# Study Vector Bundle in a Relatively Harder Way - Tangent Bundle

## Tangent line and tangent surface as vector spaces

We begin our study by some elementary Calculus. Now we have the function \(f(x)=x^2+\frac{e^x}{x^2+1}\) as our example. It should not be a problem to find its tangent line at point \((0,1)\), by calculating its derivative, we have \(l:x-y+1=0\) as the tangent line.

\(l\) is not a vector space since it
does not get cross the origin, in general. But \(l-\overrightarrow{OA}\) is a vector space.
In general, suppose \(P(x,y)\) is a
point on the curve determined by \(f\),
i.e. \(y=f(x)\), then we obtain a
vector space \(l_p-\overrightarrow{OP} \simeq
\mathbb{R}\). But the action of moving the tangent line to the
origin is superfluous so naturally we consider the tangent line at \(P\) as a vector space
**determined** by \(P\).
In this case, the induced vector space (tangent line) is always of
dimension \(1\).

Now we move to two-variable functions. We have a function \(a(x,y)=x^2+y^2-x-y+xy\) as our example.
Some elementary Calculus work gives us the tangent surface of \(z=a(x,y)\) at \(A(1,1,1)\), which can be identified by
\(S:2x+2y-z=3\simeq\mathbb{R}^2\).
Again, this can be considered as a vector space
**determined** by \(A\),
or roughly speaking it is one if we take \(A\) as the origin. Further we have a base
\((\overrightarrow{AB},\overrightarrow{AC})\).
Other vectors on \(S\), for example
\(\overrightarrow{AD}\), can be written
as a linear combination of \(\overrightarrow{AB}\) and \(\overrightarrow{AC}\). In other words,
\(S\) is "spanned" by \((\overrightarrow{AB},\overrightarrow{AC})\).

Tangent line and tangent surface play an important role in
differentiation. But sometimes we do not have a chance to use it with
ease, for example \(S^1:x^2+y^2=1\)
cannot be represented by a single-variable function. However the
implicit function theorem, which you have already learned in Calculus,
gives us a chance to find a satisfying function locally. Here in this
post we will try to generalize this concept, trying to find the tangent
**space** at some point of a manifold. (The two examples
above have already determined two manifolds and two tangent spaces.)

## Definition of tangent vectors

We will introduce the abstract definition of a tangent vector at beginning. You may think it is way too abstract but actually it is not. Surprisingly, the following definition can simplify our work in the future. But before we go, make sure that you have learned about Fréchet derivative (along with some functional analysis knowledge).

Let \(M\) be a manifold of class
\(C^p\) with \(p \geq 1\) and let \(x\) be a point of \(M\). Let \((U,\varphi)\) be a chart at \(x\) and \(v\) be a element of the vector space \(\mathbf{E}\) where \(\varphi(U)\) lies (for example, if \(M\) is a \(d\)-dimensional manifold, then \(v \in \mathbb{R}^d\)). Next we consider the
triple \((U,\varphi,v)\). Suppose \((U,\varphi,v)\) and \((V,\psi,w)\) are two such triples. We say
these two triples are **equivalent** if the following
identity holds: \[
{\color\green{[}}{\color\red{(}}{\color\red{\psi\circ\varphi^{-1}}}{\color\red{)'}}{\color\red{(}}{\color\purple{\varphi(x)}}{\color\red)}{\color\green{]}}(v)=w.
\] This identity looks messy so we need to explain how to read
it. First we consider the function in red: the derivative of \(\psi\circ\varphi^{-1}\). The derivative of
\(\psi\circ\varphi^{-1}\) at point
\(\varphi(x)\) (in purple) is a linear
transform, and the transform is embraced with green brackets. Finally,
this linear transform maps \(v\) to
\(w\). In short we read, the derivative
of \(\psi\circ\varphi^{-1}\) at \(\varphi(x)\) maps \(v\) on \(w\). You may recall that you have meet
something like \(\psi\circ\varphi^{-1}\) in the definition
of manifold. It is not likely that these 'triples' should be associated
to tangent vectors. But before we explain it, we need to make sure that
we indeed defined an equivalent relation.

(Theorem 1)The relation \[ (U,\varphi,v) \sim (V,\psi,w)\\ [(\psi\circ\varphi^{-1})'(\varphi(x))](v)=w \] is an equivalence relation.

*Proof.* This will not go further than elementary Calculus, in
fact, chain rule:

(Chain rule)If \(f:U \to V\) is differentiable at \(x_0 \in U\), if \(g: V \to W\) is differentiable at \(f(x_0)\), then \(g \circ f\) is differentiable at \(x_0\), and \[ (g\circ f)'(x_0)=g'(f(x_0))\circ f'(x_0) \]

- \((U,\varphi,v)\sim(U,\varphi,v)\).

Since \(\varphi\circ\varphi^{-1}=\operatorname{id}\), whose derivative is still the identity everywhere, we have \[ [(\varphi\circ\varphi^{-1})'(\varphi(x))](v)=\operatorname{id}(v)=v \]

- If \((U,\varphi,v) \sim (V,\psi,w)\), then \((V,\psi,w)\sim(U,\varphi,v)\).

So now we have \[ [(\psi\circ\varphi^{-1})'(\varphi(x))](v)=w. \] To prove that \([(\varphi\circ\psi^{-1})'(\psi(x))]{}(w)=v\), we need some implementation of chain rule.

Note first \[ (\psi\circ\varphi^{-1})'(\varphi(x))=\psi'(\varphi^{-1}(\varphi(x)))\circ\varphi^{-1}{'}(\varphi(x))=\psi'(x)\circ(\varphi^{-1})'(\varphi(x)) \] while \[ (\varphi\circ\psi^{-1})'(\psi(x))=\varphi'(x)\circ(\psi^{-1})'(\psi(x)). \] But also by the chain rule, if \(f\) is a diffeomorphism, we have \[ (f\circ f^{-1})'(x)=(f^{-1})'(f(x))\circ f'(x)=\operatorname{id} \] or equivalently \[ f'(x)=[(f^{-1})'(f(x))]^{-1} \quad (f^{-1})'(f(x))=[f'(x)]^{-1} \]

Therefore \[ \begin{aligned} \{(\psi\circ\varphi^{-1})'(\varphi(x))\}^{-1} &=\{\psi'(x)\circ(\varphi^{-1})'(\varphi(x))\}^{-1} \\ &=\{(\varphi^{-1})'(\varphi(x))\}^{-1}\circ\{\psi'(x)\}^{-1} \\ &=\varphi'(x)\circ(\psi^{-1})'(\psi(x)) \\ &=(\varphi\circ\psi^{-1})'(\psi(x)) \end{aligned} \] which implies \[ (\varphi\circ\psi^{-1})'(\psi(x))(w)=\{(\psi\circ\varphi^{-1})'(\varphi(x))\}^{-1}(w)=v. \]

- If \((U,\varphi,v)\sim(V,\psi,w)\) and \((V,\psi,w)\sim(W,\lambda,z)\), then \((U,\varphi,v)\sim(W,\lambda,z)\).

We are given identities \[ [(\psi\circ\varphi^{-1})'(\varphi(x))](v)=w \] and \[ [(\lambda\circ\psi^{-1})'(\psi(x))](w)=z. \] By canceling \(w\), we get \[ \begin{aligned} z = [(\lambda\circ\psi^{-1})'(\psi(x))] \circ [(\psi\circ\varphi^{-1})'(\varphi(x))] (v) \end{aligned}. \] On the other hand, \[ \begin{aligned} (\lambda\circ\varphi^{-1})'(\varphi(x))&=(\lambda\circ\psi^{-1}\circ\psi\circ\varphi^{-1})'(\varphi(x)) \\ &=(\lambda\circ\psi^{-1})'(\psi\circ\varphi^{-1}\circ\varphi(x))\circ(\psi\circ\varphi^{-1})'(\varphi(x)) \\ &=(\lambda\circ\psi^{-1})'(\psi(x))\circ(\psi\circ\varphi^{-1})'(\varphi(x)) \end{aligned} \] which is what we needed. \(\square\)

An **equivalence class** of such triples \((U,\varphi,v)\) is called a **tangent
vector** of \(X\) at \(x\). The set of such tangent vectors is
called the **tangent space** to \(X\) at \(x\), which is denoted by \(T_x(X)\). But it seems that we have gone
too far. Is the triple even a 'vector'? To get a clear view let's see
Euclidean submanifolds first.

### Definition of tangent vectors of Euclidean submanifolds

Suppose \(M\) is a submanifold of \(\mathbb{R}^n\). We say \(z\) is the

tangent vectorof \(M\) at point \(x\) if there exists a curve \(\alpha\) of class \(C^1\), which is defined on \(\mathbb{R}\) and where there exists an interval \(I\) such that \(\alpha(I) \subset M\), such that \(\alpha(t_0)=x\) and \(\alpha'(t_0)=z\). (For convenience we often take \(t_0=0\).)

This definition is immediate if we check some examples. For the curve \(M: x^2+1+\frac{e^x}{x^2+1}-y=0\), we can show that \((1,1)^T\) is a tangent vector of \(M\) at \((0,1)\), which is identical to our first example. Taking \[ \alpha(t)=(t,t^2+1+\frac{e^t}{t^2+1}) \] we get \(\alpha(0)=(0,1)\) and \[ \alpha'(t)=(1,2t+\frac{e^t(t-1)^2}{(t^2+1)^2})^T. \] Therefore \(\alpha'(0)=(1,1)^T\). \(\square\)

#### Coordinate system and tangent vector

Let \(\mathbf{E}\) and \(\mathbf{F}\) be two Banach spaces and \(U\) an open subset of \(\mathbf{E}\). A \(C^p\) map \(f: U \to \mathbf{F}\) is called an

immersionat \(x\) if \(f'(x)\) is injective.

For example, if we take \(\mathbf{E}=\mathbf{F}=\mathbb{R}=U\) and \(f(x)=x^2\), then \(f\) is an immersion at almost all point on \(\mathbb{R}\) except \(0\) since \(f'(0)=0\) is not injective. This may lead you to Sard's theorem.

(Theorem 2)Let \(M\) be a subset of \(\mathbb{R}^n\), then \(M\) is a \(d\)-dimensional \(C^p\) submanifold of \(\mathbb{R}^n\) if and only if for every \(x \in M\) there exists an open neighborhood \(U \subset \mathbb{R}^n\) of \(x\), an open neighborhood \(\Omega \subset \mathbb{R}^d\) of \(0\) and a \(C^p\) map \(g: \Omega \to \mathbb{R}^n\) such that \(g\) is immersion at \(0\) such that \(g(0)=x\), and \(g\) is a homeomorphism between \(\Omega\) and \(M \cap U\) with the topology induced from \(\mathbb{R}^n\).

This follows from the definition of manifold and should not be
difficult to prove. But it is not what this blog post should cover. For
a proof you can check *Differential Geometry: Manifolds, Curves, and
Surfaces* by Marcel Berger and Bernard Gostiaux. The proof is
located in section 2.1.

A coordinate system on a \(d\)-dimensional \(C^p\) submanifold \(M\) of \(\mathbb{R}^n\) is a pair \((\Omega,g)\) consisting of an open set \(\Omega \subset \mathbb{R}^d\) and a \(C^p\) function \(g:\Omega \to \mathbb{R}^n\) such that \(g(\Omega)\) is open in \(V\) and \(g\) induces a homeomorphism between \(\Omega\) and \(g(\Omega)\).

For convenience, we say \((\Omega,g)\) is centered at \(x\) if \(g(0)=x\) and \(g\) is an immersion at \(x\). By theorem 2 it is always possible to find such a coordinate system centered at a given point \(x \in M\). The following theorem will show that we can get a easier approach to tangent vector.

(Theorem 3)Let \(\mathbf{E}\) and \(\mathbf{F}\) be two finite-dimensional vector spaces, \(U \subset \mathbf{E}\) an open set, \(f:U \to \mathbf{F}\) a \(C^1\) map, \(M\) a submanifold of \(\mathbf{E}\) contained in \(U\) and \(W\) a submanifold of \(\mathbf{F}\) such that \(f(M) \subset W\). Take \(x \in M\) and set \(y=f(x)\), If \(z\) is a tangent vector to \(M\) at \(x\), the image \(f'(x)(z)\) is a tangent vector to \(W\) at \(y=f(x)\).

*Proof.* Since \(z\) is a
tangent vector, we see there exists a curve \(\alpha: J \to M\) such that \(\alpha(0)=x\) and \(\alpha'(0)=z\) where \(J\) is an open interval containing \(0\). The function \(\beta = f \circ \alpha: J \to W\) is also a
curve satisfying \(\beta(0)=f(\alpha(0))=f(x)\) and \[
\beta'(0)=f'(\alpha(0))\alpha'(0)=f'(x)(z),
\] which is our desired curve. \(\square\)

### Why we use 'equivalence relation'

We shall show that equivalence relation makes sense. Suppose \(M\) is a \(d\)-submanifold of \(\mathbb{R}^n\), \(x \in M\) and \(z\) is a tangent vector to \(M\) at \(x\). Let \((\Omega,g)\) be a coordinate system centered at \(x\). Since \(g \in C^p(\mathbb{R}^d;\mathbb{R}^n)\), we see \(g'(0)\) is a \(n \times d\) matrix, and injectivity ensures that \(\operatorname{rank}(g'(0))=d\).

Every open set \(\Omega \subset
\mathbb{R}^d\) is a \(d\)-dimensional submanifold of \(\mathbb{R}^d\) (of \(C^p\)). Suppose now \(v \in \mathbb{R}^d\) is a tangent vector to
\(\Omega\) at \(0\) (determined by a curve \(\alpha\)), then by Theorem 3, \(g \circ \alpha\) determines a tangent
vector to \(M\) at \(x\), which is \(z_x=g'(0)(v)\). Suppose \((\Lambda,h)\) is another coordinate system
centered at \(x\). If we want to obtain
\(z_x\) as well, we must have \[
h'(0)(w)=g'(0)(v),
\] which is equivalent to \[
w = (h'(0)^{-1} \circ g'(0))(v)=(h^{-1}\circ g)'(0)(v),
\] for some \(w \in
\mathbb{R}^d\) which is the tangent vector to \(\Lambda\) at \(0
\in \Lambda\). *(The inverse makes sense since we implicitly
restricted ourself to \(\mathbb{R}^d\))*

However, we also have two charts by \((U,\varphi)=(g(\Omega),g^{-1})\) and \((V,\psi) = (h(\Lambda),h^{-1})\), which gives \[ (h^{-1} \circ g)'(0)(v)=[(\psi \circ \varphi^{-1})'(\varphi(x))](v)=w \] and this is just our equivalence relation (don't forget that \(g(0)=x\) hence \(g^{-1}(x)=\varphi(x)=0\)!). There we have our reason for equivalence relation: If \((U,\varphi,v) \sim (V,\psi,w)\), then \((U,\varphi,u)\) and \((V,\psi,v)\) determines the same tangent vector but we do not have to evaluate it manually. In general, all elements in an equivalence class represent a single vector, so the vector is (algebraically) a equivalence class. This still holds when talking about Banach manifold since topological properties of Euclidean spaces do not play a role. The generalized proof can be implemented with little difficulty.

## Tangent space

The tangent vectors at \(x \in M\) span a vector space (which is based at \(x\)). We do hope that because if not our definition of tangent vector would be incomplete and cannot even hold for an trivial example (such as what we mentioned at the beginning). We shall show, satisfyingly, the set of tangent vectors to \(M\) at \(x\) (which we write \(T_xM\)) forms a vector space that is toplinearly isomorphic to \(\mathbf{E}\), on which \(M\) is modeled.

(Theorem 4)\(T_xM \simeq \mathbf{E}\). In other words, \(T_xM\) can be given the structure of topological vector space given by the chart.

*Proof.* Let \((U,\varphi)\)
be a chart at \(x\). For \(v \in \mathbf{E}\), we see \((\varphi^{-1})'(x)(v)\) is a tangent
vector at \(x\). On the other hand,
pick \(\mathbf{w} \in T_xM\), which can
be represented by \((V,\psi,w)\). Then
\[
v=(\varphi\circ\psi^{-1})'(\psi(x))(w)
\] makes \((U,\varphi,v) \sim
(V,\psi,w)\) uniquely, and therefore we get some \(v \in \mathbf{E}\). To conclude, \[
T_xM \xrightarrow[(\varphi^{-1})'(x)]{\simeq}\mathbf{E}
\] which proves our theorem. Note that this does not depend on
the choice of charts. \(\square\)

For many reasons it is not a good idea to identify \(T_xM\) as \(\mathbf{E}\) without mentioning the point \(x\). For example we shouldn't identify the tangent line of a curve as \(x\)-axis. Instead, it would be better to identify or visualize \(T_xM\) as \((x,\mathbf{E})\), that is, a linear space with origin at \(x\).

## Tangent bundle

Now we treat *all* tangent spaces as a vector bundle. Let
\(M\) be a manifold of class \(C^p\) with \(p
\geq 1\), define the tangent bundle by the disjoint union \[
T(M)=\bigsqcup_{x \in M}T_xM.
\] This is a vector bundle if we define the projection by \[
\begin{aligned}
\pi: T(M) &\to M \\
y \in T_xM &\mapsto x
\end{aligned}
\] and we will verify it soon. First let's see an example. Below
is a visualization of the tangent bundle of \(\frac{x^2}{4}+\frac{y^2}{3}=1\), denoted by
red lines:

Also we can see \(\pi\) maps points on the blue line to a point on the curve, which is \(B\).

To show that a tangent bundle of a manifold is a vector bundle, we need to verify that it satisfies three conditions we mentioned in previous post. Let \((U,\varphi)\) be a chart of \(M\) such that \(\varphi(U)\) is open in \(\mathbf{E}\), then tangent vectors can be represented by \((U,\varphi,v)\). We get a bijection \[ \tau_U:\pi^{-1}(U) = T(U) \to U \times \mathbf{E} \] by definition of tangent vectors as equivalence classes. Let \(z_x\) be a tangent vector to \(U\) at \(x\), then there exists some \(v \in \mathbf{E}\) such that \((U,\varphi,v)\) represents \(z\). On the other hand, for some \(v \in \mathbf{E}\) and \(x \in U\), \((U,\varphi,v)\) represents some tangent vector at \(x\). Explicitly, \[ \tau_{U}(z_x)=(x,v)=(\pi(z_x),[(\varphi^{-1})'(\pi(z_x))]^{-1}(z_x)) \]

Further we get the following diagram commutative (which establishes
**VB 1**):

For **VB 2** and **VB 3** we need to check
different charts. Let \((U_i,\varphi_i)\), \((U_j,\varphi_j)\) be two charts. Define
\(\varphi_{ji}=\varphi_j \circ
\varphi_i^{-1}\) on \(\varphi_i(U_i
\cap U_j)\), and respectively we write \(\tau_{U_i}=\tau_i\) and \(\tau_{U_j}=\tau_j\). Then we get a
transition mapping \[
\tau_{ji}:(\tau_j \circ \tau_i^{-1}):(U_i \cap U_j) \times \mathbf{E}
\to (U_i \cap U_j) \times \mathbf{E}.
\]

One can verify that \[
\tau_{ji}(x,v)=(\varphi_{ji}(x),D\varphi_{ji}(x) \cdot v)
\] for \(x \in U_i \cap U_j\)
and \(v \in \mathbf{E}\). Since \(D\varphi_{ji} \in C^{p-1}\) and \(D\varphi_{ji}(x)\) is a toplinear
isomorphism, we see \[
x \mapsto (\tau_j \circ
\tau_i^{-1})_x=(\varphi_{ji}(x),D\varphi_{ji}(x)\cdot(\cdot))
\] is a morphism, which goes for **VB 3**. It
remains to verify **VB 2**. To do this we need a fact from
Banach space theory:

If \(f:U \to L(\mathbf{E},\mathbf{F})\) is a \(C^k\)-morphism, then the map of \(U \times \mathbf{E}\) into \(\mathbf{F}\) given by \[ (x,v) \mapsto [f(x)](v) \] is a \(C^k\)-morphism.

Here, we have \(f(x)=\tau_{ji}(x,\cdot)\) and to conclude,
\(\tau_{ji}\) is a \(C^{p-1}\)-morphism. It is also an
isomorphism since it has an inverse \(\tau_{ij}\). Following the definition of
manifold, we can conclude that \(T(U)\)
has a unique **manifold structure** such that \(\tau_i\) are morphisms (there will be a
formal proof in next post about any total space of a vector bundle). By
**VB 1**, we also have \(\pi=\tau_i\circ pr\), which makes it a
morphism as well. On each fiber \(\pi^{-1}(x)\), we can freely transport the
topological vector space structure of any \(\mathbf{E}\) such that \(x\) lies in \(U_i\), by means of \(\tau_{ix}\). Since \(f(x)\) is a toplinear isomorphism, the
result is independent of the choice of \(U_i\). **VB 2** is therefore
established.

Using some fancier word, we can also say that \(T:M \to T(M)\) is a
**functor** from the category of \(C^p\)-manifolds to the category of vector
bundles of class \(C^{p-1}\).