In this post we show the Mason-Stothers theorem, the so-called $abc$ theorem for polynomials, and derive Fermat's Last theorem and Davenport's inequality for polynomials. These three theorems correspond to the $abc$ conjecture, Fermat's Last Theorem and Hall's conjecture in number theory.

Read moreIn this post we study cyclotomic polynomials in field theory and deduce some baisc properties of it. We will also use it to solve some problems in field theory.

Read moreIn this post we give several forms of Masher's theorem by studying group algebra, which eventually becomes a study of semisimple rings. One can consider this post a chaotic evil introduction to representation theory or something.

Read moreYou can find contents about Dedekind domain (or Dedekind ring) in
*almost all* algebraic number theory books. But many properties
can be proved inside ring theory. I hope you can find the solution you
need in this post, and this post will not go further than elementary
ring theory. With that being said, you are assumed to have enough
knowledge of ring and ring of fractions (this post
serves well), but not too much mathematics maturity is assumed (at the
very least you are assumed to be familiar with terminologies in the
linked post).\(\def\mb{\mathbb}\) \(\def\mfk{\mathfrak}\)

There are several ways to define Dedekind domain since there are several equivalent statements of it. We will start from the one based on ring of fractions. As a friendly reminder, \(\mb{Z}\) or any principal integral domain is already a Dedekind domain. In fact Dedekind domain may be viewed as a generalization of principal integral domain.

Let \(\mfk{o}\) be an integral
domain (a.k.a. entire ring), and \(K\)
be its quotient field. A **Dedekind domain** is an integral
domain \(\mfk{o}\) such that the
fractional ideals form a group under multiplication. Let's have a
breakdown. By a **fractional ideal** \(\mfk{a}\) we mean a nontrivial additive
subgroup of \(K\) such that

- \(\mfk{o}\mfk{a}=\mfk{a}\),
- there exists some nonzero element \(c \in \mfk{o}\) such that \(c\mfk{a} \subset \mfk{o}\).

What does the group look like? As you may guess, the unit element is
\(\mfk{o}\). For a fractional ideal
\(\mfk{a}\), we have the inverse to be
another fractional ideal \(\mfk{b}\)
such that \(\mfk{ab}=\mfk{ba}=\mfk{o}\). Note we regard
\(\mfk{o}\) as a subring of \(K\). For \(a \in
\mfk{o}\), we treat it as \(a/1 \in
K\). This makes sense because the map \(i:a \mapsto a/1\) is injective. For the
existence of \(c\), you may consider it
as a restriction that the 'denominator' is *bounded*.
Alternatively, we say that fractional ideal of \(K\) is a finitely generated \(\mfk{o}\)-submodule of \(K\). But in this post it is not assumed
that you have learned module theory.

Let's take \(\mb{Z}\) as an example. The quotient field of \(\mb{Z}\) is \(\mb{Q}\). We have a fractional ideal \(P\) where all elements are of the type \(\frac{np}{2}\) with \(p\) prime and \(n \in \mb{Z}\). Then indeed we have \(\mb{Z}P=P\). On the other hand, take \(2 \in \mb{Z}\), we have \(2P \subset \mb{Z}\). For its inverse we can take a fractional ideal \(Q\) where all elements are of the type \(\frac{2n}{p}\). As proved in algebraic number theory, the ring of algebraic integers in a number field is a Dedekind domain.

Before we go on we need to clarify the definition of ideal multiplication. Let \(\mfk{a}\) and \(\mfk{b}\) be two ideals, we define \(\mfk{ab}\) to be the set of all sums

\[ x_1y_1+\cdots+x_ny_n \]

where \(x_i \in \mfk{a}\) and \(y_i \in \mfk{b}\). Here the number \(n\) means finite but is not fixed. Alternatively we cay say \(\mfk{ab}\) contains all finite sum of products of \(\mfk{a}\) and \(\mfk{b}\).

(Proposition 1)A Dedekind domain \(\mfk{o}\) is Noetherian.

By Noetherian ring we mean that every ideal in a ring is finitely generated. Precisely, we will prove that for every ideal \(\mfk{a} \subset \mfk{o}\) there are \(a_1,a_2,\cdots,a_n \in \mfk{a}\) such that, for every \(r \in \mfk{a}\), we have an expression

\[ r = c_1a_1 + c_2a_2 + \cdots + c_na_n \qquad c_1,c_2,\cdots,c_n \in \mfk{o}. \]

Also note that any ideal \(\mfk{a} \subset \mfk{o}\) can be viewed as a fractional ideal.

**Proof.** Since \(\mfk{a}\) is an ideal of \(\mfk{o}\), let \(K\) be the quotient field of \(\mfk{o}\), we see since \(\mfk{oa}=\mfk{a}\), we may also view \(\mfk{a}\) as a fractional ideal. Since
\(\mfk{o}\) is a Dedekind domain, and
fractional ideals of \(\mfk{a}\) is a
group, there is an fractional ideal \(\mfk{b}\) such that \(\mfk{ab}=\mfk{ba}=\mfk{o}\). Since \(1 \in \mfk{o}\), we may say that there
exists some \(a_1,a_2,\cdots, a_n \in
\mfk{a}\) and \(b_1,b_2,\cdots,b_n \in
\mfk{o}\) such that \(\sum_{i = 1
}^{n}a_ib_i=1\). For any \(r \in
\mfk{a}\), we have an expression

\[ r = rb_1a_1+rb_2a_2+\cdots+rb_na_n. \]

On the other hand, any element of the form \(c_1a_1+c_2a_2+\cdots+c_na_n\), by definition, is an element of \(\mfk{a}\). \(\blacksquare\)

From now on, the inverse of an fractional ideal \(\mfk{a}\) will be written like \(\mfk{a}^{-1}\).

(Proposition 2)For ideals \(\mfk{a},\mfk{b} \subset \mfk{o}\), \(\mfk{b}\subset\mfk{a}\) if and only if there exists some \(\mfk{c}\) such that \(\mfk{ac}=\mfk{b}\) (or we simply say \(\mfk{a}|\mfk{b}\))

**Proof.** If \(\mfk{b}=\mfk{ac}\), simply note that \(\mfk{ac} \subset \mfk{a} \cap \mfk{c} \subset
\mfk{a}\). For the converse, suppose that \(a \supset \mfk{b}\), then \(\mfk{c}=\mfk{a}^{-1}\mfk{b}\) is an ideal
of \(\mfk{o}\) since \(\mfk{c}=\mfk{a}^{-1}\mfk{b} \subset
\mfk{a}^{-1}\mfk{a}=\mfk{o}\), hence we may write \(\mfk{b}=\mfk{a}\mfk{c}\). \(\blacksquare\)

(Proposition 3)If \(\mfk{a}\) is an ideal of \(\mfk{o}\), then there are prime ideals \(\mfk{p}_1,\mfk{p}_2,\cdots,\mfk{p}_n\) such that\[ \mfk{a}=\mfk{p}_1\mfk{p}_2\cdots\mfk{p}_n. \]

**Proof.** For this problem we use a classical
technique: contradiction on maximality. Suppose this is not true, let
\(\mfk{A}\) be the set of ideals of
\(\mfk{o}\) that cannot be written as
the product of prime ideals. By assumption \(\mfk{U}\) is non-empty. Since as we have
proved, \(\mfk{o}\) is Noetherian, we
can pick a maximal element \(\mfk{a}\)
of \(\mfk{A}\) with respect to
inclusion. If \(\mfk{a}\) is maximal,
then since all maximal ideals are prime, \(\mfk{a}\) itself is prime as well. If \(\mfk{a}\) is properly contained in an ideal
\(\mfk{m}\), then we write \(\mfk{a}=\mfk{m}\mfk{m}^{-1}\mfk{a}\). We
have \(\mfk{m}^{-1}\mfk{a} \supsetneq
\mfk{a}\) since if not, we have \(\mfk{a}=\mfk{ma}\), which implies that
\(\mfk{m}=\mfk{o}\). But by maximality,
\(\mfk{m}^{-1}\mfk{a}\not\in\mfk{U}\),
hence it can be written as a product of prime ideals. But \(\mfk{m}\) is prime as well, we have a prime
factorization for \(\mfk{a}\),
contradicting the definition of \(\mfk{U}\).

Next we show unicity up to a permutation. If

\[ \mfk{p}_1\mfk{p}_2\cdots\mfk{p}_k=\mfk{q}_1\mfk{q}_2\cdots\mfk{q}_j, \]

since \(\mfk{p}_1\mfk{p}_2\cdots\mfk{p}_k\subset\mfk{p}_1\) and \(\mfk{p}_1\) is prime, we may assume that \(\mfk{q}_1 \subset \mfk{p}_1\). By the property of fractional ideal we have \(\mfk{q}_1=\mfk{p}_1\mfk{r}_1\) for some fractional ideal \(\mfk{r}_1\). However we also have \(\mfk{q}_1 \subset \mfk{r}_1\). Since \(\mfk{q}_1\) is prime, we either have \(\mfk{q}_1 \supset \mfk{p}_1\) or \(\mfk{q}_1 \supset \mfk{r}_1\). In the former case we get \(\mfk{p}_1=\mfk{q}_1\), and we finish the proof by continuing inductively. In the latter case we have \(\mfk{r}_1=\mfk{q}_1=\mfk{p}_1\mfk{q}_1\), which shows that \(\mfk{p}_1=\mfk{o}\), which is impossible. \(\blacksquare\)

(Proposition 4)Every nontrivial prime ideal \(\mfk{p}\) is maximal.

**Proof.** Let \(\mfk{m}\) be an maximal ideal containing
\(\mfk{p}\). By proposition 2 we have
some \(\mfk{c}\) such that \(\mfk{p}=\mfk{mc}\). If \(\mfk{m} \neq \mfk{p}\), then \(\mfk{c} \neq \mfk{o}\), and we may write
\(\mfk{c}=\mfk{p}_1\cdots\mfk{p}_n\),
hence \(\mfk{p}=\mfk{m}\mfk{p}_1\cdots\mfk{p}_n\),
which is a prime factorisation, contradicting the fact that \(\mfk{p}\) has a unique prime factorisation,
which is \(\mfk{p}\) itself. Hence any
maximal ideal containing \(\mfk{p}\) is
\(\mfk{p}\) itself. \(\blacksquare\)

(Proposition 5)Suppose the Dedekind domain \(\mfk{o}\) only contains one prime (and maximal) ideal \(\mfk{p}\), let \(t \in \mfk{p}\) and \(t \not\in \mfk{p}^2\), then \(\mfk{p}\) is generated by \(t\).

**Proof.** Let \(\mfk{t}\) be the ideal generated by \(t\). By proposition 3 we have a
factorisation

\[ \mfk{t}=\mfk{p}^n \]

for some \(n\) since \(\mfk{o}\) contains only one prime ideal. According to proposition 2, if \(n \geq 3\), we write \(\mfk{p}^n=\mfk{p}^2\mfk{p}^{n-2}\), we see \(\mfk{p}^2 \supset \mfk{p}^n\). But this is impossible since if so we have \(t \in \mfk{p}^n \subset \mfk{p}^2\) contradicting our assumption. Hence \(0<n<3\). But If \(n=2\) we have \(t \in \mfk{p}^2\) which is also not possible. So \(\mfk{t}=\mfk{p}\) provided that such \(t\) exists.

For the existence of \(t\), note if not, then for all \(t \in \mfk{p}\) we have \(t \in \mfk{p}^2\), hence \(\mfk{p} \subset \mfk{p}^2\). On the other hand we already have \(\mfk{p}^2 = \mfk{p}\mfk{p}\), which implies that \(\mfk{p}^2 \subset \mfk{p}\) (proposition 2), hence \(\mfk{p}^2=\mfk{p}\), contradicting proposition 3. Hence such \(t\) exists and our proof is finished. \(\blacksquare\)

In fact there is another equivalent definition of Dedekind domain:

A domain \(\mfk{o}\) is Dedekind if and only if

- \(\mfk{o}\) is Noetherian.
- \(\mfk{o}\) is integrally closed.
- \(\mfk{o}\) has Krull dimension \(1\) (i.e. every non-zero prime ideals are maximal).

This is equivalent to say that faction ideals form a group and is frequently used by mathematicians as well. But we need some more advanced techniques to establish the equivalence. Presumably there will be a post about this in the future.

First we recall some backgrounds. Suppose \(A\) is a ring with multiplicative identity
\(1_A\). A **left module**
of \(A\) is an additive abelian group
\((M,+)\), together with an ring
operation \(A \times M \to M\) such
that \[
\begin{aligned}
(a+b)x &= ax+bx \\
a(x+y) &= ax+ay \\
a(bx) &= (ab)x \\
1_Ax &= x
\end{aligned}
\] for \(x,y \in M\) and \(a,b \in A\). As a corollary, we see \((0_A+0_A)x=0_Ax=0_Ax+0_Ax\), which shows
\(0_Ax=0_M\) for all \(x \in M\). On the other hand, \(a(x-x)=0_M\) which implies \(a(-x)=-(ax)\). We can also define right
\(A\)-modules but we are not discussing
them here.

Let \(S\) be a subset of \(M\). We say \(S\) is a **basis** of \(M\) if \(S\) generates \(M\) and \(S\) is linearly independent. That is, for
all \(m \in M\), we can pick \(s_1,\cdots,s_n \in S\) and \(a_1,\cdots,a_n \in A\) such that \[
m = a_1s_1+a_2s_2+\cdots+a_ns_n,
\] and, for any \(s_1,\cdots,s_n \in
S\), we have \[
a_1s_1+a_2s_2+\cdots+a_ns_n=0_M \implies a_1=a_2=\cdots=a_n=0_A.
\] Note this also shows that \(0_M\notin S\) (what happens if \(0_M \in S\)?). We say \(M\) is **free** if it has a
basis. The case when \(M\) or \(A\) is trivial is excluded.

If \(A\) is a field, then \(M\) is called a **vector
space**, which has no difference from the one we learn in linear
algebra and functional analysis. Mathematicians in functional analysis
may be interested in the cardinality of a vector space, for example,
when a vector space is of finite dimension, or when the basis is
countable. But the basis does not come from nowhere. In fact we can
prove that vector spaces have basis, but modules are not so lucky. \(\def\mb{\mathbb}\)

First of all let's consider the cyclic group \(\mb{Z}/n\mb{Z}\) for \(n \geq 2\). If we define \[
\begin{aligned}
\mb{Z} \times \mb{Z}/n\mb{Z} &\to \mb{Z}/n\mb{Z} \\
(m,k+n\mb{Z}) &\mapsto mk+n\mb{Z}
\end{aligned}
\] which is actually \(m\)
copies of an element, then we get a module, which will be denoted by
\(M\). For any \(x=k+n\mb{Z} \in M\), we see \(nk+n\mb{Z}=0_M\). Therefore for
**any** subset \(S \subset
M\), if \(x_1,\cdots,x_k \in
M\), we have \[
nx_1+nx_2+\cdots+nx_k = 0_M,
\] which gives the fact that \(M\) has no basis. In fact this can be
generalized further. If \(A\) is a ring
but not a field, let \(I\) be a
nontrivial proper ideal, then \(A/I\)
is a module that has no basis.

Following \(\mb{Z}/n\mb{Z}\) we also
have another example on finite order. Indeed, **any finite abelian
group is not free as a module over \(\mb{Z}\).** More generally,

Let \(G\) be a abelian group, and \(G_{tor}\) be its torsion subgroup. If \(G_{tor}\) is non-trival, then \(G\) cannot be a free module over \(\mb{Z}\).

Next we shall take a look at infinite rings. Let \(F[X]\) be the polynomial ring over a field \(F\) and \(F'[X]\) be the polynomial sub-ring that have coefficient of \(X\) equal to \(0\). Then \(F[X]\) is a \(F'[X]\)-module. However it is not free.

Suppose we have a basis \(S\) of \(F[X]\), then we claim that \(|S|>1\). If \(|S|=1\), say \(P \in S\), then \(P\) cannot generate \(F[X]\) since if \(P\) is constant then we cannot generate a polynomial contains \(X\) with power \(1\); If \(P\) is not constant, then the constant polynomial cannot be generate. Hence \(S\) contains at least two polynomials, say \(P_1 \neq 0\) and \(P_2 \neq 0\). However, note \(-X^2P_1 \in F'[X]\) and \(X^2P_2 \in F'[X]\), which gives \[ (X^2P_2)P_1-(X^2P_1)P_2=0. \] Hence \(S\) cannot be a basis.

I hope those examples have convinced you that basis is not a universal thing. We are going to prove that every vector space has a basis. More precisely,

Let \(V\) be a nontrivial vector space over a field \(K\). Let \(\Gamma\) be a set of generators of \(V\) over \(K\) and \(S \subset \Gamma\) is a subset which is linearly independent, then there exists a basis of \(V\) such that \(S \subset B \subset \Gamma\).

Note we can always find such \(\Gamma\) and \(S\). For the extreme condition, we can pick \(\Gamma=V\) and \(S\) be a set containing any single non-zero element of \(V\). Note this also gives that we can generate a basis by expanding any linearly independent set. The proof relies on a fact that every non-zero element in a field is invertible, and also, Zorn's lemma. In fact, axiom of choice is equivalent to the statement that every vector has a set of basis. The converse can be found here. \(\def\mfk{\mathfrak}\)

*Proof.* Define \[
\mfk{T} =\{T \subset \Gamma:S \subset T, \text{ $T$ is linearly
independent}\}.
\] Then \(\mfk{T}\) is not empty
since it contains \(S\). If \(T_1 \subset T_2 \subset \cdots\) is a
totally ordered chain in \(\mfk{T}\),
then \(T=\bigcup_{i=1}^{\infty}T_i\) is
again linearly independent and contains \(S\). To show that \(T\) is linearly independent, note that if
\(x_1,x_2,\cdots,x_n \in T\), we can
find some \(k_1,\cdots,k_n\) such that
\(x_i \in T_{k_i}\) for \(i=1,2,\cdots,n\). If we pick \(k = \max(k_1,\cdots,k_n)\), then \[
x_1,x_2,\cdots,x_n \in \bigcup_{i=1}^{n}T_{k_i}=T_k.
\] But we already know that \(T_k\) is linearly independent, so \(a_1x_1+\cdots+a_nx_n=0_V\) implies \(a_1=\cdots=a_n=0_K\).

By Zorn's lemma, let \(B\) be the maximal element of \(\mfk{T}\), then \(B\) is also linearly independent since it is an element of \(\mfk{T}\). Next we show that \(B\) generates \(V\). Suppose not, then we can pick some \(x \in \Gamma\) that is not generated by \(B\). Define \(B'=B \cup \{x\}\), we see \(B'\) is linearly independent as well, because if we pick \(y_1,y_2,\cdots,y_n \in B\), and if \[ \sum_{k=1}^{n}a_ky_k+bx=0_V, \] then if \(b \neq 0\) we have \[ x = -\sum_{k=1}^{n}b^{-1}a_ky_k \in B, \] contradicting the assumption that \(x\) is not generated by \(B\). Hence \(b=0_K\). However, we have proved that \(B'\) is a linearly independent set containing \(B\) and contained in \(S\), contradicting the maximality of \(B\) in \(\mfk{T}\). Hence \(B\) generates \(V\). \(\square\)

Let \(A\) be an abelian group. Let
\((e_i)_{i \in I}\) be a family of
elements of \(A\). We say that this
family is a **basis** for \(A\) if the family is not empty, and if
every element of \(A\) has a unique
expression as a **linear expression** \[
x = \sum_{i \in I} x_i e_i
\] where \(x_i \in \mathbb{Z}\)
and almost all \(x_i\) are equal to
\(0\). This means that the sum is
actually finite. An abelian group is said to be **free** if
it has a basis. Alternatively, we may write \(A\) as a direct sum by \[
A \cong \bigoplus_{i \in I}\mathbb{Z}e_i.
\]

Let \(S\) be a set. Say we want to
get a group out of this for some reason, so how? It is not a good idea
to endow \(S\) with a binary operation
beforehead since overall \(S\) is
merely a set. We shall **generate** a group out of \(S\) in the most **freely**
way.

Let \(\mathbb{Z}\langle S \rangle\)
be the set of all **maps** \(\varphi:S \to \mathbb{Z}\) such that, for
only a **finite** number of \(x
\in S\), we have \(\varphi(x) \neq
0\). For simplicity, we denote \(k
\cdot x\) to be some \(\varphi_0 \in
\mathbb{Z}\langle S \rangle\) such that \(\varphi_0(x)=k\) but \(\varphi_0(y) = 0\) if \(y \neq x\). For any \(\varphi\), we claim that \(\varphi\) has a unique expression \[
\varphi=k_1 \cdot x_1 + k_2 \cdot x_2 + \cdots + k_n \cdot x_n.
\] One can consider these integers \(k_i\) as the order of \(x_i\), or simply the time that \(x_i\) appears (may be negative). For \(\varphi\in\mathbb{Z}\langle S \rangle\),
let \(I=\{x_1,x_2,\cdots,x_n\}\) be the
set of elements of \(S\) such that
\(\varphi(x_i) \neq 0\). If we denote
\(k_i=\varphi(x_i)\), we can show that
\(\psi=k_1 \cdot x_1 + k_2 \cdot x_2 + \cdots
+ k_n \cdot x_n\) is equal to \(\varphi\). For \(x \in I\), we have \(\psi(x)=k\) for some \(k=k_i\neq 0\) by definition of the '\(\cdot\)'; if \(y
\notin I\) however, we then have \(\psi(y)=0\). This coincides with \(\varphi\). \(\blacksquare\)

By definition the zero map \(\mathcal{O}=0 \cdot x \in \mathbb{Z}\langle S \rangle\) and therefore we may write any \(\varphi\) by \[ \varphi=\sum_{x \in S}k_x\cdot x \] where \(k_x \in \mathbb{Z}\) and can be zero. Suppose now we have two expressions, for example \[ \varphi=\sum_{x \in S}k_x \cdot x=\sum_{x \in S}k_x'\cdot x \] Then \[ \varphi-\varphi=\mathcal{O}=\sum_{x \in S}(k_x-k'_x)\cdot x \] Suppose \(k_y - k_y' \neq 0\) for some \(y \in S\), then \[ \mathcal{O}(y)=k_y-k_y'\neq 0 \] which is a contradiction. Therefore the expression is unique. \(\blacksquare\)

This \(\mathbb{Z}\langle S \rangle\)
is what we are looking for. It is an additive group (which can be proved
immediately) and, what is more important, every element can be expressed
as a 'sum' associated with finite number of elements of \(S\). We shall write \(F_{ab}(S)=\mathbb{Z}\langle S \rangle\),
and call it the **free abelian group generated by \(S\)**. For elements in \(S\), we say they are **free
generators** of \(F_{ab}(S)\).
If \(S\) is a finite set, we say \(F_{ab}(S)\) is **finitely
generated**.

An abelian group is

freeif and only if it is isomorphic to a free abelian group \(F_{ab}(S)\) for some set \(S\).

**Proof.** First we shall show that \(F_{ab}(S)\) is free. For \(x \in M\), we denote \(\varphi = 1 \cdot x\) by \([x]\). Then for any \(k \in \mathbb{Z}\), we have \(k[x]=k \cdot x\) and \(k[x]+k'[y] = k\cdot x + k' \cdot
y\). By definition of \(F_{ab}(S)\), any element \(\varphi \in F_{ab}(S)\) has a unique
expression \[
\varphi = k_1 \cdot x_1 + \cdots + k_n \cdot x_n
=k_1[x_1]+\cdots+k_n[x_n]
\] Therefore \(F_{ab}(S)\) is
free since we have found the basis \(([x])_{x
\in S}\).

Conversely, if \(A\) is free, then it is immediate that its basis \((e_i)_{i \in I}\) generates \(A\). Our statement is therefore proved. \(\blacksquare\)

(Proposition 1)If \(A\) is an abelian group, then there is a free group \(F\) which has a subgroup \(H\) such that \(A \cong F/H\).

**Proof.** Let \(S\) be
any set containing \(A\). Then we get a
surjective map \(\gamma: S \to A\) and
a free group \(F_{ab}(S)\). We also get
a unique homomorphism \(\gamma_\ast:F_{ab}(S)
\to A\) by \[
\begin{aligned}
\gamma_\ast:F_{ab}(S) &\to A \\
\varphi=\sum_{x \in S}k_x\cdot x &\mapsto \sum_{x \in S}k_x\gamma(x)
\end{aligned}
\] which is also surjective. By the first isomorphism theorem, if
we set \(H=\ker(\gamma_\ast)\) and
\(F_{ab}(S)=F\), then \[
F/H \cong A.
\] \(\blacksquare\)

(Proposition 2)If \(A\) is finitely generated, then \(F\) can also be chosen to be finitely generated.

**Proof.** Let \(S\) be
the generator of \(A\), and \(S'\) is a set containing \(S\). Note if \(S\) is finite, which means \(A\) is finitely generated, then \(S'\) can also be finite by inserting
one or any finite number more of elements. We have a map from \(S\) and \(S'\) into \(F_{ab}(S)\) and \(F_{ab}(S')\) respectively by \(f_S(x)=1 \cdot x\) and \(f_{S'}(x')=1 \cdot x'\). Define
\(g=f_{S'} \circ \lambda:S' \to
F_{ab}(S)\) we get another homomorphism by \[
\begin{aligned}
g_\ast:F_{ab}(S') &\to F_{ab}(S) \\
\varphi'=\sum_{x \in S'}k_{x} \cdot x &\mapsto \sum_{x \in
S'}k_{x}\cdot g(x)
\end{aligned}
\] This defines a unique homomorphism such that \(g_\ast \circ f_{S'} = g\). As one can
also verify, this map is also surjective. Therefore by the first
isomorphism theorem we have \[
A \cong F_{ab}(S) \cong F_{ab}(S')/\ker(g_\ast)
\] \(\blacksquare\)

It's worth mentioning separately that we have implicitly proved two statements with commutative diagrams:

(Proposition 3 | Universal property)If \(g:S \to B\) is a mapping of \(S\) into some abelian group \(B\), then we can define a unique group-homomorphism making the following diagram commutative:

(Proposition 4)If \(\lambda:S \to S\) is a mapping of sets, there is a unique homomorphism \(\overline{\lambda}\) making the following diagram commutative:

(In the proof of Proposition 2 we exchanged \(S\) an \(S'\).)

(The Grothendieck group)Let \(M\) be a commutative monoid written additively. We shall prove that there exists a commutative group \(K(M)\) with a monoid homomorphism \[ \gamma:M \to K(M) \]satisfying the following universal property: If \(f:M \to A\) is a homomorphism from \(M\) into a abelian group \(A\), then there exists a unique homomorphism \(f_\gamma:K(M) \to A\) such that \(f=f_\gamma\circ\gamma\). This can be represented by a commutative diagram:

**Proof.** There is a commutative diagram describes what
we are doing.

Let \(F_{ab}(M)\) be the free abelian group generated by \(M\). For \(x \in M\), we denote \(1 \cdot x \in F_{ab}(M)\) by \([x]\). Let \(B\) be the group generated by all elements of the type \[ [x+y]-[x]-[y] \] where \(x,y \in M\). This can be considered as a subgroup of \(F_{ab}(M)\). We let \(K(M)=F_{ab}(M)/B\). Let \(i=x \to [x]\) and \(\pi\) be the canonical map \[ \pi:F_{ab}(M) \to F_{ab}(M)/B. \] We are done by defining \(\gamma: \pi \circ i\). Then we shall verify that \(\gamma\) is our desired homomorphism satisfying the universal property. For \(x,y \in M\), we have \(\gamma(x+y)=\pi([x+y])\) and \(\gamma(x)+\gamma(y) = \pi([x])+\pi([y])=\pi([x]+[y])\). However we have \[ [x+y]-[x]-[y] \in B, \] which implies that \[ \gamma(x)+\gamma(y)=\pi([x]+[y])=\pi([x+y]) = \gamma(x+y). \] Hence \(\gamma\) is a monoid-homomorphism. Finally the universal property. By proposition 3, we have a unique homomorphism \(f_\ast\) such that \(f_\ast \circ i = f\). Note if \(y \in B\), then \(f_\ast(y) =0\). Therefore \(B \subset \ker{f_\ast}\) Therefore we are done if we define \(f_\gamma(x+B)=f_\ast (x)\). \(\blacksquare\)

Why such a \(B\)? Note in general
\([x+y]\) is not necessarily equal to
\([x]+[y]\) in \(F_{ab}(M)\), but we don't want it to be so.
So instead we create a new **equivalence relation**, by
factoring a subgroup generated by \([x+y]-[x]-[y]\). Therefore in \(K(M)\) we see \([x+y]+B = [x]+[y]+B\), which finally makes
\(\gamma\) a homomorphism. We use the
same strategy to generate the **tensor product** of two
modules later. But at that time we have more than one relation to take
care of.

If for all \(x,y,z \in M\), \(x+y=x+z\) implies \(y=z\), then we say \(M\) is a cancellative monoid, or the cancellation law holds in \(M\). Note for the proof above we didn't use any property of cancellation. However we still have an interesting property for cancellation law.

(Theorem)The cancellation law holds in \(M\) if and only if \(\gamma\) is injective.

**Proof.** This proof involves another approach to the
Grothendieck group. We consider pairs \((x,y)
\in M \times M\) with \(x,y \in
M\). Define \[
(x,y) \sim (x',y') \iff \exists \ell \in M,
y+x'+\ell=x+y'+\ell.
\] Then we get a equivalence relation (try to prove it
yourself!). We define the addition component-wise, that is, \((x,y)+(x',y')=(x+x',y+y')\),
then the equivalence classes of pairs form a group \(A\), where the zero element is \([(0,0)]\). We have a monoid-homomorphism
\[
f:x \mapsto [(x,0)].
\] If cancellation law holds in \(M\), then \[
\begin{aligned}
f(x) = f(y) &\implies [(x,0)] = [(y,0)] \\
&\implies 0+y+\ell=x+0+\ell \\
&\implies x=y.
\end{aligned}
\] Hence \(f\) is injective. By
the universal property of the Grothendieck group, we get a unique
homomorphism \(f_\gamma\) such that
\(f_\gamma \circ \gamma = f\). If \(x \neq 0\) in \(M\), then \(f_\gamma \circ \gamma(x) \neq 0\) since
\(f\) is injective. This implies \(\gamma(x) \neq 0\). Hence \(\gamma\) is injective.

Conversely, if \(\gamma\) is injective, then \(i\) is injective (this can be verified by contradiction). Then we see \(f=f_\ast \circ i\) is injective. But \(f(x)=f(y)\) if and only if \(x+\ell = y+\ell\), hence \(x+ \ell = y+ \ell\) implies \(x=y\), the cancellation law holds on \(M\).

Our first example is \(\mathbb{N}\). Elements of \(F_{ab}(\mathbb{N})\) are of the form \[ \varphi=k_1 \cdot n_1 + k_2 \cdot n_2+\cdots + k_m \cdot n_m. \] For elements in \(B\) they are generated by \[ \varphi=1\cdot (m+n)-1\cdot m - 1\cdot n \] which we wish to represent \(0\). Indeed, \(K(\mathbb{N}) \simeq \mathbb{Z}\) since if we have a homomorphism \[ \begin{aligned} f:K(\mathbb{N}) &\to \mathbb{Z} \\ \sum_{j=1}^{m}k_j \cdot n_j +B &\mapsto \sum_{j=1}^{m}k_j n_j. \end{aligned} \] For \(r \in \mathbb{Z}\), we see \(f(1 \cdot r+B)=r\). On the other hand, if \(\sum_{j=1}^{m}k_j \cdot n_j \not\in B\), then its image under \(f\) is not \(0\).

In the first example we 'granted' the natural numbers 'subtraction'. Next we grant the division on multiplicative monoid.

Consider \(M=\mathbb{Z} \setminus 0\). Now for \(F_{ab}(M)\) we write elements in the form \[ \varphi={}^{k_1}n_1{}^{k_2}n_2\cdots{}^{k_m}n_m \] which denotes that \(\varphi(n_j)=k_j\) and has no other differences. Then for elements in \(B\) they are generated by \[ \varphi = {}^1(n_1n_2){}^{-1}(n_1)^{-1}(n_2) \] which we wish to represent \(1\). Then we see \(K(M) \simeq \mathbb{Q} \setminus 0\) if we take the isomorphism \[ \begin{aligned} f:K(M) &\to \mathbb{Q} \setminus 0 \\ \left(\prod_{j=1}^{m}{}^{k_j}n_j\right)B &\mapsto \prod_{j=1}^{m}n_j^{k_j}. \end{aligned} \]

Of course this is not the end of the Grothendieck group. But for further example we may need a lot of topology background. For example, we have the topological \(K\)-theory group of a topological space to be the Grothendieck group of isomorphism classes of topological vector bundles. But I think it is not a good idea to post these examples at this timing.

We begin our study by some elementary Calculus. Now we have the function \(f(x)=x^2+\frac{e^x}{x^2+1}\) as our example. It should not be a problem to find its tangent line at point \((0,1)\), by calculating its derivative, we have \(l:x-y+1=0\) as the tangent line.

\(l\) is not a vector space since it
does not get cross the origin, in general. But \(l-\overrightarrow{OA}\) is a vector space.
In general, suppose \(P(x,y)\) is a
point on the curve determined by \(f\),
i.e. \(y=f(x)\), then we obtain a
vector space \(l_p-\overrightarrow{OP} \simeq
\mathbb{R}\). But the action of moving the tangent line to the
origin is superfluous so naturally we consider the tangent line at \(P\) as a vector space
**determined** by \(P\).
In this case, the induced vector space (tangent line) is always of
dimension \(1\).

Now we move to two-variable functions. We have a function \(a(x,y)=x^2+y^2-x-y+xy\) as our example.
Some elementary Calculus work gives us the tangent surface of \(z=a(x,y)\) at \(A(1,1,1)\), which can be identified by
\(S:2x+2y-z=3\simeq\mathbb{R}^2\).
Again, this can be considered as a vector space
**determined** by \(A\),
or roughly speaking it is one if we take \(A\) as the origin. Further we have a base
\((\overrightarrow{AB},\overrightarrow{AC})\).
Other vectors on \(S\), for example
\(\overrightarrow{AD}\), can be written
as a linear combination of \(\overrightarrow{AB}\) and \(\overrightarrow{AC}\). In other words,
\(S\) is "spanned" by \((\overrightarrow{AB},\overrightarrow{AC})\).

Tangent line and tangent surface play an important role in
differentiation. But sometimes we do not have a chance to use it with
ease, for example \(S^1:x^2+y^2=1\)
cannot be represented by a single-variable function. However the
implicit function theorem, which you have already learned in Calculus,
gives us a chance to find a satisfying function locally. Here in this
post we will try to generalize this concept, trying to find the tangent
**space** at some point of a manifold. (The two examples
above have already determined two manifolds and two tangent spaces.)

We will introduce the abstract definition of a tangent vector at beginning. You may think it is way too abstract but actually it is not. Surprisingly, the following definition can simplify our work in the future. But before we go, make sure that you have learned about Fréchet derivative (along with some functional analysis knowledge).

Let \(M\) be a manifold of class
\(C^p\) with \(p \geq 1\) and let \(x\) be a point of \(M\). Let \((U,\varphi)\) be a chart at \(x\) and \(v\) be a element of the vector space \(\mathbf{E}\) where \(\varphi(U)\) lies (for example, if \(M\) is a \(d\)-dimensional manifold, then \(v \in \mathbb{R}^d\)). Next we consider the
triple \((U,\varphi,v)\). Suppose \((U,\varphi,v)\) and \((V,\psi,w)\) are two such triples. We say
these two triples are **equivalent** if the following
identity holds: \[
{\color\green{[}}{\color\red{(}}{\color\red{\psi\circ\varphi^{-1}}}{\color\red{)'}}{\color\red{(}}{\color\purple{\varphi(x)}}{\color\red)}{\color\green{]}}(v)=w.
\] This identity looks messy so we need to explain how to read
it. First we consider the function in red: the derivative of \(\psi\circ\varphi^{-1}\). The derivative of
\(\psi\circ\varphi^{-1}\) at point
\(\varphi(x)\) (in purple) is a linear
transform, and the transform is embraced with green brackets. Finally,
this linear transform maps \(v\) to
\(w\). In short we read, the derivative
of \(\psi\circ\varphi^{-1}\) at \(\varphi(x)\) maps \(v\) on \(w\). You may recall that you have meet
something like \(\psi\circ\varphi^{-1}\) in the definition
of manifold. It is not likely that these 'triples' should be associated
to tangent vectors. But before we explain it, we need to make sure that
we indeed defined an equivalent relation.

(Theorem 1)The relation \[ (U,\varphi,v) \sim (V,\psi,w)\\ [(\psi\circ\varphi^{-1})'(\varphi(x))](v)=w \] is an equivalence relation.

*Proof.* This will not go further than elementary Calculus, in
fact, chain rule:

(Chain rule)If \(f:U \to V\) is differentiable at \(x_0 \in U\), if \(g: V \to W\) is differentiable at \(f(x_0)\), then \(g \circ f\) is differentiable at \(x_0\), and \[ (g\circ f)'(x_0)=g'(f(x_0))\circ f'(x_0) \]

- \((U,\varphi,v)\sim(U,\varphi,v)\).

Since \(\varphi\circ\varphi^{-1}=\operatorname{id}\), whose derivative is still the identity everywhere, we have \[ [(\varphi\circ\varphi^{-1})'(\varphi(x))](v)=\operatorname{id}(v)=v \]

- If \((U,\varphi,v) \sim (V,\psi,w)\), then \((V,\psi,w)\sim(U,\varphi,v)\).

So now we have \[ [(\psi\circ\varphi^{-1})'(\varphi(x))](v)=w. \] To prove that \([(\varphi\circ\psi^{-1})'(\psi(x))]{}(w)=v\), we need some implementation of chain rule.

Note first \[ (\psi\circ\varphi^{-1})'(\varphi(x))=\psi'(\varphi^{-1}(\varphi(x)))\circ\varphi^{-1}{'}(\varphi(x))=\psi'(x)\circ(\varphi^{-1})'(\varphi(x)) \] while \[ (\varphi\circ\psi^{-1})'(\psi(x))=\varphi'(x)\circ(\psi^{-1})'(\psi(x)). \] But also by the chain rule, if \(f\) is a diffeomorphism, we have \[ (f\circ f^{-1})'(x)=(f^{-1})'(f(x))\circ f'(x)=\operatorname{id} \] or equivalently \[ f'(x)=[(f^{-1})'(f(x))]^{-1} \quad (f^{-1})'(f(x))=[f'(x)]^{-1} \]

Therefore \[ \begin{aligned} \{(\psi\circ\varphi^{-1})'(\varphi(x))\}^{-1} &=\{\psi'(x)\circ(\varphi^{-1})'(\varphi(x))\}^{-1} \\ &=\{(\varphi^{-1})'(\varphi(x))\}^{-1}\circ\{\psi'(x)\}^{-1} \\ &=\varphi'(x)\circ(\psi^{-1})'(\psi(x)) \\ &=(\varphi\circ\psi^{-1})'(\psi(x)) \end{aligned} \] which implies \[ (\varphi\circ\psi^{-1})'(\psi(x))(w)=\{(\psi\circ\varphi^{-1})'(\varphi(x))\}^{-1}(w)=v. \]

- If \((U,\varphi,v)\sim(V,\psi,w)\) and \((V,\psi,w)\sim(W,\lambda,z)\), then \((U,\varphi,v)\sim(W,\lambda,z)\).

We are given identities \[ [(\psi\circ\varphi^{-1})'(\varphi(x))](v)=w \] and \[ [(\lambda\circ\psi^{-1})'(\psi(x))](w)=z. \] By canceling \(w\), we get \[ \begin{aligned} z = [(\lambda\circ\psi^{-1})'(\psi(x))] \circ [(\psi\circ\varphi^{-1})'(\varphi(x))] (v) \end{aligned}. \] On the other hand, \[ \begin{aligned} (\lambda\circ\varphi^{-1})'(\varphi(x))&=(\lambda\circ\psi^{-1}\circ\psi\circ\varphi^{-1})'(\varphi(x)) \\ &=(\lambda\circ\psi^{-1})'(\psi\circ\varphi^{-1}\circ\varphi(x))\circ(\psi\circ\varphi^{-1})'(\varphi(x)) \\ &=(\lambda\circ\psi^{-1})'(\psi(x))\circ(\psi\circ\varphi^{-1})'(\varphi(x)) \end{aligned} \] which is what we needed. \(\square\)

An **equivalence class** of such triples \((U,\varphi,v)\) is called a **tangent
vector** of \(X\) at \(x\). The set of such tangent vectors is
called the **tangent space** to \(X\) at \(x\), which is denoted by \(T_x(X)\). But it seems that we have gone
too far. Is the triple even a 'vector'? To get a clear view let's see
Euclidean submanifolds first.

Suppose \(M\) is a submanifold of \(\mathbb{R}^n\). We say \(z\) is the

tangent vectorof \(M\) at point \(x\) if there exists a curve \(\alpha\) of class \(C^1\), which is defined on \(\mathbb{R}\) and where there exists an interval \(I\) such that \(\alpha(I) \subset M\), such that \(\alpha(t_0)=x\) and \(\alpha'(t_0)=z\). (For convenience we often take \(t_0=0\).)

This definition is immediate if we check some examples. For the curve \(M: x^2+1+\frac{e^x}{x^2+1}-y=0\), we can show that \((1,1)^T\) is a tangent vector of \(M\) at \((0,1)\), which is identical to our first example. Taking \[ \alpha(t)=(t,t^2+1+\frac{e^t}{t^2+1}) \] we get \(\alpha(0)=(0,1)\) and \[ \alpha'(t)=(1,2t+\frac{e^t(t-1)^2}{(t^2+1)^2})^T. \] Therefore \(\alpha'(0)=(1,1)^T\). \(\square\)

Let \(\mathbf{E}\) and \(\mathbf{F}\) be two Banach spaces and \(U\) an open subset of \(\mathbf{E}\). A \(C^p\) map \(f: U \to \mathbf{F}\) is called an

immersionat \(x\) if \(f'(x)\) is injective.

For example, if we take \(\mathbf{E}=\mathbf{F}=\mathbb{R}=U\) and \(f(x)=x^2\), then \(f\) is an immersion at almost all point on \(\mathbb{R}\) except \(0\) since \(f'(0)=0\) is not injective. This may lead you to Sard's theorem.

(Theorem 2)Let \(M\) be a subset of \(\mathbb{R}^n\), then \(M\) is a \(d\)-dimensional \(C^p\) submanifold of \(\mathbb{R}^n\) if and only if for every \(x \in M\) there exists an open neighborhood \(U \subset \mathbb{R}^n\) of \(x\), an open neighborhood \(\Omega \subset \mathbb{R}^d\) of \(0\) and a \(C^p\) map \(g: \Omega \to \mathbb{R}^n\) such that \(g\) is immersion at \(0\) such that \(g(0)=x\), and \(g\) is a homeomorphism between \(\Omega\) and \(M \cap U\) with the topology induced from \(\mathbb{R}^n\).

This follows from the definition of manifold and should not be
difficult to prove. But it is not what this blog post should cover. For
a proof you can check *Differential Geometry: Manifolds, Curves, and
Surfaces* by Marcel Berger and Bernard Gostiaux. The proof is
located in section 2.1.

A coordinate system on a \(d\)-dimensional \(C^p\) submanifold \(M\) of \(\mathbb{R}^n\) is a pair \((\Omega,g)\) consisting of an open set \(\Omega \subset \mathbb{R}^d\) and a \(C^p\) function \(g:\Omega \to \mathbb{R}^n\) such that \(g(\Omega)\) is open in \(V\) and \(g\) induces a homeomorphism between \(\Omega\) and \(g(\Omega)\).

For convenience, we say \((\Omega,g)\) is centered at \(x\) if \(g(0)=x\) and \(g\) is an immersion at \(x\). By theorem 2 it is always possible to find such a coordinate system centered at a given point \(x \in M\). The following theorem will show that we can get a easier approach to tangent vector.

(Theorem 3)Let \(\mathbf{E}\) and \(\mathbf{F}\) be two finite-dimensional vector spaces, \(U \subset \mathbf{E}\) an open set, \(f:U \to \mathbf{F}\) a \(C^1\) map, \(M\) a submanifold of \(\mathbf{E}\) contained in \(U\) and \(W\) a submanifold of \(\mathbf{F}\) such that \(f(M) \subset W\). Take \(x \in M\) and set \(y=f(x)\), If \(z\) is a tangent vector to \(M\) at \(x\), the image \(f'(x)(z)\) is a tangent vector to \(W\) at \(y=f(x)\).

*Proof.* Since \(z\) is a
tangent vector, we see there exists a curve \(\alpha: J \to M\) such that \(\alpha(0)=x\) and \(\alpha'(0)=z\) where \(J\) is an open interval containing \(0\). The function \(\beta = f \circ \alpha: J \to W\) is also a
curve satisfying \(\beta(0)=f(\alpha(0))=f(x)\) and \[
\beta'(0)=f'(\alpha(0))\alpha'(0)=f'(x)(z),
\] which is our desired curve. \(\square\)

We shall show that equivalence relation makes sense. Suppose \(M\) is a \(d\)-submanifold of \(\mathbb{R}^n\), \(x \in M\) and \(z\) is a tangent vector to \(M\) at \(x\). Let \((\Omega,g)\) be a coordinate system centered at \(x\). Since \(g \in C^p(\mathbb{R}^d;\mathbb{R}^n)\), we see \(g'(0)\) is a \(n \times d\) matrix, and injectivity ensures that \(\operatorname{rank}(g'(0))=d\).

Every open set \(\Omega \subset
\mathbb{R}^d\) is a \(d\)-dimensional submanifold of \(\mathbb{R}^d\) (of \(C^p\)). Suppose now \(v \in \mathbb{R}^d\) is a tangent vector to
\(\Omega\) at \(0\) (determined by a curve \(\alpha\)), then by Theorem 3, \(g \circ \alpha\) determines a tangent
vector to \(M\) at \(x\), which is \(z_x=g'(0)(v)\). Suppose \((\Lambda,h)\) is another coordinate system
centered at \(x\). If we want to obtain
\(z_x\) as well, we must have \[
h'(0)(w)=g'(0)(v),
\] which is equivalent to \[
w = (h'(0)^{-1} \circ g'(0))(v)=(h^{-1}\circ g)'(0)(v),
\] for some \(w \in
\mathbb{R}^d\) which is the tangent vector to \(\Lambda\) at \(0
\in \Lambda\). *(The inverse makes sense since we implicitly
restricted ourself to \(\mathbb{R}^d\))*

However, we also have two charts by \((U,\varphi)=(g(\Omega),g^{-1})\) and \((V,\psi) = (h(\Lambda),h^{-1})\), which gives \[ (h^{-1} \circ g)'(0)(v)=[(\psi \circ \varphi^{-1})'(\varphi(x))](v)=w \] and this is just our equivalence relation (don't forget that \(g(0)=x\) hence \(g^{-1}(x)=\varphi(x)=0\)!). There we have our reason for equivalence relation: If \((U,\varphi,v) \sim (V,\psi,w)\), then \((U,\varphi,u)\) and \((V,\psi,v)\) determines the same tangent vector but we do not have to evaluate it manually. In general, all elements in an equivalence class represent a single vector, so the vector is (algebraically) a equivalence class. This still holds when talking about Banach manifold since topological properties of Euclidean spaces do not play a role. The generalized proof can be implemented with little difficulty.

The tangent vectors at \(x \in M\) span a vector space (which is based at \(x\)). We do hope that because if not our definition of tangent vector would be incomplete and cannot even hold for an trivial example (such as what we mentioned at the beginning). We shall show, satisfyingly, the set of tangent vectors to \(M\) at \(x\) (which we write \(T_xM\)) forms a vector space that is toplinearly isomorphic to \(\mathbf{E}\), on which \(M\) is modeled.

(Theorem 4)\(T_xM \simeq \mathbf{E}\). In other words, \(T_xM\) can be given the structure of topological vector space given by the chart.

*Proof.* Let \((U,\varphi)\)
be a chart at \(x\). For \(v \in \mathbf{E}\), we see \((\varphi^{-1})'(x)(v)\) is a tangent
vector at \(x\). On the other hand,
pick \(\mathbf{w} \in T_xM\), which can
be represented by \((V,\psi,w)\). Then
\[
v=(\varphi\circ\psi^{-1})'(\psi(x))(w)
\] makes \((U,\varphi,v) \sim
(V,\psi,w)\) uniquely, and therefore we get some \(v \in \mathbf{E}\). To conclude, \[
T_xM \xrightarrow[(\varphi^{-1})'(x)]{\simeq}\mathbf{E}
\] which proves our theorem. Note that this does not depend on
the choice of charts. \(\square\)

For many reasons it is not a good idea to identify \(T_xM\) as \(\mathbf{E}\) without mentioning the point \(x\). For example we shouldn't identify the tangent line of a curve as \(x\)-axis. Instead, it would be better to identify or visualize \(T_xM\) as \((x,\mathbf{E})\), that is, a linear space with origin at \(x\).

Now we treat *all* tangent spaces as a vector bundle. Let
\(M\) be a manifold of class \(C^p\) with \(p
\geq 1\), define the tangent bundle by the disjoint union \[
T(M)=\bigsqcup_{x \in M}T_xM.
\] This is a vector bundle if we define the projection by \[
\begin{aligned}
\pi: T(M) &\to M \\
y \in T_xM &\mapsto x
\end{aligned}
\] and we will verify it soon. First let's see an example. Below
is a visualization of the tangent bundle of \(\frac{x^2}{4}+\frac{y^2}{3}=1\), denoted by
red lines:

Also we can see \(\pi\) maps points on the blue line to a point on the curve, which is \(B\).

To show that a tangent bundle of a manifold is a vector bundle, we need to verify that it satisfies three conditions we mentioned in previous post. Let \((U,\varphi)\) be a chart of \(M\) such that \(\varphi(U)\) is open in \(\mathbf{E}\), then tangent vectors can be represented by \((U,\varphi,v)\). We get a bijection \[ \tau_U:\pi^{-1}(U) = T(U) \to U \times \mathbf{E} \] by definition of tangent vectors as equivalence classes. Let \(z_x\) be a tangent vector to \(U\) at \(x\), then there exists some \(v \in \mathbf{E}\) such that \((U,\varphi,v)\) represents \(z\). On the other hand, for some \(v \in \mathbf{E}\) and \(x \in U\), \((U,\varphi,v)\) represents some tangent vector at \(x\). Explicitly, \[ \tau_{U}(z_x)=(x,v)=(\pi(z_x),[(\varphi^{-1})'(\pi(z_x))]^{-1}(z_x)) \]

Further we get the following diagram commutative (which establishes
**VB 1**):

For **VB 2** and **VB 3** we need to check
different charts. Let \((U_i,\varphi_i)\), \((U_j,\varphi_j)\) be two charts. Define
\(\varphi_{ji}=\varphi_j \circ
\varphi_i^{-1}\) on \(\varphi_i(U_i
\cap U_j)\), and respectively we write \(\tau_{U_i}=\tau_i\) and \(\tau_{U_j}=\tau_j\). Then we get a
transition mapping \[
\tau_{ji}:(\tau_j \circ \tau_i^{-1}):(U_i \cap U_j) \times \mathbf{E}
\to (U_i \cap U_j) \times \mathbf{E}.
\]

One can verify that \[
\tau_{ji}(x,v)=(\varphi_{ji}(x),D\varphi_{ji}(x) \cdot v)
\] for \(x \in U_i \cap U_j\)
and \(v \in \mathbf{E}\). Since \(D\varphi_{ji} \in C^{p-1}\) and \(D\varphi_{ji}(x)\) is a toplinear
isomorphism, we see \[
x \mapsto (\tau_j \circ
\tau_i^{-1})_x=(\varphi_{ji}(x),D\varphi_{ji}(x)\cdot(\cdot))
\] is a morphism, which goes for **VB 3**. It
remains to verify **VB 2**. To do this we need a fact from
Banach space theory:

If \(f:U \to L(\mathbf{E},\mathbf{F})\) is a \(C^k\)-morphism, then the map of \(U \times \mathbf{E}\) into \(\mathbf{F}\) given by \[ (x,v) \mapsto [f(x)](v) \] is a \(C^k\)-morphism.

Here, we have \(f(x)=\tau_{ji}(x,\cdot)\) and to conclude,
\(\tau_{ji}\) is a \(C^{p-1}\)-morphism. It is also an
isomorphism since it has an inverse \(\tau_{ij}\). Following the definition of
manifold, we can conclude that \(T(U)\)
has a unique **manifold structure** such that \(\tau_i\) are morphisms (there will be a
formal proof in next post about any total space of a vector bundle). By
**VB 1**, we also have \(\pi=\tau_i\circ pr\), which makes it a
morphism as well. On each fiber \(\pi^{-1}(x)\), we can freely transport the
topological vector space structure of any \(\mathbf{E}\) such that \(x\) lies in \(U_i\), by means of \(\tau_{ix}\). Since \(f(x)\) is a toplinear isomorphism, the
result is independent of the choice of \(U_i\). **VB 2** is therefore
established.

Using some fancier word, we can also say that \(T:M \to T(M)\) is a
**functor** from the category of \(C^p\)-manifolds to the category of vector
bundles of class \(C^{p-1}\).

Direction is a considerable thing. For example take a look at this picture (by David Gunderman):

The position of the red ball and black ball shows that this triple of
balls turns upside down every time they finish one round. This wouldn't
happen if this triple were on a normal band, which can be denoted by
\(S^1 \times (0,1)\). What would happen
if we try to describe their velocity on the Möbius band, both locally
and globally? There must be some significant difference from a normal
band. If we set some move pattern on balls, for example let them run
horizontally or zig-zagly, hopefully we get different *set* of
vectors. those vectors can span some vector spaces as well.

Here and in the forgoing posts, we will try to develop purely formally certain functorial constructions having to do with vector bundles. It may be overly generalized, but we will offer some examples to make it concrete.

Let \(M\) be a manifold (of class
\(C^p\), where \(p \geq 0\) and can be set to \(\infty\)) modeled on a Banach space \(\mathbf{E}\). Let \(E\) be another topological space and \(\pi: E \to M\) a surjective \(C^p\)-morphism. A **vector
bundle** is a topological construction associated with \(M\) (base space), \(E\) (total space) and \(\pi\) (bundle projection) such that,
roughly speaking, \(E\) is locally a
product of \(M\) and \(\mathbf{E}\).

We use \(\mathbf{E}\) instead of \(\mathbb{R}^n\) to include the infinite dimensional cases. We will try to distinguish finite-dimensional and infinite-dimensional Banach spaces here. There are a lot of things to do, since, for example, infinite dimensional Banach spaces have no countable Hamel basis, while the finite-dimensional ones have finite ones (this can be proved by using the Baire category theorem).

Next we will show precisely how \(E\) locally becomes a product space. Let
\(\mathfrak{U}=(U_i)_i\) be an open
covering of \(M\), and for each \(i\), suppose that we are *given* a
mapping \[
\tau_i:\pi^{-1}(U_i)\to U_i \times E
\] satisfying the following three conditions.

**VB 1** \(\tau_i\) is
a \(C^p\) diffeomorphism making the
following diagram commutative:

where \(pr\) is the projection of the first component: \((x,y) \mapsto x\). By restricting \(\tau_i\) on one point of \(U_i\), we obtain an isomorphism on each fiber \(\pi^{-1}(x)\): \[ \tau_{ix}:\pi^{-1}(x) \xrightarrow{\simeq} \{x\} \times \mathbf{E} \]

**VB 2** For each pair of open sets \(U_i\), \(U_j \in
\mathfrak{U}\), we have the map \[
\tau_{jx} \circ \tau_{ix}^{-1}: \mathbf{E} \to \mathbf{E}
\] to be a toplinear isomorphism (that is, it preserves \(\mathbf{E}\) for being a
*topological* vector space).

**VB 3** For any two members \(U_i\), \(U_j \in
\mathfrak{U}\), we have the following function to be a \(C^p\)-morphism: \[
\begin{aligned}
\varphi:U_i \cap U_j &\to L(\mathbf{E},\mathbf{E}) \\
x &\mapsto \left(\tau_j\circ \tau_i^{-1}\right)_x
\end{aligned}
\]

**REMARKS.** As with manifold, we call the set of
2-tuples \((U_i,\tau_i)_i\) a
**trivializing covering** of \(\pi\), and that \((\tau_i)\) are its **trivializing
maps**. Precisely, for \(x \in
U_i\), we say \(U_i\) or \(\tau_i\) trivializes at \(x\).

Two trivializing *coverings* for \(\pi\) is said to be
**VB-equivalent** if taken together they also satisfy
conditions of **VB 2** and **VB 3**. It's
immediate that **VB-equivalence** is an equivalence
relation and we leave the verification to the reader. It is this
VB-equivalence *class* of trivializing coverings that determines
a structure of **vector bundle** on \(\pi\). With respect to the Banach space
\(\mathbf{E}\), we say that the vector
bundle has **fiber** \(\mathbf{E}\), or is **modeled
on** \(\mathbf{E}\).

Next we shall give some motivations of each condition. Each pair
\((U_i,\tau_i)\) determines a local
product of 'a part of the manifold' and the model space, on the latter
of which we can deploy the direction with ease. This is what **VB
1** tells us. But that's far from enough if we want our vectors
fine enough. We do want the total space \(E\) to actually be able to qualify our
requirements. As for **VB 2**, it is ensured that using two
different trivializing maps will give the same structure of some Banach
spaces (with *equivalent* norms). According to the image of \(\tau_{ix}\), we can say, for each point
\(x \in X\), which can be determined by
a fiber \(\pi^{-1}(x)\) (the pre-image
of \(\tau_{ix}\)), can be given another
Banach space by being sent via \(\tau_{jx}\) for some \(j\). Note that \(\pi^{-1}(x) \in E\), the total space. In
fact, **VB 2** has an equivalent alternative:

**VB 2'** On each fiber \(\pi^{-1}(x)\) we are given a structure of
Banach space as follows. For \(x \in
U_i\), we have a toplinear isomorphism which is in fact the
trivializing map: \[
\tau_{ix}:\pi^{-1}(x)=E_x \to \mathbf{E}.
\] As stated, **VB 2** implies **VB
2'**. Conversely, if **VB 2'** is satisfied, then
for open sets \(U_i\), \(U_j \in \mathfrak{U}\), and \(x \in U_i \cap U_j\), we have \(\tau_{jx} \circ \tau_{ix}^{-1}:\mathbf{E} \to
\mathbf{E}\) to be an toplinear isomorphism. Hence, we can
consider **VB 2** or **VB 2'** as the
refinement of **VB 1**.

In finite dimensional case, one can omit **VB 3** since
it can be implied by **VB 2**, and we will prove it
below.

(Lemma)Let \(\mathbf{E}\) and \(\mathbf{F}\) be two finite dimensional Banach spaces. Let \(U\) be open in some Banach space. Let \[ f:U \times \mathbf{E} \to \mathbf{F} \] be a \(C^p\)-morphism such that for each \(x \in U\), the map \[ f_x: \mathbf{E} \to \mathbf{F} \] given by \(f_x(v)=f(x,v)\) is a linear map. Then the map of \(U\) into \(L(\mathbf{E},\mathbf{F})\) given by \(x \mapsto f_x\) is a \(C^p\)-morphism.

**PROOF.** Since \(L(\mathbf{E},\mathbf{F})=L(\mathbf{E},\mathbf{F_1})
\times L(\mathbf{E},\mathbf{F_2}) \times \cdots \times
L(\mathbf{E},\mathbf{F_n})\) where \(\mathbf{F}=\mathbf{F_1} \times \cdots \times
\mathbf{F_n}\), by induction on the dimension of \(\mathbf{F}\) and \(\mathbf{E}\), it suffices to assume that
\(\mathbf{E}\) and \(\mathbf{F}\) are toplinearly isomorphic to
\(\mathbb{R}\). But in that case, the
function \(f(x,v)\) can be written
\(g(x)v\) for some \(g:U \to \mathbb{R}\). Since \(f\) is a morphism, it follows that as a
function of each argument \(x\), \(v\) is also a morphism, Putting \(v=1\) shows that \(g\) is also a morphism, which finishes the
case when both the dimension of \(\mathbf{E}\) and \(\mathbf{F}\) are equal to \(1\), and the proof is completed by
induction. \(\blacksquare\)

To show that **VB 3** is implied by **VB
2**, put \(\mathbf{E}=\mathbf{F}\) as in the lemma.
Note that \(\tau_j \circ \tau_i^{-1}\)
maps \(U_i \cap U_j \times \mathbf{E}\)
to \(\mathbf{E}\), and \(U_i \cap U_j\) is open, and for each \(x \in U_i \cap U_j\), the map \((\tau_j \circ \tau_i^{-1})_x=\tau_{jx} \circ
\tau_{ix}^{-1}\) is toplinear, hence linear. Then the fact that
\(\varphi\) is a morphism follows from
the lemma.

Let \(M\) be any \(n\)-dimensional smooth manifold that you
are familiar with, then \(pr:M \times
\mathbb{R}^n \to M\) is actually a vector bundle. Here the total
space is \(M \times \mathbb{R}^n\) and
the base is \(M\) and \(pr\) is the bundle projection but in this
case it is simply a projection. Intuitively, on a total space, we can
determine a point \(x \in M\), and
another component can be any direction in \(\mathbb{R}^n\), hence a
*vector*.

We need to verify three conditions carefully. Let \((U_i,\varphi_i)_i\) be any atlas of \(M\), and \(\tau_i\) is the identity map on \(U_i\) (which is naturally of \(C^p\)). We claim that \((U_i,\tau_i)_i\) satisfy the three conditions, thus we get a vector bundle.

For **VB 1** things are clear: since \(pr^{-1}(U_i)=U_i \times \mathbb{R}^n\), the
diagram is commutative. Each fiber \(pr^{-1}(x)\) is essentially \((x) \times \mathbb{R}^n\), and still, \(\tau_{jx} \circ \tau_{ix}^{-1}\) is the
identity map between \((x) \times
\mathbb{R}^n\) and \((x) \times
\mathbb{R}^n\), under the same Euclidean topology, hence
**VB 2** is verified, and we have no need to verify
**VB 3**.

First of all, imagine you have embedded a circle into a Möbius band. Now we try to give some formal definition. As with quotient topology, \(S^1\) can be defined as \[ S^1=I/\sim_1, \]

where \(I\) is the unit interval and \(0 \sim_1 1\) (identifying two ends). On the other hand, the infinite Möbius band can be defined by \[ B= (I \times \mathbb{R})/\sim_2 \] where \((0,v) \sim_2 (1,-v)\) for all \(v \in \mathbb{R}\) (not only identifying two ends of \(I\) but also 'flips' the vertical line). Then all we need is a natural projection on the first component: \[ \pi:B \to S^1. \] And the verification has few difference from the trivial bundle. Quotient topology of Banach spaces follows naturally in this case, but things might be troublesome if we restrict ourself in \(\mathbb{R}^n\).

The first example is relatively rare in many senses. By \(S^n\) we mean the set in \(\mathbb{R}^{n+1}\) with \[
S^n=\{(x_0,x_1,\dots,x_n):x_0^2+x_1^2+\cdots+x_n^2=1\}
\] and the tangent bundle can be defined by \[
TS^n=\{(\mathbf{x},\mathbf{y}):\langle\mathbf{x},\mathbf{y}\rangle=0\}
\subset S^{n} \times\mathbb{R}^{n+1},
\] where, of course, \(\mathbf{x} \in
S^n\) and \(\mathbf{y} \in
\mathbb{R}^{n+1}\). The vector bundle is given by \(pr:TS^n \to S^n\) where \(pr\) is the projection of the first factor.
This total space is of course much finer than \(M \times \mathbb{R}^n\) in the first
example. Each point in the manifold now is associated with a *tangent
space* \(T_x(M)\) at this
point.

More generally, we can define it in any Hilbert space \(H\), for example, \(L^2\) space: \[ TS=\{(x,y):\langle x , y \rangle=0\} \subset S \times H \] where \[ S=\{x:\langle x , x \rangle = 1\}. \] The projection is natural: \[ \begin{aligned} \pi: TM &\to M \\ T_x(M) & \mapsto x \end{aligned} \] But we will not cover the verification in this post since it is required to introduce the abstract definition of tangent vectors. This will be done in the following post.

We want to study those 'vectors' associated to some manifold both
globally and locally. For example we may want to describe the tangent
line of some curves at some point without heavily using elementary
calculus stuff. Also, we may want to describe the vector bundle of a
manifold globally, for example, when will we have a trivial one? Can we
classify the manifold using the behavior of the bundle? Can we make it a
little more abstract, for example, consider the class of all isomorphism
bundles? How do one bundle *transform* to another? But to do this
we need a big amount of definitions and propositions.

Before we go into group theory, let's recall how Cauchy sequence is defined in analysis.

A sequence \((x_n)_{n=1}^{\infty}\)
of real/complex numbers is called a Cauchy sequence if, for every \(\varepsilon>0\), there is a positive
integer \(N\) such that for all \(m,n>N\), we have \[
|x_m-x_n|<\varepsilon.
\] That said, the **distance** between two numbers
is always 'too close'. Notice that only distance is involved, the
definition of Cauchy sequence in metric space comes up in the natural of
things.

Given a metric space \((X,d)\), a sequence \((x_n)_{n=1}^{\infty}\) is Cauchy if for every real number \(\varepsilon>0\), there is a positive integer \(N\) such that, for all \(m,n>N\), the distance by \[ d(x_m,x_n)<\varepsilon. \]

By considering the topology induced by metric, we see that \(x_n\) lies in a neighborhood of \(x_m\) with radius \(\varepsilon\). But a topology can be constructed by neighborhood, hence the Cauchy sequence for topological vector space follows.

For a topological vector space \(X\), pick a local base \(\mathcal{B}\), then \((x_n)_{n=1}^{\infty}\) is a Cauchy sequence if for each member \(U \in \mathcal{B}\), there exists some number \(N\) such that for \(m,n>N\), we have \[ x_m-x_n \in U. \] But in a topological space, it's not working. Consider two topological space by \[ X=(0,1)\quad Y=(1,+\infty) \] with usual topology. We have \(X \simeq Y\) since we have the map by \[ \begin{aligned} h:X &\to Y \\ x &\mapsto \frac{1}{x} \end{aligned} \] as a homeomorphism. Consider the Cauchy sequence \((\frac{1}{n+1})_{n=1}^{\infty}\), we see \((h(\frac{1}{n+1}))_{n=1}^{\infty}=(n+1)_{n=1}^{\infty}\) which is not Cauchy. This counterexample shows that being a Cauchy sequence is not preserved by homeomorphism.

Similarly, one can have a Cauchy sequence in a topological group (bu considering subtraction as inverse).

A sequence \((x_n)_{n=1}^{\infty}\) in a topological group \(G\) is a Cauchy sequence if for every open neighborhood \(U\) of the identity \(G\), there exists some number \(N\) such that whenever \(m,n>N\), we have \[ x_nx_m^{-1} \in U \]

A metric space \((X,d)\) where every Cauchy sequence converges is complete.

Spaces like \(\mathbb{R}\), \(\mathbb{C}\) are complete with Euclid metric. But consider the sequence in \(\mathbb{Q}\) by \[ a_n=\left(1+\frac{1}{n}\right)^{n} \] we have \(a_n\in\mathbb{Q}\) for all \(n\) but the sequence does not converge in \(\mathbb{Q}\). Indeed in \(\mathbb{R}\) we can naturally write \(a_n \to e\) but \(e \notin \mathbb{Q}\) as we all know.

There are several ways to construct \(\mathbb{R}\) from \(\mathbb{Q}\). One of the most famous methods is Dedekind's cut. However you can find no explicit usage of Cauchy sequence. There is another method by using Cauchy sequence explicitly. We are following that way algebraically.

Suppose we are given a group \(G\) with a sequence of normal subgroups \((H_n)_{n=1}^{\infty}\) with \(H_n \supset H_{n+1}\) for all \(n\), all of which has finite index. We are going to complete this group.

A sequence \((x_n)_{n=1}^{\infty}\)
in \(G\) will be called **Cauchy
sequence** if given \(H_k\),
there exists some \(N>0\) such that
for \(m,n>N\), we have \[
x_nx_m^{-1} \in H_k
\] Indeed, this looks very similar to what we see in topological
group, but we don't want to grant a topology to the group anyway. This
definition does not go to far from the original definition of Cauchy
sequence in \(\mathbb{R}\) as well. If
you treat \(H_k\) as some 'small'
thing, it shows that \(x_m\) and \(x_n\) are close enough (by considering
\(x_nx_m^{-1}\) as their
difference).

A sequence \((x_n)_{n=1}^{\infty}\)
in \(G\) will be called **null
sequence** if given \(k\), there
exists some \(N>0\) such that for
all \(n>N\), we have \[
x_n\in H_k
\] or you may write \(x_ne^{-1} \in
H_k\). It can be considered as being *arbitrarily close to the
identity \(e\)*.

The Cauchy sequences (of \(G\)) form a group under termwise product

*Proof.* Let \(C\) be the set
of Cauchy sequences, we shall show that \(C\) forms a group. For \((x_1,x_2,\cdots),(y_1,y_2,\cdots)\in C\),
the product is defined by \[
(x_1,x_2,\cdots)(y_1,y_2,\cdots)=(x_1y_1,x_2y_2,\cdots)
\] The associativity follows naturally from the associativity of
\(G\). To show that \((x_1y_1,x_2y_2,\cdots)\) is still a Cauchy
sequence, notice that for big enough \(m\), \(n\)
and some \(k\), we have \[
x_nx_m^{-1}\in H_k \quad y_ny_m^{-1}\in H_k.
\] But \((x_ny_n)(x_my_m)^{-1}=x_ny_ny_m^{-1}x_m^{-1}\).
To show that this is an element of \(H_k\), notice that \[
x_ny_ny_m^{-1}x_m^{-1}=(x_ny_ny_m^{-1}{\color{red}{x_n^{-1}}})({\color{red}{x_n}}x_m^{-1})
\] Since \(y_ny_m^{-1}\in H_k\),
\(H_k\) is normal, we have \(x_ny_ny_mx_n^{-1} \in H_k\). Since \(x_nx_m^{-1} \in H_k\), \((x_ny_n)(x_my_m)^{-1}\) can be viewed as a
product of two elements of \(H_k\),
therefore is an element of \(H_k\).

Obviously, if we define \(e_C=(e_G,e_G,\cdots)\), where \(e_G\) is the identity of \(G\), \(e_C\) becomes the identity of \(C\), since \[ (x_1,x_2,\cdots)(e_G,e_G,\cdots)=(x_1,x_2,\cdots)=(e_G,e_G,\cdots)(x_1,x_2,\cdots). \] Finally the inverse. We need to show that \[ (x_1,x_2,\cdots)^{-1}=(x_1^{-1},x_2^{-1},\cdots) \] is still an element of \(C\). This is trivial since if we have \[ x_nx_m^{-1} \in H_k \] then \[ (x_n^{-1})(x_m^{-1})^{-1}=x_n^{-1}x_m \in H_k \] as \(H_k\) is a group.

The null sequences (of \(G\)) form a group, further, it's a normal subgroup of \(C\), that is, the group of Cauchy sequences.

Let \(N\) be the set of null sequences of \(G\). Still, the identity is defined by \((e_G,e_G,\cdots)\), and there is no need to duplicate the validation. And the associativity still follows from \(G\). To show that \(N\) is closed under termwise product, namely if \((x_n),(y_n) \in N\), then \((x_ny_n)\in N\), one only need to notice that, for big \(n\), we already have \[ x_n,y_n \in H_k. \] Therefore \(x_ny_n \in H_k\) since \(x_n\) and \(y_n\) are two elements of \(H_k\).

To show that \((x_n^{-1})\), which should be treated as the inverse of \((x_n)\), is still in \(N\), notice that if \(x_n \in H_k\), then \(x_n^{-1} \in H_k\).

Next, we shall show that \(N\) is a subgroup of \(C\), which is equivalent to show that every null sequence is Cauchy. Given \(H_p \supset H_q\), for \((x_n)\in{N}\), there are some big enough \(m\) and \(n\) such that \[ x_n \in H_p \quad x_m \in H_q \] therefore \[ x_nx_m^{-1} \in H_p \] as desired. Finally, pick \((p_n) \in N\) and \((q_n) \in C\), we shall show that \((q_n)(p_n)(q_n)^{-1} \in N\). That is, the sequence \((q_np_nq_n^{-1})\) is a null sequence. Given \(H_k\), we have some big \(n\) such that \[ p_n \in H_k \] therefore \[ q_np_nq_n^{-1} \in H_k \] since \(H_k\) is normal. Our statement is proved.

The factor group \(C/N\) is called the

completionof \(G\) (with respect to \((H_n)\)).

As we know, the elements of \(C/N\) are cosets. A coset can be considered as an element of \(G\)'s completion. Let's head back to some properties of factor group. Pick \(x,y \in C\), then \(xN=yN\) if and only if \(x^{-1}y \in N\). With that being said, two Cauchy sequences are equivalent if their 'difference' is a null sequence.

Informally, consider the addictive group \(\mathbb{Q}\). There are two Cauchy sequence by \[ (1,1,1,\cdots,1,\cdots) \\ (0.9,0.99,0.999,0.9999,\cdots). \] They are equivalent since \[ (0.1,0.01,0.001,\cdots) \] is a null sequence. That's why people say \(0.99999... = 1\) (in analysis, the difference is convergent to \(0\); but in algebra, we say the two sequences are equivalent). Another example, \(\ln{2}\) can be represented by the equivalent class of \[ (0.6,0.69,0.693,\cdots). \]

We made our completion using Cauchy sequences. The completion is filled with some Cauchy sequence and some additions of 'nothing', whence the gap disappears.

Again, the sequence of normal subgroups does not have to be indexed by \(\mathbb{N}\). It can be indexed by any directed partially ordered set, or simply partially ordered set. Removing the restriction of index set gives us a great variety of implementation.

However, can we finished everything about completing \(\mathbb{Q}\) using this? The answer is, no - the multiplication is not verified! To finish this, field theory have to be taken into consideration.