Dedekind Domain and Properties in an Elementary Approach

You can find contents about Dedekind domain (or Dedekind ring) in almost all algebraic number theory books. But many properties can be proved inside ring theory. I hope you can find the solution you need in this post, and this post will not go further than elementary ring theory. With that being said, you are assumed to have enough knowledge of ring and ring of fractions (this post serves well), but not too much mathematics maturity is assumed (at the very least you are assumed to be familiar with terminologies in the linked post).\(\def\mb{\mathbb}\) \(\def\mfk{\mathfrak}\)

Definition

There are several ways to define Dedekind domain since there are several equivalent statements of it. We will start from the one based on ring of fractions. As a friendly reminder, \(\mb{Z}\) or any principal integral domain is already a Dedekind domain. In fact Dedekind domain may be viewed as a generalization of principal integral domain.

Let \(\mfk{o}\) be an integral domain (a.k.a. entire ring), and \(K\) be its quotient field. A Dedekind domain is an integral domain \(\mfk{o}\) such that the fractional ideals form a group under multiplication. Let's have a breakdown. By a fractional ideal \(\mfk{a}\) we mean a nontrivial additive subgroup of \(K\) such that

  • \(\mfk{o}\mfk{a}=\mfk{a}\),
  • there exists some nonzero element \(c \in \mfk{o}\) such that \(c\mfk{a} \subset \mfk{o}\).

What does the group look like? As you may guess, the unit element is \(\mfk{o}\). For a fractional ideal \(\mfk{a}\), we have the inverse to be another fractional ideal \(\mfk{b}\) such that \(\mfk{ab}=\mfk{ba}=\mfk{o}\). Note we regard \(\mfk{o}\) as a subring of \(K\). For \(a \in \mfk{o}\), we treat it as \(a/1 \in K\). This makes sense because the map \(i:a \mapsto a/1\) is injective. For the existence of \(c\), you may consider it as a restriction that the 'denominator' is bounded. Alternatively, we say that fractional ideal of \(K\) is a finitely generated \(\mfk{o}\)-submodule of \(K\). But in this post it is not assumed that you have learned module theory.

Let's take \(\mb{Z}\) as an example. The quotient field of \(\mb{Z}\) is \(\mb{Q}\). We have a fractional ideal \(P\) where all elements are of the type \(\frac{np}{2}\) with \(p\) prime and \(n \in \mb{Z}\). Then indeed we have \(\mb{Z}P=P\). On the other hand, take \(2 \in \mb{Z}\), we have \(2P \subset \mb{Z}\). For its inverse we can take a fractional ideal \(Q\) where all elements are of the type \(\frac{2n}{p}\). As proved in algebraic number theory, the ring of algebraic integers in a number field is a Dedekind domain.

Before we go on we need to clarify the definition of ideal multiplication. Let \(\mfk{a}\) and \(\mfk{b}\) be two ideals, we define \(\mfk{ab}\) to be the set of all sums \[ x_1y_1+\cdots+x_ny_n \] where \(x_i \in \mfk{a}\) and \(y_i \in \mfk{b}\). Here the number \(n\) means finite but is not fixed. Alternatively we cay say \(\mfk{ab}\) contains all finite sum of products of \(\mfk{a}\) and \(\mfk{b}\).

Propositions

(Proposition 1) A Dedekind domain \(\mfk{o}\) is Noetherian.

By Noetherian ring we mean that every ideal in a ring is finitely generated. Precisely, we will prove that for every ideal \(\mfk{a} \subset \mfk{o}\) there are \(a_1,a_2,\cdots,a_n \in \mfk{a}\) such that, for every \(r \in \mfk{a}\), we have an expression \[ r = c_1a_1 + c_2a_2 + \cdots + c_na_n \qquad c_1,c_2,\cdots,c_n \in \mfk{o}. \] Also note that any ideal \(\mfk{a} \subset \mfk{o}\) can be viewed as a fractional ideal.

Proof. Since \(\mfk{a}\) is an ideal of \(\mfk{o}\), let \(K\) be the quotient field of \(\mfk{o}\), we see since \(\mfk{oa}=\mfk{a}\), we may also view \(\mfk{a}\) as a fractional ideal. Since \(\mfk{o}\) is a Dedekind domain, and fractional ideals of \(\mfk{a}\) is a group, there is an fractional ideal \(\mfk{b}\) such that \(\mfk{ab}=\mfk{ba}=\mfk{o}\). Since \(1 \in \mfk{o}\), we may say that there exists some \(a_1,a_2,\cdots, a_n \in \mfk{a}\) and \(b_1,b_2,\cdots,b_n \in \mfk{o}\) such that \(\sum_{i = 1 }^{n}a_ib_i=1\). For any \(r \in \mfk{a}\), we have an expression \[ r = rb_1a_1+rb_2a_2+\cdots+rb_na_n. \] On the other hand, any element of the form \(c_1a_1+c_2a_2+\cdots+c_na_n\), by definition, is an element of \(\mfk{a}\). \(\blacksquare\)

From now on, the inverse of an fractional ideal \(\mfk{a}\) will be written like \(\mfk{a}^{-1}\).

(Proposition 2) For ideals \(\mfk{a},\mfk{b} \subset \mfk{o}\), \(\mfk{b}\subset\mfk{a}\) if and only if there exists some \(\mfk{c}\) such that \(\mfk{ac}=\mfk{b}\) (or we simply say \(\mfk{a}|\mfk{b}\))

Proof. If \(\mfk{b}=\mfk{ac}\), simply note that \(\mfk{ac} \subset \mfk{a} \cap \mfk{c} \subset \mfk{a}\). For the converse, suppose that \(a \supset \mfk{b}\), then \(\mfk{c}=\mfk{a}^{-1}\mfk{b}\) is an ideal of \(\mfk{o}\) since \(\mfk{c}=\mfk{a}^{-1}\mfk{b} \subset \mfk{a}^{-1}\mfk{a}=\mfk{o}\), hence we may write \(\mfk{b}=\mfk{a}\mfk{c}\). \(\blacksquare\)

(Proposition 3) If \(\mfk{a}\) is an ideal of \(\mfk{o}\), then there are prime ideals \(\mfk{p}_1,\mfk{p}_2,\cdots,\mfk{p}_n\) such that \[ \mfk{a}=\mfk{p}_1\mfk{p}_2\cdots\mfk{p}_n. \]

Proof. For this problem we use a classical technique: contradiction on maximality. Suppose this is not true, let \(\mfk{A}\) be the set of ideals of \(\mfk{o}\) that cannot be written as the product of prime ideals. By assumption \(\mfk{U}\) is nonempty. Since as we have proved, \(\mfk{o}\) is Noetherian, we can pick an maximal element \(\mfk{a}\) of \(\mfk{A}\) with respect to inclusion. If \(\mfk{a}\) is maximal, then since all maximal ideals are prime, \(\mfk{a}\) itself is prime as well. If \(\mfk{a}\) is properly contained in an ideal \(\mfk{m}\), then we write \(\mfk{a}=\mfk{m}\mfk{m}^{-1}\mfk{a}\). We have \(\mfk{m}^{-1}\mfk{a} \supsetneq \mfk{a}\) since if not, we have \(\mfk{a}=\mfk{ma}\), which implies \(\mfk{m}=\mfk{o}\). But by maximality, \(\mfk{m}^{-1}\mfk{a}\not\in\mfk{U}\), hence it can be written as a product of prime ideals. But \(\mfk{m}\) is prime as well, we have a prime factorization for \(\mfk{a}\), contradicting the definition of \(\mfk{U}\).

Next we show uniqueness up to permutation. If \[ \mfk{p}_1\mfk{p}_2\cdots\mfk{p}_k=\mfk{q}_1\mfk{q}_2\cdots\mfk{q}_j, \] since \(\mfk{p}_1\mfk{p}_2\cdots\mfk{p}_k\subset\mfk{p}_1\) and \(\mfk{p}_1\) is prime, we may assume that \(\mfk{q}_1 \subset \mfk{p}_1\). By the property of fractional ideal we have \(\mfk{q}_1=\mfk{p}_1\mfk{r}_1\) for some fractional ideal \(\mfk{r}_1\). However we also have \(\mfk{q}_1 \subset \mfk{r}_1\). Since \(\mfk{q}_1\) is prime, we either have \(\mfk{q}_1 \supset \mfk{p}_1\) or \(\mfk{q}_1 \supset \mfk{r}_1\). In the former case we get \(\mfk{p}_1=\mfk{q}_1\), and we finish the proof by continuing inductively. In the latter case we have \(\mfk{r}_1=\mfk{q}_1=\mfk{p}_1\mfk{q}_1\), which shows that \(\mfk{p}_1=\mfk{o}\), which is impossible. \(\blacksquare\)

(Proposition 4) Every nontrivial prime ideal \(\mfk{p}\) is maximal.

Proof. Let \(\mfk{m}\) be an maximal ideal containing \(\mfk{p}\). By proposition 2 we have some \(\mfk{c}\) such that \(\mfk{p}=\mfk{mc}\). If \(\mfk{m} \neq \mfk{p}\), then \(\mfk{c} \neq \mfk{o}\), and we may write \(\mfk{c}=\mfk{p}_1\cdots\mfk{p}_n\), hence \(\mfk{p}=\mfk{m}\mfk{p}_1\cdots\mfk{p}_n\), which is a prime factorisation, contradicting the fact that \(\mfk{p}\) has a unique prime factorisation, which is \(\mfk{p}\) itself. Hence any maximal ideal containing \(\mfk{p}\) is \(\mfk{p}\) itself. \(\blacksquare\)

(Proposition 5) Suppose the Dedekind domain \(\mfk{o}\) only contains one prime (and maximal) ideal \(\mfk{p}\), let \(t \in \mfk{p}\) and \(t \not\in \mfk{p}^2\), then \(\mfk{p}\) is generated by \(t\).

Proof. Let \(\mfk{t}\) be the ideal generated by \(t\). By proposition 3 we have a factorisation \[ \mfk{t}=\mfk{p}^n \] for some \(n\) since \(\mfk{o}\) contains only one prime ideal. According to proposition 2, if \(n \geq 3\), we write \(\mfk{p}^n=\mfk{p}^2\mfk{p}^{n-2}\), we see \(\mfk{p}^2 \supset \mfk{p}^n\). But this is impossible since if so we have \(t \in \mfk{p}^n \subset \mfk{p}^2\) contradicting our assumption. Hence \(0<n<3\). But If \(n=2\) we have \(t \in \mfk{p}^2\) which is also not possible. So \(\mfk{t}=\mfk{p}\) provided that such \(t\) exists.

For the existence of \(t\), note if not, then for all \(t \in \mfk{p}\) we have \(t \in \mfk{p}^2\), hence \(\mfk{p} \subset \mfk{p}^2\). On the other hand we already have \(\mfk{p}^2 = \mfk{p}\mfk{p}\), which implies that \(\mfk{p}^2 \subset \mfk{p}\) (proposition 2), hence \(\mfk{p}^2=\mfk{p}\), contradicting proposition 3. Hence such \(t\) exists and our proof is finished. \(\blacksquare\)

Characterisation of Dedekind domain

In fact there is another equivalent definition of Dedekind domain:

A domain \(\mfk{o}\) is Dedekind if and only if

  • \(\mfk{o}\) is Noetherian.
  • \(\mfk{o}\) is integrally closed.
  • \(\mfk{o}\)​ has Krull dimension \(1\)​ (i.e. every non-zero prime ideals are maximal).

This is equivalent to say that faction ideals form a group and is frequently used by mathematicians as well. But we need some more advanced techniques to establish the equivalence. Presumably there will be a post about this in the future.

Why Does a Vector Space Have a Basis (Module Theory)

Module and vector space

First we recall some backgrounds. Suppose \(A\) is a ring with multiplicative identity \(1_A\). A left module of \(A\) is an additive abelian group \((M,+)\), together with an ring operation \(A \times M \to M\) such that \[ \begin{aligned} (a+b)x &= ax+bx \\ a(x+y) &= ax+ay \\ a(bx) &= (ab)x \\ 1_Ax &= x \end{aligned} \] for \(x,y \in M\) and \(a,b \in A\). As a corollary, we see \((0_A+0_A)x=0_Ax=0_Ax+0_Ax\), which shows \(0_Ax=0_M\) for all \(x \in M\). On the other hand, \(a(x-x)=0_M\) which implies \(a(-x)=-(ax)\). We can also define right \(A\)-modules but we are not discussing them here.

Let \(S\) be a subset of \(M\). We say \(S\) is a basis of \(M\) if \(S\) generates \(M\) and \(S\) is linearly independent. That is, for all \(m \in M\), we can pick \(s_1,\cdots,s_n \in S\) and \(a_1,\cdots,a_n \in A\) such that \[ m = a_1s_1+a_2s_2+\cdots+a_ns_n, \] and, for any \(s_1,\cdots,s_n \in S\), we have \[ a_1s_1+a_2s_2+\cdots+a_ns_n=0_M \implies a_1=a_2=\cdots=a_n=0_A. \] Note this also shows that \(0_M\notin S\) (what happens if \(0_M \in S\)?). We say \(M\) is free if it has a basis. The case when \(M\) or \(A\) is trivial is excluded.

If \(A\) is a field, then \(M\) is called a vector space, which has no difference from the one we learn in linear algebra and functional analysis. Mathematicians in functional analysis may be interested in the cardinality of a vector space, for example, when a vector space is of finite dimension, or when the basis is countable. But the basis does not come from nowhere. In fact we can prove that vector spaces have basis, but modules are not so lucky. \(\def\mb{\mathbb}\)

Examples of non-free modules

First of all let's consider the cyclic group \(\mb{Z}/n\mb{Z}\) for \(n \geq 2\). If we define \[ \begin{aligned} \mb{Z} \times \mb{Z}/n\mb{Z} &\to \mb{Z}/n\mb{Z} \\ (m,k+n\mb{Z}) &\mapsto mk+n\mb{Z} \end{aligned} \] which is actually \(m\) copies of an element, then we get a module, which will be denoted by \(M\). For any \(x=k+n\mb{Z} \in M\), we see \(nk+n\mb{Z}=0_M\). Therefore for any subset \(S \subset M\), if \(x_1,\cdots,x_k \in M\), we have \[ nx_1+nx_2+\cdots+nx_k = 0_M, \] which gives the fact that \(M\) has no basis. In fact this can be generalized further. If \(A\) is a ring but not a field, let \(I\) be a nontrivial proper ideal, then \(A/I\) is a module that has no basis.

Following \(\mb{Z}/n\mb{Z}\) we also have another example on finite order. Indeed, any finite abelian group is not free as a module over \(\mb{Z}\). More generally,

Let \(G\) be a abelian group, and \(G_{tor}\) be its torsion subgroup. If \(G_{tor}\) is non-trival, then \(G\) cannot be a free module over \(\mb{Z}\).


Next we shall take a look at infinite rings. Let \(F[X]\) be the polynomial ring over a field \(F\) and \(F'[X]\) be the polynomial sub-ring that have coefficient of \(X\) equal to \(0\). Then \(F[X]\) is a \(F'[X]\)-module. However it is not free.

Suppose we have a basis \(S\) of \(F[X]\), then we claim that \(|S|>1\). If \(|S|=1\), say \(P \in S\), then \(P\) cannot generate \(F[X]\) since if \(P\) is constant then we cannot generate a polynomial contains \(X\) with power \(1\); If \(P\) is not constant, then the constant polynomial cannot be generate. Hence \(S\) contains at least two polynomials, say \(P_1 \neq 0\) and \(P_2 \neq 0\). However, note \(-X^2P_1 \in F'[X]\) and \(X^2P_2 \in F'[X]\), which gives \[ (X^2P_2)P_1-(X^2P_1)P_2=0. \] Hence \(S\) cannot be a basis.

Why does a vector space have a basis

I hope those examples have convinced you that basis is not a universal thing. We are going to prove that every vector space has a basis. More precisely,

Let \(V\) be a nontrivial vector space over a field \(K\). Let \(\Gamma\) be a set of generators of \(V\) over \(K\) and \(S \subset \Gamma\) is a subset which is linearly independent, then there exists a basis of \(V\) such that \(S \subset B \subset \Gamma\).

Note we can always find such \(\Gamma\) and \(S\). For the extreme condition, we can pick \(\Gamma=V\) and \(S\) be a set containing any single non-zero element of \(V\). Note this also gives that we can generate a basis by expanding any linearly independent set. The proof relies on a fact that every non-zero element in a field is invertible, and also, Zorn's lemma. In fact, axiom of choice is equivalent to the statement that every vector has a set of basis.\(\def\mfk{\mathfrak}\)

Proof. Define \[ \mfk{T} =\{T \subset \Gamma:S \subset T, \text{ $T$ is linearly independent}\}. \] Then \(\mfk{T}\) is not empty since it contains \(S\). If \(T_1 \subset T_2 \subset \cdots\) is a totally ordered chain in \(\mfk{T}\), then \(T=\bigcup_{i=1}^{\infty}T_i\) is again linearly independent and contains \(S\). To show that \(T\) is linearly independent, note that if \(x_1,x_2,\cdots,x_n \in T\), we can find some \(k_1,\cdots,k_n\) such that \(x_i \in T_{k_i}\) for \(i=1,2,\cdots,n\). If we pick \(k = \max(k_1,\cdots,k_n)\), then \[ x_1,x_2,\cdots,x_n \in \bigcup_{i=1}^{n}T_{k_i}=T_k. \] But we already know that \(T_k\) is linearly independent, so \(a_1x_1+\cdots+a_nx_n=0_V\) implies \(a_1=\cdots=a_n=0_K\).

By Zorn's lemma, let \(B\) be the maximal element of \(\mfk{T}\), then \(B\) is also linearly independent since it is an element of \(\mfk{T}\). Next we show that \(B\) generates \(V\). Suppose not, then we can pick some \(x \in \Gamma\) that is not generated by \(B\). Define \(B'=B \cup \\{x\\}\), we see \(B'\) is linearly independent as well, because if we pick \(y_1,y_2,\cdots,y_n \in B\), and if \[ \sum_{k=1}^{n}a_ky_k+bx=0_V, \] then if \(b \neq 0\) we have \[ x = -\sum_{k=1}^{n}b^{-1}a_ky_k \in B, \] contradicting the assumption that \(x\) is not generated by \(B\). Hence \(b=0_K\). However, we have proved that \(B'\) is a linearly independent set containing \(B\) and contained in \(S\), contradicting the maximality of \(B\) in \(\mfk{T}\). Hence \(B\) generates \(V\). \(\square\)

The Grothendienck Group

Free group

Let \(A\) be an abelian group. Let \((e_i)_{i \in I}\) be a family of elements of \(A\). We say that this family is a basis for \(A\) if the family is not empty, and if every element of \(A\) has a unique expression as a linear expression \[ x = \sum_{i \in I} x_i e_i \] where \(x_i \in \mathbb{Z}\) and almost all \(x_i\) are equal to \(0\). This means that the sum is actually finite. An abelian group is said to be free if it has a basis. Alternatively, we may write \(A\) as a direct sum by \[ A \cong \bigoplus_{i \in I}\mathbb{Z}e_i. \]

Free abelian group generated by a set

Let \(S\) be a set. Say we want to get a group out of this for some reason, so how? It is not a good idea to endow \(S\) with a binary operation beforehead since overall \(S\) is merely a set. We shall generate a group out of \(S\) in the most freely way.

Let \(\mathbb{Z}\langle S \rangle\) be the set of all maps \(\varphi:S \to \mathbb{Z}\) such that, for only a finite number of \(x \in S\), we have \(\varphi(x) \neq 0\). For simplicity, we denote \(k \cdot x\) to be some \(\varphi_0 \in \mathbb{Z}\langle S \rangle\) such that \(\varphi_0(x)=k\) but \(\varphi_0(y) = 0\) if \(y \neq x\). For any \(\varphi\), we claim that \(\varphi\) has a unique expression \[ \varphi=k_1 \cdot x_1 + k_2 \cdot x_2 + \cdots + k_n \cdot x_n. \] One can consider these integers \(k_i\) as the order of \(x_i\), or simply the time that \(x_i\) appears (may be negative). For \(\varphi\in\mathbb{Z}\langle S \rangle\), let \(I=\{x_1,x_2,\cdots,x_n\}\) be the set of elements of \(S\) such that \(\varphi(x_i) \neq 0\). If we denote \(k_i=\varphi(x_i)\), we can show that \(\psi=k_1 \cdot x_1 + k_2 \cdot x_2 + \cdots + k_n \cdot x_n\) is equal to \(\varphi\). For \(x \in I\), we have \(\psi(x)=k\) for some \(k=k_i\neq 0\) by definition of the '\(\cdot\)'; if \(y \notin I\) however, we then have \(\psi(y)=0\). This coincides with \(\varphi\). \(\blacksquare\)

By definition the zero map \(\mathcal{O}=0 \cdot x \in \mathbb{Z}\langle S \rangle\) and therefore we may write any \(\varphi\) by \[ \varphi=\sum_{x \in S}k_x\cdot x \] where \(k_x \in \mathbb{Z}\) and can be zero. Suppose now we have two expressions, for example \[ \varphi=\sum_{x \in S}k_x \cdot x=\sum_{x \in S}k_x'\cdot x \] Then \[ \varphi-\varphi=\mathcal{O}=\sum_{x \in S}(k_x-k'_x)\cdot x \] Suppose \(k_y - k_y' \neq 0\) for some \(y \in S\), then \[ \mathcal{O}(y)=k_y-k_y'\neq 0 \] which is a contradiction. Therefore the expression is unique. \(\blacksquare\)

This \(\mathbb{Z}\langle S \rangle\) is what we are looking for. It is an additive group (which can be proved immediately) and, what is more important, every element can be expressed as a 'sum' associated with finite number of elements of \(S\). We shall write \(F_{ab}(S)=\mathbb{Z}\langle S \rangle\), and call it the free abelian group generated by \(S\). For elements in \(S\), we say they are free generators of \(F_{ab}(S)\). If \(S\) is a finite set, we say \(F_{ab}(S)\) is finitely generated.

An abelian group is free if and only if it is isomorphic to a free abelian group \(F_{ab}(S)\) for some set \(S\).

Proof. First we shall show that \(F_{ab}(S)\) is free. For \(x \in M\), we denote \(\varphi = 1 \cdot x\) by \([x]\). Then for any \(k \in \mathbb{Z}\), we have \(k[x]=k \cdot x\) and \(k[x]+k'[y] = k\cdot x + k' \cdot y\). By definition of \(F_{ab}(S)\), any element \(\varphi \in F_{ab}(S)\) has a unique expression \[ \varphi = k_1 \cdot x_1 + \cdots + k_n \cdot x_n =k_1[x_1]+\cdots+k_n[x_n] \] Therefore \(F_{ab}(S)\) is free since we have found the basis \(([x])_{x \in S}\).

Conversely, if \(A\) is free, then it is immediate that its basis \((e_i)_{i \in I}\) generates \(A\). Our statement is therefore proved. \(\blacksquare\)

The connection between an arbitrary abelian group an a free abelian group

(Proposition 1) If \(A\) is an abelian group, then there is a free group \(F\) which has a subgroup \(H\) such that \(A \cong F/H\).

Proof. Let \(S\) be any set containing \(A\). Then we get a surjective map \(\gamma: S \to A\) and a free group \(F_{ab}(S)\). We also get a unique homomorphism \(\gamma_\ast:F_{ab}(S) \to A\) by \[ \begin{aligned} \gamma_\ast:F_{ab}(S) &\to A \\ \varphi=\sum_{x \in S}k_x\cdot x &\mapsto \sum_{x \in S}k_x\gamma(x) \end{aligned} \] which is also surjective. By the first isomorphism theorem, if we set \(H=\ker(\gamma_\ast)\) and \(F_{ab}(S)=F\), then \[ F/H \cong A. \] \(\blacksquare\)

(Proposition 2) If \(A\) is finitely generated, then \(F\) can also be chosen to be finitely generated.

Proof. Let \(S\) be the generator of \(A\), and \(S'\) is a set containing \(S\). Note if \(S\) is finite, which means \(A\) is finitely generated, then \(S'\) can also be finite by inserting one or any finite number more of elements. We have a map from \(S\) and \(S'\) into \(F_{ab}(S)\) and \(F_{ab}(S')\) respectively by \(f_S(x)=1 \cdot x\) and \(f_{S'}(x')=1 \cdot x'\). Define \(g=f_{S'} \circ \lambda:S' \to F_{ab}(S)\) we get another homomorphism by \[ \begin{aligned} g_\ast:F_{ab}(S') &\to F_{ab}(S) \\ \varphi'=\sum_{x \in S'}k_{x} \cdot x &\mapsto \sum_{x \in S'}k_{x}\cdot g(x) \end{aligned} \] This defines a unique homomorphism such that \(g_\ast \circ f_{S'} = g\). As one can also verify, this map is also surjective. Therefore by the first isomorphism theorem we have \[ A \cong F_{ab}(S) \cong F_{ab}(S')/\ker(g_\ast) \] \(\blacksquare\)

It's worth mentioning separately that we have implicitly proved two statements with commutative diagrams:

(Proposition 3 | Universal property) If \(g:S \to B\) is a mapping of \(S\) into some abelian group \(B\), then we can define a unique group-homomorphism making the following diagram commutative:

diagram-000001

(Proposition 4) If \(\lambda:S \to S\) is a mapping of sets, there is a unique homomorphism \(\overline{\lambda}\) making the following diagram commutative:

diagram-000001

(In the proof of Proposition 2 we exchanged \(S\) an \(S'\).)

The Grothendieck group

(The Grothendieck group) Let \(M\) be a commutative monoid written additively. We shall prove that there exists a commutative group \(K(M)\) with a monoid homomorphism \[ \gamma:M \to K(M) \]

satisfying the following universal property: If \(f:M \to A\) is a homomorphism from \(M\) into a abelian group \(A\), then there exists a unique homomorphism \(f_\gamma:K(M) \to A\) such that \(f=f_\gamma\circ\gamma\). This can be represented by a commutative diagram:

diagram-000001

Proof. There is a commutative diagram describes what we are doing.

grothendieck-group-universal-proof

Let \(F_{ab}(M)\) be the free abelian group generated by \(M\). For \(x \in M\), we denote \(1 \cdot x \in F_{ab}(M)\) by \([x]\). Let \(B\) be the group generated by all elements of the type \[ [x+y]-[x]-[y] \] where \(x,y \in M\). This can be considered as a subgroup of \(F_{ab}(M)\). We let \(K(M)=F_{ab}(M)/B\). Let \(i=x \to [x]\) and \(\pi\) be the canonical map \[ \pi:F_{ab}(M) \to F_{ab}(M)/B. \] We are done by defining \(\gamma: \pi \circ i\). Then we shall verify that \(\gamma\) is our desired homomorphism satisfying the universal property. For \(x,y \in M\), we have \(\gamma(x+y)=\pi([x+y])\) and \(\gamma(x)+\gamma(y) = \pi([x])+\pi([y])=\pi([x]+[y])\). However we have \[ [x+y]-[x]-[y] \in B, \] which implies that \[ \gamma(x)+\gamma(y)=\pi([x]+[y])=\pi([x+y]) = \gamma(x+y). \] Hence \(\gamma\) is a monoid-homomorphism. Finally the universal property. By proposition 3, we have a unique homomorphism \(f_\ast\) such that \(f_\ast \circ i = f\). Note if \(y \in B\), then \(f_\ast(y) =0\). Therefore \(B \subset \ker{f_\ast}\) Therefore we are done if we define \(f_\gamma(x+B)=f_\ast (x)\). \(\blacksquare\)

Comments of the proof

Why such a \(B\)? Note in general \([x+y]\) is not necessarily equal to \([x]+[y]\) in \(F_{ab}(M)\), but we don't want it to be so. So instead we create a new equivalence relation, by factoring a subgroup generated by \([x+y]-[x]-[y]\). Therefore in \(K(M)\) we see \([x+y]+B = [x]+[y]+B\), which finally makes \(\gamma\) a homomorphism. We use the same strategy to generate the tensor product of two modules later. But at that time we have more than one relation to take care of.

Cancellative monoid

If for all \(x,y,z \in M\), \(x+y=x+z\) implies \(y=z\), then we say \(M\) is a cancellative monoid, or the cancellation law holds in \(M\). Note for the proof above we didn't use any property of cancellation. However we still have an interesting property for cancellation law.

(Theorem) The cancellation law holds in \(M\) if and only if \(\gamma\) is injective.

Proof. This proof involves another approach to the Grothendieck group. We consider pairs \((x,y) \in M \times M\) with \(x,y \in M\). Define \[ (x,y) \sim (x',y') \iff \exists \ell \in M, y+x'+\ell=x+y'+\ell. \] Then we get a equivalence relation (try to prove it yourself!). We define the addition component-wise, that is, \((x,y)+(x',y')=(x+x',y+y')\), then the equivalence classes of pairs form a group \(A\), where the zero element is \([(0,0)]\). We have a monoid-homomorphism \[ f:x \mapsto [(x,0)]. \] If cancellation law holds in \(M\), then \[ \begin{aligned} f(x) = f(y) &\implies [(x,0)] = [(y,0)] \\ &\implies 0+y+\ell=x+0+\ell \\ &\implies x=y. \end{aligned} \] Hence \(f\) is injective. By the universal property of the Grothendieck group, we get a unique homomorphism \(f_\gamma\) such that \(f_\gamma \circ \gamma = f\). If \(x \neq 0\) in \(M\), then \(f_\gamma \circ \gamma(x) \neq 0\) since \(f\) is injective. This implies \(\gamma(x) \neq 0\). Hence \(\gamma\) is injective.

Conversely, if \(\gamma\) is injective, then \(i\) is injective (this can be verified by contradiction). Then we see \(f=f_\ast \circ i\) is injective. But \(f(x)=f(y)\) if and only if \(x+\ell = y+\ell\), hence \(x+ \ell = y+ \ell\) implies \(x=y\), the cancellation law holds on \(M\).

Examples

Our first example is \(\mathbb{N}\). Elements of \(F_{ab}(\mathbb{N})\) are of the form \[ \varphi=k_1 \cdot n_1 + k_2 \cdot n_2+\cdots + k_m \cdot n_m. \] For elements in \(B\) they are generated by \[ \varphi=1\cdot (m+n)-1\cdot m - 1\cdot n \] which we wish to represent \(0\). Indeed, \(K(\mathbb{N}) \simeq \mathbb{Z}\) since if we have a homomorphism \[ \begin{aligned} f:K(\mathbb{N}) &\to \mathbb{Z} \\ \sum_{j=1}^{m}k_j \cdot n_j +B &\mapsto \sum_{j=1}^{m}k_j n_j. \end{aligned} \] For \(r \in \mathbb{Z}\), we see \(f(1 \cdot r+B)=r\). On the other hand, if \(\sum_{j=1}^{m}k_j \cdot n_j \not\in B\), then its image under \(f\) is not \(0\).


In the first example we 'granted' the natural numbers 'subtraction'. Next we grant the division on multiplicative monoid.

Consider \(M=\mathbb{Z} \setminus 0\). Now for \(F_{ab}(M)\) we write elements in the form \[ \varphi={}^{k_1}n_1{}^{k_2}n_2\cdots{}^{k_m}n_m \] which denotes that \(\varphi(n_j)=k_j\) and has no other differences. Then for elements in \(B\) they are generated by \[ \varphi = {}^1(n_1n_2){}^{-1}(n_1)^{-1}(n_2) \] which we wish to represent \(1\). Then we see \(K(M) \simeq \mathbb{Q} \setminus 0\) if we take the isomorphism \[ \begin{aligned} f:K(M) &\to \mathbb{Q} \setminus 0 \\ \left(\prod_{j=1}^{m}{}^{k_j}n_j\right)B &\mapsto \prod_{j=1}^{m}n_j^{k_j}. \end{aligned} \]


Of course this is not the end of the Grothendieck group. But for further example we may need a lot of topology background. For example, we have the topological \(K\)-theory group of a topological space to be the Grothendieck group of isomorphism classes of topological vector bundles. But I think it is not a good idea to post these examples at this timing.

Study Vector Bundle in a Relatively Harder Way - Tangent Bundle

Tangent line and tangent surface as vector spaces

We begin our study by some elementary Calculus. Now we have the function \(f(x)=x^2+\frac{e^x}{x^2+1}\) as our example. It should not be a problem to find its tangent line at point \((0,1)\), by calculating its derivative, we have \(l:x-y+1=0\) as the tangent line.

\(l\) is not a vector space since it does not get cross the origin, in general. But \(l-\overrightarrow{OA}\) is a vector space. In general, suppose \(P(x,y)\) is a point on the curve determined by \(f\), i.e. \(y=f(x)\), then we obtain a vector space \(l_p-\overrightarrow{OP} \simeq \mathbb{R}\). But the action of moving the tangent line to the origin is superfluous so naturally we consider the tangent line at \(P\) as a vector space determined by \(P\). In this case, the induced vector space (tangent line) is always of dimension \(1\).

image-20201211153752166

Now we move to two-variable functions. We have a function \(a(x,y)=x^2+y^2-x-y+xy\) as our example. Some elementary Calculus work gives us the tangent surface of \(z=a(x,y)\) at \(A(1,1,1)\), which can be identified by \(S:2x+2y-z=3\simeq\mathbb{R}^2\). Again, this can be considered as a vector space determined by \(A\), or roughly speaking it is one if we take \(A\) as the origin. Further we have a base \((\overrightarrow{AB},\overrightarrow{AC})\). Other vectors on \(S\), for example \(\overrightarrow{AD}\), can be written as a linear combination of \(\overrightarrow{AB}\) and \(\overrightarrow{AC}\). In other words, \(S\) is "spanned" by \((\overrightarrow{AB},\overrightarrow{AC})\).

image-20201211153513707

Tangent line and tangent surface play an important role in differentiation. But sometimes we do not have a chance to use it with ease, for example \(S^1:x^2+y^2=1\) cannot be represented by a single-variable function. However the implicit function theorem, which you have already learned in Calculus, gives us a chance to find a satisfying function locally. Here in this post we will try to generalize this concept, trying to find the tangent space at some point of a manifold. (The two examples above have already determined two manifolds and two tangent spaces.)

Definition of tangent vectors

We will introduce the abstract definition of a tangent vector at beginning. You may think it is way too abstract but actually it is not. Surprisingly, the following definition can simplify our work in the future. But before we go, make sure that you have learned about Fréchet derivative (along with some functional analysis knowledge).

Let \(M\) be a manifold of class \(C^p\) with \(p \geq 1\) and let \(x\) be a point of \(M\). Let \((U,\varphi)\) be a chart at \(x\) and \(v\) be a element of the vector space \(\mathbf{E}\) where \(\varphi(U)\) lies (for example, if \(M\) is a \(d\)-dimensional manifold, then \(v \in \mathbb{R}^d\)). Next we consider the triple \((U,\varphi,v)\). Suppose \((U,\varphi,v)\) and \((V,\psi,w)\) are two such triples. We say these two triples are equivalent if the following identity holds: \[ {\color\green{[}}{\color\red{(}}{\color\red{\psi\circ\varphi^{-1}}}{\color\red{)'}}{\color\red{(}}{\color\purple{\varphi(x)}}{\color\red)}{\color\green{]}}(v)=w. \] This identity looks messy so we need to explain how to read it. First we consider the function in red: the derivative of \(\psi\circ\varphi^{-1}\). The derivative of \(\psi\circ\varphi^{-1}\) at point \(\varphi(x)\) (in purple) is a linear transform, and the transform is embraced with green brackets. Finally, this linear transform maps \(v\) to \(w\). In short we read, the derivative of \(\psi\circ\varphi^{-1}\) at \(\varphi(x)\) maps \(v\) on \(w\). You may recall that you have meet something like \(\psi\circ\varphi^{-1}\) in the definition of manifold. It is not likely that these 'triples' should be associated to tangent vectors. But before we explain it, we need to make sure that we indeed defined an equivalent relation.

(Theorem 1) The relation \[ (U,\varphi,v) \sim (V,\psi,w)\\ [(\psi\circ\varphi^{-1})'(\varphi(x))](v)=w \] is an equivalence relation.

Proof. This will not go further than elementary Calculus, in fact, chain rule:

(Chain rule) If \(f:U \to V\) is differentiable at \(x_0 \in U\), if \(g: V \to W\) is differentiable at \(f(x_0)\), then \(g \circ f\) is differentiable at \(x_0\), and \[ (g\circ f)'(x_0)=g'(f(x_0))\circ f'(x_0) \]

  1. \((U,\varphi,v)\sim(U,\varphi,v)\).

Since \(\varphi\circ\varphi^{-1}=\operatorname{id}\), whose derivative is still the identity everywhere, we have \[ [(\varphi\circ\varphi^{-1})'(\varphi(x))](v)=\operatorname{id}(v)=v \]

  1. If \((U,\varphi,v) \sim (V,\psi,w)\), then \((V,\psi,w)\sim(U,\varphi,v)\).

So now we have \[ [(\psi\circ\varphi^{-1})'(\varphi(x))](v)=w. \] To prove that \([(\varphi\circ\psi^{-1})'(\psi(x))]{}(w)=v\), we need some implementation of chain rule.

Note first \[ (\psi\circ\varphi^{-1})'(\varphi(x))=\psi'(\varphi^{-1}(\varphi(x)))\circ\varphi^{-1}{'}(\varphi(x))=\psi'(x)\circ(\varphi^{-1})'(\varphi(x)) \] while \[ (\varphi\circ\psi^{-1})'(\psi(x))=\varphi'(x)\circ(\psi^{-1})'(\psi(x)). \] But also by the chain rule, if \(f\) is a diffeomorphism, we have \[ (f\circ f^{-1})'(x)=(f^{-1})'(f(x))\circ f'(x)=\operatorname{id} \] or equivalently \[ f'(x)=[(f^{-1})'(f(x))]^{-1} \quad (f^{-1})'(f(x))=[f'(x)]^{-1} \]

Therefore \[ \begin{aligned} \{(\psi\circ\varphi^{-1})'(\varphi(x))\}^{-1} &=\{\psi'(x)\circ(\varphi^{-1})'(\varphi(x))\}^{-1} \\ &=\{(\varphi^{-1})'(\varphi(x))\}^{-1}\circ\{\psi'(x)\}^{-1} \\ &=\varphi'(x)\circ(\psi^{-1})'(\psi(x)) \\ &=(\varphi\circ\psi^{-1})'(\psi(x)) \end{aligned} \] which implies \[ (\varphi\circ\psi^{-1})'(\psi(x))(w)=\{(\psi\circ\varphi^{-1})'(\varphi(x))\}^{-1}(w)=v. \]

  1. If \((U,\varphi,v)\sim(V,\psi,w)\) and \((V,\psi,w)\sim(W,\lambda,z)\), then \((U,\varphi,v)\sim(W,\lambda,z)\).

We are given identities \[ [(\psi\circ\varphi^{-1})'(\varphi(x))](v)=w \] and \[ [(\lambda\circ\psi^{-1})'(\psi(x))](w)=z. \] By canceling \(w\), we get \[ \begin{aligned} z = [(\lambda\circ\psi^{-1})'(\psi(x))] \circ [(\psi\circ\varphi^{-1})'(\varphi(x))] (v) \end{aligned}. \] On the other hand, \[ \begin{aligned} (\lambda\circ\varphi^{-1})'(\varphi(x))&=(\lambda\circ\psi^{-1}\circ\psi\circ\varphi^{-1})'(\varphi(x)) \\ &=(\lambda\circ\psi^{-1})'(\psi\circ\varphi^{-1}\circ\varphi(x))\circ(\psi\circ\varphi^{-1})'(\varphi(x)) \\ &=(\lambda\circ\psi^{-1})'(\psi(x))\circ(\psi\circ\varphi^{-1})'(\varphi(x)) \end{aligned} \] which is what we needed. \(\square\)

An equivalence class of such triples \((U,\varphi,v)\) is called a tangent vector of \(X\) at \(x\). The set of such tangent vectors is called the tangent space to \(X\) at \(x\), which is denoted by \(T_x(X)\). But it seems that we have gone too far. Is the triple even a 'vector'? To get a clear view let's see Euclidean submanifolds first.

Definition of tangent vectors of Euclidean submanifolds

Suppose \(M\) is a submanifold of \(\mathbb{R}^n\). We say \(z\) is the tangent vector of \(M\) at point \(x\) if there exists a curve \(\alpha\) of class \(C^1\), which is defined on \(\mathbb{R}\) and where there exists an interval \(I\) such that \(\alpha(I) \subset M\), such that \(\alpha(t_0)=x\) and \(\alpha'(t_0)=z\). (For convenience we often take \(t_0=0\).)

This definition is immediate if we check some examples. For the curve \(M: x^2+1+\frac{e^x}{x^2+1}-y=0\), we can show that \((1,1)^T\) is a tangent vector of \(M\) at \((0,1)\), which is identical to our first example. Taking \[ \alpha(t)=(t,t^2+1+\frac{e^t}{t^2+1}) \] we get \(\alpha(0)=(0,1)\) and \[ \alpha'(t)=(1,2t+\frac{e^t(t-1)^2}{(t^2+1)^2})^T. \] Therefore \(\alpha'(0)=(1,1)^T\). \(\square\)

Coordinate system and tangent vector

Let \(\mathbf{E}\) and \(\mathbf{F}\) be two Banach spaces and \(U\) an open subset of \(\mathbf{E}\). A \(C^p\) map \(f: U \to \mathbf{F}\) is called an immersion at \(x\) if \(f'(x)\) is injective.

For example, if we take \(\mathbf{E}=\mathbf{F}=\mathbb{R}=U\) and \(f(x)=x^2\), then \(f\) is an immersion at almost all point on \(\mathbb{R}\) except \(0\) since \(f'(0)=0\) is not injective. This may lead you to Sard's theorem.

(Theorem 2) Let \(M\) be a subset of \(\mathbb{R}^n\), then \(M\) is a \(d\)-dimensional \(C^p\) submanifold of \(\mathbb{R}^n\) if and only if for every \(x \in M\) there exists an open neighborhood \(U \subset \mathbb{R}^n\) of \(x\), an open neighborhood \(\Omega \subset \mathbb{R}^d\) of \(0\) and a \(C^p\) map \(g: \Omega \to \mathbb{R}^n\) such that \(g\) is immersion at \(0\) such that \(g(0)=x\), and \(g\) is a homeomorphism between \(\Omega\) and \(M \cap U\) with the topology induced from \(\mathbb{R}^n\).

This follows from the definition of manifold and should not be difficult to prove. But it is not what this blog post should cover. For a proof you can check Differential Geometry: Manifolds, Curves, and Surfaces by Marcel Berger and Bernard Gostiaux. The proof is located in section 2.1.

A coordinate system on a \(d\)-dimensional \(C^p\) submanifold \(M\) of \(\mathbb{R}^n\) is a pair \((\Omega,g)\) consisting of an open set \(\Omega \subset \mathbb{R}^d\) and a \(C^p\) function \(g:\Omega \to \mathbb{R}^n\) such that \(g(\Omega)\) is open in \(V\) and \(g\) induces a homeomorphism between \(\Omega\) and \(g(\Omega)\).

For convenience, we say \((\Omega,g)\) is centered at \(x\) if \(g(0)=x\) and \(g\) is an immersion at \(x\). By theorem 2 it is always possible to find such a coordinate system centered at a given point \(x \in M\). The following theorem will show that we can get a easier approach to tangent vector.

(Theorem 3) Let \(\mathbf{E}\) and \(\mathbf{F}\) be two finite-dimensional vector spaces, \(U \subset \mathbf{E}\) an open set, \(f:U \to \mathbf{F}\) a \(C^1\) map, \(M\) a submanifold of \(\mathbf{E}\) contained in \(U\) and \(W\) a submanifold of \(\mathbf{F}\) such that \(f(M) \subset W\). Take \(x \in M\) and set \(y=f(x)\), If \(z\) is a tangent vector to \(M\) at \(x\), the image \(f'(x)(z)\) is a tangent vector to \(W\) at \(y=f(x)\).

Proof. Since \(z\) is a tangent vector, we see there exists a curve \(\alpha: J \to M\) such that \(\alpha(0)=x\) and \(\alpha'(0)=z\) where \(J\) is an open interval containing \(0\). The function \(\beta = f \circ \alpha: J \to W\) is also a curve satisfying \(\beta(0)=f(\alpha(0))=f(x)\) and \[ \beta'(0)=f'(\alpha(0))\alpha'(0)=f'(x)(z), \] which is our desired curve. \(\square\)

Why we use 'equivalence relation'

We shall show that equivalence relation makes sense. Suppose \(M\) is a \(d\)-submanifold of \(\mathbb{R}^n\), \(x \in M\) and \(z\) is a tangent vector to \(M\) at \(x\). Let \((\Omega,g)\) be a coordinate system centered at \(x\). Since \(g \in C^p(\mathbb{R}^d;\mathbb{R}^n)\), we see \(g'(0)\) is a \(n \times d\) matrix, and injectivity ensures that \(\operatorname{rank}(g'(0))=d\).

Every open set \(\Omega \subset \mathbb{R}^d\) is a \(d\)-dimensional submanifold of \(\mathbb{R}^d\) (of \(C^p\)). Suppose now \(v \in \mathbb{R}^d\) is a tangent vector to \(\Omega\) at \(0\) (determined by a curve \(\alpha\)), then by Theorem 3, \(g \circ \alpha\) determines a tangent vector to \(M\) at \(x\), which is \(z_x=g'(0)(v)\). Suppose \((\Lambda,h)\) is another coordinate system centered at \(x\). If we want to obtain \(z_x\) as well, we must have \[ h'(0)(w)=g'(0)(v), \] which is equivalent to \[ w = (h'(0)^{-1} \circ g'(0))(v)=(h^{-1}\circ g)'(0)(v), \] for some \(w \in \mathbb{R}^d\) which is the tangent vector to \(\Lambda\) at \(0 \in \Lambda\). (The inverse makes sense since we implicitly restricted ourself to \(\mathbb{R}^d\))

However, we also have two charts by \((U,\varphi)=(g(\Omega),g^{-1})\) and \((V,\psi) = (h(\Lambda),h^{-1})\), which gives \[ (h^{-1} \circ g)'(0)(v)=[(\psi \circ \varphi^{-1})'(\varphi(x))](v)=w \] and this is just our equivalence relation (don't forget that \(g(0)=x\) hence \(g^{-1}(x)=\varphi(x)=0\)!). There we have our reason for equivalence relation: If \((U,\varphi,v) \sim (V,\psi,w)\), then \((U,\varphi,u)\) and \((V,\psi,v)\) determines the same tangent vector but we do not have to evaluate it manually. In general, all elements in an equivalence class represent a single vector, so the vector is (algebraically) a equivalence class. This still holds when talking about Banach manifold since topological properties of Euclidean spaces do not play a role. The generalized proof can be implemented with little difficulty.

Tangent space

The tangent vectors at \(x \in M\) span a vector space (which is based at \(x\)). We do hope that because if not our definition of tangent vector would be incomplete and cannot even hold for an trivial example (such as what we mentioned at the beginning). We shall show, satisfyingly, the set of tangent vectors to \(M\) at \(x\) (which we write \(T_xM\)) forms a vector space that is toplinearly isomorphic to \(\mathbf{E}\), on which \(M\) is modeled.

(Theorem 4) \(T_xM \simeq \mathbf{E}\). In other words, \(T_xM\) can be given the structure of topological vector space given by the chart.

Proof. Let \((U,\varphi)\) be a chart at \(x\). For \(v \in \mathbf{E}\), we see \((\varphi^{-1})'(x)(v)\) is a tangent vector at \(x\). On the other hand, pick \(\mathbf{w} \in T_xM\), which can be represented by \((V,\psi,w)\). Then \[ v=(\varphi\circ\psi^{-1})'(\psi(x))(w) \] makes \((U,\varphi,v) \sim (V,\psi,w)\) uniquely, and therefore we get some \(v \in \mathbf{E}\). To conclude, \[ T_xM \xrightarrow[(\varphi^{-1})'(x)]{\simeq}\mathbf{E} \] which proves our theorem. Note that this does not depend on the choice of charts. \(\square\)

For many reasons it is not a good idea to identify \(T_xM\) as \(\mathbf{E}\) without mentioning the point \(x\). For example we shouldn't identify the tangent line of a curve as \(x\)-axis. Instead, it would be better to identify or visualize \(T_xM\) as \((x,\mathbf{E})\), that is, a linear space with origin at \(x\).

Tangent bundle

Now we treat all tangent spaces as a vector bundle. Let \(M\) be a manifold of class \(C^p\) with \(p \geq 1\), define the tangent bundle by the disjoint union \[ T(M)=\bigsqcup_{x \in M}T_xM. \] This is a vector bundle if we define the projection by \[ \begin{aligned} \pi: T(M) &\to M \\ y \in T_xM &\mapsto x \end{aligned} \] and we will verify it soon. First let's see an example. Below is a visualization of the tangent bundle of \(\frac{x^2}{4}+\frac{y^2}{3}=1\), denoted by red lines:

image-20201219160517699

Also we can see \(\pi\) maps points on the blue line to a point on the curve, which is \(B\).

To show that a tangent bundle of a manifold is a vector bundle, we need to verify that it satisfies three conditions we mentioned in previous post. Let \((U,\varphi)\) be a chart of \(M\) such that \(\varphi(U)\) is open in \(\mathbf{E}\), then tangent vectors can be represented by \((U,\varphi,v)\). We get a bijection \[ \tau_U:\pi^{-1}(U) = T(U) \to U \times \mathbf{E} \] by definition of tangent vectors as equivalence classes. Let \(z_x\) be a tangent vector to \(U\) at \(x\), then there exists some \(v \in \mathbf{E}\) such that \((U,\varphi,v)\) represents \(z\). On the other hand, for some \(v \in \mathbf{E}\) and \(x \in U\), \((U,\varphi,v)\) represents some tangent vector at \(x\). Explicitly, \[ \tau_{U}(z_x)=(x,v)=(\pi(z_x),[(\varphi^{-1})'(\pi(z_x))]^{-1}(z_x)) \]

Further we get the following diagram commutative (which establishes VB 1):

diagram-000001

For VB 2 and VB 3 we need to check different charts. Let \((U_i,\varphi_i)\), \((U_j,\varphi_j)\) be two charts. Define \(\varphi_{ji}=\varphi_j \circ \varphi_i^{-1}\) on \(\varphi_i(U_i \cap U_j)\), and respectively we write \(\tau_{U_i}=\tau_i\) and \(\tau_{U_j}=\tau_j\). Then we get a transition mapping \[ \tau_{ji}:(\tau_j \circ \tau_i^{-1}):(U_i \cap U_j) \times \mathbf{E} \to (U_i \cap U_j) \times \mathbf{E}. \]

One can verify that \[ \tau_{ji}(x,v)=(\varphi_{ji}(x),D\varphi_{ji}(x) \cdot v) \] for \(x \in U_i \cap U_j\) and \(v \in \mathbf{E}\). Since \(D\varphi_{ji} \in C^{p-1}\) and \(D\varphi_{ji}(x)\) is a toplinear isomorphism, we see \[ x \mapsto (\tau_j \circ \tau_i^{-1})_x=(\varphi_{ji}(x),D\varphi_{ji}(x)\cdot(\cdot)) \] is a morphism, which goes for VB 3. It remains to verify VB 2. To do this we need a fact from Banach space theory:

If \(f:U \to L(\mathbf{E},\mathbf{F})\) is a \(C^k\)-morphism, then the map of \(U \times \mathbf{E}\) into \(\mathbf{F}\) given by \[ (x,v) \mapsto [f(x)](v) \] is a \(C^k\)-morphism.

Here, we have \(f(x)=\tau_{ji}(x,\cdot)\) and to conclude, \(\tau_{ji}\) is a \(C^{p-1}\)-morphism. It is also an isomorphism since it has an inverse \(\tau_{ij}\). Following the definition of manifold, we can conclude that \(T(U)\) has a unique manifold structure such that \(\tau_i\) are morphisms (there will be a formal proof in next post about any total space of a vector bundle). By VB 1, we also have \(\pi=\tau_i\circ pr\), which makes it a morphism as well. On each fiber \(\pi^{-1}(x)\), we can freely transport the topological vector space structure of any \(\mathbf{E}\) such that \(x\) lies in \(U_i\), by means of \(\tau_{ix}\). Since \(f(x)\) is a toplinear isomorphism, the result is independent of the choice of \(U_i\). VB 2 is therefore established.


Using some fancier word, we can also say that \(T:M \to T(M)\) is a functor from the category of \(C^p\)-manifolds to the category of vector bundles of class \(C^{p-1}\).

Study Vector Bundle in a Relatively Harder Way - Definition

Motivation

Direction is a considerable thing. For example take a look at this picture (by David Gunderman):

mobius

The position of the red ball and black ball shows that this triple of balls turns upside down every time they finish one round. This wouldn't happen if this triple were on a normal band, which can be denoted by \(S^1 \times (0,1)\). What would happen if we try to describe their velocity on the Möbius band, both locally and globally? There must be some significant difference from a normal band. If we set some move pattern on balls, for example let them run horizontally or zig-zagly, hopefully we get different set of vectors. those vectors can span some vector spaces as well.

A Formal Construction

Here and in the forgoing posts, we will try to develop purely formally certain functorial constructions having to do with vector bundles. It may be overly generalized, but we will offer some examples to make it concrete.

Let \(M\) be a manifold (of class \(C^p\), where \(p \geq 0\) and can be set to \(\infty\)) modeled on a Banach space \(\mathbf{E}\). Let \(E\) be another topological space and \(\pi: E \to M\) a surjective \(C^p\)-morphism. A vector bundle is a topological construction associated with \(M\) (base space), \(E\) (total space) and \(\pi\) (bundle projection) such that, roughly speaking, \(E\) is locally a product of \(M\) and \(\mathbf{E}\).

We use \(\mathbf{E}\) instead of \(\mathbb{R}^n\) to include the infinite dimensional cases. We will try to distinguish finite-dimensional and infinite-dimensional Banach spaces here. There are a lot of things to do, since, for example, infinite dimensional Banach spaces have no countable Hamel basis, while the finite-dimensional ones have finite ones (this can be proved by using the Baire category theorem).

Next we will show precisely how \(E\) locally becomes a product space. Let \(\mathfrak{U}=(U_i)_i\) be an open covering of \(M\), and for each \(i\), suppose that we are given a mapping \[ \tau_i:\pi^{-1}(U_i)\to U_i \times E \] satisfying the following three conditions.

VB 1 \(\tau_i\) is a \(C^p\) diffeomorphism making the following diagram commutative:

diagram-000001

where \(pr\) is the projection of the first component: \((x,y) \mapsto x\). By restricting \(\tau_i\) on one point of \(U_i\), we obtain an isomorphism on each fiber \(\pi^{-1}(x)\): \[ \tau_{ix}:\pi^{-1}(x) \xrightarrow{\simeq} \{x\} \times \mathbf{E} \]

VB 2 For each pair of open sets \(U_i\), \(U_j \in \mathfrak{U}\), we have the map \[ \tau_{jx} \circ \tau_{ix}^{-1}: \mathbf{E} \to \mathbf{E} \] to be a toplinear isomorphism (that is, it preserves \(\mathbf{E}\) for being a topological vector space).

VB 3 For any two members \(U_i\), \(U_j \in \mathfrak{U}\), we have the following function to be a \(C^p\)-morphism: \[ \begin{aligned} \varphi:U_i \cap U_j &\to L(\mathbf{E},\mathbf{E}) \\ x &\mapsto \left(\tau_j\circ \tau_i^{-1}\right)_x \end{aligned} \]

REMARKS. As with manifold, we call the set of 2-tuples \((U_i,\tau_i)_i\) a trivializing covering of \(\pi\), and that \((\tau_i)\) are its trivializing maps. Precisely, for \(x \in U_i\), we say \(U_i\) or \(\tau_i\) trivializes at \(x\).

Two trivializing coverings for \(\pi\) is said to be VB-equivalent if taken together they also satisfy conditions of VB 2 and VB 3. It's immediate that VB-equivalence is an equivalence relation and we leave the verification to the reader. It is this VB-equivalence class of trivializing coverings that determines a structure of vector bundle on \(\pi\). With respect to the Banach space \(\mathbf{E}\), we say that the vector bundle has fiber \(\mathbf{E}\), or is modeled on \(\mathbf{E}\).

Next we shall give some motivations of each condition. Each pair \((U_i,\tau_i)\) determines a local product of 'a part of the manifold' and the model space, on the latter of which we can deploy the direction with ease. This is what VB 1 tells us. But that's far from enough if we want our vectors fine enough. We do want the total space \(E\) to actually be able to qualify our requirements. As for VB 2, it is ensured that using two different trivializing maps will give the same structure of some Banach spaces (with equivalent norms). According to the image of \(\tau_{ix}\), we can say, for each point \(x \in X\), which can be determined by a fiber \(\pi^{-1}(x)\) (the pre-image of \(\tau_{ix}\)), can be given another Banach space by being sent via \(\tau_{jx}\) for some \(j\). Note that \(\pi^{-1}(x) \in E\), the total space. In fact, VB 2 has an equivalent alternative:

VB 2' On each fiber \(\pi^{-1}(x)\) we are given a structure of Banach space as follows. For \(x \in U_i\), we have a toplinear isomorphism which is in fact the trivializing map: \[ \tau_{ix}:\pi^{-1}(x)=E_x \to \mathbf{E}. \] As stated, VB 2 implies VB 2'. Conversely, if VB 2' is satisfied, then for open sets \(U_i\), \(U_j \in \mathfrak{U}\), and \(x \in U_i \cap U_j\), we have \(\tau_{jx} \circ \tau_{ix}^{-1}:\mathbf{E} \to \mathbf{E}\) to be an toplinear isomorphism. Hence, we can consider VB 2 or VB 2' as the refinement of VB 1.

In finite dimensional case, one can omit VB 3 since it can be implied by VB 2, and we will prove it below.

(Lemma) Let \(\mathbf{E}\) and \(\mathbf{F}\) be two finite dimensional Banach spaces. Let \(U\) be open in some Banach space. Let \[ f:U \times \mathbf{E} \to \mathbf{F} \] be a \(C^p\)-morphism such that for each \(x \in U\), the map \[ f_x: \mathbf{E} \to \mathbf{F} \] given by \(f_x(v)=f(x,v)\) is a linear map. Then the map of \(U\) into \(L(\mathbf{E},\mathbf{F})\) given by \(x \mapsto f_x\) is a \(C^p\)-morphism.

PROOF. Since \(L(\mathbf{E},\mathbf{F})=L(\mathbf{E},\mathbf{F_1}) \times L(\mathbf{E},\mathbf{F_2}) \times \cdots \times L(\mathbf{E},\mathbf{F_n})\) where \(\mathbf{F}=\mathbf{F_1} \times \cdots \times \mathbf{F_n}\), by induction on the dimension of \(\mathbf{F}\) and \(\mathbf{E}\), it suffices to assume that \(\mathbf{E}\) and \(\mathbf{F}\) are toplinearly isomorphic to \(\mathbb{R}\). But in that case, the function \(f(x,v)\) can be written \(g(x)v\) for some \(g:U \to \mathbb{R}\). Since \(f\) is a morphism, it follows that as a function of each argument \(x\), \(v\) is also a morphism, Putting \(v=1\) shows that \(g\) is also a morphism, which finishes the case when both the dimension of \(\mathbf{E}\) and \(\mathbf{F}\) are equal to \(1\), and the proof is completed by induction. \(\blacksquare\)

To show that VB 3 is implied by VB 2, put \(\mathbf{E}=\mathbf{F}\) as in the lemma. Note that \(\tau_j \circ \tau_i^{-1}\) maps \(U_i \cap U_j \times \mathbf{E}\) to \(\mathbf{E}\), and \(U_i \cap U_j\) is open, and for each \(x \in U_i \cap U_j\), the map \((\tau_j \circ \tau_i^{-1})_x=\tau_{jx} \circ \tau_{ix}^{-1}\) is toplinear, hence linear. Then the fact that \(\varphi\) is a morphism follows from the lemma.

Examples

Trivial bundle

Let \(M\) be any \(n\)-dimensional smooth manifold that you are familiar with, then \(pr:M \times \mathbb{R}^n \to M\) is actually a vector bundle. Here the total space is \(M \times \mathbb{R}^n\) and the base is \(M\) and \(pr\) is the bundle projection but in this case it is simply a projection. Intuitively, on a total space, we can determine a point \(x \in M\), and another component can be any direction in \(\mathbb{R}^n\), hence a vector.

We need to verify three conditions carefully. Let \((U_i,\varphi_i)_i\) be any atlas of \(M\), and \(\tau_i\) is the identity map on \(U_i\) (which is naturally of \(C^p\)). We claim that \((U_i,\tau_i)_i\) satisfy the three conditions, thus we get a vector bundle.

For VB 1 things are clear: since \(pr^{-1}(U_i)=U_i \times \mathbb{R}^n\), the diagram is commutative. Each fiber \(pr^{-1}(x)\) is essentially \((x) \times \mathbb{R}^n\), and still, \(\tau_{jx} \circ \tau_{ix}^{-1}\) is the identity map between \((x) \times \mathbb{R}^n\) and \((x) \times \mathbb{R}^n\), under the same Euclidean topology, hence VB 2 is verified, and we have no need to verify VB 3.

Möbius band

First of all, imagine you have embedded a circle into a Möbius band. Now we try to give some formal definition. As with quotient topology, \(S^1\) can be defined as \[ S^1=I/\sim_1, \]

where \(I\) is the unit interval and \(0 \sim_1 1\) (identifying two ends). On the other hand, the infinite Möbius band can be defined by \[ B= (I \times \mathbb{R})/\sim_2 \] where \((0,v) \sim_2 (1,-v)\) for all \(v \in \mathbb{R}\) (not only identifying two ends of \(I\) but also 'flips' the vertical line). Then all we need is a natural projection on the first component: \[ \pi:B \to S^1. \] And the verification has few difference from the trivial bundle. Quotient topology of Banach spaces follows naturally in this case, but things might be troublesome if we restrict ourself in \(\mathbb{R}^n\).

Tangent bundle of the sphere

The first example is relatively rare in many senses. By \(S^n\) we mean the set in \(\mathbb{R}^{n+1}\) with \[ S^n=\{(x_0,x_1,\dots,x_n):x_0^2+x_1^2+\cdots+x_n^2=1\} \] and the tangent bundle can be defined by \[ TS^n=\{(\mathbf{x},\mathbf{y}):\langle\mathbf{x},\mathbf{y}\rangle=0\} \subset S^{n} \times\mathbb{R}^{n+1}, \] where, of course, \(\mathbf{x} \in S^n\) and \(\mathbf{y} \in \mathbb{R}^{n+1}\). The vector bundle is given by \(pr:TS^n \to S^n\) where \(pr\) is the projection of the first factor. This total space is of course much finer than \(M \times \mathbb{R}^n\) in the first example. Each point in the manifold now is associated with a tangent space \(T_x(M)\) at this point.

More generally, we can define it in any Hilbert space \(H\), for example, \(L^2\) space: \[ TS=\{(x,y):\langle x , y \rangle=0\} \subset S \times H \] where \[ S=\{x:\langle x , x \rangle = 1\}. \] The projection is natural: \[ \begin{aligned} \pi: TM &\to M \\ T_x(M) & \mapsto x \end{aligned} \] But we will not cover the verification in this post since it is required to introduce the abstract definition of tangent vectors. This will be done in the following post.

There are still many things remain undiscovered

We want to study those 'vectors' associated to some manifold both globally and locally. For example we may want to describe the tangent line of some curves at some point without heavily using elementary calculus stuff. Also, we may want to describe the vector bundle of a manifold globally, for example, when will we have a trivial one? Can we classify the manifold using the behavior of the bundle? Can we make it a little more abstract, for example, consider the class of all isomorphism bundles? How do one bundle transform to another? But to do this we need a big amount of definitions and propositions.

Cauchy sequence in group theory

Recall - Cauchy sequence in analysis

Before we go into group theory, let's recall how Cauchy sequence is defined in analysis.

Real/complex number

A sequence \((x_n)_{n=1}^{\infty}\) of real/complex numbers is called a Cauchy sequence if, for every \(\varepsilon>0\), there is a positive integer \(N\) such that for all \(m,n>N\), we have \[ |x_m-x_n|<\varepsilon. \] That said, the distance between two numbers is always 'too close'. Notice that only distance is involved, the definition of Cauchy sequence in metric space comes up in the natural of things.

Metric space

Given a metric space \((X,d)\), a sequence \((x_n)_{n=1}^{\infty}\) is Cauchy if for every real number \(\varepsilon>0\), there is a positive integer \(N\) such that, for all \(m,n>N\), the distance by \[ d(x_m,x_n)<\varepsilon. \]

Topological vector space

By considering the topology induced by metric, we see that \(x_n\) lies in a neighborhood of \(x_m\) with radius \(\varepsilon\). But a topology can be constructed by neighborhood, hence the Cauchy sequence for topological vector space follows.

For a topological vector space \(X\), pick a local base \(\mathcal{B}\), then \((x_n)_{n=1}^{\infty}\) is a Cauchy sequence if for each member \(U \in \mathcal{B}\), there exists some number \(N\) such that for \(m,n>N\), we have \[ x_m-x_n \in U. \] But in a topological space, it's not working. Consider two topological space by \[ X=(0,1)\quad Y=(1,+\infty) \] with usual topology. We have \(X \simeq Y\) since we have the map by \[ \begin{aligned} h:X &\to Y \\ x &\mapsto \frac{1}{x} \end{aligned} \] as a homeomorphism. Consider the Cauchy sequence \((\frac{1}{n+1})_{n=1}^{\infty}\), we see \((h(\frac{1}{n+1}))_{n=1}^{\infty}=(n+1)_{n=1}^{\infty}\) which is not Cauchy. This counterexample shows that being a Cauchy sequence is not preserved by homeomorphism.

Topological group

Similarly, one can have a Cauchy sequence in a topological group (bu considering subtraction as inverse).

A sequence \((x_n)_{n=1}^{\infty}\) in a topological group \(G\) is a Cauchy sequence if for every open neighborhood \(U\) of the identity \(G\), there exists some number \(N\) such that whenever \(m,n>N\), we have \[ x_nx_m^{-1} \in U \]

Completeness

A metric space \((X,d)\) where every Cauchy sequence converges is complete.

Spaces like \(\mathbb{R}\), \(\mathbb{C}\) are complete with Euclid metric. But consider the sequence in \(\mathbb{Q}\) by \[ a_n=\left(1+\frac{1}{n}\right)^{n} \] we have \(a_n\in\mathbb{Q}\) for all \(n\) but the sequence does not converge in \(\mathbb{Q}\). Indeed in \(\mathbb{R}\) we can naturally write \(a_n \to e\) but \(e \notin \mathbb{Q}\) as we all know.

There are several ways to construct \(\mathbb{R}\) from \(\mathbb{Q}\). One of the most famous methods is Dedekind's cut. However you can find no explicit usage of Cauchy sequence. There is another method by using Cauchy sequence explicitly. We are following that way algebraically.

Completing a group

Suppose we are given a group \(G\) with a sequence of normal subgroups \((H_n)_{n=1}^{\infty}\) with \(H_n \supset H_{n+1}\) for all \(n\), all of which has finite index. We are going to complete this group.

Cauchy sequence

A sequence \((x_n)_{n=1}^{\infty}\) in \(G\) will be called Cauchy sequence if given \(H_k\), there exists some \(N>0\) such that for \(m,n>N\), we have \[ x_nx_m^{-1} \in H_k \] Indeed, this looks very similar to what we see in topological group, but we don't want to grant a topology to the group anyway. This definition does not go to far from the original definition of Cauchy sequence in \(\mathbb{R}\) as well. If you treat \(H_k\) as some 'small' thing, it shows that \(x_m\) and \(x_n\) are close enough (by considering \(x_nx_m^{-1}\) as their difference).

Null sequence

A sequence \((x_n)_{n=1}^{\infty}\) in \(G\) will be called null sequence if given \(k\), there exists some \(N>0\) such that for all \(n>N\), we have \[ x_n\in H_k \] or you may write \(x_ne^{-1} \in H_k\). It can be considered as being arbitrarily close to the identity \(e\).

Cauchy sequence and null sequence as groups

The Cauchy sequences (of \(G\)) form a group under termwise product

Proof. Let \(C\) be the set of Cauchy sequences, we shall show that \(C\) forms a group. For \((x_1,x_2,\cdots),(y_1,y_2,\cdots)\in C\), the product is defined by \[ (x_1,x_2,\cdots)(y_1,y_2,\cdots)=(x_1y_1,x_2y_2,\cdots) \] The associativity follows naturally from the associativity of \(G\). To show that \((x_1y_1,x_2y_2,\cdots)\) is still a Cauchy sequence, notice that for big enough \(m\), \(n\) and some \(k\), we have \[ x_nx_m^{-1}\in H_k \quad y_ny_m^{-1}\in H_k. \] But \((x_ny_n)(x_my_m)^{-1}=x_ny_ny_m^{-1}x_m^{-1}\). To show that this is an element of \(H_k\), notice that \[ x_ny_ny_m^{-1}x_m^{-1}=(x_ny_ny_m^{-1}{\color{red}{x_n^{-1}}})({\color{red}{x_n}}x_m^{-1}) \] Since \(y_ny_m^{-1}\in H_k\), \(H_k\) is normal, we have \(x_ny_ny_mx_n^{-1} \in H_k\). Since \(x_nx_m^{-1} \in H_k\), \((x_ny_n)(x_my_m)^{-1}\) can be viewed as a product of two elements of \(H_k\), therefore is an element of \(H_k\).

Obviously, if we define \(e_C=(e_G,e_G,\cdots)\), where \(e_G\) is the identity of \(G\), \(e_C\) becomes the identity of \(C\), since \[ (x_1,x_2,\cdots)(e_G,e_G,\cdots)=(x_1,x_2,\cdots)=(e_G,e_G,\cdots)(x_1,x_2,\cdots). \] Finally the inverse. We need to show that \[ (x_1,x_2,\cdots)^{-1}=(x_1^{-1},x_2^{-1},\cdots) \] is still an element of \(C\). This is trivial since if we have \[ x_nx_m^{-1} \in H_k \] then \[ (x_n^{-1})(x_m^{-1})^{-1}=x_n^{-1}x_m \in H_k \] as \(H_k\) is a group.


The null sequences (of \(G\)) form a group, further, it's a normal subgroup of \(C\), that is, the group of Cauchy sequences.

Let \(N\) be the set of null sequences of \(G\). Still, the identity is defined by \((e_G,e_G,\cdots)\), and there is no need to duplicate the validation. And the associativity still follows from \(G\). To show that \(N\) is closed under termwise product, namely if \((x_n),(y_n) \in N\), then \((x_ny_n)\in N\), one only need to notice that, for big \(n\), we already have \[ x_n,y_n \in H_k. \] Therefore \(x_ny_n \in H_k\) since \(x_n\) and \(y_n\) are two elements of \(H_k\).

To show that \((x_n^{-1})\), which should be treated as the inverse of \((x_n)\), is still in \(N\), notice that if \(x_n \in H_k\), then \(x_n^{-1} \in H_k\).

Next, we shall show that \(N\) is a subgroup of \(C\), which is equivalent to show that every null sequence is Cauchy. Given \(H_p \supset H_q\), for \((x_n)\in{N}\), there are some big enough \(m\) and \(n\) such that \[ x_n \in H_p \quad x_m \in H_q \] therefore \[ x_nx_m^{-1} \in H_p \] as desired. Finally, pick \((p_n) \in N\) and \((q_n) \in C\), we shall show that \((q_n)(p_n)(q_n)^{-1} \in N\). That is, the sequence \((q_np_nq_n^{-1})\) is a null sequence. Given \(H_k\), we have some big \(n\) such that \[ p_n \in H_k \] therefore \[ q_np_nq_n^{-1} \in H_k \] since \(H_k\) is normal. Our statement is proved.

The factor group

The factor group \(C/N\) is called the completion of \(G\) (with respect to \((H_n)\)).

As we know, the elements of \(C/N\) are cosets. A coset can be considered as an element of \(G\)'s completion. Let's head back to some properties of factor group. Pick \(x,y \in C\), then \(xN=yN\) if and only if \(x^{-1}y \in N\). With that being said, two Cauchy sequences are equivalent if their 'difference' is a null sequence.

Informally, consider the addictive group \(\mathbb{Q}\). There are two Cauchy sequence by \[ (1,1,1,\cdots,1,\cdots) \\ (0.9,0.99,0.999,0.9999,\cdots). \] They are equivalent since \[ (0.1,0.01,0.001,\cdots) \] is a null sequence. That's why people say \(0.99999... = 1\) (in analysis, the difference is convergent to \(0\); but in algebra, we say the two sequences are equivalent). Another example, \(\ln{2}\) can be represented by the equivalent class of \[ (0.6,0.69,0.693,\cdots). \]


Remarks

We made our completion using Cauchy sequences. The completion is filled with some Cauchy sequence and some additions of 'nothing', whence the gap disappears.

Again, the sequence of normal subgroups does not have to be indexed by \(\mathbb{N}\). It can be indexed by any directed partially ordered set, or simply partially ordered set. Removing the restriction of index set gives us a great variety of implementation.

However, can we finished everything about completing \(\mathbb{Q}\) using this? The answer is, no - the multiplication is not verified! To finish this, field theory have to be taken into consideration.