We are going to evaluate the Fourier transform of $\frac{\sin{x}}{x}$ and $\left(\frac{\sin{x}}{x}\right)^2$. And it turns out to be a comprehensive application of many elementary theorems of single complex variable functions. Thus it is recommended to make sure that you can evaluate and understand all the identities in this post by yourself. Also, make sure that you can recall all words in *italics*.

For real $t$, find the limit by

We will do this using contour integration. Since the complex function $f(z)=\frac{\sin{z}}{z}e^{itz}$ is *entire*, by *Cauchy’s theorem*, its integral over $[-A,A]$ is equal to the one over the path $\Gamma_A$ by going from $-A$ to $-1$ along the real axis, from $-1$ to $1$ along the lower half of the unit circle, and from $1$ to $A$ along the real axis (why?). Since the path $\Gamma_A$ avoids the origin, we may use the identity

Replacing $\sin{z}$ with $\frac{1}{2i}(e^{itz}-e^{-itz})$, we get

If we put $\varphi_A(t)=\int_{\Gamma_A}\frac{1}{2iz}e^{i(t+1)z}dz$, we see $I_A(t)=\varphi_A(t+1)-\varphi_A(t-1)$. It is convenient to divide $\varphi_A$ by $\pi$ since we therefore get

and we are cool with the divisor $2\pi i$.

Now, finish the path $\Gamma_A$ in two ways. First, by the semicircle from $A$ to $-Ai$ to $-A$; second, by the semicircle from $A$ to $Ai$ to $-A$, which finishes a circle with radius $A$ actually. For simplicity we denote the two paths by $\Gamma_U$ and $\Gamma_L$ Again by the Cauchy theorem, the first case gives us a integral with value $0$, thus by Cauchy’s theorem,

Notice that

we see, if $t\sin\theta>0$, we have $|\exp(iAte^{i\theta})| \to 0$ as $A \to \infty$. When $-\pi < \theta <0$ in this case, we have $\sin\theta<0$. Therefore we get

(You should be able to prove the convergence above.) Also trivially

But what if $t>0$? Indeed, it would be difficult to obtain the limit using the integral over $[-\pi,0]$. But we have another path, namely the upper one.

Note that $\frac{e^{itz}}{z}$ is a *meromorphic function* in $\mathbb{C}$ with a pole at $0$. For such a function we have

which implies that the residue at $0$ is $1$. By the *residue theorem*,

Note that we have used the *change-of-variable* formula as we did for the upper one. $\operatorname{Ind}_{\Gamma_L}(0)$ means the *winding number* of $\Gamma_L$ around $0$, which is $1$ of course. The identity above implies

Thus if $t>0$, since $\sin\theta>0$ when $0<\theta<\pi$, we get

But as already shown, $I_A(t)=\varphi_A(t+1)-\varphi_A(t-1)$, thus to conclude,

Since $\psi(x)=\left(\frac{\sin{x}}{x}\right)$ is even, by dividing $I_A$ by $\sqrt{\frac{1}{2\pi}}$, we actually obtain the *Fourier transform* of it by abuse of language. Therefore we also get

Note that $\hat\psi(t)$ is not continuous, let alone uniformly continuous. ‘Therefore’, $\psi(x) \notin L^1$ since if $f \in L^1$, then $\hat{f}$ is *uniformly continuous* (proof). Another interesting fact is, this also implies the value of the Dirichlet integral since we have

We end this section by evaluating the inverse of $\hat\psi(t)$. This requires a simple calculation.

For real $t$, compute

Now since $h(x)=\frac{\sin^2{x}}{x^2} \in L^1$, we are able to say with ease that the integral above is the Fourier transform of $h(x)$. But still we will be using the limit form by

where

And we are still using the contour integration as above (we are still using $\Gamma_A$, $\Gamma_U$ and $\Gamma_L$). For this we get

Therefore it suffices to discuss the function

since we have

Dividing $\mu_A(z)$ by $\frac{1}{\pi i}$, we see

Integrate $\frac{e^{itz}}{z^2}$ over $\Gamma_L$, we see

Since we still have

if $t<0$ in this case, we see $\frac{1}{\pi i}\mu_A(z) \to 0$ as $A \to \infty$. For $t>0$, integrating over $\Gamma_U$, we have

We can also evaluate $\mu_A(0)$ by computing the integral but we are not doing that. To conclude, we have

Therefore for $J_A$ we have

Now you may ask, how did you find the value at $0$, $2$ or $-2$? $\mu_A(0)$ is not evaluated. But $h(t) \in L^1$, we see $\hat{h}(t)=\sqrt{\frac{1}{2\pi}}J(t)$ is uniformly continuous, thus continuous, and the values at these points follows from continuity.

Again, we get the value of a classic improper integral by

And this time it’s not hard to find the Fourier inverse:

Thereafter you are able to evaluate the improper integral of $\left(\frac{\sin{x}}{x}\right)^n$. Using *Fubini’s* or *Tonelli’s* theorem will be almost infeasible. But using the contour integral as such will force you deal with $n$ binomial coefficients, which might be tedious still. It’s even possible to discuss the convergence of the sequence $(I_n)$ where

Is intended to establish the existence of the Lebesgue measure in the future, which is often denoted by $m$. In fact, the Lebesgue measure follows as a special case of R-M-K representation theorem. You may not believe it, but euclidean properties of $\mathbb{R}^k$ plays no role in the existence of $m$. The only topological property that works is the fact that $\mathbb{R}^k$ is a locally compact Hausdorff space.

The theorem is named after F. Riesz who introduced it for continuous functions on $[0,1]$ (with respect to Riemann-Steiltjes integral). Years later, after the generalization done by A. Markov and S. Kakutani, we are able to view it in a locally compact Hausdorff space.

You may find there are some over-generalized properties, but this is intended to have you being able to enjoy more alongside (there are some tools related to differential geometry). Also there are many topology and analysis tricks worth your attention.

Again, euclidean topology plays no role in this proof. We need to specify the topology for different reasons. This is similar to what we do in linear functional analysis. Throughout, let $X$ be a topological space.

**0.0 Definition.** $X$ is a *Hausdorff space* if the following is true: If $p \in X$, $q\in X$ but $p \neq q$, then there are two **disjoint** open sets $U$ and $V$ such that $p \in U$ and $q \in V$.

**0.1 Definition.** $X$ is *locally compact* if every point of $X$ has a neighborhood whose closure is compact.

**0.2 Remarks.** A Hausdorff space is also called a $T_2$ space (see Kolmogorov classification) or a separated space. There is a classic example of locally compact Hausdorff space: $\mathbb{R}^n$. It is trivial to verify this. But this is far from being enough. In the future we will see, we can construct some ridiculous but mathematically valid measures.

**0.3 Definition.** A set $E \subset X$ is called *$\sigma$-compact* if $E$ is a countable union of compact sets. Note that every open subset in a euclidean space $\mathbb{R}^n$ is $\sigma$-compact since it can always be a countable union of closed balls (which is compact).

**0.4 Definition.** A covering of $X$ is *locally finite* if every point has a neighborhood which intersects only finitely many elements of the covering. Of course, if the covering is already finite, it’s also locally finite.

**0.5 Definition.** A *refinement* of a covering of $X$ is a second covering, each element of which is contained in an element of the first covering.

**0.6 Definition.** $X$ is *paracompact* if it is Hausdorff, and every open covering has a locally finite open refinement. Obviously any compact space is paracompact.

**0.7 Theorem.** If $X$ is a second countable Hausdorff space and is locally compact, then $X$ is paracompact. For proof, see this [Theorem 2.6].

**0.8 Theorem.** If $X$ is locally compact and sigma compact, then $X=\bigcup_{i=1}^{\infty}K_i$ where for all $i \in \mathbb{N}$, $K_i$ is compact and $K_i \subset\operatorname{int}K_{i+1}$.

The basic technical tool in the theory of differential manifolds is the existence of a partition of unity. We will steal this tool for the application of analysis theory.

**1.0 Definition.** A **partition of unity** on $X$ is a collection $(g_i)$ of continuous real valued functions on $X$ such that

- $g_i \geq 0$ for each $i$.
- every $x \in X$ has a neighborhood $U$ such that $U \cap \operatorname{supp}(g_i)=\varnothing$ for all but finitely many of $g_i$.
- for each $x \in X$, we have $\sum_{i}g_i(x)=1$. (That’s why you see the word ‘unity’.)

**1.1 Definition.** A partition of unity $(g_i)$ on $X$ is *subordinate* to an open cover of $X$ if and only if for each $g_i$ there is an element $U$ of the cover such that $\operatorname{supp}(g_i) \subset U$. We say $X$ *admits* partitions of unity if and only if for every open cover of $X$, there exists a partition of unity subordinate to the cover.

**1.2 Theorem.** A Hausdorff space admits a partition of unity if and only if it is paracompact. As a corollary, we have:

**1.3 Corollary.** Suppose $V_1,\cdots,V_n$ are open subsets of a locally compact Hausdorff space $X$, $K$ is compact, and

Then there exists a partition of unity $(h_i)$ that is subordinate to the cover $(V_n)$ such that $\operatorname{supp}(h_i) \subset V_i$ and $\sum_{i=1}^{n}h_i=1$ for all $x \in K$.

**2.0 Notation.** The notation

will mean that $K$ is a compact subset of $X$, that $f \in C_c(X)$, that $f(X) \subset [0,1]$, and that $f(x)=1$ for all $x \in K$. The notation

will mean that $V$ is open, that $f \in C_c(X)$, that $f(X) \subset [0,1]$ and that $\operatorname{supp}(f) \subset V$. If both hold, we write

**2.1 Remarks.** Clearly, with this notation, we are able to simplify the statement of being subordinate. We merely need to write $g_i \prec U$ in 1.1 instead of $\operatorname{supp}(g_i) \subset U$.

**2.2 Urysohn’s Lemma for locally compact Hausdorff space.** Suppose $X$ is locally compact and Hausdorff, $V$ is open in $X$ and $K \subset V$ is a compact set. Then there exists an $f \in C_c(X)$ such that

**2.3 Remarks.** By $f \in C_c(X)$ we shall mean $f$ is a continuous function with a compact support. This relation also says that $\chi_K \leq f \leq \chi_V$. For more details and the proof, visit this page. This lemma is generally for normal space, for a proof on that level, see arXiv:1910.10381. (Question: why we consider two disjoint closed subsets thereafter?)

We will be using the $\varepsilon$-definitions of $\sup$ and $\inf$, which will makes the proof easier in this case, but if you don’t know it would be troublesome. So we need to put it down here.

Let $S$ be a nonempty subset of the real numbers that is bounded below. The lower bound $w$ is to be the infimum of $S$ if and only if for any $\varepsilon>0$, there exists an element $x_\varepsilon \in S$ such that $x_\varepsilon<w+\varepsilon$.

This definition of $\inf$ is equivalent to the if-then definition by

Let $S$ be a set that is bounded below. We say $w=\inf S$ when $w$ satisfies the following condition.

- $w$ is a lower bound of $S$.
- If $t$ is also a lower bound of $S$, then $t \leq s$.

We have the analogous definition for $\sup$.

Analysis is full of vector spaces and linear transformations. We already know that the Lebesgue integral induces a linear functional. That is, for example, $L^1([0,1])$ is a vector space, and we have a linear functional by

But what about the reverse? Given a linear functional, is it guaranteed that we have a measure to establish the integral? The R-M-K theorem answers this question affirmatively. The functional to be discussed is *positive*, which means that if $\Lambda$ is positive and $f(X) \subset [0,\infty)$, then $\Lambda{f} \in [0,\infty)$.

Let $X$ be a locally compact Hausdorff space, and let $\Lambda$ be a positive linear functional.) on $C_c(X)$. Then there exists a $\sigma$-algebra $\mathfrak{M}$ on $X$ which contains all Borel sets in $X$, and there exists a unique positive measure $\mu$ on $\mathfrak{M}$ which represents $\Lambda$ in the sense that

for all $f \in C_c(X)$.

For the measure $\mu$ and the $\sigma$-algebra $\mathfrak{M}$, we have four assertions:

- $\mu(K)<\infty$ for every compact set $K \subset X$.
- For every $E \in \mathfrak{M}$, we have

- For every open set $E$ and every $E \in \mathfrak{M}$, we have

- If $E \in \mathfrak{M}$, $A \subset E$, and $\mu(E)=0$, then $A \in \mathfrak{M}$.

**Remarks before proof.** It would be great if we can establish the Lebesgue measure $m$ by putting $X=\mathbb{R}^n$. But we need a little more extra work to get this result naturally. If 2 is satisfied, we say $\mu$ is *outer* regular, and *inner* regular for 3. If both hold, we say $\mu$ is *regular*. The partition of unity and Urysohn’s lemma will be heavily used in the proof of the main theorem, so make sure you have no problem with it.

The proof is rather long so we will split it into several steps. I will try my best to make every line clear enough.

For every open set $V \in X$, define

If $V_1 \subset V_2$ and both are open, we claim that $\mu(V_1) \leq \mu(V_2)$. For $f \prec V_1$, since $\operatorname{supp}f \subset V_1 \subset V_2$, we see $f \prec V_2$. But we are able to find some $g \prec V_2$ such that $g \geq f$, or more precisely, $\operatorname{supp}(g) \supset \operatorname{supp}(f)$. By taking another look at the proof of Urysohn’s lemma for locally compact Hausdorff space, we see there is an open set G with compact closure such that

By Urysohn’s lemma to the pair $(\overline{G},V_2)$, we see there exists a function $g \in C_c(X)$ such that

Therefore

Thus for any $f \prec V_1$ and $g \prec V_2$, we have $\Lambda{g} \geq \Lambda{f}$ (monotonic) since $\Lambda{g}-\Lambda{f}=\Lambda{(g-f)}\geq 0$. By taking the supremum over $f$ and $g$, we see

The ‘monotonic’ property of such $\mu$ enables us to *define* $\mu(E)$ for all $E \subset X$ by

The definition above is trivial to valid for open sets. Sometimes people say $\mu$ is the outer measure. We will discuss other kind of sets thoroughly in the following steps. Warning: we are not saying that $\mathfrak{M} = 2^X$. The crucial property of $\mu$, namely countable additivity, will be proved only on a certain $\sigma$-algebra.

It follows from the definition of $\mu$ that if $E_1 \subset E_2$, then $\mu(E_1) \leq \mu(E_2)$.

Let $\mathfrak{M}_F$ be the class of all $E \subset X$ which satisfy the two following conditions:

$\mu(E) <\infty$.

‘Inner regular’:

One may say here $\mu$ is the ‘inner measure’. Finally, let $\mathfrak{M}$ be the class of all $E \subset X$ such that for every compact $K$, we have $E \cap K \in \mathfrak{M}_F$. We shall show that $\mathfrak{M}$ is the desired $\sigma$-algebra.

**Remarks of Step 0.** So far, we have only proved that $\mu(E) \geq 0$ for all $E {\color\red{\subset}}X$. What about the countable additivity? It’s clear that $\mathfrak{M}_F$ and $\mathfrak{M}$ has some strong relation. We need to get a clearer view of it. Also, if we restrict $\mu$ to $\mathfrak{M}_F$, we restrict ourself to finite numbers. In fact, we will show finally $\mathfrak{M}_F \subset \mathfrak{M}$.

If $K$ is compact, then $K \in \mathfrak{M}_F$, and

Define $V_\alpha=f^{-1}(\alpha,1]$ for $K \prec f$ and $0 < \alpha < 1$. Since $f(x)=1$ for all $x \in K$, we have $K \subset V_{\alpha}$. Therefore by definition of $\mu$ for all $E \subset X$, we have

Note that $f \geq \alpha{g}$ whenever $g \prec V_{\alpha}$ since $\alpha{g} \leq \alpha < f$. Since $\mu(K)$ is an lower bound of $\frac{1}{\alpha}\Lambda{f}$ with $0<\alpha<1$, we see

Since $f(X) \in [0,1]$, we have $\Lambda{f}$ to be finite. Namely $\mu(K) <\infty$. Since $K$ itself is compact, we see $K \in \mathfrak{M}_F$.

To prove the identity, note that there exists some $V \supset K$ such that $\mu(V)<\mu(K)+\varepsilon$ for some $\varepsilon>0$. By Urysohn’s lemma, there exists some $h \in C_c(X)$ such that $K \prec h \prec V$. Therefore

Therefore $\mu(K)$ is the infimum of $\Lambda{h}$ with $K \prec h$.

**Remarks of Step 1.** We have just proved assertion 1 of the property of $\mu$. The hardest part of this proof is the inequality

But this is merely the $\varepsilon$-definition of $\inf$. Note that $\mu(K)$ is the infimum of $\mu(V)$ with $V \supset K$. For any $\varepsilon>0$, there exists some open $V$ for what? Under certain conditions, this definition is much easier to use. Now we will examine the relation between $\mathfrak{M}_F$ and $\tau_X$, namely the topology of $X$.

$\mathfrak{M}_F$ contains every open set $V$ with $\mu(V)<\infty$.

It suffices to show that for open set $V$, we have

For $0<\varepsilon<\mu(V)$, we see there exists an $f \prec V$ such that $\Lambda{f}>\mu(V)-\varepsilon$. If $W$ is any open set which contains $K= \operatorname{supp}(f)$, then $f \prec W$, and therefore $\Lambda{f} \leq \mu(W)$. Again by definition of $\mu(K)$, we see

Therefore

This is exactly the definition of $\sup$. The identity is proved.

**Remarks of Step 2.** It’s important to that this identity can only be satisfied by open sets and sets $E$ with $\mu(E)<\infty$, the latter of which will be proved in the following steps. This is the *flaw* of this theorem. With these preparations however, we are able to show the countable additivity of $\mu$ on $\mathfrak{M}_F$.

If $E_1,E_2,E_3,\cdots$ are arbitrary subsets of $X$, then

First we show this holds for finitely many open sets. This is tantamount to show that

if $V_1$ and $V_2$ are open. Pick $g \prec V_1 \cup V_2$. This is possible due to Urysohn’s lemma. By corollary 1.3, there is a partition of unity $(h_1,h_2)$ subordinate to $(V_1,V_2)$ in the sense of corollary 1.3. Therefore,

Notice that $h_1g \prec V_1$ and $h_2g \prec V_2$. By taking the supremum, we have

Now we back to arbitrary subsets of $X$. If $\mu(E_i)=\infty$ for some $i$, then there is nothing to prove. Therefore we shall assume that $\mu(E_i)<\infty$ for all $i$. By definition of $\mu(E_i)$, we see there are open sets $V_i \supset E_i$ such that

Put $V=\bigcup_{i=1}^{\infty}V_i$, and choose $f \prec V_i$. Since $f \in C_c(X)$, there is a finite collection of $V_i$ that covers the support of $f$. Therefore without loss of generality, we may say that

for some $n$. We therefore obtain

for all $f \prec V$. Since $\bigcup E_i \subset V$, we have $\mu(\bigcup E_i) \leq \mu(V)$. Therefore

Since $\varepsilon$ is arbitrary, the inequality is proved.

**Remarks of Step 3.** Again, we are using the $\varepsilon$-definition of $\inf$. One may say this step showed the subaddtivity of the outer measure. Also note the geometric series by $\sum_{k=1}^{\infty}\frac{\varepsilon}{2^k}=\varepsilon$.

Suppose $E=\bigcup_{i=1}^{\infty}E_i$, where $E_1,E_2,\cdots$ are pairwise disjoint members of $\mathfrak{M}_F$, then

If $\mu(E)<\infty$, we also have $E \in \mathfrak{M}_F$.

As a dual to Step 3, we firstly show this holds for finitely many compact sets. As proved in Step 1, compact sets are in $\mathfrak{M}_F$. Suppose now $K_1$ and $K_2$ are disjoint compact sets. We want to show that

Note that compact sets in a Hausdorff space is closed. Therefore we are able to apply Urysohn’s lemma to the pair $(K_1,K_2^c)$. That said, there exists a $f \in C_c(X)$ such that

In other words, $f(x)=1$ for all $x \in K_1$ and $f(x)=0$ for all $x \in K_2$, since $\operatorname{supp}(f) \cap K_2 = \varnothing$. By Step 1, since $K_1 \cup K_2$ is compact, there exists some $g \in C_c(X)$ such that

Now things become tricky. We are able to write $g$ by

But $K_1 \prec fg$ and $K_2 \prec (1-f)g$ by the properties of $f$ and $g$. Also since $\Lambda$ is linear, we have

Therefore we have

On the other hand, by Step 3, we have

Therefore they must equal.

If $\mu(E)=\infty$, there is nothing to prove. So now we should assume that $\mu(E)<\infty$. Since $E_i \in \mathfrak{M}_F$, there are compact sets $K_i \subset E_i$ with

Putting $H_n=K_1 \cup K_2 \cup \cdots \cup K_n$, we see $E \supset H_n$ and

This inequality holds for all $n$ and $\varepsilon$, therefore

Therefore by Step 3, the identity holds.

Finally we shall show that $E \in \mathfrak{M}_F$ if $\mu(E) <\infty$. To make it more understandable, we will use elementary calculus notation. If we write $\mu(E)=x$ and $x_n=\sum_{i=1}^{n}\mu(E_i)$, we see

Therefore, for any $\varepsilon>0$, there exists some $N \in \mathbb{N}$ such that

This is tantamount to

But by definition of the *compact* set $H_N$ above, we see

Hence $E$ satisfies the requirements of $\mathfrak{M}_F$, thus an element of it.

**Remarks of Step 4.** You should realize that we are heavily using the $\varepsilon$-definition of $\sup$ and $\inf$. As you may guess, $\mathfrak{M}_F$ should be a subset of $\mathfrak{M}$ though we don’t know whether it is a $\sigma$-algebra or not. In other words, we hope that the countable additivity of $\mu$ holds on a $\sigma$-algebra that is *properly extended* from $\mathfrak{M}_F$. However it’s still difficult to show that $\mathfrak{M}$ is a $\sigma$-algebra. We need more properties of $\mathfrak{M}_F$ to go on.

If $E \in \mathfrak{M}_F$ and $\varepsilon>0$, there is a compact $K$ and an open $V$ such that $K \subset E \subset V$ and $\mu(V-K)<\varepsilon$.

There are two ways to write $\mu(E)$, namely

where $K$ is compact and $V$ is open. Therefore there exists some $K$ and $V$ such that

Since $V-K$ is open, and $\mu(V-K)<\infty$, we have $V-K \in \mathfrak{M}_F$. By Step 4, we have

Therefore $\mu(V-K)<\varepsilon$ as proved.

**Remarks of Step 5.** You should be familiar with the $\varepsilon$-definitions of $\sup$ and $\inf$ now. Since $V-K =V\cap K^c \subset V$, we have $\mu(V-K)\leq\mu(V)<\mu(E)+\frac{\varepsilon}{2}<\infty$.

If $A,B \in \mathfrak{M}_F$, then $A-B,A\cup B$ and $A \cap B$ are elements of $\mathfrak{M}_F$.

This shows that $\mathfrak{M}_F$ is closed under union, intersection and relative complement. In fact, we merely need to prove $A-B \in \mathfrak{M}_F$, since $A \cup B=(A-B) \cup B$ and $A\cap B = A-(A-B)$.

By Step 5, for $\varepsilon>0$, there are sets $K_A$, $K_B$, $V_A$, $V_B$ such that $K_A \subset A \subset V_A$, $K_B \subset B \subset V_B$, and for $A-B$ we have

With an application of Step 3 and 5, we have

Since $K_A-V_B$ is a closed subset of $K_A$, we see $K_A-V_B$ is compact as well (a closed subset of a compact set is compact). But $K_A-V_B \subset A-B$, and $\mu(A-B) <\mu(K_A-V_B)+2\varepsilon$, we see $A-B$ meet the requirement of $\mathfrak{M}_F$ (, the fact that $\mu(A-B)<\infty$ is trivial since $\mu(A-B)<\mu(A)$).

Since $A-B$ and $B$ are pairwise disjoint members of $\mathfrak{M}_F$, we see

Thus $A \cup B \in \mathfrak{M}_F$. Since $A,A-B \in \mathfrak{M}_F$, we see $A \cap B = A-(A-B) \in \mathfrak{M}_F$.

**Remarks of Step 6.** In this step, we demonstrated several ways to express a set, all of which end up with a huge simplification. Now we are able to show that $\mathfrak{M}_F$ is a subset of $\mathfrak{M}$.

There is a precise relation between $\mathfrak{M}$ and $\mathfrak{M}_F$ by

If $E \in \mathfrak{M}_F$, we shall show that $E \in \mathfrak{M}$. For compact $K\in\mathfrak{M}_F$ (Step 1), by Step 6, we see $K \cap E \in \mathfrak{M}_F$, therefore $E \in \mathfrak{M}$.

If $E \in \mathfrak{M}$ with $\mu(E)<\infty$ however, we need to show that $E \in \mathfrak{M}_F$. By definition of $\mu$, for $\varepsilon>0$, there is an open $V$ such that

Therefore $V \in \mathfrak{M}_F$. By Step 5, there is a compact set $K$ such that $\mu(V-K)<\varepsilon$ (the open set containing $V$ should be $V$ itself). Since $E \cap K \in \mathfrak{M}_F$, there exists a compact set $H \subset E \cap K$ with

Since $E \subset (E \cap K) \cup (V-K)$, it follows from Step 1 that

Therefore $E \in \mathfrak{M}_F$.

**Remarks of Step 7.** Several tricks in the preceding steps are used here. Now we are pretty close to the fact that $(X,\mathfrak{M},\mu)$ is a measure space. Note that for $E \in \mathfrak{M}-\mathfrak{M}_F$, we have $\mu(E)=\infty$, but we have already proved the countable additivity for $\mathfrak{M}_F$. Is it ‘almost trivial’ for $\mathfrak{M}$? Before that, we need to show that $\mathfrak{M}$ is a $\sigma$-algebra. Note that assertion 3 of $\mu$ has been proved.

We will validate the definition of $\sigma$-algebra one by one.

$X \in \mathfrak{M}$.

For any compact $K \subset X$, we have $K \cap X=K$. But as proved in Step 1, $K \in \mathfrak{M}_F$, therefore $X \in \mathfrak{M}$.

If $A \in \mathfrak{M}$, then $A^c \in\mathfrak{M}$.

If $A \in \mathfrak{M}$, then $A \cap K \in \mathfrak{M}_F$. But

By Step 1 and Step 6, we see $K \cap A^c \in \mathfrak{M}_F$, thus $A^c \in \mathfrak{M}$.

If $A_n \in \mathfrak{M}$ for all $n \in \mathbb{N}$, then $A=\bigcup_{n=1}^{\infty}A_n \in \mathfrak{M}$.

We assign an auxiliary sequence of sets inductively. For $n=1$, we write $B_1=A_1 \cap K$ where $K$ is compact. Then $B_1 \in \mathfrak{M}_F$. For $n \geq 2$, we write

Since $A_n \cap K \in \mathfrak{M}_F$, $B_1,B_2,\cdots,B_{n-1} \in \mathfrak{M}_F$, by Step 6, $B_n \in \mathfrak{M}_F$. Also $B_n$ is pairwise disjoint.

Another set-theoretic manipulation shows that

Now we are able to evaluate $\mu(A \cap K)$ by Step 4.

Therefore $A \cap K \in \mathfrak{M}_F$, which implies that $A \in \mathfrak{M}$.

$\mathfrak{M}$ contains all Borel sets.

Indeed, it suffices to prove that $\mathfrak{M}$ contains all open sets and/or closed sets. We’ll show two different paths. Let $K$ be a compact set.

- If $C$ is closed, then $C \cap K$ is compact, therefore $C$ is an element of $\mathfrak{M}_F$. (By Step 2.)
- If $D$ is open, then $D \cap K \subset K$. Therefore $\mu(D \cap K) \leq \mu(K)<\infty$, which shows that $D$ is an element of $\mathfrak{M}_F$. (By Step 7.)

Therefore by 1 or 2, $\mathfrak{M}$ contains all Borel sets.

Again, we will verify all properties of $\mu$ one by one.

$\mu(E) \geq 0$ for all $E \in \mathfrak{M}$.

This follows immediately from the definition of $\mu$, since $\Lambda$ is positive and $0 \leq f \leq 1$.

$\mu$ is countably additive.

If $A_1,A_2,\cdots$ form a disjoint countable collection of members of $\mathfrak{M}$, we need to show that

If $A_n \in \mathfrak{M}_F$ for all $n$, then this is merely what we have just proved in Step 4. If $A_j \in \mathfrak{M}-\mathfrak{M}_F$ however, we have $\mu(A_j)=\infty$. So $\sum_n\mu(A_n)=\infty$. For $\mu(\cup_n A_n)$, notice that $\cup_n A_n \supset A_j$, we have $\mu(\cup_n A_n) \geq \mu(A_j)=\infty$. The identity is now proved.

So far assertion 1-3 have been proved. But the final assertion has not been proved explicitly. We do that since this property will be used when discussing the Lebesgue measure $m$. In fact, this will show that $(X,\mathfrak{M},\mu)$ is a complete measure space.

If $E \in \mathfrak{M}$, $A \subset E$, and $\mu(E)=0$, then $A \in \mathfrak{M}$.

It suffices to show that $A \in \mathfrak{M}_F$. By definition, $\mu(A)=0$ as well. If $K \subset A$, where $K$ is compact, then $\mu(K)=\mu(A)=0$. Therefore $0$ is the supremum of $\mu(K)$. It follows that $A \in \mathfrak{M}_F \subset \mathfrak{M}$.

For every $f \in C_c(X)$, $\Lambda{f}=\int_X fd\mu$.

This is the absolute main result of the theorem. It suffices to prove the inequality

for all $f \in C_c(X)$. What about the other side? By the linearity of $\Lambda$ and $\int_X \cdot d\mu$, once inequality above proved, we have

Therefore

holds as well, and this establish the equality.

Notice that since $K=\operatorname{supp}(f)$ is compact, we see the range of $f$ has to be compact. Namely we may assume that $[a,b]$ contains the range of $f$. For $\varepsilon>0$, we are able to pick a partition around $[a,b]$ such that $y_n - y_{n-1}<\varepsilon$ and

Put

Since $f$ is continuous, $f$ is Borel measurable. The sets $E_i$ are trivially pairwise disjoint Borel sets. Again, there are open sets $V_i \supset E_i$ such that

for $i=1,2,\cdots,n$, and such that $f(x)<y_i + \varepsilon$ for all $x \in V_i$. Notice that $(V_i)$ covers $K$, therefore by the partition of unity, there are a sequence of functions $(h_i)$ such that $h_i \prec V_i$ for all $i$ and $\sum h_i=1$ on $K$. By Step 1 and the fact that $f=\sum_i h_i$, we see

By the way we picked $V_i$, we see $h_if \leq (y_i+\varepsilon)h_i$. We have the following inequality:

Since $h_i \prec V_i$, we have $\mu(E_i)+\frac{\varepsilon}{n}>\mu(V_i) \geq \Lambda{h_i}$. And we already get $\sum_i \Lambda{h_i} \geq \mu(K)$. If we put them into the inequality above, we get

Observe that $\cup_i E_i=K$, by Step 9 we have $\sum_{i}\mu(E_i)=\mu(K)$. A slight manipulation shows that

Therefore for $\Lambda f$ we get

Now here comes the trickiest part of the whole blog post. By definition of $E_i$, we see $f(x) > y_{i-1}>y_{i}-\varepsilon$ for $x \in E_i$. Therefore we get simple function $s_n$ by

If we evaluate the Lebesgue integral of $f$ with respect to $\mu$, we see

For $2\varepsilon\mu(K)$, things are simple since $0\leq\mu(K)<\infty$. Therefore $2\varepsilon\mu(K) \to 0$ as $\varepsilon \to 0$. Now let’s estimate the final part of the inequality. It’s trivial that $\frac{\varepsilon}{n}\sum_{i=1}^{n}(|a|+\varepsilon)=\varepsilon(\varepsilon+|a|)$. For $y_i$, observe that $y_i \leq b$ for all $i$, therefore $\frac{\varepsilon}{n}\sum_{i=1}^{n}y_i \leq \frac{\varepsilon}{n}nb=\varepsilon b$. Thus

Notice that $b+|a| \geq 0$ since $b \geq a \geq -|a|$. Our estimation of $\Lambda{f}$ is finally done:

Since $\varepsilon$ is arbitrary, we see $\Lambda{f} \leq \int_X fd\mu$. The identity is proved.

If there are two measures $\mu_1$ and $\mu_2$ that satisfy assertion 1 to 4 and are correspond to $\Lambda$, then $\mu_1=\mu_2$.

In fact, according to assertion 2 and 3, $\mu$ is determined by the values on compact subsets of $X$. It suffices to show that

If $K$ is a compact subset of $X$, then $\mu_1(K)=\mu_2(K)$.

Fix $K$ compact and $\varepsilon>0$. By Step 1, there exists an open $V \supset K$ such that $\mu_2(V)<\mu_2(K)+\varepsilon$. By Urysohn’s lemma, there exists some $f$ such that $K \prec f \prec V$. Hence

Thus $\mu_1(K) \leq \mu_2(K)$. If $\mu_1$ and $\mu_2$ are exchanged, we see $\mu_2(K) \leq \mu_1(K)$. The uniqueness is proved.

Can we simply put $X=\mathbb{R}^k$ right now? The answer is no. Note that the outer regularity is for all sets but inner is only for open sets and members of $\mathfrak{M}_F$. But we expect the outer and inner regularity to be ‘symmetric’. There is an example showing that *locally compact* is far from being enough to offer the ‘symmetry’.

Define $X=\mathbb{R}_1 \times \mathbb{R}_2$, where $\mathbb{R}_1$ is the real line equipped with discrete metric $d_1$, and $\mathbb{R}_2$ is the real line equipped with euclidean metric $d_2$. The metric of $X$ is defined by

The topology $\tau_X$ induced by $d_X$ is naturally Hausdorff and locally compact by considering the vertical segments. So what would happen to this weird locally compact Hausdorff space?

If $f \in C_c(X)$, let $x_1,x_2,\cdots,x_n$ be those values of $x$ for which $f(x,y) \neq 0$ for at least one $y$. Since $f$ has compact support, it is ensured that there are only finitely many $x_i$’s. We are able to define a positive linear functional by

where $\mu$ is the measure associated with $\Lambda$ in the sense of R-M-K theorem. Let

By squeezing the disjoint vertical segments around $(x_i,0)$, we see $\mu(K)=0$ for all compact $K \subset E$ but $\mu(E)=\infty$.

This is in violent contrast to what we do expect. However, if $X$ is required to be $\sigma$-compact (note that the space in this example is not), this kind of problems disappear neatly.

- Walter Rudin,
*Real and Complex Analysis* - Serge Lang,
*Fundamentals of Differential Geometry* - Joel W. Robbin,
*Partition of Unity* - Brian Conrad,
*Paracompactness and local compactness* - Raoul Bott & Loring W. Tu,
*Differential Forms in Algebraic Topology*

We are finally going to prove the open mapping theorem in $F$-space. In this version, only metric and completeness are required. Therefore it contains the Banach space version naturally.

(Theorem 0)Suppose we have the following conditions:

- $X$ is a $F$-space,
- $Y$ is a topological space,
- $\Lambda: X \to Y$ is continuous and linear, and
- $\Lambda(X)$ is of the second category in $Y$.
Then $\Lambda$ is an open mapping.

*Proof.* Let $B$ be a neighborhood of $0$ in $X$. Let $d$ be an invariant metric on $X$ that is compatible with the $F$-topology of $X$. Define a sequence of balls by

where $r$ is picked in such a way that $B_0 \subset B$. To show that $\Lambda$ is an open mapping, we need to prove that there exists some neighborhood $W$ of $0$ in $Y$ such that

To do this however, we need an auxiliary set. In fact, we will show that there exists some $W$ such that

We need to prove the inclusions one by one.

The first inclusion requires BCT. Since $B_2 -B_2 \subset B_1$, and $Y$ is a topological space, we get

Since

according to BCT, at least one $k\Lambda(B_2)$ is of the second category in $Y$. But scalar multiplication $y\mapsto ky$ is a homeomorphism of $Y$ onto $Y$, we see $k\Lambda(B_2)$ is of the second category for all $k$, especially for $k=1$. Therefore $\overline{\Lambda(B_2)}$ has nonempty interior, which implies that there exists some open neighborhood $W$ of $0$ in $Y$ such that $W \subset \overline{\Lambda(B_1)}$. By replacing the index, it’s easy to see this holds for all $n$. That is, for $n \geq 1$, there exists some neighborhood $W_n$ of $0$ in $Y$ such that $W_n \subset \overline{\Lambda(B_n)}$.

The second inclusion requires the completeness of $X$. Fix $y_1 \in \overline{\Lambda(B_1)}$, we will show that $y_1 \in \Lambda(B)$. Pick $y_n$ inductively. Assume $y_n$ has been chosen in $\overline{\Lambda(B_n)}$. As stated before, there exists some neighborhood $W_{n+1}$ of $0$ in $Y$ such that $W_{n+1} \subset \overline{\Lambda(B_{n+1})}$. Hence

Therefore there exists some $x_n \in B_n$ such that

Put $y_{n+1}=y_n-\Lambda x_n$, we see $y_{n+1} \in W_{n+1} \subset \overline{\Lambda(B_{n+1})}$. Therefore we are able to pick $y_n$ naturally for all $n \geq 1$.

Since $d(x_n,0)<\frac{r}{2^n}$ for all $n \geq 0$, the sums $z_n=\sum_{k=1}^{n}x_k$ converges to some $z \in X$ since $X$ is a $F$-space. Notice we also have

we have $z \in B_0 \subset B$.

By the continuity of $\Lambda$, we see $\lim_{n \to \infty}y_n = 0$. Notice we also have

we see $y_1 = \Lambda z \in \Lambda(B)$.

The whole theorem is now proved, that is, $\Lambda$ is an open mapping. $\square$

You may think the following relation comes from nowhere:

But it’s not. We need to review some set-point topology definitions. Notice that $y_n$ is a limit point of $\Lambda(B_n)$, and $y_n-W_{n+1}$ is a open neighborhood of $y_n$. If $(y_n - W_{n+1}) \cap \Lambda(B_{n})$ is empty, then $y_n$ cannot be a limit point.

The geometric series by

is widely used when sum is taken into account. It is a good idea to keep this technique in mind.

The formal proof will not be put down here, but they are quite easy to be done.

(Corollary 0)$\Lambda(X)=Y$.

This is an immediate consequence of the fact that $\Lambda$ is open. Since $Y$ is open, $\Lambda(X)$ is an open subspace of $Y$. But the only open subspace of $Y$ is $Y$ itself.

(Corollary 1)$Y$ is a $F$-space as well.

If you have already see the commutative diagram by quotient space (put $N=\ker\Lambda$), you know that the induced map $f$ is open and continuous. By treating topological spaces as groups, by corollary 0 and the first isomorphism theorem, we have

Therefore $f$ is a isomorphism; hence one-to-one. Therefore $f$ is a homeomorphism as well. In this post we showed that $X/\ker{\Lambda}$ is a $F$-space, therefore $Y$ has to be a $F$-space as well. (We are using the fact that $\ker{\Lambda}$ is a closed set. But why closed?)

(Corollary 2)If $\Lambda$ is a continuous linear mapping of an $F$-space $X$ onto a $F$-space $Y$, then $\Lambda$ is open.

This is a direct application of BCT and open mapping theorem. Notice that $Y$ is now of the second category.

(Corollary 3)If the linear map $\Lambda$ in Corollary 2 is injective, then $\Lambda^{-1}:Y \to X$ is continuous.

This comes from corollary 2 directly since $\Lambda$ is open.

(Corollary 4)If $X$ and $Y$ are Banach spaces, and if $\Lambda: X \to Y$ is a continuous linear bijective map, then there exist positive real numbers $a$ and $b$ such thatfor every $x \in X$.

This comes from corollary 3 directly since both $\Lambda$ and $\Lambda^{-1}$ are bounded as they are continuous.

(Corollary 5)If $\tau_1 \subset \tau_2$ are vector topologies on a vector space $X$ and if both $(X,\tau_1)$ and $(X,\tau_2)$ are $F$-spaces, then $\tau_1 = \tau_2$.

This is obtained by applying corollary 3 to the identity mapping $\iota:(X,\tau_2) \to (X,\tau_1)$.

(Corollary 6)If $\lVert \cdot \rVert_1$ and $\lVert \cdot \rVert_2$ are two norms in a vector space $X$ such that

- $\lVert\cdot\rVert_1 \leq K\lVert\cdot\rVert_2$.
- $(X,\lVert\cdot\rVert_1)$ and $(X,\lVert\cdot\rVert_2)$ are Banach
Then $\lVert\cdot\rVert_1$ and $\lVert\cdot\rVert_2$ are equivalent.

This is merely a more restrictive version of corollary 5.

]]>We are going to show the completeness of $X/N$ where $X$ is a TVS and $N$ a closed subspace. Alongside, a bunch of useful analysis tricks will be demonstrated (and that’s why you may find this blog post a little tedious.). But what’s more important, the theorem proved here will be used in the future.

To make it clear, we should give a formal definition of $F$-space.

A topological space $X$ is an $F$-space if its topology $\tau$ is induced by a complete invariant metric $d$.

A metric $d$ on a vector space $X$ will be called invariant if for all $x,y,z \in X$, we have

By complete we mean every Cauchy sequence of $(X,d)$ converges.

The metric can be inherited to the quotient space naturally (we will use this fact latter), that is

If $X$ is a $F$-space, $N$ is a closed subspace of a topological vector space $X$, then $X/N$ is still a $F$-space.

Suppose $d$ is a complete invariant metric compatible with $\tau_X$. The metric on $X/N$ is defined by

*Proof.* First, if $\pi(x)=\pi(y)$, that is, $x-y \in N$, we see

If $\pi(x) \neq \pi(y)$ however, we shall show that $\rho(\pi(x),\pi(y))>0$. In this case, we have $x-y \notin N$. Since $N$ is closed, $N^c$ is open, and $x-y$ is an interior point of $X-N$. Therefore there exists an open ball $B_r(x-y)$ centered at $x-y$ with radius $r>0$ such that $B_r(x-y) \cap N = \varnothing$. Notice we have $d(x-y,z)>r$ since otherwise $z \in B_r(x-y)$. By putting

we see $d(x-y,z) \geq r_0$ for all $z \in N$ and indeed $r_0=\inf_{z \in N}d(x-y,z)>0$ (the verification can be done by contradiction). In general, $\inf_z d(x-y,z)=0$ if and only if $x-y \in \overline{N}$.

Next, we shall show that $\rho(\pi(x),\pi(y))=\rho(\pi(y),\pi(x))$, and it suffices to assume that $\pi(x) \neq \pi(y)$. Sgince $d$ is translate invariant, we get

Therefore the $\inf$ of the left hand is equal to the one of the right hand. The identity is proved.

Finally, we need to verify the triangle inequality. Let $r,s,t \in X$. For any $\varepsilon>0$, there exist some $z_\varepsilon$ and $z_\varepsilon’$ such that

Since $d$ is invariant, we see

*(I owe @LeechLattice for the inequality above.)*

Therefore

*(Warning: This does not imply that $\rho(\pi(r),\pi(s))+\rho(\pi(s),\pi(t))=\inf_z d(r-t,z)$ since we don’t know whether it is the lower bound or not.)*

If $\rho(\pi(r),\pi(s))+\rho(\pi(s),\pi(t))<\rho(\pi(r),\pi(t))$ however, let

then there exists some $z’’_\varepsilon=z_\varepsilon+z’_\varepsilon$ such that

which is a contradiction since $\rho(\pi(r),\pi(t)) \leq d(r-t,z)$ for all $z \in N$.

*(We are using the $\varepsilon$ definition of $\inf$. See here.)*

Since $\pi$ is surjective, we see if $u \in X/N$, there exists some $a \in X$ such that $\pi(a)=u$. Therefore

If $\pi(x)=\pi(x’)$ and $\pi(y)=\pi(y’)$, we have to show that $\rho(\pi(x),\pi(y))=\rho(\pi(x’),\pi(y’))$. In fact,

since $\rho(\pi(x),\pi(x’))=0$ as $\pi(x)=\pi(x’)$. Meanwhile

therefore $\rho(\pi(x),\pi(y))=\rho(\pi(x’),\pi(y’))$.

By proving this, we need to show that a set $E \subset X/N$ is open with respect to $\tau_N$ if and only if $E$ is a union of open balls. But we need to show a generalized version:

If $\mathscr{B}$ is a local base for $\tau$, then the collection $\mathscr{B}_N$, which contains all sets $\pi(V)$ where $V \in \mathscr{B}$, forms a local base for $\tau_N$.

*Proof.* We already know that $\pi$ is continuous, linear and open. Therefore $\pi(V)$ is open for all $V \in \mathscr{B}$. For any open set around $E \subset X/N$ containing $\pi(0)$, we see $\pi^{-1}(E)$ is open, and we have

and therefore

Now consider the local base $\mathscr{B}$ containing all open balls around $0 \in X$. Since

we see $\rho$ determines $\mathscr{B}_N$. But we have already proved that $\rho$ is invariant; hence $\mathscr{B}_N$ determines $\tau_N$.

Once this is proved, we are able to claim that, if $X$ is a $F$-space, then $X/N$ is still a $F$-space, since its topology is induced by a complete invariant metric $\rho$.

*Proof.* Suppose $(x_n)$ is a Cauchy sequence in $X/N$, relative to $\rho$. There is a subsequence $(x_{n_k})$ with $\rho(x_{n_k},x_{n_{k+1}})<2^{-k}$. Since $\pi$ is surjective, we are able to pick some $z_k \in X$ such that $\pi(z_k) = x_{n_k}$ and such that

(The existence can be verified by contradiction still.) By the inequality above, we see $(z_k)$ is Cauchy (can you see why?). Since $X$ is complete, $z_k \to z$ for some $z \in X$. By the **continuity** of $\pi$, we also see $x_{n_k} \to \pi(z)$ as $k \to \infty$. Therefore $(x_{n_k})$ converges. Hence $(x_n)$ converges since it has a convergent subsequence. $\rho$ is complete.

This fact will be used to prove some corollaries in the open mapping theorem. For instance, for any continuous linear map $\Lambda:X \to Y$, we see $\ker(\Lambda)$ is closed, therefore if $X$ is a $F$-space, then $X/\ker(\Lambda)$ is a $F$-space as well. We will show in the future that $X/\ker(\Lambda)$ and $\Lambda(X)$ are homeomorphic if $\Lambda(X)$ is of the second category.

There are more properties that can be inherited by $X/N$ from $X$. For example, normability, metrizability, local convexity. In particular, if $X$ is Banach, then $X/N$ is Banach as well. To do this, it suffices to define the quotient norm by

]]>Before going to it, we are going to give several motivations to define the Riemann-Stieltjes integral, which can be considered as an generalization of Riemann integral, the one everyone learns in their Calculus class.

When talking about $\int_a^b fdg$, one may simply think about $\int_a^b fg’dx$. But is it even necessary that $g$ is differentiable? What would happen if $g$ is simply continuous, or even not continuous? Further, given that $g$ is differentiable, can we prove that

in a general way(without assuming $f$ is differentiable)?

Another motivation comes from probability theory. Oftentimes one need to consider discrete case ($\sum$) and continuous case ($\int$) separately. One may say that integral is the limit of summation, but it would be weird to write $\int$ as $\lim\sum$ every time. However, if we have a way to write a sum, for example the expected value of a discrete variable, as an integral, things would be easier. Of course, we don’t want to write such a sum as another sum by adding up the integral on several disjoint segments. That would be weirder.

If you have learned measure theory, you will know that Lebesgue integral does not perfectly cover Riemann integral. For example, $\int_{0}^{\infty}\frac{\sin{x}}{x}dx$ is not integrable in the sense of Lebesgue but Riemann. We cannot treat Lebesgue integral as a generalization of Riemann integral. In this blog post however, we are showing a direct generalization of Riemann integral.

We are trying our best to prevent ourselves from using $\sup$, $\inf$, and differentiation theory. But $\varepsilon-\delta$ language is heavily used here, so make sure that you are good at it.

By a partition $P$ on $[a,b]$ we mean a sequence of numbers $(x_n)$ such that

and we associate its size by

Let $f$, $g$ be bounded real function on $[a,b]$ (again, no continuity or differentiability required). Given a partition $P$ and numbers $c_k$ with $x_k \leq c_k \leq x_{k+1}$, we define the Riemann-Stieltjes sum (RS-sum) by

We say that the **limit**

exists if there exists some $L \in \mathbb{R}$ such that give $\varepsilon>0$, there exists $\delta>0$ such that whenever $\sigma(P)<\delta$, we have

In this case, we say $f$ is RS(g)-integrable, and the limit is denoted by

This is the so-called **Riemann-Stieltjes** integral. When $g(x)=x$, we get **Riemann integral** naturally.

This integral method can be generalized to Banach space. Let $f$, $g$ be bounded maps of $[a,b]$ into Banach spaces $E$, $F$ respectively. Assume we have a product $E \times F \to G$ denoted by $(u,v) \mapsto uv$ with $\lVert uv \rVert \leq \lVert u \rVert \lVert v \rVert$. Then by replacing the absolute value by norm, still we get the Riemann-Stieltjes integral, although in this case we have

and $G$ is not necessary to be $\mathbb{R}$. This is different from Bochner integral, since no measure theory evolved here.

First, we shall show that RS(g)-integrable functions form a vector space. To do this, it suffices to show that

and

are linear. This follows directly from the definition of RS-sum. Let’s see the result.

Suppose we have

Then we have the following identities for $\alpha \in I$.

- $\int_a^b \alpha fdg=\alpha I$.
- $\int_a^b (f+h)dg=I+J$.
- $\int_a^bfd(g+u)=I+K$.
- $\int_a^b fd(\alpha g)=\alpha I$.

*Proof.* We shall show 2 for example. Other three identities follows in the same way.

Notice that the existence of the limit of RS-sum depends only on the size of $P$. For $\varepsilon>0$, there exists some $\delta_1,\delta_2>0$ such that

when $\sigma(P)<\delta_1$ and $\sigma(P)<\delta_2$ respectively. By picking $\delta=\min(\delta_1,\delta_2)$, we see for $\sigma(P)<\delta$, we have

$f \in RS(g)$ if and only if $g \in RS(f)$. In this case, we also have integration by parts:

You may not believe it, but differentiation does not play any role here, as promised at the beginning.

*Proof.* Using the summation by parts (by Abel), we have

By writing

we have

where

Consider the partition $Q$ by

we have $x_0,x_1,\cdots,x_{n-1},x_k$ to be intermediate points, and

Since $0 < \sigma(Q) \leq 2\sigma(P) \leq 4\sigma(Q)$, when $\sigma(P) \to 0$, we also have $\sigma(Q) \to 0$ and vice versa. Suppose now $\int_a^b gdf$ exists, we have.

And integration by parts follows.

Suppose $\int_a^bfdg$ exists, then

The proposition is proved. $\square$

As said before, we want to represent both continuous and discrete case using integral. For measure theory, we have Lebesgue measure and counting measure. But in some cases, this can be done using Riemann-Stieltjes integral as well. Ordinary Riemann integral and finite or infinite series are both special cases of Riemann-Stieltjes integral.

To do this, we need the unit step function by

If $a<s<b$, $f$ is bounded on $[a,b]$ and continuous at $s$, by putting $g(x)=I(x-s)$, we have

*Proof.* A simple verification shows that $\int_a^b fdg=\int_s^b fdg$ (by unwinding the RS-sum, one see immediately that $g(x_k)=0$ for all $x_k\leq s$, therefore the partition before $s$ has no tribute to the value of the integral). Now consider the partition $P$ by

We see

As $x_1 \to s$, we have $c_0 \to s$, since $f$ is continuous at $s$, we have $f(c_0) \to f(s)$ as desired. $\square$

By the linearity of RS integral, it’s easy to generalize this to the case of finite linear combination. Namely, for $g(x)=\sum_{k=1}^{n}c_nI(x-s_n)$, we have

But now we are discussing the infinite case.

Suppose $c_n \geq 0$ for all $n$ and $\sum_n c_n$ converges, $(s_n)$ is a sequence of distinct points in $(a,b)$, and

Let $f$ be continuous on $[a,b]$. Then

*Proof.* First it’s easy to see that $g(x)$ converges for every $x$, and is monotonic with $g(a)=0$, $g(b)=\sum_n c_n$. For given $\varepsilon>0$, there exists some $N$ such that

Put

we have

By putting $M=\sup|f(x)|$, we see

The inequality holds since $g_2(b)-g_2(a)<\varepsilon$. Since $M$ is finite, when $N \to \infty$, we have the desired result.

Finally we are discussing some differentiation. The following theorem shows the connection between RS integral and Riemann integral.

Let $f$ be continuous and suppose that $g$ is real differentiable on $[a,b]$ while $g’$ is Riemann integrable as well, then $f \in RS(g)$ and

*Proof.* By mean value theorem, for each $k$, we have

The RS-sum can be written as

Since $g’$ is Riemann integrable, we have

given that $|S(P,g’,x)-\int_a^b g’dx|<\varepsilon$. Therefore

where $M=\sup|f(x)|<\infty$ ($f$ is assumed to be bounded.) . Also notice that $fg’$ is integrable since $f$ is continuous. Therefore

Therefore,

which proves the theorem. $\square$

To sum up, given $\varepsilon>0$, there exists some $\delta>0$ such that if $\sigma(P)<\delta$, we have

and

After some estimation, we get

]]>We are restricting ourself into $\mathbb{R}$ endowed with normal topology. Recall that a function is continuous if and only if for any open set $U \subset \mathbb{R}$, we have

to be open. One can rewrite this statement using $\varepsilon-\delta$ language. To say a function $f: \mathbb{R} \to \mathbb{R}$ continuous at $f(x)$, we mean for any $\varepsilon>0$, there exists some $\delta>0$ such that for $t \in (x-\delta,x+\delta)$, we have

$f$ is continuous on $\mathbb{R}$ if and only if $f$ is continuous at every point of $\mathbb{R}$.

If $(x-\delta,x+\delta)$ is replaced with $(x-\delta,x)$ or $(x,x+\delta)$, we get left continuous and right continuous, one of which plays an important role in probability theory.

But the problem is, sometimes continuity is too strong for being a restriction, but the ‘direction’ associated with left/right continuous functions are unnecessary as well. For example the function

is neither left nor right continuous (globally), but it is a thing. Left/right continuous is not a perfectly weakened version of continuity. We need something different.

Let $f$ be a real (or extended-real) function on $\mathbb{R}$. The semicontinuity of $f$ is defined as follows.

If

is open for all real $\alpha$, we say $f$ is

lowersemicontinuous.If

is open for all real $\alpha$, we say $f$ is

uppersemicontinuous.

Is it possible to rewrite these definition à la $\varepsilon-\delta$? The answer is yes if we restrict ourself in metric space.

$f: \mathbb{R} \to \mathbb{R}$ is upper semicontinuous at $x$ if for every $\varepsilon>0$, there exists some $\delta>0$ such that for $t \in (x-\delta,x+\delta)$, we have

$f: \mathbb{R} \to \mathbb{R}$ is lower semicontinuous at $x$ if for every $\varepsilon>0$, there exists some $\delta>0$ such that for $t \in (x-\delta,x+\delta)$, we have

Of course, $f$ is upper/lower semicontinuous on $\mathbb{R}$ if and only if it is so on every point of $\mathbb{R}$. One shall find no difference between the definitions in different styles.

Here is another way to see it. For the continuity of $f$, we are looking for *arbitrary* open subsets $V$ of $\mathbb{R}$, and $f^{-1}(V)$ is expected to be open. For the lower/upper semicontinuity of $f$, however, the open sets are restricted to be like $(\alpha,+\infty]$ and $[-\infty,\alpha)$. Since all open sets of $\mathbb{R}$ can be generated by the union or intersection of sets like $[-\infty,\alpha)$ and $(\beta,+\infty]$, we immediately get

$f$ is continuous if and only if $f$ is both upper semicontinuous and lower semicontinuous.

*Proof.* If $f$ is continuous, then for any $\alpha \in \mathbb{R}$, we see $[-\infty,\alpha)$ is open, and therefore

has to be open. The upper semicontinuity is proved. The lower semicontinuity of $f$ is proved in the same manner.

If $f$ is both upper and lower semicontinuous, we see

is open. Since every open subset of $\mathbb{R}$ can be written as a countable union of segments of the above types, we see for any open subset $V$ of $\mathbb{R}$, $f^{-1}(V)$ is open. (If you have trouble with this part, it is recommended to review the definition of topology.) $\square$

There are two important examples.

- If $E \subset \mathbb{R}$ is open, then $\chi_E$ is lower semicontinuous.
- If $F \subset \mathbb{R}$ is closed, then $\chi_F$ is upper semicontinuous.

We will prove the first one. The second one follows in the same manner of course. For $\alpha<0$, the set $A=\chi_E^{-1}((\alpha,+\infty])$ is equal to $\mathbb{R}$, which is open. For $\alpha \geq 1$, since $\chi_E \leq 1$, we see $A=\varnothing$. For $0 \leq \alpha < 1$ however, the set of $x$ where $\chi_E>\alpha$ has to be $E$, which is still open.

When checking the semicontinuity of a function, we check from bottom to top or top to bottom. The function $\chi_E$ is defined by

If $f_1$ and $f_2$ are upper/lower semicontinuous, then so is $f_1+f_2$.

*Proof.* We are going to prove this using different tools. Suppose now both $f_1$ and $f_2$ are upper semicontinuous. For $\varepsilon>0$, there exists some $\delta_1>0$ and $\delta_2>0$ such that

*Proof.* If we pick $\delta=\min(\delta_1,\delta_2)$, then we see for all $t \in (x-\delta,x+\delta)$, we have

The upper semicontinuity of $f_1+f_2$ is proved by considering all $x \in \mathbb{R}$.

Now suppose both $f_1$ and $f_2$ are lower semicontinuous. We have a identity by

The set on the right side is always open. Hence $f_1+f_2$ is lower semicontinuous. $\square$

However, when there are infinite many semicontinuous functions, things are different.

Let $\{f_n\}$ be a sequence of nonnegative functions on $\mathbb{R}$, then

- If each $f_n$ is lower semicontinuous, then so is $\sum_{1}^{\infty}f_n$.
- If each $f_n$ is upper semicontinuous, then $\sum_{1}^{\infty}f_n$ is not necessarily upper semicontinuous.

*Proof.* To prove this we are still using the properties of open sets. Put $g_n=\sum_{1}^{n}f_k$. Now suppose all $f_k$ are lower. Since $g_n$ is a finite sum of lower functions, we see each $g_n$ is lower. Let $f=\sum_{n}f_n$. As $f_k$ are non-negative, we see $f(x)>\alpha$ if and only if there exists some $n_0$ such that $g_{n_0}(x)>\alpha$. Therefore

The set on the right hand is open already.

For the upper semicontinuity, it suffices to give an counterexample, but before that, we shall give the motivation.

As said, the characteristic function of a closed set is upper semicontinuous. Suppose $\{E_n\}$ is a sequence of almost disjoint closed set, then $E=\cup_{n\geq 1}E_n$ is not necessarily closed, therefore $\chi_E=\sum\chi_{E_n}$ (a.e.) is not necessarily upper semicontinuous. Now we give a concrete example. Put $f_0=\chi_{[1,+\infty]}$ and $f_n=\chi_{E_n}$ for $n \geq 1$ where

For $x > 0$, we have $f=\sum_nf_n \geq 1$. Meanwhile, $f^{-1}([-\infty,1))=[-\infty,0]$, which is not open. $\square$

Notice that $f$ can be defined on any topological space here.

There is one fact we already know about continuous functions.

If $X$ is compact, $f: X \to \mathbb{R}$ is continuous, then there exists some $a,b \in X$ such that $f(a)=\min f(X)$, $f(b)=\max f(X)$.

In fact, $f(X)$ is compact still. But for semicontinuous functions, things will be different but reasonable. For upper semicontinuous functions, we have the following fact.

If $X$ is compact and $f: X \to (-\infty,+\infty)$ is upper semicontinuous, then there exists some $a \in X$ such that $f(a)=\max f(X)$.

Notice that $X$ is not assumed to hold any other topological property. It can be Hausdorff or Lindelöf, but we are not asking for restrictions like this. The only property we will be using is that every open cover of $X$ has a finite subcover. Of course, one can replace $X$ with any compact subset of $\mathbb{R}$, for example, $[a,b]$.

*Proof.* Put $\alpha=\sup f(X)$, and define

If $f$ attains no maximum, then for any $x \in X$, there exists some $n \geq 1$ such that $f(x)<\alpha-\frac{1}{n}$. That is, $x \in E_n$ for some $n$. Therefore $\bigcup_{n \geq 1}E_n$ covers $X$. But this cover has no finite subcover of $X$. A contradiction since $X$ is compact. $\square$

This is a comprehensive application of several properties of semicontinuity.

(

Vitali–Carathéodory theorem) Suppose $f \in L^1(\mathbb{R})$, where $f$ is real-valued function. For $\varepsilon>0$, there exists some functions $u$ and $v$ on $\mathbb{R}$ such that $u \leq f \leq v$, $u$ is a upper semicontinuous functions bounded above, and $v$ is lower semicontinuous bounded below, and

It suffice to prove this theorem for $f \geq 0$ (of course $f$ is not identically equal to $0$ since this case is trivial). Since $f$ is the pointwise limit of an increasing sequence of simple functions $s_n$, we are able to write $f$ as

By putting $t_1=s_1$, $t_n=s_n-s_{n-1}$ for $n \geq 2$, we get $f=\sum_n t_n$. We are able to write $f$ as

where $E_k$ is measurable for all $k$. Also we have

and the series on the right hand converges(since $f \in L^1$. By the properties of Lebesgue measure, there exists a compact set $F_k$ and a open set $V_k$ such that $F_k \subset E_k \subset V_k$ and $c_km(V_k-F_k)<\frac{\varepsilon}{2^{k+1}}$. Put

(now you can see $v$ is lower semicontinuous and $u$ is upper semicontinuous). The $N$ is chosen in such a way that

Since $V_k \supset E_k$, we have $\chi_{V_k} \geq \chi_{E_k}$. Therefore $v \geq f$. Similarly, $f \geq u$. Now we need to check the desired integral inequality. A simple recombination shows that

If we integrate the function above, we get

This proved the case when $f \geq 0$. In the general case, we write $f=f^{+}-f^{-}$. Attach the semicontinuous functions to $f^{+}$ and $f^{-}$ respectively by $u_1 \leq f^{+} \leq v_1$ and $u_2 \leq f^{-} \leq v_2$. Put $u=u_1-v_2$, $v=v_1-u_2$. As we can see, $u$ is upper semicontinuous and $v$ is lower semicontinuous. Also, $u \leq f \leq v$ with the desired property since

and the theorem follows. $\square$

Indeed, the only unique property about measure used is the existence of $F_k$ and $V_k$. The domain $\mathbb{R}$ here can be replaced with $\mathbb{R}^k$ for $1 \leq k < \infty$, and $m$ be replaced with the respective $m_k$. Much more generally, the domain can be replaced by any locally compact Hausdorff space $X$, and the measure by any measure associated with Riesz-Markov-Kakutani representation theorem on $C_c(X)$.

The answer is no. Consider the fat Cantor set) $K$, which has Lebesgue measure $\frac{1}{2}$. We shall show that $\chi_K$ can not be approximated below by a lower semicontinuous function.

If $v$ is a lower semicontinuous function such that $v \leq \chi_K$, then $v \leq 0$.

*Proof.* Consider the set $V=v^{-1}((0,1])=v^{-1}((0,+\infty))$. Since $v \leq \chi_K$, we have $V \subset K$. We will show that $V$ has to be empty.

Pick $t \in V$. Since $V$ is open, there exists some neighborhood $U$ containing $t$ such that $U \subset V$. But $U=\varnothing$ since $U \subset K$ and $K$ has empty interior. Therefore $V = \varnothing$. That is, $v \leq 0$ for all $x$. $\square$

Suppose $u$ is any upper semicontinuous function such that $u \geq f$. For $\varepsilon=\frac{1}{2}$, we have

This example shows that there exists some integrable functions that are not able to reversely approximated in the sense of the Vitali–Carathéodory theorem.

]]>Fix $p$ with $1 \leq p \leq \infty$. It’s easy to see that $L^p(\mu)$ is a topological vector space. But it is not a metric space if we define

The reason is, if $d(f,g)=0$, we can only get $f=g$ a.e., but they are not strictly equal. With that being said, this function $d$ is actually a pseudo metric. This is unnatural. However, the relation $\sim$ by $f \sim g \mathbb{R}ightarrow d(f,g)=0$ is a equivalence relation. This inspires us to take quotient set into consideration.

For a vector space $V$, every subspace of $V$ is a normal subgroup. There is no reason to prevent ourselves from considering quotient group and looking for some interesting properties. Further, a vector space is a abelian group, therefore any subspace is automatically normal.

Let $N$ be a subspace of a vector space $X$. For every $x \in X$, let $\pi(x)$ be the coset of $N$ that contains $x$, that is

Trivially, $\pi(x)=\pi(y)$ if and only if $x-y \in N$ (say, $\pi$ is well-defined since $N$ is a vector space). This is a linear function since we also have the addition and multiplication by

These cosets are the elements of a vector space $X/N$, which reads, the quotient space of $X$ modulo $N$. The map $\pi$ is called the canonical map as we all know.

First we shall treat $\mathbb{R}^2$ as a vector space, and the subspace $\mathbb{R}$, which is graphically represented by $x$-axis, as a subspace (we will write it as $X$). For a vector $v=(2,3)$, which is represented by $AB$, we see the coset $v+X$ has something special. Pick any $u \in X$, for example $AE$, $AC$, or $AG$. We see $v+u$ has the same $y$ value. The reason is simple, since we have $v+u=(2+x,3)$, where the $y$ value remain fixed however $u$ may vary.

With that being said, the set $v+X$, which is not a vector space, can be represented by $\overrightarrow{AD}$. This proceed can be generalized to $\mathbb{R}^n$ with $\mathbb{R}^m$ as a subspace with ease.

We now consider some fancy example. Consider all rational Cauchy sequences, that is

where $a_k\in\mathbb{Q}$ for all $k$. In analysis class we learned two facts.

- Any Cauchy sequence is bounded.
- If $(a_n)$ converges, then $(a_n)$ is Cauchy.

However, the reverse of 2 does not hold in $\mathbb{Q}$. For example, if we put $a_k=(1+\frac{1}{k})^k$, we should have the limit to be $e$, but $e \notin \mathbb{Q}$.

If we define the addition and multiplication term by term, namely

and

where $\alpha \in \mathbb{Q}$, we get a vector space (the verification is easy). The zero vector is defined by

This vector space is denoted by $\overline{\mathbb{Q}}$. The subspace containing all sequences converges to $0$ will be denoted by $\overline{\mathbb{O}}$. Again, $(a_n)+\overline{\mathbb{O}}=(b_n)+\overline{\mathbb{O}}$ if and only if $(a_n-b_n) \in \overline{\mathbb{O}}$. Using the language of equivalence relation, we also say $(a_n)$ and $(b_n)$ are equivalent if $(a_n-b_n) \in \overline{\mathbb{O}}$. For example, the two following sequences are equivalent:

Actually we will get $\mathbb{R} \simeq \overline{\mathbb{Q}}/\overline{\mathbb{O}}$ in the end. But to make sure that this quotient space is exactly the one we meet in our analysis class, there are a lot of verification should be done.

We shall give more definitions for calculation. The multiplication of two Cauchy sequences is defined term by term à la the addition. For $\overline{\mathbb{Q}}/\overline{\mathbb{O}}$ we have

and

As for inequality, a partial order has to be defined. We say $(a_n) > (0)$ if there exists some $N>0$ such that $a_n>0$ for all $n \geq N$. By $(a_n) > (b_n)$ we mean $(a_n-b_n)>(0)$ of course. For cosets, we say $(a_n)+\overline{\mathbb{O}}>\overline{\mathbb{O}}$ if $(x_n) > (0)$ for some $(x_n) \in (a_n)+\overline{\mathbb{O}}$. This is well defined. That is, if $(x_n)>(0)$, then $(y_n)>(0)$ for all $(y_n) \in (a_n)+\overline{\mathbb{O}}$.

With these operations being defined, it can be verified that $\overline{\mathbb{Q}}/\overline{\mathbb{O}}$ has the desired properties, for example, least-upper-bound property. But this goes too far from the topic, we are not proving it here. If you are interested, you may visit here for more details.

Finally, we are trying to make $L^p$ a Banach space. Fix $p$ with $1 \leq p < \infty$. There is a seminorm defined for all Lebesgue measurable functions on $[0,1]$ by

$L^p$ is a vector space containing all functions $f$ with $p(f)<\infty$. But it’s not a normed space by $p$, since $p(f)=0$ only implies $f=0$ almost everywhere. However, the set $N$ which contains all functions that equals to $0$ is also a vector space. Now consider the quotient space by

where $\pi$ is the canonical map of $L^p$ into $L^p/N$. We shall prove that $\tilde{p}$ is well-defined here. If $\pi(f)=\pi(g)$, we have $f-g \in N$, therefore

which forces $p(f)=p(g)$. Therefore in this case we also have $\tilde{p}(\pi(f))=\tilde{p}(\pi(g))$. This indeed ensures that $\tilde{p}$ is a *norm*, and $L^p/N$ a Banach space. There are some topological facts required to prove this, we are going to cover a few of them.

We know if $X$ is a topological vector space with a topology $\tau$, then the addition and scalar multiplication is continuous. Suppose now $N$ is a closed subspace of $X$. Define $\tau_N$ by

We are expecting $\tau_N$ to be properly-defined. And fortunately it is. Some interesting techniques will be used in the following section.

There will be two steps to get this done.

$\tau_N$ is a topology.

It is trivial that $\varnothing$ and $X/N$ are elements of $\tau_N$. Other properties are immediate as well since we have

and

That said, if we have $A,B\in \tau_N$, then $A \cap B \in \tau_N$ since $\pi^{-1}(A \cap B)=\pi^{-1}(A) \cap \pi^{-1}(B) \in \tau$.

Similarly, if $A_\alpha \in \tau_N$ for all $\alpha$, we have $\cup A_\alpha \in \tau_N$. Also, by definition of $\tau_N$, $\pi$ is continuous.

$\tau_N$ is a vector topology.

First, we show that a point in $X/N$, which can be written as $\pi(x)$, is closed. Notice that $N$ is assumed to be closed, and

therefore has to be closed.

In fact, $F \subset X/N$ is $\tau_N$-closed if and only if $\pi^{-1}(F)$ is $\tau$-closed. To prove this, one needs to notice that $\pi^{-1}(F^c)=(\pi^{-1}(F))^{c}$.

Suppose $V$ is open, then

is open. By definition of $\tau_N$, we have $\pi(V) \in \tau_N$. Therefore $\pi$ is an open mapping.

If now $W$ is a neighborhood of $0$ in $X/N$, there exists a neighborhood $V$ of $0$ in $X$ such that

Hence $\pi(V)+\pi(V) \subset W$. Since $\pi$ is open, $\pi(V)$ is a neighborhood of $0$ in $X/N$, this shows that the addition is continuous.

The continuity of scalar multiplication will be shown in a direct way (so can the addition, but the proof above is intended to offer some special technique). We already know, the scalar multiplication on $X$ by

is continuous, where $\Phi$ is the scalar field (usually $\mathbb{R}$ or $\mathbb{C}$. Now the scalar multiplication on $X/N$ is by

We see $\psi(\alpha,x+N)=\pi(\varphi(\alpha,x))$. But the composition of two continuous functions are continuous, therefore $\psi$ is continuous.

We are going to talk about a classic commutative diagram that you already see in algebra class.

There are some assumptions.

- $X$ and $Y$ are topological vector spaces.
- $\Lambda$ is linear.
- $\pi$ is the canonical map.
- $N$ is a closed subspace of $X$ and $N \subset \ker\Lambda$.

Algebraically, there exists a unique map $f: X/N \to Y$ by $x+N \mapsto \Lambda(x)$. Namely, the diagram above is commutative. But now we are interested in some analysis facts.

$f$ is linear.

This is obvious. Since $\pi$ is **surjective**, for $u,v \in X/N$, we are able to find some $x,y \in X$ such that $\pi(x)=u$ and $\pi(y)=v$. Therefore we have

and

$\Lambda$ is open if and only if $f$ is open.

If $f$ is open, then for any open set $U \subset X$, we have

to be a open set since $\pi$ is open, and $\pi(U)$ is a open set.

If $f$ is not open, then there exists some $V \subset X/N$ such that $f(V)$ is closed. However, since $\pi$ is continuous, we have $\pi^{-1}(V)$ to be open. In this case we have

to be closed. $\Lambda$ is therefore not open. This shows that if $\Lambda$ is open, then $f$ is open.

$\Lambda$ is continuous if and only if $f$ is continuous.

If $f$ is continuous, for any open set $W \subset Y$, we have $\pi^{-1}(f^{-1}(W))=\Lambda^{-1}(W)$ to be open. Therefore $\Lambda$ is continuous.

Conversely, if $\Lambda$ is continuous, for any open set $W \subset Y$, we have $\Lambda^{-1}(W)$ to be open. Therefore $f^{-1}(W)=\pi(\Lambda^{-1}(W))$ has to be open since $\pi$ is open.

]]>已经二十五岁，已经不再长个子，家里人决定将我下葬。其实也没什么好抗拒的了，再听一会半通不通的悼词，那些乱七八糟的哭腔和应和，再忍一会棺材里的寒冷，就过去了。这是不是妄想我也不清楚，可我想我大抵是这次死亡的一个配角。

这是不是一场噩梦也无关紧要，尸胺的气味是不是从我身上发出也不重要，没有怀疑的必要。这无非是我第三次忍受，心甘情愿的忍受。为什么要挣扎？我也不是十八年前那个体弱多病的孩子，还可以四处活动。我明明是一个尽量保持体面的死者。可能在几天前还可以闻到紫罗兰的芳香。

尘归尘土归土。我可能会有一些伤感，我的精神存在早就收到了严重的打击，或许这尸臭是一个原因。或者是要下葬这一个事实。还是这样被埋掉吧，那气味是实实在在的。我也没有力气，我的神经系统——虽然我不知道是否还存在——也不听我的使唤。我没有力气挣扎。

我不应该感到沾沾自喜，不过这也确实没啥，第二场关于死亡的梦。这也可以是一场货真价实的死亡。我还是主角。十八年，或许只有十八天，我的棺材和我身体一起生长，一直到时间和棺材里的尸臭糅合起来变成半透明且柔软的胶状物，我的躯壳也被定在了里面。可能会有融化的一天。

《蓝狗的眼睛·第三次忍受》

La tercera resignación

Pic by https://www.deviantart.com/insaneattraction/art/La-tercera-resignacion-57045720.

]]>An open map is a function between two topological spaces that maps open sets to open sets. Precisely speaking, a function $f: X \to Y$ is open if for any open set $U \subset X$, $f(U)$ is open in $Y$. Likewise, a closed map is a function mapping closed sets to closed sets.

You may think open/closed map is an alternative name of continuous function. But it’s not. The definition of open/closed mapping is totally different from continuity. Here are some simple examples.

- $f(x)=\sin{x}$ defined on $\mathbb{R}$ is not open, though it’s continuous. It can be verified by considering $(0,2\pi)$, since we have $f((0,2\pi))=[-1,1]$.
- The projection $\pi: \mathbb{R}^2 \to \mathbb{R}$ defined by $(x,y) \mapsto x$ is open. Indeed, it maps an open ball onto an open interval on $x$ axis.
- The inclusion map $\varphi: \mathbb{R} \to \mathbb{R}^2$ by $x \mapsto (x,0)$ however, is not open. An open interval on the plane is
*locally closed*but not open or closed.

Under what condition will a continuous linear function between two TVS be an open mapping? We’ll give the answer in this blog post. Open mapping theorem is a sufficient condition on whether a continuous linear function is open.

Let $X,Y$ be Banach spaces and $T: X \to Y$ a

surjectivebounded linear map. Then $T$ is an open mapping.

The open balls in $X$ and $Y$ are defined respectively by

All we need to do is show that there exists some $r>0$ such that

Since every open set in $X$ or $Y$ can be expressed as a union of open balls. For a ball in $X$ centered at $x \in X$ with radius $r$, we can express it as $x+B_r^X$. After that, it becomes obvious that $T$ maps open set to open set.

First we have

The surjectivity of $T$ ensures that

Since $Y$ is Banach, or simply a complete metric space, by Baire category theorem, there must be some $n_0 \in \mathbb{N}$ such that $\overline{T(B_{n_0}^{X})}$ has nonempty interior. If not, which means $T(B_n^{X})$ is nowhere dense for all $n \in \mathbb{N}$, we have $Y$ is of the first category. A contradiction.

Since $x \to nx$ is a homeomorphism of $X$ onto $X$, we see in fact $T(B_n^X)$ is not nowhere dense for all $n \in \mathbb{N}$. Therefore, there exists some $y_0 \in \overline{T(B_1^{X})}$ and some $\varepsilon>0$ such that

the open set on the left hand is a neighborhood of $y_0$, which should be in the interior of $\overline{T(B_1^X)}$.

On the other hand, we claim

We shall prove it as follows. Pick any $y \in \overline{T(B_1^X)}$, we shall show that $y-y_0 \in \overline{T(B_2^X)}$. For $y_0$, there exists a sequence of $y_n$ where $\lVert y_n \rVert <1$ for all $n$ such that $Ty_n \to y_0$. Also we are able to find a sequence of $x_n$ where $\lVert x_n \rVert <1$ for all $n$ such that $Tx_n \to y$. Notice that we also have

since

we see $T(x_n-y_n) \in T(B_2^X)$ for all $n$, it follows that

Combining all these relations, we get

Since $T$ is linear, we see

By induction we get

for all $n \geq 1$.

We shall show however

For any $u \in B_{\varepsilon/4}^Y$, we have $u \in \overline{T(B_{1/2}^X)}$. There exists some $x_1 \in B_{1/2}^{X}$ such that

This implies that $u-Tx_1 \in B_{\varepsilon/8}^Y$. Under the same fashion, we are able to pick $x_n$ in such a way that

where $\lVert x_n \rVert<2^{-n}$. Now let $z_n=\sum_{k=1}^{n}x_k$, we shall show that $(z_n)$ is Cauchy. For $m<n$, we have

Since $X$ is Banach, there exists some $z \in X$ such that $z_n \to z$. Further we have

therefore $z \in B_1^X$. Since $T$ is bounded, therefore continuous, we get $T(z)=u$. To summarize, for $u \in B_{\varepsilon/4}^Y$, we have some $z \in B_{1}^X$ such that $T(z)=y$, which implies $T(B_1^X) \supset B_{\varepsilon/4}^Y$.

Let $U \subset X$ be open, we want to show that $T(U)$ is also open. Take $y \in T(U)$, then $y=T(x)$ with $x \in U$. Since $U$ is open, there exists some $\varepsilon>0$ such that $B_{\varepsilon}^{X}+x \subset U$. By the linearity of $T$, we obtain $B_{r\varepsilon}^Y \subset T(B_{\varepsilon}^X)$ for some small $r$. Using the linearity of $T$ again, we obtain

which shows that $T(U)$ is open, therefore $T$ is an open mapping.

One have to notice that the completeness of $X$ and $Y$ has been used more than one time. For example, the existence of $z$ depends on the fact that Cauchy sequence converges in $X$. Also, the surjectivity of $T$ cannot be omitted, can you see why?

There are some different ways to state this theorem.

- To every $y$ with $\lVert y \rVert < \delta$, there corresponds an $x$ with $\lVert x \rVert<1$ such that $T(x)=y$.
- Let $U$ and $V$ be the open unit balls of the Banach spaces $X$ and $Y$. To every surjective bounded linear map, there corresponds a $\delta>0$ such that

You may also realize that we have used a lot of basic definitions of topology. For example, we checked the openness of $T(U)$ by using neighborhood. The set $\overline{T(B_1^X)}$ should also remind you of limit point.

The difference of open mapping and continuous mapping can be viewed via the topologies of two topological vector spaces. Suppose $f: X \to Y$. If for any $U \in \tau_X$, we have $f(U) \in \tau_Y$, where $\tau_X$ and $\tau_Y$ are the topologies of $X$ and $Y$, respectively. But this has nothing to do with continuity. By continuity we mean, for any $V \in \tau_Y$, we have $f^{-1}(V) \in \tau_U$.

Fortunately, this theorem can be generalized to $F$-spaces, which will be demonstrated in the following blog post of the series. A space $X$ is an $F$-space if its topology $\tau$ is induced by a complete invariant metric $d$. Still, completeness plays a critical rule.

]]>Before we go into group theory, let’s recall how Cauchy sequence is defined in analysis.

A sequence $(x_n)_{n=1}^{\infty}$ of real/complex numbers is called a Cauchy sequence if, for every $\varepsilon>0$, there is a positive integer $N$ such that for all $m,n>N$, we have

That said, the **distance** between two numbers is always ‘too close’. Notice that only distance is involved, the definition of Cauchy sequence in metric space comes up in the natural of things.

Given a metric space $(X,d)$, a sequence $(x_n)_{n=1}^{\infty}$ is Cauchy if for every real number $\varepsilon>0$, there is a positive integer $N$ such that, for all $m,n>N$, the distance by

By considering the topology induced by metric, we see that $x_n$ lies in a neighborhood of $x_m$ with radius $\varepsilon$. But a topology can be constructed by neighborhood, hence the Cauchy sequence for topological vector space follows.

For a topological vector space $X$, pick a local base $\mathcal{B}$, then $(x_n)_{n=1}^{\infty}$ is a Cauchy sequence if for each member $U \in \mathcal{B}$, there exists some number $N$ such that for $m,n>N$, we have

But in a topological space, it’s not working. Consider two topological space by

with usual topology. We have $X \simeq Y$ since we have the map by

as a homeomorphism. Consider the Cauchy sequence $(\frac{1}{n+1})_{n=1}^{\infty}$, we see $(h(\frac{1}{n+1}))_{n=1}^{\infty}=(n+1)_{n=1}^{\infty}$ which is not Cauchy. This counterexample shows that being a Cauchy sequence is not preserved by homeomorphism.

Similarly, one can have a Cauchy sequence in a topological group (bu considering subtraction as inverse).

A sequence $(x_n)_{n=1}^{\infty}$ in a topological group $G$ is a Cauchy sequence if for every open neighborhood $U$ of the identity $G$, there exists some number $N$ such that whenever $m,n>N$, we have

A metric space $(X,d)$ where every Cauchy sequence converges is complete.

Spaces like $\mathbb{R}$, $\mathbb{C}$ are complete with Euclid metric. But consider the sequence in $\mathbb{Q}$ by

we have $a_n\in\mathbb{Q}$ for all $n$ but the sequence does not converge in $\mathbb{Q}$. Indeed in $\mathbb{R}$ we can naturally write $a_n \to e$ but $e \notin \mathbb{Q}$ as we all know.

There are several ways to construct $\mathbb{R}$ from $\mathbb{Q}$. One of the most famous methods is Dedekind’s cut. However you can find no explicit usage of Cauchy sequence. There is another method by using Cauchy sequence explicitly. We are following that way algebraically.

Suppose we are given a group $G$ with a sequence of normal subgroups $(H_n)_{n=1}^{\infty}$ with $H_n \supset H_{n+1}$ for all $n$, all of which has finite index. We are going to complete this group.

A sequence $(x_n)_{n=1}^{\infty}$ in $G$ will be called **Cauchy sequence** if given $H_k$, there exists some $N>0$ such that for $m,n>N$, we have

Indeed, this looks very similar to what we see in topological group, but we don’t want to grant a topology to the group anyway. This definition does not go to far from the original definition of Cauchy sequence in $\mathbb{R}$ as well. If you treat $H_k$ as some ‘small’ thing, it shows that $x_m$ and $x_n$ are close enough (by considering $x_nx_m^{-1}$ as their difference).

A sequence $(x_n)_{n=1}^{\infty}$ in $G$ will be called **null sequence** if given $k$, there exists some $N>0$ such that for all $n>N$, we have

or you may write $x_ne^{-1} \in H_k$. It can be considered as being *arbitrarily close to the identity $e$*.

The Cauchy sequences (of $G$) form a group under termwise product

*Proof.* Let $C$ be the set of Cauchy sequences, we shall show that $C$ forms a group. For $(x_1,x_2,\cdots),(y_1,y_2,\cdots)\in C$, the product is defined by

The associativity follows naturally from the associativity of $G$. To show that $(x_1y_1,x_2y_2,\cdots)$ is still a Cauchy sequence, notice that for big enough $m$, $n$ and some $k$, we have

But $(x_ny_n)(x_my_m)^{-1}=x_ny_ny_m^{-1}x_m^{-1}$. To show that this is an element of $H_k$, notice that

Since $y_ny_m^{-1}\in H_k$, $H_k$ is normal, we have $x_ny_ny_mx_n^{-1} \in H_k$. Since $x_nx_m^{-1} \in H_k$, $(x_ny_n)(x_my_m)^{-1}$ can be viewed as a product of two elements of $H_k$, therefore is an element of $H_k$.

Obviously, if we define $e_C=(e_G,e_G,\cdots)$, where $e_G$ is the identity of $G$, $e_C$ becomes the identity of $C$, since

Finally the inverse. We need to show that

is still an element of $C$. This is trivial since if we have

then

as $H_k$ is a group.

The null sequences (of $G$) form a group, further, it’s a normal subgroup of $C$, that is, the group of Cauchy sequences.

Let $N$ be the set of null sequences of $G$. Still, the identity is defined by $(e_G,e_G,\cdots)$, and there is no need to duplicate the validation. And the associativity still follows from $G$. To show that $N$ is closed under termwise product, namely if $(x_n),(y_n) \in N$, then $(x_ny_n)\in N$, one only need to notice that, for big $n$, we already have

Therefore $x_ny_n \in H_k$ since $x_n$ and $y_n$ are two elements of $H_k$.

To show that $(x_n^{-1})$, which should be treated as the inverse of $(x_n)$, is still in $N$, notice that if $x_n \in H_k$, then $x_n^{-1} \in H_k$.

Next, we shall show that $N$ is a subgroup of $C$, which is equivalent to show that every null sequence is Cauchy. Given $H_p \supset H_q$, for $(x_n)\in{N}$, there are some big enough $m$ and $n$ such that

therefore

as desired. Finally, pick $(p_n) \in N$ and $(q_n) \in C$, we shall show that $(q_n)(p_n)(q_n)^{-1} \in N$. That is, the sequence $(q_np_nq_n^{-1})$ is a null sequence. Given $H_k$, we have some big $n$ such that

therefore

since $H_k$ is normal. Our statement is proved.

The factor group $C/N$ is called the

completionof $G$ (with respect to $(H_n)$).

As we know, the elements of $C/N$ are cosets. A coset can be considered as an element of $G$’s completion. Let’s head back to some properties of factor group. Pick $x,y \in C$, then $xN=yN$ if and only if $x^{-1}y \in N$. With that being said, two Cauchy sequences are equivalent if their ‘difference’ is a null sequence.

Informally, consider the addictive group $\mathbb{Q}$. There are two Cauchy sequence by

They are equivalent since

is a null sequence. That’s why people say $0.99999… = 1$ (in analysis, the difference is convergent to $0$; but in algebra, we say the two sequences are equivalent). Another example, $\ln{2}$ can be represented by the equivalent class of

We made our completion using Cauchy sequences. The completion is filled with some Cauchy sequence and some additions of ‘nothing’, whence the gap disappears.

Again, the sequence of normal subgroups does not have to be indexed by $\mathbb{N}$. It can be indexed by any directed partially ordered set, or simply partially ordered set. Removing the restriction of index set gives us a great variety of implementation.

However, can we finished everything about completing $\mathbb{Q}$ using this? The answer is, no - the multiplication is not verified! To finish this, field theory have to be taken into consideration.

]]>A real-valued function $f(t)$ of a real variable, defined on some neighborhood of $0$, is said to be of $o(t)$ if

And its derivative at some point $a$ is defined by

We also have this equivalent equation:

Now suppose $f:U \subset \mathbb{R}^n \to \mathbb{R}^m$ where $U$ is an open set. The function $f$ is differentiable at $x_0 \in U$ if satisfying the following conditions.

All partial derivatives of $f$, i.e. $\frac{\partial f_i}{\partial x_j}$ exists for all $i=1,\cdots,m$ and $j = 1,\cdots,n$ at $f$. (Which ensures that the Jacobian matrix exists and is well-defined).

The Jacobian matrix $J(x_0)\in\mathbb{R}^{m\times n}$ satisfies

In fact the Jacobian matrix has been the derivative of $f$ at $x_0$ although it’s a matrix in lieu of number. But we should treat a number as a matrix in the general case. In the following definition of Fréchet derivative, you will see that we should treat

*something*as linear functional.

Let $f:U\to\mathbf{F}$ be a function where $U$ is an open subset of $\mathbf{E}$. We say $f$ is Fréchet differentiable at $x \in U$ if there is a bounded and **linear operator ** $\lambda:\mathbf{E} \to \mathbf{F}$ such that

We say that $\lambda$ is the **derivative** of $f$ at $x$, which will be denoted by $Df(x)$ or $f’(x)$. Notice that $\lambda \in L(\mathbf{E},\mathbf{F})$. If $f$ is differentiable at every point of $f$, then $f’$ is a map by

The definition above doesn’t go too far from real functions defined on the real axis. Now we are assuming that both $\mathbf{E}$ and $\mathbf{F}$ are merely topological vector spaces, and still we can get the definition of Fréchet derivative (generalized).

Let $\varphi$ be a mapping of a neighborhood of $0$ of $\mathbf{E}$ into $\mathbf{F}$. We say that $\varphi$ is **tangent to** $0$ if given a neighborhood $W$ of $0$ in $\mathbf{F}$, there exists a neighborhood $V$ of $0$ in $\mathbf{E}$ such that

for some function of $o(t)$. For example, if both $\mathbf{E}$ and $\mathbf{F}$ are normed (not have to be Banach), then we get a usual condition by

where $\lim_{\lVert x \rVert \to 0}\psi(x)=0$.

Still we assume that $\mathbf{E}$ and $\mathbf{F}$ are topological vector spaces. Let $f:U \to \mathbf{F}$ be a continuous map. We say that $f$ is differentiable at a point $x \in U$ if there exists some $\lambda \in L(\mathbf{E},\mathbf{F})$ such that for small $y$ we have

where $\varphi$ is tangent to $0$. Notice that $\lambda$ is uniquely determined.

You must be familiar with some properties of derivative, but we are redoing these in Banach space.

If $f: U \to V$ is differentiable at $x_0$, and $g:V \to W$ is differentiable at $f(x_0)$, then $g \circ f$ is differentiable at $x_0$, and

*Proof.* We are proving this in topological vector space. By definition, we already have some linear operator $\lambda$ and $\mu$ such that

where $\varphi$ and $\psi$ are tangent to $0$. Further, we got

To evaluate $g(f(x_0+y))$, notice that

It’s clear that $\mu\circ\varphi(y)+\psi(\lambda{y}+\varphi(y))$ is tangent to $0$, and $\mu\circ\lambda$ is the linear map we are looking for. That is,

From now on, we are dealing with Banach spaces. Let $U$ be an open subset of $\mathbf{E}$, and $f:U \to \mathbf{F}$ be differentiable at each point of $U$. If $f’$ is continuous, then we say that $f$ is **of class** $C^1$. The function of order $C^p$ where $p \geq 1$ is defined inductively. The $p$-th derivative $D^pf$ is defined as $D(D^{p-1}f)$ and is itself a map of $U$ into $L(\mathbf{E},L(\mathbf{E},\cdots,L(\mathbf{E},\mathbf{F})\cdots)))$ which is isomorphic to $L^p(\mathbf{E},\mathbf{F})$. A map $f$ is said to be **of class** $C^p$ if its $kth$ derivative $D^kf$ exists for $1 \leq k \leq p$, and is continuous. With the help of chain rule, and the fact that the composition of two continuous functions are continuous, we get

Let $U,V$ be open subsets of some Banach spaces. If $f:U \to V$ and $g: V \to \mathbf{F}$ are of class $C^p$, then so is $g \circ f$.

We in fact get a category ${(U,f_U)}$ where $U$ is the object as an open subset of some Banach space, and $f_U$ is the morphism as a map of class $C^p$ mapping $U$ into another open set. To verify this, one only has to realize that the composition of two maps of class $C^p$ is still of class $C^p$ (as stated above).

We say that $f$ is of class $C^\infty$ if $f$ is of class $C^p$ for all integers $p \geq 1$. Meanwhile $C^0$ maps are the continuous maps.

We are going to evaluate the Fréchet derivative of a nonlinear functional. It is the derivative of a functional mapping an infinite dimensional space into $\mathbb{R}$ (instead of $\mathbb{R}$ to $\mathbb{R}$).

Consider the functional by

where the norm is defined by

For $u\in C[0,1]$, we are going to find an linear operator $\lambda$ such that

where $\varphi(\eta)$ is tangent to $0$.

*Solution.* By evaluating $\Gamma(u+\eta)$, we get

To prove that $\int_{0}^{1}\eta^2\sin{x}dx$ is the $\varphi(\eta)$ desired, notice that

Therefore we have

as desired. The Fréchet of $\Gamma$ at $u$ is defined by

It’s hard to believe but, the derivative is not a number, not a matrix, but a linear operator. But conversely, one can treat a matrix or a number as a linear operator effortlessly.

]]>This blog serves as an introduction to profinite groups without touching anything other than elementary group theory (no ring, field, Galois theory, topological group, etc.), though we may not be able to go for further application.

We begin with an easy-to-understand motivation by introducing $\mathbb{Z}_p$. Consider the binary expansion of an integer

where $a_k=0,1$. For example we may have

You must be familiar with binary expansion if you write codes. As a topology exercise, show that the set containing all such $a$ is uncountable. In the octal number system you may also write

This notation is pretty useful in some real life occasions but not here. We are looking for connections between number systems and **prime** numbers (you will see why later)), but number systems with bases like $8,10,16$ definitely won’t work.

Fix a prime number $p$, a $p$-adic integer $\alpha$ is defined by a sequence of integers $x_k$ for which we writes

satisfying

For example, we write $88$ as a $2$-adic number by

As you may realize, $x_k$ can be written by

where $a_i=0,1,\cdots,p-1$ for $i \leq k-1$, where $a_i$ is called the $p$-adic digits.

In fact, if we define the addition componentwise, i.e.

then it’s a group. Further, if we define the multiplication componentwise, we get a ring. The group of all $p$-adic numbers is denoted by $\mathbb{Z}_p$. But this blog won’t touch anything other than group theory.

As you may wonder, it doesn’t seems to work for ‘negative’ one. For example if we have

how do we get $-\alpha$? In fact we have

which suggests the limit value of $x_k$ associated to $-\alpha$ as $k\to\infty$ is

It doesn’t converge in the usual sense. But if it does, we have

But this is valid under such circumstance. We can check this using $p$-adic digits. In fact, the $p$-adic digits of $1$ is

if we add $p-1$ to each component, we get

(there are infinitely many $p-1$!).

With all these stuff being said, you can treat $\alpha=(x_1,x_2,\cdots)$ as a **limit**:

which makes everything natural. We are not digging into $\mathbb{Z}_p$ further. But keep two words in mind: limit and group.

The definition of $\mathbb{Z}_p$ by $x_{k+1} \equiv x_k\mod{p^k}$ might remind you of $\mathbb{Z}/p^k\mathbb{Z}$. Let’s give a review of $\mathbb{Z}/p^k\mathbb{Z}$.

For integers $x,y$, we have

if $x \in p^k\mathbb{Z}$. Further we have

if $(x-y)\in p^k\mathbb{Z}$. We also may write $x \equiv y \mod p^k$. So there are infinitely many $x_{k+1} \equiv x_k \mod p^k\mathbb{Z}$, shall we associate infinitely many $\mathbb{Z}/p^k\mathbb{Z}$? If it works, we may treat $\mathbb{Z}_p$ as the ‘limit’ of $\mathbb{Z}/p^k\mathbb{Z}$. But we need some proper **operation** to do that.

Let $G_n=\mathbb{Z}/p^{n+1}\mathbb{Z}$ for each $n \geq 0$. Let

be the canonical homomorphism. Notice that $f_n$ is surjective. Now consider a $p$-adic integer

we have

Therefore we got a expression of $\mathbb{Z}_p$ by

We will write $\mathbb{Z}_p=\varprojlim\mathbb{Z}/p^n\mathbb{Z}$ since it’s an example of a inverse limit. It’s *inverse* since $f_n$ goes ‘back’ by associating each $x_n$ to $x_{n-1}$. Since $f_n$ is **surjective**, we can always raise $x_{n-1}$ to $G_{n}$ via $f_{n}$. We treated one group as a limit of a sequence of groups. We don’t want to limit ourself in number theory. In the following section we are offering a much more generalized definition where even numbers are generalized.

We are going to give a generalized definition for profinite group. Notice that in the example of $\mathbb{Z}_p$, the sequence is indexed by $\mathbb{N}$. It’s easy to understand but this index set prevents profinite group from being further applied. Of course, the index $\mathbb{N}$ is not excluded.

A set $I$ is **directed partially ordered** if it’s associated with a partial order $\geq$ such that for any two elements $i,j \in I,$ there exists a $k \in I$ such that $k \geq i$ and $k \geq j$.

$\mathbb{Z}$ with the natural inequality is of course directed partially ordered. However we can define another partial order by division. If we define $n \geq m$ if $m|n$, then we have $\operatorname{lcm}(m,n) \geq m,n$.

As another example, consider the family of all subgroups $\mathcal{F}$ of a group $G$. The partial order is defined by inclusion. i.e. for $M,N \in \mathcal{F}$, we have $M \geq N$ if $M \supset N$. In this case $M \cup N \geq M,N$.

A **projective system** is a collection of groups $G_i$ ($i \in I$), together with group homomorphisms $f^{j}_i: G_j \to G_i$ for $i,j\in {I}$ with $j \geq i$ such that

- $f_{i}^{i}=\operatorname{id}_{G_i}$ for every $i \in I$.
- $f_{i}^{j}\circ f_{j}^{k}=f_{i}^{k}$ for $k \geq j \geq i$.

Given any such projective system with a directed partially ordered index set, we has the **inverse limit** (or projective limit) defined by

It’s easy to see that $\mathbb{Z}_p$ can be defined with the same manner, although we have $I = \mathbb{N}$. It also can be verified that the inverse limit forms a group (also topological group, but we are not discussing that here).

A group is **profinite** if it is a **pro**jective limit of **finite** groups (up to isomorphism).

For any $g \in \mathbb{N}_+$, it would be interesting to consider the following projective limit by

It can be verified that we have

That said, base $8$ number system is ‘useless’ since it’s isomorphic to base $2$ number system. That’s why we focus on prime first. We will give another ‘generalization’ of $p$-adic numbers.

Suppose we have a sequence of normal subgroups $(H_n)$ of $G$ such that $H_n \supset H_{n+1}$ for all $n$. It doesn’t matter whether $G$ is finite. Let

be the canonical homomorphisms. Then the inverse limit follows:

We also have a natural homomorphism

by sending $x$ to the sequence $(x_i)$, where $x_n$ is the image of $x$ in $G/H_n$. Notice that we don’t have to use $\mathbb{N}$ as the index. This inverse limit can also be indexed by the set containing all $H_n$.

You may think this is like, algebraists stole something from analysts and made it up with the magic of algebra. There are many other applications that I want to show you in the future (not beyond elementary group theory). If you learned functional analysis you may know that $L^p$ space for $1 \leq p < \infty$ is not a Banach space due to the functions equal to $0$ a.e.. But $L^p/N$ can be a Banach space where $N$ contains all functions equal to $0$ a.e.. Both $L^p$ and $N$ are groups, and we ‘completed’ $L^p$ by defining a factor space which still is a group. In fact, in algebra, we also have **Cauchy sequence** and **completion** of a group, which are associated with inverse limit still.

- Luis Ribes,
*Introduction to Profinite Groups* - Hendrik Lenstra,
*Profinite Groups* - Serge Lang,
*Algebra Revised Third Edition*

我们要讨论这两种方程的普遍的解决办法

其中$a_i$为常数。在上篇博客里，我们见到了，假设我们能求出第一个方程，那么第二个方程的解，可以用Cramer法则，通过解普通的线性方程组，再进行积分得到。但是我们假设自己有这个“超能力”，并没有实际的操作方法。这篇博客会赋予我们这个“超能力”。当然也不仅仅是如何快速准确解出方程，更重要的是，能看到经典理论之间朴素而又巧妙的联系。

这篇博客的方法基于多项式，我想你至少在微积分课上已经知道一些简单的对于多项式的处理了。这里要用到古典代数学基本定理，也就是说

任何一个非零的一元$n$次复系数多项式，都正好有$n$个复数根（重根视为多个根）。

在这里最适合讨论的应该是

确实，再简单就是普通的不定积分了。我们再来回顾一下怎么分析这个方程的解的结构。首先，$y=0$显然是一个解，这保证了解的准确性。另一方面，在第一篇博客里，我们也给出了普遍的通解计算方式

所以，这个方程的解为

其中$C$为任意常数。

我们也可以尝试一下简单的二阶方程

显然，$y=0$仍然是这个方程的一个特殊解。我们希望的是，能解两次$y’+p(x)y=0$形式的方程，因为这种方程的解法我们是已经会了的。

注意到我们可以把方程写成

令$u=y’-y$，那么我们有了

这个方程的解我们是知道的，实际上我们已经有

所以又有了

从而又可以解出

对于一阶方程，就是我们已经学过的办法。而二阶方程，你应该已经察觉到了，似乎有迹可循。注意到，$y’’-2y’+y=0$这个方程，我们是解了两次$y’-y$。这时自然可以想到，对于某个方程可能也可以像是解两次$y’-2y$。注意如果我们把求两次导看成一个”平方“，我们有一个方程

那么能不能写出一个需要解两次$y’-2y$的方程呢？我们先写出一个关于$\lambda$的方程，再给对应一个微分方程，也就是说

而对应的方程恰好又有

仍然是要解两次$y’-2y=0$。类似地，也可能某个方程是需要解$n$次$y-3y’$，等等。我们也可以讨论“混合”的场景。比如一个二阶方程，需要先解一次$y-ay’$，再解一次$y-by’$。如果两个颠倒，会不会又有不一样的结果？我们甚至不需要解出结果就可以进行分析。实际上有

我们会给出一个解决常系数齐次方程的普遍办法。我们已经知道，求导是一个线性运算。对一个可导函数求导，得到一个新的函数。那么我们把$y’$记为$Dy$，其中$D$代表线性运算，对于高阶求导，不妨记$y^{(n)}=D^ny$。如果对函数不求导，也就是$D^0y$，我们可以记成$Iy$或者$I$省略不写。

那么如果我们已经有

也就是说

那么我们又得到一个多项式

所以原方程又可以写成

那么这和上面的例子又有什么关系呢？注意，如果$a_1,\cdots,a_n$为复数，那么$P(D)$总是可以写成下面的形式

其中$\lambda_i$两两之间可以相等，可以不相等。我们再来看$y’’-2y’+y=0$这个例子。这个时候借助$P(D)$可以把方程写成

那么在这个角度下审视这个方程的解法，我们可以设$u=(D-1)y$，通过解$(D-1)u=0$，解出$u$，又解$(D-1)y=u$，就得到了$y$。

那么普遍的解法我们已经有了，实际上，这是一个递归的办法。

对于方程

只需要设$\varphi_1=(D-\lambda_2)\cdots(D-\lambda_n)y$，然后解$(D-\lambda_1)\varphi_1=0$，解出$\varphi_1$；再按照同样的办法进行下去，设$\varphi_2=(D-\lambda_3)\cdots(D-\lambda_n)y$，解出$\varphi_2$，一直进行下去，最后设$\varphi_n=y$，接出来的就是最终结果。这时你已经获得了上篇博客里需要的“超能力”了（注意：这里的$\varphi_n$里已经包含了$n$个常数）。

以上是齐次线性方程的解决办法。对于非齐次线性方程又有什么普遍办法呢？具体办法有三。

- 如果可以很轻松地观察出方程的一个特解，比如存在$\mu(x)$使得$P(D)\mu(x)=f(x)$，那么非齐次方程的解就是$\mu(x)+\varphi_n(x)$。
- 直接解$P(D)y=f(x)$，方法和上面齐次方程一样，只需要注意，解$\varphi_1$时有$(D-\lambda_1)\varphi_1=f(x)$，递归下去得到的解和方法1是一致的。
- 利用上篇博客的办法。注意到最后得到的$\varphi_n(x)$里有$n$个常数，也就是说可以写成$\varphi_n(x)=\sum_{k=1}^{n}C_ku_k(x)$，这里的$u_k(x)$实际上就是所求的基础解系。

**Step 1: 化简$P(D)$**

这很简单，实际上我们有

**Step 2: 递归求解**

设

我们就有

解出

所以又设

又有

得到

最后，解

得

其实不难发现，$y=-1$是这个方程的一个特解，而解$y’’’-3y’’+3y’-y=0$得到$y=C_1x^2e^x+C_2xe^x+C_3e^x$，得到的结果和上面的办法是一样的。

注意到我们还可以把这个方程写成

既然我们已经有了机械的解法，那么我们能不能在动手解之前总结一下解的情况？这自然是可行的。我们会尝试讨论$P(D)$的所有基本情形。这里省略了最基本的计算，但是这些计算无非是最基本的一阶方程。

对于这种方程，我们最终要做的是解$n$次$y’-\lambda y$型一阶方程。为了解决这种方程，我们定义

那么只需要解$n$次$(D-\lambda)\varphi_{k+1}=\varphi_k$即可。通过简单的计算，得到

我们自然希望能得到类似于有$e^{\lambda_1 x},\cdots,e^{\lambda_n x}$的形式，那么事实是怎样呢？我们可以直接进行运算。

首先我们有

那么解

就能得到

整理之后就有

如果我们继续计算下去下去，就能得到

这自然是情况1和情况2的整合。在情况1里我们意识到，如果相邻的$\lambda_i$相等，那么我们有$\varphi_{i+1}=x\varphi_i+C_{i+1}e^{\lambda_{i}x}$；如果相邻的$\varphi(x)$不相等，那么我们有$\varphi_{i+1}=\varphi_i+C_{i+1}e^{\lambda_{i+1}x}$。这两个结论整合起来，再经过简单的计算，就能总结出解的一般形式：

这个时候我们解方程就可以轻松许多了。例如方程

我们能得到

所以解就是

再比如求解

注意到

所以解就是

在这整篇博客里，我们只做了一件事情——对一个高阶微分方程进行化简，使求一个高阶方程变成求若干个一阶方程。但是，手动观察整理化简是很不现实的做法，我们就将求导运算看成一个抽象的“数”，然后处理一个对应的多项式，把解方程变成两步：化简多项式、递归求解。在处理这个多项式的过程中，我们间接对原方程进行了化简。

但是多项式方法一定适用于非常系数方程吗？不一定。例如方程$y’’=xy$，这个方程并没有一个简单的解，我们也不能指望通过简单的解法得到所希望得到的函数。

这种方法的优点是，朴实、机械化，只需要执行若干次一阶方程的求导即可。但是，这种办法并没有很好地体现“线性”这个概念，很难看到和线性代数的关系。在下一篇博客里，会给出基于矩阵的解决办法。

]]>It is taught in elementary calculus course that, if a function is differentiable at some point $x_0$, then it’s continuous at $x_0$, but **not** vice versa. It’s easy to construct an counterexample which is continuous but not differentiable. An example is $y=|x|$ at $x=0$. In fact, we can even find a nowhere differentiable function by series:

where $g(x)=|x|$ for $x \in [-1,1]$ and $g(x+2)=g(x)$ for all $x \in \mathbb{R}$.

Though the existence has been proved already, we are looking for some feasible estimation on “quantity”. Also, this post is a good chance to review some elementary analysis techniques such as continuity and differentiability.

I’m assuming that you have finished this post or some equivalences (basic facts about Baire Category Theorem).

Let $K=[0,1]$ and let $C(K)$ be the set of all real-valued continuous functions defined on $K$. For $f \in C(K)$, we define the norm of $f$ by $\lVert f \rVert = \sup_{x \in K}|f(x)|$, and a metric $d$ on $C(K)$ by $d(f,g)=\lVert f-g \rVert$, then $C(K)$ becomes a **complete metric space**, where BCT applies, showing that $C(K)$ is of the second category.

A function $f \in C(K)$ will be called somewhere differentiable if there exists some point $x \in K$ such that $f$ is differentiable at $x$ (A somewhere differentiable function need not to be everywhere continuous on $K$, but that’s out of our consideration.). If there are several collections of functions defined by

then trivially $E \subset A \subset S \subset C(K)$. We’ll show that, however, $S$ is of the first category (needless to talk about $E$ and $A$), which indicates that there are “very few” somewhere differentiable functions.

(Theorem)$S$ is of the first category.

By somewhere differentiable, we mean that there exists some $x \in K$ such that the limit

exists and is bounded. If we omit the restriction of existence, we got a super set of $S$ defined by

We will show that $B_{n,m}$ is of the first category which forces $S$ to be the first category.

It suffice to show that every Cauchy sequence in $B_{n,m}$, namely $(f_k)_{k=1}^{\infty}$, converges in $B_{n,m}$. Since $S$ is complete, we have $f_k \to f$ (uniformly) for some $f \in S$ when $n \to \infty$. For each $k$, we can find $x_k \in [0,1]$ such that

for all $0 < h < \frac{1}{m}$. Since $[0,1]$ is compact, by Bolzano-Weierstrass theorem, $(x_k)_{k=1}^{\infty}$ has a convergent subsequence, namely $(x_{k_i})_{i=1}^{\infty}$. Therefore, for each $i$, we have some $g_{i}=f_{k_i}$ and $y_i=x_{k_i}$ such that

for all $0<h<\frac{1}{m}$. Suppose then $y_i \to x$ as $i \to\infty$, we therefore have

for all $0<h<\frac{1}{m}$. Therefore $f \in B_{n,m}$, which implies that $B_{n,m}$ is closed.

There are two basic analysis facts used in this step.

First, subspace of complete metric space is closed iff it’s complete. This can be easily shown by proving that subspace of complete metric space is not complete iff it’s not closed (a word of warning: not closed $\neq$ open!). For detailed proof, click here.

Second, the Bolzano-Weierstrass theorem. A subset $M$ of $\mathbb{R}$ is compact iff every sequence of elements of $M$ has a subsequence which converges to an element of $M$. We already know that $[0,1]$ is compact.

To show that $B_{n,m}$ is nowhere dense, we have to show that $B_{n,m}$ contains no open ball. The open ball for $f \in B_{n,m}$ is defined by

We will show that $B(f,\varepsilon)-B_{n,m} \neq \varnothing$ for any $\varepsilon >0$. This is done by constructing a function $g$ such that $g \in B(f,\varepsilon)$ but $g \notin B_{n,m}$.

Since $PL(K)$ (the family of piecewise linear functions defined on $K$) is dense in $C(K)$, we may find a function $p \in PL(K)$ such that $\left\Vert f-p \right\Vert<\frac{\varepsilon}{2}$.

Since $p$ is piecewise differentiable as well, we may find some $M \in \mathbb{N}$ such that $|p’(x)| \leq M$ for all $x$ where $p$ is differentiable. Pick an integer $k$ where $k>\frac{2(M+n)}{\varepsilon}$.

Define another periodic piecewise linear function $\varphi$ on $K$ as follows. For $x \in [0, \frac{2}{k}]$, $\varphi(x)=-k|x-\frac{1}{k}|+1$, for $\frac{k}{2} < x \leq 1$, $\varphi(x+\frac{k}{2})=\varphi(x)$. For this function, we have $|\varphi(x)| \leq 1$ and $|\varphi’(x)|=k$ for all $x$ where $\varphi$ is differentiable. With $g(x)$ defined by

we have

Thus $g \in B(f,\varepsilon)$. However, as we’ll show, $g \notin B_{n,m}$.

Pick $x \in K$ such that both $p$ and $\varphi$ are differentiable at $x$, then

Still, $g \in PL(K)$. For $x$ where $g$ is differentiable, we have $|g’(x)|>n$. For the turning points of $g$, we have $|g’_{-}(x)|>n$ and $g’_{+}(x)>n$, that is, the left and right derivatives of $g$. Therefore, for any $n,m \in \mathbb{N}$ and $\varepsilon \in \mathbb{R}^{+}$, we have $B(f,\varepsilon) - B_{n,m} \neq \varnothing$. That is, $B_{n,m}$ is nowhere dense.

There are some confusing parts worth talking about.

Why do we have to prove that $B_{n,m}$ contains no open balls?

Since $B_{n,m}$ is closed, the closure of $B_{n,m}$ is itself. Therefore it suffices to show that $B_{n,m}$ has empty interior, or equivalently, contains no open balls.

The detail of $g \notin A_{n,m}$?

Notice that $g$ is a piecewise linear function. Suppose the turning points of $g$ are $x_1,x_2,\cdots,x_N$. Then $g(x)$ is differentiable (also linear) on $(x_i,x_{i+1})$ for $i=1,2,\cdots,N-1$, and $g’(x)$ is the

**slope**of the segment, which is equal to $\frac{g(x+h)-g(x)}{h}$ for $x+h \in (x_i,x_{i+1})$. If $x=x_i$ for some $i$, then $\frac{g(x+h)-g(x)}{h}$ is equal to the**slope**of the segments on the left and right hand. Therefore, for all $x \in K$, we can find some $0<|h|<\frac{1}{m}$ such that $|g(x+h)-g(x)|/|h|>n$.Piecewise linear functions?

A function $p$ is called piecewise linear on $[0,1]$ if there is a partition $0=a_0<a_1<\cdots<a_n=1$ such that $p$ is “linear” (can be expressed in the form $y=ax+b$ ) on $[a_n,a_{n+1}]$. Also we assume that $p$ is continuous, whose collection is denoted by $PL(K)$. The fact that $PL(K)$ is dense in $C(K)$ can be shown as follows (For an advanced proof, you may want to check here).

$PL(K)$ is dense in $C(K)$. Alternatively, for every $f \in C(K)$ and $\varepsilon >0$, there exists some $p(x)$ such that $\left\Vert p-f \right\Vert < \varepsilon$.

(Proof)Since $K$ is compact, $f$ is uniformly continuous on $K$. Therefore, there exists some $\delta>0$ such that $|f(x)-f(y)|<\frac{\varepsilon}{2}$ for all $|x-y|<\delta$ where $x,y\in{K}$.Refine the partition associated with $p$ by forcing $|a_{i+1}-a_i|< \delta$. Define $p \in PL(K)$ by $p(a_i)=f(a_i)$. If $x \in K$, there is an $i$ such that $a_i \leq x \leq a_{i+1}$. Then $|f(x)-f(a_i)|<\frac{\varepsilon}{2}$. Also,

Therefore

This holds for all $x \in K$. Therefore holds for $\sup|f(x)-p(x)|$.

Since $f$ is differentiable at $x$, there exists a real number $A$ such that for any $\varepsilon>0$, there exists some $\delta>0$ such that

for $|h| \in (0,\delta)$. In fact, $A=f’(x)$, all we have to do is pick $n,m$ such that $n > A$ and $\frac{1}{m}<\delta$. Alternatively speaking, all we need to show is that, if $f \in S$, then $f \in B_{n,m}$ for some $m$ and $n$. Therefore

If we think of meagre sets as being “small”, this tells us that “mostly” $f \in C(K)$ is nowhere differentiable. At the very least, nowhere differentiable functions topologically exists, since a space cannot be of the first and second category at the same time.

]]>People call the Banach-Steinhaus theorem the first of the big three, which sits at the foundation of linear functional analysis. None of them can go without the Baire’s category theorem.

This blog post offers the Banach-Steinhaus theorem on different abstract levels. Recall that we have

First, there will be a simple version for Banach spaces, which may be more frequently used, and you will realize why it’s referred to as the *uniform boundedness principle*. After that, there will be a much more generalized version for TVS. Typically, the metrization of the space will not be considered.

Also, it will be a good chance to get a better view of the first and second space by Baire.

For metric spaces, equicontinuity is defined as follows. Let $(X,d_X)$ and $(Y,d_Y)$ be two metric spaces.

Let $\Lambda$ be a collection of functions from $X$ to $Y$. We have three different levels of equicontinuity.

Equicontinuous at a point. For $x_0 \in X$, if for every $\varepsilon>0$, there exists a $\delta>0$ such that $d_Y(Lx_0,Lx)<\varepsilon$ forall$L \in \Lambda$ and $d_X(x_0,x)<\delta$ (that is, the continuity holds for all $L$ in a ball centered at $x_0$ with radius $r$).Pointwise equicontinuous. $\Lambda$ is equicontinuous at each point of $X$.Uniformly equicontinuous. For every $\varepsilon>0$, there exists a $\delta>0$ such that $d_Y(Lx,Ly)<\varepsilon$ for all $x \in \Lambda$ and $x,y \in X$ such that $d_X(x,y) < \delta$.

Indeed, if $\Lambda$ contains only one element, namely $L$, then everything goes with the continuity and uniform continuity.

But for Banach-Steinhaus theorem, we need a little more restrictions. In fact, $X$ and $Y$ should be considered Banach spaces, and $\Lambda$ contains linear functions only. In this sense, for $L \in \Lambda$, we have the following three conditions equivalent.

- $L$ is bounded.
- $L$ is continuous.
- $L$ is continuous at one point of $X$.

For topological vector spaces, where only topology and linear structure are taken into consideration, things get different. Since no metrization is considered, we have to state it in the language of topology.

Suppose $X$ and $Y$ are TVS and $\Lambda$ is a collection of linear functions from $X$ to $Y$. $\Lambda$ is equicontinuous if for every neighborhood $N$ of $0$ in $Y$, there corresponds a neighborhood $V$ of $0$ in $X$ such that $L(V) \subset N$ for all $L \in \Lambda$.

Indeed, for TVS, $L \in \Lambda$ has the three conditions equivalent as well. With that being said, equicontinuous collection has the boundedness property in a uniform manner. That’s why the Banach-Steinhaus theorem is always referred to as the *uniform boundedness principle*.

Suppose $X$ is a Banach space, $Y$ is a normed linear space, and ${F}$ is a collection of bounded linear transformation of $X$ into $Y$, we have two equivalent statements:

(The Resonance Theorem)If $\sup\limits_{L \in \Lambda}\left\Vert{L}\right\Vert=\infty$, then there exists some $x \in X$ such that $\sup\limits_{L \in {L}}\left\Vert{Lx}\right\Vert=\infty$. (In fact, these $x$ form a dense $G_\delta$.)

(The Uniform Boundedness Principle)If $\sup\limits_{L \in {\Lambda}}\left\Vert{Lx}\right\Vert<\infty$ for all $x \in X$, then we have $ \left\Vert L \right\Vert \leq M$ for all $L \in {\Lambda}$ and some $M<\infty$.(A summary of 1 and 2)Either there exists an $M<\infty$ such that $\lVert L \rVert \leq M$ for all $L \in {L}$, or $\sup\lVert Lx \rVert = \infty$ for all $x$ belonging to some dense $G_\delta$ in $X$.

*Though it would be easier if we finish the TVS version proof, it’s still a good idea to leave the formal proof without the help of TVS here. The equicontinuity of $\Lambda$ will be shown in the next section.*

First, we offer an elementary proof in which the hardest part is the Cauchy sequence.

(Lemma)For any $x \in X$ and $r >0$, we havewhere $B(x,r)={y \in X:\lVert x-y \rVert < r}$.

(Proof of the lemma)For $t \in X$ we have a simple relation

If we have $t \in B(0,r)$, then $x+t,x-t\in{B(x,r)}$. And the desired inequality follows by taking the supremum over $t \in B(0,r)$. (If you find trouble understanding this, take a look at the definition of $\lVert L \rVert$.)

Suppose now $\sup\limits_{L \in \Lambda}\left\Vert{L}\right\Vert=\infty$. Pick a sequence of linear transformation in $\Lambda$, say $(L_n)_{n=1}^{\infty}$, such that $\lVert L_n \rVert \geq 4^n$. Pick $x_0 \in X$, and for $n \geq 1$, we pick $x_n$ inductively.

Set $r_n=3^{-n}$. With $x_{n-1}$ being picked, $x_n \in B(x_{n-1},r_n)$ is picked in such a way that

(It’s easy to validate this inequality by reaching a contradiction.) Also, it’s easy to check that $(x_n)_{n=1}^{\infty}$ is Cauchy. Since $X$ is **complete**, $(x_n)$ converges to some $x \in X$. Further we have

Therefore we have

The previous proof is easy to understand but it’s not easy to see the topological properties of the set formed by such $x$. Thus we are offering a topology-based proof which enables us to get a topology view.

Put

and let

we claim that each $V_n$ is open. Indeed, we have to show that $x \mapsto \lVert Lx \rVert$ is continuous. It suffice to show that $\lVert\cdot\rVert$ defined in $Y$ is continuous. This follows immediately from triangle inequality since for $x,y \in Y$ we have

which implies

by interchanging $x$ and $y$, we get

Thus $x \mapsto \lVert Lx \rVert$ is continuous since it’s a composition of $\lVert\cdot\rVert$ and $L$. Hence $\varphi$, by the definition, is lower semicontinuous, which forces $V_n$ to be open.

If **every** $V_n$ is dense in $X$ (consider $\sup\lVert L \rVert=\infty$), then by BCT, $B=\bigcap_{n=1}^{\infty} V_n$ is dense in $X$. Since each $V_n$ is open, $B$ is a dense $G_\delta$. Again by the definition of $B$, we have $\varphi(x)=\infty$ for all $x \in B$.

If one of these sets, namely $V_N$, fails to be dense in $X$, then there exist an $x_0 \in X - V_N$ and an $r>0$ such that for $x \in B(0,r)$ we have $x_0+x \notin V_N$, which is equivalent to

considering the definition of $\varphi$, we also have

for all $L \in \Lambda$. Since $x=(x+x_0)-x_0$, we also have

Dividing $r$ on two sides, we got

therefore $\lVert L \rVert \leq M=\frac{2N}{r}$ as is to be shown. Again, this follows from the definition of $\lVert L \rVert$.

Suppose $X$ and $Y$ are topological vector spaces, $\Lambda$ is a collection of continuous linear mapping from $X$ into $Y$, and $B$ is the set of all $x \in X$ whose orbits

are bounded in $Y$. For this $B$, we have:

- If $B$ is of the second category, then $\Lambda$ is equicontinuous.

Pick balanced neighborhoods $W$ and $U$ of the origin in $Y$ such that $\overline{U} + \overline{U} \subset W$. The balanced neighborhood exists since every neighborhood of $0$ contains a balanced one.

Put

If $x \in B$, then $\Lambda(x)$ is bounded, which means that to $U$, there exists some $n$ such that $\Lambda(x) \subset nU$ (Be aware, no metric is introduced, this is the definition of boundedness in topological space). Therefore we have $x \in nE$. Consequently,

If no $nE$ is of the second category, then $B$ is of the first category. Therefore, there exists at least one $n$ such that $nE$ is of the second category. Since $x \mapsto nx$ is a homeomorphism of $X$ onto $X$, $E$ is of the second category as well. But $E$ is closed since each $L$ is continuous. Therefore $E$ has an interior point $x$. In this case, $x-E$ contains a neighborhood $V$ of $0$ in $X$, and

This proves that $\Lambda$ is equicontinuous.

We’ll show that $B=X$. But before that, we need another lemma, which states the connection between equicontinuity and uniform boundedness

(Lemma)Suppose $X$ and $Y$ are TVS, $\Gamma$ is an equicontinuous collection of linear mappings from $X$ to $Y$, and $E$ is a bounded subset of $X$. Then $Y$ has a bounded subset $F$ such that $T(E) \subset F$ for every $T \in \Gamma$.

(Proof of the lemma)We’ll show that, the setis bounded. By the definition of equicontinuity, there is an neighborhood $V$ of the origin in $X$ such that $T(V) \subset W$ for all $T \in \Gamma$. Since $E$ is bounded, there exists some $t$ such that $E \subset tV$. For these $t$, by the definition of linear functions, we have

Therefore $F \subset tW$. $F$ is bounded.

Thus $\Lambda$ is uniformly bounded. Picking $E=\{x\}$ in the lemma, we also see $\Lambda(x)$ is bounded in $Y$ for every $x$. Thus $B=X$.

$X$ is a $F$-space if its topology $\tau$ is induced by a **complete** invariant metric $d$. By BCT, $X$ is of the second category. If we already have $B=X$, in which case $B$ is of the second category, then by Banach-Steinhaus theorem, $\Lambda$ is equicontinuous. Formally speaking, we have:

If $\Lambda$ is a collection of continuous linear mappings from an $F$-space $X$ into a topological vector space $Y$, and if the sets

are bounded in $Y$ for

every$x \in X$, then $\Lambda$ is equicontinuous.

Notice that all Banach spaces are $F$-spaces. Therefore we can restate the Uniform Boundedness Principle in Banach space with equicontinuity.

Suppose $X$ is a Banach space, $Y$ is a normed linear space, and ${F}$ is a collection of bounded linear transformation of $X$ into $Y$, we have:

(The Uniform Boundedness Principle)If $\sup\limits_{L \in {\Lambda}}\left\Vert{Lx}\right\Vert<\infty$ for all $x \in X$, then we have $ \left\Vert L \right\Vert \leq M$ for all $L \in {\Lambda}$ and some $M<\infty$. Further, $\Lambda$ is equicontinuous.

Surprisingly enough, the Banach-Steinhaus theorem can be used to do Fourier analysis. An important example follows.

There is a periodic continuous function $f$ on $[0,1]$ such that the Fourier series

of $f$ diverges at $0$. $\hat{f}(n)$ is defined by

Notice that $f \mapsto \hat{f}$ is linear, and the divergence of the series at $0$ can be considered by

To invoke Banach-Steinhaus theorem, the family of linear functionals are defined by

It can be proved that

which goes to infinity as $N \to \infty$. The existence of such $f$ that

follows from the resonance theorem. Further, we also know that these $f$ are in a dense $G_\delta$ subset of the vector space generated by all periodic continuous functions on $[0,1]$.

- arXiv:1005.1585v2
- W. Rudin,
*Real and Complex Analysis* - W. Rudin,
*Functional Analysiss* - Applications to Fourier series

There are three theorems about Banach spaces that occur frequently in the crux of functional analysis, which are called the ‘big three’:

- The Hahn-Banach Theorem
- The Banach-Steinhaus Theorem
- The Open Mapping Theorem

The incoming series of blog posts is intended to offer a self-read friendly explanation with richer details. Some basic analysis and topology backgrounds are required.

The term ‘category’ is due to Baire, who developed the category theorem afterwards. Let $X$ be a topological space. A set $E \subset X$ is said to be *nowhere dense* if $\overline{E}$ has empty interior, i.e. $\text{int}(\overline{E})= \varnothing$.

There are some easy examples of nowhere dense sets. For example, suppose $X=\mathbb{R}$, equipped with the usual topology. Then $\mathbb{N}$ is nowhere dense in $\mathbb{R}$ while $\mathbb{Q}$ is not. It’s trivial since $\overline{\mathbb{N}}=\mathbb{N}$, which has empty interior. Meanwhile $\overline{\mathbb{Q}}=\mathbb{R}$. But $\mathbb{R}$ is open, whose interior is itself. The category is defined using nowhere dense set. In fact,

- A set $S$ is
of the first categoryif $S$ is a countable union of nowhere dense sets.- A set $T$ is
of the second categoryif $T$ is not of the first category.

In this blog post, we consider two cases: BCT in complete metric space and in locally compact Hausdorff space. These two cases have nontrivial intersection but they are not equal. There are some complete metric spaces that are not locally compact Hausdorff.

There are some classic topological spaces, for example $\mathbb{R}^n$, are both complete metric space and locally compact Hausdorff. If a locally compact Hausdorff space happens to be a topological vector space, then this space has finite dimension. Also, a topological vector space has to be Hausdorff.

By a Baire space we mean a

topological space$X$ such thatthe intersection of every countable collection of dense open subsetsof $X$ is also dense in $X$.

Baire category states that

(BCT1)Every complete metric space is a Baire space.

(BCT2)Every locally compact Hausdorff space is a Baire space.

By taking the complement of the definition, we can see that, **every Baire space is not of the first category**.

Suppose we have a sequence of sets ${X_n}$ where $X_n$ is dense in $X$ for all $n>0$, then $X_0=\cap_n X_n$ is also dense in $X$. Notice then $X_0^{c} = \cup_n X_n^c$, a nowhere dense set and a countable union of nowhere dense sets, i.e. of the first category.

Let $X$ be the given complete metric space or locally Hausdorff space, and ${X_n}$ a countable collection of open subsets of $X$. Pick an arbitrary open subsets of $X$, namely $A_0$ (this is possible due to the topology defined on $X$). To prove that $\cap_n V_n$ is dense, we have to show that $A_0 \cap \left(\cap_n V_n\right) \neq \varnothing$. This follows the definition of denseness. Typically we have

A subset $A$ of $X$ is dense if and only if $A \cap U \neq \varnothing$ for all nonempty open subsets $U$ of $X$.

We pick a sequence of nonempty open sets ${A_n}$ inductively. With $A_{n-1}$ being picked, and since $V_n$ is open and dense in $X$, the intersection $V_n \cap A_{n-1}$ is nonempty and open. $A_n$ can be chosen such that

For BCT1, $A_n$ can be chosen to be open balls with radius $< \frac{1}{n}$; for BCT2, $A_n$ can be chosen such that the closure is compact. Define

Now, if $X$ is a locally compact Hausdorff space, then due to the compactness, $C$ is not empty, therefore we have

which shows that $A_0 \cap V_n \neq \varnothing$. BCT2 is proved.

For BCT1, we cannot follow this since it’s not ensured that $X$ has the Heine-Borel property, for example when $X$ is the Hilbert space (this is also a reason why BCT1 and BCT2 are not equivalent). The only tool remaining is Cauchy sequence. But how and where?

For any $\varepsilon > 0$, we have some $N$ such that $\frac{1}{N} < \varepsilon$. For all $m>n>N$, we have $A_m \subset A_n\subset A_N$, therefore the centers of ${A_n}$ form a Cauchy sequence, converging to some point of $K$, which implies that $K \neq \varnothing$. BCT1 follows.

BCT will be used directly in the big three. It can be considered as the origin of them. But there are many other applications in different branches of mathematics. The applications shown below are in the same pattern: **if it does not hold, then we have a Baire space of the first category, which is not possible.**

$\mathbb{R}$ is uncountable

Suppose $\mathbb{R}$ is countable, then we have

where $x_n$ is a real number. But $\{x_n\}$ is nowhere dense, therefore $\mathbb{R}$ is of the first category. A contradiction.

Suppose that $f$ is an entire function, and that in every power series

has at least one coefficient is $0$, then $f$ is a polynomial (there exists a $N$ such that $c_n=0$ for all $n>N$).

You can find the proof here. We are using the fact that $\mathbb{C}$ is complete.

An infinite dimensional Banach space $B$ has no countable basis

Assume that $B$ has a countable basis ${x_1,x_2,\cdots}$ and define

It can be easily shown that $B_n$ is nowhere dense. In this sense, $B=\cup_n B_n$. A contradiction since $B$ is a complete metric space.

]]>这篇博客里，我们会求一个矩阵的Jordan标准形（在$\mathbb{C}$中）：

我们会顺着上一篇博客的思路，用一个比较“机械”的办法，求出矩阵的Jordan标准形。但是，每一步的原因都是能给出的。“机械”的意思是说，有章可循。如果你有编程基础，你完全可以依照这个办法把求Jordan标准形的步骤写成程序，当然这就不是我擅长的领域了，我也不多做评价。不过，通往Jordan标准形的路有好几条，我觉得更适合用计算机程序写的路不应该是这篇博客的办法，而应该是这个网站的计算办法。

求这个矩阵的特征多项式能得到

当然，这个矩阵是不能对角化的，因为我们可以发现有

求这两个方程的目的是显然的，Jordan要用到准素分解、循环分解，而准素分解需要用到极小多项式，极小多项式又和特征多项式有很多联系。

同时，我们能得到特征根

我们已经知道，在$\mathbb{C}$上存在可逆矩阵$P$使得

其中$A_1$的特征多项式$f_1=(\lambda-2)^2$，极小多项式$m_1=(\lambda-2)^2$，$A_2$的极小多项式和特征多项式$m_2=f_2=\lambda-(3-\sqrt{5}i)$，$A_3$的极小多项式和特征多项式$m_3=f_3=\lambda-(3+\sqrt{5}i)$。

这就是之前讲过的先进行准素分解。先利用极小多项式的因子进行切割，然后在每个子矩阵进行仔细分析。下面是详细步骤。

我们已经知道，有

所以我们需要求出$4$个广义特征向量。由$(A-I)^2\mathbf{v}=0$得

由$(A-(3-\sqrt{5}i)I)\mathbf{v}=0$得

由$(A - (3+\sqrt{5}i)I)\mathbf{v}=0$得

我们得到了过渡矩阵

且有准素分解

首先，我们仔细回顾一下，为了得到Jordan链，我们需要进行什么工作。对于矩阵$A_i$，我们进行循环分解，得到有理标准形

然后，对每个子矩阵$A_{ij}$（$m_{ij}=(\lambda-\lambda_i)^{k_{ij}}$的友阵），考虑$B_{ij}=A_{ij}-\lambda_iI$，不难发现$B_{ij}$的极小多项式是$\lambda^{k_{ij}}$，所以只需要再对$B_{ij}$进行一次循环分解即可（如果很难理解，可以回顾一下上篇博客）。按照这个原理，在准素分解得到的$A_1$、$A_2$、$A_3$中，后两个的阶数为$1$，已经不需要分解，相似于自己即可。

对于矩阵

考虑到不变因子$m_{11}=(\lambda-2)^2=\lambda^2-4\lambda+4$，有且仅有一个，我们能得到它的有理标准形是

从$A_{1}$到$A_{11}$的过渡矩阵可以取循环基$\alpha=(1,0)^T$和$A_1\alpha=(\frac{5}{4},\frac{3}{4})^T$，构成过渡矩阵$Q=\begin{bmatrix}1&\frac{5}{4} \\ 0 & \frac{3}{4}\end{bmatrix}$。此时就有

令$B_{11}=A_{11}-2{I}=\begin{bmatrix}-2& -4 \\ 1 & 2\end{bmatrix}$，又重新取循环基，$\alpha=(1,0)^T$和$B_{11}\alpha=(-2,1)^T$，得到过渡矩阵$U=\begin{bmatrix}1& -2 \\ 0& 1\end{bmatrix}$，这个时候又有

从而对于$A_{11}$来说又有

再回到矩阵$A_1$，又有

令$V_1=QU$，我们就得到了$A_1=V_1^{-1}J_2(2)V_1$，得到一个和Jordan块的相似。

对于$A_2$和$A_3$来说，对应的$V_2$和$V_3$都是一阶单位矩阵，因为本身就已经不能再化简了。结果，我们得到了从$\text{diag}(A_{11},A_{22},A_{33})$到Jordan标准形的过渡矩阵$S=\text{diag}(V_1,V_2,V_3)$，也就是说

满足

最后再加上用来准素分解的$P$，我们就得到了从原矩阵$A$到Jordan标准形$J$的过渡矩阵

满足$T^{-1}AT=J$。到这里，我们的过渡矩阵就被求出来了。每一步都是有缘由的，都是直接运用前面的线性代数理论的，没有任何偏差。

但是要注意，Jordan标准形中，指定特征值下的Jordan块，不一定只有一个。这篇博客中的矩阵$A$，在特征值为$2$下有一个，这是因为极小多项式和特征多项式恰好相等。Jordan标准形中有多个相同特征值下的Jordan块的例子有没有？当然有，而且很简单，单位矩阵就是一个例子。

]]>我们在处理较大规模线性运算（主要涉及加法、乘法）时，总是希望能对计算进行简化。例如需要讨论若干个线性方程的解，或者解一个高阶微分方程，再或者是一些非线性问题在局部的线性拟合估计。一个经典的办法是矩阵对角化。但是，不是所有矩阵都能对角化的。例如矩阵$\begin{bmatrix}1&0\\1&1\end{bmatrix}$在$\mathbb{R}$和$\mathbb{C}$中都不能对角化，因为其最小多项式是$(\lambda-1)^2$。而矩阵

在$\mathbb{R}$中不可以对角化，在$\mathbb{C}$中可以对角化，因为最小多项式是$(\lambda-1)(\lambda^2+1)$。那么有没有什么稳妥的简化矩阵的办法？对角阵可能是简化的极限，那么能不能退而求其次，得到一个比较简单的形式 ，同时又把对角形当作一个特殊情况？

幸运的是，Jordan标准形给了我们这样一个机会。事实上，只要我们约定矩阵中的元素都是复数（实数看成虚部为$0$），那么就能保证，这个矩阵一定可以简化成Jordan标准形，是一个三角形矩阵，而且元素全部集中在对角线附近，而且如果这个矩阵能对角化，那么这个对角形，其实就是一个“极端”的Jordan标准形。

如果你觉得这两个例子很莫名其妙，不知道我这个判据从何而来，甚至不是很了解什么是最小多项式，那么你对接下来所要用到的最基本的技术不是很了解。如果想看接下来的内容，请务必了解如何将矩阵对角化，以及准素分解和循环分解。我已经有两篇博客专门对其进行了解释，可供参阅。

实际上，$\begin{bmatrix}1&0\\1&1\end{bmatrix}$就是一个Jordan标准形的简单例子。Jordan标准形已经是简化的极限，所以很不幸，对角化和它无关。我们定义$n$阶矩阵

也就是一个对角线下方全是1的矩阵下三角形矩阵。又定义Jordan块（Jordan标准形的组成元素）如下

而Jordan标准形，就是由这些Jordan块作为对角元素的准对角矩阵，也就是

其中$\text{diag}(\cdots)$表示将括号内元素放在对角线上的对角矩阵。一个Jordan标准形可能是

也就是$n_1=n_2=\cdots=n_k=1$，这个时候就已经得到了一个对角阵。对角阵和Jordan标准形的关系就很清晰了。

也可能是

我们不妨写成$J=\text{diag}(J_2(2),J_1(7))$。

当然，这个地方的$J$写成上三角矩阵，也是可以的。原因在后面会解释。

我们在这一节要证明这一个结论：

对任一复方阵$A$，总是存在可逆复方阵$P$使得

实际上，只要矩阵$A$的特征多项式，有$n$个特征值（这些特征值可以重复，可以不重复），那么这个矩阵就可以通过上面这个公式，转化成一个Jordan标准形。所以，实矩阵可能不需要要求把每个数都看成虚部为$0$的数，就可以对角化。而根据古典代数学基本定理，一个$n$阶复多项式一定有$n$个根，所以复矩阵一定是可以Jordan化的。这篇博客里只会讨论复数域的情况。下面提供两种不同的角度进行理解。但是整体思路是一样的。先对矩阵进行准素分解，又进行循环分解，再把循环分解的那一部分进行“错位”，得到Jordan标准形。

首先，我们定义一个线性变换（在$\mathbb{C}^n$下的标准基${(1,0,\cdots,0),(0,1,\cdots,0),\cdots,(0,0,\cdots,1)}$下）

不妨设$\mathscr{A}$和$A$的特征多项式为

首先，对$\mathbb{C}^n$进行准素分解，得到

我们又可以对每个子空间$V_i$进行循环分解，得到

设$\mathscr{A}$限制在$V_i$和$V_{ij}$上的线性变换分别是$\mathscr{A}_i$和$\mathscr{A}_{ij}$，特征多项式分别是$(\lambda-\lambda_i)^{d_i}$和$(\lambda-\lambda_i)^{r_{ij}}$。为方便起见，我们记$V_{ij}=V$，$r_{ij}=r$， $\mathscr{A}_{ij}=\mathscr{A}_V$。定义新的线性变换

其中$\mathscr{I}_V$是$V$中的恒等变换。因为不会产生歧义，我们简记成$\mathscr{B}_V=\mathscr{A}_V-\lambda_i$。因为$\mathscr{A}_V$的最小多项式是$(\lambda-\lambda_i)^r$，所以不难发现$\mathscr{B}^r_V$的极小多项式是$\lambda^r$，也就是说$\mathscr{B}^{r}_V=\mathscr{O}$，是一个零变换。

我们又知道，$V$是由$\mathscr{A}_V$张成的循环子空间。不妨设$V$的循环基是

我们会证明，如果把$\mathscr{A}_V$换成$\mathscr{B}_V$，得到的一组向量仍然是一组基。也就是说，我们需要证明

这组基线性无关。我们只需要证明从${\alpha_i}$到${\beta_i}$的过渡矩阵是可逆矩阵；或者说，用$\beta_i$表示$\alpha_i$的矩阵的行列式不等于$0$；或者说，${\beta_i}$这组向量的秩为$r$，这些说法都是等价的。我们不难发现（也可以用归纳证明）

用矩阵的形式表示，设$\beta=\begin{bmatrix}\beta_1\\ \beta_2 \\ \vdots \\\beta_n\end{bmatrix}$，$\alpha=\begin{bmatrix}\alpha_1\\ \alpha_2 \\ \vdots \\\alpha_n\end{bmatrix}$，那么就有

也就是说，过渡矩阵（不妨设为$B$）的每一行，都是$(1-\lambda)^k$的对应项。显然，$\det{B} \neq 0$。这说明，这组向量线性无关，是$V$的一组基。对于这组基，我们有

最后一个成立是因为$(\mathscr{A}-\lambda_i)\beta_r=\mathscr{B}_V^{r}\beta_1=0$。这组$\beta_i$又称作Jordan链，因为关系又可以写成一个链性关系：

如果我们用${\beta_i}$作为基，那么$\mathscr{A}_V$的矩阵表示恰好就是一个Jordan块$J_{r}(\lambda_i)$。这是我们对任意一个循环子空间$V_{ij}$进行的操作，将所有的子空间都选择一组Jordan链之后，将所有的Jordan链作为$\mathbb{C}^n$的一组新基，即为$P$，根据矩阵相似的原理，将标准基变换为全体Jordan链，就得到了Jordan标准形。

在这里进行准素分解和循环分解，并非空穴来风。准素分解是根据矩阵的特征值进行分解。把特征值相等的部分切割开来，保证每个子空间，对应的每个子矩阵，都在操作相同的特征值，这保证了**特征值还能在这个分解中派上用场**。

接下来，对每个准素分解子空间进行循环分解，**把不变子空间这个概念充分直接运用**（而不是只求特征向量这一丁点关于不变子空间的信息），再将每个子空间分解成循环子空间。但是循环分解得到的是一个友阵，这个友阵不够简洁。因为最后一列与$(\lambda-\lambda_i)^r$的二项式展开的系数有关系。如果我们对循环基进行合适的偏移，发现子空间也是$\mathscr{B}_V = \mathscr{A}_V-\lambda$这个线性变换张成的子空间，而$\mathscr{B}_V$的友阵恰好是一个$L_n$，就得到了一个非常简洁的表示形式了。

如果我们把Jordan链的顺序完全颠倒，从$\beta_1,\beta_2,\cdots,\beta_n$颠倒成$\beta_n,\beta_{n-1},\cdots,\beta_1$，那么得到的标准形$J’$就是$J$的转置。所以，$J$和$J^{T}$都可以用来表示Jordan标准形。

设$A$的特征多项式为$f(\lambda)=(\lambda-\lambda_1)^{d_1}\cdots(\lambda-\lambda_s)^{d_s}$（其中,$\lambda_1,\lambda_2,\cdots,\lambda_s$互异）。首先，对矩阵$A$进行准素分解，也就是说首先我们有

其中$A_i$的特征多项式为$(\lambda-\lambda_i)^{d_i}$。

再对$A_i$进行准素分解，也就是说

然后讨论$A_{ij}$。首先，$A_{ij}$的最小多项式必定是$(\lambda-\lambda_i)^{k_{ij}}$的形式，如果令$B_{ij}=A_{ij}-\lambda_iI$，那么$B_{ij}$的最小多项式是$\lambda_i^{k_{ij}}$，故根据循环分解，它相似与$C(\lambda_i^{k_{ij}})$，也就是说，存在可逆矩阵$U_{ij}$，使得

从而

此即一个Jordan块。将所有$U_{ij}$都求出来之后，就能得到Jordan标准形。选取$U_{ij}$的过程就是选取Jordan链的过程。

]]>In this post we discussed the topological properties of the zero points of an entire nonzero function, or roughly, how those points look like. The set of zero points contains no limit point, and at most countable (countable or finite). So if it’s finite, then we can find them out one by one. For example, the function $f(z)=z$ has simply one zero point. But what if it’s just countable? How fast the number grows?

Another question. Suppose we have an entire function $f$, and the zeros of $f$, namely $z_1,z_2,\cdots,z_n$, are ordered increasingly by moduli:

Is it possible to get a fine enough estimation of $|z_n|$? Interesting enough, we can get there with the help of Jensen’s formula.

Suppose $\Omega=D(0;R)$, $f \in H(\Omega)$, $f(0) \neq 0$, $0<r<R$, and $z_1,z_2,\cdots,z_{n(r)}$ are the zeros of $f$ in $\overline{D}(0;R)$, then

There is no need to worry about the assumption $f(0) \neq 0$. Take another look at this proof. Every zero point $a$ has a unique positive number $m$ such that $f(z)=(z-a)^mg(z)$ and $g \in H(\Omega)$ but $g(a) \neq 0$. The number $m$ is called the order of the zero at $a$. Therefore if we have $f(0)=0$ we can simply consider another function, namely $\frac{f}{z^m}$ where $m$ is the order of zero at $0$.

We are not proving this identity at this point. But it can be done by considering the following function

where $m$ is found by ordering $z_j$ in such a way that $z_1,\cdots,z_m \in D(0;r)$ and $|z_{m+1}|=\cdots=|z_{n}|$. One can prove this identity by considering $|g(0)|$ as well as $\log|g(re^{i\theta})|$.

For simplicity we shall assume $f(0)=1$ which has no loss of generality. Let

and $n(r)$ be the number of zeros of $f$ in $\overline{D}(0;r)$. By the maximum modulus theorem, we have

If we insert Jensen’s formula into this inequality and order $|z_n|$ by increasing moduli, we get

Which implies

So $n(r)$ is controlled by $M(2r)$. The second and third inequalities look tricky, which require more explanation.

First we should notice the fact that $z_n \in \overline{D}(0;R)$ for all $R \in \mathbb{R}$. Hence we have $\log\frac{2r}{|z_n|} \geq \log1=0$ for all $z_n \in \overline{D}(0;R)$. Hence the second inequality follows. For the third one, we simply have

So this is it, the rapidity with which $n(r)$ can grow is dominated by $M(r)$. Namely, the number of zeros of $f$ in the closed disc with radius $r$ is controlled by the maximum modulus of $f$ on a circle with bigger radius.

Let’s begin with a simple example. Let $f(z)=1$, we have $M(r)=1$ for all $r$, but also we have $n(r)=0$, in which sense this estimation does nothing. Indeed, as long as $M(r)$ is bounded by a constant, which implies $f(z)$ is bounded, then by Liouville’s theorem, $f(z)$ is constant and this estimation is not available.

But if $M(r)$ grows properly, things become interesting. For example, if we have

where $A$ and $k$ are given positive numbers, we have a good enough estimation by

This estimation becomes interesting if we consider the logarithm of $n(r)$ and $r$, that is

If we have $f(z)=1-\exp(z^k)$ where $k$ is a positive integer, we have $n(r) \sim \frac{kr^k}{\pi}$, also

We’ll see here, how to evaluate the lower bound of $|z_{n(r)}|$ using Jensen’s formula, provided that $M(r)$, or simply the upper bound of $f(z)$ is properly described. Without loss of generality we shall assume that $f(0)=1$. Also, we assume that the zero points of $f(z)$ are ordered by increasing moduli.

First we still consider

and see what will happen.

By Jensen’s, we have

This gives

By the arrangement of ${z_n}$, we have

Another example is when we have

where $\Im{z}$ means the imagine part of $z$.

We shall notice that in this case,

Following Jensen’s formula, we therefore have

]]>We are treating linear ODE problems as an example of quotient space in this blog post. You are assumed to be able to solve linear ODEs without theoretical problems, and have some basic understanding of normal subgroups.

Let $X$ be a vector space, and $N$ a subspace of it. Naturally $N$ is normal in $X$ since $X$ is abelian. Define

for $x \in X$, then the collection of sets ${\pi(x):x \in X}$ is the *quotient space* of $X$ *modulo* $N$, in which case we write $X/N$. Addition and scalar multiplication are defined by

This is well-defined since $N$ is a vector space. The kernel of $\pi$ or the origin of $X/N$ has to be $N=0+N$. People call $\pi$ the **canonical map**.

If one solves a linear ODE problem of order $n$ on an interval $(a,b)$, namely

where $\mathbf{A}$ is a constant matrix, they will find that the solution can be

where $\mathbf{c}$ is a given constant vector and $\mathbf{\Phi}(x)=\begin{bmatrix}\mathbf{\varphi_1},\mathbf{\varphi_2},\cdots,\mathbf{\varphi_n}\end{bmatrix}$ and $\{\mathbf{\varphi_i}\}$ are the fundamental solutions of $\frac{d\mathbf{y}}{dx}=\mathbf{Ay}$. We’ll translate this into the language of quotient space. We have to show that **the solutions of the equation forms an element of $X/N$**.

So where is the $X$? It suffices to pick $\mathcal{C}^n$, the space of all functions $\mathbf{y}=(y_1,y_2,\cdots,y_n)^T$ such that $y_k$ is $n$-time differentiable. The crux therefore becomes finding $N$. And we’ll show that it’s denoted by $\mathbf{\Phi}(x)\mathbf{c}$.

Indeed, it’s trivially verified since $\frac{d}{dx}$ and matrix multiplication are linear. The question is, how does this vector space *look* like? Why the fundamental solutions to this equation has and only has $n$ elements? Does that mean this space (denoted by $N$), has dimension $n$? Fortunately, the following isomorphism answers this question in the affirmative.

TheoremThe vector space $N$ is isomorphic to $\mathbb{R}^n$

Pick and fix $x_0 \in (a,b)$. Picard’s existence and uniqueness theorem ensures that, the initial value problem

has a unique solution. Therefore we have a bijection

It suffices to prove that $H$ is linear. Namely, we need to show that

which is trivial, since

Let’s see this solution again

For $\mathbf{c}=(c_1,c_2,\cdots,c_n)^T$, we have

Notice that $\{\mathbf{\varphi_k}\}$ is a **basis** of $N$ and there we have it. $\mathbf{c}$ can be any element of $\mathbb{R}^n$, hence $\mathbf{\Phi}(x)\mathbf{c}$ goes through all elements of $N$. Also we know, the function

is a special solution to $\frac{d\mathbf{y}}{dx}=\mathbf{Ay}+\mathbf{f}$. Thus we have the coset

to be the set of all solutions where $\pi(\mathbf{z}) \in X/N$.

]]>