# Partition of Unity on Different Manifolds (Part 1. Introduction)

## An application of partition of unity

Partition of unity builds a bridge between local properties and global properties. A nice example is the Stokes' theorem on manifolds.

Suppose $\omega$ is a $(n-1)$-form with compact support on a oriented manifold $M$ of dimension $n$ and if $\partial{M}$ is given the induced orientation, then $\int_M d\omega=\int_{\partial{M}}\omega$

This theorem can be proved in two steps. First, by Fubini's theorem, one proves the identity on $\mathbb{R}^n$ and $\mathbb{H}^n$. Second, for the general case, let $(U_\alpha)$ be an oriented atlas for $M$ and $(\rho_\alpha)$ a partition of unity to $(U_\alpha)$, one naturally writes $\omega=\sum_{\alpha}\rho_\alpha\omega$. Since $\int_M d\omega=\int_{\partial M}\omega$ is linear with respect to $\omega$, it suffices to prove it only for $\rho_\alpha\omega$. Note that the support of $\rho_\alpha\omega$ is contained in the intersection of supports of $\rho_\alpha$ and $\omega$, hence a compact set.

On the other hand, $U_\alpha$ is diffeomorphic to either $\mathbb{R}^n$ or $\mathbb{H}^n$, it is immediate that $\int_M d\rho_\alpha\omega=\int_{U_\alpha}d\rho_\alpha\omega=\int_{\partial U_\alpha}\rho_\alpha\omega=\int_{\partial{M}}\rho_\alpha\omega.$ Which furnishes the proof for the general case.

As is seen, to prove a global thing, we do it locally. If you have trouble with these terminologies, never mind. We will go through this right now (in a more abstract way however). If you are familiar with them however, fell free to skip.

## Prerequisites

### Manifold (of finite or infinite dimension)

Throughout, we use bold letters like $\mathbf{E}$, $\mathbf{F}$ to denote Banach spaces. We will treat Euclidean spaces as a case instead of our restriction. Indeed since Banach spaces are not necessarily of finite dimension, our approach can be troublesome. But the benefit is a better view of abstraction.

Let $X$ be a set. An atlas of class $C^p$ ($p \geq 0$) on $X$ is a collection of pairs $(U_i,\varphi_i)$ where $i$ ranges through some indexing set, satisfying the following conditions:

AT 1. Each $U_i$ is a subset of $X$ and $\bigcup_{i}U_i=X$.

AT 2. Each $\varphi_i$ is a bijection of $U_i$ onto an open subset $\varphi_iU_i$ of some Banach space $\mathbf{E}_i$ and for any $i$ and $j$, $\phi_i(U_i \cap U_j)$ is open in $E_i$.

AT 3. The map $\varphi_j\circ\varphi_i^{-1}:\varphi_i(U_i \cap U_j) \to \varphi_j(U_i \cap U_j)$ is a $C^p$-isomorphism for all $i$ and $j$.

One should be advised that isomorphism here does not come from group theory, but category theory. Precisely speaking, it's the isomorphism in the category $\mathfrak{O}$ whose objects are the continuous maps of Banach spaces and whose morphisms are the continuous maps of class $C^p$.

Also, by setting $\tau_X=(U_i)_i$, we see $\tau_X$ is a topology, and $\varphi_i$ are topological isomorphisms. Also, we see no need to assume that $X$ is Hausdorff unless we start with Hausdorff spaces. Lifting this restriction gives us more freedom (also sometimes more difficulty to some extent though).

For condition AT 2, we did not require that the vector spaces be the same for all indexes $i$, or even that they be toplinearly isomorphic. If they are all equal to the same space $\mathbf{E}$, then we say that the atlas is an $\mathbf{E}$-atlas.

Suppose that we are given an open subset $U$ of $X$ and a topological isomorphism $\phi:U \to U'$ onto an open subset of some Banach space $\mathbb{E}$. We shall say that $(U,\varphi)$ is compatible with the atlas $(U_i,\varphi_i)_i$ if each map $\varphi\circ\varphi^{-1}$ is a $C^p$-isomorphism. Two atlas are said to be compatible if each chart of one is compatible with other atlas. It can be verified that this is a equivalence relation. An equivalence relation of atlases of class $C^p$ on $X$ is said to define a structure of $C^p$-manifold on $X$. If all the vector spaces $\mathbf{E}_i$ in some atlas are toplinearly isomorphic, we can find some universal $\mathbf{E}$ that is equal to all of them. In this case, we say $X$ is a $\mathbf{E}$-manifold or that $X$ is modeled on $\mathbf{E}$.

As we know, $\mathbb{R}^n$ is a Banach space. If $\mathbf{E}=\mathbb{R}^n$ for some fixed $n$, then we say that the manifold is $n$-dimensional. Also we have the local coordinates. A chart $\varphi:U \to \mathbb{R}^n$ is given by $n$ coordinate functions $\varphi_1,\cdots,\varphi_n$. If $P$ denotes a point of $U$, these functions are often written $x_1(P),\cdots,x_n(P),$ or simply $x_1,\cdots,x_n$.

### Topological prerequisites

Let $X$ be a topological space. A covering $\mathfrak{U}$ of $X$ is locally finite if every point $x$ has a neighborhood $U$ such that all but a finite number of members of $\mathfrak{U}$ do not intersect with $U$ (as you will see, this prevents some nonsense summation). A refinement of a covering $\mathfrak{U}$ is a covering $\mathfrak{U}'$ such that for any $U' \in \mathfrak{U}'$, there exists some $U \in \mathfrak{U}$ such that $U' \subset U$. If we write $\mathfrak{U} \leq \mathfrak{U}'$ in this case, we see that the set of open covers on a topological space forms a direct set.

A topological space is paracompact if it is Hausdorff, and every open covering has a locally finite open refinement. Here follows some examples of paracompact spaces.

1. Any compact Hausdorff space.
2. Any CW complex.
3. Any metric space (hence $\mathbb{R}^n$).
4. Any Hausdorff Lindelöf space.
5. Any Hausdorff $\sigma$-compact space

These are not too difficult to prove, and one can easily find proofs on the Internet. Below are several key properties of paracompact spaces.

If $X$ is paracompact, then $X$ is normal. (Proof here)

Let $X$ be a paracompact (hence normal) space and $\mathfrak{U}=(U_i)$ a locally finite open cover, then there exists a locally finite open covering $\mathfrak{V}=(V_i)$ such that $\overline{V_i} \subset U_i$. (Proof here. Note the axiom of choice is assumed.

One can find proofs of the following propositions on Elements of Mathematics, General Topology, Chapter 1-4 by N. Bourbaki. It's interesting to compare them to the corresponding ones of compact spaces.

Every closed subspace $F$ of a paracompact space $X$ is paracompact.

The product of a paracompact space and a compact space is paracompact.

Let $X$ be a locally compact paracompact space. Then every open covering $\mathfrak{R}$ of $X$ has a locally finite open refinement $\mathfrak{R}'$ formed of relatively compact sets. If $X$ is $\sigma$-compact then $\mathfrak{R}'$ can be taken to be countable.

### Partition of unity

A partition of unity (of class $C^p$) on a manifold $X$ consists of an open covering $(U_i)$ of $X$ and a family of functions $\psi_i:X \to \mathbb{R}$ satisfying the following conditions:

PU 1. For all $x \in X$ we have $\phi_i(x) \geq 0$.

PU 2. The support of $\psi_i$ is contained in $U_i$.

PU 3. The covering is locally finite

PU 4. For each point $x \in X$ we have $\sum_{i}\psi_i(x)=1$

The sum in PU 4 makes sense because for given point $x$, there are only finite many $i$ such that $\psi_i(x) >0$, according to PU 3.

A manifold $X$ will be said to admit partition of unity if it is paracompact, and if, given a locally finite open covering $(U_i)$, there exists a partition of unity $(\psi_i)$ such that the support of $\psi_i$ is contained in $U_i$.

### Bump function

This function will be useful when dealing with finite dimensional case.

For every integer $n$ and every real number $\delta>0$ there exist maps $\psi_n \in C^{\infty}(\mathbb{R}^n;\mathbb{R})$ which equal $1$ on $B(0,1)$ and vanish in $\mathbb{R}^n\setminus B(1,1+\delta)$.

Proof. It suffices to prove it for $\mathbb{R}$ since once we proved the existence of $\psi_1$, then we may write $\psi_n(x_1,x_2,\cdots,x_n)=\psi_1(\sqrt{x_1^2+x_2^2+\cdots+x_n^2}).$ Consider the function $\phi: \mathbb{R} \to \mathbb{R}$ defined by $\phi(t)= \begin{cases} \exp\left(\frac{1}{(t-a)(t-b)}\right)&\quad\text{if } a<t<b,\\ 0 &\quad \text{otherwise}. \end{cases}$ The reader may have seen it in some analysis course and should be able to check that $\phi \in C^{\infty}(\mathbb{R};\mathbb{R})$. Integrating $\phi$ from $-\infty$ to $x$ and divide it by $\lVert \phi \rVert_1$ (you may have done it in probability theory) to obtain $\theta(x)=\frac{\int_{-\infty}^{x}\phi(t)dt}{\int_{-\infty}^{+\infty}\phi(t)dt};$ it is immediate that $\theta(x)=0$ for $x \leq a$ and $\theta(x)=1$ for $x \geq b$. By taking $a=1$ and $b=(1+\delta)^2$, our job is done by letting $\psi_1(x)=1-\theta(x^2)$. Considering $x^2=|x|^2$, one sees that the identity about $\psi_n$ and $\psi_1$ is redundant. $\square$

In the following blog posts, we will generalize this to Hilbert spaces.

## Is partition of unity ALWAYS available?

Of course this is desirable. But we will give an example that sometimes we cannot find a satisfying partition of unity.

Let $D$ be a connected bounded open set in $\ell^p$ where $p$ is not an even integer. Assume $f$ is a real-valued function, continuous on $\overline{D}$ and $n$-times differentiable in $D$ with $n \geq p$. Then $f(\overline{D}) \subset \overline{f(\partial D)}$.

(Corollary) Let $f$ be an $n$-times differentiable function on $\ell^p$ space, where $n \geq p$, and $p$ is not an even integer. If $f$ has its support in a bounded set, then $f$ is identically zero.

It follows that for $n \geq p$, $C^n$ partitions of unity do not exists whenever $p$ is not an even integer. For example,e $\ell^1[0,1]$ does not have a $C^2$ partition of unity. It is then our duty to find that under what condition does the desired partition of unity available.

## Existence of partition of unity

Below are two theorems about the existence of partitions of unity. We are not proving them here but in the future blog post since that would be rather long. The restrictions on $X$ are acceptable. For example $\mathbb{R}^n$ is locally compact and hence the manifold modeled on $\mathbb{R}^n$.

Let $X$ be a manifold which is locally compact Hausdorff and whose topology has a countable base. Then $X$ admits partitions of unity

Let $X$ be a paracompact manifold of class $C^p$, modeled on a separable Hilbert space $E$, then $X$ admits partitions of unity (of class $C^p$)

## References

• N. Bourbaki, Elements of Mathematics
• S. Lang, Fundamentals of Differential Geometry
• M. Berger, Differential Geometry: Manifolds, Curves, and Surfaces
• R. Bonic and J. Frampton, Differentiable Functions on Certain Banach Spaces

# Stirling公式的几种经典估计

## 估计定积分

### The sequence

Now consider a short exact sequence of differential complexes $0 \rightarrow A \xrightarrow{f} B \xrightarrow{g} C \rightarrow 0$ where both $f$ and $g$ are chain maps (this is important). Then there exists a long exact sequence by $\cdots\rightarrow H^q(A) \xrightarrow{f^*} H^{q}(B) \xrightarrow{g^*} H^q(C)\xrightarrow{d^{*}}H^{q+1}(A) \xrightarrow{f^*}\cdots.$ Here, $f^{\ast}$ and $g^{\ast}$ are the naturally induced maps. For $c \in C^q$, $d^{\ast}[c]$ is defined to be the cohomology class $[a]$ where $a \in A^{q+1}$, and that $f(a)=db$, and that $g(b)=c$. The sequence can be described using the two-layer commutative diagram below.

The long exact sequence is actually the purple one (you see why people may call this zig-zag lemma). This sequence is 'based on' the blue diagram, which can be considered naturally as an expansion of the short exact sequence. The method that will be used in the following proof is called diagram-chasing, whose importance has already been described by Professor James Munkres: master this. We will be abusing the properties of almost every homomorphism and group appeared in this commutative diagram to trace the elements.

#### Proof

First, we give a precise definition of $d^{\ast}$. For a closed $c \in C^q$, by the surjectivity of $g$ (note this sequence is exact), there exists some $b \in B^q$ such that $g(b)=c$. But $g(db)=d(g(b))=dc=0$, we see for $db \in B^{q+1}$ we have $db \in \ker g$. By the exactness of the sequence, we see $db \in \operatorname{im}{f}$, that is, there exists some $a \in A^{q+1}$ such that $f(a)=db$. Further, $a$ is closed since $f(da)=d(f(a))=d^2b=0$ and we already know that $f$ has trivial kernel (which contains $da$).

$d^{\ast}$ is therefore defined by $d^*[c]=[a],$ where $[\cdot]$ means "the homology class of".

But it is expected that $d^{\ast}$ is a well-defined homomorphism. Let $c_q$ and $c_q'$ be two closed forms in $C^q$. To show $d^{\ast}$ is well-defined, we suppose $[c_q]=[c_q']$ (i.e. they are homologous). Choose $b_q$ and $b_q'$ so that $g(b_q)=c_q$ and $g(b_q')=c_q'$. Accordingly, we also pick $a_{q+1}$ and $a_{q+1}'$ such that $f(a_{q+1})=db_q$ and $f(a_{q+1}')=db_q'$. By definition of $d^{\ast}$, we need to show that $[a_{q+1}]=[a_{q+1}']$.

Recall the properties of factor group. $[c_q]=[c_q']$ if and only if $c_q-c_q' \in \operatorname{im}d$. Therefore we can pick some $c_{q-1} \in C^{q-1}$ such that $c_q-c_q'=dc_{q-1}$. Again, by the surjectivity of $g$, there is some $b_{q-1}$ such that $g(b_{q-1})=c_{q-1}$.

Note that \begin{aligned} g(b_q-b_q'-db_{q-1})&=c_q-c_{q}'-g(db_{q-1}) \\ &=dc_{q-1}-d(g(b_{q-1})) \\ &=dc_{q-1}-dc_{q-1} \\ &= 0. \end{aligned} Therefore $b_q-b_q'-db_{q-1} \in \operatorname{im} f$. We are able to pick some $a_q \in A^{q}$ such that $f(a_q)=b_q-b_q'-db_{q-1}$. But now we have \begin{aligned} f(da_q)=df(a_q)&=d(b_q-b_q'-db_{q-1}) \\ &=db_q-db_q'-d^2b_{q-1} \\ &=db_q-db_q' \\ &=f(a_{q+1}-a_{q+1}'). \end{aligned} Since $f$ is injective, we have $da_q=a_{q+1}-a_{q+1}'$, which implies that $a_{q+1}-a_{q+1}' \in \operatorname{im}d$. Hence $[a_{q+1}]=[a_{q+1}']$.

To show that $d^{\ast}$ is a homomorphism, note that $g(b_q+b_q')=c_q+c_q'$ and $f(a_{q+1}+a_{q+1}')=d(b_q+b_q')$. Thus we have $d^*[c_q+c_q']=[a_{q+1}+a_{q+1}'].$ The latter equals $[a_{q+1}]+[a_{q+1}']$ since the canonical map is a homomorphism. Therefore we have $d^*[c_q+c_q']=d^*[c_q]+d^*[c_q'].$ Therefore the long sequence exists. It remains to prove exactness. Firstly we need to prove exactness at $H^q(B)$. Pick $[b] \in H^q(B)$. If there is some $a \in A^q$ such that $f(a)=b$, then $g(f(a))=0$. Therefore $g^{\ast}[b]=g^{\ast}[f(a)]=[g(f(a))]=[0]$; hence $\operatorname{im}f \subset \ker g$.

Conversely, suppose now $g^{\ast}[b]=[0]$, we shall show that there exists some $[a] \in H^q(A)$ such that $f^{\ast}[a]=[b]$. Note $g^{\ast}[b]=\operatorname{im}d$ where $d$ is the differential operator of $C$ (why?). Therefore there exists some $c_{q-1} \in C^{q-1}$ such that $g(b)=dc_{q-1}$. Pick some $b_{q-1}$ such that $g(b_{q-1})=c_{q-1}$. Then we have $g(b-db_{q-1})=g(b)-d(g(b_{q-1}))=g(b)-dc_{q-1}=0.$

Therefore $f(a)=b-db_{q-1}$ for some $a \in A^q$. Note $a$ is closed since $f(da)=df(a)=d(b-db_{q-1})=db-d^2b_{q-1}=db=0$ and $f$ is injective. $db=0$ since we have $g(db)=d(g(b))=d(dc_{q-1})=0.$ Furthermore, $f^*[a]=[f(a)]=[b-dc_{q-1}]=[b]-[0]=[b].$ Therefore $\ker g^{\ast} \subset \operatorname{im} f$ as desired.

Now we prove exactness at $H^q(C)$. (Notation:) pick $[c_q] \in H^q(C)$, there exists some $b_q$ such that $g(b_q)=c_q$; choose $a_{q+1}$ such that $f(a_{q+1})=db_q$. Then $d^{\ast}[c_q]=[a_{q+1}]$ by definition.

If $[c_q] \in \operatorname{im}g^{\ast}$, we see $[c_q]=[g(b_q)]=g^{\ast}[b_q]$. But $b_q$ is closed since $[b_q] \in H^q(B)$, we see $f(a_{q+1})=db_q=0$, therefore $d^{\ast}[c_q]=[a_{q+1}]=[0]$ since $f$ is injective. Therefore $\operatorname{im}g^{\ast} \subset \ker d^{\ast}$.

Conversely, suppose $d^{\ast}[c^q]=[0]$. By definition of $H^{q+1}(A)$, there is some $a_q \in A$ such that $da_q = a_{q+1}$ (can you see why?). We claim that $b_q-f(a_q)$ is closed and we have $[c_q]=g^{\ast}[b_q-f(a_q)]$.

By direct computation, $d(b_q-f(a_q))=db_q-d(f(a_q))=db_q-f(d(a_q))=db_q-f(a_{q+1})=0.$ Meanwhile $g^*[b_q-f(a_q)]=[g(b_q)]-[g(f(a_q))]=[c_q].$ Therefore $\ker d^{\ast} \subset \operatorname{im}g^{\ast}$. Note that $g(f(a_q))=0$ by exactness.

Finally, we prove exactness at $H^{q+1}(A)$. Pick $\alpha \in H^{q+1}(A)$. If $\alpha \in \operatorname{im}d^{\ast}$, then $\alpha=[a_{q+1}]$ where $f(a_{q+1})=db_q$ by definition. Then $f^*(\alpha)=[f(a_{q+1})]=[db_q]=[0].$ Therefore $\alpha \in \ker f^{\ast}$. Conversely, if we have $f^{\ast}(\alpha)=[0]$, pick the representative element of $\alpha$, namely we write $\alpha=[a]$; then $[f(a)]=[0]$. But this implies that $f(a) \in \operatorname{im}d$ where $d$ denotes the differential operator of $B$. There exists some $b_{q+1} \in B^{q+1}$ and $b_q \in B^q$ such that $db_{q}=b_{q+1}$. Suppose now $c_q=g(b_q)$. $c_q$ is closed since $dc_q=g(db_q)=g(b_{q+1})=g(f(a))=0$. By definition, $\alpha=d^{\ast}[c_q]$. Therefore $\ker f^{\ast} \subset \operatorname{im}d^{\ast}$.

### Remarks

As you may see, almost every property of the diagram has been used. The exactness at $B^q$ ensures that $g(f(a))=0$. The definition of $H^q(A)$ ensures that we can simplify the meaning of $[0]$. We even use the injectivity of $f$ and the surjectivity of $g$.

This proof is also a demonstration of diagram-chasing technique. As you have seen, we keep running through the diagram to ensure that there is "someone waiting" at the destination.

This long exact group is useful. Here is an example.

## Application: Mayer-Vietoris Sequence

By differential forms on a open set $U \subset \mathbb{R}^n$, we mean $\Omega^*(U)=\{C^{\infty}\text{ functions on }U\}\otimes_\mathbb{R}\Omega^*.$ And the de Rham cohomology of $U$ comes up in the nature of things.

We are able to compute the cohomology of the union of two open sets. Suppose $M=U \cup V$ is a manifold with $U$ and $V$ open, and $U \amalg V$ is the disjoint union of $U$ and $V$ (the coproduct in the category of sets). $\partial_0$ and $\partial_1$ are inclusions of $U \cap V$ in $U$ and $V$ respectively. We have a natural sequence of inclusions $M \leftarrow U\amalg V \leftleftarrows^{\partial_0}_{\partial_1}\leftleftarrows U \cap V.$ Since $\Omega^{*}$ can also be treated as a contravariant functor from the category of Euclidean spaces with smooth maps to the category of commutative differential graded algebras and their homomorphisms, we have $\Omega^*(M) \rightarrow \Omega^*(U) \oplus \Omega^*(V) \rightrightarrows^{\partial^*_0}_{\partial^*_1}\rightrightarrows\Omega^*({U \cap V}).$ By taking the difference of the last two maps, we have \begin{aligned} 0 \rightarrow \Omega^*(M) \rightarrow \Omega^*(U) \oplus \Omega^*(V) &\rightarrow \Omega^*(U \cap V) \rightarrow 0 \\ (\omega,\tau) &\mapsto \tau-\omega \end{aligned} The sequence above is a short exact sequence. Therefore we may use the zig-zag lemma to find a long exact sequence (which is also called the Mayer-Vietoris sequence) by $\cdots\to H^q(M) \to H^q(U) \oplus H^q(V) \to H^q(U \cap V) \xrightarrow{d^*} H^{q+1}(M) \to \cdots$

### An example

This sequence allows one to compute the cohomology of two union of two open sets. For example, for $H^{*}_{DR}(\mathbb{R}^2-P-Q)$, where $P(x_p,y_p)$ and $Q(x_q,y_q)$ are two distinct points in $\mathbb{R}^2$, we may write $(\mathbb{R}^2-P)\cap(\mathbb{R}^2-Q)=\mathbb{R}^2-P-Q$ and $(\mathbb{R}^2-P)\cup(\mathbb{R}^2-Q)=\mathbb{R}^2.$ Therefore we may write $M=\mathbb{R}^2$, $U=\mathbb{R}^2-P$ and $V=\mathbb{R}^2-Q$. For $U$ and $V$, we have another decomposition by $\mathbb{R}^2-P=(\mathbb{R}^2-P_x)\cup(\mathbb{R}^2-P_y)$ where $P_x=\{(x,y_p):x \in \mathbb{R}\}.$ But $(\mathbb{R}^2-P_x)\cap(\mathbb{R}^2-P_y)$ is a four-time (homeomorphic) copy of $\mathbb{R}^2$. So things become clear after we compute $H^{\ast}_{DR}(\mathbb{R}^2)$.

• Raoul Bott, Loring W. Tu, Differential Forms in Algebraic Topology
• Munkres J. R., Elements of Algebraic Topology
• Micheal Spivak, Calculus on Manifolds
• Serge Lang, Algebra

# The Fourier transform of sinx/x and (sinx/x)^2 and more

## In this post

We are going to evaluate the Fourier transform of $\frac{\sin{x}}{x}$ and $\left(\frac{\sin{x}}{x}\right)^2$. And it turns out to be a comprehensive application of many elementary theorems of single complex variable functions. Thus it is recommended to make sure that you can evaluate and understand all the identities in this post by yourself. Also, make sure that you can recall what all words in italics means.

To be clear, by Fourier transform we actually mean $\hat{f}(t) = \frac{1}{\sqrt{2\pi}}\int_{-\infty}^{\infty}f(x)e^{-itx}dx.$ But we omit $\frac{1}{\sqrt{2\pi}}$ and use $e^{itx}$ in place of $e^{-itx}$ because it is easier to compute, and does not change the final result.

## Problem 1

For real $t$, find the limit by $\lim_{A \to \infty}\int_{-A}^{A}\frac{\sin{x}}{x}e^{itx}dx.$

Since $\frac{\sin{x}}{x}e^{itx}\not\in L^1$, we cannot evaluate the integral of it over $\mathbb{R}$ directly since it's not defined. Instead, for given $A>0$, the integral of it over $[-A,A]$ is defined, and we evaluate this limit to get what we want.

We will do this using contour integration. Since the complex function $f(z)=\frac{\sin{z}}{z}e^{itz}$ is entire, by Cauchy's theorem, its integral over $[-A,A]$ is equal to the one over the path $\Gamma_A$ by going from $-A$ to $-1$ along the real axis, from $-1$ to $1$ along the lower half of the unit circle, and from $1$ to $A$ along the real axis (why?). Since the path $\Gamma_A$ avoids the origin, we may use the identity $2i\sin{z}=e^{iz}-e^{-iz}.$ Replacing $\sin{z}$ with $\frac{1}{2i}(e^{itz}-e^{-itz})$, we get $I_A(t)=\int_{\Gamma_A}f(z)dz=\int_{\Gamma_A}\frac{1}{2iz}(e^{i(t+1)z}-e^{i(t-1)z})dz.$ If we put $\varphi_A(t)=\int_{\Gamma_A}\frac{1}{2iz}e^{i(t+1)z}dz$, we see $I_A(t)=\varphi_A(t+1)-\varphi_A(t-1)$. It is convenient to divide $\varphi_A$ by $\pi$ since we therefore get $\frac{1}{\pi}\varphi_A(t)=\frac{1}{2\pi i}\int_{\Gamma_A}\frac{e^{itz}}{z}dz$ and we are cool with the divisor $2\pi i$.

Now, finish the path $\Gamma_A$ in two ways. First, by the semicircle from $A$ to $-Ai$ to $-A$; second, by the semicircle from $A$ to $Ai$ to $-A$, which finishes a circle with radius $A$ actually. For simplicity we denote the two paths by $\Gamma_U$ and $\Gamma_L$ Again by the Cauchy theorem, the first case gives us a integral with value $0$, thus by Cauchy's theorem, $\frac{1}{\pi}\varphi_A(t)=\frac{1}{2\pi i}\int_{-\pi}^{0}\frac{\exp{(itAe^{i\theta})}}{Ae^{i\theta}}dAe^{i\theta}=\frac{1}{2\pi}\int_{-\pi}^{0}\exp{(itAe^{i\theta})}d\theta.$ Notice that \begin{aligned} |\exp(itAe^{i\theta})|&=|\exp(itA(\cos\theta+i\sin\theta))| \\ &=|\exp(itA\cos\theta)|\cdot|\exp(-At\sin\theta)| \\ &=\exp(-At\sin\theta) \end{aligned}

hence if $t\sin\theta>0$, we have $|\exp(iAte^{i\theta})| \to 0$ as $A \to \infty$. When $-\pi < \theta <0$ however, we have $\sin\theta<0$. Therefore we get $\frac{1}{\pi}\varphi_{A}(t)=\frac{1}{2\pi}\int_{-\pi}^{0}\exp(itAe^{i\theta})d\theta \to 0\quad (A \to \infty,t<0).$ (You should be able to prove the convergence above.) Also trivially $\varphi_A(0)=\frac{1}{2}\int_{-\pi}^{0}1d\theta=\frac{\pi}{2}.$ But what if $t>0$? Indeed, it would be difficult to obtain the limit using the integral over $[-\pi,0]$. But we have another path, namely the upper one.

Note that $\frac{e^{itz}}{z}$ is a meromorphic function in $\mathbb{C}$ with a pole at $0$. For such a function we have $\frac{e^{itz}}{z}=\frac{1}{z}\left(1+itz+\frac{(itz)^2}{2!}+\cdots\right)=\frac{1}{z}+it+\frac{(it)^2z}{2!}+\cdots.$ which implies that the residue at $0$ is $1$. By the residue theorem, \begin{aligned} \frac{1}{2\pi{i}}\int_{\Gamma_L}\frac{e^{itz}}{z}dz&=\frac{1}{2\pi{i}}\int_{\Gamma_A}\frac{e^{itz}}{z}dz+\frac{1}{2\pi}\int_{0}^{\pi}\exp(itAe^{i\theta})d\theta \\ &=1\cdot\operatorname{Ind}_{\Gamma_L}(0)=1. \end{aligned} Note that we have used the change-of-variable formula as we did for the upper one. $\operatorname{Ind}_{\Gamma_L}(0)$ denotes the winding number of $\Gamma_L$ around $0$, which is $1$ of course. The identity above implies $\frac{1}{\pi}\varphi_A(t)=1-\frac{1}{2\pi}\int_{0}^{\pi}\exp{(itAe^{i\theta})}d\theta.$ Thus if $t>0$, since $\sin\theta>0$ when $0<\theta<\pi$, we get $\frac{1}{\pi}\varphi_A(t)\to 1 \quad(A \to \infty,t>0).$ But as is already shown, $I_A(t)=\varphi_A(t+1)-\varphi_A(t-1)$. To conclude, $\lim_{A\to\infty}I_A(t)= \begin{cases} \pi\quad &|t|<1, \\ 0 \quad &|t|>1 ,\\ \frac{1}{2\pi} \quad &|t|=1. \end{cases}$

### What we can learn from this integral

Since $\psi(x)=\left(\frac{\sin{x}}{x}\right)$ is even, dividing $I_A$ by $\sqrt{\frac{1}{2\pi}}$, we actually obtain the Fourier transform of it by abuse of language. Therefore we also get $\hat\psi(t)= \begin{cases} \sqrt{\frac{\pi}{2}}\quad & |t|<1, \\ 0 \quad & |t|>1, \\ \frac{1}{2\pi\sqrt{2\pi}} & |t|=1. \end{cases}$ Note that $\hat\psi(t)$ is not continuous, let alone being uniformly continuous. Therefore, $\psi(x) \notin L^1$. The reason is, if $f \in L^1$, then $\hat{f}$ is uniformly continuous (proof). Another interesting fact is, this also implies the value of the Dirichlet integral since we have \begin{aligned} \int_{-\infty}^{\infty}\left(\frac{\sin{x}}{x}\right)dx&=\int_{-\infty}^{\infty}\left(\frac{\sin{x}}{x}\right)e^{0\cdot ix}dx \\ &=\sqrt{2\pi}\hat\psi(0) \\ &=\pi. \end{aligned} We end this section by evaluating the inverse of $\hat\psi(t)$. This requires a simple calculation. \begin{aligned} \sqrt{\frac{1}{2\pi}}\int_{-\infty}^{\infty}\hat\psi(t)e^{itx}dt &= \sqrt{\frac{1}{2\pi}}\int_{-1}^{1}\sqrt{\frac{\pi}{2}}e^{itx}dt \\ &=\frac{1}{2}\cdot\frac{1}{ix}(e^{ix}-e^{-ix}) \\ &=\frac{\sin{x}}{x}. \end{aligned}

## Problem 2

For real $t$, compute $J=\int_{-\infty}^{\infty}\left(\frac{\sin{x}}{x}\right)^2e^{itx}dx.$

Now since $h(x)=\frac{\sin^2{x}}{x^2} \in L^1$, we are able to say with ease that the integral above is the Fourier transform of $h(x)$ (multiplied by $\sqrt{2\pi}$). But still we will be using the limit form by $J(t)=\lim_{A \to \infty}J_A(t)$ where $J_A(t)=\int_{-A}^{A}\left(\frac{\sin{x}}{x}\right)^2e^{itx}dx.$ And we are still using the contour integration as above (keep $\Gamma_A$, $\Gamma_U$ and $\Gamma_L$ in mind!). For this we get $\left(\frac{\sin z}{z}\right)^2e^{itz}=\frac{e^{i(t+2)z}+e^{i(t-2)z}-2e^{itz}}{-4z^2}.$ Therefore it suffices to discuss the function $\mu_A(z)=\int_{\Gamma_A}\frac{e^{itz}}{2z^2}dz$ since we have $J_A(t)=\mu_A(t)-\frac{1}{2}(\mu_A(t+2)-\mu_A(t-2)).$ Dividing $\mu_A(z)$ by $\frac{1}{\pi i}$, we see $\frac{1}{\pi i}\mu_A(t)=\frac{1}{2\pi i}\int_{\Gamma_A}\frac{e^{itz}}{z^2}dz.$ An integration of $\frac{e^{itz}}{z^2}$ over $\Gamma_L$ gives \begin{aligned} \frac{1}{\pi i}\mu_A(z)&=\frac{1}{2\pi i}\int_{-\pi}^{0}\frac{\exp(itAe^{i\theta})}{A^2e^{2i\theta}}dAe^{i\theta} \\ &=\frac{1}{2\pi}\int_{-\pi}^{0}\frac{\exp(itAe^{i\theta})}{Ae^{i\theta}}d\theta. \end{aligned} Since we still have $\left|\frac{\exp(itAe^{i\theta})}{Ae^{i\theta}}\right|=\frac{1}{A}\exp(-At\sin\theta),$ if $t<0$ in this case, $\frac{1}{\pi i}\mu_A(z) \to 0$ as $A \to \infty$. For $t>0$, integrating along $\Gamma_U$, we have $\frac{1}{\pi i}\mu_A(t)=it-\frac{1}{2\pi}\int_{0}^{\pi}\frac{\exp(itAe^{i\theta})}{Ae^{i\theta}}d\theta \to it \quad (A \to \infty)$ We can also evaluate $\mu_A(0)$ by computing the integral but we are not doing that. To conclude, we have $\lim_{A \to\infty}\mu_A(t)=\begin{cases} 0 \quad &t>0, \\ -\pi t \quad &t<0. \end{cases}$ Therefore for $J_A$ we have $J(t)=\lim_{A \to\infty}J_A(t)=\begin{cases} 0 \quad &|t| \geq 2, \\ \pi(1+\frac{t}{2}) \quad &-2<t \leq 0, \\ \pi(1-\frac{t}{2}) \quad & 0<t <2. \end{cases}$ Now you may ask, how did you find the value at $0$, $2$ or $-2$? $\mu_A(0)$ is not evaluated. But $h(t) \in L^1$, $\hat{h}(t)=\sqrt{\frac{1}{2\pi}}J(t)$ is uniformly continuous, thus continuous, and the values at these points follows from continuity.

### What we can learn from this integral

Again, we get the value of a classic improper integral by $\int_{-\infty}^{\infty}\left(\frac{\sin{x}}{x}\right)^2dx = J(0)=\pi.$ And this time it's not hard to find the Fourier inverse: \begin{aligned} \sqrt{\frac{1}{2\pi}}\int_{-\infty}^{\infty}\hat{h}(t)e^{itx}dt&=\frac{1}{2\pi}\int_{-\infty}^{\infty}J(t)e^{itx}dt \\ &=\frac{1}{2\pi}\int_{-2}^{2}\pi(1-\frac{1}{2}|t|)e^{itx}dt \\ &=\frac{e^{2ix}+e^{-2ix}-2}{-4x^2} \\ &=\frac{(e^{ix}-e^{-ix})^2}{-4x^2} \\ &=\left(\frac{\sin{x}}{x}\right)^2. \end{aligned}

Thereafter you are able to evaluate the improper integral of $\left(\frac{\sin{x}}{x}\right)^n$. Using Fubini's or Tonelli's theorem is not a good idea. But using the contour integral as such will force you deal with $n$ binomial coefficients, which might be tedious still. It's even possible to discuss the convergence of the sequence $(I_n)$ where $I_n(t)=\lim_{A \to \infty}\int_{-A}^{A}\left(\frac{\sin{x}}{x}\right)^ne^{itx}dx.$

# The Riesz-Markov-Kakutani Representation Theorem

## This post

Is intended to establish the existence of the Lebesgue measure in the future, which is often denoted by $m$. In fact, the Lebesgue measure follows as a special case of R-M-K representation theorem. You may not believe it, but euclidean properties of $\mathbb{R}^k$ plays no role in the existence of $m$. The only topological property that works is the fact that $\mathbb{R}^k$ is a locally compact Hausdorff space.

The theorem is named after F. Riesz who introduced it for continuous functions on $[0,1]$ (with respect to Riemann-Steiltjes integral). Years later, after the generalization done by A. Markov and S. Kakutani, we are able to view it in a locally compact Hausdorff space.

You may find there are some over-generalized properties, but this is intended to have you being able to enjoy more alongside (there are some tools related to differential geometry). Also there are many topology and analysis tricks worth your attention.

## Tools

### Different kinds of topological spaces

Again, euclidean topology plays no role in this proof. We need to specify the topology for different reasons. This is similar to what we do in linear functional analysis. Throughout, let $X$ be a topological space.

0.0 Definition. $X$ is a Hausdorff space if the following is true: If $p \in X$, $q\in X$ but $p \neq q$, then there are two disjoint open sets $U$ and $V$ such that $p \in U$ and $q \in V$.

0.1 Definition. $X$ is locally compact if every point of $X$ has a neighborhood whose closure is compact.

0.2 Remarks. A Hausdorff space is also called a $T_2$ space (see Kolmogorov classification) or a separated space. There is a classic example of locally compact Hausdorff space: $\mathbb{R}^n$. It is trivial to verify this. But this is far from being enough. In the future we will see, we can construct some ridiculous but mathematically valid measures.

0.3 Definition. A set $E \subset X$ is called $\sigma$-compact if $E$ is a countable union of compact sets. Note that every open subset in a euclidean space $\mathbb{R}^n$ is $\sigma$-compact since it can always be a countable union of closed balls (which is compact).

0.4 Definition. A covering of $X$ is locally finite if every point has a neighborhood which intersects only finitely many elements of the covering. Of course, if the covering is already finite, it's also locally finite.

0.5 Definition. A refinement of a covering of $X$ is a second covering, each element of which is contained in an element of the first covering.

0.6 Definition. $X$ is paracompact if it is Hausdorff, and every open covering has a locally finite open refinement. Obviously any compact space is paracompact.

0.7 Theorem. If $X$ is a second countable Hausdorff space and is locally compact, then $X$ is paracompact. For proof, see this [Theorem 2.6].

0.8 Theorem. If $X$ is locally compact and sigma compact, then $X=\bigcup_{i=1}^{\infty}K_i$ where for all $i \in \mathbb{N}$, $K_i$ is compact and $K_i \subset\operatorname{int}K_{i+1}$.

### Partition of unity

The basic technical tool in the theory of differential manifolds is the existence of a partition of unity. We will steal this tool for the application of analysis theory.

1.0 Definition. A partition of unity on $X$ is a collection $(g_i)$ of continuous real valued functions on $X$ such that

1. $g_i \geq 0$ for each $i$.
2. every $x \in X$ has a neighborhood $U$ such that $U \cap \operatorname{supp}(g_i)=\varnothing$ for all but finitely many of $g_i$.
3. for each $x \in X$, we have $\sum_{i}g_i(x)=1$. (That's why you see the word 'unity'.)

1.1 Definition. A partition of unity $(g_i)$ on $X$ is subordinate to an open cover of $X$ if and only if for each $g_i$ there is an element $U$ of the cover such that $\operatorname{supp}(g_i) \subset U$. We say $X$ admits partitions of unity if and only if for every open cover of $X$, there exists a partition of unity subordinate to the cover.

1.2 Theorem. A Hausdorff space admits a partition of unity if and only if it is paracompact (the 'only if' part is by considering the definition of partition of unity. For the 'if' part, see here). As a corollary, we have:

1.3 Corollary. Suppose $V_1,\cdots,V_n$ are open subsets of a locally compact Hausdorff space $X$, $K$ is compact, and $K \subset \bigcup_{k=1}^{n}V_k.$ Then there exists a partition of unity $(h_i)$ that is subordinate to the cover $(V_n)$ such that $\operatorname{supp}(h_i) \subset V_i$ and $\sum_{i=1}^{n}h_i=1$ for all $x \in K$.

### Urysohn's lemma (for locally compact Hausdorff spaces)

2.0 Notation. The notation $K \prec f$ will mean that $K$ is a compact subset of $X$, that $f \in C_c(X)$, that $f(X) \subset [0,1]$, and that $f(x)=1$ for all $x \in K$. The notation $f \prec V$ will mean that $V$ is open, that $f \in C_c(X)$, that $f(X) \subset [0,1]$ and that $\operatorname{supp}(f) \subset V$. If both hold, we write $K \prec f \prec V.$ 2.1 Remarks. Clearly, with this notation, we are able to simplify the statement of being subordinate. We merely need to write $g_i \prec U$ in 1.1 instead of $\operatorname{supp}(g_i) \subset U$.

2.2 Urysohn's Lemma for locally compact Hausdorff space. Suppose $X$ is locally compact and Hausdorff, $V$ is open in $X$ and $K \subset V$ is a compact set. Then there exists an $f \in C_c(X)$ such that $K \prec f \prec V.$ 2.3 Remarks. By $f \in C_c(X)$ we shall mean $f$ is a continuous function with a compact support. This relation also says that $\chi_K \leq f \leq \chi_V$. For more details and the proof, visit this page. This lemma is generally for normal space, for a proof on that level, see arXiv:1910.10381. (Question: why we consider two disjoint closed subsets thereafter?)

### The $\varepsilon$-definitions of $\sup$ and $\inf$

We will be using the $\varepsilon$-definitions of $\sup$ and $\inf$, which will makes the proof easier in this case, but if you don't know it would be troublesome. So we need to put it down here.

Let $S$ be a nonempty subset of the real numbers that is bounded below. The lower bound $w$ is to be the infimum of $S$ if and only if for any $\varepsilon>0$, there exists an element $x_\varepsilon \in S$ such that $x_\varepsilon<w+\varepsilon$.

This definition of $\inf$ is equivalent to the if-then definition by

Let $S$ be a set that is bounded below. We say $w=\inf S$ when $w$ satisfies the following condition.

1. $w$ is a lower bound of $S$.
2. If $t$ is also a lower bound of $S$, then $t \leq s$.

We have the analogous definition for $\sup$.

## The main theorem

Analysis is full of vector spaces and linear transformations. We already know that the Lebesgue integral induces a linear functional. That is, for example, $L^1([0,1])$ is a vector space, and we have a linear functional by $f \mapsto \int_0^1 f(x)dx.$ But what about the reverse? Given a linear functional, is it guaranteed that we have a measure to establish the integral? The R-M-K theorem answers this question affirmatively. The functional to be discussed is positive, which means that if $\Lambda$ is positive and $f(X) \subset [0,\infty)$, then $\Lambda{f} \in [0,\infty)$.

Let $X$ be a locally compact Hausdorff space, and let $\Lambda$ be a positive linear functional on $C_c(X)$. Then there exists a $\sigma$-algebra $\mathfrak{M}$ on $X$ which contains all Borel sets in $X$, and there exists a unique positive measure $\mu$ on $\mathfrak{M}$ which represents $\Lambda$ in the sense that $\Lambda{f}=\int_X fd\mu$ for all $f \in C_c(X)$.

For the measure $\mu$ and the $\sigma$-algebra $\mathfrak{M}$, we have four assertions:

1. $\mu(K)<\infty$ for every compact set $K \subset X$.
2. For every $E \in \mathfrak{M}$, we have

$\mu(E)=\{\mu(V):E \subset V, V\text{ open}\}.$

1. For every open set $E$ and every $E \in \mathfrak{M}$, we have

$\mu(E)=\sup\{\mu(K):K \subset E, K\text{ compact}\}.$

1. If $E \in \mathfrak{M}$, $A \subset E$, and $\mu(E)=0$, then $A \in \mathfrak{M}$.

Remarks before proof. It would be great if we can establish the Lebesgue measure $m$ by putting $X=\mathbb{R}^n$. But we need a little more extra work to get this result naturally. If 2 is satisfied, we say $\mu$ is outer regular, and inner regular for 3. If both hold, we say $\mu$ is regular. The partition of unity and Urysohn's lemma will be heavily used in the proof of the main theorem, so make sure you have no problem with it.

### Proving the theorem

The proof is rather long so we will split it into several steps. I will try my best to make every line clear enough.

#### Step 0 - Construction of $\mu$ and $\mathfrak{M}$

For every open set $V \in X$, define $\mu(V)=\sup\{\Lambda{f}:f \prec V\}.$

If $V_1 \subset V_2$ and both are open, we claim that $\mu(V_1) \leq \mu(V_2)$. For $f \prec V_1$, since $\operatorname{supp}f \subset V_1 \subset V_2$, we see $f \prec V_2$. But we are able to find some $g \prec V_2$ such that $g \geq f$, or more precisely, $\operatorname{supp}(g) \supset \operatorname{supp}(f)$. By taking another look at the proof of Urysohn's lemma for locally compact Hausdorff space, we see there is an open set G with compact closure such that $\operatorname{supp}(f) \subset G \subset \overline{G} \subset V_2.$ By Urysohn's lemma to the pair $(\overline{G},V_2)$, we see there exists a function $g \in C_c(X)$ such that $\overline{G} \prec g \prec V_2.$ Therefore $\operatorname{supp}(f) \subset \overline{G} \subset \operatorname{supp}(g).$ Thus for any $f \prec V_1$ and $g \prec V_2$, we have $\Lambda{g} \geq \Lambda{f}$ (monotonic) since $\Lambda{g}-\Lambda{f}=\Lambda{(g-f)}\geq 0$. By taking the supremum over $f$ and $g$, we see $\mu(V_1) \leq \mu(V_2).$ The 'monotonic' property of such $\mu$ enables us to define $\mu(E)$ for all $E \subset X$ by $\mu(E)=\inf \{\mu(V):E \subset V, V\text{ open}\}.$ The definition above is trivial to valid for open sets. Sometimes people say $\mu$ is the outer measure. We will discuss other kind of sets thoroughly in the following steps. Warning: we are not saying that $\mathfrak{M} = 2^X$. The crucial property of $\mu$, namely countable additivity, will be proved only on a certain $\sigma$-algebra.

It follows from the definition of $\mu$ that if $E_1 \subset E_2$, then $\mu(E_1) \leq \mu(E_2)$.

Let $\mathfrak{M}_F$ be the class of all $E \subset X$ which satisfy the two following conditions:

1. $\mu(E) <\infty$.

2. 'Inner regular': $\mu(E)=\sup\{\mu(K):K \subset E, K\text{ compact}\}.$

One may say here $\mu$ is the 'inner measure'. Finally, let $\mathfrak{M}$ be the class of all $E \subset X$ such that for every compact $K$, we have $E \cap K \in \mathfrak{M}_F$. We shall show that $\mathfrak{M}$ is the desired $\sigma$-algebra.

Remarks of Step 0. So far, we have only proved that $\mu(E) \geq 0$ for all $E {\color\red{\subset}}X$. What about the countable additivity? It's clear that $\mathfrak{M}_F$ and $\mathfrak{M}$ has some strong relation. We need to get a clearer view of it. Also, if we restrict $\mu$ to $\mathfrak{M}_F$, we restrict ourself to finite numbers. In fact, we will show finally $\mathfrak{M}_F \subset \mathfrak{M}$.

#### Step 1 - The 'measure' of compact sets (outer)

If $K$ is compact, then $K \in \mathfrak{M}_F$, and $\mu(K)=\inf\{\Lambda{f}:K \prec f\}<\infty$

Define $V_\alpha=f^{-1}(\alpha,1]$ for $K \prec f$ and $0 < \alpha < 1$. Since $f(x)=1$ for all $x \in K$, we have $K \subset V_{\alpha}$. Therefore by definition of $\mu$ for all $E \subset X$, we have $\mu(K) \leq \mu(V_\alpha)=\sup\{\Lambda{g}:g \prec V_{\alpha}\} < \frac{1}{\alpha}\Lambda{f}.$ Note that $f \geq \alpha{g}$ whenever $g \prec V_{\alpha}$ since $\alpha{g} \leq \alpha < f$. Since $\mu(K)$ is an lower bound of $\frac{1}{\alpha}\Lambda{f}$ with $0<\alpha<1$, we see $\mu(K) \leq \inf_{\alpha \in (0,1)}\{\frac{1}{\alpha}\Lambda{f}\}=\Lambda{f}.$ Since $f(X) \in [0,1]$, we have $\Lambda{f}$ to be finite. Namely $\mu(K) <\infty$. Since $K$ itself is compact, we see $K \in \mathfrak{M}_F$.

To prove the identity, note that there exists some $V \supset K$ such that $\mu(V)<\mu(K)+\varepsilon$ for some $\varepsilon>0$. By Urysohn's lemma, there exists some $h \in C_c(X)$ such that $K \prec h \prec V$. Therefore $\Lambda{h} \leq \mu(V) < \mu(K)+\varepsilon$ Therefore $\mu(K)$ is the infimum of $\Lambda{h}$ with $K \prec h$.

Remarks of Step 1. We have just proved assertion 1 of the property of $\mu$. The hardest part of this proof is the inequality $\mu(V)<\mu(K)+\varepsilon.$ But this is merely the $\varepsilon$-definition of $\inf$. Note that $\mu(K)$ is the infimum of $\mu(V)$ with $V \supset K$. For any $\varepsilon>0$, there exists some open $V$ for what? Under certain conditions, this definition is much easier to use. Now we will examine the relation between $\mathfrak{M}_F$ and $\tau_X$, namely the topology of $X$.

#### Step 2 - The 'measure' of open sets (inner)

$\mathfrak{M}_F$ contains every open set $V$ with $\mu(V)<\infty$.

It suffices to show that for open set $V$, we have $\mu(V)=\sup\{\mu(K):K \subset E, K\text{ compact}\}.$ For $0<\varepsilon<\mu(V)$, we see there exists an $f \prec V$ such that $\Lambda{f}>\mu(V)-\varepsilon$. If $W$ is any open set which contains $K= \operatorname{supp}(f)$, then $f \prec W$, and therefore $\Lambda{f} \leq \mu(W)$. Again by definition of $\mu(K)$, we see $\Lambda{f}\leq\mu(K).$ Therefore $\mu(V)-\varepsilon<\Lambda{f}\leq\mu(K)\leq\mu(V).$ This is exactly the definition of $\sup$. The identity is proved.

Remarks of Step 2. It's important to that this identity can only be satisfied by open sets and sets $E$ with $\mu(E)<\infty$, the latter of which will be proved in the following steps. This is the flaw of this theorem. With these preparations however, we are able to show the countable additivity of $\mu$ on $\mathfrak{M}_F$.

#### Step 3 - The subadditivity of $\mu$ on $2^X$

If $E_1,E_2,E_3,\cdots$ are arbitrary subsets of $X$, then $\mu\left(\bigcup_{k=1}^{\infty}E_k\right) \leq \sum_{k=1}^{\infty}\mu(E_k)$

First we show this holds for finitely many open sets. This is tantamount to show that $\mu(V_1 \cup V_2)\leq \mu(V_1)+\mu(V_2)$ if $V_1$ and $V_2$ are open. Pick $g \prec V_1 \cup V_2$. This is possible due to Urysohn's lemma. By corollary 1.3, there is a partition of unity $(h_1,h_2)$ subordinate to $(V_1,V_2)$ in the sense of corollary 1.3. Therefore, \begin{aligned} \Lambda(g)&=\Lambda((h_1+h_2)g) \\ &=\Lambda(h_1g)+\Lambda(h_2g) \\ &\leq\mu(V_1)+\mu(V_2). \end{aligned} Notice that $h_1g \prec V_1$ and $h_2g \prec V_2$. By taking the supremum, we have $\mu(V_1 \cup V_2)\leq \mu(V_1)+\mu(V_2).$

Now we back to arbitrary subsets of $X$. If $\mu(E_i)=\infty$ for some $i$, then there is nothing to prove. Therefore we shall assume that $\mu(E_i)<\infty$ for all $i$. By definition of $\mu(E_i)$, we see there are open sets $V_i \supset E_i$ such that $\mu(V_i)<\mu(E_i)+\frac{\varepsilon}{2^i}.$ Put $V=\bigcup_{i=1}^{\infty}V_i$, and choose $f \prec V_i$. Since $f \in C_c(X)$, there is a finite collection of $V_i$ that covers the support of $f$. Therefore without loss of generality, we may say that $f \prec V_1 \cup V_2 \cup \cdots \cup V_n$ for some $n$. We therefore obtain \begin{aligned} \Lambda{f} &\leq \mu(V_1 \cup V_2 \cup \cdots \cup V_n) \\ &\leq \mu(V_1)+\mu(V_2)+\cdots+\mu(V_n) \\ &\leq \sum_{i=1}^{n}\left(\mu(E_i)+\frac{\varepsilon}{2^i}\right) \\ &\leq \sum_{i=1}^{\infty}\mu(E_i)+\varepsilon, \end{aligned} for all $f \prec V$. Since $\bigcup E_i \subset V$, we have $\mu(\bigcup E_i) \leq \mu(V)$. Therefore $\mu(\bigcup_{i=1}^{\infty}E_i)\leq\mu(V)=\sup\{\Lambda{f}\}\leq\sum_{i=1}^{\infty}\mu(E_i)+\varepsilon.$ Since $\varepsilon$ is arbitrary, the inequality is proved.

Remarks of Step 3. Again, we are using the $\varepsilon$-definition of $\inf$. One may say this step showed the subaddtivity of the outer measure. Also note the geometric series by $\sum_{k=1}^{\infty}\frac{\varepsilon}{2^k}=\varepsilon$.

#### Step 4 - Additivity of $\mu$ on $\mathfrak{M}_F$

Suppose $E=\bigcup_{i=1}^{\infty}E_i$, where $E_1,E_2,\cdots$ are pairwise disjoint members of $\mathfrak{M}_F$, then $\mu(E)=\sum_{i=1}^{\infty}\mu(E_i).$ If $\mu(E)<\infty$, we also have $E \in \mathfrak{M}_F$.

As a dual to Step 3, we firstly show this holds for finitely many compact sets. As proved in Step 1, compact sets are in $\mathfrak{M}_F$. Suppose now $K_1$ and $K_2$ are disjoint compact sets. We want to show that $\mu(K_1 \cup K_2)=\mu(K_1)+\mu(K_2).$ Note that compact sets in a Hausdorff space is closed. Therefore we are able to apply Urysohn's lemma to the pair $(K_1,K_2^c)$. That said, there exists a $f \in C_c(X)$ such that $K_1 \prec f \prec K_2^c.$ In other words, $f(x)=1$ for all $x \in K_1$ and $f(x)=0$ for all $x \in K_2$, since $\operatorname{supp}(f) \cap K_2 = \varnothing$. By Step 1, since $K_1 \cup K_2$ is compact, there exists some $g \in C_c(X)$ such that $K_1 \cup K_2 \prec g \quad \text{and} \quad \Lambda(g) < \mu(K_1 \cup K_2)+\varepsilon.$ Now things become tricky. We are able to write $g$ by $g=fg+(1-f)g.$ But $K_1 \prec fg$ and $K_2 \prec (1-f)g$ by the properties of $f$ and $g$. Also since $\Lambda$ is linear, we have $\mu(K_1)+\mu(K_2) \leq \Lambda(fg)+\Lambda((1-f)g)=\Lambda(g) < \mu(K_1 \cup K_2)+\varepsilon.$ Therefore we have $\mu(K_1)+\mu(K_2) \leq \mu(K_1 \cup K_2).$ On the other hand, by Step 3, we have $\mu(K_1 \cup K_2) \leq \mu(K_1)+\mu(K_2).$ Therefore they must equal.

If $\mu(E)=\infty$, there is nothing to prove. So now we should assume that $\mu(E)<\infty$. Since $E_i \in \mathfrak{M}_F$, there are compact sets $K_i \subset E_i$ with $\mu(K_i) > \mu(E_i)-\frac{\varepsilon}{2^i}.$ Putting $H_n=K_1 \cup K_2 \cup \cdots \cup K_n$, we see $E \supset H_n$ and $\mu(E) \geq \mu(H_n)=\sum_{i=1}^{n}\mu(H_i)>\sum_{i=1}^{n}\mu(E_i)-\varepsilon.$ This inequality holds for all $n$ and $\varepsilon$, therefore $\mu(E) \geq \sum_{i=1}^{\infty}\mu(E_i).$ Therefore by Step 3, the identity holds.

Finally we shall show that $E \in \mathfrak{M}_F$ if $\mu(E) <\infty$. To make it more understandable, we will use elementary calculus notation. If we write $\mu(E)=x$ and $x_n=\sum_{i=1}^{n}\mu(E_i)$, we see $\lim_{n \to \infty}x_n=x.$ Therefore, for any $\varepsilon>0$, there exists some $N \in \mathbb{N}$ such that $x-x_N<\varepsilon.$ This is tantamount to $\mu(E)<\sum_{i=1}^{N}\mu(E_i)+\varepsilon.$ But by definition of the compact set $H_N$ above, we see $\mu(E)<{\color\red{\sum_{i=1}^{N}\mu(E_i)}}+\varepsilon<{\color\red {\mu(H_N)+\varepsilon}}+\varepsilon=\mu(H_N)+2\varepsilon.$ Hence $E$ satisfies the requirements of $\mathfrak{M}_F$, thus an element of it.

Remarks of Step 4. You should realize that we are heavily using the $\varepsilon$-definition of $\sup$ and $\inf$. As you may guess, $\mathfrak{M}_F$ should be a subset of $\mathfrak{M}$ though we don't know whether it is a $\sigma$-algebra or not. In other words, we hope that the countable additivity of $\mu$ holds on a $\sigma$-algebra that is properly extended from $\mathfrak{M}_F$. However it's still difficult to show that $\mathfrak{M}$ is a $\sigma$-algebra. We need more properties of $\mathfrak{M}_F$ to go on.

#### Step 5 - The 'continuity' of $\mathfrak{M}_F$.

If $E \in \mathfrak{M}_F$ and $\varepsilon>0$, there is a compact $K$ and an open $V$ such that $K \subset E \subset V$ and $\mu(V-K)<\varepsilon$.

There are two ways to write $\mu(E)$, namely $\mu(E)=\sup\{\mu(K):K \subset E\} \quad \text{and} \quad \mu(E)=\inf\{\mu(V):V\supset E\}$ where $K$ is compact and $V$ is open. Therefore there exists some $K$ and $V$ such that $\mu(V)-\frac{\varepsilon}{2}<\mu(E)<\mu(K)+\frac{\varepsilon}{2}.$ Since $V-K$ is open, and $\mu(V-K)<\infty$, we have $V-K \in \mathfrak{M}_F$. By Step 4, we have $\mu(K)+\mu(V-K)=\mu(V) <\mu(K)+\varepsilon.$ Therefore $\mu(V-K)<\varepsilon$ as proved.

Remarks of Step 5. You should be familiar with the $\varepsilon$-definitions of $\sup$ and $\inf$ now. Since $V-K =V\cap K^c \subset V$, we have $\mu(V-K)\leq\mu(V)<\mu(E)+\frac{\varepsilon}{2}<\infty$.

#### Step 6 - $\mathfrak{M}_F$ is closed under certain operations

If $A,B \in \mathfrak{M}_F$, then $A-B,A\cup B$ and $A \cap B$ are elements of $\mathfrak{M}_F$.

This shows that $\mathfrak{M}_F$ is closed under union, intersection and relative complement. In fact, we merely need to prove $A-B \in \mathfrak{M}_F$, since $A \cup B=(A-B) \cup B$ and $A\cap B = A-(A-B)$.

By Step 5, for $\varepsilon>0$, there are sets $K_A$, $K_B$, $V_A$, $V_B$ such that $K_A \subset A \subset V_A$, $K_B \subset B \subset V_B$, and for $A-B$ we have $A-B \subset V_A-K_B \subset (V_A-K_A) \cup (K_A-V_B) \cup (V_B-K_B).$ With an application of Step 3 and 5, we have $\mu(A-B) \leq \mu(V_A-K_A)+\mu(K_A-V_B)+\mu(V_B-K_B)< \varepsilon+\mu(K_A-V_B)+\varepsilon.$ Since $K_A-V_B$ is a closed subset of $K_A$, we see $K_A-V_B$ is compact as well (a closed subset of a compact set is compact). But $K_A-V_B \subset A-B$, and $\mu(A-B) <\mu(K_A-V_B)+2\varepsilon$, we see $A-B$ meet the requirement of $\mathfrak{M}_F$ (, the fact that $\mu(A-B)<\infty$ is trivial since $\mu(A-B)<\mu(A)$).

Since $A-B$ and $B$ are pairwise disjoint members of $\mathfrak{M}_F$, we see $\mu(A \cup B)=\mu(A-B)+\mu(B)<\infty.$ Thus $A \cup B \in \mathfrak{M}_F$. Since $A,A-B \in \mathfrak{M}_F$, we see $A \cap B = A-(A-B) \in \mathfrak{M}_F$.

Remarks of Step 6. In this step, we demonstrated several ways to express a set, all of which end up with a huge simplification. Now we are able to show that $\mathfrak{M}_F$ is a subset of $\mathfrak{M}$.

#### Step 7 - $\mathfrak{M}_F \subset \mathfrak{M}$

There is a precise relation between $\mathfrak{M}$ and $\mathfrak{M}_F$ by $\mathfrak{M}_F=\{E \in \mathfrak{M}:\mu(E)<\infty\} \subset \mathfrak{M}.$

If $E \in \mathfrak{M}_F$, we shall show that $E \in \mathfrak{M}$. For compact $K\in\mathfrak{M}_F$ (Step 1), by Step 6, we see $K \cap E \in \mathfrak{M}_F$, therefore $E \in \mathfrak{M}$.

If $E \in \mathfrak{M}$ with $\mu(E)<\infty$ however, we need to show that $E \in \mathfrak{M}_F$. By definition of $\mu$, for $\varepsilon>0$, there is an open $V$ such that $\mu(V)<\mu(E)+\varepsilon<\infty.$ Therefore $V \in \mathfrak{M}_F$. By Step 5, there is a compact set $K$ such that $\mu(V-K)<\varepsilon$ (the open set containing $V$ should be $V$ itself). Since $E \cap K \in \mathfrak{M}_F$, there exists a compact set $H \subset E \cap K$ with $\mu(E \cap K)<\mu(H)+\varepsilon.$ Since $E \subset (E \cap K) \cup (V-K)$, it follows from Step 1 that $\mu(E) \leq {\color\red{\mu(E\cap K)}}+\mu(V-K)<{\color\red{\mu(H)+\varepsilon}}+\varepsilon=\mu(H)+2\varepsilon.$ Therefore $E \in \mathfrak{M}_F$.

Remarks of Step 7. Several tricks in the preceding steps are used here. Now we are pretty close to the fact that $(X,\mathfrak{M},\mu)$ is a measure space. Note that for $E \in \mathfrak{M}-\mathfrak{M}_F$, we have $\mu(E)=\infty$, but we have already proved the countable additivity for $\mathfrak{M}_F$. Is it 'almost trivial' for $\mathfrak{M}$? Before that, we need to show that $\mathfrak{M}$ is a $\sigma$-algebra. Note that assertion 3 of $\mu$ has been proved.

#### Step 8 - $\mathfrak{M}$ is a $\sigma$-algebra in $X$ containing all Borel sets

We will validate the definition of $\sigma$-algebra one by one.

$X \in \mathfrak{M}$.

For any compact $K \subset X$, we have $K \cap X=K$. But as proved in Step 1, $K \in \mathfrak{M}_F$, therefore $X \in \mathfrak{M}$.

If $A \in \mathfrak{M}$, then $A^c \in\mathfrak{M}$.

If $A \in \mathfrak{M}$, then $A \cap K \in \mathfrak{M}_F$. But $K-(A \cap K)=K \cap(A^c \cup K^c)=K\cap A^c \cup \varnothing=K \cap A^c.$ By Step 1 and Step 6, we see $K \cap A^c \in \mathfrak{M}_F$, thus $A^c \in \mathfrak{M}$.

If $A_n \in \mathfrak{M}$ for all $n \in \mathbb{N}$, then $A=\bigcup_{n=1}^{\infty}A_n \in \mathfrak{M}$.

We assign an auxiliary sequence of sets inductively. For $n=1$, we write $B_1=A_1 \cap K$ where $K$ is compact. Then $B_1 \in \mathfrak{M}_F$. For $n \geq 2$, we write $B_n=(A_n \cap K)-(B_1 \cup \cdots\cup B_{n-1}).$ Since $A_n \cap K \in \mathfrak{M}_F$, $B_1,B_2,\cdots,B_{n-1} \in \mathfrak{M}_F$, by Step 6, $B_n \in \mathfrak{M}_F$. Also $B_n$ is pairwise disjoint.

Another set-theoretic manipulation shows that \begin{aligned} A \cap K&=K \cap\left(\bigcup_{n=1}^{\infty}A_n\right) \\ &=\bigcup_{n=1}^{\infty}(K \cap A_n) \\ &=\bigcup_{n=1}^{\infty}B_n \cup(B_1 \cup \cdots\cup B_{n-1}) \\ &=\bigcup_{n=1}^{\infty}B_n. \end{aligned} Now we are able to evaluate $\mu(A \cap K)$ by Step 4. \begin{aligned} \mu(A \cap K)&=\sum_{n=1}^{\infty}\mu(B_n) \\ &= \lim_{n \to \infty}(A_n \cap K) <\infty. \end{aligned} Therefore $A \cap K \in \mathfrak{M}_F$, which implies that $A \in \mathfrak{M}$.

$\mathfrak{M}$ contains all Borel sets.

Indeed, it suffices to prove that $\mathfrak{M}$ contains all open sets and/or closed sets. We'll show two different paths. Let $K$ be a compact set.

1. If $C$ is closed, then $C \cap K$ is compact, therefore $C$ is an element of $\mathfrak{M}_F$. (By Step 2.)
2. If $D$ is open, then $D \cap K \subset K$. Therefore $\mu(D \cap K) \leq \mu(K)<\infty$, which shows that $D$ is an element of $\mathfrak{M}_F$. (By Step 7.)

Therefore by 1 or 2, $\mathfrak{M}$ contains all Borel sets.

#### Step 9 - $\mu$ is a positive measure on $\mathfrak{M}_F$

Again, we will verify all properties of $\mu$ one by one.

$\mu(E) \geq 0$ for all $E \in \mathfrak{M}$.

This follows immediately from the definition of $\mu$, since $\Lambda$ is positive and $0 \leq f \leq 1$.

$\mu$ is countably additive.

If $A_1,A_2,\cdots$ form a disjoint countable collection of members of $\mathfrak{M}$, we need to show that $\mu\left(\bigcup_{n=1}^{\infty}A_n\right)=\sum_{n=1}^{\infty}\mu(A_n).$ If $A_n \in \mathfrak{M}_F$ for all $n$, then this is merely what we have just proved in Step 4. If $A_j \in \mathfrak{M}-\mathfrak{M}_F$ however, we have $\mu(A_j)=\infty$. So $\sum_n\mu(A_n)=\infty$. For $\mu(\cup_n A_n)$, notice that $\cup_n A_n \supset A_j$, we have $\mu(\cup_n A_n) \geq \mu(A_j)=\infty$. The identity is now proved.

#### Step 10 - The completeness of $\mu$

So far assertion 1-3 have been proved. But the final assertion has not been proved explicitly. We do that since this property will be used when discussing the Lebesgue measure $m$. In fact, this will show that $(X,\mathfrak{M},\mu)$ is a complete measure space.

If $E \in \mathfrak{M}$, $A \subset E$, and $\mu(E)=0$, then $A \in \mathfrak{M}$.

It suffices to show that $A \in \mathfrak{M}_F$. By definition, $\mu(A)=0$ as well. If $K \subset A$, where $K$ is compact, then $\mu(K)=\mu(A)=0$. Therefore $0$ is the supremum of $\mu(K)$. It follows that $A \in \mathfrak{M}_F \subset \mathfrak{M}$.

#### Step 11 - The functional and the measure

For every $f \in C_c(X)$, $\Lambda{f}=\int_X fd\mu$.

This is the absolute main result of the theorem. It suffices to prove the inequality $\Lambda f \leq \int_X fd\mu$ for all $f \in C_c(X)$. What about the other side? By the linearity of $\Lambda$ and $\int_X \cdot d\mu$, once inequality above proved, we have $\Lambda(-f)=-\Lambda{f}\leq\int_{X}-fd\mu=-\int_Xfd\mu.$ Therefore $\Lambda{f} \geq \int_X fd\mu$ holds as well, and this establish the equality.

Notice that since $K=\operatorname{supp}(f)$ is compact, we see the range of $f$ has to be compact. Namely we may assume that $[a,b]$ contains the range of $f$. For $\varepsilon>0$, we are able to pick a partition around $[a,b]$ such that $y_n - y_{n-1}<\varepsilon$ and $y_0 < a < y_1<\cdots<y_n=b.$ Put $E_i=\{x:y_{i-1}< f(x) \leq y_i\}\cap K.$ Since $f$ is continuous, $f$ is Borel measurable. The sets $E_i$ are trivially pairwise disjoint Borel sets. Again, there are open sets $V_i \supset E_i$ such that $\mu(V_i) < \mu(E_i)+\frac{\varepsilon}{n}$ for $i=1,2,\cdots,n$, and such that $f(x)<y_i + \varepsilon$ for all $x \in V_i$. Notice that $(V_i)$ covers $K$, therefore by the partition of unity, there are a sequence of functions $(h_i)$ such that $h_i \prec V_i$ for all $i$ and $\sum h_i=1$ on $K$. By Step 1 and the fact that $f=\sum_i h_i$, we see $\mu(K) \leq \Lambda(\sum_i h_i)=\sum_i \Lambda{h_i}.$ By the way we picked $V_i$, we see $h_if \leq (y_i+\varepsilon)h_i$. We have the following inequality: \begin{aligned} \Lambda{f} &= \sum_{i=1}^{n}\Lambda(h_if) \leq\sum_{i=1}^{n}(y_i+\varepsilon)\Lambda{h_i} \\ &= \sum_{i=1}^{n}\left(|a|-|a|+y_i+\varepsilon\right)\Lambda{h_i} \\ &=\sum_{i=1}^{n}(|a|+y_i+\varepsilon)\Lambda{h_i}-|a|\sum_{i=1}^{n}\Lambda{h_i}. \end{aligned} Since $h_i \prec V_i$, we have $\mu(E_i)+\frac{\varepsilon}{n}>\mu(V_i) \geq \Lambda{h_i}$. And we already get $\sum_i \Lambda{h_i} \geq \mu(K)$. If we put them into the inequality above, we get \begin{aligned} \Lambda{f} &\leq \sum_{i=1}^{n}(|a|+y_i+\varepsilon)\Lambda{h_i}-|a|\sum_{i=1}^{n}\Lambda{h_i} \\ &\leq \sum_{i=1}^{n}(|a|+y_i+\varepsilon){\color\red{(\mu(E_i)+\frac{\varepsilon}{n})}}-|a|\color\red{\mu(K)}. \end{aligned} Observe that $\cup_i E_i=K$, by Step 9 we have $\sum_{i}\mu(E_i)=\mu(K)$. A slight manipulation shows that \begin{aligned} \sum_{i=1}^{n}(|a|+y_i+\varepsilon)\mu(E_i)-|a|\mu(K)&=|a|\sum_{i=1}^{n}\mu(E_i)-|a|\mu(K)+\sum_{i=1}^{n}(y_i+\varepsilon)\mu(E_i) \\ &=\sum_{i=1}^{n}(y_i-\varepsilon)\mu(E_i)+2\varepsilon\mu(K). \end{aligned} Therefore for $\Lambda f$ we get \begin{aligned} \Lambda{f} &\leq\sum_{i=1}^{n}(|a|+y_i+\varepsilon)(\mu(E_i)+\frac{\varepsilon}{n})-|a|\mu(K) \\ &=\sum_{i=1}^{n}(y_i-\varepsilon)\mu(E_i)+2\varepsilon\mu(K)+\frac{\varepsilon}{n}\sum_{i=1}^n(|a|+y_i+\varepsilon). \end{aligned} Now here comes the trickiest part of the whole blog post. By definition of $E_i$, we see $f(x) > y_{i-1}>y_{i}-\varepsilon$ for $x \in E_i$. Therefore we get simple function $s_n$ by $s_n=\sum_{i=1}^{n}(y_i-\varepsilon)\chi_{E_i}.$ If we evaluate the Lebesgue integral of $f$ with respect to $\mu$, we see $\int_X s_nd\mu={\color\red{\sum_{i=1}^{n}(y_i-\varepsilon)\mu(E_i)}} \leq {\color\red{\int_X fd\mu}}.$ For $2\varepsilon\mu(K)$, things are simple since $0\leq\mu(K)<\infty$. Therefore $2\varepsilon\mu(K) \to 0$ as $\varepsilon \to 0$. Now let's estimate the final part of the inequality. It's trivial that $\frac{\varepsilon}{n}\sum_{i=1}^{n}(|a|+\varepsilon)=\varepsilon(\varepsilon+|a|)$. For $y_i$, observe that $y_i \leq b$ for all $i$, therefore $\frac{\varepsilon}{n}\sum_{i=1}^{n}y_i \leq \frac{\varepsilon}{n}nb=\varepsilon b$. Thus ${\color\green{\frac{\varepsilon}{n}\sum_{i=1}^{n}(|a|+y_i+\varepsilon)}} \color\black\leq {\color\green {\varepsilon(|a|+b+\varepsilon)}}\color\black{.}$ Notice that $b+|a| \geq 0$ since $b \geq a \geq -|a|$. Our estimation of $\Lambda{f}$ is finally done: \begin{aligned} \Lambda{f} &\leq{\color\red{\sum_{i=1}^{n}(y_i-\varepsilon)\mu(E_i)}}+2\varepsilon\mu(K)+{\color\green{\frac{\varepsilon}{n}\sum_{i=1}^n(|a|+y_i+\varepsilon)}} \\ &\leq{\color\red {\int_Xfd\mu}}+2\varepsilon\mu(K)+{\color\green{\varepsilon(|a|+b+\varepsilon)}} \\ &= \int_X fd\mu+\varepsilon(2\mu(K)+|a|+b+\varepsilon). \end{aligned} Since $\varepsilon$ is arbitrary, we see $\Lambda{f} \leq \int_X fd\mu$. The identity is proved.

#### Step 12 - The uniqueness of $\mu$

If there are two measures $\mu_1$ and $\mu_2$ that satisfy assertion 1 to 4 and are correspond to $\Lambda$, then $\mu_1=\mu_2$.

In fact, according to assertion 2 and 3, $\mu$ is determined by the values on compact subsets of $X$. It suffices to show that

If $K$ is a compact subset of $X$, then $\mu_1(K)=\mu_2(K)$.

Fix $K$ compact and $\varepsilon>0$. By Step 1, there exists an open $V \supset K$ such that $\mu_2(V)<\mu_2(K)+\varepsilon$. By Urysohn's lemma, there exists some $f$ such that $K \prec f \prec V$. Hence $\mu_1(K)=\int_X\chi_kd\mu \leq\int_X fd\mu=\Lambda{f}=\int_X fd\mu_2 \\ \leq \int_X \chi_V fd\mu_2=\mu_2(V)<\mu_2(V)+\varepsilon.$ Thus $\mu_1(K) \leq \mu_2(K)$. If $\mu_1$ and $\mu_2$ are exchanged, we see $\mu_2(K) \leq \mu_1(K)$. The uniqueness is proved.

## The flaw

Can we simply put $X=\mathbb{R}^k$ right now? The answer is no. Note that the outer regularity is for all sets but inner is only for open sets and members of $\mathfrak{M}_F$. But we expect the outer and inner regularity to be 'symmetric'. There is an example showing that locally compact is far from being enough to offer the 'symmetry'.

### A weird example

Define $X=\mathbb{R}_1 \times \mathbb{R}_2$, where $\mathbb{R}_1$ is the real line equipped with discrete metric $d_1$, and $\mathbb{R}_2$ is the real line equipped with euclidean metric $d_2$. The metric of $X$ is defined by $d_X((x_1,y_1),(x_2,y_2))=d_1(x_1,x_2)+d_2(x_1,x_2).$ The topology $\tau_X$ induced by $d_X$ is naturally Hausdorff and locally compact by considering the vertical segments. So what would happen to this weird locally compact Hausdorff space?

If $f \in C_c(X)$, let $x_1,x_2,\cdots,x_n$ be those values of $x$ for which $f(x,y) \neq 0$ for at least one $y$. Since $f$ has compact support, it is ensured that there are only finitely many $x_i$'s. We are able to define a positive linear functional by $\Lambda f=\sum_{i=1}^{n}\int_{-\infty}^{+\infty}f(x_i,y)dy=\int_X fd\mu,$ where $\mu$ is the measure associated with $\Lambda$ in the sense of R-M-K theorem. Let $E=\mathbb{R}_1 \times \{0\}.$ By squeezing the disjoint vertical segments around $(x_i,0)$, we see $\mu(K)=0$ for all compact $K \subset E$ but $\mu(E)=\infty$.

This is in violent contrast to what we do expect. However, if $X$ is required to be $\sigma$-compact (note that the space in this example is not), this kind of problems disappear neatly.

1. Walter Rudin, Real and Complex Analysis
2. Serge Lang, Fundamentals of Differential Geometry
3. Joel W. Robbin, Partition of Unity
4. Brian Conrad, Paracompactness and local compactness
5. Raoul Bott & Loring W. Tu, Differential Forms in Algebraic Topology

# The Big Three Pt. 4 - The Open Mapping Theorem (F-Space)

## The Open Mapping Theorem

We are finally going to prove the open mapping theorem in $F$-space. In this version, only metric and completeness are required. Therefore it contains the Banach space version naturally.

(Theorem 0) Suppose we have the following conditions:

1. $X$ is a $F$-space,
2. $Y$ is a topological space,
3. $\Lambda: X \to Y$ is continuous and linear, and
4. $\Lambda(X)$ is of the second category in $Y$.

Then $\Lambda$ is an open mapping.

Proof. Let $B$ be a neighborhood of $0$ in $X$. Let $d$ be an invariant metric on $X$ that is compatible with the $F$-topology of $X$. Define a sequence of balls by $B_n=\{x:d(x,0) < \frac{r}{2^n}\}$ where $r$ is picked in such a way that $B_0 \subset B$. To show that $\Lambda$ is an open mapping, we need to prove that there exists some neighborhood $W$ of $0$ in $Y$ such that $W \subset \Lambda(B).$ To do this however, we need an auxiliary set. In fact, we will show that there exists some $W$ such that $W \subset \overline{\Lambda(B_1)} \subset \Lambda(B).$ We need to prove the inclusions one by one.

The first inclusion requires BCT. Since $B_2 -B_2 \subset B_1$, and $Y$ is a topological space, we get $\overline{\Lambda(B_2)}-\overline{\Lambda(B_2)} \subset \overline{\Lambda(B_2)-\Lambda(B_2)} \subset \overline{\Lambda(B_1)}$ Since $\Lambda(X)=\bigcup_{k=1}^{\infty}k\Lambda(B_2),$ according to BCT, at least one $k\Lambda(B_2)$ is of the second category in $Y$. But scalar multiplication $y\mapsto ky$ is a homeomorphism of $Y$ onto $Y$, we see $k\Lambda(B_2)$ is of the second category for all $k$, especially for $k=1$. Therefore $\overline{\Lambda(B_2)}$ has nonempty interior, which implies that there exists some open neighborhood $W$ of $0$ in $Y$ such that $W \subset \overline{\Lambda(B_1)}$. By replacing the index, it's easy to see this holds for all $n$. That is, for $n \geq 1$, there exists some neighborhood $W_n$ of $0$ in $Y$ such that $W_n \subset \overline{\Lambda(B_n)}$.

The second inclusion requires the completeness of $X$. Fix $y_1 \in \overline{\Lambda(B_1)}$, we will show that $y_1 \in \Lambda(B)$. Pick $y_n$ inductively. Assume $y_n$ has been chosen in $\overline{\Lambda(B_n)}$. As stated before, there exists some neighborhood $W_{n+1}$ of $0$ in $Y$ such that $W_{n+1} \subset \overline{\Lambda(B_{n+1})}$. Hence $(y_n-W_{n+1}) \cap \Lambda(B_n) \neq \varnothing$ Therefore there exists some $x_n \in B_n$ such that $\Lambda x_n = y_n - W_{n+1}.$ Put $y_{n+1}=y_n-\Lambda x_n$, we see $y_{n+1} \in W_{n+1} \subset \overline{\Lambda(B_{n+1})}$. Therefore we are able to pick $y_n$ naturally for all $n \geq 1$.

Since $d(x_n,0)<\frac{r}{2^n}$ for all $n \geq 0$, the sums $z_n=\sum_{k=1}^{n}x_k$ converges to some $z \in X$ since $X$ is a $F$-space. Notice we also have \begin{aligned} d(z,0)& \leq d(x_1,0)+d(x_2,0)+\cdots \\ & < \frac{r}{2}+\frac{r}{4}+\cdots \\ & = r \end{aligned} we have $z \in B_0 \subset B$.

By the continuity of $\Lambda$, we see $\lim_{n \to \infty}y_n = 0$. Notice we also have $\sum_{k=1}^{n} \Lambda x_k = \sum_{k=1}^{n}(y_k-y_{k+1})=y_1-y_{n+1} \to y_1 \quad (n \to \infty),$ we see $y_1 = \Lambda z \in \Lambda(B)$.

The whole theorem is now proved, that is, $\Lambda$ is an open mapping. $\square$

### Remarks

You may think the following relation comes from nowhere: $(y_n - W_{n+1}) \cap \Lambda(B_{n}) \neq \varnothing.$ But it's not. We need to review some set-point topology definitions. Notice that $y_n$ is a limit point of $\Lambda(B_n)$, and $y_n-W_{n+1}$ is a open neighborhood of $y_n$. If $(y_n - W_{n+1}) \cap \Lambda(B_{n})$ is empty, then $y_n$ cannot be a limit point.

The geometric series by $\frac{\varepsilon}{2}+\frac{\varepsilon}{4}+\cdots+\frac{\varepsilon}{2^n}+\cdots=\varepsilon$ is widely used when sum is taken into account. It is a good idea to keep this technique in mind.

## Corollaries

The formal proof will not be put down here, but they are quite easy to be done.

(Corollary 0) $\Lambda(X)=Y$.

This is an immediate consequence of the fact that $\Lambda$ is open. Since $Y$ is open, $\Lambda(X)$ is an open subspace of $Y$. But the only open subspace of $Y$ is $Y$ itself.

(Corollary 1) $Y$ is a $F$-space as well.

If you have already see the commutative diagram by quotient space (put $N=\ker\Lambda$), you know that the induced map $f$ is open and continuous. By treating topological spaces as groups, by corollary 0 and the first isomorphism theorem, we have $X/\ker\Lambda \simeq \Lambda(X)=Y.$ Therefore $f$ is a isomorphism; hence one-to-one. Therefore $f$ is a homeomorphism as well. In this post we showed that $X/\ker{\Lambda}$ is a $F$-space, therefore $Y$ has to be a $F$-space as well. (We are using the fact that $\ker{\Lambda}$ is a closed set. But why closed?)

(Corollary 2) If $\Lambda$ is a continuous linear mapping of an $F$-space $X$ onto a $F$-space $Y$, then $\Lambda$ is open.

This is a direct application of BCT and open mapping theorem. Notice that $Y$ is now of the second category.

(Corollary 3) If the linear map $\Lambda$ in Corollary 2 is injective, then $\Lambda^{-1}:Y \to X$ is continuous.

This comes from corollary 2 directly since $\Lambda$ is open.

(Corollary 4) If $X$ and $Y$ are Banach spaces, and if $\Lambda: X \to Y$ is a continuous linear bijective map, then there exist positive real numbers $a$ and $b$ such that $a \lVert x \rVert \leq \lVert \Lambda{x} \rVert \leq b\rVert x \rVert$ for every $x \in X$.

This comes from corollary 3 directly since both $\Lambda$ and $\Lambda^{-1}$ are bounded as they are continuous.

(Corollary 5) If $\tau_1 \subset \tau_2$ are vector topologies on a vector space $X$ and if both $(X,\tau_1)$ and $(X,\tau_2)$ are $F$-spaces, then $\tau_1 = \tau_2$.

This is obtained by applying corollary 3 to the identity mapping $\iota:(X,\tau_2) \to (X,\tau_1)$.

(Corollary 6) If $\lVert \cdot \rVert_1$ and $\lVert \cdot \rVert_2$ are two norms in a vector space $X$ such that

• $\lVert\cdot\rVert_1 \leq K\lVert\cdot\rVert_2$.
• $(X,\lVert\cdot\rVert_1)$ and $(X,\lVert\cdot\rVert_2)$ are Banach

Then $\lVert\cdot\rVert_1$ and $\lVert\cdot\rVert_2$ are equivalent.

This is merely a more restrictive version of corollary 5.

## The series

Since there is no strong reason to write more posts on this topic, i.e. the three fundamental theorems of linear functional analysis, I think it's time to make a list of the series. It's been around half a year.

# The completeness of the quotient space (topological vector space)

## The Goal

We are going to show the completeness of $X/N$ where $X$ is a TVS and $N$ a closed subspace. Alongside, a bunch of useful analysis tricks will be demonstrated (and that's why you may find this blog post a little tedious.). But what's more important, the theorem proved here will be used in the future.

## The main process

To make it clear, we should give a formal definition of $F$-space.

A topological space $X$ is an $F$-space if its topology $\tau$ is induced by a complete invariant metric $d$.

A metric $d$ on a vector space $X$ will be called invariant if for all $x,y,z \in X$, we have $d(x+z,y+z)=d(x,y).$ By complete we mean every Cauchy sequence of $(X,d)$ converges.

### Defining the quotient metric $\rho$

The metric can be inherited to the quotient space naturally (we will use this fact latter), that is

If $X$ is a $F$-space, $N$ is a closed subspace of a topological vector space $X$, then $X/N$ is still a $F$-space.

Suppose $d$ is a complete invariant metric compatible with $\tau_X$. The metric on $X/N$ is defined by $\boxed{\rho(\pi(x),\pi(y))=\inf_{z \in N}d(x-y,z)}$ ### $\rho$ is a metric

Proof. First, if $\pi(x)=\pi(y)$, that is, $x-y \in N$, we see $\rho(\pi(x),\pi(y))=\inf_{z \in N}d(x-y,z)=d(x-y,x-y)=0.$ If $\pi(x) \neq \pi(y)$ however, we shall show that $\rho(\pi(x),\pi(y))>0$. In this case, we have $x-y \notin N$. Since $N$ is closed, $N^c$ is open, and $x-y$ is an interior point of $X-N$. Therefore there exists an open ball $B_r(x-y)$ centered at $x-y$ with radius $r>0$ such that $B_r(x-y) \cap N = \varnothing$. Notice we have $d(x-y,z)>r$ since otherwise $z \in B_r(x-y)$. By putting $r_0=\sup\{r:B_r(x-y) \cap N = \varnothing\},$ we see $d(x-y,z) \geq r_0$ for all $z \in N$ and indeed $r_0=\inf_{z \in N}d(x-y,z)>0$ (the verification can be done by contradiction). In general, $\inf_z d(x-y,z)=0$ if and only if $x-y \in \overline{N}$.

Next, we shall show that $\rho(\pi(x),\pi(y))=\rho(\pi(y),\pi(x))$, and it suffices to assume that $\pi(x) \neq \pi(y)$. Sgince $d$ is translate invariant, we get \begin{aligned} d(x-y,z)&=d(x-y-z,0) \\ &=d(0,y-x+z) \\ &=d(-z,y-x) \\ &=d(y-x,-z). \end{aligned} Therefore the $\inf$ of the left hand is equal to the one of the right hand. The identity is proved.

Finally, we need to verify the triangle inequality. Let $r,s,t \in X$. For any $\varepsilon>0$, there exist some $z_\varepsilon$ and $z_\varepsilon'$ such that $d(r-s,z_\varepsilon)<\rho(\pi(r),\pi(s))+\frac{\varepsilon}{2},\quad d(s-t,z'_\varepsilon)<\rho(\pi(s),\pi(t))+\frac{\varepsilon}{2}.$ Since $d$ is invariant, we see \begin{aligned} d(r-t,z_\varepsilon+z'_\varepsilon)&=d((r-s)+(s-t)-(z_\varepsilon+z'_\varepsilon),0) \\ &=d([(r-s)-z_\varepsilon]+[(s-t)-z'_\varepsilon],0) \\ &=d(r-s-z_\varepsilon,t-s+z'_\varepsilon) \\ &\leq d(r-s-z_\varepsilon,0)+d(t-s+z'_\varepsilon,0) \\ &=d(r-s,z_\varepsilon)+d(s-t,z'_\varepsilon) \end{aligned} (I owe [@LeechLattice](https://onp4.com/@leechlattice) for the inequality above.)

Therefore \begin{aligned} d(r-t,z_\varepsilon+z'_\varepsilon)&\leq d(r-s,z_\varepsilon)+d(s-t,z'_\varepsilon) \\ &<\rho(\pi(r),\pi(s))+\rho(\pi(s),\pi(t))+\varepsilon. \end{aligned} (Warning: This does not imply that $\rho(\pi(r),\pi(s))+\rho(\pi(s),\pi(t))=\inf_z d(r-t,z)$ since we don't know whether it is the lower bound or not.)

If $\rho(\pi(r),\pi(s))+\rho(\pi(s),\pi(t))<\rho(\pi(r),\pi(t))$ however, let $0<\varepsilon<\rho(\pi(r),\pi(t))-(\rho(\pi(r),\pi(s))+\rho(\pi(s),\pi(t)))$ then there exists some $z''_\varepsilon=z_\varepsilon+z'_\varepsilon$ such that $d(r-t,z''_\varepsilon)<\rho(\pi(r),\pi(t))$ which is a contradiction since $\rho(\pi(r),\pi(t)) \leq d(r-t,z)$ for all $z \in N$.

(We are using the $\varepsilon$ definition of $\inf$. See here.)

### $\rho$ is translate invariant

Since $\pi$ is surjective, we see if $u \in X/N$, there exists some $a \in X$ such that $\pi(a)=u$. Therefore \begin{aligned} \rho(\pi(x)+u,\pi(y)+u) &=\rho(\pi(x)+\pi(a),\pi(y)+\pi(a)) \\ &=\rho(\pi(x+a),\pi(y+a)) \\ &=\inf_{z \in N}d(x+a-y-a,z) \\ &=\rho(\pi(x),\pi(y)). \end{aligned}

### $\rho$ is well-defined

If $\pi(x)=\pi(x')$ and $\pi(y)=\pi(y')$, we have to show that $\rho(\pi(x),\pi(y))=\rho(\pi(x'),\pi(y'))$. In fact, \begin{aligned} \rho(\pi(x),\pi(y)) &\leq \rho(\pi(x),\pi(x'))+\rho(\pi(x'),\pi(y'))+\rho(\pi(y'),\pi(y)) \\ &=\rho(\pi(x'),\pi(y')) \end{aligned} since $\rho(\pi(x),\pi(x'))=0$ as $\pi(x)=\pi(x')$. Meanwhile \begin{aligned} \rho(\pi(x'),\pi(y')) &\leq \rho(\pi(x'),\pi(x)) + \rho(\pi(x),\pi(y)) + \rho(\pi(y),\pi(y')) \\ &= \rho(\pi(x),\pi(y)). \end{aligned} therefore $\rho(\pi(x),\pi(y))=\rho(\pi(x'),\pi(y'))$.

### $\rho$ is compatible with $\tau_N$

By proving this, we need to show that a set $E \subset X/N$ is open with respect to $\tau_N$ if and only if $E$ is a union of open balls. But we need to show a generalized version:

If $\mathscr{B}$ is a local base for $\tau$, then the collection $\mathscr{B}_N$, which contains all sets $\pi(V)$ where $V \in \mathscr{B}$, forms a local base for $\tau_N$.

Proof. We already know that $\pi$ is continuous, linear and open. Therefore $\pi(V)$ is open for all $V \in \mathscr{B}$. For any open set around $E \subset X/N$ containing $\pi(0)$, we see $\pi^{-1}(E)$ is open, and we have $\pi^{-1}(E)=\bigcup_{V\in\mathscr{B}}V$ and therefore $E=\bigcup_{V \in \mathscr{B}}\pi(V).$

Now consider the local base $\mathscr{B}$ containing all open balls around $0 \in X$. Since $\pi(\{x:d(x,0)<r\})=\{u:\rho(u,\pi(0))<r\}$ we see $\rho$ determines $\mathscr{B}_N$. But we have already proved that $\rho$ is invariant; hence $\mathscr{B}_N$ determines $\tau_N$.

### If $d$ is complete, then $\rho$ is complete.

Once this is proved, we are able to claim that, if $X$ is a $F$-space, then $X/N$ is still a $F$-space, since its topology is induced by a complete invariant metric $\rho$.

Proof. Suppose $(x_n)$ is a Cauchy sequence in $X/N$, relative to $\rho$. There is a subsequence $(x_{n_k})$ with $\rho(x_{n_k},x_{n_{k+1}})<2^{-k}$. Since $\pi$ is surjective, we are able to pick some $z_k \in X$ such that $\pi(z_k) = x_{n_k}$ and such that $d(z_{k},z_{k+1})<2^{-k}.$ (The existence can be verified by contradiction still.) By the inequality above, we see $(z_k)$ is Cauchy (can you see why?). Since $X$ is complete, $z_k \to z$ for some $z \in X$. By the continuity of $\pi$, we also see $x_{n_k} \to \pi(z)$ as $k \to \infty$. Therefore $(x_{n_k})$ converges. Hence $(x_n)$ converges since it has a convergent subsequence. $\rho$ is complete.

## Remarks

This fact will be used to prove some corollaries in the open mapping theorem. For instance, for any continuous linear map $\Lambda:X \to Y$, we see $\ker(\Lambda)$ is closed, therefore if $X$ is a $F$-space, then $X/\ker(\Lambda)$ is a $F$-space as well. We will show in the future that $X/\ker(\Lambda)$ and $\Lambda(X)$ are homeomorphic if $\Lambda(X)$ is of the second category.

There are more properties that can be inherited by $X/N$ from $X$. For example, normability, metrizability, local convexity. In particular, if $X$ is Banach, then $X/N$ is Banach as well. To do this, it suffices to define the quotient norm by $\lVert \pi(x) \rVert = \inf\{\lVert x-z \rVert:z \in N\}.$

# Introducing Riemann-Stieltjes Integral

## Motivation

Before going to it, we are going to give several motivations to define the Riemann-Stieltjes integral, which can be considered as an generalization of Riemann integral, the one everyone learns in their Calculus class.

When talking about $\int_a^b fdg$, one may simply think about $\int_a^b fg'dx$. But is it even necessary that $g$ is differentiable? What would happen if $g$ is simply continuous, or even not continuous? Further, given that $g$ is differentiable, can we prove that $\int_a^b f(x)dg(x)=\int_a^bf(x)g'(x)dx$ in a general way(without assuming $f$ is differentiable)?

Another motivation comes from probability theory. Oftentimes one need to consider discrete case ($\sum$) and continuous case ($\int$) separately. One may say that integral is the limit of summation, but it would be weird to write $\int$ as $\lim\sum$ every time. However, if we have a way to write a sum, for example the expected value of a discrete variable, as an integral, things would be easier. Of course, we don't want to write such a sum as another sum by adding up the integral on several disjoint segments. That would be weirder.

If you have learned measure theory, you will know that Lebesgue integral does not perfectly cover Riemann integral. For example, $\int_{0}^{\infty}\frac{\sin{x}}{x}dx$ is not integrable in the sense of Lebesgue but Riemann. We cannot treat Lebesgue integral as a generalization of Riemann integral. In this blog post however, we are showing a direct generalization of Riemann integral.

We are trying our best to prevent ourselves from using $\sup$, $\inf$, and differentiation theory. But $\varepsilon-\delta$ language is heavily used here, so make sure that you are good at it.

## Riemann-Stieltjes Integral

By a partition $P$ on $[a,b]$ we mean a sequence of numbers $(x_n)$ such that $a=x_0 \leq x_1 \leq \cdots \leq x_n=b$ and we associate its size by $\sigma(P)=\max_{k}(x_{k+1}-x_k).$ Let $f$, $g$ be bounded real function on $[a,b]$ (again, no continuity or differentiability required). Given a partition $P$ and numbers $c_k$ with $x_k \leq c_k \leq x_{k+1}$, we define the Riemann-Stieltjes sum (RS-sum) by $S(P,f,g)=\sum_{k=0}^{n-1}f(c_k)[g(x_{k+1})-g(x_k)].$ We say that the limit $\lim_{\sigma(P) \to 0}S(P,f,g)$ exists if there exists some $L \in \mathbb{R}$ such that give $\varepsilon>0$, there exists $\delta>0$ such that whenever $\sigma(P)<\delta$, we have $|S(P,f,g)-L|<\varepsilon.$ In this case, we say $f$ is RS(g)-integrable, and the limit is denoted by $\int_a^bfdg.$ This is the so-called Riemann-Stieltjes integral. When $g(x)=x$, we get Riemann integral naturally.

### Remarks: Further generalization still available

This integral method can be generalized to Banach space. Let $f$, $g$ be bounded maps of $[a,b]$ into Banach spaces $E$, $F$ respectively. Assume we have a product $E \times F \to G$ denoted by $(u,v) \mapsto uv$ with $\lVert uv \rVert \leq \lVert u \rVert \lVert v \rVert$. Then by replacing the absolute value by norm, still we get the Riemann-Stieltjes integral, although in this case we have $\int_a^b fdg \in G$ and $G$ is not necessary to be $\mathbb{R}$. This is different from Bochner integral, since no measure theory evolved here.

## Linearity with respect to $f$ and $g$

First, we shall show that RS(g)-integrable functions form a vector space. To do this, it suffices to show that $f \mapsto S(P,f,g)$ and $g \mapsto S(P,f,g)$ are linear. This follows directly from the definition of RS-sum. Let's see the result.

Suppose we have $\int_a^b fdg=I, \quad \int_a^b hdg=J, \int_a^b fdu=K$ Then we have the following identities for $\alpha \in I$.

1. $\int_a^b \alpha fdg=\alpha I$.
2. $\int_a^b (f+h)dg=I+J$.
3. $\int_a^bfd(g+u)=I+K$.
4. $\int_a^b fd(\alpha g)=\alpha I$.

Proof. We shall show 2 for example. Other three identities follows in the same way.

Notice that the existence of the limit of RS-sum depends only on the size of $P$. For $\varepsilon>0$, there exists some $\delta_1,\delta_2>0$ such that $|S(P,f,g)-I|<\frac{\varepsilon}{2},\quad |S(P,h,g)-J| < \frac{\varepsilon}{2}$ when $\sigma(P)<\delta_1$ and $\sigma(P)<\delta_2$ respectively. By picking $\delta=\min(\delta_1,\delta_2)$, we see for $\sigma(P)<\delta$, we have \begin{aligned} |S(P,f+h,g)-(I+J)|&=|(S(P,f,g)-I)+(S(P,h,g)-J)| \\ &\leq |S(P,f,g)-I| + |S(P,h,g)-J| \\ &< \frac{\varepsilon}{2}+\frac{\varepsilon}{2}=\varepsilon. \end{aligned}

## Integration by parts but no differentiation

$f \in RS(g)$ if and only if $g \in RS(f)$. In this case, we also have integration by parts: $\int_a^b fdg + \int_a^b gdf=f(b)g(b)-f(a)g(a)$

You may not believe it, but differentiation does not play any role here, as promised at the beginning.

Proof. Using the summation by parts (by Abel), we have \begin{aligned} S(P,f,g)&=\sum_{k=0}^{n-1}f(c_k)[g(x_{k+1})-g(x_k)] \\ &=-\sum_{k=1}^{n-1}g(x_k)[f(c_k)-f(c_{k-1})]+f(c_{n-1})g(b)-f(c_0)g(a). \\ \end{aligned} $By writing$ S(P,f,g)=S(P,f,g)+f(a)g(a)-f(a)g(a)+f(b)g(b)-f(b)g(b) $we have$ S(P,f,g)=f(b)g(b)-f(a)g(a)-S(Q,g,f) $where$ S(Q,g,f)=_{k=1}^{n-1}g(x_k)[f(c_k)-f(c_{k-1})]+[f(b)-f(c_{n-1})]g(b)+[f(c_0)-f(a)]g(a). $Consider the partition Q by$ y_k= $\begin{cases} a &\quad k=0 \\ c_{k-1}&\quad k<n \\ b &\quad k=n \end{cases}$ $we have x_0,x_1,\cdots,x_{n-1},x_k to be intermediate points, and$ S(Q,g,f)={k=0}^{n-1}g(x_k)[f(y_{k+1})-f(y_k)]. $Since 0 < \sigma(Q) \leq 2\sigma(P) \leq 4\sigma(Q), when \sigma(P) \to 0, we also have \sigma(Q) \to 0 and vice versa. Suppose now \int_a^b gdf exists, we have.$ {(P) }S(P,f,g)=f(b)g(b)-f(a)g(a)-_a^bgdf=_a^bfdg. And integration by parts follows.

Suppose $\int_a^bfdg$ exists, then $\lim_{\sigma(Q) \to 0}S(Q,g,f)=f(b)g(b)-f(a)g(a)-\int_a^b fdg=\int_a^b gdf.$ The proposition is proved. $\square$

## The flexibility of Riemann-Stieltjes integral

As said before, we want to represent both continuous and discrete case using integral. For measure theory, we have Lebesgue measure and counting measure. But in some cases, this can be done using Riemann-Stieltjes integral as well. Ordinary Riemann integral and finite or infinite series are both special cases of Riemann-Stieltjes integral.

### From integral to series (discrete case)

To do this, we need the unit step function by $I(x)=\begin{cases} 0 \quad x \leq 0, \\ 1 \quad x > 0 .\end{cases}$

If $a<s<b$, $f$ is bounded on $[a,b]$ and continuous at $s$, by putting $g(x)=I(x-s)$, we have $\int_a^b fdg=f(s)$

Proof. A simple verification shows that $\int_a^b fdg=\int_s^b fdg$ (by unwinding the RS-sum, one see immediately that $g(x_k)=0$ for all $x_k\leq s$, therefore the partition before $s$ has no tribute to the value of the integral). Now consider the partition $P$ by $s=x_0<x_1<\cdots<x_n=b.$ We see $S(P,f,g)=\sum_{k=0}^{n-1}f(c_k)(g(x_{k+1})-g(x_k))=f(c_0)(g(x_1)-g(x_0))=f(c_0).$ As $x_1 \to s$, we have $c_0 \to s$, since $f$ is continuous at $s$, we have $f(c_0) \to f(s)$ as desired. $\square$

By the linearity of RS integral, it's easy to generalize this to the case of finite linear combination. Namely, for $g(x)=\sum_{k=1}^{n}c_nI(x-s_n)$, we have $\int_a^b fdg=\sum_{k=1}^{n}c_nf(s_n).$ But now we are discussing the infinite case.

Suppose $c_n \geq 0$ for all $n$ and $\sum_n c_n$ converges, $(s_n)$ is a sequence of distinct points in $(a,b)$, and $g(x)=\sum_nc_nI(x-s_n).$ Let $f$ be continuous on $[a,b]$. Then $\int_a^b fdg=\sum_{n}c_nf(s_n)$

Proof. First it's easy to see that $g(x)$ converges for every $x$, and is monotonic with $g(a)=0$, $g(b)=\sum_n c_n$. For given $\varepsilon>0$, there exists some $N$ such that $\sum_{N+1}^{\infty}c_n<\varepsilon.$ Put $g_1(x)=\sum_{n=1}^{N}c_nI(x-s_n),\quad g_2(x)=\sum_{N+1}^{\infty}c_nI(x-s_n)=g(x)-g_1(x)$ we have $\int_a^b fdg_1=\sum_{n=1}^{N}c_nf(s_n).$ By putting $M=\sup|f(x)|$, we see $\left\vert\int_a^b fdg_2 \right\vert=\left\vert\int_a^b fdg-\int_a^bfdg_1 \right\vert=\left\vert\int_a^b fdg-\sum_{n=1}^{N}c_nf(s_n)\right\vert \leq M\varepsilon$ The inequality holds since $g_2(b)-g_2(a)<\varepsilon$. Since $M$ is finite, when $N \to \infty$, we have the desired result.

### Transformed into ordinary Riemann integral (continuous case)

Finally we are discussing some differentiation. The following theorem shows the connection between RS integral and Riemann integral.

Let $f$ be continuous and suppose that $g$ is real differentiable on $[a,b]$ while $g'$ is Riemann integrable as well, then $f \in RS(g)$ and $\int_a^b fdg=\int_a^b fg'dx$

Proof. By mean value theorem, for each $k$, we have $g(x_{k+1})-g(x_k)=g'(\zeta_k)(x_{k+1}-x_k).$ The RS-sum can be written as $S(P,f,g)=\sum_{k=0}^{n-1}f(c_k)[g(x_{k+1})-g(x_k)]=\sum_{k=0}^{n-1}f(c_k)g'(\zeta_k)(x_{k+1}-x_k).$ Since $g'$ is Riemann integrable, we have $\sum_{k=0}^{n-1}|g'(c_k)-g'(\zeta_k)|(x_{k+1}-x_k) <\varepsilon$ given that $|S(P,g',x)-\int_a^b g'dx|<\varepsilon$. Therefore $\left\vert\sum_{k=0}^{n-1}f(c_k)g'(\zeta_k)(x_{k+1}-x_k)-\sum_{k=0}^{n-1}f(c_k)g'(c_k)(x_{k+1}-x_k)\right\vert\leq M\varepsilon$ where $M=\sup|f(x)|<\infty$ ($f$ is assumed to be bounded.) . Also notice that $fg'$ is integrable since $f$ is continuous. Therefore \begin{aligned} \left\vert S(P,f,g)-\int_a^bfg'dx \right\vert&=\left\vert S(P,f,g)-S(P,fg',x)+S(P,fg',x)-\int_a^bfg'dx \right\vert \\ &\leq \left\vert S(P,f,g)-S(P,fg',x) \right\vert+\left\vert S(P,fg',x)-\int_a^bfg'dx \right\vert \\ &< (M+1)\varepsilon. \end{aligned} Therefore, $\int_a^bfdg=\int_a^b fg'dx,$ which proves the theorem. $\square$

To sum up, given $\varepsilon>0$, there exists some $\delta>0$ such that if $\sigma(P)<\delta$, we have $\left|S(P,g',x)-\int_a^b g'dx\right|<\varepsilon/(M+1)$ and $\left\vert S(P,fg',x)-\int_a^bfg'dx \right\vert<\varepsilon/(M+1).$

After some estimation, we get $\left|S(P,f,g)-\int_{a}^{b}fg'dx \right|<(M+1)\frac{\varepsilon}{M+1}=\varepsilon.$