# The Riesz-Markov-Kakutani Representation Theorem

## This post

Is intended to establish the existence of the Lebesgue measure in the future, which is often denoted by \(m\). In fact, the Lebesgue measure follows as a special case of R-M-K representation theorem. You may not believe it, but euclidean properties of \(\mathbb{R}^k\) plays no role in the existence of \(m\). The only topological property that works is the fact that \(\mathbb{R}^k\) is a locally compact Hausdorff space.

The theorem is named after F. Riesz who introduced it for continuous functions on \([0,1]\) (with respect to Riemann-Steiltjes integral). Years later, after the generalization done by A. Markov and S. Kakutani, we are able to view it on a locally compact Hausdorff space.

You may find there are some over-generalized properties, but this is intended to have you being able to enjoy more alongside (there are some tools related to differential geometry). Also there are many topology and analysis tricks worth your attention.

## Tools

### Different kinds of topological spaces

Again, euclidean topology plays no role in this proof. We need to specify the topology for different reasons. This is similar to what we do in linear functional analysis. Throughout, let \(X\) be a topological space.

**0.0 Definition.** \(X\) is a *Hausdorff space* if the
following is true: If \(p \in X\),
\(q\in X\) but \(p \neq q\), then there are two
**disjoint** open sets \(U\) and \(V\) such that \(p
\in U\) and \(q \in V\).

**0.1 Definition.** \(X\) is *locally compact* if every
point of \(X\) has a neighborhood whose
closure is compact.

**0.2 Remarks.** A Hausdorff space is also called a
\(T_2\) space (see Kolmogorov
classification) or a separated space. There is a classic example of
locally compact Hausdorff space: \(\mathbb{R}^n\). It is trivial to verify
this. But this is far from being enough. In the future we will see, we
can construct some ridiculous but mathematically valid measures.

**0.3 Definition.** A set \(E
\subset X\) is called *\(\sigma\)-compact* if \(E\) is a countable union of compact sets.
Note that every open subset in a euclidean space \(\mathbb{R}^n\) is \(\sigma\)-compact since it can always be a
countable union of closed balls (which is compact).

**0.4 Definition.** A covering of \(X\) is *locally finite* if every
point has a neighborhood which intersects only finitely many elements of
the covering. Of course, if the covering is already finite, it's also
locally finite.

**0.5 Definition.** A *refinement* of a covering
of \(X\) is a second covering, each
element of which is contained in an element of the first covering.

**0.6 Definition.** \(X\) is *paracompact* if it is
Hausdorff, and every open covering has a locally finite open refinement.
Obviously any compact space is paracompact.

**0.7 Theorem.** If \(X\) is a second countable Hausdorff space
and is locally compact, then \(X\) is
paracompact. For proof, see this
[Theorem 2.6]. One uses this to prove that a differentiable manifold
admits a partition of unity.

**0.8 Theorem.** If \(X\) is locally compact and sigma compact,
then \(X=\bigcup_{i=1}^{\infty}K_i\)
where for all \(i \in \mathbb{N}\),
\(K_i\) is compact and \(K_i \subset\operatorname{int}K_{i+1}\).

### Partition of unity

The basic technical tool in the theory of differential manifolds is the existence of a partition of unity. We will steal this tool for the application of analysis theory.

**1.0 Definition.** A **partition of
unity** on \(X\) is a collection
\((g_i)\) of continuous real valued
functions on \(X\) such that

- \(g_i \geq 0\) for each \(i\).
- every \(x \in X\) has a neighborhood \(U\) such that \(U \cap \operatorname{supp}(g_i)=\varnothing\) for all but finitely many of \(g_i\).
- for each \(x \in X\), we have \(\sum_{i}g_i(x)=1\). (That's why you see the word 'unity'.)

One should be reminded that, partition of unity is frequently used in many other fields. For example, in differential geometry, one uses it to find Riemannian structure on a smooth manifold. In generalised function theory, one uses it to find the connection between local property and global property as well.

**1.1 Definition.** A partition of unity \((g_i)\) on \(X\) is *subordinate* to an open
cover of \(X\) if and only if for each
\(g_i\) there is an element \(U\) of the cover such that \(\operatorname{supp}(g_i) \subset U\). We
say \(X\) *admits* partitions of
unity if and only if for every open cover of \(X\), there exists a partition of unity
subordinate to the cover.

**1.2 Theorem.** A Hausdorff space admits a partition of
unity if and only if it is paracompact (the 'only if' part is by
considering the definition of partition of unity. For the 'if' part, see
here).
As a corollary, we have:

**1.3 Corollary.** Suppose \(V_1,\cdots,V_n\) are open subsets of a
locally compact Hausdorff space \(X\),
\(K\) is compact, and \[
K \subset \bigcup_{k=1}^{n}V_k.
\] Then there exists a partition of unity \((h_i)\) that is subordinate to the cover
\((V_n)\) such that \(\operatorname{supp}(h_i) \subset V_i\) and
\(\sum_{i=1}^{n}h_i=1\) for all \(x \in K\).

### Urysohn's lemma (for locally compact Hausdorff spaces)

**2.0 Notation.** The notation \[
K \prec f
\] will mean that \(K\) is a
compact subset of \(X\), that \(f \in C_c(X)\), that \(f(X) \subset [0,1]\), and that \(f(x)=1\) for all \(x \in K\). The notation \[
f \prec V
\] will mean that \(V\) is open,
that \(f \in C_c(X)\), that \(f(X) \subset [0,1]\) and that \(\operatorname{supp}(f) \subset V\). If both
hold, we write \[
K \prec f \prec V.
\] **2.1 Remarks.** Clearly, with this notation, we
are able to simplify the statement of being subordinate. We merely need
to write \(g_i \prec U\) in 1.1 instead
of \(\operatorname{supp}(g_i) \subset
U\).

**2.2 Urysohn's Lemma for locally compact Hausdorff
space.** Suppose \(X\) is
locally compact and Hausdorff, \(V\) is
open in \(X\) and \(K \subset V\) is a compact set. Then there
exists an \(f \in C_c(X)\) such that
\[
K \prec f \prec V.
\] **2.3 Remarks.** By \(f
\in C_c(X)\) we shall mean \(f\)
is a continuous function with a compact support. This relation also says
that \(\chi_K \leq f \leq \chi_V\). For
more details and the proof, visit this
page. This lemma is generally for normal space, for a proof on that
level, see arXiv:1910.10381.
(Question: why we consider two disjoint closed subsets thereafter?)

### The \(\varepsilon\)-definitions of \(\sup\) and \(\inf\)

We will be using the \(\varepsilon\)-definitions of \(\sup\) and \(\inf\), which will makes the proof easier in this case, but if you don't know it would be troublesome. So we need to put it down here.

Let \(S\) be a nonempty subset of the real numbers that is bounded below. The lower bound \(w\) is to be the infimum of \(S\) if and only if for any \(\varepsilon>0\), there exists an element \(x_\varepsilon \in S\) such that \(x_\varepsilon<w+\varepsilon\).

This definition of \(\inf\) is equivalent to the if-then definition by

Let \(S\) be a set that is bounded below. We say \(w=\inf S\) when \(w\) satisfies the following condition.

- \(w\) is a lower bound of \(S\).
- If \(t\) is also a lower bound of \(S\), then \(t \leq s\).

We have the analogous definition for \(\sup\).

## The main theorem

Analysis is full of vector spaces and linear transformations. We
already know that the Lebesgue integral induces a linear functional.
That is, for example, \(L^1([0,1])\) is
a vector space, and we have a linear functional by \[
f \mapsto \int_0^1 f(x)dx.
\] But what about the reverse? Given a linear functional, is it
guaranteed that we have a measure to establish the integral? The R-M-K
theorem answers this question affirmatively. The functional to be
discussed is *positive*, which means that if \(\Lambda\) is positive and \(f(X) \subset [0,\infty)\), then \(\Lambda{f} \in [0,\infty)\).

Let \(X\) be a locally compact Hausdorff space, and let \(\Lambda\) be a positive linear functional on \(C_c(X)\). Then there exists a \(\sigma\)-algebra \(\mathfrak{M}\) on \(X\) which contains all Borel sets in \(X\), and there exists a unique positive measure \(\mu\) on \(\mathfrak{M}\) which represents \(\Lambda\) in the sense that \[ \Lambda{f}=\int_X fd\mu \] for all \(f \in C_c(X)\).

For the measure \(\mu\) and the \(\sigma\)-algebra \(\mathfrak{M}\), we have four assertions:

- \(\mu(K)<\infty\) for every compact set \(K \subset X\).
- For every \(E \in \mathfrak{M}\), we have
\[ \mu(E)=\{\mu(V):E \subset V, V\text{ open}\}. \]

- For every open set \(E\) and every \(E \in \mathfrak{M}\), we have
\[ \mu(E)=\sup\{\mu(K):K \subset E, K\text{ compact}\}. \]

- If \(E \in \mathfrak{M}\), \(A \subset E\), and \(\mu(E)=0\), then \(A \in \mathfrak{M}\).

**Remarks before proof.** It would be great if we can
establish the Lebesgue measure \(m\) by
putting \(X=\mathbb{R}^n\). But we need
a little more extra work to get this result naturally. If 2 is
satisfied, we say \(\mu\) is
*outer* regular, and *inner* regular for 3. If both hold,
we say \(\mu\) is *regular*. The
partition of unity and Urysohn's lemma will be heavily used in the proof
of the main theorem, so make sure you have no problem with it. It can
also be extended to complex space, but that requires much non-trivial
work.

### Proving the theorem

The proof is rather long so we will split it into several steps. I will try my best to make every line clear enough.

#### Step 0 - Construction of \(\mu\) and \(\mathfrak{M}\)

For every open set \(V \in X\), define \[ \mu(V)=\sup\{\Lambda{f}:f \prec V\}. \]

If \(V_1 \subset V_2\) and both are
open, we claim that \(\mu(V_1) \leq
\mu(V_2)\). For \(f \prec V_1\),
since \(\operatorname{supp}f \subset V_1
\subset V_2\), we see \(f \prec
V_2\). But we are able to find some \(g
\prec V_2\) such that \(g \geq
f\), or more precisely, \(\operatorname{supp}(g) \supset
\operatorname{supp}(f)\). By taking another look at the proof of
Urysohn's lemma for locally compact Hausdorff space, we see there is an
open set G with compact closure such that \[
\operatorname{supp}(f) \subset G \subset \overline{G} \subset V_2.
\] By Urysohn's lemma to the pair \((\overline{G},V_2)\), we see there exists a
function \(g \in C_c(X)\) such that
\[
\overline{G} \prec g \prec V_2.
\] Therefore \[
\operatorname{supp}(f) \subset \overline{G} \subset
\operatorname{supp}(g).
\] Thus for any \(f \prec V_1\)
and \(g \prec V_2\), we have \(\Lambda{g} \geq \Lambda{f}\) (monotonic)
since \(\Lambda{g}-\Lambda{f}=\Lambda{(g-f)}\geq
0\). By taking the supremum over \(f\) and \(g\), we see \[
\mu(V_1) \leq \mu(V_2).
\] The 'monotonic' property of such \(\mu\) enables us to *define* \(\mu(E)\) for all \(E \subset X\) by \[
\mu(E)=\inf \{\mu(V):E \subset V, V\text{ open}\}.
\] The definition above is trivial to valid for open sets.
Sometimes people say \(\mu\) is the
outer measure. We will discuss other kind of sets thoroughly in the
following steps. Warning: we are not saying that \(\mathfrak{M} = 2^X\). The crucial property
of \(\mu\), namely countable
additivity, will be proved only on a certain \(\sigma\)-algebra.

It follows from the definition of \(\mu\) that if \(E_1 \subset E_2\), then \(\mu(E_1) \leq \mu(E_2)\).

Let \(\mathfrak{M}_F\) be the class of all \(E \subset X\) which satisfy the two following conditions:

\(\mu(E) <\infty\).

'Inner regular': \[ \mu(E)=\sup\{\mu(K):K \subset E, K\text{ compact}\}. \]

One may say here \(\mu\) is the 'inner measure'. Finally, let \(\mathfrak{M}\) be the class of all \(E \subset X\) such that for every compact \(K\), we have \(E \cap K \in \mathfrak{M}_F\). We shall show that \(\mathfrak{M}\) is the desired \(\sigma\)-algebra.

**Remarks of Step 0.** So far, we have only proved that
\(\mu(E) \geq 0\) for all \(E {\color\red{\subset}}X\). What about the
countable additivity? It's clear that \(\mathfrak{M}_F\) and \(\mathfrak{M}\) has some strong relation. We
need to get a clearer view of it. Also, if we restrict \(\mu\) to \(\mathfrak{M}_F\), we restrict ourself to
finite numbers. In fact, we will show finally \(\mathfrak{M}_F \subset \mathfrak{M}\).

#### Step 1 - The 'measure' of compact sets (outer)

If \(K\) is compact, then \(K \in \mathfrak{M}_F\), and \[ \mu(K)=\inf\{\Lambda{f}:K \prec f\}<\infty \]

Define \(V_\alpha=f^{-1}(\alpha,1]\) for \(K \prec f\) and \(0 < \alpha < 1\). Since \(f(x)=1\) for all \(x \in K\), we have \(K \subset V_{\alpha}\). Therefore by definition of \(\mu\) for all \(E \subset X\), we have \[ \mu(K) \leq \mu(V_\alpha)=\sup\{\Lambda{g}:g \prec V_{\alpha}\} < \frac{1}{\alpha}\Lambda{f}. \] Note that \(f \geq \alpha{g}\) whenever \(g \prec V_{\alpha}\) since \(\alpha{g} \leq \alpha < f\). Since \(\mu(K)\) is an lower bound of \(\frac{1}{\alpha}\Lambda{f}\) with \(0<\alpha<1\), we see \[ \mu(K) \leq \inf_{\alpha \in (0,1)}\{\frac{1}{\alpha}\Lambda{f}\}=\Lambda{f}. \] Since \(f(X) \in [0,1]\), we have \(\Lambda{f}\) to be finite. Namely \(\mu(K) <\infty\). Since \(K\) itself is compact, we see \(K \in \mathfrak{M}_F\).

To prove the identity, note that there exists some \(V \supset K\) such that \(\mu(V)<\mu(K)+\varepsilon\) for some \(\varepsilon>0\). By Urysohn's lemma, there exists some \(h \in C_c(X)\) such that \(K \prec h \prec V\). Therefore \[ \Lambda{h} \leq \mu(V) < \mu(K)+\varepsilon \] Therefore \(\mu(K)\) is the infimum of \(\Lambda{h}\) with \(K \prec h\).

**Remarks of Step 1.** We have just proved assertion 1
of the property of \(\mu\). The hardest
part of this proof is the inequality \[
\mu(V)<\mu(K)+\varepsilon.
\] But this is merely the \(\varepsilon\)-definition of \(\inf\). Note that \(\mu(K)\) is the infimum of \(\mu(V)\) with \(V
\supset K\). For any \(\varepsilon>0\), there exists some open
\(V\) for what? Under certain
conditions, this definition is much easier to use. Now we will examine
the relation between \(\mathfrak{M}_F\)
and \(\tau_X\), namely the topology of
\(X\).

#### Step 2 - The 'measure' of open sets (inner)

\(\mathfrak{M}_F\) contains every open set \(V\) with \(\mu(V)<\infty\).

It suffices to show that for open set \(V\), we have \[ \mu(V)=\sup\{\mu(K):K \subset E, K\text{ compact}\}. \] For \(0<\varepsilon<\mu(V)\), we see there exists an \(f \prec V\) such that \(\Lambda{f}>\mu(V)-\varepsilon\). If \(W\) is any open set which contains \(K= \operatorname{supp}(f)\), then \(f \prec W\), and therefore \(\Lambda{f} \leq \mu(W)\). Again by definition of \(\mu(K)\), we see \[ \Lambda{f}\leq\mu(K). \] Therefore \[ \mu(V)-\varepsilon<\Lambda{f}\leq\mu(K)\leq\mu(V). \] This is exactly the definition of \(\sup\). The identity is proved.

**Remarks of Step 2.** It's important to that this
identity can only be satisfied by open sets and sets \(E\) with \(\mu(E)<\infty\), the latter of which
will be proved in the following steps. This is the *flaw* of this
theorem. With these preparations however, we are able to show the
countable additivity of \(\mu\) on
\(\mathfrak{M}_F\).

#### Step 3 - The subadditivity of \(\mu\) on \(2^X\)

If \(E_1,E_2,E_3,\cdots\) are arbitrary subsets of \(X\), then \[ \mu\left(\bigcup_{k=1}^{\infty}E_k\right) \leq \sum_{k=1}^{\infty}\mu(E_k) \]

First we show this holds for finitely many open sets. This is tantamount to show that \[ \mu(V_1 \cup V_2)\leq \mu(V_1)+\mu(V_2) \] if \(V_1\) and \(V_2\) are open. Pick \(g \prec V_1 \cup V_2\). This is possible due to Urysohn's lemma. By corollary 1.3, there is a partition of unity \((h_1,h_2)\) subordinate to \((V_1,V_2)\) in the sense of corollary 1.3. Therefore, \[ \begin{aligned} \Lambda(g)&=\Lambda((h_1+h_2)g) \\ &=\Lambda(h_1g)+\Lambda(h_2g) \\ &\leq\mu(V_1)+\mu(V_2). \end{aligned} \] Notice that \(h_1g \prec V_1\) and \(h_2g \prec V_2\). By taking the supremum, we have \[ \mu(V_1 \cup V_2)\leq \mu(V_1)+\mu(V_2). \]

Now we back to arbitrary subsets of \(X\). If \(\mu(E_i)=\infty\) for some \(i\), then there is nothing to prove. Therefore we shall assume that \(\mu(E_i)<\infty\) for all \(i\). By definition of \(\mu(E_i)\), we see there are open sets \(V_i \supset E_i\) such that \[ \mu(V_i)<\mu(E_i)+\frac{\varepsilon}{2^i}. \] Put \(V=\bigcup_{i=1}^{\infty}V_i\), and choose \(f \prec V_i\). Since \(f \in C_c(X)\), there is a finite collection of \(V_i\) that covers the support of \(f\). Therefore without loss of generality, we may say that \[ f \prec V_1 \cup V_2 \cup \cdots \cup V_n \] for some \(n\). We therefore obtain \[ \begin{aligned} \Lambda{f} &\leq \mu(V_1 \cup V_2 \cup \cdots \cup V_n) \\ &\leq \mu(V_1)+\mu(V_2)+\cdots+\mu(V_n) \\ &\leq \sum_{i=1}^{n}\left(\mu(E_i)+\frac{\varepsilon}{2^i}\right) \\ &\leq \sum_{i=1}^{\infty}\mu(E_i)+\varepsilon, \end{aligned} \] for all \(f \prec V\). Since \(\bigcup E_i \subset V\), we have \(\mu(\bigcup E_i) \leq \mu(V)\). Therefore \[ \mu(\bigcup_{i=1}^{\infty}E_i)\leq\mu(V)=\sup\{\Lambda{f}\}\leq\sum_{i=1}^{\infty}\mu(E_i)+\varepsilon. \] Since \(\varepsilon\) is arbitrary, the inequality is proved.

**Remarks of Step 3.** Again, we are using the \(\varepsilon\)-definition of \(\inf\). One may say this step showed the
subaddtivity of the outer measure. Also note the geometric series by
\(\sum_{k=1}^{\infty}\frac{\varepsilon}{2^k}=\varepsilon\).

#### Step 4 - Additivity of \(\mu\) on \(\mathfrak{M}_F\)

Suppose \(E=\bigcup_{i=1}^{\infty}E_i\), where \(E_1,E_2,\cdots\) are pairwise disjoint members of \(\mathfrak{M}_F\), then \[ \mu(E)=\sum_{i=1}^{\infty}\mu(E_i). \] If \(\mu(E)<\infty\), we also have \(E \in \mathfrak{M}_F\).

As a dual to Step 3, we firstly show this holds for finitely many compact sets. As proved in Step 1, compact sets are in \(\mathfrak{M}_F\). Suppose now \(K_1\) and \(K_2\) are disjoint compact sets. We want to show that \[ \mu(K_1 \cup K_2)=\mu(K_1)+\mu(K_2). \] Note that compact sets in a Hausdorff space is closed. Therefore we are able to apply Urysohn's lemma to the pair \((K_1,K_2^c)\). That said, there exists a \(f \in C_c(X)\) such that \[ K_1 \prec f \prec K_2^c. \] In other words, \(f(x)=1\) for all \(x \in K_1\) and \(f(x)=0\) for all \(x \in K_2\), since \(\operatorname{supp}(f) \cap K_2 = \varnothing\). By Step 1, since \(K_1 \cup K_2\) is compact, there exists some \(g \in C_c(X)\) such that \[ K_1 \cup K_2 \prec g \quad \text{and} \quad \Lambda(g) < \mu(K_1 \cup K_2)+\varepsilon. \] Now things become tricky. We are able to write \(g\) by \[ g=fg+(1-f)g. \] But \(K_1 \prec fg\) and \(K_2 \prec (1-f)g\) by the properties of \(f\) and \(g\). Also since \(\Lambda\) is linear, we have \[ \mu(K_1)+\mu(K_2) \leq \Lambda(fg)+\Lambda((1-f)g)=\Lambda(g) < \mu(K_1 \cup K_2)+\varepsilon. \] Therefore we have \[ \mu(K_1)+\mu(K_2) \leq \mu(K_1 \cup K_2). \] On the other hand, by Step 3, we have \[ \mu(K_1 \cup K_2) \leq \mu(K_1)+\mu(K_2). \] Therefore they must equal.

If \(\mu(E)=\infty\), there is nothing to prove. So now we should assume that \(\mu(E)<\infty\). Since \(E_i \in \mathfrak{M}_F\), there are compact sets \(K_i \subset E_i\) with \[ \mu(K_i) > \mu(E_i)-\frac{\varepsilon}{2^i}. \] Putting \(H_n=K_1 \cup K_2 \cup \cdots \cup K_n\), we see \(E \supset H_n\) and \[ \mu(E) \geq \mu(H_n)=\sum_{i=1}^{n}\mu(H_i)>\sum_{i=1}^{n}\mu(E_i)-\varepsilon. \] This inequality holds for all \(n\) and \(\varepsilon\), therefore \[ \mu(E) \geq \sum_{i=1}^{\infty}\mu(E_i). \] Therefore by Step 3, the identity holds.

Finally we shall show that \(E \in
\mathfrak{M}_F\) if \(\mu(E)
<\infty\). To make it more understandable, we will use
elementary calculus notation. If we write \(\mu(E)=x\) and \(x_n=\sum_{i=1}^{n}\mu(E_i)\), we see \[
\lim_{n \to \infty}x_n=x.
\] Therefore, for any \(\varepsilon>0\), there exists some \(N \in \mathbb{N}\) such that \[
x-x_N<\varepsilon.
\] This is tantamount to \[
\mu(E)<\sum_{i=1}^{N}\mu(E_i)+\varepsilon.
\] But by definition of the *compact* set \(H_N\) above, we see \[
\mu(E)<{\color\red{\sum_{i=1}^{N}\mu(E_i)}}+\varepsilon<{\color\red
{\mu(H_N)+\varepsilon}}+\varepsilon=\mu(H_N)+2\varepsilon.
\] Hence \(E\) satisfies the
requirements of \(\mathfrak{M}_F\),
thus an element of it.

**Remarks of Step 4.** You should realize that we are
heavily using the \(\varepsilon\)-definition of \(\sup\) and \(\inf\). As you may guess, \(\mathfrak{M}_F\) should be a subset of
\(\mathfrak{M}\) though we don't know
whether it is a \(\sigma\)-algebra or
not. In other words, we hope that the countable additivity of \(\mu\) holds on a \(\sigma\)-algebra that is *properly
extended* from \(\mathfrak{M}_F\).
However it's still difficult to show that \(\mathfrak{M}\) is a \(\sigma\)-algebra. We need more properties
of \(\mathfrak{M}_F\) to go on.

#### Step 5 - The 'continuity' of \(\mathfrak{M}_F\).

If \(E \in \mathfrak{M}_F\) and \(\varepsilon>0\), there is a compact \(K\) and an open \(V\) such that \(K \subset E \subset V\) and \(\mu(V-K)<\varepsilon\).

There are two ways to write \(\mu(E)\), namely \[ \mu(E)=\sup\{\mu(K):K \subset E\} \quad \text{and} \quad \mu(E)=\inf\{\mu(V):V\supset E\} \] where \(K\) is compact and \(V\) is open. Therefore there exists some \(K\) and \(V\) such that \[ \mu(V)-\frac{\varepsilon}{2}<\mu(E)<\mu(K)+\frac{\varepsilon}{2}. \] Since \(V-K\) is open, and \(\mu(V-K)<\infty\), we have \(V-K \in \mathfrak{M}_F\). By Step 4, we have \[ \mu(K)+\mu(V-K)=\mu(V) <\mu(K)+\varepsilon. \] Therefore \(\mu(V-K)<\varepsilon\) as proved.

**Remarks of Step 5.** You should be familiar with the
\(\varepsilon\)-definitions of \(\sup\) and \(\inf\) now. Since \(V-K =V\cap K^c \subset V\), we have \(\mu(V-K)\leq\mu(V)<\mu(E)+\frac{\varepsilon}{2}<\infty\).

#### Step 6 - \(\mathfrak{M}_F\) is closed under certain operations

If \(A,B \in \mathfrak{M}_F\), then \(A-B,A\cup B\) and \(A \cap B\) are elements of \(\mathfrak{M}_F\).

This shows that \(\mathfrak{M}_F\) is closed under union, intersection and relative complement. In fact, we merely need to prove \(A-B \in \mathfrak{M}_F\), since \(A \cup B=(A-B) \cup B\) and \(A\cap B = A-(A-B)\).

By Step 5, for \(\varepsilon>0\), there are sets \(K_A\), \(K_B\), \(V_A\), \(V_B\) such that \(K_A \subset A \subset V_A\), \(K_B \subset B \subset V_B\), and for \(A-B\) we have \[ A-B \subset V_A-K_B \subset (V_A-K_A) \cup (K_A-V_B) \cup (V_B-K_B). \] With an application of Step 3 and 5, we have \[ \mu(A-B) \leq \mu(V_A-K_A)+\mu(K_A-V_B)+\mu(V_B-K_B)< \varepsilon+\mu(K_A-V_B)+\varepsilon. \] Since \(K_A-V_B\) is a closed subset of \(K_A\), we see \(K_A-V_B\) is compact as well (a closed subset of a compact set is compact). But \(K_A-V_B \subset A-B\), and \(\mu(A-B) <\mu(K_A-V_B)+2\varepsilon\), we see \(A-B\) meet the requirement of \(\mathfrak{M}_F\) (, the fact that \(\mu(A-B)<\infty\) is trivial since \(\mu(A-B)<\mu(A)\)).

Since \(A-B\) and \(B\) are pairwise disjoint members of \(\mathfrak{M}_F\), we see \[ \mu(A \cup B)=\mu(A-B)+\mu(B)<\infty. \] Thus \(A \cup B \in \mathfrak{M}_F\). Since \(A,A-B \in \mathfrak{M}_F\), we see \(A \cap B = A-(A-B) \in \mathfrak{M}_F\).

**Remarks of Step 6.** In this step, we demonstrated
several ways to express a set, all of which end up with a huge
simplification. Now we are able to show that \(\mathfrak{M}_F\) is a subset of \(\mathfrak{M}\).

#### Step 7 - \(\mathfrak{M}_F \subset \mathfrak{M}\)

There is a precise relation between \(\mathfrak{M}\) and \(\mathfrak{M}_F\) given by \[ \mathfrak{M}_F=\{E \in \mathfrak{M}:\mu(E)<\infty\} \subset \mathfrak{M}. \]

If \(E \in \mathfrak{M}_F\), we shall show that \(E \in \mathfrak{M}\). For compact \(K\in\mathfrak{M}_F\) (Step 1), by Step 6, we see \(K \cap E \in \mathfrak{M}_F\), therefore \(E \in \mathfrak{M}\).

If \(E \in \mathfrak{M}\) with \(\mu(E)<\infty\) however, we need to show that \(E \in \mathfrak{M}_F\). By definition of \(\mu\), for \(\varepsilon>0\), there is an open \(V\) such that \[ \mu(V)<\mu(E)+\varepsilon<\infty. \] Therefore \(V \in \mathfrak{M}_F\). By Step 5, there is a compact set \(K\) such that \(\mu(V-K)<\varepsilon\) (the open set containing \(V\) should be \(V\) itself). Since \(E \cap K \in \mathfrak{M}_F\), there exists a compact set \(H \subset E \cap K\) with \[ \mu(E \cap K)<\mu(H)+\varepsilon. \] Since \(E \subset (E \cap K) \cup (V-K)\), it follows from Step 1 that \[ \mu(E) \leq {\color\red{\mu(E\cap K)}}+\mu(V-K)<{\color\red{\mu(H)+\varepsilon}}+\varepsilon=\mu(H)+2\varepsilon. \] Therefore \(E \in \mathfrak{M}_F\).

**Remarks of Step 7.** Several tricks in the preceding
steps are used here. Now we are pretty close to the fact that \((X,\mathfrak{M},\mu)\) is a measure space.
Note that for \(E \in
\mathfrak{M}-\mathfrak{M}_F\), we have \(\mu(E)=\infty\), but we have already proved
the countable additivity for \(\mathfrak{M}_F\). Is it 'almost trivial'
for \(\mathfrak{M}\)? Before that, we
need to show that \(\mathfrak{M}\) is a
\(\sigma\)-algebra. Note that assertion
3 of \(\mu\) has been proved.

#### Step 8 - \(\mathfrak{M}\) is a \(\sigma\)-algebra in \(X\) containing all Borel sets

We will validate the definition of \(\sigma\)-algebra one by one.

\(X \in \mathfrak{M}\).

For any compact \(K \subset X\), we have \(K \cap X=K\). But as proved in Step 1, \(K \in \mathfrak{M}_F\), therefore \(X \in \mathfrak{M}\).

If \(A \in \mathfrak{M}\), then \(A^c \in\mathfrak{M}\).

If \(A \in \mathfrak{M}\), then \(A \cap K \in \mathfrak{M}_F\). But \[ K-(A \cap K)=K \cap(A^c \cup K^c)=K\cap A^c \cup \varnothing=K \cap A^c. \] By Step 1 and Step 6, we see \(K \cap A^c \in \mathfrak{M}_F\), thus \(A^c \in \mathfrak{M}\).

If \(A_n \in \mathfrak{M}\) for all \(n \in \mathbb{N}\), then \(A=\bigcup_{n=1}^{\infty}A_n \in \mathfrak{M}\).

We assign an auxiliary sequence of sets inductively. For \(n=1\), we write \(B_1=A_1 \cap K\) where \(K\) is compact. Then \(B_1 \in \mathfrak{M}_F\). For \(n \geq 2\), we write \[ B_n=(A_n \cap K)-(B_1 \cup \cdots\cup B_{n-1}). \] Since \(A_n \cap K \in \mathfrak{M}_F\), \(B_1,B_2,\cdots,B_{n-1} \in \mathfrak{M}_F\), by Step 6, \(B_n \in \mathfrak{M}_F\). Also \(B_n\) is pairwise disjoint.

Another set-theoretic manipulation shows that \[ \begin{aligned} A \cap K&=K \cap\left(\bigcup_{n=1}^{\infty}A_n\right) \\ &=\bigcup_{n=1}^{\infty}(K \cap A_n) \\ &=\bigcup_{n=1}^{\infty}B_n \cup(B_1 \cup \cdots\cup B_{n-1}) \\ &=\bigcup_{n=1}^{\infty}B_n. \end{aligned} \] Now we are able to evaluate \(\mu(A \cap K)\) by Step 4. \[ \begin{aligned} \mu(A \cap K)&=\sum_{n=1}^{\infty}\mu(B_n) \\ &= \lim_{n \to \infty}(A_n \cap K) <\infty. \end{aligned} \] Therefore \(A \cap K \in \mathfrak{M}_F\), which implies that \(A \in \mathfrak{M}\).

\(\mathfrak{M}\) contains all Borel sets.

Indeed, it suffices to prove that \(\mathfrak{M}\) contains all open sets and/or closed sets. We'll show two different paths. Let \(K\) be a compact set.

- If \(C\) is closed, then \(C \cap K\) is compact, therefore \(C\) is an element of \(\mathfrak{M}_F\). (By Step 2.)
- If \(D\) is open, then \(D \cap K \subset K\). Therefore \(\mu(D \cap K) \leq \mu(K)<\infty\), which shows that \(D\) is an element of \(\mathfrak{M}_F\) (step 7).

Therefore by 1 or 2, \(\mathfrak{M}\) contains all Borel sets.

#### Step 9 - \(\mu\) is a positive measure on \(\mathfrak{M}\)

Again, we will verify all properties of \(\mu\) one by one.

\(\mu(E) \geq 0\) for all \(E \in \mathfrak{M}\).

This follows immediately from the definition of \(\mu\), since \(\Lambda\) is positive and \(0 \leq f \leq 1\).

\(\mu\) is countably additive.

If \(A_1,A_2,\cdots\) form a disjoint countable collection of members of \(\mathfrak{M}\), we need to show that \[ \mu\left(\bigcup_{n=1}^{\infty}A_n\right)=\sum_{n=1}^{\infty}\mu(A_n). \] If \(A_n \in \mathfrak{M}_F\) for all \(n\), then this is merely what we have just proved in Step 4. If \(A_j \in \mathfrak{M}-\mathfrak{M}_F\) however, we have \(\mu(A_j)=\infty\). So \(\sum_n\mu(A_n)=\infty\). For \(\mu(\cup_n A_n)\), notice that \(\cup_n A_n \supset A_j\), we have \(\mu(\cup_n A_n) \geq \mu(A_j)=\infty\). The identity is now proved.

#### Step 10 - The completeness of \(\mu\)

So far assertion 1-3 have been proved. But the final assertion has not been proved explicitly. We do that since this property will be used when discussing the Lebesgue measure \(m\). In fact, this will show that \((X,\mathfrak{M},\mu)\) is a complete measure space.

If \(E \in \mathfrak{M}\), \(A \subset E\), and \(\mu(E)=0\), then \(A \in \mathfrak{M}\).

It suffices to show that \(A \in \mathfrak{M}_F\). By definition, \(\mu(A)=0\) as well. If \(K \subset A\), where \(K\) is compact, then \(\mu(K)=\mu(A)=0\). Therefore \(0\) is the supremum of \(\mu(K)\). It follows that \(A \in \mathfrak{M}_F \subset \mathfrak{M}\).

#### Step 11 - The functional and the measure

For every \(f \in C_c(X)\), \(\Lambda{f}=\int_X fd\mu\).

This is the absolute main result of the theorem. It suffices to prove the inequality \[ \Lambda f \leq \int_X fd\mu \] for all \(f \in C_c(X)\). What about the other side? By the linearity of \(\Lambda\) and \(\int_X \cdot d\mu\), once inequality above proved, we have \[ \Lambda(-f)=-\Lambda{f}\leq\int_{X}-fd\mu=-\int_Xfd\mu. \] Therefore \[ \Lambda{f} \geq \int_X fd\mu \] holds as well, and this establish the equality.

Notice that since \(K=\operatorname{supp}(f)\) is compact, we see the range of \(f\) has to be compact. Namely we may assume that \([a,b]\) contains the range of \(f\). For \(\varepsilon>0\), we are able to pick a partition around \([a,b]\) such that \(y_n - y_{n-1}<\varepsilon\) and \[ y_0 < a < y_1<\cdots<y_n=b. \] Put \[ E_i=\{x:y_{i-1}< f(x) \leq y_i\}\cap K. \] Since \(f\) is continuous, \(f\) is Borel measurable. The sets \(E_i\) are trivially pairwise disjoint Borel sets. Again, there are open sets \(V_i \supset E_i\) such that \[ \mu(V_i) < \mu(E_i)+\frac{\varepsilon}{n} \] for \(i=1,2,\cdots,n\), and such that \(f(x)<y_i + \varepsilon\) for all \(x \in V_i\). Notice that \((V_i)\) covers \(K\), therefore by the partition of unity, there are a sequence of functions \((h_i)\) such that \(h_i \prec V_i\) for all \(i\) and \(\sum h_i=1\) on \(K\). By Step 1 and the fact that \(f=\sum_i h_i\), we see \[ \mu(K) \leq \Lambda(\sum_i h_i)=\sum_i \Lambda{h_i}. \] By the way we picked \(V_i\), we see \(h_if \leq (y_i+\varepsilon)h_i\). We have the following inequality: \[ \begin{aligned} \Lambda{f} &= \sum_{i=1}^{n}\Lambda(h_if) \leq\sum_{i=1}^{n}(y_i+\varepsilon)\Lambda{h_i} \\ &= \sum_{i=1}^{n}\left(|a|-|a|+y_i+\varepsilon\right)\Lambda{h_i} \\ &=\sum_{i=1}^{n}(|a|+y_i+\varepsilon)\Lambda{h_i}-|a|\sum_{i=1}^{n}\Lambda{h_i}. \end{aligned} \] Since \(h_i \prec V_i\), we have \(\mu(E_i)+\frac{\varepsilon}{n}>\mu(V_i) \geq \Lambda{h_i}\). And we already get \(\sum_i \Lambda{h_i} \geq \mu(K)\). If we put them into the inequality above, we get \[ \begin{aligned} \Lambda{f} &\leq \sum_{i=1}^{n}(|a|+y_i+\varepsilon)\Lambda{h_i}-|a|\sum_{i=1}^{n}\Lambda{h_i} \\ &\leq \sum_{i=1}^{n}(|a|+y_i+\varepsilon){\color\red{(\mu(E_i)+\frac{\varepsilon}{n})}}-|a|\color\red{\mu(K)}. \end{aligned} \] Observe that \(\cup_i E_i=K\), by Step 9 we have \(\sum_{i}\mu(E_i)=\mu(K)\). A slight manipulation shows that \[ \begin{aligned} \sum_{i=1}^{n}(|a|+y_i+\varepsilon)\mu(E_i)-|a|\mu(K)&=|a|\sum_{i=1}^{n}\mu(E_i)-|a|\mu(K)+\sum_{i=1}^{n}(y_i+\varepsilon)\mu(E_i) \\ &=\sum_{i=1}^{n}(y_i-\varepsilon)\mu(E_i)+2\varepsilon\mu(K). \end{aligned} \] Therefore for \(\Lambda f\) we get \[ \begin{aligned} \Lambda{f} &\leq\sum_{i=1}^{n}(|a|+y_i+\varepsilon)(\mu(E_i)+\frac{\varepsilon}{n})-|a|\mu(K) \\ &=\sum_{i=1}^{n}(y_i-\varepsilon)\mu(E_i)+2\varepsilon\mu(K)+\frac{\varepsilon}{n}\sum_{i=1}^n(|a|+y_i+\varepsilon). \end{aligned} \] Now here comes the trickiest part of the whole blog post. By definition of \(E_i\), we see \(f(x) > y_{i-1}>y_{i}-\varepsilon\) for \(x \in E_i\). Therefore we get simple function \(s_n\) by \[ s_n=\sum_{i=1}^{n}(y_i-\varepsilon)\chi_{E_i}. \] If we evaluate the Lebesgue integral of \(f\) with respect to \(\mu\), we see \[ \int_X s_nd\mu={\color\red{\sum_{i=1}^{n}(y_i-\varepsilon)\mu(E_i)}} \leq {\color\red{\int_X fd\mu}}. \] For \(2\varepsilon\mu(K)\), things are simple since \(0\leq\mu(K)<\infty\). Therefore \(2\varepsilon\mu(K) \to 0\) as \(\varepsilon \to 0\). Now let's estimate the final part of the inequality. It's trivial that \(\frac{\varepsilon}{n}\sum_{i=1}^{n}(|a|+\varepsilon)=\varepsilon(\varepsilon+|a|)\). For \(y_i\), observe that \(y_i \leq b\) for all \(i\), therefore \(\frac{\varepsilon}{n}\sum_{i=1}^{n}y_i \leq \frac{\varepsilon}{n}nb=\varepsilon b\). Thus \[ {\color\green{\frac{\varepsilon}{n}\sum_{i=1}^{n}(|a|+y_i+\varepsilon)}} \color\black\leq {\color\green {\varepsilon(|a|+b+\varepsilon)}}\color\black{.} \] Notice that \(b+|a| \geq 0\) since \(b \geq a \geq -|a|\). Our estimation of \(\Lambda{f}\) is finally done: \[ \begin{aligned} \Lambda{f} &\leq{\color\red{\sum_{i=1}^{n}(y_i-\varepsilon)\mu(E_i)}}+2\varepsilon\mu(K)+{\color\green{\frac{\varepsilon}{n}\sum_{i=1}^n(|a|+y_i+\varepsilon)}} \\ &\leq{\color\red {\int_Xfd\mu}}+2\varepsilon\mu(K)+{\color\green{\varepsilon(|a|+b+\varepsilon)}} \\ &= \int_X fd\mu+\varepsilon(2\mu(K)+|a|+b+\varepsilon). \end{aligned} \] Since \(\varepsilon\) is arbitrary, we see \(\Lambda{f} \leq \int_X fd\mu\). The identity is proved.

#### Step 12 - The uniqueness of \(\mu\)

If there are two measures \(\mu_1\) and \(\mu_2\) that satisfy assertion 1 to 4 and are correspond to \(\Lambda\), then \(\mu_1=\mu_2\).

In fact, according to assertion 2 and 3, \(\mu\) is determined by the values on compact subsets of \(X\). It suffices to show that

If \(K\) is a compact subset of \(X\), then \(\mu_1(K)=\mu_2(K)\).

Fix \(K\) compact and \(\varepsilon>0\). By Step 1, there exists an open \(V \supset K\) such that \(\mu_2(V)<\mu_2(K)+\varepsilon\). By Urysohn's lemma, there exists some \(f\) such that \(K \prec f \prec V\). Hence \[ \mu_1(K)=\int_X\chi_kd\mu \leq\int_X fd\mu=\Lambda{f}=\int_X fd\mu_2 \\ \leq \int_X \chi_V fd\mu_2=\mu_2(V)<\mu_2(V)+\varepsilon. \] Thus \(\mu_1(K) \leq \mu_2(K)\). If \(\mu_1\) and \(\mu_2\) are exchanged, we see \(\mu_2(K) \leq \mu_1(K)\). The uniqueness is proved.

## The flaw

Can we simply put \(X=\mathbb{R}^k\)
right now? The answer is no. Note that the outer regularity is for all
sets but inner is only for open sets and members of \(\mathfrak{M}_F\). But we expect the outer
and inner regularity to be 'symmetric'. There is an example showing that
*locally compact* is far from being enough to offer the
'symmetry'.

### A weird example

Define \(X=\mathbb{R}_1 \times \mathbb{R}_2\), where \(\mathbb{R}_1\) is the real line equipped with discrete metric \(d_1\), and \(\mathbb{R}_2\) is the real line equipped with euclidean metric \(d_2\). The metric of \(X\) is defined by \[ d_X((x_1,y_1),(x_2,y_2))=d_1(x_1,x_2)+d_2(x_1,x_2). \] The topology \(\tau_X\) induced by \(d_X\) is naturally Hausdorff and locally compact by considering the vertical segments. So what would happen to this weird locally compact Hausdorff space?

If \(f \in C_c(X)\), let \(x_1,x_2,\cdots,x_n\) be those values of \(x\) for which \(f(x,y) \neq 0\) for at least one \(y\). Since \(f\) has compact support, it is ensured that there are only finitely many \(x_i\)'s. We are able to define a positive linear functional by \[ \Lambda f=\sum_{i=1}^{n}\int_{-\infty}^{+\infty}f(x_i,y)dy=\int_X fd\mu, \] where \(\mu\) is the measure associated with \(\Lambda\) in the sense of R-M-K theorem. Let \[ E=\mathbb{R}_1 \times \{0\}. \] By squeezing the disjoint vertical segments around \((x_i,0)\), we see \(\mu(K)=0\) for all compact \(K \subset E\) but \(\mu(E)=\infty\).

This is in violent contrast to what we do expect. However, if \(X\) is required to be \(\sigma\)-compact (note that the space in this example is not), this kind of problems disappear neatly.

## References / Further reading

- Walter Rudin,
*Real and Complex Analysis* - Serge Lang,
*Fundamentals of Differential Geometry* - Joel W. Robbin,
*Partition of Unity* - Brian Conrad,
*Paracompactness and local compactness* - Raoul Bott & Loring W. Tu,
*Differential Forms in Algebraic Topology*