The Lebesgue-Radon-Nikodym theorem and how von Neumann proved it

An introduction

If one wants to learn the fundamental theorem of Calculus in the sense of Lebesgue integral, properties of measures have to be taken into account. In elementary calculus, one may consider something like

$df(x)=f'(x)dx$

where $f$ is differentiable, say, everywhere on an interval. Now we restrict $f$ to be a differentiable and nondecreasing real function defined on $I=[a,b]$. There we got a one-to-one function defined by

$g(x)=x+f(x)$

For measurable sets $E\in\mathfrak{M}$, it can be seen that if $m(E)=0$, we have $m(g(E))=0$. Moreover, $g(E) \in \mathfrak{M}$, and $g$ is one-to-one. Therefore we can define a measure like

$\mu(E)=m(g(E))$

If we have a relation

$\mu(E)=\int_{E}hdm$

(in fact, this is the Radon-Nikodym theorem we will prove later), the fundamental theorem of calculus for $f$ becomes somewhat clear since if $E=[a,x]$, we got $g(E)=[a+f(a),x+f(x)]$, thus we got

$\begin{aligned} \mu(E)=m(g(E))&=g(x)-g(a)\\ &=f(x)-f(a)+\int_a^xdt \\ &=\int_a^xh(t)dt \end{aligned}$

which trivially implies

$f(x)-f(a)=\int_a^x[h(t)-1]dt$

the function $h$ looks like to be $g’=f’+1$.

We are not proving the fundamental theorem here. But this gives rise to a question. Is it possible to find a function such that

$\mu(E)=\int_{E}hdm$

one may write as

$d\mu=hdm$

or, more generally, a measure $\mu$ with respect to another measure $\lambda$? Does this $\mu$ exist with respect to $\lambda$? Does this $h$ exist? Lot of questions. Luckily the Lebesgue decomposition and Radon-Nikodym theorem make it possible.

Notations

Let $\mu$ be a positive measure on a $\sigma$-algebra $\mathfrak{M}$, let $\lambda$ be any arbitrary measure (positive or complex) defined on $\mathfrak{M}$.

We write

$\lambda \ll \mu$

if $\lambda(E)=0$ for every $E\in\mathfrak{M}$ for which $\mu(E)=0$. (You may write $\mu \ll m$ in the previous section.) We say $\lambda$ is absolutely continuous with respect to $\mu$.

Another relation between measures worth consideration is being mutually singular. If we have $\lambda(E)=\lambda(A \cap E)$ for every $E \in \mathfrak{M}$, we say $\lambda$ is concentrated on $A$.

If we now have two measures $\mu_1$ and $\mu_2$, two disjoint sets $A$ and $B$ such that $\mu_1$ is concentrated on $A$, $\mu_2$ is concentrated on $B$, we say $\mu_1$ and $\mu_2$ are mutually singular, and write

$\mu_1 \perp \mu_2$

The Theorem of Lebesgue-Radon-Nikodym

Let $\mu$ be a positive $\sigma$-finite measure on $\mathfrak{M}$, and $\lambda$ a complex measure on $\mathfrak{M}$.

There exists a unique pair of complex measures $\lambda_{ac}$ and $\lambda_{s}$ on $\mathfrak{M}$ such that

$\lambda = \lambda_{ac}+\lambda_s \quad \lambda_{ac}\ll\mu\quad \lambda_s \perp \mu$

There is a unique $h \in L^1(\mu)$ such that

$\lambda_{ac}(E)=\int_{E}hd\mu$
for every $E \in \mathfrak{M}$.

The unique pair $(\lambda_{ac},\lambda_s)$ is called the Lebesgue decomposition; the existence of $h$ is called the Radon-Nikodym theorem, and $h$ is called the Radon-Nikodym derivative. One also writes $d\lambda_{ac}=hd\mu$ or $\frac{d\lambda_{ac}}{d\mu}=h$ in this situation.

These are two separate theorems, but von Neumann gave the idea to prove these two at one stroke.

If we already have $\lambda \ll \mu$, then $\lambda_s=0$ and the Radon-Nikodym derivative shows up in the natural of things.

Also, one cannot ignore the fact that $m$ the Lebesgue measure is $\sigma$-finite.

Proof explained

Step 1 - Construct a bounded functional

We are going to employ Hilbert space technique in this proof. Precisely speaking, we are going to construct a bounded linear functional to find another function, namely $g$, which is the epicentre of this proof.

The boundedness of $\lambda$ is clear since it’s complex, but $\mu$ is only assumed to be $\sigma$-finite. Therefore we need some adjustment onto $\mu$.

1.1 Replacing $\mu$ with a finite measure

If $\mu$ is a positive $\sigma$-finite measure on a $\sigma$-algebra $\mathfrak{M}$ in a set $X$, then there is a function $w$ such that $w \in L^1(\mu)$ and $0<w(x)<1$ for every $x \in X$.

The $\sigma$-finiteness of $\mu$ denotes that, there exist some sets $E_n$ such that

$X=\bigcup_{n=1}^{\infty}E_n$

and that $\mu(E_n)<\infty$ for all $n$.

Define

$w_n(x)= \begin{aligned} \begin{cases} \frac{1}{2^n(1+\mu(E_n))}\quad &x \in E_n \\ 0 \quad &x\notin E_n \end{cases} \end{aligned}$

(you can also say that $w_n=\frac{1}{2^n(1+\mu(E_n))}\chi_{E_n}$), then we have

$\begin{aligned} w &= \sum_{n=1}^{\infty}w_n \\ \end{aligned}$

satisfies $0<w<1$ for all $x$. With $w$, we are able to define a new measure, namely

$\tilde{\mu}(E)=\int_{E}wd\mu.$

The fact that $\tilde{\mu}(E)$ is a measure can be validated by considering $\int_{E}wd\mu=\int_{X}\chi_{E}wd\mu$. It’s more important that $\tilde{\mu}(E)$ is bounded and $\tilde{\mu}(E)=0$ if and only if $\mu(E)=0$. The second one comes from the strict positivity of $w$. For the first one, notice that

$\begin{aligned} \tilde{\mu}(X) &\leq \sum_{n=1}^{\infty}\tilde{\mu}(E_n) \\ &= \sum_{n=1}^{\infty}\frac{1}{2^n(1+\mu(E_n))} \\ &\leq \sum_{n=1}^{\infty}\frac{1}{2^n} \end{aligned}$

1.2 A bounded linear functional associated with $\lambda$

Since $\lambda$ is complex, without loss of generality, we are able to assume that $\lambda$ is a positive bounded measure on $\mathfrak{M}$. By 1.1, we are able to obtain a positive bounded measure by

$\varphi=\lambda+\tilde{\mu}$

Following the construction of Lebesgue measure, we have

$\int_{X}fd\varphi=\int_{X}fd\lambda+\int_{X}fwd\mu$

for all nonnegative measurable function $f$. Also, notice that $\lambda \leq \varphi$, we have

$\left\vert \int_{X}fd\lambda \right\vert \leq \int_{X}|f|d\lambda \leq \int_{X}|f|d\varphi \leq \sqrt{\varphi(X)}\left\Vert f \right\Vert_2$

for $f \in L^2(\varphi)$ by Schwarz inequality.

Since $\varphi(X)<\infty$, we have

$\Lambda{f}=\int_{X}fd\lambda$

to be a bounded linear functional on $L^2(\varphi)$.

Step 2 - Find the associated function with respect to $\lambda$

Since $L^2(\varphi)$ is a Hilbert space, every bounded linear functional on a Hilbert space $H$ is given by an inner product with an element in $H$. That is, by the completeness of $L^2(\varphi)$, there exists a function $g$ such that

$\Lambda{f}=\int_{X}fd\lambda=\int_{X}fgd\varphi=(f,g).$

The properties of $L^2$ space shows that $g$ is determined almost everywhere with respect to $\varphi$.

For $E \in \mathfrak{M}$, we got

$0 \leq (\chi_{E},g)=\int_{E}gd\varphi=\int_{E}d\lambda=\lambda(E)\leq\varphi(E)$

which implies $0 \leq g \leq 1$ for almost every $x$ with respect to $\varphi$. Therefore we are able to assume that $0 \leq g \leq 1$ without ruining the identity. The proof is in the bag once we define $A$ to be the set where $0 \leq g < 1$ and $B$ the set where $g=1$.

Step 3 - Generate $\lambda_{ac}$ and $\lambda_{s}$ and the Radon-Nikodym derivative at one stroke

We claim that $\lambda(A \cap E)$ and $\lambda(B \cap E)$ form the decomposition we are looking for, $\lambda_{ac}$ and $\lambda_s$, respectively. Namely, $\lambda_{ac}=\lambda(A \cap E)$, $\lambda_s=\lambda(B \cap E)$.

Proving $\lambda_s \perp \mu$

If we combine $\Lambda{f}=(f,g)$ and $\varphi=\lambda+\tilde{\mu}$ together, we have

$\int_{X}(1-g)fd\lambda=\int_{X}fgwd\mu.$

Put $f=\chi_{B}$, we have

$\int_{B}wd\mu=0.$

Since $w$ is strictly positive, we see that $\mu(B)=0$. Notice that $A \cap B = \varnothing$ and $A \cup B=X$. For $E \in \mathfrak{M}$, we write $E=E_A \cup E_B$, where $E_A \subset A$ and $E_B \subset B$. Therefore

$\mu(E)=\mu(E_A)+\mu(E_B)=\mu(E \cap A)+\mu(E \cap B)=\mu(E \cap A).$

Therefore $\mu$ is concentrated on $A$.

For $\lambda_s$, observe that

$\lambda_s(E)=\lambda(E \cap B)=\lambda((E \cap B) \cap B)=\lambda_s(E \cap B).$

Hence $\lambda_s$ is concentrated on $B$. This observation shows that $\lambda_s \perp \mu$.

Proving $\lambda_{ac} \ll \mu$ by the Radon-Nikodym derivative

The relation that $\lambda_{ac} \ll \mu$ will be showed by the existence of the Radon-Nikodym derivative.

If we replace $f$ by

$(1+g+\cdots+g^n)\chi_E,$

where $E \in \mathfrak{M}$, we have

$\int_X(1-g)fd\lambda=\int_E(1-g^{n+1})d\lambda=\int_Eg(1+g+\cdots+g^n)wd\mu.$

Notice that

$\begin{aligned} \int_{E}(1-g^{n+1})d\lambda &=\int\limits_{E \cap A}(1-g^{n+1})d\lambda + \int\limits_{E \cap B}(1-g^{n+1})d\lambda \\ &=\int\limits_{E \cap A}(1-g^{n+1})d\lambda \\ &\to\lambda(E \cap A) = \lambda_{ac}(E)\quad(n\to\infty) \end{aligned}$

Define $h_n=g(1+g+g^2+\cdots+g^n)w$, we see that on $A$, $h_n$ converges monotonically to

$h= \begin{aligned} \begin{cases} \frac{gw}{1-g} \quad &x\in{A}\\ 0 \quad &x\in{B} \end{cases} \end{aligned}$

By monotone convergence theorem, we got

$\lim_{n\to\infty}\int_{E}h_nd\mu = \int_{E}hd\mu=\lambda_{ac}(E).$

for every $E\in\mathfrak{M}$.

The measurable function $h$ is the desired Radon-Nikodym derivative once we show that $h \in L^1(\mu)$. Replacing $E$ with $X$, we see that

$\int_{X}|h|d\mu=\int_{X}hd\mu=\lambda_{ac}(X)\leq\lambda(X)<\infty.$

Clearly, if $\mu(E)=0$, we have

$\lambda_{ac}(E)=\int_{E}hd\mu=0$

which shows that

$\lambda_{ac}\ll\mu$

as desired.

Step 3 - Generalization onto complex measures

By far we have proved this theorem for positive bounded measure. For real bounded measure, we can apply the proceeding case to the positive and negative part of it. For all complex measures, we have

$\lambda=\lambda_1+i\lambda_2$

where $\lambda_1$ and $\lambda_2$ are real.

Step 4 - Uniqueness of the decomposition

If we have two Lebesgue decompositions of the same measure, namely $(\lambda_{ac},\lambda_s)$ and $(\lambda’_{ac},\lambda’_s)$, we shall show that

$\lambda_{ac}-\lambda_{ac}'=\lambda_s'-\lambda_s=0$

By the definition of the decomposition we got

$\lambda_{ac}-\lambda'_{ac}=\lambda'_s-\lambda_s$

with $\lambda_{ac}-\lambda_{ac}’ \ll \mu$ and $\lambda_{s}’-\lambda_{s}\perp\mu$. This implies that $\lambda’_{s}-\lambda_{s} \ll \mu$ as well.

Since $\lambda’_s-\lambda_s\perp\mu$, there exists a set with $\mu(A)=0$ on which $\lambda’_s-\lambda_s$ is concentrated; the absolute continuity shows that $\lambda’_s(E)-\lambda_s(E)=0$ for all $E \subset A$. Hence $\lambda_s’-\lambda_s$ is concentrated on $X-A$. Therefore we got $(\lambda’_s-\lambda_s)\perp(\lambda’_s-\lambda_s)$, which forces $\lambda’_s-\lambda_s=0$. The uniqueness is proved.

(Following the same process one can also show that $\lambda_{ac}\perp\lambda_s$.)

updated at 2025-05-11

# Walter Rudin # Real and Complex Analysis # von Neumann # Lebesgue # Radon-Nikodym