The Banach Algebra of Borel Measures on Euclidean Space

This blog post is intended to deliver a quick explanation of the algebra of Borel measures on $\mathbb{R}^n$. It will be broken into pieces. All complex-valued complex Borel measures $M(\mathbb{R}^n)$ clearly form a vector space over $\mathbb{C}$. The main goal of this post is to show that this is a Banach space and also a Banach algebra.

In fact, the $\mathbb{R}^n$ case can be generalised into any locally compact abelian group (see any abstract harmonic analysis books), this is because what really matters here is being locally compact and abelian. But at this moment we stick to Euclidean spaces. Note since $\mathbb{R}^n$ is $\sigma$-compact, all Borel measures are regular.

To read this post you need to be familiar with some basic properties of Banach algebra, complex Borel measures, and the most important, Fubini’s theorem.

Several ways to prove Hardy's inequality

Suppose $1 < p < \infty$ and $f \in L^p((0,\infty))$ (with respect to Lebesgue measure of course) is a nonnegative function, take

we have Hardy’s inequality $\def\lrVert[#1]{\lVert #1 \rVert}$

where $\frac{1}{p}+\frac{1}{q}=1$ of course.

There are several ways to prove it. I think there are several good reasons to write them down thoroughly since that may be why you find this page. Maybe you are burnt out since it’s left as exercise. You are assumed to have enough knowledge of Lebesgue measure and integration.

Minkowski’s integral inequality

Let $S_1,S_2 \subset \mathbb{R}$ be two measurable set, suppose $F:S_1 \times S_2 \to \mathbb{R}$ is measurable, then

A proof can be found at here by turning to Example A9. You may need to replace all measures with Lebesgue measure $m$.

Now let’s get into it. For a measurable function in this place we should have $G(x,t)=\frac{f(t)}{x}$. If we put this function inside this inequality, we see

Note we have used change-of-variable twice and the inequality once.

A constructive approach

I have no idea how people came up with this solution. Take $xF(x)=\int_0^x f(t)t^{u}t^{-u}dt$ where $0<u<1-\frac{1}{p}$. Hölder’s inequality gives us


Note we have used the fact that $\frac{1}{p}+\frac{1}{q}=1 \implies p+q=pq$ and $\frac{p}{q}=p-1$. Fubini’s theorem gives us the final answer:

It remains to find the minimum of $\varphi(u) = \left(\frac{1}{1-uq}\right)^{p-1}\frac{1}{up}$. This is an elementary calculus problem. By taking its derivative, we see when $u=\frac{1}{pq}<1-\frac{1}{p}$ it attains its minimum $\left(\frac{p}{p-1}\right)^p=q^p$. Hence we get

which is exactly what we want. Note the constant $q$ cannot be replaced with a smaller one. We simply proved the case when $f \geq 0$. For the general case, one simply needs to take absolute value.

Integration by parts

This approach makes use of properties of $L^p$ space. Still we assume that $f \geq 0$ but we also assume $f \in C_c((0,\infty))$, that is, $f$ is continuous and has compact support. Hence $F$ is differentiable in this situation. Integration by parts gives

Note since $f$ has compact support, there are some $[a,b]$ such that $f >0$ only if $0 < a \leq x \leq b < \infty$ and hence $xF(x)^p\vert_0^\infty=0$. Next it is natural to take a look at $F’(x)$. Note we have

hence $xF’(x)=f(x)-F(x)$. A substitution gives us

which is equivalent to say

Hölder’s inequality gives us

Together with the identity above we get

which is exactly what we want since $1-\frac{1}{q}=\frac{1}{p}$ and all we need to do is divide $\left[\int_0^\infty F^pdx\right]^{1/q}$ on both sides. So what’s next? Note $C_c((0,\infty))$ is dense in $L^p((0,\infty))$. For any $f \in L^p((0,\infty))$, we can take a sequence of functions $f_n \in C_c((0,\infty))$ such that $f_n \to f$ with respect to $L^p$-norm. Taking $F=\frac{1}{x}\int_0^x f(t)dt$ and $F_n = \frac{1}{x}\int_0^x f_n(t)dt$, we need to show that $F_n \to F$ pointwise, so that we can use Fatou’s lemma. For $\varepsilon>0$, there exists some $m$ such that $\lrVert[f_n-f]_p < \frac{1}{n}$. Thus

Hence $F_n \to F$ pointwise, which also implies that $|F_n|^p \to |F|^p$ pointwise. For $|F_n|$ we have

note the third inequality follows since we have already proved it for $f \geq 0$. By Fatou’s lemma, we have

A Continuous Function Sending L^p Functions to L^1

Throughout, let $(X,\mathfrak{M},\mu)$ be a measure space where $\mu$ is positive.

The question

If $f$ is of $L^p(\mu)$, which means $\lVert f \rVert_p=\left(\int_X |f|^p d\mu\right)^{1/p}<\infty$, or equivalently $\int_X |f|^p d\mu<\infty$, then we may say $|f|^p$ is of $L^1(\mu)$. In other words, we have a function

This function does not have to be one to one due to absolute value. But we hope this function to be fine enough, at the very least, we hope it is continuous.

Here, $f \sim g$ means that $f-g$ equals $0$ almost everywhere with respect to $\mu$. It can be easily verified that this is an equivalence relation.


We still use the $\varepsilon-\delta$ argument but it’s in a metric space. Suppose $(X,d_1)$ and $(Y,d_2)$ are two metric spaces and $f:X \to Y$ is a function. We say $f$ is continuous at $x_0 \in X$ if, for any $\varepsilon>0$, there exists some $\delta>0$ such that $d_2(f(x_0),f(x))<\varepsilon$ whenever $d_1(x_0,x)<\delta$. Further, we say $f$ is continuous on $X$ if $f$ is continuous at every point $x \in X$.


For $1\leq p<\infty$, we already have a metric by

given that $d(f,g)=0$ if and only if $f \sim g$. This is complete and makes $L^p$ a Banach space. But for $0<p<1$ (yes we are going to cover that), things are much more different, and there is one reason: Minkowski inequality holds reversely! In fact, we have

for $0<p<1$. $L^p$ space has too many weird things when $0<p<1$. Precisely,

For $0<p<1$, $L^p(\mu)$ is locally convex if and only if $\mu$ assumes finitely many values. (Proof.)

On the other hand, for example, $X=[0,1]$ and $\mu=m$ be the Lebesgue measure, then $L^p(\mu)$ has no open convex subset other than $\varnothing$ and $L^p(\mu)$ itself. However,

A topological vector space $X$ is normable if and only if its origin has a convex bounded neighbourhood. (See Kolmogorov’s normability criterion.)

Therefore $L^p(m)$ is not normable, hence not Banach.

We have gone too far. We need a metric that is fine enough.

Metric of $L^p$ when $0<p<1$


for $f \in L^p(\mu)$. We will show that we have a metric by

Fix $y\geq 0$, consider the function

We have $f(0)=y^p$ and

when $x > 0$ and hence $f(x)$ is nonincreasing on $[0,\infty)$, which implies that

Hence for any $f$, $g \in L^p$, we have

This inequality ensures that

is a metric. It’s immediate that $d(f,g)=d(g,f) \geq 0$ for all $f$, $g \in L^p(\mu)$. For the triangle inequality, note that

This is translate-invariant as well since

The completeness can be verified in the same way as the case when $p>1$. In fact, this metric makes $L^p$ a locally bounded F-space.

The continuity of $\lambda$

The metric of $L^1$ is defined by

We need to find a relation between $d_p(f,g)$ and $d_1(\lambda(f),\lambda(g))$, where $d_p$ is the metric of the corresponding $L^p$ space.


As we have proved,

Without loss of generality we assume $x \geq y$ and therefore


By interchanging $x$ and $y$, we get

Replacing $x$ and $y$ with $|f|$ and $|g|$ where $f$, $g \in L^p$, we get


and we therefore have

Hence $\lambda$ is continuous (and in fact, Lipschitz continuous and uniformly continuous) when $0<p<1$.

$1 \leq p < \infty$

It’s natural to think about Minkowski’s inequality and Hölder’s inequality in this case since they are critical inequality enablers. You need to think about some examples of how to create the condition to use them and get a fine result. In this section we need to prove that

This inequality is surprisingly easy to prove however. We will use nothing but the mean value theorem. Without loss of generality we assume that $x > y \geq 0$ and define $f(t)=t^p$. Then

where $y < \zeta < x$. But since $p-1 \geq 0$, we see $\zeta^{p-1} < x^{p-1} <x^{p-1}+y^{p-1}$. Therefore

For $x=y$ the equality holds.


By Hölder’s inequality, we have

By Minkowski’s inequality, we have

Now things are clear. Since $1/p+1/q=1$, or equivalently $1/q=(p-1)/p$, suppose $\lVert f \rVert_p$, $\lVert g \rVert_p \leq R$, then $(p-1)q=p$ and therefore

Summing the inequalities above, we get

hence $\lambda$ is continuous.

Conclusion and further

We have proved that $\lambda$ is continuous, and when $0<p<1$, we have seen that $\lambda$ is Lipschitz continuous. It’s natural to think about its differentiability afterwards, but the absolute value function is not even differentiable so we may have no chance. But this is still a fine enough result. For example we have no restriction to $(X,\mathfrak{M},\mu)$ other than the positivity of $\mu$. Therefore we may take $\mathbb{R}^n$ as the Lebesgue measure space here, or we can take something else.

It’s also interesting how we use elementary Calculus to solve some much more abstract problems.

Basic Facts of Semicontinuous Functions


We are restricting ourselves into $\mathbb{R}$ endowed with normal topology. Recall that a function is continuous if and only if for any open set $U \subset \mathbb{R}$, we have

to be open. One can rewrite this statement using $\varepsilon-\delta$ language. To say a function $f: \mathbb{R} \to \mathbb{R}$ continuous at $f(x)$, we mean for any $\varepsilon>0$, there exists some $\delta>0$ such that for $t \in (x-\delta,x+\delta)$, we have

$f$ is continuous on $\mathbb{R}$ if and only if $f$ is continuous at every point of $\mathbb{R}$.

If $(x-\delta,x+\delta)$ is replaced with $(x-\delta,x)$ or $(x,x+\delta)$, we get left continuous and right continuous, one of which plays an important role in probability theory.

But the problem is, sometimes continuity is too strong for being a restriction, but the ‘direction’ associated with left/right continuous functions are unnecessary as well. For example the function

is neither left nor right continuous (globally), but it is a thing. Left/right continuity is not a perfectly weakened version of continuity. We need something different.

Definition of semicontinuous

Let $f$ be a real (or extended-real) function on $\mathbb{R}$. The semicontinuity of $f$ is defined as follows.


is open for all real $\alpha$, we say $f$ is lower semicontinuous.


is open for all real $\alpha$, we say $f$ is upper semicontinuous.

Is it possible to rewrite these definitions à la $\varepsilon-\delta$? The answer is yes if we restrict ourselves in metric space.

$f: \mathbb{R} \to \mathbb{R}$ is upper semicontinuous at $x$ if, for every $\varepsilon>0$, there exists some $\delta>0$ such that for $t \in (x-\delta,x+\delta)$, we have

$f: \mathbb{R} \to \mathbb{R}$ is lower semicontinuous at $x$ if, for every $\varepsilon>0$, there exists some $\delta>0$ such that for $t \in (x-\delta,x+\delta)$, we have

Of course, $f$ is upper/lower semicontinuous on $\mathbb{R}$ if and only if it is so on every point of $\mathbb{R}$. One shall find no difference between the definitions in different styles.

Relation with continuous functions

Here is another way to see it. For the continuity of $f$, we are looking for arbitrary open subsets $V$ of $\mathbb{R}$, and $f^{-1}(V)$ is expected to be open. For the lower/upper semicontinuity of $f$, however, the open sets are restricted to be like $(\alpha,+\infty]$ and $[-\infty,\alpha)$. Since all open sets of $\mathbb{R}$ can be generated by the union or intersection of sets like $[-\infty,\alpha)$ and $(\beta,+\infty]$, we immediately get

$f$ is continuous if and only if $f$ is both upper semicontinuous and lower semicontinuous.

Proof. If $f$ is continuous, then for any $\alpha \in \mathbb{R}$, we see $[-\infty,\alpha)$ is open, and therefore

has to be open. The upper semicontinuity is proved. The lower semicontinuity of $f$ is proved in the same manner.

If $f$ is both upper and lower semicontinuous, we see

is open. Since every open subset of $\mathbb{R}$ can be written as a countable union of segments of the above types, we see for any open subset $V$ of $\mathbb{R}$, $f^{-1}(V)$ is open. (If you have trouble with this part, it is recommended to review the definition of topology.) $\square$


There are two important examples.

  1. If $E \subset \mathbb{R}$ is open, then $\chi_E$ is lower semicontinuous.
  2. If $F \subset \mathbb{R}$ is closed, then $\chi_F$ is upper semicontinuous.

We will prove the first one. The second one follows in the same manner of course. For $\alpha<0$, the set $A=\chi_E^{-1}((\alpha,+\infty])$ is equal to $\mathbb{R}$, which is open. For $\alpha \geq 1$, since $\chi_E \leq 1$, we see $A=\varnothing$. For $0 \leq \alpha < 1$ however, the set of $x$ where $\chi_E>\alpha$ has to be $E$, which is still open.

When checking the semicontinuity of a function, we check from bottom to top or top to bottom. The function $\chi_E$ is defined by

Addition of semicontinuous functions

If $f_1$ and $f_2$ are upper/lower semicontinuous, then so is $f_1+f_2$.

Proof. We are going to prove this using different tools. Suppose now both $f_1$ and $f_2$ are upper semicontinuous. For $\varepsilon>0$, there exists some $\delta_1>0$ and $\delta_2>0$ such that

Proof. If we pick $\delta=\min(\delta_1,\delta_2)$, then we see for all $t \in (x-\delta,x+\delta)$, we have

The upper semicontinuity of $f_1+f_2$ is proved by considering all $x \in \mathbb{R}$.

Now suppose both $f_1$ and $f_2$ are lower semicontinuous. We have an identity by

The set on the right side is always open. Hence $f_1+f_2$ is lower semicontinuous. $\square$

However, when there are infinite many semicontinuous functions, things are different.

Let $\{f_n\}$ be a sequence of nonnegative functions on $\mathbb{R}$, then

  • If each $f_n$ is lower semicontinuous, then so is $\sum_{1}^{\infty}f_n$.
  • If each $f_n$ is upper semicontinuous, then $\sum_{1}^{\infty}f_n$ is not necessarily upper semicontinuous.

Proof. To prove this we are still using the properties of open sets. Put $g_n=\sum_{1}^{n}f_k$. Now suppose all $f_k$ are lower. Since $g_n$ is a finite sum of lower functions, we see each $g_n$ is lower. Let $f=\sum_{n}f_n$. As $f_k$ are non-negative, we see $f(x)>\alpha$ if and only if there exists some $n_0$ such that $g_{n_0}(x)>\alpha$. Therefore

The set on the right hand is open already.

For the upper semicontinuity, it suffices to give a counterexample, but before that, we shall give the motivation.

As said, the characteristic function of a closed set is upper semicontinuous. Suppose $\{E_n\}$ is a sequence of almost disjoint closed set, then $E=\cup_{n\geq 1}E_n$ is not necessarily closed, therefore $\chi_E=\sum\chi_{E_n}$ (a.e.) is not necessarily upper semicontinuous. Now we give a concrete example. Put $f_0=\chi_{[1,+\infty]}$ and $f_n=\chi_{E_n}$ for $n \geq 1$ where

For $x > 0$, we have $f=\sum_nf_n \geq 1$. Meanwhile, $f^{-1}([-\infty,1))=[-\infty,0]$, which is not open. $\square$

Notice that $f$ can be defined on any topological space here.

Maximum and minimum

There is one fact we already know about continuous functions.

If $X$ is compact, $f: X \to \mathbb{R}$ is continuous, then there exists some $a,b \in X$ such that $f(a)=\min f(X)$, $f(b)=\max f(X)$.

In fact, $f(X)$ is compact still. But for semicontinuous functions, things will be different but reasonable. For upper semicontinuous functions, we have the following fact.

If $X$ is compact and $f: X \to (-\infty,+\infty)$ is upper semicontinuous, then there exists some $a \in X$ such that $f(a)=\max f(X)$.

Notice that $X$ is not assumed to hold any other topological property. It can be Hausdorff or Lindelöf, but we are not asking for restrictions like this. The only property we will be using is that every open cover of $X$ has a finite subcover. Of course, one can replace $X$ with any compact subset of $\mathbb{R}$, for example, $[a,b]$.

Proof. Put $\alpha=\sup f(X)$, and define

If $f$ attains no maximum, then for any $x \in X$, there exists some $n \geq 1$ such that $f(x)<\alpha-\frac{1}{n}$. That is, $x \in E_n$ for some $n$. Therefore $\bigcup_{n \geq 1}E_n$ covers $X$. But this cover has no finite subcover of $X$. A contradiction since $X$ is compact. $\square$

Approximating integrable functions

This is a comprehensive application of several properties of semicontinuity.

(Vitali–Carathéodory theorem) Suppose $f \in L^1(\mathbb{R})$, where $f$ is a real-valued function. For $\varepsilon>0$, there exist some functions $u$ and $v$ on $\mathbb{R}$ such that $u \leq f \leq v$, $u$ is an upper semicontinuous function bounded above, and $v$ is lower semicontinuous bounded below, and

It suffices to prove this theorem for $f \geq 0$ (of course $f$ is not identically equal to $0$ since this case is trivial). Since $f$ is the pointwise limit of an increasing sequence of simple functions $s_n$, can to write $f$ as

By putting $t_1=s_1$, $t_n=s_n-s_{n-1}$ for $n \geq 2$, we get $f=\sum_n t_n$. We can write $f$ as

where $E_k$ is measurable for all $k$. Also, we have

and the series on the right hand converges (since $f \in L^1$. By the properties of Lebesgue measure, there exists a compact set $F_k$ and an open set $V_k$ such that $F_k \subset E_k \subset V_k$ and $c_km(V_k-F_k)<\frac{\varepsilon}{2^{k+1}}$. Put

(now you can see $v$ is lower semicontinuous and $u$ is upper semicontinuous). The $N$ is chosen in such a way that

Since $V_k \supset E_k$, we have $\chi_{V_k} \geq \chi_{E_k}$. Therefore $v \geq f$. Similarly, $f \geq u$. Now we need to check the desired integral inequality. A simple recombination shows that

If we integrate the function above, we get

This proved the case when $f \geq 0$. In the general case, we write $f=f^{+}-f^{-}$. Attach the semicontinuous functions to $f^{+}$ and $f^{-}$ respectively by $u_1 \leq f^{+} \leq v_1$ and $u_2 \leq f^{-} \leq v_2$. Put $u=u_1-v_2$, $v=v_1-u_2$. As we can see, $u$ is upper semicontinuous and $v$ is lower semicontinuous. Also, $u \leq f \leq v$ with the desired property since

and the theorem follows. $\square$


Indeed, the only property about measure used is the existence of $F_k$ and $V_k$. The domain $\mathbb{R}$ here can be replaced with $\mathbb{R}^k$ for $1 \leq k < \infty$, and $m$ be replaced with the respective $m_k$. Much more generally, the domain can be replaced by any locally compact Hausdorff space $X$ and the measure by any measure associated with the Riesz-Markov-Kakutani representation theorem on $C_c(X)$.

Is the reverse approximation always possible?

The answer is no. Consider the fat Cantor set $K$, which has Lebesgue measure $\frac{1}{2}$. We shall show that $\chi_K$ can not be approximated below by a lower semicontinuous function.

If $v$ is a lower semicontinuous function such that $v \leq \chi_K$, then $v \leq 0$.

Proof. Consider the set $V=v^{-1}((0,1])=v^{-1}((0,+\infty))$. Since $v \leq \chi_K$, we have $V \subset K$. We will show that $V$ has to be empty.

Pick $t \in V$. Since $V$ is open, there exists some neighbourhood $U$ containing $t$ such that $U \subset V$. But $U=\varnothing$ since $U \subset K$ and $K$ has an empty interior. Therefore $V = \varnothing$. That is, $v \leq 0$ for all $x$. $\square$

Suppose $u$ is an upper semicontinuous function such that $u \geq f$. For $\varepsilon=\frac{1}{2}$, we have

This example shows that there exist some integrable functions that are not able to reversely approximated in the sense of the Vitali–Carathéodory theorem.

More properties of zeros of an entire function

What’s going on again

In this post we discussed the topological properties of the zero points of an entire nonzero function, or roughly, how those points look like. The set of zero points contains no limit point, and at most countable (countable or finite). So if it’s finite, then we can find them out one by one. For example, the function $f(z)=z$ has simply one zero point. But what if it’s just countable? How fast the number grows?

Another question. Suppose we have an entire function $f$, and the zeros of $f$, namely $z_1,z_2,\cdots,z_n$, are ordered increasingly by moduli:

Is it possible to get a fine enough estimation of $|z_n|$? Interesting enough, we can get there with the help of Jensen’s formula.

Jensen’s formula

Suppose $\Omega=D(0;R)$, $f \in H(\Omega)$, $f(0) \neq 0$, $0<r<R$, and $z_1,z_2,\cdots,z_{n(r)}$ are the zeros of $f$ in $\overline{D}(0;R)$, then

There is no need to worry about the assumption $f(0) \neq 0$. Take another look at this proof. Every zero point $a$ has a unique positive number $m$ such that $f(z)=(z-a)^mg(z)$ and $g \in H(\Omega)$ but $g(a) \neq 0$. The number $m$ is called the order of the zero at $a$. Therefore if we have $f(0)=0$ we can simply consider another function, namely $\frac{f}{z^m}$ where $m$ is the order of zero at $0$.

We are not proving this identity at this point. But it can be done by considering the following function

where $m$ is found by ordering $z_j$ in such a way that $z_1,\cdots,z_m \in D(0;r)$ and $|z_{m+1}|=\cdots=|z_{n}|$. One can prove this identity by considering $|g(0)|$ as well as $\log|g(re^{i\theta})|$.

Several applications

The number of zeros of $f$ in $\overline{D}(0;r)$

For simplicity we shall assume $f(0)=1$ which has no loss of generality. Let

and $n(r)$ be the number of zeros of $f$ in $\overline{D}(0;r)$. By the maximum modulus theorem, we have

If we insert Jensen’s formula into this inequality and order $|z_n|$ by increasing moduli, we get

Which implies

So $n(r)$ is controlled by $M(2r)$. The second and third inequalities look tricky, which require more explanation.

First we should notice the fact that $z_n \in \overline{D}(0;R)$ for all $R \in \mathbb{R}$. Hence we have $\log\frac{2r}{|z_n|} \geq \log1=0$ for all $z_n \in \overline{D}(0;R)$. Hence the second inequality follows. For the third one, we simply have

So this is it, the rapidity with which $n(r)$ can grow is dominated by $M(r)$. Namely, the number of zeros of $f$ in the closed disc with radius $r$ is controlled by the maximum modulus of $f$ on a circle with bigger radius.

Examples based on different $M(r)$

Let’s begin with a simple example. Let $f(z)=1$, we have $M(r)=1$ for all $r$, but also we have $n(r)=0$, in which sense this estimation does nothing. Indeed, as long as $M(r)$ is bounded by a constant, which implies $f(z)$ is bounded, then by Liouville’s theorem, $f(z)$ is constant and this estimation is not available.

But if $M(r)$ grows properly, things become interesting. For example, if we have

where $A$ and $k$ are given positive numbers, we have a good enough estimation by

This estimation becomes interesting if we consider the logarithm of $n(r)$ and $r$, that is

If we have $f(z)=1-\exp(z^k)$ where $k$ is a positive integer, we have $n(r) \sim \frac{kr^k}{\pi}$, also

Lower bound of $|z_{n(r)}|$

We’ll see here, how to evaluate the lower bound of $|z_{n(r)}|$ using Jensen’s formula, provided that $M(r)$, or simply the upper bound of $f(z)$ is properly described. Without loss of generality we shall assume that $f(0)=1$. Also, we assume that the zero points of $f(z)$ are ordered by increasing moduli.

First we still consider

and see what will happen.

By Jensen’s, we have

This gives

By the arrangement of $\{z_n\}$, we have

Another example is when we have

where $\Im{z}$ means the imagine part of $z$.

We shall notice that in this case,

Following Jensen’s formula, we therefore have

The Lebesgue-Radon-Nikodym theorem and how von Neumann proved it

An introduction

If one wants to learn the fundamental theorem of Calculus in the sense of Lebesgue integral, properties of measures have to be taken into account. In elementary calculus, one may consider something like

where $f$ is differentiable, say, everywhere on an interval. Now we restrict $f$ to be a differentiable and nondecreasing real function defined on $I=[a,b]$. There we got a one-to-one function defined by

For measurable sets $E\in\mathfrak{M}$, it can be seen that if $m(E)=0$, we have $m(g(E))=0$. Moreover, $g(E) \in \mathfrak{M}$, and $g$ is one-to-one. Therefore we can define a measure like

If we have a relation

(in fact, this is the Radon-Nikodym theorem we will prove later), the fundamental theorem of calculus for $f$ becomes somewhat clear since if $E=[a,x]$, we got $g(E)=[a+f(a),x+f(x)]$, thus we got

which trivially implies

the function $h$ looks like to be $g’=f’+1$.

We are not proving the fundamental theorem here. But this gives rise to a question. Is it possible to find a function such that

one may write as

or, more generally, a measure $\mu$ with respect to another measure $\lambda$? Does this $\mu$ exist with respect to $\lambda$? Does this $h$ exist? Lot of questions. Luckily the Lebesgue decomposition and Radon-Nikodym theorem make it possible.


Let $\mu$ be a positive measure on a $\sigma$-algebra $\mathfrak{M}$, let $\lambda$ be any arbitrary measure (positive or complex) defined on $\mathfrak{M}$.

We write

if $\lambda(E)=0$ for every $E\in\mathfrak{M}$ for which $\mu(E)=0$. (You may write $\mu \ll m$ in the previous section.) We say $\lambda$ is absolutely continuous with respect to $\mu$.

Another relation between measures worth consideration is being mutually singular. If we have $\lambda(E)=\lambda(A \cap E)$ for every $E \in \mathfrak{M}$, we say $\lambda$ is concentrated on $A$.

If we now have two measures $\mu_1$ and $\mu_2$, two disjoint sets $A$ and $B$ such that $\mu_1$ is concentrated on $A$, $\mu_2$ is concentrated on $B$, we say $\mu_1$ and $\mu_2$ are mutually singular, and write

The Theorem of Lebesgue-Radon-Nikodym

Let $\mu$ be a positive $\sigma$-finite measure on $\mathfrak{M}$, and $\lambda$ a complex measure on $\mathfrak{M}$.

  • There exists a unique pair of complex measures $\lambda_{ac}$ and $\lambda_{s}$ on $\mathfrak{M}$ such that
  • There is a unique $h \in L^1(\mu)$ such that

for every $E \in \mathfrak{M}$.

The unique pair $(\lambda_{ac},\lambda_s)$ is called the Lebesgue decomposition; the existence of $h$ is called the Radon-Nikodym theorem, and $h$ is called the Radon-Nikodym derivative. One also writes $d\lambda_{ac}=hd\mu$ or $\frac{d\lambda_{ac}}{d\mu}=h$ in this situation.

These are two separate theorems, but von Neumann gave the idea to prove these two at one stroke.

If we already have $\lambda \ll \mu$, then $\lambda_s=0$ and the Radon-Nikodym derivative shows up in the natural of things.

Also, one cannot ignore the fact that $m$ the Lebesgue measure is $\sigma$-finite.

Proof explained

Step 1 - Construct a bounded functional

We are going to employ Hilbert space technique in this proof. Precisely speaking, we are going to construct a bounded linear functional to find another function, namely $g$, which is the epicentre of this proof.

The boundedness of $\lambda$ is clear since it’s complex, but $\mu$ is only assumed to be $\sigma$-finite. Therefore we need some adjustment onto $\mu$.

1.1 Replacing $\mu$ with a finite measure

If $\mu$ is a positive $\sigma$-finite measure on a $\sigma$-algebra $\mathfrak{M}$ in a set $X$, then there is a function $w$ such that $w \in L^1(\mu)$ and $0<w(x)<1$ for every $x \in X$.

The $\sigma$-finiteness of $\mu$ denotes that, there exist some sets $E_n$ such that

and that $\mu(E_n)<\infty$ for all $n$.


(you can also say that $w_n=\frac{1}{2^n(1+\mu(E_n))}\chi_{E_n}$), then we have

satisfies $0<w<1$ for all $x$. With $w$, we are able to define a new measure, namely

The fact that $\tilde{\mu}(E)$ is a measure can be validated by considering $\int_{E}wd\mu=\int_{X}\chi_{E}wd\mu$. It’s more important that $\tilde{\mu}(E)$ is bounded and $\tilde{\mu}(E)=0$ if and only if $\mu(E)=0$. The second one comes from the strict positivity of $w$. For the first one, notice that

1.2 A bounded linear functional associated with $\lambda$

Since $\lambda$ is complex, without loss of generality, we are able to assume that $\lambda$ is a positive bounded measure on $\mathfrak{M}$. By 1.1, we are able to obtain a positive bounded measure by

Following the construction of Lebesgue measure, we have

for all nonnegative measurable function $f$. Also, notice that $\lambda \leq \varphi$, we have

for $f \in L^2(\varphi)$ by Schwarz inequality.

Since $\varphi(X)<\infty$, we have

to be a bounded linear functional on $L^2(\varphi)$.

Step 2 - Find the associated function with respect to $\lambda$

Since $L^2(\varphi)$ is a Hilbert space, every bounded linear functional on a Hilbert space $H$ is given by an inner product with an element in $H$. That is, by the completeness of $L^2(\varphi)$, there exists a function $g$ such that

The properties of $L^2$ space shows that $g$ is determined almost everywhere with respect to $\varphi$.

For $E \in \mathfrak{M}$, we got

which implies $0 \leq g \leq 1$ for almost every $x$ with respect to $\varphi$. Therefore we are able to assume that $0 \leq g \leq 1$ without ruining the identity. The proof is in the bag once we define $A$ to be the set where $0 \leq g < 1$ and $B$ the set where $g=1$.

Step 3 - Generate $\lambda_{ac}$ and $\lambda_{s}$ and the Radon-Nikodym derivative at one stroke

We claim that $\lambda(A \cap E)$ and $\lambda(B \cap E)$ form the decomposition we are looking for, $\lambda_{ac}$ and $\lambda_s$, respectively. Namely, $\lambda_{ac}=\lambda(A \cap E)$, $\lambda_s=\lambda(B \cap E)$.

Proving $\lambda_s \perp \mu$

If we combine $\Lambda{f}=(f,g)$ and $\varphi=\lambda+\tilde{\mu}$ together, we have

Put $f=\chi_{B}$, we have

Since $w$ is strictly positive, we see that $\mu(B)=0$. Notice that $A \cap B = \varnothing$ and $A \cup B=X$. For $E \in \mathfrak{M}$, we write $E=E_A \cup E_B$, where $E_A \subset A$ and $E_B \subset B$. Therefore

Therefore $\mu$ is concentrated on $A$.

For $\lambda_s$, observe that

Hence $\lambda_s$ is concentrated on $B$. This observation shows that $\lambda_s \perp \mu$.

Proving $\lambda_{ac} \ll \mu$ by the Radon-Nikodym derivative

The relation that $\lambda_{ac} \ll \mu$ will be showed by the existence of the Radon-Nikodym derivative.

If we replace $f$ by

where $E \in \mathfrak{M}$, we have

Notice that

Define $h_n=g(1+g+g^2+\cdots+g^n)w$, we see that on $A$, $h_n$ converges monotonically to

By monotone convergence theorem, we got

for every $E\in\mathfrak{M}$.

The measurable function $h$ is the desired Radon-Nikodym derivative once we show that $h \in L^1(\mu)$. Replacing $E$ with $X$, we see that

Clearly, if $\mu(E)=0$, we have

which shows that

as desired.

Step 3 - Generalization onto complex measures

By far we have proved this theorem for positive bounded measure. For real bounded measure, we can apply the proceeding case to the positive and negative part of it. For all complex measures, we have

where $\lambda_1$ and $\lambda_2$ are real.

Step 4 - Uniqueness of the decomposition

If we have two Lebesgue decompositions of the same measure, namely $(\lambda_{ac},\lambda_s)$ and $(\lambda’_{ac},\lambda’_s)$, we shall show that

By the definition of the decomposition we got

with $\lambda_{ac}-\lambda_{ac}’ \ll \mu$ and $\lambda_{s}’-\lambda_{s}\perp\mu$. This implies that $\lambda’_{s}-\lambda_{s} \ll \mu$ as well.

Since $\lambda’_s-\lambda_s\perp\mu$, there exists a set with $\mu(A)=0$ on which $\lambda’_s-\lambda_s$ is concentrated; the absolute continuity shows that $\lambda’_s(E)-\lambda_s(E)=0$ for all $E \subset A$. Hence $\lambda_s’-\lambda_s$ is concentrated on $X-A$. Therefore we got $(\lambda’_s-\lambda_s)\perp(\lambda’_s-\lambda_s)$, which forces $\lambda’_s-\lambda_s=0$. The uniqueness is proved.

(Following the same process one can also show that $\lambda_{ac}\perp\lambda_s$.)

Topological properties of the zeros of a holomorphic function

What’s going on

If for every $z_0 \in \Omega$ where $\Omega$ is a plane open set, the limit

exists, we say that $f$ is holomorphic (a.k.a. analytic) in $\Omega$. If $f$ is holomorphic in the whole plane, it’s called entire. The class of all holomorphic functions (denoted by $H(\Omega)$) has many interesting properties. For example it does form a ring.

But what happens if we talk about the points where $f$ is equal to $0$? Is it possible to find an entire function $g$ such that $g(z)=0$ if and only if $z$ is on the unit circle? The topological property we will discuss in this post answers this question negatively.


Suppose $\Omega$ is a region, the set

is a at most countable set without limit point, as long as $f$ is not identically equal to $0$ on $\Omega$.

Trivially, if $f(\Omega)=\{0\}$, we have $Z(f)=\Omega$. The set of unit circle is not at most countable and every point is a limit point. Hence if an entire function is equal to $0$ on the unit circle, then the function equals to $0$ on the whole plane.

Note: the connectivity of $\Omega$ is important. For example, for two disjoint open sets $\Omega_0$ and $\Omega_1$, define $f(z)=0$ on $\Omega_0$ and $f(z)=1$ on $\Omega_1$, then everything fails.

A simple application (Feat. Baire Category Theorem)

Before establishing the proof, let’s see what we can do using this result.

Suppose that $f$ is an entire function, and that in every power series

has at leat one coefficient is $0$, then $f$ is a polynomial.

Clearly we have $n!c_n=f^{(n)}(a)$, thus for every $a \in \mathbb{C}$, we can find a postivie integer $n_0$ such that $f^{(n_0)}(a)=0$. Thus we establish the identity:

Notice the fact that $f^{(n)}$ is entire. So $Z(f^{n})$ is either an at most countable set without limit point, or simply equal to $\mathbb{C}$. If there exists a number $N$ such that $Z(f^{N})=\mathbb{C}$, then naturally $Z(f^{n})=\mathbb{C}$ holds for all $n \geq N$. Whilst we see that $f$’s power series has finitely many nonzero coefficients, thus polynomial.

So the question is, is this $N$ always exist? Being an at most countable set without limit points , $Z(f^{(n)})$ has empty interior (nowhere dense). But according to Baire Category Theorem, $\mathbb{C}$ could not be a countable union of nowhere dense sets (of the first category if you say so). This forces the existence of $N$.


The proof will be finished using some basic topology techniques.

Let $A$ be the set of all limit points of $Z(f)$ in $\Omega$. The continuity of $f$ shows that $A \subset Z(f)$. We’ll show that if $A \neq \varnothing$, then $Z(f)=\Omega$.

First we claim that if $a \in A$, then $a \in \bigcap_{n \geq 0}Z(f^{(n)})$. That is, $f^{(k)}(a) = 0$ for all $k \geq 0$. Suppose this fails, then there is a smallest positive integer $m$ such that $c_m \neq 0$ for the power series on the disc $D(a;r)$:


It’s clear that $g \in H(D(a;r))$ since we have

But the continuity shows that $g(a)=0$ while $c_m \neq 0$. A contradiction.

Next fix a point $b \in \Omega$. Choose a curve (continuous mapping) defined $\gamma$ on $[0,1]$ such that $\gamma(0)=a$ and $\gamma(1)=b$. Let

By hypothesis, $0 \in \Gamma$. We shall prove that $1 \in \Gamma$. Let

There exists a sequence $\{t_n\}\subset\Gamma$ such that $t_n \to s$. The continuity of $f^{(k)}$ and $\gamma$ shows that

Hence $s \in \Gamma$. Choose a disc $D(\gamma(s);\delta)\subset\Omega$. On this disc, $f$ is represented by its power series but all coefficients are $0$. It follows that $f(z)=0$ for all $z \in D(\gamma(s);\delta)$. Further, $f^{(k)}(z)=0$ for all $z \subset D(\gamma(s);\delta)$ for all $k \geq 0$. Therefore by the continuity of $\gamma$, there exists $\varepsilon>0$ such that $\gamma(s-\varepsilon,s+\varepsilon)\subset D(\gamma(s);\delta)$, which implies that $(s-\varepsilon, s+\varepsilon)\cap[0,1]\subset\Gamma$. Since $s=\sup\Gamma$, we have $s=1$, therefore $1 \in \Gamma$.

So far we showed that $\Omega = \bigcap_{n \geq 0}Z(f^{(n)})$, which forces $Z(f)=\Omega$. This happens when $Z(f)$ contains limit points, which is equivalent to what we shall prove.

When $Z(f)$ contains no limit point, all points of $Z(f)$ are isolated points; hence in each compact subset of $\Omega$, there are at most finitely many points in $Z(f)$. Since $\Omega$ is $\sigma$-compact, $Z(f)$ is at most countable. $Z(f)$ is also called a discrete set in this situation.