The Big Three Pt. 1 - Baire Category Theorem Explained

There are three theorems about Banach spaces that occur frequently in the crux of functional analysis, which are called the ‘big three’:

1. The Hahn-Banach Theorem
2. The Banach-Steinhaus Theorem
3. The Open Mapping Theorem

The incoming series of blog posts is intended to offer a self-read friendly explanation with richer details. Some basic analysis and topology backgrounds are required.

## First and second category

The term ‘category’ is due to Baire, who developed the category theorem afterwards. Let $X$ be a topological space. A set $E \subset X$ is said to be nowhere dense if $\overline{E}$ has empty interior, i.e. $\text{int}(\overline{E})= \varnothing$.

There are some easy examples of nowhere dense sets. For example, suppose $X=\mathbb{R}$, equipped with the usual topology. Then $\mathbb{N}$ is nowhere dense in $\mathbb{R}$ while $\mathbb{Q}$ is not. It’s trivial since $\overline{\mathbb{N}}=\mathbb{N}$, which has empty interior. Meanwhile $\overline{\mathbb{Q}}=\mathbb{R}$. But $\mathbb{R}$ is open, whose interior is itself. The category is defined using nowhere dense set. In fact,

• A set $S$ is of the first category if $S$ is a countable union of nowhere dense sets.
• A set $T$ is of the second category if $T$ is not of the first category.

## Baire category theorem (BCT)

In this blog post, we consider two cases: BCT in complete metric space and in locally compact Hausdorff space. These two cases have nontrivial intersection but they are not equal. There are some complete metric spaces that are not locally compact Hausdorff.

There are some classic topological spaces, for example $\mathbb{R}^n$, are both complete metric space and locally compact Hausdorff. If a locally compact Hausdorff space happens to be a topological vector space, then this space has finite dimension. Also, a topological vector space has to be Hausdorff.

By a Baire space we mean a topological space $X$ such that the intersection of every countable collection of dense open subsets of $X$ is also dense in $X$.

Baire category states that

(BCT 1) Every complete metric space is a Baire space.

(BCT 2) Every locally compact Hausdorff space is a Baire space.

By taking the complement of the definition, we can see that, every Baire space is not of the first category.

Suppose we have a sequence of sets $\{X_n\}$ where $X_n$ is dense in $X$ for all $n>0$, then $X_0=\cap_n X_n$ is also dense in $X$. Notice then $X_0^{c} = \cup_n X_n^c$, a nowhere dense set and a countable union of nowhere dense sets, i.e. of the first category.

### Proving BCT 1 and BCT 2 via Choquet game

Let $X$ be the given complete metric space or locally Hausdorff space, and $\{X_n\}$ a countable collection of open subsets of $X$. Pick an arbitrary open subsets of $X$, namely $A_0$ (this is possible due to the topology defined on $X$). To prove that $\cap_n V_n$ is dense, we have to show that $A_0 \cap \left(\cap_n V_n\right) \neq \varnothing$. This follows the definition of denseness. Typically we have

A subset $A$ of $X$ is dense if and only if $A \cap U \neq \varnothing$ for all nonempty open subsets $U$ of $X$.

We pick a sequence of nonempty open sets $\{A_n\}$ inductively. With $A_{n-1}$ being picked, and since $V_n$ is open and dense in $X$, the intersection $V_n \cap A_{n-1}$ is nonempty and open. $A_n$ can be chosen such that

For BCT 1, $A_n$ can be chosen to be open balls with radius $< \frac{1}{n}$; for BCT 2, $A_n$ can be chosen such that the closure is compact. Define

Now, if $X$ is a locally compact Hausdorff space, then due to the compactness, $C$ is not empty, therefore we have

which shows that $A_0 \cap V_n \neq \varnothing$. BCT 2 is proved.

For BCT 1, we cannot follow this since it’s not ensured that $X$ has the Heine-Borel property, for example when $X$ is the Hilbert space (this is also a reason why BCT 1 and BCT 2 are not equivalent). The only tool remaining is Cauchy sequence. But how and where?

For any $\varepsilon > 0$, we have some $N$ such that $\frac{1}{N} < \varepsilon$. For all $m>n>N$, we have $A_m \subset A_n\subset A_N$, therefore the centers of $\{A_n\}$ form a Cauchy sequence, converging to some point of $K$, which implies that $K \neq \varnothing$. BCT 1 follows.

## Applications of BCT

BCT will be used directly in the big three. It can be considered as the origin of them. But there are many other applications in different branches of mathematics. The applications shown below are in the same pattern: if it does not hold, then we have a Baire space of the first category, which is not possible.

$\mathbb{R}$ is uncountable

Suppose $\mathbb{R}$ is countable, then we have

where $x_n$ is a real number. But $\{x_n\}$ is nowhere dense, therefore $\mathbb{R}$ is of the first category. A contradiction.

Suppose that $f$ is an entire function, and that in every power series

has at least one coefficient is $0$, then $f$ is a polynomial (there exists a $N$ such that $c_n=0$ for all $n>N$).

You can find the proof here. We are using the fact that $\mathbb{C}$ is complete.

An infinite dimensional Banach space $B$ has no countable basis

Assume that $B$ has a countable basis $\{x_1,x_2,\cdots\}$ and define

It can be easily shown that $B_n$ is nowhere dense. In this sense, $B=\cup_n B_n$. A contradiction since $B$ is a complete metric space.

## The series

Since there is no strong reason to write more posts on this topic, i.e. the three fundamental theorems of linear functional analysis, I think it’s time to make a list of the series. It’s been around half a year.

Jordan标准形的求法——以一个四阶矩阵为例

## 复方阵的Jordan标准形总是存在

### 从矩阵运算的角度理解

More properties of zeros of an entire function

## What’s going on again

In this post we discussed the topological properties of the zero points of an entire nonzero function, or roughly, how those points look like. The set of zero points contains no limit point, and at most countable (countable or finite). So if it’s finite, then we can find them out one by one. For example, the function $f(z)=z$ has simply one zero point. But what if it’s just countable? How fast the number grows?

Another question. Suppose we have an entire function $f$, and the zeros of $f$, namely $z_1,z_2,\cdots,z_n$, are ordered increasingly by moduli:

Is it possible to get a fine enough estimation of $|z_n|$? Interesting enough, we can get there with the help of Jensen’s formula.

## Jensen’s formula

Suppose $\Omega=D(0;R)$, $f \in H(\Omega)$, $f(0) \neq 0$, $0<r<R$, and $z_1,z_2,\cdots,z_{n(r)}$ are the zeros of $f$ in $\overline{D}(0;R)$, then

There is no need to worry about the assumption $f(0) \neq 0$. Take another look at this proof. Every zero point $a$ has a unique positive number $m$ such that $f(z)=(z-a)^mg(z)$ and $g \in H(\Omega)$ but $g(a) \neq 0$. The number $m$ is called the order of the zero at $a$. Therefore if we have $f(0)=0$ we can simply consider another function, namely $\frac{f}{z^m}$ where $m$ is the order of zero at $0$.

We are not proving this identity at this point. But it can be done by considering the following function

where $m$ is found by ordering $z_j$ in such a way that $z_1,\cdots,z_m \in D(0;r)$ and $|z_{m+1}|=\cdots=|z_{n}|$. One can prove this identity by considering $|g(0)|$ as well as $\log|g(re^{i\theta})|$.

## Several applications

### The number of zeros of $f$ in $\overline{D}(0;r)$

For simplicity we shall assume $f(0)=1$ which has no loss of generality. Let

and $n(r)$ be the number of zeros of $f$ in $\overline{D}(0;r)$. By the maximum modulus theorem, we have

If we insert Jensen’s formula into this inequality and order $|z_n|$ by increasing moduli, we get

Which implies

So $n(r)$ is controlled by $M(2r)$. The second and third inequalities look tricky, which require more explanation.

First we should notice the fact that $z_n \in \overline{D}(0;R)$ for all $R \in \mathbb{R}$. Hence we have $\log\frac{2r}{|z_n|} \geq \log1=0$ for all $z_n \in \overline{D}(0;R)$. Hence the second inequality follows. For the third one, we simply have

So this is it, the rapidity with which $n(r)$ can grow is dominated by $M(r)$. Namely, the number of zeros of $f$ in the closed disc with radius $r$ is controlled by the maximum modulus of $f$ on a circle with bigger radius.

### Examples based on different $M(r)$

Let’s begin with a simple example. Let $f(z)=1$, we have $M(r)=1$ for all $r$, but also we have $n(r)=0$, in which sense this estimation does nothing. Indeed, as long as $M(r)$ is bounded by a constant, which implies $f(z)$ is bounded, then by Liouville’s theorem, $f(z)$ is constant and this estimation is not available.

But if $M(r)$ grows properly, things become interesting. For example, if we have

where $A$ and $k$ are given positive numbers, we have a good enough estimation by

This estimation becomes interesting if we consider the logarithm of $n(r)$ and $r$, that is

If we have $f(z)=1-\exp(z^k)$ where $k$ is a positive integer, we have $n(r) \sim \frac{kr^k}{\pi}$, also

### Lower bound of $|z_{n(r)}|$

We’ll see here, how to evaluate the lower bound of $|z_{n(r)}|$ using Jensen’s formula, provided that $M(r)$, or simply the upper bound of $f(z)$ is properly described. Without loss of generality we shall assume that $f(0)=1$. Also, we assume that the zero points of $f(z)$ are ordered by increasing moduli.

First we still consider

and see what will happen.

By Jensen’s, we have

This gives

By the arrangement of $\{z_n\}$, we have

Another example is when we have

where $\Im{z}$ means the imagine part of $z$.

We shall notice that in this case,

Following Jensen’s formula, we therefore have

Linear ODE but Quotient Spaces

## What does this blog do

We are treating linear ODE problems as an example of quotient space in this blog post. You are assumed to be able to solve linear ODEs without theoretical problems, and have some basic understanding of normal subgroups.

## General theories

### Quotient Space

Let $X$ be a vector space, and $N$ a subspace of it. Naturally $N$ is normal in $X$ since $X$ is abelian. Define

for $x \in X$, then the collection of sets $\{\pi(x):x \in X\}$ is the quotient space of $X$ modulo $N$, in which case we write $X/N$. Addition and scalar multiplication are defined by

This is well-defined since $N$ is a vector space. The kernel of $\pi$ or the origin of $X/N$ has to be $N=0+N$. People call $\pi$ the canonical map.

### Linear ODE

If one solves a linear ODE problem of order $n$ on an interval $(a,b)$, namely

where $\mathbf{A}$ is a constant matrix, they will find that the solution can be

where $\mathbf{c}$ is a given constant vector and $\mathbf{\Phi}(x)=\begin{bmatrix}\mathbf{\varphi_1},\mathbf{\varphi_2},\cdots,\mathbf{\varphi_n}\end{bmatrix}$ and $\{\mathbf{\varphi_i}\}$ are the fundamental solutions of $\frac{d\mathbf{y}}{dx}=\mathbf{Ay}$. We’ll translate this into the language of quotient space. We have to show that the solutions of the equation forms an element of $X/N$.

## Steps to quotient space

So where is the $X$? It suffices to pick $\mathcal{C}^n$, the space of all functions $\mathbf{y}=(y_1,y_2,\cdots,y_n)^T$ such that $y_k$ is $n$-time differentiable. The crux therefore becomes finding $N$. And we’ll show that it’s denoted by $\mathbf{\Phi}(x)\mathbf{c}$.

### The solutions of $\frac{d\mathbf{y}}{dx}=\mathbf{Ay}$ form a vector space

Indeed, it’s trivially verified since $\frac{d}{dx}$ and matrix multiplication are linear. The question is, how does this vector space look like? Why the fundamental solutions to this equation has and only has $n$ elements? Does that mean this space (denoted by $N$), has dimension $n$? Fortunately, the following isomorphism answers this question in the affirmative.

Theorem The vector space $N$ is isomorphic to $\mathbb{R}^n$

Pick and fix $x_0 \in (a,b)$. Picard’s existence and uniqueness theorem ensures that, the initial value problem

has a unique solution. Therefore we have a bijection

It suffices to prove that $H$ is linear. Namely, we need to show that

which is trivial, since

### Quotient space and coset are there

Let’s see this solution again

For $\mathbf{c}=(c_1,c_2,\cdots,c_n)^T$, we have

Notice that $\{\mathbf{\varphi_k}\}$ is a basis of $N$ and there we have it. $\mathbf{c}$ can be any element of $\mathbb{R}^n$, hence $\mathbf{\Phi}(x)\mathbf{c}$ goes through all elements of $N$. Also we know, the function

is a special solution to $\frac{d\mathbf{y}}{dx}=\mathbf{Ay}+\mathbf{f}$. Thus we have the coset

to be the set of all solutions where $\pi(\mathbf{z}) \in X/N$.

Prove Picard's Existence and Uniqueness Theorem on Different Levels

## IVP and Picard’s

By IVP (Initial Value Problem), we mean the problem about solving

Of course we do hope we can solve this problem easily with a simple result. But that won’t happen if the function $f$ is ‘ugly’ enough. Hence the problem is, generally and theoretically, when we can get a unique solution? When does some solution exist? Fortunately Picard ensured that

If $f$ is uniformly Lipschitz continuous in $y$ and continuous in $x$ on $R=\{(x,y):|x-x_0| \leq a,|y-y_0| \leq b\}$, then for some $\varepsilon>0$, there exists a unique solution $y$ to the IVP on the interval $[x_0-\varepsilon,x_0+\varepsilon]$.

Interesting enough, there are several ways to prove it on different levels. This blog post offers two proofs.

## Proving Picard’s existence and uniqueness theorem

### Some preparation

By uniformly Lipschitz continuous, we mean that for all $y \in R$, $f$ satisfies

for some $L > 0$. This condition is useful in many different branch of mathematics.

Also, it’s trivial to verify that the IVP is equivalent to

And yes, our job becomes finding such a $y$ satisfying this equation.

### A sketch of the proof in elementary calculus

Honestly the proof is kind of long. I’ll leave the basic steps here. First, we define the Picard sequence by

for $n=0,1,\cdots$. It can be shown by induction that

where $M = \sup\limits_{(x,y) \in R}|f(x,y)|$ and $\varepsilon = \min\left(a,\frac{b}{M}\right)$. We want to prove that

is the solution to the IVP, where Lipschitz condition comes into play by considering

Finally, it should be shown that $y$ is the unique solution. To do this, it can be shown that, if we have two solutions, say $y=u(x)$ and $y=v(x)$, then $u(x)-v(x)=0$.

### Osgood’s condition ensures uniqueness

Osgood’s condition is much weaker than Lipschitz’s. Under Osgood’s condition, it’s easy to check the uniqueness (existence is not guaranteed), but there is no way to get the result (while Lipschitz’s show you the way) through this condition.

$f(x,y)$ has at most one solution in every point in $R$ if $f$ is continuous and satisfies the Osgood’s condition, namely

in $R$ where $F$ is a continuous function such that $F(t)>0$ for all $r>0$. Also, $F$ is defined in such a way that

for some $r_1>0$.

Naturally, if we define $F(r)=Lr$, we have $\int_{0}^{r_1}\frac{1}{Lr}dr=\infty$.

#### Proof of Osgood’s

We’ll prove this theorem indirectly. That is, if there exists some point $(x_0,y_0) \in R$ such that $f$ has at least two solutions, then $f$ does not satisfy Osgood’s condition, which is equivalent to the statement that if $f$ satisfies Osgood’s, then $f$ has no more than solution.

Suppose now there exists a point $(x_0,y_0) \in R$ such that $y’=f(x,y)$ has two distinct solutions

and $y_1 \neq y_2$ for at $x_1 \neq x_0$. W.L.O.G. we suppose that $x_1 > x_0$. Define

and $r(x)=y_1(x)-y_2(x)$ on $[x_s,x_1]$. The derivative of $r$ is interesting since we have

Then the desired improper integral converges since we have

Therefore $f$ does not satisfy Osgood’s condition.

### Banach FPT is applied onto Picard’s

#### What is Banach FPT (Fixed Point Theorem)

A map $T: X \mapsto X$ defined on a complete metric space $(X,d)$ is a contraction if there exists some $k \in [0,1)$ such that

Banach Fixed Point Theorem states that, $T$ admits a fixed point, namely $T(x)=x$ for some $x \in X$.

The proof is an application of Cauchy sequence (notice that $(X,d)$ is complete). Pick an arbitrary $x_0 \in X$, by defining $T(x_n)=x_{n+1}$, we have

which finally goes to

$\{x_n\}$ is Cauchy then since we have

Since $(X,d)$ is complete, we see $\{x_n\}$ converges. Also, $T$ is (uniformly) continuous, therefore

Uniqueness of the fixed point follows from the uniqueness of limit.

#### Proving Picard’s using Banach FPT

Fortunately, Picard’s existence and uniqueness theorem can be directly proved using Banach FPT. All we need is a proper translation.

##### Complete metric space

Let $\mathcal{C_B}(R)$ be all bounded continuous function on $R$, then $\mathcal{C_B}(R)$ is complete considering the metric by

##### Contraction map

The functional $T:\mathcal{C_B}(R)\mapsto\mathcal{C_B}(R)$ by

is the contraction we are looking for, if we take uniform Lipschitz’s condition into consideration. Namely, if we have $\varepsilon<\frac{1}{L}$, we can see that

Therefore by Banach FPT, the functional $T$ has a unique fixed point, which is equivalent to Picard’s existence and uniqueness theorem. Further, the solution can be obtained by taking

where $y(x_0)=y_0$ comes from the initial value problem. Picard’s follows from FPT.

## 矩阵表示的目的

1. 假设我们已经得到了方程的解，那么这些解有怎样的性质？和一般的实数范围的方程组的解有什么不同？
2. 既然求导是线性运算，我们在线性代数中的理论，有多少可以用上？
3. 如果要求解的方程组的系数全是实数，那么能不能运用实矩阵的性质，对矩阵进行简化（例如化为对角型等），对方程求解进行简化？