Basic Facts of Semicontinuous Functions

Continuity

We are restricting ourselves into $\mathbb{R}$ endowed with normal topology. Recall that a function is continuous if and only if for any open set $U \subset \mathbb{R}$, we have

to be open. One can rewrite this statement using $\varepsilon-\delta$ language. To say a function $f: \mathbb{R} \to \mathbb{R}$ continuous at $f(x)$, we mean for any $\varepsilon>0$, there exists some $\delta>0$ such that for $t \in (x-\delta,x+\delta)$, we have

$f$ is continuous on $\mathbb{R}$ if and only if $f$ is continuous at every point of $\mathbb{R}$.

If $(x-\delta,x+\delta)$ is replaced with $(x-\delta,x)$ or $(x,x+\delta)$, we get left continuous and right continuous, one of which plays an important role in probability theory.

But the problem is, sometimes continuity is too strong for being a restriction, but the ‘direction’ associated with left/right continuous functions are unnecessary as well. For example the function

is neither left nor right continuous (globally), but it is a thing. Left/right continuity is not a perfectly weakened version of continuity. We need something different.

Definition of semicontinuous

Let $f$ be a real (or extended-real) function on $\mathbb{R}$. The semicontinuity of $f$ is defined as follows.

If

is open for all real $\alpha$, we say $f$ is lower semicontinuous.

If

is open for all real $\alpha$, we say $f$ is upper semicontinuous.

Is it possible to rewrite these definitions à la $\varepsilon-\delta$? The answer is yes if we restrict ourselves in metric space.

$f: \mathbb{R} \to \mathbb{R}$ is upper semicontinuous at $x$ if, for every $\varepsilon>0$, there exists some $\delta>0$ such that for $t \in (x-\delta,x+\delta)$, we have

$f: \mathbb{R} \to \mathbb{R}$ is lower semicontinuous at $x$ if, for every $\varepsilon>0$, there exists some $\delta>0$ such that for $t \in (x-\delta,x+\delta)$, we have

Of course, $f$ is upper/lower semicontinuous on $\mathbb{R}$ if and only if it is so on every point of $\mathbb{R}$. One shall find no difference between the definitions in different styles.

Relation with continuous functions

Here is another way to see it. For the continuity of $f$, we are looking for arbitrary open subsets $V$ of $\mathbb{R}$, and $f^{-1}(V)$ is expected to be open. For the lower/upper semicontinuity of $f$, however, the open sets are restricted to be like $(\alpha,+\infty]$ and $[-\infty,\alpha)$. Since all open sets of $\mathbb{R}$ can be generated by the union or intersection of sets like $[-\infty,\alpha)$ and $(\beta,+\infty]$, we immediately get

$f$ is continuous if and only if $f$ is both upper semicontinuous and lower semicontinuous.

Proof. If $f$ is continuous, then for any $\alpha \in \mathbb{R}$, we see $[-\infty,\alpha)$ is open, and therefore

has to be open. The upper semicontinuity is proved. The lower semicontinuity of $f$ is proved in the same manner.

If $f$ is both upper and lower semicontinuous, we see

is open. Since every open subset of $\mathbb{R}$ can be written as a countable union of segments of the above types, we see for any open subset $V$ of $\mathbb{R}$, $f^{-1}(V)$ is open. (If you have trouble with this part, it is recommended to review the definition of topology.) $\square$

Examples

There are two important examples.

  1. If $E \subset \mathbb{R}$ is open, then $\chi_E$ is lower semicontinuous.
  2. If $F \subset \mathbb{R}$ is closed, then $\chi_F$ is upper semicontinuous.

We will prove the first one. The second one follows in the same manner of course. For $\alpha<0$, the set $A=\chi_E^{-1}((\alpha,+\infty])$ is equal to $\mathbb{R}$, which is open. For $\alpha \geq 1$, since $\chi_E \leq 1$, we see $A=\varnothing$. For $0 \leq \alpha < 1$ however, the set of $x$ where $\chi_E>\alpha$ has to be $E$, which is still open.

When checking the semicontinuity of a function, we check from bottom to top or top to bottom. The function $\chi_E$ is defined by

Addition of semicontinuous functions

If $f_1$ and $f_2$ are upper/lower semicontinuous, then so is $f_1+f_2$.

Proof. We are going to prove this using different tools. Suppose now both $f_1$ and $f_2$ are upper semicontinuous. For $\varepsilon>0$, there exists some $\delta_1>0$ and $\delta_2>0$ such that

Proof. If we pick $\delta=\min(\delta_1,\delta_2)$, then we see for all $t \in (x-\delta,x+\delta)$, we have

The upper semicontinuity of $f_1+f_2$ is proved by considering all $x \in \mathbb{R}$.

Now suppose both $f_1$ and $f_2$ are lower semicontinuous. We have an identity by

The set on the right side is always open. Hence $f_1+f_2$ is lower semicontinuous. $\square$


However, when there are infinite many semicontinuous functions, things are different.

Let $\{f_n\}$ be a sequence of nonnegative functions on $\mathbb{R}$, then

  • If each $f_n$ is lower semicontinuous, then so is $\sum_{1}^{\infty}f_n$.
  • If each $f_n$ is upper semicontinuous, then $\sum_{1}^{\infty}f_n$ is not necessarily upper semicontinuous.

Proof. To prove this we are still using the properties of open sets. Put $g_n=\sum_{1}^{n}f_k$. Now suppose all $f_k$ are lower. Since $g_n$ is a finite sum of lower functions, we see each $g_n$ is lower. Let $f=\sum_{n}f_n$. As $f_k$ are non-negative, we see $f(x)>\alpha$ if and only if there exists some $n_0$ such that $g_{n_0}(x)>\alpha$. Therefore

The set on the right hand is open already.

For the upper semicontinuity, it suffices to give a counterexample, but before that, we shall give the motivation.

As said, the characteristic function of a closed set is upper semicontinuous. Suppose $\{E_n\}$ is a sequence of almost disjoint closed set, then $E=\cup_{n\geq 1}E_n$ is not necessarily closed, therefore $\chi_E=\sum\chi_{E_n}$ (a.e.) is not necessarily upper semicontinuous. Now we give a concrete example. Put $f_0=\chi_{[1,+\infty]}$ and $f_n=\chi_{E_n}$ for $n \geq 1$ where

For $x > 0$, we have $f=\sum_nf_n \geq 1$. Meanwhile, $f^{-1}([-\infty,1))=[-\infty,0]$, which is not open. $\square$

Notice that $f$ can be defined on any topological space here.

Maximum and minimum

There is one fact we already know about continuous functions.

If $X$ is compact, $f: X \to \mathbb{R}$ is continuous, then there exists some $a,b \in X$ such that $f(a)=\min f(X)$, $f(b)=\max f(X)$.

In fact, $f(X)$ is compact still. But for semicontinuous functions, things will be different but reasonable. For upper semicontinuous functions, we have the following fact.

If $X$ is compact and $f: X \to (-\infty,+\infty)$ is upper semicontinuous, then there exists some $a \in X$ such that $f(a)=\max f(X)$.

Notice that $X$ is not assumed to hold any other topological property. It can be Hausdorff or Lindelöf, but we are not asking for restrictions like this. The only property we will be using is that every open cover of $X$ has a finite subcover. Of course, one can replace $X$ with any compact subset of $\mathbb{R}$, for example, $[a,b]$.

Proof. Put $\alpha=\sup f(X)$, and define

If $f$ attains no maximum, then for any $x \in X$, there exists some $n \geq 1$ such that $f(x)<\alpha-\frac{1}{n}$. That is, $x \in E_n$ for some $n$. Therefore $\bigcup_{n \geq 1}E_n$ covers $X$. But this cover has no finite subcover of $X$. A contradiction since $X$ is compact. $\square$

Approximating integrable functions

This is a comprehensive application of several properties of semicontinuity.

(Vitali–Carathéodory theorem) Suppose $f \in L^1(\mathbb{R})$, where $f$ is a real-valued function. For $\varepsilon>0$, there exist some functions $u$ and $v$ on $\mathbb{R}$ such that $u \leq f \leq v$, $u$ is an upper semicontinuous function bounded above, and $v$ is lower semicontinuous bounded below, and

It suffices to prove this theorem for $f \geq 0$ (of course $f$ is not identically equal to $0$ since this case is trivial). Since $f$ is the pointwise limit of an increasing sequence of simple functions $s_n$, can to write $f$ as

By putting $t_1=s_1$, $t_n=s_n-s_{n-1}$ for $n \geq 2$, we get $f=\sum_n t_n$. We can write $f$ as

where $E_k$ is measurable for all $k$. Also, we have

and the series on the right hand converges (since $f \in L^1$. By the properties of Lebesgue measure, there exists a compact set $F_k$ and an open set $V_k$ such that $F_k \subset E_k \subset V_k$ and $c_km(V_k-F_k)<\frac{\varepsilon}{2^{k+1}}$. Put

(now you can see $v$ is lower semicontinuous and $u$ is upper semicontinuous). The $N$ is chosen in such a way that

Since $V_k \supset E_k$, we have $\chi_{V_k} \geq \chi_{E_k}$. Therefore $v \geq f$. Similarly, $f \geq u$. Now we need to check the desired integral inequality. A simple recombination shows that

If we integrate the function above, we get

This proved the case when $f \geq 0$. In the general case, we write $f=f^{+}-f^{-}$. Attach the semicontinuous functions to $f^{+}$ and $f^{-}$ respectively by $u_1 \leq f^{+} \leq v_1$ and $u_2 \leq f^{-} \leq v_2$. Put $u=u_1-v_2$, $v=v_1-u_2$. As we can see, $u$ is upper semicontinuous and $v$ is lower semicontinuous. Also, $u \leq f \leq v$ with the desired property since

and the theorem follows. $\square$

Generalisation

Indeed, the only property about measure used is the existence of $F_k$ and $V_k$. The domain $\mathbb{R}$ here can be replaced with $\mathbb{R}^k$ for $1 \leq k < \infty$, and $m$ be replaced with the respective $m_k$. Much more generally, the domain can be replaced by any locally compact Hausdorff space $X$ and the measure by any measure associated with the Riesz-Markov-Kakutani representation theorem on $C_c(X)$.

Is the reverse approximation always possible?

The answer is no. Consider the fat Cantor set $K$, which has Lebesgue measure $\frac{1}{2}$. We shall show that $\chi_K$ can not be approximated below by a lower semicontinuous function.

If $v$ is a lower semicontinuous function such that $v \leq \chi_K$, then $v \leq 0$.

Proof. Consider the set $V=v^{-1}((0,1])=v^{-1}((0,+\infty))$. Since $v \leq \chi_K$, we have $V \subset K$. We will show that $V$ has to be empty.

Pick $t \in V$. Since $V$ is open, there exists some neighbourhood $U$ containing $t$ such that $U \subset V$. But $U=\varnothing$ since $U \subset K$ and $K$ has an empty interior. Therefore $V = \varnothing$. That is, $v \leq 0$ for all $x$. $\square$

Suppose $u$ is an upper semicontinuous function such that $u \geq f$. For $\varepsilon=\frac{1}{2}$, we have

This example shows that there exist some integrable functions that are not able to reversely approximated in the sense of the Vitali–Carathéodory theorem.

An Introduction to Quotient Space

I’m assuming the reader has some abstract algebra and functional analysis background. You may have learned this already in your linear algebra class, but we are making our way to functional analysis problems.

Motivation

The trouble with $L^p$ spaces

Fix $p$ with $1 \leq p \leq \infty$. It’s easy to see that $L^p(\mu)$ is a topological vector space. But it is not a metric space if we define

The reason is, if $d(f,g)=0$, we can only get $f=g$ a.e., but they are not strictly equal. With that being said, this function $d$ is actually a pseudo metric. This is unnatural. However, the relation $\sim$ by $f \sim g \mathbb{R}ightarrow d(f,g)=0$ is a equivalence relation. This inspires us to take the quotient set into consideration.

Vector spaces are groups anyway

For a vector space $V$, every subspace of $V$ is a normal subgroup. There is no reason to prevent ourselves from considering the quotient group and looking for some interesting properties. Further, a vector space is an abelian group, therefore any subspace is automatically normal.

Definition

Let $N$ be a subspace of a vector space $X$. For every $x \in X$, let $\pi(x)$ be the coset of $N$ that contains $x$, that is

Trivially, $\pi(x)=\pi(y)$ if and only if $x-y \in N$ (say, $\pi$ is well-defined since $N$ is a vector space). This is a linear function since we also have the addition and multiplication by

These cosets are the elements of a vector space $X/N$, which reads, the quotient space of $X$ modulo $N$. The map $\pi$ is called the canonical map as we all know.

Examples

R^2-quotient

First, we shall treat $\mathbb{R}^2$ as a vector space, and the subspace $\mathbb{R}$, which is graphically represented by $x$-axis, as a subspace (we will write it as $X$). For a vector $v=(2,3)$, which is represented by $AB$, we see the coset $v+X$ has something special. Pick any $u \in X$, for example, $AE$, $AC$, or $AG$. We see $v+u$ has the same $y$ value. The reason is simple since we have $v+u=(2+x,3)$, where the $y$ value remains fixed however $u$ may vary.

With that being said, the set $v+X$, which is not a vector space, can be represented by $\overrightarrow{AD}$. This proceed can be generalized to $\mathbb{R}^n$ with $\mathbb{R}^m$ as a subspace with ease.


We now consider a fancy example. Consider all rational Cauchy sequences, that is

where $a_k\in\mathbb{Q}$ for all $k$. In analysis class, we learned two facts.

  1. Any Cauchy sequence is bounded.
  2. If $(a_n)$ converges, then $(a_n)$ is Cauchy.

However, the reverse of 2 does not hold in $\mathbb{Q}$. For example, if we put $a_k=(1+\frac{1}{k})^k$, we should have the limit to be $e$, but $e \notin \mathbb{Q}$.

If we define the addition and multiplication term by term, namely

and

where $\alpha \in \mathbb{Q}$, we get a vector space (the verification is easy). The zero vector is defined by

This vector space is denoted by $\overline{\mathbb{Q}}$. The subspace containing all sequences converges to $0$ will be denoted by $\overline{\mathbb{O}}$. Again, $(a_n)+\overline{\mathbb{O}}=(b_n)+\overline{\mathbb{O}}$ if and only if $(a_n-b_n) \in \overline{\mathbb{O}}$. Using the language of equivalence relation, we also say $(a_n)$ and $(b_n)$ are equivalent if $(a_n-b_n) \in \overline{\mathbb{O}}$. For example, the two following sequences are equivalent:

Actually, we will get $\mathbb{R} \simeq \overline{\mathbb{Q}}/\overline{\mathbb{O}}$ in the end. But to make sure that this quotient space is exactly the one we meet in our analysis class, there are a lot of verifications should be done.

We shall give more definitions for calculation. The multiplication of two Cauchy sequences is defined term by term à la the addition. For $\overline{\mathbb{Q}}/\overline{\mathbb{O}}$ we have

and

As for inequality, a partial order has to be defined. We say $(a_n) > (0)$ if there exists some $N>0$ such that $a_n>0$ for all $n \geq N$. By $(a_n) > (b_n)$ we mean $(a_n-b_n)>(0)$ of course. For cosets, we say $(a_n)+\overline{\mathbb{O}}>\overline{\mathbb{O}}$ if $(x_n) > (0)$ for some $(x_n) \in (a_n)+\overline{\mathbb{O}}$. This is well defined. That is, if $(x_n)>(0)$, then $(y_n)>(0)$ for all $(y_n) \in (a_n)+\overline{\mathbb{O}}$.

With these operations being defined, it can be verified that $\overline{\mathbb{Q}}/\overline{\mathbb{O}}$ has the desired properties, for example, the least-upper-bound property. But this goes too far from the topic, we are not proving it here. If you are interested, you may visit here for more details.


Finally, we are trying to make $L^p$ a Banach space. Fix $p$ with $1 \leq p < \infty$. There is a seminorm defined for all Lebesgue measurable functions on $[0,1]$ by

$L^p$ is a vector space containing all functions $f$ with $p(f)<\infty$. But it’s not a normed space by $p$, since $p(f)=0$ only implies $f=0$ almost everywhere. However, the set $N$ which contains all functions that equal $0$ is also a vector space. Now consider the quotient space by

where $\pi$ is the canonical map of $L^p$ into $L^p/N$. We shall prove that $\tilde{p}$ is well-defined here. If $\pi(f)=\pi(g)$, we have $f-g \in N$, therefore

which forces $p(f)=p(g)$. Therefore in this case we also have $\tilde{p}(\pi(f))=\tilde{p}(\pi(g))$. This indeed ensures that $\tilde{p}$ is a norm, and $L^p/N$ a Banach space. There are some topological facts required to prove this, we are going to cover a few of them.

Topology of quotient space

Definition

We know if $X$ is a topological vector space with a topology $\tau$, then the addition and scalar multiplication are continuous. Suppose now $N$ is a closed subspace of $X$. Define $\tau_N$ by

We are expecting $\tau_N$ to be properly-defined. And fortunately, it is. Some interesting techniques will be used in the following section.

$\tau_N$ is a vector topology

There will be two steps to get this done.

$\tau_N$ is a topology.

It is trivial that $\varnothing$ and $X/N$ are elements of $\tau_N$. Other properties are immediate as well since we have

and

That said, if we have $A,B\in \tau_N$, then $A \cap B \in \tau_N$ since $\pi^{-1}(A \cap B)=\pi^{-1}(A) \cap \pi^{-1}(B) \in \tau$.

Similarly, if $A_\alpha \in \tau_N$ for all $\alpha$, we have $\cup A_\alpha \in \tau_N$. Also, by definition of $\tau_N$, $\pi$ is continuous.

$\tau_N$ is a vector topology.

First, we show that a point in $X/N$, which can be written as $\pi(x)$, is closed. Notice that $N$ is assumed to be closed, and

therefore has to be closed.

In fact, $F \subset X/N$ is $\tau_N$-closed if and only if $\pi^{-1}(F)$ is $\tau$-closed. To prove this, one needs to notice that $\pi^{-1}(F^c)=(\pi^{-1}(F))^{c}$.

Suppose $V$ is open, then

is open. By definition of $\tau_N$, we have $\pi(V) \in \tau_N$. Therefore $\pi$ is an open mapping.

If now $W$ is a neighbourhood of $0$ in $X/N$, there exists a neighbourhood $V$ of $0$ in $X$ such that

Hence $\pi(V)+\pi(V) \subset W$. Since $\pi$ is open, $\pi(V)$ is a neighbourhood of $0$ in $X/N$, this shows that the addition is continuous.

The continuity of scalar multiplication will be shown in a direct way (so can the addition, but the proof above is intended to offer some special technique). We already know, the scalar multiplication on $X$ by

is continuous, where $\Phi$ is the scalar field (usually $\mathbb{R}$ or $\mathbb{C}$. Now the scalar multiplication on $X/N$ is by

We see $\psi(\alpha,x+N)=\pi(\varphi(\alpha,x))$. But the composition of two continuous functions is continuous, therefore $\psi$ is continuous.

A commutative diagram by quotient space

We are going to talk about a classic commutative diagram that you already see in algebra class.

diagram-000001

There are some assumptions.

  1. $X$ and $Y$ are topological vector spaces.
  2. $\Lambda$ is linear.
  3. $\pi$ is the canonical map.
  4. $N$ is a closed subspace of $X$ and $N \subset \ker\Lambda$.

Algebraically, there exists a unique map $f: X/N \to Y$ by $x+N \mapsto \Lambda(x)$. Namely, the diagram above is commutative. But now we are interested in some analysis facts.

$f$ is linear.

This is obvious. Since $\pi$ is surjective, for $u,v \in X/N$, we are able to find some $x,y \in X$ such that $\pi(x)=u$ and $\pi(y)=v$. Therefore we have

and

$\Lambda$ is open if and only if $f$ is open.

If $f$ is open, then for any open set $U \subset X$, we have

to be an open set since $\pi$ is open, and $\pi(U)$ is an open set.

If $f$ is not open, then there exists some $V \subset X/N$ such that $f(V)$ is closed. However, since $\pi$ is continuous, we have $\pi^{-1}(V)$ to be open. In this case, we have

to be closed. $\Lambda$ is therefore not open. This shows that if $\Lambda$ is open, then $f$ is open.

$\Lambda$ is continuous if and only if $f$ is continuous.

If $f$ is continuous, for any open set $W \subset Y$, we have $\pi^{-1}(f^{-1}(W))=\Lambda^{-1}(W)$ to be open. Therefore $\Lambda$ is continuous.

Conversely, if $\Lambda$ is continuous, for any open set $W \subset Y$, we have $\Lambda^{-1}(W)$ to be open. Therefore $f^{-1}(W)=\pi(\Lambda^{-1}(W))$ has to be open since $\pi$ is open.

关于《第三次忍受》

Credit: https://www.deviantart.com/insaneattraction/art/La-tercera-resignacion-57045720

已经二十五岁,已经不再长个子,家里人决定将我下葬。其实也没什么好抗拒的了,再听一会半通不通的悼词,那些乱七八糟的哭腔和应和,再忍一会棺材里的寒冷,就过去了。这是不是妄想我也不清楚,可我想我大抵是这次死亡的一个配角。

这是不是一场噩梦也无关紧要,尸胺的气味是不是从我身上发出也不重要,没有怀疑的必要。这无非是我第三次忍受,心甘情愿的忍受。为什么要挣扎?我也不是十八年前那个体弱多病的孩子,还可以四处活动。我明明是一个尽量保持体面的死者。可能在几天前还可以闻到紫罗兰的芳香。

尘归尘土归土。我可能会有一些伤感,我的精神存在早就收到了严重的打击,或许这尸臭是一个原因。或者是要下葬这一个事实。还是这样被埋掉吧,那气味是实实在在的。我也没有力气,我的神经系统——虽然我不知道是否还存在——也不听我的使唤。我没有力气挣扎。

我不应该感到沾沾自喜,不过这也确实没啥,第二场关于死亡的梦。这也可以是一场货真价实的死亡。我还是主角。十八年,或许只有十八天,我的棺材和我身体一起生长,一直到时间和棺材里的尸臭糅合起来变成半透明且柔软的胶状物,我的躯壳也被定在了里面。可能会有融化的一天。

《蓝狗的眼睛·第三次忍受》

La tercera resignación

Pic by https://www.deviantart.com/insaneattraction/art/La-tercera-resignacion-57045720.

The Big Three Pt. 3 - The Open Mapping Theorem (Banach Space)

What is open mapping

An open map is a function between two topological spaces that maps open sets to open sets. Precisely speaking, a function $f: X \to Y$ is open if for any open set $U \subset X$, $f(U)$ is open in $Y$. Likewise, a closed map is a function mapping closed sets to closed sets.

You may think open/closed map is an alternative name of continuous function. But it’s not. The definition of open/closed mapping is totally different from continuity. Here are some simple examples.

  1. $f(x)=\sin{x}$ defined on $\mathbb{R}$ is not open, though it’s continuous. It can be verified by considering $(0,2\pi)$, since we have $f((0,2\pi))=[-1,1]$.
  2. The projection $\pi: \mathbb{R}^2 \to \mathbb{R}$ defined by $(x,y) \mapsto x$ is open. Indeed, it maps an open ball onto an open interval on $x$ axis.
  3. The inclusion map $\varphi: \mathbb{R} \to \mathbb{R}^2$ by $x \mapsto (x,0)$ however, is not open. An open interval on the plane is locally closed but not open or closed.

Under what condition will a continuous linear function between two TVS be an open mapping? We’ll give the answer in this blog post. Open mapping theorem is a sufficient condition on whether a continuous linear function is open.

Open Mapping Theorem

Let $X,Y$ be Banach spaces and $T: X \to Y$ a surjective bounded linear map. Then $T$ is an open mapping.

The open balls in $X$ and $Y$ are defined respectively by

All we need to do is show that there exists some $r>0$ such that

Since every open set in $X$ or $Y$ can be expressed as a union of open balls. For a ball in $X$ centered at $x \in X$ with radius $r$, we can express it as $x+B_r^X$. After that, it becomes obvious that $T$ maps open set to open set.

First we have

The surjectivity of $T$ ensures that

Since $Y$ is Banach, or simply a complete metric space, by Baire category theorem, there must be some $n_0 \in \mathbb{N}$ such that $\overline{T(B_{n_0}^{X})}$ has nonempty interior. If not, which means $T(B_n^{X})$ is nowhere dense for all $n \in \mathbb{N}$, we have $Y$ is of the first category. A contradiction.

Since $x \to nx$ is a homeomorphism of $X$ onto $X$, we see in fact $T(B_n^X)$ is not nowhere dense for all $n \in \mathbb{N}$. Therefore, there exists some $y_0 \in \overline{T(B_1^{X})}$ and some $\varepsilon>0$ such that

the open set on the left hand is a neighborhood of $y_0$, which should be in the interior of $\overline{T(B_1^X)}$.

On the other hand, we claim

We shall prove it as follows. Pick any $y \in \overline{T(B_1^X)}$, we shall show that $y-y_0 \in \overline{T(B_2^X)}$. For $y_0$, there exists a sequence of $y_n$ where $\lVert y_n \rVert <1$ for all $n$ such that $Ty_n \to y_0$. Also we are able to find a sequence of $x_n$ where $\lVert x_n \rVert <1$ for all $n$ such that $Tx_n \to y$. Notice that we also have

since

we see $T(x_n-y_n) \in T(B_2^X)$ for all $n$, it follows that

Combining all these relations, we get

Since $T$ is linear, we see

By induction we get

for all $n \geq 1$.

We shall show however

For any $u \in B_{\varepsilon/4}^Y$, we have $u \in \overline{T(B_{1/2}^X)}$. There exists some $x_1 \in B_{1/2}^{X}$ such that

This implies that $u-Tx_1 \in B_{\varepsilon/8}^Y$. Under the same fashion, we are able to pick $x_n$ in such a way that

where $\lVert x_n \rVert<2^{-n}$. Now let $z_n=\sum_{k=1}^{n}x_k$, we shall show that $(z_n)$ is Cauchy. For $m<n$, we have

Since $X$ is Banach, there exists some $z \in X$ such that $z_n \to z$. Further we have

therefore $z \in B_1^X$. Since $T$ is bounded, therefore continuous, we get $T(z)=u$. To summarize, for $u \in B_{\varepsilon/4}^Y$, we have some $z \in B_{1}^X$ such that $T(z)=y$, which implies $T(B_1^X) \supset B_{\varepsilon/4}^Y$.

Let $U \subset X$ be open, we want to show that $T(U)$ is also open. Take $y \in T(U)$, then $y=T(x)$ with $x \in U$. Since $U$ is open, there exists some $\varepsilon>0$ such that $B_{\varepsilon}^{X}+x \subset U$. By the linearity of $T$, we obtain $B_{r\varepsilon}^Y \subset T(B_{\varepsilon}^X)$ for some small $r$. Using the linearity of $T$ again, we obtain

which shows that $T(U)$ is open, therefore $T$ is an open mapping.

Remarks

One have to notice that the completeness of $X$ and $Y$ has been used more than one time. For example, the existence of $z$ depends on the fact that Cauchy sequence converges in $X$. Also, the surjectivity of $T$ cannot be omitted, can you see why?

There are some different ways to state this theorem.

  • To every $y$ with $\lVert y \rVert < \delta$, there corresponds an $x$ with $\lVert x \rVert<1$ such that $T(x)=y$.
  • Let $U$ and $V$ be the open unit balls of the Banach spaces $X$ and $Y$. To every surjective bounded linear map, there corresponds a $\delta>0$ such that

You may also realize that we have used a lot of basic definitions of topology. For example, we checked the openness of $T(U)$ by using neighborhood. The set $\overline{T(B_1^X)}$ should also remind you of limit point.

The difference of open mapping and continuous mapping can be viewed via the topologies of two topological vector spaces. Suppose $f: X \to Y$. If for any $U \in \tau_X$, we have $f(U) \in \tau_Y$, where $\tau_X$ and $\tau_Y$ are the topologies of $X$ and $Y$, respectively. But this has nothing to do with continuity. By continuity we mean, for any $V \in \tau_Y$, we have $f^{-1}(V) \in \tau_U$.

Fortunately, this theorem can be generalized to $F$-spaces, which will be demonstrated in the following blog post of the series. A space $X$ is an $F$-space if its topology $\tau$ is induced by a complete invariant metric $d$. Still, completeness plays a critical rule.

The series

Since there is no strong reason to write more posts on this topic, i.e. the three fundamental theorems of linear functional analysis, I think it’s time to make a list of the series. It’s been around half a year.

Cauchy sequence in group theory

Recall - Cauchy sequence in analysis

Before we go into group theory, let’s recall how Cauchy sequence is defined in analysis.

Real/complex number

A sequence $(x_n)_{n=1}^{\infty}$ of real/complex numbers is called a Cauchy sequence if, for every $\varepsilon>0$, there is a positive integer $N$ such that for all $m,n>N$, we have

That said, the distance between two numbers is always ‘too close’. Notice that only distance is involved, the definition of Cauchy sequence in metric space comes up in the natural of things.

Metric space

Given a metric space $(X,d)$, a sequence $(x_n)_{n=1}^{\infty}$ is Cauchy if for every real number $\varepsilon>0$, there is a positive integer $N$ such that, for all $m,n>N$, the distance by

Topological vector space

By considering the topology induced by metric, we see that $x_n$ lies in a neighborhood of $x_m$ with radius $\varepsilon$. But a topology can be constructed by neighborhood, hence the Cauchy sequence for topological vector space follows.

For a topological vector space $X$, pick a local base $\mathcal{B}$, then $(x_n)_{n=1}^{\infty}$ is a Cauchy sequence if for each member $U \in \mathcal{B}$, there exists some number $N$ such that for $m,n>N$, we have

But in a topological space, it’s not working. Consider two topological space by

with usual topology. We have $X \simeq Y$ since we have the map by

as a homeomorphism. Consider the Cauchy sequence $(\frac{1}{n+1})_{n=1}^{\infty}$, we see $(h(\frac{1}{n+1}))_{n=1}^{\infty}=(n+1)_{n=1}^{\infty}$ which is not Cauchy. This counterexample shows that being a Cauchy sequence is not preserved by homeomorphism.

Topological group

Similarly, one can have a Cauchy sequence in a topological group (bu considering subtraction as inverse).

A sequence $(x_n)_{n=1}^{\infty}$ in a topological group $G$ is a Cauchy sequence if for every open neighborhood $U$ of the identity $G$, there exists some number $N$ such that whenever $m,n>N$, we have

Completeness

A metric space $(X,d)$ where every Cauchy sequence converges is complete.

Spaces like $\mathbb{R}$, $\mathbb{C}$ are complete with Euclid metric. But consider the sequence in $\mathbb{Q}$ by

we have $a_n\in\mathbb{Q}$ for all $n$ but the sequence does not converge in $\mathbb{Q}$. Indeed in $\mathbb{R}$ we can naturally write $a_n \to e$ but $e \notin \mathbb{Q}$ as we all know.

There are several ways to construct $\mathbb{R}$ from $\mathbb{Q}$. One of the most famous methods is Dedekind’s cut. However you can find no explicit usage of Cauchy sequence. There is another method by using Cauchy sequence explicitly. We are following that way algebraically.

Completing a group

Suppose we are given a group $G$ with a sequence of normal subgroups $(H_n)_{n=1}^{\infty}$ with $H_n \supset H_{n+1}$ for all $n$, all of which has finite index. We are going to complete this group.

Cauchy sequence

A sequence $(x_n)_{n=1}^{\infty}$ in $G$ will be called Cauchy sequence if given $H_k$, there exists some $N>0$ such that for $m,n>N$, we have

Indeed, this looks very similar to what we see in topological group, but we don’t want to grant a topology to the group anyway. This definition does not go to far from the original definition of Cauchy sequence in $\mathbb{R}$ as well. If you treat $H_k$ as some ‘small’ thing, it shows that $x_m$ and $x_n$ are close enough (by considering $x_nx_m^{-1}$ as their difference).

Null sequence

A sequence $(x_n)_{n=1}^{\infty}$ in $G$ will be called null sequence if given $k$, there exists some $N>0$ such that for all $n>N$, we have

or you may write $x_ne^{-1} \in H_k$. It can be considered as being arbitrarily close to the identity $e$.

Cauchy sequence and null sequence as groups

The Cauchy sequences (of $G$) form a group under termwise product

Proof. Let $C$ be the set of Cauchy sequences, we shall show that $C$ forms a group. For $(x_1,x_2,\cdots),(y_1,y_2,\cdots)\in C$, the product is defined by

The associativity follows naturally from the associativity of $G$. To show that $(x_1y_1,x_2y_2,\cdots)$ is still a Cauchy sequence, notice that for big enough $m$, $n$ and some $k$, we have

But $(x_ny_n)(x_my_m)^{-1}=x_ny_ny_m^{-1}x_m^{-1}$. To show that this is an element of $H_k$, notice that

Since $y_ny_m^{-1}\in H_k$, $H_k$ is normal, we have $x_ny_ny_mx_n^{-1} \in H_k$. Since $x_nx_m^{-1} \in H_k$, $(x_ny_n)(x_my_m)^{-1}$ can be viewed as a product of two elements of $H_k$, therefore is an element of $H_k$.

Obviously, if we define $e_C=(e_G,e_G,\cdots)$, where $e_G$ is the identity of $G$, $e_C$ becomes the identity of $C$, since

Finally the inverse. We need to show that

is still an element of $C$. This is trivial since if we have

then

as $H_k$ is a group.


The null sequences (of $G$) form a group, further, it’s a normal subgroup of $C$, that is, the group of Cauchy sequences.

Let $N$ be the set of null sequences of $G$. Still, the identity is defined by $(e_G,e_G,\cdots)$, and there is no need to duplicate the validation. And the associativity still follows from $G$. To show that $N$ is closed under termwise product, namely if $(x_n),(y_n) \in N$, then $(x_ny_n)\in N$, one only need to notice that, for big $n$, we already have

Therefore $x_ny_n \in H_k$ since $x_n$ and $y_n$ are two elements of $H_k$.

To show that $(x_n^{-1})$, which should be treated as the inverse of $(x_n)$, is still in $N$, notice that if $x_n \in H_k$, then $x_n^{-1} \in H_k$.

Next, we shall show that $N$ is a subgroup of $C$, which is equivalent to show that every null sequence is Cauchy. Given $H_p \supset H_q$, for $(x_n)\in{N}$, there are some big enough $m$ and $n$ such that

therefore

as desired. Finally, pick $(p_n) \in N$ and $(q_n) \in C$, we shall show that $(q_n)(p_n)(q_n)^{-1} \in N$. That is, the sequence $(q_np_nq_n^{-1})$ is a null sequence. Given $H_k$, we have some big $n$ such that

therefore

since $H_k$ is normal. Our statement is proved.

The factor group

The factor group $C/N$ is called the completion of $G$ (with respect to $(H_n)$).

As we know, the elements of $C/N$ are cosets. A coset can be considered as an element of $G$’s completion. Let’s head back to some properties of factor group. Pick $x,y \in C$, then $xN=yN$ if and only if $x^{-1}y \in N$. With that being said, two Cauchy sequences are equivalent if their ‘difference’ is a null sequence.

Informally, consider the addictive group $\mathbb{Q}$. There are two Cauchy sequence by

They are equivalent since

is a null sequence. That’s why people say $0.99999… = 1$ (in analysis, the difference is convergent to $0$; but in algebra, we say the two sequences are equivalent). Another example, $\ln{2}$ can be represented by the equivalent class of


Remarks

We made our completion using Cauchy sequences. The completion is filled with some Cauchy sequence and some additions of ‘nothing’, whence the gap disappears.

Again, the sequence of normal subgroups does not have to be indexed by $\mathbb{N}$. It can be indexed by any directed partially ordered set, or simply partially ordered set. Removing the restriction of index set gives us a great variety of implementation.

However, can we finished everything about completing $\mathbb{Q}$ using this? The answer is, no - the multiplication is not verified! To finish this, field theory have to be taken into consideration.

An elementary introduction to profinite groups

Introduction

This blog serves as an introduction to profinite groups without touching anything other than elementary group theory (no ring, field, Galois theory, topological group, etc.), though we may not be able to go for further application.

Number system

We begin with an easy-to-understand motivation by introducing $\mathbb{Z}_p$. Consider the binary expansion of an integer

where $a_k=0,1$. For example we may have

You must be familiar with binary expansion if you write codes. As a topology exercise, show that the set containing all such $a$ is uncountable. In the octal number system you may also write

This notation is pretty useful in some real life occasions but not here. We are looking for connections between number systems and prime numbers (you will see why later)), but number systems with bases like $8,10,16$ definitely won’t work.

$p$-adic integer

Fix a prime number $p$, a $p$-adic integer $\alpha$ is defined by a sequence of integers $x_k$ for which we writes

satisfying

For example, we write $88$ as a $2$-adic number by

As you may realize, $x_k$ can be written by

where $a_i=0,1,\cdots,p-1$ for $i \leq k-1$, where $a_i$ is called the $p$-adic digits.

In fact, if we define the addition componentwise, i.e.

then it’s a group. Further, if we define the multiplication componentwise, we get a ring. The group of all $p$-adic numbers is denoted by $\mathbb{Z}_p$. But this blog won’t touch anything other than group theory.

As you may wonder, it doesn’t seems to work for ‘negative’ one. For example if we have

how do we get $-\alpha$? In fact we have

which suggests the limit value of $x_k$ associated to $-\alpha$ as $k\to\infty$ is

It doesn’t converge in the usual sense. But if it does, we have

But this is valid under such circumstance. We can check this using $p$-adic digits. In fact, the $p$-adic digits of $1$ is

if we add $p-1$ to each component, we get

(there are infinitely many $p-1$!).

With all these stuff being said, you can treat $\alpha=(x_1,x_2,\cdots)$ as a limit:

which makes everything natural. We are not digging into $\mathbb{Z}_p$ further. But keep two words in mind: limit and group.

$\mathbb{Z}_p$ and $\mathbb{Z}/p^k\mathbb{Z}$

The definition of $\mathbb{Z}_p$ by $x_{k+1} \equiv x_k\mod{p^k}$ might remind you of $\mathbb{Z}/p^k\mathbb{Z}$. Let’s give a review of $\mathbb{Z}/p^k\mathbb{Z}$.

For integers $x,y$, we have

if $x \in p^k\mathbb{Z}$. Further we have

if $(x-y)\in p^k\mathbb{Z}$. We also may write $x \equiv y \mod p^k$. So there are infinitely many $x_{k+1} \equiv x_k \mod p^k\mathbb{Z}$, shall we associate infinitely many $\mathbb{Z}/p^k\mathbb{Z}$? If it works, we may treat $\mathbb{Z}_p$ as the ‘limit’ of $\mathbb{Z}/p^k\mathbb{Z}$. But we need some proper operation to do that.

Let $G_n=\mathbb{Z}/p^{n+1}\mathbb{Z}$ for each $n \geq 0$. Let

be the canonical homomorphism. Notice that $f_n$ is surjective. Now consider a $p$-adic integer

we have

Therefore we got a expression of $\mathbb{Z}_p$ by

We will write $\mathbb{Z}_p=\varprojlim\mathbb{Z}/p^n\mathbb{Z}$ since it’s an example of a inverse limit. It’s inverse since $f_n$ goes ‘back’ by associating each $x_n$ to $x_{n-1}$. Since $f_n$ is surjective, we can always raise $x_{n-1}$ to $G_{n}$ via $f_{n}$. We treated one group as a limit of a sequence of groups. We don’t want to limit ourself in number theory. In the following section we are offering a much more generalized definition where even numbers are generalized.

Definition of profinite group

We are going to give a generalized definition for profinite group. Notice that in the example of $\mathbb{Z}_p$, the sequence is indexed by $\mathbb{N}$. It’s easy to understand but this index set prevents profinite group from being further applied. Of course, the index $\mathbb{N}$ is not excluded.

Directed partially ordered set

A set $I$ is directed partially ordered if it’s associated with a partial order $\geq$ such that for any two elements $i,j \in I,$ there exists a $k \in I$ such that $k \geq i$ and $k \geq j$.

$\mathbb{Z}$ with the natural inequality is of course directed partially ordered. However we can define another partial order by division. If we define $n \geq m$ if $m|n$, then we have $\operatorname{lcm}(m,n) \geq m,n$.

As another example, consider the family of all subgroups $\mathcal{F}$ of a group $G$. The partial order is defined by inclusion. i.e. for $M,N \in \mathcal{F}$, we have $M \geq N$ if $M \supset N$. In this case $M \cup N \geq M,N$.

Projective system

A projective system is a collection of groups $G_i$ ($i \in I$), together with group homomorphisms $f^{j}_i: G_j \to G_i$ for $i,j\in {I}$ with $j \geq i$ such that

  1. $f_{i}^{i}=\operatorname{id}_{G_i}$ for every $i \in I$.
  2. $f_{i}^{j}\circ f_{j}^{k}=f_{i}^{k}$ for $k \geq j \geq i$.

Inverse limit and profinite group

Given any such projective system with a directed partially ordered index set, we has the inverse limit (or projective limit) defined by

It’s easy to see that $\mathbb{Z}_p$ can be defined with the same manner, although we have $I = \mathbb{N}$. It also can be verified that the inverse limit forms a group (also topological group, but we are not discussing that here).

A group is profinite if it is a projective limit of finite groups (up to isomorphism).

Examples

For any $g \in \mathbb{N}_+$, it would be interesting to consider the following projective limit by

It can be verified that we have

That said, base $8$ number system is ‘useless’ since it’s isomorphic to base $2$ number system. That’s why we focus on prime first. We will give another ‘generalization’ of $p$-adic numbers.


Suppose we have a sequence of normal subgroups $(H_n)$ of $G$ such that $H_n \supset H_{n+1}$ for all $n$. It doesn’t matter whether $G$ is finite. Let

be the canonical homomorphisms. Then the inverse limit follows:

We also have a natural homomorphism

by sending $x$ to the sequence $(x_i)$, where $x_n$ is the image of $x$ in $G/H_n$. Notice that we don’t have to use $\mathbb{N}$ as the index. This inverse limit can also be indexed by the set containing all $H_n$.

Further

You may think this is like, algebraists stole something from analysts and made it up with the magic of algebra. There are many other applications that I want to show you in the future (not beyond elementary group theory). If you learned functional analysis you may know that $L^p$ space for $1 \leq p < \infty$ is not a Banach space due to the functions equal to $0$ a.e.. But $L^p/N$ can be a Banach space where $N$ contains all functions equal to $0$ a.e.. Both $L^p$ and $N$ are groups, and we ‘completed’ $L^p$ by defining a factor space which still is a group. In fact, in algebra, we also have Cauchy sequence and completion of a group, which are associated with inverse limit still.

References / Further readings

  • Luis Ribes, Introduction to Profinite Groups
  • Hendrik Lenstra, Profinite Groups
  • Serge Lang, Algebra Revised Third Edition

线性ODE(四)——常系数高阶线性微分方程的普遍解法

最终目标

我们要讨论这两种方程的普遍的解决办法

其中$a_i$为常数。在上篇博客里,我们见到了,假设我们能求出第一个方程,那么第二个方程的解,可以用Cramer法则,通过解普通的线性方程组,再进行积分得到。但是我们假设自己有这个“超能力”,并没有实际的操作方法。这篇博客会赋予我们这个“超能力”。当然也不仅仅是如何快速准确解出方程,更重要的是,能看到经典理论之间朴素而又巧妙的联系。

这篇博客的方法基于多项式,我想你至少在微积分课上已经知道一些简单的对于多项式的处理了。这里要用到古典代数学基本定理,也就是说

任何一个非零的一元$n$次复系数多项式,都正好有$n$个复数根(重根视为多个根)。

多项式方法

回顾简单的线性方程

在这里最适合讨论的应该是

确实,再简单就是普通的不定积分了。我们再来回顾一下怎么分析这个方程的解的结构。首先,$y=0$显然是一个解,这保证了解的准确性。另一方面,在第一篇博客里,我们也给出了普遍的通解计算方式

所以,这个方程的解为

其中$C$为任意常数。


我们也可以尝试一下简单的二阶方程

显然,$y=0$仍然是这个方程的一个特殊解。我们希望的是,能解两次$y’+p(x)y=0$形式的方程,因为这种方程的解法我们是已经会了的。

注意到我们可以把方程写成

令$u=y’-y$,那么我们有了

这个方程的解我们是知道的,实际上我们已经有

所以又有了

从而又可以解出

尝试对上述方程解法进行抽象

对于一阶方程,就是我们已经学过的办法。而二阶方程,你应该已经察觉到了,似乎有迹可循。注意到,$y’’-2y’+y=0$这个方程,我们是解了两次$y’-y$。这时自然可以想到,对于某个方程可能也可以像是解两次$y’-2y$。注意如果我们把求两次导看成一个”平方“,我们有一个方程

那么能不能写出一个需要解两次$y’-2y$的方程呢?我们先写出一个关于$\lambda$的方程,再给对应一个微分方程,也就是说

而对应的方程恰好又有

仍然是要解两次$y’-2y=0$。类似地,也可能某个方程是需要解$n$次$y-3y’$,等等。我们也可以讨论“混合”的场景。比如一个二阶方程,需要先解一次$y-ay’$,再解一次$y-by’$。如果两个颠倒,会不会又有不一样的结果?我们甚至不需要解出结果就可以进行分析。实际上有

求导的“多项式”——你的“超能力”

我们会给出一个解决常系数齐次方程的普遍办法。我们已经知道,求导是一个线性运算。对一个可导函数求导,得到一个新的函数。那么我们把$y’$记为$Dy$,其中$D$代表线性运算,对于高阶求导,不妨记$y^{(n)}=D^ny$。如果对函数不求导,也就是$D^0y$,我们可以记成$Iy$或者$I$省略不写。

那么如果我们已经有

也就是说

那么我们又得到一个多项式

所以原方程又可以写成

那么这和上面的例子又有什么关系呢?注意,如果$a_1,\cdots,a_n$为复数,那么$P(D)$总是可以写成下面的形式

其中$\lambda_i$两两之间可以相等,可以不相等。我们再来看$y’’-2y’+y=0$这个例子。这个时候借助$P(D)$可以把方程写成

那么在这个角度下审视这个方程的解法,我们可以设$u=(D-1)y$,通过解$(D-1)u=0$,解出$u$,又解$(D-1)y=u$,就得到了$y$。

那么普遍的解法我们已经有了,实际上,这是一个递归的办法。

对于方程

只需要设$\varphi_1=(D-\lambda_2)\cdots(D-\lambda_n)y$,然后解$(D-\lambda_1)\varphi_1=0$,解出$\varphi_1$;再按照同样的办法进行下去,设$\varphi_2=(D-\lambda_3)\cdots(D-\lambda_n)y$,解出$\varphi_2$,一直进行下去,最后设$\varphi_n=y$,接出来的就是最终结果。这时你已经获得了上篇博客里需要的“超能力”了(注意:这里的$\varphi_n$里已经包含了$n$个常数)。

以上是齐次线性方程的解决办法。对于非齐次线性方程又有什么普遍办法呢?具体办法有三。

  1. 如果可以很轻松地观察出方程的一个特解,比如存在$\mu(x)$使得$P(D)\mu(x)=f(x)$,那么非齐次方程的解就是$\mu(x)+\varphi_n(x)$。
  2. 直接解$P(D)y=f(x)$,方法和上面齐次方程一样,只需要注意,解$\varphi_1$时有$(D-\lambda_1)\varphi_1=f(x)$,递归下去得到的解和方法1是一致的。
  3. 利用上篇博客的办法。注意到最后得到的$\varphi_n(x)$里有$n$个常数,也就是说可以写成$\varphi_n(x)=\sum_{k=1}^{n}C_ku_k(x)$,这里的$u_k(x)$实际上就是所求的基础解系。

实例:利用多项式办法进行机械式求解

Step 1: 化简$P(D)$

这很简单,实际上我们有

Step 2: 递归求解

我们就有

解出

所以又设

又有

得到

最后,解

其实不难发现,$y=-1$是这个方程的一个特解,而解$y’’’-3y’’+3y’-y=0$得到$y=C_1x^2e^x+C_2xe^x+C_3e^x$,得到的结果和上面的办法是一样的。

注意到我们还可以把这个方程写成

对$P(D)y=0$型方程基本情形的总结

既然我们已经有了机械的解法,那么我们能不能在动手解之前总结一下解的情况?这自然是可行的。我们会尝试讨论$P(D)$的所有基本情形。这里省略了最基本的计算,但是这些计算无非是最基本的一阶方程。

1. $P(D)=(D-\lambda)^n$的情形

对于这种方程,我们最终要做的是解$n$次$y’-\lambda y$型一阶方程。为了解决这种方程,我们定义

那么只需要解$n$次$(D-\lambda)\varphi_{k+1}=\varphi_k$即可。通过简单的计算,得到

2. $P(D)=(D-\lambda_1)(D-\lambda_2)\cdots(D-\lambda_n)$的情形($\lambda_i$之间两两互异)

我们自然希望能得到类似于有$e^{\lambda_1 x},\cdots,e^{\lambda_n x}$的形式,那么事实是怎样呢?我们可以直接进行运算。

首先我们有

那么解

就能得到

整理之后就有

如果我们继续计算下去下去,就能得到

3. 一般情况:$P(D)=(D-\lambda_1)^{r_1}(D-\lambda_2)^{r_2}\cdots(D-\lambda_k)^{r_k}$

这自然是情况1和情况2的整合。在情况1里我们意识到,如果相邻的$\lambda_i$相等,那么我们有$\varphi_{i+1}=x\varphi_i+C_{i+1}e^{\lambda_{i}x}$;如果相邻的$\varphi(x)$不相等,那么我们有$\varphi_{i+1}=\varphi_i+C_{i+1}e^{\lambda_{i+1}x}$。这两个结论整合起来,再经过简单的计算,就能总结出解的一般形式:


这个时候我们解方程就可以轻松许多了。例如方程

我们能得到

所以解就是

再比如求解

注意到

所以解就是


多项式方法的总结

在这整篇博客里,我们只做了一件事情——对一个高阶微分方程进行化简,使求一个高阶方程变成求若干个一阶方程。但是,手动观察整理化简是很不现实的做法,我们就将求导运算看成一个抽象的“数”,然后处理一个对应的多项式,把解方程变成两步:化简多项式、递归求解。在处理这个多项式的过程中,我们间接对原方程进行了化简。

但是多项式方法一定适用于非常系数方程吗?不一定。例如方程$y’’=xy$,这个方程并没有一个简单的解,我们也不能指望通过简单的解法得到所希望得到的函数。

这种方法的优点是,朴实、机械化,只需要执行若干次一阶方程的求导即可。但是,这种办法并没有很好地体现“线性”这个概念,很难看到和线性代数的关系。在下一篇博客里,会给出基于矩阵的解决办法。

A continuous function is 'mostly' nowhere differentiable

About this post

It is taught in elementary calculus course that, if a function is differentiable at some point $x_0$, then it’s continuous at $x_0$, but not vice versa. It’s easy to construct an counterexample which is continuous but not differentiable. An example is $y=|x|$ at $x=0$. In fact, we can even find a nowhere differentiable function by series:

where $g(x)=|x|$ for $x \in [-1,1]$ and $g(x+2)=g(x)$ for all $x \in \mathbb{R}$.

Though the existence has been proved already, we are looking for some feasible estimation on “quantity”. Also, this post is a good chance to review some elementary analysis techniques such as continuity and differentiability.

Some basic facts

I’m assuming that you have finished this post or some equivalences (basic facts about Baire Category Theorem).

Let $K=[0,1]$ and let $C(K)$ be the set of all real-valued continuous functions defined on $K$. For $f \in C(K)$, we define the norm of $f$ by $\lVert f \rVert = \sup_{x \in K}|f(x)|$, and a metric $d$ on $C(K)$ by $d(f,g)=\lVert f-g \rVert$, then $C(K)$ becomes a complete metric space, where BCT applies, showing that $C(K)$ is of the second category.

A function $f \in C(K)$ will be called somewhere differentiable if there exists some point $x \in K$ such that $f$ is differentiable at $x$ (A somewhere differentiable function need not to be everywhere continuous on $K$, but that’s out of our consideration.). If there are several collections of functions defined by

then trivially $E \subset A \subset S \subset C(K)$. We’ll show that, however, $S$ is of the first category (needless to talk about $E$ and $A$), which indicates that there are “very few” somewhere differentiable functions.

Proof in detail

(Theorem) $S$ is of the first category.

By somewhere differentiable, we mean that there exists some $x \in K$ such that the limit

exists and is bounded. If we omit the restriction of existence, we got a super set of $S$ defined by

We will show that $B_{n,m}$ is of the first category which forces $S$ to be the first category.

Step 1 - Each $B_{n,m}$ is closed

Proof

It suffice to show that every Cauchy sequence in $B_{n,m}$, namely $(f_k)_{k=1}^{\infty}$, converges in $B_{n,m}$. Since $S$ is complete, we have $f_k \to f$ (uniformly) for some $f \in S$ when $n \to \infty$. For each $k$, we can find $x_k \in [0,1]$ such that

for all $0 < h < \frac{1}{m}$. Since $[0,1]$ is compact, by Bolzano-Weierstrass theorem, $(x_k)_{k=1}^{\infty}$ has a convergent subsequence, namely $(x_{k_i})_{i=1}^{\infty}$. Therefore, for each $i$, we have some $g_{i}=f_{k_i}$ and $y_i=x_{k_i}$ such that

for all $0<h<\frac{1}{m}$. Suppose then $y_i \to x$ as $i \to\infty$, we therefore have

for all $0<h<\frac{1}{m}$. Therefore $f \in B_{n,m}$, which implies that $B_{n,m}$ is closed.

Remarks

There are two basic analysis facts used in this step.

First, subspace of complete metric space is closed iff it’s complete. This can be easily shown by proving that subspace of complete metric space is not complete iff it’s not closed (a word of warning: not closed $\neq$ open!). For detailed proof, click here.

Second, the Bolzano-Weierstrass theorem. A subset $M$ of $\mathbb{R}$ is compact iff every sequence of elements of $M$ has a subsequence which converges to an element of $M$. We already know that $[0,1]$ is compact.

Step 2 - Each $B_{n,m}$ is nowhere dense

Proof

To show that $B_{n,m}$ is nowhere dense, we have to show that $B_{n,m}$ contains no open ball. The open ball for $f \in B_{n,m}$ is defined by

We will show that $B(f,\varepsilon)-B_{n,m} \neq \varnothing$ for any $\varepsilon >0$. This is done by constructing a function $g$ such that $g \in B(f,\varepsilon)$ but $g \notin B_{n,m}$.

Since $PL(K)$ (the family of piecewise linear functions defined on $K$) is dense in $C(K)$, we may find a function $p \in PL(K)$ such that $\left\Vert f-p \right\Vert<\frac{\varepsilon}{2}$.

Since $p$ is piecewise differentiable as well, we may find some $M \in \mathbb{N}$ such that $|p’(x)| \leq M$ for all $x$ where $p$ is differentiable. Pick an integer $k$ where $k>\frac{2(M+n)}{\varepsilon}$.

Define another periodic piecewise linear function $\varphi$ on $K$ as follows. For $x \in [0, \frac{2}{k}]$, $\varphi(x)=-k|x-\frac{1}{k}|+1$, for $\frac{k}{2} < x \leq 1$, $\varphi(x+\frac{k}{2})=\varphi(x)$. For this function, we have $|\varphi(x)| \leq 1$ and $|\varphi’(x)|=k$ for all $x$ where $\varphi$ is differentiable. With $g(x)$ defined by

we have

Thus $g \in B(f,\varepsilon)$. However, as we’ll show, $g \notin B_{n,m}$.

Pick $x \in K$ such that both $p$ and $\varphi$ are differentiable at $x$, then

Still, $g \in PL(K)$. For $x$ where $g$ is differentiable, we have $|g’(x)|>n$. For the turning points of $g$, we have $|g’_{-}(x)|>n$ and $g’_{+}(x)>n$, that is, the left and right derivatives of $g$. Therefore, for any $n,m \in \mathbb{N}$ and $\varepsilon \in \mathbb{R}^{+}$, we have $B(f,\varepsilon) - B_{n,m} \neq \varnothing$. That is, $B_{n,m}$ is nowhere dense.

Remarks

There are some confusing parts worth talking about.

  • Why do we have to prove that $B_{n,m}$ contains no open balls?

    ​ Since $B_{n,m}$ is closed, the closure of $B_{n,m}$ is itself. Therefore it suffices to show that $B_{n,m}$ has empty interior, or equivalently, contains no open balls.

  • The detail of $g \notin A_{n,m}$?

    ​ Notice that $g$ is a piecewise linear function. Suppose the turning points of $g$ are $x_1,x_2,\cdots,x_N$. Then $g(x)$ is differentiable (also linear) on $(x_i,x_{i+1})$ for $i=1,2,\cdots,N-1$, and $g’(x)$ is the slope of the segment, which is equal to $\frac{g(x+h)-g(x)}{h}$ for $x+h \in (x_i,x_{i+1})$. If $x=x_i$ for some $i$, then $\frac{g(x+h)-g(x)}{h}$ is equal to the slope of the segments on the left and right hand. Therefore, for all $x \in K$, we can find some $0<|h|<\frac{1}{m}$ such that $|g(x+h)-g(x)|/|h|>n$.

  • Piecewise linear functions?

    ​ A function $p$ is called piecewise linear on $[0,1]$ if there is a partition $0=a_0<a_1<\cdots<a_n=1$ such that $p$ is “linear” (can be expressed in the form $y=ax+b$ ) on $[a_n,a_{n+1}]$. Also we assume that $p$ is continuous, whose collection is denoted by $PL(K)$. The fact that $PL(K)$ is dense in $C(K)$ can be shown as follows (For an advanced proof, you may want to check here).

$PL(K)$ is dense in $C(K)$. Alternatively, for every $f \in C(K)$ and $\varepsilon >0$, there exists some $p(x)$ such that $\left\Vert p-f \right\Vert < \varepsilon$.

(Proof) Since $K$ is compact, $f$ is uniformly continuous on $K$. Therefore, there exists some $\delta>0$ such that $|f(x)-f(y)|<\frac{\varepsilon}{2}$ for all $|x-y|<\delta$ where $x,y\in{K}$.

Refine the partition associated with $p$ by forcing $|a_{i+1}-a_i|< \delta$. Define $p \in PL(K)$ by $p(a_i)=f(a_i)$. If $x \in K$, there is an $i$ such that $a_i \leq x \leq a_{i+1}$. Then $|f(x)-f(a_i)|<\frac{\varepsilon}{2}$. Also,

Therefore

This holds for all $x \in K$. Therefore holds for $\sup|f(x)-p(x)|$.

Step 3 - $S = \bigcup_{m}\bigcup_{n} B_{n,m}$, therefore of the first category

Proof

Since $f$ is differentiable at $x$, there exists a real number $A$ such that for any $\varepsilon>0$, there exists some $\delta>0$ such that

for $|h| \in (0,\delta)$. In fact, $A=f’(x)$, all we have to do is pick $n,m$ such that $n > A$ and $\frac{1}{m}<\delta$. Alternatively speaking, all we need to show is that, if $f \in S$, then $f \in B_{n,m}$ for some $m$ and $n$. Therefore

Remarks

If we think of meagre sets as being “small”, this tells us that “mostly” $f \in C(K)$ is nowhere differentiable. At the very least, nowhere differentiable functions topologically exists, since a space cannot be of the first and second category at the same time.

The Big Three Pt. 2 - The Banach-Steinhaus Theorem

About this blog post

People call the Banach-Steinhaus theorem the first of the big three, which sits at the foundation of linear functional analysis. None of them can go without the Baire’s category theorem.

This blog post offers the Banach-Steinhaus theorem on different abstract levels. Recall that we have

First, there will be a simple version for Banach spaces, which may be more frequently used, and you will realize why it’s referred to as the uniform boundedness principle. After that, there will be a much more generalized version for TVS. Typically, the metrization of the space will not be considered.

Also, it will be a good chance to get a better view of the first and second space by Baire.

Equicontinuity

For metric spaces, equicontinuity is defined as follows. Let $(X,d_X)$ and $(Y,d_Y)$ be two metric spaces.

Let $\Lambda$ be a collection of functions from $X$ to $Y$. We have three different levels of equicontinuity.

  1. Equicontinuous at a point. For $x_0 \in X$, if for every $\varepsilon>0$, there exists a $\delta>0$ such that $d_Y(Lx_0,Lx)<\varepsilon$ for all $L \in \Lambda$ and $d_X(x_0,x)<\delta$ (that is, the continuity holds for all $L$ in a ball centered at $x_0$ with radius $r$).
  2. Pointwise equicontinuous. $\Lambda$ is equicontinuous at each point of $X$.
  3. Uniformly equicontinuous. For every $\varepsilon>0$, there exists a $\delta>0$ such that $d_Y(Lx,Ly)<\varepsilon$ for all $x \in \Lambda$ and $x,y \in X$ such that $d_X(x,y) < \delta$.

Indeed, if $\Lambda$ contains only one element, namely $L$, then everything goes with the continuity and uniform continuity.

But for Banach-Steinhaus theorem, we need a little more restrictions. In fact, $X$ and $Y$ should be considered Banach spaces, and $\Lambda$ contains linear functions only. In this sense, for $L \in \Lambda$, we have the following three conditions equivalent.

  1. $L$ is bounded.
  2. $L$ is continuous.
  3. $L$ is continuous at one point of $X$.

For topological vector spaces, where only topology and linear structure are taken into consideration, things get different. Since no metrization is considered, we have to state it in the language of topology.

Suppose $X$ and $Y$ are TVS and $\Lambda$ is a collection of linear functions from $X$ to $Y$. $\Lambda$ is equicontinuous if for every neighborhood $N$ of $0$ in $Y$, there corresponds a neighborhood $V$ of $0$ in $X$ such that $L(V) \subset N$ for all $L \in \Lambda$.

Indeed, for TVS, $L \in \Lambda$ has the three conditions equivalent as well. With that being said, equicontinuous collection has the boundedness property in a uniform manner. That’s why the Banach-Steinhaus theorem is always referred to as the uniform boundedness principle.

The Banach-Steinhaus theorem, a sufficient condition for being equicontinuous

Banach space version

Suppose $X$ is a Banach space, $Y$ is a normed linear space, and ${F}$ is a collection of bounded linear transformation of $X$ into $Y$, we have two equivalent statements:

  1. (The Resonance Theorem) If $\sup\limits_{L \in \Lambda}\left\Vert{L}\right\Vert=\infty$, then there exists some $x \in X$ such that $\sup\limits_{L \in {L}}\left\Vert{Lx}\right\Vert=\infty$. (In fact, these $x$ form a dense $G_\delta$.)

  2. (The Uniform Boundedness Principle) If $\sup\limits_{L \in {\Lambda}}\left\Vert{Lx}\right\Vert<\infty$ for all $x \in X$, then we have $ \left\Vert L \right\Vert \leq M$ for all $L \in {\Lambda}$ and some $M<\infty$.

  3. (A summary of 1 and 2) Either there exists an $M<\infty$ such that $\lVert L \rVert \leq M$ for all $L \in {L}$, or $\sup\lVert Lx \rVert = \infty$ for all $x$ belonging to some dense $G_\delta$ in $X$.

Proof

Though it would be easier if we finish the TVS version proof, it’s still a good idea to leave the formal proof without the help of TVS here. The equicontinuity of $\Lambda$ will be shown in the next section.

An elementary proof of the Resonance theorem

First, we offer an elementary proof in which the hardest part is the Cauchy sequence.

(Lemma) For any $x \in X$ and $r >0$, we have

where $B(x,r)=\{y \in X:\lVert x-y \rVert < r\}$.

(Proof of the lemma)

For $t \in X$ we have a simple relation

If we have $t \in B(0,r)$, then $x+t,x-t\in{B(x,r)}$. And the desired inequality follows by taking the supremum over $t \in B(0,r)$. (If you find trouble understanding this, take a look at the definition of $\lVert L \rVert$.)

Suppose now $\sup\limits_{L \in \Lambda}\left\Vert{L}\right\Vert=\infty$. Pick a sequence of linear transformation in $\Lambda$, say $(L_n)_{n=1}^{\infty}$, such that $\lVert L_n \rVert \geq 4^n$. Pick $x_0 \in X$, and for $n \geq 1$, we pick $x_n$ inductively.

Set $r_n=3^{-n}$. With $x_{n-1}$ being picked, $x_n \in B(x_{n-1},r_n)$ is picked in such a way that

(It’s easy to validate this inequality by reaching a contradiction.) Also, it’s easy to check that $(x_n)_{n=1}^{\infty}$ is Cauchy. Since $X$ is complete, $(x_n)$ converges to some $x \in X$. Further we have

Therefore we have

A topology-based proof

The previous proof is easy to understand but it’s not easy to see the topological properties of the set formed by such $x$. Thus we are offering a topology-based proof which enables us to get a topology view.

Put

and let

we claim that each $V_n$ is open. Indeed, we have to show that $x \mapsto \lVert Lx \rVert$ is continuous. It suffice to show that $\lVert\cdot\rVert$ defined in $Y$ is continuous. This follows immediately from triangle inequality since for $x,y \in Y$ we have

which implies

by interchanging $x$ and $y$, we get

Thus $x \mapsto \lVert Lx \rVert$ is continuous since it’s a composition of $\lVert\cdot\rVert$ and $L$. Hence $\varphi$, by the definition, is lower semicontinuous, which forces $V_n$ to be open.

If every $V_n$ is dense in $X$ (consider $\sup\lVert L \rVert=\infty$), then by BCT, $B=\bigcap_{n=1}^{\infty} V_n$ is dense in $X$. Since each $V_n$ is open, $B$ is a dense $G_\delta$. Again by the definition of $B$, we have $\varphi(x)=\infty$ for all $x \in B$.

If one of these sets, namely $V_N$, fails to be dense in $X$, then there exist an $x_0 \in X - V_N$ and an $r>0$ such that for $x \in B(0,r)$ we have $x_0+x \notin V_N$, which is equivalent to

considering the definition of $\varphi$, we also have

for all $L \in \Lambda$. Since $x=(x+x_0)-x_0$, we also have

Dividing $r$ on two sides, we got

therefore $\lVert L \rVert \leq M=\frac{2N}{r}$ as is to be shown. Again, this follows from the definition of $\lVert L \rVert$.

Topological vector space version

Suppose $X$ and $Y$ are topological vector spaces, $\Lambda$ is a collection of continuous linear mapping from $X$ into $Y$, and $B$ is the set of all $x \in X$ whose orbits

are bounded in $Y$. For this $B$, we have:

  • If $B$ is of the second category, then $\Lambda$ is equicontinuous.
A proof using properties of TVS

Pick balanced neighborhoods $W$ and $U$ of the origin in $Y$ such that $\overline{U} + \overline{U} \subset W$. The balanced neighborhood exists since every neighborhood of $0$ contains a balanced one.

Put

If $x \in B$, then $\Lambda(x)$ is bounded, which means that to $U$, there exists some $n$ such that $\Lambda(x) \subset nU$ (Be aware, no metric is introduced, this is the definition of boundedness in topological space). Therefore we have $x \in nE$. Consequently,

If no $nE$ is of the second category, then $B$ is of the first category. Therefore, there exists at least one $n$ such that $nE$ is of the second category. Since $x \mapsto nx$ is a homeomorphism of $X$ onto $X$, $E$ is of the second category as well. But $E$ is closed since each $L$ is continuous. Therefore $E$ has an interior point $x$. In this case, $x-E$ contains a neighborhood $V$ of $0$ in $X$, and

This proves that $\Lambda$ is equicontinuous.

Equicontinuity and uniform boundedness

We’ll show that $B=X$. But before that, we need another lemma, which states the connection between equicontinuity and uniform boundedness

(Lemma) Suppose $X$ and $Y$ are TVS, $\Gamma$ is an equicontinuous collection of linear mappings from $X$ to $Y$, and $E$ is a bounded subset of $X$. Then $Y$ has a bounded subset $F$ such that $T(E) \subset F$ for every $T \in \Gamma$.

(Proof of the lemma) We’ll show that, the set

is bounded. By the definition of equicontinuity, there is an neighborhood $V$ of the origin in $X$ such that $T(V) \subset W$ for all $T \in \Gamma$. Since $E$ is bounded, there exists some $t$ such that $E \subset tV$. For these $t$, by the definition of linear functions, we have

Therefore $F \subset tW$. $F$ is bounded.

Thus $\Lambda$ is uniformly bounded. Picking $E=\{x\}$ in the lemma, we also see $\Lambda(x)$ is bounded in $Y$ for every $x$. Thus $B=X$.

A special case when $X$ is a $F$-space or Banach space

$X$ is a $F$-space if its topology $\tau$ is induced by a complete invariant metric $d$. By BCT, $X$ is of the second category. If we already have $B=X$, in which case $B$ is of the second category, then by Banach-Steinhaus theorem, $\Lambda$ is equicontinuous. Formally speaking, we have:

If $\Lambda$ is a collection of continuous linear mappings from an $F$-space $X$ into a topological vector space $Y$, and if the sets

are bounded in $Y$ for every $x \in X$, then $\Lambda$ is equicontinuous.

Notice that all Banach spaces are $F$-spaces. Therefore we can restate the Uniform Boundedness Principle in Banach space with equicontinuity.

Suppose $X$ is a Banach space, $Y$ is a normed linear space, and ${F}$ is a collection of bounded linear transformation of $X$ into $Y$, we have:

  • (The Uniform Boundedness Principle) If $\sup\limits_{L \in {\Lambda}}\left\Vert{Lx}\right\Vert<\infty$ for all $x \in X$, then we have $|L| \le M$ for all $L \in {\Lambda}$ and some $M<\infty$. Further, $\Lambda$ is equicontinuous.

Application

Surprisingly enough, the Banach-Steinhaus theorem can be used to do Fourier analysis. An important example follows.

There is a periodic continuous function $f$ on $[0,1]$ such that the Fourier series

of $f$ diverges at $0$. $\hat{f}(n)$ is defined by

Notice that $f \mapsto \hat{f}$ is linear, and the divergence of the series at $0$ can be considered by

To invoke Banach-Steinhaus theorem, the family of linear functionals are defined by

It can be proved that

which goes to infinity as $N \to \infty$. The existence of such $f$ that

follows from the resonance theorem. Further, we also know that these $f$ are in a dense $G_\delta$ subset of the vector space generated by all periodic continuous functions on $[0,1]$.

The series

Since there is no strong reason to write more posts on this topic, i.e. the three fundamental theorems of linear functional analysis, I think it’s time to make a list of the series. It’s been around half a year.

References / Further readings

  1. arXiv:1005.1585v2
  2. W. Rudin, Real and Complex Analysis
  3. W. Rudin, Functional Analysiss
  4. Applications to Fourier series