The Big Three Pt. 6 - Closed Graph Theorem with Applications

(Before everything: elementary background of topology and vector spaces, Banach spaces, is assumed.)

A surprising result of Banach spaces

We can define several relations between two norms. Suppose we have a topological vector space \(X\) and two norms \(\lVert \cdot \rVert_1\) and \(\lVert \cdot \rVert_2\). One says \(\lVert \cdot \rVert_1\) is weaker than \(\lVert \cdot \rVert_2\) if there is \(K>0\) such that \(\lVert x \rVert_1 \leq K \lVert x \rVert_2\) for all \(x \in X\). Two norms are equivalent if each is weaker than the other (trivially this is a equivalence relation). The idea of stronger and weaker norms is related to the idea of the "finer" and "coarser" topologies in the setting of topological spaces.

So what about their limit of convergence? Unsurprisingly this can be verified with elementary \(\epsilon-N\) arguments. Suppose now \(\lVert x_n - x \rVert_1 \to 0\) as \(n \to 0\), we immediately have \[ \lVert x_n - x \rVert_2 \leq K \lVert x_n-x \rVert_1 < K\varepsilon \]

for some large enough \(n\). Hence \(\lVert x_n - x \rVert_2 \to 0\) as well. But what about the converse? We give a new definition of equivalence relation between norms.

(Definition) Two norms \(\lVert \cdot \rVert_1\) and \(\lVert \cdot \rVert_2\) of a topological vector space are compatible if given that \(\lVert x_n - x \rVert_1 \to 0\) and \(\lVert x_n - y \rVert_2 \to 0\) as \(n \to \infty\), we have \(x=y\).

By the uniqueness of limit, we see if two norms are equivalent, then they are compatible. And surprisingly, with the help of the closed graph theorem we will discuss in this post, we have

(Theorem 1) If \(\lVert \cdot \rVert_1\) and \(\lVert \cdot \rVert_2\) are compatible, and both \((X,\lVert\cdot\rVert_1)\) and \((X,\lVert\cdot\rVert_2)\) are Banach, then \(\lVert\cdot\rVert_1\) and \(\lVert\cdot\rVert_2\) are equivalent.

This result looks natural but not seemingly easy to prove, since one find no way to build a bridge between the limit and a general inequality. But before that, we need to elaborate some terminologies.

Preliminaries

(Definition) For \(f:X \to Y\), the graph of \(f\) is defined by \[ G(f)=\{(x,f(x)) \in X \times Y:x \in X\}. \]

If both \(X\) and \(Y\) are topological spaces, and the topology of \(X \times Y\) is the usual one, that is, the smallest topology that contains all sets \(U \times V\) where \(U\) and \(V\) are open in \(X\) and \(Y\) respectively, and if \(f: X \to Y\) is continuous, it is natural to expect \(G(f)\) to be closed. For example, by taking \(f(x)=x\) and \(X=Y=\mathbb{R}\), one would expect the diagonal line of the plane to be closed.

(Definition) The topological space \((X,\tau)\) is an \(F\)-space if \(\tau\) is induced by a complete invariant metric \(d\). Here invariant means that \(d(x+z,y+z)=d(x,y)\) for all \(x,y,z \in X\).

A Banach space is easily to be verified to be a \(F\)-space by defining \(d(x,y)=\lVert x-y \rVert\).

(Open mapping theorem) See this post

By definition of closed set, we have a practical criterion on whether \(G(f)\) is closed.

(Proposition 1) \(G(f)\) is closed if and only if, for any sequence \((x_n)\) such that the limits \[ x=\lim_{n \to \infty}x_n \quad \text{ and }\quad y=\lim_{n \to \infty}f(x_n) \] exist, we have \(y=f(x)\).

In this case, we say \(f\) is closed. For continuous functions, things are trivial.

(Proposition 2) If \(X\) and \(Y\) are two topological spaces and \(Y\) is Hausdorff, and \(f:X \to Y\) is continuous, then \(G(f)\) is closed.

Proof. Let \(G^c\) be the complement of \(G(f)\) with respect to \(X \times Y\). Fix \((x_0,y_0) \in G^c\), we see \(y_0 \neq f(x_0)\). By the Hausdorff property of \(Y\), there exists some open subsets \(U \subset Y\) and \(V \subset Y\) such that \(y_0 \in U\) and \(f(x_0) \in V\) and \(U \cap V = \varnothing\). Since \(f\) is continuous, we see \(W=f^{-1}(V)\) is open in \(X\). We obtained a open neighborhood \(W \times U\) containing \((x_0,y_0)\) which has empty intersection with \(G(f)\). This is to say, every point of \(G^c\) has a open neighborhood contained in \(G^c\), hence a interior point. Therefore \(G^c\) is open, which is to say that \(G(f)\) is closed. \(\square\)

closed-graph

REMARKS. For \(X \times Y=\mathbb{R} \times \mathbb{R}\), we have a simple visualization. For \(\varepsilon>0\), there exists some \(\delta\) such that \(|f(x)-f(x_0)|<\varepsilon\) whenever \(|x-x_0|<\delta\). For \(y_0 \neq f(x_0)\), pick \(\varepsilon\) such that \(0<\varepsilon<\frac{1}{2}|f(x_0)-y_0|\), we have two boxes (\(CDEF\) and \(GHJI\) on the picture), namely \[ B_1=\{(x,y):x_0-\delta<x<x_0+\delta,f(x_0)-\varepsilon<y<f(x_0)+\varepsilon\} \] and \[ B_2=\{(x,y):x_0-\delta<x<x_0+\delta,y_0-\varepsilon<y<y_0+\varepsilon\}. \] In this case, \(B_2\) will not intersect the graph of \(f\), hence \((x_0,y_0)\) is an interior point of \(G^c\).

The Hausdorff property of \(Y\) is not removable. To see this, since \(X\) has no restriction, it suffices to take a look at \(X \times X\). Let \(f\) be the identity map (which is continuous), we see the graph \[ G(f)=\{(x,x):x \in X\} \] is the diagonal. Suppose \(X\) is not Hausdorff, we reach a contradiction. By definition, there exists some distinct \(x\) and \(y\) such that all neighborhoods of \(x\) contain \(y\). Pick \((x,y) \in G^c\), then all neighborhoods of \((x,y) \in X \times X\) contain \((x,x)\) so \((x,y) \in G^c\) is not a interior point of \(G^c\), hence \(G^c\) is not open.

Also, as an immediate consequence, every affine algebraic variety in \(\mathbb{C}^n\) and \(\mathbb{R}^n\) is closed with respect to Euclidean topology. Further, we have the Zariski topology \(\mathcal{Z}\) by claiming that, if \(V\) is an affine algebraic variety, then \(V^c \in \mathcal{Z}\). It's worth noting that \(\mathcal{Z}\) is not Hausdorff (example?) and in fact much coarser than the Euclidean topology although an affine algebraic variety is both closed in the Zariski topology and the Euclidean topology.

The closed graph theorem

After we have proved this theorem, we are able to prove the theorem about compatible norms. We shall assume that both \(X\) and \(Y\) are \(F\)-spaces, since the norm plays no critical role here. This offers a greater variety but shall not be considered as an abuse of abstraction.

(The Closed Graph Theorem) Suppose

  1. \(X\) and \(Y\) are \(F\)-spaces,

  2. \(f:X \to Y\) is linear,

  3. \(G(f)\) is closed in \(X \times Y\).

Then \(f\) is continuous.

In short, the closed graph theorem gives a sufficient condition to claim the continuity of \(f\) (keep in mind, linearity does not imply continuity). If \(f:X \to Y\) is continuous, then \(G(f)\) is closed; if \(G(f)\) is closed and \(f\) is linear, then \(f\) is continuous.

Proof. First of all we should make \(X \times Y\) an \(F\)-space by assigning addition, scalar multiplication and metric. Addition and scalar multiplication are defined componentwise in the nature of things: \[ \alpha(x_1,y_1)+\beta(x_2,y_2)=(\alpha x_1+\beta x_2,\alpha y_1 + \beta y_2). \] The metric can be defined without extra effort: \[ d((x_1,y_1),(x_2,y_2))=d_X(x_1,x_2)+d_Y(y_1,y_2). \] Then it can be verified that \(X \times Y\) is a topological space with translate invariant metric. (Potentially the verifications will be added in the future but it's recommended to do it yourself.)


Since \(f\) is linear, the graph \(G(f)\) is a subspace of \(X \times Y\). Next we quote an elementary result in point-set topology, a subset of a complete metric space is closed if and only if it's complete, by the translate-invariance of \(d\), we see \(G(f)\) is an \(F\)-space as well. Let \(p_1: X \times Y \to X\) and \(p_2: X \times Y \to Y\) be the natural projections respectively (for example, \(p_1(x,y)=x\)). Our proof is done by verifying the properties of \(p_1\) and \(p_2\) on \(G(f)\).

For simplicity one can simply define \(p_1\) on \(G(f)\) instead of the whole space \(X \times Y\), but we make it a global projection on purpose to emphasize the difference between global properties and local properties. One can also write \(p_1|_{G(f)}\) to dodge confusion.


Claim 1. \(p_1\) (with restriction on \(G(f)\)) defines an isomorphism between \(G(f)\) and \(X\).

For \(x \in X\), we see \(p_1(x,f(x)) = x\) (surjectivity). If \(p_1(x,f(x))=0\), we see \(x=0\) and therefore \((x,f(x))=(0,0)\), hence the restriction of \(p_1\) on \(G\) has trivial kernel (injectivity). Further, it's trivial that \(p_1\) is linear.

Claim 2. \(p_1\) is continuous on \(G(f)\).

For every sequence \((x_n)\) such that \(\lim_{n \to \infty}x_n=x\), we have \(\lim_{n \to \infty}f(x_n)=f(x)\) since \(G(f)\) is closed, and therefore \(\lim_{n \to \infty}p_1(x_n,f(x_n)) =x\). Meanwhile \(p_1(x,f(x))=x\). The continuity of \(p_1\) is proved.

Claim 3. \(p_1\) is a homeomorphism with restriction on \(G(f)\).

We already know that \(G(f)\) is an \(F\)-space, so is \(X\). For \(p_1\) we have \(p_1(G(f))=X\) is of the second category (since it's an \(F\)-space and \(p_1\) is one-to-one), and \(p_1\) is continuous and linear on \(G(f)\). By the open mapping theorem, \(p_1\) is an open mapping on \(G(f)\), hence is a homeomorphism thereafter.

Claim 4. \(p_2\) is continuous.

This follows the same way as the proof of claim 2 but much easier since we have no need to care about \(f\).


Now things are immediate once one realizes that \(f=p_2 \circ p_1|_{G(f)}^{-1}\), and hence \(f\) is continuous. \(\square\)

Applications

Before we go for theorem 1 at the beginning, we drop an application on Hilbert spaces.

Let \(T\) be a bounded operator on the Hilbert space \(L_2([0,1])\) so that if \(\phi \in L_2([0,1])\) is a continuous function so is \(T\phi\). Then the restriction of \(T\) to \(C([0,1])\) is a bounded operator of \(C([0,1])\).

For details please check this.


Now we go for the identification of norms. Define \[ \begin{aligned} f:(X,\lVert\cdot\rVert_1) &\to (X,\lVert\cdot\rVert_2) \\ x &\mapsto x \end{aligned} \] i.e. the identity map between two Banach spaces (hence \(F\)-spaces). Then \(f\) is linear. We need to prove that \(G(f)\) is closed. For the convergent sequence \((x_n)\) \[ \lim_{n \to \infty}\lVert x_n -x \rVert_1=0, \] we have \[ \lim_{n \to \infty} \lVert f(x_n)-x \rVert_2=\lim_{n \to \infty}\lVert x_n -x\rVert_2=\lim_{n \to \infty}\lVert f(x_n)-f(x)\rVert_2=0. \] Hence \(G(f)\) is closed. Therefore \(f\) is continuous, hence bounded, we have some \(K\) such that \[ \lVert x \rVert_2 =\lVert f(x) \rVert_1 \leq K \lVert x \rVert_1. \] By defining \[ \begin{aligned} g:(X,\lVert\cdot\rVert_2) &\to (X,\lVert\cdot\rVert_1) \\ x &\mapsto x \end{aligned} \] we see \(g\) is continuous as well, hence we have some \(K'\) such that \[ \lVert x \rVert_1 =\lVert g(x) \rVert_2 \leq K'\lVert x \rVert_2 \] Hence two norms are weaker than each other.

The series

Since there is no strong reason to write more posts on this topic, i.e. the three fundamental theorems of linear functional analysis, I think it's time to make a list of the series. It's been around half a year.

References

  • Walter Rudin, Functional Analysis
  • Peter Lax, Functional Analysis
  • Jesús Gil de Lamadrid, Some Simple Applications of the Closed Graph Theorem

The Big Three Pt. 5 - The Hahn-Banach Theorem (Dominated Extension)

About this post

The Hahn-Banach theorem has been a central tool for functional analysis and therefore enjoys a wide variety, many of which have a numerous uses in other fields of mathematics. Therefore it's not possible to cover all of them. In this post we are covering two 'abstract enough' results, which are sometimes called the dominated extension theorem. Both of them will be discussed in real vector space where topology is not endowed. This allows us to discuss any topological vector space.

Another interesting thing is, we will be using axiom of choice, or whatever equivalence you may like, for example Zorn's lemma or well-ordering principle. Before everything, we need to examine more properties of vector spaces.

Vector space

It's obvious that every complex vector space is also a real vector space. Suppose \(X\) is a complex vector space, and we shall give the definition of real-linear and complex-linear functionals.

An addictive functional \(\Lambda\) on \(X\) is called real-linear (complex-linear) if \(\Lambda(\alpha x)=\alpha\Lambda(x)\) for every \(x \in X\) and for every real (complex) scalar \(\alpha\).

For *-linear functionals, we have two important but easy theorems.

If \(u\) is the real part of a complex-linear functional \(f\) on \(X\), then \(u\) is real-linear and \[ f(x)=u(x)-iu(ix) \quad (x \in X). \]

Proof. For complex \(f(x)=u(x)+iv(x)\), it suffices to denote \(v(x)\) correctly. But \[ if(x)=iu(x)-v(x), \] we see \(\Im(f(x)=v(x)=-\Re(if(x))\). Therefore \[ f(x)=u(x)-i\Re(if(x))=u(x)-i\Re(f(ix)) \] but \(\Re(f(ix))=u(ix)\), we get \[ f(x)=u(x)-iu(ix). \] To show that \(u(x)\) is real-linear, note that \[ f(x+y)=u(x+y)+iv(x+y)=f(x)+f(y)=u(x)+u(y)+i(v(x)+v(y)). \] Therefore \(u(x)+u(y)=u(x+y)\). Similar process can be applied to real scalar \(\alpha\). \(\square\)

Conversely, we are able to generate a complex-linear functional by a real one.

If \(u\) is a real-linear functional, then \(f(x)=u(x)-iu(ix)\) is a complex-linear functional

Proof. Direct computation. \(\square\)

Suppose now \(X\) is a complex topological vector space, we see a complex-linear functional on \(X\) is continuous if and only if its real part is continuous. Every continuous real-linear \(u: X \to \mathbb{R}\) is the real part of a unique complex-linear continuous functional \(f\).

Sublinear, seminorm

Sublinear functional is 'almost' linear but also 'almost' a norm. Explicitly, we say \(p: X \to \mathbb{R}\) a sublinear functional when it satisfies \[ \begin{aligned} p(x)+p(y) &\leq p(x+y) \\ p(tx) &= tp(x) \\ \end{aligned} \] for all \(t \geq 0\). As one can see, if \(X\) is normable, then \(p(x)=\lVert x \rVert\) is a sublinear functional. One should not be confused with semilinear functional, where inequality is not involved. Another thing worth noting is that \(p\) is not restricted to be nonnegative.


A seminorm on a vector space \(X\) is a real-valued function \(p\) on \(X\) such that \[ \begin{aligned} p(x+y) &\leq p(x)+p(y) \\ p(\alpha x)&=|\alpha|p(x) \end{aligned} \] for all \(x,y \in X\) and scalar \(\alpha\).

Obviously a seminorm is also a sublinear functional. For the connection between norm and seminorm, one shall note that \(p\) is a norm if and only if it satisfies \(p(x) \neq 0\) if \(x \neq 0\).

Dominated extension theorems

Are the results will be covered in this post. Generally speaking, we are able to extend a functional defined on a subspace to the whole space as long as it's dominated by a sublinear functional. This is similar to the dominated convergence theorem, which states that if a convergent sequence of measurable functions are dominated by another function, then the convergence holds under the integral operator.

(Hahn-Banach) Suppose

  1. \(M\) is a subspace of a real vector space \(X\),
  2. \(f: M \to \mathbb{R}\) is linear and \(f(x) \leq p(x)\) on \(M\) where \(p\) is a sublinear functional on \(X\)

Then there exists a linear \(\Lambda: X \to \mathbb{R}\) such that \[ \Lambda(x)=f(x) \] for all \(x \in M\) and \[ -p(-x) \leq \Lambda(x) \leq p(x) \] for all \(x \in X\).

Step 1 - Extending the function by one dimension

With that being said, if \(f(x)\) is dominated by a sublinear functional, then we are able to extend this functional to the whole space with a relatively proper range.

Proof. If \(M=X\) we have nothing to do. So suppose now \(M\) is a nontrivial proper subspace of \(X\). Choose \(x_1 \in X-M\) and define \[ M_1=\{x+tx_1:x \in M,t \in R\}. \] It's easy to verify that \(M_1\) satisfies all axioms of vector space (warning again: no topology is endowed). Now we will be using the properties of sublinear functionals.

Since \[ f(x)+f(y)=f(x+y) \leq p(x+y) \leq p(x-x_1)+p(x_1+y) \] for all \(x,y \in M\), we have \[ f(x)-p(x-x_1) \leq p(x_1+y) -f(y). \] Let \[ \alpha=\sup_{x}\{f(x)-p(x-x_1):x \in M\}. \] By definition, we naturally get \[ f(x)-\alpha \leq p(x-x_1) \] and \[ f(y)+\alpha \leq p(x_1+y). \] Define \(f_1\) on \(M_1\) by \[ f_1(x+tx_1)=f(x)+t\alpha. \] So when \(x +tx_1 \in M\), we have \(t=0\), and therefore \(f_1=f\).

To show that \(f_1 \leq p\) on \(M_1\), note that for \(t>0\), we have \[ f(x/t)-\alpha \leq p(x/t-x_1), \] which implies \[ f(x)-t\alpha=f_1(x-t\alpha)\leq p(x-tx_1). \] Similarly, \[ f(y/t)+\alpha \leq p(y/t+x_1), \] and therefore \[ f(y)+t\alpha=f_1(y+tx_1) \leq p(y+tx_1). \] Hence \(f_1 \leq p\).

Step 2 - An application of Zorn's lemma

Side note: Why Zorn's lemma

It seems that we can never stop using step 1 to extend \(M\) to a larger space, but we have to extend. (If \(X\) is a finite dimensional space, then this is merely a linear algebra problem.) This meets exactly what William Timothy Gowers said in his blog post:

If you are building a mathematical object in stages and find that (i) you have not finished even after infinitely many stages, and (ii) there seems to be nothing to stop you continuing to build, then Zorn’s lemma may well be able to help you.

-- How to use Zorn's lemma

And we will show that, as W. T. Gowers said,

If the resulting partial order satisfies the chain condition and if a maximal element must be a structure of the kind one is trying to build, then the proof is complete.


To apply Zorn's lemma, we need to construct a partially ordered set. Let \(\mathscr{P}\) be the collection of all ordered pairs \((M',f')\) where \(M'\) is a subspace of \(X\) containing \(M\) and \(f'\) is a linear functional on \(M'\) that extends \(f\) and satisfies \(f' \leq p\) on \(M'\). For example we have \[ (M,f) , (M_1,f_1) \subset \mathscr{P}. \] The partial order \(\leq\) is defined as follows. By \((M',f') \leq (M'',f'')\), we mean \(M' \subset M''\) and \(f' = f''\) on \(M'\). Obviously this is a partial order (you should be able to check this).

Suppose now \(\mathcal{F}\) is a chain (totally ordered subset of \(\mathscr{P}\)). We claim that \(\mathcal{F}\) has an upper bound (which is required by Zorn's lemma). Let \[ M_0=\bigcup_{(M',f') \in \mathcal{F}}M' \] and \[ f_0(y)=f(y) \] whenever \((M',f') \in \mathcal{F}\) and \(y \in M'\). It's easy to verify that \((M_0,f_0)\) is the upper bound we are looking for. But \(\mathcal{F}\) is arbitrary, therefore by Zorn's lemma, there exists a maximal element \((M^\ast,f^\ast)\) in \(\mathscr{P}\). If \(M^* \neq X\), according to step 1, we are able to extend \(M^\ast\), which contradicts the maximality of \(M^\ast\). And \(\Lambda\) is defined to be \(f^\ast\). By the linearity of \(\Lambda\), we see \[ -p(-x) \leq -\Lambda(-x)=\Lambda{x}. \] The theorem is proved. \(\square\)

How this proof is constructed

This is a classic application of Zorn's lemma (well-ordering principle, or Hausdorff maximality theorem). First, we showed that we are able to extend \(M\) and \(f\). But since we do not know the dimension or other properties of \(X\), it's not easy to control the extension which finally 'converges' to \((X,\Lambda)\). However, Zorn's lemma saved us from this random exploration: Whatever happens, the maximal element is there, and take it to finish the proof.

Generalisation onto the complex field

Since inequality is appeared in the theorem above, we need more careful validation.

(Bohnenblust-Sobczyk-Soukhomlinoff) Suppose \(M\) is a subspace of a vector space \(X\), \(p\) is a seminorm on \(X\), and \(f\) is a linear functional on \(M\) such that \[ |f(x)| \leq p(x) \] for all \(x \in M\). Then \(f\) extends to a linear functional \(\Lambda\) on \(X\) satisfying \[ |\Lambda (x)| \leq p(x) \] for all \(x \in X\).

Proof. If the scalar field is \(\mathbb{R}\), then we are done, since \(p(-x)=p(x)\) in this case (can you see why?). So we assume the scalar field is \(\mathbb{C}\).

Put \(u = \Re f\). By dominated extension theorem, there is some real-linear functional \(U\) such that \(U(x)=u\) on \(M\) and \(U \leq p\) on \(X\). And here we have \[ \Lambda(x)=U(x)-iU(ix) \] where \(\Lambda(x)=f(x)\) on \(M\).

To show that \(|\Lambda(x)| \leq p(x)\) for \(x \neq 0\), by taking \(\alpha=\frac{|\Lambda(x)|}{\Lambda(x)}\), we have \[ U(\alpha{x})=\Lambda(\alpha{x})=|\Lambda(x)|\leq p(\alpha x)=p(x) \] since \(|\alpha|=1\) and \(p(\alpha{x})=|\alpha|p(x)=p(x)\). \(\square\)

Extending Hahn-Banach theorem under linear transform

To end this post, we state a beautiful and useful extension of the Hahn-Banach theorem, which is done by R. P. Agnew and A. P. Morse.

(Agnew-Morse) Let \(X\) denote a real vector space and \(\mathcal{A}\) be a collection of linear maps \(A_\alpha: X \to X\) that commute, or namely \[ A_\alpha A_\beta=A_\beta A_\alpha \] for all \(A_\alpha,A_\beta \in \mathcal{A}\). Let \(p\) be a sublinear functional such that \[ p(A_\alpha{x})=p(x) \] for all \(A_\alpha \in \mathcal{A}\). Let \(Y\) be a subspace of \(X\) on which a linear functional \(f\) is defined such that

  1. \(f(y) \leq p(y)\) for all \(y \in Y\).
  2. For each mapping \(A\) and \(y \in Y\), we have \(Ay \in Y\).
  3. Under the hypothesis of 2, we have \(f(Ay)=f(y)\).

Then \(f\) can be extended to \(X\) by \(\Lambda\) so that \(-p(-x) \leq \Lambda(x) \leq p(x)\) for all \(x \in X\), and \[ \Lambda(A_\alpha{x})=\Lambda{x}. \]

To prove this theorem, we need to construct a sublinear functional that dominates \(f\). For the whole proof, see Functional Analysis by Peter Lax.

The series

Since there is no strong reason to write more posts on this topic, i.e. the three fundamental theorems of linear functional analysis, I think it's time to make a list of the series. It's been around half a year.

References / Further Readings

  1. Walter Rudin, Functional Analysis.
  2. Peter Lax, Functional Analysis.
  3. William Timothy Gowers, How to use Zorn's lemma.

The Big Three Pt. 4 - The Open Mapping Theorem (F-Space)

The Open Mapping Theorem

We are finally going to prove the open mapping theorem in \(F\)-space. In this version, only metric and completeness are required. Therefore it contains the Banach space version naturally.

(Theorem 0) Suppose we have the following conditions:

  1. \(X\) is a \(F\)-space,
  2. \(Y\) is a topological space,
  3. \(\Lambda: X \to Y\) is continuous and linear, and
  4. \(\Lambda(X)\) is of the second category in \(Y\).

Then \(\Lambda\) is an open mapping.

Proof. Let \(B\) be a neighborhood of \(0\) in \(X\). Let \(d\) be an invariant metric on \(X\) that is compatible with the \(F\)-topology of \(X\). Define a sequence of balls by \[ B_n=\{x:d(x,0) < \frac{r}{2^n}\} \] where \(r\) is picked in such a way that \(B_0 \subset B\). To show that \(\Lambda\) is an open mapping, we need to prove that there exists some neighborhood \(W\) of \(0\) in \(Y\) such that \[ W \subset \Lambda(B). \] To do this however, we need an auxiliary set. In fact, we will show that there exists some \(W\) such that \[ W \subset \overline{\Lambda(B_1)} \subset \Lambda(B). \] We need to prove the inclusions one by one.


The first inclusion requires BCT. Since \(B_2 -B_2 \subset B_1\), and \(Y\) is a topological space, we get \[ \overline{\Lambda(B_2)}-\overline{\Lambda(B_2)} \subset \overline{\Lambda(B_2)-\Lambda(B_2)} \subset \overline{\Lambda(B_1)} \] Since \[ \Lambda(X)=\bigcup_{k=1}^{\infty}k\Lambda(B_2), \] according to BCT, at least one \(k\Lambda(B_2)\) is of the second category in \(Y\). But scalar multiplication \(y\mapsto ky\) is a homeomorphism of \(Y\) onto \(Y\), we see \(k\Lambda(B_2)\) is of the second category for all \(k\), especially for \(k=1\). Therefore \(\overline{\Lambda(B_2)}\) has nonempty interior, which implies that there exists some open neighborhood \(W\) of \(0\) in \(Y\) such that \(W \subset \overline{\Lambda(B_1)}\). By replacing the index, it's easy to see this holds for all \(n\). That is, for \(n \geq 1\), there exists some neighborhood \(W_n\) of \(0\) in \(Y\) such that \(W_n \subset \overline{\Lambda(B_n)}\).


The second inclusion requires the completeness of \(X\). Fix \(y_1 \in \overline{\Lambda(B_1)}\), we will show that \(y_1 \in \Lambda(B)\). Pick \(y_n\) inductively. Assume \(y_n\) has been chosen in \(\overline{\Lambda(B_n)}\). As stated before, there exists some neighborhood \(W_{n+1}\) of \(0\) in \(Y\) such that \(W_{n+1} \subset \overline{\Lambda(B_{n+1})}\). Hence \[ (y_n-W_{n+1}) \cap \Lambda(B_n) \neq \varnothing \] Therefore there exists some \(x_n \in B_n\) such that \[ \Lambda x_n = y_n - W_{n+1}. \] Put \(y_{n+1}=y_n-\Lambda x_n\), we see \(y_{n+1} \in W_{n+1} \subset \overline{\Lambda(B_{n+1})}\). Therefore we are able to pick \(y_n\) naturally for all \(n \geq 1\).

Since \(d(x_n,0)<\frac{r}{2^n}\) for all \(n \geq 0\), the sums \(z_n=\sum_{k=1}^{n}x_k\) converges to some \(z \in X\) since \(X\) is a \(F\)-space. Notice we also have \[ \begin{aligned} d(z,0)& \leq d(x_1,0)+d(x_2,0)+\cdots \\ & < \frac{r}{2}+\frac{r}{4}+\cdots \\ & = r \end{aligned} \] we have \(z \in B_0 \subset B\).

By the continuity of \(\Lambda\), we see \(\lim_{n \to \infty}y_n = 0\). Notice we also have \[ \sum_{k=1}^{n} \Lambda x_k = \sum_{k=1}^{n}(y_k-y_{k+1})=y_1-y_{n+1} \to y_1 \quad (n \to \infty), \] we see \(y_1 = \Lambda z \in \Lambda(B)\).

The whole theorem is now proved, that is, \(\Lambda\) is an open mapping. \(\square\)

Remarks

You may think the following relation comes from nowhere: \[ (y_n - W_{n+1}) \cap \Lambda(B_{n}) \neq \varnothing. \] But it's not. We need to review some set-point topology definitions. Notice that \(y_n\) is a limit point of \(\Lambda(B_n)\), and \(y_n-W_{n+1}\) is a open neighborhood of \(y_n\). If \((y_n - W_{n+1}) \cap \Lambda(B_{n})\) is empty, then \(y_n\) cannot be a limit point.

The geometric series by \[ \frac{\varepsilon}{2}+\frac{\varepsilon}{4}+\cdots+\frac{\varepsilon}{2^n}+\cdots=\varepsilon \] is widely used when sum is taken into account. It is a good idea to keep this technique in mind.

Corollaries

The formal proof will not be put down here, but they are quite easy to be done.

(Corollary 0) \(\Lambda(X)=Y\).

This is an immediate consequence of the fact that \(\Lambda\) is open. Since \(Y\) is open, \(\Lambda(X)\) is an open subspace of \(Y\). But the only open subspace of \(Y\) is \(Y\) itself.

(Corollary 1) \(Y\) is a \(F\)-space as well.

If you have already see the commutative diagram by quotient space (put \(N=\ker\Lambda\)), you know that the induced map \(f\) is open and continuous. By treating topological spaces as groups, by corollary 0 and the first isomorphism theorem, we have \[ X/\ker\Lambda \simeq \Lambda(X)=Y. \] Therefore \(f\) is a isomorphism; hence one-to-one. Therefore \(f\) is a homeomorphism as well. In this post we showed that \(X/\ker{\Lambda}\) is a \(F\)-space, therefore \(Y\) has to be a \(F\)-space as well. (We are using the fact that \(\ker{\Lambda}\) is a closed set. But why closed?)

(Corollary 2) If \(\Lambda\) is a continuous linear mapping of an \(F\)-space \(X\) onto a \(F\)-space \(Y\), then \(\Lambda\) is open.

This is a direct application of BCT and open mapping theorem. Notice that \(Y\) is now of the second category.

(Corollary 3) If the linear map \(\Lambda\) in Corollary 2 is injective, then \(\Lambda^{-1}:Y \to X\) is continuous.

This comes from corollary 2 directly since \(\Lambda\) is open.

(Corollary 4) If \(X\) and \(Y\) are Banach spaces, and if \(\Lambda: X \to Y\) is a continuous linear bijective map, then there exist positive real numbers \(a\) and \(b\) such that \[ a \lVert x \rVert \leq \lVert \Lambda{x} \rVert \leq b\rVert x \rVert \] for every \(x \in X\).

This comes from corollary 3 directly since both \(\Lambda\) and \(\Lambda^{-1}\) are bounded as they are continuous.

(Corollary 5) If \(\tau_1 \subset \tau_2\) are vector topologies on a vector space \(X\) and if both \((X,\tau_1)\) and \((X,\tau_2)\) are \(F\)-spaces, then \(\tau_1 = \tau_2\).

This is obtained by applying corollary 3 to the identity mapping \(\iota:(X,\tau_2) \to (X,\tau_1)\).

(Corollary 6) If \(\lVert \cdot \rVert_1\) and \(\lVert \cdot \rVert_2\) are two norms in a vector space \(X\) such that

  • \(\lVert\cdot\rVert_1 \leq K\lVert\cdot\rVert_2\).
  • \((X,\lVert\cdot\rVert_1)\) and \((X,\lVert\cdot\rVert_2)\) are Banach

Then \(\lVert\cdot\rVert_1\) and \(\lVert\cdot\rVert_2\) are equivalent.

This is merely a more restrictive version of corollary 5.

The series

Since there is no strong reason to write more posts on this topic, i.e. the three fundamental theorems of linear functional analysis, I think it's time to make a list of the series. It's been around half a year.

The Big Three Pt. 3 - The Open Mapping Theorem (Banach Space)

What is open mapping

An open map is a function between two topological spaces that maps open sets to open sets. Precisely speaking, a function \(f: X \to Y\) is open if for any open set \(U \subset X\), \(f(U)\) is open in \(Y\). Likewise, a closed map is a function mapping closed sets to closed sets.

You may think open/closed map is an alternative name of continuous function. But it's not. The definition of open/closed mapping is totally different from continuity. Here are some simple examples.

  1. \(f(x)=\sin{x}\) defined on \(\mathbb{R}\) is not open, though it's continuous. It can be verified by considering \((0,2\pi)\), since we have \(f((0,2\pi))=[-1,1]\).
  2. The projection \(\pi: \mathbb{R}^2 \to \mathbb{R}\) defined by \((x,y) \mapsto x\) is open. Indeed, it maps an open ball onto an open interval on \(x\) axis.
  3. The inclusion map \(\varphi: \mathbb{R} \to \mathbb{R}^2\) by \(x \mapsto (x,0)\) however, is not open. An open interval on the plane is locally closed but not open or closed.

Under what condition will a continuous linear function between two TVS be an open mapping? We'll give the answer in this blog post. Open mapping theorem is a sufficient condition on whether a continuous linear function is open.

Open Mapping Theorem

Let \(X,Y\) be Banach spaces and \(T: X \to Y\) a surjective bounded linear map. Then \(T\) is an open mapping.

The open balls in \(X\) and \(Y\) are defined respectively by \[ B_r^X=\{x \in X:\lVert x \rVert<r\}\quad\text{and}\quad B_r^Y=\{y \in Y:\lVert y \rVert<r\} \] All we need to do is show that there exists some \(r>0\) such that \[ B_r^Y \subset T(B_1^X) \] Since every open set in \(X\) or \(Y\) can be expressed as a union of open balls. For a ball in \(X\) centered at \(x \in X\) with radius \(r\), we can express it as \(x+B_r^X\). After that, it becomes obvious that \(T\) maps open set to open set.

First we have \[ X=\bigcup_{n=1}^{\infty}B_n^{X}. \] The surjectivity of \(T\) ensures that \[ Y=\bigcup_{n=1}^{\infty}T(B_n^X). \] Since \(Y\) is Banach, or simply a complete metric space, by Baire category theorem, there must be some \(n_0 \in \mathbb{N}\) such that \(\overline{T(B_{n_0}^{X})}\) has nonempty interior. If not, which means \(T(B_n^{X})\) is nowhere dense for all \(n \in \mathbb{N}\), we have \(Y\) is of the first category. A contradiction.

Since \(x \to nx\) is a homeomorphism of \(X\) onto \(X\), we see in fact \(T(B_n^X)\) is not nowhere dense for all \(n \in \mathbb{N}\). Therefore, there exists some \(y_0 \in \overline{T(B_1^{X})}\) and some \(\varepsilon>0\) such that \[ y_0+B_\varepsilon^Y \subset \overline{T(B_1^X)} \] the open set on the left hand is a neighborhood of \(y_0\), which should be in the interior of \(\overline{T(B_1^X)}\).

On the other hand, we claim \[ \overline{T(B_1^X)} - y_0 \subset \overline{T(B_2^X)}. \] We shall prove it as follows. Pick any \(y \in \overline{T(B_1^X)}\), we shall show that \(y-y_0 \in \overline{T(B_2^X)}\). For \(y_0\), there exists a sequence of \(y_n\) where \(\lVert y_n \rVert <1\) for all \(n\) such that \(Ty_n \to y_0\). Also we are able to find a sequence of \(x_n\) where \(\lVert x_n \rVert <1\) for all \(n\) such that \(Tx_n \to y\). Notice that we also have \[ y-y_0=\lim_{n \to \infty}T(x_n-y_n), \] since \[ \lVert x_n -y_n \rVert \leq \lVert x_n \rVert+\lVert y_n \rVert <2, \] we see \(T(x_n-y_n) \in T(B_2^X)\) for all \(n\), it follows that \[ y-y_0 \in \overline{T(B_2^X)}. \] Combining all these relations, we get \[ B_\varepsilon^Y \subset \overline{T(B_2^X)}. \] Since \(T\) is linear, we see \[ 2B_{\varepsilon/2}^{Y} \subset \overline{T(2B_1^X)}=2\overline{T(B_1^X)}. \] By induction we get \[ B_{\varepsilon/2^n}^Y \subset \overline{T(B_{1/2^{n-1}}^X)} \] for all \(n \geq 1\).

We shall show however \[ B_{\varepsilon/4}^Y \subset T(B_1^X). \] For any \(u \in B_{\varepsilon/4}^Y\), we have \(u \in \overline{T(B_{1/2}^X)}\). There exists some \(x_1 \in B_{1/2}^{X}\) such that \[ \lVert u-Tx_1 \rVert < \frac{\varepsilon}{8}. \] This implies that \(u-Tx_1 \in B_{\varepsilon/8}^Y\). Under the same fashion, we are able to pick \(x_n\) in such a way that \[ \lVert u-Tx_1-Tx_2-\cdots-Tx_n \rVert < \frac{\varepsilon}{2^{n+2}} \] where \(\lVert x_n \rVert<2^{-n}\). Now let \(z_n=\sum_{k=1}^{n}x_k\), we shall show that \((z_n)\) is Cauchy. For \(m<n\), we have \[ \lVert z_n - z_m \rVert =\left\Vert\sum_{k=m+1}^nx_k \right\Vert \leq \sum_{k=m+1}^{n}\lVert x_k\rVert < \frac{1}{2^{m+1}} \] Since \(X\) is Banach, there exists some \(z \in X\) such that \(z_n \to z\). Further we have \[ \lVert z\rVert = \lim_{n \to \infty}\lVert z_n \rVert \leq \sum_{k=1}^{\infty}\lVert x_n \rVert < 1 \] therefore \(z \in B_1^X\). Since \(T\) is bounded, therefore continuous, we get \(T(z)=u\). To summarize, for \(u \in B_{\varepsilon/4}^Y\), we have some \(z \in B_{1}^X\) such that \(T(z)=y\), which implies \(T(B_1^X) \supset B_{\varepsilon/4}^Y\).

Let \(U \subset X\) be open, we want to show that \(T(U)\) is also open. Take \(y \in T(U)\), then \(y=T(x)\) with \(x \in U\). Since \(U\) is open, there exists some \(\varepsilon>0\) such that \(B_{\varepsilon}^{X}+x \subset U\). By the linearity of \(T\), we obtain \(B_{r\varepsilon}^Y \subset T(B_{\varepsilon}^X)\) for some small \(r\). Using the linearity of \(T\) again, we obtain \[ B_{r\varepsilon}^Y + y \subset T(B_{\varepsilon}^X+x) \subset T(U) \] which shows that \(T(U)\) is open, therefore \(T\) is an open mapping.

Remarks

One have to notice that the completeness of \(X\) and \(Y\) has been used more than one time. For example, the existence of \(z\) depends on the fact that Cauchy sequence converges in \(X\). Also, the surjectivity of \(T\) cannot be omitted, can you see why?

There are some different ways to state this theorem.

  • To every \(y\) with \(\lVert y \rVert < \delta\), there corresponds an \(x\) with \(\lVert x \rVert<1\) such that \(T(x)=y\).
  • Let \(U\) and \(V\) be the open unit balls of the Banach spaces \(X\) and \(Y\). To every surjective bounded linear map, there corresponds a \(\delta>0\) such that

\[ T(U) \supset \delta{V}. \]

You may also realize that we have used a lot of basic definitions of topology. For example, we checked the openness of \(T(U)\) by using neighborhood. The set \(\overline{T(B_1^X)}\) should also remind you of limit point.

The difference of open mapping and continuous mapping can be viewed via the topologies of two topological vector spaces. Suppose \(f: X \to Y\). If for any \(U \in \tau_X\), we have \(f(U) \in \tau_Y\), where \(\tau_X\) and \(\tau_Y\) are the topologies of \(X\) and \(Y\), respectively. But this has nothing to do with continuity. By continuity we mean, for any \(V \in \tau_Y\), we have \(f^{-1}(V) \in \tau_U\).

Fortunately, this theorem can be generalized to \(F\)-spaces, which will be demonstrated in the following blog post of the series. A space \(X\) is an \(F\)-space if its topology \(\tau\) is induced by a complete invariant metric \(d\). Still, completeness plays a critical rule.

The series

Since there is no strong reason to write more posts on this topic, i.e. the three fundamental theorems of linear functional analysis, I think it's time to make a list of the series. It's been around half a year.

The Big Three Pt. 2 - The Banach-Steinhaus Theorem

About this blog post

People call the Banach-Steinhaus theorem the first of the big three, which sits at the foundation of linear functional analysis. None of them can go without the Baire's category theorem.

This blog post offers the Banach-Steinhaus theorem on different abstract levels. Recall that we have \[ \text{TVS} \supset \text{Metrizable TVS} \supset \text{F-space} \supset \text{Fréchet space}\supset\text{Banach space} \supset \text{Hilbert space} \] First, there will be a simple version for Banach spaces, which may be more frequently used, and you will realize why it's referred to as the uniform boundedness principle. After that, there will be a much more generalized version for TVS. Typically, the metrization of the space will not be considered.

Also, it will be a good chance to get a better view of the first and second space by Baire.

Equicontinuity

For metric spaces, equicontinuity is defined as follows. Let \((X,d_X)\) and \((Y,d_Y)\) be two metric spaces.

Let \(\Lambda\) be a collection of functions from \(X\) to \(Y\). We have three different levels of equicontinuity.

  1. Equicontinuous at a point. For \(x_0 \in X\), if for every \(\varepsilon>0\), there exists a \(\delta>0\) such that \(d_Y(Lx_0,Lx)<\varepsilon\) for all \(L \in \Lambda\) and \(d_X(x_0,x)<\delta\) (that is, the continuity holds for all \(L\) in a ball centered at \(x_0\) with radius \(r\)).
  2. Pointwise equicontinuous. \(\Lambda\) is equicontinuous at each point of \(X\).
  3. Uniformly equicontinuous. For every \(\varepsilon>0\), there exists a \(\delta>0\) such that \(d_Y(Lx,Ly)<\varepsilon\) for all \(x \in \Lambda\) and \(x,y \in X\) such that \(d_X(x,y) < \delta\).

Indeed, if \(\Lambda\) contains only one element, namely \(L\), then everything goes with the continuity and uniform continuity.

But for Banach-Steinhaus theorem, we need a little more restrictions. In fact, \(X\) and \(Y\) should be considered Banach spaces, and \(\Lambda\) contains linear functions only. In this sense, for \(L \in \Lambda\), we have the following three conditions equivalent.

  1. \(L\) is bounded.
  2. \(L\) is continuous.
  3. \(L\) is continuous at one point of \(X\).

For topological vector spaces, where only topology and linear structure are taken into consideration, things get different. Since no metrization is considered, we have to state it in the language of topology.

Suppose \(X\) and \(Y\) are TVS and \(\Lambda\) is a collection of linear functions from \(X\) to \(Y\). \(\Lambda\) is equicontinuous if for every neighborhood \(N\) of \(0\) in \(Y\), there corresponds a neighborhood \(V\) of \(0\) in \(X\) such that \(L(V) \subset N\) for all \(L \in \Lambda\).

Indeed, for TVS, \(L \in \Lambda\) has the three conditions equivalent as well. With that being said, equicontinuous collection has the boundedness property in a uniform manner. That's why the Banach-Steinhaus theorem is always referred to as the uniform boundedness principle.

The Banach-Steinhaus theorem, a sufficient condition for being equicontinuous

Banach space version

Suppose \(X\) is a Banach space, \(Y\) is a normed linear space, and \({F}\) is a collection of bounded linear transformation of \(X\) into \(Y\), we have two equivalent statements: 1. (The Resonance Theorem) If \(\sup\limits_{L \in \Lambda}\left\Vert{L}\right\Vert=\infty\), then there exists some \(x \in X\) such that \(\sup\limits_{L \in {L}}\left\Vert{Lx}\right\Vert=\infty\). (In fact, these \(x\) form a dense \(G_\delta\).)

  1. (The Uniform Boundedness Principle) If \(\sup\limits_{L \in {\Lambda}}\left\Vert{Lx}\right\Vert<\infty\) for all \(x \in X\), then we have $ L M$ for all \(L \in {\Lambda}\) and some \(M<\infty\).
  2. (A summary of 1 and 2) Either there exists an \(M<\infty\) such that \(\lVert L \rVert \leq M\) for all \(L \in {L}\), or \(\sup\lVert Lx \rVert = \infty\) for all \(x\) belonging to some dense \(G_\delta\) in \(X\).

Proof

Though it would be easier if we finish the TVS version proof, it's still a good idea to leave the formal proof without the help of TVS here. The equicontinuity of \(\Lambda\) will be shown in the next section.

An elementary proof of the Resonance theorem

First, we offer an elementary proof in which the hardest part is the Cauchy sequence.

(Lemma) For any \(x \in X\) and \(r >0\), we have \[ \sup_{y\in B(x,r)}\lVert Lx \rVert \geq \lVert L \rVert r \] where \(B(x,r)=\{y \in X:\lVert x-y \rVert < r\}\).

(Proof of the lemma)

For \(t \in X\) we have a simple relation \[ \begin{aligned} \max(\lVert{L(x+t)}\rVert,\lVert{L(x-t)}\rVert)&=\frac{1}{2}(\lVert{L(x+t)}\rVert+\lVert{L(x-t)}\rVert)+\frac{1}{2}\left\vert\lVert{L(x+t)}\rVert-\lVert{L(x-t)}\rVert\right\vert \\ &\geq \frac{1}{2}(\lVert{L(x+t)}\rVert+\lVert{L(x-t)}\rVert) \\ &\geq \frac{1}{2}\lVert{L(2t)}\rVert=\lVert Lt \rVert \end{aligned} \] If we have \(t \in B(0,r)\), then \(x+t,x-t\in{B(x,r)}\). And the desired inequality follows by taking the supremum over \(t \in B(0,r)\). (If you find trouble understanding this, take a look at the definition of \(\lVert L \rVert\).)

Suppose now \(\sup\limits_{L \in \Lambda}\left\Vert{L}\right\Vert=\infty\). Pick a sequence of linear transformation in \(\Lambda\), say \((L_n)_{n=1}^{\infty}\), such that \(\lVert L_n \rVert \geq 4^n\). Pick \(x_0 \in X\), and for \(n \geq 1\), we pick \(x_n\) inductively.

Set \(r_n=3^{-n}\). With \(x_{n-1}\) being picked, \(x_n \in B(x_{n-1},r_n)\) is picked in such a way that \[ \lVert L_n x_n \rVert \geq \frac{2}{3}\lVert L_n \rVert r_n \] (It's easy to validate this inequality by reaching a contradiction.) Also, it's easy to check that \((x_n)_{n=1}^{\infty}\) is Cauchy. Since \(X\) is complete, \((x_n)\) converges to some \(x \in X\). Further we have \[ \begin{aligned} \lVert x-x_n \rVert &\leq \sum_{k=n}^{\infty}\lVert x_k - x_{k+1}\rVert \\ &=\frac{1}{2\cdot 3^n} \end{aligned} \] Therefore we have \[ \begin{aligned} \lVert L_n x \rVert &=\lVert L_n[x_n-(x_n-x)] \rVert \\ &\geq \lVert L_nx_n \rVert - \lVert L_n(x_n-x) \rVert \\ &\geq \frac{2}{3}\lVert{L_n}\rVert{3}^{-n}-\lVert{L_n}\rVert\lVert{x_n-x}\rVert\\ &\geq \frac{1}{6}\lVert{L_n}\rVert{3}^{-n} \\ & \geq \frac{1}{6}\left(\frac{4}{3}\right)^n \to\infty \end{aligned} \]

A topology-based proof

The previous proof is easy to understand but it's not easy to see the topological properties of the set formed by such \(x\). Thus we are offering a topology-based proof which enables us to get a topology view.

Put \[ \varphi(x)=\sup_{L \in \Lambda}\lVert Lx \rVert \] and let \[ V_n=\{x:\varphi(x)>n\} \] we claim that each \(V_n\) is open. Indeed, we have to show that \(x \mapsto \lVert Lx \rVert\) is continuous. It suffice to show that \(\lVert\cdot\rVert\) defined in \(Y\) is continuous. This follows immediately from triangle inequality since for \(x,y \in Y\) we have \[ \lVert x \rVert \leq \lVert x-y \rVert + \lVert y \rVert \] which implies \[ \lVert x \rVert - \lVert y \rVert \leq \lVert x-y \rVert \] by interchanging \(x\) and \(y\), we get \[ |\lVert x \rVert - \lVert y \rVert | \leq \lVert x-y \rVert \] Thus \(x \mapsto \lVert Lx \rVert\) is continuous since it's a composition of \(\lVert\cdot\rVert\) and \(L\). Hence \(\varphi\), by the definition, is lower semicontinuous, which forces \(V_n\) to be open.

If every \(V_n\) is dense in \(X\) (consider \(\sup\lVert L \rVert=\infty\)), then by BCT, \(B=\bigcap_{n=1}^{\infty} V_n\) is dense in \(X\). Since each \(V_n\) is open, \(B\) is a dense \(G_\delta\). Again by the definition of \(B\), we have \(\varphi(x)=\infty\) for all \(x \in B\).

If one of these sets, namely \(V_N\), fails to be dense in \(X\), then there exist an \(x_0 \in X - V_N\) and an \(r>0\) such that for \(x \in B(0,r)\) we have \(x_0+x \notin V_N\), which is equivalent to \[ \varphi(x+x_0) \leq N \] considering the definition of \(\varphi\), we also have \[ \lVert L(x+x_0) \rVert \leq N \] for all \(L \in \Lambda\). Since \(x=(x+x_0)-x_0\), we also have \[ \lVert Lx \rVert \leq \lVert L(x+x_0) \rVert+\lVert Lx_0 \rVert \leq 2N \] Dividing \(r\) on two sides, we got \[ \lVert L\frac{x}{r}\rVert \leq \frac{2N}{r} \] therefore \(\lVert L \rVert \leq M=\frac{2N}{r}\) as is to be shown. Again, this follows from the definition of \(\lVert L \rVert\).

Topological vector space version

Suppose \(X\) and \(Y\) are topological vector spaces, \(\Lambda\) is a collection of continuous linear mapping from \(X\) into \(Y\), and \(B\) is the set of all \(x \in X\) whose orbits \[ \Lambda(x)=\{Lx:L\in\Lambda\} \] are bounded in \(Y\). For this \(B\), we have:

  • If \(B\) is of the second category, then \(\Lambda\) is equicontinuous.
A proof using properties of TVS

Pick balanced neighborhoods \(W\) and \(U\) of the origin in \(Y\) such that \(\overline{U} + \overline{U} \subset W\). The balanced neighborhood exists since every neighborhood of \(0\) contains a balanced one.

Put \[ E=\bigcap_{L \in \Lambda}L^{-1}(\overline{U}). \] If \(x \in B\), then \(\Lambda(x)\) is bounded, which means that to \(U\), there exists some \(n\) such that \(\Lambda(x) \subset nU\) (Be aware, no metric is introduced, this is the definition of boundedness in topological space). Therefore we have \(x \in nE\). Consequently, \[ B\subset \bigcup_{n=1}^{\infty}nE. \] If no \(nE\) is of the second category, then \(B\) is of the first category. Therefore, there exists at least one \(n\) such that \(nE\) is of the second category. Since \(x \mapsto nx\) is a homeomorphism of \(X\) onto \(X\), \(E\) is of the second category as well. But \(E\) is closed since each \(L\) is continuous. Therefore \(E\) has an interior point \(x\). In this case, \(x-E\) contains a neighborhood \(V\) of \(0\) in \(X\), and \[ L(V) \subset Lx-L(E) \subset \overline{U} - \overline{U} \subset W \] This proves that \(\Lambda\) is equicontinuous.

Equicontinuity and uniform boundedness

We'll show that \(B=X\). But before that, we need another lemma, which states the connection between equicontinuity and uniform boundedness

(Lemma) Suppose \(X\) and \(Y\) are TVS, \(\Gamma\) is an equicontinuous collection of linear mappings from \(X\) to \(Y\), and \(E\) is a bounded subset of \(X\). Then \(Y\) has a bounded subset \(F\) such that \(T(E) \subset F\) for every \(T \in \Gamma\).

(Proof of the lemma) We'll show that, the set \[ F=\bigcup_{T \in \Gamma}T(E) \] is bounded. By the definition of equicontinuity, there is an neighborhood \(V\) of the origin in \(X\) such that \(T(V) \subset W\) for all \(T \in \Gamma\). Since \(E\) is bounded, there exists some \(t\) such that \(E \subset tV\). For these \(t\), by the definition of linear functions, we have \[ T(E) \subset T(tV)=tT(V) \subset tW \] Therefore \(F \subset tW\). \(F\) is bounded.

Thus \(\Lambda\) is uniformly bounded. Picking \(E=\{x\}\) in the lemma, we also see \(\Lambda(x)\) is bounded in \(Y\) for every \(x\). Thus \(B=X\).

A special case when \(X\) is a \(F\)-space or Banach space

\(X\) is a \(F\)-space if its topology \(\tau\) is induced by a complete invariant metric \(d\). By BCT, \(X\) is of the second category. If we already have \(B=X\), in which case \(B\) is of the second category, then by Banach-Steinhaus theorem, \(\Lambda\) is equicontinuous. Formally speaking, we have:

If \(\Lambda\) is a collection of continuous linear mappings from an \(F\)-space \(X\) into a topological vector space \(Y\), and if the sets \[ \Lambda(x)=\{Lx:L\in\Lambda\} \] are bounded in \(Y\) for every \(x \in X\), then \(\Lambda\) is equicontinuous.

Notice that all Banach spaces are \(F\)-spaces. Therefore we can restate the Uniform Boundedness Principle in Banach space with equicontinuity.

Suppose \(X\) is a Banach space, \(Y\) is a normed linear space, and \({F}\) is a collection of bounded linear transformation of \(X\) into \(Y\), we have:

  • (The Uniform Boundedness Principle) If \(\sup\limits_{L \in {\Lambda}}\left\Vert{Lx}\right\Vert<\infty\) for all \(x \in X\), then we have \(\|L\| \le M\) for all \(L \in {\Lambda}\) and some \(M<\infty\). Further, \(\Lambda\) is equicontinuous.

Application

Surprisingly enough, the Banach-Steinhaus theorem can be used to do Fourier analysis. An important example follows.

There is a periodic continuous function \(f\) on \([0,1]\) such that the Fourier series \[ \sum_{n\in\mathbb{Z}}\hat{f}(n)e^{2\pi inx} \] of \(f\) diverges at \(0\). \(\hat{f}(n)\) is defined by \[ \hat{f}(n)=\int_{0}^{1}e^{-2\pi inx}f(x)dx \]

Notice that \(f \mapsto \hat{f}\) is linear, and the divergence of the series at \(0\) can be considered by \[ \sum_{n\in\mathbb{Z}}\hat{f}(n)e^{2\pi in\cdot0}=\sum_{n\in\mathbb{Z}}\hat{f}(n) \] To invoke Banach-Steinhaus theorem, the family of linear functionals are defined by \[ \lambda_N(f)=\sum_{|n| \leq N}\hat{f}(n) \] It can be proved that \[ \lVert \lambda_N \rVert=\int_0^1\left\vert\sum_{|n| \leq N}e^{-2\pi inx}\right\vert dx \] which goes to infinity as \(N \to \infty\). The existence of such \(f\) that \[ \sup_{N}|\lambda_N(f)|=+\infty \] follows from the resonance theorem. Further, we also know that these \(f\) are in a dense \(G_\delta\) subset of the vector space generated by all periodic continuous functions on \([0,1]\).

The series

Since there is no strong reason to write more posts on this topic, i.e. the three fundamental theorems of linear functional analysis, I think it's time to make a list of the series. It's been around half a year.

References / Further readings

  1. arXiv:1005.1585v2
  2. W. Rudin, Real and Complex Analysis
  3. W. Rudin, Functional Analysiss
  4. Applications to Fourier series

The Big Three Pt. 1 - Baire Category Theorem Explained

About the 'Big Three'

There are three theorems about Banach spaces that occur frequently in the crux of functional analysis, which are called the 'big three':

  1. The Hahn-Banach Theorem
  2. The Banach-Steinhaus Theorem
  3. The Open Mapping Theorem

The incoming series of blog posts is intended to offer a self-read friendly explanation with richer details. Some basic analysis and topology backgrounds are required.

First and second category

The term 'category' is due to Baire, who developed the category theorem afterwards. Let \(X\) be a topological space. A set \(E \subset X\) is said to be nowhere dense if \(\overline{E}\) has empty interior, i.e. \(\text{int}(\overline{E})= \varnothing\).

There are some easy examples of nowhere dense sets. For example, suppose \(X=\mathbb{R}\), equipped with the usual topology. Then \(\mathbb{N}\) is nowhere dense in \(\mathbb{R}\) while \(\mathbb{Q}\) is not. It's trivial since \(\overline{\mathbb{N}}=\mathbb{N}\), which has empty interior. Meanwhile \(\overline{\mathbb{Q}}=\mathbb{R}\). But \(\mathbb{R}\) is open, whose interior is itself. The category is defined using nowhere dense set. In fact,

  • A set \(S\) is of the first category if \(S\) is a countable union of nowhere dense sets.
  • A set \(T\) is of the second category if \(T\) is not of the first category.

Baire category theorem (BCT)

In this blog post, we consider two cases: BCT in complete metric space and in locally compact Hausdorff space. These two cases have nontrivial intersection but they are not equal. There are some complete metric spaces that are not locally compact Hausdorff.

There are some classic topological spaces, for example \(\mathbb{R}^n\), are both complete metric space and locally compact Hausdorff. If a locally compact Hausdorff space happens to be a topological vector space, then this space has finite dimension. Also, a topological vector space has to be Hausdorff.

By a Baire space we mean a topological space \(X\) such that the intersection of every countable collection of dense open subsets of \(X\) is also dense in \(X\).

Baire category states that

(BCT 1) Every complete metric space is a Baire space.

(BCT 2) Every locally compact Hausdorff space is a Baire space.

By taking the complement of the definition, we can see that, every Baire space is not of the first category.

Suppose we have a sequence of sets \(\{X_n\}\) where \(X_n\) is dense in \(X\) for all \(n>0\), then \(X_0=\cap_n X_n\) is also dense in \(X\). Notice then \(X_0^{c} = \cup_n X_n^c\), a nowhere dense set and a countable union of nowhere dense sets, i.e. of the first category.

Proving BCT 1 and BCT 2 via Choquet game

Let \(X\) be the given complete metric space or locally Hausdorff space, and \(\{X_n\}\) a countable collection of open subsets of \(X\). Pick an arbitrary open subsets of \(X\), namely \(A_0\) (this is possible due to the topology defined on \(X\)). To prove that \(\cap_n V_n\) is dense, we have to show that \(A_0 \cap \left(\cap_n V_n\right) \neq \varnothing\). This follows the definition of denseness. Typically we have

A subset \(A\) of \(X\) is dense if and only if \(A \cap U \neq \varnothing\) for all nonempty open subsets \(U\) of \(X\).

We pick a sequence of nonempty open sets \(\{A_n\}\) inductively. With \(A_{n-1}\) being picked, and since \(V_n\) is open and dense in \(X\), the intersection \(V_n \cap A_{n-1}\) is nonempty and open. \(A_n\) can be chosen such that \[ \overline{A}_n \subset V_n \cap A_{n-1} \] For BCT 1, \(A_n\) can be chosen to be open balls with radius \(< \frac{1}{n}\); for BCT 2, \(A_n\) can be chosen such that the closure is compact. Define \[ C = \bigcap_{n=1}^{\infty}\overline{A}_n \] Now, if \(X\) is a locally compact Hausdorff space, then due to the compactness, \(C\) is not empty, therefore we have \[ \begin{cases} K \subset A_0 \\ K \subset V_n \quad(n \in \mathbb{N}) \end{cases} \] which shows that \(A_0 \cap V_n \neq \varnothing\). BCT 2 is proved.

For BCT 1, we cannot follow this since it's not ensured that \(X\) has the Heine-Borel property, for example when \(X\) is the Hilbert space (this is also a reason why BCT 1 and BCT 2 are not equivalent). The only tool remaining is Cauchy sequence. But how and where?

For any \(\varepsilon > 0\), we have some \(N\) such that \(\frac{1}{N} < \varepsilon\). For all \(m>n>N\), we have \(A_m \subset A_n\subset A_N\), therefore the centers of \(\{A_n\}\) form a Cauchy sequence, converging to some point of \(K\), which implies that \(K \neq \varnothing\). BCT 1 follows.

Applications of BCT

BCT will be used directly in the big three. It can be considered as the origin of them. But there are many other applications in different branches of mathematics. The applications shown below are in the same pattern: if it does not hold, then we have a Baire space of the first category, which is not possible.

\(\mathbb{R}\) is uncountable

Suppose \(\mathbb{R}\) is countable, then we have \[ \mathbb{R}=\bigcup_{n=1}^{\infty}\{x_n\} \] where \(x_n\) is a real number. But \(\\{x_n\\}\) is nowhere dense, therefore \(\mathbb{R}\) is of the first category. A contradiction.

Suppose that \(f\) is an entire function, and that in every power series \[ f(z)=\sum_{n=1}^{\infty}c_n(z-a)^n \] has at least one coefficient is \(0\), then \(f\) is a polynomial (there exists a \(N\) such that \(c_n=0\) for all \(n>N\)).

You can find the proof here. We are using the fact that \(\mathbb{C}\) is complete.

An infinite dimensional Banach space \(B\) has no countable basis

Assume that \(B\) has a countable basis \(\{x_1,x_2,\cdots\}\) and define \[ B_n=\text{span}\{x_1,x_2,\cdots,x_n\} \] It can be easily shown that \(B_n\) is nowhere dense. In this sense, \(B=\cup_n B_n\). A contradiction since \(B\) is a complete metric space.

The series

Since there is no strong reason to write more posts on this topic, i.e. the three fundamental theorems of linear functional analysis, I think it's time to make a list of the series. It's been around half a year.