We develop two almost straightforward way to compute the Fourier transform of $\exp(-cx^2)$, in the sense that any contour integration and the calculus of residues are not required at all. The first cool approach enables us to think about these elementary concepts much deeper, so I highly recommend to study this approach as long as you are familiar with ODE of first order.

Read moreWe offer a detailed proof of the Riemann mapping theorem, which states that every proper simply connected region is conformally equivalent to the open unit disc.

Read moreThis blog post is intended to deliver a quick explanation of the algebra of Borel measures on \(\mathbb{R}^n\). It will be broken into pieces. All complex-valued complex Borel measures \(M(\mathbb{R}^n)\) clearly form a vector space over \(\mathbb{C}\). The main goal of this post is to show that this is a Banach space and also a Banach algebra.

In fact, the \(\mathbb{R}^n\) case can be generalised into any locally compact abelian group (see any abstract harmonic analysis books), this is because what really matters here is being locally compact and abelian. But at this moment we stick to Euclidean spaces. Note since \(\mathbb{R}^n\) is \(\sigma\)-compact, all Borel measures are regular.

To read this post you need to be familiar with some basic properties of Banach algebra, complex Borel measures, and the most important, Fubini's theorem.

We study the average of sum, in the sense of integral.

Read moreThroughout we consider the Hilbert space \(L^2=L^2(\mathbb{R})\), the space of all complex-valued functions with real variable such that \(f \in L^2\) if and only if \[ \lVert f \rVert_2^2=\int_{-\infty}^{\infty}|f(t)|^2dm(t)<\infty \] where \(m\) denotes the ordinary Lebesgue measure (in fact it's legitimate to consider Riemann integral in this context).

For each \(t \geq 0\), we assign an bounded linear operator \(Q(t)\) such that \[ (Q(t)f)(s)=f(s+t). \] This is indeed bounded since we have \(\lVert Q(t)f \rVert_2 = \lVert f \rVert_2\) as the Lebesgue measure is translate-invariant. This is a left translation operator with a single step \(t\).

We study the concept of quasi-analytic functions, which are quite close to being analytic.

Read moreSuppose \(1 < p < \infty\) and \(f \in L^p((0,\infty))\) (with respect to Lebesgue measure of course) is a nonnegative function, take \[ F(x) = \frac{1}{x}\int_0^x f(t)dt \quad 0 < x <\infty, \] we have Hardy's inequality \(\def\lrVert[#1]{\lVert #1 \rVert}\) \[ \lrVert[F]_p \leq q\lrVert[f]_p \] where \(\frac{1}{p}+\frac{1}{q}=1\) of course.

There are several ways to prove it. I think there are several good
reasons to write them down thoroughly since that may be why you find
this page. Maybe you are burnt out since it's *left as exercise*.
You are assumed to have enough knowledge of Lebesgue measure and
integration.

Let \(S_1,S_2 \subset \mathbb{R}\) be two measurable set, suppose \(F:S_1 \times S_2 \to \mathbb{R}\) is measurable, then \[ \left[\int_{S_2} \left\vert\int_{S_1}F(x,y)dx \right\vert^pdy\right]^{\frac{1}{p}} \leq \int_{S_1} \left[\int_{S_2} |F(x,y)|^p dy\right]^{\frac{1}{p}}dx. \] A proof can be found at here by turning to Example A9. You may need to replace all measures with Lebesgue measure \(m\).

Now let's get into it. For a measurable function in this place we should have \(G(x,t)=\frac{f(t)}{x}\). If we put this function inside this inequality, we see \[ \begin{aligned} \lrVert[F]_p &= \left[\int_0^\infty \left\vert \int_0^x \frac{f(t)}{x}dt \right\vert^p dx\right]^{\frac{1}{p}} \\ &= \left[\int_0^\infty \left\vert \int_0^1 f(ux)du \right\vert^p dx\right]^{\frac{1}{p}} \\ &\leq \int_0^1 \left[\int_0^\infty |f(ux)|^pdx\right]^{\frac{1}{p}}du \\ &= \int_0^1 \left[\int_0^\infty |f(ux)|^pudx\right]^{\frac{1}{p}}u^{-\frac{1}{p}}du \\ &= \lrVert[f]_p \int_0^1 u^{-\frac{1}{p}}du \\ &=q\lrVert[f]_p. \end{aligned} \] Note we have used change-of-variable twice and the inequality once.

I have no idea how people came up with this solution. Take \(xF(x)=\int_0^x f(t)t^{u}t^{-u}dt\) where \(0<u<1-\frac{1}{p}\). Hölder's inequality gives us \[ \begin{aligned} xF(x) &= \int_0^x f(t)t^ut^{-u}dt \\ &\leq \left[\int_0^x t^{-uq}dt\right]^{\frac{1}{q}}\left[\int_0^xf(t)^pt^{up}dt\right]^{\frac{1}{p}} \\ &=\left(\frac{1}{1-uq}x^{1-uq}\right)^{\frac{1}{q}}\left[\int_0^xf(t)^pt^{up}dt\right]^{\frac{1}{p}} \end{aligned} \] Hence \[ \begin{aligned} F(x)^p & \leq \frac{1}{x^p}\left\{\left(\frac{1}{1-uq}x^{1-uq}\right)^{\frac{1}{q}}\left[\int_0^xf(t)^pt^{up}dt\right]^{\frac{1}{p}}\right\}^{p} \\ &= \left(\frac{1}{1-uq}\right)^{\frac{p}{q}}x^{\frac{p}{q}(1-uq)-p}\int_0^x f(t)^pt^{up}dt \\ &= \left(\frac{1}{1-uq}\right)^{p-1}x^{-up-1}\int_0^x f(t)^pt^{up}dt \end{aligned} \]

Note we have used the fact that \(\frac{1}{p}+\frac{1}{q}=1 \implies p+q=pq\) and \(\frac{p}{q}=p-1\). Fubini's theorem gives us the final answer: \[ \begin{aligned} \int_0^\infty F(x)^pdx &\leq \int_0^\infty\left[\left(\frac{1}{1-uq}\right)^{p-1}x^{-up-1}\int_0^x f(t)^pt^{up}dt\right]dx \\ &=\left(\frac{1}{1-uq}\right)^{p-1}\int_0^\infty dx\int_0^x f(t)^pt^{up}x^{-up-1}dt \\ &=\left(\frac{1}{1-uq}\right)^{p-1}\int_0^\infty dt\int_t^\infty f(t)^pt^{up}x^{-up-1}dx \\ &=\left(\frac{1}{1-uq}\right)^{p-1}\frac{1}{up}\int_0^\infty f(t)^pdt. \end{aligned} \] It remains to find the minimum of \(\varphi(u) = \left(\frac{1}{1-uq}\right)^{p-1}\frac{1}{up}\). This is an elementary calculus problem. By taking its derivative, we see when \(u=\frac{1}{pq}<1-\frac{1}{p}\) it attains its minimum \(\left(\frac{p}{p-1}\right)^p=q^p\). Hence we get \[ \int_0^\infty F(x)^pdx \leq q^p\int_0^\infty f(t)^pdt, \] which is exactly what we want. Note the constant \(q\) cannot be replaced with a smaller one. We simply proved the case when \(f \geq 0\). For the general case, one simply needs to take absolute value.

This approach makes use of properties of \(L^p\) space. Still we assume that \(f \geq 0\) but we also assume \(f \in C_c((0,\infty))\), that is, \(f\) is continuous and has compact support. Hence \(F\) is differentiable in this situation. Integration by parts gives \[ \int_0^\infty F^p(x)dx=xF(x)^p\vert_0^\infty- p\int_0^\infty xdF^p = -p\int_0^\infty xF^{p-1}(x)F'(x)dx. \] Note since \(f\) has compact support, there are some \([a,b]\) such that \(f >0\) only if \(0 < a \leq x \leq b < \infty\) and hence \(xF(x)^p\vert_0^\infty=0\). Next it is natural to take a look at \(F'(x)\). Note we have \[ F'(x) = \frac{f(x)}{x}-\frac{\int_0^x f(t)dt}{x^2}, \] hence \(xF'(x)=f(x)-F(x)\). A substitution gives us \[ \int_0^\infty F^p(x)dx = -p\int_0^\infty F^{p-1}(x)[f(x)-F(x)]dx, \] which is equivalent to say \[ \int_0^\infty F^p(x)dx = \frac{p}{p-1}\int_0^\infty F^{p-1}(x)f(x)dx. \] Hölder's inequality gives us \[ \begin{aligned} \int_0^\infty F^{p-1}(x)f(x)dx &\leq \left[\int_0^\infty F^{(p-1)q}(x)dx\right]^{\frac{1}{q}}\left[\int_0^\infty f(x)^pdx\right]^{\frac{1}{p}} \\ &=\left[\int_0^\infty F^{p}(x)dx\right]^{\frac{1}{q}}\left[\int_0^\infty f(x)^pdx\right]^{\frac{1}{p}}. \end{aligned} \] Together with the identity above we get \[ \int_0^\infty F^p(x)dx = q\left[\int_0^\infty F^{p}(x)dx\right]^{\frac{1}{q}}\left[\int_0^\infty f(x)^pdx\right]^{\frac{1}{p}} \] which is exactly what we want since \(1-\frac{1}{q}=\frac{1}{p}\) and all we need to do is divide \(\left[\int_0^\infty F^pdx\right]^{1/q}\) on both sides. So what's next? Note \(C_c((0,\infty))\) is dense in \(L^p((0,\infty))\). For any \(f \in L^p((0,\infty))\), we can take a sequence of functions \(f_n \in C_c((0,\infty))\) such that \(f_n \to f\) with respect to \(L^p\)-norm. Taking \(F=\frac{1}{x}\int_0^x f(t)dt\) and \(F_n = \frac{1}{x}\int_0^x f_n(t)dt\), we need to show that \(F_n \to F\) pointwise, so that we can use Fatou's lemma. For \(\varepsilon>0\), there exists some \(m\) such that \(\lrVert[f_n-f]_p < \frac{1}{n}\). Thus \[ \begin{aligned} |F_n(x)-F(x)| &= \frac{1}{x}\left\vert \int_0^x f_n(t)dt - \int_0^x f(t)dt \right\vert \\ &\leq \frac{1}{x} \int_0^x |f_n(t)-f(t)|dt \\ &\leq \frac{1}{x} \left[\int_0^x|f_n(t)-f(t)|^pdt\right]^{\frac{1}{p}}\left[\int_0^x 1^qdt\right]^{\frac{1}{q}} \\ &=\frac{1}{x^{1/p}}\left[\int_0^x|f_n(t)-f(t)|^pdt\right]^{\frac{1}{p}} \\ &\leq \frac{1}{x^{1/p}}\lrVert[f_n-f]_p <\frac{\varepsilon}{x^{1/p}}. \end{aligned} \] Hence \(F_n \to F\) pointwise, which also implies that \(|F_n|^p \to |F|^p\) pointwise. For \(|F_n|\) we have \[ \begin{aligned} \int_0^\infty |F_n(x)|^pdx &= \int_0^\infty \left\vert\frac{1}{x}\int_0^x f_n(t)dt\right\vert^p dx \\ &\leq \int_0^\infty \left[\frac{1}{x}\int_0^x |f_n(t)|dt\right]^{p}dx \\ &\leq q\int_0^\infty |f_n(t)|^pdt \end{aligned} \] note the third inequality follows since we have already proved it for \(f \geq 0\). By Fatou's lemma, we have \[ \begin{aligned} \int_0^\infty |F(x)|^pdx &= \int_0^\infty \lim_{n \to \infty}|F_n(x)|^pdx \\ &\leq \lim_{n \to \infty} \int_0^\infty |F_n(x)|^pdx \\ &\leq \lim_{n \to \infty}q^p\int_0^\infty |f_n(x)|^pdx \\ &=q^p\int_0^\infty |f(x)|^pdx. \end{aligned} \]

Throughout, let \((X,\mathfrak{M},\mu)\) be a measure space where \(\mu\) is positive.

If \(f\) is of \(L^p(\mu)\), which means \(\lVert f \rVert_p=\left(\int_X |f|^p
d\mu\right)^{1/p}<\infty\), or equivalently \(\int_X |f|^p d\mu<\infty\), then we may
say \(|f|^p\) is of \(L^1(\mu)\). In other words, we have a
function \[
\begin{aligned}
\lambda: L^p(\mu) &\to L^1(\mu) \\
f &\mapsto |f|^p.
\end{aligned}
\] This function does not have to be one to one due to absolute
value. But we hope this function to be *fine* enough, at the very
least, we hope it is continuous.

Here, \(f \sim g\) means that \(f-g\) equals \(0\) almost everywhere with respect to \(\mu\). It can be easily verified that this is an equivalence relation.

We still use the \(\varepsilon-\delta\) argument but it's in a metric space. Suppose \((X,d_1)\) and \((Y,d_2)\) are two metric spaces and \(f:X \to Y\) is a function. We say \(f\) is continuous at \(x_0 \in X\) if, for any \(\varepsilon>0\), there exists some \(\delta>0\) such that \(d_2(f(x_0),f(x))<\varepsilon\) whenever \(d_1(x_0,x)<\delta\). Further, we say \(f\) is continuous on \(X\) if \(f\) is continuous at every point \(x \in X\).

For \(1\leq p<\infty\), we already have a metric by \[ d(f,g)=\lVert f-g \rVert_p \] given that \(d(f,g)=0\) if and only if \(f \sim g\). This is complete and makes \(L^p\) a Banach space. But for \(0<p<1\) (yes we are going to cover that), things are much more different, and there is one reason: Minkowski inequality holds reversely! In fact, we have \[ \lVert f+g \rVert_p \geq \lVert f \rVert_p + \lVert g \rVert_p \] for \(0<p<1\). \(L^p\) space has too many weird things when \(0<p<1\). Precisely,

For \(0<p<1\), \(L^p(\mu)\) is locally convex if and only if \(\mu\) assumes finitely many values. (Proof.)

On the other hand, for example, \(X=[0,1]\) and \(\mu=m\) be the Lebesgue measure, then \(L^p(\mu)\) has *no* open convex
subset other than \(\varnothing\) and
\(L^p(\mu)\) itself. However,

A topological vector space \(X\) is normable if and only if its origin has a convex bounded neighbourhood. (See Kolmogorov's normability criterion.)

Therefore \(L^p(m)\) is not normable, hence not Banach.

We have gone too far. We need a metric that is fine enough.

Define \[ \Delta(f)=\int_X |f|^p d\mu \] for \(f \in L^p(\mu)\). We will show that we have a metric by \[ d(f,g)=\Delta(f-g). \] Fix \(y\geq 0\), consider the function \[ f(x)=(x+y)^p-x^p. \] We have \(f(0)=y^p\) and \[ f'(x)=p(x+y)^{p-1}-px^{p-1} \leq px^{p-1}-px^{p-1}=0 \] when \(x > 0\) and hence \(f(x)\) is nonincreasing on \([0,\infty)\), which implies that \[ (x+y)^p \leq x^p+y^p. \] Hence for any \(f\), \(g \in L^p\), we have \[ \Delta(f+g)=\int_X |f+g|^p d\mu \leq \int_X |f|^p d\mu + \int_X |g|^p d\mu=\Delta(f)+\Delta(g). \] This inequality ensures that \[ d(f,g)=\Delta(f-g) \] is a metric. It's immediate that \(d(f,g)=d(g,f) \geq 0\) for all \(f\), \(g \in L^p(\mu)\). For the triangle inequality, note that \[ d(f,h)+d(g,h)=\Delta(f-h)+\Delta(h-g) \geq \Delta((f-h)+(h-g))=\Delta(f-g)=d(f,g). \] This is translate-invariant as well since \[ d(f+h,g+h)=\Delta(f+h-g-h)=\Delta(f-g)=d(f,g) \] The completeness can be verified in the same way as the case when \(p>1\). In fact, this metric makes \(L^p\) a locally bounded F-space.

The metric of \(L^1\) is defined by \[ d_1(f,g)=\lVert f-g \rVert_1=\int_X |f-g|d\mu. \] We need to find a relation between \(d_p(f,g)\) and \(d_1(\lambda(f),\lambda(g))\), where \(d_p\) is the metric of the corresponding \(L^p\) space.

As we have proved, \[ (x+y)^p \leq x^p+y^p. \] Without loss of generality we assume \(x \geq y\) and therefore \[ x^p=(x-y+y)^p \leq (x-y)^p+y^p. \] Hence \[ x^p-y^p \leq (x-y)^p. \] By interchanging \(x\) and \(y\), we get \[ |x^p-y^p| \leq |x-y|^p. \] Replacing \(x\) and \(y\) with \(|f|\) and \(|g|\) where \(f\), \(g \in L^p\), we get \[ \int_{X}\lvert |f|^p-|g|^p \rvert d\mu \leq \int_X |f-g|^p d\mu. \] But \[ d_1(\lambda(f),\lambda(g))=\int_{X}\lvert |f|^p-|g|^p \rvert d\mu \\ d_p(f,g)=\Delta(f-g)= d\mu \leq \int_X |f-g|^p d\mu \] and we therefore have \[ d_1(\lambda(f),\lambda(g)) \leq d_p(f,g). \] Hence \(\lambda\) is continuous (and in fact, Lipschitz continuous and uniformly continuous) when \(0<p<1\).

It's natural to think about Minkowski's inequality and Hölder's inequality in this case since they are critical inequality enablers. You need to think about some examples of how to create the condition to use them and get a fine result. In this section we need to prove that \[ |x^p-y^p| \leq p|x-y|(x^{p-1}+y^{p-1}). \] This inequality is surprisingly easy to prove however. We will use nothing but the mean value theorem. Without loss of generality we assume that \(x > y \geq 0\) and define \(f(t)=t^p\). Then \[ \frac{f(x)-f(y)}{x-y}=f'(\zeta)=p\zeta^{p-1} \] where \(y < \zeta < x\). But since \(p-1 \geq 0\), we see \(\zeta^{p-1} < x^{p-1} <x^{p-1}+y^{p-1}\). Therefore \[ f(x)-f(y)=x^p-y^p=p(x-y)\zeta^{p-1}<p(x-y)(x^{p-1}-y^{p-1}). \] For \(x=y\) the equality holds.

Therefore \[
\begin{aligned}
d_1(\lambda(f),\lambda(g)) &= \int_X \left||f|^p-|g|^p\right|d\mu \\
&\leq
\int_Xp\left||f|-|g|\right|(|f|^{p-1}+|g|^{p-1})d\mu
\end{aligned}
\] By *Hölder's inequality*, we have \[
\begin{aligned}
\int_X ||f|-|g||(|f|^{p-1}+|g|^{p-1})d\mu & \leq \left[\int_X
\left||f|-|g|\right|^pd\mu\right]^{1/p}\left[\int_X\left(|f|^{p-1}+|g|^{p-1}\right)^q\right]^{1/q}
\\
&\leq \left[\int_X
\left|f-g\right|^pd\mu\right]^{1/p}\left[\int_X\left(|f|^{p-1}+|g|^{p-1}\right)^q\right]^{1/q}
\\
&=\lVert f-g \rVert_p
\left[\int_X\left(|f|^{p-1}+|g|^{p-1}\right)^q\right]^{1/q}.
\end{aligned}
\] By *Minkowski's inequality*, we have \[
\left[\int_X\left(|f|^{p-1}+|g|^{p-1}\right)^q\right]^{1/q} \leq
\left[\int_X|f|^{(p-1)q}d\mu\right]^{1/q}+\left[\int_X
|g|^{(p-1)q}d\mu\right]^{1/q}
\] Now things are clear. Since \(1/p+1/q=1\), or equivalently \(1/q=(p-1)/p\), suppose \(\lVert f \rVert_p\), \(\lVert g \rVert_p \leq R\), then \((p-1)q=p\) and therefore \[
\left[\int_X|f|^{(p-1)q}d\mu\right]^{1/q}+\left[\int_X
|g|^{(p-1)q}d\mu\right]^{1/q} = \lVert f \rVert_p^{p-1}+\lVert g
\rVert_p^{p-1} \leq 2R^{p-1}.
\] Summing the inequalities above, we get \[
\begin{aligned}
d_1(\lambda(f),\lambda(g)) \leq 2pR^{p-1}\lVert f-g \rVert_p
=2pR^{p-1}d_p(f,g)
\end{aligned}
\] hence \(\lambda\) is
continuous.

We have proved that \(\lambda\) is continuous, and when \(0<p<1\), we have seen that \(\lambda\) is Lipschitz continuous. It's natural to think about its differentiability afterwards, but the absolute value function is not even differentiable so we may have no chance. But this is still a fine enough result. For example we have no restriction to \((X,\mathfrak{M},\mu)\) other than the positivity of \(\mu\). Therefore we may take \(\mathbb{R}^n\) as the Lebesgue measure space here, or we can take something else.

It's also interesting how we use elementary Calculus to solve some much more abstract problems.

(Gleason-Kahane-Żelazko)If \(\phi\) is a complex linear functional on a unitary Banach algebra \(A\), such that \(\phi(e)=1\) and \(\phi(x) \neq 0\) for every invertible \(x \in A\), then \[ \phi(xy)=\phi(x)\phi(y) \] Namely, \(\phi\) is a complex homomorphism.

Suppose \(A\) is a complex unitary
Banach algebra and \(\phi: A \to
\mathbb{C}\) is a linear functional which is not identically
\(0\) (for convenience), and if \[
\phi(xy)=\phi(x)\phi(y)
\] for all \(x \in A\) and \(y \in A\), then \(\phi\) is called a *complex
homomorphism* on \(A\). Note that a
unitary Banach algebra (with \(e\) as
multiplicative unit) is also a ring, so is \(\mathbb{C}\), we may say in this case \(\phi\) is a ring-homomorphism. For such
\(\phi\), we have an instant
proposition:

Proposition 0\(\phi(e)=1\) and \(\phi(x) \neq 0\) for every invertible \(x \in A\).

*Proof.* Since \(\phi(e)=\phi(ee)=\phi(e)\phi(e)\), we have
\(\phi(e)=0\) or \(\phi(e)=1\). If \(\phi(e)=0\) however, for any \(y \in A\), we have \(\phi(y)=\phi(ye)=\phi(y)\phi(e)=0\), which
is an excluded case. Hence \(\phi(e)=1\).

For invertible \(x \in A\), note that \(\phi(xx^{-1})=\phi(x)\phi(x^{-1})=\phi(e)=1\). This can't happen if \(\phi(x)=0\). \(\square\)

The theorem reveals that Proposition \(0\) actually characterizes the complex homomorphisms (ring-homomorphisms) among the linear functionals (group-homomorphisms).

This theorem was proved by Andrew M. Gleason in 1967 and later independently by J.-P. Kahane and W. Żelazko in 1968. Both of them worked mainly on commutative Banach algebras, and the non-commutative version, which focused on complex homomorphism, was by W. Żelazko. In this post we will follow the third one.

Unfortunately, one cannot find an educational proof on the Internet with ease, which may be the reason why I write this post and why you read this.

Following definitions of Banach algebra and some logic manipulation, we have several equivalences worth noting.

(Stated by Gleason)Let \(M\) be a linear subspace of codimension one in a commutative Banach algebra \(A\) having an identity. Suppose no element of \(M\) is invertible, then \(M\) is an ideal.

(Stated by Kahane and Żelazko)A subspace \(X \subset A\) of codimension \(1\) is a maximal ideal if and only if it consists of non-invertible elements.

(Stated by Kahane and Żelazko)Let \(A\) be a commutative complex Banach algebra with unit element. Then a functional \(f \in A^\ast\) is a multiplicative linear functional if and only if \(f(x)=\sigma(x)\) holds for all \(x \in A\).

Here \(\sigma(x)\) denotes the spectrum of \(x\).

Clearly any maximal ideal contains no invertible element (if so, then it contains \(e\), then it's the ring itself). So it suffices to show that it has codimension 1, and if it consists of non-invertible elements. Also note that every maximal ideal is the kernel of some complex homomorphism. For such a subspace \(X \subset A\), since \(e \notin X\), we may define \(\phi\) so that \(\phi(e)=1\), and \(\phi(x) \in \sigma(x)\) for all \(x \in A\). Note that \(\phi(e)=1\) holds if and only if \(\phi(x) \in \sigma(x)\). As we will show, \(\phi\) has to be a complex homomorphism.

Lemma 0Suppose \(A\) is a unitary Banach algebra, \(x \in A\), \(\lVert x \rVert<1\), then \(e-x\) is invertible.

This lemma can be found in any functional analysis book introducing Banach algebra.

Lemma 1Suppose \(f\) is an entire function of one complex variable, \(f(0)=1\), \(f'(0)=0\), and \[ 0<|f(\lambda)| \leq e^{|\lambda|} \] for all complex \(\lambda\), then \(f(\lambda)=1\) for all \(\lambda \in \mathbb{C}\).

Note that there is an entire function \(g\) such that \(f=\exp(g)\). It can be shown that \(g=0\). Indeed, if we put \[ h_r(\lambda) = \frac{r^2g(\lambda)}{\lambda^2[2r-g(\lambda)]} \] then we see \(h_r\) is holomorphic in the open disk centred at \(0\) with radius \(2r\). Besides, \(|h_r(\lambda)| \leq 1\) if \(|\lambda|=r\). By the maximum modulus theorem, we have \[ |h_r(\lambda)| \leq 1 \] whenever \(|\lambda| \leq r\). Fix \(\lambda\) and let \(r \to \infty\), by definition of \(h_r(\lambda)\), we must have \(g(\lambda)=0\).

A mapping \(\phi\) from one ring
\(R\) to another ring \(R'\) is said to be a **Jordan
homomorphism** from \(R\) to
\(R'\) if \[
\phi(a+b)=\phi(a)+\phi(b)
\] and \[
\phi(ab+ba)=\phi(a)\phi(b)+\phi(b)\phi(a).
\] It's of course clear that every homomorphism is Jordan. Note
if \(R'\) is not of characteristic
\(2\), the second identity is
equivalent to \[
\phi(a^2)=\phi(a)^2.
\] *To show the equivalence, one let \(b=a\) in the first case and puts \(a+b\) in place of \(a\) in the second case.*

Since in this case \(R=A\) and \(R'=\mathbb{C}\), the latter of which is commutative, we also write \[ \phi(ab+ba)=2\phi(a)\phi(b). \] As we will show, the \(\phi\) in the theorem is a Jordan homomorphism.

We will follow an unusual approach. By keep 'downgrading' the goal, one will see this algebraic problem be transformed into a pure analysis problem neatly.

To begin with, let \(N\) be the kernel of \(\phi\).

If \(\phi\) is a complex homomorphism, it is immediate that \(\phi\) is a Jordan homomorphism. Conversely, if \(\phi\) is Jordan, we have \[ \phi(xy+yx) =2\phi(x)\phi(y). \] If \(x\in N\), the right hand becomes \(0\), and therefore \[ xy+yx \in N \quad \text{if } x \in N, y \in A. \] Consider the identity \[ (xy-yx)^2+(xy+yx)^2=2[x(yxy)+(yxy)x] \]

Therefore \[ \begin{aligned} \phi((xy-yx)^2+(xy+yx)^2)&=\phi((xy-yx)^2)+\phi((xy+yx)^2) \\ &=\phi(xy-yx)^2+\phi(xy+yx)^2 \\ &= \phi(xy-yx)^2 \\ &=2\phi[x(yxy)+(yxy)x] \\ &=0 \end{aligned} \] Since \(x \in N\) and \(yxy \in A\), we see \(x(yxy)+(yxy)x \in N\). Therefore \(\phi(xy-yx)=0\) and \[ xy-yx \in N \] if \(x \in N\) and \(y \in A\). Further we see \[ xy-yx+xy+yx=2xy \in N \quad \text {and}\quad xy+yx-xy+yx = 2yx \in N, \] which implies that \(N\) is an ideal. This may remind you of this classic diagram (we will not use it since it is additive though):

For \(x,y \in A\), we have \(x \in \phi(x)e+N\) and \(y \in \phi(y)e+N\). As a result, \(xy \in \phi(x)\phi(y)e+N\), and therefore \[ \phi(xy)=\phi(x)\phi(y)+0. \]

Again, if \(\phi\) is Jordan, we have \(\phi(x^2)=\phi(x)^2\) for all \(x \in A\). Conversely, if \(\phi(a^2)=0\) for all \(a \in N\), we may write \(x\) by \[ x=\phi(x)e+a \] where \(a \in N\) for all \(x \in A\). Therefore \[ \begin{aligned} \phi(x^2)&=\phi((\phi(x)e+a)^2)=\phi(x)^2+2\phi(x)\phi(a)+\phi(a)^2=\phi(x)^2, \end{aligned} \] which also shows that \(\phi\) is Jordan.

Fix \(a \in N\), assume \(\lVert a \rVert = 1\) without loss of generality, and define \[ f(\lambda)=\sum_{n=0}^{\infty}\frac{\phi(a^n)}{n!}\lambda^n \] for all complex \(\lambda\). If this function is constant (lemma 1), we immediately have \(f''(0)=\phi(a^2)=0\). This is purely a complex analysis problem however.

Note in the definition of \(f\), we
have \[
\lvert \phi(a^n) \rvert \leq \lVert \phi \rVert \lVert a^n \rVert \leq
\lVert \phi \rVert \lVert a \rVert^n=\lVert \phi \rVert.
\] So we expect the norm of \(\phi\) to be finite, which ensures that
\(f\) is entire. By *reductio ad
absurdum*, if \(\lVert e-a \rVert <
1\) for \(a \in N\), by lemma 0,
we have \(e-e+a=a\) to be invertible,
which is impossible. Hence \(\lVert e-a \rVert
\geq 1\) for all \(a \in N\). On
the other hand, for \(\lambda \in
\mathbb{C}\), we have the following inequality: \[
\begin{aligned}
\lVert \lambda e-a \rVert = \lambda\lVert e-\lambda^{-1}a \rVert
&\geq|\lambda| \\
&= |\phi(\lambda e)-\phi(a)| \\
&= |\phi(\lambda e-a)|
\end{aligned}
\] Therefore \(\phi\) is
*continuous* with norm less than \(1\). The continuity of \(\phi\) is not assumed at the beginning but
proved here.

For \(f\) we have some immediate facts. Since each coefficient in the series of \(f\) has finite norm, \(f\) is entire with \(f'(0)=\phi(a)=0\). Also, since \(\phi\) has norm \(1\), we also have \[ |f(\lambda)|=\left|\sum_{n=0}^{\infty}\frac{\phi(a^n)}{n!}\lambda^n\right| \leq \sum_{n=0}^{\infty}\frac{|\lambda^n|}{n!}=e^{|\lambda|}. \] All we need in the end is to show that \(f(\lambda) \neq 0\) for all \(\lambda \in \mathbb{C}\).

The series \[
E(\lambda)=\exp(a\lambda)=\sum_{n=0}^{\infty}\frac{(\lambda a)^n}{n!}
\] converges since \(\lVert a
\rVert=1\). The continuity of \(\phi\) shows now \[
f(\lambda)=\phi(E(\lambda)).
\] Note \[
E(-\lambda)E(\lambda)=\left(\sum_{n=0}^{\infty}\frac{(-\lambda
a)^n}{n!}\right)\left(\sum_{n=0}^{\infty}\frac{(\lambda
a)^n}{n!}\right)=e.
\] Hence \(E(\lambda)\)
*is* invertible for all \(\lambda \in
C\), hence \(f(\lambda)=\phi(E(\lambda)) \neq 0\). By
lemma 1, \(f(\lambda)=1\) is constant.
The proof is completed by reversing the steps. \(\square\)

- Walter Rudin,
*Real and Complex Analysis* - Walter Rudin,
*Functional Analysis* - Andrew M. Gleason,
*A Characterization of Maximal Ideals* - J.-P. Kahane and W. Żelazko,
*A Characterization of Maximal Ideals in Commutative Banach Algebras* - W. Żelazko
*A Characterization of Multiplicative linear functionals in Complex Banach Algebras* - I. N. Herstein,
*Jordan Homomorphisms*

We are going to show the completeness of \(X/N\) where \(X\) is a TVS and \(N\) a closed subspace. Alongside, a bunch of useful analysis tricks will be demonstrated (and that's why you may find this blog post a little tedious.). But what's more important, the theorem proved here will be used in the future.

To make it clear, we should give a formal definition of \(F\)-space.

A topological space \(X\) is an \(F\)-space if its topology \(\tau\) is induced by a complete invariant metric \(d\).

A metric \(d\) on a vector space \(X\) will be called invariant if for all \(x,y,z \in X\), we have \[ d(x+z,y+z)=d(x,y). \] By complete we mean every Cauchy sequence of \((X,d)\) converges.

The metric can be inherited to the quotient space naturally (we will use this fact latter), that is

If \(X\) is a \(F\)-space, \(N\) is a closed subspace of a topological vector space \(X\), then \(X/N\) is still a \(F\)-space.

Suppose \(d\) is a complete invariant metric compatible with \(\tau_X\). The metric on \(X/N\) is defined by \[ \boxed{\rho(\pi(x),\pi(y))=\inf_{z \in N}d(x-y,z)} \] ### \(\rho\) is a metric

*Proof.* First, if \(\pi(x)=\pi(y)\), that is, \(x-y \in N\), we see \[
\rho(\pi(x),\pi(y))=\inf_{z \in N}d(x-y,z)=d(x-y,x-y)=0.
\] If \(\pi(x) \neq \pi(y)\)
however, we shall show that \(\rho(\pi(x),\pi(y))>0\). In this case,
we have \(x-y \notin N\). Since \(N\) is closed, \(N^c\) is open, and \(x-y\) is an interior point of \(X-N\). Therefore there exists an open ball
\(B_r(x-y)\) centered at \(x-y\) with radius \(r>0\) such that \(B_r(x-y) \cap N = \varnothing\). Notice we
have \(d(x-y,z)>r\) since otherwise
\(z \in B_r(x-y)\). By putting \[
r_0=\sup\{r:B_r(x-y) \cap N = \varnothing\},
\] we see \(d(x-y,z) \geq r_0\)
for all \(z \in N\) and indeed \(r_0=\inf_{z \in N}d(x-y,z)>0\) (the
verification can be done by contradiction). In general, \(\inf_z d(x-y,z)=0\) if and only if \(x-y \in \overline{N}\).

Next, we shall show that \(\rho(\pi(x),\pi(y))=\rho(\pi(y),\pi(x))\), and it suffices to assume that \(\pi(x) \neq \pi(y)\). Sgince \(d\) is translate invariant, we get \[ \begin{aligned} d(x-y,z)&=d(x-y-z,0) \\ &=d(0,y-x+z) \\ &=d(-z,y-x) \\ &=d(y-x,-z). \end{aligned} \] Therefore the \(\inf\) of the left hand is equal to the one of the right hand. The identity is proved.

Finally, we need to verify the triangle inequality. Let \(r,s,t \in X\). For any \(\varepsilon>0\), there exist some \(z_\varepsilon\) and \(z_\varepsilon'\) such that \[
d(r-s,z_\varepsilon)<\rho(\pi(r),\pi(s))+\frac{\varepsilon}{2},\quad
d(s-t,z'_\varepsilon)<\rho(\pi(s),\pi(t))+\frac{\varepsilon}{2}.
\] Since \(d\) is invariant, we
see \[
\begin{aligned}
d(r-t,z_\varepsilon+z'_\varepsilon)&=d((r-s)+(s-t)-(z_\varepsilon+z'_\varepsilon),0)
\\
&=d([(r-s)-z_\varepsilon]+[(s-t)-z'_\varepsilon],0)
\\
&=d(r-s-z_\varepsilon,t-s+z'_\varepsilon)
\\
&\leq
d(r-s-z_\varepsilon,0)+d(t-s+z'_\varepsilon,0) \\
&=d(r-s,z_\varepsilon)+d(s-t,z'_\varepsilon)
\end{aligned}
\] *(I owe @LeechLattice for
the inequality above.)*

Therefore \[
\begin{aligned}
d(r-t,z_\varepsilon+z'_\varepsilon)&\leq
d(r-s,z_\varepsilon)+d(s-t,z'_\varepsilon) \\
&<\rho(\pi(r),\pi(s))+\rho(\pi(s),\pi(t))+\varepsilon.
\end{aligned}
\] *(Warning: This does not imply that \(\rho(\pi(r),\pi(s))+\rho(\pi(s),\pi(t))=\inf_z
d(r-t,z)\) since we don't know whether it is the lower bound or
not.)*

If \(\rho(\pi(r),\pi(s))+\rho(\pi(s),\pi(t))<\rho(\pi(r),\pi(t))\) however, let \[ 0<\varepsilon<\rho(\pi(r),\pi(t))-(\rho(\pi(r),\pi(s))+\rho(\pi(s),\pi(t))) \] then there exists some \(z''_\varepsilon=z_\varepsilon+z'_\varepsilon\) such that \[ d(r-t,z''_\varepsilon)<\rho(\pi(r),\pi(t)) \] which is a contradiction since \(\rho(\pi(r),\pi(t)) \leq d(r-t,z)\) for all \(z \in N\).

*(We are using the \(\varepsilon\) definition of \(\inf\). See here.)*

Since \(\pi\) is surjective, we see if \(u \in X/N\), there exists some \(a \in X\) such that \(\pi(a)=u\). Therefore \[ \begin{aligned} \rho(\pi(x)+u,\pi(y)+u) &=\rho(\pi(x)+\pi(a),\pi(y)+\pi(a)) \\ &=\rho(\pi(x+a),\pi(y+a)) \\ &=\inf_{z \in N}d(x+a-y-a,z) \\ &=\rho(\pi(x),\pi(y)). \end{aligned} \]

If \(\pi(x)=\pi(x')\) and \(\pi(y)=\pi(y')\), we have to show that \(\rho(\pi(x),\pi(y))=\rho(\pi(x'),\pi(y'))\). In fact, \[ \begin{aligned} \rho(\pi(x),\pi(y)) &\leq \rho(\pi(x),\pi(x'))+\rho(\pi(x'),\pi(y'))+\rho(\pi(y'),\pi(y)) \\ &=\rho(\pi(x'),\pi(y')) \end{aligned} \] since \(\rho(\pi(x),\pi(x'))=0\) as \(\pi(x)=\pi(x')\). Meanwhile \[ \begin{aligned} \rho(\pi(x'),\pi(y')) &\leq \rho(\pi(x'),\pi(x)) + \rho(\pi(x),\pi(y)) + \rho(\pi(y),\pi(y')) \\ &= \rho(\pi(x),\pi(y)). \end{aligned} \] therefore \(\rho(\pi(x),\pi(y))=\rho(\pi(x'),\pi(y'))\).

By proving this, we need to show that a set \(E \subset X/N\) is open with respect to \(\tau_N\) if and only if \(E\) is a union of open balls. But we need to show a generalized version:

If \(\mathscr{B}\) is a local base for \(\tau\), then the collection \(\mathscr{B}_N\), which contains all sets \(\pi(V)\) where \(V \in \mathscr{B}\), forms a local base for \(\tau_N\).

*Proof.* We already know that \(\pi\) is continuous, linear and open.
Therefore \(\pi(V)\) is open for all
\(V \in \mathscr{B}\). For any open set
around \(E \subset X/N\) containing
\(\pi(0)\), we see \(\pi^{-1}(E)\) is open, and we have \[
\pi^{-1}(E)=\bigcup_{V\in\mathscr{B}}V
\] and therefore \[
E=\bigcup_{V \in \mathscr{B}}\pi(V).
\]

Now consider the local base \(\mathscr{B}\) containing all open balls around \(0 \in X\). Since \[ \pi(\{x:d(x,0)<r\})=\{u:\rho(u,\pi(0))<r\} \] we see \(\rho\) determines \(\mathscr{B}_N\). But we have already proved that \(\rho\) is invariant; hence \(\mathscr{B}_N\) determines \(\tau_N\).

Once this is proved, we are able to claim that, if \(X\) is a \(F\)-space, then \(X/N\) is still a \(F\)-space, since its topology is induced by a complete invariant metric \(\rho\).

*Proof.* Suppose \((x_n)\) is
a Cauchy sequence in \(X/N\), relative
to \(\rho\). There is a subsequence
\((x_{n_k})\) with \(\rho(x_{n_k},x_{n_{k+1}})<2^{-k}\).
Since \(\pi\) is surjective, we are
able to pick some \(z_k \in X\) such
that \(\pi(z_k) = x_{n_k}\) and such
that \[
d(z_{k},z_{k+1})<2^{-k}.
\] (The existence can be verified by contradiction still.) By the
inequality above, we see \((z_k)\) is
Cauchy (can you see why?). Since \(X\)
is complete, \(z_k \to z\) for some
\(z \in X\). By the
**continuity** of \(\pi\),
we also see \(x_{n_k} \to \pi(z)\) as
\(k \to \infty\). Therefore \((x_{n_k})\) converges. Hence \((x_n)\) converges since it has a convergent
subsequence. \(\rho\) is complete.

This fact will be used to prove some corollaries in the open mapping theorem. For instance, for any continuous linear map \(\Lambda:X \to Y\), we see \(\ker(\Lambda)\) is closed, therefore if \(X\) is a \(F\)-space, then \(X/\ker(\Lambda)\) is a \(F\)-space as well. We will show in the future that \(X/\ker(\Lambda)\) and \(\Lambda(X)\) are homeomorphic if \(\Lambda(X)\) is of the second category.

There are more properties that can be inherited by \(X/N\) from \(X\). For example, normability, metrizability, local convexity. In particular, if \(X\) is Banach, then \(X/N\) is Banach as well. To do this, it suffices to define the quotient norm by \[ \lVert \pi(x) \rVert = \inf\{\lVert x-z \rVert:z \in N\}. \]