The abc Theorem of Polynomials

In this post we show the Mason-Stothers theorem, the so-called $abc$ theorem for polynomials, and derive Fermat's Last theorem and Davenport's inequality for polynomials. These three theorems correspond to the $abc$ conjecture, Fermat's Last Theorem and Hall's conjecture in number theory.
Read more

The Fourier Transform of exp(-cx^2) and Its Convolution

We develop two almost straightforward way to compute the Fourier transform of $\exp(-cx^2)$, in the sense that any contour integration and the calculus of residues are not required at all. The first cool approach enables us to think about these elementary concepts much deeper, so I highly recommend to study this approach as long as you are familiar with ODE of first order.
Read more

The Banach Algebra of Borel Measures on Euclidean Space

This blog post is intended to deliver a quick explanation of the algebra of Borel measures on \(\mathbb{R}^n\). It will be broken into pieces. All complex-valued complex Borel measures \(M(\mathbb{R}^n)\) clearly form a vector space over \(\mathbb{C}\). The main goal of this post is to show that this is a Banach space and also a Banach algebra.

In fact, the \(\mathbb{R}^n\) case can be generalised into any locally compact abelian group (see any abstract harmonic analysis books), this is because what really matters here is being locally compact and abelian. But at this moment we stick to Euclidean spaces. Note since \(\mathbb{R}^n\) is \(\sigma\)-compact, all Borel measures are regular.

To read this post you need to be familiar with some basic properties of Banach algebra, complex Borel measures, and the most important, Fubini's theorem.

Read more

Dedekind Domain and Properties in an Elementary Approach

You can find contents about Dedekind domain (or Dedekind ring) in almost all algebraic number theory books. But many properties can be proved inside ring theory. I hope you can find the solution you need in this post, and this post will not go further than elementary ring theory. With that being said, you are assumed to have enough knowledge of ring and ring of fractions (this post serves well), but not too much mathematics maturity is assumed (at the very least you are assumed to be familiar with terminologies in the linked post).\(\def\mb{\mathbb}\) \(\def\mfk{\mathfrak}\)

Definition

There are several ways to define Dedekind domain since there are several equivalent statements of it. We will start from the one based on ring of fractions. As a friendly reminder, \(\mb{Z}\) or any principal integral domain is already a Dedekind domain. In fact Dedekind domain may be viewed as a generalization of principal integral domain.

Let \(\mfk{o}\) be an integral domain (a.k.a. entire ring), and \(K\) be its quotient field. A Dedekind domain is an integral domain \(\mfk{o}\) such that the fractional ideals form a group under multiplication. Let's have a breakdown. By a fractional ideal \(\mfk{a}\) we mean a nontrivial additive subgroup of \(K\) such that

  • \(\mfk{o}\mfk{a}=\mfk{a}\),
  • there exists some nonzero element \(c \in \mfk{o}\) such that \(c\mfk{a} \subset \mfk{o}\).

What does the group look like? As you may guess, the unit element is \(\mfk{o}\). For a fractional ideal \(\mfk{a}\), we have the inverse to be another fractional ideal \(\mfk{b}\) such that \(\mfk{ab}=\mfk{ba}=\mfk{o}\). Note we regard \(\mfk{o}\) as a subring of \(K\). For \(a \in \mfk{o}\), we treat it as \(a/1 \in K\). This makes sense because the map \(i:a \mapsto a/1\) is injective. For the existence of \(c\), you may consider it as a restriction that the 'denominator' is bounded. Alternatively, we say that fractional ideal of \(K\) is a finitely generated \(\mfk{o}\)-submodule of \(K\). But in this post it is not assumed that you have learned module theory.

Let's take \(\mb{Z}\) as an example. The quotient field of \(\mb{Z}\) is \(\mb{Q}\). We have a fractional ideal \(P\) where all elements are of the type \(\frac{np}{2}\) with \(p\) prime and \(n \in \mb{Z}\). Then indeed we have \(\mb{Z}P=P\). On the other hand, take \(2 \in \mb{Z}\), we have \(2P \subset \mb{Z}\). For its inverse we can take a fractional ideal \(Q\) where all elements are of the type \(\frac{2n}{p}\). As proved in algebraic number theory, the ring of algebraic integers in a number field is a Dedekind domain.

Before we go on we need to clarify the definition of ideal multiplication. Let \(\mfk{a}\) and \(\mfk{b}\) be two ideals, we define \(\mfk{ab}\) to be the set of all sums

\[ x_1y_1+\cdots+x_ny_n \]

where \(x_i \in \mfk{a}\) and \(y_i \in \mfk{b}\). Here the number \(n\) means finite but is not fixed. Alternatively we cay say \(\mfk{ab}\) contains all finite sum of products of \(\mfk{a}\) and \(\mfk{b}\).

Propositions

(Proposition 1) A Dedekind domain \(\mfk{o}\) is Noetherian.

By Noetherian ring we mean that every ideal in a ring is finitely generated. Precisely, we will prove that for every ideal \(\mfk{a} \subset \mfk{o}\) there are \(a_1,a_2,\cdots,a_n \in \mfk{a}\) such that, for every \(r \in \mfk{a}\), we have an expression

\[ r = c_1a_1 + c_2a_2 + \cdots + c_na_n \qquad c_1,c_2,\cdots,c_n \in \mfk{o}. \]

Also note that any ideal \(\mfk{a} \subset \mfk{o}\) can be viewed as a fractional ideal.

Proof. Since \(\mfk{a}\) is an ideal of \(\mfk{o}\), let \(K\) be the quotient field of \(\mfk{o}\), we see since \(\mfk{oa}=\mfk{a}\), we may also view \(\mfk{a}\) as a fractional ideal. Since \(\mfk{o}\) is a Dedekind domain, and fractional ideals of \(\mfk{a}\) is a group, there is an fractional ideal \(\mfk{b}\) such that \(\mfk{ab}=\mfk{ba}=\mfk{o}\). Since \(1 \in \mfk{o}\), we may say that there exists some \(a_1,a_2,\cdots, a_n \in \mfk{a}\) and \(b_1,b_2,\cdots,b_n \in \mfk{o}\) such that \(\sum_{i = 1 }^{n}a_ib_i=1\). For any \(r \in \mfk{a}\), we have an expression

\[ r = rb_1a_1+rb_2a_2+\cdots+rb_na_n. \]

On the other hand, any element of the form \(c_1a_1+c_2a_2+\cdots+c_na_n\), by definition, is an element of \(\mfk{a}\). \(\blacksquare\)

From now on, the inverse of an fractional ideal \(\mfk{a}\) will be written like \(\mfk{a}^{-1}\).

(Proposition 2) For ideals \(\mfk{a},\mfk{b} \subset \mfk{o}\), \(\mfk{b}\subset\mfk{a}\) if and only if there exists some \(\mfk{c}\) such that \(\mfk{ac}=\mfk{b}\) (or we simply say \(\mfk{a}|\mfk{b}\))

Proof. If \(\mfk{b}=\mfk{ac}\), simply note that \(\mfk{ac} \subset \mfk{a} \cap \mfk{c} \subset \mfk{a}\). For the converse, suppose that \(a \supset \mfk{b}\), then \(\mfk{c}=\mfk{a}^{-1}\mfk{b}\) is an ideal of \(\mfk{o}\) since \(\mfk{c}=\mfk{a}^{-1}\mfk{b} \subset \mfk{a}^{-1}\mfk{a}=\mfk{o}\), hence we may write \(\mfk{b}=\mfk{a}\mfk{c}\). \(\blacksquare\)

(Proposition 3) If \(\mfk{a}\) is an ideal of \(\mfk{o}\), then there are prime ideals \(\mfk{p}_1,\mfk{p}_2,\cdots,\mfk{p}_n\) such that

\[ \mfk{a}=\mfk{p}_1\mfk{p}_2\cdots\mfk{p}_n. \]

Proof. For this problem we use a classical technique: contradiction on maximality. Suppose this is not true, let \(\mfk{A}\) be the set of ideals of \(\mfk{o}\) that cannot be written as the product of prime ideals. By assumption \(\mfk{U}\) is non-empty. Since as we have proved, \(\mfk{o}\) is Noetherian, we can pick a maximal element \(\mfk{a}\) of \(\mfk{A}\) with respect to inclusion. If \(\mfk{a}\) is maximal, then since all maximal ideals are prime, \(\mfk{a}\) itself is prime as well. If \(\mfk{a}\) is properly contained in an ideal \(\mfk{m}\), then we write \(\mfk{a}=\mfk{m}\mfk{m}^{-1}\mfk{a}\). We have \(\mfk{m}^{-1}\mfk{a} \supsetneq \mfk{a}\) since if not, we have \(\mfk{a}=\mfk{ma}\), which implies that \(\mfk{m}=\mfk{o}\). But by maximality, \(\mfk{m}^{-1}\mfk{a}\not\in\mfk{U}\), hence it can be written as a product of prime ideals. But \(\mfk{m}\) is prime as well, we have a prime factorization for \(\mfk{a}\), contradicting the definition of \(\mfk{U}\).

Next we show unicity up to a permutation. If

\[ \mfk{p}_1\mfk{p}_2\cdots\mfk{p}_k=\mfk{q}_1\mfk{q}_2\cdots\mfk{q}_j, \]

since \(\mfk{p}_1\mfk{p}_2\cdots\mfk{p}_k\subset\mfk{p}_1\) and \(\mfk{p}_1\) is prime, we may assume that \(\mfk{q}_1 \subset \mfk{p}_1\). By the property of fractional ideal we have \(\mfk{q}_1=\mfk{p}_1\mfk{r}_1\) for some fractional ideal \(\mfk{r}_1\). However we also have \(\mfk{q}_1 \subset \mfk{r}_1\). Since \(\mfk{q}_1\) is prime, we either have \(\mfk{q}_1 \supset \mfk{p}_1\) or \(\mfk{q}_1 \supset \mfk{r}_1\). In the former case we get \(\mfk{p}_1=\mfk{q}_1\), and we finish the proof by continuing inductively. In the latter case we have \(\mfk{r}_1=\mfk{q}_1=\mfk{p}_1\mfk{q}_1\), which shows that \(\mfk{p}_1=\mfk{o}\), which is impossible. \(\blacksquare\)

(Proposition 4) Every nontrivial prime ideal \(\mfk{p}\) is maximal.

Proof. Let \(\mfk{m}\) be an maximal ideal containing \(\mfk{p}\). By proposition 2 we have some \(\mfk{c}\) such that \(\mfk{p}=\mfk{mc}\). If \(\mfk{m} \neq \mfk{p}\), then \(\mfk{c} \neq \mfk{o}\), and we may write \(\mfk{c}=\mfk{p}_1\cdots\mfk{p}_n\), hence \(\mfk{p}=\mfk{m}\mfk{p}_1\cdots\mfk{p}_n\), which is a prime factorisation, contradicting the fact that \(\mfk{p}\) has a unique prime factorisation, which is \(\mfk{p}\) itself. Hence any maximal ideal containing \(\mfk{p}\) is \(\mfk{p}\) itself. \(\blacksquare\)

(Proposition 5) Suppose the Dedekind domain \(\mfk{o}\) only contains one prime (and maximal) ideal \(\mfk{p}\), let \(t \in \mfk{p}\) and \(t \not\in \mfk{p}^2\), then \(\mfk{p}\) is generated by \(t\).

Proof. Let \(\mfk{t}\) be the ideal generated by \(t\). By proposition 3 we have a factorisation

\[ \mfk{t}=\mfk{p}^n \]

for some \(n\) since \(\mfk{o}\) contains only one prime ideal. According to proposition 2, if \(n \geq 3\), we write \(\mfk{p}^n=\mfk{p}^2\mfk{p}^{n-2}\), we see \(\mfk{p}^2 \supset \mfk{p}^n\). But this is impossible since if so we have \(t \in \mfk{p}^n \subset \mfk{p}^2\) contradicting our assumption. Hence \(0<n<3\). But If \(n=2\) we have \(t \in \mfk{p}^2\) which is also not possible. So \(\mfk{t}=\mfk{p}\) provided that such \(t\) exists.

For the existence of \(t\), note if not, then for all \(t \in \mfk{p}\) we have \(t \in \mfk{p}^2\), hence \(\mfk{p} \subset \mfk{p}^2\). On the other hand we already have \(\mfk{p}^2 = \mfk{p}\mfk{p}\), which implies that \(\mfk{p}^2 \subset \mfk{p}\) (proposition 2), hence \(\mfk{p}^2=\mfk{p}\), contradicting proposition 3. Hence such \(t\) exists and our proof is finished. \(\blacksquare\)

Characterisation of Dedekind domain

In fact there is another equivalent definition of Dedekind domain:

A domain \(\mfk{o}\) is Dedekind if and only if

  • \(\mfk{o}\) is Noetherian.
  • \(\mfk{o}\) is integrally closed.
  • \(\mfk{o}\)​ has Krull dimension \(1\)​ (i.e. every non-zero prime ideals are maximal).

This is equivalent to say that faction ideals form a group and is frequently used by mathematicians as well. But we need some more advanced techniques to establish the equivalence. Presumably there will be a post about this in the future.

A Continuous Function Sending L^p Functions to L^1

Throughout, let \((X,\mathfrak{M},\mu)\) be a measure space where \(\mu\) is positive.

The question

If \(f\) is of \(L^p(\mu)\), which means \(\lVert f \rVert_p=\left(\int_X |f|^p d\mu\right)^{1/p}<\infty\), or equivalently \(\int_X |f|^p d\mu<\infty\), then we may say \(|f|^p\) is of \(L^1(\mu)\). In other words, we have a function \[ \begin{aligned} \lambda: L^p(\mu) &\to L^1(\mu) \\ f &\mapsto |f|^p. \end{aligned} \] This function does not have to be one to one due to absolute value. But we hope this function to be fine enough, at the very least, we hope it is continuous.

Here, \(f \sim g\) means that \(f-g\) equals \(0\) almost everywhere with respect to \(\mu\). It can be easily verified that this is an equivalence relation.

Continuity

We still use the \(\varepsilon-\delta\) argument but it's in a metric space. Suppose \((X,d_1)\) and \((Y,d_2)\) are two metric spaces and \(f:X \to Y\) is a function. We say \(f\) is continuous at \(x_0 \in X\) if, for any \(\varepsilon>0\), there exists some \(\delta>0\) such that \(d_2(f(x_0),f(x))<\varepsilon\) whenever \(d_1(x_0,x)<\delta\). Further, we say \(f\) is continuous on \(X\) if \(f\) is continuous at every point \(x \in X\).

Metrics

For \(1\leq p<\infty\), we already have a metric by \[ d(f,g)=\lVert f-g \rVert_p \] given that \(d(f,g)=0\) if and only if \(f \sim g\). This is complete and makes \(L^p\) a Banach space. But for \(0<p<1\) (yes we are going to cover that), things are much more different, and there is one reason: Minkowski inequality holds reversely! In fact, we have \[ \lVert f+g \rVert_p \geq \lVert f \rVert_p + \lVert g \rVert_p \] for \(0<p<1\). \(L^p\) space has too many weird things when \(0<p<1\). Precisely,

For \(0<p<1\), \(L^p(\mu)\) is locally convex if and only if \(\mu\) assumes finitely many values. (Proof.)

On the other hand, for example, \(X=[0,1]\) and \(\mu=m\) be the Lebesgue measure, then \(L^p(\mu)\) has no open convex subset other than \(\varnothing\) and \(L^p(\mu)\) itself. However,

A topological vector space \(X\) is normable if and only if its origin has a convex bounded neighbourhood. (See Kolmogorov's normability criterion.)

Therefore \(L^p(m)\) is not normable, hence not Banach.

We have gone too far. We need a metric that is fine enough.

Metric of \(L^p\) when \(0<p<1\)

Define \[ \Delta(f)=\int_X |f|^p d\mu \] for \(f \in L^p(\mu)\). We will show that we have a metric by \[ d(f,g)=\Delta(f-g). \] Fix \(y\geq 0\), consider the function \[ f(x)=(x+y)^p-x^p. \] We have \(f(0)=y^p\) and \[ f'(x)=p(x+y)^{p-1}-px^{p-1} \leq px^{p-1}-px^{p-1}=0 \] when \(x > 0\) and hence \(f(x)\) is nonincreasing on \([0,\infty)\), which implies that \[ (x+y)^p \leq x^p+y^p. \] Hence for any \(f\), \(g \in L^p\), we have \[ \Delta(f+g)=\int_X |f+g|^p d\mu \leq \int_X |f|^p d\mu + \int_X |g|^p d\mu=\Delta(f)+\Delta(g). \] This inequality ensures that \[ d(f,g)=\Delta(f-g) \] is a metric. It's immediate that \(d(f,g)=d(g,f) \geq 0\) for all \(f\), \(g \in L^p(\mu)\). For the triangle inequality, note that \[ d(f,h)+d(g,h)=\Delta(f-h)+\Delta(h-g) \geq \Delta((f-h)+(h-g))=\Delta(f-g)=d(f,g). \] This is translate-invariant as well since \[ d(f+h,g+h)=\Delta(f+h-g-h)=\Delta(f-g)=d(f,g) \] The completeness can be verified in the same way as the case when \(p>1\). In fact, this metric makes \(L^p\) a locally bounded F-space.

The continuity of \(\lambda\)

The metric of \(L^1\) is defined by \[ d_1(f,g)=\lVert f-g \rVert_1=\int_X |f-g|d\mu. \] We need to find a relation between \(d_p(f,g)\) and \(d_1(\lambda(f),\lambda(g))\), where \(d_p\) is the metric of the corresponding \(L^p\) space.

\(0<p<1\)

As we have proved, \[ (x+y)^p \leq x^p+y^p. \] Without loss of generality we assume \(x \geq y\) and therefore \[ x^p=(x-y+y)^p \leq (x-y)^p+y^p. \] Hence \[ x^p-y^p \leq (x-y)^p. \] By interchanging \(x\) and \(y\), we get \[ |x^p-y^p| \leq |x-y|^p. \] Replacing \(x\) and \(y\) with \(|f|\) and \(|g|\) where \(f\), \(g \in L^p\), we get \[ \int_{X}\lvert |f|^p-|g|^p \rvert d\mu \leq \int_X |f-g|^p d\mu. \] But \[ d_1(\lambda(f),\lambda(g))=\int_{X}\lvert |f|^p-|g|^p \rvert d\mu \\ d_p(f,g)=\Delta(f-g)= d\mu \leq \int_X |f-g|^p d\mu \] and we therefore have \[ d_1(\lambda(f),\lambda(g)) \leq d_p(f,g). \] Hence \(\lambda\) is continuous (and in fact, Lipschitz continuous and uniformly continuous) when \(0<p<1\).

\(1 \leq p < \infty\)

It's natural to think about Minkowski's inequality and Hölder's inequality in this case since they are critical inequality enablers. You need to think about some examples of how to create the condition to use them and get a fine result. In this section we need to prove that \[ |x^p-y^p| \leq p|x-y|(x^{p-1}+y^{p-1}). \] This inequality is surprisingly easy to prove however. We will use nothing but the mean value theorem. Without loss of generality we assume that \(x > y \geq 0\) and define \(f(t)=t^p\). Then \[ \frac{f(x)-f(y)}{x-y}=f'(\zeta)=p\zeta^{p-1} \] where \(y < \zeta < x\). But since \(p-1 \geq 0\), we see \(\zeta^{p-1} < x^{p-1} <x^{p-1}+y^{p-1}\). Therefore \[ f(x)-f(y)=x^p-y^p=p(x-y)\zeta^{p-1}<p(x-y)(x^{p-1}-y^{p-1}). \] For \(x=y\) the equality holds.


Therefore \[ \begin{aligned} d_1(\lambda(f),\lambda(g)) &= \int_X \left||f|^p-|g|^p\right|d\mu \\ &\leq \int_Xp\left||f|-|g|\right|(|f|^{p-1}+|g|^{p-1})d\mu \end{aligned} \] By Hölder's inequality, we have \[ \begin{aligned} \int_X ||f|-|g||(|f|^{p-1}+|g|^{p-1})d\mu & \leq \left[\int_X \left||f|-|g|\right|^pd\mu\right]^{1/p}\left[\int_X\left(|f|^{p-1}+|g|^{p-1}\right)^q\right]^{1/q} \\ &\leq \left[\int_X \left|f-g\right|^pd\mu\right]^{1/p}\left[\int_X\left(|f|^{p-1}+|g|^{p-1}\right)^q\right]^{1/q} \\ &=\lVert f-g \rVert_p \left[\int_X\left(|f|^{p-1}+|g|^{p-1}\right)^q\right]^{1/q}. \end{aligned} \] By Minkowski's inequality, we have \[ \left[\int_X\left(|f|^{p-1}+|g|^{p-1}\right)^q\right]^{1/q} \leq \left[\int_X|f|^{(p-1)q}d\mu\right]^{1/q}+\left[\int_X |g|^{(p-1)q}d\mu\right]^{1/q} \] Now things are clear. Since \(1/p+1/q=1\), or equivalently \(1/q=(p-1)/p\), suppose \(\lVert f \rVert_p\), \(\lVert g \rVert_p \leq R\), then \((p-1)q=p\) and therefore \[ \left[\int_X|f|^{(p-1)q}d\mu\right]^{1/q}+\left[\int_X |g|^{(p-1)q}d\mu\right]^{1/q} = \lVert f \rVert_p^{p-1}+\lVert g \rVert_p^{p-1} \leq 2R^{p-1}. \] Summing the inequalities above, we get \[ \begin{aligned} d_1(\lambda(f),\lambda(g)) \leq 2pR^{p-1}\lVert f-g \rVert_p =2pR^{p-1}d_p(f,g) \end{aligned} \] hence \(\lambda\) is continuous.

Conclusion and further

We have proved that \(\lambda\) is continuous, and when \(0<p<1\), we have seen that \(\lambda\) is Lipschitz continuous. It's natural to think about its differentiability afterwards, but the absolute value function is not even differentiable so we may have no chance. But this is still a fine enough result. For example we have no restriction to \((X,\mathfrak{M},\mu)\) other than the positivity of \(\mu\). Therefore we may take \(\mathbb{R}^n\) as the Lebesgue measure space here, or we can take something else.

It's also interesting how we use elementary Calculus to solve some much more abstract problems.

Basic Facts of Semicontinuous Functions

Continuity

We are restricting ourselves into \(\mathbb{R}\) endowed with normal topology. Recall that a function is continuous if and only if for any open set \(U \subset \mathbb{R}\), we have \[ \{x:f(x) \in U\}=f^{-1}(U) \]

to be open. One can rewrite this statement using \(\varepsilon-\delta\) language. To say a function \(f: \mathbb{R} \to \mathbb{R}\) continuous at \(f(x)\), we mean for any \(\varepsilon>0\), there exists some \(\delta>0\) such that for \(t \in (x-\delta,x+\delta)\), we have \[ |f(x)-f(t)|<\varepsilon. \] \(f\) is continuous on \(\mathbb{R}\) if and only if \(f\) is continuous at every point of \(\mathbb{R}\).

If \((x-\delta,x+\delta)\) is replaced with \((x-\delta,x)\) or \((x,x+\delta)\), we get left continuous and right continuous, one of which plays an important role in probability theory.

But the problem is, sometimes continuity is too strong for being a restriction, but the 'direction' associated with left/right continuous functions are unnecessary as well. For example the function \[ f(x)=\chi_{(0,1)}(x) \] is neither left nor right continuous (globally), but it is a thing. Left/right continuity is not a perfectly weakened version of continuity. We need something different.

Definition of semicontinuous

Let \(f\) be a real (or extended-real) function on \(\mathbb{R}\). The semicontinuity of \(f\) is defined as follows.

If \[ \{x:f(x)>\alpha\} \] is open for all real \(\alpha\), we say \(f\) is lower semicontinuous.

If \[ \{x:f(x)<\alpha\} \] is open for all real \(\alpha\), we say \(f\) is upper semicontinuous.

Is it possible to rewrite these definitions à la \(\varepsilon-\delta\)? The answer is yes if we restrict ourselves in metric space.

\(f: \mathbb{R} \to \mathbb{R}\) is upper semicontinuous at \(x\) if, for every \(\varepsilon>0\), there exists some \(\delta>0\) such that for \(t \in (x-\delta,x+\delta)\), we have \[ f(t)<f(x)+\varepsilon \]

\(f: \mathbb{R} \to \mathbb{R}\) is lower semicontinuous at \(x\) if, for every \(\varepsilon>0\), there exists some \(\delta>0\) such that for \(t \in (x-\delta,x+\delta)\), we have \[ f(t)>f(x)-\varepsilon \]

Of course, \(f\) is upper/lower semicontinuous on \(\mathbb{R}\) if and only if it is so on every point of \(\mathbb{R}\). One shall find no difference between the definitions in different styles.

Relation with continuous functions

Here is another way to see it. For the continuity of \(f\), we are looking for arbitrary open subsets \(V\) of \(\mathbb{R}\), and \(f^{-1}(V)\) is expected to be open. For the lower/upper semicontinuity of \(f\), however, the open sets are restricted to be like \((\alpha,+\infty]\) and \([-\infty,\alpha)\). Since all open sets of \(\mathbb{R}\) can be generated by the union or intersection of sets like \([-\infty,\alpha)\) and \((\beta,+\infty]\), we immediately get

\(f\) is continuous if and only if \(f\) is both upper semicontinuous and lower semicontinuous.

Proof. If \(f\) is continuous, then for any \(\alpha \in \mathbb{R}\), we see \([-\infty,\alpha)\) is open, and therefore \[ f^{-1}([-\infty,\alpha)) \] has to be open. The upper semicontinuity is proved. The lower semicontinuity of \(f\) is proved in the same manner.

If \(f\) is both upper and lower semicontinuous, we see \[ f^{-1}((\alpha,\beta))=f^{-1}([-\infty,\beta)) \cap f^{-1}((\alpha,+\infty]) \] is open. Since every open subset of \(\mathbb{R}\) can be written as a countable union of segments of the above types, we see for any open subset \(V\) of \(\mathbb{R}\), \(f^{-1}(V)\) is open. (If you have trouble with this part, it is recommended to review the definition of topology.) \(\square\)

Examples

There are two important examples.

  1. If \(E \subset \mathbb{R}\) is open, then \(\chi_E\) is lower semicontinuous.
  2. If \(F \subset \mathbb{R}\) is closed, then \(\chi_F\) is upper semicontinuous.

We will prove the first one. The second one follows in the same manner of course. For \(\alpha<0\), the set \(A=\chi_E^{-1}((\alpha,+\infty])\) is equal to \(\mathbb{R}\), which is open. For \(\alpha \geq 1\), since \(\chi_E \leq 1\), we see \(A=\varnothing\). For \(0 \leq \alpha < 1\) however, the set of \(x\) where \(\chi_E>\alpha\) has to be \(E\), which is still open.

When checking the semicontinuity of a function, we check from bottom to top or top to bottom. The function \(\chi_E\) is defined by \[ \chi_E(x)=\begin{cases} 1 \quad x \in E \\ 0 \quad x \notin E \end{cases}. \]

Addition of semicontinuous functions

If \(f_1\) and \(f_2\) are upper/lower semicontinuous, then so is \(f_1+f_2\).

Proof. We are going to prove this using different tools. Suppose now both \(f_1\) and \(f_2\) are upper semicontinuous. For \(\varepsilon>0\), there exists some \(\delta_1>0\) and \(\delta_2>0\) such that \[ f_1(t) < f_1(x)+\varepsilon/2 \quad t \in (x-\delta_1,x+\delta_1), \\ f_2(t) < f_2(x) + \varepsilon/2 \quad t \in (x-\delta_2,x+\delta_2). \] Proof. If we pick \(\delta=\min(\delta_1,\delta_2)\), then we see for all \(t \in (x-\delta,x+\delta)\), we have \[ f_1(t)+f_2(t)<f_1(x)+f_2(x)+\varepsilon. \] The upper semicontinuity of \(f_1+f_2\) is proved by considering all \(x \in \mathbb{R}\).

Now suppose both \(f_1\) and \(f_2\) are lower semicontinuous. We have an identity by \[ \{x:f_1+f_2>\alpha\}=\bigcup_{\beta\in\mathbb{R}}\{x:f_1>\beta\}\cap\{x:f_2>\alpha-\beta\}. \] The set on the right side is always open. Hence \(f_1+f_2\) is lower semicontinuous. \(\square\)


However, when there are infinite many semicontinuous functions, things are different.

Let \(\{f_n\}\) be a sequence of nonnegative functions on \(\mathbb{R}\), then

  • If each \(f_n\) is lower semicontinuous, then so is \(\sum_{1}^{\infty}f_n\).
  • If each \(f_n\) is upper semicontinuous, then \(\sum_{1}^{\infty}f_n\) is not necessarily upper semicontinuous.

Proof. To prove this we are still using the properties of open sets. Put \(g_n=\sum_{1}^{n}f_k\). Now suppose all \(f_k\) are lower. Since \(g_n\) is a finite sum of lower functions, we see each \(g_n\) is lower. Let \(f=\sum_{n}f_n\). As \(f_k\) are non-negative, we see \(f(x)>\alpha\) if and only if there exists some \(n_0\) such that \(g_{n_0}(x)>\alpha\). Therefore \[ \{x:f(x)>\alpha\}=\bigcup_{n \geq n_0}\{x:g_n>\alpha\}. \] The set on the right hand is open already.

For the upper semicontinuity, it suffices to give a counterexample, but before that, we shall give the motivation.

As said, the characteristic function of a closed set is upper semicontinuous. Suppose \(\{E_n\}\) is a sequence of almost disjoint closed set, then \(E=\cup_{n\geq 1}E_n\) is not necessarily closed, therefore \(\chi_E=\sum\chi_{E_n}\) (a.e.) is not necessarily upper semicontinuous. Now we give a concrete example. Put \(f_0=\chi_{[1,+\infty]}\) and \(f_n=\chi_{E_n}\) for \(n \geq 1\) where \[ E_n=\{x:\frac{1}{1+n} \leq x \leq \frac{1}{n}\}. \] For \(x > 0\), we have \(f=\sum_nf_n \geq 1\). Meanwhile, \(f^{-1}([-\infty,1))=[-\infty,0]\), which is not open. \(\square\)

Notice that \(f\) can be defined on any topological space here.

Maximum and minimum

There is one fact we already know about continuous functions.

If \(X\) is compact, \(f: X \to \mathbb{R}\) is continuous, then there exists some \(a,b \in X\) such that \(f(a)=\min f(X)\), \(f(b)=\max f(X)\).

In fact, \(f(X)\) is compact still. But for semicontinuous functions, things will be different but reasonable. For upper semicontinuous functions, we have the following fact.

If \(X\) is compact and \(f: X \to (-\infty,+\infty)\) is upper semicontinuous, then there exists some \(a \in X\) such that \(f(a)=\max f(X)\).

Notice that \(X\) is not assumed to hold any other topological property. It can be Hausdorff or Lindelöf, but we are not asking for restrictions like this. The only property we will be using is that every open cover of \(X\) has a finite subcover. Of course, one can replace \(X\) with any compact subset of \(\mathbb{R}\), for example, \([a,b]\).

Proof. Put \(\alpha=\sup f(X)\), and define \[ E_n=\{x:f(x)<\alpha-\frac{1}{n}\}. \] If \(f\) attains no maximum, then for any \(x \in X\), there exists some \(n \geq 1\) such that \(f(x)<\alpha-\frac{1}{n}\). That is, \(x \in E_n\) for some \(n\). Therefore \(\bigcup_{n \geq 1}E_n\) covers \(X\). But this cover has no finite subcover of \(X\). A contradiction since \(X\) is compact. \(\square\)

Approximating integrable functions

This is a comprehensive application of several properties of semicontinuity.

(Vitali–Carathéodory theorem) Suppose \(f \in L^1(\mathbb{R})\), where \(f\) is a real-valued function. For \(\varepsilon>0\), there exist some functions \(u\) and \(v\) on \(\mathbb{R}\) such that \(u \leq f \leq v\), \(u\) is an upper semicontinuous function bounded above, and \(v\) is lower semicontinuous bounded below, and \[ \boxed{\int_{\mathbb{R}}(v-u)dm<\varepsilon} \]

It suffices to prove this theorem for \(f \geq 0\) (of course \(f\) is not identically equal to \(0\) since this case is trivial). Since \(f\) is the pointwise limit of an increasing sequence of simple functions \(s_n\), can to write \(f\) as \[ f=s_1+\sum_{n=2}^{\infty}(s_n-s_{n-1}). \] By putting \(t_1=s_1\), \(t_n=s_n-s_{n-1}\) for \(n \geq 2\), we get \(f=\sum_n t_n\). We can write \(f\) as \[ f=\sum_{k=1}^{\infty}c_k\chi_{E_k} \] where \(E_k\) is measurable for all \(k\). Also, we have \[ \int_X f d\mu = \sum_{k=1}^{\infty}c_km(E_k), \] and the series on the right hand converges (since \(f \in L^1\). By the properties of Lebesgue measure, there exists a compact set \(F_k\) and an open set \(V_k\) such that \(F_k \subset E_k \subset V_k\) and \(c_km(V_k-F_k)<\frac{\varepsilon}{2^{k+1}}\). Put \[ v=\sum_{k=1}^{\infty}c_k\chi_{V_k},\quad u=\sum_{k=1}^{N}c_k\chi_{F_k} \] (now you can see \(v\) is lower semicontinuous and \(u\) is upper semicontinuous). The \(N\) is chosen in such a way that \[ \sum_{k=N+1}^{\infty}c_km(E_K)<\frac{\varepsilon}{2}. \] Since \(V_k \supset E_k\), we have \(\chi_{V_k} \geq \chi_{E_k}\). Therefore \(v \geq f\). Similarly, \(f \geq u\). Now we need to check the desired integral inequality. A simple recombination shows that \[ \begin{aligned} v-u&=\sum_{k=1}^{\infty}c_k\chi_{V_k}-\sum_{k=1}^{N}c_k\chi_{F_k} \\ &\leq \sum_{k=1}^{\infty}c_k\chi_{V_k}-\sum_{k=1}^{N}c_k\chi_{F_k}+\sum_{k=N+1}^{\infty}c_k(\chi_{E_k}-\chi_{F_k}) \\ &=\sum_{k=1}^{\infty}c_k(\chi_{V_k}-\chi_{F_k})+\sum_{k=N+1}^{\infty}c_k\chi_{E_k}. \end{aligned}. \] If we integrate the function above, we get \[ \begin{aligned} \int_{\mathbb{R}}(v-u)dm &\leq \sum_{k=1}^{\infty}c_k\mu(V_k-E_k)+\sum_{k=N+1}^{\infty}c_k\chi_{E_k} \\ &< \sum_{k=1}^{\infty}\frac{\varepsilon}{2^{k+1}}+\frac{\varepsilon}{2} \\ &=\varepsilon. \end{aligned} \] This proved the case when \(f \geq 0\). In the general case, we write \(f=f^{+}-f^{-}\). Attach the semicontinuous functions to \(f^{+}\) and \(f^{-}\) respectively by \(u_1 \leq f^{+} \leq v_1\) and \(u_2 \leq f^{-} \leq v_2\). Put \(u=u_1-v_2\), \(v=v_1-u_2\). As we can see, \(u\) is upper semicontinuous and \(v\) is lower semicontinuous. Also, \(u \leq f \leq v\) with the desired property since \[ \int_\mathbb{R}(v-u)dm=\int_\mathbb{R}(v_1-u_1)dm+\int_\mathbb{R}(v_2-u_2)dm<2\varepsilon, \] and the theorem follows. \(\square\)

Generalisation

Indeed, the only property about measure used is the existence of \(F_k\) and \(V_k\). The domain \(\mathbb{R}\) here can be replaced with \(\mathbb{R}^k\) for \(1 \leq k < \infty\), and \(m\) be replaced with the respective \(m_k\). Much more generally, the domain can be replaced by any locally compact Hausdorff space \(X\) and the measure by any measure associated with the Riesz-Markov-Kakutani representation theorem on \(C_c(X)\).

Is the reverse approximation always possible?

The answer is no. Consider the fat Cantor set \(K\), which has Lebesgue measure \(\frac{1}{2}\). We shall show that \(\chi_K\) can not be approximated below by a lower semicontinuous function.

If \(v\) is a lower semicontinuous function such that \(v \leq \chi_K\), then \(v \leq 0\).

Proof. Consider the set \(V=v^{-1}((0,1])=v^{-1}((0,+\infty))\). Since \(v \leq \chi_K\), we have \(V \subset K\). We will show that \(V\) has to be empty.

Pick \(t \in V\). Since \(V\) is open, there exists some neighbourhood \(U\) containing \(t\) such that \(U \subset V\). But \(U=\varnothing\) since \(U \subset K\) and \(K\) has an empty interior. Therefore \(V = \varnothing\). That is, \(v \leq 0\) for all \(x\). \(\square\)

Suppose \(u\) is an upper semicontinuous function such that \(u \geq f\). For \(\varepsilon=\frac{1}{2}\), we have \[ \int_{\mathbb{R}}(u-v)dm \geq \int_\mathbb{R}(f-v)dm \geq \frac{1}{2}. \] This example shows that there exist some integrable functions that are not able to reversely approximated in the sense of the Vitali–Carathéodory theorem.