In this post we show the Mason-Stothers theorem, the so-called $abc$ theorem for polynomials, and derive Fermat's Last theorem and Davenport's inequality for polynomials. These three theorems correspond to the $abc$ conjecture, Fermat's Last Theorem and Hall's conjecture in number theory.

Read moreIn this post we study cyclotomic polynomials in field theory and deduce some baisc properties of it. We will also use it to solve some problems in field theory.

Read moreWe prove the celebrated Hensel's lemma using the so-called Newton's method and "double induction", and try to find solutions of polynomials in $\mathbb{Q}_p$.

Read moreWe develop two almost straightforward way to compute the Fourier transform of $\exp(-cx^2)$, in the sense that any contour integration and the calculus of residues are not required at all. The first cool approach enables us to think about these elementary concepts much deeper, so I highly recommend to study this approach as long as you are familiar with ODE of first order.

Read moreThis blog post is intended to deliver a quick explanation of the algebra of Borel measures on \(\mathbb{R}^n\). It will be broken into pieces. All complex-valued complex Borel measures \(M(\mathbb{R}^n)\) clearly form a vector space over \(\mathbb{C}\). The main goal of this post is to show that this is a Banach space and also a Banach algebra.

In fact, the \(\mathbb{R}^n\) case can be generalised into any locally compact abelian group (see any abstract harmonic analysis books), this is because what really matters here is being locally compact and abelian. But at this moment we stick to Euclidean spaces. Note since \(\mathbb{R}^n\) is \(\sigma\)-compact, all Borel measures are regular.

To read this post you need to be familiar with some basic properties of Banach algebra, complex Borel measures, and the most important, Fubini's theorem.

We study the average of sum, in the sense of integral.

Read moreYou can find contents about Dedekind domain (or Dedekind ring) in
*almost all* algebraic number theory books. But many properties
can be proved inside ring theory. I hope you can find the solution you
need in this post, and this post will not go further than elementary
ring theory. With that being said, you are assumed to have enough
knowledge of ring and ring of fractions (this post
serves well), but not too much mathematics maturity is assumed (at the
very least you are assumed to be familiar with terminologies in the
linked post).\(\def\mb{\mathbb}\) \(\def\mfk{\mathfrak}\)

There are several ways to define Dedekind domain since there are several equivalent statements of it. We will start from the one based on ring of fractions. As a friendly reminder, \(\mb{Z}\) or any principal integral domain is already a Dedekind domain. In fact Dedekind domain may be viewed as a generalization of principal integral domain.

Let \(\mfk{o}\) be an integral
domain (a.k.a. entire ring), and \(K\)
be its quotient field. A **Dedekind domain** is an integral
domain \(\mfk{o}\) such that the
fractional ideals form a group under multiplication. Let's have a
breakdown. By a **fractional ideal** \(\mfk{a}\) we mean a nontrivial additive
subgroup of \(K\) such that

- \(\mfk{o}\mfk{a}=\mfk{a}\),
- there exists some nonzero element \(c \in \mfk{o}\) such that \(c\mfk{a} \subset \mfk{o}\).

What does the group look like? As you may guess, the unit element is
\(\mfk{o}\). For a fractional ideal
\(\mfk{a}\), we have the inverse to be
another fractional ideal \(\mfk{b}\)
such that \(\mfk{ab}=\mfk{ba}=\mfk{o}\). Note we regard
\(\mfk{o}\) as a subring of \(K\). For \(a \in
\mfk{o}\), we treat it as \(a/1 \in
K\). This makes sense because the map \(i:a \mapsto a/1\) is injective. For the
existence of \(c\), you may consider it
as a restriction that the 'denominator' is *bounded*.
Alternatively, we say that fractional ideal of \(K\) is a finitely generated \(\mfk{o}\)-submodule of \(K\). But in this post it is not assumed
that you have learned module theory.

Let's take \(\mb{Z}\) as an example. The quotient field of \(\mb{Z}\) is \(\mb{Q}\). We have a fractional ideal \(P\) where all elements are of the type \(\frac{np}{2}\) with \(p\) prime and \(n \in \mb{Z}\). Then indeed we have \(\mb{Z}P=P\). On the other hand, take \(2 \in \mb{Z}\), we have \(2P \subset \mb{Z}\). For its inverse we can take a fractional ideal \(Q\) where all elements are of the type \(\frac{2n}{p}\). As proved in algebraic number theory, the ring of algebraic integers in a number field is a Dedekind domain.

Before we go on we need to clarify the definition of ideal multiplication. Let \(\mfk{a}\) and \(\mfk{b}\) be two ideals, we define \(\mfk{ab}\) to be the set of all sums

\[ x_1y_1+\cdots+x_ny_n \]

where \(x_i \in \mfk{a}\) and \(y_i \in \mfk{b}\). Here the number \(n\) means finite but is not fixed. Alternatively we cay say \(\mfk{ab}\) contains all finite sum of products of \(\mfk{a}\) and \(\mfk{b}\).

(Proposition 1)A Dedekind domain \(\mfk{o}\) is Noetherian.

By Noetherian ring we mean that every ideal in a ring is finitely generated. Precisely, we will prove that for every ideal \(\mfk{a} \subset \mfk{o}\) there are \(a_1,a_2,\cdots,a_n \in \mfk{a}\) such that, for every \(r \in \mfk{a}\), we have an expression

\[ r = c_1a_1 + c_2a_2 + \cdots + c_na_n \qquad c_1,c_2,\cdots,c_n \in \mfk{o}. \]

Also note that any ideal \(\mfk{a} \subset \mfk{o}\) can be viewed as a fractional ideal.

**Proof.** Since \(\mfk{a}\) is an ideal of \(\mfk{o}\), let \(K\) be the quotient field of \(\mfk{o}\), we see since \(\mfk{oa}=\mfk{a}\), we may also view \(\mfk{a}\) as a fractional ideal. Since
\(\mfk{o}\) is a Dedekind domain, and
fractional ideals of \(\mfk{a}\) is a
group, there is an fractional ideal \(\mfk{b}\) such that \(\mfk{ab}=\mfk{ba}=\mfk{o}\). Since \(1 \in \mfk{o}\), we may say that there
exists some \(a_1,a_2,\cdots, a_n \in
\mfk{a}\) and \(b_1,b_2,\cdots,b_n \in
\mfk{o}\) such that \(\sum_{i = 1
}^{n}a_ib_i=1\). For any \(r \in
\mfk{a}\), we have an expression

\[ r = rb_1a_1+rb_2a_2+\cdots+rb_na_n. \]

On the other hand, any element of the form \(c_1a_1+c_2a_2+\cdots+c_na_n\), by definition, is an element of \(\mfk{a}\). \(\blacksquare\)

From now on, the inverse of an fractional ideal \(\mfk{a}\) will be written like \(\mfk{a}^{-1}\).

(Proposition 2)For ideals \(\mfk{a},\mfk{b} \subset \mfk{o}\), \(\mfk{b}\subset\mfk{a}\) if and only if there exists some \(\mfk{c}\) such that \(\mfk{ac}=\mfk{b}\) (or we simply say \(\mfk{a}|\mfk{b}\))

**Proof.** If \(\mfk{b}=\mfk{ac}\), simply note that \(\mfk{ac} \subset \mfk{a} \cap \mfk{c} \subset
\mfk{a}\). For the converse, suppose that \(a \supset \mfk{b}\), then \(\mfk{c}=\mfk{a}^{-1}\mfk{b}\) is an ideal
of \(\mfk{o}\) since \(\mfk{c}=\mfk{a}^{-1}\mfk{b} \subset
\mfk{a}^{-1}\mfk{a}=\mfk{o}\), hence we may write \(\mfk{b}=\mfk{a}\mfk{c}\). \(\blacksquare\)

(Proposition 3)If \(\mfk{a}\) is an ideal of \(\mfk{o}\), then there are prime ideals \(\mfk{p}_1,\mfk{p}_2,\cdots,\mfk{p}_n\) such that\[ \mfk{a}=\mfk{p}_1\mfk{p}_2\cdots\mfk{p}_n. \]

**Proof.** For this problem we use a classical
technique: contradiction on maximality. Suppose this is not true, let
\(\mfk{A}\) be the set of ideals of
\(\mfk{o}\) that cannot be written as
the product of prime ideals. By assumption \(\mfk{U}\) is non-empty. Since as we have
proved, \(\mfk{o}\) is Noetherian, we
can pick a maximal element \(\mfk{a}\)
of \(\mfk{A}\) with respect to
inclusion. If \(\mfk{a}\) is maximal,
then since all maximal ideals are prime, \(\mfk{a}\) itself is prime as well. If \(\mfk{a}\) is properly contained in an ideal
\(\mfk{m}\), then we write \(\mfk{a}=\mfk{m}\mfk{m}^{-1}\mfk{a}\). We
have \(\mfk{m}^{-1}\mfk{a} \supsetneq
\mfk{a}\) since if not, we have \(\mfk{a}=\mfk{ma}\), which implies that
\(\mfk{m}=\mfk{o}\). But by maximality,
\(\mfk{m}^{-1}\mfk{a}\not\in\mfk{U}\),
hence it can be written as a product of prime ideals. But \(\mfk{m}\) is prime as well, we have a prime
factorization for \(\mfk{a}\),
contradicting the definition of \(\mfk{U}\).

Next we show unicity up to a permutation. If

\[ \mfk{p}_1\mfk{p}_2\cdots\mfk{p}_k=\mfk{q}_1\mfk{q}_2\cdots\mfk{q}_j, \]

since \(\mfk{p}_1\mfk{p}_2\cdots\mfk{p}_k\subset\mfk{p}_1\) and \(\mfk{p}_1\) is prime, we may assume that \(\mfk{q}_1 \subset \mfk{p}_1\). By the property of fractional ideal we have \(\mfk{q}_1=\mfk{p}_1\mfk{r}_1\) for some fractional ideal \(\mfk{r}_1\). However we also have \(\mfk{q}_1 \subset \mfk{r}_1\). Since \(\mfk{q}_1\) is prime, we either have \(\mfk{q}_1 \supset \mfk{p}_1\) or \(\mfk{q}_1 \supset \mfk{r}_1\). In the former case we get \(\mfk{p}_1=\mfk{q}_1\), and we finish the proof by continuing inductively. In the latter case we have \(\mfk{r}_1=\mfk{q}_1=\mfk{p}_1\mfk{q}_1\), which shows that \(\mfk{p}_1=\mfk{o}\), which is impossible. \(\blacksquare\)

(Proposition 4)Every nontrivial prime ideal \(\mfk{p}\) is maximal.

**Proof.** Let \(\mfk{m}\) be an maximal ideal containing
\(\mfk{p}\). By proposition 2 we have
some \(\mfk{c}\) such that \(\mfk{p}=\mfk{mc}\). If \(\mfk{m} \neq \mfk{p}\), then \(\mfk{c} \neq \mfk{o}\), and we may write
\(\mfk{c}=\mfk{p}_1\cdots\mfk{p}_n\),
hence \(\mfk{p}=\mfk{m}\mfk{p}_1\cdots\mfk{p}_n\),
which is a prime factorisation, contradicting the fact that \(\mfk{p}\) has a unique prime factorisation,
which is \(\mfk{p}\) itself. Hence any
maximal ideal containing \(\mfk{p}\) is
\(\mfk{p}\) itself. \(\blacksquare\)

(Proposition 5)Suppose the Dedekind domain \(\mfk{o}\) only contains one prime (and maximal) ideal \(\mfk{p}\), let \(t \in \mfk{p}\) and \(t \not\in \mfk{p}^2\), then \(\mfk{p}\) is generated by \(t\).

**Proof.** Let \(\mfk{t}\) be the ideal generated by \(t\). By proposition 3 we have a
factorisation

\[ \mfk{t}=\mfk{p}^n \]

for some \(n\) since \(\mfk{o}\) contains only one prime ideal. According to proposition 2, if \(n \geq 3\), we write \(\mfk{p}^n=\mfk{p}^2\mfk{p}^{n-2}\), we see \(\mfk{p}^2 \supset \mfk{p}^n\). But this is impossible since if so we have \(t \in \mfk{p}^n \subset \mfk{p}^2\) contradicting our assumption. Hence \(0<n<3\). But If \(n=2\) we have \(t \in \mfk{p}^2\) which is also not possible. So \(\mfk{t}=\mfk{p}\) provided that such \(t\) exists.

For the existence of \(t\), note if not, then for all \(t \in \mfk{p}\) we have \(t \in \mfk{p}^2\), hence \(\mfk{p} \subset \mfk{p}^2\). On the other hand we already have \(\mfk{p}^2 = \mfk{p}\mfk{p}\), which implies that \(\mfk{p}^2 \subset \mfk{p}\) (proposition 2), hence \(\mfk{p}^2=\mfk{p}\), contradicting proposition 3. Hence such \(t\) exists and our proof is finished. \(\blacksquare\)

In fact there is another equivalent definition of Dedekind domain:

A domain \(\mfk{o}\) is Dedekind if and only if

- \(\mfk{o}\) is Noetherian.
- \(\mfk{o}\) is integrally closed.
- \(\mfk{o}\) has Krull dimension \(1\) (i.e. every non-zero prime ideals are maximal).

This is equivalent to say that faction ideals form a group and is frequently used by mathematicians as well. But we need some more advanced techniques to establish the equivalence. Presumably there will be a post about this in the future.

Throughout, let \((X,\mathfrak{M},\mu)\) be a measure space where \(\mu\) is positive.

If \(f\) is of \(L^p(\mu)\), which means \(\lVert f \rVert_p=\left(\int_X |f|^p
d\mu\right)^{1/p}<\infty\), or equivalently \(\int_X |f|^p d\mu<\infty\), then we may
say \(|f|^p\) is of \(L^1(\mu)\). In other words, we have a
function \[
\begin{aligned}
\lambda: L^p(\mu) &\to L^1(\mu) \\
f &\mapsto |f|^p.
\end{aligned}
\] This function does not have to be one to one due to absolute
value. But we hope this function to be *fine* enough, at the very
least, we hope it is continuous.

Here, \(f \sim g\) means that \(f-g\) equals \(0\) almost everywhere with respect to \(\mu\). It can be easily verified that this is an equivalence relation.

We still use the \(\varepsilon-\delta\) argument but it's in a metric space. Suppose \((X,d_1)\) and \((Y,d_2)\) are two metric spaces and \(f:X \to Y\) is a function. We say \(f\) is continuous at \(x_0 \in X\) if, for any \(\varepsilon>0\), there exists some \(\delta>0\) such that \(d_2(f(x_0),f(x))<\varepsilon\) whenever \(d_1(x_0,x)<\delta\). Further, we say \(f\) is continuous on \(X\) if \(f\) is continuous at every point \(x \in X\).

For \(1\leq p<\infty\), we already have a metric by \[ d(f,g)=\lVert f-g \rVert_p \] given that \(d(f,g)=0\) if and only if \(f \sim g\). This is complete and makes \(L^p\) a Banach space. But for \(0<p<1\) (yes we are going to cover that), things are much more different, and there is one reason: Minkowski inequality holds reversely! In fact, we have \[ \lVert f+g \rVert_p \geq \lVert f \rVert_p + \lVert g \rVert_p \] for \(0<p<1\). \(L^p\) space has too many weird things when \(0<p<1\). Precisely,

For \(0<p<1\), \(L^p(\mu)\) is locally convex if and only if \(\mu\) assumes finitely many values. (Proof.)

On the other hand, for example, \(X=[0,1]\) and \(\mu=m\) be the Lebesgue measure, then \(L^p(\mu)\) has *no* open convex
subset other than \(\varnothing\) and
\(L^p(\mu)\) itself. However,

A topological vector space \(X\) is normable if and only if its origin has a convex bounded neighbourhood. (See Kolmogorov's normability criterion.)

Therefore \(L^p(m)\) is not normable, hence not Banach.

We have gone too far. We need a metric that is fine enough.

Define \[ \Delta(f)=\int_X |f|^p d\mu \] for \(f \in L^p(\mu)\). We will show that we have a metric by \[ d(f,g)=\Delta(f-g). \] Fix \(y\geq 0\), consider the function \[ f(x)=(x+y)^p-x^p. \] We have \(f(0)=y^p\) and \[ f'(x)=p(x+y)^{p-1}-px^{p-1} \leq px^{p-1}-px^{p-1}=0 \] when \(x > 0\) and hence \(f(x)\) is nonincreasing on \([0,\infty)\), which implies that \[ (x+y)^p \leq x^p+y^p. \] Hence for any \(f\), \(g \in L^p\), we have \[ \Delta(f+g)=\int_X |f+g|^p d\mu \leq \int_X |f|^p d\mu + \int_X |g|^p d\mu=\Delta(f)+\Delta(g). \] This inequality ensures that \[ d(f,g)=\Delta(f-g) \] is a metric. It's immediate that \(d(f,g)=d(g,f) \geq 0\) for all \(f\), \(g \in L^p(\mu)\). For the triangle inequality, note that \[ d(f,h)+d(g,h)=\Delta(f-h)+\Delta(h-g) \geq \Delta((f-h)+(h-g))=\Delta(f-g)=d(f,g). \] This is translate-invariant as well since \[ d(f+h,g+h)=\Delta(f+h-g-h)=\Delta(f-g)=d(f,g) \] The completeness can be verified in the same way as the case when \(p>1\). In fact, this metric makes \(L^p\) a locally bounded F-space.

The metric of \(L^1\) is defined by \[ d_1(f,g)=\lVert f-g \rVert_1=\int_X |f-g|d\mu. \] We need to find a relation between \(d_p(f,g)\) and \(d_1(\lambda(f),\lambda(g))\), where \(d_p\) is the metric of the corresponding \(L^p\) space.

As we have proved, \[ (x+y)^p \leq x^p+y^p. \] Without loss of generality we assume \(x \geq y\) and therefore \[ x^p=(x-y+y)^p \leq (x-y)^p+y^p. \] Hence \[ x^p-y^p \leq (x-y)^p. \] By interchanging \(x\) and \(y\), we get \[ |x^p-y^p| \leq |x-y|^p. \] Replacing \(x\) and \(y\) with \(|f|\) and \(|g|\) where \(f\), \(g \in L^p\), we get \[ \int_{X}\lvert |f|^p-|g|^p \rvert d\mu \leq \int_X |f-g|^p d\mu. \] But \[ d_1(\lambda(f),\lambda(g))=\int_{X}\lvert |f|^p-|g|^p \rvert d\mu \\ d_p(f,g)=\Delta(f-g)= d\mu \leq \int_X |f-g|^p d\mu \] and we therefore have \[ d_1(\lambda(f),\lambda(g)) \leq d_p(f,g). \] Hence \(\lambda\) is continuous (and in fact, Lipschitz continuous and uniformly continuous) when \(0<p<1\).

It's natural to think about Minkowski's inequality and Hölder's inequality in this case since they are critical inequality enablers. You need to think about some examples of how to create the condition to use them and get a fine result. In this section we need to prove that \[ |x^p-y^p| \leq p|x-y|(x^{p-1}+y^{p-1}). \] This inequality is surprisingly easy to prove however. We will use nothing but the mean value theorem. Without loss of generality we assume that \(x > y \geq 0\) and define \(f(t)=t^p\). Then \[ \frac{f(x)-f(y)}{x-y}=f'(\zeta)=p\zeta^{p-1} \] where \(y < \zeta < x\). But since \(p-1 \geq 0\), we see \(\zeta^{p-1} < x^{p-1} <x^{p-1}+y^{p-1}\). Therefore \[ f(x)-f(y)=x^p-y^p=p(x-y)\zeta^{p-1}<p(x-y)(x^{p-1}-y^{p-1}). \] For \(x=y\) the equality holds.

Therefore \[
\begin{aligned}
d_1(\lambda(f),\lambda(g)) &= \int_X \left||f|^p-|g|^p\right|d\mu \\
&\leq
\int_Xp\left||f|-|g|\right|(|f|^{p-1}+|g|^{p-1})d\mu
\end{aligned}
\] By *Hölder's inequality*, we have \[
\begin{aligned}
\int_X ||f|-|g||(|f|^{p-1}+|g|^{p-1})d\mu & \leq \left[\int_X
\left||f|-|g|\right|^pd\mu\right]^{1/p}\left[\int_X\left(|f|^{p-1}+|g|^{p-1}\right)^q\right]^{1/q}
\\
&\leq \left[\int_X
\left|f-g\right|^pd\mu\right]^{1/p}\left[\int_X\left(|f|^{p-1}+|g|^{p-1}\right)^q\right]^{1/q}
\\
&=\lVert f-g \rVert_p
\left[\int_X\left(|f|^{p-1}+|g|^{p-1}\right)^q\right]^{1/q}.
\end{aligned}
\] By *Minkowski's inequality*, we have \[
\left[\int_X\left(|f|^{p-1}+|g|^{p-1}\right)^q\right]^{1/q} \leq
\left[\int_X|f|^{(p-1)q}d\mu\right]^{1/q}+\left[\int_X
|g|^{(p-1)q}d\mu\right]^{1/q}
\] Now things are clear. Since \(1/p+1/q=1\), or equivalently \(1/q=(p-1)/p\), suppose \(\lVert f \rVert_p\), \(\lVert g \rVert_p \leq R\), then \((p-1)q=p\) and therefore \[
\left[\int_X|f|^{(p-1)q}d\mu\right]^{1/q}+\left[\int_X
|g|^{(p-1)q}d\mu\right]^{1/q} = \lVert f \rVert_p^{p-1}+\lVert g
\rVert_p^{p-1} \leq 2R^{p-1}.
\] Summing the inequalities above, we get \[
\begin{aligned}
d_1(\lambda(f),\lambda(g)) \leq 2pR^{p-1}\lVert f-g \rVert_p
=2pR^{p-1}d_p(f,g)
\end{aligned}
\] hence \(\lambda\) is
continuous.

We have proved that \(\lambda\) is continuous, and when \(0<p<1\), we have seen that \(\lambda\) is Lipschitz continuous. It's natural to think about its differentiability afterwards, but the absolute value function is not even differentiable so we may have no chance. But this is still a fine enough result. For example we have no restriction to \((X,\mathfrak{M},\mu)\) other than the positivity of \(\mu\). Therefore we may take \(\mathbb{R}^n\) as the Lebesgue measure space here, or we can take something else.

It's also interesting how we use elementary Calculus to solve some much more abstract problems.

In this post we compute the Fourier transform of $\sin{x}/x$ and $(\sin{x}/x)^2$ through contour integration.

Read moreWe are restricting ourselves into \(\mathbb{R}\) endowed with normal topology. Recall that a function is continuous if and only if for any open set \(U \subset \mathbb{R}\), we have \[ \{x:f(x) \in U\}=f^{-1}(U) \]

to be open. One can rewrite this statement using \(\varepsilon-\delta\) language. To say a function \(f: \mathbb{R} \to \mathbb{R}\) continuous at \(f(x)\), we mean for any \(\varepsilon>0\), there exists some \(\delta>0\) such that for \(t \in (x-\delta,x+\delta)\), we have \[ |f(x)-f(t)|<\varepsilon. \] \(f\) is continuous on \(\mathbb{R}\) if and only if \(f\) is continuous at every point of \(\mathbb{R}\).

If \((x-\delta,x+\delta)\) is replaced with \((x-\delta,x)\) or \((x,x+\delta)\), we get left continuous and right continuous, one of which plays an important role in probability theory.

But the problem is, sometimes continuity is too strong for being a restriction, but the 'direction' associated with left/right continuous functions are unnecessary as well. For example the function \[ f(x)=\chi_{(0,1)}(x) \] is neither left nor right continuous (globally), but it is a thing. Left/right continuity is not a perfectly weakened version of continuity. We need something different.

Let \(f\) be a real (or extended-real) function on \(\mathbb{R}\). The semicontinuity of \(f\) is defined as follows.

If \[ \{x:f(x)>\alpha\} \] is open for all real \(\alpha\), we say \(f\) is

lowersemicontinuous.

If \[ \{x:f(x)<\alpha\} \] is open for all real \(\alpha\), we say \(f\) is

uppersemicontinuous.

Is it possible to rewrite these definitions à la \(\varepsilon-\delta\)? The answer is yes if we restrict ourselves in metric space.

\(f: \mathbb{R} \to \mathbb{R}\) is upper semicontinuous at \(x\) if, for every \(\varepsilon>0\), there exists some \(\delta>0\) such that for \(t \in (x-\delta,x+\delta)\), we have \[ f(t)<f(x)+\varepsilon \]

\(f: \mathbb{R} \to \mathbb{R}\) is lower semicontinuous at \(x\) if, for every \(\varepsilon>0\), there exists some \(\delta>0\) such that for \(t \in (x-\delta,x+\delta)\), we have \[ f(t)>f(x)-\varepsilon \]

Of course, \(f\) is upper/lower semicontinuous on \(\mathbb{R}\) if and only if it is so on every point of \(\mathbb{R}\). One shall find no difference between the definitions in different styles.

Here is another way to see it. For the continuity of \(f\), we are looking for *arbitrary*
open subsets \(V\) of \(\mathbb{R}\), and \(f^{-1}(V)\) is expected to be open. For the
lower/upper semicontinuity of \(f\),
however, the open sets are restricted to be like \((\alpha,+\infty]\) and \([-\infty,\alpha)\). Since all open sets of
\(\mathbb{R}\) can be generated by the
union or intersection of sets like \([-\infty,\alpha)\) and \((\beta,+\infty]\), we immediately get

\(f\) is continuous if and only if \(f\) is both upper semicontinuous and lower semicontinuous.

*Proof.* If \(f\) is
continuous, then for any \(\alpha \in
\mathbb{R}\), we see \([-\infty,\alpha)\) is open, and therefore
\[
f^{-1}([-\infty,\alpha))
\] has to be open. The upper semicontinuity is proved. The lower
semicontinuity of \(f\) is proved in
the same manner.

If \(f\) is both upper and lower semicontinuous, we see \[ f^{-1}((\alpha,\beta))=f^{-1}([-\infty,\beta)) \cap f^{-1}((\alpha,+\infty]) \] is open. Since every open subset of \(\mathbb{R}\) can be written as a countable union of segments of the above types, we see for any open subset \(V\) of \(\mathbb{R}\), \(f^{-1}(V)\) is open. (If you have trouble with this part, it is recommended to review the definition of topology.) \(\square\)

There are two important examples.

- If \(E \subset \mathbb{R}\) is open, then \(\chi_E\) is lower semicontinuous.
- If \(F \subset \mathbb{R}\) is closed, then \(\chi_F\) is upper semicontinuous.

We will prove the first one. The second one follows in the same manner of course. For \(\alpha<0\), the set \(A=\chi_E^{-1}((\alpha,+\infty])\) is equal to \(\mathbb{R}\), which is open. For \(\alpha \geq 1\), since \(\chi_E \leq 1\), we see \(A=\varnothing\). For \(0 \leq \alpha < 1\) however, the set of \(x\) where \(\chi_E>\alpha\) has to be \(E\), which is still open.

When checking the semicontinuity of a function, we check from bottom to top or top to bottom. The function \(\chi_E\) is defined by \[ \chi_E(x)=\begin{cases} 1 \quad x \in E \\ 0 \quad x \notin E \end{cases}. \]

If \(f_1\) and \(f_2\) are upper/lower semicontinuous, then so is \(f_1+f_2\).

*Proof.* We are going to prove this using different tools.
Suppose now both \(f_1\) and \(f_2\) are upper semicontinuous. For \(\varepsilon>0\), there exists some \(\delta_1>0\) and \(\delta_2>0\) such that \[
f_1(t) < f_1(x)+\varepsilon/2 \quad t \in (x-\delta_1,x+\delta_1), \\
f_2(t) < f_2(x) + \varepsilon/2 \quad t \in (x-\delta_2,x+\delta_2).
\] *Proof.* If we pick \(\delta=\min(\delta_1,\delta_2)\), then we
see for all \(t \in
(x-\delta,x+\delta)\), we have \[
f_1(t)+f_2(t)<f_1(x)+f_2(x)+\varepsilon.
\] The upper semicontinuity of \(f_1+f_2\) is proved by considering all
\(x \in \mathbb{R}\).

Now suppose both \(f_1\) and \(f_2\) are lower semicontinuous. We have an identity by \[ \{x:f_1+f_2>\alpha\}=\bigcup_{\beta\in\mathbb{R}}\{x:f_1>\beta\}\cap\{x:f_2>\alpha-\beta\}. \] The set on the right side is always open. Hence \(f_1+f_2\) is lower semicontinuous. \(\square\)

However, when there are infinite many semicontinuous functions, things are different.

Let \(\{f_n\}\) be a sequence of nonnegative functions on \(\mathbb{R}\), then

- If each \(f_n\) is lower semicontinuous, then so is \(\sum_{1}^{\infty}f_n\).
- If each \(f_n\) is upper semicontinuous, then \(\sum_{1}^{\infty}f_n\) is not necessarily upper semicontinuous.

*Proof.* To prove this we are still using the properties of
open sets. Put \(g_n=\sum_{1}^{n}f_k\).
Now suppose all \(f_k\) are lower.
Since \(g_n\) is a finite sum of lower
functions, we see each \(g_n\) is
lower. Let \(f=\sum_{n}f_n\). As \(f_k\) are non-negative, we see \(f(x)>\alpha\) if and only if there
exists some \(n_0\) such that \(g_{n_0}(x)>\alpha\). Therefore \[
\{x:f(x)>\alpha\}=\bigcup_{n \geq n_0}\{x:g_n>\alpha\}.
\] The set on the right hand is open already.

For the upper semicontinuity, it suffices to give a counterexample, but before that, we shall give the motivation.

As said, the characteristic function of a closed set is upper semicontinuous. Suppose \(\{E_n\}\) is a sequence of almost disjoint closed set, then \(E=\cup_{n\geq 1}E_n\) is not necessarily closed, therefore \(\chi_E=\sum\chi_{E_n}\) (a.e.) is not necessarily upper semicontinuous. Now we give a concrete example. Put \(f_0=\chi_{[1,+\infty]}\) and \(f_n=\chi_{E_n}\) for \(n \geq 1\) where \[ E_n=\{x:\frac{1}{1+n} \leq x \leq \frac{1}{n}\}. \] For \(x > 0\), we have \(f=\sum_nf_n \geq 1\). Meanwhile, \(f^{-1}([-\infty,1))=[-\infty,0]\), which is not open. \(\square\)

Notice that \(f\) can be defined on any topological space here.

There is one fact we already know about continuous functions.

If \(X\) is compact, \(f: X \to \mathbb{R}\) is continuous, then there exists some \(a,b \in X\) such that \(f(a)=\min f(X)\), \(f(b)=\max f(X)\).

In fact, \(f(X)\) is compact still. But for semicontinuous functions, things will be different but reasonable. For upper semicontinuous functions, we have the following fact.

If \(X\) is compact and \(f: X \to (-\infty,+\infty)\) is upper semicontinuous, then there exists some \(a \in X\) such that \(f(a)=\max f(X)\).

Notice that \(X\) is not assumed to hold any other topological property. It can be Hausdorff or Lindelöf, but we are not asking for restrictions like this. The only property we will be using is that every open cover of \(X\) has a finite subcover. Of course, one can replace \(X\) with any compact subset of \(\mathbb{R}\), for example, \([a,b]\).

*Proof.* Put \(\alpha=\sup
f(X)\), and define \[
E_n=\{x:f(x)<\alpha-\frac{1}{n}\}.
\] If \(f\) attains no maximum,
then for any \(x \in X\), there exists
some \(n \geq 1\) such that \(f(x)<\alpha-\frac{1}{n}\). That is,
\(x \in E_n\) for some \(n\). Therefore \(\bigcup_{n \geq 1}E_n\) covers \(X\). But this cover has no finite subcover
of \(X\). A contradiction since \(X\) is compact. \(\square\)

This is a comprehensive application of several properties of semicontinuity.

(

Vitali–Carathéodory theorem) Suppose \(f \in L^1(\mathbb{R})\), where \(f\) is a real-valued function. For \(\varepsilon>0\), there exist some functions \(u\) and \(v\) on \(\mathbb{R}\) such that \(u \leq f \leq v\), \(u\) is an upper semicontinuous function bounded above, and \(v\) is lower semicontinuous bounded below, and \[ \boxed{\int_{\mathbb{R}}(v-u)dm<\varepsilon} \]

It suffices to prove this theorem for \(f \geq 0\) (of course \(f\) is not identically equal to \(0\) since this case is trivial). Since \(f\) is the pointwise limit of an increasing sequence of simple functions \(s_n\), can to write \(f\) as \[ f=s_1+\sum_{n=2}^{\infty}(s_n-s_{n-1}). \] By putting \(t_1=s_1\), \(t_n=s_n-s_{n-1}\) for \(n \geq 2\), we get \(f=\sum_n t_n\). We can write \(f\) as \[ f=\sum_{k=1}^{\infty}c_k\chi_{E_k} \] where \(E_k\) is measurable for all \(k\). Also, we have \[ \int_X f d\mu = \sum_{k=1}^{\infty}c_km(E_k), \] and the series on the right hand converges (since \(f \in L^1\). By the properties of Lebesgue measure, there exists a compact set \(F_k\) and an open set \(V_k\) such that \(F_k \subset E_k \subset V_k\) and \(c_km(V_k-F_k)<\frac{\varepsilon}{2^{k+1}}\). Put \[ v=\sum_{k=1}^{\infty}c_k\chi_{V_k},\quad u=\sum_{k=1}^{N}c_k\chi_{F_k} \] (now you can see \(v\) is lower semicontinuous and \(u\) is upper semicontinuous). The \(N\) is chosen in such a way that \[ \sum_{k=N+1}^{\infty}c_km(E_K)<\frac{\varepsilon}{2}. \] Since \(V_k \supset E_k\), we have \(\chi_{V_k} \geq \chi_{E_k}\). Therefore \(v \geq f\). Similarly, \(f \geq u\). Now we need to check the desired integral inequality. A simple recombination shows that \[ \begin{aligned} v-u&=\sum_{k=1}^{\infty}c_k\chi_{V_k}-\sum_{k=1}^{N}c_k\chi_{F_k} \\ &\leq \sum_{k=1}^{\infty}c_k\chi_{V_k}-\sum_{k=1}^{N}c_k\chi_{F_k}+\sum_{k=N+1}^{\infty}c_k(\chi_{E_k}-\chi_{F_k}) \\ &=\sum_{k=1}^{\infty}c_k(\chi_{V_k}-\chi_{F_k})+\sum_{k=N+1}^{\infty}c_k\chi_{E_k}. \end{aligned}. \] If we integrate the function above, we get \[ \begin{aligned} \int_{\mathbb{R}}(v-u)dm &\leq \sum_{k=1}^{\infty}c_k\mu(V_k-E_k)+\sum_{k=N+1}^{\infty}c_k\chi_{E_k} \\ &< \sum_{k=1}^{\infty}\frac{\varepsilon}{2^{k+1}}+\frac{\varepsilon}{2} \\ &=\varepsilon. \end{aligned} \] This proved the case when \(f \geq 0\). In the general case, we write \(f=f^{+}-f^{-}\). Attach the semicontinuous functions to \(f^{+}\) and \(f^{-}\) respectively by \(u_1 \leq f^{+} \leq v_1\) and \(u_2 \leq f^{-} \leq v_2\). Put \(u=u_1-v_2\), \(v=v_1-u_2\). As we can see, \(u\) is upper semicontinuous and \(v\) is lower semicontinuous. Also, \(u \leq f \leq v\) with the desired property since \[ \int_\mathbb{R}(v-u)dm=\int_\mathbb{R}(v_1-u_1)dm+\int_\mathbb{R}(v_2-u_2)dm<2\varepsilon, \] and the theorem follows. \(\square\)

Indeed, the only property about measure used is the existence of \(F_k\) and \(V_k\). The domain \(\mathbb{R}\) here can be replaced with \(\mathbb{R}^k\) for \(1 \leq k < \infty\), and \(m\) be replaced with the respective \(m_k\). Much more generally, the domain can be replaced by any locally compact Hausdorff space \(X\) and the measure by any measure associated with the Riesz-Markov-Kakutani representation theorem on \(C_c(X)\).

The answer is no. Consider the fat Cantor set \(K\), which has Lebesgue measure \(\frac{1}{2}\). We shall show that \(\chi_K\) can not be approximated below by a lower semicontinuous function.

If \(v\) is a lower semicontinuous function such that \(v \leq \chi_K\), then \(v \leq 0\).

*Proof.* Consider the set \(V=v^{-1}((0,1])=v^{-1}((0,+\infty))\).
Since \(v \leq \chi_K\), we have \(V \subset K\). We will show that \(V\) has to be empty.

Pick \(t \in V\). Since \(V\) is open, there exists some neighbourhood \(U\) containing \(t\) such that \(U \subset V\). But \(U=\varnothing\) since \(U \subset K\) and \(K\) has an empty interior. Therefore \(V = \varnothing\). That is, \(v \leq 0\) for all \(x\). \(\square\)

Suppose \(u\) is an upper semicontinuous function such that \(u \geq f\). For \(\varepsilon=\frac{1}{2}\), we have \[ \int_{\mathbb{R}}(u-v)dm \geq \int_\mathbb{R}(f-v)dm \geq \frac{1}{2}. \] This example shows that there exist some integrable functions that are not able to reversely approximated in the sense of the Vitali–Carathéodory theorem.