Basic Facts of Semicontinuous Functions

Continuity

We are restricting ourself into \(\mathbb{R}\) endowed with normal topology. Recall that a function is continuous if and only if for any open set \(U \subset \mathbb{R}\), we have \[ \{x:f(x) \in U\}=f^{-1}(U) \]

to be open. One can rewrite this statement using \(\varepsilon-\delta\) language. To say a function \(f: \mathbb{R} \to \mathbb{R}\) continuous at \(f(x)\), we mean for any \(\varepsilon>0\), there exists some \(\delta>0\) such that for \(t \in (x-\delta,x+\delta)\), we have \[ |f(x)-f(t)|<\varepsilon. \] \(f\) is continuous on \(\mathbb{R}\) if and only if \(f\) is continuous at every point of \(\mathbb{R}\).

If \((x-\delta,x+\delta)\) is replaced with \((x-\delta,x)\) or \((x,x+\delta)\), we get left continuous and right continuous, one of which plays an important role in probability theory.

But the problem is, sometimes continuity is too strong for being a restriction, but the 'direction' associated with left/right continuous functions are unnecessary as well. For example the function \[ f(x)=\chi_{(0,1)}(x) \] is neither left nor right continuous (globally), but it is a thing. Left/right continuity is not a perfectly weakened version of continuity. We need something different.

Definition of semicontinuous

Let \(f\) be a real (or extended-real) function on \(\mathbb{R}\). The semicontinuity of \(f\) is defined as follows.

If \[ \{x:f(x)>\alpha\} \] is open for all real \(\alpha\), we say \(f\) is lower semicontinuous.

If \[ \{x:f(x)<\alpha\} \] is open for all real \(\alpha\), we say \(f\) is upper semicontinuous.

Is it possible to rewrite these definition à la \(\varepsilon-\delta\)? The answer is yes if we restrict ourself in metric space.

\(f: \mathbb{R} \to \mathbb{R}\) is upper semicontinuous at \(x\) if for every \(\varepsilon>0\), there exists some \(\delta>0\) such that for \(t \in (x-\delta,x+\delta)\), we have \[ f(t)<f(x)+\varepsilon \]

\(f: \mathbb{R} \to \mathbb{R}\) is lower semicontinuous at \(x\) if for every \(\varepsilon>0\), there exists some \(\delta>0\) such that for \(t \in (x-\delta,x+\delta)\), we have \[ f(t)>f(x)-\varepsilon \]

Of course, \(f\) is upper/lower semicontinuous on \(\mathbb{R}\) if and only if it is so on every point of \(\mathbb{R}\). One shall find no difference between the definitions in different styles.

Relation with continuous functions

Here is another way to see it. For the continuity of \(f\), we are looking for arbitrary open subsets \(V\) of \(\mathbb{R}\), and \(f^{-1}(V)\) is expected to be open. For the lower/upper semicontinuity of \(f\), however, the open sets are restricted to be like \((\alpha,+\infty]\) and \([-\infty,\alpha)\). Since all open sets of \(\mathbb{R}\) can be generated by the union or intersection of sets like \([-\infty,\alpha)\) and \((\beta,+\infty]\), we immediately get

\(f\) is continuous if and only if \(f\) is both upper semicontinuous and lower semicontinuous.

Proof. If \(f\) is continuous, then for any \(\alpha \in \mathbb{R}\), we see \([-\infty,\alpha)\) is open, and therefore \[ f^{-1}([-\infty,\alpha)) \] has to be open. The upper semicontinuity is proved. The lower semicontinuity of \(f\) is proved in the same manner.

If \(f\) is both upper and lower semicontinuous, we see \[ f^{-1}((\alpha,\beta))=f^{-1}([-\infty,\beta)) \cap f^{-1}((\alpha,+\infty]) \] is open. Since every open subset of \(\mathbb{R}\) can be written as a countable union of segments of the above types, we see for any open subset \(V\) of \(\mathbb{R}\), \(f^{-1}(V)\) is open. (If you have trouble with this part, it is recommended to review the definition of topology.) \(\square\)

Examples

There are two important examples.

  1. If \(E \subset \mathbb{R}\) is open, then \(\chi_E\) is lower semicontinuous.
  2. If \(F \subset \mathbb{R}\) is closed, then \(\chi_F\) is upper semicontinuous.

We will prove the first one. The second one follows in the same manner of course. For \(\alpha<0\), the set \(A=\chi_E^{-1}((\alpha,+\infty])\) is equal to \(\mathbb{R}\), which is open. For \(\alpha \geq 1\), since \(\chi_E \leq 1\), we see \(A=\varnothing\). For \(0 \leq \alpha < 1\) however, the set of \(x\) where \(\chi_E>\alpha\) has to be \(E\), which is still open.

When checking the semicontinuity of a function, we check from bottom to top or top to bottom. The function \(\chi_E\) is defined by \[ \chi_E(x)=\begin{cases} 1 \quad x \in E \\ 0 \quad x \notin E \end{cases}. \]

Addition of semicontinuous functions

If \(f_1\) and \(f_2\) are upper/lower semicontinuous, then so is \(f_1+f_2\).

Proof. We are going to prove this using different tools. Suppose now both \(f_1\) and \(f_2\) are upper semicontinuous. For \(\varepsilon>0\), there exists some \(\delta_1>0\) and \(\delta_2>0\) such that \[ f_1(t) < f_1(x)+\varepsilon/2 \quad t \in (x-\delta_1,x+\delta_1), \\ f_2(t) < f_2(x) + \varepsilon/2 \quad t \in (x-\delta_2,x+\delta_2). \] Proof. If we pick \(\delta=\min(\delta_1,\delta_2)\), then we see for all \(t \in (x-\delta,x+\delta)\), we have \[ f_1(t)+f_2(t)<f_1(x)+f_2(x)+\varepsilon. \] The upper semicontinuity of \(f_1+f_2\) is proved by considering all \(x \in \mathbb{R}\).

Now suppose both \(f_1\) and \(f_2\) are lower semicontinuous. We have a identity by \[ \{x:f_1+f_2>\alpha\}=\bigcup_{\beta\in\mathbb{R}}\{x:f_1>\beta\}\cap\{x:f_2>\alpha-\beta\}. \] The set on the right side is always open. Hence \(f_1+f_2\) is lower semicontinuous. \(\square\)


However, when there are infinite many semicontinuous functions, things are different.

Let \(\\{f_n\\}\) be a sequence of nonnegative functions on \(\mathbb{R}\), then

  • If each \(f_n\) is lower semicontinuous, then so is \(\sum_{1}^{\infty}f_n\).
  • If each \(f_n\) is upper semicontinuous, then \(\sum_{1}^{\infty}f_n\) is not necessarily upper semicontinuous.

Proof. To prove this we are still using the properties of open sets. Put \(g_n=\sum_{1}^{n}f_k\). Now suppose all \(f_k\) are lower. Since \(g_n\) is a finite sum of lower functions, we see each \(g_n\) is lower. Let \(f=\sum_{n}f_n\). As \(f_k\) are non-negative, we see \(f(x)>\alpha\) if and only if there exists some \(n_0\) such that \(g_{n_0}(x)>\alpha\). Therefore \[ \{x:f(x)>\alpha\}=\bigcup_{n \geq n_0}\{x:g_n>\alpha\}. \] The set on the right hand is open already.

For the upper semicontinuity, it suffices to give an counterexample, but before that, we shall give the motivation.

As said, the characteristic function of a closed set is upper semicontinuous. Suppose \(\\{E_n\\}\) is a sequence of almost disjoint closed set, then \(E=\cup_{n\geq 1}E_n\) is not necessarily closed, therefore \(\chi_E=\sum\chi_{E_n}\) (a.e.) is not necessarily upper semicontinuous. Now we give a concrete example. Put \(f_0=\chi_{[1,+\infty]}\) and \(f_n=\chi_{E_n}\) for \(n \geq 1\) where \[ E_n=\{x:\frac{1}{1+n} \leq x \leq \frac{1}{n}\}. \] For \(x > 0\), we have \(f=\sum_nf_n \geq 1\). Meanwhile, \(f^{-1}([-\infty,1))=[-\infty,0]\), which is not open. \(\square\)

Notice that \(f\) can be defined on any topological space here.

Maximum and minimum

There is one fact we already know about continuous functions.

If \(X\) is compact, \(f: X \to \mathbb{R}\) is continuous, then there exists some \(a,b \in X\) such that \(f(a)=\min f(X)\), \(f(b)=\max f(X)\).

In fact, \(f(X)\) is compact still. But for semicontinuous functions, things will be different but reasonable. For upper semicontinuous functions, we have the following fact.

If \(X\) is compact and \(f: X \to (-\infty,+\infty)\) is upper semicontinuous, then there exists some \(a \in X\) such that \(f(a)=\max f(X)\).

Notice that \(X\) is not assumed to hold any other topological property. It can be Hausdorff or Lindelöf, but we are not asking for restrictions like this. The only property we will be using is that every open cover of \(X\) has a finite subcover. Of course, one can replace \(X\) with any compact subset of \(\mathbb{R}\), for example, \([a,b]\).

Proof. Put \(\alpha=\sup f(X)\), and define \[ E_n=\{x:f(x)<\alpha-\frac{1}{n}\}. \] If \(f\) attains no maximum, then for any \(x \in X\), there exists some \(n \geq 1\) such that \(f(x)<\alpha-\frac{1}{n}\). That is, \(x \in E_n\) for some \(n\). Therefore \(\bigcup_{n \geq 1}E_n\) covers \(X\). But this cover has no finite subcover of \(X\). A contradiction since \(X\) is compact. \(\square\)

Approximating integrable functions

This is a comprehensive application of several properties of semicontinuity.

(Vitali–Carathéodory theorem) Suppose \(f \in L^1(\mathbb{R})\), where \(f\) is real-valued function. For \(\varepsilon>0\), there exists some functions \(u\) and \(v\) on \(\mathbb{R}\) such that \(u \leq f \leq v\), \(u\) is a upper semicontinuous functions bounded above, and \(v\) is lower semicontinuous bounded below, and \[ \boxed{\int_{\mathbb{R}}(v-u)dm<\varepsilon} \]

It suffice to prove this theorem for \(f \geq 0\) (of course \(f\) is not identically equal to \(0\) since this case is trivial). Since \(f\) is the pointwise limit of an increasing sequence of simple functions \(s_n\), we are able to write \(f\) as \[ f=s_1+\sum_{n=2}^{\infty}(s_n-s_{n-1}). \] By putting \(t_1=s_1\), \(t_n=s_n-s_{n-1}\) for \(n \geq 2\), we get \(f=\sum_n t_n\). We are able to write \(f\) as \[ f=\sum_{k=1}^{\infty}c_k\chi_{E_k} \] where \(E_k\) is measurable for all \(k\). Also we have \[ \int_X f d\mu = \sum_{k=1}^{\infty}c_km(E_k), \] and the series on the right hand converges(since \(f \in L^1\). By the properties of Lebesgue measure, there exists a compact set \(F_k\) and a open set \(V_k\) such that \(F_k \subset E_k \subset V_k\) and \(c_km(V_k-F_k)<\frac{\varepsilon}{2^{k+1}}\). Put \[ v=\sum_{k=1}^{\infty}c_k\chi_{V_k},\quad u=\sum_{k=1}^{N}c_k\chi_{F_k} \] (now you can see \(v\) is lower semicontinuous and \(u\) is upper semicontinuous). The \(N\) is chosen in such a way that \[ \sum_{k=N+1}^{\infty}c_km(E_K)<\frac{\varepsilon}{2}. \] Since \(V_k \supset E_k\), we have \(\chi_{V_k} \geq \chi_{E_k}\). Therefore \(v \geq f\). Similarly, \(f \geq u\). Now we need to check the desired integral inequality. A simple recombination shows that \[ \begin{aligned} v-u&=\sum_{k=1}^{\infty}c_k\chi_{V_k}-\sum_{k=1}^{N}c_k\chi_{F_k} \\ &\leq \sum_{k=1}^{\infty}c_k\chi_{V_k}-\sum_{k=1}^{N}c_k\chi_{F_k}+\sum_{k=N+1}^{\infty}c_k(\chi_{E_k}-\chi_{F_k}) \\ &=\sum_{k=1}^{\infty}c_k(\chi_{V_k}-\chi_{F_k})+\sum_{k=N+1}^{\infty}c_k\chi_{E_k}. \end{aligned}. \] If we integrate the function above, we get \[ \begin{aligned} \int_{\mathbb{R}}(v-u)dm &\leq \sum_{k=1}^{\infty}c_k\mu(V_k-E_k)+\sum_{k=N+1}^{\infty}c_k\chi_{E_k} \\ &< \sum_{k=1}^{\infty}\frac{\varepsilon}{2^{k+1}}+\frac{\varepsilon}{2} \\ &=\varepsilon. \end{aligned} \] This proved the case when \(f \geq 0\). In the general case, we write \(f=f^{+}-f^{-}\). Attach the semicontinuous functions to \(f^{+}\) and \(f^{-}\) respectively by \(u_1 \leq f^{+} \leq v_1\) and \(u_2 \leq f^{-} \leq v_2\). Put \(u=u_1-v_2\), \(v=v_1-u_2\). As we can see, \(u\) is upper semicontinuous and \(v\) is lower semicontinuous. Also, \(u \leq f \leq v\) with the desired property since \[ \int_\mathbb{R}(v-u)dm=\int_\mathbb{R}(v_1-u_1)dm+\int_\mathbb{R}(v_2-u_2)dm<2\varepsilon, \] and the theorem follows. \(\square\)

Generalization

Indeed, the only unique property about measure used is the existence of \(F_k\) and \(V_k\). The domain \(\mathbb{R}\) here can be replaced with \(\mathbb{R}^k\) for \(1 \leq k < \infty\), and \(m\) be replaced with the respective \(m_k\). Much more generally, the domain can be replaced by any locally compact Hausdorff space \(X\), and the measure by any measure associated with Riesz-Markov-Kakutani representation theorem on \(C_c(X)\).

Is the reverse approximation always possible?

The answer is no. Consider the fat Cantor set \(K\), which has Lebesgue measure \(\frac{1}{2}\). We shall show that \(\chi_K\) can not be approximated below by a lower semicontinuous function.

If \(v\) is a lower semicontinuous function such that \(v \leq \chi_K\), then \(v \leq 0\).

Proof. Consider the set \(V=v^{-1}((0,1])=v^{-1}((0,+\infty))\). Since \(v \leq \chi_K\), we have \(V \subset K\). We will show that \(V\) has to be empty.

Pick \(t \in V\). Since \(V\) is open, there exists some neighborhood \(U\) containing \(t\) such that \(U \subset V\). But \(U=\varnothing\) since \(U \subset K\) and \(K\) has empty interior. Therefore \(V = \varnothing\). That is, \(v \leq 0\) for all \(x\). \(\square\)

Suppose \(u\) is any upper semicontinuous function such that \(u \geq f\). For \(\varepsilon=\frac{1}{2}\), we have \[ \int_{\mathbb{R}}(u-v)dm \geq \int_\mathbb{R}(f-v)dm \geq \frac{1}{2}. \] This example shows that there exists some integrable functions that are not able to reversely approximated in the sense of the Vitali–Carathéodory theorem.

洛必达法则的几种不同的证明

前言

洛必达法则我想甚至不少高中生甚至初中生都听说过,知道怎么进行简单的应用。简单点说,处理\(\frac{0}{0}\)的函数时,对上下进行求导,可能会很大程度上简化计算。但是洛必达法则为什么能奏效? 能不能用严格的数理语言进行论证? 这是这篇文章需要解决的.

洛必达法则的完整论述

假设有定义在\((a,b)\)可导的实函数\(f\)\(g\),且\(g’(x)\neq0\)对所有\(x\in(a,b)\)恒成立,其中\(a\)\(b\)满足 \[ -\infty\leq{a}<{b}\leq+\infty.\]
若有\[\lim_{x\to a}\frac{f’(x)}{g’(x)}=A,\]且如果\[\lim_{x\to a}f(x)=\lim_{x\to a}g(x)=0,\]\[\lim_{x\to a}g(x)=+\infty,\]那么\[\lim_{x\to a}\frac{f(x)}{g(x)}=A\]类似的结论对\(x\to{b}\)或者\(g(x)\to-\infty\)也成立。

证明1:线性近似

波努利最开始的”证明”

洛必达法则首次出现于1696年洛必达的 Analyse des Infiniment Petits pour l'Intelligence des Lignes Courbes 一书中。这本书当然以”洛必达法则”闻名于世。证明是这样完成的: \[ \frac{f(a+dx)}{g(a+dx)}=\frac{f(a)+f’(a)dx}{g(a)+g’(a)dx}=\frac{f’(a)dx}{g’(a)dx}=\frac{f’(a)}{g’(a)} \]

这个证明很好理解,线性近似展开,再考虑到\(f(a)=g(a)=0\)就得到结果。但是这个做法肯定是不合适的,\(dx\)在这里非常模糊,也不方便表达\(x\to\infty\)的情况。关于历史内容可以参见 The Historical Development of the Calculus 一书。

线性近似的严格证明

首先,这里只讨论\(h\to0\)的情况。实际上,对于其他情况,可以作换元。例如\(h\to\infty\)时,可以利用\(u=\frac{1}{h}\),那么又转换成了\(u\to0\)的情况。另外我们只讨论函数趋近于\(0\)的情况。因为趋近于无穷时函数的线性近似可能无法处理。例如\(y=\frac{1}{x}\)\(x=0\)附近是没有近似的。

对函数导数有 \[ f’(x) = \lim_{h\to 0} \frac{f(x+h)-f(x)}{h}, \]

我们可以写成 \[ f’(x) = \frac{f(x+h)-f(x)}{h} + r(h) \]

其中\(\lim\limits_{h\to0}r(h)=0\),且\(r(h)\)为连续函数。进行代数变形(这里\(r(h)\)的正负进行了调整),我们的得到线性近似 \[ f(x+h)=f(x)+f’(x)h+r(h)h \]

同样可以写出\(g(x)\)的线性近似 \[ g(x+h)=g(x)+g’(x)h+s(h)h \]

那么就能得到 \[ \frac{f(a+h)}{g(a+h)}=\frac{f(a)+f’(a)h+r(h)h}{g(a)+g’(a)h+s(h)h}=\frac{f’(a)h+r(h)h}{g’(a)h+s(h)h}=\frac{f’(a)+r(h)}{g’(a)+s(h)} \]

\(h\to0\)时,\(r(h)\to0\)\(s(h)\to0\),故得到了结论。

证明2:中值定理

这个证明中,我们会利用柯西中值定理(GMVT)对所有的情况进行完整的证明,这期间涉及到一些不等式运算技巧。证明来自W. Rudin的 Principles Of Mathematical Analysis,我会在其中加上一些额外的解释。

情况1: \(-\infty\leq{A}<+\infty\)

选取实数\(\varepsilon>0\)\(q\)使得\(A<A+\varepsilon<q\)。因为\(\frac{f(x)}{g(x)}\to{A}\),必定有实数\(\delta\in(0,b-a)\)使得对于所有\(a<x<a+\delta\),始终有\(-\varepsilon<\frac{f’(x)}{g’(x)}-A<\varepsilon\)。也就是说 \[\frac{f’(x)}{g’(x)}<A+\varepsilon.\]

\(a<x<y<c\),由GMVT可知,存在\(t\in(x,y)\)使得不等式(A)成立: \[ \frac{f(x)-f(y)}{g(x)-g(y)}=\frac{f’(t)}{g’(t)}<A+\varepsilon \] 最后一个不等式成立是因为\(t\in(x,y)\subset(a,b)\),而\((a,b)\)中这个不等式成立。

情况1.1: \(g(x)\to0\)

\(x\to{a}\),此时关于\(x\)\(y\)的不等式会有\(\frac{f(y)}{g(y)}\leq{A+\varepsilon}<q\quad(a<y<a+\delta)\)

也就是说,对任意实数\(\varepsilon>0\),有\(\delta>0\),使得\(a<y<a+\delta\)时,满足不等式(B): \[ \frac{f(y)}{g(y)}\leq\varepsilon+A<q \]

(注意:这个地方并没有用\(\varepsilon-\delta\)证明了这个情况下的收敛)

情况1.2: \(g(x)\to+\infty\)

\(r=A+\varepsilon\)。固定不等式(A)中的\(y\),因为\(g(x)\to+\infty\),能找到一个值\(c\in(a,b)\)使得\(g(x)>g(y)\)\(g(x)>0\)对所有\(x\in(a,c)\)同时成立。那么不等式(A)两边同时乘以\([g(x)-g(y)]/g(x)\),能得到不等式(C) \[ \frac{f(x)}{g(x)}<r-r\frac{g(y)}{g(x)}+\frac{f(y)}{g(x)}\quad(a<x<c) \]

\(x\to{a}\),因为\(g(x)\to+\infty\),有点\(c_1\in(a,c)\)使得不等式(D)成立: \[ \frac{f(x)}{g(x)}<q\quad(a<x<c_1) \]

情况1.1和1.2的整合

不等式(B)和(D)都只说明,存在\(c\in(a,b)\)使得对于所有\(x\in(a,c)\),满足\(\frac{f(x)}{g(x)}<q\).但是\(\frac{f(x)}{g(x)}\)\(A\)的关系并不知道。

这里要注意,不等式(B)和(D)都只在\(q>A\)时成立,也就是说,如果\(q=A\),那么有\(\frac{f(x)}{g(x)}\geq{q}=A\)。也就是说,对于所有\(q>A\),都存在\(c\in(a,b)\),使得对于所有\(x\in(a,c)\),满足 \(A\leq\frac{f(x)}{g(x)}<q\),若令\(q\to{A}\),就能得到\(\frac{f(x)}{g(x)}\to{A}\)

情况2: \(-\infty<{A}\leq+\infty\)

这个情况是和情况1完全类似的。同理可证,对任意\(p\),当且仅当\(p<A\)时,总有\(c’\in(a,b)\),使得对于所有\(x\in(a,c’)\),满足\(p<\frac{f(x)}{g(x)}\leq{A}\)

结合\(A\)的这两种情况,原命题得证。

证明中几个小问题

不等式(A)第一项的分母为什么一定有意义?

假设它无意义。如果有\(g(x)=g(y)\),那么有\({x}<t<y\)使得\(g’(t)=0\),此时不满足 \[ f’(t)/g’(t)<A+\varepsilon \] > 不等式(B)中为什么变成小于等于?

每次改变\(x\)\(t\)也发生改变,记为\(t(x)\),此时可能有\(\lim\limits_{x\to{a}}\frac{f’(t(x))}{g’(t(x))}=r\)