The notion of Cohen-Macaulay ring is sufficiently general to a wealthof examples in algebraic geometry, invariance theory and combinatorics;meanwhile it is sufficiently strict to allow a rich theory. The notionof Cohen-Macaulay is a workhorse of commutative algebra. In this post,we discover an important subclass of Cohen-Macaulay ring - regular localrings (one would be thinking about

It is recommended to be familiar with basic commutative algebra toolssuch as Nakayama's lemma and minimal prime ideals.

The content can be generalised to modules to a good extent, but weare not doing it for sake of quick accessibility.

Definition 1.TheKrull dimensionof\(R\), written as \(\dim{R}\), is the supremum taking over thelength of prime ideal chains \[\mathfrak{p}_0 \subsetneq \mathfrak{p}_1 \subsetneq \dots\subsetneq \mathfrak{p}_d.\]

This definition was introduced to define dimension of affinevarieties, in a global sense. Locally, we have the followingdefinition.

Definition 2.Theembedding dimensionof\(R\) is the dimension of a vectorspace \[emb.\dim{R} = \dim_k(\mathfrak{m}/\mathfrak{m}^2).\] The right hand side is the dimension of a\(k\)-vector space \(\mathfrak{m}/\mathfrak{m}^2\).

Let \(R\) be the local ring of acomplex variety \(X\) at a point

Definition 3.The ring\(R\) is called regularif\(\dim{R}=emb.\dim{R}\).

The most immediate intuitive example of regular local ring has to berings of the form \[K[[x_1,\dots,x_n]],\] where \(K\) is a field. Thesekind of rings are regular local rings of Krull dimension

But by working on regular local rings, we are not essentiallyrestricting ourselves into the ring of power series over a field. Forexample, the ring

Nevertheless, our primary model of regular local rings is still aring of the form

Definition 4.Aregular sequenceof\(R\), also written as \(R\)-sequence, is a sequence \([x_1,\dots,x_n]\) of elements in \(\mathfrak{m}\) such that \(x_1\) is a non-zero-divisor in \(R\), and such that given \(i>1\), each \(x_i\) is a non-zero-divisor in \(R/(x_1,\dots,x_i)\). The

gradeof \(R\),\(G(R)\), is the longest length of regularsequences. If \(G(R)=\dim{R}\), then\(R\) is called Cohen-Macaulay.

It is quite intuitive that, for

Amongst many sequences, we are in particular interested in thesequence that are mapped onto a basis of the *regular*sequence.

Proposition 1.Let\(x_1,\dots,x_n\) be elements in \(\mathfrak{m} \subset R\) whose images forma basis of \(\mathfrak{m}/\mathfrak{m}^2\), then \(x_1,\dots,x_n\) generate the maximal ideal\(\mathfrak{m}\).

*Proof.*

Proposition 2.If\(R\) is a regular local ring of dimension\(n\) and \(x_1, \dots,x_n \in \mathfrak{m}\) map to abasis of \(\mathfrak{m}/\mathfrak{m}^2\), then \(R/(x_1,\dots,x_i)\) is a regular local ringof dimension \(n-i\).

*Proof.* By proposition 1, we have

It looks quite promising now that the sequence of basis can geteverything down to earth, and we will show that in the followingsection.

Proposition 3.If\(R\) is regular, then \(R\) is an integral domain.

*Proof.* We use induction on \(\dimR\). When \(\dim{R}=0\) and\(R\) is regular,

Pick \(x \in \mathfrak{m} \setminus\mathfrak{m}^2\). Then this element map to a nonzero element in\(\mathfrak{m}/\mathfrak{m}^2\). Thereexists a basis of

We claim that there exists \(x \in\mathfrak{m}/\mathfrak{m}^2\) such that

Thus, as our claim is true, we can write

We now reach our conclusion of this post.

Proposition 4.If\(R\) is regular and of Krull dimension \(n\), any \(x_1,\dots,x_n \in \mathfrak{m}\) mapping toa basis of \(\mathfrak{m}/\mathfrak{m}^2\) gives rise toa regular sequence (\(R\)-sequence).Hence \(G(R)=\dim{R}\) and therefore\(R\) is Cohen-Macaulay.

*Proof.* As \(G(R) \le\dim{R}\), once we have shown that

- Charles A. Weibel,
*An Introduction to HomologicalAlgebra*. - M. F. Atiyah, I. G. MacDonald,
*Introduction to CommutativeAlgebra*. - David Eisenbud,
*Commutative Algebra: with a View TowardAlgebraic Geometry*. - Winfred Bruns, Jürgen Herzog,
*Cohen-Macaulay Rings*

Theorem 1 (Mason-Stothers).Let\(a(X),b(X),c(X) \in K[X]\) be polynomialssuch that \((a,b,c)=1\) and \(a+b=c\). Then \[\max\{\deg a,\deg b,\deg c\} \le n_0(abc)-1.\]

*Proof.* Putting \(f=a/c\)and \(g=b/c\), we have

We interrupt the proof here for some good reasons. Rational functionsof the form \(f'/f\) remind us ofthe chain rule applied to \(\log{x}\).In the context of calculus, we have

*Proof (continued).* Since

Next we observe the degrees of

We present some applications of this theorem.

Corollary 1 (Fermat's theorem for polynomials).Let\(a(X),b(X)\) and\(c(X)\) be relatively prime polynomials in\(K[X]\) such that not all of them areconstant, and such that \[a(X)^n+b(X)^n=c(X)^n.\] Then \(n \le 2\).

Alternatively one can argue the curve

*Proof.* Since \(a,b\) and\(c\) are relatively prime, we alsohave \(a^n\),

Corollary 2 (Davenport's inequality).Let\(f,g \in K[X]\) be non-constant polynomialssuch that \(f^3-g^2 \ne 0\). Then \[\deg (f^3-g^2) \ge \frac{1}{2}\deg f + 1.\]

One may discuss cases separately on whether

Lemma 1.Suppose \(p,q \inK[X]\) are two distinct non-constant polynomials, then\[\deg(p-q) \ge \deg p - n_0(p)-n_0(q)+1.\]

*Proof.* Let \(k(f)\) be theleading coefficient of a polynomial

Next suppose \(\deg p = \deg q\) and\(k(p)=k(q)\). If

*Proof (of corollary 2).* Put

One may also generalise the case to

For more geometrical interpretation, one may be interested in[Zannier 95], where Riemann's existence theorem is also discussed.

In Stothers's paper [Stothers 81], the author discussed the conditionwhere the equality holds. If you look carefully you will realise histheorem 1.1 is exactly the Mason-Stothers theorem.

- [Davenport 65] H. Davenport,
*On*, 1965. (can someonefind a digital copy of this paper?)\(f^3(t)-g^2(t)\) - [Ma 84] R. C. Mason,
*Diophantine Equations over FunctionFields*, 1984. - [Shioda 04] Tetsuji Shioda,
*The abc-theorem, Davenport’sinequality and elliptic surfaces*, 2004(https://www2.rikkyo.ac.jp/web/shioda/papers/esdstadd.pdf) - [Stothers 81] W. W. Stothers,
*POLYNOMIAL IDENTITIES ANDHAUPTMODULN*, 1981. (https://doi.org/10.1093/qmath/32.3.349) - [Zannier 95] Umberto Zannier (Venezia),
*On Davenport’s bound forthe degree of \(f^3-g^2\) and Riemann’sExistence Theorem*, 1995. (https://eudml.org/doc/206763)

The **Riemann zeta function** is widely known:

\[\zeta(s)=\sum_{n=1}^{\infty}\frac{1}{n^s}.\]

It is widely known mainly because of the celebrated hypothesis byRiemann that remains unsolved after more than a century's attempts bymathematicians and 150 million attempts by computers:

Riemann Hypothesis:The non-trivial zeros of\(\zeta(s)\) lie on the line \(\Re(s)=\frac{1}{2}\).

People are told by pop-science how important and mysterious thishypothesis is. Or how disastrous if this would be solved one day. We canput them aside. A question is, why would Riemann ever think about thezero set of *such* a function? Perhaps something else? Accordingto Riemann, the distribution function of primes

\[\pi(x)=\sum_{p\text{ prime}}^{p \le x}1\]

may be written as the series

\[\pi(x)=R(x)-\sum_{\rho}R(x^\rho)\]

where

\[R(x)=1+\sum_{n=1}^{\infty}\frac{1}{n\zeta(n+1)}\frac{(\log{x})^n}{n!}\]

and \(\rho\) varies over all zerosof \(\zeta(s)\). With these being said,once this *hypothesis* is proven true, we may have a much moreconcrete say of the distribution of prime numbers.

But this is not the topic of this post actually. The author of thispost is not trying to prove the Riemann Hypothesis in a few pages, andnobody could. In this post, we investigate the analytic continuation of\(\zeta(s)\) step-by-step, so that itwill make sense to even think about evaluating the value at

A sketch of our procedure follows. The function

With these two tools being developed, we will observe the so-calledcomplete zeta function, which will bring us to THE continuation we arelooking for.

We will carry out details more about non-trivial processes, insteadof basic complex analysis. The reader may skip our preparation if theyare familiar with these content.

The Gamma function should be studied in an analysis course:

\[\Gamma(s)=\int_0^\infty e^{-t}t^{s-1}dt,s>0.\]

In an analysis course we have studied some of this function'simportant properties:

\(\Gamma(1)=1\).

\(\Gamma(s+1)=s\Gamma(s)\) (as aresult \(n!=\Gamma(n+1)\))

\(\log\Gamma(s)\) is a convexfunction.

In this section however, we will study it in the context of complexanalysis.

Theorem 1.The Gamma function\[\Gamma(s)=\int_0^\infty e^{-t}t^{s-1}dt\]

is well-defined as an analytic function in the half plane

\(\Re(s)>0\).

*Proof.* If we write

\[\begin{aligned}|e^{-t}t^{s-1}|&=|e^{-t}t^{u-1}||t^{iv}| \\ &=|e^{-t}t^{u-1}||e^{icv}| \\ &=e^{-t}t^{u-1}.\end{aligned}\]

Therefore

\[\begin{aligned}\int_{0}^{\infty}|e^{-t}t^{s-1}|dt &= \int_0^\infty e^{-t}t^{u-1}dt \\&=\Gamma(u) \\&<\infty.\end{aligned}\]

Then other properties follows.

Theorem 2.If\(\Re(s)>0\), then \[\Gamma(s+1)=s\Gamma(s),\]

and as a consequence

\(\Gamma(n+1)=n!\) for \(n=0,1,\dots\).

*Proof.* The second statement follows immediately because\(\Gamma(1)=1\). For the firstequation, we do a integration by parts:

\[\int_{\varepsilon}^{1/\varepsilon}\frac{d}{dt}(e^{-t} t^s)dt=-\int_{\varepsilon}^{1/\varepsilon}e^{-t}t^sdt+s\int_{\varepsilon}^{1/\varepsilon}e^{-t}t^{s-1}dt.\]

Taking \(\varepsilon \to 0\), we getwhat we want. \(\square\)

Now we are ready for the analytic continuation for the Gammafunction, which builds a bridge to the analytic continuation of

Theorem 3.The function\(\Gamma(s)\) defined in theorem 1 admits ananalytic continuation to a meromorphic function on the complex planewhose singularities are simple poles at \(0,-1,\dots\), with corresponding residue\(\frac{(-1)^n}{n!}\).

*Proof.* It suffices to show that we can

\[\Gamma_1(s)=\frac{\Gamma_0(s+1)}{s}\]

is THE analytic continuation of

\[\operatorname{Res}_{s=0}\Gamma_1(s)=\lim_{s \to 0}s\Gamma_1(s)=\Gamma_0(1)=1.\]

Likewise, we can define

\[\Gamma_2(s)=\frac{\Gamma_1(s+1)}{s}=\frac{\Gamma_0(s+2)}{s(s+1)}.\]

Overall, whenever \(m \ge 1\) is aninteger, we can define

\[\Gamma_m(s)=\frac{\Gamma_0(s+m)}{\prod_{j=1}^{m-1}(s-j)}.\]

This function is meromorphic in

\[\operatorname{res}_{s=-n}\Gamma_m(s)=\frac{\Gamma(-n+m)}{ (m-1-n)!(-1)(-2)\dots(-n)}=\frac{(-1)^n}{n!}.\]

Successive applications of the lemma shows that

Throughout, unless specified, we will call the function obtained inthe proof of theorem 3 as THE function

For all \(s \in \mathbb{C}\), thisfunction satisfies

Before we proceed, we develop two relationship between

If we perform a chance of variable

\[\int_{0}^\infty e^{-nu}n^{s}u^{s-1}du=\Gamma(s).\]

This is to say,

\[\begin{aligned}\frac{1}{n^s}\Gamma(s)&=\int_0^\infty e^{-nu}u^{s-1}du \\\end{aligned}\]

Taking the sum of all \(n\), wesee

\[\begin{aligned}\Gamma(s)\sum_{n=1}^{\infty}\frac{1}{n^s}&=\Gamma(s)\zeta(s) \\&= \sum_{n=1}^{\infty}\int_0^\infty e^{-nu}u^{s-1}du \\&=\int_0^\infty \sum_{n=1}^{\infty}e^{-nu}u^{s-1}du \\&=\int_0^{\infty}\frac{e^{-u}u^{s-1}}{1-e^{-u}}du \\&=\int_0^{\infty}\frac{u^{s-1}}{e^u-1}du.\end{aligned}\]

This relationship is beautiful, but may make our computation a littlebit more complicated. However, if we get our hand dirty earlier, ourstudy will be easier. Thus we will do a "uglier" change of variable\(t \mapsto \pi n^2y\) to obtain

\[\pi^{-s}\Gamma(s)\frac{1}{n^{2s}}=\int_0^\infty e^{-\pi n^{2}y }y^{s-1}dy\]

which implies

\[\pi^{-s}\Gamma(s)\zeta(2s)=\int_0^\infty \sum_{n=1}^{\infty}e^{-\pi n^2y}y^{s-1}dy.\]

Either case, it is legal to change the order of summation andintegration, because of the monotone convergence theorem.

Before we proceed, we need some more properties of the Gammafunction.

Theorem 3 (Euler's reflection formula).For all\(s \in \mathbb{C}\),\[\Gamma(s)\Gamma(1-s)=\frac{\pi}{\sin\pi s}.\]

Observe that this identity makes sense at all poles. Since

By analytic continuation, it suffices to prove it for

*Proof (real version).* First of all, observe that

\[\csc{x}=\frac{1}{x}+\sum_{n=1}^{\infty}\frac{2x}{x^2-n^2\pi^2}.\]

On the other hand, we have

\[\begin{aligned}\Gamma(x)\Gamma(1-x)&=B(x,1-x) \\ &=\int_0^1 t^{1-x}(1-t)^xdt \\ &=\int_0^\infty \frac{1}{y^x(1+y)}dy\end{aligned}\]

by taking \(t=\frac{1}{1+y}\). Nextwe compute this integral for both

\[\begin{aligned}\int_0^1\frac{1}{y^x(1+y)}dy &= \int_0^1\frac{1}{y^x} \sum_{n=0}^{\infty}(-y)^ndy \\ &= \sum_{n=0}^{\infty}\int_0^1(-y)^{n-x}dy \\ &= \sum_{n=0}^{\infty}\frac{(-1)^{n-1}}{n-x}.\end{aligned}\]

(One shall be disturbed by our exchange of infinite sum andintegration due to his or her study in analysis, but will be relaxedafter being informed about

On the other hand, taking

\[\begin{aligned}\int_1^\infty\frac{1}{y^x(1+y)}dy &= \int_0^1\frac{u^{x-1}}{1+u}du\\&=\frac{1}{x}+\sum_{n=1}^{\infty}\frac{(-1)^n}{n+x}\end{aligned}\]

Summing up, one has

\[\Gamma(x)\Gamma(1-x)=\frac{1}{x}+\sum_{n=1}^{\infty} (-1)^n\frac{2x}{x^2-n^2}.\]

It remains to show that

\[\cos\alpha{t}=\frac{\sin\pi \alpha}{\pi \alpha}+\sum_{n=1}^{\infty}(-1)^n\frac{2\alpha}{\pi(\alpha^2-n^2)}\sin\alpha\pi\cos{nt}.\]

Taking \(t=0\) and multiplying bothsides by \(\pi\csc\pi\alpha\), weobtain what we want. \(\square\)

*Proof (complex version).* By definition,

\[\begin{aligned}\Gamma(1-s)\Gamma(s) &= \int_0^\infty e^{-t}t^{s-1}\Gamma(1-s)dt \\ &= \int_0^\infty e^{-t}t^{s-1}\left(\int_0^\infty e^{-v}v^{s}dv\right)dt \\ &= \int_0^\infty e^{-t}t^{s-1}t \left( \int_0^\infty e^{-ut}(ut)^{-s}du \right)dt \\ &= \int_0^\infty du \int_0^\infty e^{-t(u+1)}u^{-s}dt \\ &= \int_0^\infty \frac{u^{-s}}{1+u}du\end{aligned}\]

Here we performed a change-of-variable on

\[\begin{aligned}\int_0^\infty \frac{u^{-s}}{1+u}du &= \int_0^\infty\frac{e^{(1-s)x}}{1+e^x}dx \\\end{aligned}\]

The integral on the right hand side can be computed to be

- In very much particular, by putting
\(s=1/2\), we obtain

\[\Gamma(1/2)=\sqrt{\pi}.\]

As a bonus of this, by putting

$$

Therefore

\[\Gamma(1/2)=2\int_0^\inftye^{-u^2}du=\int_{-\infty}^{\infty}e^{-u^2}du=\sqrt{\pi}.\]

To conclude this section, we mention the

Theorem 4 (Legendre duplication formula).\[\Gamma(s)\Gamma(s+1/2)=\frac{2\sqrt\pi}{2^{2s}}\Gamma(2s)\]

One can find a proof

Put

\[\theta(s)=\sum_{n \in \mathbb{Z}}e^{-\pi n^2 s}, \quad \Re(s)>0\]

and try to deduce its relation with

To begin with, we first show that

Proposition 1.The theta function is holomorphic onthe right half plane.

*Proof.* Let \(C\) be acompact subset of the right half plane, and put

\[\begin{aligned}\sum_{|n|\ge n_0}|e^{-\pi n^2 s}| &= \sum_{|n| \ge n_0}e^{-\pi n^2u} \\ &\le \sum_{|n| \ge n_0}e^{-\pi n^2 y_0} \\ &\le \sum_{|n| \ge n_0}e^{-\pi |n|}\end{aligned}\]

Therefore \(\theta(s)\) convergesabsolutely on any compact subset of the right half plane. (Note we haveused the fact that \(n^2y_0 \ge |n|n_0y_0 \ge|n|\) when we are studying the convergence.) Since each term isholomorphic, we have shown that

Therefore it is safe to work around theta function. Now we are readyto deduce a functional equation.

Theorem 4.The theta function satisfies thefunctional equation on\(\{\Re(s)>0\}\): \[\theta(s)=\frac{1}{\sqrt{s}}\theta\left(\frac{1}{s}\right)\]

The square root is chosen to be in the branch with positive realpart.

*Proof.* Consider the function

\[\hat{g}(t)=\int_{-\infty}^{\infty}g(x)e^{-2\pi ixt}dx.\]

Now we put \(g(x)=e^{-\pi ux^2}=f(\sqrt{u}x)\). The Fourier transform of

\[\hat{g}(t)=\frac{1}{\sqrt{u}}\hat{f}\left(\frac{t}{\sqrt{u}}\right)= \frac{1}{\sqrt{u}}e^{-\pi t^2 / u}.\]

Since \(g(x)\) is a Schwartzfunction, by

\[\sum_{n \in \mathbb{Z}}g(n)=\theta(u)=\sum_{n \in \mathbb Z}\hat{g}(n)=\frac{1}{\sqrt{u}}\theta\left(\frac{1}{u}\right).\]

By extending with analytic continuation, we are done.

For Schwartz functions, also known as rapidly decreasing functions,we refer the reader to chapter 7 of W. Rudin's *FunctionalAnalysis*.

Next we will study the behaviour of

Proposition 2.When the real number\(t \to 0\), the theta function is equivalentto \(\frac{1}{\sqrt{t}}\). Moreprecisely, when \(t\) is small enough,the following inequality holds: \[\left|\theta(t)-\frac{1}{\sqrt{t}}\right|<e^{-(\pi-1)/t}.\]

*Proof.* Rewrite

\[\theta(t)=1+2\sum_{n=1}^{\infty}e^{-\pi n^2 t}.\]

Therefore

\[\begin{aligned}\left|\theta(t)-\frac{1}{\sqrt{t}}\right| &= \left| \frac{1}{\sqrt{t}}\left(\theta\left(\frac{1}{t}\right)-1\right)\right| \\&= \frac{2}{\sqrt{t}}\sum_{n=1}^{\infty}e^{-\pi n^2/t}\end{aligned}\]

Pick \(t>0\) small enough sothat

\[e^{-1/t}<\frac{\sqrt{t}}{4}, \quad e^{-2\pi/t}<2.\]

It follows that

\[\begin{aligned}\left|\theta(t)-\frac{1}{\sqrt{t}}\right| &=\frac{2}{\sqrt{t}}\sum_{n=1}^{\infty}e^{-\pi n^2/t} \\ &< \frac{1}{2}e^{1/t}\sum_{n=1}^{\infty}e^{-\pi/t} e^{-\pi(n^2-1)/t} \\ &=\frac{1}{2}e^{-(\pi-1)/t}\sum_{n=1}^{\infty}e^{-\pi(n+1)(n-1)/t}\\ &<\frac{1}{2}e^{-(\pi-1)/t}\sum_{n=1}^{\infty} e^{-2\pi(n-1)/t} \\ &<\frac{1}{2}e^{-(\pi-1)/t}\sum_{n=1}^{\infty} 2^{n-1} \\ &=e^{-(\pi-1)/t}.\end{aligned}\]

\(\square\)

As a result, we also know how

Corollary 1.The limit of\(\theta(t)\) at infinity is \(1\) in the following sense: when \(t\) is big enough, \[|\theta(t)-1| < e^{-(\pi-1)t}/\sqrt{t}.\]

*Proof.* Let \(t\) be bigenough such that \(\frac{1}{t}\) issmall enough. That is,

\[\left|\theta\left(\frac{1}{t}\right)-\sqrt{t}\right| =\left|\sqrt{t}\theta(t)-\sqrt{t}\right|<e^{-(\pi-1)t}\]

according to proposition 2. The result follows.

To begin with, we introduce the Mellin transform. In a manner ofspeaking, this transform can actually be understood as themultiplicative version of the two-sided Laplace transform.

Definition.Given a function\(f:\mathbb{R}_+ \to \mathbb{C}\), the Mellintransform of \(f\) is defined to be \[\mathcal{M}_f(s)=\int_0^\infty f(x)x^{s-1}dx,\]

provided that the limit exists.

For example, \(\Gamma(s)\) is theMellin transform of \(e^x\). Moreover,for the two-side Laplace transform

\[\mathcal{B}_f(s)=\int_{-\infty}^{+\infty}e^{-sx}f(x)dx,\]

we actually have

\[\mathcal{M}_f(s)=\mathcal{B}_{\tilde{f}}(s),\]

where

Our goal is to recover \(Z(s)\)through the Mellin transform of

\[\pi^{-s}\Gamma(s)\zeta(2s)=\int_0^\infty \sum_{n=1}^{\infty}e^{-\pi n^2y}x^{s-1}dx.\]

It seems we can get our result really quick by studying

\[\phi(s)=\int_0^1\left(\theta(x)-\frac{1}{\sqrt{x}}\right)x^{s/2-1}dx + \int_1^\infty (\theta(x)-1)x^{s/2-1}dx.\]

We use \(s/2\) in place of

The partition \((0,1) \cup(1,\infty)\) immediately inspires one to use thechange-of-variable \(y=\frac{1}{x}\).As a result,

\[\begin{aligned}\phi(s)&=\int_0^1\left(\theta(x)-\frac{1}{\sqrt{x}}\right)x^{s/2-1}dx + \int_1^\infty (\theta(x)-1)x^{s/2-1}dx \\&= -\int_1^\infty\left( \theta\left(\frac{1}{y}\right)-\sqrt{y}\right)y^{1-s/2}(-y^{-2})dy -\int_0^1\left( \theta\left(\frac{1}{y}\right)-1\right)y^{1-s/2}(-y^{-2})dy \\&=\int_1^\infty (\theta(y)-1)y^{(1-s)/2-1}dy + \int_0^1\left(\theta(y)-\frac{1}{\sqrt{y}}\right)y^{(1-s)/2-1}dy \\&= \phi(1-s).\end{aligned}\]

Now we are ready to compute

\[\begin{aligned}\int_0^1\left(\theta(x)-\frac{1}{\sqrt{x}}\right)x^{s/2-1}dx &=\int_0^1\theta(x)x^{s/2-1}dx -\frac{2}{s-1} \\&=\int_0^1 \sum_{n=-\infty}^{+\infty}e^{-\pi n^2 x}x^{s/2-1}dx \\&= \int_0^1 x^{s/2-1}dx + 2\sum_{n=1}^{\infty}\int_0^1e^{-\pi n^2x}x^{s/2-1}dx - \frac{2}{s-1} \\&= 2\sum_{n=1}^{\infty}\int_0^1e^{-\pi n^2 x}x^{s/2-1}dx +\frac{2}{s}-\frac{2}{s-1}.\end{aligned}\]

On the other hand,

\[\int_1^\infty (\theta(x)-1)x^{s/2-1}dx = 2\sum_{n=1}^{\infty}\int_1^\infty e^{-\pi n^2 x}x^{s/2-1}dx.\]

Therefore

\[\begin{aligned}\phi(s)&=2\sum_{n=1}^{\infty}\int_0^\infty e^{-\pi n^2 x} x^{s/2-1}dx + \frac{2}{s}-\frac{2}{s-1} \\ &=2\left(Z(s)+\frac{1}{s}-\frac{1}{s-1}\right).\end{aligned}\]

Therefore

\[Z(s)=\pi^{-s/2}\Gamma(s/2)\zeta(s)=\frac{1}{2}\phi(s)-\frac{1}{s}+\frac{1}{s-1}.\]

In particular,

\[\begin{aligned}Z(1-s)&=\frac{1}{2}\phi(1-s)-\frac{1}{1-s}+\frac{1}{1-s-1} \\ &=\frac{1}{2}\phi(s)-\frac{1}{s}+\frac{1}{s-1} \\ &=Z(s)\end{aligned}\]

Expanding this equation above, we see

\[\pi^{-(1-s)/2}\Gamma((1-s)/2)\zeta(1-s) =\pi^{-s/2}\Gamma(s/2)\zeta(s).\]

This gives

\[\zeta(1-s)=\pi^{\frac{1}{2}-s}\frac{\Gamma\left(\frac{s}{2}\right)}{\Gamma\left(\frac{1-s}{2}\right)}\zeta(s).\]

Finally we try to simplify the quotient above. By Legendre'sduplication formula,

\[\Gamma(s)=\frac{2^s}{2\sqrt{\pi}}\Gamma\left(\frac{s}{2}\right)\Gamma\left(\frac{s+1}{2}\right).\]

By Euler's reflection formula,

\[\Gamma\left(\frac{1-s}{2}\right)\Gamma\left(\frac{s+1}{2}\right)=\frac{\pi}{\sin\pi\left(\frac{1-s}{2}\right)}=\frac{\pi}{\cos\frac{\pi s}{2}}.\]

Combining these two equations, we obtain

Proposition 3.The Riemann Zeta function\(\zeta(s)\) admits an analytic continuationsatisfying the functional equation \[\zeta(1-s)=2(2\pi)^{-s}\Gamma(s)\cos\frac{\pi s}{2}\zeta(s).\]

In particular, since we also have

\[\zeta(s)=\frac{\pi^{s/2}}{\Gamma(s/2)}\left(\phi(s)-\frac{1}{s}+\frac{1}{s-1}\right),\]

it is immediate that \(\zeta(s)\)admits a simple hole at \(s=1\) withresidue \(1\). Another concern is

\[\begin{aligned}\zeta(s)&= \frac{\pi^{s/2}}{\Gamma(s/2)}\phi(s)- \frac{\pi^{s/2}}{s\Gamma(s/2)}+\frac{\pi^{s-1}}{\Gamma(s/2)}\\&=\frac{\pi^{s/2}}{\Gamma(s/2)}\phi(s)- \frac{2\pi^{s/2}}{\Gamma(s/2+1)}+\frac{\pi^{s-1}}{\Gamma(s/2)}\end{aligned}\]

there is no pole at \(s=0\) (noticethat \(\phi(s)\) is entire). We nowknow a little bit more about the analyticity of

Corollary 2.The Riemann zeta function\(\zeta(s)\) has its analytic continuationdefined on \(\mathbb{C} \setminus\{1\}\), with a simple pole at \(s=1\) with residue \(1\).

Now we are safe to compute

\[\zeta(-1)=2(2\pi)^{-2}\Gamma(2)\zeta(2) =\frac{2}{4\pi^2}\cdot 2 \cdot \cos(\pi)\cdot \frac{\pi^2}{6} = -\frac{1}{12}.\]

But I believe, after these long computation of the analyticalcontinuation, we can be confident enough to say that, when

This post ends with

]]>So why is this important? Part of what I’ve tried to show on thisblog is that mathematics and physics can be extremely non-intuitive,even bizarre, but that they have their own rules and logic that makeperfect sense once you get familiar with them. The original video, inmy opinion, acts more like a magic trick than an explanation: it shows apeculiar, non-intuitive result and tries to pass it off as absolutetruth without qualification. Making science and math look likeincomprehensible magic does not do any favors for the scientists whostudy it nor for the public who would like to understand it.

Let \(K\) be a field (in this postwe mostly assume that \(K \supset\mathbb{Q}\)) and \(n\) aninteger \(>1\) which is notdivisible by the characteristic of

\[X^n-1\]

is separable because its derivative is

The generator \(\zeta_n\) of

Proposition 1.Notation being above,\(L/K\) is Galois, the Galois group \(\operatorname{Gal}(L/K) \cong(\mathbb{Z}/n\mathbb{Z})^\ast\) (the group of units in \(\mathbb{Z}/n\mathbb{Z}\)) and \([L:K]=\varphi(n)\).

Let's first elaborate the fact that

The proof can be produced by two lemmas, the first of which isindependent to the characteristic of the field.

Lemma 1.Let \(k\)be a field and \(n\) be not divisibleby the characteristic \(p\). Let\(\zeta=\zeta_n\) be a primitive \(n\)-th root of unity in \(\overline{k}\), then \((\mathbb{Z}/n\mathbb{Z})^\ast \supset\operatorname{Gal}(k(\zeta)/k)\) and therefore \([k(\zeta):k] \le \varphi(n)\). Besides,\(k(\zeta)/k\) is a normal abelianextension.

*Proof.* Let \(\sigma\) be anembedding of \(k(\zeta)\) in

\[(\sigma\zeta)^n=\sigma(\zeta^n)=\sigma(1)=1\]

so that \(\sigma\zeta\) is also an\(n\)-th root of unity also. Hence\(\sigma\zeta=\zeta^i\) for some

\[\sigma\tau\zeta=\zeta^{i(\sigma)i(\tau)}.\]

It follows that \(i(\sigma)\) and\(i(\tau)\) are prime to

It is easy to find an example with strict inclusion. One only needsto look at \(k=\mathbb{R}\) or

Lemma 2.Let\(\zeta=\zeta_n\) be a primitive \(n\)-th root of polynomial over \(\mathbb{Q}\), then for any \(p \nmid n\), \(\zeta^p\) is also a primitive \(n\)-th root of unity.

*Proof.* Let \(f(X)\) be theirreducible polynomial of \(\zeta\)over \(\mathbb{Q}\), then

Suppose \(\zeta^p\) is not a root of\(f\). Since

\[h(X^p)=f(X)g(X).\]

Again by Gauss's lemma, \(g(X)\) hasintegral coefficients.

Next we reduce these equations in

\[\overline{f}(X)\overline{g}(X)=\overline{h}(X^p).\]

By Fermat's little theorem \(a^p=a\)for all \(a \in \mathbf{F}_q\), we alsohave

\[\overline{h}(X^p)=\overline{h}(X)^p.\]

Therefore

\[\overline{f}(X)\overline{g}(X)=\overline{h}(X)^p,\]

which implies that

Now we are ready for Proposition 1.

*Proof of Proposition 1.* Since

Recall in elementary group theory, if

We will show that \(f\) in the prooflemma 2 is actually the cyclotomic polynomial

We have

\[X^n-1=\prod_{\zeta}(X-\zeta),\]

where the product is taken over all

\[\Phi_d(X)=\prod_{\operatorname{period} \zeta=d}(X-\zeta).\]

Then

\[X^n-1=\prod_{d|n}\Phi_d(X).\]

It follows that \(\Phi_1(X)=X-1\)and

\[\Phi_n(X)=\frac{X^n-1}{\prod_{d\mid n}^{d<n}\Phi_d(X)}.\]

This presentation makes our computation much easier. But tounderstand \(\Phi_n\), we still shouldkeep in mind that the \(n\)-thcyclotomic polynomial is defined to be

\[\Phi_n(X)=\prod_{\operatorname{period}\zeta=n}(X-\zeta),\]

whose roots are all primitive

Proposition 2.The cyclotomic polynomial isirreducible and is the irreducible polynomial of\(\zeta\) over \(\mathbb{Q}\), where \(\zeta\) is a primitive \(n\)-th root of unity.

We end this section by a problem in number fields, making use of whatwe have studied above.

Problem 0.A number field\(F\) only contains finitely many roots ofunity.

*Solution.* Let \(\zeta \inF\) be a root of unity with period

We will do some dirty computation in this section.

Problem 1.If \(p\)is prime, then\(\Phi_p(X)=X^{p-1}+X^{p-2}+\dots+1\), andfor an integer \(\nu \ge 1\), \(\Phi_{p^\nu}(X)=\Phi_p(X^{p^{\nu-1}})\).

*Solution.* The only integer

\[\Phi_p(X)=\frac{X^p-1}{\Phi_1(X)}=X^{p-1}+\dots+1.\]

For the second statement, we use induction on

\[\Phi_{p^\nu}(X)=\Phi_p(X^{p^{\nu-1}})=\frac{X^{p^{\nu}}-1}{X^{p^{\nu-1}}-1}=\frac{X^{p^{\nu}}-1}{\prod_{r=0}^{\nu-1}\Phi_{p^{r}}(X)}\]

is proved, then

\[\begin{aligned}\Phi_{p^{\nu+1}}(X)&=\frac{X^{p^{\nu+1}}-1}{\prod_{r=0}^{\nu}\Phi_{p^r}(X)}\\ &=\frac{X^{p^{\nu+1}}-1}{X^{p^{\nu}}-1} \\ &=\Phi_p(X^{p^\nu}).\end{aligned}\]

Problem 2.Let\(p\) be a prime number. If \(p \nmid n\), then \[\Phi_{pn}(X)=\frac{\Phi_n(X^p)}{\Phi_n(X)}.\]

*Solution.* Assume \(p \nmidn\) first. It holds clearly for

\[\begin{aligned}\frac{\Phi_n(X^p)}{\Phi_n(X)} &= \frac{X^{pn}-1}{\prod_{d|n}^{d<n}\Phi_d(X^p)}\frac{\prod_{d|n}^{d<n}\Phi_d(X)}{X^n-1}\\&= \frac{X^{pn}-1}{(X^n-1)\prod_{d|n}^{d<n}\Phi_{dp}(X)} \\&=\frac{X^{pn}-1}{\prod_{d|n}\Phi_d(X)\prod_{d|n}^{d<n}\Phi_{dp}(X)} \\&=\Phi_{np}(X).\end{aligned}\]

Problem 3.If \(n\)is an odd number \(>1\), then\(\Phi_{2n}(X)=\Phi_n(-X)\).

*Solution.* By problem 2,

\[\Phi_n(X)\Phi_n(-X)=\Phi_n(X^2).\]

For \(n=3\) we see

\[\begin{aligned}\Phi_3(X)\Phi_3(-X) &= (X^2+X+1)(X^2-X+1) \\&=(X^2+1)^2-X^2 \\&=X^4+X^2+1 \\&=\Phi_3(X^2).\end{aligned}\]

Now suppose it holds for all odd numbers

\[\begin{aligned}\Phi_n(X)\Phi_n(-X) &= \frac{(X^n-1)(-X^n-1)}{ (X-1)(-X-1)\prod_{3\le d < n}^{d|n}\Phi_d(X)\Phi_d(-X)} \\&= \frac{-(X^{2n}-1)}{-(X^2-1)\prod_{3 \le d < n}^{d|n} \Phi_d(X^2)} \\&= \Phi_n(X^2).\end{aligned}\]

The following problem would not be very easy without the Möbiusinversion formula so we will use it anyway. Problems above can also bededuced from this formula. Let

\[f(n)=\prod_{d|n}F(n/d)^{\mu(d)}\]

with

\[\mu(n)=\begin{cases}0 & \text{if $n$ is divisible by $p^2$ for some prime $p$}, \\(-1)^r & \text{if $n$ is a product of $r$ distinct primes,} \\1 & \text{if $n=1$.}\end{cases}\]

Putting \(f(d)=\Phi_d(X)\), wesee

\[\Phi_n(X)=\prod_{d|n}(X^{n/d}-1)^{\mu(d)}.\]

Now we proceed.

Problem 4.If\(p|n\), then \(\Phi_{pn}(X)=\Phi_n(X^p)\).

*Solution.* By the Möbius inversion formula, we see

\[\begin{aligned}\Phi_{pn}(X) &= \prod_{d|pn}(X^{pn/d}-1)^{\mu(d)} \\ &= \left(\prod_{d|n}(X^{pn/d}-1)^{\mu(d)} \right) \left(\prod_{d|np}^{d\nmidn}(X^{pn/d}-1)^{\mu(d)}\right) \\ &= \Phi_n(X^p)\end{aligned}\]

because all \(d\) that divides

Problem 5.Let\(n=p_1^{r_1}\dots p_s^{r_s}\), then \[\Phi_n(X)=\Phi_{p_1 \dots p_s}(X^{p_1^{r_1-1}\dotsp_s^{r_s-1}}).\]

*Solution.* This problem can be solved by induction on thenumber of primes. For \(s=1\) it isproblem 1. Suppose it has been proved for

\[n_{s-1}=p_1^{r_1}\dots p_{s-1}^{r_{s-1}}\]

and a prime \(p_s\), we have

\[\Phi_{n_{s-1}p_s}(X)=\frac{\Phi_{n_{s-1}}(X^{p_s})}{\Phi_{n_{s-1}}(X)}\]

On the other hand,

\[\begin{aligned}\frac{\Phi_{n_{s-1}}(X^{p_s})}{\Phi_{n_{s-1}}(X)}&=\frac{\Phi_{p_1\dotsp_{s-1}}(X^{ p_1^{r_1-1}\dots p_{s-1}^{r_{s-1}-1}p_s})}{\Phi_{p_1\dots p_{s-1}}(X^{ p_1^{r_1-1}\dots p_{s-1}^{r_{s-1}-1}})} \\&=\Phi_{p_1 \dots p_{s-1}p_s}(X^{p_1^{r_1-1}\dotsp_{s-1}^{r_{s-1}-1}p_s^{1-1}})\end{aligned}\]

if we put \(Y=X^{p_1^{r_1-1}\dotsp_{s-1}^{r_{s-1}-1}}\). When it comes to higher degree of

Let \(\zeta\) be a primitive

\[\begin{aligned}N_\mathbb{Q}^K(1-\zeta)&=\prod_{\sigma \in G}\sigma(1-\zeta) \\&=\prod_{\sigma \in G}(1-\sigma\zeta) \\\end{aligned}\]

Since \(G\) acts on the set ofprimitive roots transitively,

\[N_\mathbb{Q}^K(1-\zeta)=\Phi_n(1).\]

If \(n=p^r\), then

\[n=p_1^{r_1}\dots p_s^{r_s},\]

then

\[\Phi_n(1)=\Phi_{p_1\dots p_s}(1)=1.\]

]]>

Definition.For a polynomial with coefficients in anumber field \(K\)\[f(t_1,\dots,t_n)=\sum_{j_1,\dots,j_n}a_{j_1\dots j_n}t_1^{j_1}\dotst_n^{j_n}=\sum_{\mathbf{j}}a_{\mathbf{j}}\mathbf{t}^{\mathbf{j}},\]

the

heightof \(f\)is defined to be\[h(f)=\sum_{v \in M_K}\log|f|_v\]

where

\[|f|_v=\max_{\mathbf{j}}|a_{\mathbf{j}}|_v\]

is the

Gauss normfor any place\(v\).

Here, \(M_K\) refers to thecanonical set of non-equivalent places on

As one can expect, this can tell us about some complexity of apolynomial, just like how the height of an algebraic number tells us itscomplexity. Let us compute some examples.

Let us consider the simplest one

\[f(x)=x^2-1 \in \mathbb{Q}[x]\]

first. Since \(|x^2-1|_v=1\) for allplaces \(v\), the height of

Next, we take care of a polynomial that involves prime numbers

\[g(x)=\frac{1}{4}x^4+\frac{1}{3}x^3+\frac{1}{2}x^2+x+2\]

We see \(|g(x)|_\infty=2\),

\[h(g)=\log{2}+\log{4}+\log{3}=3\log{2}+\log{3}.\]

Put \(u(x,y)=\sqrt{2}x^2 +3\sqrt{2}xy+5y^2+7 \in \mathbb{Q}(\sqrt{2})[x,y]\), we cancompute its height carefully. Notice that

\[\begin{aligned}h(u) &= \log|u|_\infty + \log|u|_2 + \log|u|_3 + \log|u|_5+\log|u|_7\\ &= \log 7 + \log\sqrt\frac{1}{2}+\log\frac{1}{3}+\log\frac{1}{5}+\log\frac{1}{7} \\ &= -\frac{1}{2}\log{2}-\log{3}-\log{5}.\end{aligned}\]

If \(f \in K[s_1,\dots,s_n]\) and\(g \in K[t_1,\dots,t_m]\) are twopolynomials in different variables, then as a polynomial in

But if variables coincide, things get different. For example,

Gauss's lemma.If\(v\) is not Archimedean, then \(|fg|_v=|f|_v|g|_v\).

*Proof.* First of all, it suffices to prove it for univariablecases. If \(f\) and

\[x_j=t^{d^{j-1}}\]

reduces our study into \(K[t]\).This is because, with such a \(d\),this substitution gives a univariable polynomial with the same set ofcoefficients.

Therefore we only need to show that

We suppose that \(|fg|_v<1\),i.e., \(|c_j|_v<1\) for all

\[|c_j|_v=|a_0b_j+a_1b_{j-1}+\dots|_v<1,\]

we have

So much for non-Archimedean case. For Archimedean case things aremore complicated so we do not have enough space to cover that.Nevertheless, we have

Gelfond's lemma.Let\(f_1,\dots,f_m\) be complex polynomials in\(n\) variables an set \(f=f_1\cdots f_n\), then \[2^{-d}\prod_{j=1}^{m}\ell_\infty(f_j) \le \ell_\infty(f) \le 2^d\prod_{j=1}^{m}\ell_\infty(f_j),\]

where \(d\) is the sum of thepartial degrees of \(f\), and

\(\ell_\infty(f)=\max_j|a_j|=|f|_\infty\).

Combining Gelfond's lemma and Gauss's lemma, we obtain

\[\left|h(f)-\sum_{j=1}^{m}h(f_j) \right| \le d\log{2}.\]

Is not actually given by Mahler initially. It was named after Mahlerbecause he successfully extended it to multivariable cases in an elegantway. We will cover the original motivation anyway.

Say we want to find prime numbers large enough. Pierce came up withan idea. Consider \(p(x) \in\mathbb{Z}[x]\), which is factored into

\[p(x)=a\prod_i(x-\alpha_i).\]

Consider

\[\lim_{n \to \infty}\frac{|\alpha^{n+1}-1|}{|\alpha^n-1|}=\begin{cases}|\alpha|, & |\alpha|>1, \\1, & |\alpha|<1.\end{cases}\]

So it makes sense to compare all roots of

\[M(p)=a\prod_i \max\{1,|\alpha_i|\}.\]

This number appears if we consider

He also asked the following question, which is now understood as**Lehmer conjecture**, although in his paper he addressedit as a problem instead of a conjecture:

Is there a constant \(c\) such that,\(M(p)>1 \implies M(p)>c\)?

It remains open but we can mention some key bounds.

- Lehmer himself found that

\[M(x^{10}+x^9-x^7-x^6-x^5-x^4-x^3+x+1)=1.1762808\cdots\]

and actually this is the finest result that has ever been discovered.It was because of this discovery that he gave his *problem*.

This polynomial has also led to the discovery of a large prime number\(\sqrt{\Delta_{379}}=1, 794, 327, 140,357\), although by studying

- Breusch (and later Smyth) discovered that if
\(p\) is monic, irreducible andnonreciprocal, i.e. it does not satisfy \(p(x)=\pm x^{\deg p}f(1/x)\), then

\[M(p)=1.3247179\cdots=\text{ real root of $x^3-x-1$.}\]

- E. Dobrowlolski found that, t if
\(p(x)\) is monic, irreducible andnoncyclotomic, and has degree \(d\)then

\[M(p)>1+c\left( \frac{\log\log d}{\log d} \right)^3\]

for some \(c>0\).

Definition.For \(f \in\mathbb{C}[x_1,\dots,x_n]\), theMahler measureis defined to be\[M(f)=\exp\left(\int_{\mathbb{T}^n}\log|f(e^{i\theta_1},\dots,e^{i\theta_n})|d\mu_1\dots d\mu_n \right),\]

where

\(d\mu_i=\frac{1}{2\pi}d\theta_i\), i.e.,\(d\mu_1\dots d\mu_n\) corresponds tothe (completion of) Harr measure on \(\mathbb{T}^n\) with total measure \(1\).

We see through Jensen's formula that when

\[M(f)=M(a)\prod_{i=1}^{d}M(t-\alpha_i)=a\prod_{i=1}^{d}M(t-\alpha_i).\]

On the other hand, as an exercise in complex analysis, one can showthat

\[\int_0^{2\pi}\log|t-\alpha_i|d\mu=\log^+|\alpha_i|=\log\max\{1,|\alpha_i|\}.\]

Combining them, we see

\[M(f)=a\prod_{i=1}^{d}\exp\left(\log^+|\alpha_i|\right)=a\prod_{i=1}^{n}\max\{1,|\alpha_i|\}.\]

Taking the logarithm we also obtain **Jensen'sformula**

\[\log M(f)=\log|a|+\sum_{i=1}^{d}\log^+|\alpha_i|.\]

We first give a reasonable and useful estimation of

Definition.For\(f(t)=a_dt^d+\dots+a_0\), the \(\ell_p\)-norm of \(f\) is naturally defined to be \[\ell_p(f)=\left(\sum_{j=0}^{d}|a_j|^p \right)^{\frac{1}{p}}.\]

For \(p=\infty\), we have

\(\ell_\infty(f)=\max_j|a_j|\).

Lemma 1.Notation being above,\(M(f) \le \ell_1(f)\) and \[{d \choose \lfloor{d/2}\rfloor }^{-1} \ell_\infty(f)\le M(f) \le \ell_2(f) \le\sqrt{d+1}\ell_\infty(f).\]

*Proof.* To begin with, we observe those obvious ones. Firstof all,

\[\begin{aligned}|f(e^{i\theta})|&=|a_d e^{id\theta}+\dots+a_0| \\ &\le |a_de^{id\theta}|+\dots+|a_0| \\ &=|a_d|+\dots+|a_0| \\ &=\ell_1(f).\end{aligned}\]

Therefore

\[\begin{aligned}M(f) &=\exp\left(\int_0^{2\pi}\log|f(e^{i\theta})|d\mu \right) \\ &\le \exp\left(\int_0^{2\pi}\log\ell_1(f)d\mu \right) \\ &=\ell_1(f).\end{aligned}\]

Next, by Jensen's inequality

\[\begin{aligned}M(f) &=\exp\left(\frac{1}{2}\int_0^{2\pi}\log|f(e^{i\theta})|^2d\mu\right) \\ &=\sqrt{\exp\left( \int_0^{2\pi} \log|f(e^{i\theta})|^2d\mu\right)} \\ &\le \sqrt{\int_0^{2\pi}|f(e^{i\theta})|^2d\mu} \\\end{aligned}\]

However, by Parseval's formula, the last term equals

\[\ell_2(f)=\sqrt{\sum_{j=0}^{d}|a_j|^2} \le \sqrt{d+1}\ell_\infty(f).\]

For the remaining inequality, we use Vieta's formula

\[\left|\frac{a_{d-r}}{a_d} \right| = \left|\sum_{j_1 < \dots< j_r}\alpha_{j_1} \cdots \alpha_{j_r}\right|\]

and therefore

\[|a_{d-r}| \le {d \choose r}|a_d|\prod_{j=1}^{d}\max\{1,|\alpha_j|\}={d \choose r}M(f)\le {d \choose \lfloor{d/2}\rfloor }M(f)\]

for all \(0 \le r \le d\). Replacing\(|a_{d-r}|\) with

Before proving Northcott's theorem, we show the connection betweenMahler measure and heights.

Proposition 1.Let\(\alpha \in \overline{\mathbb{Q}}\) and let\(f\) be the minimal polynomial of\(\alpha\) over \(\mathbb{Z}\). Then \[\log M(f) = \deg(\alpha)h(\alpha)\]

and

\[\log|N_{\mathbb{Q}(\alpha)/\mathbb Q}(\alpha)| \le\deg(\alpha)h(\alpha).\]

*Proof.* Put

\[f(t)=a_dt^d+\dots+a_0.\]

Choose a number field \(K\) thatcontains \(\alpha\) and is a Galoisextension of \(\mathbb{Q}\), withGalois group \(G\). Then

\[|a_d|_v\prod_{\sigma \inG}\max\{1,|\sigma\alpha|_v\}^{d/[K:\mathbb{Q}]}=1.\]

Now we are ready to compute the height of

\[h(\alpha)=\sum_{v \in M_K}\log^+|\alpha|_v, \quadh(\sigma\alpha)=h(\alpha),\quad \forall \sigma \in G.\]

We therefore obtain

\[\begin{aligned}h(\alpha)&=\frac{1}{[K:\mathbb{Q}]}\sum_{\sigma \inG}h(\sigma\alpha) \\ &=\frac{1}{[K:\mathbb{Q}]}\sum_{v \in M_K}\sum_{\sigma \inG}\log^+|\sigma\alpha|_v \\ &=\frac{1}{[K:\mathbb{Q}]}\sum_{v\mid\infty}\sum_{\sigma\in G}\log^+|\sigma\alpha|_v + \frac{1}{[K:\mathbb{Q}]}\sum_{v\nmid \infty}\sum_{\sigma\in G}\log^+|\sigma\alpha|_v \\\end{aligned}\]

The last term corresponds to what we have computed above aboutnon-Archimedean absolute values so we break it down a little bit:

$$\[\begin{aligned}\sum_{\sigma \in G}\log^+|\sigma\alpha|_v &= \sum_{\sigma \in G}\log\max\{1,|\sigma\alpha|_v\} \\ &=\log\prod_{\sigma \in G}\max\{1,|\sigma\alpha|_v\} \\ &=\frac{[K:\mathbb{Q}]}{d}\log\prod_{\sigma \in G} \max\{1,|\sigma\alpha|_v\}^{d/[K:\mathbb{Q}]} \\ &=-\frac{[K:\mathbb{Q}]}{d}\log|a_d|_v \\ &=\frac{[K:\mathbb{Q}]}{d}\log|a_d|_w\end{aligned}\]$$

for some \(u \mid \infty\),according to the product formula. On the other hand, for

\[\sum_{\sigma \in G}\log^+|\sigma\alpha|_v =\frac{[K:\mathbb{Q}]}{d}\sum_{j=1}^{d}\log^+|\alpha_j|_v.\]

All in all,

\[\begin{aligned}h(\alpha)&=\frac{1}{[K:\mathbb{Q}]}\frac{[K:\mathbb{Q}]}{d} \sum_{v \mid \infty}\left( \log|a_d|_v+\sum_{j=1}^{d}\log^+|\alpha_j|_v \right) \\ &=\frac{1}{d}\log M(f).\end{aligned}\]

The second assertion follows immediately because

\[\log|N_{\mathbb{Q}(\alpha)/\mathbb{Q}}(\alpha)|=\sum_{v\mid \infty}\sum_{j=1}^{d}\log^+|\alpha|_j.\]

The set of non-zero algebraic integers of height

When it comes to algebraic integers of small heights, things may getcomplicated, but Northcott's theorem assures that we will be studying afinite set.

Northcott's Theorem.Given an integer\(N>0\) and a real number \(H \ge1\), there are only a finite number ofalgebraic integers \(\alpha\)satisfying \(\deg(\alpha) \le N\) and\(h(\alpha) \le \log H\).

*Proof.* Let \(\alpha\) be aalgebraic integer of degree \(d<N\)and height \(h(\alpha) \le \log H\).Suppose \(f(t)=a_dt^d+\dots+a_0 \in\mathbb{Z}[t]\) is the minimal polynomial of

\[\max|a_i|=\ell_\infty(f) \le {d \choose \lfloor d/2 \rfloor} M(f) \le 2^dM(f).\]

On the other hand, by proposition 1,

\[\log M(f) =dh(\alpha) \le d\log{H}=\log H^d,\]

we have actually

\[\max|a_i| \le (2H)^d.\]

This gives rise to no more than \((2\lfloor(2H)^d \rfloor+1)^{d+1}\) distinct polynomials

We also have the **Northcott property**, where we do notcare about degrees. A set \(L\) ofalgebraic integers is said to satisfy Northcott property if, for every\(T>0\), the set

\[\{\alpha \in L:h(\alpha)<T\}\]

is finite. Such a set \(L\) is saidto satisfy **Bogomolov property** if, there exists

\[\{\alpha \in L:0<h(\alpha)<T\}\]

is empty. As a matter of elementary topology, Northcott propertyimplies Bogomolov property. It would be quite interesting if

Erico Bombieri, Walter Gubler,

*Heights in DiophantineGeometry*.Michel Waldschmidt,

*Diophantine Approximation on LinearAlgebraic Groups, Transcendence Properties of the Exponential Functionin Several Variables*.Chris Smyth,

*THE MAHLER MEASURE OF ALGEBRAIC NUMBERS: ASURVEY*.

Let \(F\) be a non-Archimedean localfield, meaning that \(F\) is completeunder the metric induced by a

\[\mathfrak{o}_F=\{\alpha \in F:|\alpha| \le 1\}\]

and its unique prime (hence maximal) ideal

\[\mathfrak{p}=\{\alpha \in F:|\alpha| <1\}.\]

The residue field

Let \(f \in \mathfrak{o}_F[x]\) be apolynomial. Hensel's lemma states that, if

To do this, we need to use Newton's method of approximating roots of\(f(x)=0\), something like

\[a_{n+1}=a_n-\frac{f(a_n)}{f'(a_n)}.\]

We know that \(a_n \to \zeta\) where\(f(\zeta)=0\) at a *Principles of Mathematical Analysis* if you arenot familiar with it, I heartily recommend.). Now we will steal Newton'smethod into number theory to find roots in a non-Archimedean field,which is violently different from

We will also use induction, in the form of which I would like to call"double induction". Instead of claiming that

This method is inspired by this

Hensel's Lemma.Let\(F\) be a non-Archimedean local field withring of integers \(\mathfrak{o}_F=\{\alpha \inF:|\alpha| \le 1\}\) and prime ideal \(\mathfrak{p}=\{\alpha \inF:|\alpha|<1\}\). Let \(f \in\mathfrak{o}_F[x]\) be a polynomial whose reduction \(\overline{f} \in k[x]\) has a simple root\(a \in k\), then \(a\) can be lifted to \(\alpha \equiv a \mod \mathfrak{p}\), suchthat \(f(\alpha)=0\).

By simple root we mean

Put \(F=\mathbb{Q}_7\). Then

\[\begin{aligned}2 &\equiv 3^2 \mod 7 \\2 &\equiv (3+7)^2 \mod 7^2 \\&\cdots\cdots\end{aligned}\]

Hence we can put

Since \(k\) is a finite field, wesee \(k^\times\) is a cyclic group oforder \(q-1\) where

Pick any \(a_0 \in \mathfrak{o}_F\)that is a lift of

\[a_n=a_{n-1}-\frac{f(a_{n-1})}{f'(a_{n-1})},\quad n \ge 1,\]

then we claim that \(a_n\) convergesto the root we are looking for.

First of all, we need to show that \(a_n\in \mathfrak{o}_F\), i.e., \(|a_n| \le1\) for all \(n\). It sufficesto show that \(|f(a_{n-1})/f'(a_{n-1})|\le 1\). We firstly observe the case when

Since \(\overline{f}(a)=0\) but\(\overline{f}'(a) \ne 0\), we have\(f(a_0) \in \mathfrak{p}\) but

By Taylor's theorem.

\[f'(a_n)=f'(a_{n-1})+f''(a_{n-1})(a_n-a_{n-1})+g_n(a_n)(a_n-a_{n-1})^2\]

for some \(g_n \in\mathfrak{o}_F[x]\). When

\[|f'(a_1)|=\max\{|f'(a_0)|,|f''(a_{0})(a_1-a_0)|,|g_1(a_1)(a_1-a_0)^2|\}=|f'(a_0)|=1.\]

Since \(a_1 \in \mathfrak{o}_F\), wealso see that \(f(a_1) \in\mathfrak{o}_F\) hence its absolute value is not greater than\(1\). As a result

This inspires us to claim the following *two* statements:

\(|f(a_n)| < 1\) for all\(n \ge 0\).

\(|f'(a_n)|=|f'(a_0)|=1\) for all\(n \ge 0\).

We have verified (a) and (b) for

First of all, by (a) and (b) for

Consider the Taylor's expansion

\[\begin{aligned}f(a_n)&=f(a_{n-1})+f'(a_{n-1})(a_n-a_{n-1})+h_n(a_n)(a_n-a_{n-1})^2\\ &=f(a_{n-1})-f'(a_{n-1})\frac{f(a_{n-1})}{f'(a_{n-1})}+h_n(a_n) \left(\frac{f(a_{n-1})}{f'(a_{n-1})}\right)^2 \\ &= h_n(a_n)\left(\frac{f(a_{n-1})}{f'(a_{n-1})}\right)^2\end{aligned}\]

where \(h_n \in \mathfrak{o}_F[x]\).It follows that \(|h_n(a_n)| \le 1\).Since \(|f'(a_{n-1})|=1\), by (b)we actually have

\[|f(a_n)| \le |f(a_{n-1})|^2<1.\]

To prove (b) for \(n\), we considerthe Taylor's expansion

\[f'(a_n)=f'(a_{n-1})+f''(a_{n-1})(a_n-a_{n-1})+g_n(a_n)(a_n-a_{n-1})^2\]

Notice that since \(a_n \in\mathfrak{o}_F\), we have

\[\begin{aligned}|f''(a_{n-1})(a_n-a_{n-1})|&=|f''(a_{n-1})||f(a_{n-1})|<1,\\|g_n(a_n)(a_{n}-a_{n-1})^2|&=|g_n(a_n)||f(a_{n-1})|^2<1.\end{aligned}\]

Hence

\[\begin{aligned}|f'(a_n)|&=\max\{|f'(a_{n-1}),|f''(a_{n-1})(a_n-a_{n-1})|,|g_n(a_n)(a_{n}-a_{n-1})^2|\}\\ &=|f'(a_{n-1})|,\end{aligned}\]

bearing in mind that for a non-Archimedean absolute value,

We need to show that \(\{a_n\}\) isa Cauchy sequence. To do this, it suffices to show that

For any \(\varepsilon>0\), thereexists \(N>0\) such that

\[\begin{aligned}|a_m-a_n|&=|(a_m-a_{m-1})+(a_{m-1}-a_{m-2})+\dots+(a_{n+1}-a_{n})|\\ &\le\max\{|a_m-a_{m-1}|,\dots,|a_{n+1}-a_n|\} \\ &\le |f(a_n)| \\ &< \varepsilon.\end{aligned}\]

Therefore \(\{a_n\}\) is Cauchy.Since \(F\) is complete,

In local fields, congruence is determined by inequality. In fact, weonly need to show that

Suppose \(|a_{n-1}-a_0|<1\)then

\[|a_n-a_0| \le \max\{|a_n-a_{n-1}|,|a_{n-1}-a_0|\}<1.\]

Therefore \(|\alpha-a_0|=\lim_{n \to\infty}|a_n-a_0|<1\), from which the result follows.

In fact we have not explicitly used the fact that

Hensel's lemma, stronger version.Let\(F\) be a non-Archimedean local field withring of integers \(\mathfrak{o}_F\).Suppose there exists \(a \in\mathfrak{o}_F\) such that \(|f(a)|<|f'(a)|^2\), then thereexists some \(b \in \mathfrak{o}_F\)such that \(f(b)=0\) and \(|b-a|<|f'(a)|\).

Instead of asserting

For example, we can find a square root of

There are a lot of variants of Hensel's lemma, for example you can doexercise 10.9 of Atiyah-MacDonald. In fact, we later even have

There are some other proofs of Hensel's lemma in this post, forexample, since Newton's method can also be understood as a contractionmapping, we can also prove it using properties of contraction mapping(see

The group \(GL_2(\mathbb{F}_q)\)consists of invertible \(2 \times 2\)matrices with entries in the finite field

The field \(\mathbb{F}_q\) itself isalready pretty complicated, let alone a matrix group over it. In thispost we try to follow Fulton-Harris' idea on *Representation Theory:A First Course* to classify all irreducible representations of

First of all we determine the cardinality of

\[G \supset B = \left\{\begin{pmatrix}a & b \\ 0 &d\end{pmatrix}:a,d \ne 0\right\} \supsetN= \left\{\begin{pmatrix}1 & b \\ 0 & 1\end{pmatrix}\right\}.\]

The cardinality of \(G\) isdetermined by the class formula, consider the canonical action on

First of all, notice that

The action of \(G\) on

\[\begin{pmatrix}a & b \\c & d\end{pmatrix}(r:s)=(ar+bs:cr+ds).\]

In particular, \(B\) is the isotropygroup of the set \(\{(1:0)\}\), becausein this case,

Since \(G\) clearly acts on

\[|G|=|B||\mathbb{P}^1(\mathbb{F}_q)|=(q-1)^2q(q+1).\]

In general, the cardinality of

We next consider the diagonal subgroup

\[D=\left\{\begin{pmatrix}a & 0 \\ 0 & d\end{pmatrix}:a,d \ne 0\right\} = \mathbb{F}_q^\ast \times \mathbb{F}_q^\ast.\]

Let

Let $$ be a generator of the cyclic group

\[K=\left\{\begin{pmatrix}x & \varepsilon{y} \\ y & x\end{pmatrix}:\text{$x$ and $y$ notidentically zero}\right\} \cong (\mathbb{F}')^\ast.\]

The isomorphism is given by

\[\begin{pmatrix}x & \varepsilon{y} \\ y & x\end{pmatrix} \leftrightarrow \zeta = x+ y\sqrt\varepsilon.\]

To make \(K\) a subgroup of

*At the end of this section one can see a table of theresult.*

To matrices, conjugacy gives rise to eigenvalues and Jordan canonicalform. So we immediately come up with the three following forms:

\[a_x = \begin{pmatrix}x & 0 \\ 0 & x \end{pmatrix}, \quadb_x = \begin{pmatrix}x & 1 \\ 0 & x \end{pmatrix}, \quadc_{x,y} = \begin{pmatrix}x & 0 \\ 0 & y \end{pmatrix} (x \ne y).\]

For each \(x\) and

Conjugacy classes represented by

\[A^{-1}\begin{pmatrix}x & 0 \\ 0 & x\end{pmatrix}A=\begin{pmatrix}x & 0 \\ 0 & x\end{pmatrix}A^{-1}A=\begin{pmatrix}x & 0 \\ 0 & x\end{pmatrix}.\]

Therefore there is only one element in the conjugacy classrepresented by \(a_x\). Ranging throughall \(x \ne 0\), we obtain

For Jordan canonical form like

Let \(\mathcal{C}=(b_x)\) be aconjugacy class. Let \(G\) act on

To determine the cardinality of

\[Ab_x=\begin{pmatrix}a_{11}x & a_{11}+a_{12}x \\a_{21}x & a_{21}+a_{22}x\end{pmatrix}=\begin{pmatrix}a_{21}+a_{11}x & a_{22}+a_{12}x \\a_{21}x & a_{22}x\end{pmatrix} = b_x A.\]

The equation above implies that

\[J=\left\{\begin{pmatrix}a & b \\ 0 & a \end{pmatrix}:a \ne0\right\}.\]

It follows that

Let \(\mathcal{D}=(c_{x,y})\) be aconjugacy class. Ranging through all \(x,y \ne0\) with \(y \ne x\), thendivide it by \(2\), we obtain

We determine the cardinality of

Now let's count how many conjugacy classes we have obtained:

\[1(q-1)+(q^2-1)(q-1)+(q^2+q)\frac{(q-1)(q-2)}{2}=\frac{1}{2}(q^4-3q^2+2q).\]

We still need to find

\[d_{x,y}=\begin{pmatrix} x & \varepsilon{y} \\ y & x\end{pmatrix}, \quad y \ne 0.\]

Then the eigenvalues of \(d_{x,y}\)are \(x\pm \sqrt\varepsilon y\), noneof which lies in \(\mathbb{F}_q\).Therefore it has nothing to do with Jordan canonical form. We willexplore the remainder of conjugacy classes of

Ranging through all \(x\) and

\[\begin{pmatrix}a & -\varepsilon{c} \\c & -a\end{pmatrix}.\]

Now let \(\mathcal{E}=(d_{x,y})\) bea conjugacy class. Notice the isotropy group of

Representative | Number of Elements in Class | Number of Classes |
---|---|---|

\(b_x =\begin{pmatrix}x & 1 \\ 0 & x \end{pmatrix}\) | ||

\(c_{x,y} =\begin{pmatrix}x & 0 \\ 0 & y \end{pmatrix} (x \ney)\) | ||

*These matrices would not frequently appear in the remainder ofthe post because it will mess up the format.*

There are

Some computations are omitted because if not, this section would beunreadable. However, the author of this post has checked most of them onpaper. The reader should find it easy to compute by themselves. Forcompleted computation, one refers to

Recall how we find irreducible representations of

\(\chi_W(a_x)=q+1\). It fixesevery point.

\(\chi_W(b_x)=1\). It only fixesone point: \((1:0)\).

\(\chi_W(c_{x,y})=2\). It fixestwo points: \((1:0)\) and

\((0:1)\). \(\chi_W(d_{x,y})=0\). If

\(d_{x,y}\) fixes \((a:b)\), then \(a^2=\varepsilon b^2\), and this cannothappen.

Therefore we have

We see, \((\chi_V,\chi_V)=1\).Therefore \(V\) is irreducible and wecannot decompose \(W\) further. We haveto find different approaches.

The Pontryagin dual of a group \(H\)is defined to be

\[\hat{H}=\operatorname{Hom}(H,S^1).\]

If \(H\) admits a topology, we maywant to eliminate non-continuous homomorphisms but it's not our concernhere because we only care about finite groups now, which admits discretetopology. Notice that if \(H\) isfinite and cyclic, then \(\hat{H} \congH\). We will use this fact right now.

Since \(G\) can be pretty big, it isnot realistic to study all eigenvalues of representations. Instead, weconsider the Pontryagin dual of

\[\chi_{U_\alpha}(g)=\alpha(\det(g)), \quad g \in G.\]

Note the trivial representation is one of the

Tensoring \(U_\alpha\) with

We have successfully determined

\[\begin{aligned}\gamma_{\alpha,\beta}:B \to B/N \cong D \cong \mathbb{F}^\ast \times\mathbb{F}^\ast&\xrightarrow{(\alpha,\beta)}\mathbb{C}^\ast \times \mathbb{C}^\ast\xrightarrow{\times}\mathbb{C}^\ast\\\begin{pmatrix}a & b \\ 0 & d\end{pmatrix}&\mapsto \alpha(a)\beta(b)\end{aligned}\]

Let \(W'_{\alpha,\beta}\) be therepresentation of \(B\) with character\(\gamma_{\alpha,\beta}\), and let\(W_{\alpha,\beta}=\operatorname{Ind}_B^GW_{\alpha,\beta}'\). We can quite easily (no, with a lot ofdirty computation) write down the character table of

If \(\alpha=\beta\), then

We haven't used this subgroup yet so we first explore it in the samevein as attempt 3. We consider the the dual of

We put \(\zeta=x+y\sqrt\varepsilon \in K =(\mathbb{F}')^\ast\). Since

Let's try to tensor what we have found. It is easy to see that

\(V \otimesW_{\alpha,1}\) |

Let \(\varphi \in\widehat{(\mathbb{F}')^\ast}\) be a homomorphism such that

\[\begin{aligned}(\chi_{V \otimes W_{\alpha,1}},\chi_{W_{\alpha,1}})&=2, \\(\chi_{V \otimes W_{\alpha,1}},\chi_{V \otimes W_{\alpha,1}})&=q+3,\\(\chi_{\operatorname{Ind}(\varphi)},\chi_{W_{\alpha,1}}) &= 1, \\(\chi_{\operatorname{Ind}(\varphi)},\chi_{V \otimes W_{\alpha,1}})&= q.\end{aligned}\]

We see \(W_{\alpha,1}\) is containedin the representation determined by

\[\chi_\varphi=\chi_{V \otimesW_{\alpha,1}}-\chi_{W_{\alpha,1}}-\chi_{\operatorname{Ind}(\varphi)}.\]

We can compute that

It follows that

A few remarks are in order. We can call these four classes ofirreducible representations in the following way (excerpted from

\(U_\alpha\):

\(1\)-dimensional representations. There are\(q-1\) of them. \(V_\alpha=V \otimes U_\alpha\):\(q\)-dimensional representations. Here\(V\) is also called Steinbergrepresentation. There are \(q-1\) ofthem.

\(W_{\alpha,\beta}\): (

\(q+1\))-dimensional irreducible principleseries. There are \(\frac{1}{2}(q-1)(q-2)\) of them. Someauthors may also treat \(q\)-dimensional representations asprinciple series. \(X_\varphi\): irreduciblecuspidal representations or complementary series representations. Thereare \(\frac{1}{2}q(q-1)\) of them. Arepresentation is cuspidal if the

Jacquet moduleis trivial.

The Segre embedding allows us to define the product of projectivevarieties reasonably, and we will discuss it right now. To begin with weconsider the product of

Definition 1.TheSegre embeddingis defined as follows:\[\begin{aligned}\iota:\mathbb{P}^m \times \mathbb{P}^n &\to \mathbb{P}^N \\([X_0:\cdots:X_m],[Y_0:\cdots:Y_n]) &\mapsto[X_0Y_0:X_0Y_1:\cdots:X_mY_n]\end{aligned}\]

Clearly, \(N=(m+1)(n+1)-1=mn+m+n\).The image on the right hand side has

\(X_iY_j\) ordered lexicographically.

First of all we make sure that this function is well-defined,otherwise our work will be useless.

Proposition 1.The Segre embedding is a well-definedinjective map.

*Proof.* Assume \(X_i'=\lambdaX_i\) and \(Y_j'=\mu Y_j\)for some \(\lambda,\mu \ne 0\),then

\[\begin{aligned}\iota([X_0':\dots:X_m'],[Y_0':\dots:Y_n'])&=[X_0'Y_0':\dots:X_m'Y_n']\\ &=[\lambda\mu X_0Y_0:\dots:\lambda\mu X_mY_n] \\ &=\lambda\mu[X_0Y_0:\dots:X_mY_n] \\ &=[X_0Y_0:\dots:X_mY_n] \\\end{aligned}\]

Next suppose that

Next we study the image further using linear algebra

We can write elements in

\[\begin{bmatrix}Z_{00} & \dots & Z_{0n} \\\vdots & \ddots & \vdots \\Z_{m0} & \cdots & Z_{mn}\end{bmatrix}.\]

Therefore the image of \(\iota\) isgiven by \(Z_{ij}=X_iY_j\). Through anelementary observation, we see the matrix

\[\begin{bmatrix}X_0Y_0 & \cdots & X_0Y_n \\\vdots & \ddots & \vdots \\X_mY_0 & \cdots & X_mY_n\end{bmatrix}\]

has rank \(1\). The question is, isthe converse true? For this reason we study the set

\[Z=Z(\{Z_{ij}Z_{kl}-Z_{kj}Z_{il}:1 \le i,k \le m,1 \le j,l \le n\}).\]

Note \(Z_{ij}Z_{kl}-Z_{kj}Z_{il}\)is the determinant of all \(2\times2\)submatrices the matrix \([Z_{ij}]\).This \(Z\) contains all

\[V_{kl} \cap Z\to U_k \times U_l', \quad [Z_{ij}] \mapsto([Z_{0l}:\dots:Z_{ml}],[Z_{k0}:\dots:Z_{kn}])\]

This is indeed the inverse map of

Therefore, the image of the Segre embedding is a projective variety.As a classic example, the image of

In this section we offer a way to understand the Segre embedding innumber fields. To begin with, we need some definition.

Height is computed by absolute values on a field so we firstnormalise all absolute values on

The ordinary absolute value

\[M_\mathbb{Q}=\{|\cdot|_p:p\text{ is prime or}=\infty\}.\]

Likewise we define \(M_K\), wherethroughout \(K\) will always be anumber field. \(M_K\) consists of theordinary absolute value and extensions of

\[|x|_v=|N_{K_v/\mathbb{Q}_p}(x)|_p^{1/[K:\mathbb{Q}]},\quad \forall x\inK,v|p.\]

In particular, \(M_K\) satisfies theproduct formula:

\[\prod_{v \in M_K}|x|_v=1 \text{ or } \sum_{v \in M_K}\log|x|_v=0\]

for all \(x \in K^\times\). Thisrestriction allows us to work fine with projective spaces, as we willsee later.

Definition 2.The (absolute logarithmic)heightof \(x\in\mathbb{P}^n_{\overline{\mathbb{Q}}}\), with coordinates\((x_1,\dots,x_m) \in K\), is defined by \[h(x)=\sum_{v \in M_K}\max_j \log |x_j|_v.\]

Actually, the height function can show the "algebraic complication"of \(x\), and is well-defined in manysenses.

Proposition 2.The height\(h(x)\) is independent of the choice of\(K\).

*Sketch of the proof.* Let

\[\sum_{w|v}[L_w:K_v]=[L:K],\]

which implies that

\[\sum_{w|v}\log |x|_w=\log |x|_v.\]

Therefore

\[\sum_{w \in M_L}\max_j\log|x_j|_w=\sum_{v \inM_K}\sum_{w|v}\max_j\log|x_j|_w\]

gives what we want. \(\square\)

Proposition 3.\(h(x)\) is well-defined on \(\mathbb{P}^n_{\overline{\mathbb{Q}}}\).

*Proof.* It remains to show that

\[\begin{aligned}h(\lambda{x})&=\sum_{v \in M_K}\max_j\log|\lambda x_j|_v \\ &=\sum_{v \inM_K}\left(\log|\lambda|_v+\max_j\log|x_j|_v \right) \\ &=\sum_{v \in M_K}\log|\lambda|_v+\sum_{v \inM_K}\max_j\log|x_j|_v \\ &=\sum_{v \in M_K}\max_j\log|x_j|_v \\ &=h(x).\end{aligned}\]

Note \(\sum_{v \inM_K}\log|\lambda|_v=0\) because o the product formula.

To highlight the ability of height to measure algebraic complication,let's mention the following theorem of Kronecker.

Theorem 1 (Kronecker).The height of\(\zeta\in\overline{\mathbb{Q}}^\times\) is\(0\) if and only if \(\zeta\) is a root of unity.

One direction is straightforward. To prove the converse, one may needsome combinatorics, symmetric functions and Dirichlet's pigeon-holeprinciple. See theorem 2.4 of this

Now let's invite the Segre embedding into the party:

\[\begin{aligned}\iota:\mathbb{P}^n_{\overline{\mathbb{Q}}} \times\mathbb{P}^m_{\overline{\mathbb{Q}}} &\to\mathbb{P}^N_{\overline{\mathbb{Q}}} \\ (x,y) &\mapsto x \otimes y \\ &:=(x_iy_j).\end{aligned}\]

Using the fact that

\[h(x \otimes y) = h(x) + h(y).\]

The Segre embedding is immediately used after introducing the heightof a polynomial.

Definition 3.For\(f(t_1,\dots,t_n) \in K[t_1,\dots,t_n]\), wewrite \[f(t_1,\dots,t_n)=\sum_{\mathbf{j}}a_{\mathbf{j}}\mathbf{t^j}.\]

Then the

heightof\(f\) is defined to be \[h(f)=\sum_{v \in M_K}\log |f|_v\]

where

\[|f|_v=\max_{\mathbf{j}}|a_{\mathbf{j}}|_v.\]

Likewise, it can show the complication of **different variables** can be understood as tensorproducts and therefore we have the following fact

Proposition 4.Let\(f(t_1,\dots,t_n)\) and \(g(s_1,\dots,s_m)\) be polynomials indifferent sets of variables, then \[h(fg)=h(f)+h(g).\]

Note: if \(f\) and

\[\left|h(f)-\sum_{j=1}^{m}h(f_j)\right| \le d\ln 2\]

where \(d\) is the sum of thepartial degrees of \(f\).

]]>We want to apply calculus to fields, but tools are needed. For theordinary calculus, on \(\mathbb{R}\) or\(\mathbb{C}\), the most important roleis played by limit:

\[\lim_{x \to a}f(x)=A.\]

However we cannot immigrate absolute value into other fieldsdirectly. Indeed, if the field \(k\) isan extension of \(\mathbb{Q}\), then wemay define an absolute value on \(k\)to be the restriction of the absolute value of

Definition 1.Anabsolute valueona field \(K\) is a real valued function\(|\cdot|:K \to \mathbb{R}_+\) suchthat

For all \(x \in K\), we have\(|x| \ge 0\) and

\(|x|=0\) if and only if \(x=0\). \(|xy|=|x||y|\).

There exists \(c>0\) suchthat \(|x+y| \le c\max\{|x|,|y|\}\).

Before we dive into some technical details of the inequality, let'ssee some trivial and non-trivial examples.

On any field \(K\), we candefine \(|x|=1\) for all

\(x \ne 0\). This is the most trivialabsolute value and it carries little to none information. But whetherthe absolute value is trivial, we always have \(|1|=1\) because \(|1x|=|1||x|=|x|\). If \(K=\mathbb{Q}\), we candefine \(|m/n|\) to be the ordinaryabsolute value

\(\sqrt{\left(\frac{m}{n}\right)^2}\). We arefamiliar with it for sure. It is customary to write \(|\cdot|_\infty\). However, for \(K=\mathbb{Q}\),and \(m/n \in K\), we can alsowrite

\[\frac{m}{n}=p^a\frac{m'}{n'}.\]

where \(m'\) and

\(n'\) are integers coprime to \(p\). Under this presentation we canput

\[\left|\frac{m}{n}\right|_p=p^{-a}.\]

In this way we obtain an absolute value

- Let \(K=\mathbf{F}_q\) be a finitefield, then the only absolute value on
\(K\) is trivial. To see this, notice that\(K^\times\) is a cyclic group. Pickany \(x \in K^\times\), we have \(|x|^{q-1}=|x^{q-1}|=|1|=1\).

It seems we have ignored the triangle inequality for no reason, butactually we didn't. To see this, we give a refinement of the triangleinequality first.

Proposition 1.Let\(|\cdot|:K \to \mathbb{R}\) be an absolutevalue with \(|x+y|\lec\max\{|x|,|y|\}\), then the following two statements areequivalent:

\(c \le 2\).

For all \(a,b \in K\), we have\(|a+b|\le |a|+|b|\). This is the

triangle inequality.

*Proof.* It is obvious that

\[\left|\sum_{k=1}^{n}a_k \right| \le 2^m \max |a_k|=n\max|a_k| \le 2n\max|a_k|.\]

For all positive integers satisfying

\[\begin{aligned}\left|\sum_{k=1}^{n}a_k\right| &\le c\max\left\{\left|\sum_{k=1}^{2^{m-1}}a_k\right|,\left|\sum_{k=2^m+1}^{2^{m}}a_k\right|\right\}\\ &\le2\max\left\{\left|\sum_{k=1}^{2^{m-1}}a_k\right|,\left|\sum_{k=2^m+1}^{2^{m}}a_k\right|\right\}\\ &\le2\max\left\{2^{m-1}\cdot\max_{1\le k \le2^{m-1}}|a_k|,2^{m-1}\cdot\max_{2^{m-1}<k\le 2^m}|a_k|\right\} \\ &\le 2 \cdot 2^{m-1}\max_{1 \le k \le2^m}|a_k| \\ &\le 2n \max_{1 \le k \le 2^m}|a_k|\end{aligned}\]

Let \(\tilde{n}\) be the image of\(n\) in

\[\left|\sum_{k=1}^{n}a_k\right| \le 2n \sum_{k=1}^{n}|a_k|.\]

We therefore can write

\[\begin{aligned}|a+b|^n &= |(a+b)^n| \\ &=\left|\sum_{k=0}^{n}{n \choose k}a^k b^{n-k} \right| \\ &\le 2(n+1)\sum_{k=0}^{n}\left|\widetilde{n\choosek}\right||a|^k|b|^{n-k} \\ &\le 4(n+1)\sum_{k=0}^{n}{n \choose k}|a|^k|b|^{n-k} \\ &=4(n+1) (|a|+|b|)^n.\end{aligned}\]

It follows that

\[|a+b| \le \sqrt[n]{4(n+1)}(|a|+|b|), \quad \forall n \in \mathbb{N}.\]

Since \(\lim_{n \to\infty}\sqrt[n]{4(n+1)}=1\), we are done.

\[f(X)=\sum_{k=n}^{\infty}a_kX^k\]

where \(a_n \ne 0\). We can definean absolute value on

Notice that an absolute value induces a translate-invariant metric inan obvious way:

\[d(x,y)=|x-y|.\]

A topology comes up in the nature of things. We cannot apply theoremsin functional analysis directly because we do not have a real or complexvector space. But we can try to import those important concepts. Whenstudying open mapping theorem, we care about equivalent norms ormetrics, on whether they induce the same topology. Here we will also dothat.

Definition 2.Two absolute values\(|\cdot|_1\) and \(|\cdot|_2\) are equivalentif they induces the same topology (this is clearly an equivalencerelation). An equivalence class of absolute values is called aplace.

Clearly, the topology is discrete if and only if the absolute valueis trivial. Therefore a trivial absolute value is not equivalent to anynon-trivial ones. But let's see two non-trivial absolute values that arenot equivalent. On \(\mathbb{Q}\),consider \(|\cdot|_\infty\) and

\[ \limsup_{n \to \infty}\left|\frac{1}{n}\right|_2=1\]

if we take odd numbers into account. On the other hand,

We have an important characterisation of equivalent absolutevalues.

Proposition 2.Let\(|\cdot|_1\) and \(|\cdot|_2\) be two non-trivial absolutevalues, then the following statements are equivalent.

\(|\cdot|_1\) and

\(|\cdot|_2\) are equivalent. \(|x|_1<1\) implies that\(|x|_2<1\).

There exists \(\lambda>0\)such that

\(|\cdot|_1=|\cdot|_2^\lambda\).

*Proof.* Assume that

Assume that \(|x|_1<1\) alwaysimplies that \(|x|_2<1\). It followsthat \(|x|_1>1\) implies that

\[|x|_2=\left(|x_0|_1^{\log_ab}\right)^\alpha=|x_0|_1^{\alpha\lambda}=|x|_1^\lambda.\]

3 implying 1 is immediate because

If \(|\cdot|_1=|\cdot|_2^\lambda\),\(|x+y|_1\le c_1\max\{|x|_1,|y|_1\}\)and \(|x+y|_2 \lec_2\max\{|x|_2,|y|_2\}\), then

Proposition 3.Each absolute value is equivalent toone that satisfies the triangle inequality.

Bearing this in mind, we can study the case when

Proposition 4.Let\(|\cdot|\) be an absolute value on \(K\). Then the following statements areequivalent:

\(|\cdot|\) satisfies theultrametric inequality:

\(|x+y|\le\max\{|x|,|y|\}\). \(|\tilde{n}|\le 1\) for all\(n \in \mathbb{N}\).

*Proof.* Suppose that \(|x+y| \le\max\{|x|,|y|\}\). Then

\[|\widetilde{n+1}|=|\tilde{n}+1|\le\max\{|\tilde{n}|,|1|\}\le 1.\]

Conversely, suppose that \(|\tilde{n}| \le1\) for all \(n\). Replace theabsolute value with one satisfying triangle inequality if necessary. Itfollows that

\[\begin{aligned}|a+b|^n &\le \sum_{k=0}^{n}\left|{n \choose k}a^k b^{n-k}\right| \\&\le \sum_{k=0}^{n}\left|\widetilde{n \choosek}\right||a|^k|b|^{n-k} \\&\le (n+1)\max\{|a|^n,|b|^n\} \\&=(n+1)\max\{|a|,|b|\}^n.\end{aligned}\]

Therefore \(|a+b| \le\sqrt[n]{n+1}\max\{|a|,|b|\}\). The result follows from the factthat \(\sqrt[n]{n+1} \to 1\) as

Definition 3.An absolute value is callednon-Archimedean, orultrametric, ifthe condition in proposition 4 is satisfied. Otherwise it is calledArchimedeanorordinary.

For example, trivial absolute values are ultrametric but we are notinterested in that. What is interesting is that

There is a second classification - Ostrowski's theorem, which statesthat all nontrivial places on

Theorem 4.2 of

this note forthe ordinary theorem of Ostrowski on \(\mathbb{Q}\). This

expositorypaper for the theorem of Ostrowski on number fields. This

expositorypaper for the theorem of Ostrowski on function fields.

When we have a field extension

Definition 4.A field\(K\) is completewithrespect to \(|\cdot|\) if\(K\) is a complete metric space with respectto the metric \(d(x,y)=|x-y|\).

To employ the similar device, we will define completion in a similarstyle. Let \(\mathscr{P}_F\) be the setof all places of a field \(F\). Eachplace \(v\) on

\[\begin{aligned} r:\mathscr{P}_L &\to \mathscr{P}_K \\ v &\mapsto u\end{aligned}\]

from the places of \(L\) to placesof \(K\).

Definition 5.Let\(L/K\) be a field extension and \(u \in \mathscr{P}_K\). If \(v \in r^{-1}(u)\), we write \(v|w\) and say \(v\) divides\(w\) or \(v\) lies over\(u\).

Definition 6.A completion of\(K\) with respect to a place \(v\) is an extension field \(K_v\) with a place \(w\) such that

\(w|v\).

The topology of \(K_v\) inducedby \(w\) is complete.

\(K\) is a dense subset of

\(K_v\).

The extension exists and is unique up to isomorphism (to see this,modify the proof on the completion of

For \(|\cdot|_p\) on

As an striking example, in

\[\sum_{k=0}^{\infty}2^k=-1\]

because

\[\lim_{n \to \infty}\left|\sum_{k=0}^{n}2^k+1\right|_2=\lim_{n \to\infty}|2^n-1+1|_2=\lim_{n \to \infty} 2^{-n}=0.\]

There is nothing skippy or misunderstanding as that Numberphile videoon the "identity"

To conclude this post and prepare for future posts, we show thatabsolute values works fine with norms over a vector space (do notconfuse with norms in Galois theory).

Definition 7.Let\(K\) be a field with absolute value \(|\cdot|\) and \(E\) be a vector space over \(K\). A norm \(E\to \mathbb{R}\) compatible with \(|\cdot|\) is a function \(\|\cdot\|\) that satisfies

\(\|\xi\|\ge 0\) for all

\(\xi \in E\), and \(\|\xi\|=0\) if and only if \(\xi=0\). For all \(x \in K\) and

\(\xi \in E\), \(\|x\xi\|=|x|\|\xi\|\). \(\|\xi_1+\xi_2\| \le\|\xi_1\|+\|\xi_2\|\) for all

\(\xi_1,\xi_2 \in E\).

Two norms \(\|\cdot\|_1\) and **equivalent** if there exist numbers

\[C_1\|\xi\|_1 \le \|\xi\|_2 \le C_2 \|\xi\|_1.\]

This is an equivalence relation and we have already seen this inelementary linear algebra and functional analysis. This is equivalent tothe fact that \(\|\cdot\|_1\) and

\[\xi=x_1\xi_1+\cdots+x_n\xi_n,\quad x_1,\dots,x_n \in K.\]

We can define norms like

Proposition 5.Let\(K\) be a complete field under a non-trivialabsolute value \(|\cdot|\), and let\(E\) be a finite-dimensional spaceover \(K\). Then any two norms on \(E\) that are compatible with \(|\cdot|\) are equivalent.

*Proof.* It suffices to show that

\[\xi^{(\nu)} = x_1^{(\nu)}\xi_1+\cdots+x_n^{(\nu)}\xi_n\]

is a Cauchy sequence (with respect to a norm) in

Suppose this is false for

\[\frac{\xi^{(\nu)}}{x_1^{(\nu)}}-\xi_1=\frac{x_2^{(\nu)}}{x_1^{(\nu)}}\xi_2+\dots+\frac{x_n^{(\nu)}}{x_1^{(\nu)}}\xi_n.\]

Taking the limit, we see \(\xi_1\)is a linear combination of

We will need this proposition to work with finite fieldextensions.

Erico Bombieri and Walter Gubler,

*Heights in DiophantineGeometry*.Serge Lang,

*Algebra Revisited Third Edition*.Dinakar Ramakrishman and Robert J. Valenza,

*Fourier Analysison Number Fields*.

In our previous post on theirreducible representations of

The result is satisfying for

This post would be relatively easier to read than the previous post.Other than the basic language of representation theory (of Lie groups),only multivariable calculus and linear algebra are needed.

The group \(SO(3)\) has a richbackground in physics. See, for example, "

Like in the previous post, we first determine a good playground andthen show that this is all we need. The playground here is

\[P_\ell=\mathbb{C} \otimes_\mathbb{R}\operatorname{Sym}^\ell\mathbb{R}^3.\]

The reason for the symmetric product of

Recall that

\[\dim \operatorname{Sym}^\ell \mathbb{R}^3 = {\ell+3-1 \choose\ell}={\ell + 2 \choose \ell}={\ell+2 \choose2}=\frac{(\ell+2)(\ell+1)}{2}.\]

Therefore \(\dimP_\ell=\frac{(\ell+2)(\ell+1)}{2}\), as a

The action of \(SO(3)\) or

\[(Af)(x)=f(xA).\]

Here, \(x=(x_1,x_2,x_3) \in \mathbb{R}\times \mathbb{R} \times \mathbb{R}\), and

To study this representation, we need to find some morphisms

\[\Delta:f \mapsto \left(\frac{\partial^2}{\partialx_1^2}+\frac{\partial^2}{\partial x_2^2}+\frac{\partial^2}{\partialx_3^2}\right)f.\]

In other words, \(\Delta\) is thetrace of the Hessian matrix of \(f\).Trace is used in representation theory to define character so there is achance to find its good connection to the representation of

We shall also not forget the **kernel** of theLaplacian, which is called **harmonic polynomials** ofdegree \(\ell\) in this context:

\[\mathfrak{H}_\ell = \{f \in P_\ell:\Delta{f}=0\}.\]

Since functions in \(P_\ell\) arehomogeneous, the value of \(f\) at apoint \(x\) is determined by the valueat \(\frac{x}{\|x\|} \in S^2\), theunit sphere. Therefore we also call **sphericalharmonics** of degree \(\ell\).And we certainly need to know the nullity of

Lemma 1.The dimension of\(\mathfrak{H}_\ell\) is \(2\ell+1\).

*Proof.* To begin with, we need to write a more explicitexpression of the Laplacian. First of all we perform a Taylor expansionof \(f \in P_\ell\) with respect to thefirst variable \(x_1\):

Here, \(f_k(x_2,x_3)\) ishomogeneous of degree \(\ell-k\) in\(x_2,x_3\). Therefore we only need tostudy one term of the right hand side.

\[\begin{aligned}\Delta \frac{f_k(x_2,x_3)}{k!}x_1^k &=\frac{f_k(x_2,x_3)}{k!}k(k-1)x_1^{k-2}+\frac{x_1^k}{k!}\left(\frac{\partial^2f_k}{\partial x_2^2}+\frac{\partial^2 f_k}{\partial x_3^2}\right) \\&=\frac{f_k(x_2,x_3)}{(k-2)!}x_1^{k-2}+\frac{x_1^k}{k!}\left(\frac{\partial^2f_k}{\partial x_2^2}+\frac{\partial^2 f_k}{\partial x_3^2}\right)\end{aligned}\]

Now we can put them together naturally:

\[\Delta f =\sum_{k=0}^{\ell-2}\frac{f_{k+2}}{k!}x_1^{k}+\sum_{k=0}^{\ell}\frac{x_1^k}{k!}\left(\frac{\partial^2f_k}{\partial x_2^2}+\frac{\partial^2 f_k}{\partial x_3^2}\right)\]

Let's try to explore the last term a little bit more. If

\[\Delta f =\sum_{k=0}^{\ell-2}\frac{x_1^k}{k!}\left[f_{k+2}+\left(\frac{\partial^2f_k}{\partial x_2^2}+\frac{\partial^2 f_k}{\partial x_3^2}\right)\right]\]

Therefore \(\Delta{f}=0\) if andonly if

\[f_{k+2}+\left(\frac{\partial^2 f_k}{\partial x_2^2}+\frac{\partial^2f_k}{\partial x_3^2}\right)=0.\]

Therefore, once \(f_0\) and

\[\dim \mathfrak{H}_\ell=\dim P_\ell^2+\dim P_{\ell-2}^2\]

where \(P_k^2\) is the space ofhomogeneous polynomials with two variables, hence is isomorphic to

\[\dim P_\ell^2 = {\ell+2-1 \choose \ell}=\ell+1, \quad \dim P_{\ell-1}^2= \ell.\]

Hence

\[\dim \mathfrak{H}_\ell=2\ell+1.\]

\(\square\)

Recall that \(\dim W_n=2n+1\). Thisshould not be a coincidence, and we shall dive into it right now. To dothis we immediately establish the connection between

Lemma 2.The action of the Laplacian on\(C^\infty(\mathbb{R}^3;\mathbb{C})\) (whichcontains \(P_\ell\) for all \(\ell \ge 0\)) commutes with the action of\(SO(3)\), i.e. \(\Delta\) is \(SO(3)\)-equivariant.

*Proof.* Really routine verification.

As a result, we have an very important result:

Theorem 1.The space\(\mathfrak{H}_\ell\) is an \(SO(3)\)-invariant subspace of \(P_\ell\).

We start with a direct observation of matrices in

Lemma 2.Every element in\(SO(3)\) is conjugate to \(R(t)\), where \[R(t)= \begin{pmatrix}1 & 0 & 0 \\0&\cos{t}&-\sin{t} \\0&\sin{t}&\cos{t} \end{pmatrix}.\]

*Proof.* Pick any \(A \inSO(3)\). First of all we show that

\[\begin{aligned}\det (I-A)&=\det(AA^T-A) \\ &=\det(A(A^T-I)) \\ &=\det(A)\det(A^T-I) \\ &=\det(A-I) \\ &=-\det(I-A)\end{aligned}\]

we therefore have \(\det(I-A)=0\).Hence we can pick \(v_1 \in \ker(I-A)\setminus \{0\}\) with norm

\[V^{-1}AV=R=\begin{pmatrix}1 & 0 & 0 \\ 0&a&b \\ 0&c&d \end{pmatrix} \in SO(3).\]

In particular, \(R \in SO(3)\) alsoimplies

\[\begin{cases}a^2+b^2=1 \\b^2+d^2=1 \\a^2+c^2=1 \\b^2+d^2=1 \\ad-bc=1\end{cases}\]

Solving this equation system we must have

Since characters are invariant under conjugation, the study of thecharacter of \(SO(3)\) is reduced to\(T\), the subgroup generated bymatrices of the form \(R(t)\). Butdirect computation is a nightmare so we try our best to do it moreelegantly. For this reason, we return to the irreducible representationsof \(SU(2)\) (there are only twovariables, anyway). The canonical map \(SU(2)\to SO(3)\) has a specific value:

\[e(t)=\begin{pmatrix}\exp(it) & 0 \\0 & \exp(-it)\end{pmatrix} \mapsto\begin{pmatrix}1 & 0 & 0 \\0 & \cos{2t} & -\sin{2t} \\0 & \sin{2t} & \cos{2t}\end{pmatrix}=R(2t).\]

One can refer to

\[\chi_{V_{2n}}(e(t/2))=\sum_{k=0}^{2n}\exp\left(i(2n-2k)\frac{t}{2}\right)=\sum_{k=0}^{2n}\exp(i(n-k)t).\]

Now we are ready for the irreducible representations of

Since we basically have \(\dim\mathfrak{H}_\ell=\dim W_\ell\), it is natural to believe that\(\mathfrak{H}_\ell \cong W_\ell\), inthe sense of \(SO(3)\)-modules, and thefollowing theorem answers this question affirmatively.

Theorem 2.The space\(\mathfrak{H}_\ell\) is isomorphic to \(W_\ell\). In other words, irreducible \(SO(3)\)-modules are determined by sphericalharmonics whose Laplacians are \(0\).

*Proof.* We will use the fact every compact Lie group iscompletely reducible. (First of all,

Therefore we have

\[\mathfrak{H}_\ell= \bigoplus_{\nu}W_{n_\nu}\]

where each \(W_{n_\nu}\) is anirreducible representation of

\[2\ell+1 = \sum_{\nu}(2n_\nu+1).\]

To prove that

\[\chi_{\mathfrak{H}_\ell}(R(t))=\sum_{n_\nu}\sum_{k=-n_\nu}^{n_\nu}\exp(ikt).\]

In other words, the character is a linear combination of

To do this, we can consider vector

\[\begin{aligned}(R(t)f_\ell)(t)&=\left(x_2\cos{t}+x_3\sin{t}+i(-x_2\sin{t}+x_3\cos{t})\right)^\ell\\ &=(e^{-it}x_2+ie^{-it}x_3)^\ell \\ &=e^{-i\ell t}(x_2+ix_3)^\ell \\ &=e^{-i\ell t}f_\ell(t).\end{aligned}\]

\(\square\)

There are no even-dimensional irreducible representations of\(SO(3)\). This is what every readerhas to take away from this post.

We find the eigenvalue because it shows that

\(\exp(-i\ell t)\) appears in the summand of\(\chi_{\mathfrak{H}_\ell}(R(t))\),hence \(|-\ell|=\ell \le \max n_\nu\).Since \(\{n_\nu\}\) is finite, themaximum can be attained, and therefore our argument on dimension isdone. The representation of \(U(2)\)can be deduced algebraically, for one only need to notice that

\(U(2) = (S^1 \times SU(2))/H\), where \(H=\{(1,I),(-1,-I)\}\). One will also needan odd-even argument just like we did to \(SO(3)\). Likewise, since \(O(3)=SO(3) \times\mathbb{Z}/2\mathbb{Z}\), we can deduce irreduciblerepresentations of \(O(3)\) in asimilar fashion.

Representation theory is important in various branches of mathematicsand physics. When studying representation of finite groups, we havequite some algebra and combinatorics. When differentiation (moreprecisely, smoothness) joins the party, we have Lie group, involvingcalculus, linear algebra, geometry and much more. Especially, theoriesaround \(SU(2)\) and

In this post we develop a way to study irreducible representations ofthese two Lie groups, in a mathematician's way. I try my best to makesure that everything is down-to-earth, and everything can be "reduced"to 19th (pre-modern) mathematics.

Nevertheless, the reader has to be assumed to be familiar withelementary languages of representation theory (and you know that, thereare a lot of abuse of language), which I think is not a problem becauseotherwise you wouldn't be reading this post. You need to recalleigenvalue theories in linear algebra, as well as Fourier series. Weneed the fact that the trigonometric system is complete. In other wordstrigonometric polynomials are dense in the space of continuousfunctions.

We will first study \(SU(2)\) and afirst classification of irreducible representations of

\[SU(2)/\{-I,I\} \cong SO(3).\]

This is to say, \(SU(2)\) is a"double cover" of \(SO(3)\). To seethis, notice that \(SU(2) \cong S^3\)and \(SO(3) \cong \mathbb{R}P^3\) asLie groups, meanwhile \(\mathbb{R}P^3 \congS^3/\{-1,1\}\) can be considered as the definition.

Of course, by representation we mean finite dimensional and unitaryrepresentations.

Indeed it seems we have nowhere to start. Instead of trying to findall of them, we will try to work on seemingly immediate representationsand it turns out that they are all we are looking for.

Let \(V_0\) be the trivialrepresentation on \(\mathbb{C}\) and\(V_1\) be the standard representationon \(\mathbb{C}^2\), which is given byordinary matrix multiplication. These representations are irreducible.We want to extend this family to

Direct sum:

\(\bigoplus_{i=1}^{n}V_1\). The dimension is\(2n\) and unfortunately, therepresentation is determined by each component so essentially there isno "new thing". Tensor product:

\(\bigotimes_{i=1}^{n}V_1\). The dimension is\(2^n\) which is way too big. Wedge product:

\(\bigwedge^{n}V_1\). It stops at \(n=2\) and we have to deal with \(u \wedge v = - v \wedge u\). This can beannoying. Symmetric product:

\(\sym^{n}V_1\). The dimension is \(n+1\) and it doesn't stop. Besides, it canbe understood as homogeneous polynomials of degree \(n\) in two variables. This is a fantasticchoice. Besides we have \(\sym^0V_1=V_0\) so nothing is abruptly excluded.

Put \(V_n=\sym^nV_1\), which can beunderstood as the space of homogeneous polynomials of degree

\[P_k=z_1^k z_2^{n-k}.\]

And we will make use of it later.

For each \(g \in SU(2)\), we have aleft action

\[\begin{aligned}\rho:SU(2) &\to \operatorname{Aut}(V_n), \\ g &\mapsto (P(z) \mapsto P(zg)).\end{aligned}\]

In other words,

\[g=\begin{pmatrix}\alpha & \beta \\-\overline{\beta} & \overline{\alpha}\end{pmatrix}, \quad |\alpha|^2+|\beta|^2=1.\]

Then

\[zg=(\alpha z_1-\overline{\beta}z_2,\beta z_1+\overline{\alpha}z_2)\]

When there is no confusion, we will write

Since \(z \mapsto zg\) is ahomogeneous map of degree \(1\) as itis linear and is non-degenerate, we have **We now have awell-defined representation.** Note

Proposition 1.The representations\(V_n\) are irreducible.

*Proof.* By Schur's lemma, we need to show that each

The group \(SU(2)\) can becomplicated, but \(U(1) \cong S^1\) issimple and can be considered as a subgroup of

First of all we embed \(S^1\) into\(SU(2)\) by

\[a \mapsto \begin{pmatrix}a & 0 \\ 0 & a^{-1} \end{pmatrix}.\]

Call the matrix right hand side

\[g_a P_k=(az_1)^{k}(a^{-1}z_2)^{n-k}=a^{2k-n}z_1^kz_2^{n-k}=a^{2k-n}P_k\]

for all \(k\). This is to say, **eigenvector**corresponding to **eigenvalue**

Since \(\{P_k\}\) are linearlyindependent, under this basis, we have a matrix representation

\[\rho(g_a) = \operatorname{diag}(a^{-n},a^{-n+2},\dots,a^{n-2},a^n).\]

but we don't know how eigenspaces are spanned because we may have\(a^j=a^k\) for

On the other hand, by definition of

\[Ag_a P_k = g_aAP_k = A a^{2k-n}P_k = a^{2k-n}AP_k.\]

Hence \(AP_k\) lies in

\[A = \begin{pmatrix}c_1 & & \\ & \ddots & \\ & & c_n\end{pmatrix}.\]

We want this matrix to be a scalar matrix. The result follows fromanother embedding of \(U(1)\) into\(SU(2)\). Note

\[g_t = \begin{pmatrix}\cos{t} & -\sin{t} \\\sin{t} & \cos{t}\end{pmatrix} \in SU(2).\]

Still we have \(Ag_t=g_tA\). As wecan see,

\[\begin{aligned}Ag_tP_n &= A(z_1\cos{t}+z_2\sin{t})^n \\ &= A\sum_{k=0}^{n}{n \choose k}(z_1\cos{t})^k(z_2\sin{t})^{n-k} \\ &= A\sum_{k=0}^{n}{n \choosek}(\cos{t})^k(\sin{t})^{n-k}z_1^k z_2^{n-k} \\ &= A\sum_{k=0}^{n}{n \choose k}(\cos{t})^k(\sin{t})^{n-k}P_k \\ &= \sum_{k=0}^{n}{n \choose k}(\cos{t})^k(\sin{t})^{n-k}AP_k \\ &= \sum_{k=0}^{n}{n \choose k}(\cos{t})^k(\sin{t})^{n-k}c_kP_k.\end{aligned}\]

This follows from our observation on eigenvalues. Next, weimmediately use the eigenvalue \(c_n\)to obtain

\[g_t AP_n = g_t c_nP_n = c_n \sum_{k=0}^{n}{n \choosek}(\cos{t})^k(\sin{t})^{n-k} P_k.\]

This is the definition of

So far we have used diagonalisation of representations of

Let's recall diagonalisation in

\[g \sim \begin{pmatrix}\lambda & 0 \\ 0 & \lambda^{-1}\end{pmatrix} \sim \begin{pmatrix}\lambda^{-1} & 0 \\ 0 &\lambda \end{pmatrix}\]

where \(\lambda\) is one of theeigenvalues of \(g\). Since thediagonalised matrix is still in

\[e(t) = \begin{pmatrix} \exp(it) & 0 \\ 0 & \exp(-it)\end{pmatrix}.\]

We see, \(e(s) \sim e(t)\) if andonly if \(s = \pm t \mod 2\pi\). Byperiodicity of \(\exp\) function, wealso see \(e(t)\) is in particular\(2\pi\)-periodic. If

Define \(\Lambda:SU(2) \to S^1\)sending \(g \in SU(2)\) to theeigenvalue of \(g\) with non-negativeimaginary part (one can also pick non-positive one, because

With help of this \(e(t)\) and

\[\{\text{Class functions }SU(2) \to \mathbb{C}\} \longleftrightarrow\{\text{even }2\pi-\text{periodic function }\mathbb{R} \to \mathbb{C}\}\]

Recall that the space on the right hand side has a countable uniformbasis

\[1,\cos{t},\cos{2t},\dots.\]

In other words, \(\{\cos{nt}\}_{n \ge0}\) spans a dense subspace. This is about the completeness oftrigonometric system. Since there are only even functions, *Real and Complex Analysis*by W. Rudin.

For class functions, we certainly want to know about characters. Let\(\chi_n\) be the character of

\[\chi_n(e(t))=\operatorname{tr}(\rho(e(t)))=\operatorname{tr}(\operatorname{diag}(\exp(it)^{-n},\dots,\exp(it)^n))=\sum_{k=0}^{n}e^{i(n-2k)t}.\]

When \(t \in \pi\mathbb{Z}\), then\(\chi_n(e(t)) \in \mathbb{Z}\).Otherwise, as a classic exercise in calculus, we have

\[\kappa_n(t)=\chi_n(e(t))=\frac{\sin(n+1)t}{\sin{t}}.\]

We have \(\kappa_0(t)=1\). For

\[\kappa_n(t)=\frac{\cos{nt}\sin{t}+\sin{nt}\cos{t}}{\sin{t}}=\cos{nt}+\kappa_{n-1}(t)\cos{t}.\]

We see \(\kappa_1(t)=2\cos{t}\). Byinduction, every \(\kappa_n(t)\) is apolynomial in variables

The argument above shows that

Proposition 2.For continuous class function\(f:SU(2) \to \mathbb{C}\), we have \[\int_{SU(2)}f(x)dx = \frac{1}{\pi}\int_0^\pi f \circ e(t)\sin^2{t}dt.\]

*Proof.* On one hand, since the

\[\int_{SU(2)}\chi_n(x)dx = \dim V_n^{SU(2)} = \begin{cases} 1 & n=0,\\ 0 & n>0. \end{cases}\]

Here, for a group \(G\) and arepresentation \(V\),

On the right hand side we are looking for even

\[\frac{1}{2\pi}\int_{-\pi}^{\pi}f\circ e(t)\sin^2{t}dt =\frac{1}{\pi}\int_{-\pi}^{\pi}\sin(n+1)t\sin{t}dt = \begin{cases} 1& n=0, \\ 0 & n>0. \end{cases}\]

Since the functional \(h \mapsto\frac{1}{2\pi}\int_{-\pi}^{\pi}h\sin^2{t}dt\) is continuous inthe uniform topology and \(\kappa_n\)spans a dense subspace, the result is now obtained.

Finally, surprisingly and satisfyingly enough, the denseness haveactually axed out all other possibilities of irreducible representation.In other words, our search in symmetric products is optimal. We can seethis through Parseval's identity. This is the heart of this blogpost.

Proposition 3.Every irreducible representation of\(SU(2)\) is isomorphic to one of the\(V_n\).

*Proof.* Suppose we have a character that is different fromall of the \(\chi_n\). Then theorthonormality shows that \(\langle\chi,\chi_n \rangle = 0\) for all \(n\ge 0\) and \(\langle \chi,\chi\rangle=1\). Now let's see why this is absurd.

Since \(\{\chi_n\}_{n \ge 0}\) spansa dense subspace in the space of class functions, we actually have

\[\chi = \sum_{n = 0 }^{\infty} a_n \chi_n.\]

Therefore

\[\langle \chi,\chi_n \rangle = \int_{SU(2)}\overline\chi(x)\chi_n(x)dx =a_n=0,\quad n \ge 0\]

and

\[\langle \chi,\chi \rangle = \sum_{n=0}^{\infty}\langlea_n\chi_n,a_n\chi_n \rangle = \sum_{n=0}^{\infty}|a_n|^2=1.\]

It is impossible to have the sum of

Now we head to \(SO(3)\). In factthe result follows immediately from the surjection

\[\pi:SU(2) \to SO(3).\]

We have \(\ker\pi=\{-I,I\}\). Let\(W\) be a representation of

\[\rho:SO(3) \to GL(W).\]

Then

\[\pi^\ast\rho:SU(2) \to GL(W)\]

by \(g \mapsto \rho(\pi(g))\) is aninduced representation, and we write

On the other hand, if \(\vartheta:SU(2) \toGL(V)\) is an irreducible representation where

\[\pi_\ast\vartheta:SO(3) \cong SU(2)/\{I,-I\} \to GL(V)\]

given by \(g\ker\pi \mapsto\vartheta(g)\). Let's denote it by

Therefore we have realised a correspondence

\[\{\text{Irreducible representations of $SO(3)$}\} \\\updownarrow \\\{\text{Irreducible representations of $SU(2)$ where $-I$ acts asidentity.}\}\]

So it remains to determine those of

\[\rho_n(-I)P(z)=P(z(-I))=P(-z)=(-1)^nP(z)\]

because \(P \in\mathbb{C}[z_1,z_2]\) is homogeneous of degree

Proposition 4.Every irreducible representation of\(SO(3)\) is of the form\[W_n = \pi_\ast V_{2n}\]

where \(V_{2n}\) is described inproposition 2.

This is, of course, just a first classification. But to introduce aclassification as explicit as what we have done for

Let \(P_{\ell}\) be the complexvector space of homogeneous polynomials in three variables of degree\(\ell\), which can be considered asfunctions on \(\mathbb{R}^3\)immediately. This setting makes sense immediately, just as what we havedone for \(SU(2)\). Then, in fact,

\[W_\ell=\mathfrak{H}_\ell = \{f \in P_\ell:\Delta f=0\}.\]

This is to say, \(W_\ell\) can beunderstood as harmonic homogeneous polynomials in

- Tendor Bröker and Tammo tom Dieck,
*Representations of CompactLie Groups*. - Walter Rudin,
*Real and Complex Analysis, 3rd Edition*.

\[f_c(x)=\exp(-cx^2).\]

We want to compute the Fourier transform

\[\hat{f}_c(t)=\frac{1}{\sqrt{2\pi}}\int_{-\infty}^{+\infty}f_c(x)e^{-ixt}dx.\]

As one can expect, the computation can be quite interesting, as

\[\int_{-\infty}^{+\infty}f_c(x)dx=\frac{1}{\sqrt{c}}\int_{-\infty}^{+\infty}\exp(-(\sqrt{c}x)^2)d\sqrt{c}x=\sqrt\frac{\pi}{c}.\]

Now we dive into this integral and see what we can get.

Let's admit, trying to compute the integral straightforward issomewhat unrealistic. So we need to go through an alternative way. Forconvenience (of writing MathJax codes) we may write

First of all, \(\hat{f}_c(t)\) isalways well-defined, this is because

\[\int_{-\infty}^{+\infty}|f_c(x)e^{-ixt}|dx=\int_{-\infty}^{+\infty}|f_c(x)|dx<\infty\]

so we can compute it without worrying about anything.

It's hard to think about but we do have it. An integration by partsgives

\[\begin{aligned}\varphi(t)=\frac{1}{\sqrt{2\pi}}\int_{-\infty}^{+\infty}\exp(-cx^2)e^{-itx}dx&=\frac{1}{\sqrt{2\pi}}\int_{-\infty}^{+\infty}\exp(-cx^2)\frac{1}{-it}de^{-itx}\\&=\frac{i}{t\sqrt{2\pi}}[\exp(-cx^2)e^{-itx}]|_{-\infty}^{+\infty}\\&\quad-\frac{i}{t\sqrt{2\pi}}\int_{-\infty}^{+\infty}e^{-itx}d\exp(-cx^2) \\&=\frac{-2c}{t\sqrt{2\pi}}\int_{-\infty}^{+\infty}-xi\exp(-cx^2)e^{-itx}dx\end{aligned}\]

On the other hand, we have

\[\varphi'(t)=\hat{f'_c}(t)=\frac{1}{\sqrt{2\pi}}\int_{-\infty}^{+\infty}-ixf_c(x)e^{-itx}dx.\]

(The well-definedness of the integral can be verified easily.)Combining both, we obtain an differential equation

\[t\varphi(t)+2c\varphi'(t)=0\]

This differential equation corresponds to an integral equation

\[\int2c\frac{d\varphi}{\varphi}=-\int tdt.\]

And we solve it to obtain

\[2c\log\varphi=-\frac{1}{2}t^2+C\]

or alternatively,

\[\varphi(t)=C\exp(-\frac{1}{4c}t^2).\]

Now put the initial value back in. As we have shown above, thissubjects to the Gaussian integral

\[\varphi(0)=\frac{1}{\sqrt{2\pi}}\sqrt{\frac{\pi}{c}}=\frac{1}{\sqrt{2c}}.\]

Therefore

\[\hat{f}_c(t)=\varphi(t)=\frac{1}{\sqrt{2c}}\exp\left(-\frac{1}{4c}t^2\right)\]

is exactly what we want.

Before showing another method, we first have an question: can we have\(\hat{f}_c=f_c\)? Solving an equationwith variable in \(c\) answers thisquestion affirmatively:

\[\hat{f}_c=f_c \iff \begin{cases}\frac{1}{\sqrt{2c}}=1 \\-\frac{1}{4c}=-c \end{cases} \iff c = \frac{1}{2}.\]

In other words, \(f_\frac{1}{2}\) isa fixed point of the Fourier transform. For this class of functions, thefixed point is this and only this one.

We can also make use of the Gaussian integral to get what wewant.

\[\begin{aligned} \varphi(t) &=\frac{1}{\sqrt{2\pi}}\int_{-\infty}^{+\infty}\exp(-cx^2)e^{-itx}dx \\ &= \frac{1}{\sqrt{2\pi}}\int_{-\infty}^{\infty} \exp\left[-c\left( x+\frac{it}{2c} \right)^2-\frac{t^2}{4c}\right]dx \\ &= \frac{1}{\sqrt{2\pi}}\exp\left(-\frac{t^2}{4c} \right)\int_{-\infty}^{\infty}\exp \left[-c\left( x+\frac{it}{2c} \right)^2\right]dx \\ &= \frac{1}{\sqrt{2\pi}}\exp\left(-\frac{t^2}{4c} \right)\sqrt{\frac{\pi}{c}} \\ &=\frac{1}{\sqrt{2c}}\exp\left(-\frac{t^2}{4c}\right)\end{aligned}\]

As a classic property of the Fourier transform, for

\[\widehat{f \ast g}(t)=\hat{f}(t)\hat{g}(t)\]

where

\[f \ast g(x) = \frac{1}{\sqrt{2\pi}}\int_{-\infty}^{+\infty}f(x-y)g(y)dy.\]

By the way, \(f \in L^1\) means

With this result, we can compute \(f_a \astf_b\) easily. Note

\[\widehat{f_a \astf_b}(t)=\hat{f_a}(t)\hat{f_b}(t)=\frac{1}{2\sqrt{ab}}\exp\left[-\left(\frac{1}{4a}+\frac{1}{4b}\right)x^2\right]\]

Now let's see if we can have \(f_a \ast f_b= \gamma f_c\) for some

\[\frac{1}{4c}=\frac{1}{4a}+\frac{1}{4c} \impliesc=\frac{1}{\frac{1}{a}+\frac{1}{b}}=\frac{ab}{a+b}.\]

We also have

\[\gamma \frac{1}{\sqrt{2c}}=\frac{1}{2\sqrt{ab}} \implies \gamma =\sqrt\frac{c}{2ab}=\sqrt{\frac{1}{2(a+b)}}\]

Therefore

\[f_a \ast f_b = \sqrt{\frac{1}{2(a+b)}}f_c\]

where \(c\) is given above. We donot even have to compute the integral explicitly.

]]>Is intended to supply a detailed proofs of the Riemann mappingtheorem.

Riemann mapping theorem.Every simply connectedregion \(\Omega \subsetneq \mathbb{C}\)is conformally equivalent to the open unit disc\(U\).

Fortunately the proof can be found in many textbooks of complexanalysis, but the proof is fairly technical so it can be painful toread. This post can be considered as a painkiller. In this post you willsee the proof being filled with many details. However, the writer stillencourage the reader to reproduce the proof by their own pen and paper.The writer also hopes that this post can increase the accessibility ofthis theorem and the proof.

However, there is a bar. We need to assume some background in complexanalysis, although they are very basic already. Minimal prerequisite isbeing able to answer the following questions.

Contour integration, Cauchy's formula.

Almost uniform convergence. Let

\(\Omega \subset \mathbb{C}\) be open andsuppose that \(f_j \in H(\Omega)\) forall \(j=1,2,\dots\), and \(f_j \to f\) uniformly on every compactsubset \(K \subset \Omega\). Does \(f \in H(\Omega)\)? What is the uniformlimit of \(f'_j\)? Informally, wecall the phenomenon that a sequence of functions uniformly converges onevery compact subset *almost uniform*convergence. This hasnothing to do with*almost everywhere*in integration theory. Infact, this post does not require background in Lebesgue integrationtheory.Open mapping theorem (complex analysis version).

Maximum modulus principle and some variants.

Rouché's theorem. Or even more, the calculus ofresidues.

Despite of the prerequisites, we still need some preparationbeforehand.

Definition 1.Let\(X\) be a connectedtopological space. We say \(X\) issimply connectedif every curve is null-homotopic. Let\(\gamma:[0,1] \to X\) be a closedcurve, i.e., it is a continuous map such that\(\gamma(0)=\gamma(1)\). We say \(\gamma\) is null-homotopic if it ishomotopic to a constant map \(\gamma_0:[0,1]\to \{x\}\) with \(x \inX\).

Intuitively, if \(X\) is simplyconnected, then \(X\) contains no"hole". For example, the unit disc

There are a lot of good properties of simply connected region, whichwill be summarised below.

Proposition 1.For a region (open and connectedsubset of \(\mathbb{R}^2\)), thefollowing conditions are equivalent. Each one can imply other eight.

- \(\Omega\) is homeomorphic to theopen unit disc \(U\).
- \(\Omega\) is simplyconnected.
\(\operatorname{Ind}_\gamma(\alpha)=0\) forevery path \(\gamma\) in \(\Omega\) and \(\alpha \in S^2 \setminus \Omega\), where\(S^2\) is the Riemann sphere. - \(S^2 \setminus \Omega\) isconnected.
- Every \(f \in H(\Omega)\) can beapproximated by polynomials, almost uniformly..
- For every \(f \in H(\Omega)\) andevery closed path \(\gamma\) in
\(\Omega\), \[\int_\gamma f(z)\mathrm{d}z=0.\]

- Every \(f \in H(\Omega)\) hasanti-derivative. That is, there exists an \(F\in H(\Omega)\) such that
\(F'=f\). - If \(f \in H(\Omega)\) and
\(1/f \in H(\Omega)\), then there exists a\(g \in H(\Omega)\) such that \(f=\exp{g}\). - For such \(f\), there also exists a\(\varphi \in H(\Omega)\) such that\(f=\varphi^2\).

5~9 are pretty much saying, calculus is fine here and we are notworrying about nightmare counterexamples, to some extent. Most of theimplications \(n \implies n+1\) are notthat difficult, but there are some deserve a mention. 4 implying 5 is aconsequence of Runge's theorem. In the implication of 7 to 8, one needsto use the fact that \(\Omega\) isconnected. When we have \(f=\exp{g}\),then we can put

If you know the definition of sheaf, you will realise that

Since we haven't prove the Riemann mapping theorem, we cannot use theequivalence above yet. However, we can use 9 right away. This gives riseto Koebe's square root trick.

Equicontinuity is quite an important concept. You may have seen it indifferential equation, harmonic function, maybe just sequence offunctions. We will use it to describe a family of functions, wherealmost uniform convergence can be well established.

Definition 2.Let\(\mathscr{F}\) be a family of functions\((X,d) \to \mathbb{C}\) where \((X,d)\) is a metric space. We say that \(\mathscr{F}\) is

equicontinuousif, to every\(\varepsilon>0\), there corresponds a\(\delta>0\) such that whenever\(d(x,y)<\delta\), we have \(|f(x)-f(y)|<\varepsilon\) for all \(f \in \mathscr{F}\). In particular, bydefinition, all functions in \(\mathscr{F}\) are uniformly continuous. We say that \(\mathscr{F}\) is

pointwise boundedif, to every \(x\in X\), there corresponds some \(0 \leM(x) < \infty\) such that \(|f(x)|\le M(x)\) for every \(f \in\mathscr{F}\).We say that \(\mathscr{F}\) is

uniformly bounded on each compact subsetif, to each compact\(K \subset X\), there corresponds anumber \(M(K)\) such that\(|f(z)| \le M(K)\) for all \(f \in \mathscr{F}\) and \(z \in K\).

These concepts are talking about "a family of" continuity andboundedness. In our proof of the Riemann mapping theorem, we do notconstruct the map explicitly, instead, we will use these concepts aboveto obtain one (which is a limit) that exists. In this post we simply put\(X=\Omega \subset \mathbb{C}\), asimply connected region and \(d\) isthe natural one.

A famous result of equicontinuity is Arzelà-Ascoli, which says thatpointwise boundedness and equicontinuity implies almost uniformconvergence.

Theorem 1 (Arzelà-Ascoli)Let\(\mathscr{F}\) be a family of complexfunctions on a metric space \(X\),which is pointwise bounded and equicontinuous. \(X\) is separable, i.e., it contains acountable dense set. Then every sequence \(\{f_n\}\) in \(\mathscr{F}\) has then a subsequence thatconverges uniformly on every compact subset of \(X\).

Here is a self-contained

Certainly it is OK to let \(X\) be asubset of \(\mathbb{R}\),

Definition 3.Suppose\(\mathscr{F} \subset H(\Omega)\), for someregion \(\Omega \subset \mathbb{C}\).We call \(\mathscr{F}\) a normalfamilyif every sequence of members of\(\mathscr{F}\) contains a subsequence, whichconverges uniformly on every compact subset of \(\mathscr{F}\). The limit function is notrequired to be in \(\mathscr{F}\).

We now apply Arzelà-Ascoli to complex analysis.

Theorem 2 (Montel).Suppose\(\mathscr{F} \subset H(\Omega)\) isuniformly bounded, then \(\mathscr{F}\)is a normal family.

*Proof.* We need to show that

Let \(\{K_n\}\) be a sequence ofcompact sets such that (1) \(\bigcup_n K_n =\Omega\) and (2) \(K_n \subsetK^\circ_{n+1} \subset K_{n+1}\), the interior of **every**\(z \in K_n\), there exists a positivenumber \(\delta_n\) such that

For such \(\delta_n\), we pick

What does this inequality imply? For **restrictions** of the members of

Now consider a sequence \(\{f_j\}\)in \(\mathscr{F}\). For each

**Remarks.** We have no idea what the limit is, and thishappens in our proof of the Riemann map theorem as well.

The sequence \(\{K_n\}\) can beconstructed explicitly, however. In fact, for every open set

- \(\bigcup_n K_n=\Omega\).
- \(K_n \subset K_{n+1}^\circ\).
- For every compact \(K \subset\Omega\), there is some \(n\)such that \(K \subset K_n\).
- Every component of \(S^2 \setminusK_n\) contains a component of \(S^2\setminus \Omega\).

The set is constructed as follows and can be verified to satisfy whatwe want above. or each \(n\), define\[V_n = D(\infty,n) \cup \bigcup_{a \not\in \Omega}D(a,1/n).\] Then \(K=S^2 \setminus V_n\)is what we want.

Is another important tool for our proof of the Riemann mappingtheorem. We need this lemma to establish important inequalities. Thislemma as well as its variants show the rigidity of holomorphic maps. Wemake use of the maximum modulus theorem. For simplicity, let

Theorem 3 (Schwarz lemma).Suppose\(f:U \to \mathbb{C}\) is a holomorphic mapin \(H^\infty\) such that \(f(0)=0\) and \(\|f\|_\infty \le 1\), then \[\begin{aligned}|f(z)| &\le |z| \quad (z \in U), \\|f'(0)| &\le 1;\end{aligned}\] on the other hand, if \(|f(z)|=|z|\) holds for some \(z \in U \setminus \{0\}\), or if \(|f'(0)|=1\) holds, then \(f(z)=\lambda{z}\) for some complex constant\(\lambda\) such that \(|\lambda|=1\).

*Proof.* Since \(f(0)=0\),\(f(z)/z\) has a removable singularityat \(z=0\). Hence there exists

There are many variances of the Schwarz lemma, and we will be usingSchwarz-Pick.

Definition 4.For any\(\alpha \in U\), define \[\varphi_\alpha(z) = \frac{z-\alpha}{1-\overline\alpha z}.\]

This family is a subfamily of Möbius transformation, but we are notpaying very much attention to this family right now. We need the factthat such \(\varphi_\alpha\) is alwaysa one-to-one mapping which carries

Theorem 4 (Schwarz-Pick lemma).Suppose\(\alpha,\beta \in U\), \(f \in H^\infty\) and \(\| f\|_\infty \le 1\), \(f(\alpha)=\beta\). Then \[|f'(\alpha)| \le \frac{1-|\beta|^2}{1-|\alpha|^2}.\]

*Proof.* Consider \[g=\varphi_\beta \circ f \circ \varphi_{-\alpha}.\] We see \(g \in H^\infty\) and\(\|g\|_\infty \le 1\). What's moreimportant, \(g(0)=\varphi_\beta \circf(\alpha)=\varphi_\beta(\beta)=0\). By the Schwarz lemma,

Each \(z \ne 0\) determines a*direction* from the origin, which can be described by *preserves angles* at

Conformal mappings preserves angles in a reasonable way. A function\(f\) is **conformal** ifit is holomorphic and \(f'(z) \ne0\) everywhere. We have a theorem describes that, but it ispretty elementary so we are not including the proof in this post.

Theorem 5.Let\(f\) map a region \(\Omega\) into the plane. If \(f'(z_0)\) exists at some \(z_0 \in \Omega\) and \(f'(z_0) \ne 0\), then \(f\) preserves angles at \(z_0\). Conversely, if the differential\(Df\) exists and is different from\(0\) at \(z_0\), and if \(f\) preserves angles at \(z_0\), then \(f'(z_0)\) exists and is different from\(0\).

There is no confusion about **conformallyequivalent** if there is a conformal one-to-one mapping of

Theorem 6 (Riemann mapping theorem).Every propersimply connected region \(\Omega\) inthe plane is conformally equivalent to the open unit disc\(U\).

As a famous example, the upper plane

As one may expect, this theorem asserts that the study of a simplyconnected region \(\Omega\) can bereduced to \(U\) to some extent. But aconformal equivalence is not just about homeomorphism. If **induces** aring isomorphism \[\begin{aligned}\varphi^\ast:H(\Omega_2) &\to H(\Omega_1) \\ f &\mapsto f \circ \varphi\end{aligned}\] Therefore, the ring

The proof is fairly technical. But it is a good chance to attest toour skill in complex analysis. The bread and butter of this proof is thefollowing set: \[\Sigma = \{\psi \in H(\Omega):\psi(\Omega) \subset U;\psi\text{ isone-to-one.}\}\] Our is to prove that there is some **normalfamily**.

Pick \(w_0 \in \mathbb{C} \setminus\Omega\). Then \(g(z)=z-w_0 \inH(\Omega)\) and what is more important,

Since \(\varphi\) is an openmapping, there is an open disc \(D(a,r)\subset \varphi(\Omega)\), where \(a\in \varphi(\Omega)\), \(a \ne0\) and \(0<r<|a|\). Butby arguments above we have \(-a \not\in\varphi(\Omega)\), and therefore

**Remark.** You may have trouble believing that

Since \(D(-a,r) \cap\varphi(\Omega)=\varnothing\), we have

If \(\psi \in \Sigma\) and

\(\psi(\Omega) \subsetneqq U\), and \(z_0 \in \Omega\), then there exists a \(\psi_1 \in \Sigma\) such that \(|\psi_1'(z_0)|>|\psi'(z_0)|\).

This step shows that we can "enlarge" the range in some way.

For convenience we use the Möbius transformation

By putting \(s(z)=z^2\), we have\[\begin{aligned}\psi(z)&=\varphi_{-\alpha} \circ g^2(z) \\ &= \varphi_{-\alpha} \circ s \circ g(z) \\ &= \varphi_{-\alpha} \circ s \circ \varphi_{-\beta} \circ\psi_1(z).\end{aligned}\] If we put \(F(z)=\varphi_{-\alpha}\circ s \circ \varphi_{-\beta}(z)\), then the chain rule showsthat \[\psi'(z_0) = F'(0)\psi_1'(z_0).\] (Note we used the fact that

We take the contraposition of step 2:

Fix \(z_0 \in \Omega\). If

\(h \in \Sigma\) is an element such that\(|h'(z_0)| \ge |\psi'(z_0)|\)for all \(\psi \in \Sigma\), then \(h(\Omega)=U\).

The proof is complete once we have found such a function! To do this,we use the fact that \(\Sigma\) is anormal family. Put \[\eta = \sup\{|\psi'(z_0)|:\psi \in \Sigma\}.\] By definition of \(\eta\),there is a sequence \(\{\psi_n\}\) suchthat \(|\psi_n'(z_0)| \to \eta\) in\(\Sigma\). By normality of

It remains to show that \(h\) isone-to-one. Fix distinct \(z_1, z_2 \in\Omega\). Put \(\alpha=h(z_1)\)and \(\alpha_n=\varphi_n(z_1)\), then\(\alpha_n \to \alpha\). Let

- \(z_1 \not\in \overline{D}\).
- \(h-\alpha\) has no zero point onthe boundary of \(\overline{D}\).

We see \(\varphi_n -\alpha_n\)converges to \(h-\alpha\), uniformly on\(\overline{D}\). They have no zero in\(D\) since they are one-to-one andhave a zero at \(z_1\). By Rouché'stheorem, \(h-\alpha\) has no zero in\(D\) either, and in particular

**Remark.** First of all, such a

Our choice of \(\overline{D}\)enables us to use Rouché's theorem (chances are you didn't get it).Since \(h-\alpha\) has no zero on theboundary, we have \(\zeta=\inf_{z \in \partialD}|h(z)-\alpha|>0\). When

This proof is a reproduction of W. Rudin's *Real and ComplexAnalysis*. For a comprehensive further reading, I highly recommend

In the previous post weare convinced that the Galois group of a separable irreduciblepolynomial \(f\) can be realised as asubgroup of the symmetric group, the elements of which permute the rootsof \(f\). We worked on cubicpolynomials over a field with characteristic not equal to

Let \(f \in \mathbb{Q}[X]\) be anirreducible polynomial of prime degree

We present some handy results in finite group theory that will beused in the main result. One may skip this section until needed. I willcollapse the proof in case one wants to treat it as an exercise.

Lemma 1.Let \(p\)be a prime number. The symmetric group\(\mathfrak{S}_p\) is generated by \([12 \cdots p]\) and an arbitrarytransposition \([rs]\).

*Proof.* We prove this by presenting several sets ofgenerators of \(\mathfrak{S}_n\) where\(n\) is a positive integer.

- It is generated by cycles. This is a really, really routineverification and sometimes assumed as a fact.
- It is generated by transpositions, i.e.,
\(2\)-cycles. It suffices to show that acycle is a product of transpositions. Indeed, for any cycle \([i_1\dots i_k]\) in \(\mathfrak{S}_n\), we have \([i_1\cdotsi_k]=[i_1i_2][i_2i_3]\cdots[i_{k-1}i_k]\). This proves ourstatement. - It is generated by translations of the form
\([1k]\). It suffices to show that atransposition is generated as such. For any transposition \([rs]\), we have \([rs]=[1r][1s][1r]\). - It is generated by adjacent translations, i.e. the generators can beof the form \([k-1 ,k]\). This followsfrom the following identity:

\[[1k]=[12][23]\cdots[k-1,k][k-2,k-1]\cdots [23][12]\]

- It is generated by two elements:
\(\sigma=[12]\) and \(\tau=[12\cdots n]\). This follows from thefollowing identity:

\[\tau^{k-2}\sigma\tau^{-(k-2)}=[\tau^{k-2}(1)\tau^{k-2}(2)]=[k-1,k].\]

Now, back to the case when \(n=p\) isprime. Put \(\sigma=[rs]\) andWe have many good reasons to study the Galois group of*something*. It would be great if the group can be written downexplicitly. In this section we show that the group can be revealed bythe number of nonreal roots.

Proposition 1.Let \(f(X)\in \mathbb{Q}[X]\) be an irreducible polynomial of prime degree.If \(f\) has precisely two nonrealroots, then the Galois group \(G\) over\(\mathbb{Q}\) is\(\mathfrak{S}_p\).

*Proof.* Let \(L\) be thesplitting field of \(f\). It sufficesto show that \(G\) contains atransposition and a \(p\)-cycle, whichis \([12\cdots p]\). By the Sylow'stheorem, \(G\) has a subgroup

In fact, \(\sigma\) can beconsidered as \([12\dots p]\). Supposethe order of roots of \(f\) is given,for which we have \(\sigma=[i_1 i_2 \dotsi_p]\). Then If we re-order these roots, by putting the

It remains to prove that \(G\)contains a transposition. Let

For example, consider the polynomial \[f(X)=X^5-4X+2.\] With calculus one can show that it has exactly three roots,hence it has two nonreal roots. Eisenstein's criterion shows that

This also works fine when \(p=2\) or\(3\). The case when

It is way too ambitious to restrict ourselves in one single pair ofroots. Also, it seems we have ignored the alternating group

Proposition 2.Let \(f \in\mathbb{Q}[X]\) be an irreducible polynomial of prime degree\(p \ge 5\). Suppose that\(f\) has \(k>0\) pairs of nonreal roots. If \(p \ge 4k+1\), then the Galois group \(G\) is isomorphic to \(\mathfrak{A}_p\) or \(\mathfrak{S}_p\). If \(k\) is odd then \(G \cong \mathfrak{S}_p\).

The proof is done by showing that

Indeed, what we have proved in "the simplest case" is nothing but\(k=1\). When

1 | Input: An irreducible polynomial f(X) over Q with prime degree p >= 5 |

Here, \(\Delta(f)\) is thediscriminant of \(f\). We have seenthat whether \(\Delta\) is a perfectsquare matters a lot. The discussion of `ReductionMethod`

canbe trailed in Oz Ben-Shimol's paper.

Let \(k\) be an arbitrary field andsuppose \(f(X) \in k[X]\) is separableand, i.e., \(f\) has no multiple rootsin an algebraic closure, and of degree \(\ge1\). Let \[f(X)=(X-x_1)\cdots(X-x_n)\] be its factorisation in a splitting field

For example, consider

Based on this fact, we can consider

To begin with we note that we can assume that the quadratic term is\(0\). Let

Now back to our main story. First of all we study irreducibility. If\(f\) is irreducible, then clearly ithas no root in \(K\). On the otherhand, if \(f\) has no root in

Proposition 1.Let\(f(X)\) be a cubic polynomial in \(K[X]\) where \(\operatorname{char}K=0,5,7,\dots\), then\(f\) is irreducible over \(K\) if and only if \(f\) has no root in \(K\).

Notation being above, we assume that

\(f\) is separable if and only if\(\gcd(f,f')=0\). The derivative of\(f\), which should be simplifiedbecause \(f\) has been, is given by\[f'(X)=3X^2+a.\] It is not equal to \(a\)because the characteristic of \(K\) isnot \(3\). We will show carefully that\(f(X)\) is separable by working onthese two polynomials.

The first question is the value of

Now there only remains the most general case:

Where we are at? We want to ensure that

Since \(G\) permutes three elements,\(G\) has to be a subgroup of

The question now is, when \(G\) is\(\mathfrak{S}_3\) and when it is

If \(\delta \in K\), or moreprecisely, \(\sqrt\Delta \in K\), then\(\sigma(\delta)=\delta\) and itfollows that \(\varepsilon(\sigma)=1\)for all \(\sigma \in G\). This can onlyhappen if \(G=\mathfrak{A}_3\).

If \(\sqrt\Delta \not\in K\), then\(\delta\) is not fixed by

We have the following conclusion.

Proposition 2.Notation being above. Assume that\(f\) is irreducible. Then the Galoisgroup of \(f\) is\(\mathfrak{S}_3\) if and only if \(\sqrt\delta \not\in K\). The group is \(\mathfrak{A}_3\) if and only if \(\sqrt\Delta \in K\).

A dirty calculation shows that

Let's conclude this post by a handy but nontrivial example. Consider\[f(X)=X^3-X-1\] The discriminant is \(-4\cdot(-1)^3-27 \cdot (-1)^2=-23\), which lies in

The method is presented by Artin: we will be actively using theoriesof the Sylow group theory. Recall that for a finite group *other* \(p\)-Sylow groups here.However, one needs to also recall that a

On the other hand, we also make use of analysis (which is Gauss'sidea). For every \(a>0\), there is asquare root \(\sqrt{a}>0\). In otherword, we have a positive root of the equation *not* irreducible over\(\mathbb{R}\) unless

Next we take a look at *not*irreducible. With this being said,

We also need a part of the following lemma on field extension. Inbrief, finite separable extension induces a *minimal* Galoisextension.

Lemma.Let \(E/F\)be a finite separable extension. Then\(E\) is contained in an extension \(K\) such that \(K/F\) is Galois. It is minimal in the sensethat, in a fixed algebraic closure \(K^\mathrm{a}\) of \(K\), any other Galois extension \(L\) of \(F\) containing \(E\) must contain \(K\) as well. We have the following tower:\[F \subset E \subset K \subset L \subset K^\mathrm{a}.\]

*Proof.* First of all, we can find a finite Galois extensionof \(F\) containing

The complex field \(\mathbb{C}\) isalgebraically closed.

The following proof focuses on algebra and tries its best to avoidanalysis. If you are a fan of analysis, you can dive into complexanalysis and use the maximum modulus theorem to study a polynomial. Or,you can study the behaviour of

*Proof.* Let's firstly make it a problem of Galois theory.Since \(\mathbb{R} \supset\mathbb{Q}\), it is of characteristic

Applying the lemma above, we can find a finite and Galois extension\(K/\mathbb{R}\). We need to prove that\(K=\mathbb{C}\).

Put \(G=G(K/\mathbb{R})\). We wantto show that \(|G|=2\) hence

Indeed, applying the lemma again, we see

Therefore \(G=H\) is a

Why we have to prove that **algebraic** extension which is normal and separable.

Let \(G\) be a finite group and\(R\) be a commutative ring. The*algebra* of \(G\) over

\[e_s e_t = e_{st},\quad \forall s,t \in G.\]

With this being said, given \(u=\sum_{s \inG}a_se_s\) and \(v=\sum_{t \inG}b_te_t\), we have

\[uv = \sum_{s \in G}\sum_{t \in G}a_sb_te_{st}.\]

For example, take

\[uv = a_xb_{x^2}e_1+a_1b_xe_x+(a_1b_{x^2}+a_xb_x)e_{x^2}.\]

As one will notice, the structure of this algebra should bedetermined by both \(G\) and *simple*. A lot of things in elementary linear algebra can berecovered here. And that is part of the mission of this blog post.Before we dive in we need to look into group algebra in a generalsetting first. It is not often to see group algebra and representationtheory to be treated together but let's try it. While the majority ofthis post is (non-commutative) ring theory and module theory, weencourage the reader to try to use representation theory as examples.Standalone examples may drive us too far and we may not have enoughspace for them.

First of all, we list some very obvious facts that do not even needproof.

\(R[G]\) is a free

\(R\)-module with dimension \(|G|\). \(R[G]\) is itself a ring. Thecommutativity of \(R[G]\) is determinedby \(G\).

However, as one may ignore,

Proposition 1.\(R[G]\) is nota divisionring.

*Proof.* Pick \(g \in G\)that is not the identity. Then

But in a division ring, there is no zero-divisor.

As a ring, we certainly can consider modules over

Let \(R\) be a ring (not assumed tobe commutative here). An \(R\)-module\(E\) is called **simple**it has no nontrivial submodule. This may remind you of irreducible orsimple representations of a group. We will see the connection later.Following the definition, we immediately have a special version ofSchur's lemma:

Proposition 2 (Schur's Lemma).Let\(E,F\) be two simple \(R\)-modules. Every nontrivial homomorphism\(f:E \to F\) is an isomorphism.

*Proof.* Note \(\ker{f}\) and\(f(E)\) are submodules of

Corollary 1.If\(E\) is a simple \(R\)-module, then \(\operatorname{End}_R(E)\) is a divisionring.

*Proof.* If \(f:E \to E\) isnontrivial, then according to Schur's lemma, it has an inverse.

This definitely reminds you of irreducible representations. Butirreducible representations are not always the case, so are simplemodules. Recall the Maschke's theorem in representation theory:*Every representation of a finite group over \(\mathbb{C}\) having positive dimension iscompletely reducible.* For modules, we have a similar statement.

Definition-Proposition 3.Let\(E\) be an \(R\)-module. Then the following threeconditions are equivalent:

SS 1.\(E\) is asum of simple \(R\)-modules.

SS 2.\(E\) is adirect sum of simple \(R\)-modules.

SS 3.For every submodule\(E'\) of \(E\), there is another submodule \(F\) such that \(E= E' \oplus F\), i.e. every submodule is a directsummand. If \(E\) satisfies the threeconditions above, then \(E\) is called

semisimple. A ring\(R\) is semisimple if it is a semisimplemodule over itself.

*Proof.* Assume **SS 1**, say we have

Next we assume **SS 2** and we have **SS 3**.

Finally we assume **SS 3**. Let

Lemma 4.Let \(E\)be an \(R\)-module satisfyingSS 3, then every nontrivial submodule\(F\) has a simple submodule.

*Proof.* It suffices to show that every nontrivial principalmodule has a simple submodule. Indeed, for any

Let \(L\) be the kernel of themorphism

\[\begin{aligned}R &\to Rv \\a &\mapsto av.\end{aligned}\]

Then \(L\) is a left ideal, which iscontained in a maximal ideal \(M\) of\(R\). It follows that

\[R/L \cong Rv.\]

By **SS 3**, we can find a submodule

\[E = Mv \oplus M'\]

which gives

\[Rv = E \cap Rv = (Mv \cap Rv) \oplus (M' \cap Rv)=Mv \oplus (M'\cap Rv).\]

We claim that \(M' \cap Rv\) ismaximal. Pick any proper submodule \(E'\subset M' \cap Rv\), then \(Mv\oplus E'\) is a submodule of

Proposition 5.Let\(E\) be a semisimple \(R\)-module, then every nontrivial submoduleand quotient module of \(E\) issemisimple.

*Proof.* Pick nontrivial submodule

\[F + \bigoplus_{j \in J}E_j\]

is direct. Then the direct sum is actually

Corollary 6.\(R\)is a semisimple ring if and only if every\(R\)-module is semisimple.

*Proof.* By the universal property of free modules, every\(R\)-module is a factor module of afree \(R\)-module, while a free

Let \(R\) be a ring. We say it is afinite dimensional algebra if it is also a vector space over some field\(K\) of finite dimension. We study theJacobson radical \(J(R)=\bigcup\{\text{leftmaximal ideals of }R\}\) in this subsection, which will be usedin next section.

We summarise what we want to prove in the following proposition.

Proposition 7 (Jacobson Radical).Let\(R\) be a ring (not necessarily commutative)and \(J(R)\) be the Jacobson radical of\(R\), then

\(J(R)\) is a two-sided idealcontaining all nilpotent elements.

For every simple \(R\)-module\(E\) we have

\(J(R)E=0\). More precisely, \(J(R)=\{a \in R:aE=0\text{ for all simple\(R\)-modulle \(E\)}\}\) Suppose \(R\) is a finitedimensional algebra (or more generally,

\(R\) is Artinian), then \(R/J(R)\) is semisimple, and if \(I\) is a two-sided ideal such that \(R/I\) is semisimple, then \(J(R) \subset I\). It follows that \(R\) is semisimple if and only if \(J(R)\) is trivial. Assumption being above, \(J(R)\)is nilpotent.

*Proof.* We first prove 2. Pick any

Conversely, suppose \(J(R)E \ne 0\)for some simple \(E\). Since

Next we prove 1. By definition

To prove 3, we first note that \(R\)is Artinian: every descending chain of left ideals

\[M_1 \supset M_1 \cap M_2 \supset M_1 \cap M_2 \cap M_3 \supset\cdots\supset J(R)\]

must be finite. Therefore we can write

\[\begin{aligned}\phi:R/J(R) &\to R/M_1 \oplus R/M_2 \oplus \cdots \oplus R/M_n \\ x+J(R) &\mapsto (x+M_1,x+M_2,\dots,x+M_n).\end{aligned}\]

Since \(J(R)=\bigcap_{i=1}^{n}M_i\),this follows from nothing but the Chinese Remainder Theorem.

Now suppose \(I\) is a two-sidedideal such that \(R/I\) is semisimple.By definition we can write

\[R/I=\bigoplus_{j \in J}L_j\]

for some simple \(L_j\). Pick any\(a \in J(R)\), we have

If \(J(R)=0\), then

To prove 4, we work on the descending chain

Let \(R\) be a commutative ring and\(G\) a finite group. Let

\[\rho: G \to \operatorname{Aut}_{R}E\]

and we can also study the ring homomorphism

\[\lambda:R[G] \to \operatorname{End}_{R}E.\]

We show that they are the same thing. Given

Conversely, for an representation

\[\begin{aligned}\lambda:R[G] &\to \operatorname{End}_{R}E \\\sum_{g \in G}a_ge_g &\mapsto \sum_{g \in G}a_g\rho(g).\end{aligned}\]

Therefore, the study of group representation can also be transferredinto the study of group algebra. For simplicity we call such a module\(E\) together with a representation\(\rho\) as a *Note such a \(G\)-module can alsobe considered as a module over \(R[G]\)in the usual sense. Conversely, an \(R[G]\)-module is a \(G\)-module.* When the context is clear,we write \(gx\) in place of

We generalise Maschke's theorem in an arbitrary field

Theorem 8 (Maschke).Let\(G\) be a finite group of order \(n\). Let \(K\) be a field, then \(K[G]\) is semisimple if and only if thecharacteristic of \(K\) does not divide\(n\) (it can also be \(0\)).

In introductory representation theory, we study the case when

*Proof.* Let \(E\) be a

To do this, we *average*

\[\varphi:x \mapsto \frac{1}{n}\sum_{g \in G}g^{-1} \circ\pi\circ g(x)\]

This map is \(K[G]\)-linear. Wetherefore can write \(E=\ker\varphi \oplusF\) because it is the left inverse of the inclusion

\[\varphi(x)=\frac{1}{n}\sum_{g \in G}g^{-1} \circ g(x)=\frac{1}{n}\sum_{g\in G}x=x.\]

Note, since \(F\) is a

Next we suppose that

\[s=\sum_{g \in G}e_g.\]

Note \(gs:=e_gs\) for all

In other words, if \(E\) is afinitely dimensional representation over

It would be great that, given a matrix representation of arepresentation, we can decompose it into diagonal block matrix, witheach block being a subrepresentation. But it would not be a easy job: weneed to know whether the field is algebraically closed, thecharacteristic of it, et cetera. Perhaps we need some Galois theory butit has gone too far from this post. Anyway we need to see through thestructure to know how to work with it.

In this section we study the structure of **simple** if it is semisimple and all of its simpleleft ideals are isomorphic. A left ideal is called simple if it is asimple left \(R\)-module.

Theorem 9 (Structure theorem of semisimple rings).Let \(R\) be a semisimple ring. Thenthe isomorphic class of left ideals of\(R\) is finite. Say it is represented by\(L_1,L_2,\dots,L_s\). If \(R_i = \sum_{L \cong L_i}L\) (the sum of allleft ideals isomorphic to \(L_i\)),then \(R_i\) is a two-sided ideal, andis a simple ring. One can write \(R\)as a product \[R=\prod_{i=1}^{s}R_i.\]

Besides, \(R\) admits a Peircedecomposition with respect to these

\(R_i\). There are elements \(e_i \in R_i\) such that \[1=e_1+\cdots+e_s.\] The \(e_i\) are idempotent(\(e_i^2=e_i\)), orthogonal ( \(e_ie_j=0\) if \(i\ne j\)). As a ring, \(e_i\) isthe multiplication identity of \(R_i\),and \(R_i=e_iR=Re_i\).

*Proof.* To begin with we first study the behaviour of simpleleft ideals.

Lemma 10.Let \(L\)be a simple left ideal of \(R\) and\(E\) be a simple\(R\)-module, then \(LE = 0\) unless \(L \cong E\).

*Proof of the lemma.* Since

is surjective. It is injective because the kernel is a submodule of\(L\) and it has to be trivial.

According to this lemma, \(R_iR_j=0\) whenever \(i \ne j\).This will be frequently used. For the time being we can write

\[R_i \subset R_i R = R_i R_i \subset R_i \implies R_iR=R.\]

Therefore \(R_i\) is also a rightideal for all \(i\). But before weproceed we need to explain the relation above. Since

Also note that \(R_i \cap R_j=0\)for all \(i \ne j\) because it is anintersection of two distinct classes of simple modules. Therefore we canwrite \(R=\bigoplus_{i \in I}R_i\) forthe time being.

Now consider \(1=\sum_{i \in I}e_i\)with \(e_i \in R_i\). This sum isfinite (by definition of direct sum, where cofiniteness is required).Let \(J \subset I\) be the finitesubset such that \(e_j \ne 0\) for all\(j \in J\). It follows that

Now we study the relation of

\[a_i=a_i(e_1+\cdots+e_n)=a_ie_i=(e_1+\cdots+e_n)a_i=e_ia_i.\]

Therefore \(e_i\) is the unit in\(R_i\) (it follows automatically that\(e_i^2=e_i\)). For any

\[a=a_1+\cdots+a_n.\]

This gives us a projection

Let's extract more information from this theorem. First of all thesum of \(1\) is also finite in every\(R_i\), hence each

Theorem 11.Every simple ring\(R\) admits a finite direct sum of simpleleft ideals \[R = \bigoplus_{i=1}^{n}R_i.\]

*Proof.* Since \(R\) issemisimple, it is a sum of simple left ideals, the collection of whichcan be chosen to be direct. Say we have

Consider \(1 \in R\):

\[1=\sum_{i \in I}x_i\]

where \(x_i \in R_i\). This sum isfinite, say we have

\[R=1 \cdot R = \bigoplus_{i=1}^{n}x_iR=\bigoplus_{i=1}^{n}R_i.\]

This proves our assertion.

Combining theorem 9 and 11, we see

Corollary 12.Every semisimple ring\(R\) admits a decomposition \[R=n_1L_1 \oplus \cdots \oplus n_rL_r\]

where \(n_iL_i\) denotes

\(n_i\) direct sums of isomorphic simple leftideals \(L_i\). This direct sum isunique in the following sense. \(L_1,\dots,L_r\) are unique up toisomorphism. \((n_i,L_i)\) are uniqueup to a permutation.

This must reminds you of the isotropical decomposition of arepresentation into irreducible representations. They are the samething. It used the semisimplicity of

We include here a elementary ring theory result that really doesn'tneed a proof here.

Proposition 13.Let \(R_1,R_2,\cdots, R_n\) be rings with units. The direct product\[R=R_1 \times \cdots \times R_n\]

has the following property. Every ideal (no matter left, right ortwo-sided) of \(R_i\) is an ideal of\(R\). Every minimal ideal of

\(R_i\) is an ideal of \(R\). Every minimal ideal of \(R\) is an ideal of some \(R_i\).

The proof is quite similar to how we prove that

Corollary 14.If\(R_1,\cdots,R_n\) are semisimple rings, thenso is \[R=R_1 \times \cdots \times R_n.\]

We want to work with matrices, i.e., we want to work with linearequations. This becomes possible because of Wedderburn-Artin ringtheory. We don't know what can happen yet, so we can only try togeneralise things very carefully.

When talking about matrices, we can talk about endormorphisms aswell. So our first step is to find a bridge to endormorphisms. We now toneed to consider \(R\) as a left moduleover itself.

The most immediate one is multiplication. For

\[\lambda_a:x \mapsto ax.\]

It may looks natural but unfortunately it is not necessarily anendomorphism. The reason is simple because we have

\[\rho_a:x \mapsto xa.\]

Now \(\rho_a(yx)=y\rho_a(x)\) holdsnaturally. We can show that every endomorphism is defined in this way.Consider the map \(\rho:a \mapsto (x \mapstoxa)\). We have

\(\rho\) is anti-homomorphism.Indeed, \(\rho(ab)=\rho(b)\rho(a)\) forall \(a,b \in R\) and

\(\rho(a+b)=\rho(a)+\rho(b)\). \(\rho\) is surjective (as afunction, not a homomorphism). For any

\(\psi:x \mapsto \psi(x)\), we have \(\psi(x)=\psi(x \cdot 1)=x\psi(1)\).Therefore \(\rho(\psi(1))=\psi\). \(\rho\) is injective. If

\(\rho(a)(x)=xa=0\) for all \(x \in R\), then in particular \(\rho(a)(1)=a=0\).

We can call \(\rho\) an*anti-isomorphism* but that causes headaches. Instead, if weconsider the opposite ring \(A^{op}\)where addition is the same as \(A\) andmultiplication \(\ast\) is given by

\[a \ast b = ba\]

then we have

Proposition 14.Let\(R\) be a ring. There is a naturalisomorphism \(R^{op} \cong\operatorname{End}_R(R)\) given by \(a\mapsto (x \mapsto xa)\).

Note \((R^{op})^{op}=R\) so we maybe able to take the opposite to decompose

Now write

\[R^{op} \cong \bigoplus_{i=1}^{r}\operatorname{End}_R(n_iL_i).\]

However, by Schur's lemma,

\[\operatorname{End}_R(n_iL_i) \cong \operatorname{Mat}_{n_i}(D_i).\]

For each \(f \in\operatorname{End}_R(n_kL_k)\), we have a corresponding matrix\((p_ift_j)\):

\[L_k \xrightarrow{t_j}L_k \oplus \cdots \oplus L_k \xrightarrow{f} L_k\oplus \cdots\oplus L_k \xrightarrow{p_i}L_k\]

where \(t_j\) is the inclusion and\(p_i\) is projection. This is to say,the isomorphism is given by

\[f \mapsto (p_ift_j)\]

The verification is a matter of linear algebra and techniquesfrequently used in this post.

Therefore we have

\[R^{op}\cong \bigoplus_{i=1}^{r}\operatorname{Mat}_{n_i}(D_i).\]

Taking the opposite again we have

\[R=(R^{op})^{op} \cong\bigoplus_{i=1}^{r}\operatorname{Mat}_{n_i}(D_i^{op}).\]

The isomorphism

\[R=\bigoplus_{i=1}^{r}\operatorname{Mat}_{n_i}(D_i).\]

where \(D_i\) is a divisionring.

Conversely, rings of the form above is semisimple. This is easybecause for

Lemma.Let \(R\) bea ring. All two-sided ideals of\(\operatorname{Mat}_n(R)\) are of the form\(\operatorname{Mat}_n(I)\) where \(I\) is a two-sided ideal of \(R\).

*Proof.* If \(I\) is atwo-sided ideal of \(R\), then clearly\(\operatorname{Mat}_n(I)\) is atwo-sided ideal of

\[I=\{a \in R:\text{$a$ is the $(1,1)$-th element of $A$ for some $A \inJ$}\}.\]

Then \(I\) is a two-sided ideal. Nowpick some \(A \in\operatorname{Mat}_n(R)\). Let

\[E_{ij}AE_{k\ell}=a_{jk}E_{i\ell}.\]

Therefore if \(A \in J\), then

\[E_{1j}AE_{k1}=a_{jk}E_{11} \in J \implies a_{jk} \in I\]

for all \(j,k\). Therefore

It follows that a matrix algebra over a division ring or a field issemisimple. But let's head back to where we were.

The direct sum (or product because it is finite) of matrix algebrasover division rings

\[\operatorname{Mat}_{n_1}(D_1) \oplus \cdots \oplus\operatorname{Mat}_{n_r}(D_r).\]

To conclude, we have the Wedderburn-Artin theorem.

Theorem 15 (Wedderburn-Artin).\(R\) is a semisimple ring if and only if itcan be written as a direct sum (or product because they are the samewhen finite) of matrix algebras over some division rings \[R \cong \operatorname{Mat}_{n_1}(D_1) \oplus \cdots \oplus\operatorname{Mat}_{n_r}(D_r).\]

Since the opposite of a division ring is a division ring, we alsohave

Corollary 16.A ring\(R\) is semisimple if and only if \(R^{op}\) is.

Now back to representation theory. But it can be extremely hard: wehave no idea about the division ring. However, when the ring isalgebraically closed, there is no problem. Note some author also use*skew field* in place of division ring.

Proposition 17.Let\(K\) be an algebraically closed field and\(D\) be a finite dimensional divisionring over \(K\), then \(D \cong K\).

*Proof.* Pick \(a \in D\)that is not \(0\). Note the map

\[(\lambda{e}-a)x=0\]

for some nonzero \(x\) where

If you have studied Banach algebra theory, you will realise that thisnothing but Gelfand-Mazur theorem (see any book in functional analysisthat discusses Banach algebra, for example, *Functional Analysis*by W. Rudin). In infinite dimensional space we have to consider thetopology of the field and the algebra.

Therefore we can now state Maschke's theorem in the finest waypossible:

Theorem 18 (Maschke).Let\(G\) be a finite group, and \(K\) be an algebraically closed field whosecharacteristic does not divide the order of \(G\), then \[K[G]=\operatorname{Mat}_{n_1}(K) \oplus \cdots \oplus\operatorname{Mat}_{n_r}(K).\]

Those \(n_i\) are uniquelydetermined. In particular,

\(n_1^2+\cdots+n_r^2=|G|\).

*Algebra Revised Third Edition*, Serge Lang.*Abstract Algebra*, Pierre Antoine Grillet.*Linear Representation of Finite Groups*, Jean-PierreSerre

\[R'=\mathbb{C}[\cos{x},\sin{x}].\]

in a different style.

Again, if we consider the map

\[\begin{aligned}\Phi:\mathbb{C}[X,Y] &\to \mathbb{C}[\cos{x},\sin{x}] \\ f(X,Y) &\mapsto f(\cos{x},\sin{x})\end{aligned}\]

we will see that

\[\mathbb{C}[\cos{x},\sin{x}] \cong \mathbb{C}[X,Y]/(X^2+Y^2-1).\]

Following the same step as in the previous post, we can show that\(R'=\mathbb{C}[\cos{x},\sin{x}]\)is Dedekind. However, the map

\[\begin{aligned}\Psi:\mathbb{C}[U,V] &\to \mathbb{C}[X,Y]/(X^2+Y^2-1) \\ g(U,V) &\mapsto \overline{g(X+iY,X-iY)}\end{aligned}\]

shows that

(Proposition 1)\[\mathbb{C}[X,Y]/(X^2+Y^2-1) \cong \mathbb{C}[U,V]/(UV-1) \cong\mathbb{C}[T,T^{-1}] \cong \mathbb{C}[T]_T.\]

The localisation of a UFD is a UFD, hence we see

\[f(\cos{x},\sin{x}) = e^{-inx}P(e^{ix})\]

where \(P(X) \in \mathbb{C}[X]\).Conversely, All elements of the form

\[\begin{aligned}\Lambda: \mathbb{C}[T]_{T} &\to \mathbb{C}[\cos{x},\sin{x}], \\ T &\mapsto \cos{x}+i\sin{x}.\end{aligned}\]

Note it follows that \(T^{-1}\) mapsto \(\cos{x}-i\sin{x}\).

Now we return to the identity

\[\sin^2{x}=(1-\cos{x})(1+\cos{x}).\]

In \(\mathbb{R}[\cos{x},\sin{x}]\),since \(\sin{x}\),

However, since \(R'\) is a UFD,\(\sin^2{x}=(1-\cos{x})(1+\cos{x})\)tells us *nothing*. We need to figure out why and what is goingon. To work with it we consider the form

\(P(T)=\sum_{j=m}^{n}a_jT^{j}\) (youshould be reminded of Laurent series!) is irreducible where

\(m,n \in \mathbb{Z}\) if and only if \(Q(T)=T^{-m}P(T)\) is irreducible. However,\(Q(T) \in \mathbb{C}[T]\) isirreducible if and only if \(Q\) is ofdegree \(1\)), which is equivalent tosay that \(n-m=1\) in \(P(T)\).

Therefore irreducible elements in the form

(Proposition 2)Irreducible elements of\(R'\) is of the form \[\cos{x}+i\sin{x}+a, a \in \mathbb{C}^\ast.\]

With this being said, \(\sin{x}\),\(1-\cos{x}\) and *not*irreducible. For example, for

\[\begin{aligned}\sin{x}&=\frac{1}{2i}(e^{ix}-e^{-ix})\\& = \frac{1}{2ie^{ix}}(e^{2ix}-1) \\& = \frac{1}{2ie^{ix}}(e^{ix}+1)(e^{ix}-1) \\& = \frac{1}{2ie^{ix}}(\cos{x}+i\sin{x}+1)(\cos{x}+i\sin{x}-1)\end{aligned}\]

We can find some obvious facts about these two rings. For example,\(R\) is a free

Treating \(R\) as a free

\[f(\cos{x},\sin{x})=P(\cos{x})+Q(\cos{x})\sin{x}\]

where \(P,Q \in \mathbb{R}[X]\). Forsimplicity we write \(f=P+Q\sin{x}\).Suppose we now have

\[\begin{aligned}\frac{f}{g} &= \frac{P_1+Q_1\sin{x}}{P_2+Q_2\sin{x}} \\ &=\frac{(P_1+Q_1\sin{x})(P_2-Q_2\sin{x})}{(P_2+Q_2\sin{x})(P_2-Q_2\sin{x})}\\ &=\frac{P_1P_2-Q_1Q_2(1-\cos^2{x})+(P_2Q_1-P_1Q_2)\sin{x}}{P_2^2+Q_2^2(1-\cos^{2}{x})}\end{aligned}\]

Therefore every element of \(K(R)\)can be written in the form

(Proposition 3)The quotient field of\(R\) is \[K(R)=\mathbb{R}(\cos{x})[\sin{x}].\]

Likewise,

\[K(R')=\mathbb{C}(\cos{x})[\sin{x}]\]

can be proved in exactly the same way.

Since \(R\) is Dedekind, it isintegrally closed in \(K(R)\). But whatabout its relation with \(K(R')\)?For this we have an elegant result:

(Proposition 4)\(R'\) is the integral closure of \(R\) in \(K(R')\).

*Proof.* Let \(C\) be theclosure of \(R\) in

*We are using the notation that Hartshorne used in his bookAlgebraic Geometry.*

Put \(f(X,Y)=X^2+Y^2-1\), then

\[\begin{pmatrix}\partial f/\partial X \\\partial f/\partial Y\end{pmatrix} =\begin{pmatrix}2X \\2Y\end{pmatrix}\]

has rank \(1\). The coordinate ring\(A(Y)\) is exactly

Let \(P\) be a point on

(Proposition 5)Every point\(P\) on the curve \(Z(X^2+Y^2-1)\) corresponds to a uniquenonzero complex number \(a \inC^\ast\).

Since \(Y\) is nonsingular, it alsofollows that

*Introduction to Commutative Algebra*, M. F. Atiyah &I. G. MacDonald.*Algebraic Geometry*, Robin Hartshorne.*Commutative Ring Theory and Applications*, edited byMarco Fontana, Salah-Eddine Kabbaj and Sylvia Wiegand.

\[\chi:M \to K^\ast.\] By trivial character we mean a character such that **linearlyindependent over \(K\)** ifwhenever \[a_1f_1+\cdots+a_nf_n=0\] with all \(a_i \in K\), wehave \(a_i=0\) for all

In Fourier analysis we are always interested by functions like

Let \(G\) be a LCA group, then *character* if \(|\gamma(x)|=1\)for all \(x \in G\) and *continuous* characters form a group *dualgroup* of \(G\).

If \(G=\mathbb{R}\), solving theequation

With this we have our generalised version of Fourier transform. Let\(G\) be a LCA group, **Fouriertransform** is given by \[\hat{f}(\gamma) = \int_G f(x)\gamma(-x)dx, \quad \gamma \in \Gamma.\] One can intuitively verify that

Indeed, with this identification, we can also identify

If \(G\) is discrete, then

\(\Gamma\) is compact; if \(G\) is compact, then \(\Gamma\) is discrete.

*Proof.* If \(G\) isdiscrete, then \(L^1(G)\) has a unit.The maximal ideal space, which can be identified as

If \(G\) is compact, then its Haarmeasure can be normalised so that

Besides, the compactness of \(G\)implies the constant function \(f \equiv1\) is in \(L^1(G)\). As aresult, \(\hat{f}(1)=1\) but

If characters of \(G\) are linearindependent, then they are pairwise distinct, but what about theconverse? Dedekind answered this question affirmatively. But hisapproach is rather complicated: it needed determinant. However, Artinfound a neat way to do it:

Theorem (Dedekind-Artin)Let\(M\) be a monoid and \(K\) a field. Let \(\chi_1,\dots,\chi_n\) be distinctcharacters of \(G\) in \(K\). Then they are linearly independentover \(K\).

*Proof.* Suppose this is false. Let

As an application, we consider an

Let \(\alpha_1,\cdots,\alpha_n\) bedistinct non-zero elements of a field

\(K\). If \(a_1,\cdots,a_n\) are elements of \(K\) such that for all integers \(v \ge 0\) we have \[a_1\alpha_1^v + \cdots + a_n\alpha_n^v = 0\] then \(a_i=0\) for all \(i\).

*Proof.* Consider \(n\)distinct characters

The linear independence of characters gives us a good chance ofstudying the relation of the field extension and the Galois group.

Hilbert's Theorem 90 (Modern Version)Let\(K/k\) be a Galois extension with Galoisgroup \(G\), then \(H^1(G,K^\ast)=1\) and \(H^1(G,K)=0\). This is to say, the firstcohomology group is trivial for both addition and multiplication.

It may look confusing but the classic version is about cyclicextensions (\(K/k\) is cyclic if it isGalois and the Galois group is cyclic).

Hilbert's Theorem 90 (Classic Version, MultiplicativeForm)Let \(K/k\) be cyclic ofdegree \(n\) with Galois group\(G\) generated by \(\sigma\). Then \[\frac{\ker N}{1/\sigma{A}} \cong 1\] where \(1/\sigma{A}\)consists of all elements of the form \(\alpha/\sigma(\alpha)\) with \(\alpha \in A\), and \(N(\beta)\) is the norm of \(\beta \in K\) over \(k\).

This corresponds to the statement that

Hilbert's Theorem 90 (Classic Version, AdditiveForm)Let \(K/k\) be cyclic ofdegree \(n\) with Galois group\(G\) generated by \(\sigma\). Then \[\frac{\ker \Tr}{(1-\sigma){A}} \cong 0\] where \((1-\sigma)A\)consists of all elements of the form \((1-\sigma)(\alpha)\) with \(\alpha \in A\), and \(\Tr(\beta)\) is the norm of \(\beta \in K\) over \(k\).

This corresponds to, of course, the statement that **\(G\)-mod** ofleft \(G\)-modules. The set ofmorphisms of two objects \(A\) and\(B\), for which we write *cohomology groups of \(G\) with coefficients in \(A\)* is the right derived functor of\(\operatorname{Hom}_G(\mathbb{Z},-)\):\[H^\ast (G,A) \cong\operatorname{Ext}^\ast_{\mathbb{Z}[G]}(\mathbb{Z},A).\] It follows that $H^0(G,A) _G(Z,A)=A/ga-a:g G,a A $. Inparticular, if \(G\) is trivial, then\(\operatorname{Hom}_G(\mathbb{Z},-)\)is exact and therefore

$$ which is to say \(\alpha_\tau =\gamma/\tau\gamma\). Replacing

For the additive form, take \(\theta \in K\setminus \ker Tr\). Given a

*Additive form.* Pick any

Conversely, assume

*Multiplicative form.* This can be done in a quite similarsetting. For any

Consider the extension

TheoremIntegers\(x,y,z\) satisfy the Diophantine equation\(x^2+y^2=z^2\) if and only if \((x,y,z)\) is proportional to \((m^2-n^2,2mn,m^2+n^2)\) for some integers\(m,n\).

This can be generalised to all Diophantine equations of the form\(x^2+Axy+By^2=Cz^2\) for some nonzeroconstant \(C\) and constant

The additive form is a good friend of "character

Theorem (Artin-Schreier)Let\(k\) be a field of character \(p\) and \(K/k\) an extension of degree \(p\). Then there exists \(\alpha \in K\) and \(\alpha\) is the zeroof an equation \(X^p-X-a=0\) for some \(a \in k\).

*Proof.* Note the Galois group

For the case when the character is

- Serge Lang,
*Algbra, Revised Third Edition*. - Charles A. Weibel,
*An Introduction to HomologicalAlgebra*. - Noam D. Elkies,
*Pythagorean triples and Hilbert’s Theorem90*. (https://abel.math.harvard.edu/~elkies/Misc/hilbert.pdf) - Jose Capco,
*The Two Artin-Schreier Theorems*.(https://www3.risc.jku.at/publications/download/risc_5477/the_two_artin_schreier_theorems__jcapco.pdf) - Walter Rudin,
*Fourier Analysis on Groups*.

In fact, the \(\mathbb{R}^n\) casecan be generalised into any locally compact abelian group (see anyabstract harmonic analysis books), this is because what really mattershere is being locally compact and abelian. But at this moment we stickto Euclidean spaces. Note since

To read this post you need to be familiar with some basic propertiesof Banach algebra, complex Borel measures, and the most important,Fubini's theorem.

The norm on \(M(\mathbb{R}^n)\) isthe *total variation*: \[\lVert \mu \rVert = |\mu|(\mathbb{R}^n) = \sup\sum_{i=1}^{\infty}|\mu(E_i)|\] the supremum being taken over all partitions

Note each measure in

Conversely, every bounded linear functional

A vector space \(V\) over a field\(\mathbb{F}\) is called an algebra ifthere is an \(\mathbb{F}\)-bilinearform \[B:V \times V \to V\]

. It is a Banach algebra if \(V\)itself is Banach and the bilinear form is associative, i.e.

The convolution of measures is defined in the style of convolution offunctions, in a natural sense. For any Borel set

First, we need to show that \(E_2\)is Borel. In fact, we have \[\chi_{E_2}(x,y) = \chi_E(x+y).\] Since \(E\) is Borel, we see\(\chi_E\) is Borel. Meanwhile

Next, is \(\mu \ast \lambda\) anelement of \(M(\mathbb{R}^n)\)? For anyBorel set \(E\), the value of

For any \(f \in C_0(\mathbb{R}^n)\),we have a linear functional \[\Phi:f \mapsto \iint f(x+y)d\mu(x)d\lambda(y) = \int fd(\mu \ast\lambda)\] By Riesz representation theorem, there exists a unique measure\(\nu\) such that

Next we show that \(\ast\) isassociative. It can be carried out by Riesz's theorem. Put

To conclude, \(M(\mathbb{R}^n)\) isa commutative Banach algebra. Even better, this space has a unit whichis customarily called the **Dirac measure**. Let

A measure \(\mu\) is said to be

discreteif there is a countable set\(E\) such that \(\mu(A)=\mu(A \cap E)\) for all measurablesets \(A\) (in general we say \(\mu\) is concentrated on \(E\)). \(\mu\) is said to be continuousif\(\mu(A)=0\) whenever\(A\) only contains a single point. We write\(\mu \ll \lambda\), \(\mu\) is absolutely continuouswith respect to \(\lambda\), if\(\lambda(A)=0 \implies \mu(A)=0\).

We now play some games between continuous and discrete measures.First, we study the subspace of discrete measures

For convolution, things are a little trickier. Suppose

Next, we focus on the subspace of continuous measures

Next, suppose \(\mu \ll m\) and\(m(E)=0\). We see

Finally, we consider the Radon-Nikodym derivatives (which exists(surjective) and is unique almost everywhere (injective)) of absolutelycontinuous measures. If \[\mu(E) = \int_E fdm, \quad \lambda(E) = \int_E gd\mu,\] then the coincide \(\mu \ast\lambda\) coincide with \(f \astg\) in the following sense: \[\begin{aligned}(\mu \ast \lambda)(E) &= \int_{\mathbb{R^n}} \mu(E-t)d\lambda(t) \\ &=\int_{\mathbb{R^n}}\left(\int_{E}f(x+t)dm(x) \right)g(t)dm(t) \\ &= \int_{\mathbb{R}^n}\int_Ef(x+t)g(t)dm(x)d(t) \\ &= \int_E (f \ast g)dm\end{aligned}\] In other words, we have \(d(\mu \ast\lambda) = (f \ast g)dm\). Through this, we established analgebraic isomorphism \(M_{ac}(\mathbb{R}^n)\cong L^1(\mathbb{R}^n,m)\).

\(L^1(\mathbb{R}^n,m)\) could'vebeen a Banach algebra, but the unit is missing. However one can embed itinto \(M(\mathbb{R}^n)\) as a subspaceof the subalgebra

\(G\) be the group of invertibleelements of \(M=M(\mathbb{R})\), and\(G_1\) be the component of