<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <title>Desvl&#39;s blog</title>
  <icon>https://desvl.xyz/icon.png</icon>
  
  <link href="https://desvl.xyz/atom.xml" rel="self"/>
  
  <link href="https://desvl.xyz/"/>
  <updated>2026-01-11T22:44:50.756Z</updated>
  <id>https://desvl.xyz/</id>
  
  <author>
    <name>Desvl</name>
    
  </author>
  
  <generator uri="https://hexo.io/">Hexo</generator>
  
  <entry>
    <title>On the Boyd–Deninger polynomial x+1/x+y+1/y+1, pt. I - The curve</title>
    <link href="https://desvl.xyz/2025/12/30/boyd-deninger/"/>
    <id>https://desvl.xyz/2025/12/30/boyd-deninger/</id>
    <published>2025-12-29T23:00:00.000Z</published>
    <updated>2026-01-11T22:44:50.756Z</updated>
    
    <content type="html"><![CDATA[<h2 id="Introduction"><a href="#Introduction" class="headerlink" title="Introduction"></a>Introduction</h2><p>Throughout, let $P(x,y)=x+\frac{1}{x}+y+\frac{1}{y}+1$. This Laurent polynomial is one of the Boyd-Deninger polynomials,  which have been extensively studied for their connections to Mahler measures and special values of $L$-functions. </p><p>The polynomial $P(x,y)$ is special for several reasons. First of all, the projective closure of $P(x,y)=0$ is an elliptic curve $E$ of conductor $15$ and the curve does not admit complex multiplication (we will show that in this post). Second, this polynomial evoke Beilinson’s conjecture (we will justify in another post). As one can prove, there is a satisfyingly simple relation between the Mahler measure of $P$ and the $L$-function of the curve $E$:</p><script type="math/tex; mode=display">m(P)=L'(E,0)= \frac{15}{4\pi^2}L(E,2).</script><p>In this post, we will do the preparation: the geometry of $P$ and $E$, and the determination of the Deninger path of $P$, an important subset of $E$, which will be useful for the future study.</p><p>It is expected that the reader know the basic arithmetic of elliptic curves and the basics of projective curves. </p><h2 id="The-geometry-of-the-polynomial"><a href="#The-geometry-of-the-polynomial" class="headerlink" title="The geometry of the polynomial"></a>The geometry of the polynomial</h2><p>Let $Z_P$ be the zero locus of $P(x,y)=x+\frac{1}{x}+y+\frac{1}{y}+1$ in $(\mathbb{C}^\times)^2=\operatorname{Spec}(\mathbb{C}[X_1,X_1^{-1},X_2,X_2^{-1}])$. We would like to know its properties in terms of a curve, before moving on to the study of the projective closure of $Z_P$ in $\mathbb{P}^2(\mathbb{C})$.</p><blockquote><p><strong>Proposition 1.1</strong> The algebraic set $Z_P$ is a smooth algebraic curve.</p></blockquote><p><em>Proof.</em> We notice that $\frac{\partial P}{\partial x}(x,y)=1-\frac{1}{x^2}$, $\frac{\partial P}{\partial y}(x,y)=1-\frac{1}{y^2}$. A singular point of $P$ must satisfy $\frac{\partial P}{\partial x}(x,y)=\frac{\partial P}{\partial y}(x,y)=P(x,y)=0$, and the only possible candidates of $(x,y) \in (\mathbb{C}^\times)^2$ are $(1,1)$, $(1,-1)$, $(-1,-1)$ and $(-1,1)$, where the values of $P$ are never $0$, therefore $P$ does not admit singular points. $\square$</p><p>Let $E=\overline{Z_P}$ be the projective closure of $Z_P$ in $\mathbb{P}^2(\mathbb{C})$. We will show that $E$ is an elliptic curve of conductor $15$.</p><blockquote><p><strong>Proposition 1.2.</strong> The following statements are true:</p><ol><li>$E$ is a non-singular cubic defined over $\mathbb{Q}$.</li><li>$E$ has a rational point, hence is an elliptic curve defined over $\mathbb{Q}$.</li><li>$E \setminus Z_P$ is a subgroup of $E(\mathbb{Q})$ isomorphic to $\mathbb{Z}/4\mathbb{Z}$.</li><li>$E$ has conductor $15$.</li><li>$E$ does not admit complex multiplication.</li><li>$E(\mathbb{Q}) \cong \mathbb{Z}/4\mathbb{Z}$.</li></ol></blockquote><p>A few comments are in order before the proof. As a matter of fact we are looking at the compactification of $Z_P$, which is, according to the third point, $Z_P$ with $4$ points added. Besides, we are allowed to use the modularity theorem to construct a map $X_1(15) \to E$ to enable us to study the polynomial in the domain of modular forms. The modularity theorem is nevertheless an overkill; we can thereafter give a more explicit construction.</p><p><em>Proof.</em> We notice that the homogenization of $P(x,y)$ is</p><script type="math/tex; mode=display">\widetilde{P}(X,Y,Z)=X^2Y+YZ^2+XY^2+XZ^2+XYZ.</script><p>In other words, $E$ is the zero locus of $\widetilde{P}(X,Y,Z)$ in $\mathbb{P}^2(\mathbb{C})$. We can see indeed that $E$ is a cubic curve defined over $\mathbb{Q}$. We need to see if $E$ is smooth. To do this, we search for singular points. </p><p>We have</p><script type="math/tex; mode=display">\begin{cases}\frac{\partial\widetilde{P}}{\partial X}(X,Y,Z) &=  2XY+Y^2+Z^2+YZ \\\frac{\partial\widetilde{P}}{\partial Y}(X,Y,Z) &=X^2+Z^2+2XY+XZ \\\frac{\partial\widetilde{P}}{\partial Z}(X,Y,Z) &= 2YZ + 2XZ+XY\end{cases}</script><p>We can verify the first statement on each affine cover $U_X=\{X \ne 0\}$, $U_Y=\{Y \ne 0\}$ and $U_Z=\{Z\ne 0\}$.</p><p>On $U_X$, a singular point $[1:Y:Z]$ on $E$ satisfies</p><script type="math/tex; mode=display">\begin{cases}Y+YZ^2+Y^2+Z^2+YZ&=0 \\2Y+Y^2+Z^2+YZ &= 0 \\1+Z^2+2Y+Z&=0 \\2YZ+2Z+Y&=0\end{cases}</script><p>We will show that the system of equation has no solution. By combining the first two equations, we get</p><script type="math/tex; mode=display">Y(Z^2-1)=0</script><p>If $Y=0$, then $2YZ+2Z+Y=0$ becomes $Z=0$, which is absurd because then $1+Z^2+2Y+Z=1=0$. Therefore we can only have $Z^2=1$, i.e. $Z=1$ or $Z=-1$. The case where $Z=1$ implies that $2Y+3=0$ and $3Y+2=0$, which is again a contradiction. Finally, the case where $Z=-1$ implies that $2Y+1=0$ and $-2-Y=0$, which is still not possible. Conclusion: the first two equations cannot hold at the same time, therefore on $U_X$ there is no singular point. The singularity on $U_Y$ and $U_Z$ is verified in the same manner so we omit the proof. To conclude, $E$ is a smooth curve, hence irreducible (see proposition 5.1 of <a href="https://agag-gathmann.math.rptu.de/class/curves-2023/curves-2023-c5.pdf">this document</a>).</p><p>To see that $E$ is an elliptic curve, we notice that $\widetilde{P}(1,0,0)=0$ so $E$ is a smooth cubic curve that admits a rational point $P_1=[1:0:0]$ thus an elliptic curve.</p><p>One finds also some other simple points that lies on $E$. That is, $P_2=[0:0:1]$, $P_3=[1:-1:0]$ and $P_4=[0:1:0]$. In fact, we have $E \setminus Z_P = \{P_1,P_2,P_3,P_4\}$. To see this, we notice that $Z_P \subset U_Z$ and for all points on $Z_P$, the coordinates of $X$ and $Y$ cannot be zero. Thus $P_2 \in U_Z$ is the only point of $E \cap U_Z$ that is not on $Z_P$. On $\mathbb{P}^2(\mathbb{C}) \setminus U_Z$, the points on $E$ satisfy $X^2Y+XY^2=XY(X+Y)=0$, the only solution of which is $P_1,P_3$ and $P_4$. </p><p>Moving on, we study the structure of $E \setminus Z_P$ as a group. Indeed, the unit element of $E(\mathbb{Q})$ can be set as $P_1=[1:0:0]$. To calculate the sum $P_2+P_2$, we first of all calculate the tangent line of $E$ at $P_2$. Indeed, the gradient of $\widetilde{P}$ at $P_2$ is $(1,1,0)$, therefore the tangent line is $L_2=\{X+Y=0\}$, and we have $E \cap L_2 = \{X^2Z=0\} \cap \{X+Y = 0\}=\{[1:-1:0],[0:0:1]\}$. Therefore the other point on $E$ passing through $L_2$ is $P_3=[1:-1:0]$. The point $2P_2$ is therefore the intersection of $E$ and the line containing $P_1$ and $P_3$. We notice that the line containing $P_1$ and $P_3$ is $L_{13}=\{Z=0\}$. We see that $L_{14} \cap E=\{X^2Y+XY^2=0\} \cap \{Z=0\}=\{P_1,P_3,P_4\}$, which implies that $2P_2=P_4$. </p><p>We can calculate $2P_4$ in the same manner. Indeed, the tangent line of $E$ at $P_4$ is $L_4=\{X=0\}$, and $L_4 \cap E = \{YZ^2=0\} \cap \{X=0\}=\{P_4,P_2\}$. Therefore $2P_4$ is the intersection of $E$ and the line passing through $P_2$ and $P_1$, which is $L_{12}=\{Y=0\}$. We see that $L_{12} \cap E = \{XZ^2=0\}\cap \{Y=0\}=\{P_1,P_2\}$. Notice that $L_{12}$ is tangent at $P_1$ therefore we have $2P_4 = P_1$.</p><p>The calculation of $P_3$ can be tricky so we circumvent it by considering $P_2+P_4$. Indeed, the line passing through $P_2$ and $P_4$ is $L_{24}=L_4=\{X=0\}$, which is tangent at $P_4$.  Therefore the sum $P_2+P_4$ is collinear with $P_1$ and $P_4$, which is exactly $P_3$ since we have shown that $L_{14} \cap E = \{P_1,P_3,P_4\}$.</p><p>To conclude, with $P_1$ being chosen as the unit, we have $2P_2 = P_4$, $3P_2 = P_2+P_4 = P_3$ and $4P_2 = 2P_4 = P_1$. Therefore $E \setminus Z_P$ is a cyclic group of order $4$ generated by $P_2$ as expected.</p><p>Finally, we see why the conductor of $E$ is $15$. The Weierstrass form of $E$ is $y^2 + xy + y = x^3 + x^2$. This form can be found by a change of variable $(X,Y,Z) \mapsto (X+Y+Z,-Y,-X)$ as we have</p><script type="math/tex; mode=display">\widetilde{P}(X+Y+Z,-Y,-X)=X^{3} + X^{2} Z - X Y Z - Y^{2} Z - Y Z^{2}.</script><p>From the Weierstrass form it is easy to see that the discriminant $\Delta = -15$ and therefore the conductor $N=15$. One can simply use the first two steps of <a href="https://en.wikipedia.org/wiki/Tate%27s_algorithm">Tate’s algorithm</a> working on prime numbers $p=3$ and $5$. One can also simply use SageMath to carry out the calculation. The code can be found after the proof.</p><p>For the last statement, as one can compute (with the help of SageMath), that $j(E)=-\frac{1}{15}$ is <strong>not</strong> an algebraic integer, therefore $E$ does not admit complex multiplication. See Theorem II.6.1. of <a href="https://link.springer.com/book/10.1007/978-1-4612-0851-8">this book</a>.</p><p>The fact that $E(\mathbb{Q})$ is isomorphic to $\mathbb{Z}/4\mathbb{Z}$ is a confer-the-table type of proof. All rational elliptic curves of conductor $15$ have rank $0$ (a curve whose rank is $\ge 1$ has conductor $\ge 37$; see <a href="https://wstein.org/rank4/intro.html#citations">this page</a> to for the explanation on conductor; see <a href="https://johncremona.github.io/book/fulltext/table4.pdf">this document</a> for the table), therefore $E(\mathbb{Q})$ is a finite abelian group. To show that $E(\mathbb{Q})$ is exactly $\mathbb{Z}/4\mathbb{Z}$, we need to check the reduction of $E$ modulo $p$ where $E$ has good reduction, because in this case, we have $E(\mathbb{Q}) \subset E(\mathbf{F}_p)$ (Theorem VII.3.1 of Silverman’s The Arithmetic of Elliptic Curves). The prime numbers that we will use are $7$ and $23$. Indeed, we see that $E(\mathbf{F}_7) \cong \mathbb{Z}/8\mathbb{Z}$ so $\mathbb{Z}/4\mathbb{Z} \subset E(\mathbb{Q}) \subset \mathbb{Z}/8\mathbb{Z}$. Therefore $E(\mathbb{Q})$ is either $\mathbb{Z}/4\mathbb{Z}$ or $\mathbb{Z}/8\mathbb{Z}$. Assume that $E(\mathbb{Q}) \cong \mathbb{Z}/8\mathbb{Z}$, then $E(\mathbb{Q}) \subset E(\mathbf{F}_{23}) \cong \mathbb{Z}/12\mathbb{Z} \times \mathbb{Z}/2\mathbb{Z}$ would imply that $\mathbb{Z}/8\mathbb{Z} \subset \mathbb{Z}/12\mathbb{Z}$, which is absurd because $8$ does not divide $12$. Therefore we must have $E(\mathbb{Q}) \cong \mathbb{Z}/4\mathbb{Z}$. $\square$</p><p><strong>Remark.</strong> It worths noting that we have proved that there is no rational point on $Z_P$. In other words, the equation $x+\frac{1}{x}+y+\frac{1}{y}+1=0$ has no rational solution.</p><p>The following code block contains a minimal implementation of the elliptic curve $E$ in SageMath. If the reader does not have SageMath on their device, it is recommended to run the code on an online server like <a href="https://sagecell.sagemath.org/">SageMathCell</a>. It is also recommended to check the graph of $P(x,y)$ with the help of for instance <a href="https://www.geogebra.org/calculator/tfrpejdn">Geogebra</a>.</p> <figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># Sage implementation on finding the Weierstrass form and the conductor</span></span><br><span class="line"><span class="comment"># SageMath version 10.6, Release Date: 2025-03-31</span></span><br><span class="line"><span class="comment"># Python version 3.12.11</span></span><br><span class="line"><span class="comment"># ========================</span></span><br><span class="line"><span class="comment"># 1. Define the polynomial and the point</span></span><br><span class="line">R.&lt;X, Y, Z&gt; = PolynomialRing(QQ, <span class="number">3</span>)</span><br><span class="line">P = X^<span class="number">2</span>*Y + X*Y^<span class="number">2</span> + X*Y*Z + X*Z^<span class="number">2</span> + Y*Z^<span class="number">2</span></span><br><span class="line">P1 = [<span class="number">1</span>, <span class="number">0</span>, <span class="number">0</span>]</span><br><span class="line">E = EllipticCurve_from_cubic(P,P1)</span><br><span class="line"><span class="built_in">print</span>(E)</span><br><span class="line"><span class="comment"># === Output begins ===</span></span><br><span class="line"><span class="comment"># Scheme morphism:</span></span><br><span class="line"><span class="comment">#  From: Projective Plane Curve over Rational Field defined by X^2*Y + X*Y^2 + X*Y*Z + X*Z^2 + Y*Z^2</span></span><br><span class="line"><span class="comment">#  To:   Elliptic Curve defined by y^2 + x*y + y = x^3 + x^2 over Rational Field</span></span><br><span class="line"><span class="comment">#  Defn: Defined on coordinates by sending (X : Y : Z) to</span></span><br><span class="line"><span class="comment">#        (-Z : X : X + Y + Z)</span></span><br><span class="line"><span class="comment"># === Output ends ===</span></span><br><span class="line"><span class="comment"># 2. The change of variable that yields the Weierstrass normal form, i.e. the inverse of (X:Y:Z) -&gt; (-Z:X:X+Y+Z)</span></span><br><span class="line">total_map = E.defining_polynomials()</span><br><span class="line">M = matrix(QQ,[[p.coefficient(v) <span class="keyword">for</span> v <span class="keyword">in</span> [X,Y,Z]] <span class="keyword">for</span> p <span class="keyword">in</span> total_map])</span><br><span class="line">M_inv = M.inverse()</span><br><span class="line">new_vars = M.inverse()*vector([X,Y,Z])</span><br><span class="line"><span class="built_in">print</span>(<span class="string">&quot;The change of variable sends (X,Y,Z) to&quot;</span>,new_vars)</span><br><span class="line"><span class="built_in">print</span>(<span class="string">&quot;The Weierstrass normal form can be found after the change of variable of F:&quot;</span>, F(new_vars[<span class="number">0</span>],new_vars[<span class="number">1</span>],new_vars[<span class="number">2</span>]))</span><br><span class="line"><span class="comment"># === Output begins ===</span></span><br><span class="line"><span class="comment"># The change of variable sends (X,Y,Z) to (Y, X - Y + Z, -X)</span></span><br><span class="line"><span class="comment"># The Weierstrass normal form can be found after the change of variable of F: X^3 + X^2*Z + X*Y*Z - Y^2*Z + Y*Z^2</span></span><br><span class="line"><span class="comment"># === Output ends ===</span></span><br><span class="line"><span class="comment"># 3. Important values of the elliptic curve</span></span><br><span class="line">E_curve = E.codomain()</span><br><span class="line"><span class="built_in">print</span>(<span class="string">&quot;The discriminant of E is&quot;</span>,E_curve.discriminant())</span><br><span class="line"><span class="built_in">print</span>(<span class="string">&quot;The j-invariant of E is&quot;</span>, E_curve.j_invariant())</span><br><span class="line"><span class="built_in">print</span>(<span class="string">&quot;The conductor of E is&quot;</span>, E_curve.conductor())</span><br><span class="line"><span class="comment"># === Output begins ===</span></span><br><span class="line"><span class="comment"># The discriminant of E is -15</span></span><br><span class="line"><span class="comment"># The j-invariant of E is -1/15</span></span><br><span class="line"><span class="comment"># The conductor of E is 15</span></span><br><span class="line"><span class="comment"># === Output ends ===</span></span><br></pre></td></tr></table></figure><h2 id="The-Deninger-path-of-the-polynomial"><a href="#The-Deninger-path-of-the-polynomial" class="headerlink" title="The Deninger path of the polynomial"></a>The Deninger path of the polynomial</h2><p>The <strong>Deninger path</strong> of a polynomial $Q(x,y) \in \mathbb{C}[x^{\pm},y^{\pm}]$ is defined to be</p><script type="math/tex; mode=display">\gamma(Q)=\{(x,y)\in(\mathbb{C}^{\times})^2:Q(x,y)=0,|x|=1,|y|>1\}.</script><p>The reason why we are interested in such a circle is that we want to learn the value of the Mahler measure of a polynomial, which is defined in the following manner for $Q$:</p><script type="math/tex; mode=display">m(Q)=  \frac{1}{(2\pi i)^2}\iint_{|x|=|y|=1}\log|Q(x,y)|\frac{dx}{x}\frac{dy}{y}.</script><p>This integral is not easy to study. To make our lives easier, we try to integrate $\log|Q(x,y)|$ with one variable at a time. Let $Q^\ast(x)$ be the leading coefficient of $Q(x,y)$ viewed as a Laurent polynomial with coefficients in $\mathbb{C}[x^{\pm}]$. Then whenever $Q^\ast(x) \ne 0$, we have, by <a href="[Jensen&#39;s formula - Wikipedia](https://en.wikipedia.org/wiki/Jensen&#39;s_formula">Jensen’s formula</a>),</p><script type="math/tex; mode=display">\frac{1}{2\pi i}\int_{|y|=1}\log|Q(x,y)|\frac{dy}{y}=\log|Q^\ast(x)|+\sum_{\substack{|y|>1 \\Q(x,y)=0}}\log|y|.</script><p>We then integrate the value above with respect to $x$, only to find</p><script type="math/tex; mode=display">\begin{aligned}m(Q) &= \frac{1}{2\pi i}\int_{|x|=1}\log|Q^\ast(x)|\frac{dx}{x}+\frac{1}{2\pi i}\int_{\gamma(Q)}\log|y|\frac{dx}{x} \\    &= m(Q^\ast)+\frac{1}{2\pi i}\int_{\gamma(Q)}\log|y|\frac{dx}{x}\end{aligned}</script><p>Therefore the integration path $\gamma(Q)$ appears naturally. In this part, we want to find the Deninger path of $P(x,y)=x+\frac{1}{x}+y+\frac{1}{y}+1$.</p><blockquote><p><strong>Proposition 2.</strong> The Deninger path $\gamma(P)$ is given by</p><script type="math/tex; mode=display">\gamma(P)= \left\{(e^{i\theta},Y(\theta)):\theta\in\left(-\frac{\pi}{3},\frac{\pi}{3}\right),\,Y(\theta)=-\cos\theta-\frac{1}{2}-\sqrt{\left(\cos\theta+\frac{1}{2}\right)^2-1}\right\}.</script></blockquote><p><em>Proof.</em> The determination of the path is surprisingly elementary. It is routine to verify that such $(e^{i\theta},Y(\theta))$ lie on $\gamma(P)$ so it suffices to verify the inverse. Let $(x,y) \in \gamma(P)$. Since $|x|=1$, there exists $\theta \in \mathbb{R}/2\pi\mathbb{Z}$ such that $x=e^{i\theta}$. We can pick $\theta$ in such a manner that $-\pi \le \theta \le \pi$. As a result,</p><script type="math/tex; mode=display">P(x,y)=2\cos\theta+y+\frac{1}{y}+1=0,</script><p>which implies that</p><script type="math/tex; mode=display">y^2+(2\cos\theta+1)y+1=0,</script><p>and $y$ has to be one of the solutions. Before finding the solutions, we notice that in order that $(x,y) \in \gamma(P)$, we must have $\operatorname{Im}(y)=0$. If not, then $y \ne \overline{y}$ and both $y$ and $\overline{y}$ are roots of $y^2+(2\cos\theta+1)y+1$, which implies that $|y|^2=y\overline{y}=1$ according to Vieta’s formula. </p><p>It follows that $\Delta = (2\cos\theta+1)^2-4 \ge 0$, which implies that $-\frac{\pi}{3}\le \theta \le \frac{\pi}{3}$. The boundary cases have to be excluded because if $\theta=\pm \frac{\pi}{3}$ then $2\cos\theta+1=2$ and the double solution to $y^2+2y+1=0$ is $y=1$.</p><p>Therefore we must have $-\frac{\pi}{3}&lt; \theta &lt; \frac{\pi}{3}$ and solving $y^2+(2\cos\theta+1)y+1=0$ yields</p><script type="math/tex; mode=display">y = -\sqrt{\left(\cos\theta+\frac{1}{2}\right)^2-1}-\left(\cos\theta+\frac{1}{2}\right),</script><p>and another root is omitted because its absolute value is smaller than $1$. To conclude, points on $\gamma(P)$ are exactly of the form $(e^{i\theta},Y(\theta))$. $\square$</p><p>We terminate the post before it becomes way too long. In a future post, we will explain how to compute $m(P)$ using the method of modular forms. </p><h2 id="Addendum-determine-the-conductor-manually"><a href="#Addendum-determine-the-conductor-manually" class="headerlink" title="Addendum : determine the conductor manually"></a>Addendum : determine the conductor manually</h2><p>We show that the conductor of $E$ is $15=3 \cdot 5$ manually. Since $\Delta(E)=-15=-3\cdot 5$, we only need to verify the singularity of $E$ with reduciton modulo $3$ and $5$. For the definition of conductor, please see <a href="https://planetmath.org/conductorofanellipticcurve">Planet math</a>.</p><p>The question is to determine the type of singularity of $E(\mathbf{F}_3)$ and $E(\mathbf{F}_5)$. Since these two curves have only finitely many points, we cannot visualise the singularity. Therefore we have to determine the type of singularity algebraically. Recall that the singular points of cubic curves can be determined by its tangent lines: if two tangent lines coincide, then the singular point is of type cusp and therefore additive. Otherwise the singular point is nodal and therefore multiplicative. With the point given, the sole factor that can determine the tangent line is the slope, which brings us to the following observation.</p><p>Let $f(x,y)=y^3+a_1xy+a_3y-x^3-a_2x^2-a_4x-a_6$ be the minimal Weierstrass normal form of an elliptic curve $E’:f(x,y)=0$ in a field $K$. Assume that $(x_0,y_0)$ is the singular point of $E’$, then $f(x,y)$ admit a Taylor expansion </p><script type="math/tex; mode=display">f(x,y)=f(x,y)-f(x_0,y_0)=[(y-y_0)-\alpha(x-x_0)][(y-y_0)-\beta(x-x_0)]-(x-x_0)^3</script><p>where $\alpha,\beta\in \overline{K}$. In fact, $y-y_0=\alpha(x-x_0)$ and $y-y_0=\beta(x-x_0)$ should be considered as the tangent lines of $E’$ at $(x_0,y_0)$. If $\alpha=\beta$ then we have a cusp and otherwise we have a node. We remind the reader that it is possible that $\alpha,\beta\not\in K$. Besides, we look for only one singular point as there is at most one. See <a href="https://math.stackexchange.com/questions/3505935/any-irreducible-cubic-affine-curve-has-at-most-1-singular-point">this discussion</a>.</p><p>For our curve $E$, we have $f(x,y)=y^2 + xy + y -x^3- x^2$. First of all we determine the slopes $\alpha$ and $\beta$ in the case where $K=\mathbf{F}_3$. First of all in terms of the singular points, we notice that</p><script type="math/tex; mode=display">\begin{cases}\frac{\partial f}{\partial x} = y +x \\\frac{\partial f}{\partial y} = x-y+1\end{cases}</script><p>and solving $f=\frac{\partial f}{\partial x}=\frac{\partial f}{\partial y}=0$ yields $(x_0,y_0)=(1,-1)$ as the singular point of $E(\mathbf{F}_3)$. We have therefore the Taylor expansion</p><script type="math/tex; mode=display">f(x,y)=[(y+1)-\alpha(x-1)][(y+1)-\beta(x-1)]-(x-1)^3.</script><p>As a matter of fact we do not have to expand all the terms. Notice that</p><script type="math/tex; mode=display">f(x,y)=(y+1)^2-(\alpha+\beta)(x-1)(y+1)+\alpha\beta(x-1)^2-(x-1)^3.</script><p>If we check the coefficient of $xy$ and $x^2$, we see</p><script type="math/tex; mode=display">\alpha+\beta=-1,\,\alpha\beta=-1</script><p>and therefore $\alpha$ and $\beta$ are the roots of the polynomial $\varphi(X)=X^2+X-1$. The discriminant of $\varphi$ is $1+4=2$. However since $\left(\dfrac{2}{3}\right)=2^{1}=-1$, we see that the solutions of $\varphi(X)=0$ do not lie in $\mathbf{F}_3$. Denote $\sqrt{2}$ the element of $\mathbf{F}_9$ that satisfies $X^2-2=0$. Then without loss of generality we can put $\alpha= 1-\sqrt{2}$ and $\beta=1+\sqrt{2}$. Therefore $E(\mathbf{F}_3)$ admits a nodal singular point.</p><p>We can do the exact same procedure with $\mathbf{F}_5$, but it is not necessary at all: it doesn’t matter where is the singular point. Let $(x_0,y_0)$ be the singular point of $E(\mathbf{F}_5)$, then</p><script type="math/tex; mode=display">f(x,y)=(y-y_0)^2-(\alpha+\beta)(x-x_0)(y-y_0)+\alpha\beta(x-x_0)^2-(x-x_0)^3.</script><p>With exactly the same artument with the coefficients, we find that</p><script type="math/tex; mode=display">\alpha+\beta=-1,\,\alpha\beta=-1</script><p>and to solve these two equations we find $\alpha=1-\sqrt{2}$ and $\beta=1+\sqrt{2}$ in $\mathbf{F}_{25}$. Therefore $E(\mathbf{F}_5)$ admits a nodal singular point, as long as such a point exist! We need to nonetheless determine the point. For this reason we see</p><script type="math/tex; mode=display">\begin{cases}\frac{\partial f}{\partial x} = y-3x^2-2x\\\frac{\partial f}{\partial y} = 2y+x+1\end{cases}</script><p>And it is not difficult to see that the point $(x_0,y_0)$ that satisfies $\frac{\partial f}{\partial x}=\frac{\partial f}{\partial y}=f=0$ is $(3,3)$.</p><h2 id="Reference"><a href="#Reference" class="headerlink" title="Reference"></a>Reference</h2><p>Other than the linked documents in text, this post is based on <a href="https://www.cambridge.org/core/books/many-variations-of-mahler-measures/29DB6CD1A87B356AD304DED9ECC9F4EE">Many Variations of Mahler Measures, <em>A Lasting Symphony</em></a> by <a href="https://perso.ens-lyon.fr/francois.brunault/">François Brunault</a> and <a href="https://www.math.ru.nl/~zudilin/">Wadim Zudilin</a>.</p>]]></content>
    
    
    <summary type="html">In this post we study the Boyd-Deninger polynomial P(x,y)=x+1/x+y+1/y+1. In particular, we are interested in the elliptic curve that is defined by it.</summary>
    
    
    
    <category term="Geometry" scheme="https://desvl.xyz/categories/Geometry/"/>
    
    <category term="Algebraic Geometry" scheme="https://desvl.xyz/categories/Geometry/Algebraic-Geometry/"/>
    
    <category term="Elliptic Curve" scheme="https://desvl.xyz/categories/Geometry/Elliptic-Curve/"/>
    
    
    <category term="Elliptic Curve" scheme="https://desvl.xyz/tags/Elliptic-Curve/"/>
    
    <category term="SageMath" scheme="https://desvl.xyz/tags/SageMath/"/>
    
  </entry>
  
  <entry>
    <title>Boolean ring and algebraic numbers</title>
    <link href="https://desvl.xyz/2025/10/13/boolean/"/>
    <id>https://desvl.xyz/2025/10/13/boolean/</id>
    <published>2025-10-13T14:06:01.000Z</published>
    <updated>2025-12-31T15:33:49.467Z</updated>
    
    <content type="html"><![CDATA[<h2 id="Boolean-ring"><a href="#Boolean-ring" class="headerlink" title="Boolean ring"></a>Boolean ring</h2><p>Let $B$ be a commutative ring with unity. We say that $B$ is a <strong>Boolean ring</strong> if $x^2=x$ for all $x \in B$. The name “Boolean” certainly rings a bell of the idea of bool values in programming, or in general, the <strong>Boolean algebra</strong> that is frequently used in logic, digital electronics and computer science. </p><p>In this post, we will examine Boolean rings on a level of commutative algebra, followed by an explicit example in algebraic number theory.</p><h2 id="Basic-properties-of-Boolean-rings"><a href="#Basic-properties-of-Boolean-rings" class="headerlink" title="Basic properties of Boolean rings"></a>Basic properties of Boolean rings</h2><p>Throughout, let $B$ be a Boolean ring.</p><blockquote><p><strong>Proposition 1.</strong> In the Boolean ring $B$, we have</p><ol><li>$2x=0$ for all $x \in B$.</li><li>Every prime ideal $\mathfrak{p} \subset B$ is maximal, and $B/\mathfrak{p}$ is a field with two elements.</li><li>Every finitely generated ideal in $A$ is principal.</li></ol></blockquote><p><em>Proof.</em> For 1, notice that</p><script type="math/tex; mode=display">2x=2x^2=(2x)^2=4x^2 \implies 2x=2x^2=(4x^2-2x^2)=(2x^2-2x^2)=0</script><p>For 2, it suffices to show that for every prime ideal $\mathfrak{p} \subset B$, we have $B/\mathfrak{p} \cong \mathbb{Z}/2\mathbb{Z}$.</p><p>Pick $x \in B \setminus \mathfrak{p}$. Then in $B/\mathfrak{p}$ we have $\overline{x}^2=\overline{x}$, where $\overline{x}=x+\mathfrak{p} \in B/\mathfrak{p}$. Therefore $\overline{x}(\overline{x}-\overline{1})=0$. However, since $B/\mathfrak{p}$ is entire, we see that we must have $\overline{x}=\overline{1}$ since $x \not\in \mathfrak{p}$. Therefore there are exactly two elements in $B/\mathfrak{p}$, namely $\overline{0}$ and $\overline{1}$.</p><p>For 3, we use the induction. If $\mathfrak{a}$ is generated by one element, there is nothing to prove. If $\mathfrak{a}=(x,y)$, then we set</p><script type="math/tex; mode=display">u = x+y+xy.</script><p>This element is interesting because</p><script type="math/tex; mode=display">ux = x^2+xy+xy=x,\, uy=xy+y+xy = y.</script><p>Therefore for all elements $a=rx+sy$, we have</p><script type="math/tex; mode=display">a = r(ux)+s(uy)=(rx+sy)u.</script><p>Therefore we have $\mathfrak{a}=(u)=(x+y+xy)$.</p><p>Suppose now we have proved that all ideals generated by $n$ elements are principal. Then for an ideal generated by $n+1$ elements, let’s say $\mathfrak{a}=(x_1,\dots,x_n,x_{n+1})$, for an element</p><script type="math/tex; mode=display">a=a_1x_1+\dots+a_nx_n+a_{n+1}x_{n+1} \in \mathfrak{a},</script><p>there is an element $y_{n+1} \in \mathfrak{a}$ such that $a_1x_1+\dots+a_nx_n=b_{n+1}y_{n+1}$, and if we set $u_{n+1}=x_{n+1}+y_{n+1}+x_{n+1}y_{n+1}$, then </p><script type="math/tex; mode=display">a = a_{n+1}x_{n+1}+b_{n+1}y_{n+1}=(a_{n+1}x_{n+1}+b_{n+1}y_{n+1})u_{n+1}</script><p> and therefore $\mathfrak{a}=(u_{n+1})$ as expected. $\square$</p><p>Indeed, if $B$ is noetherian, then we see immediately that $\dim B = 0$, where $\dim$ denotes the Krull dimension. Besides, in this case, $B$ is automatically a PID. We should notice however a Boolean ring $B$ is not necessarily noetherian. For example </p><script type="math/tex; mode=display">\mathbb{Z}/2\mathbb{Z} \times \mathbb{Z}/2\mathbb{Z} \times \cdots</script><p>is Boolean but not noetherian because we can consider the chain of ideals</p><script type="math/tex; mode=display">I_n = \underbrace{\mathbb{Z}/2\mathbb{Z} \times \cdots \times \mathbb{Z}/2\mathbb{Z}}_{n \mathrm{\,times}} \times \{0\}\times \{0\} \times \cdots</script><p>and $I_1 \subset I_2 \subset \cdots$ is not a stationary chain.</p><p>Next we investigate the topology of $\operatorname{Spec}B$. It is required to have the basic knowledge of the Zariski topology.</p><blockquote><p><strong>Proposition 2.</strong> Let $X=\operatorname{Spec}B$ and $X_f=X \setminus V(f)$, where $V(f)=\{\mathfrak{p} \in X:f \in \mathfrak{p}\}$ be the basic open sets of $X$ [recall that an open subset of $X$ is quasi-compact if and only if it is a finite union of sets $X_f$. Then </p><ol><li>For each $f \in B$, the set $X_f$ is both open and closed in $X$.</li><li>Let $f_1,\dots,f_n \in B$, then $X_{f_1} \cup \cdots \cup X_{f_n}=X_f$ for some $f \in B$.</li><li>The sets $X_f$ are the only subsets of $X$ which are both open and closed.</li><li>$X$ is a compact Hausdorff space.</li></ol></blockquote><p><em>Proof.</em> By definition $X_f$ is indeed open. To show that $X_f$ is closed, it suffices to show that $V(f)$ is always open. To do this, we use the fact that $B$ is Boolean, i.e. $f^2=f$ for all $f \in B$. We see immediately that</p><script type="math/tex; mode=display">V(f) \cup V(1-f) = V((f)(1-f))=V(f-f^2)=V(0)=X</script><p>and on the other hand,</p><script type="math/tex; mode=display">V(f) \cap V(1-f) = V(f,1-f)=V(1)=\varnothing.</script><p>This is to say we have $X_f = X \setminus V(f) = V(1-f)$ to be closed all the time.</p><p>For 2, we can simply use the identity $X_f=V(1-f)$ proved above. Indeed, </p><script type="math/tex; mode=display">X_{f_1} \cup \cdots \cup X_{f_n} = V(1-f_1) \cup \cdots \cup V(1-f_n) = V((1-f_1)\cdots(1-f_n))=V(1-F(f_1,\dots,f_n))</script><p>where $F(f_1,\dots,f_n) \in B$ is a finite sum and product of $f_1,\dots,f_n$ and is the element $f$ that we were looking for.</p><p>For 3, we pick a open and closed set $Y \subset X$. Since $Y$ is open, we can write $Y = \bigcup_{i \in I}X_{f_i}$ for some index set $I$. Since $Y$ is closed in $X$, we see that $Y$ is quasi-compact, and therefore the index set $I$ can be chosen to be finite. By 2, there is therefore a $f \in B$ such that $Y=X_f$.</p><p>Finally, we show that $X$ is Hausdorff. Indeed, if $\mathfrak{p},\mathfrak{q} \in X$ with $\mathfrak{p} \ne \mathfrak{q}$, then without loss of generality we can assume that there exists $x \in \mathfrak{p}$ such that $x \not\in \mathfrak{q}$. We see then $\mathfrak{p} \in V(x)$ and $\mathfrak{q} \in V(1-x)$, and both $V(x)$ and $V(1-x)$ are open, while $V(x) \cap V(1-x)=\varnothing$. $\square$</p><h2 id="Boolean-ring-coming-from-a-ring-of-integers"><a href="#Boolean-ring-coming-from-a-ring-of-integers" class="headerlink" title="Boolean ring coming from a ring of integers."></a>Boolean ring coming from a ring of integers.</h2><p>Let $K$ be a number field and let $\mathcal{O}_K$ be the ring of integers of $K$. We would expect that $\mathcal{O}_K=\mathbb{Z}[x]$ for some $x \in \mathcal{O}_K$. For example, if $d \in \mathbb{Z}\setminus\{0,1\}$ is a care-free integer, and if we set $K=\mathbb{Q}(\sqrt{d})$, then</p><script type="math/tex; mode=display">\mathcal{O}_K=\begin{cases}\mathbb{Z}\left[\frac{1+\sqrt{d}}{2}\right],& d\equiv 1\pmod{4}, \\\mathbb{Z}[\sqrt{d}],& d\equiv 2,3\pmod{4}.\end{cases}</script><p>So now we pose a question : if we consider $K=\mathbb{Q}(\sqrt{-7},\sqrt{17})$, then does there exist $x \in \mathcal{O}_K$ such that $\mathcal{O}_K = \mathbb{Z}[x]$?</p><p>Instead of trying to find such a $x$ manually, we will solve this question with a general setting.</p><blockquote><p><strong>Proposition 3.</strong> Let $m,n \in \mathbb{Z}\setminus\{0,1\}$ be distinct integers, square-free, such that $m \equiv n \equiv 1 \pmod{4}$. If we put $K = \mathbb{Q}(\sqrt{m},\sqrt{n})$, then $\mathcal{O}_K = \mathbb{Z} \oplus \mathbb{Z}\alpha \oplus \mathbb{Z}\beta\oplus\mathbb{Z}\alpha\beta$, where $\alpha=\frac{1+\sqrt{n}}{2}$ and $\beta = \frac{1+\sqrt{m}}{2}$.</p></blockquote><p><em>Proof.</em> First of all we notice that $[K:\mathbb{Q}]=4$ and that $\{1,\alpha,\beta,\alpha\beta\}$ as well as $\{1,\sqrt{m},\sqrt{n},\sqrt{mn}\}$ are two $\mathbb{Q}$-basis of $K$. The Galois group $G(K/\mathbb{Q})$ is isomorphic to $\mathbb{Z}/2\mathbb{Z} \times \mathbb{Z}/2\mathbb{Z}$, generated by $\sigma:\sqrt{n} \mapsto -\sqrt{n}$ and $\gamma:\sqrt{m} \mapsto -\sqrt{m}$. Therefore</p><script type="math/tex; mode=display">d(1,\alpha,\beta,\alpha\beta)=\begin{vmatrix}1 & \alpha & \beta &\alpha\beta \\1 & \sigma(\alpha) & \beta & \sigma(\alpha)\beta \\1 & \alpha & \gamma(\beta) &\alpha\gamma(\beta) \\1 & \sigma(\alpha) & \gamma(\beta) & \sigma(\alpha)\gamma(\beta)\end{vmatrix}^2 = m^2n^2.</script><p>We next want to show that</p><script type="math/tex; mode=display">\Gamma=\mathbb{Z} \oplus \mathbb{Z}\alpha \oplus \mathbb{Z}\beta\oplus\mathbb{Z}\alpha\beta \subset \mathcal{O}_K \subset \Gamma'= \frac{1}{4}\left(\mathbb{Z} \oplus \mathbb{Z}\sqrt{m} \oplus \mathbb{Z}\sqrt{n}\oplus\mathbb{Z}\sqrt{mn}\right).</script><p>On one hand, $\alpha,\beta \in \mathcal{O}_K$ should be clear. Notice that</p><script type="math/tex; mode=display">\alpha^2-\alpha=\frac{1+n+2\sqrt{n}-2\sqrt{n}-2}{4}=\frac{n-1}{4} \in \mathbb{Z}</script><p>and </p><script type="math/tex; mode=display">\beta^2-\beta=\frac{m-1}{4} \in \mathbb{Z}.</script><p>Therefore if we put $f(X)=X^2-X-\frac{n-1}{4}$, then $f(\alpha)=0$. Likewise, if we put $g(X)=X^2-X-\frac{m-1}{4}$, then $g(\beta)=0$. The first inclusion $\mathbb{Z} \oplus \mathbb{Z}\alpha \oplus \mathbb{Z}\beta\oplus\mathbb{Z}\alpha\beta \subset \mathcal{O}_K$ is then proved. </p><p>On the other hand, pick an arbitrary $x = a+b\sqrt{n}+c\sqrt{m}+d\sqrt{mn} \in \mathcal{O}_K \subset K$. We know on the first place that $a,b,c,d\in\mathbb{Q}$. However, we notice that</p><script type="math/tex; mode=display">\operatorname{Tr}_{K/\mathbb{Q}(\sqrt{n})}(x)=2a+2b\sqrt{n}</script><p>is an algebraic integer as it is the root of a monic polynomial $(X-2a)^2-4b^2n$. At the same time, we have $2a+2b\sqrt{n}$. Therefore $2a+2b\sqrt{n} \in \mathcal{O}_{\mathbb{Q}(\sqrt{n})}=\mathbb{Z}\left[\frac{1+\sqrt{n}}{2}\right]$ (since $n\equiv 1\pmod{4}$). Therefore there exists $a’,b’\in\mathbb{Z}$ such that</p><script type="math/tex; mode=display">2a+2b\sqrt{n}=a'+b'\frac{1+\sqrt{n}}{2}=a'+\frac{b'}{2}+\frac{b'}{2}\sqrt{n},</script><p>which implies that</p><script type="math/tex; mode=display">4b=b'\in \mathbb{Z},\, 4a=2a'+b' \in \mathbb{Z}.</script><p>Likewise, we see</p><script type="math/tex; mode=display">\operatorname{Tr}_{K/\mathbb{Q}(\sqrt{m})}(x)=2a+2c\sqrt{m}</script><p>and in the same way we can prove that $4c \in \mathbb{Z}$.</p><p>Finally,</p><script type="math/tex; mode=display">\operatorname{Tr}_{K/\mathbb{Q}(\sqrt{mn})}(x)=2a+2d\sqrt{mn} \in \mathbb{Z}</script><p>from which it follows that $4d \in \mathbb{Z}$. We have therefore proved that</p><script type="math/tex; mode=display">\mathcal{O}_K \subset \frac{1}{4}\left(\mathbb{Z} \oplus \mathbb{Z}\sqrt{m} \oplus \mathbb{Z}\sqrt{n}\oplus\mathbb{Z}\sqrt{mn}\right).</script><p>Finally, since</p><script type="math/tex; mode=display">d(1,\sqrt{m},\sqrt{n},\sqrt{mn})=\begin{vmatrix}1 & \sqrt{m} & \sqrt{n} &\sqrt{mn} \\1 & -\sqrt{m} & \sqrt{n} & -\sqrt{mn} \\1 & \sqrt{m} & -\sqrt{n} & -\sqrt{mn} \\1 & -\sqrt{m} & -\sqrt{n} & -\sqrt{mn}\end{vmatrix}^2=(16mn)^2,</script><p>we have</p><script type="math/tex; mode=display">d\left(\frac{1}{4},\frac{\sqrt{m}}{4},\frac{\sqrt{n}}{4},\frac{\sqrt{mn}}{4}\right)=\frac{4^4m^2n^2}{16^4}=\frac{m^2n^2}{4^4}.</script><p>If we consider the discriminant of $\mathcal{O}_K$, noted by $\Delta_K$, then</p><script type="math/tex; mode=display">d(1,\alpha,\beta,\alpha\beta)=[\mathcal{O}_K:\Gamma]^2\Delta_K=m^2n^2 \implies \Delta_K|m^2n^2,</script><p>and at the same time,</p><script type="math/tex; mode=display">\Delta_K = [\Gamma':\mathcal{O}_K]^2 d\left(\frac{1}{4},\frac{\sqrt{m}}{4},\frac{\sqrt{n}}{4},\frac{\sqrt{mn}}{4}\right) = [\Gamma':\mathcal{O}_K]^2\frac{m^2n^2}{4^4} \implies m^2n^2 | 4^4\Delta_K</script><p>However, since $m^2n^2$ is impair (as $m\equiv n \equiv 1 \pmod{4}$), we can only have</p><script type="math/tex; mode=display">\Delta_K=m^2n^2=d(1,\alpha,\beta,\alpha\beta),</script><p>which forces $\{1,\alpha,\beta,\alpha\beta\}$ to be a $\mathbb{Z}$-basis of $\mathcal{O}_K$. $\square$</p><hr><p>To answer our question, we restrict ourselves to the case $m \equiv n \equiv 1 \pmod{8}$. In this question we will see that the Boolean ring arises naturally.</p><blockquote><p><strong>Proposition 4.</strong> Let $m,n \in \mathbb{Z}\setminus\{0,1\}$ be distinct integers, square-free, such that $m \equiv n \equiv 1 \pmod{8}$. If we put $K = \mathbb{Q}(\sqrt{m},\sqrt{n})$, then there does not exist $t \in \mathcal{O}_K$ such that $\mathcal{O}_K = \mathbb{Z}[t]$.</p></blockquote><p><em>Proof.</em> The proposition invites us to try to write $\mathcal{O}_K$ as a polynomial ring over $\mathbb{Z}$. As one can see easily,</p><script type="math/tex; mode=display">\mathcal{O}_K =\mathbb{Z} \oplus \mathbb{Z}\alpha \oplus \mathbb{Z}\beta\oplus\mathbb{Z}\alpha\beta \cong \mathbb{Z}[X,Y]/\left(X^2-X+\frac{1-n}{4},Y^2-Y+\frac{1-m}{4}\right)</script><p>where $\alpha=\frac{1+\sqrt{n}}{2}$ and $\beta=\frac{1+\sqrt{m}}{2}$ as above with the isomorphism induced by the map</p><script type="math/tex; mode=display">\begin{aligned}\mathbb{Z}[X,Y] &\to \mathcal{O}_K \\1 &\mapsto 1 \\X &\mapsto \alpha \\Y &\mapsto \beta\end{aligned}</script><p>Since in our question, $m\equiv n \equiv 1 \pmod{8}$, we see that $\frac{1-n}{4},\, \frac{1-m}{4} \in 2\mathbb{Z}$. Therefore by a modulo of $2$, we obtain</p><script type="math/tex; mode=display">\mathcal{O}_K/2\mathcal{O}_K = \mathbf{F}_2[X,Y]/(X^2-X,Y^2-Y)</script><p>where $\mathbf{F}_2$ is the finite field of $2$ elements. Here, the ring $B=\mathcal{O}_K/2\mathcal{O}_K$ is a <strong>Boolean ring</strong>. Indeed, we can now even explicitly write down $\mathcal{O}_K/2\mathcal{O}_K$ as $\mathbf{F}_2[x,y]$ with $x^2=x$ and $y^2=y$. All elements of $\mathcal{O}_K/2\mathcal{O}_K$ can be identified as $a+bx+cy+dxy$ with $a,b,c,d\in\mathbf{F}_2$. There are $2^4=16$ elements in total, and it can be easily seen that $(a+bx+cy+dxy)^2=a+bx+cy+dxy$.</p><p>Since $B\cong \mathbf{F}_2[x,y]$ is Boolean, all prime ideals are maximal. There are exactly $4$ maximal ideals:</p><ol><li>$(x,y-1)=(y-1+xy)$</li><li>$(x-1,y)=(x-1+xy)$</li><li>$(x-1,y-1)=(xy+1)$</li><li>$(x,y)=(x+y+xy)$.</li></ol><p>For a homomorphism $\varphi:B \to \mathbf{F}_2$, we have $\varphi(0)=0$, $\varphi(1)=1$ so $\varphi$ is surjective. The kernel $\ker\varphi$ is therefore a maximal ideal. There are thus exactly $4$ homomorphisms $B \to \mathbf{F}_2$, which correspond to, sending $x$ to $0$ and $y$ to $1$, sending $x$ to $1$ and $y$ to $0$, sending $x$ and $y$ to $1$ and finally sending $x$ and $y$ to $0$, respectively.</p><p>Now we show that we cannot pick $t \in \mathcal{O}_K$ such that $\mathcal{O}_K=\mathbb{Z}[t]$. To reach a contradiction, we suppose that such a $t$ exist. It follows that</p><script type="math/tex; mode=display">\mathcal{O}_K/2\mathcal{O}_K \cong \mathbf{F}_2[t] \cong\mathbf{F}_2[X]/(P)</script><p>where $P$ is a polynomial of degree $4$. However this is absurd because for a homomorphism</p><script type="math/tex; mode=display">\psi:\mathbf{F}_2[X]/(P) \to \mathbf{F}_2,</script><p>we can only have two possibilities: $\psi(X)=1$ or $\psi(X)=0$. However we have shown that $\mathcal{O}_K/2\mathcal{O}_K$ can be mapped onto $\mathcal{F}_2$ in $4$ ways. A contradiction. $\square$</p><p>Therefore unfortunately, for the number field $K=\mathbb{Q}(\sqrt{-7},\sqrt{17})$, we cannot find $x \in \mathcal{O}_K$ such that $\mathcal{O}_K=\mathbb{Z}[x]$.</p><h2 id="References"><a href="#References" class="headerlink" title="References"></a>References</h2><ul><li>M. F. Atiyah FRS, I. G. MacDonald, <em>Introduction to Commutative Algebra</em>.</li></ul>]]></content>
    
    
    <summary type="html">In this post, we study the Boolean ring and see how it can be used in algebraic number theory.</summary>
    
    
    
    <category term="Algebra" scheme="https://desvl.xyz/categories/Algebra/"/>
    
    <category term="Commutative Algebra" scheme="https://desvl.xyz/categories/Algebra/Commutative-Algebra/"/>
    
    <category term="Number Theory" scheme="https://desvl.xyz/categories/Number-Theory/"/>
    
    <category term="Algebraic Number Theory" scheme="https://desvl.xyz/categories/Number-Theory/Algebraic-Number-Theory/"/>
    
    
    <category term="Exercise solution" scheme="https://desvl.xyz/tags/Exercise-solution/"/>
    
    <category term="Atiyah-MacDonald" scheme="https://desvl.xyz/tags/Atiyah-MacDonald/"/>
    
  </entry>
  
  <entry>
    <title>Artin-Schreier Extensions</title>
    <link href="https://desvl.xyz/2025/05/16/artin-schreier/"/>
    <id>https://desvl.xyz/2025/05/16/artin-schreier/</id>
    <published>2025-05-16T14:55:17.000Z</published>
    <updated>2025-05-18T13:22:24.996Z</updated>
    
    <content type="html"><![CDATA[<h2 id="Recall"><a href="#Recall" class="headerlink" title="Recall"></a>Recall</h2><p>Throughout, let $K$ be a field of characteristic $p\ne 0$ and $E/K$ a cyclic extension of order $p^{m-1}$ with $m &gt;1$. The algebraic closure $\overline K^\mathrm{a}$, the separable algebraic closure $\overline K^{\mathrm{s}}$ are always fixed. We use $\mathbf{F}_p$ to denote the finite field of $p$ elements. </p><p>For proposition 2 in the post, let $G$ be the Galois group of the extension of $\overline K^\mathrm{s}/K$ (which is, the projective limit of $\mathrm{Gal}(K’/K))$, with $K’$ running over all finite and separable extension of  $K$; see <a href="/2020/07/21/profinite-group/">this post</a> for the definition of projective limit). The reader is expected to know how to induce a long exact sequence from a short exact sequence, for example from <a href="/2020/09/27/diagram-chasing/">this post</a>.</p><p>In <a href="/2021/10/10/Characters-in-Analysis-and-Algebra/">this post</a> (the reader is urged to make sure that he or she has understood the concept of characters and more importantly Hilbert’s theorem 90), we have shown that if $[E:K]=p$, then $E=K(x)$ where $x$ is the zero of a polynomial of the form $X^p-X-\alpha$ where $\alpha \in K$. In this belated post, we want to show that, <u>whenever it comes to an extension of order $p^{m-1}$, we are running into the a polynomial of the form $X^p-X-\alpha$.</u> The theory behind is called Artin-Schreier theory, which has its own (highly non-trivial) nature. </p><h2 id="Artin-Schreier-extensions"><a href="#Artin-Schreier-extensions" class="headerlink" title="Artin-Schreier extensions"></a>Artin-Schreier extensions</h2><blockquote><p><strong>Definition 1.</strong> An Artin-Schreier polynomial $A_\alpha(X) \in K[X]$ is of the form</p><script type="math/tex; mode=display">A_\alpha(X)=X^p-X-\alpha,\;\alpha\in K.</script></blockquote><p>An immediate property of Artin-Schreier polynomials that one should notice is the equation</p><script type="math/tex; mode=display">A_\alpha(X+Y)=A_\alpha(X)+A_\alpha(Y)-A_\alpha(0).</script><p>To see this, one should notice that for $x,y \in K$ we have $(x+y)^p=x^p+y^p$.</p><p>With this equation we can easily show that</p><blockquote><p><strong>Proposition 1.</strong> If $A_\alpha(X)$ has a root in $K$, then all roots of $A_\alpha(X)$ is in $K$. Otherwise, $A_\alpha(X)$ is irreducible over $K$. In this case, let $x$ be a root of $A_\alpha(X)$, then $K(x)/K$ is a cyclic extension of degree $p$.</p></blockquote><p><em>Proof.</em> We suppose that $x \in K$ is a root of $A_\alpha(X)$. Then </p><script type="math/tex; mode=display">A_\alpha(x+1)=A_\alpha(x)+A_\alpha(1)-A_\alpha(0)=0-\alpha+\alpha=0.</script><p>Therefore, by induction, we see easily that $x, x+1, \cdots, x+p-1$ are roots of $A_\alpha(X)$, all of which are in $K$.</p><p>Now we suppose that $A_\alpha(X)$ has no root in $K$. Let $x \in \overline K$ be a root of $A_\alpha(X)$. Then in $\overline K[X]$, the polynomial will be written in the form</p><script type="math/tex; mode=display">A_\alpha(X)=\prod_{j=1}^{p}(X-x+j)</script><p>because, again due to the equation $A_\alpha(X+Y)=A_\alpha(X)+A_\alpha(Y)-A_\alpha(0)$, we can see that $x,x+1,\dots,x+p-1$ are roots of $A_\alpha$.</p><p>By contradiction we suppose that $A_\alpha$ is reducible, say $A_\alpha(X)=f(X)g(X)$ where $1 \le d=\deg f &lt; p$ and $f,g \in K[X]$. It follows that </p><script type="math/tex; mode=display">f(X)=\prod_{j=1}^{d}(X-x+n_j)</script><p>where $\{n_1,\dots,n_d\} \subset \{1,2,\cdots,p\}$. If we expand the polynomial above, we see</p><script type="math/tex; mode=display">f(X)=X^d+\left(\sum_{j=1}^{d}n_j-dx\right)X^{d-1}+\text{lower terms} \in K[X]</script><p>Therefore $\left(\sum_{j=1}^{d}n_j-dx\right) \in K$ which is absurd because we then have $x \in K$. Therefore we see that $A_\alpha$ is irreducible.</p><p>To see that $K(x)/K$ is Galois, we first notice that this extension is normal : $K(x)$ contains all roots of $A_\alpha(X)$. This extension is separable because all roots of $A_\alpha(X)$, namely $x,x+1,\dots,x+p-1$, are pairwise distinct, i.e. $A_\alpha(X)$ has no multiple roots.</p><p>Finally, to see why the Galois group of $K(x)/K$ is cyclic, we notice the action of the Galois group $G$ over the roots of $A_\alpha(X)$. Since $A_\alpha(X)$ is irreducible, there exists $\sigma \in G$ such that $\sigma(x)=x+1$. We see easily that $\sigma^j(x)=x+j$ so $\sigma$ generates $G$ which has period $p$. $\square$</p><p>The correspondence between extensions of degree $p$ and polynomials of the form $X^p-X-\alpha$ inspires us to consider them in a distinguished manner.</p><blockquote><p><strong>Definition 2.</strong> The field extension $E/K$ is called an Artin-Schreier extension if $E=K(x)$ for some $\alpha \in L \setminus K$ such that $x^p-x\in K$.</p></blockquote><p>Consider the map $\wp:\overline K^\mathrm{s} \to \overline K^\mathrm{s}$ defined by $u \mapsto u^p-u$. We certainly want to find the deep relation between Artin-Schreier extensions of a given field $K$ and the map $\wp$. One of the key information can be found through the following correspondence.</p><blockquote><p><strong>Proposition 2.</strong> There is an isomorphism $\operatorname{Hom}(G,\mathbf{F}_p) \cong K/\wp(K)$.</p></blockquote><p><em>Proof.</em> We first notice that $\wp$ is a $G$-homomorphism, that is, it commutes with the action of $G$ on $\overline K^\mathrm{s}$. Indeed, for any $x \in \overline K^\mathrm{s}$ and $g \in G$, we have</p><script type="math/tex; mode=display">\wp(gx)=(gx)^p-gx=g(x^p)-gx=g(x^p-x)=g\wp(x).</script><p>On the other hand, $\wp$ is surjective. Indeed, for any $a \in \overline{K}^\mathrm{s}$, the equation $X^p-X=a$ always has a solution in $\overline K^\mathrm{s}$ because the polynomial $X^p-X-a$ is separable.</p><p>We can also see that the kernel of $\wp$ is $\mathbf{F}_p$. This is because the splitting field of $X^p-X$ is the field of $p^1$ elements, which has to be $\mathbf{F}_p$ itself. Therefore we have obtained a short exact sequence</p><script type="math/tex; mode=display">0 \to \mathbf{F}_p\xrightarrow{\iota}\overline K^\mathrm{s} \xrightarrow{\wp}\overline K^\mathrm{s} \to 0</script><p>where $\iota$ is the embedding. Taking the long exact sequence of cohomology, noticing that, by Hilbert’s Theorem 90, $H^1(G,\overline{K}^\mathrm{s})=0$, we have another exact sequence</p><script type="math/tex; mode=display">K \to K \to \operatorname{Hom}(G,\mathbf{F}_q)\to 0</script><p>where the first arrow is induced by $\wp$ and the second by $\iota$. Therefore we have $\operatorname{Hom}(G,\mathbf{F}_p) \cong K/\wp(K)$. One can explicitly show that there is a surjective map $K \to \operatorname{Hom}(G,\mathbf{F}_q)$ with kernel $\wp(K)$ that defines the isomorphism. For $c \in K$, one solves $x^p-x=c$, then $\varphi_c:g\mapsto g(x)-x$ is the desired map. The key ingredient of the verification involves the (infinite) Galois correspondence, but otherwise the verification is very tedious. We remark that for any $\varphi \in \operatorname{Hom}(G,\mathbf{F}_p)\setminus\{0\}$ and put $H=\ker\varphi$. Then $K^H/K$ is an Artin-Schreier extension with Galois group $G/H$ and on the other hand $H=\mathrm{Gal}(\overline K^\mathrm{s}/K^H)$. $\square$</p><h2 id="“Artin-Schreier-of-higher-order”"><a href="#“Artin-Schreier-of-higher-order”" class="headerlink" title="“Artin-Schreier of higher order”"></a>“Artin-Schreier of higher order”</h2><p>We conclude this post by showing that, under a certain condition, one can find an Artin-Schreier extension $L/E$ such that $L/K$ is cyclic of order $p^m$.</p><blockquote><p><strong>Lemma 1.</strong> Let $\beta \in E$ be an element such that $\operatorname{Tr}_K^E(\beta)=1$, then there exists $\alpha \in K$ such that $\sigma(\alpha)-\alpha = \beta^p-\beta$, where $\sigma$ is the generator of $\operatorname{Gal}(E/K)$.</p></blockquote><p><em>Proof.</em> Notice that $\operatorname{Tr}_K^E(\beta^p)=\operatorname{Tr}_K^E(\beta)^p=1$, which implies that $\operatorname{Tr}_K^E(\beta^p-\beta)=0$. By Hilbert’s theorem 90, such $\alpha$ exists. $\square$</p><blockquote><p><strong>Lemma 2.</strong> The polynomial $f(X)=X^p-X-\alpha$ is irreducible over $E$; that is, let $\theta$ be a root of $f$, then $E(\theta)$ is an Artin-Schreier extension of $E$.</p></blockquote><p><em>Proof.</em> By contradiction, we suppose that $\theta \in E$. By Artin-Schreier, all roots of $f$ lie in $E$. In particular, $\sigma(\theta)$ is a root of $f$. Therefore</p><script type="math/tex; mode=display">\begin{aligned}\sigma\alpha-\alpha&=\sigma(\theta^p-\theta)-(\theta^p-\theta) \\                   &=(\sigma\theta-\theta)^p-(\sigma\theta-\theta) \\                   &=\beta^p-\beta\end{aligned}</script><p>which implies that</p><script type="math/tex; mode=display">(\sigma\theta-\theta-\beta)^p=\sigma\theta-\theta-\beta.</script><p>It follows that $\sigma\theta-\theta-\beta$ is a root of $g(X)=X^p-X$. This implies that $\sigma\theta-\theta-\beta\in\mathbf{F}_p \subset K$ and therefore</p><script type="math/tex; mode=display">\operatorname{Tr}_K^E(\sigma\theta-\theta-\beta)=0.</script><p>However, by assumption and Artin-Schreier, $\sigma\theta-\theta \in \mathbf{F}_p \subset K$ we therefore have $\operatorname{Tr}_K^E(\sigma\theta-\theta)=0$ and finally</p><script type="math/tex; mode=display">0=\operatorname{Tr}_K^E(\sigma\theta-\theta-\beta)=\operatorname{Tr}_K^E(\sigma\theta-\theta)-\operatorname{Tr}_K^E(\beta)=-1</script><p>which is absurd. $\square$</p><blockquote><p><strong>Proposition 3.</strong> The field extension $K(\theta)/K$ is Galois, cyclic of degree $p^m$ of $f$, whose Galois group is generated by an extension $\sigma^\ast$ of $\sigma$ such that</p><script type="math/tex; mode=display">\sigma^\ast(\theta)=\theta+\beta</script></blockquote><p><em>Proof.</em> First of all we show that $K(\theta)=E(\theta)$. Indeed, since $K \subset E$, we have $K(\theta) \subset E(\theta)$. However, since $\theta \not \in E$, we must have $K \subset E \subsetneq K(\theta)$. Therefore $p=[E(\theta):K(\theta)][K(\theta):E]$, which forces $E(\theta)$ to be exactly $K(\theta)$. </p><p>Let $h(X)$ be the minimal polynomial of $\theta$ over $K$ of degree $p^m$. Then we give an explicit expression of $h$. Notice that since $f(X)$ is the polynomial of $\theta$ over $E$ of degree $p$, we must have $f(X)|h(X)$. For any $k$, we see that $f^{\sigma^k}(X)|g^{\sigma^k}(X)$ too. However, since $\sigma$ fixes $K$, we must have $g^{\sigma^k}(X)=g(X)$, from which it follows that $f^{\sigma^k}(X)|g(X)$ for all $0 \le k \le p^{m-1}-1$. Since the degree of each $f^{\sigma^k}(X)$ is $p$, we obtain</p><script type="math/tex; mode=display">h(X)=\prod_{k=0}^{p^{m-1}-1}f^{\sigma^k}(X)=\prod_{k=0}^{p^{m-1}-1}(X^p-X-\sigma^k\alpha).</script><p>Knowing that $\theta$ is a root of $g$, we see that $\theta+\beta$ is a root of $g(X)$ too because</p><script type="math/tex; mode=display">f^\sigma(\theta+\beta)=(\theta+\beta)^p-(\theta+\beta)-\sigma\alpha=(\theta^p-\theta)+(\beta^p-\beta)-\sigma\alpha=\alpha+\underbrace{\sigma\alpha-\alpha}_{\text {Lemma 1}}-\sigma\alpha=0</script><p>and by induction we see that for $0 \le k \le p^{m-1}-1$, $f^{\sigma^k}(X)$ has a root in the form</p><script type="math/tex; mode=display">\sigma+\beta+\cdots+\sigma^{k-1}\beta.</script><p>By Artin-Schreier, all roots of $f^{\sigma^k}(X)$ lie in $E(\theta)$ and therefore $h(X)$ splits in $E(\theta)$. Since $E(\theta)/E$ is separable, $E/K$ is separable, we see also $E(\theta)/K$ is separable, which means that $E(\theta)=K(\theta)$ is Galois over $K$. </p><p>To see why $K(\theta)/K$ is cyclic, we consider an homomorphism $\sigma^\ast$ of $K(\theta)$ such that $\sigma^{\ast}|_E=\sigma$ and that $\sigma^\ast(\theta)=\theta+\beta$. It follows that $\sigma^\ast \in \operatorname{Gal}(K(\theta)/K)$ because its restriction on $K$, which is the restriction of $\sigma$ on $K$, is the identity. We see then for all $0 \le n \le p^{m}$, one has </p><script type="math/tex; mode=display">(\sigma^\ast)^n(\theta)=\theta+\beta+\cdots+\sigma^{n-1}\beta.</script><p>In particular,</p><script type="math/tex; mode=display">(\sigma^\ast)^{p^{m-1}}=\theta+\operatorname{Tr}_K^E(\beta)=\theta+1,</script><p>from which it follows that $(\sigma^\ast)^{p^{m-1}}$ has order $p$, which implies that $\sigma^\ast$ has order $p^m$, thus the Galois group is generated by $\sigma^\ast$. $\square$</p><h2 id="References"><a href="#References" class="headerlink" title="References"></a>References</h2><ul><li>Jean-Pierre Serre, Local Fields (chapter X) (<a href="https://link.springer.com/book/10.1007/978-1-4757-5673-9">link</a>).</li><li>Serge Lang, Algebra, chapter VI (<a href="https://link.springer.com/book/10.1007/978-1-4613-0041-0">link</a>)</li></ul>]]></content>
    
    
    <summary type="html">We are interested in a special category of field extensions. Let $K$ be a field of characteristic $p &#92;ne 0$, we want to know the structure of an extension of $K$ of degree $p$. It turns out that there lies the an Artin-Schreier polynomial of the form $X^p-X-&#92;alpha$.</summary>
    
    
    
    <category term="Algebra" scheme="https://desvl.xyz/categories/Algebra/"/>
    
    <category term="Field Theory" scheme="https://desvl.xyz/categories/Algebra/Field-Theory/"/>
    
    <category term="Galois Theory" scheme="https://desvl.xyz/categories/Algebra/Galois-Theory/"/>
    
    
    <category term="Exercise solution" scheme="https://desvl.xyz/tags/Exercise-solution/"/>
    
    <category term="Serge Lang" scheme="https://desvl.xyz/tags/Serge-Lang/"/>
    
  </entry>
  
  <entry>
    <title>Equivalent Conditions of Regular Local Rings of Dimension 1</title>
    <link href="https://desvl.xyz/2025/05/11/regular-local-ring/"/>
    <id>https://desvl.xyz/2025/05/11/regular-local-ring/</id>
    <published>2025-05-10T23:33:19.000Z</published>
    <updated>2025-05-11T12:06:56.090Z</updated>
    
    <content type="html"><![CDATA[<h2 id="Introduction"><a href="#Introduction" class="headerlink" title="Introduction"></a>Introduction</h2><p>Regular local rings are important objects in modern algebra, number theory and algebraic geometry. Therefore it would be way too ambitious to try to briefly justify the motivation of studying regular local rings. In this post, we try to collect equivalent conditions of being a regular local ring of dimension $1$ and prove them. There are plenty of equivalent conditions and it is difficult to find a book that collects as many as them as possible, let alone giving a detailed proof. The reader is also encouraged to prove the conditions himself, after knowing that the most important tool in the proof is Nakayama’s lemma.</p><h3 id="Discrete-valuation-ring"><a href="#Discrete-valuation-ring" class="headerlink" title="Discrete valuation ring"></a>Discrete valuation ring</h3><p>The reader may have come up with the definition of discrete valuation rings, without knowing the motivation. Indeed, one way to interpret discrete valuation rings is to see them as “Taylor expansions”. The analogy after the definition may explain why. </p><blockquote><p><strong>Definition 1.</strong> Let $F$ be a field. A surjective function $F:\mathbb{Z} \to \{\infty\}$  is called a <strong>discrete valuation</strong> if</p><ol><li>$v(\alpha)=\infty \iff \alpha = 0$;</li><li>$v(\alpha\beta)=v(\alpha)+v(\beta)$;</li><li>$v(\alpha+\beta)\ge\min(v(\alpha),v(\beta))$.</li></ol><p>The ring $R_v=\{\alpha \in F:v(\alpha) \ge 0\}$ is called a <strong>discrete valuation ring</strong>. It is a local ring with maximal ideal $\mathfrak{m}_v=\{\alpha \in F:v(\alpha) &gt; 0\}$. </p></blockquote><p>We should <strong>not</strong> compare $R_v$ with a polynomial ring, as all polynomial rings are not local. Let $t \in \mathfrak{m}_v$ be an element such that $v(t)=1$. We will show that $\mathfrak{m}_v = (t)$. Indeed, for any $u \in \mathfrak{m}_v$, we see that</p><script type="math/tex; mode=display">v(ut^{-1})=v(u)+v(t^{-1})=v(u)-1 \ge 0 \implies ut^{-1} \in R_v</script><p>and as a result we can write $u=(ut^{-1})t$. If we look further, suppose that $v(u)=m$. Then $\alpha = ut^{-m} \in R_v$ is a unit and thus we have $u=\alpha t^m$. In other words, every element can be expressed as a monomial of $t$. </p><p>The analogy or even example to bring about here is the order of zero at origin of (rational) functions over $\mathbb{R}$. For a rational function $F(x)=f(x)/g(x)$, we see that if we define $v(F)=\deg{f}-\deg{g}$, then $\lim_{x\to 0}\frac{F(x)}{x^m}$ is non-zero and finite. The degree of zero polynomial depends on the context, and in our context we make it infinite as no matter how big $m$ is, we are never reaching a point that $\lim_{x \to 0}\frac{0}{x^m}$ is non-zero and finite. Therefore the discrete valuation ring in our story is the polynomials where the function is equivalent to a monomial of positive degree, and the generator of the maximal ideal is the “identity” map. In short, one way of imagining the discrete valuation ring is the space of “smooth” functions at a point that converge to $0$ with the evaluation being the degree of approximation.</p><h2 id="Regular-local-ring-of-dimension-1"><a href="#Regular-local-ring-of-dimension-1" class="headerlink" title="Regular local ring of dimension 1"></a>Regular local ring of dimension 1</h2><p>For a ring $R$, we use $\dim(R)$ to denote the Krull dimension and for a vector space $V$ over a field $K$, $\dim_K(V)$ is used to denote the dimension of $V$ as a vector space over $K$.</p><blockquote><p><strong>Theorem 2.</strong> Let $R$ be a commutative noetherian local ring with unit and maximal ideal $\mathfrak{m}$ with the residue field $\kappa=R/\mathfrak{m}$. Then the following conditions are equivalent.</p><ol><li>$R$ is a discrete valuation ring in its field of fraction;</li><li>$\dim_\kappa(\mathfrak{m}/\mathfrak{m}^2)=\dim(R)=1$, i.e., $R$ is a regular local ring of dimension $1$;</li><li>$R$ is a unique factorization domain of Krull dimension $1$;</li><li>$\mathfrak{m}$ is a principal ideal and $\dim(R)=1$.</li><li>$R$ is a principal ideal domain which is not a field;</li><li>$R$ is an integrally closed domain of Krull dimension $1$.</li></ol></blockquote><p>（N. B. - We <strong>have to</strong> assume the axiom of choice by all means, otherwise none of these makes sense. In fact, without assuming the axiom of Choice, it is <strong>unprovable</strong> that a principal ideal domain has a maximal ideal or the ring has a prime element when it is not a field. See <a href="https://doi.org/10.1017/S0305004100052282">this article</a> for more details.)</p><p><em>Proof.</em> Suppose first that $R$ is a discrete valuation ring with a discrete valuation $v$. Then $\mathfrak{m}=\{a\in R:v(a)&gt;0\}$ is the maximal ideal of $R$ that can be generated by an element $t \in \mathfrak{m}$ such that $v(t)=1$. Let $\mathfrak{a}$ be another ideal of $R$ and let $k=\min v(\mathfrak{a})$. There is an element $x \in \mathfrak{a}$ such that $v(x)=k$ and we can write $x=ut^k$ where $u$ is a unit of $R$. For any other element $y \in \mathfrak{a}$, we have $\ell=v(y)\ge k$ and therefore $y=vt^{\ell}=vu^{-1}t^{\ell-k}ut^{k}=vu^{-1}t^{\ell-k}x$. In other words, we have $\mathfrak{a}=(x)=(t^k)$ for some $k \ge 1$. When $k&gt;1$, the ideal $(t^k)$ is not prime let alone maximal, so we have shown that when $R$ is a discrete valuation ring, the maximal ideal $\mathfrak{m}$ is principal, the Krull dimension of $R$ is $1$ and $R$ is principal but not a field because the maximal ideal is not zero. </p><p>This is to say, we have $1 \implies  4,5$. Since a principal ideal domain is also a unique factorization domain, we also get $3$. Besides, we have shown that in all 6 scenarios, the ring $R$ is of Krull dimension $1$. <u>Therefore from now on we assume that $R$ is a commutative noetherian local ring of Krull dimension $1$ a priori.</u> This condition implies that the maximal ideal $\mathfrak{m}$ is <strong>not</strong> nilpotent because $\mathfrak{m}$ is nilpotent if and only if the dimension of $R$ would be $0$ (hint: Nakayama’s lemma; consider the possibility that $\mathfrak{m}^n=\mathfrak{m}^{n+1}$).</p><p>Now assume that $\mathfrak{m}$ is principal and we write $\mathfrak{m}=(t)$ for some $t\in\mathfrak{m}$. For any $a \in R \setminus \{0\}$, if $a$ is invertible, then we can write $a=at^{0}$. Otherwise we have $a\in\mathfrak{m}$ and therefore $a=a_1t$ for some $a_1 \in R\setminus\{0\}$. We show that there exists a unique $n \ge 0$ such that $a = a_n t^n$ where $a_n$ is a unit in $R$. </p><p>When $a$ is a unit, as shown above, there is nothing to prove. Therefore, to reach a contradiction, we suppose that such $n$ does not exist when $a$ is not a unit. Then by induction, for each $j&gt;0$, there exists $a_j \in R\setminus\{0\}$ such that $a=a_jt^j$, which means that $a \in (t^j)=\mathfrak{m}^j$ for all $j$.  By Krull’s intersection theorem, we have $\bigcap_{j=1}^{\infty}\mathfrak{m}^j=\{0\}$ (<a href="https://public.websites.umich.edu/~mmustata/ReviewSheet3_2017.pdf">this is a consequence of Nakayama’s lemma and Artin-Rees lemma</a>), and therefore $a=0$, which is absurd. Therefore the desired $n$ always exists.</p><p>Next we show that such $n$ is unique. Suppose that $a = a_m t^m=a_nt^n$ where $a_m,a_n \in R^\times$ and without loss of generality we assume that $m \ge n$. Then $a - a = (a_mt^{m-n}-a_n)t^n=0$. Since $t$ is not nilpotent, we must have $a_mt^{m-n}-a_n=0$. In this case we must have $m=n$ and $a_m=a_n$ because otherwise $a_mt^{m-n}$ would not be a unit in $R$. </p><p>Therefore for all $a\in R \setminus\{0\}$, we can always uniquely write $a = ut^{v(a)}$ where $v(a) \ge 0$ is an integer. Since $t$ is not nilpotent, we see that $R$ is an integral domain and it is a discrete valuation ring in its field of fraction. Besides, $R$ is a principal ideal domain because for any ideal $\mathfrak{a} \subset \mathfrak{m}$, the ideal is generated by the element $a=v^{-1}(\min v(\mathfrak{a}))$. </p><p>Next we study the dimension of $\mathfrak{m}/\mathfrak{m}^2$ over $\kappa$, where $\mathfrak{m}=(t)$. Notice that $\dim_\kappa \mathfrak{m}/\mathfrak{m}^2\ge 1$ because otherwise $t=1$ or $0$. We show that $\dim_\kappa\mathfrak{m}/\mathfrak{m}^2 &lt;2$ under the assumption of 4. Let $u,v\in \mathfrak{m}/\mathfrak{m}^2$ be two distinct non-zero vectors. We show that there exists $\alpha \in \kappa$ such that $\alpha u = -v$. Suppose that $u = rt \pmod{\mathfrak{m}^2}$ and $v = st \pmod{\mathfrak{m}^2}$. Then $r,s \not\in \mathfrak{m}$ because otherwise $u=v=0$. If we choose $\alpha = -\frac{s}{r}\pmod{\mathfrak{m}}$, we see that $\alpha u = -st\pmod{\mathfrak{m}^2}=-v$ as desired.</p><p>To conclude, we have shown that $4 \implies 1,2,5$. </p><p>Moving on, we assume 5 and see what we can get. First of all every principal ideal domain is a unique factorisation ring so we get $3$ (axiom of choice is indispensable here). Besides since every ideal is principal then in particular the maximal ideal is principal so we get $4$. To conclude, we get $5 \implies 3,4$.</p><p>Finally we need to study the points 2,3 and 6. To begin with, we assume 3. Then by an elementary verification we see that $R$ is integrally closed (see <a href="https://proofwiki.org/wiki/Unique_Factorization_Domain_is_Integrally_Closed">ProofWiki</a>). Next we show that $\mathfrak{m}$ is principal.  Let $\mathscr{P}$ be the family of proper principal ideals of $R$ (they are contained in $\mathfrak{m}$ since $R$ is local). Then the set $\mathscr{P}$ is ordered by inclusion and every chain has a maximal element given by the union. By Zorn’s lemma, in $\mathscr{P}$ there is a maximal element $\mathfrak{M} \in\mathscr{P}$ that contains all proper principal ideals. Next we show that $\mathfrak{M}$ is maximal hence it is equal to $\mathfrak{m}$. To see this, assume that $a \in R \setminus \mathfrak{M}$. Then $(a)$ is not a proper principal ideal of $R$ because otherwise $(a) \subset \mathfrak{M} \implies a \in \mathfrak{M}$. Therefore $a$ is a unit and $\mathfrak{M}$ is the maximal ideal of the local ring $R$, which means $\mathfrak{M}=\mathfrak{m}$. This shows that $3 \implies 4,6$.</p><p>Next we assume 2. We use proposition 2 of this <a href="2022/12/05/regular-local-ring-cohen-macaulay/#The-Sequence-That-Forms-a-Basis">old post</a>, only need to notice that the dimension of $\mathfrak{m}/\mathfrak{m}^2$ is exactly the number of generators of $\mathfrak{m}$. Therefore we obtain $2 \implies 4$.</p><p>For the last part we assume that $R$ is integrally closed. Choose an arbitrary non-unit $a \in R$. If $a=0$ then $a \in \mathfrak{m}$. Otherwise, consider the ring $\widetilde{R}=R_\mathfrak{m}/aR_\mathfrak{m}$ which is not a field. Then $\tilde{R}$ is of Krull dimension $0$ therefore the maximal ideal $\tilde{\mathfrak{m}}=\mathfrak{m}R_\mathfrak{m}/aR_\mathfrak{m}$, is nilpotent. There exists $n&gt;0$ such that $\tilde{\mathfrak{m}}^n\ne 0$ but $\tilde{\mathfrak{m}}^{n+1}=0$, which implies that $\mathfrak{m}^n \not \subset (a)$ but $\mathfrak{m}^{n+1} \subset (a)$. Choose $b\in (a) \setminus \mathfrak{m}^n$. Then we claim that $\mathfrak{m}=(x)$ where $x=a/b \in K(R)$, the field of fraction of $R$. To see this, notice that $x^{-1}\mathfrak{m} \subset R$ because $b\mathfrak{m} \subset \mathfrak{m}^{n+1} \subset (a)$ so every element of $b\mathfrak{m}$ is of the form $ua$ where $u \in R$ and consequently every element of $\frac{b}{a}\mathfrak{m}$ is of the form $u$ where $u \in R$. Therefore $x^{-1}\mathfrak{m}$ can be considered as an ideal of $R$. However, we also have $x^{-1}\mathfrak{m} \not\subset \mathfrak{m}$ which is because, otherwise, $\mathfrak{m}$, as a finitely generated $R$-module, would be a faithful $R[x^{-1}]$-module, and therefore $x^{-1}$ is integral over $R$, thus lies in $R$. Hence we must have $x^{-1}\mathfrak{m}=R$, which implies that $\mathfrak{m}=(x)$. Therefore we obtain $6 \implies 4$.</p><p>We have established all necessary implications to obtain the equivalences. $\square$</p>]]></content>
    
    
    <summary type="html">In this post we collect and prove (as detailed as possible) the equivalent conditions of being a Regular local ring of dimension 1.</summary>
    
    
    
    <category term="Algebra" scheme="https://desvl.xyz/categories/Algebra/"/>
    
    <category term="Commutative Algebra" scheme="https://desvl.xyz/categories/Algebra/Commutative-Algebra/"/>
    
    
    <category term="Study new content" scheme="https://desvl.xyz/tags/Study-new-content/"/>
    
    <category term="Quick reference" scheme="https://desvl.xyz/tags/Quick-reference/"/>
    
  </entry>
  
  <entry>
    <title>The Structure of SL_2(F_3) as a Semidirect Product</title>
    <link href="https://desvl.xyz/2023/11/11/sl2-f3/"/>
    <id>https://desvl.xyz/2023/11/11/sl2-f3/</id>
    <published>2023-11-11T22:12:19.000Z</published>
    <updated>2025-05-29T22:41:52.171Z</updated>
    
    <content type="html"><![CDATA[<h1 id="Introduction"><a href="#Introduction" class="headerlink" title="Introduction"></a>Introduction</h1><p>Let $\mathbb{F}_3$ be the field of three elements and $SL_2(\mathbb{F}_3)$ be the group of $2 \times 2$ matrices with determinant $1$. In this post we show that $SL_2(\mathbb{F}_3)$ is the semi-direct product of $H_8$ and $\mathbb{Z}/3\mathbb{Z}$.</p><p>First of all we determine the cardinality of $SL_2(\mathbb{F}_3)$. To do this, we consider $GL_2(\mathbb{F}_3)$ and notice that $SL_2(\mathbb{F}_3)$ is the kernel of $\det$ function.</p><p>To determine $GL_2(\mathbb{F}_3)$, fix a basis of $\mathbb{F}_3 \oplus \mathbb{F}_3$ and let $A$ be a matrix representation of an element in $GL_2(\mathbb{F}_3)$. The first column of $A$ has $3^2-1$ number of choices: we only exclude the case of $(0,0)^T$. The second column has $3^2-3$ choices. We exclude $3$ possibilities given by the scalar multiplication of the first column to prevent linear dependence. Therefore $|GL_2(\mathbb{F}_3)|=(3^2-1)(3^2-3)=48$. Next we consider the exact sequence</p><script type="math/tex; mode=display">0 \to SL_2(\mathbb{F}_3) \to GL_2(\mathbb{F}_3)  \to (\mathbb{F}_3)^\ast \to 0.</script><p>We get $|SL_2(\mathbb{F}_3)|=|GL_2(\mathbb{F}_3)|/(\mathbb{F}_3)^\ast|=48/2=24$.</p><p>We immediately think about the possibility that $SL_2(\mathbb{F}_3)\cong \mathfrak{S}_4$. Is that the case?</p><h1 id="24-3-8"><a href="#24-3-8" class="headerlink" title="24=3*8"></a>24=3*8</h1><p>As a group of order 24, we immediately consider the elements of order $2$, $3$ and $4$ in order to know the structure of the group we are looking at.</p><h2 id="The-element-of-order-2"><a href="#The-element-of-order-2" class="headerlink" title="The element of order 2"></a>The element of order 2</h2><p>There are ${4 \choose 2}/2!=3$ elements of order $2$ in $\mathfrak{A}_4$, i.e. those being products of two $2$-cycles. However, how many elements of order $2$ are there in $SL_2(\mathbb{F}_3)$? Let $A$ be such an element, then $A^2 = I$. Therefore all elements of order $2$ is nullified by the polynomial</p><script type="math/tex; mode=display">f(X)=X^2-1=(X-1)(X+1).</script><p>If $A \in SL_2(\mathbb{F}_3)$ is of order $2$, then the minimal polynomial of $A$ divides $f(X)$, hence is either $X+1$ or $X^2-1$. The second case is impossible because then $f(X)$ will be the characteristic polynomial of $A$ and therefore $A$ has eigenvalue $1$ and $-1$ thus determinant $-1$. We get</p><blockquote><p><strong>Proposition 1.</strong> The element in $SL_2(\mathbb{F}_3)$ of order $2$ is only $A=-I$. In particular, $SL_2(\mathbb{F}_3)$ is not isomorphic to $\mathfrak{S}_4$.</p></blockquote><h2 id="Determine-the-group-using-Sylow-theory"><a href="#Determine-the-group-using-Sylow-theory" class="headerlink" title="Determine the group using Sylow theory"></a>Determine the group using Sylow theory</h2><p>Checking elements of order $2$ is not out of nowhere. Since $24=2^3 \cdot 3$, it makes sense to look at $2$-Sylow and $3$-Sylow subgroups of $SL_2(\mathbb{F}_3)$. Sylow’s theorem ensures that there is a subgroup of order $3$, which can only be $\mathbb{Z}/3\mathbb{Z}$. We have also determined that the subgroup of order $2$ is $\{-I,I\}$. Next we determine the group of order $8$.</p><h3 id="Elements-of-order-4"><a href="#Elements-of-order-4" class="headerlink" title="Elements of order 4"></a>Elements of order 4</h3><p>To study elements of order $4$, we immediately consider the polynomial</p><script type="math/tex; mode=display">g(X)=X^4-1=(X+1)(X-1)(X^2+1).</script><p>Let $A \in SL_2(\mathbb{F}_3)$ be an element of order $4$. Then $g(A)=0$. But since $A+I \ne 0$ and $A-I \ne 0$, we will be considering $h(X)=X^2+1$ instead. Notice that $h(X)$ is irreducible in $\mathbb{F}_3[X]$ and therefore it is minimal polynomial of $A$. Since the degree of $h$ is $2$, we also see $h(X)$ is the characteristic polynomial of $A$.</p><p>From this polynomial we see that $\mathrm{tr}(A)=0$. Combining with the fact that $|A|=1$, we can easily deduce that elements of order $4$ consists of</p><script type="math/tex; mode=display">i=\begin{pmatrix}0 & 1 \\-1 & 0\end{pmatrix},\;j = \begin{pmatrix}-1 & -1 \\-1 & 1\end{pmatrix},\;k = \begin{pmatrix}1 & -1 \\-1 & -1\end{pmatrix},\;-i,\;-j,\;-k.</script><p>We in particular have $i^3=i^{-1}=-i$, $j^3=j^{-1}=-j$ and $k^3=k^{-1}=-k$. Furthermore, $k=ij=-ji$. These identities rings a bell of quaternion number. We therefore have the quaternion group lying in $SL_2(\mathbb{F}_3)$ as a $2$-Sylow subgroup:</p><script type="math/tex; mode=display">H_8=\{1,i,j,k,-1,-i,-j,-k\}.</script><p>Is there any other $2$-Sylow subgroup? The answer is no. To see this, let $H’$ be another $2$-Sylow group. Then there exists some $g \in SL_2(\mathbb{F}_3)$ such that $H’=gH_8 g^{-1}$, which is equal to $H_8$ because all elements in $K$ will have order $4$.</p><blockquote><p><strong>Proposition 2.</strong> The quaternion group $H_8$ can be embedded into $SL_2(\mathbb{F}_3)$ as the unique $2$-Sylow group. In particular, $SL_2(\mathbb{F}_3)$ has no element of order $8$.</p></blockquote><h3 id="An-element-of-order-3"><a href="#An-element-of-order-3" class="headerlink" title="An element of order 3"></a>An element of order 3</h3><p>Let $A \in SL_2(\mathbb{F}_3)$ be an element of order $3$. Then its minimal polynomial $m(X)$ divides $X^3-1=(X-1)^3=(X-1)^2(X-1)$. Since $A-I \ne 0$, we must have $m(X)=(X-1)^2=X^2+X+1$. We can also see that the characteristic polynomial of $A$ is also $X^2+X+1$. In particular, we see the trace of $A$ is $-1=2$. We can then choose</p><script type="math/tex; mode=display">A=\begin{pmatrix}1 & 0 \\1 & 1\end{pmatrix}</script><p>Therefore $K=\{I,A,A^2\}$ is a $3$-Sylow subgroup of $SL_2(\mathbb{F}_3)$, which is not unique, because for example one can also consider the group generated by the transpose of $A$.</p><h3 id="Conclusion"><a href="#Conclusion" class="headerlink" title="Conclusion"></a>Conclusion</h3><p>Notice that $H \cap K = \{1\}$ because $\gcd(3,4)=1$. Therefore the map $H \times K \to HK$ given by $(x,y) \mapsto xy$ is bijective. Since $H$ is also normal, we are safe to write $G=H\ltimes K$ because $|HK|=|H||K|=24=|G|$. </p><h1 id="References"><a href="#References" class="headerlink" title="References"></a>References</h1><ul><li>Serge Lang, <em>Algebra Revised Third Edition</em></li><li>Olivier SERMAN, <a href="https://math.univ-lille1.fr/~serman/agreg/SL2F3.pdf">$SL_2(\mathbb{F}_3)$</a>. </li></ul>]]></content>
    
    
    <summary type="html">In this post we determine $SL_2(&#92;mathbb{F}_3)$ using Sylow theory and linear algebra.</summary>
    
    
    
    <category term="Algebra" scheme="https://desvl.xyz/categories/Algebra/"/>
    
    <category term="Group Theory" scheme="https://desvl.xyz/categories/Algebra/Group-Theory/"/>
    
    
    <category term="Exercise solution" scheme="https://desvl.xyz/tags/Exercise-solution/"/>
    
    <category term="Serge Lang" scheme="https://desvl.xyz/tags/Serge-Lang/"/>
    
    <category term="Sylow" scheme="https://desvl.xyz/tags/Sylow/"/>
    
  </entry>
  
  <entry>
    <title>A Separable Extension Is Solvable by Radicals Iff It Is Solvable</title>
    <link href="https://desvl.xyz/2023/10/21/solvable-by-radical/"/>
    <id>https://desvl.xyz/2023/10/21/solvable-by-radical/</id>
    <published>2023-10-21T08:39:01.000Z</published>
    <updated>2025-05-29T22:48:55.797Z</updated>
    
    <content type="html"><![CDATA[<h1 id="Introduction"><a href="#Introduction" class="headerlink" title="Introduction"></a>Introduction</h1><p>Polynomial is of great interest in various fields, such as analysis, geometry and algebra. Given a polynomial, we try to extract as many information as possible. For example, given a polynomial, we certainly want to find its roots. However this is not very realistic. Abel-Ruffini theorem states that it is impossible to solve polynomials of degree $\ge 5$ in general. For example, one can always solve the polynomial $x^n-1=0$ for arbitrary $n$, but trying to solve $x^5-x-1=0$ over $\mathbb{Q}$ is not possible. Galois showed that the flux of solvability lies in the structure of the Galois group, depending on whether it is solvable group-theoretically.</p><p>In this post, we will explore the theory of solvability in the modern sense, considering extensions of arbitrary characteristic rather than solely number fields over $\mathbb{Q}$.</p><h1 id="Solvable-Extensions"><a href="#Solvable-Extensions" class="headerlink" title="Solvable Extensions"></a>Solvable Extensions</h1><blockquote><p><strong>Definition 1.</strong> Let $E/k$ be a separable and finite field extension, and $K$ the smallest Galois extension of $k$ containing $E$. We say $E/k$ is <strong>solvable</strong> if $G(K/k)$ (the Galois group of $K$ over $k$) is solvable.</p></blockquote><p>Throughout we will deal with separable extensions because without this assumption one will be dealing with normal extensions instead of Galois extensions. Although we will arrive at a similar result.</p><blockquote><p><strong>Proposition 1.</strong> Let $E/k$ be a separable extension. Then $E/k$ is solvable if and only if there exists a solvable Galois extension $L/k$ such that $k \subset E \subset L$.</p></blockquote><p><em>Proof.</em> If $E/k$ is solvable, it suffices to take $L$ to be the smallest Galois extension of $k$ containing $E$. Conversely, Suppose $L/k$ is a solvable and Galois such that $k \subset E \subset L$. Let $K$ be the smallest Galois extension of $k$ containing $E$, i.e. we have $k \subset E \subset K \subset L$. We see $G(K/k) \cong G(L/k)/G(L/K)$ is a homomorphism image of $G(L/k)$ and it has to be solvable. $\square$</p><p>Next we introduce an important concept concerning field extensions.</p><blockquote><p><strong>Definition 2.</strong> Let $\mathcal{C}$ be a certain class of extension fields $F \subset E$. We say that $\mathcal{C}$ is <strong>distinguished</strong> if it satisfies the following conditions:</p><ol><li>Let $k \subset F \subset E$ be a tower of fields. The extension $k \subset E$ is in $\mathcal{C}$ if and only if $k \subset F$ is in $\mathcal{C}$ and $F \subset E$ is in $\mathcal{C}$.</li><li>If $k \subset E$ is in $\mathcal{C}$ and if $F$ is any given extension of $k$, and $E,F$ are both contained in some field, then $F \subset EF$ is in $\mathcal{C}$ too. Here $EF$ is the compositum of $E$ and $F$, i.e. the smallest field that contains both $E$ and $F$.</li><li>If $k \subset F$ and $k \subset E$ are in $\mathcal{C}$ and $F,E$ are subfields of a common field, then $k \subset FE$ is in $\mathcal{C}$.</li></ol></blockquote><p>When dealing with several extensions at the same time, it can be a great idea to consider the class of extensions they are in. For example, Galois extension is <em>not</em> distinguished because normal extension does not satisfy 1. That’s why we need to have the fundamental theorem of Galois theory, a.k.a. Galois correspondence, because not all intermediate subfields are Galois. Separable extension is distinguished however. We introduce this concept because:</p><blockquote><p><strong>Proposition 2.</strong> Solvable extensions form a distinguished class of extensions. (N.B. these extensions are finite and separable by default.)</p></blockquote><p><em>Proof.</em> We verify all three conditions mentioned in definition 2. To make our proof easier however, we first verify 2.</p><p><code>Step 1.</code> Let $E/k$ be solvable. Let $F$ be a field containing $k$ and assume $E, F$ are subfields of some algebraically closed field. We need to show that $EF/F$ is solvable. By proposition 1, there is a Galois solvable extension $K/k$ such that $K \supset E \supset k$. Then $KF$ is Galois over $F$ and $G(KF/F)$ is a subgroup of $G(K/k)$. Therefore $KF/F$ is a Galois solvable extension and we have $KF \supset  EF \supset F$, which implies that $EF/F$ is solvable.</p><p><code>Step 2.</code> Consider a tower of extensions $E \supset F \supset k$. Assume now $E/k$ is solvable. Then there exists a Galois solvable extension $K$ containing $E$, which implies that $F/k$ is solvable because $K \supset F$. We see $E/F$ is also solvable because $EF=E$ and we are back to step 1. </p><p>Conversely, assume that $E/F$ is solvable and $F/k$ is solvable. We will find a solvable extension $M/k$ containing $E$.  Let $K/k$ be a Galois solvable extension such that $K \supset F$, then $EK/K$ is solvable by step 1. Let $L$ be a Galois solvable extension of $K$ containing $EK$. If $\sigma$ is any embedding of $L$ over $k$ in a given algebraic closure, then $\sigma K = K$ and hence $\sigma L$ is a solvable extension of $K$. [This sentence deserves some explanation. Notice that $L/k$ is not necessarily Galois, therefore $\sigma$ is not necessarily an automorphism of $L$ and $\sigma L \ne L$ in general . However, since $K/k$ is Galois, the restriction of $\sigma$ on $K$ is an automorphism so therefore $\sigma K = K$. The extension $\sigma L / \sigma K$ is solvable because $\sigma L$ is isomorphic to $L$ and $\sigma K = K$.]</p><p>We let $M$ be the compositum of all extensions $\sigma L$ for all embeddings $\sigma$ of $L$ over $k$. Then $M/k$ is Galois and so is $M/K$ [note: this is the property of normal extension; besides, $M/k$ is finite]. We have $G(M/K) \subset \prod_{\sigma}G(\sigma L/K)$ which is a product of solvable groups. Therefore $G(M/K)$ is solvable, meaning $M/K$ is a solvable extension. We have a surjective homomorphism $G(M/k) \to G(K/k)$ (given by $\sigma \mapsto \sigma|_K$) and therefore $G(M/k)$ has a normal subgroup whose factor group is solvable, meaning $G(M/k)$ is solvable. Since $E \subset M$, we are done.</p><p><code>Step 3.</code> If $F/k$ and $E/k$ are solvable and $E,F$ are subfields of a common field, we need to show that $EF$ is solvable over $k$. By step 1, $EF/F$ is solvable. By step 2, $EF/k$ is solvable. $\square$</p><h1 id="Solvable-By-Radicals"><a href="#Solvable-By-Radicals" class="headerlink" title="Solvable By Radicals"></a>Solvable By Radicals</h1><blockquote><p><strong>Definition 2.</strong> Let $F/k$ be a finite and separable extension. We say $F/k$ is solvable by radicals if there exists a finite extension $E$ of $k$ containing $F$, and admitting a tower decomposition</p><script type="math/tex; mode=display">k=E_0 \subset E_1 \subset E_2 \subset \dots \subset E_m=E</script><p>such that each step $E_{i+1}/E_i$ is one of the following types:</p><ol><li>It is obtained by adjoining a root of unity.</li><li>It is obtained by adjoining a root of a polynomial $X^n-a$ with $a_i \in E_i$ and $n$ prime to the characteristic.</li><li>It is obtained by adjoining a root of an equation $X^p-X-a$ with $a \in E_i$ if $p$ is the characteristic $&gt;0$.</li></ol></blockquote><p>For example, $\mathbb{Q}(\sqrt{-2})/\mathbb{Q}$ is solvable by radicals. We consider the polynomial $f(x)=x^2-2x+3$. We know its roots are $x_1=-1-\sqrt{-2}$ and $x_2=-1+\sqrt{-2}$. However let’s see the question in the sense of field theory. Notice that</p><script type="math/tex; mode=display">f(x)=(x-1)^2+2.</script><p>Therefore $f(x)=0$ is equivalent to $(x-1)^2=-2$. Then $x-1=\sqrt{-2}$ and $x-1=-\sqrt{-2}$ in $\mathbb{Q}(\sqrt{-2})$ are two equations that make perfect sense. Thus we obtain our desired roots. The field gives us the liberty of basic arithmetic, and the radical extension gives us the method to look for a radical root.</p><p>It is immediate that the class of extensions solvable by radicals is a distinguished class.</p><p>In general, we are adding “$n$-th root of something”. However, when the characteristic of the field is not zero, there are some complications. For example, talking about the $p$-th root of an element in a field of characteristic $p&gt;0$ will not work. Therefore we need to take good care of that. The second and third types are nods to Kummer theory and Artin-Schreier theory respectively, which are deduced from Hilbert’s theorem 90’s additive and multiplicative form. We interrupt the post by introducing the respective theorems.</p><hr><p>Let $K/k$ be a cyclic extension of degree $n$, that is, $K/k$ is Galois and $G(K/k)$ is cyclic. Suppose $G(K/k)$ is generated by $\sigma$. Then we have the celebrated “Theorem 90”:</p><blockquote><p><strong>Theorem 1 (Hilbert’s theorem 90, multiplicative form).</strong> Notation being above, let $\beta \in K$. The norm $N_{k}^{K}(\beta)=1$ if and only if there exists an element $\alpha \ne 0$ in $K$ such that $\beta = \alpha/\sigma\alpha$.</p></blockquote><p>To prove this, we need Artin’s theorem of independent characters. With this, we see the second type of extension in definition 2 is cyclic.</p><blockquote><p><strong>Theorem 2.</strong> Let $k$ be a field, $n$ an integer $&gt;0$ prime to the characteristic of $k$, and assume that there is a primitive $n$-th root of unity in $k$.</p><ol><li>Let $K$ be a cyclic extension of degree $n$. Then there exists $\alpha \in K$ such that $K = k(\alpha)$ and $\alpha$ satisfies an equation $X^n-a=0$ for some $a \in k$.</li><li>Conversely, let $a \in k$. Let $\alpha$ be a root of $X^n-a$. Then $k(\alpha)$ is cyclic over $k$ of degree $d|n$, and $\alpha^d$ is an element of $k$.</li></ol></blockquote><p>All in all, theorem 2 states that a $n$-th root of $a$ yields a cyclic extension. However we don’t drop the assumption that $n$ is prime to the characteristic of $k$. When this is not the case, we will use Artin-Schreier theorem.</p><blockquote><p><strong>Theorem 3 (Hilbert’s theorem 90, additive form).</strong> Let $K/k$ be a cyclic extension of degree $n$. Let $\sigma$ be the generator of $G(K/k)$. Let $\beta \in K$. The trace $\mathrm{Tr}_k^K(\beta)=0$ if and only if there exists an element $\alpha \in K$ such that $\beta = \alpha-\sigma\alpha$.</p></blockquote><p>This theorem requires another application of the independence of characters. </p><blockquote><p><strong>Theorem 4 (Artin-Schreier).</strong> Let $k$ be a field of characteristic $p$.</p><ol><li>Let $K$ be a cyclic extension of $k$ of degree $p$. Then there exists $\alpha \in K$ such that $K=k(\alpha)$ and $\alpha$ satisfies an equation $X^p-X-a=0$ with some $a \in k$. </li><li>Conversely, given $a \in k$, the polynomial $f(X)=X^p-X-a$ either has one root in $k$, in which case all its roots are in $k$, or it is irreducible. In the latter case, if $\alpha$ is a root then $k(\alpha)$ is cyclic of degree $p$ over $k$.</li></ol></blockquote><p>In other words, instead of looking at the $p$-th root of unity in a field of characteristic $p$, we look at the root of $X^p-X-a$, which still yields a cyclic extension.</p><hr><p>Now we are ready for the core theorem of this post.</p><blockquote><p><strong>Theorem 5.</strong> Let $E$ be a finite separable extension of $k$. Then $E$ is solvable by radicals if and only if $E/k$ is solvable.</p></blockquote><p> <em>Proof.</em> First of all we assume that $E/k$ is solvable. Then there exists a finite Galois solvable extension of $k$ containing $E$ and we call it $K$. Let $m$ be the product of all primes $l$ such that $l \ne \operatorname{char}k$ but $l|[K:k]$. Let $F=k(\zeta)$ where $\zeta$ is a primitive $m$-th root of unity. Then $F/k$ is abelian and is solvable by radical by definition. </p><p>Since solvable extensions form a distinguished class, we see $KF/F$ is solvable. There is a tower of subfields between $F$ and $KF$ such that each step is cyclic of prime order, because every solvable group admits a tower of cyclic groups, and we can use Galois correspondence. By theorem 2 and 4, we see $KF/F$ is solvable by radical because extensions of prime order have been determined by these two theorems. It follows that $E/k$ is solvable by radicals: $KF/F$ is solvable by radicals, $F/k$ is solvable by radicals $\implies$ $KF/k$ is solvable by radicals $\implies$ $E/k$ is solvable by radicals because $KF \supset E \supset k$.</p><hr><p>The elaboration of the “if” part is as follows. In order to prove $E/k$ is solvable by radicals, we show that there is a much bigger field $KF$ containing $E$ such that $KF/k$ is solvable by radical. First of all there exists a finite Galois solvable extension $K/k$ containing $E$. Next we define a cyclotomic extension $F/k$ with the following intentions</p><ol><li>$F/k$ should be solvable by radicals.</li><li>$F$ contains enough primitive roots of unity, so that we can use theorem 2 freely. </li></ol><p>To reach these two goals, we decide to put $F=k(\zeta)$ where $\zeta$ is a $m$-th root of unity and $m$ is the radical of $[K:k]$ divided by the characteristic of $k$ when necessary. This field $F$ certainly ensures that $F/k$ is solvable by radical. For the second goal, we need to take a look of the subfield between $F$ and $KF$. Let $k = K_0 \subset K_1 \subset \dots \subset K_n = K$ be a tower of field extensions such that every step $K_{i+1}/K_i$ is of prime degree [this is possible due to the solvability of $K/k$]. These prime numbers can only be factors of $[K:k]$ Then in the lifted field extension $F=K_0F \subset K_1F \subset \dots \subset K_nF=KF$ we do not introduce new prime numbers. Why do we consider prime factors of $[K:k]$? Let’s say $[K_{i+1}F:K_iF] = \ell$ is a prime number. If $\ell=\operatorname{char}k$ then we can use theorem 4. Otherwise we still have $\ell|[K:k]$ so we use theorem 2. However this theorem requires a primitive $\ell$-th root to be in $K_{i}F$. Our choice of $m$ and $\zeta$ guaranteed this to happen because $\ell|m$ and therefore a primitive $\ell$-th root of unity exists in $F$. We can make $m$ bigger but there is no necessity. The “only if” part does nearly the same thing, with an alternation of logic chain.</p><hr><p>Conversely, assume that $E/k$ is solvable by radicals. For any embedding $\sigma$ of $E$ in $E^{\mathrm{a}}$ over $k$, the extension $\sigma E/k$ is also solvable by radicals. Hence the smallest Galois extension $K$ of $E$ containing $k$, which is a composite of $E$ and its conjugates is solvable by radicals. Let $m$ be the product of all primes unequal to the characteristic dividing the degree $[K:k]$ and again let $F=k(\zeta)$ where $\zeta$ is a primitive $m$-th root of unity. It will suffice to prove that $KF$ is solvable over $F$, because it follows that $KF$ is solvable by $k$ and hence $G(K/k)$ is solvable because it is a homomorphic image of $G(KF/k)$. But $KF/F$ can be decomposed into a tower of extensions such that each step is prime degree and of the type described in theorem 2 and theorem 4. The corresponding root of unity is in the field $F$. Hence $KF/F$ is solvable, proving the theorem. $\square$</p>]]></content>
    
    
    <summary type="html">We show that a separable extension is solvable by radical iff it is solvable, i.e. it has a Galois closure with solvable Galois group. The proof is done in a general setting.</summary>
    
    
    
    <category term="Algebra" scheme="https://desvl.xyz/categories/Algebra/"/>
    
    <category term="Field Theory" scheme="https://desvl.xyz/categories/Algebra/Field-Theory/"/>
    
    <category term="Galois Theory" scheme="https://desvl.xyz/categories/Algebra/Galois-Theory/"/>
    
    
    <category term="Galois" scheme="https://desvl.xyz/tags/Galois/"/>
    
  </entry>
  
  <entry>
    <title>Picard&#39;s Little Theorem and Twice-Punctured Plane</title>
    <link href="https://desvl.xyz/2023/09/18/picard-little/"/>
    <id>https://desvl.xyz/2023/09/18/picard-little/</id>
    <published>2023-09-18T11:47:20.000Z</published>
    <updated>2025-05-10T22:19:10.276Z</updated>
    
    <content type="html"><![CDATA[<h1 id="Introduction"><a href="#Introduction" class="headerlink" title="Introduction"></a>Introduction</h1><p>Let $f:\mathbb{C} \to \mathbb{C}$ be a holomorphic function. By Liouville’s theorem, if $f(\mathbb{C})$ is bounded, then $f$ has to be a constant function. However, there is a much stronger result. In fact, if $f(\mathbb{C})$ differs $\mathbb{C}$ from exactly $2$ points, then $f$ is a constant. In other words, suppose $f$ is non-constant, then the equation $f(z)=a$ for all $a \in \mathbb{C}$ except at most one $a$. To think about this, if $f$ is a non-constant polynomial, then $f(z)=a$ always has a solution (the fundamental theorem of algebra). If, for example, $f(z)=\exp(z)$, then $f(z)=a$ has no solution only if $a=0$.</p><p>The proof will not be easy. It will not be proved within few lines of obvious observations, either in elementary approaches or advanced approaches. In this post we will follow the later by studying the twice-punctured plane $\mathbb{C} \setminus\{0,1\}$. To be specific, without loss of generality, we can assume that $0$ and $1$ are not in the range of $f$. Then $f(\mathbb{C}) \subset \mathbb{C}\setminus\{0,1\}$. Next we use advanced tools to study $\mathbb{C}\setminus\{0,1\}$ in order to reduce the question to Liouville’s theorem by constructing a bounded holomorphic function related to $f$.</p><p>We will find a holomorphic covering map $\lambda:\mathfrak{H} \to \mathbb{C}\setminus\{0,1\}$ and then replace $\mathfrak{H}$ with the unit disc $D$ using the Cayley transform $z \mapsto \frac{z-i}{z+i}$. Then the aforementioned $f$ will be lifted to a holomorphic function $F:\mathbb{C} \to D$, which has to be constant due to Liouville’s theorem, and as a result $f$ is constant.</p><p>With these being said, we need analytic continuation theory to establish the desired $\lambda$, and on top of that, (algebraic) topology will be needed to justify the function $F$.</p><h1 id="Analytic-Continuation"><a href="#Analytic-Continuation" class="headerlink" title="Analytic Continuation"></a>Analytic Continuation</h1><p>For a concrete example of analytic continuation, I recommend <a href="2022/11/24/riemann-zeta-continuation/">this post on the Riemann $\zeta$ function</a>. In this post however, we only focus on the basic language of it in order that we can explain later content using analytic continuation.</p><p>Our continuation is always established “piece by piece”, which is the reason we formulate continuation in the following sense.</p><blockquote><p><strong>Definition 1.</strong> A <em>function element</em> is an ordered pair $(f,D)$ where $D$ is an open disc and $f \in H(D)$. Two function elements $(f_1,D_1)$ and $(f_2,D_2)$ are <em>direct continuation</em> of each other if $D_1 \cap D_2 \ne \varnothing$ and $f_1=f_2$ on $D$. In this case we write</p><script type="math/tex; mode=display">(f_1,D_1) \sim (f_2,D_2).</script></blockquote><p>The notion of ordered pair may ring a bell of sheaf and stalk. Indeed some authors do formulate analytic continuation in this language, see for example <a href="https://link.springer.com/book/10.1007/978-3-030-59365-0">Principles of Complex Analysis</a> by <a href="https://link.springer.com/book/10.1007/978-3-030-59365-0#author-0-0">Serge Lvovski</a>. </p><p>The $\sim$ relation is by definition reflective and symmetric, but not transitive. To see this, let $\omega$ be the primitive $3$-th root of unity. Let $D_0, D_1,D_2$ be open discs with radius $1$ and centres $\omega^0,\omega^1,\omega^2$. Since the $D_i$ are simply connected, we can always pick $f_i \in H(D_i)$ such that $f_i^2(z)=z$, and $(f_0,D_0) \sim (f_1,D_1)$ and $(f_1,D_1) \sim (f_2,D_2)$ but on $D_0 \cap D_2$ one has $f_2 =-f_0 \ne f_0$. Indeed there is nothing mysterious: we are actually rephrasing the fact that square root function cannot be defined at a region containing $0$.</p><blockquote><p><strong>Definition 2.</strong> A <em>chain</em> is a finite sequence $\mathscr{C}$ of discs $(D_0,D_1,\dots,D_n)$ such that $D_{i-1} \cap D_i \ne \varnothing$ for $i=1,\dots,n$. If $(f_0,D_0)$ is given and if there exists function elements $(f_i,D_i)$ such that $(f_{i-1},D_{i-1}) \sim (f_i,D_i)$ for $i=1,\dots,n$, then $(f_n,D_n)$ is said to be the <em>analytic continuation of $(f_0,D_0)$ along $\mathscr{C}$</em>.</p><p>A chain $\mathscr{C}=(D_0,\dots,D_n)$ is said to <em>cover</em> a curve $\gamma$ with parameter interval $[0,1]$ if there are numbers $0=s_0&lt;s_1&lt;\dots&lt;s_n=1$ such that $\gamma(0)$ is the centre of $D_0$, $\gamma(1)$ is the centre of $D_n$, and</p><script type="math/tex; mode=display">\gamma([s_i,s_{i+1}]) \subset D_i \quad(i=0,\dots,n-1).</script><p>If $(f_0,D_0)$ can be continued along this $\mathscr{C}$ to $(f_n,D_n)$, we call $(f_n,D_n)$ an <em>analytic continuation of $(f_0,D_0)$ along $\gamma$</em>; $(f_0,D_0)$ is then said to <em>admit</em> an analytic continuation along $\gamma$.</p></blockquote><p>Either way, it is not necessary that $(f_0,D_0) \sim (f_n,D_n)$. However, unicity of $(f_n,D_n)$ is always guaranteed. We will sketch out the proof on unicity.</p><blockquote><p><strong>Lemma 1.</strong> Suppose that $D_0 \cap D_1 \cap D_2 \ne \varnothing$, $(D_0,f_0) \sim (D_1,f_1)$ and $(D_1,f_1) \sim (D_2,f_2)$, then $(D_0,f_0) \sim (D_2,f_2)$.</p></blockquote><p><em>Proof.</em> By assumption, $f_0=f_1$ in $D_0 \cap D_1$, and $f_1=f_2$ in $D_1 \cap D_2$. It follows that $f_0=f_2$ in $D_0 \cap D_1 \cap D_2$, which is open and non-empty. Since $f_0$ and $f_2$ are holomorphic in $D_0 \cap D_2$ and $D_0 \cap D_2$ is connected, we have $f_0 = f_2$ in $D_0 \cap D_2$. This is because on a open connected set $D_0 \cap D_2$, the zero of $f_0-f_2$ is not discrete. Therefore $f_0-f_2$ has to be $0$ everywhere on $D_0 \cap D_2$. $\square$</p><blockquote><p><strong>Theorem 1.</strong> If $(f,D)$ is a function element and $\gamma$ is a curve which starts at the centre of $D$, then $(f,D)$ admits at most one analytic continuation along $\gamma$. </p></blockquote><p><em>Sketch of the proof.</em> Let $\mathscr{C}_1=(A_0,A_1,\dots,A_m)$ and $\mathscr{C}_2=(B_0,B_1,\dots,B_n)$ be two chains that cover $\gamma$. If $(f,D)$ can be analytically continued along $\mathscr{C}_1$ to a function element $(g_m,A_m)$ and along $\mathscr{C}_2$ to $(h_n,B_n)$, then $g_m=h_n$ in $A_m \cap B_n$.</p><p>We are also given partitions $0=s_0&lt;s_1&lt;\dots&lt;s_m=s_{m+1}=1$ and $0=t_0&lt;t_1&lt;\dots&lt;t_n=t_{n+1}=1$ such that</p><script type="math/tex; mode=display">\gamma([s_i,s_{i+1}]) \subset A_1,\;\gamma([t_j,t_{j+1}]) \subset B_j,\; (0\le i \le m, 0 \le j \le n),</script><p> and function elements $(g_i,A_i) \sim (g_{i+1},A_{i+1})$ and $(h_j,B_j) \sim (h_{j+1},B_{j+1})$ for $0 \le i \le m-1$ and $0 \le j \le n-1$ with $g_0=h_0=f$. The poof is established by showing that the continuation is compatible with intersecting intervals, where lemma 1 will be used naturally. To be specific, if $0 \le i \le m$ and $0 \le j \le n$, and if $[s_i,s_{i+1}] \cap [t_j,t_{j+1}] \ne \varnothing$, then $(g_i, A_i) \sim (h_j,B_j)$.</p><h1 id="The-Monodromy-Theorem"><a href="#The-Monodromy-Theorem" class="headerlink" title="The Monodromy Theorem"></a>The Monodromy Theorem</h1><p>The monodromy theorem asserts that on a simply connected region $\Omega$, for a function element $(f,D)$ with $D \subset \Omega$, we can extend it to all $\Omega$ if $(f,D)$ can be continued along all curves. To prove this we need homotopy properties of analytic continuation and simply connected spaces.</p><blockquote><p><strong>Definition 1.</strong> A simply connected space is a path connected topological space $X$ with trivial fundamental group $\pi_1(X,x_0)=\{e\}$ for all $x_0 \in X$.</p></blockquote><p>The following fact is intuitive and will be used in the monodromy theorem.</p><blockquote><p><strong>Lemma 2.</strong> Let $X$ be a simply connected space and let $\gamma_1$ and $\gamma_2$ be two closed curves $[0,1] \to X$ with $\gamma_1(0)=\gamma_2(0)$ and $\gamma_1(1)=\gamma_2(1)$. Then $\gamma_1$ and $\gamma_2$ are homotopic.</p></blockquote><p><em>Proof.</em> Let $\gamma_i^{-1}$ be the curve defined by $\gamma_i^{-1}(t)=\gamma_i(1-t)$ for $i=1,2$. Then</p><script type="math/tex; mode=display">\gamma_1 \simeq \gamma_1 \ast \gamma_2^{-1} \ast \gamma_2 \simeq e \ast \gamma_2 \simeq \gamma_2</script><p>where $e$ is the identity of $\pi_1(X,\gamma_1(0))$. $\square$</p><p>Next we prove the two-point version of the monodromy theorem.</p><blockquote><p><strong>Monodromy theorem (two-point version).</strong> Let $\alpha,\beta$ be two points on $\mathbb{C}$ and $(f,D)$ be a function element where $D$ is centred at $\alpha$. Let $\{\gamma_t\}$ be a homotopy class indexed by a map $H(s,t):[0,1] \times [0,1] \to \mathbb{C}$ with the same origin $\alpha$ and terminal $\beta$. If $(f,D)$ admits analytic continuation along each $\gamma_t$, to an element $(g_t,D_t)$, then $g_1=g_0$. </p></blockquote><p>In brief, analytic continuation is faithful along homotopy classes. By being indexed by $H(s,t)$ we mean that $\gamma_t(s)=H(s,t)$. We need the uniform continuity of $H(s,t)$.</p><p><em>Proof.</em> Fix $t \in [0,1]$. By definition, there is a chain $\mathscr{C}=(A_0,\dots,A_n)$ which covers $\gamma_t$, with $A_0=D$, such that $(g_t,D_t)$ is obtained by continuation of $(f,D)$ along $\mathscr{C}$. There are numbers $0=s_0&lt;\dots&lt;s_n=1$ such that</p><script type="math/tex; mode=display">E_i=\gamma_t([s_i,s_{i+1}]) \subset A_i,\;(i=0,1,\dots,n-1).</script><p>For each $i$, define</p><script type="math/tex; mode=display">d_i=\inf\{|x-y|:x \in E_i,y \in A_i^c\}.</script><p>The $d_i$ makes sense and is always positive, because $E_i$ is always compact and $A_i$ is an open set. Then pick any $\varepsilon \in (0,\min_i\{d_i\})$. Since $H(s,t)$ is uniformly continuous, there exists a $\delta&gt;0$ such that</p><script type="math/tex; mode=display">|\gamma_t(s)-\gamma_u(s)|<\varepsilon,\; \forall t,u,s \in [0,1]; |u-t|<\delta.</script><p>We claim that $\mathscr{C}$ also covers $\gamma_u$. To do this, pick any $s \in [s_i,s_{i+1}]$. Then $\gamma_u(s) \in A_i$ because</p><script type="math/tex; mode=display">|\gamma_t(s)-\gamma_u(s)| <\varepsilon<d_i=\inf\{|x-y|:x \in E_i,y\not\in A_i\}.</script><p>Therefore by theorem 1, we have $g_t=g_u$. Notice that for any $t \in [0,1]$, there is a segment $I_t$ such that $g_u=g_t$ for all $u \in [0,1] \cap I_t$. Since $[0,1]$ is compact, there are finitely many $I_t$ that cover $[0,1]$. Since $[0,1]$ is connected, we see, after a finite number of steps, we can reach $g_0=g_1$. $\square$</p><blockquote><p><strong>Momodromy theorem.</strong> Suppose $\Omega$ is a simply connected open subset of the plane, $(f,D)$ is a function element with $D \subset \Omega$, and $(f,D)$ can be analytically continued along every curve in $\Omega$ that starts at the centre of $D$. Then there exists $g \in H(\Omega)$ such that $g(z)=f(z)$ for all $z \in D$.</p></blockquote><p><em>Proof.</em> Let $\gamma_0$ and $\gamma_1$ be two curves in $\Omega$ from the centre $\alpha$ of $D$ to some point $\beta \in \Omega$. Then the two-point monodromy theorem and lemma 2 ensures us that these two curves lead to the same element $(g_\beta,D_\beta)$, where $D_\beta \subset \Omega$ is a circle with centre at $\beta$. If $D_{\beta_1}$ intersects $D_\beta$, then $(g_{\beta_1},D_{\beta_1})$ can be obtained by continuing $(f,D)$ to $\beta$, then along the segment connecting $\beta$ and $\beta_1$. By definition of analytic continuation, $g_{\beta_1}=g_\beta$ in $D_{\beta_1} \cap D_\beta$. Therefore the definition </p><script type="math/tex; mode=display">g(z)=g_{\beta}(z),\; z \in D_\beta,\beta\in \Omega</script><p>is a consistent definition and gives the desired holomorphic extension of $f$. $\square$</p><h1 id="Modular-Function"><a href="#Modular-Function" class="headerlink" title="Modular Function"></a>Modular Function</h1><p>Let $\mathfrak{H}$ be the open upper half plane. We will find a function $\lambda \in H(\mathfrak{H})$ whose image is $E=\mathbb{C} \setminus\{0,1\}$ and is in fact the (holomorphic) covering space of $E$. The function $\lambda$ is called a modular function. </p><p>As usual, consider the action of $G=SL(2,\mathbb{Z})$ on $\mathfrak{H}$ given by</p><script type="math/tex; mode=display">\begin{pmatrix}a & b \\c & d\end{pmatrix}z=\frac{az+b}{cz+d}.</script><blockquote><p><strong>Definition 2.</strong> A <strong>Modular function</strong> is a holomorphic (or meromorphic) function $f$ on $\mathfrak{H}$ which is invariant under a non-trivial subgroup $\Gamma$ of $G$. That is, for any $\varphi \in \Gamma$, one has $f \circ \varphi=f$.</p></blockquote><p>In this section, we consider this subgroup:</p><script type="math/tex; mode=display">\Gamma=\left\{\begin{pmatrix}a & b \\c & d\end{pmatrix} \in SL(2,\mathbb{Z}):a,d \equiv 1 \bmod{2};b,c\equiv0\bmod{2}\right\}.</script><p>It has a fundamental domain</p><script type="math/tex; mode=display">Q=\{z=x+iy \in \mathfrak{H}:y>0,-1 \le x <1,|2z+1| \ge 1,|2z-1|>1\}.</script><p>Basically, $Q$ is bounded by two vertical lines $x=1$ and $x=-1$ vertically, and two semicircles with centre at $x=\frac{1}{2}$ and $x=-\frac{1}{2}$ with diameter $1$, but only the left part contains boundary points. The term <em>fundamental domain</em> will be justified by the following theorem.</p><blockquote><p><strong>Theorem 4.</strong> Let $\Gamma$ and $Q$ be as above. </p><p>(a) Let $\varphi_1,\varphi_2$ be two distinct elements of $\Gamma$, then $\varphi_1(Q) \cap \varphi_2(Q) = \varnothing$.</p><p>(b) $\bigcup_{\varphi \in \Gamma}\varphi(Q)=\mathfrak{H}$.</p><p>(c) $\Gamma$ is generated by two elements</p><script type="math/tex; mode=display">\sigma=\begin{pmatrix}1 & 0 \\2 & 1\end{pmatrix},\;\tau=\begin{pmatrix}1 & 2 \\0 & 1\end{pmatrix}</script></blockquote><p><em>Sketch of the proof.</em> Let $\Gamma_1$ be the subgroup of $\Gamma$ generated by $\sigma$ and $\tau$, and show (b’):</p><script type="math/tex; mode=display">\bigcup_{\varphi \in \Gamma_1}\varphi(Q)=\mathfrak{H}.</script><p>Then (a) and (b’) would imply that $\Gamma_1=\Gamma$ and (b) is proved. To prove (a), one will replace $\varphi_1$ with the identity element and discuss the relationship between $c$ and $d$ for $\varphi_2=\begin{pmatrix}a &amp; b \\ c &amp; d \end{pmatrix}$. To prove (b’), one need to notice that</p><script type="math/tex; mode=display">\Sigma = \bigcup_{\varphi \in \Gamma_1}\varphi(Q) \supset\{z \in \mathfrak{H}:|2z-(2m+1)| \ge 1,m \in \mathbb{Z}\}.</script><p>For $w \in \mathfrak{H}$, by picking $\varphi_0 \in \Gamma$ that maximises $\Im\varphi_0(w)$, only to show that $z=\varphi_0(w) \in \Sigma$ and therefore $w \in \Sigma$.</p><hr><p>We are now allowed to introduce the modular function.</p><blockquote><p><strong>Theorem 5.</strong> Notation being above, there exists a function $\lambda \in H(\mathfrak{H})$ such that</p><p>(a) $\lambda \circ \varphi = \lambda$ for every $\varphi \in \Gamma$.</p><p>(b) $\lambda$ is one-to-one on $Q$.</p><p>(c) $\lambda(\mathfrak{H})=\lambda(Q)=E=\mathbb{C}\setminus\{0,1\}$.</p><p>(d) $\lambda$ has the real axis as its natural boundary. That is, $\lambda$ has no holomorphic extension to any region that properly contains $\mathfrak{H}$.</p></blockquote><p><em>Proof.</em> Consider</p><script type="math/tex; mode=display">Q_0=\{z \in \mathfrak{H}:0<\Re z <1,|2z-1|>1\} \subset Q.</script><p>This is a simply connected region with simple boundary. There is a continuous function $h$ which is one-to-one on $\overline{Q}_0$ and is holomorphic in $Q_0$ such that $h(Q_0)=\mathfrak{H}$, $h(0)=0$, $h(1)=1$ and $h(\infty)=\infty$. This is a consequence of conformal mapping theory.</p><p>The <a href="https://mathweb.ucsd.edu/~jmckerna/Teaching/19-20/Spring/120B/l_22.pdf">Schwartz reflection principle</a> extends $h$ to a continuous function on $\overline{Q}$ which is a conformal mapping of $Q^\circ$ (the interior of $Q$) onto the plane minus the non-negative real axis, by the formula</p><script type="math/tex; mode=display">h(-x+iy)=\overline{h(x+iy)}.</script><p>Note the extended $h$ is one-to-one on $Q$, and $h(Q)$ is $E$ defined in (c).</p><p>On the boundary of $Q$, the function $h$ is real. In particular,</p><script type="math/tex; mode=display">h(-1+iy)=h(1+iy)=h((-1+iy)+2)=h(\tau(-1+iy))\quad (0<y<\infty)</script><p>and that</p><script type="math/tex; mode=display">h\left(-\frac{1}{2}+\frac{1}{2}e^{i\theta}\right)=h\left(\frac{1}{2}+\frac{1}{2}e^{i(\pi-\theta)}\right)=h\left(\sigma\left(-\frac{1}{2}+\frac{1}{2}e^{i\theta}\right)\right).</script><p>We now define</p><script type="math/tex; mode=display">\lambda(z)=h(\varphi^{-1}(z))</script><p>for $\varphi \in \Gamma$ and $z \in \varphi(Q)$. This definition makes sense because for each $z \in \mathfrak{H}$, there is one and only one $\varphi \in \Gamma$ such that $z \in \varphi(Q)$. Properties (a) (b) and (c) follows immediately.</p><p>Notice $\lambda$ is continuous on</p><script type="math/tex; mode=display">Q \cup \tau^{-1}(Q) \cup \sigma^{-1}(Q),</script><p>and therefore on an open set $V$ containing $Q$. Cauchy’s theorem shows that $\lambda$ is holomorphic in $V$. Since $\mathfrak{H}$ is covered by the union of the sets $\varphi(V)$ for $\varphi \in \Gamma$, and since $\lambda \circ \varphi = \lambda$, we conclude that $\lambda \in H(\mathfrak{H})$. </p><p>Finally, the set of all numbers $\varphi(0)=b/d$ is dense on the real axis. If $\lambda$ could be analytically continued to a region which properly contains $\mathfrak{H}$, the zeros of $\lambda$ would have a limit point in this region, which is impossible since $\lambda$ is not constant. $\square$</p><p>We are now ready for the pièce de résistance of this post.</p><h1 id="Picard’s-Little-Theorem"><a href="#Picard’s-Little-Theorem" class="headerlink" title="Picard’s Little Theorem"></a>Picard’s Little Theorem</h1><blockquote><p><strong>Theorem (Picard).</strong> If $f$ is an entire function and if there are two distinct complex numbers $\alpha$ and $\beta$ such that are not in the range of $f$, then $f$ is constant.</p></blockquote><p>The proof is established by considering an analytic continuation of a function $g$ associated with $f$. The continuation will be originated at the origin and validated by monodromy theorem. Then by Cayley’s transformation, we find out the range of $g$ is bounded and hence $g$ is constant, so is $f$.</p><p><em>Proof.</em> First of all notice that without loss of generality, we assume that $\alpha=0$ and $\beta=1$, because otherwise we can replace $f$ with $(f-\alpha)/(\beta-\alpha)$. That said, the range of $f$ is $E$ in theorem 5. There is a disc $A_0$ with centre at $0$ so that $f(A_0)$ lies in a disc $D_0 \subset E$. </p><p>For every disc $D \subset E$, there is an associated region $V \subset \mathfrak{H}$ such that $\lambda$ in theorem 5 is one-to-one on $V$ and $\lambda(V)=D$; each such $V$ intersects at most two of the domains $\varphi(Q)$. Corresponding to each choice of $V$, there is a function $\psi \in H(D)$ such that $\psi(\lambda(z))=z$ for all $z \in V$.</p><p>Now let $\psi_0 \in H(D_0)$ be the function such that $\psi_0(\lambda(z))=z$ as above. Define $g(z)=\psi_0(f(z))$ for $z \in A_0$. We claim that $g(z)$ can be analytically continued to an entire function.</p><p>If $D_1$ is another disc in $E$ with $D_0 \cap D_1 \ne \varnothing$, we can choose a corresponding $V_1$ so that $V_0 \cap V_1 \ne \varnothing$. Then $(\psi_0,D_0)$ and $(\psi_1,D_1)$ are direct analytic continuations of each other. We can proceed this procedure all along to find a direct analytic continuation $(\psi_{i+1},D_{i+1})$ of $(\psi_i,D_i)$ with $V_{i+1} \cap V_i \ne 0$. Note $\psi_i(D_i) \subset V_i \subset \mathfrak{H}$ for all $i$.</p><p>Let $\gamma$ be a curve in the plane which starts at $0$. The range of $f \circ \gamma$ is a compact subset of $E$ and therefore $\gamma$ can be covered by a chain of discs, say $A_0,\dots,A_n$, so that each $f(A_i)$ is in a disc $D_i \subset E$. By considering function elements $\{(\psi_{i},D_i)\}$, composing with $f$ on each $D_i$ (this is safe because $f$ is entire), we get an analytic continuation of $(g,A_0)$ along the chain $(A_0,\dots,A_n)$. Note $\psi_i \circ f(A_i) \subset \psi_i(D_i) \subset \mathfrak{H}$ again. </p><p>Since $\gamma$ is arbitrary, we have shown that $(g,A_0)$ can be analytically continued along every curve in the plane. The monodromy theorem implies that $g$ extends to an entire function. Thus proving our claim given before. </p><p>Note the range of the extended $g$ on every possible $A_i$ has range lying inside $\mathfrak{H}$. Therefore $g(\mathbb{C}) \subset \mathfrak{H}$. It follows that</p><script type="math/tex; mode=display">h=\frac{g-i}{g+i} \in H(\mathbb{C})</script><p>has range in the unit disc. By Liouville’s theorem, $h$ is a constant function. Thus $g$ is constant too. </p><p>Now we move back to $f$ by looking at $A_0$. Since $\psi_0$ is one-to-one on $f(A_0)$ and $A_0$ is not empty and open, $f(A_0)$ has to be a singleton. Thus $f$ is constant on $A_0$. If we represent $f$ as a power series on a disc lying inside $A_0$, we see $f$ has to be a constant. $\square$</p><p>Note we have also seen that the range of a non-constant function cannot be half of a plane. But this result is useless because we can find two points on a large chunk of a plane after all.</p><h1 id="Reference"><a href="#Reference" class="headerlink" title="Reference"></a>Reference</h1><ul><li>Walter Rudin, <em>Real and Complex Analysis</em>.</li><li>Tammo tom Dieck, <em>Algebraic Topology</em>.</li></ul>]]></content>
    
    
    <summary type="html">We show that the range of a non-constant entire function&#39;s range cannot be a twice-punctured plane.</summary>
    
    
    
    <category term="Analysis" scheme="https://desvl.xyz/categories/Analysis/"/>
    
    <category term="Complex Analysis" scheme="https://desvl.xyz/categories/Analysis/Complex-Analysis/"/>
    
    
    <category term="Walter Rudin" scheme="https://desvl.xyz/tags/Walter-Rudin/"/>
    
    <category term="Study new content" scheme="https://desvl.xyz/tags/Study-new-content/"/>
    
    <category term="Analytic Continuation" scheme="https://desvl.xyz/tags/Analytic-Continuation/"/>
    
  </entry>
  
  <entry>
    <title>SL(2,R) As a Topological Space and Topological Group</title>
    <link href="https://desvl.xyz/2023/08/12/sl2-decomposition/"/>
    <id>https://desvl.xyz/2023/08/12/sl2-decomposition/</id>
    <published>2023-08-12T17:28:22.000Z</published>
    <updated>2026-02-25T01:24:50.256Z</updated>
    
    <content type="html"><![CDATA[<h1 id="Introduction"><a href="#Introduction" class="headerlink" title="Introduction"></a>Introduction</h1><p>There are a lot of important linear algebraic groups that are widely used in mathematics, physics and industry. Some of them have nice visualisations. For example, it is widely known that $SU(2) \cong S^3$ and $SO(3) \cong \mathbb{RP}^3$. The group $SL(2,\mathbb{R})$ is not less important than them but the visualisation concerning this group is not very easy to be found. In this post we show that </p><script type="math/tex; mode=display">SL(2,\mathbb{R}) \cong S^1 \times \mathbb{R} \times \mathbb{R} \cong  S^1 \times D,</script><p>where $D$ is the open unit disk. In other words, $SL(2,\mathbb{R})$ can be considered as a donut, not the shell of it ($S^1 \times S^1$) but the “content” or “flesh” of it. More formally, the inside of a solid torus.</p><p>The related core theory can be found in Iwasawa decomposition, but to access it we need Lie group and Lie algebra theories, which involves differential geometry and certainly goes beyond the scope of this post. Interested readers can refer to <a href="doi.org/10.1007/978-1-4757-2453-0">Lie Groups Beyond an Introduction</a> chapter 6 for Iwasawa decomposition theory.</p><h1 id="Immediate-topological-consequences"><a href="#Immediate-topological-consequences" class="headerlink" title="Immediate topological consequences"></a>Immediate topological consequences</h1><p>Before we establish the homeomorphism</p><script type="math/tex; mode=display">SL(2,\mathbb{R}) \cong S^1 \times \mathbb{R} \times \mathbb{R} \cong S^1 \times D</script><p>we first see what we can derive from it. </p><ul><li>Is $SL(2,\mathbb{R})$ compact?</li></ul><p>No. Since $D$ is not compact, $S^1 \times D$ cannot be compact.</p><ul><li>What is the fundamental group of $SL(2,\mathbb{R})$?</li></ul><p>Notice there is a (strong) deformation retract between $S^1 \times D$ and $S^1$. Therefore $\pi_1(SL(2,\mathbb{R})) = \pi_1(S^1)=\mathbb{Z}$.</p><ul><li>Connectedness of $SL(2,\mathbb{R})$?</li></ul><p>It is connected because $S^1$ and $D$ are connected. It is not simply connected because the fundamental group is not trivial.</p><ul><li>What is the dimension of $SL(2,\mathbb{R})$ as a manifold?</li></ul><p>The dimension is $3$.</p><h1 id="The-Iwasawa-decomposition"><a href="#The-Iwasawa-decomposition" class="headerlink" title="The Iwasawa decomposition"></a>The Iwasawa decomposition</h1><p>If we directly jump to the conclusion without mentioning Lie theory, one will see the decomposition comes from nowhere. Instead of defining $K$, $A$ and $N$ that will appear later and show that there is no discrepancy, we deduce the decomposition without the usage of Lie theory. Instead, we consider the action of $SL(2,\mathbb{R})$ on the upper half plane, because group action is likely to expose more information of the group.</p><p>Consider the group action of $SL(2,\mathbb{R})$ on the upper half plane</p><script type="math/tex; mode=display">\mathfrak{H}=\{z \in \mathbb{C}:\Im(z)>0\}</script><p>given by</p><script type="math/tex; mode=display">\begin{pmatrix}a & b \\c& d\end{pmatrix}(z)=\frac{az+b}{cz+d}.</script><p>Up to an explosion of calculation, one can indeed verify that this is a group action and in particular</p><script type="math/tex; mode=display">\Im\left(\frac{az+b}{cz+d}\right)=\frac{\Im(z)}{|cz+d|^2}.</script><p>As one may guess, it is not wise to continue without investigating the action first, or we will be lost in calculation. We first show that this action is transitive by showing that for any $z=x+yi \in \mathfrak{H}$, there is some $\sigma \in SL(2,\mathbb{R})$ such that $\sigma(z)=i$:</p><script type="math/tex; mode=display">\frac{az+b}{cz+d}=\frac{ax+b+ayi}{cx+d+cyi}=i \implies \begin{cases}ax+b=-cy \\cx+d=ay\end{cases}</script><p>Let’s play around the last linear equation system:</p><script type="math/tex; mode=display">\begin{cases}ax+b=-cy \\cx+d=ay\end{cases} \implies\begin{cases}acx+bc=-c^2y \\acx+ad=a^2y\end{cases} \impliesa^2+c^2=\frac{1}{y}.</script><p>We can put $c=0$ and $a=\frac{1}{\sqrt{y}}$ so that $b=-\frac{x}{\sqrt{y}}$ and $d=\sqrt{y}$. That is,</p><script type="math/tex; mode=display">\begin{pmatrix}\frac{1}{\sqrt{y}} & -\frac{x}{\sqrt{y}} \\0 & \sqrt{y}\end{pmatrix}(x+iy)=i.</script><p>We have therefore proved:</p><blockquote><p>The action of $SL(2,\mathbb{R})$ on $\mathfrak{H}$ is transitive.</p></blockquote><p><em>Proof.</em> For any $z,z’ \in \mathfrak{H}$, there exists $\sigma$ and $\sigma’$ such that $\sigma(z)=i$ and $\sigma’(z’)=i$. Then $\sigma’^{-1}(\sigma(z))=z’$, i.e. $\sigma’^{-1}\sigma$ sends $z$ to $z’$. $\square$</p><p>By working around $i$ on $\mathfrak{H}$ we can save ourselves from a lot of troubles. It is then desirable to find the stabiliser of $i$.</p><blockquote><p>The stabiliser of $i \in \mathfrak{H}$ is $SO(2) \cong S^1$.</p></blockquote><p><em>Proof.</em> Suppose $\sigma=\begin{pmatrix} a &amp; b \\c &amp; d \end{pmatrix}$ stabilises $i$. Then first of all we have</p><script type="math/tex; mode=display">\Im\left(\frac{ai+b}{ci+d}\right)=\frac{1}{|ci+d|^2}=1 \implies c^2+d^2=1.</script><p>Then</p><script type="math/tex; mode=display">\frac{ai+b}{ci+d}=i \implies ai+b=-c+di \implies \begin{cases}a=d \\ b=-c\end{cases}</script><p>It follows that</p><script type="math/tex; mode=display">\begin{pmatrix}a & b \\c & d\end{pmatrix}\begin{pmatrix}a & c \\b & d\end{pmatrix}=\begin{pmatrix}a^2+b^2 & ac+bd \\ac+bd & c^2+d^2\end{pmatrix}=\begin{pmatrix}1 & 0 \\0 & 1\end{pmatrix}.</script><p>Therefore $\sigma \in O(2) \cap SL(2) = SO(2)$ as expected. $\square$</p><p>With these being said, the action of $SL(2,\mathbb{R})$ on $i$ consists of $SO(2)$ that moves nothing and the rest that actually move things. In other words, $SL(2,\mathbb{R})/SO(2) \cong \mathfrak{H}$ as a $2$-manifold. In particular, the action is a isometry. We will find the effective part of the group action out. For $\sigma \in SL(2,\mathbb{R})$, we assume that $\sigma(i)=x+iy$. Then</p><script type="math/tex; mode=display">\sigma(i)=\begin{pmatrix}\frac{1}{\sqrt{y}} & -\frac{x}{\sqrt{y}} \\0 & \sqrt{y}\end{pmatrix}^{-1}(i)=\begin{pmatrix}\sqrt{y} & \frac{x}{\sqrt{y}} \\ 0 & \frac{1}{\sqrt{y}}\end{pmatrix}(i).</script><p>Let $B$ be the upper triangular matrices in $SL(2,\mathbb{R})$ with positive diagonal elements. Then it is elements in $B$ that actually move things. According to this classification, we have obtained a decomposition</p><blockquote><p> The matrix multiplication map $B \times SO(2) \to SL(2,\mathbb{R})$ is surjective. </p></blockquote><p><em>Proof.</em> Notice that every element of $B$ can be written in the form </p><script type="math/tex; mode=display">\lambda_{x,y}=\begin{pmatrix}\sqrt{y} & \frac{x}{\sqrt{y}} \\ 0 & \frac{1}{\sqrt{y}}\end{pmatrix}.</script><p>For any $\sigma \in SL(2,\mathbb{R})$, suppose $\sigma(i)=x+iy$, then $\sigma(i)=\lambda_{x,y}(i)$, therefore $\lambda_{x,y}^{-1}\sigma(i)=i$, i.e. $\lambda_{x,y}^{-1}\sigma \in SO(2)$, i.e. $\lambda_{x,y}^{-1}\sigma$ is a stabiliser of $i$. The product $\sigma = \lambda_{x,y}(\lambda_{x,y}^{-1}\sigma)$ always lies in the image of $B \times SO(2)$. $\square$</p><p>We can decompose $B$ further:</p><script type="math/tex; mode=display">\begin{pmatrix}\sqrt{y} & \frac{x}{\sqrt{y}} \\ 0 & \frac{1}{\sqrt{y}}\end{pmatrix}=\begin{pmatrix}1 & x \\ 0 & 1\end{pmatrix}\begin{pmatrix}\sqrt{y} & 0 \\ 0 & \frac{1}{\sqrt{y}} \end{pmatrix}.</script><p><strong>Let $N$ be the group of upper triangular matrices in $SL(2,\mathbb{R})$ with $1$ on the diagonal line and let $A$ be the group of diagonal matrices with non-negative entries.</strong> Then $B=NA$. <strong>Let $K=SO(2) \subset SL(2,\mathbb{R})$</strong>, then we have obtained the so-called Iwasawa decomposition:</p><blockquote><p>There is a diffeomorphism onto</p><script type="math/tex; mode=display">\begin{aligned}N \times A \times K &\to SL(2,\mathbb{R}) \\(n,a,k) &\mapsto nak\end{aligned}</script></blockquote><p><em>Proof.</em> It only remains to show injectivity. Suppose $n_1a_1k_1=n_2a_2k_2$. Applying both sides onto $i$ we obtain $n_1a_1(i)=n_2a_2(i)$. Suppose</p><script type="math/tex; mode=display">n_i=\begin{pmatrix}1 & x_i \\0 & 1\end{pmatrix},\;a_i=\begin{pmatrix}\sqrt{y_i} & 0 \\0 & \frac{1}{\sqrt{y_i}}\end{pmatrix},\; i=1,2.</script><p>Then we have $n_1a_1(i)=x_1+y_1i=n_2a_2(i)=x_2+y_2i$. It follows that $x_1=x_2$ and $y_1=y_2$, i.e. $n_1=n_2$ and $a_1=a_2$ and therefore $k_1=k_2$. $\square$</p><p>By investigating $N$ and $A$ further we obtain</p><blockquote><p>The group $SL(2,\mathbb{R})$ is homeomorphic to $S^1 \times D$.</p></blockquote><p><em>Proof.</em> Notice that $N$ is homeomorphic to $\mathbb{R}$ and $A$ is homeomorphic to $\mathbb{R}_{&gt;0}\cong \mathbb{R}$. $\square$</p><p>Notice the order of $N,A,K$ does not matter very much: $NAK,KAN,ANK,KNA$ are the same thing. This is because $AN=NA$ and for $nak \in SL(2,\mathbb{R})$, we have $(nak)^{-1}=k^{-1}a^{-1}n^{-1}$ which lies in the preimage of $K \times A \times N$ under matrix multiplication.</p><h1 id="Immediate-group-theoretical-consequences"><a href="#Immediate-group-theoretical-consequences" class="headerlink" title="Immediate group-theoretical consequences"></a>Immediate group-theoretical consequences</h1><p>With the full Iwasawa decomposition in mind, we can scratch the surface of the rather complicated $SL(2,\mathbb{R})$.</p><blockquote><p>The only continuous homomorphism of $SL(2,\mathbb{R})$ to $\mathbb{R}$ is trivial.</p></blockquote><p><em>Proof.</em> Let $f:SL(2,\mathbb{R}) \to \mathbb{R}$ be such a map. We have $f(kan)=f(k)+f(a)+f(n)$. We need to show that $f(k)=f(a)=f(n)=0$.</p><p>First of all, since $K$ is a compact subgroup of $SL(2,\mathbb{R})$, its image on $\mathbb{R}$ has to be a compact subgroup. On the other hand, $f$ on $A$ and $N$ can be constructed more explicitly. For $A$, we see $\begin{pmatrix}r &amp; 0 \\ 0 &amp; \frac{1}{r} \end{pmatrix} \mapsto r \mapsto \log{r}$ yields an isomorphism of $A$ and $\mathbb{R}$, in both algebraical and topological sense. For $N$ on the other hand, we immediately have an isomorphism $\begin{pmatrix}1 &amp; x \\ 0 &amp; 1\end{pmatrix} \mapsto x$. Therefore the image of $f$ on $A$ and $N$ can be realised as as $u\log{r}$ and $vx$ for some $u,v \in \mathbb{R}$. We use the fact that $AN=NA$ to determine $u$ and $v$. Notice that</p><script type="math/tex; mode=display">\begin{pmatrix}r & 0 \\ 0 & \frac{1}{r} \end{pmatrix}\begin{pmatrix}1 & x \\ 0 & 1\end{pmatrix}=\begin{pmatrix}1 & r^2x \\ 0 & 1\end{pmatrix}\begin{pmatrix}r & 0 \\ 0 & \frac{1}{r}\end{pmatrix},</script><p>applying $f$ on both sides, we have</p><script type="math/tex; mode=display">u\log{r}+vx=vr^2x+u\log{r}\;\forall x\in \mathbb{R},\;r>0 \implies v=0</script><p>For $u$, we consider the conjugate relation</p><script type="math/tex; mode=display">\begin{pmatrix}0 & -1 \\1 & 0\end{pmatrix}\begin{pmatrix}r & 0 \\ 0 & \frac{1}{r} \end{pmatrix}\begin{pmatrix}0 & -1 \\1 & 0\end{pmatrix}^{-1}=\begin{pmatrix}\frac{1}{r} & 0 \\ 0 & r \end{pmatrix}=\begin{pmatrix}r & 0 \\ 0 & \frac{1}{r} \end{pmatrix}^{-1}.</script><p>Applying $f$ on both sides we obtain</p><script type="math/tex; mode=display">u\log{r}=-u\log{r},\;\forall r>0 \implies u=0.</script><p>This proves the triviality of $f$. $\square$</p><blockquote><p>Let $f:SL(2,\mathbb{R}) \to GL(n,\mathbb{R})$ be a continuous homomorphism, then $f(SL(2,\mathbb{R})) \subset SL(n,\mathbb{R})$.</p></blockquote><p><em>Proof.</em> Consider the sequence of group homomorphisms</p><script type="math/tex; mode=display">SL(2,\mathbb{R}) \xrightarrow{f}GL(n,\mathbb{R}) \xrightarrow{\det}\mathbb{R}^\ast</script><p>Since $SL(2,\mathbb{R})$ is connected, we see $\det\circ f(SL(2,\mathbb{R}))$ is connected, thus lying in $\mathbb{R}_{&gt;0}$. We can then modify the sequence a little bit:</p><script type="math/tex; mode=display">SL(2,\mathbb{R}) \xrightarrow{f}GL(n,\mathbb{R}) \xrightarrow{\det}\mathbb{R}_{>0} \xrightarrow{\log}\mathbb{R}\xrightarrow{\exp}\mathbb{R}_{>0}</script><p>The map $\log \circ \det \circ f$ is a continuous homomorphism sending $SL(2,\mathbb{R})$ to $\mathbb{R}$, which is trivial, and therefore</p><script type="math/tex; mode=display">\exp \circ \log \circ \det \circ f(SL(2,\mathbb{R}))=\det\circ f(SL(2,\mathbb{R}))=\{1\}.</script><p>This proves our assertion. $\square$</p><p>There are still a lot we can do without much Lie theory but Haar measure theory. The reader is advised to try this <a href="https://metaphor.ethz.ch/x/2018/fs/401-3226-00L/ex/ex1.pdf">exercise set</a> to see, for example, that the “volume” of $SL(2,\mathbb{R})/SL(2,\mathbb{Z})$ is $\zeta(2)$. In the references / further reading section the reader will also find a way to show that $SL(2,\mathbb{Z})\backslash SL(2,\mathbb{R})/SO(2,\mathbb{R})$ has volume $\frac{\pi}{3}$.</p><h1 id="References-Further-Reading"><a href="#References-Further-Reading" class="headerlink" title="References / Further Reading"></a>References / Further Reading</h1><ul><li>Keith Conrad, <em>Decomposing $SL(2,\mathbb{R})$</em>. <a href="https://kconrad.math.uconn.edu/blurbs/grouptheory/SL(2,R).pdf">https://kconrad.math.uconn.edu/blurbs/grouptheory/SL(2,R).pdf</a></li><li>Serge Lang, $SL(2,\mathbb{R})$. (He wrote this book for self-study because he didn’t know much about Lie theory. In chapter 3 one can see how to do calculus with $SL(2,\mathbb{R})$.)</li><li>Anthony W. Knapp, <em>Lie Groups Beyond an Introduction</em>.</li><li>Merrick Cai, <em>The Volume of $SL(2,\mathbb{Z})\backslash SL(2,\mathbb{R})/SO(2,\mathbb{R})$</em>. <a href="https://math.mit.edu/~mqt/math/teaching/mit/18-704/final-papers/cai-m_final.pdf">https://math.mit.edu/~mqt/math/teaching/mit/18-704/final-papers/cai-m_final.pdf</a> (A more concrete example of doing calculus on this group.)</li></ul>]]></content>
    
    
    <summary type="html">In this post we show that $SL(2,&#92;mathbb{R})$ can be identified as the inside of a solid torus and see what we can learn from it.</summary>
    
    
    
    <category term="Algebra" scheme="https://desvl.xyz/categories/Algebra/"/>
    
    <category term="Group Theory" scheme="https://desvl.xyz/categories/Algebra/Group-Theory/"/>
    
    <category term="Lie Groups" scheme="https://desvl.xyz/categories/Algebra/Lie-Groups/"/>
    
    <category term="Topology" scheme="https://desvl.xyz/categories/Topology/"/>
    
    
    <category term="Study new content" scheme="https://desvl.xyz/tags/Study-new-content/"/>
    
    <category term="Lie theory" scheme="https://desvl.xyz/tags/Lie-theory/"/>
    
  </entry>
  
  <entry>
    <title>Artin&#39;s Theorem of Induced Characters</title>
    <link href="https://desvl.xyz/2023/07/17/artin-theorem/"/>
    <id>https://desvl.xyz/2023/07/17/artin-theorem/</id>
    <published>2023-07-17T19:39:59.000Z</published>
    <updated>2025-05-16T23:53:35.204Z</updated>
    
    <content type="html"><![CDATA[<h1 id="Introduction"><a href="#Introduction" class="headerlink" title="Introduction"></a>Introduction</h1><p>When studying a linear space, when some subspaces are known, we are interested in the contribution of these subspaces, by studying their sum or (inner) direct sum if possible. This philosophy can be applied to many other fields.</p><p>In the context of representation theory, say, we are given a finite group $G$, with a subgroup $H$, we want to know how a character of $H$ is related to a character of $G$, through induction if anything. Next we state the content of this post more formally.</p><p>Let $G$ be a finite group with distinct irreducible characters $\chi_1,\dots,\chi_h$. A class function $f$ on $G$ is a character if and only if it is a linear combination of the $\chi_i$’s with non-negative integer coefficients. We denote the space of characters by $R^+(G)$. However, $R^+(G)$ lacks a satisfying algebraic structure, for example, one is not even allowed to freely do subtraction. For this reason, we extend the coefficients to all of integers, by defining</p><script type="math/tex; mode=display">R(G)=\mathbb{Z}\chi_1 \oplus \cdots \oplus \mathbb{Z}\chi_h.</script><p>An element of $R(G)$ is called a <em>virtual character</em> because when one coefficient of some $\chi_i$ is negative, it cannot be a character in the usual sense. Note that $R(G)$ is a finitely generated free abelian group, hence we are free to do subtraction in the normal sense.</p><p>Besides, since the product of two characters are still a character, we see $R(G)$ is a ring (not necessarily commutative). To be precise, it is a subring of the ring $F_\mathbb{C}(G)$, the ring of class functions of $G$ over $\mathbb{C}$. Furthermore, we actually have $F_\mathbb{C}(G) \cong \mathbb{C} \otimes R(G)$.</p><p>Let $H$ be a subgroup of $G$. Then the operation of restriction and induction defines homomorphisms $\mathrm{Res}:R(G) \to R(H)$ and $\mathrm{Ind}:R(H) \to R(G)$. By extending the Frobenius reciprocity linearly, still we find that $\mathrm{Res}$ and $\mathrm{Ind}$ are adjoints of each other. We also notice that the image of $\mathrm{Ind}:R(H) \to R(G)$ is a right ideal of $R(G)$. This is because, for any $\varphi \in R(H)$ and $\psi \in R(G)$, one has</p><script type="math/tex; mode=display">\mathrm{Ind}(\varphi)\cdot \psi = \mathrm{Ind}(\psi \cdot \mathrm{Res}(\psi)) \in \operatorname{Im}(\mathrm{Ind}).</script><p>But being an ideal should not be the end of our story. We want to know what happens if we consider more than one subgroups. For example, since every group is the union of all of its cyclic groups, what if we consider all cyclic subgroups of $G$? We are also interested in how all these ideals work together. This is where Artin’s theorem comes in.</p><h1 id="Artin’s-Theorem-Statement-and-a-Concrete-Example"><a href="#Artin’s-Theorem-Statement-and-a-Concrete-Example" class="headerlink" title="Artin’s Theorem - Statement and a Concrete Example"></a>Artin’s Theorem - Statement and a Concrete Example</h1><blockquote><p><strong>Artin’s Theorem.</strong> Let $X$ be a family of subgroups of a finite group $G$. Let $\mathrm{Ind}:\oplus_{H \in X}R(H) \to R(G)$ be the homomorphism defined by the family of $\mathrm{Ind}_H^G$, $H \in X$. Then the following statements are equivalent:</p><p>(i) $G$ is the union of the conjugates of all $H \in X$. Equivalently, for any $\sigma \in G$, there is some $H \in X$ such that $H$ contains a conjugate of $\sigma$.</p><p>(ii) The cokernel of $\mathrm{Ind}:\bigoplus_{H \in X}R(H) \to R(G)$ is finite. </p></blockquote><p><strong>Example.</strong> Put $G=D_4$, the dihedral group consists of rotations ($\sigma$) and flips ($\tau$) of the square. We write</p><script type="math/tex; mode=display">G=\{1,\sigma,\sigma^2,\sigma^3,\tau,\tau\sigma,\tau\sigma^2,\tau\sigma^3\}.</script><p>In this example we take $X=\{\langle\sigma\rangle,\langle\tau\rangle,\langle\tau\sigma\rangle\}$. First of all we put down the character table of $G$:</p><script type="math/tex; mode=display">\begin{array}{c|ccccc}   G & 1 & \sigma^2 & \sigma & \tau & \sigma\tau \\   \hline   \chi_1 & 1 & 1 & 1 & 1 & 1 \\   \chi_2 & 1 & 1 & -1 & 1 & -1 \\   \chi_3 & 1 & 1 & -1 & -1 & 1 \\   \chi_4 & 1 & 1 & 1 & -1 & -1 \\    \chi_5 & 2 & -2 & 0 & 0 & 0\end{array}</script><p>The character table of elements of $X$ is not difficult to carry out as they are characters of cyclic groups.</p><script type="math/tex; mode=display">\begin{array}{c|cccc}\langle\sigma\rangle & 1 & \sigma & \sigma^2 & \sigma^3 \\\hline\chi_1^\sigma & 1 & 1 & 1 & 1 \\\chi_2^\sigma & 1 & i & -1 & -i \\\chi_3^\sigma & 1 & -1 & 1 & -1 \\\chi_4^\sigma & 1 & -i & -1 & i\end{array}\quad\quad\begin{array}{c|cc}\langle\tau\rangle & 1 &\tau \\\hline\chi_1^\tau & 1 & 1 \\\chi_2^\tau & 1 & -1 \end{array}\quad\quad\begin{array}{c|cc}\langle\tau\sigma\rangle & 1 &\tau\sigma \\\hline\chi_1^{\tau\sigma} & 1 & 1 \\\chi_2^{\tau\sigma} & 1 & -1 \end{array}</script><p>Instead of writing something like $\mathrm{Ind}_{\langle\sigma\rangle}^{D_4}\chi_1^\sigma=\chi_1+\chi_4$ manually for all characters, we put all of them in an induction-restriction table:</p><script type="math/tex; mode=display">\begin{array}{c|cccc|cc|cc}D_4 & \langle\sigma\rangle & & & &\langle\tau\rangle & & \langle\tau\sigma\rangle & \\ &\chi_1^\sigma & \chi_2^\sigma & \chi_3^\sigma &\chi_4^\sigma &\chi_1^\tau & \chi_2^\tau & \chi_1^{\sigma\tau} & \chi_2^{\sigma\tau} \\ \hline\chi_1 & 1 & 0 & 0 & 0 & 1 & 0 & 1 & 0 \\\chi_2 & 0 & 0 & 1 & 0 & 1 & 0 & 0 & 1 \\\chi_3 & 0 & 0 & 1 & 0 & 0 & 1 & 1 & 0 \\\chi_4 & 1 & 0 & 0 & 0 & 0 & 1 & 0 & 1 \\\chi_5 & 0 & 1 & 0 & 1 & 1 & 1 & 1 & 1\end{array}</script><p>which yields a matrix naturally:</p><script type="math/tex; mode=display">T=\left(\begin{array}{cccc|cc|cc}1 & 0 & 0 & 0 & 1 & 0 & 1 & 0 \\0 & 0 & 1 & 0 & 1 & 0 & 0 & 1 \\0 & 0 & 1 & 0 & 0 & 1 & 1 & 0 \\1 & 0 & 0 & 0 & 0 & 1 & 0 & 1 \\0 & 1 & 0 & 1 & 1 & 1 & 1 & 1\end{array}\right).</script><p>How to read the induction-restriction table? For example, the first column is $\langle \mathrm{Ind}_{\langle\sigma\rangle}^{D_4}\chi_1^\sigma,\chi_j\rangle$.  Since $\mathrm{Ind}_{\langle\sigma\rangle}^{D_4}\chi_1^\sigma=\chi_1+\chi_4$, the column becomes $(1,0,0,1,0)$. On the other hand, the rows are indicated by the inner product with restriction. For example, since we have $\mathrm{Res}_{\langle\sigma\rangle}^{D_4}\chi_5=1$, thus $\langle\chi_4^\sigma,\mathrm{Res}_{\langle\sigma\rangle}^{D_4}\chi_5\rangle=1$ and therefore $T_{54}=1$. Induction and restriction coexist up to a transpose, which is another way to illustrate Frobenius reciprocity.</p><p>We obtain the induction map explicitly:</p><script type="math/tex; mode=display">R(\langle\sigma\rangle) \oplus R(\langle\tau\rangle) \oplus R(\langle\tau\sigma\rangle)  \xrightarrow{\mathrm{Ind}}R(D_4), \\\mathbb{Z}^4 \oplus \mathbb{Z}^2 \oplus \mathbb{Z}^2 \xrightarrow{\text{multiplication by $T$}} \mathbb{Z}^5</script><p>where the basis of $R(D_4)$ is $\chi_1,\dots,\chi_5$ and the basis of $R(\langle\sigma\rangle) \oplus R(\langle\tau\rangle) \oplus R(\langle\tau\sigma\rangle)$ is given by the second row of the induction-restriction table. By doing Gaussian elimination of rows and columns of $T$ (over $\mathbb{Z}$), i.e. changing the basis for $\mathbb{Z}^5$ and $\mathbb{Z}^8$, the matrix $T$ is reduced to the form</p><script type="math/tex; mode=display">U=\begin{pmatrix}1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 \\0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 \\0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 \\0 & 0 & 0 & 0 & 2 & 0 & 0 & 0\end{pmatrix}</script><p>The image of $U$ is $\mathbb{Z} \oplus \mathbb{Z} \oplus \mathbb{Z} \oplus \mathbb{Z} \oplus 2\mathbb{Z}$, hence the cokernel of the induction map is</p><script type="math/tex; mode=display">\mathbb{Z}^5/(\mathbb{Z} \oplus \mathbb{Z} \oplus \mathbb{Z} \oplus \mathbb{Z} \oplus 2\mathbb{Z})\cong \mathbb{Z}/2\mathbb{Z},</script><p>which is certainly finite. One can also verify that $X$ satisfies (i).</p><h1 id="Proof-of-Artin’s-Theorem"><a href="#Proof-of-Artin’s-Theorem" class="headerlink" title="Proof of Artin’s Theorem"></a>Proof of Artin’s Theorem</h1><h2 id="i-gt-ii"><a href="#i-gt-ii" class="headerlink" title="(i) =&gt; (ii)"></a>(i) =&gt; (ii)</h2><p>Consider the exact sequence</p><script type="math/tex; mode=display">\bigoplus_{H \in X}R(H) \xrightarrow{\mathrm{Ind}}R(G) \to \mathrm{coker}(\mathrm{Ind}) \to 0.</script><p>To show that $\mathrm{coker}(\mathrm{Ind})$ is finite (it is a finitely generated ring to begin with), it suffices to show that it suffices to see its result from tensoring with $\mathbb{Q}$, in other words, that</p><script type="math/tex; mode=display">1_\mathbb{Q}\otimes\mathrm{Ind}:\bigoplus_{H \in X}\mathbb{Q} \otimes R(H) \to \mathbb{Q}\otimes R(G)</script><p>is a surjective map, i.e. it has trivial cokernel. This is equivalent to the surjectivity of the $\mathbb{C}$-linear map</p><script type="math/tex; mode=display">1_\mathbb{C}\otimes\mathrm{Ind}:\bigoplus_{H\in X}\mathbb{C}\otimes R(H) \to \mathbb{C} \otimes R(G).</script><p>By Frobenius reciprocity, this is on the other hand equivalent to the injectivity of</p><script type="math/tex; mode=display">1 \otimes \mathrm{Res}:\mathbb{C}\otimes R(G) \to \bigoplus_{H \in X}\mathbb{C} \otimes R(H).</script><p>Notice that $\mathbb{C} \otimes R(G)$ is the space of class functions of $G$. For a class function $f$ of $G$, if its restriction on each $H$ is $0$, according to (i), all values of $f$ have been determined, therefore $f$ is $0$ everywhere.</p><h2 id="ii-gt-i"><a href="#ii-gt-i" class="headerlink" title="(ii) =&gt; (i)"></a>(ii) =&gt; (i)</h2><p>Let $S$ be the union of the conjugates of the subgroups $H \in X$. Then we write elements in $\oplus_{H \in X}R(H)$ as $g=\sum_{H \in X}\mathrm{Ind}_H^G(f_H)$. It follows that $g$ always vanishes on $G \setminus S$. If (ii) holds, then</p><script type="math/tex; mode=display">1_\mathbb{C}\otimes\mathrm{Ind}:\bigoplus_{H\in X}\mathbb{C}\otimes R(H) \to \mathbb{C} \otimes R(G)</script><p>is a surjective map. Therefore class functions of $G$, i.e. elements of $\mathbb{C} \otimes R(G)$ vanish on $G \setminus S$, which forces $G \setminus S$ to be empty, i.e. $G=S$.</p><h1 id="References-Further-Reading"><a href="#References-Further-Reading" class="headerlink" title="References / Further Reading"></a>References / Further Reading</h1><ul><li>Jean-Pierre Serre, <em>Linear Representations of Finite Groups</em>.</li><li>Kiyoshi Igusa, <em><a href="https://people.brandeis.edu/~igusa/Math101bS07/Math101b_notesD.pdf">ALGEBRA II, PART D: REPRESENTATIONS OF GROUPS</a></em></li></ul>]]></content>
    
    
    <summary type="html">We give a relatively more detailed proof of Artin&#39;s theorem in representation theory of finite groups as well as an example of dihedral group.</summary>
    
    
    
    <category term="Algebra" scheme="https://desvl.xyz/categories/Algebra/"/>
    
    <category term="Representation Theory" scheme="https://desvl.xyz/categories/Algebra/Representation-Theory/"/>
    
    <category term="Linear Algebra" scheme="https://desvl.xyz/categories/Linear-Algebra/"/>
    
    <category term="Finite Groups" scheme="https://desvl.xyz/categories/Algebra/Finite-Groups/"/>
    
    
    <category term="Study new content" scheme="https://desvl.xyz/tags/Study-new-content/"/>
    
    <category term="Jean-Pierre Serre" scheme="https://desvl.xyz/tags/Jean-Pierre-Serre/"/>
    
  </entry>
  
  <entry>
    <title>Chinese Remainder Theorem in Several Scenarios of Ring Theory</title>
    <link href="https://desvl.xyz/2023/05/27/chinese-remainder-theorem-ring-theory/"/>
    <id>https://desvl.xyz/2023/05/27/chinese-remainder-theorem-ring-theory/</id>
    <published>2023-05-27T11:09:12.000Z</published>
    <updated>2025-05-11T12:05:00.444Z</updated>
    
    <content type="html"><![CDATA[<h1 id="Introduction"><a href="#Introduction" class="headerlink" title="Introduction"></a>Introduction</h1><p>The classic version of Chinese remainder theorem tells us that we can find solutions out of modulus relations. You may have seen this poem when you were young.</p><blockquote><p>有物不知其數，三三數之剩二，五五數之剩三，七七數之剩二。問物幾何？</p></blockquote><p>Translation:</p><blockquote><p>There are certain things whose number is unknown. If we count them by threes, we have two left over; by fives, we have three left over; and by sevens, two are left over. How many things are there?</p></blockquote><p>This poem can be translated into finding the solution of an equation:</p><script type="math/tex; mode=display">\begin{cases}x \equiv 2 \pmod 3, \\x \equiv 3 \pmod 5, \\x \equiv 2 \pmod 7.\end{cases}</script><p>In modern language, we consider the integer ring $\mathbb{Z}$ (Z for Zahlen in German). Ideals $(3)$, $(5)$ and $(7)$ are pairwise coprime (or comaximal), and as a result, the map</p><script type="math/tex; mode=display">\begin{aligned}\mathbb{Z} &\to \mathbb{Z}/(3) \times \mathbb{Z}/(5) \times \mathbb{Z}/(7) \\n &\mapsto (n +(3), n +(5), n +(7))\end{aligned}</script><p>is considered. The poem is all about finding a representative of the preimage of $(2 +(3), 3+(5), 2+ (7))$. The Chinese remainder theorem tells us that, the map is surjective. Actually $233$ is such an element. </p><p>Nevertheless, what really matters here is about rings and ideals. That is why we try revisit the Chinese Remainder theorem in the language of ring theories. Although the importance of the classic version should never been ignored, we should also see it further.</p><p>We will study the Chinese remainder theorem in a ring, assuming that the ring is commutative or something even weaker. We will also see a special case in Dedekind domains. We try our best to make assumptions as few as possible.</p><h1 id="Chinese-Remainder-Theorem"><a href="#Chinese-Remainder-Theorem" class="headerlink" title="Chinese Remainder Theorem"></a>Chinese Remainder Theorem</h1><p>We want to apply restrictions as little as possible. Let $A$ be a ring that is not necessarily commutative and does not necessarily contain a unit. A lot of things developed in ring theory will fail here, but we can still consider <strong>direct product</strong> of rings and <strong>coprime</strong> (or comaximal) ideals. Two ideals $\mathfrak{a}$ and $\mathfrak{b}$ of $A$ are coprime if $\mathfrak{a}+\mathfrak{b}=A$.  We will do the Chinese remainder theorem on two levels of abstraction. Throughout, when discussing ideals, we are all talking about two-sided ideals.</p><h2 id="Level-1-Ring-with-Unity"><a href="#Level-1-Ring-with-Unity" class="headerlink" title="Level 1 - Ring with Unity"></a>Level 1 - Ring with Unity</h2><p>When the ring has a unit, we have an easy view of the intersection and product of ideals</p><blockquote><p><strong>Proposition 1.</strong> Let $A$ be a ring with unity, $\mathfrak{a}$ and $\mathfrak{b}$ two ideals of $A$. If $\mathfrak{a}$ and $\mathfrak{b}$ are coprime, i.e. $\mathfrak{a}+\mathfrak{b}=A$, then</p><script type="math/tex; mode=display">\mathfrak{a} \cap \mathfrak{b}=\mathfrak{ab}+\mathfrak{ba}.</script><p>In particular, when $A$ is commutative, one always has $\mathfrak{a} \cap \mathfrak{b} = \mathfrak{ab}$.</p></blockquote><p><em>Proof.</em> The last statement follows from the relation that $\mathfrak{ab}=\mathfrak{ba}$, therefore it suffices to prove the first relation. Notice that there exists $x \in \mathfrak{a}$ and $y \in \mathfrak{b}$ such that $x+y=1$. As a result, for any $a \in \mathfrak{a} \cap \mathfrak{b}$, one has</p><script type="math/tex; mode=display">a=a(x+y)=ax+ay \in \mathfrak{ba}+\mathfrak{ab} \implies \mathfrak{a} \cap \mathfrak{b} \subset \mathfrak{ba}+\mathfrak{ab}.</script><p>Conversely, since both $\mathfrak{a}$ and $\mathfrak{b}$ are two-sided ideals, we see $\mathfrak{ab} \subset \mathfrak{a} \cap \mathfrak{b}$ and $\mathfrak{ba} \subset \mathfrak{a} \cap \mathfrak{b}$, so is their sum. $\square$</p><p>Let $A$ be a ring with unity. Consider a finite number of ideals $\mathfrak{a}_1,\dots,\mathfrak{a}_n$. Define a homomorphism</p><script type="math/tex; mode=display">\begin{aligned}\phi:A &\to \prod_{i=1}^{n}(A/\mathfrak{a}_i), \\x &\mapsto (x+\mathfrak{a}_1, \dots, x+\mathfrak{a}_n).\end{aligned}</script><p>We do not assume that these $\mathfrak{a}_i$ are pairwise coprime just yet. We will see what happens when they are.</p><blockquote><p><strong>Theorem 1.</strong> For the homomorphism $\phi$ defined above,</p><ol><li>$\phi$ is injective if and only if $\bigcap_{i=1}^{n}\mathfrak{a}_i=(0)$.</li><li>If the $\mathfrak{a}_i$ are pairwise coprime and $A$ is commutative, then $\prod_{i=1}^{n}\mathfrak{a}_i=\bigcap_{i=1}^{n}\mathfrak{a}_i$.</li><li>$\phi$ is surjective if and only if the $\mathfrak{a}_i$ are pairwise coprime.</li></ol></blockquote><p><em>Proof.</em> The first statement follows from the fact that $\ker\phi=\bigcap_{i=1}^{n}\mathfrak{a}_i=(0)$.</p><p>For the second statement, according to proposition 1, this equality holds for $n=2$. Now suppose that $n&gt;2$ and this statement holds for $\mathfrak{a}_1,\dots,\mathfrak{a}_{n-1}$. Let $\mathfrak{b}=\prod_{i=1}^{n-1}\mathfrak{a}_i=\bigcap_{i=1}^{n-1}\mathfrak{a}_i$, then we will show that $\mathfrak{a}_n+\mathfrak{b}=A$ and therefore</p><script type="math/tex; mode=display">\mathfrak{b}\mathfrak{a}_n=\prod_{i=1}^{n}\mathfrak{a}_i=\mathfrak{b} \cap \mathfrak{a}_n = \bigcap_{i=1}^{n}\mathfrak{a}_i.</script><p>Notice that $\mathfrak{a}_i$ and $\mathfrak{a}_n$ are pairwise coprime for all $1 \le i \le n-1$. Therefore, in particular, for each of these $i$, we have equation $x_i+y_i =1$ where $x_i \in \mathfrak{a}_i$ and $y_i \in \mathfrak{b}_i$. Here we use the fact that $1 \in A$. Commutativity of $A$ is also used because proposition 1 shows us that without commutativity we cannot even prove it for $n=2$.</p><p>From this equation we deduce that</p><script type="math/tex; mode=display">\prod_{i=1}^{n-1}x_i = \prod_{i=1}^{n-1}(1-y_i).</script><p>Expanding the product on the right hand side, we see $\prod_{i=1}^{n-1}(1-y_i)=\prod_{i=1}^{n-1}x_i \equiv 1 \pmod{\mathfrak{a}_n}$. This implies that there exists $y \in \mathfrak{a}_n$ such that $\prod_{i=1}^{n-1}x_i+y=1$. Since $\prod_{i=1}^{n-1}x_i \in \mathfrak{b}$, we have shown that $\mathfrak{a}_n+\mathfrak{b}=A$.</p><p>For the third statement, we first assume that $\phi$ is surjective. It suffices to show, for example, $\mathfrak{a}_1$ and $\mathfrak{a}_2$ are coprime. There exists $x \in A$ such that $\phi(x)=(1+\mathfrak{a}_1,0+\mathfrak{a}_2,\dots,0+\mathfrak{a}_n)$. This shows us that $x \in \mathfrak{a}_2$ and $1-x \in \mathfrak{a}_1$. As a result,</p><script type="math/tex; mode=display">1=(1-x)+x \in \mathfrak{a}_1+\mathfrak{a}_2,</script><p>hence these two ideals are coprime. This procedure applies to all other $\mathfrak{a}_i$ by merely modifying the index.</p><p>Conversely, assume the $\mathfrak{a}_i$ are pairwise coprime, then it suffices to show that there exists $x \in A$ such that $\phi(x)=(1+\mathfrak{a}_1,0+\mathfrak{a}_2,\dots,0+\mathfrak{a}_n)$ because we can apply the same procedure to all other $i$th component and $1$ can be then replaced with any other element $a$ of $A$ (for example, for this case, we replace $x$ with $ax$). All other cases can be generated by addition.</p><p>Since $\mathfrak{a}_1+\mathfrak{a}_i=(1)$ for all $i&gt;1$, we have $u_i+v_i = 1$ where $u_i \in \mathfrak{a}_1$ and $v_i \in \mathfrak{a}_i$. Take</p><script type="math/tex; mode=display">x=\prod_{i=2}^{n}v_i = \prod_{i=2}^{n}(1-u_i),</script><p>  then we see $x \equiv 0 \pmod{\mathfrak{a}_i}$ for all $i&gt;1$ but $x \equiv 1 \pmod{\mathfrak{a}_1}$. This $x$ will be mapped to  $(1+\mathfrak{a}_1,0+\mathfrak{a}_2,\dots,0+\mathfrak{a}_n)$ as expected. $\square$</p><p>It also matters that ideals are two-sided, otherwise these products of $n-1$ terms will make less sense.</p><blockquote><p><strong>Corollary 1 (Chinese Remainder Theorem).</strong> If the $\mathfrak{a}_i$ are pairwise coprime, then $\phi$ is an isomorphism:</p><script type="math/tex; mode=display">\frac{A}{\bigcap_{i=1}^{n}\mathfrak{a}_i} \cong \prod_{i=1}^{n}\frac{A}{\mathfrak{a}_i}.</script><p>If $A$ is commutative, then</p><script type="math/tex; mode=display">\frac{A}{\prod_{i=1}^{n}\mathfrak{a}_i} \cong \prod_{i=1}^{n}\frac{A}{\mathfrak{a}_i}</script></blockquote><h2 id="Level-2-Noncommutative-Ring-without-Unity"><a href="#Level-2-Noncommutative-Ring-without-Unity" class="headerlink" title="Level 2 - Noncommutative Ring without Unity"></a>Level 2 - Noncommutative Ring without Unity</h2><p>We first need a clarification of what do we mean by “noncommutative ring”. When we say “let $A$ be a noncommutative ring”, we mean $A$ is not necessarily commutative (it can be but we do not care); when we say “$A$ is noncommutative”, we mean $A$ is not commutative. This is a matter of convenience.</p><p>What hurts most on this level is that we cannot use unity anymore this time (there can be a unit, but we should not care here). To work around this, we need to figure out what we essentially did when proving the surjectivity of $\phi$ in theorem 1. We find a suitable element in the first ideal, and a suitable element in the intersection of all other ideals.</p><p>For this reason, we replace being pairwise coprime with a different condition. It is easy to see that if all the ideals are pairwise coprime, then the following condition will be satisfied automatically.</p><blockquote><p><strong>Theorem 2 (Chinese Remainder Theorem).</strong> Let $A$ be a noncommutative ring, let $\mathfrak{a}_1,\dots,\mathfrak{a}_n$ be ideals such that </p><script type="math/tex; mode=display">\mathfrak{a}_i+\bigcap_{1 \le j \le n}^{j \ne i}\mathfrak{a}_j = A,</script><p>for all $i=1,2,\dots,n$, then one has an isomorphism</p><script type="math/tex; mode=display">\frac{A}{\bigcap_{i=1}^{n}\mathfrak{a}_i} \cong \prod_{i=1}^{n}\frac{A}{\mathfrak{a}_i},</script><p>induced by the map</p><script type="math/tex; mode=display">\begin{aligned}\phi:A &\to \prod_{i=1}^{n}\frac{A}{\mathfrak{a}_i} \\x &\mapsto (x+\mathfrak{a}_1,\dots,x+\mathfrak{a}_n).\end{aligned}</script></blockquote><p><em>Proof.</em> We have $\ker\phi=\bigcap_{i=1}^{n}\mathfrak{a}_i$. Therefore it remains to show that our improved coprime condition implies that $\phi$ is surjective. Again, it suffices to show that the preimage of $(a+\mathfrak{a}_1,0+\mathfrak{a}_2,\dots,0+\mathfrak{a}_n)$ exists for all $r \in A$ (wait! One should not consider $1$ here!). Since $\mathfrak{a}_1+\bigcap_{i=2}^{n}\mathfrak{a}_i=A$, for any $a \in A$, there exists $a_1 \in \mathfrak{a}_1$ and $a_2 \in \bigcap_{i=1}^{n}\mathfrak{a}_2$ such that</p><script type="math/tex; mode=display">a=a_1+a_2.</script><p>As a result,</p><script type="math/tex; mode=display">\begin{aligned}\phi(a_2)&=(a_2+\mathfrak{a}_1,a_2+\mathfrak{a}_2,\dots,a_2+\mathfrak{a}_n) \\         &=(a_1+a_2+\mathfrak{a}_1,0+\mathfrak{a}_2,\dots,0+\mathfrak{a}_n) \\         &=(a+\mathfrak{a}_1,0+\mathfrak{a}_2,\dots,0+\mathfrak{a}_n) \\\end{aligned}</script><p>This proves the surjectivity of $\phi$. $\square$</p><h2 id="Bonuses"><a href="#Bonuses" class="headerlink" title="Bonuses"></a>Bonuses</h2><p>We </p><blockquote><p><strong>Theorem 3 (Chinese Remainder Theorem).</strong> Let $A$ be a ring, let $\mathfrak{a}_1,\dots,\mathfrak{a}_n$ be let $\mathfrak{a}_1,\dots,\mathfrak{a}_n$ be ideals of $A$. If all the $\mathfrak{a}_i$ are pairwise coprime and for all $i$,</p><script type="math/tex; mode=display">A=\mathfrak{a}_i+A^2.</script><p>then Chinese remainder theorem holds.</p></blockquote><p>If $A$ has a unit, then the condition $A=\mathfrak{a}_i+A^2$ is automatically satisfied.</p><p><em>Proof.</em> Assume first all the $\mathfrak{a}_i$ are coprime and satisfy $A=\mathfrak{a}_i+A^2$. It suffices to prove the case when $i=1$. Notice that</p><script type="math/tex; mode=display">\begin{aligned}A=\mathfrak{a}_1+A^2 &= \mathfrak{a}_1+(\mathfrak{a}_1+\mathfrak{a}_2)(\mathfrak{a}_1+\mathfrak{a}_3) \\                     &= \mathfrak{a}_1 + \mathfrak{a}_1^2 +\mathfrak{a}_2\mathfrak{a}_1+\mathfrak{a}_1\mathfrak{a}_3+\mathfrak{a}_2\mathfrak{a}_3 \\                     &\subset \mathfrak{a}_1 + \mathfrak{a}_2\mathfrak{a}_3 \\                     &\subset \mathfrak{a}_1 + \mathfrak{a}_2 \cap \mathfrak{a}_3 \\                     &\subset A.\end{aligned}</script><p>Therefore $A=\mathfrak{a}_1 + \mathfrak{a}_2 \cap \mathfrak{a}_3$. Suppose now $3&lt;m &lt; n$ and</p><script type="math/tex; mode=display">A = \mathfrak{a}_1 + \mathfrak{a}_2 \cap \mathfrak{a}_3 \cap \dots \cap \mathfrak{a}_m,</script><p>then</p><script type="math/tex; mode=display">\begin{aligned}A = \mathfrak{a}_1 + A^2 & = \mathfrak{a}_1 + (\mathfrak{a}_1 + \mathfrak{a}_2 \cap \mathfrak{a}_3 \cap \dots \cap \mathfrak{a}_m)(\mathfrak{a}_1+\mathfrak{a}_{m+1}) \\                         & = \mathfrak{a}_1+\mathfrak{a}_1^2 +\mathfrak{a}_1\mathfrak{a}_{m+1}+(\mathfrak{a}_2 \cap \mathfrak{a}_3 \cap \dots \cap \mathfrak{a}_m)\mathfrak{a}_1 + (\mathfrak{a}_2 \cap \mathfrak{a}_3 \cap \dots \cap \mathfrak{a}_m)\mathfrak{a}_{m+1} \\                         & \subset \mathfrak{a}_1 + \mathfrak{a}_2 \cap \mathfrak{a}_3 \cap \dots \cap \mathfrak{a}_m \cap \mathfrak{a}_{m+1} \\                         &\subset A.\end{aligned}</script><p> By induction we have $\mathfrak{a}_1+\bigcap_{j=2}^{n}\mathfrak{a}_j = A$ as desired. The next follows from theorem 2. $\square$</p><p>Finally we offer an interesting version of Chinese remainder theorem, involving Dedekind domains.</p><blockquote><p><strong>Theorem 4 (Chinese Remainder Theorem).</strong> Let $\mathfrak{a}_1,\dots,\mathfrak{a}_n$ be ideals and let $x_1,\dots,x_n$ be elements in a Dedekind domain $A$. Then the system of congruences $x\equiv x_i \pmod{\mathfrak{a}_i}$ ($1 \le i \le n$) has a solution $x$ in $A$ if and only if $x_i \equiv x_j \pmod{\mathfrak{a}_i+\mathfrak{a}_j}$ whenver $i \ne j$.</p></blockquote><p><em>Proof.</em> Define $\phi:A \to \bigoplus_{i=1}^{n}A/\mathfrak{a}_i$ by $x \mapsto (x+\mathfrak{a}_1,\dots,x+\mathfrak{a}_n)$ and $\psi: \bigoplus_{i=1}^{n}A/\mathfrak{a}_i \to \bigoplus_{i&lt;j}A/(\mathfrak{a}_i+\mathfrak{a}_j)$ such that the $(i,j)$-component of $\psi(x_1+\mathfrak{a}_1,\dots,x_n+\mathfrak{a}_n)$ is $x_i-x_j+\mathfrak{a}_i+\mathfrak{a}_j$, then the statement is equivalent to saying that the sequence of $A$-modules</p><script type="math/tex; mode=display">A \xrightarrow{\phi}\bigoplus_{i=1}^{n}A/\mathfrak{a}_i \xrightarrow{\psi}\bigoplus_{i<j}A/(\mathfrak{a}_i+\mathfrak{a}_j)</script><p>is exact. It is clear that $\operatorname{im}\phi \subset \ker \psi$. We need to show the converse. Since exactness is a local property, we can assume that $A$ is a discrete valuation ring, meaning that there is an element $x \in A$ such that all ideals $\mathfrak{a}$ are of the form $(x^k)$. Therefore we can rearrange the $\mathfrak{a}_i$ so that $\mathfrak{a}_1 = (x^{k_1}) \supset \mathfrak{a}_2=(x^{k_2}) \supset \cdots \supset \mathfrak{a}_n=(x^{k_n})$. In this case,  one has $k_1 \le k_2 \le \dots \le k_n$. In this case, we have $\mathfrak{a}_i+\mathfrak{a}_j=\mathfrak{a}_i$ whenever $i&lt;j$.</p><p>Now pick $(x_1+\mathfrak{a}_1,\dots,x_n+\mathfrak{a}_n)\in \ker\psi$, then $x_i-x_j \in \mathfrak{a}_i+\mathfrak{a}_j=\mathfrak{a}_i$. Therefore $x_i \equiv x_j \pmod{\mathfrak{a}_i}$. In particular, taking $j=n$, we see</p><script type="math/tex; mode=display">\phi(x_n)=(x_n+\mathfrak{a}_1,\dots,x_n+\mathfrak{a}_n)=(x_1+\mathfrak{a}_1,\dots,x_n+\mathfrak{a}_n) \implies (x_1+\mathfrak{a}_1,\dots,x_n+\mathfrak{a}_n)\in \operatorname{im}\phi,</script><p>as is wanted. $\square$</p><h1 id="Examples-and-Remarks"><a href="#Examples-and-Remarks" class="headerlink" title="Examples and Remarks"></a>Examples and Remarks</h1><p>If we replace $A$ with $\mathbb{Z}$ and the $\mathfrak{a}_i$ with ideals generated by coprime numbers, then we reach the classic version of the Chinese remainder theorem.</p><p>For non-commutative case, the reader can read <a href="/2022/01/18/group-algebra/">this post</a></p><p>For another example, we consider the <a href="https://en.wikipedia.org/wiki/Lagrange_polynomial">Lagrange interpolation</a>, which is the special case of the Chinese Remainder theorem on the ring $\mathbb{R}[X]$ with extra consideration on evaluations. As you may guess, $\mathbb{R}$ can be replaced with other fields. </p><blockquote><p> Find a polynomial $f(X) \in \mathbb{R}[X]$ passing three points $(1,2)$, $(2,-1)$ and $(3,2)$. </p></blockquote><p>Consider ideals $\mathfrak{a}_1=(x-1)$, $\mathfrak{a}_2=(x-2)$ and $\mathfrak{a}_3=(3,2)$. Then, for example, $f(X) \equiv f_1(X) \pmod{\mathfrak{a}_1}$, where $f_1$ is a real polynomial that attains $2$ at the point $1$. The Chinese Remainder theorem tells us that such $f$ exists. This approach is seemingly an unnecessary overkill but it allows us to view a theorem in numerical analysis in an algebraic way.</p><p>We can also make things that are not related to polynomial rings a matter of polynomial rings. For example, via Chinese Remainder Theorem, we can compute $\mathbb{C} \otimes_\mathbb{R} \mathbb{C}$ in the following way:</p><script type="math/tex; mode=display">\begin{aligned}\mathbb{C} \otimes_\mathbb{R} \mathbb{C} &\cong \mathbb{C} \otimes_\mathbb{R} \mathbb{R}[X]/(X^2+1) \\                                         &\cong (\mathbb{C} \otimes \mathbb{R}[X])/(X^2+1) \\                                         &\cong\mathbb{C}[X]/(X^2+1) \\                                         &\cong\mathbb{C}[X]/(X-i)(X+i) \\                                         &\cong\mathbb{C}[X]/(X-i) \times \mathbb{C}[X]/(X+i) \\                                         &\cong \mathbb{C} \times \mathbb{C}\end{aligned}</script><p>There is a proof of Hilbert’s theorem 90 (of cyclic extensions) where the isomorphism above is used, with degrees higher than $2$: <a href="https://mathoverflow.net/a/21117/172944">https://mathoverflow.net/a/21117/172944</a> </p><p>Our last remark shows the geometrical interpretation of Chinese Remainder Theorem. In geometry, we consider the spectrum of a unitary commutative ring, which gives rise to an affine scheme. If $A=\prod_{i=1}^{n}A_i$ is a direct product of such rings $A_i$, then $\operatorname{Spec}A \cong \coprod_{i=1}^{n}\operatorname{Spec}(A_i)$. Conversely, using Chinese remainder theorem, we can show that if $\operatorname{Spec}A$ is a disjoint union of two spectrums, then $A$ is a direct product of two other rings. To be precise:</p><blockquote><p>Let $A$ be a unitary commutative ring, then the following statements are equivalent:</p><ol><li>$X=\operatorname{Spec}(A)$ is disconnected.</li><li>$A=A_1 \times A_2$ where neither of the two rings is a zero ring.</li><li>There is an element $e \ne 0,1$ such that $e^2=e$, i.e. an idempotent element.</li></ol></blockquote><p>In particular, since in a local ring any idempotent element is $0$ or $1$, we see the spectrum of it has to be connected.</p><h1 id="References"><a href="#References" class="headerlink" title="References"></a>References</h1><ul><li><p>Michael Atiyah, I. G. MacDonald, <em>Introduction to Commutative Algebra</em></p></li><li><p>Ravi Vakil, <em>Foundations of Algebraic Geometry</em></p></li></ul>]]></content>
    
    
    <summary type="html">We study the Chinese remainder theorem in various contexts and abstract levels.</summary>
    
    
    
    <category term="Algebra" scheme="https://desvl.xyz/categories/Algebra/"/>
    
    <category term="Commutative Algebra" scheme="https://desvl.xyz/categories/Algebra/Commutative-Algebra/"/>
    
    <category term="Non-commutative Algebra" scheme="https://desvl.xyz/categories/Algebra/Non-commutative-Algebra/"/>
    
    
    <category term="Chinese Remainder Theorem" scheme="https://desvl.xyz/tags/Chinese-Remainder-Theorem/"/>
    
  </entry>
  
  <entry>
    <title>Projective Representations of SO(3)</title>
    <link href="https://desvl.xyz/2023/04/06/projective-rep-so3/"/>
    <id>https://desvl.xyz/2023/04/06/projective-rep-so3/</id>
    <published>2023-04-06T18:27:23.000Z</published>
    <updated>2025-05-10T22:19:10.342Z</updated>
    
    <content type="html"><![CDATA[<h1 id="Introduction"><a href="#Introduction" class="headerlink" title="Introduction"></a>Introduction</h1><p>In <a href="/2022/06/16/so3-laplacian/">another post</a> we gave an exposition of irreducible representations of $SO(3)$, where we find ourselves studying harmonic polynomials on a sphere. In this post, we study another category of representations of $SO(3)$ that have its own significance in physics: projective representation. The result will be written as direct sums of irreducible representations of $SU(2)$ so the reader is advised to review the <a href="2022/05/08/rep-SU2/">corresponding post</a>. We recall that</p><blockquote><p>Every irreducible unitary irreducible representation of $SU(2)$ is of the form $V_n$, where</p><script type="math/tex; mode=display">V_n=\operatorname{Sym}^n \mathbb{C}^2=\{\text{The space of homogeneous polynomials of degree $n$ in two variables}\}</script></blockquote><p>Representation theory has a billion applications in physics. The group $SO(3)$ acts as the group of orientation-preserving orthogonal symmetries in $\mathbb{R}^3$ in an obvious way. The invariance under this action justifies the principle that physical reactions such as those between elementary particles should not depend on the observer’s vantage point. </p><p>Nevertheless, applications of representation theory in physics do not end at finite dimensional vector spaces. Put infinite dimensional vector spaces aside, we sometimes also need a class of vectors, in lieu of a single vector. For example, given a wavefunction $\psi$, we know $|\psi|^2$ has an interpretation of probability density. But then for any $\lambda \in S^1$, we see $|\lambda\psi|^2=|\psi|^2$, therefore $\lambda\psi$ and $\psi$ should be equivalent in a sense. By considering these equivalent classes, we find ourselves considering the projective space. Hence it makes sense to consider projective representations</p><script type="math/tex; mode=display">G \to PGL(n,\mathbb{C})=GL(n,\mathbb{C})/\mathbb{C}^\ast</script><p>where $G$ is compact. In this post we will assume $G=SO(3)$ and see how far we can go.</p><h1 id="Simplification-of-Arguments"><a href="#Simplification-of-Arguments" class="headerlink" title="Simplification of Arguments"></a>Simplification of Arguments</h1><p>We begin with a simple group-theoretic lemma:</p><blockquote><p><strong>Lemma 1.</strong> One has</p><script type="math/tex; mode=display">PGL(n,\mathbb{C})=GL(n,\mathbb{C})/\mathbb{C}^\ast \cong SL(n,\mathbb{C})/C_n</script><p>where $C_n$ is the group of $n$th roots of unity, embedded  into $SL(n,\mathbb{C})$ via the map $\xi \to \xi I$. </p></blockquote><p><em>Proof.</em> Consider the canonical map</p><script type="math/tex; mode=display">\begin{aligned}p:SL(n,\mathbb{C}) &\to GL(n,\mathbb{C})/\mathbb{C}^\ast, \\A &\mapsto A\mathbb{C}^\ast.\end{aligned}</script><p>This map is surjective. For any $B\mathbb{C}^\ast \in GL(n,\mathbb{C})/\mathbb{C}$, we have $B\mathbb{C}^\ast=\frac{1}{|B|}B\mathbb{C}^\ast$, and $\frac{1}{|B|}B \in SL(n,\mathbb{C})$ is the preimage of $B\mathbb{C}^\ast$. </p><p>On the other hand, we see $\ker p$ consists of scalar matrices in $SL(n,\mathbb{C})$. If $\lambda I \in SL(n,\mathbb{C})$, then $|\lambda I|=\lambda^n=1$, thereby $\ker p$ can be identified as $C_n$, proving the isomorphism. $\square$</p><p>Therefore, when studying a projective representation $G \to PGL(n,\mathbb{C})$, we are quickly reduced to special linear group, which is much simpler. Besides the group of $n$th roots of unity is much simpler than the group of nonzero complex numbers. </p><p>However, our simplification has not reach the end. We will see next that special linear group can be then reduced to special unitary group. Recall that a linear matrix representation of a compact Lie group is similar to a unitary one. The following lemma is a projective analogy.</p><blockquote><p><strong>Lemma 2.</strong> Let $G$ be a compact Lie group. Every homomorphism $\varphi:G \to PGL(n,\mathbb{C})=SL(n,\mathbb{C})/C_n$ is conjugate to a homomorphism whose image lies in $SU(n)/C_n$.</p></blockquote><p><em>Proof.</em> Consider the <a href="https://encyclopediaofmath.org/wiki/Fibre_product_of_objects_in_a_category">fibre product</a> $H$ of $G$ and $SL(n,\mathbb{C})$ over $PGL(n,\mathbb{C})$:</p><p align="center"><img src="/img/so3-fibre-product.png"></p><p>Here, $p$ is the canonical projection of $SL(n,\mathbb{C}) \to SL(n,\mathbb{C})/C_n$. It suffices to show that $\tilde\varphi$ is similar to a unitary representation. Explicitly, one has</p><script type="math/tex; mode=display">H=\{(g,A) \in G \times SL(n,\mathbb{C}):p(A)=\varphi(g)\},</script><p>with $\tilde\varphi:(g,A) \mapsto A$ and $\tilde{p}:(g,A) \to g$.  Since $G$ is compact and $\tilde{p}$ has finite kernel $C_n$, one sees that $H$ is a compact Lie group. Therefore the matrix representation $\tilde\varphi:H \to SL(n,\mathbb{C})$ is similar to a homomorphism $H \to SU(n)$, from which the lemma follows. $\square$</p><p>Therefore we are reduced to considering homomorphisms</p><script type="math/tex; mode=display">\varphi:SO(3) \to SU(n)/C_n</script><p>for sake of this post. But we are not done yet. Having to deal with a quotient group is not satisfactory anyway. </p><p>Since $SU(n)$ is simply connected (see <a href="http://jde27.uk/lgla/16_topology_accessible.html">this video</a>), the projections $SU(n) \to SU(n)/C_n$ are universal coverings. In particular, when $n=2$, we see $SU(2) \to SU(2)/C_2 = SO(3)$ is our well-known universal covering. If we lift $\varphi$ to universal coverings, we see ourselves dealing with $SU(2) \to SU(n)$. To be precise, we have the following commutative diagram (universal cover is a functor):</p><p align="center"><img src="/img/so3-covering.png"></p><p>Dealing with $\tilde\varphi$ is much simpler. Physicists are more interested in unitary representations of the quaternion group $SU(2) = \operatorname{Spin}(3)$ rather than $SO(3)$, even though it looks more natural. </p><h1 id="Discovering-Projective-Representations"><a href="#Discovering-Projective-Representations" class="headerlink" title="Discovering Projective Representations"></a>Discovering Projective Representations</h1><p>Now we are interested in finding all unitary representations that can be pushed down to a projective representation of $SO(3)$. We have two questions:</p><blockquote><p><strong>Question 1.</strong> Does it suffice to consider maps of the form $\tilde\varphi:SU(2)\to SU(n)$?</p></blockquote><p>The answer is yes. Notice that every homomorphism $f:SU(2) \to U(1)$ has to be trivial. If not, then $\ker f$ should be a nontrivial proper normal subgroup of $SU(2)$, i.e. it has to be $C_2$. But $SU(2)/C_2 \cong SO(3)$. A contradiction.</p><p>Also recall the exact sequence</p><script type="math/tex; mode=display">1 \to SU(n) \xrightarrow{i} U(n) \xrightarrow{\det}U(1) \to 1.</script><p>Let $g:SU(2) \to U(n)$ be any homomorphism, and consider the canonical projection $\pi:U(n) \to \frac{U(n)}{SU(n)}=U(1)$. We see $\pi \circ g$ sends any elements in $SU(2)$ to $1$, meaning the image of $SU(2)$ in $U(n)$ must bee in $SU(n)$. Therefore, by considering maps of the form $SU(2) \to SU(n)$, we are not missing anything. $\square$</p><blockquote><p><strong>Question 2.</strong> What should be considered in order to determine whether $\tilde\varphi:SU(2) \to SU(n)$ can be pushed down into a morphism $\varphi:SO(3) \to SU(n)/C_n$? </p></blockquote><p>The answer is, one should consider the element $-I$. Let $p:SU(2) \to SO(3)$ be the universal covering, and let $p_n:SU(n) \to SU(n)/C_n$ be the corresponding universal covering. For $\tilde\varphi:SU(2) \to SU(n)$, we want to know when there will be a homomorphism $\varphi:SO(3) \to SU(n)/C_n$ such that $p_n \circ \tilde\varphi = \varphi \circ p$.</p><p>Notice that $p(-I)=I$, therefore, should $\varphi$ exist, one has $p_n \circ \tilde\varphi(-I)=e$, the identity in the group $SU(n)/C_n$, because one should have $\varphi(I)=e$. Hence $\tilde\varphi(-I) \in \ker p_n$. Therefore $\varphi(-I)$ can be identified as a $n$th root of unity. Since $\tilde\varphi(-I)\tilde\varphi(-I)=\tilde\varphi(I)$, we see $\tilde\varphi(-I)$ should also be identified as a square root of $1$. That is, $\tilde\varphi(-I)$ is either $\operatorname{id}$ or $-\operatorname{id}$. We discuss these two cases in the following question.</p><p>On the other hand, if $\tilde\varphi(-I)=\pm\operatorname{id}$, then one can verify that $p_n \circ \tilde\varphi \circ p^{-1}$ can be well-defined. Therefore $\tilde\varphi$ can be pushed down into a morphism of $SO(3)$ if and only if $\tilde\varphi(-I)=\pm\operatorname{id}$. $\square$</p><blockquote><p><strong>Question 3.</strong> Let $W=\bigoplus_n k_n V_n$ be a representation of $SU(2)$. What will happen if it can be pushed down to a projective representation of $SO(3)$?</p></blockquote><p>Let $\tilde\varphi:SU(2) \to SU(n)$ be the homomorphism corresponding to $W$. We have known for certain that when $\tilde\varphi$ can be pushed down to $SO(3)$ if and only if $\tilde\varphi(-I)=\pm\operatorname{id}$.</p><p> If $\tilde\varphi(-I)=\operatorname{id}$, then all the $n$ have to be even because the action on the polynomials cannot be the identity when $n$ is odd. If $\tilde\varphi(-I)=-\operatorname{id}$, then all the $n$ have to be odd because when $n$ is even the action of $-I$ on the polynomials must be the identity.</p><p>To be more explicit, $W=\bigoplus_n k_{2n}V_{2n}$ or $W=\bigoplus_{n}k_{2n+1}V_{2n+1}$.</p><blockquote><p><strong>Theorem 1.</strong> The projective representations of $SO(3)$ are given up to conjugations of $SU(2)$ of the form</p><script type="math/tex; mode=display">\bigoplus_n k_n V_{2n}\;\text{(even) or }\; \bigoplus_n k_{2n+1}V_{2n+1}\;\text{(odd)},</script><p>depending on whether $(-I)$ acts by $\operatorname{id}$ or $-\operatorname{id}$.</p></blockquote><p>In brief, when thinking about projective representations of $SO(3)$, one thinks about polynomials in two variables whose terms are either all even or all odd.</p><p>When studying $\tilde\varphi:SU(2) \to SU(n)$, we see $\tilde\varphi(-I)$ can be identified both as a $n$th root of unity and a square root of unity. When $n$ is odd however, we see $\tilde\varphi(-I)$ cannot be identified as $-1$, i.e. $-I$ cannot act as $-\operatorname{id}$. Unexpectedly, number theory plays a small role here.</p>]]></content>
    
    
    <summary type="html">In this post we study projective representations of $SO(3)$, although we will make more use of $SU(2)$. At the end of this post we reach the conclusion that one will think about polynomials with odd or even terms. Projective representations have its own significance in physics although the room of this post is too small to contain it. Nevertheless, the reader is invited to use linear algebra much more extensively with a taste of modern physics in this post.</summary>
    
    
    
    <category term="Algebra" scheme="https://desvl.xyz/categories/Algebra/"/>
    
    <category term="Representation Theory" scheme="https://desvl.xyz/categories/Algebra/Representation-Theory/"/>
    
    <category term="Linear Algebra" scheme="https://desvl.xyz/categories/Linear-Algebra/"/>
    
    <category term="Lie Groups" scheme="https://desvl.xyz/categories/Algebra/Lie-Groups/"/>
    
    <category term="Topology" scheme="https://desvl.xyz/categories/Topology/"/>
    
    <category term="Algebraic Topology" scheme="https://desvl.xyz/categories/Topology/Algebraic-Topology/"/>
    
    
    <category term="representation" scheme="https://desvl.xyz/tags/representation/"/>
    
  </entry>
  
  <entry>
    <title>The Quadratic Reciprocity Law</title>
    <link href="https://desvl.xyz/2023/03/20/quadratic-reciprocity-law-1/"/>
    <id>https://desvl.xyz/2023/03/20/quadratic-reciprocity-law-1/</id>
    <published>2023-03-20T11:12:03.000Z</published>
    <updated>2025-05-23T21:05:00.587Z</updated>
    
    <content type="html"><![CDATA[<h1 id="Introduction"><a href="#Introduction" class="headerlink" title="Introduction"></a>Introduction</h1><p>Historically, thanks to Gauss, the quadratic reciprocity law marked the beginning of algebraic number theory. Therefore it it deserves a good dose of attention. However, whacking the definition to the beginner would not work pretty well.</p><p>We consider the equation</p><script type="math/tex; mode=display">x^2+by = a,\; (a,b\in \mathbb{Z}),</script><p>one of the simplest non-trivial multi-variable Diophantine equations that can be imagined. Trying to violently search all solutions without any precaution is not wise. Therefore we consider reductions first. In order that $x^2+by=a$ has a solution, it is necessary that</p><script type="math/tex; mode=display">x^2 \equiv a \pmod{b}.</script><p>Then the Chinese remainder theorem inspires us to first look into the case when $b$ is a prime. The case when $b=2$ is excluded because we are only allowed to study whether $x$ is odd or even. </p><p>Therefore we study the equation $x^2=a$ in the finite field of order $p$ where $p \ne 2$. We give a very straightforward characterisation, which is seemingly stupid. For $a \in \mathbf{F}_p^\ast$, define</p><script type="math/tex; mode=display">\left(\frac{a}{p}\right)=\begin{cases}\;\;\;1 & \text{$x^2=a$ has at least one solution in $\mathbf{F}_p$,} \\-1 & \text{$x^2=a$ has no solution in $\mathbf{F}_p$.}\end{cases}</script><p>It is also convenient to define $\left(\frac{0}{p}\right)=0$. </p><p>This post will start with an equivalent form that is easier to compute (although less intuitive).  Then we will demonstrate how to do basic computation of it, and finally we try to view it in a view of algebraic number theory.</p><h1 id="Elementary-Observations"><a href="#Elementary-Observations" class="headerlink" title="Elementary Observations"></a>Elementary Observations</h1><h2 id="Basic-Computation"><a href="#Basic-Computation" class="headerlink" title="Basic Computation"></a>Basic Computation</h2><p>We begin with a simplified formula for the Legendre symbol. </p><blockquote><p><strong>Proposition 1.</strong> $\left(\frac{a}{p}\right) = a^\frac{p-1}{2}$ for $a \in \mathbf{F}_p^\ast$.</p></blockquote><p>N.B. The power on the right hand side is taken in the corresponding finite field. For example, $\left(\frac{2}{3}\right)=2=-1$ in $\mathbf{F}_3$. By abuse of language, we identify integers $1$ and $-1$ with its canonical images in the finite field.</p><p><em>Proof.</em> Notice that $\left(\frac{a}{p}\right)=1$ if and only if $a \in \mathbf{F}_p^{\ast 2}$. The rest comes from the following lemma which deserves to be stated separately in a more general literature. $\square$</p><blockquote><p><strong>Lemma 1.</strong> Let $p$ be a prime (it can be $2$ this time) and $K$ a finite field of order $q=p^n$ for some $n&gt;0$. Then</p><ol><li><p>If $p=2$, then all elements of $K$ are squares.</p></li><li><p>If $p \ne 2$, then the squares $K^{\ast 2}$ of $K^\ast$ form a subgroup of index $2$ in $K^*$; it is the kernel of the map $p:x \mapsto x^{(q-1)/2}$ from $K^\ast $ to $\{-1,1\}$.</p><p>To be precise, one has an exact sequence of cyclic groups:</p><script type="math/tex; mode=display">1 \to K^{\ast 2} \to K^\ast \xrightarrow{p} \{1,-1\} \to 1.</script></li></ol></blockquote><p><em>Proof.</em> The first case is a restatement on the condition of Frobenius endomorphism being an automorphism (see <a href="https://ncatlab.org/nlab/show/Frobenius+morphism#for_fields_2">nlab</a>). For the second case, let $\overline{K}$ be an algebraic closure of $K$. If $x \in K^\ast$, let $y \in \overline{K}$ be a square root of $x$, i.e. such that $y^2=x$. We have</p><script type="math/tex; mode=display">y^{q-1}=x^{(q-1)/2} = \pm 1 \;\text{ since }x^{q-1}=1.</script><p>Since $x \in K^{\ast 2}$ if and only if $y \in K^\ast$, which is equivalent to $y^{q-1}=p(x)=1$, one has $\ker p = K^{\ast 2}$. The rest follows from elementary calculation. $\square$ </p><p>You need to recall or study basic structures of finite fields. For example, a finite field is always of prime power order. All finite fields of order $p^n$ are isomorphic, uniquely determined as a subfield of an algebraic closure of $\mathbf{F}_p$, being the splitting field of the polynomial $X^{p^n}-X$. Besides, the multiplicative group of a finite field is cyclic. </p><p>From proposition 1 it follows that</p><blockquote><p><strong>Corollary 1.</strong> For any prime number $p \ne 2$,</p><ol><li>The Legendre symbol is multiplicative, i.e. $\left(\frac{ab}{p}\right)=\left(\frac{a}{p}\right)\left(\frac{b}{p}\right)$.</li><li>$\left(\frac{1}{p}\right)=1$</li><li>$\left(\frac{-1}{p}\right)=(-1)^{\varepsilon(p)}$ where $\varepsilon(p)=\frac{p-1}{2} \pmod{2}$.</li></ol></blockquote><p>The harder thing to compute is the Legendre symbol when $a=2$.</p><blockquote><p><strong>Proposition 2.</strong> One has $\left(\frac{2}{p}\right)=(-1)^{\omega(p)}$ where $\omega(p)=\frac{p^2-1}{8}\pmod{2}$.</p></blockquote><p>We want to find a square root of $2$, i.e. an element $y$ satisfying $y^2=2$ so that computing $2^{(p-1)/2}$ becomes computing $y^{p-1}$. This is not a easy job, and we do not expect to find it inside the field. For example, $\left(\frac{2}{3}\right)=2=-1$ and $\left(\frac{2}{5}\right)=4=-1$, meaning there is not such a $y$ in $\mathbf{F}_3$ and $\mathbf{F}_5$. However, there is an easy way to generate a $2$. Consider $y=\alpha+\alpha^{-1}$, then $y^2=2+\alpha^2+\alpha^{-2}$. If we have $\alpha^2+\alpha^{-2}=0$ then we are done. To find such an $\alpha$, notice that $\alpha^2+\alpha^{-2}=0$ implies that $\alpha^4+1=0$. Therefore $\alpha^8=1$. It suffices to use a primitive $8$th root of unity.</p><p><em>Proof.</em> Let $\alpha$ be a primitive $8$th root of unity in a algebraic closure $\Omega$ of $\mathbf{F}_p$. Then $y=\alpha+\alpha^{-1}$ verifies $y^2=2$. Since $\Omega$ has characteristic $p$, we have</p><script type="math/tex; mode=display">y^p=\alpha^p+\alpha^{-p}.</script><p>Observe that if $p \equiv 1 \pmod{8}$, then $y^p=\alpha+\alpha^{-1}=y$ (we used the fact that $\alpha$ is an $8$th root of unity). Therefore $y^{p-1}=\left(\frac{2}{p}\right)=1$. This inspires us to determine $y^{p-1}$ through the relation between $p$ and $8$. As $p$ is odd, there are four possibilities: $p\equiv 1,3,5,7 \pmod{8}$.</p><p>If $p \equiv 7 \pmod{8}$, i.e. $p \equiv -1 \pmod{8}$, we still have $y^p=\alpha^{-1}+\alpha=y$. Therefore $\left(\frac{2}{p}\right)=1$ whenever $p \equiv \pm 1 \pmod{8}$. This discovery inspires us to study $p \equiv \pm 5 \pmod{8}$ together. When this is the case, one finds $y^p=\alpha^5+\alpha^{-5}$. Since $\alpha^4=\alpha^{-4}=-1$ (the primitivity of $\alpha$ matters here), $y^p$ becomes $-(\alpha+\alpha^{-1})=-y$. Cancelling $y$ on both sides, we obtain $y^{p-1}=\left(\frac{2}{p}\right)=-1$. To conclude,</p><script type="math/tex; mode=display">\left(\frac{2}{p}\right)=\begin{cases}1 &p\equiv \pm 1\pmod{8} \\-1 & p\equiv \pm 5 \pmod{8}\end{cases}</script><p>It remains to justify the $\omega$ function as above. We need to find a function $\omega(n)$ such that $\omega(n) \equiv 0 \pmod 2$ when $p \equiv \pm 1 \pmod 8$ and $\omega(n) \equiv 1 \pmod 2$ when $p \equiv \pm 5 \pmod 8$. If we square $p$, we can ignore the difference of the signs:</p><script type="math/tex; mode=display">\begin{aligned}p&=8k \pm 1 \implies p^2=64k^2\pm 16k+1 \implies p^2-1=8 \cdot 2 \cdot (4k^2\pm k), \\p&=8k \pm 5 \implies p^2=64k^2\pm 80k +25 \implies p^2-1 = 8 \cdot (8k^2\pm10k+3).\end{aligned}</script><p>Therefore, whether $(p^2-1)/8$ is odd or even is completely determined by the remainder of $p$ modulo $8$. We therefore put $\omega(p)=(p^2-1)/8$ and this concludes our proof. $\square$</p><p>To conclude in simpler form, we have</p><ul><li>$1$ is always a square root in a finite field.</li><li>$-1$ is a square root in $\mathbf{F}_p$ if and only if $\frac{p-1}{2}$ is even, i.e., $p \equiv 1 \pmod{4}$.</li><li>$2$ is a square root in $\mathbf{F}_p$ if and only if $\frac{p^2-1}{8}$ is even, i.e. $p \equiv \pm 1 \pmod{8}$.</li></ul><h2 id="Gauss’s-Quadratic-Reciprocity-Law"><a href="#Gauss’s-Quadratic-Reciprocity-Law" class="headerlink" title="Gauss’s Quadratic Reciprocity Law"></a>Gauss’s Quadratic Reciprocity Law</h2><p>The Legendre symbol says a lot of things, but you do not want to compute, for example $\left(\frac{37}{53}\right)$ by hand in the basic way as above. However, granted the following law, things are much easier.</p><blockquote><p><strong>Proposition 3 (Gauss’s Quadratic Reciprocity Law).</strong> For two distinct odd prime numbers $p$ and $\ell$, the following identity holds:</p><script type="math/tex; mode=display">\left(\frac{\ell}{p}\right)\left(\frac{p}{\ell}\right)=(-1)^{\varepsilon(p)\varepsilon(\ell)}.</script><p>Alternatively,</p><script type="math/tex; mode=display">\left(\frac{\ell}{p}\right)=\left(\frac{p}{\ell}\right)(-1)^{\varepsilon(\ell)\varepsilon(p)}.</script></blockquote><p>Instead of computing $37^{(53-1)/2}$ modulo $53$, we obtain the value of $\left(\frac{37}{53}\right)$ in a much easier way:</p><script type="math/tex; mode=display">\left(\frac{37}{53}\right)=\left(\frac{53}{37}\right)=\left(\frac{16}{37}\right)=\left(\frac{2}{37}\right)^4=1.</script><p>In other words, there exist a solution of the equation $x^2+53y=37$. </p><p>The proof is carried out by Gauss sum. The proof looks contrived, but one can see a lot of important tricks. We will use corollary 1.1 frequently.</p><p><em>Proof.</em> Again, let $\Omega$ be an algebraic closure of $\mathbf{F}_p$, and let $\omega \in \Omega$ be a primitive $\ell$-th root of unity. If $x \in \mathbf{F}_\ell$, then $\omega^x$ is well-defined. Thus it is legitimate to write the “Gauss sum”:</p><script type="math/tex; mode=display">y = \sum_{x \in \mathbf{F}_\ell}\left(\frac{x}{\ell}\right)\omega^x.</script><p>Following the inspiration of what we have done in proposition 2, we study $y^2$ and $y^{p-1}$ again. The second one is quick.</p><blockquote><p> <strong>Claim 1.</strong> $y^{p-1}=\left(\frac{p}{\ell}\right)$.</p></blockquote><p>To show claim $1$, we notice that, as $\Omega$ is of characteristic $p$, we have</p><script type="math/tex; mode=display">y^p = \sum_{x \in \mathbf{F}_\ell}\left(\frac{x}{p}\right)\omega^{xp}=\sum_{z \in \mathbf{F}_\ell}\left(\frac{zp^{-1}}{\ell}\right)\omega^z=\left(\frac{p^{-1}}{\ell}\right)\sum_{z \in \mathbf{F}_\ell}\left(\frac{z}{\ell}\right)\omega^z = \left(\frac{p^{-1}}{\ell}\right)y</script><p>and therefore</p><script type="math/tex; mode=display">y^{p-1}=\left(\frac{p^{-1}}{\ell}\right)=\left(\frac{p}{\ell}\right).</script><blockquote><p><strong>Claim 2.</strong> $y^2 = \left(\frac{-1}{\ell}\right)\ell$ (by abuse of language, $\ell$ (the one outside the Legendre symbol) is used to denote the image of $\ell$ in the field $\mathbf{F}_p$.)</p></blockquote><p>Notice that</p><script type="math/tex; mode=display">y^2= \sum_{x,z \in \mathbf{F}_\ell}\left(\frac{xz}{\ell}\right)\omega^{x+z}=\sum_{u \in \mathbf{F}_\ell}\omega^u\left(\sum_{t \in \mathbf{F}_\ell}\left(\frac{t(u-t)}{\ell}\right)\right)=\sum_{u \in \mathbf{F}_\ell}\omega^u\left(\sum_{t \in \mathbf{F}_\ell^\ast}\left(\frac{t(u-t)}{\ell}\right)\right).</script><p>Terms where $t=0$ are ignored safely. Then we notice that</p><script type="math/tex; mode=display">\left(\frac{t(u-t)}{\ell}\right)=\left(\frac{-t^2}{\ell}\right)\left(\frac{1-ut^{-1}}{\ell}\right)=\left(\frac{-1}{\ell}\right)\left(\frac{t^2}{\ell}\right)\left(\frac{1-ut^{-1}}{\ell}\right)=\left(\frac{-1}{\ell}\right)\left(\frac{1-ut^{-1}}{\ell}\right).</script><p>For this reason we put</p><script type="math/tex; mode=display">C_u=\sum_{t \in \mathbf{F}_\ell^\ast}\left(\frac{1-ut^{-1}}{\ell}\right).</script><p>It follows that</p><script type="math/tex; mode=display">y^2 = \left(\frac{-1}{\ell}\right)\sum_{u \in \mathbf{F}_\ell}C_u\omega^u.</script><p>It remains to compute the coefficients $C_u$. We see</p><script type="math/tex; mode=display">C_0 = \sum_{t \in \mathbf{F}_\ell^{\ast}}\left(\frac{1}{\ell}\right)=\sum_{t \in \mathbf{F}_\ell^\ast}1=\ell-1.</script><p>When $u \ne 0$, the term $s=1-ut^{-1}$ runs over all of $\mathbf{F}_\ell$ except $1$. Therefore</p><script type="math/tex; mode=display">C_u=\sum_{s \in \mathbf{F}_\ell}\left(\frac{s}{\ell}\right)-\left(\frac{1}{\ell}\right)=-\left(\frac{1}{\ell}\right)=-1,</script><p>since $[\mathbf{F}_\ell:\mathbf{F}_\ell^2]=2$ (read: exactly half of the elements of $\mathbf{F}_\ell$ are squares, the rest are not). Therefore</p><script type="math/tex; mode=display">y^2=\left(\frac{-1}{\ell}\right)(\ell-1-w-w^2-\dots-w^{\ell-1})</script><p>Recall that $1-\omega^\ell=(1-\omega)(1+\omega+\dots+\omega^{\ell-1})=0$. As $\omega$ is a primitive root, we see $\omega \ne 1$ and therefore $1+\omega+\dots+\omega^{-\ell-1}=0$. The result follows.</p><p>Finally, the reciprocity follows because</p><script type="math/tex; mode=display">y^{p-1}=(y^2)^{\frac{p-1}{2}}=\left(\frac{\left(\frac{-1}{\ell}\right)\ell}{p}\right)=\left(\frac{p}{\ell}\right).</script><p>We invite the reader to expand the identity above using corollary 1 and see the result. $\square$</p><h1 id="Observation-from-Algebraic-Number-Theory"><a href="#Observation-from-Algebraic-Number-Theory" class="headerlink" title="Observation from Algebraic Number Theory"></a>Observation from Algebraic Number Theory</h1><p>In this section we introduce some observation from a point of view of algebraic number theory without complete proofs.</p><p>Let $p$ be an odd prime, and $\zeta_p$ a primitive $p$-th root of unity. We have seen that the Gauss’s sum</p><script type="math/tex; mode=display">S_p = \sum_{x \in \mathbf{F}_p}\xi_p^x</script><p> satisfies the relation</p><script type="math/tex; mode=display">S^2_p=\left(\frac{-1}{p}\right)p \implies \sqrt{p}=\sqrt{\left(\frac{-1}{p}\right)}S_p.</script><p>Therefore the field $\mathbb{Q}(\sqrt{p})$ is contained in $\mathbb{Q}(\zeta_p)$ or $\mathbb{Q}(\zeta_p,i)$, depending on the sign of $\left(\frac{-1}{p}\right)$. The first one is a cyclotomic extension of $\mathbb{Q}$ by definition. The second one is not, but is a finite abelian extension of $\mathbb{Q}$. However, every finite abelian extension of $\mathbb{Q}$ is a subfield of a cyclotomic field. See <a href="https://math.mit.edu/classes/18.785/2015fa/LectureNotes19.pdf">this note</a>. To conclude,</p><blockquote><p>Every field of the form $\mathbb{Q}(\sqrt{p})$ lies in a subfield of $\mathbb{Q}(\zeta_m)$ for some $m&gt;1$.</p></blockquote><p>Solving the equation $x^2 \equiv a \pmod p$ also inspires us to look at the quadratic field $K=\mathbb{Q}(\sqrt{a})$. For simplicity we assume that $a$ is square free. If $\left(\frac{a}{p}\right)=1$, then there exists $\alpha \in \mathbb{Z}$ such that</p><script type="math/tex; mode=display">x^2-a \equiv (x-\alpha)(x+\alpha)\pmod p.</script><p>This equation is interesting because on the left hand side we actually have the minimal polynomial of $K$, namely $p(x)=x^2-a$. The equation split completely modulo $p$. The relation above actually signifies that there exists prime ideals $\mathfrak{P}_1,\mathfrak{P}_2\subset \mathfrak{o}_k$ such that</p><script type="math/tex; mode=display">(p)=\mathfrak{P}_1^{e_1}\mathfrak{P}_2^{e_2}</script><p>where the <strong>ramification indices</strong> $e_1=e_2=1$. This says the prime ideal $(p)$ is <strong>totally split</strong> in $\mathbb{Q}(\sqrt{a})$. Conversely, if $(p)$ is totally split in $\mathbb{Q}(\sqrt{a})$ (where $(a,p)=1$ for sure), then $\left(\frac{a}{p}\right)=1$. To conclude,</p><blockquote><p>The Legendre symbol $\left(\frac{a}{p}\right)=1$ if and only if $(p)$ totally splits in $\mathbb{Q}(\sqrt{a})$.</p></blockquote><p>In fact, one can have a more profound observation of number fields which will imply the quadratic reciprocity law:</p><blockquote><p>Let $\ell$ and $p$ be two distinct odd primes, $S_\ell=\left(\frac{-1}{\ell}\right)\ell$, then $(p)$ is totally split in $\mathbb{Q}(S_\ell)$ if and only if $(p)$ splits totally into two an even number of prime ideals in $\mathbb{Q}(\zeta_\ell)$.</p></blockquote><p>Besides you may want to know about <a href="https://websites.math.leidenuniv.nl/algebra/artin.pdf">Artin’s reciprocity</a> which generalised Gauss’s reciprocity, but that’s quite advanced topic (class field theory). This also shows the significance of quadratic reciprocity law. </p><h1 id="References"><a href="#References" class="headerlink" title="References"></a>References</h1><ul><li>Jean-Pierre Serre, A Course in Arithmetic</li><li>Jürgen Neukirch, Algebraic Number Theory</li><li>Serge Lang, Algebraic Number Theory</li></ul>]]></content>
    
    
    <summary type="html">In this post we deliver the basic computation of the quadratic reciprocity law and see its importance in algebraic number theory.</summary>
    
    
    
    <category term="Number Theory" scheme="https://desvl.xyz/categories/Number-Theory/"/>
    
    <category term="Algebraic Number Theory" scheme="https://desvl.xyz/categories/Number-Theory/Algebraic-Number-Theory/"/>
    
    <category term="Elementary Number Theory" scheme="https://desvl.xyz/categories/Number-Theory/Elementary-Number-Theory/"/>
    
    
    <category term="Serre" scheme="https://desvl.xyz/tags/Serre/"/>
    
  </entry>
  
  <entry>
    <title>Vague Convergence in Measure-theoretic Probability Theory - Equivalent Conditions</title>
    <link href="https://desvl.xyz/2023/02/13/vague-convergence/"/>
    <id>https://desvl.xyz/2023/02/13/vague-convergence/</id>
    <published>2023-02-13T15:33:53.000Z</published>
    <updated>2025-05-10T22:19:10.699Z</updated>
    
    <content type="html"><![CDATA[<h1 id="Introduction"><a href="#Introduction" class="headerlink" title="Introduction"></a>Introduction</h1><p>In analysis and probability theory, one studies various sort of convergences (of random variables) for various reasons. In this post we study vague convergence, which is responsible for the convergence in distribution.</p><p>Vaguely speaking, vague convergence is the weakest kind of convergence one can expect (whilst still caring about continuity whenever possible). We do not consider any dependence relation between the sequence of random variables.</p><p>Throughout, fix a probability space $(\Omega,\mathscr{F},\mathscr{P})$, where $\Omega$ is the sample space, $\mathscr{F}$ the event space and $\mathscr{P}$ the probability function. Let $(X_n)$ be a sequence of random variables on this space. Each random variable $X_n$ canonically induces a probability space $(\mathbb{R},\mathscr{B},\mu_n)$ where $\mathscr{B}$ is the Borel $\sigma$-measure. To avoid notation hell we only consider the correspondence $X_n \leftrightarrow \mu_n$ where</p><script type="math/tex; mode=display">\mu_n(B)=\mathscr{P}(X^{-1}(B))=\mathscr{P}\{X \in B\}.</script><p>Here comes the question: if $X_n$ tends to a limit, then we would expect that $\mu_n$ converges to a limit (say $\mu$) in some sense (at least on some intervals). But is that always the case? Even if the sequence converges, can we even have $\mu(\mathbb{R})=1$? We will see through some examples that this is really not the case.</p><h2 id="Examples-Failure-of-Convergence-on-Intervals"><a href="#Examples-Failure-of-Convergence-on-Intervals" class="headerlink" title="Examples: Failure of Convergence on Intervals"></a>Examples: Failure of Convergence on Intervals</h2><p>Let $X_n\equiv\frac{(-1)^n}{n}$, then $X_n \to 0$ deterministically. For any $a&gt;0$, the sequence $\mu_n((0,a))$ oscillates between $0$ and $1$, i.e. it ends up in the form</p><script type="math/tex; mode=display">\dots,0,1,0,1,\dots</script><p>which does not converge at all. Likewise, for any $b&lt;0$, the sequence $\mu_n((b,0))$ oscillates between $1$ and $0$.</p><p>As another example of convergence failure, consider $b_n&lt;0&lt;a_n$ with $a_n \to 0$ and $b_n \to 0$ as $n \to \infty$, and let $X_n$ be the sequence of random variables having the uniform distribution on $(b_n,a_n)$. We see $X_n \to 0$ a.e. but $\mu_n((b,0))$, which subjects to the area under $X_n$ between $b_n$ and $0$, may not converge at all, or converge to any number between $0$ and $1$.</p><h2 id="Example-Failure-of-Converging-to-a-Probability-Measure"><a href="#Example-Failure-of-Converging-to-a-Probability-Measure" class="headerlink" title="Example: Failure of Converging to a Probability Measure"></a>Example: Failure of Converging to a Probability Measure</h2><p>We compose an example where $\mu_n$ converges to a measure $\mu$ where $\mu(\mathbb{R})&lt;1$, preventing $\mu$ from being a probability measure. To do this, fix two positive numbers $\alpha$ and $\beta$ such that $\alpha+\beta&lt;1$. Consider the sequence of random variables $X_n$ with</p><script type="math/tex; mode=display">\begin{aligned}X_n = \begin{cases}n & \text{with probability $\alpha$,} \\0 & \text{with probability $1-\alpha-\beta$,} \\-n &\text{with probability $\beta$.}\end{cases}\end{aligned}</script><p>Then $X_n \to X$ where</p><script type="math/tex; mode=display">\begin{aligned}X = \begin{cases}\infty & \text{with probability $\alpha$,} \\0 & \text{with probability $1-\alpha-\beta$,} \\-\infty &\text{with probability $\beta$.}\end{cases}\end{aligned}</script><p>Then $\lim_n\mu_n(\mathbb{R})=1-\alpha-\beta&lt;1$. Atoms of this measure has escaped to $+\infty$ and $-\infty$.</p><p>These examples inspire us to develop a weaker sense of convergence, where we only take intervals into account (because we would expect continuous functions to play a role).</p><h1 id="Definitions"><a href="#Definitions" class="headerlink" title="Definitions"></a>Definitions</h1><p>From the example above, it is clear that it is not expected to reach $=1$ all the time. Therefore we consider $\le 1$ instead, hence the following weakened version of probability measure and distribution function follow.</p><blockquote><p><strong>Definition 1.</strong> A measure $\mu$ on $\mathbb{R}$ is a <em>subprobability measure</em> (s.p.m.) if $\mu(\mathbb{R}) \le 1$. Correspondingly, one defines the <em>subdistribution function</em> (s.d.f.) with respect to $\mu$ by</p><script type="math/tex; mode=display">\forall x, F(x)=\mu((-\infty,x]).</script></blockquote><p>When $\mu(\mathbb{R})=1$, there is nothing new, but even if not, we do not have very much obstacles. Still we see $F(x)$ is a right continuous function with $F(-\infty)=0$ and $F(+\infty)=\mu(\mathbb{R}) \le 1$. For brevity’s sake, we will write $\mu((a,b])$ into $\mu(a,b]$ from now on, and similarly for other kind of intervals. We also put $\mu(a,b)=0$ when $a&gt;b$ because why not.</p><p>Our examples also warn us that atoms are a big deal, which leads us to the following definition concerning intervals.</p><blockquote><p><strong>Definition 2.</strong> Notation being above, an interval $(a,b)$ is called a <em>continuous interval</em> if neither $a$ nor $b$ is an <a href="https://proofwiki.org/wiki/Definition:Atom_of_Measure">atom</a> of $\mu$, i.e. if $\mu(a,b)=\mu[a,b]$. </p></blockquote><p>One can test if $(0,1)$ is a continuous interval in our first group of examples. Now we are ready for the definition of vague convergence.</p><blockquote><p><strong>Definition 3.</strong> A sequence $(\mu_n)$ of s.p.m. is said to <em>converge vaguely</em> to an s.p.m. $\mu$ if there exists a dense subset $D \subset \mathbb{R}$ such that</p><script type="math/tex; mode=display">\forall a,b\in D,a<b,\quad \mu_n(a,b] \to \mu(a,b].</script><p>We write $\mu_n \xrightarrow{v} \mu$.</p><p>Let $(F_n)$ be the corresponding s.d.f. of $(\mu_n)$ and $F$ the s.d.f. of $\mu$. Then we say that $F_n$ converges vaguely to $F$ and write $F_n \xrightarrow{v} F$.</p></blockquote><p>It is unfair that we are not building the infrastructure for random variables (r.v.) in this context. We introduce the following concept that you may have already studied in the calculus-based probability theory:</p><blockquote><p><strong>Definition 4.</strong> Let $(X_n)$ be a sequence of r.v.’s with corresponding cumulative distribution functions (c.d.f.) $(F_n)$. We say $X_n$ converge weakly or in distribution to $X$ (with corresponding c.d.f. $F$) if $F_n \xrightarrow{v} F$.</p></blockquote><p>In calculus-based probability theory, one studies that $F_n(x) \to F(x)$ whenever $F$ is continuous at $x$. This definition is easier to understand but has skipped a lot of important details.</p><h1 id="Equivalent-Conditions"><a href="#Equivalent-Conditions" class="headerlink" title="Equivalent Conditions"></a>Equivalent Conditions</h1><p>In this section we study vague convergence in a view of measure theory, utilising $\varepsilon-\delta$ arguments most of the time. We will see that the convergence looks quite similar to the convergence of $\mathbb{R}$. </p><p>Let $(a_n)$ be a sequence of real numbers, we can recall that</p><ul><li>If $(a_n)$ converges, then the limit is unique. </li><li>If $(a_n)$ is bounded, then it has a bounded subsequence.</li><li>If every subsequence of $(a_n)$ converges to $a$, then $a_n$ converges to $a$. </li></ul><p>These results are natural in the context of calculus, but in the world of topology and functional analysis, these are not naturally expected. However, s.p.m.’s enjoy all three of them (for the second point, notice that an s.p.m. is bounded in a sense anyway.) Nevertheless, it would be too ambitious to include everything here and assume that the reader will finish it in one shot.</p><blockquote><p><strong>Theorem 1.</strong> Let $(\mu_n)$ and $\mu$ be s.p.m.’s. The following conditions are equivalent:</p><p>(1) $\mu_n \xrightarrow{v} \mu$.</p><p>(2) For every finite interval $(a,b)$ and $\varepsilon&gt;0$, there exists an $n_0(a,b,\varepsilon)$ such that whenever $n \ge n_0$,</p><script type="math/tex; mode=display">\mu(a+\varepsilon,b-\varepsilon)-\varepsilon \le \mu_n(a,b) \le \mu(a-\varepsilon,b+\varepsilon)+\varepsilon</script><p>(3) For every continuity interval $(a,b]$ of $\mu$, we have</p><script type="math/tex; mode=display">\mu_n(a,b] \to \mu(a,b].</script><p>When $(\mu_n)$ and $\mu$ are p.m.’s, the second condition is equivalent to the “uniformed” edition:</p><p>(4) For every $\delta&gt;0$ and $\varepsilon&gt;0$, there exists $n_0(\delta,\varepsilon)$ such that if $n \ge n_0$, then for <strong>every</strong> interval $(a,b)$, possibly infinite:</p><script type="math/tex; mode=display">\mu(a+\delta,b-\delta)-\varepsilon \le \mu_n(a,b) \le \mu(a-\delta,b+\delta)+\varepsilon.</script></blockquote><p><em>Proof.</em> We first study the equivalence of the first three statements. Suppose $\mu_n$ converges vaguely to $\mu$. We are given a dense subset $D$ of the real line such that whenever $a,b \in D$ and $a<b$, we have $\mu_n(a,b] \to \mu(a,b]$. For every finite interval $(a,b)$ and $\varepsilon>0$, there are $a_1,a_2,b_1,b_2 \in D$ satisfying</p><script type="math/tex; mode=display">a-\varepsilon<a_1<a<a_2<a+\varepsilon,\;b-\varepsilon<b_1<b<b_2<b+\varepsilon.</script><p>By vague convergence, there exists $n_0&gt;0$ such that whenever $n \ge n_0$,</p><script type="math/tex; mode=display">|\mu_n(a_i,b_j]-\mu(a_i,b_j]|<\varepsilon</script><p>for $i=1,2$ and $j=1,2$. It follows that</p><script type="math/tex; mode=display">\mu_n(a,b) \ge \mu_n(a_2,b_1] > \mu_2(a_2,b_1]-\varepsilon\ge \mu(a+\varepsilon,b-\varepsilon)-\varepsilon</script><p>and on the other hand</p><script type="math/tex; mode=display">\mu_n(a,b) \le \mu_n(a_1,b_2] < \mu(a_1,b_2]+\varepsilon \le \mu(a-\varepsilon,b+\varepsilon)+\varepsilon.</script><p>Combining both, the implication is clear.</p><p>Next, we assume (2), and let $(a,b)$ be a continuous interval of $\mu$, i.e we have $\mu(a,b)=\mu[a,b]$.  The relation $\mu(a+\varepsilon,b-\varepsilon)-\varepsilon \le \mu_n(a,b)$ implies that</p><script type="math/tex; mode=display">\mu(a+\varepsilon,b-\varepsilon) - \varepsilon \le \varliminf_{n\to\infty} \mu_n(a,b)</script><p>holds for all $\varepsilon&gt;0$. On the other hand, as $\varepsilon \to 0$ on the left hand side, we see</p><script type="math/tex; mode=display">\mu(a,b) \le \varliminf_{n\to\infty}\mu_n(a,b).</script><p>Likewise, the relation $\mu_n(a,b) \le \mu(a-\varepsilon,b+\varepsilon)+\varepsilon$ yields</p><script type="math/tex; mode=display">\varlimsup_{n\to\infty}\mu_n(a,b) \le \mu(a-\varepsilon,b+\varepsilon)+\varepsilon.</script><p>As $\varepsilon \to 0$ on the right hand side, we obtain</p><script type="math/tex; mode=display">\varlimsup_{n \to \infty}\mu_n(a,b) \le \mu[a,b]=\mu(a,b).</script><p>To conclude both sides, notice that</p><script type="math/tex; mode=display">\mu(a,b) \le \varliminf_{n\to\infty}\mu_n(a,b)\le\varlimsup_{n\to\infty}\mu_n(a,b)\le\mu[a,b]=\mu(a,b).</script><p>This forces $\mu_n(a,b)$ to converge to $\mu(a,b)$. This implies that $\mu_n(a,b] \to \mu(a,b]$. To see this, pick another continuous interval $(a,b’)$ which properly contains $(a,b)$. Then $(b,b’)$ is another continuous interval. It follows that</p><script type="math/tex; mode=display">\mu_n(a,b] = \mu_n(a,b')-\mu_n(b,b') \to \mu(a,b')-\mu(b,b')=\mu(a,b].</script><p>Assume (3). Notice that the set of atoms $A$ of $\mu$ has to be at most countable, therefore $D=\mathbb{R} \setminus A$ is dense in $\mathbb{R}$. On the other hand, $(a,b]$ is a continuous interval if and only if $a,b \in D$. This implies (1).</p><hr><p>The arguments above also shows that when discussing vague convergence, one can replace $(a,b]$ with $(a,b)$, $(a,b]$ or $[a,b]$ freely, as long as $(a,b)$ is a continuous interval. It also follows that $\mu_n(\{a\}) \to 0$.</p><hr><p>For (4), as (4) implies (2) (by taking $\delta=\varepsilon&gt;0$), it remains to show that (3) implies (4) assuming that $\mu_n$ and $\mu$ are p.m.’s. Indeed, it suffices to prove it on a finite interval, and we will firstly justify this action. Let $A$ denote the set of atoms of $\mu$. First of all we can pick integer $n&gt;0$ such that $\mu(-n,n) &gt; 1-\frac{\varepsilon}{4}$ (that is, the interval is so big that the measure is close to $1$ enough). Pick $\alpha,\beta \in A^c$ such that $a \le -n$ and $b \ge n$ (this is possible because $A^c$ is dense). For the interval $(\alpha,\beta)$, we can put a finite partition</p><script type="math/tex; mode=display">\alpha=a_1<a_2<\dots<a_\ell=\beta</script><p> such that $|a_{j+1}-a_j| \le \delta$ and $a_j \in A^c$ for all $j=1,\dots,\ell-1$. Therefore, we have</p><script type="math/tex; mode=display">\begin{equation}\mu((a_1,a_\ell)^c)<\frac{\varepsilon}{4}.\end{equation}</script><p>By (3), there exists $n_0$ depending on $\varepsilon$ and $\ell$ (thereby $\delta$) such that</p><script type="math/tex; mode=display">\sup_{1 \le j \le \ell -1}|\mu(a_j,a_{j+1}]-\mu_n(a_j,a_{j+1}]|  < \frac{\varepsilon}{4\ell},</script><p>for all $n \ge n_0$. Adding over all $j$, replacing the endpoint with open interval, we see</p><script type="math/tex; mode=display">|\mu(a_1,a_\ell)-\mu_n(a_1,a_\ell)|<\frac{\varepsilon}{4}.</script><p>It follows that</p><script type="math/tex; mode=display">\begin{aligned}\mu_n((a_1,a_\ell)^c) &= 1-\mu_n(a_1,a_\ell) \\                      &= \mu((a_1,a_\ell))+\mu((a_1,a_\ell)^c)-\mu_n(a_1,a_\ell) \\                      &\le \mu((a_1,a_\ell)^c)+|\mu(a_1,a_\ell)-\mu_n(a_1,a_\ell)| \\                      &< \frac{\varepsilon}{2}.\end{aligned}</script><p>(This is where being p.m. matters.) Therefore when $n \ge n_0$ and discussing $\mu(a,b)$ versus $\mu_n(a,b)$, ignoring $(a,b) \setminus (a_1,a_\ell)$ results only in an error of $&lt;\frac{\varepsilon}{2}$. Therefore it suffices to assume that $(a,b) \subset (a_1,a_\ell)$ and show that</p><script type="math/tex; mode=display">\mu(a+\delta,b-\delta)-\frac{\varepsilon}{2}\le \mu_n(a,b) \le \mu(a-\delta,b+\delta)+\frac{\varepsilon}{2}.</script><p>Since $(a,b) \subset (a_1,a_\ell)$, there exists $j,k$ with $1 \le j \le k &lt; \ell$ such that</p><script type="math/tex; mode=display">\begin{aligned}\mu_n(a+\delta,b-\delta)-\frac{\varepsilon}{4} &\le \mu_n(a_{j+1},a_k)-\frac{\varepsilon}{4} \le \mu(a_{j+1},a_k) \le \mu(a,b) \\&\le \mu(a_j,a_{k+1}) \le \mu_n(a_j,a_{k+1})+\frac{\varepsilon}{4} \\&\le \mu_n(a-\delta,b+\delta) + \frac{\varepsilon}{4}.\end{aligned}</script><p>This concludes the proof and demonstrates why our specific choice of $a_j$ is important. $\square$</p><p>We cannot give a treatment of all three points above but the first point, the unicity of vague limit, is now clear.</p><blockquote><p><strong>Corollary 1 (Unicity of vague limit).</strong> Notation being in definition 3. If there is another s.p.m. $\mu’$ and another dense set $D’$ such that whenever $a,b \in D’$ and $a&lt;b$, one has $\mu_n(a,b] \to \mu’(a,b]$, then $\mu$ and $\mu’$ are identical.</p></blockquote><p><em>Proof.</em> Let $A$ be the set of atoms of $\mu$ and $\mu’$; then if $a,b \in A^c$, one has $\mu_n(a,b] \to \mu(a,b]$ and $\mu_n(a,b] \to \mu’(a,b]$. Therefore $\mu(a,b]=\mu’(a,b]$. Since $A^c$ is dense in $\mathbb{R}$, them two must be identical. $\square$ </p>]]></content>
    
    
    <summary type="html">We give an introduction to vague convergence and see several equivalent conditions of it.</summary>
    
    
    
    <category term="Analysis" scheme="https://desvl.xyz/categories/Analysis/"/>
    
    <category term="Measure Theory" scheme="https://desvl.xyz/categories/Analysis/Measure-Theory/"/>
    
    <category term="Probability Theory" scheme="https://desvl.xyz/categories/Probability-Theory/"/>
    
    
    <category term="Chung" scheme="https://desvl.xyz/tags/Chung/"/>
    
  </entry>
  
  <entry>
    <title>The Pontryagin Dual group of Q_p</title>
    <link href="https://desvl.xyz/2022/12/23/pontryagin-dual-q-p/"/>
    <id>https://desvl.xyz/2022/12/23/pontryagin-dual-q-p/</id>
    <published>2022-12-23T09:57:42.000Z</published>
    <updated>2025-10-15T18:35:42.538Z</updated>
    
    <content type="html"><![CDATA[<h1 id="Introduction"><a href="#Introduction" class="headerlink" title="Introduction"></a>Introduction</h1><p>Let $G$ be a locally compact abelian group (for example, $\mathbb{R}$, $\mathbb{Z}$, $\mathbb{T}$, $\mathbb{Q}_p$). Then every irreducible <a href="https://en.wikipedia.org/wiki/Unitary_representation">unitary representation</a> $\pi:G \to U(\mathcal{H}_\pi)$ is one dimensional, where $\mathcal{H}_\pi$ is a non-zero Hilbert space, in which case we take it as $\mathbb{C}$. It follows that $\pi(x)(z)=\xi(x)z$ for all $z \in \mathbb{C}$ where $\xi \in \operatorname{Hom}(G,\mathbb{T})$, viewing $\mathbb{T}$ as the unit circle in the complex plane. Such homomorphisms are called (unitary) <strong>characters</strong>, and we denote all characters of $G$ by $\widehat{G}$, calling it the Pontryagin dual group. It should ring a bell of the representation theory of finite groups. For convenience, instead of $\xi(x)$, we often write $\langle x,\xi \rangle$. We also write $\langle x,\xi\rangle\langle y,\xi \rangle=\langle x+y ,\xi\rangle$. </p><p>Some easily accessible examples are:</p><ul><li>$\widehat{\mathbb{R}} \cong \mathbb{R}$, with $\langle x,\xi \rangle = e^{2\pi i \xi x}$.</li><li>$\widehat{\mathbb{T}} \cong \mathbb{Z}$, with $\langle z, n \rangle = z^n$.</li><li>$\widehat{\mathbb Z} \cong \mathbb{T}$, with $\langle n,z \rangle = z^n$. </li><li>$\widehat{\mathbb{Z}/k\mathbb{Z}} \cong \mathbb{Z}/k\mathbb{Z}$, with $\langle m,n\rangle =e^{2\pi i m n / k}$.</li></ul><h1 id="The-Dual-of-p-adic-Field"><a href="#The-Dual-of-p-adic-Field" class="headerlink" title="The Dual of p-adic Field"></a>The Dual of p-adic Field</h1><p>But we want to show that </p><script type="math/tex; mode=display">\widehat{\mathbb{Q}}_p \cong \mathbb{Q}_p.</script><p>The proof is broken down into several steps. But it shall be clear that $\mathbb{Q}_p$ is a topological group with respect to addition.</p><h2 id="Step-1-Find-the-simplest-character"><a href="#Step-1-Find-the-simplest-character" class="headerlink" title="Step 1 - Find the simplest character"></a>Step 1 - Find the simplest character</h2><p>Every $p$-adic number $x \in \mathbb{Q}_p$ can be written in the form </p><script type="math/tex; mode=display">x = \sum_{j=m}^{\infty}x_jp^j</script><p>where $m \in \mathbb{Z}$, $x_j \in \{1,2,\dots,p-1\}$ for all $j$. We define </p><script type="math/tex; mode=display">\langle x,\xi_1 \rangle = \exp\left(2\pi i \sum_{j=m}^{\infty}x_jp^j\right),</script><p>and claim that $\xi_1$ is a character. Notice that the right hand side is always well-defined, because all summands when $j \ge 0$ contributes nothing as $\exp(2\pi i x_jp^j)=1$. That is to say, the right hand side can be understood as a finite product: when $m \ge 0$, i.e. $x \in \mathbb{Z}_p$, the pairing $\langle x, \xi \rangle = 1$; when $m&lt;0$ however, $\langle x,\xi_1 \rangle = \exp\left( 2\pi i \sum_{j=m}^{-1}x_jp^j\right)$. Therefore it is legitimate to write</p><script type="math/tex; mode=display">\langle x,\xi_1 \rangle =\exp (2\pi x).</script><p>From this it follows immediately that</p><script type="math/tex; mode=display">\langle x+y ,\xi_1 \rangle = \langle x,\xi_1 \rangle\langle y,\xi_1 \rangle.</script><p>The function $\xi_1$ is continuous because it is continuous on $\mathbb{Z}_p$, being constant. Therefore it is safe to say that $\xi_1$ is a character with kernel $\mathbb{Z}_p$. </p><p>A quick thought would be, generating all characters out of $\xi_1$, to get something like $\xi_p$, $\xi_{1+p+p^2+\dots}$. But that approach might lead us to a nightmare of subscripts. Instead, we try to discover as many characters as possible. For any $y \in \mathbb{Q}_p$, we define</p><script type="math/tex; mode=display">\langle x,\xi_y\rangle = \langle xy,\xi_1\rangle.</script><p>In other words, $\xi_y$ is defined by $x \mapsto \langle xy,\xi_1\rangle$. Since the multiplication is continuous, we see immediately that $\xi_y$ is a character, not very more complicated than $\xi_1$. We will show that this is all we need. To do this, we need to <em>characterise</em> all characters. Notice that have the same image but their kernels differ. So we try to characterise the characters by characterising their kernels.</p><h2 id="Step-2-Study-the-kernels-of-characters"><a href="#Step-2-Study-the-kernels-of-characters" class="headerlink" title="Step 2 - Study the kernels of characters"></a>Step 2 - Study the kernels of characters</h2><p>For the $\xi_y$ above, we notice that $\langle x,\xi_y\rangle=1$ if and only if $xy \in \ker\xi_1=\mathbb{Z}_p$, i.e. $|xy|_p \le 1$. Therefore</p><script type="math/tex; mode=display">\ker\xi_y = \{x:\|xy\|_p \le 1\}.</script><p>We expect that all characters are of the form $\xi_y$. Therefore their kernels shall be like $\ker\xi_y$ naturally. Notice that for fixed $y$, we have $|y|_p=p^m$ for some $m \in \mathbb{Z}$. As a result $\ker\xi_y = \overline{B}(0,p^{-m})$. For this reason we have the following (more obscure) argument</p><blockquote><p><strong>Lemma 1.</strong> If $\xi \in \widehat{\mathbb{Q}}_p$, there exists an integer $k$ such that $\overline{B}(0,p^{-k}) \subset \ker\xi$.</p></blockquote><p><em>Proof.</em> Since $\xi$ is continuous, $\langle 0,\xi\rangle=1$ on the circle, there exists $k$ such that $\overline{B}(0,p^{-k}) \subset \xi^{-1}\{z \in \mathbb{T}:|z-1| &lt; 1\}$ (this is to say the right hand side is an open set). But $\overline{B}(0,p^{-k})$ is a group (as $|\cdot|_p$ is non-Archimedean), therefore it maps into a subgroup of $\mathbb{T}$, which can only be $\{1\}$. $\square$</p><p>We cannot say the kernel of $\xi$ is exactly of the form $\overline{B}(0,p^{-k})$ yet, but we have a way to formalise them now. If $\overline{B}(0,p^{-k}) \subset \ker\xi$ for all $k$, then $\xi=1$ is the unit in $\widehat{\mathbb{Q}}_p$. Otherwise, for each $\xi$, there is a smallest $k_0$ such that $\overline{B}(0,p^{-k_0})\subset \ker\xi$ but $\overline{B}(0,p^{-k}) \not \subset \ker\xi$ whenever $k&lt;k_0$. In another way around, we have $\langle p^{k_0-1},\xi\rangle \ne1$ but $\langle p^k,\xi\rangle=1$ whenever $k \ge k_0$. As one may guess, such $k_0$ subjects to the “size” of $\xi$. For convenience we study the case when $k_0=0$ first.</p><blockquote><p><strong>Lemma 2 (“Fourier series”).</strong> Suppose for given $\xi \in \widehat{\mathbb{Q}}_p$, $\langle 1,\xi \rangle = 1$ but $\langle p^{-1},\xi \rangle \ne 1$. There is a sequence $(c_j)$ taking values in $\{0,1,\dots,p-1\}$ such that $\langle p^{-k},\xi \rangle=\exp\left(2\pi i\sum_1^k c_{k-j}p^{-j}\right)$ for all $k=1,2,\dots$. In particular, $c_0 \ne 0$.</p></blockquote><p><em>Proof.</em> Put $\omega_k=\langle p^{-k},\xi\rangle$. Then $\omega_0=1$ but $\omega_k \ne 1$ for all $k \ge 1$. Since</p><script type="math/tex; mode=display">\omega_{k+1}^p = \langle p^{-(k+1)},\xi \rangle^p=\langle p \cdot p^{-(k+1)},\xi \rangle =\langle p^{-k},\xi \rangle =\omega_k,</script><p>each $\omega_{k+1}$ is a $p$-th root of $\omega_{k}$, and in particular $\omega_1$ is a $p$-th root of unity. There exists $c_0 \in \{1,\dots,p-1\}$ such that </p><script type="math/tex; mode=display">\omega_1=\exp\left(2\pi i c_0p^{-1}\right),</script><p>and the overall formula for $\omega_k$ follows from induction. $\square$</p><p>One would guess that for the corresponding $k_0$, the “size” of $\xi$ should be $p^{k_0}$. This looks realistic, but will be tedious. Right now we still only study the case when $k_0=0$.</p><blockquote><p><strong>Lemma 3.</strong> Notation being in lemma 2, there exists $y \in \mathbb{Q}_p$ with $|y|_p=1$ such that $\xi = \xi_y$.</p></blockquote><p><em>Proof.</em> From lemma 2 we obtain a series $y=\sum_{j=0}^{\infty}c_jp^j$ with $c_0 \ne 0$. Then in particular $|y|_p=1$. By expanding the term, we see</p><script type="math/tex; mode=display">\begin{aligned}\langle p^{-k},\xi \rangle &= \exp\left(2\pi i \sum_1^k c_{k-j}p^{-j}\right) \\                           &= \exp\left(2\pi i \sum_{-k}^{-1} c_{j+k}p^j \right) \\                           &= \left\langle \sum_{-k}^{\infty}c_{j+k}p^j,\xi_1 \right\rangle= \langle p^{-k}y,\xi_1 \rangle = \langle p^{-k},\xi_y \rangle.\end{aligned}</script><p>It follows that $\langle x,\xi \rangle = \langle x,\xi_y \rangle$ for all $x \in \mathbb{Q}_p$. $\square$</p><p>Now we are ready to conclude our observation of the dual group.</p><h2 id="Step-3-Realise-the-dual-group"><a href="#Step-3-Realise-the-dual-group" class="headerlink" title="Step 3 - Realise the dual group"></a>Step 3 - Realise the dual group</h2><blockquote><p><strong>Theorem.</strong> The map $\Lambda:y \mapsto \xi_y$ is an isomorphism of topological groups. Hence $\mathbb{Q}_p \cong \widehat{\mathbb{Q}}_p$.</p></blockquote><p><em>Proof.</em> First of all we study the algebraic isomorphism. First of all if $\xi_y=1$, then </p><script type="math/tex; mode=display">\{x \in \mathbb{Q}_p:\|xy\|_p \le 1\}= \mathbb{Q}_p \implies y=0.</script><p>Hence the map $\Lambda$ is injective. To show that $\Lambda$ is surjective, fix $\xi \in \widehat{\mathbb{Q}}_p$. By the comment below lemma 1, there is a smallest integer $k$ such that $\langle p^k,\xi \rangle = 1$. Then one considers the character $\eta$ defined by</p><script type="math/tex; mode=display">\langle x,\eta \rangle = \langle p^k x,\xi \rangle.</script><p>It satisfies the condition in lemma 3, therefore there exists $z \in \mathbb{Q}_p$ such that $\eta=\xi_z$, and it follows that $\xi=\xi_{p^{-k}z}$.</p><p>Next we show that $\Lambda$ is a homeomorphism. Observe the following sets</p><script type="math/tex; mode=display">N(\ell,k)=\{\xi \in \widehat{\mathbb{Q}}_p:|\langle x,\xi\rangle-1|< \ell^{-1}\text{ for }\|x\|_p \le p^{k}\}</script><p>ranging over $\ell \ge 1$ and $k \in \mathbb{Z}$. These sets constitute a local base at $1$ for $\widehat{\mathbb{Q}}_p$. We need to show that it corresponds to a local base of $\mathbb{Q}_p$ under the map $\Lambda$:</p><p> The image of the set $\{x:|x|_p \le p^k\}$ under $\xi_1$ is $\{1\}$ if $k \le 0$ and is the group of $p^k$-th roots of unity if $k&gt;0$, and hence is contained in $\{z:|z-1|&lt;\ell^{-1}\}$ if and only if $k \le 0$. It follows that $\xi_y \in N(\ell,k)$ if and only if $|y|_p \le p^{-k}$, i.e., $y \in \overline{B}(0,p^{-k})$. We are done. $\square$</p>]]></content>
    
    
    <summary type="html">In this post we show that the Pontryagin dual group of $&#92;mathbb{Q}_p$ is isomorphic to itself.</summary>
    
    
    
    <category term="Number Theory" scheme="https://desvl.xyz/categories/Number-Theory/"/>
    
    <category term="p-Adic Analysis" scheme="https://desvl.xyz/categories/Number-Theory/p-Adic-Analysis/"/>
    
    
    <category term="p-adic" scheme="https://desvl.xyz/tags/p-adic/"/>
    
    <category term="Pontryagin" scheme="https://desvl.xyz/tags/Pontryagin/"/>
    
  </entry>
  
  <entry>
    <title>The Haar Measure on the Field of p-Adic Numbers</title>
    <link href="https://desvl.xyz/2022/12/20/haar-measure-p-adic/"/>
    <id>https://desvl.xyz/2022/12/20/haar-measure-p-adic/</id>
    <published>2022-12-20T13:28:41.000Z</published>
    <updated>2025-05-11T12:07:41.127Z</updated>
    
    <content type="html"><![CDATA[<h2 id="Introduction"><a href="#Introduction" class="headerlink" title="Introduction"></a>Introduction</h2><p>Let $p$ be a prime number. Then the space of $p$-adic numbers $\mathbb{Q}_p$ is a locally compact abelian group. This can be observed through the local basis</p><script type="math/tex; mode=display">\overline{B}(x,r)=\{y \in \mathbb{Q}_p:|x-y|_p \le r\},</script><p>where $|\cdot|_p$ is the $p$-adic norm such that, whenever we write $r=p^mq$ such that $q$ is prime to $p$,  we have $|r|_p=p^{-m}$.</p><p>We remind the reader that every locally compact abelian group $G$ admits a Haar measure, which is unique up to a scalar multiplication (<a href="https://math.uchicago.edu/~may/VIGRE/VIGRE2010/REUPapers/Gleason.pdf">proof</a>). In this post, we try to find the Haar measure on $\mathbb{Q}_p$, which makes it possible to do harmonic analysis on it. For this reason, in future posts, we also find the dual group of $\mathbb{Q}_p$ as well as the dual measure. </p><h2 id="Haar-Measure"><a href="#Haar-Measure" class="headerlink" title="Haar Measure"></a>Haar Measure</h2><p>Let us first recall the basic structure of $\mathbb{Q}_p$. Every element is in the form of Laurent series</p><script type="math/tex; mode=display">x=\sum_{j=m}^{\infty}c_jp^j</script><p>where $m \in \mathbb{Z}$ and $c_j \in \{0,\dots,p-1\}$. The ring of integers $\mathbb{Z}_p$ is exactly the closed disc of radius $1$ at the origin. That is, $\mathbb{Z}_p=\overline{B}(0,1)$ is a compact set. Let $\mu$ be an arbitrary Haar measure on $\mathbb{Q}_p$. Then $\mu(\mathbb{Z}_p)$ is non-zero and finite. We can therefore put</p><script type="math/tex; mode=display">m_p(E)=\frac{1}{\mu(\mathbb{Z}_p)}\mu(E).</script><p>Then in particular $m_p(\mathbb{Z}_p)=1$. This is the canonical Haar measure we are looking for. But it would be hilarious to end the post here. We will give a closer look at it, at least on a $p$-adic level.</p><p>Recall that when studying the Lebesgue measure on $\mathbb{R}$ we have encountered some definition in the form of</p><script type="math/tex; mode=display">m(E)=\inf\left\{\sum_j\ell(I_j)\right\}</script><p>where the infimum is taken over all countable collections of open intervals $\{I_j\}$ such that $\bigcup_j I_j \supset E$, and $\ell(I_j)$ is the length of $I_j$. In fact, we can actually write</p><script type="math/tex; mode=display">m(E)=\inf\{m(V):V \supset E \text{ is open.}\}</script><p>On $\mathbb{Q}_p$, we write</p><script type="math/tex; mode=display">m_p(E)=\inf\{m_p(V): V \supset E \text{ is open.}\}</script><p>The point here is how to express $V$. For this reason we need to recall some topology of $\mathbb{Q}_p$.</p><h2 id="Some-p-adic-topology"><a href="#Some-p-adic-topology" class="headerlink" title="Some p-adic topology"></a>Some p-adic topology</h2><blockquote><p>$\mathbb{Q}_p$ is a separable metric space. Therefore every open set $V$ is a union of open balls.</p></blockquote><p>There is nothing special about this statement. The space has already been equipped with a norm. Besides, as $\mathbb{Q}$ is dense in $\mathbb{Q}_p$, we have nothing to worry about second countability.</p><blockquote><p>Every closed ball of $\mathbb{Q}_p$ is open (hence we call them “balls” thereafter). Every point in the ball is a “centre”. If two balls intersect then one is contained in the other. </p></blockquote><p>This is dramatically different from our understanding of $\mathbb{R}$ or $\mathbb{C}$. Notice that the $p$-adic norm $|\cdot|_p$ only takes the values from $p^k$ with $k \in \mathbb{Z}$ or $0$. For any $r&gt;0$, there exists some $\varepsilon&gt;0$ such that</p><script type="math/tex; mode=display">\overline{B}(x,r)=\{y \in \mathbb{Q}_p:\|x-y\|_p \le r\}=\{y \in \mathbb{Q}_p:|x-y|_p<r+\varepsilon\}.</script><p> The clopenness of balls in $\mathbb{Q}_p$ follows. </p><p>Next, recall that $|\cdot|_p$ is non-Archimedean. Consider $y \in \overline{B}(x,r)$. It follows that $|x-y|_p=|y-x|_p \le r$. On the other hand, for any $z \in \overline{B}(x,r)$, we have $|x-z|_p \le r$. Therefore $|y-z|_p \le r$. Hence $\overline{B}(x,r)\subset \overline{B}(y,r)$. Symmetrically we see $\overline{B}(y,r) \subset \overline{B}(x,r)$. Hence they are equal.</p><p>Let $\overline{B}(x,r)$ and $\overline{B}(x’,r’)$ be two balls that intersect, and without loss of generality we assume that $r \le r’$. Let $y$ be a point in the intersection, then we see</p><script type="math/tex; mode=display">\overline{B}(x',r')=\overline{B}(y,r') \supset \overline{B}(y,r)=\overline{B}(x,r).</script><p>So far so good. We next try to compute the Haar measure of every ball.</p><h2 id="Measure-of-a-ball"><a href="#Measure-of-a-ball" class="headerlink" title="Measure of a ball"></a>Measure of a ball</h2><blockquote><p>Every ball of radius $p^k$ has measure $p^k$ ($k \in \mathbb{Z}$).</p></blockquote><p>First of all notice that $\overline{B}(0,1)=\mathbb{Z}_p$, and we defined $m_p$ so that $m_p(\mathbb{Z}_p)=1$. Therefore every ball of the form $\overline{B}(x,1)$ has measure $1$. Next, notice that $\overline{B}(0,p^k)=p^{-k}\mathbb{Z}_p$ for all $k \in \mathbb{Z}$, it is necessary to unwind $\mathbb{Z}_p$ a little bit more.</p><p>We have</p><script type="math/tex; mode=display">\mathbb{Z}_p=\coprod_{i=0}^{p-1}(i+p\mathbb{Z}_p)= \coprod_{i=0}^{p-1}\coprod_{j=0}^{p-1}(i+jp+p^2\mathbb{Z}_p)=\cdots</script><p>Therefore $\mathbb{Z}_p$ is a disjoint union of $p^k$ balls of radius $p^{-k}$ when $k&gt;0$. Hence in this case,</p><script type="math/tex; mode=display">m_p(\overline{B}(0,p^{-k}))=p^{-k}m_p(\overline{B}(0,1))=p^{-k}</script><p>as expected. In other words, for $k&lt;0$, the ball $\overline{B}(0,p^k)$ has measure $p^k$.</p><p> For the counterpart, we notice that </p><script type="math/tex; mode=display">\begin{aligned}p^{-k}\mathbb{Z}_p &= p^{-k}\coprod_{a_0=0}^{p-1}\dots\coprod_{a_{k-1}=0}^{p-1}(a_0+\dots+a_{k-1}p^{k-1}+p^k\mathbb{Z}_p) \\                   &= \coprod_{a_0=0}^{p-1}\dots\coprod_{a_{k-1}=0}^{p-1}(a_0p^{-k}+\dots+a_{k-1}p^{-1}+\mathbb{Z}_p)\end{aligned}</script><p>which is to say $\overline{B}(0,p^k)=p^{-k}\mathbb{Z}_p$ is a disjoint union of $p^k$ balls of radius $1$. Hence its measure is $p^k$. This concludes our computation of balls in $\mathbb{Q}_p$.</p><h2 id="Back-to-the-remaining-problem"><a href="#Back-to-the-remaining-problem" class="headerlink" title="Back to the remaining problem"></a>Back to the remaining problem</h2><p>Now we come back to the definition of $m_p$. Now every open set $V$ can be written in the form</p><script type="math/tex; mode=display">V=\bigcup_{i=1}^{\infty}\overline{B}(x_i,p^{m_i}).</script><p>The union is countable because $\mathbb{Q}_p$ is second countable. By combining intersecting balls, we can assume that the union is also disjoint. It follows that</p><script type="math/tex; mode=display">m_p(V)=\sum_{i=1}^{\infty}p^{m_i}.</script><p>Note: this should be understood in the sense of real series, instead of $p$-adic number, because $m_p$ takes the values in $\mathbb{R}$. So for an arbitrary measurable set, we have</p><script type="math/tex; mode=display">m_p(E)=\inf\left\{\sum_{i=1}^{\infty}p^{m_i}:E \subset\bigcup_{i=1}^{\infty}\overline{B}(x_i,p^{m_i})\right\}.</script>]]></content>
    
    
    <summary type="html">In this post we study the canonical Haar measure on $Q_p$, and give a explicit definition just as the Lebesgue measure.</summary>
    
    
    
    <category term="Number Theory" scheme="https://desvl.xyz/categories/Number-Theory/"/>
    
    <category term="p-Adic Analysis" scheme="https://desvl.xyz/categories/Number-Theory/p-Adic-Analysis/"/>
    
    
    <category term="p-adic" scheme="https://desvl.xyz/tags/p-adic/"/>
    
    <category term="Haar" scheme="https://desvl.xyz/tags/Haar/"/>
    
  </entry>
  
  <entry>
    <title>Every Regular Local Ring is Cohen-Macaulay</title>
    <link href="https://desvl.xyz/2022/12/05/regular-local-ring-cohen-macaulay/"/>
    <id>https://desvl.xyz/2022/12/05/regular-local-ring-cohen-macaulay/</id>
    <published>2022-12-04T23:00:00.000Z</published>
    <updated>2025-12-31T15:49:06.412Z</updated>
    
    <content type="html"><![CDATA[<p>Throughout, let $R$ be a commutative Noetherian local ring with maximal ideal $\mathfrak{m}$ and residue field $k=R/\mathfrak{m}$.</p><h1 id="Introduction"><a href="#Introduction" class="headerlink" title="Introduction"></a>Introduction</h1><p>The notion of Cohen-Macaulay ring is sufficiently general to a wealth of examples in algebraic geometry, invariance theory and combinatorics; meanwhile it is sufficiently strict to allow a rich theory. The notion of Cohen-Macaulay is a workhorse of commutative algebra. In this post, we discover an important subclass of Cohen-Macaulay ring - regular local rings (one would be thinking about $k[[x_1,\dots,x_n]]$). See also <a href="https://mathoverflow.net/q/138218/">“Why Cohen-Macaulay rings have become important in commutative algebra?”</a> on MathOverflow. </p><p>It is recommended to be familiar with basic commutative algebra tools such as Nakayama’s lemma and minimal prime ideals.</p><p>The content can be generalised to modules to a good extent, but we are not doing it for sake of quick accessibility.</p><h1 id="Embedding-Dimension-Krull-Dimension-and-Grade"><a href="#Embedding-Dimension-Krull-Dimension-and-Grade" class="headerlink" title="Embedding Dimension, Krull Dimension and Grade"></a>Embedding Dimension, Krull Dimension and Grade</h1><blockquote><p> <strong>Definition 1.</strong> The <em>Krull dimension</em> of $R$, written as $\dim{R}$, is the supremum taking over the length of prime ideal chains</p><script type="math/tex; mode=display">\mathfrak{p}_0 \subsetneq \mathfrak{p}_1 \subsetneq \dots \subsetneq  \mathfrak{p}_d.</script></blockquote><p>This definition was introduced to define dimension of affine varieties, in a global sense. Locally, we have the following definition.</p><blockquote><p><strong>Definition 2.</strong> The <em>embedding dimension</em> of $R$ is the dimension of a vector space</p><script type="math/tex; mode=display">emb.\dim{R} = \dim_k(\mathfrak{m}/\mathfrak{m}^2).</script><p>The right hand side is the dimension of a $k$-vector space $\mathfrak{m}/\mathfrak{m}^2$.</p></blockquote><p>Let $R$ be the local ring of a complex variety $X$ at a point $P$, in other words we write $R=\mathcal{O}_{P,X}$. Then $(\mathfrak{m}/\mathfrak{m}^2)^\ast$ is the Zariski tangent space of $X$ at $P$, whose dimension equals $\dim_k(\mathfrak{m}/\mathfrak{m}^2)=emb.\dim{R}$. Here for a vector space $V$ over $k$, $V^\ast$ is the dual space of $V$. The embedding dimension of $R$ is the smallest integer $n$ such that some analytic neighbourhood of $P$ in $X$ embeds into $\mathbb{C}^n$. If this dimension equals the dimension of $X$, then $X$ is “smooth” at $P$. For this reason we define regular local ring.</p><blockquote><p><strong>Definition 3.</strong> The ring $R$ is called <em>regular</em> if $\dim{R}=emb.\dim{R}$.</p></blockquote><p>The most immediate intuitive example of regular local ring has to be rings of the form</p><script type="math/tex; mode=display">K[[x_1,\dots,x_n]],</script><p>where $K$ is a field. These kind of rings are regular local rings of Krull dimension $n$. As one would imagine, this ring contains much more information than $K[x_1,\dots,x_n]$. Power series in complex analysis is much more powerful than polynomials.</p><p>But by working on regular local rings, we are not essentially restricting ourselves into the ring of power series over a field. For example, the ring $\mathbb{Z}[X]_{(2,X)}$ is also a regular local ring, but it does not even contain a field. </p><p>Nevertheless, our primary psychological model of regular local rings is still a ring of the form $A=K[[x_1,\dots,x_n]]$, which has a maximal ideal $\mathfrak{m}=(x_1,\dots,x_n)$. To study local rings in the flavour of $A$, we develop an analogy of elements $\{x_1,\dots,x_n\}$.</p><blockquote><p><strong>Definition 4.</strong> A <em>regular sequence</em> of $R$, also written as $R$-sequence, is a sequence $[x_1,\dots,x_n]$ of elements in $\mathfrak{m}$ such that $x_1$ is a non-zero-divisor in $R$, and such that given $i&gt;1$, each $x_i$ is a non-zero-divisor in  $R/(x_1,\dots,x_i)$. </p><p>The <em>grade</em> of $R$, $G(R)$, is the longest length of regular sequences. If $G(R)=\dim{R}$, then $R$ is called <em>Cohen-Macaulay</em>.</p></blockquote><p>It is quite intuitive that, for $A=K[[x_1,\dots,x_n]]$, the longest $R$-sequence has to be $[x_1,\dots,x_n]$, and therefore $A$ is Cohen-Macaulay. But such an argument does not bring us to the conclusion that quick. We will show later, anyway, every regular local ring is a Cohen-Macaulay ring.</p><h1 id="The-Sequence-That-Forms-a-Basis"><a href="#The-Sequence-That-Forms-a-Basis" class="headerlink" title="The Sequence That Forms a Basis"></a>The Sequence That Forms a Basis</h1><p>Amongst many sequences, we are in particular interested in the sequence that are mapped onto a basis of the $k$-vector space $\mathfrak{m}/\mathfrak{m}^2$. We will show later that this “regular” sequence is indeed the <em>regular</em> sequence.</p><blockquote><p><strong>Proposition 1.</strong> Let $x_1,\dots,x_n$ be elements in $\mathfrak{m} \subset R$ whose images form a basis of $\mathfrak{m}/\mathfrak{m}^2$, then $x_1,\dots,x_n$ generate the maximal ideal $\mathfrak{m}$.</p></blockquote><p><em>Proof.</em> <a href="https://stacks.math.columbia.edu/tag/07RC">Nakayama’s lemma (8)</a>. Notice that as $R$ is local, the Jacobson radical is $\mathfrak{m}$. Besides, we take $I=M=\mathfrak{m}$. $\square$</p><blockquote><p><strong>Proposition 2.</strong> If $R$ is a regular local ring of dimension $n$ and $x_1, \dots,x_n \in \mathfrak{m}$ map to a basis of $\mathfrak{m}/\mathfrak{m}^2$, then $R/(x_1,\dots,x_i)$ is a  regular local ring of dimension $n-i$.</p></blockquote><p><em>Proof.</em> By proposition 1, we have $\mathfrak{m}=(x_1,\dots,x_i,x_{i+1},\dots,x_n)$. The dimension of $R/(x_1,\dots,x_i)$ is determined by the chain in $R$:</p><script type="math/tex; mode=display">(x_1,\dots,x_i,x_{i+1}) \subset \dots \subset (x_1,\dots,x_i,x_{i+1},\dots,x_n)</script><p>which has length $n-i$. That is, $\dim R/(x_1,\dots,x_i)=n-i$. On the other hand, the maximal ideal $\mathfrak{M}$ in $R/(x_1,\dots,x_i)$ is isomorphic to $(x_{i+1},\dots,x_n)$, and $x_{i+1},\dots,x_n$ map to a basis of $\mathfrak{M}/\mathfrak{M}^2$, which consequently has dimension $n-i$. $\square$</p><p>It looks quite promising now that the sequence of basis can get everything down to earth, and we will show that in the following section.</p><h1 id="Regular-Local-Rings-Are-Integral-Domains-and-Cohen-Macaulay"><a href="#Regular-Local-Rings-Are-Integral-Domains-and-Cohen-Macaulay" class="headerlink" title="Regular Local Rings Are Integral Domains and Cohen-Macaulay"></a>Regular Local Rings Are Integral Domains and Cohen-Macaulay</h1><blockquote><p><strong>Proposition 3.</strong> If $R$ is regular, then $R$ is an integral domain.</p></blockquote><p><em>Proof.</em> We use induction on $\dim R$. When $\dim{R}=0$ and $R$ is regular, $R$ has to be a field, hence an integral domain by definition. Next we assume that $\dim{R}&gt;0$ and the argument has been proved for $\dim{R}-1$.</p><p>Pick $x \in \mathfrak{m} \setminus \mathfrak{m}^2$. Then this element map to a nonzero element in $\mathfrak{m}/\mathfrak{m}^2$. There exists a basis of $\mathfrak{m}/\mathfrak{m}^2$ that contains $\overline{x}$. Therefore by proposition 2, $R/(x)$ is a regular local ring of dimension $\dim{R}-1$, which is an integral domain by assumption. It follows that $(x)$ is prime. </p><p>We claim that there exists $x \in \mathfrak{m}/\mathfrak{m}^2$ such that $(x)$ has height $1$. If not, then for all $x \in \mathfrak{m}/\mathfrak{m}^2$, $(x)$ is a minimal. It follows that there exists finitely many minimal prime ideals $\mathfrak{p}_1,\dots,\mathfrak{p}_r$ such that</p><script type="math/tex; mode=display">\mathfrak{m}=\mathfrak{m}^2 \cup \mathfrak{p}_1 \cup \dots \cup \mathfrak{p}_r</script><p> and consequently $\mathfrak{m} \subset \mathfrak{p}_j$ for some $1 \le j \le r$. It follows that $\dim{R}=0$, contradicting our assumption that $\dim{R}&gt;0$. [Note: the prime avoidance allows at most two ideals to be non-prime. See P. 90 of Eisenbud’s Commutative Algebra, with a View Toward Algebraic Geometry.]</p><p>Thus, as our claim is true, we can write $\mathfrak{p} \subsetneq (x)$ with $\mathfrak{p}$ prime and $x \in \mathfrak{m} \setminus \mathfrak{m}^2$. We see $\mathfrak{p} \in (x^n)$ for all $n$ because if $p=rx^n \in \mathfrak{p}$, then $r \in \mathfrak{q} \subset (x)$ and therefore we write $r=sx$ or equivalently $p = sx^{n+1} \in (x^{n+1})$. When this is the case, we have $\mathfrak{p} \subset \bigcap_{n=1}^{\infty}(x^n)=0$. Therefore $R/\mathfrak{p}=R/0=R$ is an integral domain.</p><p>We now reach our conclusion of this post.</p><blockquote><p><strong>Proposition 4.</strong> If $R$ is regular and of Krull dimension $n$, any $x_1,\dots,x_n \in \mathfrak{m}$ mapping to a basis of $\mathfrak{m}/\mathfrak{m}^2$ gives rise to a regular sequence ($R$-sequence). Hence $G(R)=\dim{R}$ and therefore $R$ is Cohen-Macaulay.</p></blockquote><p><em>Proof.</em> As $G(R) \le \dim{R}$, once we have shown that $[x_1,\dots,x_n]$ is a regular sequence, we have $G(R) \ge \dim{R}$. To show it being a regular sequence, first of all notice that $x_1$ is non-zero-divisor (because $R$ is an integral domain). For any $i&gt;1$, we see $R/(x_1,\dots,x_i)$ is a regular local ring of dimension $d-i$, hence again an integral domain. Therefore $x_{i+1},\dots,x_i$ are non-zero-divisors. $\square$</p><h1 id="Reference-Further-Readings"><a href="#Reference-Further-Readings" class="headerlink" title="Reference / Further Readings"></a>Reference / Further Readings</h1><ul><li>Charles A. Weibel, <em>An Introduction to Homological Algebra</em>.</li><li>M. F. Atiyah,  I. G. MacDonald, <em>Introduction to Commutative Algebra</em>.</li><li>David Eisenbud, <em>Commutative Algebra: with a View Toward Algebraic Geometry</em>.</li><li>Winfred Bruns, Jürgen Herzog, <em>Cohen-Macaulay Rings</em></li></ul>]]></content>
    
    
    <summary type="html">In this post we show that the class of regular local rings (the abstract version of power series rings) is a subclass of Cohen-Macaulay ring.</summary>
    
    
    
    <category term="Algebra" scheme="https://desvl.xyz/categories/Algebra/"/>
    
    <category term="Commutative Algebra" scheme="https://desvl.xyz/categories/Algebra/Commutative-Algebra/"/>
    
    
    <category term="Study new content" scheme="https://desvl.xyz/tags/Study-new-content/"/>
    
    <category term="Cohen-Macaulay" scheme="https://desvl.xyz/tags/Cohen-Macaulay/"/>
    
  </entry>
  
  <entry>
    <title>The abc Theorem of Polynomials</title>
    <link href="https://desvl.xyz/2022/12/02/The-abc-Theorem-of-Polynomials/"/>
    <id>https://desvl.xyz/2022/12/02/The-abc-Theorem-of-Polynomials/</id>
    <published>2022-12-02T15:27:19.000Z</published>
    <updated>2025-05-10T22:19:10.584Z</updated>
    
    <content type="html"><![CDATA[<p>Let $K$ be an algebraically closed field of characteristic $0$. Instead of studying the polynomial ring $K[X]$ as a whole, we pay a little more attention to each polynomial. A reasonable thing to do is to count the number of distinct zeros. We define</p><script type="math/tex; mode=display">n_0(f)=\text{number of distinct roots of $f$.}</script><p>For example, If $f(X)=(X-1)^{100}$, we have $n_0(f)=1$. It seems we are diving into calculus but actually there is still a lot of algebra.</p><h1 id="The-abc-of-Polynomials"><a href="#The-abc-of-Polynomials" class="headerlink" title="The abc of Polynomials"></a>The abc of Polynomials</h1><blockquote><p><strong>Theorem 1 (Mason-Stothers).</strong> Let $a(X),b(X),c(X) \in K[X]$ be polynomials such that $(a,b,c)=1$ and $a+b=c$. Then</p><script type="math/tex; mode=display">\max\{\deg a,\deg b,\deg c\} \le n_0(abc)-1.</script></blockquote><p><em>Proof.</em> Putting $f=a/c$ and $g=b/c$, we have</p><script type="math/tex; mode=display">f+g=1.</script><p>This implies</p><script type="math/tex; mode=display">f'+g'=\frac{f'}{f}f+\frac{g'}{g}g=0 \implies \frac{g}{f}=\frac{b}{a}=-\frac{f'/f}{g'/g}.</script><hr><p>We interrupt the proof here for some good reasons. Rational functions of the form $f’/f$ remind us of the chain rule applied to $\log{x}$. In the context of calculus, we have $\left(\log{f(x)}\right)’=f’/f$. On the ring $K[x]$, we define $D:K[x] \to K[x]$ to be the formal derivative morphism. Then this endomorphism extends to $K(x)$ by</p><script type="math/tex; mode=display">D(f/g)=\frac{gDf-fDg}{g^2}.</script><p>On $K(x)^\ast$ (read: the multiplicative group of the rational function field $K(x)$), we define the logarithm derivative</p><script type="math/tex; mode=display">L(f)=\frac{Df}{f}.</script><p>It follows that </p><script type="math/tex; mode=display">L(fg)=\frac{D(fg)}{fg}=\frac{fDg+gDf}{fg}=L(f)+L(g).</script><p>Also observe that, just as in calculus, if $f$ is a constant function, then $D(f)=0$. Now we write</p><script type="math/tex; mode=display">f(X)=c\prod(X-\alpha_i)^{m_i}.</script><p>Then it follows that</p><script type="math/tex; mode=display">\begin{aligned}f'/f=L(f)&=L(c)+\sum L\left((X-\alpha_i)^{m_i} \right) \\         &= m_i\sum L(X-\alpha_i) \\         &= \sum\frac{m_i}{X-\alpha_i}.\end{aligned}</script><p>Now we can be back to the proof.</p><hr><p><em>Proof (continued).</em> Since $K$ is algebraically closed,</p><script type="math/tex; mode=display">a(X)=c_1\prod(X-\alpha_i)^{m_i}, \quad b(X)=c_2\prod(X-\beta_j)^{n_j}, \quad c(X)=c_3\prod(X-\gamma_k)^{r_k}.</script><p>We see, for example</p><script type="math/tex; mode=display">f(X)=c_1 c_3^{-1}\prod(X-\alpha_i)^{m_i}\prod(X-\gamma_k)^{-r_k}.</script><p>Therefore</p><script type="math/tex; mode=display">f'/f=\sum\frac{m_i}{X-\alpha_i}-\sum\frac{r_k}{X-\gamma_k}.</script><p>Likewise</p><script type="math/tex; mode=display">g'/g=\sum\frac{n_j}{X-\beta_j}-\sum\frac{r_k}{X-\gamma_k}.</script><p>Combining both, we obtain</p><script type="math/tex; mode=display">\frac{b}{a}= -\frac{\sum\frac{m_i}{X-\alpha_i}-\sum\frac{r_k}{X-\gamma_k}}{\sum\frac{n_j}{X-\beta_j}-\sum\frac{r_k}{X-\gamma_k}}.</script><p>Next, multiplying $f’/f$ and $g’/g$ by</p><script type="math/tex; mode=display">N_0(X)=\prod(X-\alpha_i)\prod(X-\beta_j)\prod(X-\gamma_k),</script><p>which has degree $n_0(abc)$ (since $(a,b,c)=1$, these three polynomials share no root). Both $N_0f’/f$ and $N_0g’/g$ are polynomials of degrees at most $n_0(abc)-1$ (this is because $\deg h’=\deg h-1$ for non-constant $h \in K[X]$, while $f$ and $g$ are non-constant (why?); we assume $\operatorname{char} K=0$ for this reason). </p><p>Next we observe the degrees of $a,b$ and $c$. Since $a+b=c$, we actually have $\deg c \le \max\{\deg a,\deg b\}$. Therefore $\max\{\deg a,\deg b,\deg c\}=\max\{\deg a,\deg b\}$. From the relation</p><script type="math/tex; mode=display">\frac{b}{a}=-\frac{N_0f'/f}{N_0g'/g},</script><p>and the assumption that $(a,b)=1$, one can find polynomial $h \in K[X]$ such that</p><script type="math/tex; mode=display">bh=-N_0f'/f,\quad ah = N_0g'/g.</script><p>Taking the degrees of both sides, we see</p><script type="math/tex; mode=display">\begin{aligned}\deg b \le \deg N_0f'/f \le n_0(abc)-1, \\\deg a \le \deg N_0g'/g \le n_0(abc)-1.\end{aligned}</script><p>This proves the theorem. $\square$</p><h1 id="Applications"><a href="#Applications" class="headerlink" title="Applications"></a>Applications</h1><p>We present some applications of this theorem.</p><blockquote><p><strong>Corollary 1 (Fermat’s theorem for polynomials).</strong> Let $a(X),b(X)$ and $c(X)$ be relatively prime polynomials in $K[X]$ such that not all of them are constant, and such that </p><script type="math/tex; mode=display">a(X)^n+b(X)^n=c(X)^n.</script><p>Then $n \le 2$.</p></blockquote><p>Alternatively one can argue the curve $x^n+y^n=1$ on $K(X)$.</p><p><em>Proof.</em> Since $a,b$ and $c$ are relatively prime, we also have $a^n$, $b^n$ and $c^n$ to be relatively prime. By Mason-Stothers theorem,</p><script type="math/tex; mode=display">\begin{aligned}\deg a^n = n\deg a &\le n_0(a^nb^nc^n)-1 \\                   &=   n_0(abc)-1 \\                   &\le \deg(abc)-1 \\                   &= \deg a + \deg b + \deg c - 1.\end{aligned}</script><p>Replacing $a$ by $b$ and $c$, we see</p><script type="math/tex; mode=display">\begin{cases}n\deg a \le \deg a + \deg b + \deg c - 1 \\n\deg b \le \deg a + \deg b + \deg c - 1 \\n\deg c \le \deg a + \deg b + \deg c - 1 \end{cases}</script><p>It follows that</p><script type="math/tex; mode=display">n(\deg a + \deg b + \deg c) \le 3(\deg a + \deg b + \deg c) - 1.</script><p>In this case $n&lt;3$. $\square$</p><blockquote><p><strong>Corollary 2 (Davenport’s inequality).</strong> Let $f,g \in K[X]$ be non-constant polynomials such that $f^3-g^2 \ne 0$. Then</p><script type="math/tex; mode=display">\deg (f^3-g^2) \ge \frac{1}{2}\deg f + 1.</script></blockquote><p>One may discuss cases separately on whether $f$ and $g$ are coprime, and try to apply Mason-Stothers theorem respectively, and many documents only record the proof of coprime case, which is a shame. The case when $f$ and $g$ are not coprime can be a nightmare. Instead, for sake of accessibility, we offer the elegant proof given by Stothers, starting with a lemma about the degree of the difference of two polynomials.</p><blockquote><p><strong>Lemma 1.</strong> Suppose $p,q \in K[X]$ are two distinct non-constant polynomials, then</p><script type="math/tex; mode=display">\deg(p-q) \ge \deg p - n_0(p)-n_0(q)+1.</script></blockquote><p><em>Proof.</em> Let $k(f)$ be the leading coefficient of a polynomial $f$. If $\deg p \ne \deg q$ or $k(p) \ne k(q)$, then $\deg(p-q)\ge \deg p \ge \deg p - n_0(p)-n_0(q)+1$ because $n_0(p) \ge 1$ and $n_0(q) \ge 1$.</p><p>Next suppose $\deg p = \deg q$ and $k(p)=k(q)$. If $(p,q)=1$, then by Mason-Stothers,</p><script type="math/tex; mode=display">\begin{aligned}\deg p &\le n_0(pq(p-q))-1 \\       &=n_0(p)+n_0(q)+n_0(p-q)-1 \\       &\le n_0(p)+n_0(q)+\deg(p-q)-1.\end{aligned}</script><p>Otherwise, suppose $(p,q)=r$. Then $p/r$ and $q/r$ are coprime. Again by Mason-Stothers,</p><script type="math/tex; mode=display">\begin{aligned}\deg(p/r) &\le n_0\left\{\frac{p}{r}\cdot\frac{q}{r}\cdot\frac{p-q}{r}\right\}-1 \\          &\le n_0(p/r)+n_0(q/r)+\deg((p-q)/r)-1\end{aligned}</script><p>Therefore</p><script type="math/tex; mode=display">n_0(p/r)+n_0(q/r) \ge \deg(p/r)-\deg((p-q)/r)+1=\deg(p)+\deg(p-q)+1</script><p>On the other hand,</p><script type="math/tex; mode=display">n_0(p)+n_0(q) \ge n_0(p/r)+n_0(q/r)+n_0(r) \ge n_0(p/r)+n_0(q/r).</script><p>Combining all these inequalities, we obtain what we want. $\square$</p><hr><p><em>Proof (of corollary 2).</em> Put $\deg{f}=m$ and $\deg{g}=n$. If $3m \ne 2n$, then</p><script type="math/tex; mode=display">\deg(f^3-g^2) \ge \deg(f^3)=3m\ge \frac{1}{2}m+1</script><p>because $m \ge 1$. Next we assume that $3m=2n$, or in other word, $m=2r$ and $n=3r$. By lemma 1, we can write</p><script type="math/tex; mode=display">\begin{aligned}\deg(f^3-g^2) & \ge \deg(f^3)-n_0(f^3)-n_0(g^2)+1 \\              & \ge 6r - 2r - 3r + 1 \\              & \ge r+1 \\              & = \frac{1}{2}\deg f + 1.\end{aligned}</script><p>This proves the inequality. $\square$</p><p>One may also generalise the case to $f^m-g^n$. But we put down some more important remarks. First of all, Mason-Stothers is originally a generalisation of Davenport’s inequality (by Stothers). I personally do not think any mortal can find the original paper of Davenport’s inequality, but on [Shioda 04] there is a reproduced proof using linear algebra (lemma 3.1).</p><p>For more geometrical interpretation, one may be interested in [Zannier 95], where Riemann’s existence theorem is also discussed.</p><p>In Stothers’s paper [Stothers 81], the author discussed the condition where the equality holds. If you look carefully you will realise his theorem 1.1 is exactly the Mason-Stothers theorem.</p><h1 id="References-Further-Reading"><a href="#References-Further-Reading" class="headerlink" title="References / Further Reading"></a>References / Further Reading</h1><ul><li>[Davenport 65] H. Davenport, <em>On $f^3(t)-g^2(t)$</em>, 1965. (can someone find a digital copy of this paper?)</li><li>[Ma 84] R. C. Mason, <em>Diophantine Equations over Function Fields</em>, 1984.</li><li>[Shioda 04] Tetsuji Shioda, <em>The abc-theorem, Davenport’s inequality and elliptic surfaces</em>, 2004 (<a href="https://www2.rikkyo.ac.jp/web/shioda/papers/esdstadd.pdf">https://www2.rikkyo.ac.jp/web/shioda/papers/esdstadd.pdf</a>)</li><li>[Stothers 81] W. W. Stothers, <em>POLYNOMIAL IDENTITIES AND HAUPTMODULN</em>, 1981. (<a href="https://doi.org/10.1093/qmath/32.3.349">https://doi.org/10.1093/qmath/32.3.349</a>)</li><li>[Zannier 95] Umberto Zannier (Venezia), <em>On Davenport’s bound for the degree of $f^3-g^2$ and Riemann’s Existence Theorem</em>, 1995. (<a href="https://eudml.org/doc/206763">https://eudml.org/doc/206763</a>)</li></ul>]]></content>
    
    
    <summary type="html">In this post we show the Mason-Stothers theorem, the so-called $abc$ theorem for polynomials, and derive Fermat&#39;s Last theorem and Davenport&#39;s inequality for polynomials. These three theorems correspond to the $abc$ conjecture, Fermat&#39;s Last Theorem and Hall&#39;s conjecture in number theory.</summary>
    
    
    
    <category term="Algebra" scheme="https://desvl.xyz/categories/Algebra/"/>
    
    <category term="Polynomials" scheme="https://desvl.xyz/categories/Algebra/Polynomials/"/>
    
    
    <category term="Exercise solution" scheme="https://desvl.xyz/tags/Exercise-solution/"/>
    
    <category term="Study new content" scheme="https://desvl.xyz/tags/Study-new-content/"/>
    
    <category term="Serge Lang" scheme="https://desvl.xyz/tags/Serge-Lang/"/>
    
  </entry>
  
  <entry>
    <title>A Step-by-step of the Analytic Continuation of the Riemann Zeta Function</title>
    <link href="https://desvl.xyz/2022/11/24/riemann-zeta-continuation/"/>
    <id>https://desvl.xyz/2022/11/24/riemann-zeta-continuation/</id>
    <published>2022-11-24T14:56:23.000Z</published>
    <updated>2025-05-11T12:09:59.528Z</updated>
    
    <content type="html"><![CDATA[<h2 id="Introduction"><a href="#Introduction" class="headerlink" title="Introduction"></a>Introduction</h2><p>The <strong>Riemann zeta function</strong> is widely known (to be the analytic continuation of Euler’s zeta function):</p><script type="math/tex; mode=display">\zeta(s)=\sum_{n=1}^{\infty}\frac{1}{n^s}.</script><p>It is widely known mainly because of the celebrated hypothesis by Riemann that remains unsolved after more than a century’s attempts by mathematicians and 150 million attempts by computers:</p><blockquote><p><strong>Riemann Hypothesis:</strong> The non-trivial zeros of $\zeta(s)$ lie on the line $\Re(s)=\frac{1}{2}$.</p></blockquote><p>The audience are told by pop-science how important and mysterious this hypothesis is. Or how disastrous if this would be solved one day. We can put them aside. A question is, why would Riemann ever think about the zero set of <em>such</em> a function? Why not something else? According to Riemann, the distribution function of primes</p><script type="math/tex; mode=display">\pi(x)=\sum_{p\text{ prime}}^{p \le x}1</script><p>may be written as the series</p><script type="math/tex; mode=display">\pi(x)=R(x)-\sum_{\rho}R(x^\rho)</script><p>where</p><script type="math/tex; mode=display">R(x)=1+\sum_{n=1}^{\infty}\frac{1}{n\zeta(n+1)}\frac{(\log{x})^n}{n!}</script><p>and $\rho$ varies over all zeros of $\zeta(s)$. With these being said, once this <em>hypothesis</em> is proven true, we may have a much more concrete say of the distribution of prime numbers.</p><p>But this is not the topic of this post actually. The author of this post is not trying to prove the Riemann Hypothesis in a few pages, and nobody could. In this post, we investigate the analytic continuation of $\zeta(s)$ step-by-step, so that it will make sense to even think about evaluating the value at $\frac{1}{2}$. For the theory of analytic continuation, I recommend <em>Real and Complex Analysis</em> by Walter Rudin. Although in his book he went into modular function and Picard’s little theorem, instead of $\zeta(s)$ function and related.</p><p>We will transfer the problem of $\zeta$ function into $\Gamma$ function and $\theta$ function, and uncover what we want through the Merlin transform, in the sense that we will observe the so-called complete zeta function, which will bring us to THE continuation we are looking for.</p><p>We will carry out details more about non-trivial processes, rather than basic complex analysis. The reader may skip our preparation if they are familiar with these contents.</p><h2 id="Gamma-Function"><a href="#Gamma-Function" class="headerlink" title="Gamma Function"></a>Gamma Function</h2><p>The Gamma function should be studied in an analysis course:</p><script type="math/tex; mode=display">\Gamma(s)=\int_0^\infty e^{-t}t^{s-1}dt,s>0.</script><p>In an analysis course we have studied some of this function’s important properties:</p><ul><li><p>$\Gamma(1)=1$.</p></li><li><p>$\Gamma(s+1)=s\Gamma(s)$ (as a result $n!=\Gamma(n+1)$)</p></li><li><p>$\log\Gamma(s)$ is a convex function.</p></li></ul><p>In this section however, we will study it in the context of complex analysis.</p><blockquote><p><strong>Theorem 1.</strong> The Gamma function</p><script type="math/tex; mode=display"> \Gamma(s)=\int_0^\infty e^{-t}t^{s-1}dt</script><p> is well-defined as an analytic function in the half plane $\Re(s)&gt;0$. </p></blockquote><p><em>Proof.</em> If we write $s=u+iv$ with $u&gt;0$ and $t=e^c$, then</p><script type="math/tex; mode=display">\begin{aligned}|e^{-t}t^{s-1}|&=|e^{-t}t^{u-1}||t^{iv}| \\               &=|e^{-t}t^{u-1}||e^{icv}| \\               &=e^{-t}t^{u-1}.\end{aligned}</script><p>Therefore</p><script type="math/tex; mode=display">\begin{aligned}\int_{0}^{\infty}|e^{-t}t^{s-1}|dt &= \int_0^\infty e^{-t}t^{u-1}dt \\&=\Gamma(u) \\&<\infty.\end{aligned}</script><p>The desired properties then follows. $\square$</p><blockquote><p><strong>Theorem 2.</strong> If $\Re(s)&gt;0$, then </p><script type="math/tex; mode=display">\Gamma(s+1)=s\Gamma(s),</script><p>and as a consequence $\Gamma(n+1)=n!$ for $n=0,1,\dots$.</p></blockquote><p><em>Proof.</em> The second statement follows immediately because $\Gamma(1)=1$. For the first equation, we do a integration by parts:</p><script type="math/tex; mode=display">\int_{\varepsilon}^{1/\varepsilon}\frac{d}{dt}(e^{-t} t^s)dt=-\int_{\varepsilon}^{1/\varepsilon}e^{-t}t^sdt+s\int_{\varepsilon}^{1/\varepsilon}e^{-t}t^{s-1}dt.</script><p>Taking $\varepsilon \to 0$, we get what we want. $\square$</p><p>Now we are ready for the analytic continuation for the Gamma function, which builds a bridge to the analytic continuation of $\zeta$.</p><blockquote><p><strong>Theorem 3.</strong> The function $\Gamma(s)$ defined in theorem 1 admits an analytic continuation to a meromorphic function on the complex plane whose singularities are simple poles at $0,-1,\dots$, with corresponding residue $\frac{(-1)^n}{n!}$.</p></blockquote><p><em>Proof.</em> It suffices to show that we can continuate $\Gamma$ to $\Re(s)&gt;-m$, for all $m&gt;0$, which implies that we can extend it to all the complex plane. For this reason, we put $\Gamma_0(s)=\Gamma(s)$, which is defined in theorem 1. Then</p><script type="math/tex; mode=display">\Gamma_1(s)=\frac{\Gamma_0(s+1)}{s}</script><p>is THE analytic continuation of $\Gamma_0(s)$ at $\Re(s)&gt;-1$, with the only singularity $s=0$. Then</p><script type="math/tex; mode=display">\operatorname{Res}_{s=0}\Gamma_1(s)=\lim_{s \to 0}s\Gamma_1(s)=\Gamma_0(1)=1.</script><p>Likewise, we can define</p><script type="math/tex; mode=display">\Gamma_2(s)=\frac{\Gamma_1(s+1)}{s}=\frac{\Gamma_0(s+2)}{s(s+1)}.</script><p>Overall, whenever $m \ge 1$ is an integer, we can define</p><script type="math/tex; mode=display">\Gamma_m(s)=\frac{\Gamma_0(s+m)}{\prod_{j=1}^{m-1}(s-j)}.</script><p>This function is meromorphic in $\Re(s)&gt;-m$ and has simple poles at $s=0,-1,\dots,-m+1$ with residues</p><script type="math/tex; mode=display">\operatorname{res}_{s=-n}\Gamma_m(s)=\frac{\Gamma(-n+m)}{    (m-1-n)!(-1)(-2)\dots(-n)}=\frac{(-1)^n}{n!}.</script><p>Successive applications of the lemma shows that $\Gamma_m(s)=\Gamma(s)$ for $\Re(s)&gt;0$. Therefore we have obtained the analytic continuation through this process. $\square$</p><p>Throughout, unless specified, we will call the function obtained in the proof of theorem 3 as THE function $\Gamma$. </p><p>For all $s \in \mathbb{C}$, this function satisfies $\Gamma(s+1)=s\Gamma(s)$ as it should be.</p><p>Before we proceed, we develop two relationships between the $\Gamma$ function and the $\zeta$ function, to convince the reader that we are not doing nothing.</p><p>If we perform a change of variable $t=nu$ in the definition of $\Gamma(s)$, we see</p><script type="math/tex; mode=display">\int_{0}^\infty e^{-nu}n^{s}u^{s-1}du=\Gamma(s).</script><p>This is to say,</p><script type="math/tex; mode=display">\begin{aligned}\frac{1}{n^s}\Gamma(s)&=\int_0^\infty e^{-nu}u^{s-1}du \\\end{aligned}</script><p>Taking the sum of all $n$, we see</p><script type="math/tex; mode=display">\begin{aligned}\Gamma(s)\sum_{n=1}^{\infty}\frac{1}{n^s}&=\Gamma(s)\zeta(s) \\&=    \sum_{n=1}^{\infty}\int_0^\infty e^{-nu}u^{s-1}du \\&=\int_0^\infty \sum_{n=1}^{\infty}e^{-nu}u^{s-1}du \\&=\int_0^{\infty}\frac{e^{-u}u^{s-1}}{1-e^{-u}}du \\&=\int_0^{\infty}\frac{u^{s-1}}{e^u-1}du.\end{aligned}</script><p>This relationship is beautiful, but may make our computation a little bit more complicated. However, if we get our hand dirty earlier, our study will be easier. Thus we will do a “uglier” change of variable $t \mapsto \pi n^2y$ to obtain</p><script type="math/tex; mode=display">\pi^{-s}\Gamma(s)\frac{1}{n^{2s}}=\int_0^\infty     e^{-\pi n^{2}y }y^{s-1}dy</script><p>which implies</p><script type="math/tex; mode=display">\pi^{-s}\Gamma(s)\zeta(2s)=\int_0^\infty \sum_{n=1}^{\infty}e^{-\pi n^2y}y^{s-1}dy.</script><p>Either case, it is legal to change the order of summation and integration, because of the <a href="https://en.wikipedia.org/wiki/Monotone_convergence_theorem">monotone convergence theorem</a>.</p><p>Before we proceed, we need some more properties of the Gamma function.</p><blockquote><p><strong>Theorem 4 (Euler’s reflection formula).</strong> For all $s \in \mathbb{C}$, </p><script type="math/tex; mode=display">\Gamma(s)\Gamma(1-s)=\frac{\pi}{\sin\pi s}.</script></blockquote><p>Observe that this identity makes sense at all poles. Since $\Gamma(s)$ has simple poles at $0,-1,\dots$ meanwhile $\Gamma(1-s)$ has simple poles at $1,2,\dots$. As a result, $\Gamma(s)\Gamma(1-s)$ has simple poles at all integers, a property also shared by $\pi/\sin\pi{s}$.</p><p>By analytic continuation, it suffices to prove it for $0&lt;s&lt;1$ because this result can then be extended to all of $\mathbb{C}$ through analytic continuation.</p><p><em>Proof (real version).</em> We expand the left hand side first:</p><script type="math/tex; mode=display">\begin{aligned}\Gamma(x)\Gamma(1-x)&=B(x,1-x) \\                    &=\int_0^1 t^{1-x}(1-t)^xdt \\                    &=\int_0^\infty \frac{1}{y^x(1+y)}dy\end{aligned}</script><p>by taking $t=\frac{1}{1+y}$. Next we compute this integral for both $(0,1]$ and $[1,\infty)$.</p><script type="math/tex; mode=display">\begin{aligned}\int_0^1\frac{1}{y^x(1+y)}dy &= \int_0^1\frac{1}{y^x}    \sum_{n=0}^{\infty}(-y)^ndy \\                &= \sum_{n=0}^{\infty}\int_0^1(-y)^{n-x}dy \\                &= \sum_{n=0}^{\infty}\frac{(-1)^{n-1}}{n-x}.\end{aligned}</script><p>(The exchange of integration and infinite sum is correct due to <a href="https://sites.math.washington.edu/~morrow/335_16/dominated.pdf">Arzelà’s dominated convergence theorem of Riemann integrals</a>.)</p><p>On the other hand, taking $y=\frac{1}{u}$, we see</p><script type="math/tex; mode=display">\begin{aligned}\int_1^\infty\frac{1}{y^x(1+y)}dy &= \int_0^1\frac{u^{x-1}}{1+u}du \\&=\frac{1}{x}+\sum_{n=1}^{\infty}\frac{(-1)^n}{n+x}\end{aligned}</script><p>Summing up, one has</p><script type="math/tex; mode=display">\Gamma(x)\Gamma(1-x)=\frac{1}{x}+\sum_{n=1}^{\infty}    (-1)^n\frac{2x}{x^2-n^2}.</script><p>It remains to show that $\pi\csc{\pi{x}}$ satisfies such an expansion as well, which is not straightforward because neither Fourier series nor Taylor series can drive us there directly. One can start with the infinite product expansion of $\sin{x}$ but here we follow an alternative approach. Notice that for $\alpha \in \mathbb{R} \setminus \mathbb{Z}$,</p><script type="math/tex; mode=display">\cos\alpha{t}=\frac{\sin\pi \alpha}{\pi \alpha}+\sum_{n=1}^{\infty}(-1)^n\frac{2\alpha}{\pi(\alpha^2-n^2)}\sin\alpha\pi\cos{nt}.</script><p>Taking $t=0$ and multiplying both sides by $\pi\csc\pi\alpha$, we obtain what we want. $\square$</p><p><em>Proof (complex version).</em> By definition,</p><script type="math/tex; mode=display">\begin{aligned}\Gamma(1-s)\Gamma(s) &= \int_0^\infty e^{-t}t^{s-1}\Gamma(1-s)dt \\      &= \int_0^\infty e^{-t}t^{s-1}\left(\int_0^\infty        e^{-v}v^{s}dv\right)dt \\      &= \int_0^\infty e^{-t}t^{s-1}t         \left(         \int_0^\infty e^{-ut}(ut)^{-s}du         \right)dt \\      &= \int_0^\infty du \int_0^\infty e^{-t(u+1)}u^{-s}dt \\      &= \int_0^\infty \frac{u^{-s}}{1+u}du\end{aligned}</script><p>Here we performed a change-of-variable on $v=tu$. To compute the last integral, we put $u=e^x$, and it follows that</p><script type="math/tex; mode=display">\begin{aligned}\int_0^\infty \frac{u^{-s}}{1+u}du &= \int_0^\infty \frac{e^{(1-s)x}}{1+e^x}dx \\\end{aligned}</script><p>The integral on the right hand side can be computed to be $\frac{\pi}{\sin(1-s)\pi}=\frac{s}{\sin\pi s}$. This is a easy consequence of the residue formula (by considering a rectangle with centre $z=\pi i$, height $2\pi$ and one side being the real axis). $\square$</p><p>Here is a bonus of Euler’s reflection formula. Putting $t=u^2$ in the definition of $\Gamma$ function, we also see</p><script type="math/tex; mode=display">\begin{aligned}\Gamma(s)&=\int_0^\infty e^{-u^2}u^{2s-2}2udu \\        &=2\int_0^\infty e^{-u^2}u^{2s-1}ds.\end{aligned}</script><p>Therefore</p><script type="math/tex; mode=display">\Gamma(1/2)=2\int_0^\infty e^{-u^2}du=\int_{-\infty}^{\infty}e^{-u^2}du=\sqrt{\pi}.</script><p>To conclude this section, we also mention</p><blockquote><p><strong>Theorem 5 (Legendre duplication formula).</strong></p><script type="math/tex; mode=display">\Gamma(s)\Gamma(s+1/2)=\frac{2\sqrt\pi}{2^{2s}}\Gamma(2s).</script></blockquote><p>One can find a proof <a href="https://proofwiki.org/wiki/Legendre&#39;s_Duplication_Formula">here</a>.</p><h2 id="Mellin-and-Fourier-Transform-of-the-Jacobi-Theta-Function"><a href="#Mellin-and-Fourier-Transform-of-the-Jacobi-Theta-Function" class="headerlink" title="Mellin and Fourier Transform of the Jacobi Theta Function"></a>Mellin and Fourier Transform of the Jacobi Theta Function</h2><h3 id="Behaviour-of-the-Jacobi-Theta-Function"><a href="#Behaviour-of-the-Jacobi-Theta-Function" class="headerlink" title="Behaviour of the Jacobi Theta Function"></a>Behaviour of the Jacobi Theta Function</h3><p>Put $Z(s)=\pi^{-s/2}\Gamma(s/2)\zeta(s)$. It looks we are pretty close to a great property of $\zeta(s)$, if we can figure out  $Z$ a little bit more, because $\pi^{-s/2}$ and $\Gamma(s/2)$ behave nicely. Therefore we introduce the Jacobi theta function</p><script type="math/tex; mode=display">\theta(s)=\sum_{n \in \mathbb{Z}}e^{-\pi n^2 s}, \quad \Re(s)>0</script><p>and try to deduce its relation with $Z(s)$. </p><p>To begin with, we first show that</p><blockquote><p><strong>Proposition 1.</strong> The theta function is holomorphic on the right half plane.</p></blockquote><p><em>Proof.</em> Let $C$ be a compact subset of the right half plane, and put $y_0=\inf_{s \in C}\Re(s)$. Pick any $n_0\ge \frac{1}{y_0}$. For $s=u+iv \in C$, we have $u \ge y_0$ and therefore</p><script type="math/tex; mode=display">\begin{aligned}\sum_{|n|\ge n_0}|e^{-\pi n^2 s}| &= \sum_{|n| \ge n_0}e^{-\pi n^2 u} \\    &\le \sum_{|n| \ge n_0}e^{-\pi n^2 y_0} \\    &\le \sum_{|n| \ge n_0}e^{-\pi |n|}\end{aligned}</script><p>Therefore $\theta(s)$ converges absolutely on any compact subset of the right half plane. (Note we have used the fact that $n^2y_0 \ge |n|n_0y_0 \ge |n|$ when we are studying the convergence.) Since each term is holomorphic, we have shown that $\theta(s)$ itself is holomorphic. $\square$</p><p>Therefore it is safe to work around theta function. Now we are ready to deduce a functional equation.</p><blockquote><p><strong>Theorem 6.</strong> The theta function satisfies the functional equation on $\{\Re(s)&gt;0\}$:</p><script type="math/tex; mode=display">\theta(s)=\frac{1}{\sqrt{s}}\theta\left(\frac{1}{s}\right)</script></blockquote><p>The square root is chosen to be in the branch with positive real part.</p><p><em>Proof.</em> Consider the function $f(x)=e^{-\pi x^2}$. We know that this is the fixed point of Fourier transform (in this convenient form)</p><script type="math/tex; mode=display">\hat{g}(t)=\int_{-\infty}^{\infty}g(x)e^{-2\pi ixt}dx.</script><p>Now we put $g(x)=e^{-\pi u x^2}=f(\sqrt{u}x)$. The Fourier transform of $g$ is easy to deduce:</p><script type="math/tex; mode=display">\hat{g}(t)=\frac{1}{\sqrt{u}}\hat{f}\left(\frac{t}{\sqrt{u}}\right)= \frac{1}{\sqrt{u}}e^{-\pi t^2 / u}.</script><p>Since $g(x)$ is a <a href="https://en.wikipedia.org/wiki/Schwartz_space">Schwartz function</a>, by <a href="https://proofwiki.org/wiki/Poisson_Summation_Formula">Poisson summation formula</a>, we have</p><script type="math/tex; mode=display">\sum_{n \in \mathbb{Z}}g(n)=\theta(u)=\sum_{n \in \mathbb Z}\hat{g}(n)=\frac{1}{\sqrt{u}}\theta\left(\frac{1}{u}\right).</script><p>The result follows from an analytic continuation. $\square$</p><p>For Schwartz functions, also known as rapidly decreasing functions, we refer the reader to chapter 7 of W. Rudin’s <em>Functional Analysis</em>. </p><p>Next we will study the behaviour of $\theta(s)$ on the half real line, especially at the origin and infinity. By the functional equation above, once we have a better view around the origin, we can quickly know what will happen at the infinity.</p><blockquote><p><strong>Proposition 2.</strong> When the real number $t \to 0$, the theta function is equivalent to $\frac{1}{\sqrt{t}}$. More precisely, when $t$ is small enough, the following inequality holds:</p><script type="math/tex; mode=display">\left|\theta(t)-\frac{1}{\sqrt{t}}\right|<e^{-(\pi-1)/t}.</script></blockquote><p><em>Proof.</em> Rewrite $\theta(t)$ in the form</p><script type="math/tex; mode=display">\theta(t)=1+2\sum_{n=1}^{\infty}e^{-\pi n^2 t}.</script><p>Therefore</p><script type="math/tex; mode=display">\begin{aligned}\left|\theta(t)-\frac{1}{\sqrt{t}}\right| &= \left|    \frac{1}{\sqrt{t}}\left(\theta\left(\frac{1}{t}\right)-1\right) \right| \\&= \frac{2}{\sqrt{t}}\sum_{n=1}^{\infty}e^{-\pi n^2/t}\end{aligned}</script><p>Pick $t&gt;0$ small enough so that</p><script type="math/tex; mode=display">e^{-1/t}<\frac{\sqrt{t}}{4}, \quad e^{-2\pi/t}<2.</script><p>It follows that</p><script type="math/tex; mode=display">\begin{aligned}\left|\theta(t)-\frac{1}{\sqrt{t}}\right| &= \frac{2}{\sqrt{t}}\sum_{n=1}^{\infty}e^{-\pi n^2/t} \\    &< \frac{1}{2}e^{1/t}\sum_{n=1}^{\infty}e^{-\pi/t}        e^{-\pi(n^2-1)/t} \\    &=\frac{1}{2}e^{-(\pi-1)/t}\sum_{n=1}^{\infty}e^{-\pi(n+1)(n-1)/t}\\    &<\frac{1}{2}e^{-(\pi-1)/t}\sum_{n=1}^{\infty}        e^{-2\pi(n-1)/t} \\    &<\frac{1}{2}e^{-(\pi-1)/t}\sum_{n=1}^{\infty}        2^{n-1} \\    &=e^{-(\pi-1)/t}.\end{aligned}</script><p>$\square$</p><p>As a result, we also know how $\theta(t)$ behaves at the infinity. To be precise, we have the following corollary.</p><blockquote><p><strong>Corollary 1.</strong> The limit of $\theta(t)$ at infinity is $1$ in the following sense: when $t$ is big enough,</p><script type="math/tex; mode=display">|\theta(t)-1| < e^{-(\pi-1)t}/\sqrt{t}.</script></blockquote><p><em>Proof.</em> Put $u=\frac{1}{t}$. When $u$ is small enough, we have $|\theta(u)-\frac{1}{\sqrt{u}}|&lt;e^{-(\pi-1)/u}$. As a result,</p><script type="math/tex; mode=display">\begin{aligned}|\theta(t)-1|&=\left|\frac{1}{\sqrt{t}}\theta\left(\frac{1}{t}\right)-1\right| \\             &=\left|\sqrt{u}\theta\left(u\right)-1\right| \\             &=\sqrt{u}\left|\theta(u)-\frac{1}{\sqrt{u}}\right| \\             &<\sqrt{u}e^{-(\pi-1)/u} \\             &=\frac{e^{-(\pi-1)t}}{\sqrt{t}}\end{aligned}</script><p>as expected. $\square$</p><h3 id="The-Mellin-Transform-of-the-Theta-Function-and-the-Zeta-Function"><a href="#The-Mellin-Transform-of-the-Theta-Function-and-the-Zeta-Function" class="headerlink" title="The Mellin Transform of the Theta Function and the Zeta Function"></a>The Mellin Transform of the Theta Function and the Zeta Function</h3><p>To begin with, we introduce the Mellin transform. In a manner of speaking, this transform can actually be understood as the multiplicative version of a two-sided Laplace transform.</p><blockquote><p><strong>Definition.</strong> Given a function $f:\mathbb{R}_+ \to \mathbb{C}$, the Mellin transform of $f$ is defined to be</p><script type="math/tex; mode=display">\mathcal{M}_f(s)=\int_0^\infty f(x)x^{s-1}dx,</script><p>provided that the limit exists.</p></blockquote><p>For example, $\Gamma(s)$ is the Mellin transform of $e^x$. Moreover, for the two-side Laplace transform</p><script type="math/tex; mode=display">\mathcal{B}_f(s)=\int_{-\infty}^{+\infty}e^{-sx}f(x)dx,</script><p>we actually have</p><script type="math/tex; mode=display">\mathcal{M}_f(s)=\mathcal{B}_{\tilde{f}}(s),</script><p>where $\tilde{f}(x)=f(e^{-x})$.</p><p>Our goal is to recover $Z(s)$ through the Mellin transform of $\theta(x)$. As we have proved earlier,</p><script type="math/tex; mode=display">\pi^{-s}\Gamma(s)\zeta(2s)=\int_0^\infty \sum_{n=1}^{\infty}e^{-\pi n^2y}x^{s-1}dx.</script><p>It seems we can get our result really quick by studying $\frac{1}{2}(\theta(s)-1)$. However, we see $\theta(x)$ goes to $\frac{1}{\sqrt{x}}$ rapidly as $x \to 0$, and goes to $1$ rapidly as $x \to \infty$. Convergence has to be taken care of. Therefore we add error correction terms. For this reason, we study the function</p><script type="math/tex; mode=display">\phi(s)=\int_0^1\left(\theta(x)-\frac{1}{\sqrt{x}}\right)x^{s/2-1}dx + \int_1^\infty (\theta(x)-1)x^{s/2-1}dx.</script><p>We use $s/2$ in place of $s$ because we do not want $\zeta$ to be evaluated at $2s$ all the time.</p><p>The partition $(0,1) \cup (1,\infty)$ immediately inspires one to use the change-of-variable $y=\frac{1}{x}$. As a result,</p><script type="math/tex; mode=display">\begin{aligned}\phi(s)&=\int_0^1\left(\theta(x)-\frac{1}{\sqrt{x}}\right)x^{s/2-1}dx + \int_1^\infty (\theta(x)-1)x^{s/2-1}dx \\&= -\int_1^\infty\left(    \theta\left(\frac{1}{y}\right)-\sqrt{y}\right)y^{1-s/2}(-y^{-2})dy -\int_0^1\left(    \theta\left(\frac{1}{y}\right)-1\right)y^{1-s/2}(-y^{-2})dy \\&=\int_1^\infty (\theta(y)-1)y^{(1-s)/2-1}dy + \int_0^1\left(\theta(y)-\frac{1}{\sqrt{y}}\right)y^{(1-s)/2-1}dy \\&= \phi(1-s).\end{aligned}</script><p>Now we are ready to compute $\phi(s)$. For the first part,</p><script type="math/tex; mode=display">\begin{aligned}\int_0^1\left(\theta(x)-\frac{1}{\sqrt{x}}\right)x^{s/2-1}dx &= \int_0^1\theta(x)x^{s/2-1}dx -\frac{2}{s-1} \\&=\int_0^1 \sum_{n=-\infty}^{+\infty}e^{-\pi n^2 x}x^{s/2-1}dx \\&= \int_0^1 x^{s/2-1}dx + 2\sum_{n=1}^{\infty}\int_0^1e^{-\pi n^2 x}x^{s/2-1}dx - \frac{2}{s-1} \\&= 2\sum_{n=1}^{\infty}\int_0^1e^{-\pi n^2 x}x^{s/2-1}dx        +\frac{2}{s}-\frac{2}{s-1}.\end{aligned}</script><p>On the other hand,</p><script type="math/tex; mode=display">\int_1^\infty (\theta(x)-1)x^{s/2-1}dx =     2\sum_{n=1}^{\infty}\int_1^\infty e^{-\pi n^2 x}x^{s/2-1}dx.</script><p>Therefore</p><script type="math/tex; mode=display">\begin{aligned}\phi(s)&=2\sum_{n=1}^{\infty}\int_0^\infty e^{-\pi n^2 x}        x^{s/2-1}dx + \frac{2}{s}-\frac{2}{s-1} \\       &=2\left(Z(s)+\frac{1}{s}-\frac{1}{s-1}\right).\end{aligned}</script><p>Therefore</p><script type="math/tex; mode=display">Z(s)=\pi^{-s/2}\Gamma(s/2)\zeta(s)=\frac{1}{2}\phi(s)-\frac{1}{s}+\frac{1}{s-1}.</script><p>In particular,</p><script type="math/tex; mode=display">\begin{aligned}Z(1-s)&=\frac{1}{2}\phi(1-s)-\frac{1}{1-s}+\frac{1}{1-s-1} \\          &=\frac{1}{2}\phi(s)-\frac{1}{s}+\frac{1}{s-1} \\          &=Z(s)\end{aligned}</script><p>Expanding this equation above, we see</p><script type="math/tex; mode=display">\pi^{-(1-s)/2}\Gamma((1-s)/2)\zeta(1-s) = \pi^{-s/2}\Gamma(s/2)\zeta(s).</script><p>This gives</p><script type="math/tex; mode=display">\zeta(1-s)=\pi^{\frac{1}{2}-s}\frac{\Gamma\left(\frac{s}{2}\right)}{\Gamma\left(\frac{1-s}{2}\right)}\zeta(s).</script><p>Finally we try to simplify the quotient above. By Legendre’s duplication formula,</p><script type="math/tex; mode=display">\Gamma(s)=\frac{2^s}{2\sqrt{\pi}}\Gamma\left(\frac{s}{2}\right)\Gamma\left(\frac{s+1}{2}\right).</script><p>By Euler’s reflection formula,</p><script type="math/tex; mode=display">\Gamma\left(\frac{1-s}{2}\right)\Gamma\left(\frac{1+s}{2}\right)=\frac{\pi}{\sin\pi\left(\frac{1-s}{2}\right)}=\frac{\pi}{\cos\frac{\pi s}{2}}.</script><p>Inserting these two equations into the right hand side of $\zeta(1-s)$, we obtain</p><blockquote><p><strong>Proposition 3.</strong> The Riemann Zeta function $\zeta(s)$ admits an analytic continuation satisfying the functional equation</p><script type="math/tex; mode=display">\zeta(1-s)=2(2\pi)^{-s}\Gamma(s)\cos\frac{\pi s}{2}\zeta(s).</script></blockquote><p>In particular, since we also have </p><script type="math/tex; mode=display">\zeta(s)=\frac{\pi^{s/2}}{\Gamma(s/2)}\left(\phi(s)-\frac{1}{s}+\frac{1}{s-1}\right),</script><p>it is immediate that $\zeta(s)$ admits a simple hole at $s=1$ with residue $1$. Another concern is $s=0$. Nevertheless, since we have</p><script type="math/tex; mode=display">\begin{aligned}\zeta(s)&= \frac{\pi^{s/2}}{\Gamma(s/2)}\phi(s)-            \frac{\pi^{s/2}}{s\Gamma(s/2)}+\frac{\pi^{s-1}}{\Gamma(s/2)} \\&=\frac{\pi^{s/2}}{\Gamma(s/2)}\phi(s)-            \frac{2\pi^{s/2}}{\Gamma(s/2+1)}+\frac{\pi^{s-1}}{\Gamma(s/2)}\end{aligned}</script><p>there is no pole at $s=0$ (notice that $\phi(s)$ is entire). We now know a little bit more about the analyticity of $\zeta(s)$.</p><blockquote><p><strong>Corollary 2.</strong> The Riemann zeta function $\zeta(s)$ has its analytic continuation defined on $\mathbb{C} \setminus \{1\}$, with a simple pole at $s=1$ with residue $1$.</p></blockquote><h2 id="What-is-1-2-…"><a href="#What-is-1-2-…" class="headerlink" title="What is 1+2+…?"></a>What is 1+2+…?</h2><p>Now we are safe to compute $\zeta(-1)$.</p><script type="math/tex; mode=display">\zeta(-1)=2(2\pi)^{-2}\Gamma(2)\zeta(2) =\frac{2}{4\pi^2}\cdot 2 \cdot \cos(\pi)\cdot \frac{\pi^2}{6} = - \frac{1}{12}.</script><p>But I believe, after these long computations of the analytical continuation, we can be confident enough to say that, when $\Re(s) \le 1$, the Riemann zeta function $\zeta(s)$ absolutely cannot be immediately explained by its ordinary definition $\sum_{n=1}^{\infty}n^{-s}$. Claiming $1+2+\dots=-\frac{1}{12}$ is a ridiculous and unacceptable abuse of language.</p><p>We conclude this post by citing a criticism by <a href="https://skullsinthestars.com/2014/01/18/infinite-series-not-quite-as-weird-as-some-would-say/">Greg Gbur</a> on the infamous <a href="https://youtu.be/w-I6XTVZXww">Numberphile video</a>.</p><blockquote><p>So why is this important?  Part of what I’ve tried to show on this blog is that mathematics and physics can be extremely non-intuitive, even bizarre, but that they have their own rules and logic that make perfect sense once you get familiar with them.  The original video, in my opinion, acts more like a magic trick than an explanation: it shows a peculiar, non-intuitive result and tries to pass it off as absolute truth without qualification.  Making science and math look like incomprehensible magic does not do any favors for the scientists who study it nor for the public who would like to understand it.</p></blockquote><h2 id="References-Further-Reading"><a href="#References-Further-Reading" class="headerlink" title="References / Further Reading"></a>References / Further Reading</h2><ul><li>Serge Lang, <em>Complex Analysis.</em></li><li>Elias M. Stein &amp; Rami Shakarchi, <em>Complex Analysis.</em></li><li>Jürgen Neukirch, <em>Algebraic Number Theory.</em></li><li>Jakob Glas &amp; Kevin Yeh, <em><a href="https://metaphor.ethz.ch/x/2019/fs/401-4110-19L/sc/talk4.pdf">The Classical Theta Function and the Riemann Zeta Function.</a></em></li></ul>]]></content>
    
    
    <summary type="html">We compute the analytic continuation of the Riemann Zeta function and after that the reader will realise that asserting $1+2+&#92;dots=-&#92;frac{1}{12}$ without enough caution is not a good idea.</summary>
    
    
    
    <category term="Analysis" scheme="https://desvl.xyz/categories/Analysis/"/>
    
    <category term="Complex Analysis" scheme="https://desvl.xyz/categories/Analysis/Complex-Analysis/"/>
    
    
    <category term="Analytic Continuation" scheme="https://desvl.xyz/tags/Analytic-Continuation/"/>
    
    <category term="Riemann" scheme="https://desvl.xyz/tags/Riemann/"/>
    
  </entry>
  
  <entry>
    <title>Properties of Cyclotomic Polynomials</title>
    <link href="https://desvl.xyz/2022/09/22/cyclotomic-polynomial-elementary/"/>
    <id>https://desvl.xyz/2022/09/22/cyclotomic-polynomial-elementary/</id>
    <published>2022-09-22T14:21:25.000Z</published>
    <updated>2025-05-16T23:39:45.854Z</updated>
    
    <content type="html"><![CDATA[<h1 id="Background-in-Basic-Field-Theory"><a href="#Background-in-Basic-Field-Theory" class="headerlink" title="Background in Basic Field Theory"></a>Background in Basic Field Theory</h1><p>Let $K$ be a field (in this post we mostly assume that $K \supset \mathbb{Q}$) and $n$ an integer $&gt;1$ which is not divisible by the characteristic of $K$. Then the polynomial</p><script type="math/tex; mode=display">X^n-1</script><p>is separable because its derivative is $nX^{n-1} \ne 0$. Hence in the algebraic closure $\overline{K}$, the polynomial has $n$ distinct roots, which forms a group $U$, and is cyclic. In fact, as an exercise, one can show that, for a field $k$, any subgroup $U$ of the multiplicative group $k^\ast$ is a cyclic group.</p><p>The generator $\zeta_n$ of $U$ is called the primitive $n$-th root of unity. Let $K=\mathbb{Q}$ and $L$ be the smallest extension that contains all elements of $U$, then we have $L=\mathbb{Q}(\zeta_n)$. As a matter of fact, $L/K$ is a Galois extension (to be shown later), and the cyclotomic polynomial $\Phi_n(X)$ is the irreducible polynomial of $\zeta_n$ over $\mathbb{Q}$. We first need to find the degree $[L:K]$.</p><blockquote><p><strong>Proposition 1.</strong> Notation being above, $L/K$ is Galois, the Galois group $\operatorname{Gal}(L/K) \cong (\mathbb{Z}/n\mathbb{Z})^\ast$ (the group of units in $\mathbb{Z}/n\mathbb{Z}$) and $[L:K]=\varphi(n)$.</p></blockquote><p>Let’s first elaborate the fact that $|(\mathbb{Z}/n\mathbb{Z})^\ast|=\varphi(n)$. Let $[0],[1],\dots,[n-1]$ be representatives of $\mathbb{Z}/n\mathbb{Z}$. An element $[x]$ in $\mathbb{Z}/n\mathbb{Z}$ is a unit if and only if there exists $[y]$ such that $[xy]=[1]$, which is to say, $xy \equiv 1 \mod n$. Notice that $xy \equiv 1 \mod n$ if and only if $xy+mn=1$ for some $y,n \in \mathbb{Z}$, if and only if $\gcd(x,n)=1$. Therefore $|(\mathbb{Z}/n\mathbb{Z})^\ast|=\varphi(n)$ is proved.</p><p>The proof can be produced by two lemmas, the first of which is independent to the characteristic of the field.</p><blockquote><p><strong>Lemma 1.</strong> Let $k$ be a field and $n$ be not divisible by the characteristic $p$. Let $\zeta=\zeta_n$ be a primitive $n$-th root of unity in $\overline{k}$, then $(\mathbb{Z}/n\mathbb{Z})^\ast \supset \operatorname{Gal}(k(\zeta)/k)$ and therefore $[k(\zeta):k] \le \varphi(n)$. Besides, $k(\zeta)/k$ is a normal abelian extension.</p></blockquote><p><em>Proof.</em> Let $\sigma$ be an embedding of $k(\zeta)$ in $\overline{k}$ over $k$, then</p><script type="math/tex; mode=display">(\sigma\zeta)^n=\sigma(\zeta^n)=\sigma(1)=1</script><p>so that $\sigma\zeta$ is also an $n$-th root of unity also. Hence $\sigma\zeta=\zeta^i$ for some $i=i(\sigma)$, uniquely determined modulo $n$. It follows that $\sigma$ maps $k(\zeta)$ into itself. This is to say, $k(\zeta)$ is normal over $k$. Let $\tau$ be another automorphism of $k(\zeta)$ over $k$ then</p><script type="math/tex; mode=display">\sigma\tau\zeta=\zeta^{i(\sigma)i(\tau)}.</script><p>It follows that $i(\sigma)$ and $i(\tau)$ are prime to $n$ (otherwise, $\sigma\zeta$ would have a period smaller than $n$, implying that the period of $\zeta$ is smaller than $n$, which is absurd). Therefore for each $\sigma \in \operatorname{Gal}(k(\zeta)/k)$, $i(\sigma)$ can be embedded into $(\mathbb{Z}/n\mathbb{Z})^\ast$, thus proving our theorem. $\square$</p><p>It is easy to find an example with strict inclusion. One only needs to look at $k=\mathbb{R}$ or $\mathbb{C}$.</p><blockquote><p><strong>Lemma 2.</strong> Let $\zeta=\zeta_n$ be a primitive $n$-th root of polynomial over $\mathbb{Q}$, then for any $p \nmid n$, $\zeta^p$ is also a primitive $n$-th root of unity.</p></blockquote><p><em>Proof.</em> Let $f(X)$ be the irreducible polynomial of $\zeta$ over $\mathbb{Q}$, then $f(X)|(X^n-1)$ by definition. As a result we can write $X^n-1=f(X)h(X)$ where $h(X)$ has leading coefficient $1$. By Gauss’s lemma, both $f$ and $h$ have integral coefficients.</p><p>Suppose $\zeta^p$ is not a root of $f$. Since $(\zeta^p)^n-1=(\zeta^n)^p-1=0$, it follows that $\zeta^p$ is a root of $h$, and $\zeta$ is a root of $h(X^p)$. As a result, $f(X)$ divides $h(X^p)$ and we write</p><script type="math/tex; mode=display">h(X^p)=f(X)g(X).</script><p>Again by Gauss’s lemma, $g(X)$ has integral coefficients. </p><p>Next we reduce these equations in $\mathbf{F}_p=\mathbb{Z}/p\mathbb{Z}$. We firstly have</p><script type="math/tex; mode=display">\overline{f}(X)\overline{g}(X)=\overline{h}(X^p).</script><p>By Fermat’s little theorem $a^p=a$ for all $a \in \mathbf{F}_q$, we also have</p><script type="math/tex; mode=display">\overline{h}(X^p)=\overline{h}(X)^p.</script><p>Therefore</p><script type="math/tex; mode=display">\overline{f}(X)\overline{g}(X)=\overline{h}(X)^p,</script><p>which implies that $\overline{f}(X)|\overline{h}(X)^p$. Hence $\overline{f}$ and $\overline{h}$ must have a common factor. As a result, $X^n-\overline{1}=\overline{f}(X)\overline{h}(X)$ has multiple roots in $\mathbf{F}_p$, which is impossible because of our choice of $p$. $\square$</p><p>Now we are ready for Proposition 1.</p><p><em>Proof of Proposition 1.</em> Since $\mathbb{Q}$ is a perfect field, $\mathbb{Q}(\zeta)/\mathbb{Q}$ is automatically separable. This extension is Galois because of lemma 1. By lemma 1, it suffices to show that $[\mathbb{Q}(\zeta):\mathbb{Q}] \ge \varphi(n)$.</p><p>Recall in elementary group theory, if $G$ is a finite cyclic group of order $m$ and $x$ is a generator of $G$, then the set of generators consists elements of the form $x^\nu$ where $\nu \nmid m$. In this occasion, if $\zeta$ generates $U$, then $\zeta^p$ also generates $U$ because $p \nmid n$. It follows that every primitive $n$-th root of unity can be obtained by raising $\zeta$ to a succession of prime numbers that do not divide $n$ (as a result we obtain exactly $\varphi(n)$ such primitive  roots). By lemma 2, all these numbers are roots of $f$ in the proof of lemma 2. Therefore $\deg f = [L:K] \ge \varphi(n)$. Hence the proposition is proved. $\square$</p><p>We will show that $f$ in the proof lemma 2 is actually the cyclotomic polynomial $\Phi_n(x)$ you are looking for. The following procedure works for all fields where the characteristic does not divide $n$, but we assume characteristic to be $0$ for simplicity.</p><p>We have</p><script type="math/tex; mode=display">X^n-1=\prod_{\zeta}(X-\zeta),</script><p>where the product is taken over all $n$-th roots of unity. Collecting all roots with the same period $d$ (i.e., those $\zeta$ such that $\zeta^d=1$), we put</p><script type="math/tex; mode=display">\Phi_d(X)=\prod_{\operatorname{period} \zeta=d}(X-\zeta).</script><p>Then</p><script type="math/tex; mode=display">X^n-1=\prod_{d|n}\Phi_d(X).</script><p>It follows that $\Phi_1(X)=X-1$ and</p><script type="math/tex; mode=display">\Phi_n(X)=\frac{X^n-1}{\prod_{d\mid n}^{d<n}\Phi_d(X)}.</script><p>This presentation makes our computation much easier. But to understand $\Phi_n$, we still should keep in mind that the $n$-th cyclotomic polynomial is defined to be</p><script type="math/tex; mode=display">\Phi_n(X)=\prod_{\operatorname{period}\zeta=n}(X-\zeta),</script><p>whose roots are all primitive $n$-th roots of unity. As stated in the proof of proposition 1, there are $\varphi(n)$ primitive $n$-th roots of unity, and therefore $\deg\Phi_n(X)=\varphi(n)$. Besides, $f|\Phi_n$. Since both have the same degree, these two polynomials equal. It also follows that $\sum_{d|n}\varphi(n)=n$.</p><blockquote><p><strong>Proposition 2.</strong> The cyclotomic polynomial is irreducible and is the irreducible polynomial of $\zeta$ over $\mathbb{Q}$, where $\zeta$ is a primitive $n$-th root of unity.</p></blockquote><p>We end this section by a problem in number fields, making use of what we have studied above.</p><blockquote><p><strong>Problem 0.</strong> A number field $F$ only contains finitely many roots of unity.</p></blockquote><p><em>Solution.</em> Let $\zeta \in F$ be a root of unity with period $n$. Then $\Phi_n(\zeta)=0$ and therefore $[\mathbb{Q}(\zeta):\mathbb{Q}]$ has degree $\varphi(n)$. Since $\mathbb{Q}(\zeta)$ is also a subfield of $F$, we also have $\varphi(n) \le [F:\mathbb{Q}]$. Since $\{n:\varphi(n) \le [F:\mathbb{Q}]\}$ is certainly a finite set, the number of roots of unity lie in $F$ is finite. $\square$</p><h1 id="Technical-Computations"><a href="#Technical-Computations" class="headerlink" title="Technical Computations"></a>Technical Computations</h1><p>We will do some dirty computation in this section.</p><blockquote><p><strong>Problem 1.</strong> If $p$ is prime, then $\Phi_p(X)=X^{p-1}+X^{p-2}+\dots+1$, and for an integer $\nu \ge 1$, $\Phi_{p^\nu}(X)=\Phi_p(X^{p^{\nu-1}})$.</p></blockquote><p><em>Solution.</em> The only integer $d$ that divides $p$ is $1$ and we can only have</p><script type="math/tex; mode=display">\Phi_p(X)=\frac{X^p-1}{\Phi_1(X)}=X^{p-1}+\dots+1.</script><p>For the second statement, we use induction on $\nu$. When $\nu=1$ we have nothing to prove. Suppose now</p><script type="math/tex; mode=display">\Phi_{p^\nu}(X)=\Phi_p(X^{p^{\nu-1}})=\frac{X^{p^{\nu}}-1}{X^{p^{\nu-1}}-1}=\frac{X^{p^{\nu}}-1}{\prod_{r=0}^{\nu-1}\Phi_{p^{r}}(X)}</script><p>is proved, then $X^{p^\nu}-1=\prod_{r=0}^{\nu}\Phi_{p^r}(X)$ and therefore</p><script type="math/tex; mode=display">\begin{aligned}\Phi_{p^{\nu+1}}(X)&=\frac{X^{p^{\nu+1}}-1}{\prod_{r=0}^{\nu}\Phi_{p^r}(X)} \\                   &=\frac{X^{p^{\nu+1}}-1}{X^{p^{\nu}}-1} \\                   &=\Phi_p(X^{p^\nu}).\end{aligned}</script><blockquote><p><strong>Problem 2.</strong> Let $p$ be a prime number. If $p \nmid n$, then</p><script type="math/tex; mode=display">\Phi_{pn}(X)=\frac{\Phi_n(X^p)}{\Phi_n(X)}.</script></blockquote><p><em>Solution.</em> Assume $p \nmid n$ first. It holds clearly for $n=1$. Suppose now the statement holds for all integers $&lt;n$ that are prime to $p$. We see</p><script type="math/tex; mode=display">\begin{aligned}\frac{\Phi_n(X^p)}{\Phi_n(X)} &= \frac{X^{pn}-1}{\prod_{d|n}^{d<n}\Phi_d(X^p)}\frac{\prod_{d|n}^{d<n}\Phi_d(X)}{X^n-1} \\&= \frac{X^{pn}-1}{(X^n-1)\prod_{d|n}^{d<n}\Phi_{dp}(X)} \\&= \frac{X^{pn}-1}{\prod_{d|n}\Phi_d(X)\prod_{d|n}^{d<n}\Phi_{dp}(X)} \\&=\Phi_{np}(X).\end{aligned}</script><blockquote><p><strong>Problem 3.</strong> If $n$ is an odd number $&gt;1$, then $\Phi_{2n}(X)=\Phi_n(-X)$.</p></blockquote><p><em>Solution.</em> By problem 2, $\Phi_{2n}(X)=\Phi_n(X^2)/\Phi_n(X)$. To show the identity it suffices to show that</p><script type="math/tex; mode=display">\Phi_n(X)\Phi_n(-X)=\Phi_n(X^2).</script><p>For $n=3$ we see</p><script type="math/tex; mode=display">\begin{aligned}\Phi_3(X)\Phi_3(-X) &= (X^2+X+1)(X^2-X+1) \\&=(X^2+1)^2-X^2 \\&=X^4+X^2+1 \\&=\Phi_3(X^2).\end{aligned}</script><p>Now suppose it holds for all odd numbers $3 \le d &lt; n$, then</p><script type="math/tex; mode=display">\begin{aligned}\Phi_n(X)\Phi_n(-X) &= \frac{(X^n-1)(-X^n-1)}{    (X-1)(-X-1)\prod_{3\le d < n}^{d|n}\Phi_d(X)\Phi_d(-X)} \\&=  \frac{-(X^{2n}-1)}{-(X^2-1)\prod_{3 \le d < n}^{d|n}    \Phi_d(X^2)} \\&= \Phi_n(X^2).\end{aligned}</script><p>The following problem would not be very easy without the Möbius inversion formula so we will use it anyway. Problems above can also be deduced from this formula. Let $f:\mathbb{Z}_{\ge 0} \to \mathbb{Z}_{\ge 0}$ be a function and $F(n)=\prod_{d|n}f(d)$, then the Möbius inversion formula states that</p><script type="math/tex; mode=display">f(n)=\prod_{d|n}F(n/d)^{\mu(d)}</script><p>with</p><script type="math/tex; mode=display">\mu(n)=\begin{cases}0 & \text{if $n$ is divisible by $p^2$ for some prime $p$}, \\(-1)^r & \text{if $n$ is a product of $r$ distinct primes,} \\1 & \text{if $n=1$.}\end{cases}</script><p>Putting $f(d)=\Phi_d(X)$, we see</p><script type="math/tex; mode=display">\Phi_n(X)=\prod_{d|n}(X^{n/d}-1)^{\mu(d)}.</script><p>Now we proceed.</p><blockquote><p><strong>Problem 4.</strong> If $p|n$, then $\Phi_{pn}(X)=\Phi_n(X^p)$.</p></blockquote><p><em>Solution.</em> By the Möbius inversion formula, we see</p><script type="math/tex; mode=display">\begin{aligned}\Phi_{pn}(X) &= \prod_{d|pn}(X^{pn/d}-1)^{\mu(d)} \\             &= \left(\prod_{d|n}(X^{pn/d}-1)^{\mu(d)} \right)                 \left(\prod_{d|np}^{d\nmid n}(X^{pn/d}-1)^{\mu(d)}\right) \\             &= \Phi_n(X^p)\end{aligned}</script><p>because all $d$ that divides $np$ but not $n$ must be divisible by $p^2$. Problem 2 can also follow from here.</p><blockquote><p><strong>Problem 5.</strong> Let $n=p_1^{r_1}\dots p_s^{r_s}$, then</p><script type="math/tex; mode=display">\Phi_n(X)=\Phi_{p_1 \dots p_s}(X^{p_1^{r_1-1}\dots p_s^{r_s-1}}).</script></blockquote><p><em>Solution.</em> This problem can be solved by induction on the number of primes. For $s=1$ it is problem 1. Suppose it has been proved for $s-1$ primes, then for</p><script type="math/tex; mode=display">n_{s-1}=p_1^{r_1}\dots p_{s-1}^{r_{s-1}}</script><p>and a prime $p_s$, we have</p><script type="math/tex; mode=display">\Phi_{n_{s-1}p_s}(X)=\frac{\Phi_{n_{s-1}}(X^{p_s})}{\Phi_{n_{s-1}}(X)}</script><p>On the other hand,</p><script type="math/tex; mode=display">\begin{aligned}\frac{\Phi_{n_{s-1}}(X^{p_s})}{\Phi_{n_{s-1}}(X)}&=\frac{\Phi_{p_1\dots p_{s-1}}(X^{    p_1^{r_1-1}\dots p_{s-1}^{r_{s-1}-1}p_s})}{\Phi_{p_1\dots p_{s-1}}(X^{    p_1^{r_1-1}\dots p_{s-1}^{r_{s-1}-1}})} \\&=\Phi_{p_1 \dots p_{s-1}p_s}(X^{p_1^{r_1-1}\dots p_{s-1}^{r_{s-1}-1}p_s^{1-1}})\end{aligned}</script><p>if we put $Y=X^{p_1^{r_1-1}\dots p_{s-1}^{r_{s-1}-1}}$. When it comes to higher degree of $p_s$, it’s merely problem 2. Therefore we have shown what we want.</p><h1 id="Computing-the-Norm"><a href="#Computing-the-Norm" class="headerlink" title="Computing the Norm"></a>Computing the Norm</h1><p>Let $\zeta$ be a primitive $n$-th root of unity, put $K=\mathbb{Q}(\zeta)$ and $G$ the Galois group.. We will compute the norm of $1-\zeta$ with respect to the extension $K/\mathbb{Q}$. Since this extension is separable, we have</p><script type="math/tex; mode=display">\begin{aligned}N_\mathbb{Q}^K(1-\zeta)&=\prod_{\sigma \in G}\sigma(1-\zeta) \\&=\prod_{\sigma \in G}(1-\sigma\zeta) \\\end{aligned}</script><p>Since $G$ acts on the set of primitive roots transitively, $\{\sigma\zeta\}_{\sigma \in G}$ is exactly the set of primitive roots of unity, which are roots of $\Phi_n(X)$. It follows that</p><script type="math/tex; mode=display">N_\mathbb{Q}^K(1-\zeta)=\Phi_n(1).</script><p>If $n=p^r$, then $N_\mathbb{Q}^K(1-\zeta)=\Phi_p(1^{p^{r-1}})=\Phi_p(1)=p$. On the other hand, if</p><script type="math/tex; mode=display">n=p_1^{r_1}\dots p_s^{r_s},</script><p>then</p><script type="math/tex; mode=display">\Phi_n(1)=\Phi_{p_1\dots p_s}(1)=1.</script>]]></content>
    
    
    <summary type="html">In this post we study cyclotomic polynomials in field theory and deduce some baisc properties of it. We will also use it to solve some problems in field theory.</summary>
    
    
    
    <category term="Algebra" scheme="https://desvl.xyz/categories/Algebra/"/>
    
    <category term="Field Theory" scheme="https://desvl.xyz/categories/Algebra/Field-Theory/"/>
    
    <category term="Galois Theory" scheme="https://desvl.xyz/categories/Algebra/Galois-Theory/"/>
    
    
    <category term="Exercise solution" scheme="https://desvl.xyz/tags/Exercise-solution/"/>
    
    <category term="Serge Lang" scheme="https://desvl.xyz/tags/Serge-Lang/"/>
    
  </entry>
  
  <entry>
    <title>Calculus on Fields - Heights of Polynomials, Mahler&#39;s Measure and Northcott&#39;s Theorem</title>
    <link href="https://desvl.xyz/2022/09/04/height-polynomial/"/>
    <id>https://desvl.xyz/2022/09/04/height-polynomial/</id>
    <published>2022-09-04T14:39:20.000Z</published>
    <updated>2025-05-11T12:08:36.541Z</updated>
    
    <content type="html"><![CDATA[<h2 id="Heights"><a href="#Heights" class="headerlink" title="Heights"></a>Heights</h2><blockquote><p><strong>Definition.</strong> For a polynomial with coefficients in a number field $K$</p><script type="math/tex; mode=display">f(t_1,\dots,t_n)=\sum_{j_1,\dots,j_n}a_{j_1\dots j_n}t_1^{j_1}\dots t_n^{j_n}=\sum_{\mathbf{j}}a_{\mathbf{j}}\mathbf{t}^{\mathbf{j}},</script><p>the <strong>height</strong> of $f$ is defined to be</p><script type="math/tex; mode=display">h(f)=\sum_{v \in M_K}\log|f|_v</script><p>where</p><script type="math/tex; mode=display">|f|_v=\max_{\mathbf{j}}|a_{\mathbf{j}}|_v</script><p>is the <strong>Gauss norm</strong> for any place $v$.</p></blockquote><p>Here, $M_K$ refers to the canonical set of non-equivalent places on $K$. See first four pages of <a href="https://www.math.uh.edu/~minru/fields/dictionary.pdf">this document</a> for a reference.</p><p>As one can expect, this can tell us about some complexity of a polynomial, just like how the height of an algebraic number tells us its complexity. Let us compute some examples.</p><h3 id="Computing-Heights"><a href="#Computing-Heights" class="headerlink" title="Computing Heights"></a>Computing Heights</h3><p>Let us consider the simplest one </p><script type="math/tex; mode=display">f(x)=x^2-1 \in \mathbb{Q}[x]</script><p>first. Since $|x^2-1|_v=1$ for all places $v$, the height of $f$ is a sum of $0$, which is still $0$.</p><p>Next, we take care of a polynomial that involves prime numbers</p><script type="math/tex; mode=display">g(x)=\frac{1}{4}x^4+\frac{1}{3}x^3+\frac{1}{2}x^2+x+2</script><p>We see $|g(x)|_\infty=2$, $|g(x)|_2=2^{-(-2)}=4$, $|g(x)|_3=3^{-(-1)}=3$, and the Gauss norm is $1$ for all other primes. Therefore</p><script type="math/tex; mode=display">h(g)=\log{2}+\log{4}+\log{3}=3\log{2}+\log{3}.</script><p>Put $u(x,y)=\sqrt{2}x^2 + 3\sqrt{2}xy+5y^2+7 \in \mathbb{Q}(\sqrt{2})[x,y]$, we can compute its height carefully. Notice that $|\sqrt{2}|_v=\sqrt{|2|_v}$ for all places $v$ and we therefore have</p><script type="math/tex; mode=display">\begin{aligned}h(u) &= \log|u|_\infty + \log|u|_2 + \log|u|_3 + \log|u|_5+\log|u|_7 \\     &= \log 7 + \log \sqrt\frac{1}{2}+\log\frac{1}{3}+\log\frac{1}{5}+\log\frac{1}{7} \\     &= -\frac{1}{2}\log{2}-\log{3}-\log{5}.\end{aligned}</script><h3 id="Height-and-Products"><a href="#Height-and-Products" class="headerlink" title="Height and Products"></a>Height and Products</h3><p>If $f \in K[s_1,\dots,s_n]$ and $g \in K[t_1,\dots,t_m]$ are two polynomials in different variables, then as a polynomial in $K[s_1,\dots,s_n;t_1,\dots,t_m]$, $fg$ has height $h(f)+h(g)$. This is immediately realised once we notice that the height of a polynomial is equal to the height of the vector of coefficients in appropriate projective space. The identity $h(fg)=h(f)+h(g)$ follows from the Segre embedding.</p><p>But if variables coincide, things get different. For example, $h(x+1)=0$ but $h((x+1)^2)=2$. This is because we do not have $|fg|_\infty=|f|_\infty|g|_\infty$. Nevertheless, for non-Archimedean places, things are easier.</p><blockquote><p><strong>Gauss’s lemma.</strong> If $v$ is not Archimedean, then $|fg|_v=|f|_v|g|_v$.</p></blockquote><p><em>Proof.</em> First of all, it suffices to prove it for univariable cases. If $f$ and $g$ have multiple variables $x_1,\dots,x_n$, let $d$ be an integer greater than the degree of $fg$. Then the Kronecker substitution </p><script type="math/tex; mode=display">x_j=t^{d^{j-1}}</script><p>reduces our study into $K[t]$. This is because, with such a $d$, this substitution gives a univariable polynomial with the same set of coefficients. </p><p>Therefore we only need to show that $|f(t)g(t)|_v=|f(t)|_v|g(t)|_v$. Without loss of generality we assume that $|f(t)|_v=|g(t)|_v=1$. Write $f(t)=\sum a_k t^k$ and $g(t)=\sum b_k t^k$, we have $f(t)g(t)=\sum c_jt^j$ where $c_j=\sum_{j=k+l}a_kb_l$. </p><p>We suppose that $|fg|_v&lt;1$, i.e., $|c_j|_v&lt;1$ for all $j$, and see what contradiction we will get. If $|a_j|=1$ for all $j$, then $|c_j|_v&lt;1$ implies that $|b_k|_v&lt;1$ for all $k$ and therefore $|g|_v&lt;1$, a contradiction. Therefore we may assume that, without loss of generality, $|a_0|_v&lt;1$ but $|a_1|_v=1$. Then, since</p><script type="math/tex; mode=display">|c_j|_v=|a_0b_j+a_1b_{j-1}+\dots|_v<1,</script><p>we have $|a_1b_{j-1}|_v=|b_{j-1}|_v&lt;1$ for all $j \ge 1$. It follows that $|g(t)|_v&lt;1$, still a contradiction. $\square$</p><p>So much for non-Archimedean case. For Archimedean case things are more complicated so we do not have enough space to cover that. Nevertheless, we have</p><blockquote><p><strong>Gelfond’s lemma.</strong> Let $f_1,\dots,f_m$ be complex polynomials in $n$ variables an set $f=f_1\cdots f_n$, then</p><script type="math/tex; mode=display">2^{-d}\prod_{j=1}^{m}\ell_\infty(f_j) \le \ell_\infty (f) \le 2^d\prod_{j=1}^{m}\ell_\infty(f_j),</script><p>where $d$ is the sum of the partial degrees of $f$, and $\ell_\infty(f)=\max_j|a_j|=|f|_\infty$. </p></blockquote><p>Combining Gelfond’s lemma and Gauss’s lemma, we obtain</p><script type="math/tex; mode=display">\left|h(f)-\sum_{j=1}^{m}h(f_j) \right| \le d\log{2}.</script><h2 id="Mahler-Measure"><a href="#Mahler-Measure" class="headerlink" title="Mahler Measure"></a>Mahler Measure</h2><p>Is not actually given by Mahler initially. It was named after Mahler because he successfully extended it to multivariable cases in an elegant way. We will cover the original motivation anyway.</p><h3 id="Original-Version-and-Lehmer’s-Conjecture"><a href="#Original-Version-and-Lehmer’s-Conjecture" class="headerlink" title="Original Version and Lehmer’s Conjecture"></a>Original Version and Lehmer’s Conjecture</h3><p>Say we want to find prime numbers large enough. Pierce came up with an idea. Consider $p(x) \in \mathbb{Z}[x]$, which is factored into</p><script type="math/tex; mode=display">p(x)=a\prod_i(x-\alpha_i).</script><p>Consider $\Delta_n=\prod_i(\alpha^n_i-1)$. Then by some Galois theory, this is indeed an integer. So perhaps we may find some interesting integers in the factors of $\Delta_n$. Also, we expect it to grow slowly. Lehmer studied $\frac{\Delta_{n+1}}{\Delta_n}$ and observed that</p><script type="math/tex; mode=display">\lim_{n \to \infty}\frac{|\alpha^{n+1}-1|}{|\alpha^n-1|}=\begin{cases}|\alpha|, & |\alpha|>1, \\1, & |\alpha|<1.\end{cases}</script><p>So it makes sense to compare all roots of $p(x)$ with $1$. He therefore suggested the following function related to $p(x)$:</p><script type="math/tex; mode=display">M(p)=a\prod_i \max\{1,|\alpha_i|\}.</script><p>This number appears if we consider $\lim_{n \to \infty}\Delta_{n+1}/\Delta_n$.</p><p>He also asked the following question, which is now understood as <strong>Lehmer conjecture</strong>, although in his paper he addressed it as a problem instead of a conjecture:</p><blockquote><p>Is there a constant $c$ such that, $M(p)&gt;1 \implies M(p)&gt;c$?</p></blockquote><p>It remains open but we can mention some key bounds.</p><ul><li>Lehmer himself found that</li></ul><script type="math/tex; mode=display">M(x^{10}+x^9-x^7-x^6-x^5-x^4-x^3+x+1)=1.1762808\cdots</script><p>and actually this is the finest result that has ever been discovered. It was because of this discovery that he gave his <em>problem</em>.</p><p>This polynomial has also led to the discovery of a large prime number $\sqrt{\Delta_{379}}=1, 794, 327, 140, 357$, although by studying $x^3-x-1$, we have found a bigger prime number $\Delta_{127}=3, 233, 514, 251, 032, 733$.</p><ul><li>Breusch (and later Smyth) discovered that if $p$ is monic, irreducible and nonreciprocal, i.e. it does not satisfy $p(x)=\pm x^{\deg p}f(1/x)$, then </li></ul><script type="math/tex; mode=display">M(p)=1.3247179\cdots=\text{ real root of $x^3-x-1$.}</script><ul><li>E. Dobrowlolski found that, t if $p(x)$ is monic, irreducible and noncyclotomic, and<br>has degree $d$ then</li></ul><script type="math/tex; mode=display">M(p)>1+c\left( \frac{\log\log d}{\log d} \right)^3</script><p>for some $c&gt;0$.</p><h3 id="The-General-Version-and-Jensen’s-Formula"><a href="#The-General-Version-and-Jensen’s-Formula" class="headerlink" title="The General Version and Jensen’s Formula"></a>The General Version and Jensen’s Formula</h3><blockquote><p><strong>Definition.</strong> For $f \in \mathbb{C}[x_1,\dots,x_n]$, the <strong>Mahler measure</strong> is defined to be</p><script type="math/tex; mode=display">M(f)=\exp\left(\int_{\mathbb{T}^n}\log|f(    e^{i\theta_1},\dots,e^{i\theta_n})|d\mu_1\dots d\mu_n \right),</script><p>where $d\mu_i=\frac{1}{2\pi}d\theta_i$, i.e., $d\mu_1\dots d\mu_n$ corresponds to the (completion of) Harr measure on $\mathbb{T}^n$ with total measure $1$.</p></blockquote><p>We see through Jensen’s formula that when $n=1$ this coincides with what we have defined before. Observe first that $M(fg)=M(f)M(g)$. Consider $f(t)=a\prod_{i=1}^{d}(t-\alpha_i)$, then</p><script type="math/tex; mode=display">M(f)=M(a)\prod_{i=1}^{d}M(t-\alpha_i)=a\prod_{i=1}^{d}M(t-\alpha_i).</script><p>On the other hand, as an exercise in complex analysis, one can show that</p><script type="math/tex; mode=display">\int_0^{2\pi}\log|t-\alpha_i|d\mu=\log^+|\alpha_i|=\log\max\{1,|\alpha_i|\}.</script><p>Combining them, we see</p><script type="math/tex; mode=display">M(f)=a\prod_{i=1}^{d}\exp\left(\log^+|\alpha_i|\right) =a\prod_{i=1}^{n}\max\{1,|\alpha_i|\}.</script><p>Taking the logarithm we also obtain <strong>Jensen’s formula</strong></p><script type="math/tex; mode=display">\log M(f)=\log|a|+\sum_{i=1}^{d}\log^+|\alpha_i|.</script><p>We first give a reasonable and useful estimation of $M(f)$, which will be used to prove the Northcott’s theorem.</p><blockquote><p><strong>Definition.</strong> For $f(t)=a_dt^d+\dots+a_0$, the $\ell_p$-norm of $f$ is naturally defined to be</p><script type="math/tex; mode=display">\ell_p(f)=\left(\sum_{j=0}^{d}|a_j|^p \right)^{\frac{1}{p}}.</script><p>For $p=\infty$, we have $\ell_\infty(f)=\max_j|a_j|$.</p><p><strong>Lemma 1.</strong> Notation being above, $M(f) \le \ell_1(f)$ and</p><script type="math/tex; mode=display">{d \choose \lfloor{d/2}\rfloor }^{-1} \ell_\infty(f) \le M(f) \le \ell_2(f) \le\sqrt{d+1}\ell_\infty(f).</script></blockquote><p><em>Proof.</em> To begin with, we observe those obvious ones. First of all, </p><script type="math/tex; mode=display">\begin{aligned}|f(e^{i\theta})|&=|a_d e^{id\theta}+\dots+a_0| \\                &\le |a_de^{id\theta}|+\dots+|a_0| \\                &=|a_d|+\dots+|a_0| \\                &=\ell_1(f).\end{aligned}</script><p>Therefore</p><script type="math/tex; mode=display">\begin{aligned}M(f) &=\exp\left(\int_0^{2\pi}\log|f(e^{i\theta})|d\mu \right) \\     &\le \exp\left(\int_0^{2\pi}\log\ell_1(f)d\mu \right) \\     &=\ell_1(f).\end{aligned}</script><p>Next, by Jensen’s inequality</p><script type="math/tex; mode=display">\begin{aligned}M(f) &=\exp\left(\frac{1}{2}\int_0^{2\pi}\log|f(e^{i\theta}) |^2d\mu\right) \\     &=\sqrt{\exp\left(    \int_0^{2\pi} \log|f(e^{i\theta})|^2d\mu \right)} \\     &\le \sqrt{\int_0^{2\pi}|f(e^{i\theta})|^2d\mu} \\\end{aligned}</script><p>However, by Parseval’s formula, the last term equals</p><script type="math/tex; mode=display">\ell_2(f)=\sqrt{\sum_{j=0}^{d}|a_j|^2} \le \sqrt{d+1}\ell_\infty(f).</script><p>For the remaining inequality, we use Vieta’s formula</p><script type="math/tex; mode=display">\left|\frac{a_{d-r}}{a_d} \right| = \left|\sum_{j_1 < \dots< j_r}\alpha_{j_1} \cdots \alpha_{j_r} \right|</script><p>and therefore</p><script type="math/tex; mode=display">|a_{d-r}| \le {d \choose r}|a_d|\prod_{j=1}^{d}\max\{1,|\alpha_j|\}={d \choose r}M(f) \le {d \choose \lfloor{d/2}\rfloor }M(f)</script><p>for all $0 \le r \le d$. Replacing $|a_{d-r}|$ with $\ell_\infty(f)$, we have finished the proof. $\square$</p><p>Before proving Northcott’s theorem, we show the connection between Mahler measure and heights.</p><blockquote><p><strong>Proposition 1.</strong> Let $\alpha \in \overline{\mathbb{Q}}$ and let $f$ be the minimal polynomial of $\alpha$ over $\mathbb{Z}$. Then</p><script type="math/tex; mode=display">\log M(f) = \deg(\alpha)h(\alpha)</script><p>and</p><script type="math/tex; mode=display">\log|N_{\mathbb{Q}(\alpha)/\mathbb Q}(\alpha)| \le \deg(\alpha)h(\alpha).</script></blockquote><p><em>Proof.</em> Put $d=\deg(\alpha)$ and write</p><script type="math/tex; mode=display">f(t)=a_dt^d+\dots+a_0.</script><p>Choose a number field $K$ that contains $\alpha$ and is a Galois extension of $\mathbb{Q}$, with Galois group $G$. Then $(\sigma\alpha:\sigma \in G)$ contains every conjugate of $\alpha$ exactly $[K:\mathbb{Q}]/d$ times. Since $a_0,\dots,a_d$ are coprime, for any non-Archimedean absolute value $v \in M_K$, we must have $\max_i|a_i|_v=|f|_v=1$. Combining with Gauss’s lemma and Galois theory, we see</p><script type="math/tex; mode=display">|a_d|_v\prod_{\sigma \in G}\max\{1,|\sigma\alpha|_v\}^{d/[K:\mathbb{Q}]}=1.</script><p>Now we are ready to compute the height of $\alpha$ to rediscover the Mahler’s measure. Notice that</p><script type="math/tex; mode=display">h(\alpha)=\sum_{v \in M_K}\log^+|\alpha|_v, \quad h(\sigma\alpha)=h(\alpha),\quad \forall \sigma \in G.</script><p>We therefore obtain</p><script type="math/tex; mode=display">\begin{aligned}h(\alpha)&=\frac{1}{[K:\mathbb{Q}]}\sum_{\sigma \in G}h(\sigma\alpha) \\         &=\frac{1}{[K:\mathbb{Q}]}\sum_{v \in M_K}\sum_{\sigma \in G}\log^+|\sigma\alpha|_v \\         &=\frac{1}{[K:\mathbb{Q}]}\sum_{v\mid\infty}\sum_{\sigma \in G}\log^+|\sigma\alpha|_v             + \frac{1}{[K:\mathbb{Q}]}\sum_{v\nmid \infty}\sum_{\sigma \in G}\log^+|\sigma\alpha|_v \\\end{aligned}</script><p>The last term corresponds to what we have computed above about non-Archimedean absolute values so we break it down a little bit:</p><script type="math/tex; mode=display">\begin{aligned}\sum_{\sigma \in G}\log^+|\sigma\alpha|_v &=    \sum_{\sigma \in G}\log\max\{1,|\sigma\alpha|_v\} \\    &=\log\prod_{\sigma \in G}\max\{1,|\sigma\alpha|_v\} \\    &=\frac{[K:\mathbb{Q}]}{d}\log\prod_{\sigma \in G}        \max\{1,|\sigma\alpha|_v\}^{d/[K:\mathbb{Q}]} \\    &=-\frac{[K:\mathbb{Q}]}{d}\log|a_d|_v \\    &=\frac{[K:\mathbb{Q}]}{d}\log|a_d|_w\end{aligned}</script><p>for some $u \mid \infty$, according to the product formula. On the other hand, for $v \mid \infty$,</p><script type="math/tex; mode=display">\sum_{\sigma \in G}\log^+|\sigma\alpha|_v = \frac{[K:\mathbb{Q}]}{d}\sum_{j=1}^{d}\log^+|\alpha_j|_v.</script><p>All in all,</p><script type="math/tex; mode=display">\begin{aligned}h(\alpha)&=\frac{1}{[K:\mathbb{Q}]}\frac{[K:\mathbb{Q}]}{d}           \sum_{v \mid \infty}\left(            \log|a_d|_v+\sum_{j=1}^{d}\log^+|\alpha_j|_v       \right) \\         &=\frac{1}{d}\log M(f).\end{aligned}</script><p>The second assertion follows immediately because</p><script type="math/tex; mode=display">\log|N_{\mathbb{Q}(\alpha)/\mathbb{Q}}(\alpha)|=\sum_{v\mid \infty}\sum_{j=1}^{d}\log^+|\alpha|_j.</script><h2 id="Northcott’s-Theorem"><a href="#Northcott’s-Theorem" class="headerlink" title="Northcott’s Theorem"></a>Northcott’s Theorem</h2><p>The set of non-zero algebraic integers of height $0$ lies on the unit circle, and they are actually roots of unit, by Kronecker’s theorem. However keep in mind that algebraic integers on the unit circle are not necessarily roots of units. See this short <a href="http://ramanujan.math.trinity.edu/rdaileda/research/papers/p1.pdf">paper</a>.</p><p>When it comes to algebraic integers of small heights, things may get complicated, but Northcott’s theorem assures that we will be studying a finite set.</p><blockquote><p><strong>Northcott’s Theorem.</strong> Given an integer $N&gt;0$ and a real number $H \ge1$, there are only a finite number of algebraic integers $\alpha$ satisfying $\deg(\alpha) \le N$ and $h(\alpha) \le \log H$.</p></blockquote><p><em>Proof.</em> Let $\alpha$ be a algebraic integer of degree $d&lt;N$ and height $h(\alpha) \le \log H$. Suppose $f(t)=a_dt^d+\dots+a_0 \in \mathbb{Z}[t]$ is the minimal polynomial of $\alpha$. Then lemma 1 shows us that</p><script type="math/tex; mode=display">\max|a_i|=\ell_\infty(f) \le {d \choose \lfloor d/2 \rfloor}          M(f) \le 2^dM(f).</script><p>On the other hand, by proposition 1,</p><script type="math/tex; mode=display">\log M(f) =dh(\alpha) \le d\log{H}=\log H^d,</script><p>we have actually</p><script type="math/tex; mode=display">\max|a_i| \le (2H)^d.</script><p>This gives rise to no more than $(2\lfloor (2H)^d \rfloor+1)^{d+1}$ distinct polynomials $f$, which produces at most $d(2\lfloor (2H)^d \rfloor+1)^{d+1}&lt;\infty$ algebraic integers. Ranging through all $d \le N$ we get what we want. $\square$</p><p>We also have the <strong>Northcott property</strong>, where we do not care about degrees. A set $L$ of algebraic integers is said to satisfy Northcott property if, for every $T&gt;0$, the set</p><script type="math/tex; mode=display">\{\alpha \in L:h(\alpha)<T\}</script><p>is finite. Such a set $L$ is said to satisfy <strong>Bogomolov property</strong> if, there exists $T&gt;0$ such that the set</p><script type="math/tex; mode=display">\{\alpha \in L:0<h(\alpha)<T\}</script><p>is empty. As a matter of elementary topology, Northcott property implies Bogomolov property. It would be quite interesting if $L$ is a field. This <a href="https://arxiv.org/abs/2005.10609">paper</a> can be quite interesting.</p><h2 id="References-Further-Reading"><a href="#References-Further-Reading" class="headerlink" title="References / Further Reading"></a>References / Further Reading</h2><ul><li><p>Erico Bombieri, Walter Gubler, <em>Heights in Diophantine Geometry</em>.</p></li><li><p>Michel Waldschmidt, <em>Diophantine Approximation on Linear Algebraic Groups, Transcendence Properties of the Exponential Function in Several Variables</em>.</p></li><li><p>Chris Smyth, <em>THE MAHLER MEASURE OF ALGEBRAIC NUMBERS: A SURVEY</em>.</p></li></ul>]]></content>
    
    
    <summary type="html">We study the height of polynomials and derive some important tools.</summary>
    
    
    
    <category term="Algebra" scheme="https://desvl.xyz/categories/Algebra/"/>
    
    <category term="Galois Theory" scheme="https://desvl.xyz/categories/Algebra/Galois-Theory/"/>
    
    <category term="Number Theory" scheme="https://desvl.xyz/categories/Number-Theory/"/>
    
    <category term="Algebraic Number Theory" scheme="https://desvl.xyz/categories/Number-Theory/Algebraic-Number-Theory/"/>
    
    
    <category term="Heights" scheme="https://desvl.xyz/tags/Heights/"/>
    
    <category term="Diophantus" scheme="https://desvl.xyz/tags/Diophantus/"/>
    
  </entry>
  
</feed>
