Calculus on Fields - Heights of Polynomials, Mahler's Measure and Northcott's Theorem
Heights
Definition. For a polynomial with coefficients in a number field $K$
the height of $f$ is defined to be
where
is the Gauss norm for any place $v$.
Here, $M_K$ refers to the canonical set of non-equivalent places on $K$. See first four pages of this document for a reference.
As one can expect, this can tell us about some complexity of a polynomial, just like how the height of an algebraic number tells us its complexity. Let us compute some examples.
Computing Heights
Let us consider the simplest one
first. Since $|x^2-1|_v=1$ for all places $v$, the height of $f$ is a sum of $0$, which is still $0$.
Next, we take care of a polynomial that involves prime numbers
We see $|g(x)|_\infty=2$, $|g(x)|_2=2^{-(-2)}=4$, $|g(x)|_3=3^{-(-1)}=3$, and the Gauss norm is $1$ for all other primes. Therefore
Put $u(x,y)=\sqrt{2}x^2 + 3\sqrt{2}xy+5y^2+7 \in \mathbb{Q}(\sqrt{2})[x,y]$, we can compute its height carefully. Notice that $|\sqrt{2}|_v=\sqrt{|2|_v}$ for all places $v$ and we therefore have
Height and Products
If $f \in K[s_1,\dots,s_n]$ and $g \in K[t_1,\dots,t_m]$ are two polynomials in different variables, then as a polynomial in $K[s_1,\dots,s_n;t_1,\dots,t_m]$, $fg$ has height $h(f)+h(g)$. This is immediately realised once we notice that the height of a polynomial is equal to the height of the vector of coefficients in appropriate projective space. The identity $h(fg)=h(f)+h(g)$ follows from the Segre embedding.
But if variables coincide, things get different. For example, $h(x+1)=0$ but $h((x+1)^2)=2$. This is because we do not have $|fg|_\infty=|f|_\infty|g|_\infty$. Nevertheless, for non-Archimedean places, things are easier.
Gauss’s lemma. If $v$ is not Archimedean, then $|fg|_v=|f|_v|g|_v$.
Proof. First of all, it suffices to prove it for univariable cases. If $f$ and $g$ have multiple variables $x_1,\dots,x_n$, let $d$ be an integer greater than the degree of $fg$. Then the Kronecker substitution
reduces our study into $K[t]$. This is because, with such a $d$, this substitution gives a univariable polynomial with the same set of coefficients.
Therefore we only need to show that $|f(t)g(t)|_v=|f(t)|_v|g(t)|_v$. Without loss of generality we assume that $|f(t)|_v=|g(t)|_v=1$. Write $f(t)=\sum a_k t^k$ and $g(t)=\sum b_k t^k$, we have $f(t)g(t)=\sum c_jt^j$ where $c_j=\sum_{j=k+l}a_kb_l$.
We suppose that $|fg|_v<1$, i.e., $|c_j|_v<1$ for all $j$, and see what contradiction we will get. If $|a_j|=1$ for all $j$, then $|c_j|_v<1$ implies that $|b_k|_v<1$ for all $k$ and therefore $|g|_v<1$, a contradiction. Therefore we may assume that, without loss of generality, $|a_0|_v<1$ but $|a_1|_v=1$. Then, since
we have $|a_1b_{j-1}|_v=|b_{j-1}|_v<1$ for all $j \ge 1$. It follows that $|g(t)|_v<1$, still a contradiction. $\square$
So much for non-Archimedean case. For Archimedean case things are more complicated so we do not have enough space to cover that. Nevertheless, we have
Gelfond’s lemma. Let $f_1,\dots,f_m$ be complex polynomials in $n$ variables an set $f=f_1\cdots f_n$, then
where $d$ is the sum of the partial degrees of $f$, and $\ell_\infty(f)=\max_j|a_j|=|f|_\infty$.
Combining Gelfond’s lemma and Gauss’s lemma, we obtain
Mahler Measure
Is not actually given by Mahler initially. It was named after Mahler because he successfully extended it to multivariable cases in an elegant way. We will cover the original motivation anyway.
Original Version and Lehmer’s Conjecture
Say we want to find prime numbers large enough. Pierce came up with an idea. Consider $p(x) \in \mathbb{Z}[x]$, which is factored into
Consider $\Delta_n=\prod_i(\alpha^n_i-1)$. Then by some Galois theory, this is indeed an integer. So perhaps we may find some interesting integers in the factors of $\Delta_n$. Also, we expect it to grow slowly. Lehmer studied $\frac{\Delta_{n+1}}{\Delta_n}$ and observed that
So it makes sense to compare all roots of $p(x)$ with $1$. He therefore suggested the following function related to $p(x)$:
This number appears if we consider $\lim_{n \to \infty}\Delta_{n+1}/\Delta_n$.
He also asked the following question, which is now understood as Lehmer conjecture, although in his paper he addressed it as a problem instead of a conjecture:
Is there a constant $c$ such that, $M(p)>1 \implies M(p)>c$?
It remains open but we can mention some key bounds.
- Lehmer himself found that
and actually this is the finest result that has ever been discovered. It was because of this discovery that he gave his problem.
This polynomial has also led to the discovery of a large prime number $\sqrt{\Delta_{379}}=1, 794, 327, 140, 357$, although by studying $x^3-x-1$, we have found a bigger prime number $\Delta_{127}=3, 233, 514, 251, 032, 733$.
- Breusch (and later Smyth) discovered that if $p$ is monic, irreducible and nonreciprocal, i.e. it does not satisfy $p(x)=\pm x^{\deg p}f(1/x)$, then
- E. Dobrowlolski found that, t if $p(x)$ is monic, irreducible and noncyclotomic, and
has degree $d$ then
for some $c>0$.
The General Version and Jensen’s Formula
Definition. For $f \in \mathbb{C}[x_1,\dots,x_n]$, the Mahler measure is defined to be
where $d\mu_i=\frac{1}{2\pi}d\theta_i$, i.e., $d\mu_1\dots d\mu_n$ corresponds to the (completion of) Harr measure on $\mathbb{T}^n$ with total measure $1$.
We see through Jensen’s formula that when $n=1$ this coincides with what we have defined before. Observe first that $M(fg)=M(f)M(g)$. Consider $f(t)=a\prod_{i=1}^{d}(t-\alpha_i)$, then
On the other hand, as an exercise in complex analysis, one can show that
Combining them, we see
Taking the logarithm we also obtain Jensen’s formula
We first give a reasonable and useful estimation of $M(f)$, which will be used to prove the Northcott’s theorem.
Definition. For $f(t)=a_dt^d+\dots+a_0$, the $\ell_p$-norm of $f$ is naturally defined to be
For $p=\infty$, we have $\ell_\infty(f)=\max_j|a_j|$.
Lemma 1. Notation being above, $M(f) \le \ell_1(f)$ and
Proof. To begin with, we observe those obvious ones. First of all,
Therefore
Next, by Jensen’s inequality
However, by Parseval’s formula, the last term equals
For the remaining inequality, we use Vieta’s formula
and therefore
for all $0 \le r \le d$. Replacing $|a_{d-r}|$ with $\ell_\infty(f)$, we have finished the proof. $\square$
Before proving Northcott’s theorem, we show the connection between Mahler measure and heights.
Proposition 1. Let $\alpha \in \overline{\mathbb{Q}}$ and let $f$ be the minimal polynomial of $\alpha$ over $\mathbb{Z}$. Then
and
Proof. Put $d=\deg(\alpha)$ and write
Choose a number field $K$ that contains $\alpha$ and is a Galois extension of $\mathbb{Q}$, with Galois group $G$. Then $(\sigma\alpha:\sigma \in G)$ contains every conjugate of $\alpha$ exactly $[K:\mathbb{Q}]/d$ times. Since $a_0,\dots,a_d$ are coprime, for any non-Archimedean absolute value $v \in M_K$, we must have $\max_i|a_i|_v=|f|_v=1$. Combining with Gauss’s lemma and Galois theory, we see
Now we are ready to compute the height of $\alpha$ to rediscover the Mahler’s measure. Notice that
We therefore obtain
The last term corresponds to what we have computed above about non-Archimedean absolute values so we break it down a little bit:
for some $u \mid \infty$, according to the product formula. On the other hand, for $v \mid \infty$,
All in all,
The second assertion follows immediately because
Northcott’s Theorem
The set of non-zero algebraic integers of height $0$ lies on the unit circle, and they are actually roots of unit, by Kronecker’s theorem. However keep in mind that algebraic integers on the unit circle are not necessarily roots of units. See this short paper.
When it comes to algebraic integers of small heights, things may get complicated, but Northcott’s theorem assures that we will be studying a finite set.
Northcott’s Theorem. Given an integer $N>0$ and a real number $H \ge1$, there are only a finite number of algebraic integers $\alpha$ satisfying $\deg(\alpha) \le N$ and $h(\alpha) \le \log H$.
Proof. Let $\alpha$ be a algebraic integer of degree $d<N$ and height $h(\alpha) \le \log H$. Suppose $f(t)=a_dt^d+\dots+a_0 \in \mathbb{Z}[t]$ is the minimal polynomial of $\alpha$. Then lemma 1 shows us that
On the other hand, by proposition 1,
we have actually
This gives rise to no more than $(2\lfloor (2H)^d \rfloor+1)^{d+1}$ distinct polynomials $f$, which produces at most $d(2\lfloor (2H)^d \rfloor+1)^{d+1}<\infty$ algebraic integers. Ranging through all $d \le N$ we get what we want. $\square$
We also have the Northcott property, where we do not care about degrees. A set $L$ of algebraic integers is said to satisfy Northcott property if, for every $T>0$, the set
is finite. Such a set $L$ is said to satisfy Bogomolov property if, there exists $T>0$ such that the set
is empty. As a matter of elementary topology, Northcott property implies Bogomolov property. It would be quite interesting if $L$ is a field. This paper can be quite interesting.
References / Further Reading
Erico Bombieri, Walter Gubler, Heights in Diophantine Geometry.
Michel Waldschmidt, Diophantine Approximation on Linear Algebraic Groups, Transcendence Properties of the Exponential Function in Several Variables.
Chris Smyth, THE MAHLER MEASURE OF ALGEBRAIC NUMBERS: A SURVEY.
Calculus on Fields - Heights of Polynomials, Mahler's Measure and Northcott's Theorem