Calculus on Fields - Heights of Polynomials, Mahler's Measure and Northcott's Theorem


Definition. For a polynomial with coefficients in a number field \(K\)

\[ f(t_1,\dots,t_n)=\sum_{j_1,\dots,j_n}a_{j_1\dots j_n}t_1^{j_1}\dots t_n^{j_n} =\sum_{\mathbf{j}}a_{\mathbf{j}}\mathbf{t}^{\mathbf{j}}, \]

the height of \(f\) is defined to be

\[ h(f)=\sum_{v \in M_K}\log|f|_v \]


\[ |f|_v=\max_{\mathbf{j}}|a_{\mathbf{j}}|_v \]

is the Gauss norm for any place \(v\).

Here, \(M_K\) refers to the canonical set of non-equivalent places on \(K\). See first four pages of this document for a reference.

As one can expect, this can tell us about some complexity of a polynomial, just like how the height of an algebraic number tells us its complexity. Let us compute some examples.

Computing Heights

Let us consider the simplest one

\[ f(x)=x^2-1 \in \mathbb{Q}[x] \]

first. Since \(|x^2-1|_v=1\) for all places \(v\), the height of \(f\) is a sum of \(0\), which is still \(0\).

Next, we take care of a polynomial that involves prime numbers

\[ g(x)=\frac{1}{4}x^4+\frac{1}{3}x^3+\frac{1}{2}x^2+x+2 \]

We see \(|g(x)|_\infty=2\), \(|g(x)|_2=2^{-(-2)}=4\), \(|g(x)|_3=3^{-(-1)}=3\), and the Gauss norm is \(1\) for all other primes. Therefore

\[ h(g)=\log{2}+\log{4}+\log{3}=3\log{2}+\log{3}. \]

Put \(u(x,y)=\sqrt{2}x^2 + 3\sqrt{2}xy+5y^2+7 \in \mathbb{Q}(\sqrt{2})[x,y]\), we can compute its height carefully. Notice that \(|\sqrt{2}|_v=\sqrt{|2|_v}\) for all places \(v\) and we therefore have

\[ \begin{aligned} h(u) &= \log|u|_\infty + \log|u|_2 + \log|u|_3 + \log|u|_5+\log|u|_7 \\ &= \log 7 + \log \sqrt\frac{1}{2}+\log\frac{1}{3}+\log\frac{1}{5}+\log\frac{1}{7} \\ &= -\frac{1}{2}\log{2}-\log{3}-\log{5}. \end{aligned} \]

Height and Products

If \(f \in K[s_1,\dots,s_n]\) and \(g \in K[t_1,\dots,t_m]\) are two polynomials in different variables, then as a polynomial in \(K[s_1,\dots,s_n;t_1,\dots,t_m]\), \(fg\) has height \(h(f)+h(g)\). This is immediately realised once we notice that the height of a polynomial is equal to the height of the vector of coefficients in appropriate projective space. The identity \(h(fg)=h(f)+h(g)\) follows from the Segre embedding.

But if variables coincide, things get different. For example, \(h(x+1)=0\) but \(h((x+1)^2)=2\). This is because we do not have \(|fg|_\infty=|f|_\infty|g|_\infty\). Nevertheless, for non-Archimedean places, things are easier.

Gauss's lemma. If \(v\) is not Archimedean, then \(|fg|_v=|f|_v|g|_v\).

Proof. First of all, it suffices to prove it for univariable cases. If \(f\) and \(g\) have multiple variables \(x_1,\dots,x_n\), let \(d\) be an integer greater than the degree of \(fg\). Then the Kronecker substitution

\[ x_j=t^{d^{j-1}} \]

reduces our study into \(K[t]\). This is because, with such a \(d\), this substitution gives a univariable polynomial with the same set of coefficients.

Therefore we only need to show that \(|f(t)g(t)|_v=|f(t)|_v|g(t)|_v\). Without loss of generality we assume that \(|f(t)|_v=|g(t)|_v=1\). Write \(f(t)=\sum a_k t^k\) and \(g(t)=\sum b_k t^k\), we have \(f(t)g(t)=\sum c_jt^j\) where \(c_j=\sum_{j=k+l}a_kb_l\).

We suppose that \(|fg|_v<1\), i.e., \(|c_j|_v<1\) for all \(j\), and see what contradiction we will get. If \(|a_j|=1\) for all \(j\), then \(|c_j|_v<1\) implies that \(|b_k|_v<1\) for all \(k\) and therefore \(|g|_v<1\), a contradiction. Therefore we may assume that, without loss of generality, \(|a_0|_v<1\) but \(|a_1|_v=1\). Then, since

\[ |c_j|_v=|a_0b_j+a_1b_{j-1}+\dots|_v<1, \]

we have \(|a_1b_{j-1}|_v=|b_{j-1}|_v<1\) for all \(j \ge 1\). It follows that \(|g(t)|_v<1\), still a contradiction. \(\square\)

So much for non-Archimedean case. For Archimedean case things are more complicated so we do not have enough space to cover that. Nevertheless, we have

Gelfond's lemma. Let \(f_1,\dots,f_m\) be complex polynomials in \(n\) variables an set \(f=f_1\cdots f_n\), then

\[ 2^{-d}\prod_{j=1}^{m}\ell_\infty(f_j) \le \ell_\infty (f) \le 2^d\prod_{j=1}^{m}\ell_\infty(f_j), \]

where \(d\) is the sum of the partial degrees of \(f\), and \(\ell_\infty(f)=\max_j|a_j|=|f|_\infty\).

Combining Gelfond's lemma and Gauss's lemma, we obtain

\[ \left|h(f)-\sum_{j=1}^{m}h(f_j) \right| \le d\log{2}. \]

Mahler Measure

Is not actually given by Mahler initially. It was named after Mahler because he successfully extended it to multivariable cases in an elegant way. We will cover the original motivation anyway.

Original Version and Lehmer's Conjecture

Say we want to find prime numbers large enough. Pierce came up with an idea. Consider \(p(x) \in \mathbb{Z}[x]\), which is factored into

\[ p(x)=a\prod_i(x-\alpha_i). \]

Consider \(\Delta_n=\prod_i(\alpha^n_i-1)\). Then by some Galois theory, this is indeed an integer. So perhaps we may find some interesting integers in the factors of \(\Delta_n\). Also, we expect it to grow slowly. Lehmer studied \(\frac{\Delta_{n+1}}{\Delta_n}\) and observed that

\[ \lim_{n \to \infty}\frac{|\alpha^{n+1}-1|}{|\alpha^n-1|}= \begin{cases} |\alpha|, & |\alpha|>1, \\ 1, & |\alpha|<1. \end{cases} \]

So it makes sense to compare all roots of \(p(x)\) with \(1\). He therefore suggested the following function related to \(p(x)\):

\[ M(p)=a\prod_i \max\{1,|\alpha_i|\}. \]

This number appears if we consider \(\lim_{n \to \infty}\Delta_{n+1}/\Delta_n\).

He also asked the following question, which is now understood as Lehmer conjecture, although in his paper he addressed it as a problem instead of a conjecture:

Is there a constant \(c\) such that, \(M(p)>1 \implies M(p)>c\)?

It remains open but we can mention some key bounds.

  • Lehmer himself found that

\[ M(x^{10}+x^9-x^7-x^6-x^5-x^4-x^3+x+1)=1.1762808\cdots \]

and actually this is the finest result that has ever been discovered. It was because of this discovery that he gave his problem.

This polynomial has also led to the discovery of a large prime number \(\sqrt{\Delta_{379}}=1, 794, 327, 140, 357\), although by studying \(x^3-x-1\), we have found a bigger prime number \(\Delta_{127}=3, 233, 514, 251, 032, 733\).

  • Breusch (and later Smyth) discovered that if \(p\) is monic, irreducible and nonreciprocal, i.e. it does not satisfy \(p(x)=\pm x^{\deg p}f(1/x)\), then

\[ M(p)=1.3247179\cdots=\text{ real root of $x^3-x-1$.} \]

  • E. Dobrowlolski found that, t if \(p(x)\) is monic, irreducible and noncyclotomic, and has degree \(d\) then

\[ M(p)>1+c\left( \frac{\log\log d}{\log d} \right)^3 \]

for some \(c>0\).

The General Version and Jensen's Formula

Definition. For \(f \in \mathbb{C}[x_1,\dots,x_n]\), the Mahler measure is defined to be

\[ M(f)=\exp\left(\int_{\mathbb{T}^n}\log|f( e^{i\theta_1},\dots,e^{i\theta_n} )|d\mu_1\dots d\mu_n \right), \]

where \(d\mu_i=\frac{1}{2\pi}d\theta_i\), i.e., \(d\mu_1\dots d\mu_n\) corresponds to the (completion of) Harr measure on \(\mathbb{T}^n\) with total measure \(1\).

We see through Jensen's formula that when \(n=1\) this coincides with what we have defined before. Observe first that \(M(fg)=M(f)M(g)\). Consider \(f(t)=a\prod_{i=1}^{d}(t-\alpha_i)\), then

\[ M(f)=M(a)\prod_{i=1}^{d}M(t-\alpha_i)=a\prod_{i=1}^{d}M(t-\alpha_i). \]

On the other hand, as an exercise in complex analysis, one can show that

\[ \int_0^{2\pi}\log|t-\alpha_i|d\mu=\log^+|\alpha_i|=\log\max\{1,|\alpha_i|\}. \]

Combining them, we see

\[ M(f)=a\prod_{i=1}^{d}\exp\left(\log^+|\alpha_i|\right) =a\prod_{i=1}^{n}\max\{1,|\alpha_i|\}. \]

Taking the logarithm we also obtain Jensen's formula

\[ \log M(f)=\log|a|+\sum_{i=1}^{d}\log^+|\alpha_i|. \]

We first give a reasonable and useful estimation of \(M(f)\), which will be used to prove the Northcott's theorem.

Definition. For \(f(t)=a_dt^d+\dots+a_0\), the \(\ell_p\)-norm of \(f\) is naturally defined to be

\[ \ell_p(f)=\left(\sum_{j=0}^{d}|a_j|^p \right)^{\frac{1}{p}}. \]

For \(p=\infty\), we have \(\ell_\infty(f)=\max_j|a_j|\).

Lemma 1. Notation being above, \(M(f) \le \ell_1(f)\) and

\[ {d \choose \lfloor{d/2}\rfloor }^{-1} \ell_\infty(f) \le M(f) \le \ell_2(f) \le \sqrt{d+1}\ell_\infty(f). \]

Proof. To begin with, we observe those obvious ones. First of all,

\[ \begin{aligned} |f(e^{i\theta})|&=|a_d e^{id\theta}+\dots+a_0| \\ &\le |a_de^{id\theta}|+\dots+|a_0| \\ &=|a_d|+\dots+|a_0| \\ &=\ell_1(f). \end{aligned} \]


\[ \begin{aligned} M(f) &=\exp\left(\int_0^{2\pi}\log|f(e^{i\theta})|d\mu \right) \\ &\le \exp\left(\int_0^{2\pi}\log\ell_1(f)d\mu \right) \\ &=\ell_1(f). \end{aligned} \]

Next, by Jensen's inequality

\[ \begin{aligned} M(f) &=\exp\left(\frac{1}{2}\int_0^{2\pi}\log|f(e^{i\theta}) |^2d\mu\right) \\ &=\sqrt{\exp\left( \int_0^{2\pi} \log|f(e^{i\theta})|^2d\mu \right)} \\ &\le \sqrt{\int_0^{2\pi}|f(e^{i\theta})|^2d\mu} \\ \end{aligned} \]

However, by Parseval's formula, the last term equals

\[ \ell_2(f)=\sqrt{\sum_{j=0}^{d}|a_j|^2} \le \sqrt{d+1}\ell_\infty(f). \]

For the remaining inequality, we use Vieta's formula

\[ \left|\frac{a_{d-r}}{a_d} \right| = \left|\sum_{j_1 < \dots< j_r} \alpha_{j_1} \cdots \alpha_{j_r} \right| \]

and therefore

\[ |a_{d-r}| \le {d \choose r}|a_d|\prod_{j=1}^{d} \max\{1,|\alpha_j|\}={d \choose r}M(f) \le {d \choose \lfloor{d/2}\rfloor }M(f) \]

for all \(0 \le r \le d\). Replacing \(|a_{d-r}|\) with \(\ell_\infty(f)\), we have finished the proof. \(\square\)

Before proving Northcott's theorem, we show the connection between Mahler measure and heights.

Proposition 1. Let \(\alpha \in \overline{\mathbb{Q}}\) and let \(f\) be the minimal polynomial of \(\alpha\) over \(\mathbb{Z}\). Then

\[ \log M(f) = \deg(\alpha)h(\alpha) \]


\[ \log|N_{\mathbb{Q}(\alpha)/\mathbb Q}(\alpha)| \le \deg(\alpha)h(\alpha). \]

Proof. Put \(d=\deg(\alpha)\) and write

\[ f(t)=a_dt^d+\dots+a_0. \]

Choose a number field \(K\) that contains \(\alpha\) and is a Galois extension of \(\mathbb{Q}\), with Galois group \(G\). Then \((\sigma\alpha:\sigma \in G)\) contains every conjugate of \(\alpha\) exactly \([K:\mathbb{Q}]/d\) times. Since \(a_0,\dots,a_d\) are coprime, for any non-Archimedean absolute value \(v \in M_K\), we must have \(\max_i|a_i|_v=|f|_v=1\). Combining with Gauss's lemma and Galois theory, we see

\[ |a_d|_v\prod_{\sigma \in G}\max\{1,|\sigma\alpha|_v\}^{d/[K:\mathbb{Q}]}=1. \]

Now we are ready to compute the height of \(\alpha\) to rediscover the Mahler's measure. Notice that

\[ h(\alpha)=\sum_{v \in M_K}\log^+|\alpha|_v, \quad h(\sigma\alpha)=h(\alpha),\quad \forall \sigma \in G. \]

We therefore obtain

\[ \begin{aligned} h(\alpha)&=\frac{1}{[K:\mathbb{Q}]}\sum_{\sigma \in G}h(\sigma\alpha) \\ &=\frac{1}{[K:\mathbb{Q}]}\sum_{v \in M_K}\sum_{\sigma \in G}\log^+|\sigma\alpha|_v \\ &=\frac{1}{[K:\mathbb{Q}]}\sum_{v\mid\infty}\sum_{\sigma \in G}\log^+|\sigma\alpha|_v + \frac{1}{[K:\mathbb{Q}]}\sum_{v\nmid \infty}\sum_{\sigma \in G}\log^+|\sigma\alpha|_v \\ \end{aligned} \]

The last term corresponds to what we have computed above about non-Archimedean absolute values so we break it down a little bit:

$$ \[\begin{aligned} \sum_{\sigma \in G}\log^+|\sigma\alpha|_v &= \sum_{\sigma \in G}\log\max\{1,|\sigma\alpha|_v\} \\ &=\log\prod_{\sigma \in G}\max\{1,|\sigma\alpha|_v\} \\ &=\frac{[K:\mathbb{Q}]}{d}\log\prod_{\sigma \in G} \max\{1,|\sigma\alpha|_v\}^{d/[K:\mathbb{Q}]} \\ &=-\frac{[K:\mathbb{Q}]}{d}\log|a_d|_v \\ &=\frac{[K:\mathbb{Q}]}{d}\log|a_d|_w \end{aligned}\]


for some \(u \mid \infty\), according to the product formula. On the other hand, for \(v \mid \infty\),

\[ \sum_{\sigma \in G}\log^+|\sigma\alpha|_v = \frac{[K:\mathbb{Q}]}{d}\sum_{j=1}^{d}\log^+|\alpha_j|_v. \]

All in all,

\[ \begin{aligned} h(\alpha)&=\frac{1}{[K:\mathbb{Q}]}\frac{[K:\mathbb{Q}]}{d} \sum_{v \mid \infty}\left( \log|a_d|_v+\sum_{j=1}^{d}\log^+|\alpha_j|_v \right) \\ &=\frac{1}{d}\log M(f). \end{aligned} \]

The second assertion follows immediately because

\[ \log|N_{\mathbb{Q}(\alpha)/\mathbb{Q}}(\alpha)|= \sum_{v\mid \infty}\sum_{j=1}^{d}\log^+|\alpha|_j. \]

Northcott's Theorem

The set of non-zero algebraic integers of height \(0\) lies on the unit circle, and they are actually roots of unit, by Kronecker's theorem. However keep in mind that algebraic integers on the unit circle are not necessarily roots of units. See this short paper.

When it comes to algebraic integers of small heights, things may get complicated, but Northcott's theorem assures that we will be studying a finite set.

Northcott's Theorem. Given an integer \(N>0\) and a real number \(H \ge1\), there are only a finite number of algebraic integers \(\alpha\) satisfying \(\deg(\alpha) \le N\) and \(h(\alpha) \le \log H\).

Proof. Let \(\alpha\) be a algebraic integer of degree \(d<N\) and height \(h(\alpha) \le \log H\). Suppose \(f(t)=a_dt^d+\dots+a_0 \in \mathbb{Z}[t]\) is the minimal polynomial of \(\alpha\). Then lemma 1 shows us that

\[ \max|a_i|=\ell_\infty(f) \le {d \choose \lfloor d/2 \rfloor} M(f) \le 2^dM(f). \]

On the other hand, by proposition 1,

\[ \log M(f) =dh(\alpha) \le d\log{H}=\log H^d, \]

we have actually

\[ \max|a_i| \le (2H)^d. \]

This gives rise to no more than \((2\lfloor (2H)^d \rfloor+1)^{d+1}\) distinct polynomials \(f\), which produces at most \(d(2\lfloor (2H)^d \rfloor+1)^{d+1}<\infty\) algebraic integers. Ranging through all \(d \le N\) we get what we want. \(\square\)

We also have the Northcott property, where we do not care about degrees. A set \(L\) of algebraic integers is said to satisfy Northcott property if, for every \(T>0\), the set

\[ \{\alpha \in L:h(\alpha)<T\} \]

is finite. Such a set \(L\) is said to satisfy Bogomolov property if, there exists \(T>0\) such that the set

\[ \{\alpha \in L:0<h(\alpha)<T\} \]

is empty. As a matter of elementary topology, Northcott property implies Bogomolov property. It would be quite interesting if \(L\) is a field. This paper can be quite interesting.

References / Further Reading

  • Erico Bombieri, Walter Gubler, Heights in Diophantine Geometry.

  • Michel Waldschmidt, Diophantine Approximation on Linear Algebraic Groups, Transcendence Properties of the Exponential Function in Several Variables.


Calculus on Fields - Heights of Polynomials, Mahler's Measure and Northcott's Theorem



Posted on


Updated on


Licensed under