# Hensel's Lemma - A Fair Application of Newton's Method and 'Double Induction'

# Introduction

Let \(F\) be a non-Archimedean local field, meaning that \(F\) is complete under the metric induced by a non-Archimedean absolute value \(|\cdot|\). Consider the ring of integers

\[ \mathfrak{o}_F=\{\alpha \in F:|\alpha| \le 1\} \]

and its unique prime (hence maximal) ideal

\[ \mathfrak{p}=\{\alpha \in F:|\alpha| <1\}. \]

The residue field \(k=\mathfrak{o}_F/\mathfrak{p}\) is finite because it is compact and discrete. For compactness notice that \(\mathfrak{o}_F\) is compact, and the canonical projection \(\mathfrak{o}_F \to k\) is open. For discreteness, notice that \(\mathfrak{p}\) is open, connected and contains the unit.

Let \(f \in \mathfrak{o}_F[x]\) be a polynomial. Hensel's lemma states that, if \(\overline{f} \in k[x]\), the reduction of \(f\), has a simple root \(a\) in \(k\), then the root can be lifted to a root of \(f\) in \(\mathfrak{o}_F\) and hence \(F\). This blog post is intended to offer a well-organised proof of this lemma.

To do this, we need to use Newton's method of approximating roots of \(f(x)=0\), something like

\[ a_{n+1}=a_n-\frac{f(a_n)}{f'(a_n)}. \]

We know that \(a_n \to \zeta\) where
\(f(\zeta)=0\) at a \(A^{2^n}\) speed for some constant \(A\), in calculus (do Walter Rudin's
exercise 5.25 of *Principles of Mathematical Analysis* if you are
not familiar with it, I heartily recommend.). Now we will steal Newton's
method into number theory to find roots in a non-Archimedean field,
which is violently different from \(\mathbb{R}\), the playground of elementary
calculus.

We will also use induction, in the form of which I would like to call "double induction". Instead of claiming that \(P(n)\) is true for all \(n\), we claim that \(P(n)\) and \(Q(n)\) are true for all \(n\). When proving \(P(n+1)\), we may use \(Q(n)\), and vice versa.

This method is inspired by this lecture note, where actually a "quadra induction" is used, and everything is proved altogether. Nevertheless, I would like to argue that, the quadra induction is too dense to expose the motivation and intuition of this proof. Therefore, we reduce the induction into two arguments and derive the rest with more reasonings.

# Hensel's Lemma

Hensel's Lemma.Let \(F\) be a non-Archimedean local field with ring of integers \(\mathfrak{o}_F=\{\alpha \in F:|\alpha| \le 1\}\) and prime ideal \(\mathfrak{p}=\{\alpha \in F:|\alpha|<1\}\). Let \(f \in \mathfrak{o}_F[x]\) be a polynomial whose reduction \(\overline{f} \in k[x]\) has a simple root \(a \in k\), then \(a\) can be lifted to \(\alpha \equiv a \mod \mathfrak{p}\), such that \(f(\alpha)=0\).

By simple root we mean \(\overline{f}(a)=0\) but \(\overline{f}'(a) \ne 0\). Before we prove this lemma, we see some examples.

## Examples and Applications

### Square Root of 2 in 7-adic Numbers

Put \(F=\mathbb{Q}_7\). Then \(\mathfrak{o}_F=\mathbb{Z}_7\), \(\mathfrak{p}=7\mathbb{Z}_7\) and \(k=\mathbb{F}_7\). We show that square roots of \(2\) are in \(F\). Note \(\overline{f}(x)=x^2-2=(x-3)(x+3) \in k[x]\), we therefore two simple roots of \(\overline{f}\), namely \(3\) and \(-3\). Lifting to \(\mathfrak{o}_F\), we have two roots \(\alpha_1 \equiv 3 \mod 7\mathbb{Z}_7\) and \(\alpha_2 \equiv -3 \mod 7\mathbb{Z}_7\), of \(f\). For \(\alpha_1\), we have

\[ \begin{aligned} 2 &\equiv 3^2 \mod 7 \\ 2 &\equiv (3+7)^2 \mod 7^2 \\ &\cdots\cdots \end{aligned} \]

Hence we can put \(\alpha=\sqrt{2}=3+7+2\cdot 7^2+6\cdot 7^3\cdots\in\mathbb{Z}_7 \subset \mathbb{Q}_7\). Likewise \(\alpha_2\) can be understood as \(-\sqrt{2}\). This expansion is totally different from our understanding in \(\mathbb{Q}\) or \(\mathbb{R}\).

### Roots of Unity

Since \(k\) is a finite field, we see \(k^\times\) is a cyclic group of order \(q-1\) where \(q=p^n=|k|\) for some prime \(p\). It follows that \(x^{q-1}=1\) for all \(x \in k^\times\). Therefore \(f(x)=x^{q-1}-1\) has \(q-1\) distinct roots in \(k\). By Hensel's lemma, \(F\) contains all \((q-1)\)st roots of unity. It does not matter whether \(F\) is isomorphic to \(\mathbb{Q}_p\) or \(\mathbb{F}_q((t))\).

## Proof of Hensel's Lemma (with Explanation)

Pick any \(a_0 \in \mathfrak{o}_F\) that is a lift of \(a\mod\mathfrak{p}\). Define

\[ a_n=a_{n-1}-\frac{f(a_{n-1})}{f'(a_{n-1})},\quad n \ge 1, \]

then we claim that \(a_n\) converges to the root we are looking for.

### Step 1 - Establishing A Sequence by Newton's Method

First of all, we need to show that \(a_n \in \mathfrak{o}_F\), i.e., \(|a_n| \le 1\) for all \(n\). It suffices to show that \(|f(a_{n-1})/f'(a_{n-1})| \le 1\). We firstly observe the case when \(n=1\).

Since \(\overline{f}(a)=0\) but \(\overline{f}'(a) \ne 0\), we have \(f(a_0) \in \mathfrak{p}\) but \(f'(a_0)\not\in\mathfrak{p}\). As a result, \(|f(a_0)|<1\) but \(|f'(a_0)|=1\). As a result, \(|f(a_0)/f'(a_0)|<1\), which implies that \(f(a_0)/f'(a_0) \in \mathfrak{o}_F\) and therefore \(a_1 \in \mathfrak{o}_F\).

By Taylor's theorem.

\[ f'(a_n)=f'(a_{n-1})+f''(a_{n-1})(a_n-a_{n-1})+g_n(a_n)(a_n-a_{n-1})^2 \]

for some \(g_n \in \mathfrak{o}_F[x]\). When \(n=1\), we see \(g_1(a_1) \in \mathfrak{o}_F\) and as a result \(|g_1(a_1)| \le 1\). Therefore

\[ |f'(a_1)|=\max\{|f'(a_0)|,|f''(a_{0})(a_1-a_0)|,|g_1(a_1)(a_1-a_0)^2|\}=|f'(a_0)|=1. \]

Since \(a_1 \in \mathfrak{o}_F\), we also see that \(f(a_1) \in \mathfrak{o}_F\) hence its absolute value is not greater than \(1\). As a result \(|f(a_1)/f'(a_1)| \le 1\), which implies that \(a_2 \in \mathfrak{o}_F\).

This inspires us to claim the following *two* statements:

\(|f(a_n)| < 1\) for all \(n \ge 0\).

\(|f'(a_n)|=|f'(a_0)|=1\) for all \(n \ge 0\).

We have verified (a) and (b) for \(n=0\) and \(n=1\). Now assume that (a) and (b) are true for \(n-1\), then, for \(n\), we will verify as follows.

First of all, by (a) and (b) for \(n-1\), we see \(a_n \in \mathfrak{o}_F\).

Consider the Taylor's expansion

\[ \begin{aligned} f(a_n)&=f(a_{n-1})+f'(a_{n-1})(a_n-a_{n-1})+h_n(a_n)(a_n-a_{n-1})^2 \\ &=f(a_{n-1})-f'(a_{n-1})\frac{f(a_{n-1})}{f'(a_{n-1})}+h_n(a_n) \left(\frac{f(a_{n-1})}{f'(a_{n-1})}\right)^2 \\ &= h_n(a_n)\left(\frac{f(a_{n-1})}{f'(a_{n-1})}\right)^2 \end{aligned} \]

where \(h_n \in \mathfrak{o}_F[x]\). It follows that \(|h_n(a_n)| \le 1\). Since \(|f'(a_{n-1})|=1\), by (b) we actually have

\[ |f(a_n)| \le |f(a_{n-1})|^2<1. \]

To prove (b) for \(n\), we consider the Taylor's expansion

\[ f'(a_n)=f'(a_{n-1})+f''(a_{n-1})(a_n-a_{n-1})+g_n(a_n)(a_n-a_{n-1})^2 \]

Notice that since \(a_n \in \mathfrak{o}_F\), we have \(f''(a_{n-1}),g_n(a_n) \in \mathfrak{o}_F\). By (a) and (b) for \(n-1\), we see

\[ \begin{aligned} |f''(a_{n-1})(a_n-a_{n-1})|&=|f''(a_{n-1})||f(a_{n-1})|<1,\\ |g_n(a_n)(a_{n}-a_{n-1})^2|&=|g_n(a_n)||f(a_{n-1})|^2<1. \end{aligned} \]

Hence

\[ \begin{aligned} |f'(a_n)|&=\max\{|f'(a_{n-1}),|f''(a_{n-1})(a_n-a_{n-1})|,|g_n(a_n)(a_{n}-a_{n-1})^2|\}\\ &=|f'(a_{n-1})|, \end{aligned} \]

bearing in mind that for a non-Archimedean absolute value, \(|x+y|=\max\{|x|,|y|\}\) iff \(|x| \ne |y|\). Through this process we have also proved (b).

### Step 2 - Validating the Convergence

We need to show that \(\{a_n\}\) is a Cauchy sequence. To do this, it suffices to show that \(|f(a_n)| \to 0\) sufficiently quick. Recall in the proof of (a) we have shown that \(|f(a_n)| \le |f(a_{n-1})|^2\) for all \(n\). By applying this relation inductively, we see \(|f(a_n)| \le |f(a_0)|^{2^n}\). Since \(|f(a_0)|<1\), it follows that \(|f(a_n)| \to 0\) as \(n \to \infty\).

For any \(\varepsilon>0\), there exists \(N>0\) such that \(|f(a_n)| <\varepsilon\) for all \(n \ge N\). As a result, for all \(m>n>N\), we have

\[ \begin{aligned} |a_m-a_n|&=|(a_m-a_{m-1})+(a_{m-1}-a_{m-2})+\dots+(a_{n+1}-a_{n})| \\ &\le\max\{|a_m-a_{m-1}|,\dots,|a_{n+1}-a_n|\} \\ &\le |f(a_n)| \\ &< \varepsilon. \end{aligned} \]

Therefore \(\{a_n\}\) is Cauchy. Since \(F\) is complete, \(a_n\) converges to some \(\alpha \in \mathfrak{o}_F \subset F\) such that \(f(\alpha)=\lim_{n \to \infty}f(a_n)=0\).

### Step 3 - Validating the Congruence

In local fields, congruence is determined by inequality. In fact, we only need to show that \(|\alpha-a_0|<1\), which means that \(\alpha-a_0 \in \mathfrak{p}\), and therefore \(\alpha \equiv a \mod \mathfrak{p}\) as expected. To do this, we show by induction that \(|a_n-a_0|<1\). For \(n=1\) we see \(|a_1-a_0|=|f_0|<1\).

Suppose \(|a_{n-1}-a_0|<1\) then

\[ |a_n-a_0| \le \max\{|a_n-a_{n-1}|,|a_{n-1}-a_0|\}<1. \]

Therefore \(|\alpha-a_0|=\lim_{n \to \infty}|a_n-a_0|<1\), from which the result follows. \(\square\)

# Stronger Version

In fact we have not explicitly used the fact that \(a\) is a simple root. We only used the fact that \(|f(a_0)|<1\) but \(|f'(a_0)|=1\). Moreover, what really matters here is that \(|f(a_n)|\) converges to \(0\) quick enough. Therefore \(1\) may be replaced by a smaller constant. For this reason we introduce a stronger version of Hensel's lemma.

Hensel's lemma, stronger version.Let \(F\) be a non-Archimedean local field with ring of integers \(\mathfrak{o}_F\). Suppose there exists \(a \in \mathfrak{o}_F\) such that \(|f(a)|<|f'(a)|^2\), then there exists some \(b \in \mathfrak{o}_F\) such that \(f(b)=0\) and \(|b-a|<|f'(a)|\).

Instead of asserting \(|f'(a_n)|=1\) for all \(n\), we claim that \(|f'(a_n)|=|f'(a_0)|\) (as it should be!). Instead of asserting \(|f(a_n)|<1\), we claim that \(|f(a_n)| \le \lambda^{2^n}|f'(a_0)|\) where \(\lambda=|f(a_0)|/|f'(a_0)|^2\). The proof will be nearly the same.

For example, we can find a square root of \(257\) in \(\mathbb{Z}_2 \subset \mathbb{Q}_2\). The polynomial \(f(x)=x^2-257\) is reduced to \(\overline{f}(x)=x^2-1=(x-1)^2\) in \(\mathbb{F}_2[x]\), where \(1\) is not a simple root. Therefore we cannot apply the original version of Hensel's lemma to this polynomial. Nevertheless, we see \(f(1)=-256\) and \(f'(1)=2\). Therefore \(|f(1)|=\frac{1}{2^8}\) while \(|f'(1)|=\frac{1}{2}\). We can apply Newton's method here to find a square root of \(257\) without worrying about repeated roots.

# Ending Remarks

There are a lot of variants of Hensel's lemma, for example you can do exercise 10.9 of Atiyah-MacDonald. In fact, we later even have Henselian ring and Henselisation of a ring.

There are some other proofs of Hensel's lemma in this post, for example, since Newton's method can also be understood as a contraction mapping, we can also prove it using properties of contraction mapping (see K. Conrad's note).

Hensel's Lemma - A Fair Application of Newton's Method and 'Double Induction'