Shafarevich and Siegel's theorems
Posted by martin on Friday, 07 October 2011 at 09:00
In this post I will prove the Shafarevich conjecture for elliptic curves (also called Shafarevich’s theorem). The proof is by reducing it to the finiteness of the number of solutions of a certain Diophantine equation, and then applying Siegel’s theorem on integral points on curves.
Shafarevich’s Theorem. Let
be a number field and
a finite set of places of
. Then there are only finitely many isomorphism classes of elliptic curves over
with good reduction outside
.
Siegel’s Theorem. Let
be a number field and
a finite set of places of
. An absolutely irreducible affine curve
over
of genus at least
has only finitely many
-integral points.
Since the reduction of Shafarevich’s theorem to Siegel’s theorem is short, and Siegel’s theorem is of independent interest, most of the post will be about Siegel’s theorem.
Proof of Shafarevich’s Theorem
We can enlarge
until
is principal, and
contains all primes above
and
.
Then any elliptic curve
with good reduction outside
will have a Weierstrass equation
where
and the discriminant
is a unit in
.
We can multiply the discriminant by any power of 12 in
.
So if we choose a set of coset representatives for
(which is finite by Dirichlet’s unit theorem),
then we get a finite set
such that every curve
with good reduction outside
has an
-integral Weierstrass equation with discriminant in
.
We just have to show that for each discriminant there are finitely many curves, but this is precisely the claim that
has finitely many
-integral solutions, which is true by Siegel’s theorem.
Roth’s Theorem
The proof of Siegel’s theorem is based on Roth’s theorem on Diophantine approximation (also called the Thue-Siegel-Roth theorem). This limits how closely an algebraic number can be approximated by rationals, or more generally, by elements of a fixed number field
.
Theorem (Roth, Ridout). Let
be a number field and
a normalised absolute value on
. Let
. If
is algebraic over
, then there are only finitely many
satisfying
Here the absolute values are normalised so that they satisfy the product formula
and the height is the absolute multiplicative Weil height

Applying the theorem to the coordinates of points of a
-variety, we get the following bound on the approximation of points of
by
-rational points.
Theorem. Let
be a variety over
and
. If
is algebraic over
, then there is no infinite sequence
of points in
such that
Here
is a measure of the
-adic distance of
from
.
We only need to measure distances from the fixed point
, so the following crude definition is adequate:
Let
be a quasi-affine neighbourhood of
, and
coordinates on
such that
.
Then define
If we change the neighbourhood or the coordinate system, the new
is bounded by a constant multiple of the old one, which for our purposes does not matter.
The approximation theorem for abelian varieties
If
is an abelian variety, then we can improve the previous approximation theorem to allow any
, rather than just
.
The idea of the proof is to pull back by the multiplication-by-
map:
this makes heights much smaller (
-th root) while only multiplying distances by a constant.
Hence an approximation theorem with exponent
on
, implies an approximation theorem with exponent
on
.
And for large
,
so we can apply Roth’s theorem.
Theorem. Let
be an abelian variety over
and
. If
is algebraic over
, then there is no infinite sequence
of points in
such that
Proof. Let
be a large integer, which we will choose later.
By the Mordell-Weil theorem,
is finite, so there is an infinite subsequence of
contained in a single coset
. Pass to such a subsequence and choose
such that
. (We use Mordell-Weil, and the map
rather than just
, to ensure that the points
are defined over
.)
Because
is compact, there is a subsequence of
which converges, say to
. Then
; this implies that
is algebraic over
.
The map
is finite étale, so there is a constant
such that
for all large enough
. So by hypothesis,
Because the log of the height is a quadratic form on an abelian variety (up to constant error), we have that for large
,
So
Choose
large enough that
and we get a contradiction of Roth’s theorem.
The same theorem also holds if
is a nonsingular curve of genus at least
, by embedding the curve in its Jacobian.
Siegel’s Theorem
Theorem. Let
be a number field and
a finite set of places of
. An absolutely irreducible affine curve
over
of genus at least
has only finitely many
-integral points.
Proof. Without loss of generality we may assume that
is nonsingular (otherwise cover it by its desingularisation).
Suppose that
contained an infinite sequence
of integral points. Let
be a coordinate function on
. Then
is an
-integer for all
.
Let
be the union of
with all the archimedean absolute values of
. Since
is an
-integer (i.e.
for all
), we have
It follows that there is some
such that
for infinitely many
, where
.
Replace
by a subsequence satisfying this inequality, then by a subsequence converging to some
.
By Northcott’s theorem,
as
, so also
. Hence
is a pole of
, say of order
. So
as
.
So we have
for large
and some constants
,
, which contradicts the approximation theorem.
Historical note
The theorem as proved by Siegel applied only to ordinary integers, not
-integers, and likewise Roth’s theorem applied only to archimedean absolute values.
Furthermore, Siegel proved his theorem in 1929 while Roth’s theorem was not proved until 1955.
Siegel used a weaker version of Roth’s theorem, called the Thue-Siegel theorem, and so his proof of the theorem on integral points was more complicated.
Mahler extended the Thue-Siegel theorem to non-archimedean absolute values in 1935, allowing him to prove finiteness of
-integral points on curves of genus 1 over
.
After Roth proved his theorem, it was fairly straightforward using the earlier ideas of Mahler and Siegel to extend it to non-archimedean absolute values (Ridout 1958) and then to extend Siegel’s theorem to
-integral points for all curves (Lang 1960).
over
a normalised absolute value on
is algebraic over
satisfying

is algebraic over
of points in
such that

is finite, so there is an infinite subsequence of
contained in a single coset
.
Pass to such a subsequence and choose
such that
.
(We use Mordell-Weil, and the map
rather than just
, to ensure that the points
are defined over
which converges, say to
.
Then
; this implies that
is algebraic over
such that
for all large enough
.
So by hypothesis,



and we get a contradiction of Roth’s theorem.
be a coordinate function on
is an
be the union of
is an
for all
), we have

such that
for infinitely many
.
.
as
, so also
.
Hence
.
So
as
for large
,
, which contradicts the approximation theorem.
Broad Question: To what extent can the methods in this post be used to prove the Shafarevich conjecture for higher dimensional abelian varieties? Can it be descended to some analogue of Siegel’s theorem? After all, the approximation theorem works for any abelian variety, and you’re not using the full strength of this in your proof of Siegel’s theorem (in your penultimate section (before the historical note)). Maybe the approximation theorem will imply this (yet-to-be-determined) analogue of Siegel?
If the answer is no, then where is the hurdle? (I guess that the answer is ‘no’, because otherwise there would have been no need for Finiteness I and the bulk of Faltings’s work for Mordell; perhaps the methods of Finiteness I are of themselves a generalisation of the tools of this post?)
Other comments: In the “Proof of Shafarevich’s Theorem” section, is it worth highlighting that
is finite as a consequence of Dirichlet’s unit theorem? Also, in the same paragraph line 2, should that read “…good reduction outside
” (you currently have “over
”, which could be taken to mean something you don’t).
What’s the next post about? Let me guess…either you start on Finiteness I, or you deduce Mordell from Shafarevich for curves. Either way, I’m really enjoying this series of posts (I daresay, more than the previous series!)
By the way, I’m going to Cambridge for that Part III prospects thing in October; are you going? If you’d like to come to Warwick, you’d be more than welcome to stay over. And I saw recently that in November there is a series of lectures in commemoration of Shiing-Shen Chern at IHES; will you go?
First of all the fact that the Shafarevich theorem can be deduced from Siegel’s theorem seems like a bit of a coincidence to me - it is an argument about Weierstrass equations rather than about elliptic curves. I don’t think that this can be generalised to higher dimensions, and Faltings’ proof of Finiteness I had nothing to do with Siegel’s theorem.
This is not the only way of deducing Shafarevich’s theorem from Siegel’s theorem. There is a more geometric one by applying Siegel to the modular curve
- you have to extend the base field
to
so that all elliptic curves over
with good reduction outside
have their
-torsion defined over
, so that they give rise to points on
, and use Galois cohomology to show that there are finitely many
-isomorphism classes of elliptic curves in each
-isomorphism class.
In order to generalise this to higher abelian varieties, you run into the problem that the moduli spaces are no longer 1-dimensional.
In the proof of Siegel’s theorem, I should probably have pointed out the reason why it does not work for abelian varieties, since the approximation theorem does. It is when we talk about
being a pole of
, so that near
,
looks like a negative power of the distance from
. In dimension greater than 1, poles of rational functions are varieties rather than just points.
Faltings proved finiteness of integral points on an affine subset of an abelian variety in 1990. This was related to a new proof of Finiteness I due to Vojta in 1989, which starts by unpacking the proof of Roth’s theorem, but involves much heavier machinery like the arithmetic Riemann-Roch theorem as well. For more on this, look at Chapter IX of Lang’s book “Number Theory III”, volume 60 in the Encyclopedia of Mathematical Sciences series; a later edition was published under the title “Survey of Diophantine Geometry”.
My next post will be on Siegel’s theorem for curves of genus zero with at least 3 points at infinity and Baker’s theorem, which gives effective bounds for some cases of Siegel’s theorem. This is not really an organised series of posts, and I certainly do not intend to go through the whole proof of Finiteness I. But I will look at some of the ideas involved, particularly in the Masser-Wüstholz proof, which gives an effective bound in Finiteness I.
Thanks for the correction. Yes I am going to the Part III afternoon (in November). I did not know about the Chern lectures until you told me.