Overconvergent modular forms and their explicit arithmetic

By Jan Vonk

Abstract

In these notes we aim to give a friendly introduction to the theory of overconvergent modular forms and some examples of recent arithmetic applications. The emphasis is on explicit examples and computations.

Introduction

The theory of $p$-adic modular forms has its origins in the work of Serre and Katz in the 1970s, and has seen a spectacular amount of development and applications in number theory since then. In this note, we aim to provide its context and sketch the rudiments of the theory, adopting an approach where we favour explicit examples and computations over proofs. This is done with the hope that the uninitiated reader may build up some intuition and working knowledge as a stepping stone to the literature on the subject, which can be somewhat daunting to outsiders, but for which there is no substitute if one wants to become a serious user. We have included references to many of the original texts.

We should warn the reader that by its very design, this article is doomed to be incomplete, and several crucial developments are not discussed in this text. One notable example is that we have omitted a discussion of the theory of overconvergent modular symbols, which often provides an alternative framework that is in its own way highly suited for explicit computation. The author wishes to apologise for this omission, and many others, with the added clarification that this is merely a reflection of his own lack of experience with this approach. Likewise, many recent exciting developments in the area, such as the burgeoning topic of higher Hida theory, will unfortunately not be discussed in any detail here. Finally, it is important to note that a great many excellent (expository) sources on the theory of overconvergent modular forms already exist, which include for instance the beautiful treatments of Emerton Reference Eme11 and Calegari Reference Cal13.

1. Congruences between modular forms

We start by recalling some basic definitions and motivate the theme of these notes by discussing some classical congruences for the Ramanujan $\Delta$-function, a weight-$12$ modular form of level $1$. These illustrate different types phenomena, and we highlight the features we wish to explore in these notes.

1.1. Modular forms

Suppose $\Gamma \subseteq \operatorname {SL}_2(\mathbf{Z})$ is a finite index subgroup. Then $M_k(\Gamma )$ denotes the space of modular forms of weight $k \in \mathbf{Z}$, that is the space of holomorphic functions $f$ on the Poincaré upper half-plane $\mathcal{H} \coloneq \{ z \in \mathbf{C} \ | \ \mathrm{Im}(z) > 0 \}$ which satisfy the transformation law

$$\begin{equation} f \left( \frac{az + b}{cz+d}\right) = (cz+d)^{k} f(z) \qquad \text{for all } \left(\begin{matrix} a & b \\ c & d \end{matrix}\right) \in \Gamma \tag{1} \end{equation}$$

and are holomorphic at the cusps of $\Gamma$. The subspace of cuspforms consists of those functions which vanish at all the cusps and is denoted by $S_k(\Gamma )$. In these notes, $\Gamma$ will usually be given by the congruence subgroup

$$\begin{equation} \Gamma _0(N) \coloneq \left\{ \left(\begin{matrix} a & b \\ c & d \end{matrix}\right) \in \operatorname {SL}_2(\mathbf{Z}) \ : \ c \equiv 0 \pmod {N} \right\}. \tag{2} \end{equation}$$

Any modular form $f \in M_k(\Gamma _0(N))$ is invariant under translation and admits a Fourier expansion

$$\begin{equation} f(q) = a_0 + a_1 q + a_2 q^2 + \cdots , \qquad q = e^{2 \pi i z}. \cssId{texmlid2}{\tag{3}} \end{equation}$$

This will be referred to as its $q$-expansion, and the $a_i \in \mathbf{C}$ are called its Fourier coefficients. We refer to $a_0$ as the constant term of $f$ and the $a_i$ for $i \geq 1$ as its higher Fourier coefficients. When $a_1=1$, we say $f$ is normalised.

Classical examples of modular forms are given by the Eisenstein series, which are constructed as follows. For any even $k\geq 4$, we have the weight-$k$ normalised Eisenstein series

$$\begin{equation} \begin{array}{lll} \mathbf{G}_{k}(z) &=& \!\!\!\! \sum _{(m,n) \, \in \, \mathbf{Z}^2 \!\backslash (0,0)} (mz+n)^{-k} \\[18.0pt] &=& \frac{-B_k}{2k} \ \ + \ \ \sum _{n \geq 1} \sigma _{k-1}(n) q^n ,\\\end{array} \cssId{texmlid3}{\tag{4}} \end{equation}$$

where $B_k$ is the $k$th Bernoulli number (see equation Equation 29) and where $\sigma _r = \sum _{d|n} d^r$ is the divisor function. They define modular forms of weight $k$ for the full modular group $\operatorname {SL}_2(\mathbf{Z})$. We note that for $k=2$ the series defining $\mathbf{G}_k$ fails to converge absolutely, and indeed we have $M_2(\operatorname {SL}_2(\mathbf{Z})) = 0$. We will see in §2.5 that the $q$-expansion above still has meaning for $k=2$ as a $p$-adic modular form.

The dimension of $M_k$ and $S_k$ may typically be calculated using the Riemann–Roch theorem, and the theory of modular symbols allows one to compute, to any desired $q$-adic accuracy, a set of $q$-expansions of a basis for it. For more details on these computations, see the detailed treatment of Stein Reference Ste07.

1.2. The Hecke algebra

Two central aspects of the theory of modular forms are the action of the Hecke algebra and their associated Galois representations, which we briefly discuss now.

The spaces of modular forms $M_k(\Gamma _0(N))$ and $S_k(\Gamma _0(N))$ are finite dimensional, and they are equipped with an action of the Hecke algebra, generated by operators $T_p$ for any prime $p$, where it is customary to use the notation $U_p$ whenever $p \mid N$. In terms of $q$-expansions Equation 3 they are given by the expressions

$$\begin{equation} \begin{array}{llll} T_{p} f(q) &=& \sum _{n \geq 0} a_{np}q^n \ + \ p^{k-1} \sum _{n \geq 0} a_nq^{\ell n} & \qquad p \nmid N, \\U_pf(q) &=& \sum _{n \geq 0} a_{np}q^n & \qquad p \mid N. \\\end{array} \cssId{texmlid4}{\tag{5}} \end{equation}$$

Any modular form that is an eigenvector for all these Hecke operators is called an eigenform. The Eisenstein series $\mathbf{G}_{k}$ defined in Equation 4 is a simple example of an eigenform, which satisfies

$$\begin{equation} T_p \mathbf{G}_k = (1+p^{k-1})\mathbf{G}_k. \tag{6} \end{equation}$$

Note that the eigenvalue $(1+p^{k-1})$ is equal to the divisor function $\sigma _{k-1}(p)$, and therefore also the $p$th Fourier coefficient of $\mathbf{G}_k$ displayed in Equation 4. One can verify from the expressions Equation 5 that this is always the case; if $f$ is a normalised eigenform, then its eigenvalue for $T_p$ or $U_p$ is equal to its $p$th Fourier coefficient. For an introduction to the basic properties of Hecke operators, see Diamond and Shurman Reference DS05.

Many spectacular results in number theory revolve around the notion of Galois representations. In what follows, this will always mean a continuous representation

$$\begin{equation} \rho \, : \, G_{\mathbf{Q}} \ \longrightarrow \ \mathrm{GL}_2(k), \tag{7} \end{equation}$$

where $G_{\mathbf{Q}} = \mathrm{Gal}(\overline{\mathbf{Q}}/\mathbf{Q})$ is the absolute Galois group of $\mathbf{Q}$, and $k$ is either the field of complex numbers $\mathbf{C}$ (in which case $\rho$ is called an Artin representation) or a $p$-adic field such as $\mathbf{Q}_p$. Important examples of the latter arise from elliptic curves. Suppose $E$ is an elliptic curve defined over $\mathbf{Q}$, choose a prime $p$, and consider the $p$-adic Tate module, obtained from the inverse limit of the torsion points on $E$ of $p$-power order,

$$\begin{equation} \mathbf{Q}_p \otimes _{\mathbf{Z}_p} \left( \varprojlim _{n} \, E[p^n]\right), \qquad \text{where} \ \ E[p^n] = \mathrm{Ker}(E \stackrel{\times p^n}{\longrightarrow } E), \cssId{texmlid32}{\tag{8}} \end{equation}$$

which is a two-dimensional $\mathbf{Q}_p$-vector space. This space has a natural action of the Galois group $G_{\mathbf{Q}}$, given by the Galois action on the coordinates of the $p^n$-torsion points on $E$, which are algebraic numbers. Many important arithmetic properties of the elliptic curve $E$ can be recovered from this Galois representation. For instance, for any prime $\ell \neq p$ of good reduction for $E$, we have that

$$\begin{equation} \operatorname {Tr}\rho (\mathrm{Frob}_{\ell }) = \ell + 1 - |E(\mathbf{F}_{\ell })|. \cssId{texmlid5}{\tag{9}} \end{equation}$$

In other words, the trace of the matrix of Frobenius at $\ell$ is related to the number of points of $E$ over the finite field $\mathbf{F}_{\ell }$. It is striking that this representation depends on the choice of a prime $p$, but the traces of Frobenius at primes $\ell \neq p$ of good reduction for $E$ are integers and are independent of the choice of $p$.

Suppose $f$ is an eigenform of level $N$ and weight $k=2$, where $N$ is the conductor of $E$. We say that $f$ is attached to $E$ if the traces of Frobenius elements Equation 9 are equal to the Fourier coefficients $a_{\ell }$ of $f$. In other words, this means that for all but finitely many $\ell$ we have

$$\begin{equation} a_{\ell } = \ell + 1 - |E(\mathbf{F}_{\ell })|. \tag{10} \end{equation}$$

Important developments, culminating in the work of Wiles Reference Wil95, Taylor and Wiles Reference TW95, and Breuil, Conrad, Diamond, and Taylor Reference BCDT01 show that for any elliptic curve $E$ over $\mathbf{Q}$, there exists a modular form that is attached to it in this sense. This has led not just to a proof of Fermat’s last theorem, but subsequent developments continue to this day to settle long-standing conjectures in number theory.

Remark.

We briefly mention that the converse was known much earlier. That is, a construction of Eichler and Shimura attaches an elliptic curve $E$ to any eigenform of weight $k=2$ with integer Fourier coefficients. Generally, to any normalised cuspidal eigenform $f$ of weight $k$ one may attach a two-dimensional Galois representation $\rho$ of $G_{\mathbf{Q}}$, which is unramified at all primes $\ell$ away from a finite set, and which satisfies

$$\begin{equation} \mathrm{det} \left( 1- \rho (\mathrm{Frob}_{\ell })T \right) = 1 - a_{\ell } T + \ell ^{k-1} T^2 \tag{11} \end{equation}$$

where $a_{\ell }$ is the $\ell$th Fourier coefficient of $f$. When $k\geq 2$ this representation is valued in a nonarchimedean local field, and it was constructed by Deligne Reference Del71, though it is no longer attached to an elliptic curve as in the aforementioned construction for $k=2$ due to Eichler and Shimura. When $k=1$, it is an Artin representation, constructed by Deligne and Serre Reference DS74 from the representations in higher weight via congruences.

1.3. Some examples of congruences

The Ramanujan $\Delta$-function is the unique normalised cusp form of weight $12$ for the group $\Gamma = \operatorname {SL}_2(\mathbf{Z})$. Its $q$-expansion is given by the infinite product due to Jacobi,

$$\begin{equation} \Delta (q) \ = \ q \prod _{n = 1}^{\infty } (1-q^n)^{24} \ = \ \sum _{n = 1}^{\infty } \tau (n) q^n. \tag{12} \end{equation}$$

This explicit product allows us to easily establish a number of congruences between the Fourier coefficients of $\Delta$ and those of various other modular forms, going back to the early twentieth century. For the reader who would like to inspect these manually, we tabulate its first few Fourier coefficients $\tau (p)$ for $p$ prime:

$$\begin{equation} \begin{array}{|c||ccccc|} \hline p & 2 & 3 & 5 & 7 & 11 \\\tau (p) & -24 & 252 & 4830 & -16744 & 534612 \\\hline \hline p & 13 & 17 & 19 & 23 & 29 \\\tau (p) & -52843168 & -182213314 & 308120442 & -17125708 & -577738 \\\hline \hline p & 31 & 37 & 41 & 43 & \\\tau (p) & -6905934 & 10661420 & 18643272 & 128406630 & \\\hline \end{array} \cssId{texmlid7}{\tag{13}} \end{equation}$$

Example 1.1.

We begin with a congruence that appears in the work of Ramanujan Reference Ram16. Consider the weight-$k$ Eisenstein series $\mathbf{G}_{k}$ introduced in Equation 4. For $k=12$, its constant term is equal to $\frac{691}{65520}$, whereas for $k=6$ the constant term is $\frac{-1}{504}$. Since the space $M_{12}(\operatorname {SL}_2(\mathbf{Z}))$ is two dimensional, spanned by $\mathbf{G}_{12}$ and $\Delta$, the form $\mathbf{G}_6^2$ must be a linear combination of the two. Computing the first two terms of all three $q$-expansions, we find that

$$\begin{equation} \frac{691}{65520} \cdot 504^2 \cdot \mathbf{G}_6(q)^2 = \mathbf{G}_{12}(q) - \frac{756}{65}\Delta (q), \tag{14} \end{equation}$$

and since all three modular forms involved have $691$-integral $q$-expansions, we obtain

$$\begin{equation} \Delta (q) \equiv \mathbf{G}_{12}(q) \pmod {691}. \cssId{texmlid11}{\tag{15}} \end{equation}$$

In particular, we see that for any prime $p$, we get the celebrated Ramanujan congruences

$$\begin{equation} \tau (p) \equiv 1 + p^{11} \pmod {691}. \cssId{texmlid9}{\tag{16}} \end{equation}$$

For a beautiful and very detailed expository discussion of this example in the broader context of ideal class groups of cyclotomic fields and Galois representations, see Mazur Reference Maz11.

Example 1.2.

This example is due to Wilton Reference Wil30, and it establishes a congruence modulo $23$ between $\Delta$ and a certain form of weight $1$. We have the following congruences⁠Footnote¹ for $\Delta$,

The reader familiar with the Dedekind $\eta$-function—which is a modular form of weight $1/2$ for some character $\chi _{24}$ of the metaplectic double cover of $\operatorname {SL}_2(\mathbf{Z})$—will recognise the form on the right-hand side as $\eta (q)\eta (q^{23})$.

✖

$$\begin{equation} \begin{array}{llll} q\prod \limits _{n = 1}^{\infty }(1-q^n)^{24} & \equiv & q^{1/24}\prod \limits _{n = 1}^{\infty }(1-q^n) \ \ \cdot \ \ q^{23/24} \prod \limits _{n = 1}^{\infty }(1-q^{23n}) & \pmod {23} \\[6.0pt] & \equiv & \frac{1}{2} \sum \limits _{u,v \in \mathbf{Z}} \left(q^{u^2 + uv + 6v^2} - q^{2u^2 + uv + 3v^2} \right) & \pmod {23} \end{array} \cssId{texmlid6}{\tag{17}} \end{equation}$$

where the first is a consequence of the fact that for any prime $p$, the binomial coefficient $\binom{p}{i}$ is divisible by $p$ for all $0 < i < p$, and the second follows from a calculation using the Euler identity

$$\begin{equation} \prod _{n=1}^{\infty } (1 - q^n) = \sum _{n \in \mathbf{Z}} (-1)^n q^{\frac{3n^2+n}{2}}. \cssId{texmlid33}{\tag{18}} \end{equation}$$

It is a classical result (see for instance Hecke Reference Hec26) that right-hand side of Equation 17 is a modular form of weight $1$. It is in fact a Hecke eigenform, with an associated Artin representation that we can identify easily: the quadratic field $\mathbf{Q}(\sqrt {-23})$ has class number $3$, and its Hilbert class field $H$ is obtained by adjoining a root of the cubic polynomial

$$\begin{equation} f(x) = x^3 - x - 1, \tag{19} \end{equation}$$

which has discriminant $-23$. The natural quotient gives us

$$\begin{equation} G_{\mathbf{Q}} \ \longrightarrow \ \operatorname {Gal}(H/\mathbf{Q}) \simeq S_3 \ \longrightarrow \ \operatorname {GL}_2(\mathbf{C}) \tag{20} \end{equation}$$

from the unique two-dimensional irreducible representation of $S_3$. This is the two-dimensional Artin representation attached to the above weight-$1$ form. In particular, this means that the congruence class of $\tau (p)$ modulo $23$ may be worked out from the splitting behaviour of the prime $p$ in the extension $H / \mathbf{Q}$. The reader may enjoy verifying in general or simply checking on a few small values of $\ell$ in the table Equation 13 that this boils down to the statement⁠Footnote² that for any prime $p \neq 23$, we have

We use the notation $(a/p)$ for the Legendre symbol for $p \nmid a$, which equals $1$ if $a$ is a square modulo $p$, and $-1$ otherwise.

✖

$$\begin{equation} \begin{array}{lll} \tau (p) \equiv \phantom {-}0 \pmod {23} & \qquad \text{if} \ \ \left(-23/p\right) = -1, \\\tau (p) \equiv \phantom {-}2 \pmod {23} & \qquad \text{if} \ \ \left(-23/p\right) = \phantom {-}1 \ \ \text{and} \ \ p = u^2 + 23v^2 ,\\\tau (p) \equiv -1 \pmod {23} & \qquad \text{if} \ \ \left(-23/p\right) = \phantom {-}1 \ \ \text{and} \ \ p \neq u^2 + 23v^2 .\\\end{array} \cssId{texmlid10}{\tag{21}} \end{equation}$$

Example 1.3.

As in the previous example, an elementary divisibility of binomial coefficients allows us to obtain from the infinite product expansion the following congruence for $\Delta (q),$

$$\begin{equation} \Delta (q) \ = \ q\prod _{n = 1}^{\infty }(1-q^n)^{24} \equiv q \prod _{n = 1}^{\infty }(1-q^n)^2 (1-q^{11n})^2 \pmod {11}, \cssId{texmlid12}{\tag{22}} \end{equation}$$

The right-hand side is a weight-$2$ normalised cusp form of level $\Gamma _0(11)$. It is associated to the elliptic curve

$$\begin{equation} E : y^2 + y = x^3 - x^2 -10x -20 \cssId{texmlid8}{\tag{23}} \end{equation}$$

so that we obtain in particular the following congruences for $p \neq 11$:

$$\begin{equation} \tau (p) \equiv p + 1 - |E(\mathbf{F}_{p})| \quad \pmod {11}. \tag{24} \end{equation}$$

The reader may enjoy verifying this for a few small primes, using the table Equation 13 and equation Equation 23. Unfortunately, the law governing the association $p \mapsto p + 1 - |E(\mathbf{F}_{p})|$ cannot be made explicit in the same elementary terms as in Equation 16 and Equation 21. The reason for this was explained by Shimura Reference Shi66, since this law is governed by the traces of the $11$-adic representation attached to the elliptic curve Equation 23, and Shimura showed that its mod $11$ reduction has image $\mathrm{GL}_2(\mathbf{F}_{11})$. Since this group is not solvable, the law is equivalent to the splitting behaviour of primes $p$ in a nonsolvable extension of $\mathbf{Q}$, which is in contrast with Example 1.2, where the relevant group was $S_3$.

1.4. The context of this article

The three examples of congruences Equation 15, Equation 17, Equation 22 are of very different flavours, and they illustrate different but related phenomena that arise in the $p$-adic theory of modular forms:

•: The first is a congruence between a cusp form and an Eisenstein series, of the same weight. Such congruences are central in Iwasawa theory, and related to the notion of the Eisenstein ideal; see Mazur Reference Maz77. We will not discuss this theme, but we mention that this is a fascinating topic that remains today an active area of research; see for instance Reference Mer96Reference CE05Reference Lec18Reference WWE20. A beautiful introduction to the ideas in this area can be found in Mazur Reference Maz11.
•: The second and third are both congruences between two cusp forms of different weights. This resonates with the framework of $p$-adic families of modular forms, as developed by Reference Hid86bReference Hid86aReference Col97bReference CM98 and many others, and it is these types of congruences that form the focus of this document. We note that both these examples are of a very different nature. Example 1.2 exhibits a congruence between a modular form of weight $1$ and another of a higher weight,⁠Footnote³ and it results in an elementary description of the congruence class of $\tau (p)$ modulo $23$. Example 1.3 on the other hand exhibits a congruence with a form associated to an elliptic curve. There is no similarly elementary characterisation of the congruence class of the $\tau (p)$ modulo $11$.
³

The existence of such congruences is an important ingredient in the aforementioned work of Deligne and Serre Reference DS74 on the existence of Artin representations attached to modular forms of weight $1$.
✖

In these notes, we will focus primarily on the theme of congruences between modular forms of different weights and $p$-adic families. Traditionally, the theory was built around the prototypical example of the Eisenstein family, as in Coleman Reference Col97b, until more recent advances due to Pilloni Reference Pil13 and Andreatta, Iovita, and Stevens Reference AIS14 on the geometric interpolation of line bundles, which allows us to develop the theory abstractly, without relying on the Eisenstein family. From a practical and computational point of view, this family remains of primordial importance, so the next section will quickly review it, motivated by the strategy of Serre to show the existence of the Kubota–Leopoldt $p$-adic L-function.

2. Kummer congruences and Eisenstein series

We begin with a brief discussion of the Kummer congruences, and introduce Serre’s important idea of inferring the $p$-adic variation of the constant term of a modular form, from that of its higher Fourier coefficients. This idea appeared in Serre Reference Ser73 and goes back to observations of Hecke Reference Hec24 and Siegel and Klingen Reference Kli62Reference Sie68. It will make several appearances throughout these notes.

2.1. The Kummer congruences

Recall that the Riemann zeta function $\zeta (s)$ may be analytically continued to the entire complex plane, except for a simple pole with residue $1$ at the point $s=1$. It satisfies the functional equation

$$\begin{equation} \pi ^{-s/2} \Gamma \left(\frac{s}{2}\right) \zeta (s) = \pi ^{-(1-s)/2} \Gamma \left(\frac{1-s}{2}\right) \zeta (1-s). \tag{25} \end{equation}$$

Of special importance are its values at negative odd integers (or equivalently, by the functional equation, at positive even integers), which were computed first by Euler in 1734 and read on 5 December 1735 in the St. Petersburg Academy of Sciences. The starting point for Euler was the easily verified identity

$$\begin{equation} \sin (\pi z) = \pi z \prod _{n \geq 1} \left(1 - \frac{z^2}{n^2}\right). \tag{26} \end{equation}$$

By taking the logarithmic derivative, we obtain the identities

$$\begin{eqnarray} \pi z\cot (\pi z) &=& 1 -2\sum _{n = 1}^{\infty } \sum _{k = 1}^{\infty } \frac{z^{2k}}{n^{2k}} \tag{27}\\ &=& 1 -2\sum _{k=1}^{\infty } \zeta (2k)z^{2k}. \cssId{texmlid13}{\tag{28}} \end{eqnarray}$$

On the other hand, the Bernoulli numbers are defined via the generating series

$$\begin{equation} \frac{t}{e^t-1} = \sum _{k = 0}^{\infty } B_k \frac{t^k}{k!}, \cssId{texmlid1}{\tag{29}} \end{equation}$$

and hence we can formally extract the even part of this series as

$$\begin{eqnarray} \frac{1}{2} \left(\frac{t}{e^t-1} - \frac{-t}{e^{-t}-1} \right) &=& \frac{t}{2}\cdot \frac{e^{t/2} + e^{-t/2}}{e^{t/2} - e^{-t/2}} \tag{30}\\ &=& \frac{t}{2} \cdot \mathrm{coth} \left(\frac{t}{2} \right). \tag{31} \end{eqnarray}$$

Bearing in mind that $i \mathrm{coth}(iz) = \cot (z)$, we obtain the identity

$$\begin{equation} \cot (z) = \frac{1}{z} + \sum _{n=1}^{\infty } \frac{(-1)^k2^{2k}B_{2k}}{(2k)!}z^{2k-1}. \cssId{texmlid14}{\tag{32}} \end{equation}$$

It now follows formally from Equation 28 and Equation 32 that

$$\begin{equation} \zeta (2k) = \frac{(-1)^{k-1}(2\pi )^{2k}}{2(2k)!} B_{2k} \tag{33} \end{equation}$$

and hence by the functional equation

$$\begin{equation} \zeta (1-2k) = \frac{-B_{2k}}{2k}. \cssId{texmlid34}{\tag{34}} \end{equation}$$

The fact that the value of the zeta function at negative odd integers is a rational number is remarkable. We will revisit this in the more general setting of L-functions of totally real number fields, when we discuss the explicit formula for this rational number obtained by Klingen and Siegel Reference Kli62Reference Sie68, following an idea of Hecke Reference Hec24 using diagonal restrictions of Hilbert Eisenstein series. This is discussed in §4.5.

The Bernoulli numbers have interesting $p$-adic properties, notably by two results established in the mid-nineteenth century which are the starting point for our investigations: the Clausen–von Staudt theorem Reference Cla40Reference vS40 and the Kummer congruences Reference Kum51. For convenience, we assume henceforth that $p \neq 2$.

Lemma 2.1.

If $k,k'$ are two positive even integers such that $k\!\equiv \!k'\!\pmod {(p-1)p^n}$, then

$$\begin{equation} \begin{array}{lcrcl} \text{if} \ \ \ (p-1) \nmid k & : & (1-p^{k-1})B_k/k & \equiv & (1-p^{k'-1})B_{k'}/k' \pmod {p^{n+1}}, \\\text{if} \ \ \ (p-1) \mid k & : & v_p\left(B_k/k\right) & = & -1 -v_p(k). \end{array} \cssId{texmlid16}{\tag{35}} \end{equation}$$

The Kummer congruences show in particular that the quantity

$$\begin{equation} (1-p^{k-1})\frac{B_k}{k}, \qquad k \in 1 + (p-1)\mathbf{Z}_{\geq 0}, \cssId{texmlid15}{\tag{36}} \end{equation}$$

exhibits $p$-adic continuity as a function of $k$. To the optimistic reader this may already suggest that there may be an interesting $p$-adic analytic function $\zeta _p: \mathbf{Z}_p \to \mathbf{C}_p$ whose special values at arguments $s = 1-k$ of the above form equal the quantities Equation 36. The problem of finding this function is therefore vaguely akin to attempting to reconstruct the complex Riemann zeta function $\zeta (s)$ just from the knowledge of its special values $-B_{k}/k$ at negative odd arguments $s=1-k$. Kubota and Leopoldt Reference KL64 show the following.

Theorem 2.2 (Kubota and Leopoldt).

There is a unique $p$-adic analytic function $\zeta _p : \mathbf{Z}_p \to \mathbf{C}_p$ such that

$$\begin{equation} \zeta _p(1-k) = (1-p^{k-1})\frac{-B_k}{k} \qquad \text{for all} \ k \in 1 + (p-1)\mathbf{Z}_{\geq 0}. \tag{37} \end{equation}$$

What it means precisely for a $p$-adically continuous function $\mathbf{Z}_p \to \mathbf{C}_p$ to be analytic will not be important for now, and we will not define it precisely until §3.6 when we introduce the Iwasawa algebra.

2.2. The $p$-adic family of Eisenstein series

Serre observed that the congruences of Bernoulli numbers given in Equation 35 can be upgraded to congruences between $q$-expansions of modular forms. Notice first that we see from the Kummer congruences that the Bernoulli numbers need to be modified by a factor $(1-p^{k-1})$ in order to interpolate nicely as a function of $k$. Likewise, we need to adjust the Eisenstein series introduced above by setting

$$\begin{equation} \begin{array}{lll} \mathbf{G}_k^{(p)} &=& (1-p^{k-1}U_p)\mathbf{G}_k \\[3.0pt] &=& (1-p^{k-1})\frac{-B_k}{2k} \ + \ \sum _{n\geq 1} \Bigl (\sum _{p \nmid d \mid n} d^{k-1} \Bigr ) q^n, \end{array} \tag{38} \end{equation}$$

which is a modular form for $\Gamma _0(p)$, often referred to as the (ordinary) $p$-stabilisation of $\mathbf{G}_k$. We define $\mathbf{E}^{(p)}_k$ to be its unique multiple with constant coefficient $1$. Observe that elementary congruences for the higher Fourier coefficients yield an upgraded version of the congruences Equation 35. More precisely, when

$$\begin{equation} k \equiv k' \pmod {(p-1)p^n}, \tag{39} \end{equation}$$

we have that $d^{k-1} \equiv d^{k'-1} \pmod {p^{n+1}}$ for any $d$ not divisible by $p$. Indeed, for $n = 0$ this is the statement of Fermat’s little theorem, and the general case follows by induction. Therefore we obtain

$$\begin{equation} \begin{array}{lcrcll} \text{if} \ \ \ (p-1) \nmid k & : & \mathbf{G}_k^{(p)}(q) & \equiv & \mathbf{G}_{k'}^{(p)}(q) & \pmod {p^{n+1}}, \\[3.0pt] \text{if} \ \ \ (p-1) \mid k & : & \mathbf{E}_k^{(p)}(q) & \equiv & \mathbf{E}_{k'}^{(p)}(q) & \pmod {p^{n+1}}. \end{array} \cssId{texmlid35}{\tag{40}} \end{equation}$$

The observation of Serre Reference Ser73, which was inspired by earlier ideas of Hecke Reference Hec24 and Siegel Reference Sie68, was that in establishing these congruences of Eisenstein series, there is a striking dichotomy between the congruences between the constant terms (which are the Kummer congruences and hence are somewhat deep) and the higher coefficients (which follow trivially from Fermat’s little theorem and hence are not deep). His idea was to try and obtain the Kummer congruences and the construction of the Kubota–Leopoldt zeta function $\zeta _p(s)$ by inheriting congruences of a more elementary nature from the higher coefficients.

This idea, whereby information on the constant coefficient is transferred from the higher coefficients, will appear several times throughout these notes, and is very powerful and useful in a variety of contexts.

2.3. Serre’s theory of $p$-adic modular forms

We now follow Serre Reference Ser73 and establish some basic definitions of $p$-adic modular forms. We follow Serre in restricting to the case of level $1$ modular forms defined over $\mathbf{Q}_p$, but we reassure the reader who is nervous about this that these assumptions will eventually be lifted when we adopt the more geometric viewpoint due to Katz in the next lecture.

For any formal power series in the variable $q$ given by

$$\begin{equation} f(q) = a_0 + a_1q + a_2q^2 + \cdots \in \mathbf{Q}_p\lBrack q \rBrack , \tag{41} \end{equation}$$

we define $v_p(f) = \inf _n (v_p(a_n))$, where $v_p$ is the usual $p$-adic valuation on $\mathbf{Q}_p$. We define the space of $p$-adic modular forms to be the collection of $f(q) \in \mathbf{Q}_p\lBrack q \rBrack$ such that there is a sequence $f_i \in M_{k_i}(\operatorname {SL}_2(\mathbf{Z}))$ with rational Fourier coefficients satisfying

$$\begin{equation} v_p(f(q) - f_i(q)) \to \infty . \tag{42} \end{equation}$$

A $p$-adic modular form $f(q)$ therefore is obtained as a limit of $q$-expansions of classical modular forms. The following important proposition of Serre Reference Ser73, §1.3, Théorème 1 states that the sequence of their weights $k_i$ must tend to a limit $p$-adically. Its proof lies significantly deeper than the rest of the contents of Reference Ser73, which are otherwise largely established by more elementary means.

Proposition 2.3.

Let $f,g$ be two classical modular forms of weights $k,\ell$ on $\operatorname {SL}_2(\mathbf{Z})$, both nonzero and normalised such that $v_p(f) = 0$. Suppose that we have

$$\begin{equation} v_p(f-g) \geq m \tag{43} \end{equation}$$

for some positive integer $m$. Then it must be true that

$$\begin{equation} \begin{array}{lll} k \equiv \ell \quad \bmod {(p-1)p^{m-1}} & \text{if} & p \geq 3, \\k \equiv \ell \quad \bmod {2^{m-2}} & \text{if} & p = 2. \end{array} \tag{44} \end{equation}$$

As a consequence of this proposition, one easily checks that every $p$-adic modular form $f$ has a well-defined weight

$$\begin{equation} k \coloneq \varprojlim _i k_i \in \mathbf{Z}_p \times \mathbf{Z}/(p-1)\mathbf{Z} = \varprojlim _m \mathbf{Z}/(p-1)p^m\mathbf{Z}. \tag{45} \end{equation}$$

For instance, it is not difficult to see that the $q$-expansion

$$\begin{eqnarray*} \mathbf{E}_4(q)^{-1} &=& (1 + 240q + 2160q^2 + 6720q^3 + \cdots )^{-1} \\ &=& \phantom {(}1 - 240q + 55440q^2 - 12793920q^3 + 2952385680q^4 + \cdots \end{eqnarray*}$$

is a $2$-adic, $3$-adic, and $5$-adic modular form of weight $-4$.

This is the point where Serre is able to realise the idea of “inheriting” congruences for the constant terms of Eisenstein series from the much more elementary congruences between their higher coefficients. The following result is a corollary of 2.3, and we leave the proof to the reader.

Theorem 2.4 (Serre).

Suppose we have a sequence of $p$-adic modular forms of weights $k_i$,

$$\begin{equation} f_i(q) = a_0^{(i)} + a_1^{(i)}q + a_2^{(i)}q^2 + \cdots , \tag{46} \end{equation}$$

which satisfy the two properties,

•: the sequences $a_n^{(i)}$ tend uniformly to a limit $a_n \in \mathbf{Q}_p$,
•: the weights $k_i$ tend to a limit $k \neq 0$.

Then the constant terms $a_0^{(i)}$ tend to a limit $a_0$, and the $q$-series

$$\begin{equation} f(q) = a_0 + a_1q + a_2q^2 + \cdots \in \mathbf{Q}_p\lBrack q \rBrack \tag{47} \end{equation}$$

is a $p$-adic modular form.

Notice that we may use the above theorem to show the existence of a continuous function interpolating the constant terms of the Eisenstein family! Indeed, when a sequence of integers $k$ tends to a limit in $\mathbf{Z}_p$, we already noticed that the higher Fourier coefficients of $\mathbf{G}_k^{(p)}$ tend uniformly to a limit for elementary reasons. This implies that its constant term, which is

$$\begin{equation} \zeta _p(1-k) = (1- p^{k-1})\zeta (1-k), \tag{48} \end{equation}$$

also tends to a limit, and it extends to a continuous function of $k$ in $\mathbf{Z}_p \times \mathbf{Z}/(p-1)\mathbf{Z}$, which is precisely the Kubota–Leopoldt $p$-adic L-function. The paper of Serre pushes this idea further and strengthens this significantly by deducing also its analytic properties. The above arguments may be strengthened to give an effective version of the claimed convergence, whose rate may be controlled to truly recover the Kummer congruences for Bernoulli numbers from elementary congruences between the higher coefficients.

2.4. Hecke operators and their spectrum

The space of $p$-adic modular forms is equipped with actions of Hecke operators $T_{\ell }, U_p, V_p$, as was shown by Serre Reference Ser73, §2. Suppose

$$\begin{equation} f(q) = a_0 +a_1q + a_2q^2 + \cdots \tag{49} \end{equation}$$

is a $p$-adic modular form of weight $k$. Then $T_{\ell }f$ for $\ell \neq p$ and $U_pf$ are given by the expressions Equation 5, and

$$\begin{equation} V_pf(q) = \sum _{n \geq 0} a_nq^{np} . \cssId{texmlid36}{\tag{50}} \end{equation}$$

One may wonder what can be said about these operators from the point of view of $p$-adic spectral theory Reference Ser62, and what, if any, is the arithmetic significance of the eigenvalues. Despite its great successes on the Kummer congruences, this is a point where the theory of $p$-adic modular forms starts lacking. Its definition—based solely on $q$-expansions—lacks the rigidity to avoid capturing a tremendous amount of power series in the space of $p$-adic modular forms whose arithmetic significance is less apparent.

One way to see this is as follows. Let $f$ be any $p$-adic modular form, and choose any $\lambda \in p\mathbf{Z}_p$. Then

$$\begin{equation*} f_{\lambda } = (1 - \lambda V_p)^{-1}(1-V_pU_p)f \end{equation*}$$

exists as a $p$-adic modular form, has the same weight as $f$, and satisfies

$$\begin{equation} U_p f_{\lambda } = \lambda f_{\lambda }. \tag{51} \end{equation}$$

This shows that the operator $U_p$ has a spectrum that is far from discrete and contains an overwhelmingly large continuous spectrum. To discern a discrete spectrum, this suggests that one should seek a more rigid framework that excludes these pathological eigenforms. We will find such a framework in the subspace of overconvergent modular forms defined by Katz, which we discuss in §3. As we will see, the action of the operator $U_p$ on overconvergent forms has a very rich and interesting discrete spectrum.

2.5. Eisenstein series of weight $2$

A celebrated borderline example is given by the Eisenstein series of weight $2$, and it is of great arithmetic importance. Serre shows that, for any prime number $p$, the formal power series

$$\begin{equation*} \mathbf{E}_2(q) = 1 + 24 \sum _{n \geq 1} \Bigl ( \sum _{d \mid n} \ d \Bigr ) \ q^n \end{equation*}$$

is a $p$-adic modular form. Indeed, the series $\mathbf{E}_2^{(p)}(q) \coloneq (1-pU_p)\mathbf{E}_2(q)$ is a classical modular form of weight $2$ and level $\Gamma _0(p)$. Serre shows that any form of level $\Gamma _0(p)$ is a limit of modular forms of level $1$, and therefore defines a $p$-adic modular form in the above sense. It follows that

$$\begin{equation*} \begin{array}{lll} \mathbf{E}_2(q) &=& (1-pU_p)^{-1}\mathbf{E}_2^{(p)}\\&=& \mathbf{E}_2^{(p)} \ + \ \ pU_p\mathbf{E}_2^{(p)} \ + \ \ p^2U_p^2\mathbf{E}_2^{(p)} \ \ + \ \cdots \end{array} \end{equation*}$$

is then also a $p$-adic modular form. Note that this argument is valid for any prime $p$.

In the next section, we introduce the subspace of overconvergent modular forms according to Katz. It was shown by Coleman, Gouvêa, and Jochnowitz Reference CGJ95 that the form $\mathbf{E}_2$ is never overconvergent. This example is nonetheless of tremendous arithmetic importance, and we content ourselves by mentioning its role in the theory of $p$-adic heights on ordinary elliptic curves. A beautiful discussion, along with a precise quantification of its failure to overconverge, can be found in the article by Mazur, Stein, and Tate Reference MST06.

3. Overconvergent modular forms

We encountered the Kubota–Leopoldt $p$-adic zeta function, and explored an idea of Serre that uses the $p$-adic Eisenstein family to construct it. This involved the notion of $p$-adic modular forms, which therefore served a great purpose but otherwise seemed somewhat lacking in finer structural properties, as evidenced by the absence of an interesting discrete spectrum of Hecke operators. We now follow Katz in reinterpreting the viewpoint of Serre geometrically and in identifying much smaller—though still infinite-dimensional—subspaces of the space of $p$-adic modular forms. Though we recall much of what we need, we will assume some familiarity with the algebro-geometric theory modular forms. Excellent expositions can be found for instance in Katz Reference Kat73, Calegari Reference Cal13, and Loeffler Reference Loe14.

3.1. The Hasse invariant

Suppose $S$ is a scheme over $\mathbf{F}_p$. Then there is an absolute Frobenius morphism

$$\begin{equation} F_{\mathrm{abs}}: \ S \ \longrightarrow \ S \tag{52} \end{equation}$$

given on affine opens by the map on functions $f \mapsto f^p$. If $X/S$ is an $S$-scheme, we define the scheme $X^{(p)} = X \times _S S$ where the fibre product is taken over $S$, viewed as an $S$-scheme via $F_{\mathrm{abs}}$. The relative Frobenius morphism $F = F_{X/S}$ is defined by the following commutative diagram, where the square is Cartesian.

$$\begin{equation} \vcenter{\img[][97pt][81pt][{\renewcommand{\arraystretch}{1} \setlength{\unitlength}{1.0pt} \begin{tikzpicture}[node distance=1.5cm, >=stealth, baseline=(current bounding box.center)] \node(Xp) {$X^{(p)}$}; \node(S1) [below of=Xp] {$S$}; \node(X) [right of=Xp] {$X$}; \node(S2)[right of=S1] {$S$}; \node(Frob) [left of=Xp, node distance=1.6cm, yshift = .6cm] {$X$}; \draw[->] (Xp) to node {} (X); \draw[->] (Xp) to node {} (S1); \draw[->,above] (S1) to node[above] {$\scriptstyle F_{\mathrm{abs}}$} (S2); \draw[->] (X) to node[above] {} (S2); \draw[->, bend left=30] (Frob) to node[above] {$\scriptstyle F_{\mathrm{abs}}$} (X); \draw[->, bend right=30] (Frob) to node {} (S1); \draw[->, dashed] (Frob) to node[below] {$\scriptstyle F_{X/S}$} (Xp); \end{tikzpicture}}]{Images/imgba27825df3d4babef0c1e4f0a3edf519.svg}} \tag{53} \end{equation}$$

Notice that the relative Frobenius is an $S$-linear morphism, whereas the absolute Frobenius is not! Also, the scheme $X^{(p)}$ is hardly a mysterious thing: Suppose $X$ is of finite type over $\mathbf{F}_q/\mathbf{F}_p$. Then $X^{(p)}$ is given by the same equations as $X$ but where all the coefficients are raised to the $p$th power. Note that if $q=p$, then we have $X^{(p)} = X$.

Now suppose that $E/S$ is an elliptic curve. Then the relative Frobenius $F = F_{E/S}$ is an isogeny, and hence has a dual isogeny $V$:

$$\begin{equation} \begin{array}{lllll} F \ : & E & \longrightarrow & E^{(p)} & \qquad \text{“Frobenius ”},\\V \ : & E^{(p)} & \longrightarrow & E & \qquad \text{“Verschiebung ”}. \end{array} \tag{54} \end{equation}$$

Suppose now that $S = \mathrm{Spec}(\overline{\mathbf{F}}_p)$. Then we say

$$\begin{equation} \left\{ \begin{array}{lll} E\ \text{is \textit{ordinary}} & \text{if} & E[p](\overline{\mathbf{F}}_p) \neq 1, \\E\ \text{is \textit{supersingular}} & \text{if} & E[p](\overline{\mathbf{F}}_p) = 1. \end{array}\right. \tag{55} \end{equation}$$

In general, we say $E/S$ is ordinary/supersingular if all its geometric fibres are.

Proposition 3.1.

Suppose $E/S$ is an elliptic curve and $S$ is an $\mathbf{F}_p$-scheme. Then we have

•: $E/S$ is ordinary if and only if $\ V: E^{(p)} \longrightarrow E$ is étale;
•: $E/\overline{\mathbf{F}}_p$ is supersingular, only if $E$ is defined over $\mathbf{F}_{p^2}$.

Proof.

We can factor the multiplication by $p$ map as

$$\begin{equation*} [p] : E \ \stackrel{F}{\longrightarrow } \ E^{(p)} \ \stackrel{V}{\longrightarrow } \ E. \end{equation*}$$

This implies that $V$ is separable if and only if $\mathrm{Ker}(V)(\overline{\mathbf{F}_p}) \neq 1$ on all geometric fibres. Since the kernel of Frobenius only has the trivial geometric point, this is equivalent to $\mathrm{Ker}([p])(\overline{\mathbf{F}_p}) \neq 1$. This proves the first statement. For the second statement, we have that $E/S$ is supersingular if and only if $V$ is inseparable, which means it must factor through Frobenius:

$$\begin{equation*} V \ : \ E^{(p)} \ \stackrel{F}{\longrightarrow } \ E^{(p^2)} \ \longrightarrow \ E. \end{equation*}$$

The latter map must be finite of degree $1$ and hence is an isomorphism. Thus $E$ is defined over $\mathbf{F}_{p^2}$.

■

Finally, we define the Hasse invariant of an elliptic curve $E/ R$ where $R$ is a ring of characteristic $p$. First, choose $\omega \in \mathrm{H}^0(E, \Omega _{E/R}^1)$ to be an $R$-basis, and let $\eta$ be the $R$-basis of $\mathrm{H}^1(E, \mathcal{O}_E)$ defined via Serre duality. The Hasse invariant $A(E,\omega )$ is the element of $R$ defined by

$$\begin{equation} F_{\mathrm{abs}}^*(\eta ) = A(E,\omega ) \cdot \eta . \tag{56} \end{equation}$$

Note that by the previous proposition, $E / \overline{\mathbf{F}}_p$ is ordinary if and only if $A(E,\omega ) \neq 0$ for any choice of $\omega$.

3.2. Algebraic modular forms

We will now see how to interpret the Hasse invariant as an algebraic modular form over $\mathbf{F}_p$ of weight $p-1$. Over the field of complex numbers $\mathbf{C}$, we are used to thinking of modular forms in terms of their $q$-expansions, which directly describe them as a function on the upper half-plane in the variable $q$. Over other base fields, one adapts a more algebraic viewpoint, where a modular form becomes a section of a line bundle $\omega ^{\otimes k}$ over a moduli space of elliptic curves. The algebraically defined notion of $q$-expansion, defined by its evaluation on the Tate curve, no longer directly describes the form as a “function” on classes of elliptic curves. This is quite striking for the Hasse invariant, which vanishes at supersingular points, yet has $q$-expansion given by $1$. We now discuss these notions, briefly recalling the notion of algebraic modular forms, referring to Katz Reference Kat73, Chapter 1 for more details.

A weakly holomorphic modular form of weight $k \in \mathbf{Z}$ over a ring $A$ is a rule which assigns to any isomorphism class of pairs $(E/R, \, \omega )$, where

•: $E/R$ is an elliptic curve over an $A$-algebra $R$,
•: $\omega$ is an $R$-basis for $\mathrm{H}^0(E, \Omega _{E/R}^1)$,

an element $f(E/R,\, \omega ) \in R$ such that the following two properties are satisfied:

$$\begin{equation*} \begin{array}{llll} f\left( (E/R,\omega )\otimes _{\phi } R' \right) &=& \phi \left( f(E,\omega ) \right) & \text{for all} \ \ \phi : R \to R' \ \text{of }A\text{-algebras,} \\f(E, \lambda \omega ) &=& \lambda ^{-k}f(E,\omega ) & \text{for all} \ \ \lambda \in R^{\times } . \end{array} \end{equation*}$$

The $q$-expansion of a weakly holomorphic modular form $f$ is defined as

$$\begin{equation} f(q) \coloneq f\left( \mathrm{Tate}(q)_{\mathbf{Z}\llparenthesis q \rrparenthesis }, \omega _{\mathrm{can}})\otimes R \right) \in R \llparenthesis q \rrparenthesis , \tag{57} \end{equation}$$

where $\mathrm{Tate}(q)$ is the Tate elliptic curve over $\mathbf{Z}\llparenthesis q \rrparenthesis$ defined by

$$\begin{equation} y^2 + xy = x^3 + B_qx + C_q, \quad \omega _{\mathrm{can}} = \frac{dx}{2y+x}, \tag{58} \end{equation}$$

with coefficients defined by the explicit $q$-series in $\mathbf{Z}\lBrack q \rBrack$,

$$\begin{equation} \begin{array}{lll} B_q &=& \sum _{n \geq 1} -5\sigma _3(n) q^n, \\[6.0pt] C_q &=& \sum _{n \geq 1} \frac{-5 \sigma _3(n) - 7 \sigma _5(n)}{12} q^n. \end{array} \tag{59} \end{equation}$$

We say a weakly holomorphic modular form is an algebraic (or holomorphic) modular form if its $q$-expansion, which a priori is an element of $R \llparenthesis q \rrparenthesis$, is in fact in $R \lBrack q \rBrack$.

Remark.

The Tate curve arose first in the work of Tate Reference Tat95 on $p$-adic uniformisation of elliptic curves. Over the complex numbers $\mathbf{C}$, any elliptic curve $E$ is isomorphic to $\mathbf{C} / \langle 1, \tau \rangle$ where $\langle 1, \tau \rangle$ denotes the lattice spanned by $1,\tau$ in $\mathbf{C}$. By exponentiation, this quotient can also be described as $\mathbf{C}^{\times } \!\! / q^{\mathbf{Z}}$ where as usual $q = \exp (2 \pi i \tau )$, and the isomorphism with $E$ involves explicit complex analytic functions which go back at least to Weierstraß. Tate showed that this admits a $p$-adic analogue; more precisely that for every $E$ whose $j$-invariant is not $p$-adically integral, there exists an isomorphism with $\mathrm{Tate}(q)$ for some $|q|<1$, the latter being isomorphic to $\mathbf{C}_p^{\times }\!/q^{\mathbf{Z}}$ by explicit $p$-adic analytic power series.

We now see from the definition of the Hasse invariant in the previous section that it is naturally an algebraic modular form of weight $p-1$ over $\mathbf{F}_p$. Indeed, it is a rule that attaches to $(E/R, \omega )$ an element of $R$ whose functoriality is clear by definition, and moreover for any $\lambda \in R^{\times }$ we see that

$$\begin{eqnarray*} A(E, \lambda \omega ) \cdot \lambda ^{-1}\eta &=& F_{\mathrm{abs}}^*(\lambda ^{-1}\eta ) \\ &=& \lambda ^{-p} F_{\mathrm{abs}}^*(\eta ) \\ &=& \lambda ^{1-p} \cdot A(E,\omega ) \cdot \lambda ^{-1}\eta . \end{eqnarray*}$$

We conclude that the Hasse invariant $A$ defines a weakly holomorphic modular form of weight $p-1$ and level $1$. It has the following important properties (which we will freely use in what follows):

•: The $q$-expansion of the Hasse invariant was computed in Reference Kat73Reference KM85 and is given by $A(q) = 1$. The proof is a beautiful argument using the Cartier operator.
•: We already know that for $E/k$ over $k = \overline{\mathbf{F}}_p$, the Hasse invariant vanishes if and only if $E$ is supersingular. In fact, it has simple zeroes, in the sense that if $R$ is a local Artinian $k$-algebra and $E/R$ is such that $V: E^{(p)} \longrightarrow E$ induces the zero map on tangent spaces, then it must be true that there is a supersingular elliptic curve $E_0 /k$ such that$$\begin{equation} E_0 \times _k R \simeq E. \tag{60} \end{equation}$$

3.3. Overconvergent modular forms

We now come to the main spaces of interest, which were defined by Katz Reference Kat73. They revolve around the properties of the Hasse invariant $A$ over $\mathbf{F}_p$, but are spaces of forms over a $p$-adic field, and therefore involve liftings $\widetilde{A}$ of the Hasse invariant to characteristic $0$.

Remark.

We now come to a point where rigid geometry is most naturally used. The reader unfamiliar with this framework should not be alarmed, as we take a very pedestrian approach that should be digestible if one is willing to take a few things on faith. Moreover, the next section will show these notions in action in an extended example, and the uninitiated reader may prefer to skip ahead. The foundations of rigid geometry are beautifully summarised in Conrad Reference Con08.

Suppose⁠Footnote⁴ $N\geq 5$ and $p\nmid N$ prime. We let $\mathcal{X}/\mathbf{Z}_p$ be the modular curve over $\mathbf{Z}_p$ which classifies generalised elliptic curves with $\Gamma _1(N)$-level structure, universal curve $\pi : \mathcal{E} \longrightarrow \mathcal{X}$, and closed subscheme of cusps $\mathcal{I}_C$, and we denote its generic and special fibres by $X$ and $\mathcal{X}_s,$ respectively. We define the line bundle

⁴

For simplicity, we will choose some auxiliary level structure to rigidify the moduli problem of elliptic curves and work on modular curves. If desired, this can be avoided by working on the moduli stack.

✖

$$\begin{equation} \omega \coloneq \pi _{*}\Omega ^1_{\mathcal{E}^{\mathrm{sm}}\!\!/\!\mathcal{X}}(\log \pi ^{-1}\mathcal{I}_C). \tag{61} \end{equation}$$

The Hasse invariant is the unique⁠Footnote⁵ element of $\mathrm{H}^0(\mathcal{X}_{s},\omega ^{\otimes p-1})$ with $q$-expansion $1$. Let $\mathbf{C}_p$ be the completion of the algebraic closure of $\mathbf{Q}_p$. Since the relative curve $\mathcal{X}/\mathbf{Z}_p$ is proper, every $\mathbf{C}_p$-point extends uniquely to an $\mathcal{O}_{\mathbf{C}_p}$-point, and we obtain a reduction map

⁵

The $q$-expansion principle states that any modular form with a given $q$-expansion and weight is uniquely determined.

✖

$$\begin{equation} \mathrm{red} \ : \ \mathcal{X}(\mathbf{C}_p) \longrightarrow \mathcal{X}_s(\overline{\mathbf{F}}_p). \tag{62} \end{equation}$$

This map provides our first encounter with analytic geometry over the $p$-adic numbers. The inverse image $\mathrm{red}^{-1}(x)$ of a closed point of the special fibre is isomorphic to a rigid analytic open disk $D = \{ x \in \mathbf{C}_p \, : \, |x| < 1 \}$. We saw previously that the vanishing locus of the Hasse invariant is precisely the supersingular locus of $\mathcal{X}_s$, which consists of a finite set of closed points. The ordinary locus $X^{{\mathrm{ord}}}$ is the affinoid open whose set of $\mathbf{C}_p$-points correspond to elliptic curves with ordinary reduction, which is therefore the complement of a finite number of rigid analytic open disks, indexed by the supersingular points; see Figure 1.

At this point, we are ready to give a geometric reinterpretation of the spaces of $p$-adic modular forms introduced by Serre, which were discussed in §2. The following theorem is due to Katz Reference Kat73.

Theorem 3.2 (Katz).

The space of $p$-adic modular forms of weight $k$ is isomorphic as a Hecke module to

$$\begin{equation} \mathrm{H}^0(X^{{\mathrm{ord}}}, \omega ^{\otimes k}). \tag{63} \end{equation}$$

In light of this theorem, we see in particular that if $\widetilde{A}$ is a lift of the Hasse invariant, which is any modular form of weight $p-1$ over $\mathbf{Z}_p$ whose $q$-expansion is congruent to $1$ modulo $p$, then it must be invertible as a $p$-adic modular form. This may be seen explicitly also in Serre’s language, since we have

$$\begin{equation} v_p\left(\widetilde{A}(q)^{p^n-1} - \widetilde{A}(q)^{-1}\right) \ \ \longrightarrow \ \ \infty , \tag{64} \end{equation}$$

so that the formal $q$-expansion $\widetilde{A}(q)^{-1}$ is a $p$-adic modular form in the sense of Serre.

Remark.

When we do explicit calculations later on, we will see that the fact that the $q$-expansion of the Hasse invariant is $1$ usually allows us to choose an Eisenstein series $\mathbf{E}_{p-1}$ as a lift of the Hasse invariant. Indeed, when $p \geq 5,$ this is a modular form of weight $p-1$ whose $q$-expansion is congruent to $1$ modulo $p$.

We saw previously that the space of $p$-adic modular forms is too large to have nice spectral properties, prompting Katz to consider subspaces of sections that extend⁠Footnote⁶ to affinoids strictly containing $X^{{\mathrm{ord}}}$. More precisely, let $0 \leq r \leq 1$, and define $X^{{\mathrm{ord}}} \subset X_r \subset X^{{\mathrm{rig}}}$ by

⁶

This definition may seem obscure to the uninitiated, but already at the time when Katz introduced this, $p$-adic overconvergence represented a dominant theme in the school of $p$-adic analysis surrounding Bernard Dwork and his disciples, and the idea continues to be hugely influential to this day.

✖

$$\begin{equation} X_r(\mathbf{C}_p) \coloneq \{ x \ \in \ X(\mathbf{C}_p) \ : \ v_p(\widetilde{A}_x) \leq r \}, \tag{65} \end{equation}$$

where $\widetilde{A}_x$ is a local lift of the Hasse invariant $A$ at $x$. Note we do not require a global lift of the Hasse invariant to exist, which may fail when $p =2, 3$. We define the space of $r$-overconvergent modular forms of integer weight $k$ on $\Gamma _1(N)$ to be

$$\begin{equation} M^{\dagger }_k(r) \coloneq \mathrm{H}^0(X_r,\omega ^{\otimes k}). \tag{66} \end{equation}$$

These spaces come with an action of Hecke operators $T_{\ell }$ for $\ell \nmid Np$ and $U_{\ell }$ for $\ell \mid N$, defined by restricting the Hecke correspondences on $\mathcal{X}$. They have the usual effect on $q$-expansions.

In addition, the operators $U_p$ and $V_p$ defined on $p$-adic modular forms may be defined geometrically, and they preserve the subspace of overconvergent modular forms. More precisely, they are defined for every $r<1/(p+1)$ and have the following effect on the rate of overconvergence:

$$\begin{equation} \begin{array}{lllll} U_p & : & M^{\dagger }_k(r) & \longrightarrow & M^{\dagger }_k(pr), \\[2.0pt] V_p & : & M^{\dagger }_k(pr) & \longrightarrow & M^{\dagger }_k(r). \end{array} \cssId{texmlid19}{\tag{67}} \end{equation}$$

In particular, the operator $U_p$ improves the rate of overconvergence. The reason for the existence of the operators $U_p$ and $V_p$ is the canonical subgroup section $s$ of the natural forgetful map of modular curves, which exists for any $r < p/(p+1)$:

$$\begin{equation} \vcenter{\img[][151pt][39pt][{\renewcommand{\arraystretch}{1} \setlength{\unitlength}{1.0pt} \begin{tikzpicture}[node distance=3.5cm, >=stealth, baseline=(current bounding box.center)] \node(Xp) {$X(\Gamma_1(N) \cap\Gamma_0(p))^{{\mathrm{rig}}}$}; \node(X) [right of=Xp] {$\ \ X^{{\mathrm{rig}}}$}; \node(inc) [below of=X, node distance=.5cm] {\!\!\!\! \rotatebox{90}{$\subset$}}; \node(Xr) [below of=X, node distance=1cm] {$X_r\;.$}; \draw[->] (Xp) to node {} (X); \draw[->, dashed, bend left=15] (Xr) to node[below] { $\scriptstyle s$} (Xp); \end{tikzpicture}}]{Images/imgf0bc8c3c5423112ff2c705a591bc4c4f.svg}} \tag{68} \end{equation}$$

This yields two equivalent ways to view spaces of overconvergent modular forms:

•: as sections on affinoid opens of $X$, with no level at $p$ (the tame viewpoint);
•: as sections on affinoid opens of the modular curve obtained from $X$ by adding additional $\Gamma _0(p)$ level structure (the canonical subgroup viewpoint).

For theoretical questions the latter is frequently more convenient. For instance, it forms the natural setting for questions of analytic continuation and Coleman’s classiciality results discussed in §3.7. On the other hand, for computational purposes the former often has advantages, since it allows one to compute with auxiliary classical spaces of modular forms, as we will see in §4.1.

3.4. Interlude: Extended example

Let us explore these abstract definitions in a particular case to get a feeling for the various objects involved. Consider the case where $p=2$ and $k=0$, in level $1$. In this case we can be very explicit about the spaces of $p$-adic and $r$-overconvergent modular forms, both from the tame viewpoint (in level $1$) or via the canonical subgroup section (on $X_0(2)$). This example is centred around the properties of the Klein $j$-invariant,

$$\begin{equation} j(q) = \frac{1}{q} + 744 + 196884\, q + 21493760\, q^2 + 864299970 \, q^3 + \cdots , \tag{69} \end{equation}$$

which is a level $1$ modular form of weight $0$ with a simple pole at the cusp.

3.4.1. The tame viewpoint

Consider the moduli stack $X$ of elliptic curves. Of the four values in $\mathbf{F}_4$ for the $j$-invariant, only $j=0$ is supersingular, so that its special fibre at $p=2$ has a unique supersingular point corresponding to the vanishing locus of $j$. It follows that the ordinary locus on $X$ is described by $| j^{-1} | \leq 1$, and hence the space of $2$-adic modular forms of weight $0$ is isomorphic to

$$\begin{equation} \mathbf{C}_2 \langle j^{-1} \rangle = \left\{a_0 + a_1j^{-1} + a_2j^{-2} + \cdots \ \mid \ a_n \to 0 \right\}. \tag{70} \end{equation}$$

For any $r$, the space $M_0^{\dagger }(r)$ defines a Banach space contained inside this Tate algebra, which we can explicitly identify through growth conditions on the coefficients $a_n$. Precisely, we use the observation that

$$\begin{equation} j = \frac{\mathbf{E}_4^{3}}{\Delta } \tag{71} \end{equation}$$

and $\mathbf{E}_4 = 1 + 240q + \cdots$ is the normalised Eisenstein series of weight $4$, which is a lift of the fourth power of the Hasse invariant $A^4$. In particular, we find that on the supersingular disk (where $\Delta$ is invertible, and hence $v_2(\Delta )=0$, we have that

$$\begin{equation} v_2(A) \leq r \quad \iff \quad v_2(j) \leq 12r. \tag{72} \end{equation}$$

and as a consequence, we get that the subspace of $r$-overconvergent forms is given by

$$\begin{equation} M_0^{\dagger }(r) = \left\{a_0 + a_1j^{-1} + a_2j^{-2} + \cdots \ : \ |a_n|p^{12nr} \to 0 \right\}. \cssId{texmlid17}{\tag{73}} \end{equation}$$

Finally, let us compute some Hecke operators, and see whether the obtained results make sense with what is said above. First, note that we can compute very rapidly the $q$-expansion of $j^{-1}$ (most serious computer algebra packages like Magma, PARI/GP, or Sage will already have a function implemented). Given any $2$-adic modular form of weight $0$, we can then compute its $j^{-1}$-expansion very rapidly by the simple observation that $j^{-1}$ vanishes to order $1$ at the cusp infinity, and hence we can inductively subtract powers of $j^{-1}$ until we are left with zero. Carrying out this procedure in Magma Reference BCP97, we obtain that

$$\begin{equation*} \begin{array}{lll} U_2j^{-1} &=& -744\ j^{-1} \\&=& -140914688\ j^{-2} \\&=& -16324041375744\ j^{-3} \\&=& -1528926232501026816 \ j^{-4} \\& & + \cdots , \\[3.0pt] T_3j^{-1} &=& 356652\ j^{-1} \\& & -16114360320000 \ j^{-2} \\& & +1298216343568384000000/3 \ j^{-3} \\& & + \cdots , \\[3.0pt] T_5j^{-1} &=& 49336682190\ j^{-1} \\& & -122566701099729715200000\ j^{-2} \\& & +177278377115100363578123747328000000\ j^{-3} \\& & +\cdots , \end{array} \end{equation*}$$

where we calculated in reality hundreds of terms, which look rather unappetising. Things become very interesting when we look at the $2$-adic valuations of the coefficients $a_1, a_2, a_3, \ldots$ of $U_2j^{-1}$ and $T_{\ell }j^{-1}$ tabulated above, which give us the following sequences:

$$\begin{equation} \begin{array}{lll} U_2j^{-1} & : v_2(a_n) = 3, 12, 20, 28, 35, 46, 52, 60, 67, 76, 86, 94, \ldots , \\T_3j^{-1} & : v_2(a_n) = 2, 16, 32, 45, 60, 79, 91, 105, 120, 136, 154, 165, \ldots , \\T_5j^{-1} & : v_2(a_n) = 1, 18, 33, 47, 61, 80, 92, 107, 121, 138, 155, 167, \ldots , \end{array} \tag{74} \end{equation}$$

We see very clearly that the latter two sequences grow roughly at the same rate, whereas the first one grows significantly more slowly! In fact, if we plot these three sequences in red, green, and blue, respectively, for the first two hundred terms, we obtain Figure 2. They all look like linear functions! The green and blue plots are virtually indistinguishable at this scale and look roughly like a linear function of slope $15$. On the other hand, at this scale the red plot looks roughly like a linear function of slope $8$. This is precisely what we expected from the general theory, since $j^{-1}$ is $r$-overconvergent for any $r$ (indeed, it converges on the entire modular curve $X$ except for a simple pole at the cusp $0$!), and its image under the $U_2$-operator is therefore only guaranteed to be $r$-overconvergent for any $r < p/(p+1) = 2/3$. With respect to the identification Equation 73, this shows that the valuation of the coefficients should grow at least like a linear function of slope $8 = (2/3)\cdot 12$.

3.4.2. The canonical subgroup viewpoint

Even though we can compute things to our heart’s desire, it is hard to get any more specific information in the tame description (i.e., on $X = X_0(1)$). Following Buzzard and Calegari Reference BC05, we will now see that we can get a lot of mileage from working on $X_0(2)$ instead; we know we can do this by the theory of the canonical subgroup. Define the Hauptmodul

$$\begin{equation} h = \Delta (2z)/\Delta (z) = q \prod _{n \geq 1} (1+q^n)^{24}, \cssId{texmlid20}{\tag{75}} \end{equation}$$

which is a meromorphic function on $X_0(2)$ with a simple zero at the cusp $\infty$ and a pole at the cusp $0$. It is related to the $j$-function by

$$\begin{equation} \frac{h}{(1+2^8h)^3} = j^{-1}. \tag{76} \end{equation}$$

Using a Newton polygon argument, we see that we can find a canonical section of the forgetful map whenever $v_p(j^{-1}) > -8$ exactly as predicted by the theory of canonical subgroups. Note also that in this case, we see that this section does not extend to any larger region, so the result was optimal! This means that we get an alternative description for Equation 73 of the form

$$\begin{equation} M_0^{\dagger }(r) = \left\{a_0 + a_1h + a_2h^2 + \cdots \ : \ |a_n|p^{12nr} \to 0 \right\}. \cssId{texmlid18}{\tag{77}} \end{equation}$$

The advantage is the following: The Hecke operators are defined as correspondences on $X_0(2)$, and hence we know that $U_2(h)$ and $T_{\ell }(h)$ are polynomials in $h$! This is in stark contrast with the tame situation, where we got a rather mysterious set of power series, which we could compute to any accuracy, but never exactly. In contrast, on $X_0(2)$ we can do the computation exactly, and we obtain

$$\begin{equation*} \begin{array}{lll} U_2(h) &=& 24h + 2048h^2,\\T_3(h) &=& 300h + 98304h^2 + 16777216/3h^3. \\T_5(h) &=& 18126h + 40239104h^2 + 14696841216h^3\\&& +\ 1649267441664h^4 + 281474976710656/5h^5. \end{array} \end{equation*}$$

Together with Equation 77, this can be seen as a complete description of the $U_2$-module $M_0^{\dagger }(r)$. This is what is used by Buzzard and Calegari Reference BC04 to determine the valuations of all the eigenvalues of $U_2$ on this space.

3.5. Spectral theory of $U_p$

We now discuss an important part of the subject, which is the spectral theory of the Hecke operator $U_p$, acting on the $\mathbf{C}_p$-Banach space of $r$-overconvergent forms.

We begin by defining a norm $\|\cdot \|_r$ on $M^{\dagger }_k(r)$. Pick a point $x \in X_r$, let $K$ be a finite extension of the residue field of $x$, and let $\mathrm{Spec}(K) \rightarrow \mathcal{X}_{\mathbf{Q}_p}$ be a point whose image corresponds to $x$. The properness of $\mathcal{X}$ implies that this extends uniquely to a point $\varphi : \mathrm{Spec}(\mathcal{O}_K) \rightarrow \mathcal{X}$. Now let $f \in M^{\dagger }_k(r)$, then $\varphi ^*f = a_fs$ for some section $s$ generating the trivial line bundle $\varphi ^* \omega ^{\otimes k}$ and some $a \in \mathcal{O}_K$. We set

$$\begin{equation} |f(x)| \ \coloneq \ |a_f|, \tag{78} \end{equation}$$

which is independent of the choice of $s$. The norm

$$\begin{equation} \|f \|_r \coloneq \mathrm{sup}\{ |f(x)| : x \in X_r \} \tag{79} \end{equation}$$

makes $M^{\dagger }_k(r)$ into a $p$-adic Banach space. This induces the structure of a $p$-adic Fréchet space on

$$\begin{equation} M^{\dagger }_k \coloneq \varinjlim _{r>0} M^{\dagger }_k(r), \tag{80} \end{equation}$$

which we call the space of overconvergent modular forms. The Banach spaces $M^{\dagger }_k(r)$ are infinite dimensional, and there is a priori no meaningful way to talk about the spectrum of an operator unless we know more.

Suppose we have a continuous bounded operator $T$ on a separable $\mathbf{C}_p$-Banach space $B$, then we say that $T$ is compact if it is the limit of operators of finite rank. Equivalently, $T$ is compact if and only if the image of the unit ball is relatively compact. There is a well-developed spectral theory for compact operators (see Reference Dwo62Reference Ser62Reference Col97b), which has the following pleasant consequences for compact operators.

•

$T$ has a discrete spectrum of nonzero eigenvalues$$\begin{equation} | \lambda _1 | \geq |\lambda _2| \geq \cdots , \tag{81} \end{equation}$$

where $|\lambda _i| \to 0$ as $i \to \infty$, whose inverses are the roots of a well-defined characteristic series$$\begin{eqnarray*} P(t) &=& ``\mathrm{det}(1 - Tt)''\\ &=& a_0 + a_1t + a_2t^2 + \cdots , \qquad \text{where }\quad a_i \to 0\text{ as }i \to \infty . \end{eqnarray*}$$

•

For every $v \in B$ there are constants $c_i$ and generalised eigenvectors $v_i$ with eigenvalue $\lambda _i$ such that for any $\varepsilon >0$ we have (asymptotically in $n$) that$$\begin{equation} \varepsilon ^{-n} \left\| T^nv - T^n\sum _{|\lambda _i| \geq \varepsilon } c_iv_i \right\| \ \longrightarrow \ 0. \tag{82} \end{equation}$$

The constants $c_i$ are often called the coefficients of the asymptotic expansion of $v$.

We established that the operator $U_p$ exhibits a contractive nature, and improves overconvergence as described by Equation 67. This implies that $U_p$ is compact, and hence possesses a well-defined characteristic series. Here is one concrete way to think about this series (and indeed to compute it in examples!), as explained by Serre Reference Ser62 and Coleman Reference Col97b, Theorem A2.1: Suppose we have an orthonormal basis

$$\begin{equation} \{ f_1, f_2, f_3, \ldots \} \qquad \text{for} \quad M_k^{\dagger }(r). \tag{83} \end{equation}$$

Then we obtain an infinite matrix representation of $U_p$. In the example above, where $p=2$ and $k=0$, we already noted that we have an algorithm to compute this matrix exactly, or at least any finite submatrix of it. To see what compactness really means in practice, we compute the first $10 \times 10$ submatrix with respect to the basis $f_i = (2^8h)^i$ of the cuspidal subspace, and we look at the following $2$-adic valuations of its entries.

$$\begin{equation} v_2(U_2(i,j))_{i,j} = \begin{pmatrix} 3 & 8 & & & & & & & & \\ 3 & 7 & 11 & 16 & & & & & & \\ & 8 & 12 & 17 & 19 & 24 & & & & \\ & 7 & 11 & 15 & 21 & 23 & 27 & 32 & & \\ & & 11 & 19 & 20 & 25 & 27 & 35 & 35 & \cdots \\ & & 11 & 16 & 20 & 24 & 27 & 33 & 35 & \\ & & & 17 & 19 & 24 & 29 & 34 & 35 & \\ & & & 15 & 20 & 23 & 27 & 31 & 38 & \\ & & & & 19 & 24 & 27 & 37 & 36 & \\ & & & & \vdots & & & & & \ddots \\ \end{pmatrix} \tag{84} \end{equation}$$

Here, we omitted the entries of $U_2$ that were equal to zero. The compactness of $U_p$ in orthonormalisable situations like this one is equivalent to the statement that the column vectors converge uniformly to $0$ in the infinite matrix representation. In the above example, that certainly looks plausible, as the entries of the columns seem to have valuation which grows roughly at the same rate. To contrast this with what happens in general, let us compute with respect to the same basis the first $10 \times 10$ submatrix for $T_3$.

$$\begin{equation} v_2(T_3(i,j))_{i,j} = \begin{pmatrix} 2 & 12 & 16 & & & & & & & \\ 7 & 2 & 11 & 20 & 27 & 32 & & & & \\ 8 & 8 & 2 & 14 & 17 & 28 & 34 & 46 & 48 & \\ & 11 & 8 & 2 & 12 & 19 & 29 & 36 & 43 & \\ & 16 & 9 & 10 & 2 & 12 & 16 & 32 & 34 & \cdots \\ & 16 & 15 & 12 & 7 & 2 & 11 & 22 & 28 & \\ & & 18 & 19 & 8 & 8 & 2 & 16 & 18 & \\ & & 23 & 19 & 17 & 12 & 9 & 2 & 13 & \\ & & 24 & 25 & 18 & 17 & 10 & 12 & 2 & \\ & & & & \vdots & & & & & \ddots \\ \end{pmatrix} \tag{85} \end{equation}$$

Notice the stark contrast with the matrix of $U_2$. Whereas the general entry of every column seems like it tends to zero (as it should, since $T_3$ still defines an operator on the Banach space $M_0^{\dagger }(2/3)$ after all) it does not look like the general column tends uniformly to zero. Most strikingly, the diagonal entries all seem to have valuation $2$, suggesting this operator may not have a convergent trace.

For the operator $U_2$ we can also compute an approximation for its characteristic series $P(t)$, using the above matrix. One can easily analyse to which precision the given answer is correct, but we will ignore such issues here. We truncate the matrix for $U_2$ as above, and obtain a polynomial whose coefficients are $2$-adically close to those of $P(t)$. Looking at the Newton polygon, we see that the valuations of the eigenvalues of $U_2$ on the full space $M_0^{\dagger }(r)$ for any $r$ are

$$\begin{equation} \mathbf{0}_1, \, \mathbf{3}_1,\, \mathbf{7}_1, \, \mathbf{13}_1, \, \mathbf{15}_1, \, \mathbf{17}_1, \, \ldots . \cssId{texmlid37}{\tag{86}} \end{equation}$$

Here, we denote the valuations of the eigenvalues by bold type and the multiplicity of that valuation by a subscript. It is striking that these are all integers, since there is no a priori reason that they should be! In this particular example, there is an explicit expression for the general term in this sequence, found by Buzzard and Calegari Reference BC05. We give a brief overview of their arguments.

Let $h$ be the Hauptmodul defined in Equation 75. Then a basis for the cuspidal subspace $S_0^{\dagger }(r) \subset M_0^{\dagger }(r)$ is given by the powers $h, h^2, h^3, \ldots ,$ where a general element is an infinite sum of these forms, where the coefficients decay in a controlled matter, depending on $r$. We may try to find an explicit description of the basis for $U_2$ with respect to this basis. It is easily verified that for $n \geq 2,$ we have the recursion

$$\begin{equation*} U_2(h^n) = (48h + 4096h^2)U_p(h^{n-1}) + hU_p(h^{n-2}). \end{equation*}$$

We know from Equation 77 that the powers of $2^6h$ form an orthonormal basis of $M_0^{\dagger }(1/2)$, and the above recursion implies that the $(i,j)$-th entry in the matrix for $U_2$ with respect to this basis is given by

$$\begin{equation*} \frac{3j(i+j-1)!2^{2i+2j-1}}{(2i-j)!(2j-i)!}. \end{equation*}$$

In spite of the matrix for $U_2$ being completely explicit, it is still no laughing matter to compute its slopes, and more ideas are required. It was shown by Buzzard and Calegari Reference BC05, Lemma 4, using a really intriguing direct computation using a hypergeometric summation formula, that there exist matrices $A,B$ with entries in $\mathbf{Z}_2$ which are both congruent to the identity matrix modulo $2$ and such that $ADB$ equals the matrix of $U_2$, where $D$ is the diagonal matrix with $(i,i)$-th entry given by

$$\begin{equation*} \frac{2^{4i+1}(3i)!^2i!^2}{3(2i)!^4}. \end{equation*}$$

From this, one may deduce that the matrix of $U_2$ has a characteristic series whose Newton polygon is the same as that for the matrix $D$, which implies the following.

Theorem 3.3 (Buzzard and Calegari).

The slope sequence of $U_2$ on $S_0^{\dagger }(r)$ for any $r >0$ is given by

$$\begin{equation*} \left\{ 1 + 2v_2 \left(\frac{(3n)!}{n!}\right) \right\}_{n =1, \ldots , \infty }. \end{equation*}$$

The study of slopes was very popular in the early twenty-first century; see for instance Reference Buz05Reference BC04Reference BC05Reference BP16Reference BG16 and the references contained therein. A good knowledge of the spectrum, such as the example above, leads to a streamlined way to prove many classical congruences of modular forms, such as that of Lehmer Reference Leh49 for the Fourier coefficients $a_n$ of the $j$-function, which states that

$$\begin{equation*} a_n \equiv 0 \pmod {2^{3n+8}} \qquad \text{whenever} \qquad n \equiv 0 \pmod {2^n}. \end{equation*}$$

The appearance of $3$ in the exponent is a reflection of the first positive slope being $3$ in the Buzzard–Calegari theorem, and it can be strengthened and refined using the higher slopes. This domain has in recent years shifted its fashions towards the boundary of weight space; see the works Reference BK05Reference Roe14Reference LWX17Reference AIP18 and many others. This is a fascinating notion that falls outside the narrative we take here, but we mention a spectacular recent application in the proof by Newton and Thorne Reference NT19 of modularity of $\mathrm{Sym}^n(f)$ when $f$ is a cuspidal eigenform satisfying certain conditions, including all forms of level $1$.

3.6. The eigencurve

The above constructions may be extended to incorporate families of modular forms, culminating in the existence of the eigencurve. This is a geometric object that provides a powerful picture when thinking about families of overconvergent modular forms. The theory is due mainly to Coleman Reference Col96Reference Col97b and Coleman and Mazur Reference CM98, and it was revisited more recently by Pilloni Reference Pil13 and Andreatta, Iovita, and Stevens Reference AIS14. We content ourselves with a very brief discussion in these notes.

Our desire is to explain congruences between modular forms by interpolating between different weights, as in the theory of Serre. The geometric theory of overconvergent forms is restricted to integral weights $k\in \mathbf{Z}$, and to overcome the lack of a sheaf $\omega ^{\kappa }$ for a $p$-adic weight other than $\kappa \in \mathbf{Z}$, the idea of Coleman was to turn once more to the Eisenstein family, which is defined for any weight-character as

$$\begin{equation} \kappa \in \mathcal{W} \coloneq \mathrm{Hom}_{\mathrm{cont}}(\mathbf{Z}_p^{\times }, \mathbf{C}_p^{\times }), \tag{87} \end{equation}$$

where we can view a pair $(k,\chi )$ consisting of $k\in \mathbf{Z}$ and $\chi : (\mathbf{Z}/p^n\mathbf{Z})^{\times } \to \mathbf{C}_p^{\times }$ as a subset via the embedding defined by the continuous homomorphism

$$\begin{equation} (k,\chi ) \ : \ \mathbf{Z}_p^{\times } \longrightarrow \mathbf{C}_p^{\times }, \quad a \longmapsto \chi (a)a^{k-1}, \tag{88} \end{equation}$$

where $\chi$ is now thought of as a character of $\mathbf{Z}_p^{\times }$ by composing with reduction modulo $p^n$. The subset of weight characters for which $\kappa$ induces the trivial character on $(\mathbf{Z}/p\mathbf{Z})^{\times }$ is denoted by $\mathcal{W}_0$.

The coefficients of Eisenstein series are naturally functions of $(k,\chi )$, and one can easily show that they extend to functions of $\mathcal{W}$. The only part that needs clarification is how to view the Kubota–Leopoldt zeta function $\zeta _p$ as a function of $\kappa \in \mathcal{W}$. Denote $\Delta$ for the torsion subgroup of $\mathbf{Z}_p^{\times }$, which is cyclic of order $\phi (q)$, where $q = 4$ if $p=2$, and $q=p$ otherwise. There is an isomorphism

$$\begin{equation} \mathbf{Z}_p^{\times } \stackrel{\sim }{\longrightarrow } \Delta \times (1+q\mathbf{Z}_p), \quad a \longmapsto (\omega (a),\langle a \rangle ). \tag{89} \end{equation}$$

The character $\omega$ is called the Teichmüller character. Let $\Lambda = \mathbf{Z}_p \lBrack \mathbf{Z}_p^{\times }\rBrack$ be the Iwasawa algebra, which is the ring of functions on $\mathcal{W}$. Then we have an isomorphism

$$\begin{equation} \Lambda \simeq \mathbf{Z}_p [\Delta ]\lBrack T \rBrack , \qquad 1+q \ \longmapsto \ 1+T. \tag{90} \end{equation}$$

This way, the Kubota–Leopoldt zeta function $\zeta _p$ can be viewed as a function on $\mathcal{W}$, satisfying

$$\begin{equation} \zeta _p\left((1+q)^{k-1}-1\right) = (1-p^{k-1}) \zeta (1-k), \tag{91} \end{equation}$$

giving us the Eisenstein family

$$\begin{equation} \begin{array}{llll} \mathbf{G}_{\kappa }(q) &=& \frac{\zeta _p(\kappa )}{2} \ + \phantom {1} \sum _{n \geq 1} \Bigl ( \sum _{p \nmid d \mid n} \kappa (d)/d \Bigr ) q^n, & \qquad \kappa \not \in \mathcal{W}_0, \\\mathbf{E}_{\kappa }(q) &=& 1 \ + \frac{2}{\zeta _p(\kappa )} \sum _{n \geq 1} \Bigl ( \sum _{p \nmid d \mid n} \kappa (d)/d \Bigr ) q^n, & \qquad \kappa \in \mathcal{W}_0. \\\end{array} \tag{92} \end{equation}$$

The idea of Coleman was to define an overconvergent modular form of weight $\kappa$ to be any $q$-expansion with the property that its quotient by the Eisenstein series of weight $\kappa$ is an overconvergent modular function.⁠Footnote⁷ The weights of overconvergent modular forms are naturally parametrised by a geometric object: we define $\mathcal{W}_N$, the weight space of level $N$, as a rigid analytic variety via

⁷

Since then, a more satisfactory—albeit somewhat less immediately suited for explicit computations—definition has been given by Pilloni Reference Pil13 and Andreatta, Iovita, and Pilloni Reference AIP18, who gave a geometric construction of line bundles $\omega ^{\kappa }$ on the affinoids $X_r$ for some $r$ that depends on $\kappa$. Pilloni shows that the Eisenstein series of weight $\kappa$ is a section of his line bundle, therefore giving a completely geometric definition of the space of $r$-overconvergent forms $M_{\kappa }^{\dagger }(r)$ for any weight-character $\kappa$, as long as $r$ is sufficiently small.

✖

$$\begin{equation} \mathcal{W}_N = \left(\mathrm{Spf} \ \Lambda _N \right)^{{\mathrm{rig}}} , \qquad \text{where} \quad \Lambda _N = \mathbf{Z}_p\lBrack (\mathbf{Z}/N\mathbf{Z})^{\times } \times \mathbf{Z}_p^{\times } \rBrack . \tag{93} \end{equation}$$

This set of ideas culminated in the construction, due to Coleman and Mazur Reference CM98, of the eigencurve $\mathcal{C}_N$.

Theorem 3.4 (Coleman and Mazur).

There exists a rigid analytic curve $\mathcal{C}_N \to \mathcal{W}_N$, whose $\mathbf{C}_p$-points classify normalised overconvergent eigenforms $f$ which are not in the kernel ⁠Footnote⁸ of $U_p$.

⁸

In this case, we say $f$ is of finite slope, where the “slope” refers to the valuation of its $U_p$-eigenvalue.

✖

The map $\pi : \mathcal{C}_N \longrightarrow \mathcal{W}_N$ simply associates to every overconvergent eigenform $f$ its weight character $\kappa$. The geometric properties of $\mathcal{C}_N$ therefore dictate all the possible $p$-adic variations of modular forms of finite slope in families. Relatively little is known about its geometry. Figure 3 is a free impression that attempts to depict some of its features. The weight space $\mathcal{W}_N$ decomposes as a finite union of open disks, whereas $\mathcal{C}_N$ contains a particularly well-behaved subspace $\mathcal{C}_N^{\mathrm{ord}}$ that is finite flat over every component of $\mathcal{W}_N$ (to be discussed in §3.7) and otherwise exhibits a striking contrast between its behaviour close to the boundary and deeper in the interior (these will be discussed in §4.2).

We briefly mention here one important property that has been established recently, and is often referred to as the “properness” of the eigencurve. More precisely, it was asked by Coleman and Mazur Reference CM98 whether the eigencurve can have any “holes”, in the sense of a $p$-adic analytic family of overconvergent eigenforms of finite slope parameterized by a punctured disc, which converges at the puncture to an overconvergent eigenform in the kernel of $U_p$ (such a form is typically said to have infinite slope). This is reminiscent of the valuative criterion for properness. It was proved by Buzzard and Calegari Reference BC06 that no such families exist (and hence the eigencurve is proper) when $p=2$ and $N=1$, and then also by Calegari Reference Cal08 at integer weights. The general case was established via an intricate, yet elegant, argument by Diao and Liu Reference DL16.

3.7. Hida theory

One part of the eigencurve that is fairly well understood is the ordinary part, whose discovery by Hida Reference Hid86bReference Hid86a predates that of the eigencurve by over a decade. An overconvergent form is called ordinary if it is a $U_p$-eigenvector with an eigenvalue that is a $p$-adic unit or, said differently, is of slope zero. Hida considered the ordinary projection operator

$$\begin{equation} e^{{\mathrm{ord}}} = \lim _{n \to \infty } U_p^{n!}, \tag{94} \end{equation}$$

whose limit exists as an operator on $M_{\kappa }^{\dagger }(r)$ for any $\kappa$ in $\mathcal{W}_N$. Then Hida showed the following.

Theorem 3.5 (Hida).

The image of $e^{{\mathrm{ord}}}$ on $M^{\dagger }_{\kappa }(r)$ is a finite-dimensional vector space, whose dimension depends only on the connected component of $\mathcal{W}_N$ containing $\kappa$.

This spectacular result shows that even though the slopes of the spectrum of $U_p$ can vary wildly, the dimension of the part of slope $0$ is locally constant on $\mathcal{W}_N$. Note that the connected components of $\mathcal{W}_N$ are indexed by the characters $(\mathbf{Z}/Nq\mathbf{Z})^{\times } \to \mathbf{C}_p^{\times }$, and the dimension of the ordinary subspace is constant over each component. Hida in fact proved the following statement. Suppose

$$\begin{equation} \pi ^{{\mathrm{ord}}}: \mathcal{C}^{{\mathrm{ord}}}_N \to \mathcal{W}_N \tag{95} \end{equation}$$

is the projection map from the ordinary part of the eigencurve to weight space. Then $\pi ^{{\mathrm{ord}}}$ is finite flat. The ordinary part of $\mathcal{C}^{{\mathrm{ord}}}_N$ is often referred to, though usually only locally, as the Hida family.

An extremely powerful tool is the fact that specialisations of Hida families at classical weights $k \geq 2$ are always classical modular forms. More generally, the following theorem was proved by Coleman Reference Col96.

Theorem 3.6 (Coleman).

Suppose that $k \geq 2$ is an integer weight and $f \in M^{\dagger }_k$ is a $U_p$-eigenform of slope strictly less than $k-1$. Then $f$ is classical, in the sense that it belongs to the finite-dimensional subspace

$$\begin{equation} M_k(\Gamma _0(Np)) \subset M^{\dagger }_k(\Gamma _0(N)). \tag{96} \end{equation}$$

It is difficult to overstate the importance of this powerful result, which often goes by the name of the Coleman classicality theorem. In the literature on overconvergent modular symbols, it is also commonly referred to as Coleman’s control theorem. It has far-reaching implications, and is used so frequently in the literature—as well as what follows—that it is often applied without explicit mention.

3.8. Leopoldt’s formula

We end this section with an application of this geometric viewpoint on $p$-adic modular forms, and we prove a classical result on the value at $s=1$ of $p$-adic L-functions attached to Dirichlet characters via an incarnation of Serre’s idea to investigate the constant coefficient via the higher Fourier coefficients. In this situation, it allows us to identify the L-value as an explicit combination of units. We follow the treatment in Reference BCD$^{+}$, which contains several more appearances of Serre’s idea in various guises. A different proof for Leopoldt’s formula for $L_p(1,\chi )$ can be found in Reference Was97, §5.4.

Suppose that $\chi : (\mathbf{Z}/N\mathbf{Z})^{\times } \to \mathbf{C}^{\times }$ is a primitive, even Dirichlet character with conductor $N>1$ coprime to $p$. Then we have the $p$-adic Eisenstein family of overconvergent forms

$$\begin{equation} \mathbf{E}_{k}^{(p)}(\chi )\ = \ L_p(1-k,\chi ) \ + \ 2\ \sum _{n \geq 1} \sigma ^{(p)}_{k,\chi }(n) \ q^n, \qquad \text{where} \quad \sigma ^{(p)}_{k,\chi }(n) = \sum _{p\, \nmid \, d \, \mid \, n } \ \ \chi (d) d^{k-1}. \tag{97} \end{equation}$$

This family specialises at $k=0$ to a rigid analytic function on $X^{{\mathrm{ord}}} = X_1(N)^{{\mathrm{ord}}}$, whose value at the cusp $\infty$ is the value $L_p(1,\chi )$. Now choose a primitive $N$th root of unity $\zeta$. Then there is a collection of Siegel units $g_a \in \mathcal{O}_{Y_1(N)}^{\times }$ whose $q$-expansions are given by

$$\begin{equation} g_a(q) = q^{1/12}(1-\zeta ^a) \prod _{n \geq 1} (1-q^n\zeta ^a)(1-q^n\zeta ^{-a}), \qquad 1 \leq a \leq N-1. \cssId{texmlid21}{\tag{98}} \end{equation}$$

Using the operator $V_p$ on $p$-adic modular forms, we define the rigid analytic function

$$\begin{equation} F_{\chi }^{(p)} = \frac{1}{p \mathfrak{g}(\chi ^{-1})} \sum _{a=1}^{N-1} \chi ^{-1}(a) \, \log _p \left( V_p(g_{pa})g_a^{-1} \right), \tag{99} \end{equation}$$

which is defined on the ordinary locus $X^{{\mathrm{ord}}}$. Here $\mathfrak{g}$ denotes the standard Gauß sum, obtained by summing $\chi ^{-1}(a)\zeta ^a$ over $a$. A direct computation, using expression Equation 98, shows that the higher coefficients of its $q$-expansion agree with that of $\mathbf{E}_{0}^{(p)}(\chi )$. Therefore the modular form

$$\begin{equation} \mathbf{E}_{0}^{(p)}(\chi ) - F_{\chi }^{(p)}, \tag{100} \end{equation}$$

which is a constant function, must be equal to zero, since it has nebentype $\chi$. We conclude that the constant terms of both series are equal, yielding Leopoldt’s formula,

$$\begin{equation} L_p(1,\chi ) = - \frac{(1-\chi (p)p^{-1})}{\mathfrak{g}(\chi ^{-1})} \sum _{a=1}^{N-1} \chi ^{-1}(a) \log _p(1-\zeta ^a). \tag{101} \end{equation}$$

4. Explicit computations and arithmetic applications

We now discuss how to compute explicitly with overconvergent modular forms, in more generality than was achieved in the extended example §3.4, by following the approach of Lauder Reference Lau11. We then look at a number of different arithmetic applications of this theory, illustrated with explicit examples.

4.1. Computing overconvergent forms

We first explain how to compute explicit bases for the $p$-adic Banach spaces of $r$-overconvergent forms, following Katz Reference Kat73 and Lauder Reference Lau11. Note that in the explicit example treated in §3.4, where $(p,N) = (2,1)$ and $k=0$, we were particularly lucky in the sense that the modular curve $X_0(2)$ had genus zero, and the overconvergent regions $X_r$ were isomorphic to a rigid analytic disk, for which we could identify an explicit parameter. This procedure can be repeated for any prime $p$ for which $X_0(p)$ has genus zero (i.e. for $p=2,3,5,7,13$), where one can likewise write down a power basis for the space of overconvergent modular forms, for any weight $k$. See Loeffler Reference Loe07 for a detailed discussion of this case, as well as many interesting results and computations.

For general values of $p$, we are faced with a more complicated geometric picture, as the overconvergent regions $X_r$ are isomorphic to the complement of a finite number of disks in $\mathbf{P}^1$ (see Figure 4). Moreover, in cases where we also have a nontrivial tame level $N$, the modular curve from which we remove these finitely many disks is no longer isomorphic to $\mathbf{P}^1$. Therefore, finding an explicit basis for the set of sections over the overconvergent regions $X_r$ becomes significantly more subtle. In his foundational paper on the subject, Katz Reference Kat73, Chapter 2 identifies an explicit basis for these spaces, such that any overconvergent form may be written as a unique linear combination of it, referred to as its Katz expansion.

Let $\mathcal{X}$ be the modular curve over $\mathbf{Z}_p$ with $\Gamma _1(N)$-level structure⁠Footnote⁹ for $p \nmid N\geq 5$. Let $n$ be the smallest power of $p$ such that the $n$th power of the Hasse invariant $A^n$ lifts to a level $1$ Eisenstein series $E$ of weight $k_E=n(p-1)$. Throughout this section, we assume $nr \leq 1$. Our notation is summarised in Table 1.

⁹

In practice, there is a lot of flexibility with the setup, and the computations below are usually for $\Gamma _0(N)$ instead of $\Gamma _1(N)$. To justify this, some additional analysis is required to deal with the lack of representability; see Reference BC05, Appendix.

✖

We now describe an explicit basis for the spaces $M^{\dagger }_k(r)$. Suppose $r = v_p(s)$ for some $s \in \mathbf{C}_p$, then let $\mathcal{I}_r$ be the sheaf of ideals in $\mathrm{Sym}(\omega ^{\otimes k_E})$ generated by $E-s^n$, and define the line bundle

$$\begin{equation} \mathcal{L} = \mathrm{Spec}_{\mathcal{X}}\left(\mathrm{Sym}(\omega ^{\otimes k_E})/\mathcal{I}_r\right) \ \stackrel{\pi _{\mathcal{L}}}{\longrightarrow } \ \mathcal{X}. \tag{102} \end{equation}$$

Assuming that $k\neq 1$, we can apply the base change theorems from Reference Kat73, Theorem 1.7.1 to show that

$$\begin{eqnarray} M^{\dagger }_k(r) &=& \mathrm{H}^0\left(\mathcal{L}^{{\mathrm{rig}}}, \pi ^*_{\mathcal{L}} \omega ^{\otimes k}\right) \tag{103}\\ &=& \mathrm{H}^0\left(\mathcal{X},\omega ^{\otimes k} \otimes \text{Sym} (\omega ^{\otimes k_E})\right) / \mathrm{H}^0(\mathcal{X}, \mathcal{I}_r). \cssId{Quotient}{\tag{104}} \end{eqnarray}$$

Having this concrete description in hand, we now attempt to eliminate the relation $E = s^n$ by investigating the map given by multiplication by $E$ on modular forms as in Reference Kat73, Lemma 2.6.1 and Reference Von15, Lemma 1. More precisely, the injection given by the multiplication by $E$-map

$$\begin{equation} - \times E: \mathrm{H}^0\left(\mathcal{X},\omega ^{\otimes k}\right) \longrightarrow \mathrm{H}^0\left(\mathcal{X},\omega ^{\otimes k + k_E}\right) \cssId{texmlid22}{\tag{105}} \end{equation}$$

splits as a map of $\mathbf{Z}_p$-modules. This implies that for every $i \geq 0$, we may choose generators $\{a_{i,j} \}_{j}$ for a complement of the submodule

$$\begin{equation} \mathrm{Im} \left(- \times E\right) \subseteq \mathrm{H}^0(\mathcal{X},\omega ^{\otimes k + ik_E}). \tag{106} \end{equation}$$

This choice is not canonical, but we will fix it once and for all in what follows. As in Reference Kat73, Proposition 2.6.2, one obtains the following as a consequence of Equation 104 and the splitting of Equation 105.

Theorem 4.1.

The set $\{ e_{i,j} \}_{i,j}$ is an orthonormal basis for the $p$-adic Banach space $M^{\dagger }_k(r)$, where

$$\begin{equation} e_{i,j} = s^{ni}\frac{a_{i,j}}{E^i}. \tag{107} \end{equation}$$

Note that we have avoided the case $k = 1$, which we can still compute with by appropriately twisting by $U_p$, thereby reducing the computation to one in higher weight for which the results above hold. This technique is often referred to as Coleman’s trick (see Reference Col97b, Eqn. (3.3)) and is also frequently useful in other situations. It is based on the observation that multiplication by $E^j$ defines an isomorphism

$$\begin{equation} M_k^{\dagger }(r) \ \ \longrightarrow \ \ M_{k+jk_E}^{\dagger }(r), \tag{108} \end{equation}$$

as well as the fact that the $U_p$-operator is Frobenius linear in the sense that

$$\begin{equation} U_p( f V_p(E) ) = U_p(f) E. \tag{109} \end{equation}$$

It follows from these two simple facts that $P_{k+jk_E}(t)$ equals the characteristic series of $U_p \circ G^j$ on $M_k^{\dagger }(r)$, where we denote $G = E/V_pE$. This allows us to flexibly change the weights of the spaces of overconvergent forms we are interested in. In particular, we can compute overconvergent forms in weight $1$ by reducing the computation to, say, weight $p$. Likewise, if we would like to compute the operator $U_p$ on $M_k^{\dagger }(r)$ for some extremely large weight $k$, we can use Coleman’s trick to reduce the computation to a small weight.

Now that we know, by Theorem 4.1, an explicit basis $e_{i,j}$ for the Banach space $M_k^{\dagger }(r)$, we are in a position to compute approximations of the matrix of $U_p$ on $q$-expansions. Since we can only compute finitely many of its entries, we need a good estimate on the valuations of its entries, so we know how many elements of the basis we need to compute before we are guaranteed that the end result is correct up to some chosen $p$-adic precision. To do this, let us first fix some notation for these entries. We write

$$\begin{equation} U_p \circ G^j(e_{u,v}) = \sum _{w,z} A_{u,v}^{w,z}(j)\ e_{w,z}, \tag{110} \end{equation}$$

for some $A_{u,v}^{w,z}(j) \in \mathbf{C}_p$. Said differently, the numbers $A_{u,v}^{w,z}(j)$ are the entries of the infinite matrix of $U_p \circ G^j$ with respect to our chosen orthonormal basis for $M_k^{\dagger }(r)$. The following lemma estimates their $p$-adic valuations and is an easy extension of Wan Reference Wan98, Lemma 3.1; see Reference Von15.

Lemma 4.2.

We have

$$\begin{equation} v_p\left(A_{u,v}^{w,z}(j) \right) \geq wrk_E - 1 - r(n-1). \cssId{texmlid23}{\tag{111}} \end{equation}$$

The reader may have wondered why in the above precision estimate, we included the parameter $j$, corresponding to a twist of the $U_p$ operator by $G^j = (E/V_pE)^j$, rather than simply putting $j=0$. The reason is that this allows us to easily move between different weights and to perform the computation of $U_p$ in several weights at once. The examples below illustrate this by computing the $U_p$-operator in families.

Remark.

In what follows, we frequently drop the rate of overconvergence $r$ from the notation. This is justified by the fact that an overconvergent finite slope eigenform must be $r$-overconvergent for any $r < p/(p+1)$. The data below therefore does not depend on $r$ at all, though its computation does. It is clear from Equation 111 that it is helpful in practice to choose $r$ as large as possible to accelerate convergence.

4.2. The spectral curve

We now have two crucial active ingredients for a working algorithm to compute with spaces of overconvergent modular forms, since we have (a) an explicit basis due to Katz, provided by Theorem 4.1, and (b) a precision estimate for the concomitant entries of the matrix of $U_p$ due to Wan, provided by Equation 111. Lauder Reference Lau11 combines these two ingredients into an efficient algorithm for computing $U_p$ on $M_k^{\dagger }(r)$. We note that the estimate Equation 111 is independent of $j$, and hence the computation may be performed at several $p$-adic weights at once. In this example, we compute the resulting $2$-variable series $P(\kappa ,t)$. The curve in $\mathcal{W}_N\times \mathbf{G}_m$ cut out by this equation is often referred to as the spectral curve of $U_p$, which yields the eigencurve after an additional modification; see Reference CM98.

Let $f: \mathcal{W}_N \rightarrow \mathbf{C}_p$ be a function in the Iwasawa algebra, and let $\{\kappa _0, \kappa _1, \ldots , \kappa _n\}$ be a finite set of points. Then we denote $f[\kappa _0] = f(\kappa _0)$ and we inductively define the divided difference of order $n$ to be

$$\begin{equation*} f[\kappa _0, \kappa _1, \ldots , \kappa _n] \coloneq \frac{f[\kappa _1, \ldots , \kappa _n] - f[\kappa _0, \ldots , \kappa _{n-1}]}{\kappa _n-\kappa _0}. \end{equation*}$$

We now define the $n$th Newton series to be

$$\begin{equation} P_n(\kappa ,t) = \sum _{i=0}^n P[\kappa _0,\kappa _1,\ldots ,\kappa _i](t)\times (\kappa -\kappa _0)(\kappa -\kappa _1)\cdots (\kappa -\kappa _i), \cssId{FinDiff}{\tag{112}} \end{equation}$$

where $P[\kappa _0,\ldots ,\kappa _n](t)$ is the power series in $t$ obtained by taking the corresponding finite differences on the coefficients of $P(\kappa ,t)$ of $t$, which are elements of the Iwasawa algebra by Coleman Reference Col97a. The theory of finite differences then shows that upon increasing the number of interpolation points, the $n$th Newton series $p$-adically approaches the series $P(\kappa ,t)$. This means that all we need to do to compute an approximation for $P(\kappa ,t)$ is choose our interpolation points carefully and estimate the error term.

We explicitly compute some examples, starting by revisiting the example of Buzzard and Calegari Reference BC05 familiar from §3.4, and then venturing into more unfamiliar territory relating to situations that were considered in the literature by Buzzard and Kilford Reference BK05, Roe Reference Roe14 and the work on boundary slopes, and the spectral halo by Andreatta, Iovita, and Pilloni Reference AIP18 and Bergdall and Pollack Reference BP16. We note that an alternative approach using overconvergent modular symbols has been developed in Reference DHH$^{+}$16 for Hida families. Their algorithms yield explicit $q$-expansions of Hida families, where the coefficients are elements of $\Lambda$, but it is only equipped to handle the ordinary part of the spectrum of $U_p$.

Example 4.1.

We revisit the case of $p=2$ and tame level $N=1$, where we computed with the space for $k=0$ in §3.4. Using an interpolation as described above, we can compute the two variable power series $P(\kappa ,t)$, whose specialisation at $\kappa \in \mathcal{W}$ recovers the characteristic series $P_{\kappa }(t)$ of $U_2$ on the space of overconvergent modular forms $M_{\kappa }^{\dagger }(r)$. We obtain

$$\begin{equation*} \begin{split} P(\kappa ,t)= 1 &+\ (519736167t\!+\!413685912t^2\!+\!148708352t^3\!+\!1065353216t^4)\\ &+\kappa (36306799t\!+\!374998993t^2\!+\!380696768t^3\!+\!281739264t^4) \\ &+\ \kappa ^2(43984100t\!+\!481404364t^2\!+\!496002384t^3\!+\!387895296t^4\!+\!1811939328t^5)\\ &+\ \kappa ^3(874017364t\!+\!890496879t^2\!+\!487943741t^3\!+\!4077568t^4\!+\!964689920t^5) \\ &+\ \kappa ^4(392124398t\!+\!264203079t^2\!+\!839291211t^3\!+\!908503936t^4\!+\!817102848t^5)\\ &+ O(\kappa ^5,2^{30}), \end{split} \end{equation*}$$

We actually computed $P(\kappa ,t)$ to precision $O(\kappa ^{25},2^{70})$, which took about five minutes, but we truncated the result to get output that fits in this document. Let us now investigate various specialisations:

•: The computation we did in the previous section is contained in this one, and if we set $\kappa = 5^k - 1= 0$, which corresponds to weight $k=0$, we recover the same power series as before, up to the used precision. In particular, we can read off that first few slopes are $\mathbf{0}_1, \mathbf{3}_1, \mathbf{7}_1, \ldots ,$ which agrees with the result of Buzzard and Calegari Reference BC05 that in weight $0$ the $n$th slope is equal to$$\begin{equation*} 1 + 2v_2 \left(\frac{(3n)!}{n!}\right). \end{equation*}$$
•: As for the other extreme, the main result of Buzzard and Kilford Reference BK05 states that the slopes on the boundary annulus $1/8 < |\kappa | < 1$ form an arithmetic progression with $n$th term $nv_2(\kappa )$, all with multiplicity $1$. Indeed, by substituting $\kappa = 2,$ we obtain the slope sequence $0,1,2,3,4,\ldots ,$ while for $\kappa = 4$ we recover $0,2,4,6,8,\ldots .$ Our computed power series $P(\kappa ,t)$ hence combines the best of both worlds, by describing the spectral curve over the inner regions of $\mathcal{W}$ as well as the outskirts. Notice the striking contrast between the nature of the slope sequence at $k=0$ and that close to the boundary! A folklore conjecture predicts that the same phenomenon happens in general, and a result of this flavour was obtained by Liu, Wan, and Xiao Reference LWX17.

In the above computation, we focussed on the variation of $P_{\kappa }(t)$ with the weight $\kappa$, but we can interchange the variables $\kappa$ and $t$ and study instead the powers series in $\kappa$ appearing as the coefficients of the above series in $t$. For instance, up to precision $(2^{21},\kappa ^7)$ we obtain

$$\begin{equation*} \begin{split} P(\kappa ,t)\equiv 1 &+ t(1739623\!+\!655215\kappa \!+\!2041060\kappa ^2 \!+\!1602132\kappa ^3 \!+\!2054126\kappa ^4 \!+\!779022\kappa ^5 \!+\!1634724\kappa ^6)\\ &+t^2(546968\!+\!1705937\kappa \!+\!1156556\kappa ^2 \!+\!1304431\kappa ^3 \!+\!2059079\kappa ^4 \!+\!1677821\kappa ^5 \!+\!644339\kappa ^6)\\ &+t^3(1907712\!+\!1112256\kappa \!+\!1074512\kappa ^2 \!+\!1404477\kappa ^3 \!+\!430411\kappa ^4 \!+\!51909\kappa ^5 \!+\!1261732\kappa ^6)\\ &+t^4(720896\kappa \!+\!2019328\kappa ^2 \!+\!1980416\kappa ^3 \!+\!437120\kappa ^4 \!+\!1161264\kappa ^5 \!+\!1648837\kappa ^6)\\ &+t^5(1310720\kappa ^4 \!+\!524288\kappa ^5 \!+\!1101824\kappa ^6)\\ &+O(2^{21},\kappa ^7). \end{split} \end{equation*}$$

Investigating the coefficients $a_i(\kappa )$ of $P(\kappa ,t)$ for small values, we see that their valuation on $\kappa \in \mathbf{Z}_2$ only seems to depend on $\kappa \pmod {2^6}$. This can be made into a rigorous proof of this fact by using the uniform estimates in Wan Reference Wan98 for the Newton polygon in $t$ of $P(\kappa ,t)$ recalled above. After possibly redoing the computation to a higher precision, to assure that all the slopes are indeed correct, we recover the following theorem, which may be found in Emerton Reference Eme98, Theorem 1.1.

Theorem 4.3 (Emerton).

The minimal nonzero slope of $U_2$ on $M^{\dagger }_k$ in tame level $1$, along with its multiplicity, depends only on $k \pmod {16}$. More precisely, it is given by

$$\begin{equation*} \begin{array}{lll} \mathbf{3}_1 & \text{if} \ \ k \equiv 0 & \pmod 4, \\\mathbf{4}_1 & \text{if} \ \ k \equiv 2 & \pmod {8}, \\\mathbf{5}_1 & \text{if} \ \ k\equiv 6 & \pmod {16}, \\\mathbf{6}_2 & \text{if} \ \ k \equiv 14 & \pmod {16}. \end{array} \end{equation*}$$

We note that the calculations of Emerton Reference Eme98 rely crucially on the explicit uniformisations of $2$-adic regions on the genus $0$ modular curves $X_0(2^n)$ for small values of $n$, which are hard to come by in higher levels and primes. Our algorithms do not rely on any specifics of the situation $(p,N)=(2,1)$, and therefore similar arguments work in more general settings.

Looking further into the above coefficients, let $\lambda (i)$ be the number of roots of $a_i(\kappa )$ in the open unit disk. Table 2 displays the $2$-adic valuations of these roots, along with their multiplicities. By inspecting the $2$-adic valuations of the coefficients we computed, we see that this output is provably correct and complete. Note that

$$\begin{equation*} \lambda (i) = \binom{i}{2}, \end{equation*}$$

which also follows from the main result of Buzzard and Kilford Reference BK05. In Bergdall and Pollack Reference BP16 precise conjectures are made about the location of the zeroes of $a_i$.

Example 4.2.

Let us set $(p,N) = (3,1)$ and compute $P(\kappa ,t)$ up to precision $O(3^{90},\kappa ^{60})$. With the same notation as above, we find⁠Footnote¹⁰ the slopes of the zeroes of the coefficients $a_i(\kappa )$; see Table 3. Again, this output is complete and provably correct. Notice that

¹⁰

The motivated reader can try to recover this computation, for instance using an explicit basis similar to that used in §3.4, which is possible since $X_0(3)$ has genus $0$. There is a particularly nice basis, described by Loeffler Reference Loe07, which can be twisted by an Eisenstein series to obtain the computation in all weights.

✖

$$\begin{equation*} \lambda (i) = 2 \binom{i}{2}, \end{equation*}$$

which follows from the main result of Roe Reference Roe14, who showed that near the boundary, the slopes form an arithmetic progression with an explicit argument that depends on the valuation of $\kappa$. Roe tackled this more complicated situation using the same techniques as Buzzard and Kilford Reference BK05.

Example 4.3.

We now turn to $(p,N) = (2,3)$ and compute $P(\kappa ,t)$ up to precision $O(2^{60},\kappa ^{20})$. This computation took about 90 minutes on a standard laptop. In addition to the notation above, let $\mu (i)$ to be the largest power of $p$ that divides $a_i(\kappa )$. The work of Bergdall and Pollack Reference BP16 uses Koike’s trace formula to prove that $\mu (i) = 0$ whenever $N=1$. However, in our situation $\mu$ appears to be larger for several $i$; see Table 4. Computing $P(\kappa ,t)$ up to precision $O(2,\kappa ^{30})$ takes about one minute. Extracting the degrees of the $t$-coefficients, our data suggests the boundary slope sequence

$$\begin{equation*} \mathbf{0}_2, \mathbf{1/2}_2, \mathbf{1}_2, \mathbf{3/2}_2,\mathbf{2}_2, \mathbf{5/2}_2, \mathbf{3}_2, \mathbf{7/2}_2, \ldots , \end{equation*}$$

which is indeed in accordance with the Newton polygon of $\lambda +\mu$ computed above, up to the chosen precisions. Notice the similarity with the slope sequence for $(p,N) = (2,1)$.

Example 4.4.

As above, set $(p,N)= (11,1)$ and compute $P(\kappa ,t)$ up to precision $O(11,\kappa ^{60})$, which takes about two minutes. We compute the degrees of the $t$-coefficients, which suggest the boundary slope sequence,

$$\begin{equation*} \mathbf{0}_1, \mathbf{1}_1, \mathbf{2}_1, \mathbf{3}_1, \mathbf{4}_2, \mathbf{5}_1, \mathbf{6}_1, \mathbf{7}_1, \mathbf{9}_2, \ldots . \end{equation*}$$

4.3. The Gouvêa–Mazur conjecture

An enormous amount of arithmetic information is encoded in the slopes of overconvergent modular forms, which are the valuations of their $U_p$-eigenvalues. One of the consequences of the theory of Coleman Reference Col97b is that for any $\alpha > 0$, there exists a smallest integer $N_{\alpha }$ with the following property. If $k_1$ and $k_2$ are integers such that

$$\begin{equation} k_1 \equiv k_2 \mod p^{N_{\alpha }}(p-1), \tag{113} \end{equation}$$

then the collection of slopes $\leq \alpha$ in weights $k_1$ and $k_2$ agree, with multiplicities. Gouvêa and Mazur conjectured in Reference GM92 that $N_{\alpha } \leq \lfloor \alpha \rfloor$. However, Wan Reference Wan98 exhibits an explicit quadratic upper bound for $N_{\alpha }$, depending on $p$ and the level.⁠Footnote¹¹

¹¹

Strictly speaking, Wan assumes that $p\geq 5$, but his arguments easily extend to $p=2,3$ when using our basis described above.

✖

The key observation for Wan is that the lower bound Equation 111 is independent of $j$. After taking determinants, we obtain a lower bound on the coefficients of the characteristic series of $U_p$ in weight $k+jk_E$, again independent of $j$. Wan then proceeds by proving a very general reciprocity lemma on Newton polygons, which allows him to transform the lower bound for those coefficients into an upper bound for $N_{\alpha }$.

Theorem 4.4.

There is an explicitly computable quadratic polynomial $P \in \mathbf{Q}[x]$, depending only on $p$ and the level, such that $N_{\alpha } \leq P(\alpha )$.

Since Gouvêa and Mazur conjectured in Reference GM92 that $N_{\alpha } \leq \lfloor \alpha \rfloor$, this is still an order of magnitude from what we expect. However, the Gouvêa–Mazur conjecture is known to be false, and a counterexample was given in Reference BC04. It should be noted that the counterexample of Buzzard and Calegari is only a very small violation of the conjecture, and generically it seems that in fact something much stronger than the Gouvêa–Mazur conjecture is true! Let us illustrate this with two examples.

The case $p=2$ is prolific soil for finding counterexamples to the Gouvêa–Mazur conjecture. As noted above, the first counterexample was given in Reference BC04 for $p=59$ and level $1$, and a further one for $p=79$ in Reference Lau11. For $p=2$, we obtain the following slope sequences in level $\Gamma _0(19)$:

$$\begin{eqnarray*} k=-2:& &\mathbf{0}_4, \mathbf{1/2}_2, \mathbf{1}_3, \mathbf{2}_{5}, \mathbf{9/4}_4, \mathbf{4}_3, \mathbf{5}_2, \mathbf{6}_{21}, \mathbf{15/2}_2, \ldots ,\\ k=0: & &\mathbf{0}_4, \mathbf{1/2}_2, \mathbf{1}_5, \mathbf{3}_{11}, \mathbf{13/4}_4, \mathbf{7}_{25}, \mathbf{25/2}_4, \mathbf{13}_{11},\ldots ,\\ k=2: & &\mathbf{0}_4, \mathbf{1/2}_2, \mathbf{1}_3, \mathbf{3/2}_2, \mathbf{2}_5, \mathbf{4}_{11}, \mathbf{17/4}_4, \mathbf{8}_{25}, \mathbf{27/2}_4,\ldots ,\\ k=4: & &\mathbf{0}_4, \mathbf{1/2}_2, \mathbf{1}_5, \mathbf{5/2}_2, \mathbf{3}_6, \mathbf{7/2}_2, \mathbf{4}_3, \mathbf{5}_5, \mathbf{21/4}_4, \ldots ,\\ k=6: & &\mathbf{0}_4, \mathbf{1/2}_2, \mathbf{1}_3, \mathbf{2}_{7}, \mathbf{5/2}_2, \mathbf{4}_3, \mathbf{9/2}_2, \mathbf{5}_6, \mathbf{11/2}_2, \ldots ,\\ k=8: & &\mathbf{0}_4, \mathbf{1/2}_2, \mathbf{1}_5, \mathbf{3}_{13}, \mathbf{7/2}_2, \mathbf{6}_5, \mathbf{13/2}_2, \mathbf{7}_6, \mathbf{15/2}_2, \ldots . \end{eqnarray*}$$

Notice the aberration in the dimensions of the slope-$1$ subspaces, as well as the slope $3$ subspaces in weights $0$ and $8$. Whereas these are all near misses, in that the smallest slopes for which discrepancies arise are exactly equal to the valuation of the weight difference, we note a two-dimensional slope $3/2$ subspace in weight $2$, which is completely absent in weight $6$, whereas $3/2 < v_2(6-2) = 2$. Similarly, the slope $9/4$ subspace in weight $-2$ does not exist in weight $6 = -2 + 2^3$.

On the other hand, to see how the Gouvêa–Mazur conjecture is frequently much weaker than the truth, consider the first few slopes of $U_3$ acting on $M^{\dagger }_{278}\left(\Gamma _0(41)\right)$, which we computed using Lauder’s algorithm to be

$$\begin{equation} \mathbf{0}_{12},\mathbf{1}_{14},\mathbf{3}_{48}, \mathbf{6}_{14}, \mathbf{7}_{22}, \mathbf{8}_{6}, \mathbf{9}_{22}, \mathbf{10}_{14}, \mathbf{12}_{48}, \mathbf{14}_{14}, \mathbf{16}_{22}, \mathbf{17}_6, \mathbf{18}_{22},\ldots , \tag{114} \end{equation}$$

where the subscripts denote multiplicities. Repeating the same computation in weight $8$, we find the exact same slope sequence for all the terms we display here, whereas the Gouvêa–Mazur conjecture would only predict that the slopes up to $3$ to agree. This behaviour seems rather typical in most examples we computed.

4.4. Chow–Heegner points

We now discuss how the computation of spaces of overconvergent forms, using the above algorithms, can be used to construct arithmeto-geometric invariants. We chose to discuss the Heegner-type point construction on elliptic curves, following Darmon and Rotger Reference DR14.

Let $p$ be a prime, and let $E/\mathbf{Q}$ be an elliptic curve of conductor $N$ associated to a $p$-ordinary form $f \in S_2^{\mathrm{new}}(\Gamma _0(N))$. Let $g$ be any other weight $2$ newform which is $p$-ordinary. It can be deduced from the work of Darmon and Rotger Reference DR14, Theorem 1.3 that there exists a global (rational) point $P_g \in E(\mathbf{Q})$ that satisfies the Gross–Zagier type formula,

$$\begin{equation} \log (P_g) = 2d_g\ \cdot \ \frac{\mathcal{E}_0(g) \ \mathcal{E}_1(g)}{\mathcal{E}(g,f,g)}\ \cdot \ \mathcal{L}_p (\mathbf{g},\mathbf{f}, \mathbf{g})(2,2,2), \cssId{texmlid24}{\tag{115}} \end{equation}$$

where the quantities appearing in the formula are the following:

•: $\log$ is the formal $p$-adic logarithm on the elliptic curve $E$,
•: $d_g$ is an integer described in Reference DDLR15, Remark 3.1.3,
•: the $\mathcal{E}$-factors are quadratic numbers depending only on the $p$th coefficients of $f$ and $g$,
•: $\mathcal{L}_p(\mathbf{g},\mathbf{f}, \mathbf{g})$ is the Rankin triple product $p$-adic L-function of the Hida families $\mathbf{f},\mathbf{g}$ through $f,g$.

The last item in this list deserves some discussion. We will not define the Rankin triple product $p$-adic L-function here, as that would lead us too far from the topic of these notes, and the exposition in Darmon and Rotger Reference DR14 is excellent. We will however explain how one computes the special value appearing in formula Equation 115. As before, we let $e^{\mathrm{ord}} = \lim _{n}U_p^{n!}$ be Hida’s ordinary projector. Start by computing

$$\begin{equation} e^{\mathrm{ord}}(\theta ^{-1}f^{[p]}\times g), \cssId{texmlid25}{\tag{116}} \end{equation}$$

where $f^{[p]}$ denotes the $p$-depletion $(1 - V_pU_p)f$ of $f$. Here, we have used Serre’s differential operator $\theta = qd/dq$, which is an important object in the theory of overconvergent forms and which would surely merit an entire article to do it justice. The inverse of this operator is defined by the $p$-adic limit

$$\begin{equation} \theta ^{-1} = \lim _{n \to \infty } \ \ \theta ^{p^n-1}. \tag{117} \end{equation}$$

By Coleman’s criterion, we conclude that the overconvergent form Equation 116 is classical, and hence it can be written as a finite linear combination of Hecke eigenforms of weight $2$ and level $\Gamma _0(p)$. The special value $\mathcal{L}_p (\mathbf{g},\mathbf{f}, \mathbf{g})(2,2,2)$ is the coefficient of $g$ in this linear combination.

Example 4.5.

Consider the elliptic curve

$$\begin{equation} E \ : \ y^2 + xy = x^3 - x^2 - x + 1, \tag{118} \end{equation}$$

which has rank $1$ and conductor $58$. Consider its associated newform $f$, and let $g$ be the unique newform on $\Gamma _0(58)$ different from $f$. Then

$$\begin{equation} \begin{array}{lll} f(q) &=& q - q^2 - 3q^3 + q^4 - 3q^5 + 3q^6 - 2q^7 - q^8 + 6q^9 + 3q^{10} - q^{11} + \cdots , \\g(q) &=& q + q^2 - q^3 + q^4 + q^5 - q^6 - 2q^7 + q^8 - 2q^9 + q^{10} - 3q^{11} + \cdots . \end{array} \tag{119} \end{equation}$$

Both $f$ and $g$ are $2$-ordinary. Letting $P = (0,1)$ be a generator for $E(\mathbf{Q})$, we compute that

$$\begin{equation} \mathcal{L}_2(\mathbf{g},\mathbf{f}, \mathbf{g})(2,2,2) \equiv 3\log _E(P) \pmod {2^{200}}, \tag{120} \end{equation}$$

as predicted by the theory in Reference DR14.

Let us end this discussion on a more speculative note. In the above it is important that $f$ is ordinary. Whereas it is conceivable that this may be extended to eigenforms of finite slope through the use of Coleman families, it is not clear that even if the Rankin triple product $p$-adic L-function may be constructed in cases where $f$ is of infinite slope, that it should be related to global points. Nonetheless, the computation of the special value above yields an explicit number even in those situations, and we now compute a few examples where the Tate module of $E_{\mathbf{Q}}$ is wildly ramified at $2$ or $3$, and $f$ is of infinite slope.

Example 4.5a.

Consider the elliptic curve

$$\begin{equation} E \ :\ y^2 + y = x^3 + 9x - 10, \tag{121} \end{equation}$$

which is of conductor $4617 = 3^5\cdot 19$ and rank $1$. Consider the newforms

$$\begin{equation} \begin{array}{lll} f(q) &=& q - 2q^2 + 2q^4 - 2q^5 -3q^7 + 4q^{10} - 6q^{11} + \cdots , \\g(q) &=& q - 2q^3 - 2q^4 +3q^5 - q^7 + q^9 + 3q^{11} + \cdots , \end{array} \tag{122} \end{equation}$$

where $f$ is associated to $E$, and $g$ is the unique cuspidal newform of weight $2$ on $\Gamma _0(19)$. Despite $f$ being of infinite $3$-adic slope, we can run the computation and find a numerical value for $\mathcal{L}_2(\mathbf{g},``f'', \mathbf{g})(2,2,2)$. We find that

$$\begin{equation} \mathcal{L}_3(\mathbf{g},``f'', \mathbf{g})(2,2,2) \equiv t\cdot \log _E(P) \pmod {3^{200}}\ \ \text{ where } \ \ 2t^2 + 48t + 729=0, \tag{123} \end{equation}$$

where $P = (4,9)$ is a generator of $E(\mathbf{Q})$. The fact that both quantities are related by a quadratic number $t$ of small height suggests that a more general analogue of the theory for ordinary forms in Reference DR14, and more specifically equation Equation 115, might exist.

Example 4.5b.

Consider the elliptic curve

$$\begin{equation} E \ :\ y^2 = x^3 + x^2 - 62893x - 6091893, \tag{124} \end{equation}$$

which is of rank $1$ and conductor $15104 = 2^8\cdot 59$. Let $f$ be its associated newform, and let $g$ be the newform of level $118$ associated to the elliptic curve with Cremona label 118.a1. Then

$$\begin{equation} \begin{array}{lll} f(q) &=& q - 2q^3 - 3q^7 + q^9 +3q^{11} -3q^{13} + \cdots , \\g(q) &=& q - q^2 -q^3 + q^4 -3q^5 +q^6 -q^7 -q^8 -2q^9 + 3q^{10} -2q^{11} + \cdots . \end{array} \tag{125} \end{equation}$$

Note that $g$ is $2$-ordinary. We compute that

$$\begin{equation} \mathcal{L}_2(\mathbf{g},``f'', \mathbf{g})(2,2,2) \equiv 6\log _E(P) \pmod {2^{100}}. \tag{126} \end{equation}$$

4.5. $p$-Adic L-functions of real quadratic fields

We end this article with a discussion of a method to compute $p$-adic L-functions of totally real fields $F$, following Reference LV19. We closely mirror the approach to $p$-adic L-functions in §2 developed in Serre Reference Ser73 and Deligne and Ribet Reference DR80, which is rooted in an idea that goes back to Hecke Reference Hec24 and Siegel Reference Sie68. It should be noted that an alternative approach towards $p$-adic L-functions of Barsky and Cassou-Noguès Reference Bar78Reference CN79 based on the explicit formula for zeta values of Shintani Reference Shi76 was recently used to develop an algorithm for their computation by Roblot Reference Rob15. Instead, here we take an approach using diagonal restrictions of Eisenstein series and $p$-adic interpolation, similar to that of Cohen Reference Coh76 and Cartier and Roy Reference CR72.

For simplicity, we restrict to the case where $F = \mathbf{Q}(\sqrt {D})$ is a real quadratic field. Let $\mathfrak{d}$ denote its different ideal. Suppose $\psi$ is a character of $F$. Then Hecke Reference Hec24 proposed studying the values $L(\psi ,1-k)$ by considering the diagonal restriction of a Hilbert Eisenstein series of weight $k$ over $F$. This was carried out by Klingen and Siegel Reference Kli62Reference Sie68 to show the rationality of such special values and to give explicit closed formulae for some small values of $k$. For instance, their methods, which we review shortly, yield classical identities such as

$$\begin{equation} \zeta _F(-1) = \frac{-1}{60} \sum _{\substack{b < \sqrt {D} \\ b \equiv D \pmod {2}}} \sigma _1\left( \frac{D - b^2}{4}\right). \cssId{texmlid28}{\tag{127}} \end{equation}$$

To explain their arguments, we recall the definition of the Eisenstein series attached to a character $\psi$ of modulus $\mathfrak{m}$. Suppose $k \geq 1$ is such that $\psi$ has sign $(-1)^k$ at both infinite places. Shimura Reference Shi78 defines the space $M_k(\mathfrak{m},\psi )$ of Hilbert modular forms of (parallel) weight $k$, level $\mathfrak{m}$, and character $\psi$. We content ourselves by mentioning that the data includes a holomorphic function $f: \mathcal{H}^2 \to \mathbf{C}$ which satisfies

$$\begin{equation} ({\mathrm{c}}_1z_1+\mathrm{d}_1)^{-k}({\mathrm{c}}_2z_2+\mathrm{d}_2)^{-k}f\left( \frac{\mathrm{a}_1z_1+\mathrm{b}_1}{{\mathrm{c}}_1z_1+\mathrm{d}_1},\frac{\mathrm{a}_2z_2+\mathrm{b}_2}{{\mathrm{c}}_2z_2+\mathrm{d}_2}\right) = \psi (\mathrm{a}) f(z_1,z_2), \cssId{texmlid26}{\tag{128}} \end{equation}$$

for all matrices

$$\begin{equation} \gamma = \left( \begin{matrix} \mathrm{a} & \mathrm{b} \\ {\mathrm{c}} & \mathrm{d} \\ \end{matrix} \right)\in \operatorname {SL}_2(\mathcal{O}_F) \quad \text{such that } {\mathrm{c}} \in \mathfrak{m}. \tag{129} \end{equation}$$

Here and in what follows, we have used the notation $x_i$ to denote the image of $x\in F$ under the $i$th embedding $\sigma _i : F \hookrightarrow \mathbf{R}$. The transformation law Equation 128 implies that every form has a $q$-expansion, indexed by the totally positive elements $\mathfrak{d}^{-1}_+$ of the inverse different. The case of interest to us is given by the Eisenstein series

$$\begin{equation} \mathbf{G}_{k,k}(\psi ) \in M_k(\mathfrak{m},\psi ), \tag{130} \end{equation}$$

whose $q$-expansion is given by⁠Footnote¹²

¹²

In the case where $k=1$ and $\mathfrak{m} = (1)$, the constant term of Equation 131 must be modified suitably. We refer the interested reader to the statements contained in the article by Darmon, Dasgupta, and Pollack Reference DDP11, Proposition 2.11 for more details.

✖

$$\begin{equation} L(\psi ,1-k) + 4 \sum _{\nu \in \mathfrak{d}_+^{-1}} \sum _{\mathfrak{a} | (\nu )\!\mathfrak{d}} \psi (\mathfrak{a}) \operatorname {Nm}(\mathfrak{a})^{k-1}\exp \left( 2\pi i (\nu _1 z_1 + \nu _2 z_2) \right). \cssId{texmlid27}{\tag{131}} \end{equation}$$

The diagonal restriction of $\mathbf{G}_{k,k}(\psi )(z_1,z_2)$ is obtained by setting $z_1 = z_2$, and it is a modular form of weight $2k$ and level one, which has the $q$-expansion,

$$\begin{equation} L(\psi ,1-k) + 4 \sum _{n\geq 1} \left( \sum _{\substack{\nu \in \mathfrak{d}_+^{-1}\\ \operatorname {Tr}(\nu ) = n}} \sum _{\mathfrak{a} | (\nu )\!\mathfrak{d}} \psi (\mathfrak{a}) \operatorname {Nm}(\mathfrak{a})^{k-1} \right) q^n. \cssId{texmlid29}{\tag{132}} \end{equation}$$

Setting $k=2$, note that $M_4(\operatorname {SL}_2(\mathbf{Z})) = \langle \mathbf{E}_4 \rangle$, so the above form must be a multiple of $\mathbf{E}_4$. One quickly determines this multiple, and we deduce the equality Equation 127. For general $k$, it is more difficult to describe the combination of modular forms we obtain, but we can always choose a basis of classical modular forms with rational Fourier coefficients, of which the diagonal restriction must be a rational linear combination due to the rationality⁠Footnote¹³ of its higher Fourier coefficients. It then immediately follows that $L(\psi ,1-k)$ must also be rational. We illustrate this with a simple example.

¹³

Strictly speaking, here we mean rational over the smallest number field containing the values of $\psi$.

✖

Example 4.6.

Let us consider $F= \mathbf{Q}(\sqrt {5})$. The narrow ray class group attached to the prime $(3)$ is

$$\begin{equation} \operatorname {Cl}_{(3)}^+ \simeq \mathbf{Z} / 2 \mathbf{Z}, \tag{133} \end{equation}$$

and the unique quadratic character $\psi$ of conductor $(3)$ is totally odd. We compute the diagonal restriction in Equation 132 for $k=3$ and find it has $q$-expansion given by

$$\begin{equation} L(\psi ,-2) \ -1144q - 39696q^2 - 291448q^3 - 1261696q^4 + \cdots . \tag{134} \end{equation}$$

This is a modular form of weight $6$ and level $\Gamma _1(3)$, and the space $M_6(\Gamma _1(3))$ is three dimensional and has a basis of the form

$$\begin{equation} \left\{ \begin{array}{lrrrrrrrrrrl} f_1 = & 1 & & & & &-& 504 q^3 & & & + & \cdots , \\f_2 = & & & q & & &+& 45q^3 &+& 166q^4 & + & \cdots , \\f_3 = & & & & & q^2 & + & 6q^3 &+& 27q^4 & + & \cdots . \\\end{array} \right. \tag{135} \end{equation}$$

Using the first three higher Fourier coefficients, we easily determine that the diagonal restriction is equal to

$$\begin{equation*} \frac{32}{9} f_1 -1144 f_2 - 39696f_3, \end{equation*}$$

and we deduce that $L(\psi ,-2) = 32/9$.

For general weights and characters, our inability to easily describe the first few Fourier coefficients of a rational basis for the space of modular forms that contains the diagonal restriction is what stood in the way of giving a clean explicit formula of the sort in Equation 127. Computationally, we may easily determine such a basis as in the above example, and we obtain such a formula in any given case that merits our consideration. This effectively reduces the computation of the constant term to the efficient computation of the higher Fourier coefficients of Equation 132, which are of a much more elementary nature. For real quadratic fields, an efficient algorithm was presented in Reference LV19 in terms of the theory of reduced cycles of indefinite quadratic forms Reference Gau01Reference BV07. The $p$-adic variation of the constant term is then computed by interpolation, and therefore we find an algorithmic incarnation of the idea of Serre of studying this variation by studying the corresponding variation of the higher Fourier coefficients.

Example 4.7.

Let $F= \mathbf{Q}(\sqrt {3\cdot 71})$, and let $\psi$ be the genus character attached to the extension

$$\begin{equation*} L = \mathbf{Q}(\sqrt {-3},\sqrt {-71}). \end{equation*}$$

The $p$-adic L-function is naturally an element in the Iwasawa algebra $\mathbf{Z}_p \lBrack \kappa \rBrack$ where for any positive integer $k \equiv 1 \pmod {p-1},$ we have the interpolation property

$$\begin{equation} L_p\left((1+p)^{k-1} - 1, \psi \right) = L(1-k,\psi ) \cdot \prod _{\mathfrak{p} | p}(1 - \psi (\mathfrak{p})\operatorname {Nm}(\mathfrak{p})^{k-1}), \cssId{texmlid30}{\tag{136}} \end{equation}$$

which, together with the method described above, we use to compute numerically that

$$\begin{equation*} \begin{array}{lll} L_7(\kappa , \psi ) &=\ -103777561\cdot 7\, \kappa - 96435328 \, \kappa ^2 - 15935394\, \kappa ^3 + \cdots &\pmod {7^{10},\kappa ^4},\\L_{11}(\kappa ,\psi ) &=\ -8645808191 - 10894273842\, \kappa + 4315116763\, \kappa ^2 + \cdots &\pmod {11^{10},\kappa ^4}. \\\end{array} \end{equation*}$$

By inspection of the Newton polygon, we see that the $7$-adic L-function has precisely two zeroes in the open unit disk. One of the zeroes is a so-called exceptional zero, which is caused by the vanishing of the Euler factor at $\kappa =0$ (corresponding to $k=1$) in the equality Equation 136. Such a zero is not present in the $11$-adic L-function, since $11$ splits into two ideals in $F$, neither of which are in the kernel of $\psi$. The other zero of $L_7(\kappa ,\psi )$ is more interesting, and we compute its approximate value

$$\begin{equation*} \kappa = 2669714\cdot 7 \qquad \pmod {7^{10}}. \end{equation*}$$

The presence of this zero predicts linear growth of the $7$-part of the class number of the cyclotomic tower over $L$ relative to $F$. In fact, such a divisibility is already be observed at the bottom layer, since

$$\begin{equation*} \operatorname {Cl}_L \simeq \mathbf{Z}\!/7 \mathbf{Z}. \end{equation*}$$

A celebrated feature of these $p$-adic L-functions is contained in the Gross–Stark conjecture, which occurs in situations where $p$ is inert in $F$, so there is an exceptional zero as in the example above. Indeed, in this case it is known by the work of Darmon, Dasgupta, and Pollack Reference DDP11 that

$$\begin{equation*} L_p(0, \psi ) = 0, \qquad \qquad L_p'(0,\psi ) = \log _p(u), \quad u \in \mathcal{O}_{H}[1/p] ^{\times }, \end{equation*}$$

where $H$ is the ray class field cut out by $\psi$. The numerical computation of the quantity $L_p'(0,\psi )$ may be done without first computing the series $L_p(\kappa ,\psi )$, using a more direct and efficient approach. To explain it, assume for simplicity that $\psi$ is unramified, and note the following (see Reference DPV19).

•

The $p$-adic family $\mathbf{G}_{\kappa ,\kappa }(\psi )(z,z)$ obtained from the diagonal restrictions of the Hilbert Eisenstein series specialising to the $p$-stabilisations of the Eisenstein series Equation 131 attached to $\psi$ vanishes at $\kappa = 0$ (which corresponds to weight $k=1$); i.e., we have$$\begin{equation*} \mathbf{G}_{0,0}(\psi )(z,z) = 0. \end{equation*}$$

In other words, the exceptional zero of the constant term propagates to the higher coefficients.

•

Its first order derivative with respect to $\kappa$ is the $q$-expansion of an overconvergent modular form of tame level one, whose constant coefficient is equal to $L_p'(0,\psi )$.

Since the higher Fourier coefficients are elementary even after taking this first order derivative, we may proceed with the strategy as above. We compute the higher Fourier coefficients, write the form in terms of a precomputed basis for the space $M_2^{\dagger }(\operatorname {SL}_2(\mathbf{Z}))$, and obtain a $p$-adic approximation for the constant term $L_p'(0,\psi )$. This leads to very significant speedups when compared to the naive approach that first computes the series $L_p(\kappa ,\psi )$. The following example appears in Reference LV19, §4.4.

Example 4.8.

Let $F = \mathbf{Q}(\sqrt {321})$, which has class number $6$. Then we compute in under five seconds that when $\psi$ is the unique unramified quadratic character, we have $L_7'(0,\psi ) = \log _7(u)$, where $u$ is a root of

$$\begin{equation*} 7^{16}u^6 - 20976 \cdot 7^8 u^5 - 270624 \cdot 7^4 u^4 + 526859689u^3 - 270624u^2 - 20976u + 74 = 0, \end{equation*}$$

which is a $7$-unit in the Hilbert class field of $F$.

Remark 1.

A beautiful alternative method for the computation of the Gross–Stark unit was developed for real quadratic fields by Dasgupta Reference Das07 and for cubic fields by Slavov Reference Sla07 based on the Shintani cone refinements of Reference Das08. We mention also the unpublished algorithm of Charollois, based on cocycle relations for $\mathrm{GL}_n,$ as in Reference CD14Reference CDG15. These works are more closely related to the definition of the $p$-adic L-functions by Barsky and Cassou-Noguès, but they yield a suitable refinement of it. Such a refinement may also be obtained for our method above, where the $p$-adic family of Eisenstein series is replaced by a cuspidal family in the antiparallel weight direction. This is the subect of the forthcoming paper Reference DPV20.

Remark 2.

We note the striking parallel with the work of Gross and Zagier Reference GZ86 on singular moduli. They consider the setting of two imaginary quadratic fields $K_1 = \mathbf{Q}(\tau _1)$ and $K_2 = \mathbf{Q}(\tau _2)$, whose biquadratic compositum contains a unique real quadratic subfield $F$. The real analytic family $\mathbf{G}_s(z_1,z_2)$ over $F$ attached to the associated genus character is the principal actor in the analytic part of the arguments of Gross and Zagier. They consider its diagonal restriction $\mathbf{G}_s(z,z)$, and show the following.

•

When $s=0$, we have $\mathbf{G}_s(z,z) =0$.

•

The holomorphic projection of the first derivative$$\begin{equation*} \left(\frac{\partial }{\partial s} \mathbf{G}_s(z,z) \right) \biggm |_{s=0}^{\mathrm{hol}} \end{equation*}$$

has Fourier coefficients related to $\log \operatorname {Nm}\left( j(\tau _1) - j(\tau _2)\right)$.

Since the holomorphic projection is a holomorphic modular form of weight $2$ and level $1$, it must vanish! This vanishing gives Gross and Zagier their explicit formula for $\operatorname {Nm}\left( j(\tau _1) - j(\tau _2)\right)$.

In Reference DPV19 we observe a suitable $p$-adic analogue of these phenomena. More precisely, we show that

$$\begin{equation} \left(\frac{\partial }{\partial \kappa } \mathbf{G}_{\kappa ,\kappa }(\psi )(z,z) \right) \biggm |_{\kappa = 0}^{{\mathrm{ord}}} \cssId{texmlid31}{\tag{137}} \end{equation}$$

is a classical modular form of weight $2$, whose component along the Eisenstein series $\mathbf{E}_2^{(p)}$ has constant term $L_p'(0,\psi )$. For Gross and Zagier, the higher Fourier coefficients were related to the norms of differences of singular moduli. In our setting, the higher Fourier coefficients of Equation 137 are related to the RM (real multiplication) values of certain rigid cocycles, which were introduced in Reference DV as a framework for singular moduli in the case of real quadratic fields, which are conjecturally algebraic.

In this case, the conjecture may be proved using the idea of Serre in the reverse direction, whereby information on a higher Fourier coefficient is inferred from the constant coefficient. Indeed, using the theory of $p$-adic deformation of Galois representations, it was proved by Darmon, Dasgupta, and Pollack in Reference DDP11 that the constant term is a rational multiple of the logarithm of a unit in $\mathcal{O}_H[1/p]^{\times }$. The same then follows for the higher Fourier coefficients, giving the algebraic nature of the RM values of rigid cocycles conjectured in Reference DV, at least in the special case of the so-called Dedekind–Rademacher cocycle.

Acknowledgments

This article is a reworked version of the notes for a mini-course taught by the author at a summer school on Iwasawa theory, held in Bordeaux 19–22 June 2019. We are grateful to the organisers of this conference for the invitation to give these lectures and to Mark Goresky for the invitation to publish them in the Bulletin of the AMS. All computations were performed using the Sage Reference S$^{+}$20 and Magma Reference BCP97 computer algebra systems.

Coefficient	Valuations	$\lambda$
$a_0(\kappa ) = 1$	$\emptyset$	$0$
$a_1(\kappa )$	$\emptyset$	$0$
$a_2(\kappa )$	$\mathbf{3}_1$	$1$
$a_3(\kappa )$	$\mathbf{3}_2, \mathbf{4}_1$	$3$
$a_4(\kappa )$	$\mathbf{3}_4, \mathbf{4}_1, \mathbf{7}_1$	$6$
$a_5(\kappa )$	$\mathbf{3}_{6}, \mathbf{4}_2, \mathbf{5}_1, \mathbf{7}_1$	$10$
$a_6(\kappa )$	$\mathbf{3}_{9}, \mathbf{4}_3, \mathbf{5}_2, \mathbf{6}_1$	$15$
$a_7(\kappa )$	$\mathbf{3}_{12}, \mathbf{4}_5, \mathbf{5}_2, \mathbf{6}_1, \mathbf{8}_1$	$21$

Coefficient	Valuations	$\lambda$
$a_0(\kappa ) = 1$	$\emptyset$	$0$
$a_1(\kappa )$	$\emptyset$	$0$
$a_2(\kappa )$	$\mathbf{1}_2$	$2$
$a_3(\kappa )$	$\mathbf{1}_5, \mathbf{3}_1$	$6$
$a_4(\kappa )$	$\mathbf{1}_9, \mathbf{2}_2, \mathbf{3}_1$	$12$
$a_5(\kappa )$	$\mathbf{1}_{15}, \mathbf{2}_4, \mathbf{3}_1$	$20$
$a_6(\kappa )$	$\mathbf{1}_{22}, \mathbf{2}_5, \mathbf{3}_2, \mathbf{4}_1$	$30$
$a_7(\kappa )$	$\mathbf{1}_{30}, \mathbf{2}_8, \mathbf{3}_2, \mathbf{4}_2$	$42$
$a_8(\kappa )$	$\mathbf{1}_{40}, \mathbf{2}_{11}, \mathbf{3}_2, \mathbf{4}_3$	$56$

Coefficient	Valuations	$\lambda$	$\mu$
$a_0(\kappa ) = 1$	$\emptyset$	$0$	$0$
$a_1(\kappa )$	$-$	$-$	$-$
$a_2(\kappa )$	$\emptyset$	$0$	$0$
$a_3(\kappa )$	$\emptyset$	$0$	$1$
$a_4(\kappa )$	$\mathbf{4}_1$	$1$	$0$
$a_5(\kappa )$	$\mathbf{3}_2$	$2$	$1$
$a_6(\kappa )$	$\mathbf{3}_2, \mathbf{4}_1$	$3$	$0$
$a_7(\kappa )$	$\mathbf{3}_2, \mathbf{4}_1, \mathbf{8}_1$	$4$	$1$
$a_8(\kappa )$	$\mathbf{3}_3, \mathbf{4}_1, \mathbf{5}_1, \mathbf{6}_1$	$6$	$0$
$a_9(\kappa )$	$\mathbf{3}_4, \mathbf{4}_3, \mathbf{6}_1$	$8$	$1$
$a_{10}(\kappa )$	$\mathbf{3}_5, \mathbf{4}_3, \mathbf{5}_1, \mathbf{8}_1$	$10$	$0$
$a_{11}(\kappa )$	$\mathbf{3}_6, \mathbf{4}_4, \mathbf{5}_2$	$12$	$1$
$a_{12}(\kappa )$	$\mathbf{3}_7, \mathbf{4}_5, \mathbf{5}_2, \mathbf{7}_1$	$15$	$0$

Overconvergent modular forms and their explicit arithmetic

Abstract

Introduction

1. Congruences between modular forms

1.1. Modular forms

1.2. The Hecke algebra

1.3. Some examples of congruences

1.4. The context of this article

2. Kummer congruences and Eisenstein series

2.1. The Kummer congruences

2.2. The $p$-adic family of Eisenstein series

2.3. Serre’s theory of $p$-adic modular forms

2.4. Hecke operators and their spectrum

2.5. Eisenstein series of weight $2$

3. Overconvergent modular forms

3.1. The Hasse invariant

3.2. Algebraic modular forms

3.3. Overconvergent modular forms

3.4. Interlude: Extended example

3.4.1. The tame viewpoint

3.4.2. The canonical subgroup viewpoint

3.5. Spectral theory of $U_p$

3.6. The eigencurve

3.7. Hida theory

3.8. Leopoldt’s formula

4. Explicit computations and arithmetic applications

4.1. Computing overconvergent forms

4.2. The spectral curve

4.3. The Gouvêa–Mazur conjecture

4.4. Chow–Heegner points

4.5. $p$-Adic L-functions of real quadratic fields

Acknowledgments

Table of Contents

Figures

Mathematical Fragments

References

Article Information

Settings

$p$	$2$	$3$	$\geq 5$
$E$	$E_4$	$E_6$	$E_{p-1}$
$n$	$4$	$3$	$1$
$k_E$	$4$	$6$	$p-1$