# Missing digits and good approximations

## Abstract

James Maynard has taken the analytic number theory world by storm in the last decade, proving several important and surprising theorems, resolving questions that had seemed far out of reach. He is perhaps best known for his work on small and large gaps between primes (which were discussed, hot off the press, in our 2015 *Bulletin of the AMS* article). In this article we will discuss two other Maynard breakthroughs:

— Mersenne numbers take the form and so appear as in base 2, having no digit “ It is a famous conjecture that there are infinitely many such primes. More generally it was, until Maynard’s work, an open question as to whether there are infinitely many primes that miss any given digit, in any given base. We will discuss Maynard’s beautiful ideas that went into his 2019 partial resolution of this question. ”.

— In 1926, Khinchin gave remarkable conditions for when real numbers can usually be “well approximated” by infinitely many rationals. However Khinchin’s theorem regarded 1/2, 2/4, 3/6 as distinct rationals and so could not be easily modified to cope, say, with approximations by fractions with prime denominators. In 1941 Duffin and Schaeffer proposed an appropriate but significantly more general analogy involving approximation only by reduced fractions (which is much more useful). We will discuss its 2020 resolution by Maynard and Dimitris Koukoulopoulos.

This year’s *Current Events Bulletin* highlights the work of the 2022 Fields medallists. In James Maynard’s case there are a surprising number of quite different breakthroughs that could be discussed.Footnote^{1} In my 2014 CEB lecture I described the work of Yitang Zhang Reference 37 on bounded gaps between primes and noted that a first-year postdoc, James Maynard, had taken a different, much simpler but related approach, to also get bounded gaps Reference 27 (and a similar proof had been found, independently, by Terry Tao, and given on his blog). Versions of both Zhang’s proof and the Maynard–Tao proof appear in my article Reference 15, where it is also announced that Maynard had within months made another spectacular breakthrough, this time on the largest known gaps between consecutive primes Reference 28 (and a rather different proof Reference 11 had been found by Ford, Green, Konyagin, and Tao, the two proofs combining to give an even better result Reference 12). It has been like this ever since with Maynard, many breakthrough results, some more suitable for a broad audience, some of primary importance for the technical improvements. Rather than attempt to summarize these all, I have selected two quite different topics, in both of which Maynard proved spectacular breakthroughs on questions that had long been stuck.

^{1}

In my forthcoming textbook about the distribution of primes, starting from the basics, about one-sixth of the book is dedicated to various Maynard theorems. This is one of the oldest and most venerable subjects of mathematics.

## Part 1. Primes missing digits

Most integers have many of each of the digits, 0 through 9, in their decimal expansion, so integers missing a given digit, or digits, are rare, making them hard to analyze. For example, there are integers up to having only , and , in their decimal expansion as there are three possibilities for each of the digits in the expansion.Footnote^{2} When we begin to explore we find the primes

having only the digits , and , in their decimal expansions. Are there infinitely many such primes? It seems likely given how many we have already found, but this question, and questions like it, have long been wide open, researchers finding it difficult to find a method to plausibly attack such problems (as we will discuss below). Indeed it was only recently that researchers succeeded in the following related but seemingly less difficult problems:

- •
In 2010 Mauduit and Rivat Reference 26 finally resolved Gelfond’s problem that the sum of the base- digits of prime numbers are equidistributed in arithmetic progressions, for all .

- •
In 2015 Bourgain Reference 3 showed that there are the expected number of primes with binary digits, for which of those digits have preassigned values (and see Swaenepoel Reference 33 for base- ).

Maynard simplified and (in some aspects) sharpened the tools used in these proofs but also added a perspective, and a technical confidence, that allowed him to surmount some of the established technical barriers. Here we will sketch his proof giving an asymptotic for the number of primes up to large missing ,*one* given digit in base (for sufficiently large), though his proof can be extended to counting the number of primes missing no more than base- digits (again, for sufficiently large). His proof works best if the allowed digits lie in an interval, and in that case he was able to count the number of primes whose digits come from any subinterval of of length .

We begin by discussing where we should expect to find primes, and how many there are:

### 1. Primes in arithmetic sequences

We believe that an arithmetically natural set of integers contains infinitely many primes unless there is an obvious reason why not (such as, say, if is the set of even integers, or the set of values of a reducible polynomial). Well known examples include,

- •
is the set of all integers;

- •
is the set of all integers in a given arithmetic progression (such as with );

- •
which is a way to ask for twin primes; ,

- •
.

The first two questions are resolved and we even know an asymptotic estimate for how many such primes there are up to a given while the second two questions are (wide) open. ,

#### 1.1. Guessing at the number of primes in

The prime number theorem asserts that there are primes (so roughly 1 in of the integers around are prime).Footnote^{3} As a first guess we might think that the primes are equidistributed amongst the arithmetic progressions mod and so the answer to the second question is however ; divides any element of and so if then this arithmetic progression contains at most one prime. Therefore we should restrict our attention to with There are . such progressions, and so we should adjust our guess so that if then there are primes that are This is the .*prime number theorem for arithmetic progressions*.Footnote^{4}

^{3}

To prove such a result it helps to include a weight at each prime and prove instead that since , is a more natural function to work with than (which is a more precise approximation than The prime number theorem can be deduced by the technique of ).*partial summation* which allows one to multiply or divide the summand by smooth weights.

^{4}

First claimed by de la Vallée Poussin in 1899 based on ideas from his proof of the prime number theorem, and Dirichlet’s proof of the infinitude of primes in arithmetic progressions. Thanks to Siegel and Walfisz this can be given, when is large enough compared to as follows. Fix reals , If . then the number of primes , up to ,

Let be the set of integers in up to and let , be the number of primes in If the elements of . are as likely to be prime as random integers (roughly 1 in around then we’d guess that ) This can be wrong since we have not accounted for any obvious biases in the set . for example, if ; is the set of integers in an arithmetic progression mod then much depends on whether the progression is coprime with , So we adjust our guess by a factor which is the probability that a random integer in . is coprime with divided by the probability, , that a random integer is coprime with This then yields the guess .

when (and when so we recover the correct prediction. This suggests a general strategy for guessing at ), .

#### 1.2. Sparse sets of primes

The first three questions above involve sets that are quite dense amongst the integers. Our well-worn methods usually have limited traction with sets that are *sparse* such as

- •
;

- •
for given integer and ;

- •
for a given real, irrational .

In each of these examples, a rather sparse set. Each was shown to have more-or-less the expected number of primes over 50 years ago (theorems of Hoheisel, Linnik, and Vinogradov, respectively), though all known proofs are rather difficult. Moreover if we change ,“ to an exponent ” then these questions are far beyond our current state of knowledge. ,Footnote^{5}

^{5}

The sparsest sets known in these questions to contain primes are and , due to Reference 1Reference 36Reference 25, respectively.

A family of sparse arithmetic sequences are given by the sets of values of polynomials (perhaps in several variables). Examples of sparse sets of values for which infinitely many primes have been found include

which has (see Reference 13); and

which has (see Reference 19).

This last set is an example of the set of values of a *norm-form* as is the norm an element, of the ring of integers of , In general: .

For a number field with ring of integers , the norm

is a degree polynomial in the variables …, , For example, . has norm The prime ideal theorem implies that norm-forms take on infinitely many prime values with the . all integers, provided that this polynomial has no fixed prime factor. These sequences are not so sparse (since they represent something like ’s different integers up to However, in the last example we know that there are roughly the expected number of prime values of the norm-form for ). even when we fix (in which case one obtains a sparse set of integer values).

There are infinitely many prime values of the norm, of , for integers but if we fix , we get the open question of primes of the form , In 2002 Heath-Brown and Moroz .Reference 20Reference 21 proved that any cubic norm-form with one of the variables equal to (as long as the new form is irreducible) takes roughly the expected number of prime values. Moreover in 2018, Maynard Reference 30 proved such a result for norms of

Other than primes in short intervals, in short arithmetic progressions, and amongst polynomial values, perhaps the best known questions involving primes are those without some explicitly named digit or digits in their decimal or binary expansion. We explore these below.

### 2. Primes with missing digits

How many primes only have the digits 1, 3, and 4 in their decimal expansions? When we start searching we find many:

and our guess is that there are infinitely many such primes. To guess how many up to we can follow the above recipe: Here , and so , where .Footnote^{6} We expect that the elements of are independently equidistributed modulo every prime except perhaps for those dividing the base since the last digit of an element of : is or it is coprime with , with probability whereas regular integers are coprime with , with probability and so we guess that ,

^{6}

Note that since no element of begins with a “ so ”, does not tend to a limit. The ratio ranges between and .

#### 2.1. General prediction

If is the set of integers which have only digits from in their base expansion, let and then we predict that

via the same reasoning. Maynard proved this Reference 29Reference 31 for certain general families of sparse sets His most spectacular result .Reference 29 yields (close to) the above with and that is, Maynard proved that there are roughly the expected number of primes that are missing ;*one* given digit in decimal.Footnote^{7} His methods give a lot more (as we will describe). His methods can’t handle sets as sparse as with that is for another day. ;Footnote^{8} We will sketch slightly more than the easier argument from Reference 31 which gives many results of this type though only for bases that are significantly larger than .

^{8}

Moreover there may be other, as yet undiscovered, reasons why there might not be any primes for a given The one obstruction I know about is when every element of . is divisibly be some given prime which implies that all the elements of , are also divisible by .

**Backlinks**: Reference 1, Reference 2.

#### 2.2. Who cares?

Is this a silly question? It is certainly diverting to wonder whether there are infinitely many primes with given missing digits, but how does that impact any other serious questions in mathematics? This is a case of “the proof of the pudding is in the eating”, that is, its real value can be judged only from the beautiful mathematics that unfolds. The story is two-fold. The relevant Fourier coefficients have an extraordinary structure that allows Maynard to import ideas from Markov processes, and so prove such theorems in bases To get the base down to . Maynard develops his ideas with a virtuosity in all sorts of deep techniques that spin an extraordinary (though technical) tale. ,

### 3. The circle method

#### 3.1. Fourier analysis

We use the identity

where for any real and its discrete analogue ,

obtained by summing the geometric series.

Let denote the set of primes and let be the set of integers missing some given digit or digits in base- To identify whether prime . equals some we can take the above identities with , and sum over all and to obtain, in the discrete case, ,

where, for a given set of integers we define the ,*exponential sum* (or the *Fourier transform* of by )

Similarly, in the continuous case,

One can work with either version, depending on whether discrete or continuous seems more convenient in the particular argument.Footnote^{9} Writing

^{9}

Some of this discussion will make more sense to the novice if they think about the continuous version (though the discussion also applies to the discrete version).

one can also obtain Equation 3.1 and the continuous analogue from the Parseval–Plancherel identity in Fourier analysis.

#### 3.2. The circle method

To establish a good estimate for using Equation 3.1, one needs to identify those for which the summand on the right-hand side is large; for example, and so the term in Equation 3.1 yields

which is the expected order of magnitude of our main term (though it may be out by a multiplicative constant). Other terms where is small, or is close to a rational with small denominator often also contribute to the main term, whereas we hope that the combined contribution of all of the other terms is significantly smaller. At first sight this seems unlikely since we only have the trivial bound for the other terms, but the trick is to use the Cauchy–Schwarz inequality followed by Parseval’s identity so that

This implies for example that a typical term in the sum on the right-hand side of Equation 3.1 has size which is a little bigger than the main term but certainly not so egregiously as would happen if we used the trivial bound.

We have just described the basic thinking behind the *circle method* used when one sums or integrates over the values of an exponential sum as the variable rotates around the unit circle (that is, for or , for When trying to estimate the sum on the right-hand side of ).Equation 3.1, we are most interested in those for which is *large*. Experience shows that with arithmetic problems, the exponential sums can typically only be large when is close to a rational with small denominators, and so we cut the circle up into these *major arcs*, usually those near to a rational with small denominator, and *minor arcs*, the remaining bounding the contribution from the minor arcs, and being as precise as possible with the major arcs to obtain the main terms. ,

Fourier analysis/the circle method is most successful when one has the product of at least three exponential sums to play with. For example the ternary Goldbach problem was more-or-less resolved by Vinogradov 85 years ago, whereas the binary Goldbach problem remains open.Footnote^{10}

^{10}

It is known that *almost all* integers can be written as the sum of two primes in the expected number of ways, since by counting over all integers one can estimate the variance via an integral involving three exponential sums. ,

#### 3.3. The ternary Goldbach problem

The number of representations of odd as the sum of three primes is given by

and the arc of width around yields a main term of size We have the trivial bound . and we will define here the minor arcs to be ,

(Since the typical size of is we expect that all but a tiny subset of the , belong to these minor arcs.) Then

which is significantly smaller than the main term. Thus if we can identify which

#### 3.4. Major and minor arcs

The usual way to dissect the circle is to pick a parameter

(and the right-hand side is

The arcs with

In our problem the partition of major and minor arcs will be a bit more complicated. The major arcs will be given by

and the main term will be obtained from those major arcs for which the prime factors of

Of course there is far more to say on the circle method than the brief discussion in this article. The reader should look into the two classic books on the subject Reference 6Reference 35 for much more detail and for applications to a wide variety of interesting questions.

### 4. The missing digit problem

Throughout let ^{11}

The major arcs are typically given by the points ^{12} If