Missing digits and good approximations

By Andrew Granville

Abstract

James Maynard has taken the analytic number theory world by storm in the last decade, proving several important and surprising theorems, resolving questions that had seemed far out of reach. He is perhaps best known for his work on small and large gaps between primes (which were discussed, hot off the press, in our 2015 Bulletin of the AMS article). In this article we will discuss two other Maynard breakthroughs:

— Mersenne numbers take the form and so appear as in base 2, having no digit ”. It is a famous conjecture that there are infinitely many such primes. More generally it was, until Maynard’s work, an open question as to whether there are infinitely many primes that miss any given digit, in any given base. We will discuss Maynard’s beautiful ideas that went into his 2019 partial resolution of this question.

— In 1926, Khinchin gave remarkable conditions for when real numbers can usually be “well approximated” by infinitely many rationals. However Khinchin’s theorem regarded 1/2, 2/4, 3/6 as distinct rationals and so could not be easily modified to cope, say, with approximations by fractions with prime denominators. In 1941 Duffin and Schaeffer proposed an appropriate but significantly more general analogy involving approximation only by reduced fractions (which is much more useful). We will discuss its 2020 resolution by Maynard and Dimitris Koukoulopoulos.

This year’s Current Events Bulletin highlights the work of the 2022 Fields medallists. In James Maynard’s case there are a surprising number of quite different breakthroughs that could be discussed.⁠Footnote1 In my 2014 CEB lecture I described the work of Yitang Zhang Reference 37 on bounded gaps between primes and noted that a first-year postdoc, James Maynard, had taken a different, much simpler but related approach, to also get bounded gaps Reference 27 (and a similar proof had been found, independently, by Terry Tao, and given on his blog). Versions of both Zhang’s proof and the Maynard–Tao proof appear in my article Reference 15, where it is also announced that Maynard had within months made another spectacular breakthrough, this time on the largest known gaps between consecutive primes Reference 28 (and a rather different proof Reference 11 had been found by Ford, Green, Konyagin, and Tao, the two proofs combining to give an even better result Reference 12). It has been like this ever since with Maynard, many breakthrough results, some more suitable for a broad audience, some of primary importance for the technical improvements. Rather than attempt to summarize these all, I have selected two quite different topics, in both of which Maynard proved spectacular breakthroughs on questions that had long been stuck.

1

In my forthcoming textbook about the distribution of primes, starting from the basics, about one-sixth of the book is dedicated to various Maynard theorems. This is one of the oldest and most venerable subjects of mathematics.

Part 1. Primes missing digits

Most integers have many of each of the digits, 0 through 9, in their decimal expansion, so integers missing a given digit, or digits, are rare, making them hard to analyze. For example, there are integers up to having only , , and in their decimal expansion as there are three possibilities for each of the digits in the expansion.⁠Footnote2 When we begin to explore we find the primes

2

So there are about integers up to having only , , and in their decimal expansion, where .

having only the digits , , and in their decimal expansions. Are there infinitely many such primes? It seems likely given how many we have already found, but this question, and questions like it, have long been wide open, researchers finding it difficult to find a method to plausibly attack such problems (as we will discuss below). Indeed it was only recently that researchers succeeded in the following related but seemingly less difficult problems:

In 2010 Mauduit and Rivat Reference 26 finally resolved Gelfond’s problem that the sum of the base- digits of prime numbers are equidistributed in arithmetic progressions, for all .

In 2015 Bourgain Reference 3 showed that there are the expected number of primes with binary digits, for which of those digits have preassigned values (and see Swaenepoel Reference 33 for base-).

Maynard simplified and (in some aspects) sharpened the tools used in these proofs but also added a perspective, and a technical confidence, that allowed him to surmount some of the established technical barriers. Here we will sketch his proof giving an asymptotic for the number of primes up to large , missing one given digit in base (for sufficiently large), though his proof can be extended to counting the number of primes missing no more than base- digits (again, for sufficiently large). His proof works best if the allowed digits lie in an interval, and in that case he was able to count the number of primes whose digits come from any subinterval of of length .

We begin by discussing where we should expect to find primes, and how many there are:

1. Primes in arithmetic sequences

We believe that an arithmetically natural set of integers contains infinitely many primes unless there is an obvious reason why not (such as, say, if is the set of even integers, or the set of values of a reducible polynomial). Well known examples include,

is the set of all integers;

is the set of all integers in a given arithmetic progression (such as with );

, which is a way to ask for twin primes;

.

The first two questions are resolved and we even know an asymptotic estimate for how many such primes there are up to a given , while the second two questions are (wide) open.

1.1. Guessing at the number of primes in

The prime number theorem asserts that there are primes (so roughly 1 in of the integers around are prime).⁠Footnote3 As a first guess we might think that the primes are equidistributed amongst the arithmetic progressions mod and so the answer to the second question is ; however divides any element of and so if then this arithmetic progression contains at most one prime. Therefore we should restrict our attention to with . There are such progressions, and so we should adjust our guess so that if then there are primes that are . This is the prime number theorem for arithmetic progressions.⁠Footnote4

3

To prove such a result it helps to include a weight at each prime and prove instead that , since is a more natural function to work with than (which is a more precise approximation than ). The prime number theorem can be deduced by the technique of partial summation which allows one to multiply or divide the summand by smooth weights.

4

First claimed by de la Vallée Poussin in 1899 based on ideas from his proof of the prime number theorem, and Dirichlet’s proof of the infinitude of primes in arithmetic progressions. Thanks to Siegel and Walfisz this can be given, when is large enough compared to , as follows. Fix reals . If , then the number of primes up to ,

Let be the set of integers in up to , and let be the number of primes in . If the elements of are as likely to be prime as random integers (roughly 1 in around ) then we’d guess that . This can be wrong since we have not accounted for any obvious biases in the set ; for example, if is the set of integers in an arithmetic progression mod , then much depends on whether the progression is coprime with . So we adjust our guess by a factor which is the probability that a random integer in is coprime with , divided by the probability, that a random integer is coprime with . This then yields the guess

when (and when ), so we recover the correct prediction. This suggests a general strategy for guessing at .

1.2. Sparse sets of primes

The first three questions above involve sets that are quite dense amongst the integers. Our well-worn methods usually have limited traction with sets that are sparse such as

;

for given integer and ;

for a given real, irrational .

In each of these examples, , a rather sparse set. Each was shown to have more-or-less the expected number of primes over 50 years ago (theorems of Hoheisel, Linnik, and Vinogradov, respectively), though all known proofs are rather difficult. Moreover if we change to an exponent , then these questions are far beyond our current state of knowledge.⁠Footnote5

5

The sparsest sets known in these questions to contain primes are , and due to Reference 1Reference 36Reference 25, respectively.

A family of sparse arithmetic sequences are given by the sets of values of polynomials (perhaps in several variables). Examples of sparse sets of values for which infinitely many primes have been found include

which has (see Reference 13); and

which has (see Reference 19).

This last set is an example of the set of values of a norm-form as is the norm an element, , of the ring of integers of . In general:

For a number field , with ring of integers the norm

is a degree polynomial in the variables , …, . For example, has norm . The prime ideal theorem implies that norm-forms take on infinitely many prime values with the ’s all integers, provided that this polynomial has no fixed prime factor. These sequences are not so sparse (since they represent something like different integers up to ). However, in the last example we know that there are roughly the expected number of prime values of the norm-form for even when we fix (in which case one obtains a sparse set of integer values).

There are infinitely many prime values of the norm, , of for integers , but if we fix , we get the open question of primes of the form . In 2002 Heath-Brown and Moroz Reference 20Reference 21 proved that any cubic norm-form with one of the variables equal to (as long as the new form is irreducible) takes roughly the expected number of prime values. Moreover in 2018, Maynard Reference 30 proved such a result for norms of

Other than primes in short intervals, in short arithmetic progressions, and amongst polynomial values, perhaps the best known questions involving primes are those without some explicitly named digit or digits in their decimal or binary expansion. We explore these below.

2. Primes with missing digits

How many primes only have the digits 1, 3, and 4 in their decimal expansions? When we start searching we find many:

and our guess is that there are infinitely many such primes. To guess how many up to , we can follow the above recipe: Here , and so where .⁠Footnote6 We expect that the elements of are independently equidistributed modulo every prime except perhaps for those dividing the base : since the last digit of an element of is or , it is coprime with with probability , whereas regular integers are coprime with with probability , and so we guess that

6

Note that since no element of begins with a ”, so does not tend to a limit. The ratio ranges between and .

2.1. General prediction

If is the set of integers which have only digits from in their base expansion, let and then we predict that

via the same reasoning. Maynard proved this Reference 29Reference 31 for certain general families of sparse sets . His most spectacular result Reference 29 yields (close to) the above with and ; that is, Maynard proved that there are roughly the expected number of primes that are missing one given digit in decimal.⁠Footnote7 His methods give a lot more (as we will describe). His methods can’t handle sets as sparse as with ; that is for another day.⁠Footnote8 We will sketch slightly more than the easier argument from Reference 31 which gives many results of this type though only for bases that are significantly larger than .

7

“Roughly” meaning “up to a multiplicative constant” rather than an asymptotic.

8

Moreover there may be other, as yet undiscovered, reasons why there might not be any primes for a given . The one obstruction I know about is when every element of is divisibly be some given prime , which implies that all the elements of are also divisible by .

Backlinks: Reference 1, Reference 2.

2.2. Who cares?

Is this a silly question? It is certainly diverting to wonder whether there are infinitely many primes with given missing digits, but how does that impact any other serious questions in mathematics? This is a case of “the proof of the pudding is in the eating”, that is, its real value can be judged only from the beautiful mathematics that unfolds. The story is two-fold. The relevant Fourier coefficients have an extraordinary structure that allows Maynard to import ideas from Markov processes, and so prove such theorems in bases . To get the base down to , Maynard develops his ideas with a virtuosity in all sorts of deep techniques that spin an extraordinary (though technical) tale.

3. The circle method

3.1. Fourier analysis

We use the identity

where for any real , and its discrete analogue

obtained by summing the geometric series.

Let denote the set of primes and let be the set of integers missing some given digit or digits in base-. To identify whether prime equals some , we can take the above identities with and sum over all and , to obtain, in the discrete case,

where, for a given set of integers , we define the exponential sum (or the Fourier transform of ) by

Similarly, in the continuous case,

One can work with either version, depending on whether discrete or continuous seems more convenient in the particular argument.⁠Footnote9 Writing

9

Some of this discussion will make more sense to the novice if they think about the continuous version (though the discussion also applies to the discrete version).

one can also obtain Equation 3.1 and the continuous analogue from the Parseval–Plancherel identity in Fourier analysis.

3.2. The circle method

To establish a good estimate for using Equation 3.1, one needs to identify those for which the summand on the right-hand side is large; for example, and so the term in Equation 3.1 yields

which is the expected order of magnitude of our main term (though it may be out by a multiplicative constant). Other terms where is small, or is close to a rational with small denominator often also contribute to the main term, whereas we hope that the combined contribution of all of the other terms is significantly smaller. At first sight this seems unlikely since we only have the trivial bound for the other terms, but the trick is to use the Cauchy–Schwarz inequality followed by Parseval’s identity so that

This implies for example that a typical term in the sum on the right-hand side of Equation 3.1 has size which is a little bigger than the main term but certainly not so egregiously as would happen if we used the trivial bound.

We have just described the basic thinking behind the circle method used when one sums or integrates over the values of an exponential sum as the variable rotates around the unit circle (that is, for , or for ). When trying to estimate the sum on the right-hand side of Equation 3.1, we are most interested in those for which is large. Experience shows that with arithmetic problems, the exponential sums can typically only be large when is close to a rational with small denominators, and so we cut the circle up into these major arcs, usually those near to a rational with small denominator, and minor arcs, the remaining , bounding the contribution from the minor arcs, and being as precise as possible with the major arcs to obtain the main terms.

Fourier analysis/the circle method is most successful when one has the product of at least three exponential sums to play with. For example the ternary Goldbach problem was more-or-less resolved by Vinogradov 85 years ago, whereas the binary Goldbach problem remains open.⁠Footnote10

10

It is known that almost all integers can be written as the sum of two primes in the expected number of ways, since by counting over all integers , one can estimate the variance via an integral involving three exponential sums.

3.3. The ternary Goldbach problem

The number of representations of odd as the sum of three primes is given by

and the arc of width around yields a main term of size . We have the trivial bound , and we will define here the minor arcs to be

(Since the typical size of is , we expect that all but a tiny subset of the belong to these minor arcs.) Then

which is significantly smaller than the main term. Thus if we can identify which belong to , then we can focus on evaluating on the major arcs . There are strong bounds known for , as we will see later, so these ambitions can all be achieved in practice.

3.4. Major and minor arcs

The usual way to dissect the circle is to pick a parameter and recall that, by Dirichlet’s theorem (see the discussion in Part 2), for every there exists a reduced fraction with for which

(and the right-hand side is ). Therefore we may cover (and so cover the circle, by mapping ) with the intervals (arcs),

The arcs with small are usually the major arcs, those with large are the minor arcs.

In our problem the partition of major and minor arcs will be a bit more complicated. The major arcs will be given by

and the main term will be obtained from those major arcs for which the prime factors of all divide . The minor arcs with be determined from the arcs above with , and then removing the major arcs.

Of course there is far more to say on the circle method than the brief discussion in this article. The reader should look into the two classic books on the subject Reference 6Reference 35 for much more detail and for applications to a wide variety of interesting questions.

4. The missing digit problem

Throughout let be the set of integers whose digits come from the set . Our aim is to estimate , and it will be convenient to let for some large even integer .⁠Footnote11

11

For other large the key ideas are the same, but dull technicalities arise.

The major arcs are typically given by the points for which the integrand is large.⁠Footnote12 If is large, then and must both individually be large. As we will see, Vinogradov proved that is only large when is near to a rational with small denominator. behaves differently; it is only large when there are many ’s and ’s in the decimal expansion of . The simplest that satisfy both criteria take the form for some small , perhaps with or, if , then , so that all the prime factors of must divide . We therefore split the major arcs into three parts: those with

12

That is the goal, but one may have to include other points that one cannot easily exclude.