The triangular theorem of eight and representation by quadratic polynomials

We investigate here the representability of integers as sums of triangular numbers, where the $n$-th triangular number is given by $T_n = n(n + 1)/2$. In particular, we show that $f(x_1,x_2,..., x_k) = b_1 T_{x_1} +...+ b_k T_{x_k}$, for fixed positive integers $b_1, b_2,..., b_k$, represents every nonnegative integer if and only if it represents 1, 2, 4, 5, and 8. Moreover, if `cross-terms' are allowed in $f$, we show that no finite set of positive integers can play an analogous role, in turn showing that there is no overarching finiteness theorem which generalizes the statement from positive definite quadratic forms to totally positive quadratic polynomials.


Introduction
In 1638 Fermat claimed that every number is a sum of at most three triangular numbers, four square numbers, and in general k polygonal numbers of order k. Here the n-th polygonal number of order k is (k−2)n 2 −(k−4)n 2 , so the n-th triangular number is T n := n(n+1) 2 , where we include T 0 = 0 for simplicity. For a more complete history of related questions about sums of figurate numbers and some new results, see Duke's survey paper [7]. The claim for four squares was shown by Lagrange.
Theorem (Lagrange, 1770). Every positive integer is the sum of four squares.
Theorem (Gauss, 1796). Every positive integer is the sum of three triangular numbers.
The first proof of the full assertion of Fermat was given by Cauchy in 1813 [3], cf. [11].
This paper concerns questions of representability of integers by quadratic polynomials. If f = f (x) = f (x 1 , x 2 , . . . , x k ) is a rational polynomial in k variables, it represents the integer n if there exist integers n i such that n = f (n 1 , n 2 , . . . , n k ), and it oddly represents the integer n if there exist odd integers n i such that f (n 1 , n 2 , . . . , n k ) = n. The polynomial f is said to represent the set Z of integers if it represents every element of Z.
If we let S = S x be the square polynomial x 2 , and let T = T x denote the triangular polynomial (x 2 + x)/2, the theorems of Lagrange and Gauss state that the positive integers are represented by S w + S x + S y + S z , and by T x + T y + T z .
In 1917, Ramanujan extended the question about four squares to ask for which choices of quadruples b = (b 1 , b 2 , b 3 , b 4 ) of integers b 1 S w + b 2 S x + b 3 S y + b 4 S z represents every positive integer; we shall refer to such forms as universal diagonal forms. He gave a list of 55 possible choices of b which he claimed to be the complete list of universal quarternary diagonal forms; 54 forms actually turned out to be universal and this list is complete.
Recently, Conway and Schneeberger proved in unpublished work a nice classification for universal positive definite quadratic forms whose corresponding matrices have integer entries. This answers the question of representability by positive definite homogeneous quadratic polynomials with even off-diagonal coefficients.
Theorem (Conway-Schneeberger). A positive definite quadratic form Q(x) = x t Ax, where A is a positive symmetric matrix with integer coefficients, represents every positive integer if and only if it represents the integers 1, 2, 3, 5, 6, 7, 10, 14, and 15.
Bhargava gave a simpler proof of the Conway-Schneeberger 15-Theorem in [1], and showed more generally that representability of any set Z by such form can always be checked on a finite subset Y. In addition, he exhibited Y for Z consisting of all odd integers and for Z consisting of all primes.
More recently, Bhargava and Hanke [2] have shown the 290-Theorem, providing the necessary set (the largest element of which is 290) for universal forms when the corresponding matrix is half integral, that is, for totally positive integer quadratic forms.
In 1863, Liouville [10] proved the following generalization of Gauss's theorem, similar to Ramanujan's generalization of Lagrange's Four Squares Theorem.
We will first investigate whether finiteness theorems akin to the results of the Conway-Schneeberger 15-Theorem or the Bhargava-Hanke 290-Theorem occur for sums of triangular numbers. Since Hence there is a close correspondence between representability by triangular polynomials and odd representability by diagonal quadratic forms. It is not so difficult to establish Theorem 1.1 with the escalator techniques of Bhargava (and Liouville). We will prove a stronger statement in Section 2: if the integers 1, 2, 4, 5, and 8 are represented by the triangular form, then n is represented very many times unless n + 1 has high 3-divisibility. For an integer n, we will set a n := v 3 (n+1) log 3 (n+1) , so that 3 v 3 (n+1) = (n + 1) an gives the 3-part of n + 1 as a power of n + 1. We will abbreviate t(x) = t(x 1 , x 2 , . . . , x k ) = b i T x i , and call it a triangular sum. Theorem 1.3. For ǫ > 0, there is an absolute constant c ǫ such that if the triangular sum t(x) represents 1, 2, 4, 5, and 8, then t(x) represents every nonnegative integer n at least min{c ǫ n 1 2 −ǫ , n 1−an } times. In particular, if n is sufficiently large and v 3 (n+1) log 3 (n+1) = a n < 1 2 then t(x) represents n at least c ǫ n 1 2 −ǫ times.
We now turn to more general quadratic polynomials. Let f be a quadratic polynomial in Q[x 1 , x 2 , . . . , x k ]; then f is a normalized totally positive quadratic polynomial if the image of Z k under f consists of non-negative integers, while f (x) = 0 for some x ∈ Z k . Note that clearly S x = x 2 is normalized totally positive, as is T x : T 0 = 0, T 1 = 1, T 2 = 3 are the first of the increasing sequence of triangular numbers, and T −m = T m−1 for positive m. It turns out that no finiteness theorem will hold in general for normalized totally positive quadratic polynomials, and moreover that checking no proper subset will suffice.   The class of triangular sums with cross terms corresponds to integral quadratic forms with even off-diagonal terms, just as the ordinary triangular sums correspond to diagonal quadratic forms. We refer to Section 3 for a precise definition of this subclass of quadratic polynomials.
Finally, in Section 4 we construct a 'norm' m on this class that restores finite representability. Theorem 1.6. Fix an integer m and a subset Z of the positive integers. Then there is a finite subset Y m ⊂ Z, depending only on m and Z, such that every triangular sum t with cross terms satisfying m(t) ≤ m represents Z if and only if it represents Y m .
Moreover, for Z equal to the positive integers, we find that max Y m ≫ m 2 .
It may be of interest to investigate the growth of max Y m when Z is the set of positive integers. A reasonable guess for Y m (for small fixed m) may be obtained after some computer computation, but a proof eludes us even in the case m = 1 due to a certain inherent ineffectivity. For further discussion and the guess obtained for Y 1 , we refer the reader to Remark 4.3.

Theorem of Eight
We will assume throughout that the reader is familiar with genus theory for quadratic forms. For background information on quadratic forms, a good source is [8]. Here we prove Theorem 1.1, Corollary 1.2 and Theorem 1.3. We will proceed by using a standard argument to show that the theorem is equivalent to a statement about (diagonal) quadratic forms, and then prove the corresponding result for quadratic forms. We will only need some elementary results about quadratic forms and a theorem of Siegel to show the desired result.
Proof. Consider the generating function We will omit the subscript of s b (n) when it is clear from the context. One sees easily that is the number of representations of n by the corresponding (diagonal) quadratic form with x i odd. We proceed as with escalator lattices in [1]. Without loss of generality we have for which it is possible to represent the next largest integer not already represented. We will then develop an escalator tree by forming an edge between b and [b 1 , . . . , b k ], with ∅ as the root. If i b i T x i represents every integer, then b will be a leaf of our tree. Since Therefore, if s(n) > 0 for every n, then we must have one of the above choices of b i as a sublattice. By showing that each of these choices of b i satisfies s(n) > 0 for every n, we will see that this condition is both necessary and sufficient.
For ease of notation, we will denote the triangular sum corresponding to b with [b 1 , b 2 , . . . , b k ] and the corresponding quadratic form by (b 1 , . . . , b k ). All of the cases other than [1, 1, 3, k] with 3 ≤ k ≤ 8 are covered by Liouville's Theorem. However, to obtain the more precise version given in Corollary 1.3, we will use quadratic form genus theory.
For the forms [1, 1, 1], [1,1,4], [1,1,5], [1,2,2], and [1,2,4], r o (n) = r(n), where r(n) is the number of representations of n without the restriction of x i odd. For each of these is a genus 1 quadratic form. Therefore, extending the classification of Jones [8,Theorem 86] to primitive representations when the integer is not square free, is the Hurwitz class number for the order of disriminant D < 0.
For [1,1,5] we must be slightly more careful since 5 divides the discriminant. We will explain in some detail how to deal with this complication and then will henceforth ignore this difficulty when it arises. For 5 ∤ 8n + 7 we have s [1,1,5] ). Hence the only difficulty occurs with high divisibility by 5. For p = 5 the local densities are equal to those for bounded divisibility. Thus, entirely analogously to the result of Jones we have s [1,1,5] (n) = c n H(−5(8n + 7)) for some constant c n > 0 which only depends 5-adically on 8n + 7. We calculate the cases v 5 (8n + 7) ≤ 3 by hand. Denote 5-primitive representations (i.e., 5 ∤ gcd(x, y, z)) by r * (n). Checking locally, for 5 2 | m := 8n + 7, we will obtain the result inductively by showing r * (25m) and then summing = 5 by the class number formula (see [5,Corollary 7.28, page 148]) so that this is a quick local check at the prime 5.
Our proofs for [1,1,2], [1,2,3], and [1, 1, 3] will be essentially the same. For [1, 1, 2], we note that if has a solution with x, y, and z not all odd, then taking each side modulo 8 leads us to the conclusion that x, y, and z must all be even. Therefore, the solutions without x, y, and z odd correspond to solutions of or, x 2 + y 2 + 2z 2 = 2n + 1.
Having seen that each of our choices of b is indeed a leaf to the tree, we conclude that representing the integers 1, 2, 4, 5, and 8 suffices.
Remark 2.1. The constant c ǫ in Theorem 1.3 is ineffective because it relies on Siegel's lower bound for the class number, but the bound of c ǫ n 1 2 −ǫ may be replaced with the minimum of finitely many choices of a constant times a Hurwitz class number of a certain imaginary quadratic order whose discriminant is linear in n.
We have the following example. Using the explicit bound in terms of the Hurwitz class number, we obtain for instance that if 1, 2, 4, 5, and 8 are represented, then the integer 195727301431 is represented at least 270390 times and the integer 48291403767737750 is necessarily represented at least 90542761 times (here a n ≈ 0.364), while the integer 50031545098999706 = 3 35 − 1 is only necessarily represented once. All of the bounds listed in these examples are sharp.

Cross Terms
Every quadratic polynomial f in k variables (over Q) can be written uniquely as f (x) = Q(x) + Λ(x) + C, where Q(x) is a quadratic form in k variables, Λ(x) is a linear form, and C is a constant. We will only consider quadratic polynomials such that f (x) ∈ Z for every x ∈ Z k . The quadratic form Q(x) is positive definite if and only if f (x) is bounded from below. As in the introduction, f (x 1 , x 2 , . . . , x k ) is a normalized totally positive quadratic polynomial if f is quadratic, and the image of Z k is contained in the nonnegative integers while it contains 0. Clearly, for every positive definite quadratic form Q(x) and linear form Λ(x) there is a unique C ∈ Z such that f (x) = Q(x) + Λ(x) + C is normalized totally positive.
If C is the unique integer such that aT x + bT y + cB xy + C is normalized totally positive, then aX 2 + bY 2 + cXY + (8C − a − b − c) will be the corresponding shifted quadratic form that is normalized totally positive on the odd integers.
In order to describe our construction, we will say for simplicity that two quadratic polynomials f 1 and f 2 are (arithmetically) equivalent if the number of solutions to f 1 (x) = n equals the number of solutions to f 2 (x) = n for every integer n ≥ 0.
We will consider positive definite integral quadratic form (in k variables) for which all cross terms in the matrix have even coefficients, so the cross terms of the quadratic form are 0 mod 4. This restriction is natural if one keeps in mind that we are interested in the integers oddly represented by forms.
If Q and Q are two equivalent quadratic forms such that the isomorphism preserves the condition that X i is odd, then we shall refer to them as equivalently odd, and denote the equivalence class of such forms as [Q] o .
For any positive definite quadratic form with cross terms divisible by four, we write we now define f Q = f [Q]o to be the unique normalized totally positive quadratic polynomial We will refer to f Q as a triangular sum with cross terms.
We are now ready to prove Theorem 1.5. We will first show that triangular sums with cross terms do not satisfy any finiteness theorem, and hence there is no overarching finiteness theorem for quadratic polynomials. To do so, for every positive integer n we will construct a triangular sum with cross terms f n which represents precisely every non-negative integer other than n.
The following notation will be used. If f and g are polynomials in k and ℓ variables, we denote by f ⊕ g the sum of the two as a polynomial in k + ℓ variables (so f and g are assumed to share no variables).
Proof of Theorem 1.5. Let a proper subset S 0 of a given subset S of the positive integers be given. Choose a positive integer n ∈ S\S 0 . We will proceed by explicit construction of the triangular sum with cross terms f n which represents every integer other than n.
First note that if the smallest positive integer not represented by f is n, then, since the sum of three triangular numbers represents every non-negative integer, we have that f ⊕ (n + 1)(T x ⊕ T y ⊕ T z ) represents all m ≡ n (mod n + 1). But then we can choose f n := f ⊕ (n + 1)(T x ⊕ T y ⊕ T z ) ⊕ (2n + 1)T w . It is therefore equivalent to construct f for which n is the smallest positive integer not represented by f .
We first show that it is sufficient to determine that the generating function for f (N ) is Assuming equation (3.1), then the generating function for is 2 n 1 + n 1 q + · · · + n n q n + O(q N −12 ).
If we choose N > n + 13, then the first integer not represented by g is n + 1. Therefore, we can take f n = g n−1 ; this also suffices for n = 1 (if we interpret the empty direct sum g 0 as 0).
It is important here to note how the above counterexamples differ from the proof when we only have diagonal terms, since this observation will lead us to the proof of Theorem 1.6 when m f is bounded. We will call a triangular sum with cross terms f Q (and also any corresponding f Q ) a block if the corresponding quadratic form Q has an irreducible matrix. We will build an escalator lattice by escalating (as a direct sum) by a block at each step. In Section 2, the breadth each time we escalated was finite, so that the overall tree was finite. In the above proof, however, there were infinitely many inequivalent blocks which represent 1, so that the breadth is infinite. What was expressed in the above proof was that the supremum of these depths went to infinity as we chose N increasing in terms of n in the proof.
We will refer to the cross terms as a (cross term) configuration. So for we will say that f has configuration c = (c ij ). Since the matrix of f is irreducible and hence the corresponding adjacency matrix is connected, we can assume throughout (by a change of variables) that for each j > 1 there exists i < j with c ij = 0.

Bounded norm
We will now construct a natural norm on f Q such that restricting this norm will again give a finiteness result. Let a positive definite quadratic form with even cross terms in the corresponding matrix, Remark 4.1. Note that the constant c ij is added every time c ij < 0; this may not seem canonical at first, but notice that if Q ′ is the equivalent quadratic form obtained by replacing x 1 with −x 1 , then we find that this choice leads to f Q = f Q ′ .
We next define which is added to obtain the unique (up to equivalence) normalized totally positive quadratic polynomial f Q = f Q + m f corresponding to Q . Thus, we can define the norm In a sense, this norm measures the distance between f Q and the closest f Q ′ in the equivalence class, where the distance is merely given by the absolute value of the normalization factor required. If m f is bounded, then we will again find that checking a finite subset will suffice. We may now state the following more precise version of Theorem 1.6.   A proof of the above identity using the techniques of Bhargava and Hanke [2] developed in the proof of the 290-Theorem may require a careful analysis of a possible Siegel zero. To exhibit this difficulty, consider the sum f (x, y, z) = T x + 2T y + 6T z . In the construction of N 0,1 the computations imply that there are infinitely many Q with m f Q = 1 for which f ⊕ f Q represents every positive integer. Hence we cannot merely check each case individually and must know information about the integers represented by f independently.
Although it seems that f represents all odd integers, a proof of this appears to be beyond current techniques due to ineffective lower bounds for the class number (see [9]). However, since a possible Siegel zero for L(χ d , s) would give a lower bound for the class number when d ′ = d (both fundamental), one may be able to show that f represents at least one of n or n − 1 for every positive integer n, which would suffice for showing the above identity.
Proof. Fix a positive integer m. We will start with a small overview of the proof. As in the above remark, we will escalate with blocks. We will first show that when m f ≤ m, the number of blocks that are not dimension 1 in any branch of the escalator tree is bounded, and that there are only finitely many choices for the configuration of each block. We will then proceed by defining to be the smallest integer not represented by the totally positive quadratic polynomial corresponding to Our claim is then equivalent to showing that in the escalator tree is finite. To do so, we will effectively show that with the configurations of blocks of dimension greater than one fixed, the supremum with M i sufficiently large is finite and independent of the choice of M i , and then fix M 1 ≤ m 1 , and again show that the resulting supremum is independent of M 2 , . . . , M k , and so forth. Since there are only finitely many such choices of c, the result comes from taking the maximum of each of these supremums.
We begin with a lemma that will show that there are only finitely many choices of the cross term configuration. Proof. First note that m f ⊕g = m f + m g , so that we can only have at most m blocks f with m f > 0, while we will see that m f > 0 unless f is one dimensional (and hence the block is a constant times T x ). It therefore sufficies to show that each block f of dimension greater than one has m f > 0 and those with the restriction m f ≤ m have bounded dimension and bounded coefficients in the configuration. Fix the configuration c of a block f with dimension k such that m f = m f , namely a minimal element. We will recursively show a particular choice of x i such that so that the max of the c ij is bounded by m, and the dimension is bounded by m + 1.
First set x 1 = 0. Since f is a block, we know at step j that there is some i < j such that c ij = 0. Choose i < j such that |c ij | is maximal. If x i = 0, then we set x j = −1 if c ij > 0 and x j = 0 otherwise. If x i = −1 then we set x j = 0 if c ij > 0 and x j = −1 otherwise.
Since all of our choices of x i are 0 or −1 and T −1 = T 0 = 0, the integer represented is independent of the diagonal terms M i . Now we note that for then from our definition of f , the cross term corresponding to c ij adds 0 if c ij ≥ 0 and adds −|c ij | otherwise. If x i = 0 and x j = −1, then the cross term adds −|c ij | if c ij ≥ 0 and adds 0 otherwise. Therefore by our construction above, we know that for |c ij | maximal, we have added −|c ij | to our sum, and we never add a positive integer, so the sum is at most −|c ij |. Moreover, since the block is connected, we have added at most −1 at each inductive step, so that the sum is at most −(k − 1).
For simplicity, in our escalator tree, we will "push" up all of the blocks to the top of the tree which are not dimension 1. To do so, we will first build the tree with all possible choices of blocks which are not dimension 1, and then escalate with only dimension 1 blocks from each of the nodes of the tree, including the root (the empty set). Thus, every possible form will show up in our representation. This tree (without the blocks of dimension 1) is depth at most m in the number of blocks, but is of infinite breadth. Henceforth, we can consider the configuration c to be fixed, and take the maximum over all choices of c.
We will now see that the subtree from each fixed node is of finite depth. Consider the corresponding quadratic form Q. First note that the generating function for Q when all x i are odd is the generating function for Q minus the generating function with some x i even, and the others arbitrary, which is simply another quadratic form without any restrictions, taking x i → 2x i . Thus, we have the generating function of a difference of finitely many quadratic forms, and hence we have the Fourier expansion of a modular form. Now we simply note that any quadratic form can be decomposed into an Eisenstein series and a cusp form (cf. [12]). Using the bounds of Tartakowsky [15] and Deligne [6], as long as the Eisenstein series is non-zero, the growth of the coefficients of the Eisenstein series can be shown to grow more quickly than the coefficients of the cusp form whenever the dimension is greater than or equal to 5, other than finitely many congruences classes for which the coefficients of both the Eisenstein series and the cusp form are zero.
Therefore, as long as the Eisenstein series is non-zero, there are only finitely many congruence classes and finitely many "sporadic" integers which are not represented by the quadratic form. Thus, after dimension 5, there are only finitely many congruence classes and finitely many sporadic integers not represented by the form f . If at any step of the escalation, any of the integers in these congruence classes is represented, then we have less congruence classes, and only finitely many more sporadic integers which are not represented, so that the resulting depth is bounded. For the dimension 1 blocks, it is clear that the breadth of each escalation is finite, so there are only finitely many escalators coming from this node. Therefore, it suffices to show that the Eisenstein series is non-zero.
Again using Siegel's theorem [13], the Eisenstein series is simply a difference of the local densities. At every prime other than p = 2, the local densities of the quadratic forms, of which we are taking the difference, are equal, so we only need to show that the difference of the local densities at p = 2 is positive. However, the difference of the number of local representations at a fixed 2 power must be positive, since the integer is locally represented with x i odd, except possibly for finitely many congruence classes if a high 2-power divides the discriminant.
Therefore, we can define N(M 1 , . . . , M k , c) to be the maximum of N(M 1 , . . . , M k , M k+1 , . . . M l , c), where M k+1 to M l are the dimension 1 blocks coming from the (finite) subtree of this node.
We will show that N (M 1 , . . . , M k , c) is independent of the choice of M i whenever M i is sufficiently large by showing that the resulting subtrees are identical. We need the following lemma to obtain this goal. We will need some notation before we proceed.
For a set T , define the formal power series in q q T := t∈T q t .
For fixed sets S, T ⊆ N, we will say that a form f (x) := k i=1 b i T x i represents S/T if for every s ∈ S the coefficient of q s in q T g(q) is positive, where g(q) is the generating function for f (x) given by g(q) := x∈Z k q f (x) . Proof. We will escalate as in [1] with a slight deviation. At each escalation node, there is a least element s ∈ S such that S/T 1 is not represented by the form f corresponding to this node. As in [1], we shall refer to s as the truant of f . To represent {s}/T 1 , we must have some t 1 ∈ T 1 such that s − t 1 is represented by f + bT x . Therefore, for each t 1 < s we escalate with finitely many choices of b, and there are only finitely many choices of t 1 . Thus, the breadth at each escalation is finite, and our argument above using modular forms shows that the depth is also finite, so there are only finitely many choices of s ∈ S which are truants in the escalation tree. Take S 0 to be the set of truants in the escalation tree and define M T 1 ,S := max s ∈ S 0 s + 1. The argument above shows we may replace M i = ∞). We may now fix M 1 ≤ M (1) X 0 , since there are only finitely many such choices. With this M 1 fixed, we define T 1,1 as above, and again find bounds for the other M i . Continuing recursively gives the desired result, since we know that k ≤ m, so there are only finitely many supremums that we take.
We finally would like to show that max N 0,m ≫ m 2 . To do so, we consider again the construction of our counterexamples. Consider f (x, y) := m i=1 f (N ) ⊕ T y . Since T r = r n=1 n, for N sufficiently large the smallest integer not represented by f is clearly T m+1 − 1 ≫ m 2 .