Pfister's theorem fails in the Hermitian case

We show that the Hermitian analogue of a famous result of Pfister fails. To do so we provide a Hermitian symmetric polynomial $r$ of total degree 2d such that any non-zero multiple of it cannot be written as a Hermitian sum of squares with fewer than $d+1$ squares.


Introduction
Artin's solution of Hilbert's 17-th problem [A] includes the following statement. Let r be a polynomial in n real variables. Then r ≥ 0 if and only if there is a polynomial q, not identically 0, such that q 2 r is a sum of squares of polynomials. See also [S] for recent developments. Pfister [Pf] proved that we may always choose q such that the number of terms in the sum is at most 2 n . This result is remarkable because the number of terms (the length of the sum of squares of q 2 r) is independent of the degree of r.
See [Q], [D2], [D3] and their references for Hermitian analogues of Hilbert's problem. Related work by both authors ( [D1], [DL], [L]) on applications of Hermitian symmetric polynomials to CR geometry have led us to a simple counterexample to the natural Hermitian analogue of Pfister's result.
Let r(z, w) be a polynomial on C n × C n . Using multi-index notation we write r(z, w) = c α,β z α w β .
We let r(r) denote the rank of the matrix c α,β . The function z → r(z, z) is realvalued if and only if this matrix is Hermitian. The function z → r(z, z) is a squared norm or Hermitian sum of squares if and only if this matrix is non-negative definite.
In this case there are polynomials p 1 (z), ..., p k (z) for which By linear algebra it follows that r(r) is the minimum k for which (1) holds. Hence one might call the rank of a squared norm its Hermitian length. Unlike in the real case, not every non-negative Hermitian polynomial r divides a squared norm. Suppose however that r does so; in other words, assume that there is a polynomial s, not identically 0, such that rs = ||p|| 2 . We naturally ask what bounds are possible on the rank of ||p|| 2 . We prove below that there is no bound independent of the degree of r. In particular, the Hermitian analogue of Pfister's result fails. We give the simple example now in Corollary 1.1; we also state Theorem 1.1 from which the result follows. We prove Theorem 1.1 in the next section. Corollary 1.1 shows that the Hermitian length of every nonzero squared norm divisible by r is at least d + 1, and hence the Hermitian analogue of Pfister's result fails.
Equality holds with r itself. Expanding by the binomial theorem writes r as a squared norm with rank d + 1. The following stronger result holds in arbitrary dimensions and immediately implies the Corollary. We write ||z|| 2 = n j=1 |z j | 2 . Theorem 1.1. Set r(z, z) = (1 + ||z|| 2 ) d and let g be a nonzero multiple of r. Then r(g) ≥ n+d d , and equality is possible. Here n+d d = M (n, d) equals the dimension of the vector space of polynomials of degree at most d in n variables.
Alternatively we can bihomogenize r and use ||Z|| 2d , where Z = (z 1 , ..., z n , z n+1 ). The restated conclusion is then that a nonzero multiple of ||Z|| 2d must have rank at least N (n + 1, d), where N (n + 1, d) denotes the dimension of the space of homogeneous polynomials of degree d in n + 1 variables. Note that N (n, k) equals the binomial coefficient n+k−1 k , namely the rank of the function ||z|| 2k . Note also that The homogenized version of Theorem 1.1 holds in the real-analytic case as well. See Theorem 2.1, which generalizes a well-known lemma of Huang. Huang [H] proved the following. Let f 1 , ..., f k , g 1 , ...g k be holomorphic functions defined near the origin in C m and vanishing there, and suppose that the expression is divisible by ||z|| 2 . If (2) is not identically zero, then k ≥ m. Huang's lemma is equivalent to the special case of Theorem 2.1 when d = 1. The authors acknowledge support from NSF grants DMS 07-53978 (JPD) and DMS 09-00885 (JL). They also wish to thank AIM for the workshop on CR Complexity Theory in 2010; preparing for that workshop helped lead us to this result. Finally we thank Martin Harrison and the referee for pointing out several places where the exposition needed improvement.

Proof of Theorem 1.1
In this paper we assume the polynomials used have complex coefficients. We note however that Proposition 2.1 below holds for polynomials over an arbitrary field of characteristic zero. First we prove the analogue of Theorem 1.1 when the matrix of coefficients is diagonal. To finish the proof of Theorem 1.1 we reduce the general case to the diagonal case.
Proposition 2.1. Put s(x) = n j=1 x j . Let p(x) be a homogeneous polynomial and suppose p is a multiple of s d . Then either p = 0 or p has at least N (n, d) monomials.
Proof. We first observe that the result is trivial when n = 1, as N (1, d) = 1 for all d. We next consider the case n = 2. Note that N (2, d) = d + 1 for all d. After dehomogenizing, it suffices to prove the following statement in one variable, for which we have found two proofs. If the polynomial p defined by p( is not identically zero, then p has at least d + 1 terms. The first proof is by the method of descent. Suppose that there is an integer d and polynomials q and r such that r(x) = (1 + x) d q(x), and such that q has at most d terms. Then there is a smallest such d. If the resulting polynomial q is divsible by x, then r also is, and we divide both sides by x. We may therefore assume that either q is identically zero, or that q has a nonzero constant term. In the second situation, differentiate both sides to obtain ). Now r ′ has at most d − 1 terms, and it is divisible by (1 + x) d−1 . Hence there is an example with d replaced by d − 1. Since there is no example for d = 0 other than r being identically 0, we conclude that there is no d at all for which r is not identically 0.
The second proof is more complicated. Let p be an arbitrary polynomial of degree m + d. We write p(x) in two ways: The two formulas amount to different choices of basis in the space of polynomials of degree m + d. The mapping L : C m+d+1 → C m+d+1 taking the column vector c = (c 0 , ..., c m+d ) into (a 0 , ..., a m+d ) is linear. The entry in the j-th row and k-th column of its matrix is the binomial coefficients k j , for 0 ≤ j, k ≤ m + d, where as usual we set j k equal to 0 if j < k. We also set 0 k = 1. The matrix of L is upper triangular, and all diagonal entries are equal to 1. Thus L is invertible. The condition that p be divisible by (1 + x) d amounts to saying that c j = 0 for 0 ≤ j < d. We therefore consider the matrix L ′ obtained from L by deleting the first d columns. We claim that any square submatrix of L ′ of size m+1 is invertible. We omit the details of this claim; in fact the best proof of the claim is to use the first proof above.
If p had fewer that d + 1 terms, then at least m + d + 1 − d = m + 1 of the a j would vanish. Hence there is an m + 1 by m + 1 submatrix of L ′ annihilating the column vector c ′ = (0, c d , ..., c m+d ). Since such matrices are invertible, we obtain c ′ = 0. Hence c = 0, and therefore p = 0. Therefore the only element divisible by (1 + x) d with fewer than d + 1 terms is 0.
We use this result as an induction step. Assume that we have proved the result in dimension n. Let y = (y 1 , ..., y n ), and put s = n k=1 y k . The induction hypothesis guarantees that a nonzero polynomial divisible by s j has at least N (n, j) terms.
Let p be homogeneous of degree m + d in the n + 1 variables (y, x). Assume that p is a multiple of (x + s) d . We wish to show that the number of distinct monomials in p is at least N (n + 1, d). By dehomogenization we have We expand p in two ways, writing First we note that, after dividing through by a power of x, we may assume without loss of generality that h 0 (y) = 0, and hence that A 0 (y) = 0.
Next we claim that at least d + 1 of the A j in (5) are not zero. To verify the claim, replace x by ws in (5). Using homogeneity, we obtain a polynomial in the single variable w that is divisible by (1 + w) d . Hence the claim follows from the one variable case proved above. Hence there exist d + 1 integers such that and for which A j k = 0. At each stage we choose the integers in (6) minimally. Next we note that s d−k divides A j k . This result holds by expanding the middle term in (5) by the binomial theorem, which yields an explicit formula for the A j (y), and then proceeding inductively using the minimality.
Each expression A j k (y)x j k is divisible by a different power of x and hence all the resulting terms are distinct. Therefore if K is the number of terms in p(x, y) then The inequality K ≥ N (n + 1, d) from (7) completes the induction step.
In the proof we proved the following statement. If P (t) is a nonzero multiple of (1 + t) d , then P has at least d + 1 nonzero terms. We then used this result in the inductive step.
Proposition 2.1 is the special case of the general situation when the matrix of coefficients is diagonal. It is somewhat analogous to the degree estimates proved in [DLP]. To pass from the diagonal case to the general case we replace the number of monomials occurring in a polynomial with the rank of the polynomial. We recall the needed linear algebra.
Let R be a real-analytic function defined near the origin in C n . Near the origin we write Its rank r(R) is defined to be the rank of the possibly infinite matrix of coefficients (c α,β ). The rank is the minimum number k of linearly independent local holomorphic functions f j and g j for which we can write We allow k to take the value ∞.
We clarify the connection with the diagonal case. Suppose, for each j that there is a multi-index α j such that f j (z) = c j z αj and g j (z) = z αj . After setting x j = |z j | 2 and using multi-index notation, we can rewrite (9) in the form where the c j are non-zero constants and the α j are distinct multi-indices. Then r(R) equals the number of nonzero monomials in (10).
The following result includes Theorem 1.1 as a special case when R is a polynomial.
Theorem 2.1. Let R(z, z) be a real-analytic function defined near the origin in C n . Suppose that R is not identically zero, and that R is a multiple of ||z|| 2d . Then the rank of R is at least N (n, d). Equality is possible: for example equality holds when R = ||z|| 2d .
Proof. When n = 1 the result is trivial, as N (1, d) = 1 for all d, and we are saying only that R is not identically 0. The special case d = 1 corresponds to Huang's lemma, but our argument is considerably different even in this case.
Write R(z, z) = ||z|| 2d r(z, z), where r is real-analytic in some neighborhood of the origin. Consider the lowest order part u of the Taylor expansion of r at the origin. Note that the lowest order part of ||z|| 2d r is given by ||z|| 2d u. Since the matrix of coefficients of ||z|| 2d u is a submatrix of the matrix of coefficients of R, we get r(||z|| 2d r) ≥ r(||z|| 2d u).
We may therefore assume that where u is homogeneous in the z and z variables. We write Now that we are in the polynomial case it is convenient to dehomogenize. We use different notation. Assume that z ∈ C n−1 and put p(z,z) = r(z,z)(1 + ||z|| 2 ) d .
We must show that p cannot be written as a squared norm with fewer terms than M (n − 1, d) = N (n, d). In other words, we wish to find a lower bound on the rank of the matrix of coefficients of p.
We decompose p(z, z) according to the following formula: If only the first sum is nonzero (that is, if the matrix of coefficients is diagonal), then we are already in the case of Proposition 2.1, and the conclusion holds. Therefore we assume that one of the other two terms is nonzero, and without loss of generality we suppose the second term is non-zero. (In the Hermitian case the terms are conjugates of each other.) The above decomposition is invariant under multiplication by 1 + ||z|| 2 and hence by (1 + ||z|| 2 ) d . That is, if we decompose r as above and multiply each term by (1 + ||z|| 2 ) d , we get the corresponding decomposition of p.
Next, impose a monomial order on the multi-indices α. That is, we have a total well ordering on all monomials that respects multiplication: if α < β then α + γ < β + γ. For example, lexicographical ordering will suffice. In this ordering, find the largest α such that β c β+α,β |z β | 2 is nonzero. Let β range over those c β+α,β that are nonzero. We note that the vectors [c β+α, * ] are linearly independent, because c β+α,γ must be zero for γ < β by the extremality of α.
Therefore r(p), the rank of the matrix of coefficients, is bounded below by the number of nonzero terms in β c β+α,β |z β | 2 .