Canonical forms, higher rank numerical range, convexity, totally isotropic subspace, matrix equations

Results on matrix canonical forms are used to give a complete description of the higher rank numerical range of matrices arising from the study of quantum error correction. It is shown that the set can be obtained as the intersection of closed half planes (of complex numbers). As a result, it is always a convex set in $\mathcal C$. Moreover, the higher rank numerical range of a normal matrix is a convex polygon determined by the eigenvalues. These two consequences confirm the conjectures of Choi et al. on the subject. In addition, the results are used to derive a formula for the optimal upper bound for the dimension of a totally isotropic subspace of a square matrix, and verify the solvability of certain matrix equations.


Introduction
Let M n be the algebra of n × n complex matrices. In [3], the authors introduced the notion of the rank-k numerical range of A ∈ M n defined and denoted by Λ k (A) = {λ ∈ C : P AP = λP for some rank-k orthogonal projection P } in connection to the study of quantum error correction; see [4]. Evidently, λ ∈ Λ k (A) if and only if there is a unitary matrix U ∈ M n such that U * AU has λI k as the leading principal submatrix. When k = 1, this concept reduces to the classical numerical range, which is well known to be convex by the Toeplitz-Hausdorff theorem; for example, see [8] for a simple proof. In [1] the authors conjectured that Λ k (A) is convex, and reduced the convexity problem to the problem of showing that 0 ∈ Λ k (T ) for T = I k X Y −I k for arbitrary X, Y ∈ M k . They further reduced this problem to the existence of a Hermitian matrix H satisfying the matrix equation for arbitrary M ∈ M k and positive definite P ∈ M k . In [12], the author observed that equation (1.1) can be rewritten as the continuous Riccati equation and existing results on Riccati equation will ensure its solvability; for example, see [7,Theorem 4]. This establishes the convexity of Λ k (A).
Denote by λ k (H) the kth largest eigenvalue of the Hermitian matrix H ∈ M n . We will use results on canonical forms of complex square matrices to show that Λ k (A) = ξ∈[0,2π) µ ∈ C : e iξ µ + e −iξμ ≤ λ k (e iξ A + e −iξ A * ) .
Thus, Λ k (A) is the intersection of closed half planes on the complex plane, and therefore a convex set. Furthermore, specializing our result to normal matrices confirms the conjecture in [2] asserting that Λ k (A) = 1≤j1<···<j n−k+1 ≤n conv {λ j1 , . . . , λ j n−k+1 } if A ∈ M n is a normal matrix with eigenvalues λ 1 , . . . , λ n . In addition, from our results one can derive a formula for the optimal upper bound for the dimension of a totally isotropic subspace of a square matrix. As shown in [1], the convexity of the higher rank numerical range is closely related to the study of solvability of matrix equations. Following the idea in [1], we study the solvability of certain matrix equations including those of the form (1.1), (1.2) and for a given k × k matrix R. In particular, it is shown that there is always a common solution Z satisfying a pair of equations of the form (1.3). In other words, given two matrices R, S ∈ M k , the operator spheres always have non-empty intersection, here |X| is the positive semidefinite square root of X * X.
The following results on canonical forms of matrices will be used in our discussion; for example, see [11] and [6].
I. QR decomposition: For every A ∈ M n , there is a unitary matrix Q ∈ M n and an upper triangular matrix R ∈ M n such that A = QR. II. CS decomposition: For every unitary U ∈ M 2k , there are unitary matrices where C = diag (c 1 , . . . , c k ) with 1 ≥ c 1 ≥ · · · ≥ c n ≥ 0. III. * congruence canonical form: For every A ∈ M n , there is an invertible S ∈ M n such that S * AS is a direct sum of the following three types of matrices.

Higher rank numerical range
Definition 2.1. For A ∈ M n , let Ω k (A) be the set of µ ∈ C such that for each ξ ∈ [0, 2π), the Hermitian matrix e iξ (A − µI n ) + e −iξ (A − µI n ) * has at least k nonnegative eigenvalues. In particular, if λ k (H) denotes the kth largest eigenvalue of a Hermitian matrix H ∈ M n , then When k = 1, it is well known that the classical numerical range Λ 1 (A) can be obtained by intersecting the closed half planes We will show that Λ k (A) = Ω k (A), which extends the classical result. In particular, one can easily write a computer program to draw the boundary ∂Ω k (A) of Ω k (A), and it is clear that for A ∈ M n , the convex curve ∂Ω k (A) lies inside the convex curve ∂Ω k−1 (A) if k > 1.
If A is Hermitian, then we have the nested intervals If A is normal with eigenvalues λ 1 , . . . , λ n , then Recall that we use λ k (H) to denote the kth largest eigenvalue of a Hermitian matrix H ∈ M n . Our main theorem is the following.
Since the intersection of half planes in C is a convex set, the following corollary is immediate.
By the discussion on normal matrices before Theorem 2.2, we have the following corollary confirming the conjecture in [2]. Corollary 2.4. Let A ∈ M n be a normal matrix with eigenvalues λ 1 , . . . , λ n . Then To prove the theorem, we need the following lemma, which can be found in [1]. We give a short proof using the QR decomposition.
Proof. The implication "⇒" is clear. Conversely, suppose there is an invertible S ∈ M n such that S * AS has 0 k as the leading k × k principal submatrix. By the QR decomposition, S = U R, where U is unitary and R is upper triangular. Since R −1 is also in upper triangular form, we see that U * AU = (R −1 ) * (S * AS)R −1 also has 0 k as the leading principal submatrix.
We divide the proof of Theorem 2.2 into three lemmas. In particular, the construction in Lemmas 2.7 and 2.8 can be done explicity using the results in [5,6] and QR decomposition (which involves only the Gram-Schmidt process). Thus, for every µ ∈ Λ k (A), one can construct a unitary matrix U such that U * AU with µI k as the leading principal submatrix.
Then e iξ B + e −iξ B * is unitarily similar to a matrix with 0 k as the leading principal submatrix. By the interlacing inequalities (for example, see [5]), e iξ B + e −iξ B * has at least k nonnegative eigenvalues.
We prove the result by induction on k. If k = 1, then the given condition ensures that 0 lies in the convex hull of the eigenvalues of B. Suppose V ∈ M n is unitary such that V * BV = diag (b 1 , . . . , b n ) and p 1 , . . . , p n are nonnegative real numbers summing up to 1 such that n j=1 p j b j = 0. Then u = V ( √ p 1 , . . . , √ p n ) t is a unit vector such that v * Bv = 0. Choose a unitary matrix U ∈ M n with u as the first column. Then U * BU has zero as the (1, 1) entry. So, the result holds for k = 1. [One can also use the convexity of the classical numerical range to get the conclusion. We include the argument so that the proof is independent of other convexity result.] Assume that k > 1 and the result is valid for the rank-m numerical range of normal matrices whenever m < k. If B has an eigenvalue equal to 0, then there is 1 has at least k − 1 nonnegative eigenvalues for any ξ ∈ [0, 2π). By induction assumption, there is a unitary U ∈ M n−1 such that U * B 1 U has 0 k−1 as the leading principal submatrix. Then 0 k will be a leading principal submatrix of ( Now, assume that B is invertible. Then k ≤ n/2. Suppose there is a pair of eigenvalues of B, say λ 1 and λ 2 , satisfying λ 1 /|λ 1 | = e iθ and λ 2 /|λ 2 | = −e iθ = e iθ+π for some θ ∈ [0, 2π). Then there is a unitary V ∈ M n such that V * BV = B 1 ⊕ B 2 with B 1 = diag (λ 1 , λ 2 ). Note that for each ξ ∈ [0, 2π), e iξ B 1 + e −iξ B * 1 has at least 1 nonnegative eigenvalue and e iξ B 2 + e −iξ B * 2 has at least k − 1 nonnegative eigenvalues. By the induction assumption, there are unitary U 1 ∈ M 2 and U 2 ∈ M n−2 such that U * 1 B 1 U 1 and U * 2 B 2 U 2 have 0 1 and 0 k−1 as their leading principal submatrices, respectively. Let U = U 1 ⊕ U 2 . Then 0 k will be a principal submatrix of U * V * BV U lying in rows and columns 1, 3, 4 . . . , k + 1. Thus, 0 ∈ Λ k (B).
Continue to assume that B is invertible; assume in addition that no pair of eigenvalues of B have arguments θ and θ + π.
Claim There is an invertible S ∈ M n such that S * BS has 0 k as the leading principal submatrix.
Once the claim is proved, we see that 0 ∈ Λ k (B) by Lemma 2.5, and the induction proof will be complete.
To prove the claim, let ξ ∈ [0, 2π) be such that e iξ B + e −iξ B * has the smallest number of nonnegative eigenvalues, say, k ′ . Then k ′ ≥ k. We may assume that k = k ′ . Furthermore, we may assume that ξ = 0, otherwise, replace B by e iξ B. Apply a * -congruence to B and assume that B = H + iG such that H = I k ⊕ −I n−k and G = diag (g 1 , . . . , g n ) with g 1 ≥ · · · ≥ g k and g k+1 ≥ · · · ≥ g n . Note that the given assumption on B ensures that (i) for every straight line passing through the origin, there are at least k eigenvalues of B lying in each of the closed half planes determined by the line, and (ii) there is no pair of eigenvalues of B having arguments θ and θ + π. We claim that −g n > g 1 . Otherwise, −g n ≤ g 1 . Since condition (ii) holds, we see that −g n < g 1 . Moreover, the line L passing through 0 and the eigenvalue 1 + ig 1 of B will divide the plane into two parts so that k of the eigenvalues of B, namely, 1 + ig 1 , . . . , 1 + ig k lies below L, and all other eigenvalues lies in the open half plane above L. We may then rotate L in the clockwise direction by a very small angle so that at most k − 1 of the eigenvalues of B, namely, 1 + ig 2 , . . . , 1 + ig k , will lie on the closed half plane below the resulting line, contradicting condition (i).
Similarly, we can argue that −g n−1 > g 2 . Otherwise, −g n−1 < g 2 , and we can rotate the line passing through 0 and 1 + ig 2 in the clockwise direction by a very small angle so that at most k − 1 eigenvalues of B, namely, 1 + ig 3 , . . . , 1 + ig k and −1 + ig n , will lie on the closed half plane below the resulting line, contradicting condition (i).
Thus, the leading 2k × 2k submatrix of (I k ⊕ V ) * B(I k ⊕ V ) equals Then the leading 2k × 2k submatrix of W * (I k ⊕ V ) * B(I k ⊕ V )W equals So, the claim holds.

Lemma 2.8. For any matrix
Proof. Suppose A ∈ M n and µ ∈ Ω k (A). Let S ∈ M n be such that S * (A − µI n )S is a direct sum of the following matrices as defined in Section 1 (III).
(a) Γ 2r1 (µ 1 ), . . . , Γ 2ru (µ u ). (c) e iξ1 ∆ t1 , . . . , e iξw ∆ tw , where t 1 , . . . , t q are odd, and t q+1 , . . . , t w are even. Let B = S * (A − µI n )S. For each ξ ∈ [0, 2π), consider e iξ B + e −iξ B * . Each type (a) direct summand has the form e iξ Γ 2rj (µ j ) + e −iξ Γ 2rj (µ j ) * , which will contribute r j nonnegative (positive) eigenvalues to e iξ B+e −iξ B * . Consequently, these summands will contribute a total of u j=1 r j nonnegative eigenvalues to e iξ B + e −iξ B * . Each type (b) direct summand has the form e iξ J sj (0) ⊕ e −iξ J sj (0) * , which will contribute [(s j + 1)/2] nonnegative eigenvalues to e iξ B + e −iξ B * , where [x] denotes the integral part of the real number x. Consequently, these summands will contribute a total of 1 2 v j=1 s j + p nonnegative eigenvalues to e iξ B + e −iξ B * . Each type (c) direct summand has the form with a j = cos(ξ + ξ j ) and b j = − sin(ξ + ξ j ). Suppose t j is even. Since there is a 0 tj /2 leading principal submatrix, the matrix has at least t j /2 nonnegative eigenvalues. If ξ is chosen so that a j = 0, then there will be exactly t j /2 nonnegative (positive) eigenvalues. Thus, the matrix in (2.1) will contribute t j /2 nonnegative eigenvalues to e iξ B + e −iξ B * . Suppose t j is odd, then e i(ξ+ξj ) ∆ tj + e −i(ξ+ξj ) ∆ * tj is congruent to [e i(ξ+ξj ) + e −i(ξ+ξj ) ] ⊕ D j such that D j has (t j − 1)/2 nonnegative eigenvalues. Consequently, if ξ is chosen so that a j = 0 in (2.1) whenever t j is even, then these summands will contribute a total of 1 2 w j=1 t j − q + ℓ(ξ) nonnegative eigenvalues to e iξ B + e −iξ B * , where ℓ(ξ) is the number of nonnegative eigenvalues of e iξ N + e −iξ N * with N = diag (e iξ1 , . . . , e iξq ). Denote by ν(H) the number of nonnegative eigenvalues of the Hermitian matrix H, and ℓ ′ = min{ν(e iξ N + e −iξ N * ) : ξ ∈ [0, 2π)}. Then there are infinitely many choices of ξ which attain ℓ ′ . So, we may choose ξ to attain ℓ ′ with the additional assumption that a j = 0 in (2.1) whenever t j is even. Let Then k ′ ≥ k. Hence, the conclusion that 0 ∈ Λ k (A − µI n ) will follow once we show that 0 ∈ Λ k ′ (A − µI n ). By our assumption, S * (A − µI n )S is a direct sum of the matrices listed in (a) -(c). For each direct summand Γ 2rj in (a), it is clear that the leading principal submatrix is 0 rj . Thus, these direct summands contain a zero principal submatrix of dimension u j=1 r u . For each direct summand J sj (0) in (b), the principal submatrix lying in rows and columns indexed by odd numbers is a zero principal submatrix. Thus, these direct summands contain a zero principal submatrix of dimension ( v j=1 s j + p)/2. For each direct summand e iξj ∆ tj in (c), if t j is even then the leading principal submatrix is 0 tj/2 ; if t j is odd, then the leading principal submatrix is 0 (tj−1)/2 . Thus, these direct summands contain a zero principal submatrix of dimension ( w j=1 t j − q)/2. Moreover, these direct summands are permutationally similar to a matrix T with 0 t ⊕ N as the (t + q) × (t + q) leading principal submatrix, where t = ( w j=1 t j − q)/2 and N = diag (e iξ1 , . . . , e iξq ). By Lemma 2.7, 0 ∈ Λ ℓ ′ (N ). Thus, there is a unitary matrix V ∈ M q such that V * N V has 0 ℓ ′ as the principal submatrix. Then (I t ⊕ V ⊕ I n−t−ℓ ′ ) * T (I t ⊕ V ⊕ I n−t−ℓ ′ ) has 0 t+ℓ ′ as the leading principal submatrix. Now combining all these zero principal submatrices yields a zero principal submatrix of dimension The result follows.

Totally isotropic subspaces and matrix equations
Let A ∈ M n . A subspace V of C n is a totally isotropic subspace of A if x * Ay = 0 for any x, y ∈ V. Note that U ∈ M n is unitary such that the first k columns of U form a totally isotropic subspace of A if and only if U * AU has 0 k as its leading principal submatrix. One can also write A = H +iG and dicuss the totally isotropic subspace of the Hermitian matrix pair (H, G), i.e., a subspace V of C n such that x * Hy = 0 = x * Gy for all x, y ∈ V. It is clear that A ∈ M n has a totally isotropic subspace of dimension k if and only if 0 ∈ Λ k (A). By Theorem 2.2, we have the following.
Note that the quantity min{ν(e iξ A + e −iξ A * ) : ξ ∈ [0, 2π)} is equal to k ′ in (2.2), where the quantities r 1 , . . . , r u , s 1 , . . . , s v , etc. are determined by the canonical form of A under * -congruence as in the proof of Lemma 2.8 by putting B = A − 0I. By the result in [6], one can obtain the canonical form S * AS by a finite algorithm using exact arithmetic.
The authors of [1] showed that the study of the convexity of the higher rank numerical range can be reduced to verifying the following lemma, which follows readily from Corollary 2.3.

Lemma 3.2. Let
Then there is a unitary U ∈ M 2k such that U * AU has 0 k as the leading k × k principal submatrix.
In [1], it was shown that the existence of U in Lemma 3.2 is equivalent to the solvability of some matrix equations; see [1,Theorem 2.12]. In the next theorem, we will use Lemma 3.2 and the CS decomposition of matrices to prove the solvability of a number of matrix equations and system of matrix equations. The equations in (a), (d), (f) have been considered in [1]. We give slightly different proofs of them.
We consider also other matrix equations. In particular, assertion (c) of the theorem can be restated as One can use the results in [5,6] and QR decomposition to construct the unitary matrix U in Lemma 3.2. As a result, one can give an explicit construction of the solutions of the matrix equations (a) -(d) following our proof.
It is easy to check that solvability of the equations in the theorem is equivalent to the existence of a unitary U ∈ M 2k satisfying the conclusion of Lemma 3.2.
As suggested by Professor T. Ando, it is interesting and inspiring to consider the scalar case of the statements and the proofs of the equations in the theorem. Theorem 3.3. Let R, S, P, C ∈ M k such that P is positive definite, C is a strict contraction, and γ ∈ R.
(a) There is a Z ∈ M k such that There is a Z ∈ M k such that I k + RZ + Z * R * − Z * Z = 0 k and SZ + Z * S * = 0 k .
(c) There is a Z ∈ M k such that There is a Hermitian H ∈ M k such that (e) There is a unitary U ∈ M k such that (f) There is a unitary U ∈ M k and a Hermitian H ∈ M k such that Proof. Consider the equation in (a). By Lemma 3.2, there is a unitary U ∈ M 2k such that By the CS decomposition, there are unitary matrices where C = diag (c 1 , . . . , c k ) with 1 ≥ c 1 ≥ · · · ≥ c k ≥ 0. Then W (U * BU )W * also has 0 k as the leading k × k principal submatrix. Equivalently, . Evidently, c k > 0. Otherwise, the (k, k) entry of the above matrix is −1. Thus, we can multiply the above equation by V 1 C −1 on the left and C −1 V * 1 on the right to get Taking the Hermitian part and skew-Hermitian part of the above equation, we get the two equations in (b).
To prove (c), letS = S − R. By (b) there is Z ∈ M k such that Adding the two equations, we get To prove (d), we may assume that γ = 0. Otherwise, replace R by R − γI k /2. Let S = iP −1/2 andR = RP −1/2 . By (b) there is Z ∈ M k so that I k +RZ + Z * R * − Z * Z = 0 k and SZ + Z * S * = 0 k .
The second equation implies that Z = P 1/2 H for some Hermitian H. Putting Z = P 1/2 H to the first equation, we have I k + RH + HR * − HP H = 0 k as asserted.
To prove (e), note that the first equation in (b) can be written as (Z * − R)(Z − R * ) = I k + RR * . Thus, its solution has the form Z = R * − U √ I k + RR * for some unitary U ∈ M k . Substituting this into the second equation in (b), we get the desired conclusion.

Infinite Dimensional Operators and Related results
One can easily extend the definition of Λ k (A) to a bounded linear operator A acting on infinite dimensional Hilbert spaces H; for example, see [12]. Results on Λ k (A) for infinite dimensional operators have been obtained in [10] including Theorem 4.1 below. For a self-adjoint operator H, we let λ k (H) = sup{λ k (X * HX) : X : C k → H, X * X = I k }. An open question in [2] concerns the lower bound of dim H which ensures that Λ k (A) is non-empty for every bounded linear operator A acting on H. The following result was proved in [9] that answers the above question.