Bounds for Entropy Numbers of Some Critical Operators

We provide upper bounds for entropy numbers for two types of operators: summation operators on binary trees and integral operators of Volterra type. Our efforts are concentrated on the critical cases where none of known methods works. Therefore, we develop a method which seems to be completely new and probably merits further applications.


Introduction
We will investigate the entropy numbers of certain linear operators. Recall that for a set A in a metric space, its n-th diadic entropy number is the infimum of ε > 0 such that A admits a covering by 2 n−1 balls of radius ε. Moreover, given a compact linear operator V : X → Y acting from one normed space to another, its entropy numbers e n (V ) are defined as those of V-image of the unit ball of X. The entropy numbers along with other measures of compactness such as approximation numbers and Kolmogorov numbers play extremely important role in operator theory and its applications. We refer to classical monographs [5] and [7] for further details and references.
This work originates from a question of M. Lacey and W. Linde. They investigated entropy numbers for linear Volterra operators with relatively bad compactness properties and discovered that two types of the behavior of entropy numbers are possible [9], [11] (see more details in Section 3). On a certain boundary separating the two cases their methods did not apply and the problem remained open. Further hard efforts convinced us that the remaining case can not be settled by a rich variety of traditional methods. Therefore a new technique is required. It turned out that this new technique could be cleanly elaborated and better explained if we replace Volterra operator by analogous summation operator on the binary tree. This class of operators is quite simple and natural but it is absolutely not investigated (its properties will be a subject of a separate work). Therefore, we start with consideration of summation operators and first prove our estimate in this case. Notice that the trees appear naturally in the study of functional spaces because Haar base and other similar wavelet bases indeed have a structure close to that of a binary tree.
In the last section, we reproduce the same approach for the integral operator considered by Lacey and Linde. Here is our main point: in most part of classical methods to evaluate entropy numbers e n (V) of an operator V one approximates V with a finite rank operator depending on n. Contrary to this, we approximate V with a family of finite rank operators indexed by some finite set of "essential trees", a notion introduced in this article.

Introduction to tree summation operators
We consider a tree T and its levels {T l }, l = 0, 1, . . . such that the level T 0 consists of the single node (the tree root) and the level T l+1 is the set of all direct offsprings of nodes that belong to T l .
We denote O T (t) = O(t) the set of all direct and indirect offsprings of a node t ∈ T including t itself and let O l (t) = O(t) ∩ T l . If t ∈ T l , we write |t| = l. If u ∈ O(t) we write u t and t u. The strict inequalities have the same meaning with additional assumption u = t.
For any element µ ∈ ℓ 1 (T ) and any t ∈ T we denote the mass and variation at t as Clearly, for any t ∈ T s µ (t) ≤ ||µ||(t) and for any t ∈ T l and any m ≥ l we have Now assume that T is equipped with a non-negative weight W = {w(t)} t∈T . The weight W gives rise to the following simple weighted summation operator,Ṽ : ℓ 2 (T ) → ℓ ∞ (T ) given by where the summation is actually taken over the branch leading from the root to the node t. By technical reasons, we will investigate a slightly different form of this operator. Namely, let us introduce a pair of dual tree-summation operators, V : ℓ 2 (T, W ) → ℓ ∞ (T ) and V * : and respectively. It is easy to see that It is also clear that the operators V andṼ are isomorphic. We have chosen the representation (3) because of the simple form of the operator V * , the one we will really handle.
2 The entropy of a summation operator on the binary tree In this section we consider a binary tree T with levels {T l }, l = 0, 1, . . . such that the level T 0 consists of the single node (the tree root) and every node of level T l generates 2 offsprings in T l+1 . Note that |T l | = 2 l . The weight W = {w(t)} t∈T is defined by

Regular case
Theorem 1 Let β > 1 and let the weight W be given by (5). Consider the linear operator V * : ℓ 1 (T ) → ℓ 2 (T, W ) defined by (4). There exist numeric constants C 1 , C 2 depending on β such that for all positive integers n we have the following bounds for its entropy numbers Proof. Upper bound. Consider the set D = {V * δ t , t ∈ T } ⊂ ℓ 2,W (T ), where as usual δ t denotes the delta-function at point t, i.e δ t (u) = 1 {u=t} . Recall that It is easy to establish an upper bound for diadic entropy of D. Indeed, take a net D n = {V * δ t , t ∈ T, |t| ≤ n}. Then |D n | ≤ 2 n+1 and for any t ∈ T we have We see that Now recall that a polynomial upper bound e n (D) ≤ cn −α for any set D in a Hilbert space yields a bound on e n (acoD), where aco D denotes the absolutely convex hull of D. Namely, as established in [3] for α = 1/2 and in [8] for α = 1/2 under this assumption we have e n (aco D) ≤ Cn −α , α < 1/2, e n (aco D) ≤ Cn −1/2 ln n, α = 1/2, e n (aco D) ≤ Cn −1/2 (ln n) 1/2−α α > 1/2.
By letting here α = β−1 2 , we obtain the desired upper bounds in Theorem 1, because by the property of the unit ball in ℓ 1 -space we have e n (V * ) = e n (aco D).
Lower bound. For any n ∈ N let m = 2 n and denote {t : |t| = n} := (t j ) 1≤j≤m . Take any (s j ) 1≤j≤m such that |s j | = 2n and s j is an (indirect) offspring of t j . Let µ j = δ sj − δ tj . Then These image vectors are orthogonal, since they have disjoint supports, and for appropriate C 1 We notice that we found m = 2 n elements µ j such that ||µ j || 1 = 2 and for i = j This is true for any β > 1 but it is optimal only for 1 < β ≤ 2, while for β > 2 we need a refined argument. By using the same vectors, we see that the restriction of V * on the span of vectors (µ j ) is isometric to the embedding I m : ℓ m 1 → ℓ m 2 up to the coefficient Recall that with appropriate numerical c > 0 we have a (sharp) estimate see [12]. Choose n = n(k) such that 2 n/2 ≤ k ≤ 2 (n+1)/2 . Since m = m(n) = 2 n , the parameter k fits in the range and we obtain a bound that is sharp for β ≥ 2, There are many available proofs for upper bounds in Theorem 1. The one presented here is probably the shortest one. It is due to W. Linde. We refer to [2] for the studies of other summation operators in probabilistic language.

Critical case
We see that in Theorem 1 the upper and lower estimates for β = 2 are of the same order and are "easy" to obtain modulo known results, although at the point β = 2 the behavior of entropy undergoes a striking change. Moreover, in the case β = 2 the estimates of Theorem 1 do not fit together and leave a logarithmic gap. Apparently this gap is impossible to close just by combining the existing results. Therefore, we call the case β = 2 a critical one. We will show that the lower bound of Theorem 1 is in fact sharp but the proof of the corresponding upper bound is by far more complicated and requires a new method.
Theorem 2 Let the weight W be given by (5) with β = 2. Consider the linear operator V : (4). There exists a numeric constant C such that for all positive integers n we have the following upper bound for its entropy numbers Proof. The proof of Theorem 2 will be splitted in few steps, each step having its clear own meaning. We keep the notation O(t), O l (t), |t|, s µ (t), ||µ||(t) from the previous section.
Step 1: essential subtrees In particular, the root is contained in any subtree.
The evaluation of entropy numbers will be based on the construction of some family of subtrees Υ µ based on a stopping rule. Namely, let σ l = l n . For µ ∈ ℓ 1 (T ) satisfying ||µ|| 1 ≤ 1 we define the n-essential subtree Υ µ by starting from the root and including all nodes in Υ µ while ||µ||(t) > σ |t| and stopping the construction while ||µ||(t) ≤ σ |t| . We denote by B µ the set of nodes where construction was stopped.
Since σ n+1 > 1, we have Υ µ ∩ T n+1 = ∅, that is we stop the construction not later than at the level n. In particular, Υ µ is finite. Notice that we have a partition Now we evaluate the size of Υ µ and will see that: The size of Υ µ is dramatically small. This is a decisive step towards our goal. Let N l = |Υ µ ∩ T l |. We have Proof of Lemma 3. By the definition of n-essential tree we have It follows that t∈Q |t| ≤ n, as required in (8). On the other hand, for any tree Υ and its terminal set Q it is true that thus (9) follows. We can also easily evaluate the number of possible n-essential trees.
Lemma 4 The number of subtrees of binary tree whose terminal set Q satisfies (8) does not exceed (4e) n .
Proof of Lemma 4. Since a subtree is entirely defined by its terminal set, we have to find out how many sets Q satisfy (8). Denote q l = |Q ∩ T l |. Then (8) writes as l l q l ≤ n.
Since q l ≤ n l , the number of non-negative integer solutions of this inequality does not exceed Moreover, for given sequence q l , while constructing a set Q, on each level l of a binary tree we have to choose q l elements from at most 2 l elements of this level. Therefore the number of possible sets not exceed We finish the discussion of essential subtrees by proving their useful approximation property. It follows from (1) and (2) that for any t ∈ T it is true that Moreover, (10) and the definition of Therefore, as we will see soon, the part of operator V * related to the complement of Υ µ is not essential at the precision level n −1/2 which explains the name "essential" we gave to this family.
Step 2: approximating operators We are going now to construct a family of finite rank operators approximating the operator V * . Each operator will correspond to an n-essential subtree. However the construction is valid for any subtree of T . Given a subtree Υ ⊂ T we define three operators related to Υ. The operator V * Υ : This is essentially the same operator as V * restricted on elements supported by Υ. Now define a mapping z from the complement of Υ to the boundary of Υ by letting z(s) be the last node in Υ on the way from the root to s. We denote Z(t) = z −1 (t). This set will be non-empty only if t belongs to the boundary of Υ.
Now the flush-projection operator P * Υ : This operator projects measures supported by T onto the measures supported by Υ. It is clear that ||P Υ || ≤ 1, i.e. P Υ is a contraction. The main property of the operators introduced so far reads as Finally, we will use the natural embedding ι Υ : ℓ 2 (Υ, W ) → ℓ 2 (T, W ) defined by Combining all together we define the approximating operator A Υ : ℓ 1 (T ) → ℓ 2 (T, W ) by A Υ = ι Υ V * Υ P Υ . It follows from (12) that for any subtree Υ and any µ ∈ ℓ 1 (T ) we have Finally notice that since ||P Υ || ≤ 1 and ||ι Υ || ≤ 1, we have for any m ∈ N So far we have not specified our subtree. Now we will use the n-essential subtrees constructed above. For any given n let Γ = {Υ} be the set of all subtrees Υ ⊂ T satisfying (8), hence (9). Recall that by Lemma 4 we have |Γ| ≤ (4e) n and for any µ ∈ ℓ 1 (T ) its n-essential subtree Υ µ belongs to Γ. By comparing inequality (11) with (13) we see that for any µ ∈ ℓ 1 (T ) with Recall that for every µ its own approximating operator is used. We will show now how the properties like this one can be applied. This simple idea seems to be of independent interest, thus we state it as a separate statement.
Step 3: approximation lemma The following lemma shows how a linear operator V can be approximated by a family of operators (V γ ) γ∈Γ in a sense that for every element x its image V x is approximated by V γ x with appropriate γ depending of x.
Lemma 5 Let X, Y be the normed spaces and V, (V γ ) γ∈Γ be the linear operators acting from X to Y . Then for any n ∈ N it is true that where B X = {x ∈ X : ||x|| X ≤ 1}.
Proof: Denote S 1 and S 2 the expressions in (16) and fix a small δ > 0. For every γ we can choose an (e n (V γ ) + δ)-net N γ of size 2 n−1 for the set V γ (B X ) in the space Y . Let N = γ∈Γ N γ be a global net. Clearly, For any x ∈ B X we first find a γ such that Then we find an element y ∈ N γ ⊂ N such that By triangle inequality, we have Therefore, N is an (S 1 + S 2 + 2δ)-net for the set V (B X ) and its size does not exceed 2 [log 2 |Γ|]+n . The assertion of lemma follows. We will apply Lemma 5 to Step 4: evaluation of operators on short trees With (17) at hand, it remains to evaluate e n (V * Υ ) for fixed Υ ∈ Γ. In other words, we have to evaluate the entropy of the operator restricted to a tree of a very small size (due the bound (9) for the size of Υ).
Towards this aim, recall an important entropy bound from [4], Corollary 2.4 (i). There exists a constant c > 0 such that for any operator W acting from ℓ m 1 to a Hilbert space and any k ∈ N it is true that We will apply this estimate to particular situation of tree operators. Let ∆ be the tree that consists of the first [ln n/4] levels of binary tree. Let us split our operator in a sum V * Υ = V + Υ + V 0 Υ , where V + Υ corresponds to the layers distant from the root, The idea behind this splitting is simple: the operator V + Υ has a small norm while V 0 Υ has a small image dimension. We first study the operator V + Υ . Notice that For any tree Υ of size bounded by m, from (18) we get and applying this with k = n, m = n + 1 we obtain e n (V + Υ ) ≤ c n −1/2 . Now we have to consider the operator V 0 Υ . Notice that since weights on higher levels vanish, operator V 0 Υ actually acts into ℓ 2,W (∆). The size of ∆ is merely 2 1+ln n/4 ≤ 2n 1/4 , thus estimation can be rather crude.
where V 00 Υ is the same operator as V 0 Υ but acting into ℓ ∞ (∆) and I is the embedding of ℓ ∞ (∆) in ℓ 2,W (∆). The operator V 00 Υ is a contraction, since On the other hand, we can easily evaluate the entropy of I. The net H ∆ ⊂ ℓ 2,W (∆) will consist of all possible functions h of the form where j(t) are odd integers satisfying |j(t)| ≤ n. Notice that there are no more than 2n choices for each j(t). Now we provide the estimates for approximation error and for the size of H ∆ . We start with evaluating approximation error. Let x ∈ ℓ ∞ (∆) be such that ||x|| ∞ ≤ 1. Then for any t ∈ ∆ we have |x(t)| ≤ 1, hence, there exists a function h ∈ H ∆ such that Therefore,

The size of H ∆ is bounded by
We conclude that e 2n+3 (V 0 Υ ) ≤ e 2n+3 (I) ≤ 2n −7/8 , and we are done with operator V 0 Υ , too. Having the bounds both for and e n (V + Υ ) and e n (V 0 Υ ) by standard entropy estimates we get a bound for the sum of operators, i.e.
for some numerical constants c 1 , c 2 , which is still a conjecture for general Banach spaces but is a proved statement in our situation (one of the spaces is a Hilbert one), see [1].

Entropy of an integral operator
Let r < e −2 be a small number. In this section (·, ·) and || · || denote the scalar product and the norm in L 2 [0, r], respectively. We denote by M[0, r] the space of signed measures of finite variation and || · || 1 the respective variation norm. Moreover, ||µ|| 1 (I) stands for the variation of µ ∈ M[0, r] on an interval I. Our aim is to study the critical integral operator V : and its adjoint V * : where the critical kernel is Before we start the studies of K, let us explain why it is critical in our context. Consider the family of kernels and the corresponding operators V β . It is known from the works of Linde and Lacey [9], [11] that Therefore, we see that the most interesting kernel K = K (1) lays on the boundary between two different regimes and observe a logarithmic gap between the lower and upper bounds. The situation is exactly the same as in Theorem 1.
The main property of the kernel K we need is its modulus of continuity. 2 An elementary calculation shows that for all 0 ≤ t ≤ t + u ≤ r ||K t+u − K t || 2 ≤ 2| ln u| −1/2 . Proof of Theorem 6. We repeat the ideas applied earlier to the summation operator on a binary tree. We first find a family of good finite rank approximations to V * by giving interpretation for n-essential subtrees. We will construct n-essential partition I µ n of [0, r] as follows. Given a positive integer n and an element µ ∈ M[0, r] we start dividing the interval [0, r] in halves and continue dividing while a (binary) interval I = ir 2 l , (i+1)r 2 l subject to division satisfies Once an interval does not satisfy (22) we do not divide it and include it in our partition I µ n . If ||µ|| 1 ≤ 1, the condition (22) fails for l > n. Therefore, our construction provides a finite partition of [0, r] in binary intervals of variable length.
The partition I µ n depends on µ but we will show now that the number of possible partitions and their size are rather limited.
Let D be the set of all binary intervals we divided during the construction of I µ n . Notice that D is a tree w.r.t. inclusion. Let Q be the set of all terminal intervals of D. In other words, I ∈ Q iff I satisfies (22) but neither of its halves satisfies it. It is important for us that Q uniquely determines both D and I µ n . Indeed, any subtree of the binary tree is determined by the set of its terminal nodes. Thus Q determines D. Moreover, I µ n consists of all direct offsprings of elements of D that do not belong to D.
By Lemma 4, the number of possible trees Q, thus the number of possible nessential partitions does not exceed (4e) n . It it is also worthwhile to notice that the number of intervals in I µ n satisfies |I µ n | ≤ 2|D| ≤ 2(n + 1) by Lemma 3. Consider a finite dimensional approximation for V * generated by any partition I, the operator V * I : where t I is the left end of I. We evaluate the approximation error ∆ I = V * −V * I . By the definition, We are going to show that the approximation error is particularly small when we use the n-essential partition.
Proposition 7 For any n ∈ N and any µ with ||µ|| 1 ≤ 1 we have Proof of Proposition 7. Let µ = µ + − µ − be the Hahn decomposition of µ. It is enough to show that and to prove the similar inequality for µ − . We start with For the main (diagonal) terms of this sum we have Unlike to the tree case, the summands in the definition of ∆ I are not orthogonal, therefore we can not stop here. We will show that the non-diagonal terms do not give a positive contribution to the quantity we evaluate. Let g : R → R be a function such that g vanishes on (−∞, 0] and g is a decreasing convex non-negative function on (0, +∞). Let K t (·) = g(t − ·) for t ≥ 0.
We continue the proof of Theorem 6. Let J n = {I µ n : ||µ|| 1 ≤ 1} be the set of all possible n-essential partitions of [0, r]. Recall that We claim that sup I∈Jn e n (V * I ) ≤ Cn −1/2 .
Assuming this is obtained, the application of Lemma 5 to the family of operators {V * I , I ∈ J n } along with the estimate of approximation error (24) and the estimate for the number of operators (26) lead to e n (V * ) ≤Cn −1/2 as required by assertion of Theorem 6. The same estimate for e n (V) follows by the duality argument (20). Now it only remains to prove (27). Let us fix a partition I ∈ J n . From now on, we do not need any particular properties of n-essential partitions, except for the size bound (23).
Consider an auxiliary partition E of [0, r] constructed as follows. Take m such that 2 −m ≤ n −1/4 ≤ 2 1−m . Divide [0, r] in binary intervals of length r2 −m . If a union of such intervals belongs to I, then replace them by this union. The result is a partition E. Notice that I is a refinement of E and |E| ≤ 2 m ≤ 2n 1/4 . Write V * I = V * E + (V * I − V * E ) and evaluate the entropy of both operators.
On the other hand, for any µ with ||µ|| 1 ≤ 1 find an h = I∈E K tI jI n such that max I∈E |µ(I) − jI n | ≤ n −1 . We have ||V * E µ − h|| 2 ≤ I∈E ||K tI || 2 · | j I n − µ(I)| ≤ max entropy numbers e n (A), what can we say about e n (aco A)? In (6) we already recalled some known relations. F.Gao [8] was the first to construct a critical set A with properties e n (A) ≤ c n −1/2 and e n (aco A) ≥ C n −1/2 ln n.
We can call Gao set any set satisfying (28). Later on, his arguments were streamlined and extended to non-Hilbert case in [6]. The relation to our problem is the following. Consider the critical tree summation operator V * with the weight (5) and β = 2. Take a set A = V * (1 {t} ), t ∈ T in the Hilbert space ℓ 2 (T, W ). It is plain that e n (A) ≤ Cn −1/2 , hence that e n (aco A) ≤ C n −1/2 ln n. Since it is quite difficult to get a better upper bound, one could think of A as a candidate to be a Gao set, although of a nature very different from the known ones. On the other hand, aco A is the image of the unit ball w.r.t. operator V * . In other words, e n (V * ) = e n (aco A). Therefore, Theorem 2 shows that A is not a Gao set.