Nonsmooth Hormander vector fields and their control balls

We prove a ball-box theorem for nonsmooth Hormander vector fields of step s.


Introduction
In this paper we give a self-contained proof of a ball-box theorem for a family {X 1 , . . . , X m } of nonsmooth vector fields satisfying the Hörmander condition. This is the third paper, after [M] and [MM], where we investigate ideas of the classical article by Nagel Stein and Wainger [NSW].
Our purpose is to prove a ball-box theorem using only elementary analysis techniques and at the same time to relax as much as possible the regularity assumptions on the vector fields. Roughly speaking, our results hold as soon as the commutators involved in the Hörmander condition are Lipschitz continuous. Moreover, our proof does not rely on algebraic tools, like formal series and the Campbell-Hausdorff formula.
To describe our work, we recall the basic ideas of [NSW]. Notation and language are more precisely described in Section 2. Any control ball B(x, r) associated with a family {X 1 , . . . , X m } of Hörmander vector fields in R n satisfies, for x belonging to some compact set K and small radius r < r 0 , the double inclusion (1.1) Φ x (Q(C −1 r)) ⊂ B(x, r) ⊂ Φ x (Q(Cr)).
Here, the map Φ x is an exponential of the form where the vector fields U 1 , . . . , U n are suitable commutators of lengths d 1 , . . . , d n and Q(r) = {h ∈ R n : max j |h j | 1/dj < r}. Usually, (1.1) is referred to as a ball-box inclusion. A control on the Jacobian matrix of Φ x gives an estimate of the measure of the ball and ultimately it provides the doubling property. A remarkable achievement in [NSW] concerns the choice of the vector fields U j which guarantee inclusions (1.1) for a given control ball B(x, r), see also the discussion in [Ste,p. 440]. Enumerate as Y 1 , . . . , Y q all commutators of length at most s and let ℓ i be the length of Y i . If the Hörmander condition of step s is fulfilled, then the vector fields Y i span R n at any point. Given a multi-index I = (i 1 , . . . , i n ) ∈ {1, . . . , q} n =: S and its corresponding n−tuple Y i1 , . . . , Y in of commutators, let (1.3) λ I (x) = det(Y i1 , . . . , Y in )(x) and ℓ(I) = ℓ i1 + · · · + ℓ in .
In [NSW], the authors prove the following fact: given a ball B(x, r), inclusion (1.1) holds with U 1 = Y i1 , . . . , U n = Y in if the n−tuple I ∈ S satisfies the η-maximality condition (1.4) |λ I (x)|r ℓ(I) > η max where η ∈ (0, 1) is greater than some absolute constant. Although the choice of the n-tuple I may depend on both the point and the radius, the constant C is uniform in x ∈ K and r ∈ (0, r 0 ). In [M] the second author proved that (1.1) also holds if one changes the map Φ x with the almost exponential map where h j → exp * (h j U j ) is the approximate exponential of the commutator U j , whose main feature is that it can be factorized as a suitable composition of exponentials of the original vector fields X 1 , . . . , X m . See (2.3) for the definition of exp * . Lanconelli and the second author in [LM] proved that, if inclusion (1.5), with pertinent estimates for the Jacobian of E x are known, then the Poincaré inequality follows (see [J] for the original proof). It is worth to observe now that all the results in [NSW] and [M] are proved for C M vector fields, where M is much larger than the step s. This can be seen by carefully reading the proofs of Lemmas 2.10 and 2.13 in [NSW].
In [TW,Section 4], Tao and Wright gave a new proof of the ball-box theorem with a different approach, based on Gronwall's inequality. The authors in [TW] use scaling maps of the form Φ x,r (t) := exp(t 1 r d1 U 1 + · · · + t n r dn U n )x, which are naturally defined on a box |t| ≤ ε 0 , where ε 0 > 0 is a small constant independent of x and r, see the discussion in Subsection 5.2. The arguments in [TW] do not rely on the Campbell-Hausdorff formula. 1 Moreover, although the statement is phrased for C ∞ vector fields, one can see that their results hold under the assumption that the vector fields have a C M smoothness, with M = 2s for vector fields of step s. See Remark 5.10 for a more detailed discussion.
In [MM] we started to work in low regularity hypotheses and we obtained a ball-box theorem and the Poincaré inequality for Lipschitz continuous vector fields of step two with Lipschitz continuous commutators. We used the maps (1.5), but several aspects of the work [MM] are peculiar of the step two situation and until now it was not clear how to generalize those results to higher step vector fields.
Recently, Bramanti, Brandolini and Pedroni [BBP] have proved a doubling property and the Poincaré inequality for nonsmooth Hörmander vector fields with an algebraic method. Informally speaking, they truncate the Taylor series of the coefficients of the vector fields and then they apply to the polynomial approximations the results in [NSW, LM] and [M]. The paper [BBP] also involves a study of the almost exponential maps in (1.5). The results in [BBP] and in the present paper were obtained independently and simultaneously.
In this paper we complete the result in [MM], namely we prove a ball-box theorem for general vector fields of arbitrary step s, requiring basically that all the commutators involved in the Hörmander condition are Lipschitz continuous. Our precise hypotheses are stated in Definition 2.1. We improve all previous results in term of regularity, see Remark 5.10. As in [MM], we use the almost exponential maps (1.5), but we need to provide a very detailed study of such functions in the higher step case.
The scheme of the proof of our theorem is basically the Nagel, Stein and Wainger's one, but there are some new tools that should be emphasized. Namely, we obtain some non commutative calculus formulas developed in order to show that, given a commutator Y , the derivative d dt exp * (tY ) can be precisely written as a finite sum of higher order commutators plus an integral remainder. This is done in Section 3. These results are applied in Section 5 to the almost exponential maps E in (1.5). Our main structure theorem is Theorem 5.8. As in [MM], part of our computations will be given for smooth vector fields, namely the standard Euclidean regularization X σ j of the vector fields X j . We will keep everywhere under control all constants in order to make sure that they are stable as σ goes to 0.
It is well known (see [LM, MM]) that the doubling property and the Poincaré inequality follow immediately from Theorem 5.8. Observe also that our ballbox theorem can be useful in all situations where integrals of the form |f (x) − f (y)|w(x, y)dxdy need to be estimated, for some weight w. See for example [M] or [MoM]. As an application, in Proposition 6.2 we prove a subelliptic Hörmandertype estimate for nonsmooth vector fields. We believe that the results in Section 3 may be useful in other, related, situations.
Concerning the machinary developed in Section 3, it is worth to mention the papers [RaS,RaS2], where non commutative calculus formulas are used in the proof of a nonsmooth version of Chow's Theorem for vector fields of step two.
Geometric analysis for nonsmooth vector fields started in the 80s with the papers by Franchi and Lanconelli [FL1,FL2], who proved the Poincaré inequality for diagonal vector fields in R n of the form X j = λ j (x)∂ j , j = 1, . . . , n. In the diagonal case completely different techniques are available. In the recent paper by Sawyer and Wheeden [SW], which probably contains the best results to date on diagonal vector fields, the reader can find a rich bibliography on the subject.
Plan of the paper. In Section 2 we introduce notation. In Section 3 we prove our noncommutative calculus formulas and in Section 4 we prove a stability property of the "almost-maximality" condition (1.4). These tools are applied in Subsection 5.1 to the maps E. In subsection 5.2 we briefly discuss the "scaled version" of our maps E. Subsection 5.3 contains the ball-box theorem. In Section 6 we show some examples. Finally, Section 7 contains the smooth approximation result for the original vector fields. enumerate as Y 1 , . . . , Y q all the commutators X w with length |w| ≤ s and denote by ℓ i or ℓ(Y i ) the length of Y i . We identify an ordered n-tuple of commutators Y i1 , . . . , Y in by the index I = (i 1 , . . . , i n ) ∈ S := {1, . . . , q} n . For x, y ∈ R n , denote by d(x, y) the control distance, that is the infimum of the r > 0 such that there is a Lipschitz path γ : [0, 1] → R n with γ(0) = x, γ(1) = y andγ = m j=1 b j X j (γ), for a.e. t ∈ [0, 1]. The measurable functions b j must satisfy |b j (t)| ≤ r for almost any t. Corresponding balls will be indicated as B(x, r).
Corresponding balls will be denoted by B ̺ (x, r). The definition of ̺ is meaningful as soon as the vector fields Y j are at least continuous.
Definition 2.1 (Vector fields of class A s ). Let X 1 , . . . , X m be vector fields in R n and let s ≥ 2. We say that the vector fields X j are of class A s if they are of class C s−2,1 loc (R n ) and for any word w with |w| = s − 1, and for every j, k ∈ {1, . . . , m}, (1) the derivative X k f w exists and it is continuous; (2) the distributional derivative X j (X k f w ) exists and Recall that X j ∈ C s−2,1 loc means that all the Euclidean derivatives of order at most s−2 of the functions f 1 , . . . , f m are locally Lipschitz continuous. In particular, all the commutators X w , with |w| ≤ s − 1 are locally Lipschitz continuous in the Euclidean sense and by item (1) all commutators X w of length |w| = s are pointwise defined. If we knew that d defines the Euclidean topology, condition (2) would equivalent to the fact that X w is locally d-Lipschitz, if |w| = s, see [GN, FSSC].
Let {X 1 , . . . , X m } be in the class A s and assume that they satisfy the Hörmander condition of step s. Fix once for all a pair of bounded connected open sets Ω ′ ⊂⊂ Ω and denote K = Ω ′ . We denote by D Euclidean derivatives. If D = ∂ j1 · · · ∂ jp for some j 1 , . . . , j p ∈ {1, . . . , n}, then |D| := p indicates the order of D. It is understood that a derivative of order 0 is the identity. Introduce the positive constant Remark 2.2. We will prove in Section 5 a ball-box theorem for vector fields of step s in the class A s . This improves both the results in [TW] and [BBP] in term of regularity. Indeed, in [TW] a C M regularity, with M = 2s must be assumed (see Remark 5.10). In [BBP] the authors assume that the vector fields belong to the Euclidean Lipschitz space C s−1,1 loc (R n ), which requires the boundedness on the Euclidean gradient ∇f w of any commutator f w of length s, while we only need to control only the "horizontal" gradient of f w .
Approximate commutators. For vector fields X j1 , . . . , X j ℓ , and for τ > 0, we define, as in [NSW], [M] and [MM], Then let By standard ODE theory, there is t 0 depending on ℓ, K, Ω, sup |f j | and ess sup |∇f j | such that exp * (tX j1j2...j ℓ )x is well defined for any x ∈ K and |t| ≤ t 0 . The approximate commutators C t are quite natural (indeed, they make an appearance in the original paper [4]). Assuming that the vector fields are smooth and using the Campbell-Hausdorff formula, we have the formal expansion where R k denotes a linear combination of commutators of length k. See [NSW,Lemma 2.21]. A study of these maps in the smooth case based on this formula is carried out in [M]. Define, given I = (i 1 , . . . , i n ) ∈ S, x ∈ K and h ∈ R n , with |h| ≤ C −1 where ℓ(K) = ℓ k1 + · · · + ℓ kn , the determinants λ K are defined in (1.3), and we have The lower bound (2.5) will appear many times in the following sections. All the constants in our main theorem will depend on ν in (2.5) and on L in (2.2). In order to refer to the crucial condition (1.4), we give the following definition Definition 2.3 (η−maximal triple). Let η ∈ ]0, 1[, I ∈ S, x ∈ R n and r > 0. We say that (I, x, r) is η−maximal, if we have |λ I (x)|r ℓ(I) > ηΛ(x, r).

Regularized vector fields.
Here we describe our procedure of smoothing of the vector fields X j of step s. For for any function f , let f (σ) (x) = f (x − σy)ϕ(y)dy, where ϕ ∈ C ∞ 0 is a standard nonnegative averaging kernel supported in the unit ball. Define (2.6) w , if |w| > 1. See Section 7) Then: Proposition 2.4. Let X 1 , . . . , X m be vector fields in the class A s . Then the following hold.
(2) There is σ 0 > 0 such that, if |w| = s and k = 1, . . . , m, then with C depending on L in (2.2). (3) There is r 0 depending on K, Ω and the constant in (2.2) such that the following holds. Let x ∈ K, r < r 0 and as σ → 0, uniformly in x ∈ K. As a consequence, for any I ∈ S, uniformly in x ∈ K, |h| ≤ C −1 , Proof. The proofs of items 1 and 2 are given in details in Section 7. Item 3 follows from standard properties of ODE.
Remark 2.5. The approximation result contained in Proposition 2.4 is crucial for our subsequent arguments. Note that the class A s requires a control on the Euclidean gradients of all commutators of length strictly less than s. However, it is natural to conjecture that a control only along the horizontal directions could be sufficient to ensure our main structure theorem in Section 5. Unfortunately, it seems quite difficult to get an approximation theorem as Proposition 2.4 for a more general class than A s . On the other side, working without mollified vector fields seems to rise some non trivial new issues which we plan to face in a further study.
Some more notation. Our notation for constants are the following: C, C 0 denote large absolute constants, ε 0 , r 0 , t 0 , C −1 or C −1 0 denote positive small absolute constants. "Absolute constants" may depend on the dimension n, the number m of the fields, their step s, the constant L in (2.2) and possibly the constant ν in (2.5). We also use the notation ε η (or C η ) to denote a small (or a large) constant depending also on η. The constants σ 0 or σ appearing in the regularizing parameter σ may also depend on the Euclidean continuity moduli of the vector fields f w , with |w| = s, which are not included in L. Composition of functions are shortened as follows: f g stands for f • g. The notation u is always used for functions of the form exp(t 1 Z 1 ) · · · exp(t ν Z ν ) for some t j ∈ R, ν ≥ 1, Z j ∈ {X 1 , . . . , X m }.

Approximate exponentials of commutators
The main result of this section is Theorem 3.6 in Subsection 3.3, where we prove an exact formula for the derivative d dt u(e tXw * (x)), where X w is a commutator of length |w| ≤ s, while e * is the approximate exponential defined in (2.3). All this section is written for smooth vector fields, namely the mollified X σ j , but all constants are appearing in our computations are stable as σ goes to 0. We drop everywhere in this section the superscript σ.
The integral remainder is rather complicated, but we do not need its exact form. In order to understand what we need to compute the derivative in (3.1), let us try to calculate for example the derivative d dt u(e tX e tY x), where X, Y ∈ {±X 1 , . . . , ±X m } and u denotes the identity function in R n . Since X and Y are C 1 , we have d dt u(e tX e tY x) = (Xu)(e tX e tY x) + Y (ue tX )(e tY x).
In order to compare the terms in the right-hand side, we may write Lemma 3.1 below shows that the derivative inside the integral can be written in an exact form in term of the commutator of X and Y . The purpose of the following Subsection 3.1 is to establish a formalism to study in a precise way more general, related, integral expressions.
Lemma 3.1. Let Z, X be smooth vector fields. Then, Proof. The lemma is known but we provide a proof for completeness. Observe first that Obviously, (1) = XZ(ue −tX )(e tX x). Write now (2) as follows The proof of formula (3.2) will be concluded as soon as we prove that To prove (3.3), start from the identity u(η) = u(e −tX e tX η), for small t. Differentiating, Then, (3.3) is proved by letting e tX η = ξ.
3.1. Notation for integral remainders. Let λ ∈ N, p ∈ {2, . . . , s + 1}. We denote, for y ∈ K, and t ∈ [0, t 0 ], t 0 small enough, where N is a suitable integer and u is the identity map or u = exp(tY 1 ) · · · exp(tY µ ), for some integer µ and suitable vector fields Y j ∈ {±X 1 , . . . , ±X m }. Here X wi actually stands for a mollified X σ wi , but we drop the superscript for simplicity. To describe the generic term of the sum above, we drop the dependence on i: Here X w is a commutator of length |w| = p and X ∈ {±X j }. Moreover, for any The map ϕ is the identity map or ϕ = exp(tZ 1 ) · · · exp(tZ ν ) for some ν ∈ N, where Z j ∈ {±X 1 , . . . , ±X m }.
Remark 3.2. All the numbers N, µ, ν, b, appearing in the computations of this paper will be bounded by absolute constants.
In order to explain how this formalism works, we give the main properties of our integral remainders.
Moreover, for p ≤ s + 1, where t 0 and C depend on the constant L in (2.2) and on the numbers N, µ, ν, b appearing in the sum (3.4). Furthermore, if ℓ(Z) = 1 and p ≤ s + 1, Finally, if p ≤ s, we may write, for suitable constants c w , |w| = p, Proof. The proof of (3.7) and (3.9) are rather easy and we leave them to the reader. So we start with the proof of (3.8). A typical term in O p (t λ , u, y) has the form . Therefore, (3.8) follows from the property (3.6) of ω.
Finally we establish the key property (3.10). Start from the generic term of O p (t λ , u, y) in (3.11), where we introduce the notation g k := e tZ k · · · e tZν , for k = 1, . . . , ν and g ν+1 denotes the identity map. Recall also that ℓ(Y ) ≤ s. Therefore, we get Recall that Y has length p ≤ s. The penultimate term can be written as Observe that, as required, the function ω(t, σ) where ω(t, σ) := t 0 ω(t, τ )dτ = bt λ has the correct form. The proof is concluded.

3.2.
Higher order non commutative calculus formulas. In order to prove Theorem 3.5, we first need to iterate formula (3.2). Start from smooth vector fields X := X σ j of length one and Z := X w of length ℓ(Z) := |w|. Differentiating identity (3.2) we get, by the Taylor formula where we introduced the notation: , etcetera. In other words, If we take r = s − ℓ(Z), we may write In view of (3.8), this order of expansion is the highest which ensures that the remainder can be estimated with Ct s−ℓ(Z)+1 , with a control on C in term of the constant in (2.2), as soon as y ∈ K and |t| ≤ C −1 . Next, we seek for a family of higher order formulas, in which we change e tX with an approximate exponential exp * (tX w ). The coefficients of the expansion (3.12) are all explicit but we do not need such an accuracy in the higher order formulae. To explain what suffices for our purposes, start with the case of commutators of length two. Let C t = C t (X, Y ) = e −tY e −tX e tY e tX , where X := X σ j and Y := X σ k are mollified vector fields with length one. Let Z := X σ v be a smooth commutator with length ℓ(Z) := |v|. Assume first that ℓ(Z) = s. Then, iterating (3.13) we can write If instead ℓ(Z) = s − 1, then some elementary computations based on (3.12) give Next, if ℓ(Z) = s − 2, (this can happen only if s ≥ 3), then we must expand more. Namely, we have Finally, if ℓ(Z) ≤ s − 3 (this requires at least s ≥ 4), we must expand even more: . If instead ℓ(Z) < s − 3, then we can expand up to the order O s+1 (t s+1−ℓ(Z) , u, C t x) by means of (3.10).
We have started to put tags of the form (F k,λ ) in our formulae. The number k indicates the length of the commutator we are approximating, while the number λ denotes the power of t which controls the remainder.
Note that in (F 2,4 ), the curly bracket changes sign if we exchange X with Y . Briefly, we can write Next we generalize formulae (F 2,λ ) above. The general statement we prove tells that this cancellation persists when the length of the commutator we are approximating with C t is three or more.
Applying twice formula F ℓ,ℓ+2 , we obtain, Observe first that property (3.9) gives . Later on, we will tacitly use such property many times. Recall that ℓ ≥ 2. By means of F ℓ,2 and F ℓ,1 , respectively, we obtain Inserting this information into (3.16) gives, after algebraic simplifications To conclude the proof, recall (3.7), divide by t ℓ+1 and let t → 0.
To start, recall that we are assuming that F ℓ,1 , . . . , F ℓ,s hold. Let, for t > 0 (3.17) C t : = C t (X w1 , . . . , X w ℓ ) and where X = X w0 . Let Z be a commutator with ℓ(Z) + ℓ + 2 ≤ s + 1. In the subsequent formulae, we expand everywhere up to a remainder of the form where we also used (3.9). Next we use F ℓ,ℓ+2 in (A).
. Finally we consider (C). In the k−th term of the sum use formula F ℓ,ℓ+2−k . Then Collecting together all the previous computations and making some simplifications (in particular we need here the cancellation property (3.15)), we get The Jacobi identity gives t ℓ+1 {· · · } = t ℓ+1 [Z, [X, X w ]], which is the desired term.
Ultimately we need to consider all the terms with sums. Changing k and h in (2), we may write Therefore, Step 2 and of Theorem 3.4 is concluded.
3.3. Derivatives of approximate exponentials. Here we give the formula for the derivative of an approximate exponential. All the subsection is written for the mollified vector fields X σ j , but we drop everywhere the supesrcript.
Theorem 3.5. There is t 0 > 0 such that, for any ℓ ∈ {2, . . . , s}, w = (w 1 , . . . , w ℓ ), letting C t = C t (X w1 , · · · , X w ℓ ), there are constants a w , a w such that, for any x ∈ K, t ∈ [0, t 0 ], From Theorem 3.5 it is very easy to obtain the following result: Theorem 3.6. For any commutator X w with length |w| = ℓ ≤ s, we have, for x ∈ K and t ∈ [−t 0 , t 0 ], Example 5.7 shows that, even if the vector fields are smooth, then the map exp * (tX w ) is at most C 1,α for some α < 1.
Proof of Theorem 3.6. Formula (3.22) follows immediately from (3.19), (3.20) and the definition (2.3) of e * . We only need to show now that the map is C 1 in both variables t, x.
Recall that the vector fields X σ j are smooth and in particular C 1 . By classical ODE theory, see [Ha,Chap. 5], any map of the form (τ 1 , . . . , τ ν , x) → e τ1Xi 1 · · · e τν Xi ν x is C 1 if the τ j 's belong to some neighborhood of the origin and x ∈ Ω ′ . This implies that for any commutator X w , the map ∇ x exp * (tX w )x is continuous on (t, x) ∈ I × Ω ′ , while d dt exp * (tX w )x is continuous in (t, x) ∈ I \ {0} × Ω ′ .
Next we prove that d dt exp * (tX w )x exists and it is continuous also at all points of the form (0, x). Observe first that formula (3.22) gives Now, (3.23) and l'Hôpital's rule imply that d dt exp * (tX w )x t=0 = 0, for all x ∈ Ω ′ . Finally, the uniformity of the limit ensures that the map (t, x) → d dt exp * (tX w )x is actually continuous in I × Ω ′ .
Proof of Theorem 3.5. We divide the proof in two steps.
Step 1 is concluded.
Step 2. We prove by an induction argument, that, if Theorem 3.5 holds for some ℓ ∈ {2, . . . , s − 1}, then it holds for ℓ + 1. To show the result for ℓ = 2, it suffices to follow the proof below, taking into account that formulas (3.19) and (3.20) are trivial, if ℓ = 1. We use the notation in (3.17) for C t and C 0 t . In view of (3.10) and of the already accomplished Step 1, it suffices to prove that We prove only the first line of (3.25). The latter is similar. We know that with the remarkable cancellation (3.21). Observe that a v = a v = 0, if ℓ = 1. Next, First we study A 1 + A 3 , by (3.12) and F ℓ,ℓ+1 .
. Next we study A 2 + A 4 , by means of (3.26).

Persistence of maximality conditions on balls
Here we establish a key property of stability of the η−maximality condition. The argument, as in [TW], is based on Gronwall's inequality.
Theorem 4.1. Let X 1 , . . . , X m be vector fields in A s . Then, there are r 0 > 0 and ε 0 > 0 depending on the constants L and ν in (2.5) and (2.2) such that, if for some η ∈ ]0, 1[ , x ∈ K and r < r 0 , the triple (I, x, r) is η−maximal, then for any y ∈ B(x, ηε 0 r), we have the estimates To prove Theorem 4.1 we need the following easy lemma.
Lemma 4.2. There is C > 0 depending on L and ν such that, given y ∈ Ω and z ∈ R n , the linear system q i=1 Y i (y)ξ i = z has a solution ξ ∈ R q such that |ξ| ≤ C|z|.
Proof of Theorem 4.1. Observe that if (I, x, r) is η-maximal, then there is σ > 0 which may also depend on I, x, r, such that (I, x, r) is η-maximal for the mollified X σ j for all σ ≤ σ. Therefore, we will give the proof for smooth vector fields (without writing any superscript). The nonsmooth case will follow by passing to the limit as σ → 0 and taking into account that all constants are stable.
At this point we can prove the following statement.
Corollary 4.3. Assume that (I, x, r) is η−maximal for the vector fields X 1 , . . . , X m in A s and for some x ∈ K and r ≤ r 0 . Then for any y ∈ B(x, ε 0 ηr), i = 1, . . . , q, we may write Y j (y) = n k=1 a k j Y i k (y), where |a k j | ≤ C η r ℓi k −ℓj .
Proof. Write Y i instead of Y i (y). Look at the linear system Y j = n k=1 a k j Y i k . The Cramer's rule furnishes by (4.2), and the proof is concluded.

Ball-box theorem
5.1. Derivatives of almost exponential maps. Here we take Hörmander vector fields X 1 , . . . , X m in A s . When we choose an n−tuple I = (i 1 , . . . , i n ) ∈ S and the n−tuple is understood, we write Y ij = U j and ℓ(Y ij ) = ℓ(U j ) = d j , for j = 1, . . . , n. Our first result is: Theorem 5.1. There are σ 0 , r 0 , σ 0 and C > 0 such that, given I ∈ S, then, for any j = 1, . . . , n, σ ≤ σ 0 , x ∈ K and h ∈ Q I (r 0 ), the C 1 map E σ I,x satisfies where the sum is empty if d j = ℓ(U j ) = s and the following estimates hold: Theorem 5.1 holds without assuming η-maximality. If the triple (I, x, r) is η−maximal, we have more. To state the result, fix once for all a dimensional constant χ > 0 such that , 2 for all A ∈ R n×n with norm |A| ≤ χ.
Theorem 5.2. Let r 0 , σ 0 > 0 as in Theorem 5.1. Given an η−maximal triple (I, x, r) for the vector fields X i , with x ∈ K, r < r 0 and σ ≤ σ 0 , then, for any h ∈ Q I (ε 0 ηr), j = 1, . . . , n, we may write where, Remark 5.3. Estimate (5.6) and the results on Section 4 imply that, under the hypotheses of Theorem 5.2, we have Proof of Theorem 5.1. Without loss of generality we may work in R 2 . We drop everywhere the superscript σ. Then E I (x, h) = e h1U1 * e h2U2 * x. Denote by u the identity function in R n .
To conclude the proof it suffices to write all the terms X v (ue h1U1 * )(e h2U2 * x) in (5.7) in the form X v u E I,x (h) plus an appropriate remainder. The argument is the same used in equation (5.8) and we leave it to the reader.
Write briefly E instead of E I,x (h) . Looking at the right-hand side of (5.9), we need to study, for any word w of length |w| = ℓ, with ℓ = d j + 1, . . . , s, the Here we also used (5.11). This gives the estimate of the terms in the sum in (5.9).
Next we look at the the remainder ω j . Fix j = 1, . . . , n. We know that |ω j | ≤ C h s+1−dj I and we want to write ω j = k b k j U k (E) with estimate (5.6). It is convenient to multiply by r dj . Let r dj ω j =: θ ∈ R n and ξ k = r dj b k j . Thus it suffices to show that we can write θ = k ξ k U k (E), where ξ k satisfies the estimate |ξ k | ≤ C η h r r d k . We know that To estimate ξ k , we follow a two steps argument: Step 1. Write, by Lemma 4.
Step 2. For any i = 1, . . . , q write Y i (E) = n k=1 λ k i U k (E). This can be done in a unique way and estimate |λ k i | ≤ C η r d k −ℓ(Yi) holds, by Corollary 4.3. Collecting Step 1 and Step 2, we conclude that as required. This ends the proof.
Next we pass to the limit as σ → 0 in both Theorems 5.1 and 5.2.
Theorem 5.4. If (I, x, r) is η−maximal for some x ∈ K, r ≤ r 0 , then the map E I,x QI (ε0ηr) is locally biLipschitz in the Euclidean sense and satisfies for a.e. h, where the sum is empty if d j = ℓ(U j ) = s and otherwise the following estimates hold: Remark 5.5. If s ≥ 3, then vector fields of the class A s are C 1 . Then, as discussed in the beginning of the proof of Theorem 3.6, the map E I,x is actually C 1 smooth. This is not ensured if s = 2.
Proof of Theorem 5.4. Look first at the C 1 map E σ = E σ I,x defined on Q I (r 0 ). Denote by E its pointwise limit as σ → 0. By Theorem, 5.1, the map E σ satisfies for any σ < σ 0 , h I ≤ r 0 , where a w j do not depend on σ, while |ω σ j (h)| ≤ C h s+1−dj I , uniformly in σ ≤ σ 0 . Let E σ k be a sequence weakly converging to E in W 1,2 . Therefore, by (5.13), the remainder ω σ k j has a weak limit in L 2 . Denote it by ω j . Standard properties of weak convergence ensure that |ω j (h)| ≤ C 0 h s+1−dj I for a.e. h. Therefore, we have proved the first line of (5.9) and estimates (5.10) and (5.11). To prove the second line and (5.12), it suffices to repeat the argument of Theorem 5.2, taking into account that the main ingredient there, namely Corollary 4.3, holds for nonsmooth vector fields in A s . Now we have to prove the local injectivity of E. Let σ be small enough to ensure that (I, x, r) is η-maximal for the vector fields X σ j . In view of Theorem 5.2, we can write dE σ (h) = U σ (E σ (h))(I n + B σ (h)), where U σ = [U σ 1 , . . . , U σ n ], and the entries of the matrix B satisfy |(b k j ) σ | ≤ Cr d k −dj , by (5.5). Fix now h 0 ∈ Q I (ε 0 ηr), where ε 0 η comes from Theorem 5.2. We will show that E σ is locally one-to-one around h 0 , with a stable coercivity estimate as σ → 0. By Proposition 2.4 and by the continuity of the vector fields U j , we may claim that for any δ > 0 there is ̺ > 0 such that |U σ j (ξ) − U σ j (ξ ′ )| < δ as soon as ξ, ξ ′ ∈ K, |ξ − ξ ′ | < ̺ and σ < ̺. Recall also that E σ is Lipschitz continuous, uniformly in σ, see (5.13). Then, for any δ > 0 there is ̺ > 0 such that B Eucl (h 0 , ̺) ⊂ Q I (ε 0 ηr), and, if |h − h 0 | ≤ ̺ and σ < ̺, then |U σ (E σ (h)) − U σ (E σ (h 0 ))| ≤ δ.
Take h, h ′ ∈ B Eucl (h 0 , δ). By integrating on the path To estimate from below the first line recall the easy inequality |Ax| ≥ C −1 | det A| |A| n−1 |x|, Observe also that |I + B σ (γ)| ≤ Cr 1−s . Moreover, in view of Remark 5.3, it must be |det U σ (E σ (h 0 ))| ≥ C −1 |λ I (x)|, for small σ. This suffices to estimate from below the first line. To get an estimate of the second line we need again the inequality |I + B σ (γ)| ≤ Cr 1−s . Eventually we get The proof is concluded as soon as we choose δ = δ(I, x, r) small enough and let σ → 0.
This argument shows that the map is locally biLipschitz, as desired.

5.2.
Pullback of vector fields through scaling maps. Given an η-maximal triple (I, x, r), for vector fields of the class A s we can define, as in [TW], the "scaling map" (5.14) Φ I,x,r (t) = exp n j=1 t j r ℓij Y ij x, for small |t|. The dilation δ I r (t) := (t 1 r ℓi 1 , . . . , t n r ℓi n ) makes the natural domain of Φ I,x,r independent of r. Observe the property δ I r t I = r t I . It turns out that, if X k (k = 1, . . . , m) denotes the pullback of rX k under Φ I,x,r , then X 1 , . . . , X m satisfy the Hörmander condition in an uniform way. This fact enables the authors in [TW] to give several simplifications to the arguments in [NSW].
We can also consider the scaling map associated with our exponentials. Namely, (5.15) S I,x,r (t) := exp * (t 1 r ℓi 1 Y i1 ) · · · exp * (t n r ℓi n Y in ) = E I,x (δ I r t), It will be proved in Subsection 5.3 that, if (I, x, r) is η-maximal, then S is one-toone on the set { t I ≤ ε 0 η}. If we assume that the original vector fields are of class C 1 , see Remark 5.5, thus, we may define, for all i ∈ {1, . . . , q} the vector fields Y j := S −1 * (r ℓi Y i ). Theorem 5.4 thus becomes Proposition 5.6. Let X 1 , . . . , X m be vector fields in A s . Let (I, x, r) be an η−maximal triple and let S := S I,x,r be the associated scaling map. Then S QI (ε0η) is a locally biLipschitz map and for a.e. t ∈ Q I (ε 0 η) we may write where the functions b k j satisfy Moreover, if S is C 1 and we write Y ij = ∂ tj + n k=1 a k j (t)∂ t k , then Proof. Formula (5.16) is just Theorem 5.4. The proof of (5.18) is a consequence of (5.17) and of the following elementary fact: given a square matrix B ∈ R n×n with norm |B| ≤ 1 2 , we may write (I n + B) −1 = I n + A, and |A| = k≥1 (−B) k ≤ 2|B|.
In the framework of our almost exponential maps, estimate (5.18) is sharp, even for smooth vector fields. The better estimate Y ij (t) = ∂ j + k a k j (t)∂ k with |a k j (t)| ≤ C|t|, obtained in [TW] for maps of the form (5.14), generically fails for S, as the following example shows.
Theorem 5.8. Let X 1 , . . . , X m be Hörmander vector fields of step s in the class A s . There are r 0 , r 0 , C 0 > 0, and for all η ∈ (0, 1) there are ε η , C η > 0 such that: (A) if (I, x, r) is η−maximal for some x ∈ K, r ≤ r 0 , then, for any ε ≤ ε η , we have (B) if (I, x, r) is η−maximal for some x ∈ K, r ≤ r 0 , then the map E I,x is one-to-one on the set Q I (ε η r).
Remark 5.9. Observe that in the right-hand side of inclusion (5.19) we use the distance d ̺ . Therefore, a standard consequence of (5.19) is the well known property B(x, r) ⊃ B E (x, C −1 r s ), for any x ∈ K, r < r 0 . See [FP].
Remark 5.10. In the paper [TW] the authors use the exponential maps in (1.2).
If the vector fields have step s, then their method requires that the commutators of length 2s are at least continuous. (Here, we specialize [TW] to the case ε = 1 and we do not discuss the higher regularity estimate [TW,Eq. (2.1)].) This appears in the proof of (22) and (23) of [TW,Proposition 4.1]. Indeed in equation (29), the commutator [X w , X wj ] must be written as a linear combination of commutators X w ′ , where for algebraic reasons it must be |w ′ | = |w| + |w k |. If |w| = |w k | = s, then commutators of degree 2s appear. A similar issue appears for [Y wi , Y wj ] at the beginning of p. 619.
Remark 5.11. The reason why we introduce two different constants r 0 and r 0 is that C 0 , ε 0 and r 0 depend only on L and ν in (2.2) and (2.5) (together with universal constants, like m, n and s). The constants ε η and C η depend on ν, L and η also. We do not have a control of r 0 (which appears only in the injectivity statement) in term of L and ν. This is a delicate question because of the covering argument implicitly contained in [NSW,p. 132] and described in [M, p. 230]. Below we provide a constructive procedure to provide a lower bound for r 0 in term of the functions λ I . See p. 31. This can be of some interest in view of applications of our results to nonlinear problems.
Remark 5.12. The proof of the injectivity result would be considerably simplified if we could prove (uniformly in x ∈ K, r < r 0 ) an equivalence between the balls and their convex hulls, i.e. coB(x, r) ⊂ B(x, Cr), which is reasonable for diagonal vector fields (see [SW,Remark 5]) or a "contractability" property of the ball B(x, r) inside B(x, Cr). See [Sem,Definition 1.7]. Unfortunately, in spite of their reasonable aspect, both these conditions seem quite difficult to prove in our situation. It seems also that the clever argument in [TW,p. 622] can not be adapted to our almost exponential maps.
In the proof of inclusion (5.19), we follow the argument in [NSW, M]. Before giving the proof, we need to show that some constants in the proof actually depend only on L and ν in (2.2) and (2.5). Basically, what we need is contained in Corollary 4.3 and in the following Lemma. See [NSW,p. 129].
Proof of Theorem 5.8, (A). Since the vector fields Y j are not Euclidean Lipschitz continuous, if ℓ j = s, we do not know whether or not any point in a ̺-ball of the Y j can be approximated by points in the analogous ball of the mollified Y σ j . In order to avoid this problem, observe the inclusion B ̺ (x, r) ⊂ B ̺ (x, Cr) where C is absolute and the distance ̺ is defined using the family {Y j : ℓ j ≤ s − 1, ∂ k : k = 1, . . . , n}, where we assign to the vector fields ∂ k maximal weight s. Therefore, we will prove the inclusion using the distance ̺, which is defined by Lipschitz vector fields.
Once the claim is proved, the surjectivity statement follows.
To prove the claim the key estimate we need is the following. Let U ⊂ Q I (ε η r), σ ≤ σ and assume that a C 1 − diffeomorphism ψ = (ψ 1 , . . . , ψ n ) satisfies locally ψ(E σ (h)) = h, for all h ∈ U , where, for some t ∈ [0, 1], E σ (U ) is a neighborhood of γ σ (t). Then, for µ = 1, . . . , n and for all τ close to t (5.21) The constant C η will be chosen below, while C depends on L, ν, in force of Corollary 4.3 and Lemma 5.13. We used the estimate ∂ i =ã k i U σ k with ã k i ≤ C η r d k −s , which follows from Lemma 4.2 and Corollary 4.3.
With estimate (5.21) in hands we can prove the claim along the lines of [M, p. 228]. Here is a sketch of the argument.
Step 2. There exists a path θ σ on [0, 1] satisfying (5.20). The proof of Step 2 can be done as in [M, p. 229] by a very classical argument, involving an upper bound "of Hadamard type" dE σ (θ σ (t)) −1 ≤ C, which holds uniformly in t.
The proof of the statement (A) is concluded.
Before proving part (B) of Theorem 5.8, we need the following rough injectivity statement.
Lemma 5.14. Let x ∈ K and I such that λ I (x) = 0. Then the function E I,x is one-to-one on the set Q I (C −1 |λ I (x)|).
Proof. Observe first that for all j = 1, . . . , n and small σ, we have by estimates (5.2), (5.3) and the d-Lipschitz continuity of U σ j .
As announced in Remark 5.11, we provide a constructive procedure for the "injectivity radius" r 0 in Theorem 5.8 in term of the functions λ I . Compare [M, p. 229-230].
Proposition 5.15. There is C > 1 such that, letting r x := C −1 |λ Ix (x)| for all x ∈ K, then: (1) we have (2) the map h → E Ix (y, h) is one-to-one on the set Q Ix (r x ), for any y ∈ B(x, ε 0 r x ).
Observe that Proposition 5.15 is far from what we need, because it may be inf K r x = 0, (for example this happens in the elementary situation X 1 = ∂ 1 , X 2 = Proof. We first prove (1) for y = x. Namely we show that where r x = C −1 |λ Ix (x)|, as required. Let J ∈ S. If λ J (x) = 0, then (5.26) holds for all r > 0. If instead λ J (x) = 0, by the choice of I x it must be ℓ(J) = ℓ(I x ) or ℓ(J) > ℓ(I x ). If ℓ(J) = ℓ(I x ), then (5.26) holds for any r > 0, because |λ Ix (x)| is maximal, by the construction above. If ℓ(J) > ℓ(I x ), then Thus (5.26) holds for any r ≤ r x , where r x has the required form. The proof of (1) for y = x follows from Theorem 4.1. Finally, to prove (2) observe that, in view of Lemma 5.14, the map h → E Ix (y, h) is one-to-one on Q Ix (C −1 |λ Ix (y)|). But Theorem 4.1, in particular (4.1) show that, if d(x, y) ≤ ε 0 r x , then |λ Ix (y)| and |λ Ix (x)| are comparable. This concludes the proof.
Proof of Theorem 5.8, (C). Let p 1 ≤ p be the largest integer such that Σ p1 = ∅. Then define the "injectivity radius" where the open set Ω ′ was introduced before (2.2). Recall that all metric balls B(x, r) are open, by the already accomplished Theorem (5.8), part (A). Then, by Proposition 5.15, for any y ∈ Ω p1 there is x ∈ Σ p1 such that the map h → E Ix (y, h) is one-to-one on Q Ix (ε 0 r x ) and (I x , y, r x ) is C −1 -maximal. Recall that r x ≥ r (p1) , on Σ p1 . Next let p 2 < p 1 be the largest number such that K p2 := Σ p2 \ Ω p1 = ∅. Then, let r (p2) := min We may claim that for any y ∈ Ω p2 := x∈Kp 2 Ω ′ ∩ B(x, r (p2) ), there is x ∈ K p2 such that the map h → E Ix (y, h) is one-to-one on the set Q Ix (ε 0 r x ) and (I x , y, r x ) is C −1 -maximal.
Iterating the argument, and letting r 0 = min{r (p k ) } we conclude that for any x ∈ K there is a n-tuple I 0 = I 0 (x), and ̺ 0 = ̺ 0 (x) ≥ r 0 such that E I0 (x, ·) is one-to-one on the set Q I0 (ε 0 ̺ 0 ) and (I 0 , x, ̺ 0 ) is C −1 -maximal. Clearly, I 0 can be different from I x . This is the starting point for the proof of the injectivity statement, Theorem 5.8, item (C).
From now on I, x ∈ K and r < r 0 are fixed and (I, x, r) is η−maximal, as in the hypothesys of (C). Let I 0 and ̺ 0 be the n-tuple and the injectivity radius associated with x by the argument above. Recall that ̺ 0 ≥ r 0 . Arguing as in [M, p. 230], see also [NSW,p. 133], we may find a sequence of n−tuples I = I N , I N −1 , . . . , I 1 , I 0 and correspoding numbers 0 ≤ ̺ N +1 < ̺ N < · · · < ̺ 0 , with ̺ 0 ≥ r 0 , r ∈ [̺ N +1 , ̺ N ] such that for any j = 0, 1, . . . , N − 1, In order to show that E I = E IN is one-to-one on the set Q I (ε η r), for some ε η > 0, we start by showing that E I1 is one-to-one on the set Q I1 (ε ′ η ̺ 1 ), for a suitable ε ′ η . What we know is that E I0 is one-to-one on the set Q I0 (̺ 0 ). We also know that (5.28) holds for j = 0, 1 and ̺ = ̺ 1 . Therefore, applying twice (5.19), we have
Then we have proved that E I1 is one-to-one on Q I1 (ε ′ η ̺ 1 ). Iterating the argument at most N times, we get the proof of statement (C) of Theorem 5.8.

Examples
Example 6.1 (Levi vector fields). In order to illustrate the previous procedure to findr 0 we exhibit the following three-step example. In R 3 consider the vector fields X 1 = ∂ x1 + a 1 ∂ x3 and X 2 = ∂ x2 + a 2 ∂ x3 . Assume that the vector fields belong to the class A 3 . Let us define f = X 1 a 2 − X 2 a 1 . Morover assume that |f | + |X 1 f | + |X 2 f | = 0 at every point of the closure of a bounded set Ω ⊃ K = Ω ′ .
Assume also that f has some zero inside K. This condition naturally arises in the regularity theory for graphs of the form {(z 1 , z 2 ) ∈ C 2 : Im(z 2 ) = ϕ(z 1 ,z 1 , Re(z 2 ))} having some first order zeros. See [CM], where the smoothness of C 2,α graphs with prescribed smooth Levi curvature is proved.
In next example we show a subelliptic-type estimate for nonsmooth vector fields. The argument of the proof below is due to Ermanno Lanconelli (unpublished).
Proof. We just sketch the proof, leaving some details to the reader. For any I ∈ S, let Ω I := {x ∈ Ω : I 0 (x) = I}, where I 0 (x) comes from the proof of Theorem 5.8, together with ̺ 0 = ̺ 0 (x) ≥ r 0 , see the discussion before equation (5.28). If x ∈ Ω I , we have B(x, ̺ 0 ) ⊂ E I (x, Q I (C̺ 0 )), where the biLipschitz map E I satisfies C −1 ≤ | det dE I (x, h)| ≤ C, for a.e. h ∈ Q I (C̺ 0 ). Thus, Now observe that, arguing as in the proof of Lemma 5.14, we have |E I (x, h) − x| ≥ C −1 |h|, if h I ≤ C̺ 0 . Let δ 0 = max x∈K ̺ 0 (x). Next we follow the argument in [LM]. Write E I (x, h) = γ I (x, h, T (h)), where γ I (x, h, t), t ∈ [0, T (h)] is a control function, with the properties described in [LM]. Therefore is a change of variable, by estimate T (h) ≤ h I ≤ |h| 1/s and the strict inequality ε < 1/s.
The borderline inequality f 1/s ≤ C Xf L 2 , which can not be obtained with the argument above, was proved in the smooth case by Rothschild and Stein [RoS].

Proof of Proposition 2.4
Here we prove Proposition 2.4. By definition, (2.1) means that for all j, k ∈ {1, . . . , m} and |w| ≤ s − 1 there is a bounded function X j (X k f w ) such that for any test function ψ ∈ C ∞ c (R n ), (7.1) (X k f w )(X j ψ) = − {X j (X k f w ) + div(X j )X k f w }ψ.
If D = ∂ j1 · · · ∂ jp for some j 1 , . . . , j p ∈ {1, . . . , n} is an Euclidean derivative, denote by |D| = p its order. It is understood that a derivative of order 0 is the identity. The first item of Proposition 2.4 is a consequence of the following lemma: Lemma 7.1. Let X 1 , . . . , X m be vector fields in A s . Then for any word w with |w| ≤ s and for any Euclidean derivative D of order |D| = p ∈ {0, . . . , s − |w|}, we have Note that, the case p = 0 of (7.2) provides the proof of item 1 of Proposition 2.4.
Observe also that, if |w| = s, then we have |f w − f σ w | ≤ |f w − (f w ) σ |+|f σ w − (f w ) σ |. Lemma 7.1 gives the estimates of the second term. The first one is estimated by means of the continuity modulus of f w , which is not included in L in (2.2).
Proof of Lemma 7.1. We argue by induction on |w|. If |w| = 1, then the left hand side of (7.2) vanishes. Assume that for some ℓ ∈ {1, . . . , s − 1}, (7.2) holds for any word w of length ℓ and for each D with |D| ≤ s − ℓ. Let v = kw be a word of length |kw| = ℓ + 1. We must show that for any Euclidean derivative D of order 0 ≤ |D| ≤ s−|v|, (7.2) holds. We have f v = X k f w −X w f k and f σ v = X σ k f σ w −X σ w f σ k . We first prove (7.2) when the order of D satisfies 1 ≤ |D| ≤ s − |v| = s − ℓ − 1,