Global Rigidity of Higher Rank Anosov Actions on Tori and Nilmanifolds

We show that sufficiently irreducible Anosov actions of higher rank abelian groups on tori and nilmanifolds are smoothly conjugate to affine actions.


Introduction
An Anosov diffeomorphism f on a torus T n is affine if f lifts to an affine map on R n . By a classical result of Franks and Manning, any Anosov diffeomorphism g on T n is topologically conjugate to an affine Anosov diffeomorphism. More precisely, there is a homeomorphism φ : T n → T n such that f = φ•g •φ −1 is an affine Anosov diffeomorphism. We call φ the Franks-Manning conjugacy. The linear part of f is the map induced by g on H 1 (T n ).
Anosov diffeomorphisms are rarely C 1 -conjugate to affine ones. For example, one can perturb a linear Anosov diffeomorphism locally around a fixed point p to change the conjugacy class of the derivative at p. The resulting diffeomorphism will still be Anosov but cannot be C 1 -conjugate to its linearization. The situation is radically different for Z k -actions with many Anosov diffeomorphisms. In other words, Anosov diffeomorphisms rarely commute with other Anosov diffeomorphisms.
It follows easily from the result for a single Anosov diffeomorphism that an Anosov Z kaction α on T n is topologically conjugate to a Z k -action by affine Anosov diffeomorphisms. We call this action the linearization of α and denote it by ρ. Again, for any a ∈ Z k the linear part of ρ(a) is the map induced by α(a) on H 1 (T n ). The logarithms of the moduli of the eigenvalues of these linear parts define additive maps λ i : Z k → R, which extend to linear functionals on R k . A Weyl chamber of ρ is a connected component of R k − ∪ i kerλ i . Theorem 1.1. Let α be a C ∞ -action of Z k , k ≥ 2, on a torus T n and let ρ be its linearization. Suppose that there is a Z 2 subgroup of Z k such that ρ(a) is ergodic for every nonzero a ∈ Z 2 . Further assume that there is an Anosov element for α in each Weyl chamber of ρ. Then α is C ∞ -conjugate to ρ.
Furthermore, for a linear Z k -action on T n having a Z 2 subgroup acting by ergodic elements is equivalent to several other properties, in particular to being genuinely higher rank [37]. A linear Z k -action is called genuinely higher rank if for all finite index subgroups Z of Z k , no quotient of the Z-action factors through a finite extension of Z. Hence we obtain the following corollary. Corollary 1.2. Let α be a C ∞ -action of Z k , k ≥ 2, on a torus T n . Suppose that the linearization ρ of α is genuinely higher rank. Further assume that there is an Anosov element for α in each Weyl chamber of ρ. Then α is C ∞ -conjugate to ρ.
We can define Weyl chambers for the action α itself. In fact these Weyl chambers will turn out to be the same for α and ρ. Hence existence of Anosov elements for α in every Weyl chamber of ρ is equivalent to existence of Anosov elements for α in every Weyl chamber of α.
We refer to our paper [11] for a brief survey of other results and methods in the classification of higher rank Anosov actions. Our global rigidity results above are optimal except that we require an Anosov element in every Weyl chamber. Rodriguez Hertz in [34] classifies higher rank actions on tori assuming only one Anosov element. However, his work requires multiple additional hypotheses such as bunching conditions and low dimensionality of coarse Lyapunov spaces. In particular, the hypotheses in [34] require that the rank of the acting group has to grow linearly with the dimension of the torus. It is a conjecture due to Katok and the third author that global rigidity holds assuming α has one Anosov element. We discuss this conjecture in more detail at the end of this introduction.
Let us briefly describe our proof which crucially uses the Franks-Manning conjugacy φ for some Anosov element of the action. As we noted, φ also conjugates any commuting diffeomorphism to an affine map. In consequence, each element of the action gives a functional equation for φ. This yields explicit series representations for its projection φ V to any generalized joint eigenspace V of ρ. The existence of Anosov elements of α in every Weyl chamber allows to define coarse Lyapunov foliations as finest nontrivial intersections of stable and unstable foliations of Anosov elements. Since the latter are continuous, so are the coarse Lyapunov foliations. It is precisely here that existence of an Anosov element in each Weyl chamber is used. We then employ the continuity of the coarse Lyapunov foliations to obtain uniform estimates for contraction and expansion. Thus elements close to a Weyl chamber wall act almost isometrically along suitable coarse Lyapunov foliations, or more precisely, we can make their exponents in these estimates as close to 0 as we wish, and in particular smaller than the size of the exponent in the exponential decay we get from exponential mixing. We use such elements to study the regularity of φ V along each coarse Lyapunov foliation W. Using exponential mixing for Hölder functions we show that the partial derivatives along W exist as distributions dual to spaces of Hölder functions. Then we adapt ideas from a paper by Rauch and Taylor to show that φ is smooth. We emphasize that the rigidity of Z k -actions for k ≥ 2 is due to the co-existence of (almost) isometric and hyperbolic behavior in the actions. This utterly fails for Z-actions.
The paper is organized as follows. We first explain general definitions, constructions and results for higher rank Anosov actions in Section 2. In Section 3 we turn to actions on tori and nilmanifolds, and use the Franks-Manning conjugacy to derive special properties of such actions. Most importantly, we will develop uniform growth estimates for elements near the Weyl chamber walls of the action in Section 3.2. We then turn to the case of the torus as it is substantially more elementary than the nilmanifold case. In Section 4, we establish exponential mixing for Z k -actions by ergodic affine automorphisms on a torus.
For smooth actions on tori with the standard smooth structure we prove in Section 5 the existence of partial derivatives in all directions as distributions dual to Hölder functions. This concludes the proof for the case of standard tori using the general regularity result that we establish in Section 8. For exotic tori, i.e. manifolds that are homeomorphic to but not diffeomorphic to tori, in dimensions at least 5 we can pass to a finite cover with the standard smooth structure. For dimension 4 we give a special argument in Section 6.
Finally, we adapt our arguments to nilmanifolds: Let N be a simply connected nilpotent Lie group. We call a diffeomorphism of N affine if it is a composition of an automorphism of N with a left translation by an element of N . If Γ ⊂ N is a discrete subgroup, we call the quotient N/Γ a nilmanifold. An infra-nilmanifold M is a manifold finitely covered by a nilmanifold. Diffeomorphisms of M covered by affine diffeomorphisms of N are again called affine. The Franks-Manning conjugacy theorem generalizes to infranilmanifolds: Suppose M ′ is a smooth manifold homeomorphic with an infra-nilmanifold. Then every Anosov diffeomorphism of M ′ is conjugate to an affine diffeomorphism of M by a homeomorphism φ. We call φ the Franks-Manning conjugacy. Given an action α of Z k on M ′ which contains an Anosov diffeomorphism, then its Franks-Manning conjugacy jointly conjugates all α(a), a ∈ Z k , to affine diffeomorphisms ρ(a). We call ρ the linearization of α. Now we can state our main result for nilmanifolds: on a compact infra-nilmanifold N/Γ and let ρ be its linearization. Suppose that there is a Z 2 subgroup of Z k such that ρ(a) is ergodic for every nonzero a ∈ Z 2 . Further assume that there is an Anosov element for α in each Weyl chamber of ρ. Then α is C ∞ -conjugate to ρ.
Our main result reduces to the case of standard nilmanifolds, i.e. nilmanifolds with the differentiable structure coming from the ambient Lie group. Indeed, there are no Anosov diffeomorphisms on non-toral nilmanifolds in dimensions 4 or less, and the result by J. Davis in the appendix shows that any nilmanifold of dimension at least 5 is finitely covered by a standard nilmanifold.
For standard nilmanifolds we proceed similarly to the toral case. We adapt arguments of Margulis and Qian [32] to reduce regularity of the conjugacy to regularity of the solution of a cohomology equation. The relevant cocycle however takes values in a nilpotent group, and is not directly amenable to our approach. Instead, we consider suitable factors of the cocycle in various abelian quotients of the derived series of N . Again we prove regularity of coboundaries for the resulting cocycles by exponential mixing of the Z k action, uniform expansion and contraction of elements close to Weyl chamber walls, and showing existence of derivatives via distributions dual to Hölder functions. Unlike in the toral case, exponential mixing of actions by affine automorphisms does not follow from elementary Fourier analysis. Rather this was established by Gorodnik and the third author in [14]. We remark that this approach yields the first rigidity results for higher rank actions on general nilmanifolds. Earlier cocycle and local rigidity results, by A. Katok and the third author, were only proved for actions which were higher rank both on the toral factor as well as the fibers (e.g. [27]). There, cocycles were straightened out separately on the base and the fibers. Exponential mixing of these actions thus allows for a much simpler and direct approach, and is also used in [15] to prove cocycle rigidity results.
Epilogue: We conclude this paper with some remarks about the conjecture by Katok and Spatzier that genuinely higher rank abelian Anosov actions are smoothly conjugate to affine actions. Using the arguments of our earlier paper [11], we can show that the conjugacy is always smooth along almost every leaf of each coarse Lyapunov foliation. However, we have no further evidence in support of this conjecture and in fact have some doubts about its truth. In [13] Gogolev constructed a diffeomorphism of a torus which is Hölder conjugate to an Anosov diffeomorphism but itself is not Anosov. Thus having one Anosov element may not imply that most elements are Anosov. In [7], Farrell and Jones constructed Anosov diffeomorphisms on exotic tori. In light of this construction it seems obvious to ask: Question 1.4. Are there genuinely higher rank Anosov Z k actions on exotic tori?
As exotic tori are finitely covered by standard tori, such actions would lift to actions on standard tori. The latter could not be smoothly equivalent to their linearizations since a smoothness result for conjugacy would descend to the C 0 conjugacy between the exotic and standard torus. Thus such examples would also give counterexamples to the conjecture by Katok and Spatzier even when the underlying smooth structure on the torus is standard.
We remark here that the construction in [7], further explained and simplified in [8], does not adapt easily to the case of actions of higher rank abelian groups. Indeed because of the delicate cutting and pasting arguments used in their constructions, it would be hard to guarantee that different elements continue to commute. As a consequence of Theorem 1.1, a positive answer to Question 1.4 can only occur for an action where relatively few elements are Anosov. Furthermore, by the results in [34], a positive answer to Question 1.4 seems unlikely if the dynamically defined foliations for the action have dimensions 1 or 2. The Farrell-Jones construction proceeds by cutting and pasting exotic spheres into the torus. This suggests, in order to construct examples for Question 1.4, one would want to glue in the exotic sphere in a manner somehow subordinate to the dynamical foliations using their high dimension.
We are indebted to J. Rauch for discussions concerning his result with M. Taylor on regularity for distributions. A strengthening of one of their theorems is fundamental to our approach and multiple discussions with Rauch were a key to our first believing and then proving this result. We also thank A. Gorodnik and J. Conlon for various discussions. Finally, we are more than grateful to J. Davis for discussions concerning non-standard smooth structures and for writing the appendix on exotic differentiable structures on nilmanifolds.

Preliminaries
Throughout the paper, the smoothness of diffeomorphisms, actions, and manifolds is assumed to be C ∞ , even though all definitions and some of the results can be formulated in lower regularity.
Let a be a diffeomorphism of a compact manifold M . We recall that a is Anosov if there exist a continuous a-invariant decomposition of the tangent bundle T M = E s a ⊕ E u a and constants K > 0, λ > 0 such that for all n ∈ N The distributions E s a and E u a are called the stable and unstable distributions of a. Now we consider a Z k action α on a compact manifold M via diffeomorphisms. The action is called Anosov if there is an element which acts as an Anosov diffeomorphism. For an element a of the acting group we denote the corresponding diffeomorphisms by α(a) or simply by a if the action is fixed.
The distributions E s a and E u a are Hölder continuous and tangent to the stable and unstable foliations W s a and W u a respectively [18]. The leaves of these foliations are C ∞ injectively immersed Euclidean spaces. Locally, the immersions vary continuously in the C ∞ topology. In general, the distributions E s and E u are only Hölder continuous transversally to the corresponding foliations.

Lyapunov exponents and coarse Lyapunov distributions.
First we recall some basic facts from the theory of non-uniform hyperbolicity for a single diffeomorphism, see for example [2]. Let a be a diffeomorphism of a compact manifold M preserving an ergodic probability measure µ. By Oseledec' Multiplicative Ergodic Theorem, there exist finitely many numbers χ i and an invariant measurable splitting of the tangent bundle T M = E i on a set of full measure such that the forward and backward Lyapunov exponents of v ∈ E i are χ i . This splitting is called Lyapunov decomposition. We define the stable distribution of a with respect to µ as E − a = χ i <0 E i . The subspace E − a (x) is tangent µ-a.e. to the stable manifold W − a (x). More generally, given any θ < 0 we can define the strong stable distribution by E θ a = χ i ≤θ E i which is tangent µ-a.e. to the strong stable manifold W θ a (x). W θ a (x) is a smoothly immersed Euclidean space. For a sufficiently small ball B(x), the connected component of W θ a (x) ∩ B(x), called local manifold, can be characterized by the exponential contraction property: for any sufficiently small ε > 0 there exists C = C(x) such that (2) W θ,loc a (x) = {y ∈ B(x) | dist(a n x, a n y) ≤ Ce (θ+ε)n ∀n ∈ N}.
The unstable distributions and manifolds are defined similarly. In general, E − a is only measurable and depends on the measure µ. However, if a is an Anosov diffeomorphism then E − a for any measure always agrees with the continuous stable distribution E s a . Indeed, E s a cannot contain a vector with a nontrivial component in some E j with χ j ≥ 0 since such a vector does not satisfy (1).
both inclusions have to be equalities. Now we consider the case of Z k actions. Let µ be an ergodic probability measure for a Z k action α on a compact manifold M . By commutativity, the Lyapunov decompositions for individual elements of Z k can be refined to a joint invariant splitting for the action. The following proposition from [22] describes the Multiplicative Ergodic Theorem for this case. See [20] for more details on the Multiplicative Ergodic Theorem and related notions for higher rank abelian actions. Proposition 2.1. There are finitely many linear functionals χ on Z k , a set of full measure P, and an α-invariant measurable splitting of the tangent bundle T M = E χ over P such that for all a ∈ Z k and v ∈ E χ , the Lyapunov exponent of v is χ(a), i.e.
where .. is a continuous norm on T M .

The splitting
E χ is called the Lyapunov decomposition, and the linear functionals χ, extended to linear functionals on R k , are called the Lyapunov exponents of α. The hyperplanes ker χ ⊂ R k are called the Lyapunov hyperplanes or Weyl chamber walls, and the connected components of R k − ∪ χ kerχ are called the Weyl chambers of α. The elements in the union of the Lyapunov hyperplanes are called singular, and the elements in the union of the Weyl chambers are called regular.
Consider a Z k action by automorphisms of a torus M = T d = R d /Z d or, more generally, a nilmanifold M = N/Γ, where N is a simply connected nilpotent Lie group and Γ ⊂ G is a (cocompact) lattice. In this case, the Lyapunov decomposition is determined by the eigenspaces of the d × d matrix that defines the toral automorphism or by the eigenspaces of the induced automorphism on the Lie algebra of N . In particular, every Lyapunov distribution is smooth and in the toral case integrates to a linear foliation. The Lyapunov exponents are given by the logarithms of the moduli of the eigenvalues. Hence they are independent of the invariant measure and give uniform estimates of expansion and contraction rates.
In the non-algebraic case, the individual Lyapunov distributions are in general only measurable and depend on the given measure. This can be already seen for a single diffeomorphism, even if Anosov. However, as we observed above, the full stable distribution E s a of an Anosov element a always agrees with χ(a)<0 E χ on a set of full measure for any measure.
For higher rank actions, coarse Lyapunov distributions play a similar role to the stable and unstable distributions for an Anosov diffeomorphism. For any Lyapunov functional χ the coarse Lyapunov distribution is the direct sum of all Lyapunov spaces with Lyapunov exponents, as functionals, positively proportional to χ: For an algebraic action such a distribution is a finest nontrivial intersection of the stable distributions of certain Anosov elements of the action. For nonalgebraic actions, however, this is not a priori clear. It was shown in [24,Proposition 2.4] that, in the presence of sufficiently many Anosov elements, the coarse Lyapunov distributions are well-defined, continuous, and tangent to foliations with smooth leaves. We quote the discrete time version [23,Proposition 2.2]. We denote the set of all Anosov elements in Z k by A.
Proposition 2.2. Let α be an Anosov action of Z k and let µ be an ergodic probability measure for α with full support. Suppose that there exists an Anosov element in every Weyl chamber defined by µ. Then for each Lyapunov exponent χ the coarse Lyapunov distribution can be defined as on the set P of full measure where the Lyapunov splitting exist. Moreover, E χ is Hölder continuous, and thus it can be extended to a Hölder distribution tangent to the foliation W χ = {a∈A | χ(a)<0} W s a with uniformly C ∞ leaves.
Note that ergodic measures with full support always exist if a Z k action contains a transitive Anosov element. A natural example is given by the measure µ of maximal entropy for such an element, which is unique [25,Corollary 20.1.4] and hence is invariant under the whole action. We emphasize that it is precisely here where we use the assumption that every Weyl chamber contains an Anosov element. We will use Proposition 2.2 in the next section to get uniform estimates for elements close to Weyl chamber walls.
Since a coarse Lyapunov distribution is defined by a collection of positively proportional Lyapunov exponents, it can be uniquely identified with the subset of R k where these functionals are positive (resp. negative). This subset is called the positive (resp. negative) Lyapunov half-space. Similarly, a coarse Lyapunov distribution can be defined with the oriented Lyapunov hyperplane that separates the corresponding positive and negative Lyapunov half-spaces.
3. Z k actions on tori and nilmanifolds and uniform estimates.
From now on we consider Anosov Z k actions on tori and nilmanifolds. In this section, we explore the special features we obtain thanks to the Franks-Manning conjugacy. This allows us to control invariant measures, Lyapunov exponents, and even upper bounds of expansion for elements close to a Weyl chamber wall (cf. Section 3.2). Now we consider an Anosov Z k action α on a nilmanifold M . Fix an Anosov element a for α. Then we have φ which conjugates α(a) to an automorphism A. By [39, Corollary 1] any homeomorphism of M commuting with A is an affine automorphism. Hence we conclude that φ conjugates α to an action ρ by affine automorphisms. We will call ρ an algebraic action and refer to it as the linearization of α. Now we describe the preferred invariant measure for α (cf. [21, Remark 1]). We denote by λ the normalized Haar measure on the nilmanifold M . Note that λ is invariant under any affine automorphism of M and is the unique measure of maximal entropy for any affine Anosov automorphism.
In the next proposition we show that the Lyapunov exponents of (α, µ) and (ρ, λ) are positively proportional and that the corresponding coarse Lyapunov foliations are mapped into each other by the conjugacy φ. From now on, instead of indexing a coarse Lyapunov foliations by a representative of the class of positively proportional Lyapunov functionals, we index them numerically, i.e. we write W i instead of W χ , implicitly identifying the finite collection of equivalence classes of Lyapunov exponents with a finite set of integers.
Proposition 3.2. Assume there is an Anosov element in every Weyl chamber. Then (1) The Lyapunov exponents of (α, µ) and (ρ, λ) are positively proportional, and thus the Lyapunov hyperplanes and Weyl chambers are the same.
α is the corresponding coarse Lyapunov foliation for ρ. Remark. In fact, one can show that (1) holds for Lyapunov exponents and coarse Lyapunov foliations of (α, ν) for any α-invariant measure ν so, in particular, the Lyapunov exponents of all α-invariant measures are positively proportional and the coarse Lyapunov splittings are consistent with the continuous one defined in Proposition 2.2.
Remark. We do not claim at this point that the Lyapunov exponents of (α, µ) and (ρ, λ) (or of different invariant measures for α) are equal. Of course, if α is shown to be smoothly conjugate to ρ then this is true a posteriori.
Proof : The proposition is the discrete time analogue of [11, Proposition 2.5]. We include the proof for the sake of completeness. First we observe that the conjugacy φ maps the stable manifolds of α to those of ρ. More precisely, for any a ∈ Z k and any for µ-a.e. x ∈ M we have ). Indeed, it suffices to establish this for local manifolds, which are characterized by the exponential contraction as in (2). Since φ is bi-Hölder, it preserves the property that dist(x n , y n ) decays exponentially, which implies (3). In particular, for any Anosov a ∈ Z k and any x ∈ M we have φ(W s α(a) (x)) = W s ρ(a) (φ(x)). Hence the formula for W i α given in Proposition 2.2 implies (2) once we establish (1).
To establish (1) it suffices to show that the oriented Lyapunov hyperplanes of (α, µ) and (ρ, λ) are the same. Suppose that an oriented Lyapunov hyperplane L of one action, say α, is not an oriented Lyapunov hyperplane of the other action ρ. Then we can take Z k elements a ∈ L + and b ∈ L − which are not separated by any Lyapunov hyperplane of either action other than L. Then, 3.2. Uniform estimates for elements near Lyapunov hyperplane. The uniform estimates proved in this section will play a crucial role in the proof of the main theorem. They give us upper bounds with small exponents for the expansion in certain directions for elements close to the Weyl chamber walls. This almost isometric behavior together with strong hyperbolic behavior in other directions and exponential mixing will force the convergence of suitable series as distributions.
We first address estimates for the first derivatives of these elements. We fix a positive Lyapunov half-space L + ⊂ R k and the corresponding Lyapunov hyperplane L. We denote the corresponding coarse Lyapunov distributions for α and ρ by E andĒ respectively. Recall that γ > 0 denotes a Hölder exponent of φ and φ −1 .
The Lyapunov exponents of ρ corresponding toĒ are functionals positive on L + . Letχ m andχ M be the ones smallest and the largest on L + . We will show that χ m = γχ m and χ M = γ −1χ M satisfy the conclusion of the lemma.
First we will prove the second inequality, which is slightly easier. Suppose that χ ν (b) > χ M (b) for some Lyapunov exponent of (α(b), ν) corresponding to the distribution E. Let E ′ be the distribution spanned by the Lyapunov subspaces of (α(b), ν) corresponding to Lyapunov exponents greater than χ M (b) + ε. Then, for some ε > 0, E ′ has nonzero intersection with the distribution E. The strong unstable distribution E ′ (x) is tangent for ν-a.e. x to the corresponding local strong unstable manifold W ′ (x). Hence the intersection F (x) of W ′ (x) with the leaf W (x) of the coarse Lyapunov foliation corresponding to E is a submanifold of positive dimension. We take y ∈ F (x) and denote y n = α(−nb)(y) and x n = α(−nb)(x). Then x n and y n converge exponentially with the rate at least χ M (b) + ε. Since the conjugacy φ is γ bi-Hölder it is easy to see that decreases at a rate faster than γ χ M (b). But this is impossible since φ maps W (x) tō W (φ(x)), the leaf of corresponding Lyapunov foliation of ρ, which is contracted by ρ(−b) at a rate at mostχ The first inequality can be established similarly. Suppose that χ ν (b) < χ m (b) for some Lyapunov exponent of (α(b), ν) corresponding to the distribution E. Let E ′′ ⊂ E be the Lyapunov distribution corresponding to this exponent. We cannot assert that E ′′ is tangent to an invariant foliation, so we consider a curve l tangent to a vector 0 = v ∈ E ′′ (x) for some ν-typical x. Then the exponent of v with respect to α(−b) is −χ ν (b). However, since φ(l) ⊂W (φ(x)), we can obtain as above that l is contracted by α(−b) at the rate at least χ m (b). It is easy to see that this is impossible.
Proposition 3.4. Let E be a coarse Lyapunov distribution and L + ⊂ R k be the corresponding Lyapunov half-space for α. Then for any element b ∈ L + any ε > 0 there exists where χ m and χ M are as in Lemma 3.3 Proof : In the proof we will abbreviate α(b) to b. Consider functions a n (x) = log Db n | E (x) , n ∈ N. Since the distribution E is continuous, so are the functions a n . The sequence a n is subadditive, i.e. a n+k (x) ≤ a n (b k (x)) + a k (x). The Subadditive and Multiplicative Ergodic Theorems imply that for every b-invariant ergodic measure ν the limit lim n→∞ a n (x)/n exists for ν-a.e. x and equals the largest Lyapunov exponent of (b, ν) on the distribution E. The latter is at most χ M (b) by Lemma 3.3. Thus the exponential growth rate of Db n | E (x) is at most χ M (b) for all b-invariant ergodic measures. Since Db n | E (x) is continuous, this implies the uniform exponential growth estimate, as in the second inequality in (4) (see [36,Theorem 1] or [34,Proposition 3.4]). The first inequality in (4) follows similarly by observing that the exponential growth rate of Lemma 3.5. Assume that there is an Anosov element in every Weyl chamber. Then for any a ∈ Z k , α(a) is Anosov if and only if its linearization ρ(a) is Anosov.
Proof : It is classical that if a is Anosov so is it's linearization. So assume that ρ(a) is Anosov. Then a does not belong to any Lyapunov hyperplane of ρ and hence of α. Then Proposition 3.4 applied to a or −a implies that any coarse Lyapunov distribution of α is either uniformly contracted or uniformly expanded by α(a). This implies that α(a) is Anosov since the coarse Lyapunov distributions span T M .

Higher derivatives and estimates on compositions.
In this subsection, we recall a basic estimate on higher derivatives of compositions of diffeomorphisms. The main point is that the exponential growth rate is entirely controlled by the first derivative.
Let ψ be a diffeomorphism of a compact manifold M . Given a function f on M , in local coordinates we have a vector valued function f k consisting of f and it's partial derivatives up to order k. Using a finite collection of charts and a subordinate partition of unity, one can define the C k norm of f as sup x f k (x) . It is easy to check that different choices of charts and/or partition of unity give rise to equivalent C k norms. We will also write f (x) k = f k (x) for the corresponding norm at x. More generally, let F be a foliation of M by smooth manifolds. Given a function f which is continuous and differentiable along F we can again locally define a vector valued function f k,F (x) consisting of f and it's partial derivative to order k along F and let f (x) k,F = f k,F (x) . Fixing a finite collection of foliation charts and a subordinate partition of unity, this allows us to define C k norms corresponding to only taking derivatives along F, by f k,F = sup x∈M f (x) k,F . Once again it is easy to check that different choices of charts and/or partition of unity give rise to equivalent norms. In this setting, for a homeomorphism ψ of M that is smooth along F with all derivatives continuous transversely, Lemma 3.6. Let ψ be a diffeomorphism of a manifold M preserving a foliation F by smooth leaves. Let N k = ψ k,F . Then there exists a polynomial P depending only on k and the dimension of the leaves of F such that for every m ∈ N This type of estimate is used frequently in the dynamics literature particularly in KAM theory and is usually referred to as an estimate on compositions. This lemma is essentially [9, Lemma 6.4] and a proof is contained in Appendix B of that paper. There are many other proofs of equation 5 in the literature, though mostly only in the case where the foliation F is trivial, i.e. when the only leaf of F is the manifold M . Most proofs should adapt easily to the foliated setting.

Exponential mixing for Z k -actions on tori
Consider a diffeomorphism a on a manifold preserving a probability measure µ. Given two Hölder functions f, g, we consider the matrix coefficients a k f, g where the bracket refers to the standard inner product on L 2 (µ). For an Anosov diffeomorphism a, the matrix coefficients of Hölder functions decay exponentially fast in k for either an invariant volume or the measure of maximal entropy, as follows easily from symbolic dynamics. D. Lind established exponential decay for Hölder functions for ergodic toral automorphisms in [30]. This is considerably harder, as there is no suitable symbolic dynamics. Instead he shows that dual orbits of Fourier coefficients diverge fast as one has good lower bounds on the distances from integer points to neutral subspaces along stable and unstable subspaces. This precisely is Katznelson's lemma on rational approximation of invariant subspaces. We adapt Lind's argument to prove exponential decay of matrix coefficients of Hölder functions for Z k actions by ergodic automorphisms with a bound depending on the norm of the element in Z k . Even if the Z k -action contains only Anosov elements this is not trivial since we seek a bound in terms of the norm of a ∈ Z k . In addition, some elements in Z k will be arbitrarily close to the Lyapunov hyperplanes and thus have little, if any, expansion in certain directions. Thus one essentially has to deal with the partially hyperbolic case. We remark that Damjanović and Katok obtained estimates of exponential divergence of Fourier coefficients for the dual action induced by a Z k -action by ergodic toral automorphisms [4].
Finally, Gorodnik and the third author generalized exponential decay of matrix coefficients to ergodic automorphisms and Z k actions of such on nilmanifolds [14]. We will report on this development in more detail in Section 7 when we prove the nilmanifold version of our main result. The arguments required for the nilmanifold case are substantially more complicated, and rely on work by Green and Tao on equidistribution of polynomial sequences [16]. For this reason, and to keep our exposition for the case of toral automorphisms self contained and elementary, we present our adaptation of Lind's arguments.
Let τ be a Z k -action by ergodic automorphisms of T n . We begin by recalling Katznelson's Lemma. For a proof see [4,Lemma 4.1].
Here z denotes the Euclidean norm and d the Euclidean distance.
Consider the finest decomposition into τ (Z k )-invariant subspaces E i of R n = ⊕ i E i . All E i are subspaces of generalized eigenspaces of the elements of τ (Z k ). Let λ i denote the Lyapunov exponent defined by the vectors in E i . Then e λ i (a) is the absolute value of the eigenvalue of τ (a) on E i . It is well-known that the λ i (a) are the Lyapunov exponents of τ (a). Pick an inner product with respect to which the E j are mutually orthogonal. Let | v| denote its norm. Since all norms on R n are equivalent, we can pick D > 0 such that Hence tr τ (a) l = tr b l is arbitrarily close to n for suitable l. Since τ (a) l ∈ SL(n, Z), tr τ (a) l is an integer, and thus tr τ (a) l = n. On the other hand, however, tr τ (a) l < n since the eigenvalues of b l cannot be real. This is the final contradiction.
We will need a slightly stronger variant of this lemma. For a ∈ Z k , set S(a) = max i λ i (τ (a)). Then S(a) = 0 for τ (a) ergodic. Explicit lower bounds can be found in the literature, e.g. in [3]. We give an easy soft argument for a positive lower bound.
Proof : First suppose that all elements in Z k are semisimple. If τ (a) is semisimple, then τ (a) expands each E j precisely by e λ j (a) with respect to | v| . Suppose S(a l ) → 0 for a sequence of mutually distinct 1 = a l ∈ Z k . Then there are infinitely many τ (a l ) which expand distances w.r.t. | . . . | by at most 2 D . Hence distances w.r.t. . . . get expanded by at most 2. Pick any integer vector z ∈ Z n . As the images a l (e 1 ) are integer vectors of norm at most 2 z , for some a l = a j , a l (z) = a j (z). Hence a −1 j a l cannot be ergodic. Next consider the general case. Consider a generating set a 1 , . . . , a k of Z k . Suppose a 1 ∈ Z k has a Jordan decomposition τ (a 1 ) = b 1 c 1 with b 1 semisimple and c 1 unipotent. Since τ (a 1 ) ∈ SL(n, Z) both b 1 and c 1 are in SL(n, Q). Since c 1 is unipotent, the subspace W 1 = {v | c 1 v = v} of eigenvectors with eigenvalue 1 is nontrivial and is defined over Q. Also, W 1 is τ (Z k )-invariant, and τ (Z k ) acts faithfully on W 1 since otherwise some element τ (a) for a ∈ Z k has eigenvalue 1 and is not ergodic. Also τ (a) | W 1 is semisimple. Inductively, we define a descending sequence of rational τ (Z k )-invariant subspaces W 1 ⊃ W 2 ⊃ . . . W k on which Z k acts faithfully. In addition, τ (a i ) | W i is semisimple. Hence Z k acts faithfully on W k and every element acts semisimply. By the special case above, inf{S(a | W K ) | 1 = a ∈ Z k } > 0. Since inf{S(a) | 0 = a ∈ Z k } ≥ inf{S(a | W K ) | 0 = a ∈ Z k }, the claim follows.
Note that the λ i and hence S extend to continuous functions on R k . Lemma 4.4. Suppose for all 0 = a ∈ Z k , τ (a) acts ergodically. Then for all 0 = a ∈ R k , S(a) > 0. Thus 0 < σ := 1 2 inf{S(a) | a ∈ R k , a = 1}. Proof : Suppose S(a) = 0 for some 0 = a ∈ R k . Since the line ta, t ∈ R comes arbitrarily close to integer points in Z k , we can find t l ∈ R and a l ∈ Z k with a l − t l a → 0 as l → ∞. As S(t l a) = 0 for all l, it follows readily that S(a l ) → 0 in contradiction to the last lemma. The last claim follows as S is continuous.
Let B(d) denote the ball of radius d in Z k . Suppose that there is a sequence l m → ∞ and a lm ∈ Z k with α lm := a lm ≥ l m such that τ (a lm )(H lm ) ∩ H lm = {0}. Passing to a subsequence we may assume that a lm α lm → a converges to a ∈ R k . Since S(a) ≥ 2σ, λ i (a) ≥ σ for some i. Hence we get for all large m that λ i (a lm ) ≥ l m σ.
Let E = ⊕ j =i E j . By Katznelson's Lemma applied to E, there is a constant C > 0 such that for 0 = z ∈ Z n , the distance d(z, E) > C z −n . Suppose z lm ∈ H lm with τ (a lm )z lm ∈ H lm . Then we get z lm < br lm and τ (a lm )z lm < br lm .
Denote by π i the projection to E i along E. Then π i (z lm ) = d(z lm , E) ≥ C z lm −n > Cb −n r −nlm .
As E and E i are transversal and have constant angle, there is a constant M such that for all v ∈ R n , π i (v) ≤ M v . Hence τ (a lm )(π i (z lm )) = π i (τ (a lm )z lm ) < M br lm . On the other hand, we will show below that Indeed, this estimate is clear when τ (a) is semisimple but needs more care when τ (a) has nontrivial Jordan form. This estimate will yield a contradiction to the Lyapunov exponent λ i (a) of a to be at least σ. Here is the detail.
Set v lm := π i (z lm ) π i (z lm ) . By the estimates above we get For all large m, we may assume that b lm expands vectors by a factor of at most r. Since l m ≤ α lm this implies Find a basis w 1 , . . . w s of E i which brings a to Jordan form. Write v lm = x 1 lm w 1 + . . . + x s lm w s . Passing to a subsequence the v lm converge. Suppose j is the last coordinate such that x j lm → x j = 0. Then τ (α lm a)(v lm ) has j-coordinate of absolute value x j e α lm λ i (a) . Since the sup norm determined by the basis w 1 , . . . , w r is equivalent to the standard Euclidean norm, there is a constant M' such that τ (α lm a)(v lm ) > M ′ x j e α lm λ i (a) . Hence This is impossible for large l m by choice of r and σ.
We will use the approximation by Fejér kernel functions K l (t) = l j=−l 1 − |j| l+1 e 2πijt , and refer to [28, chapter I] for details.
Set F l (t 1 , . . . , t n ) = K l (t 1 ) . . . K l (t n ). For continuous f : As in [28, p. 21, Exercise 1], we get Theorem 4.7. Suppose Z k acts affinely on T n such that for all 0 = a ∈ Z k , τ (a) acts ergodically. Let f and g be two Hölder functions on T n with Hölder exponents θ. Then there exists r > 1 such that for any a l ∈ Z k with a l ≥ l we can bound the matrix coefficients In particular, the matrix coefficients decay exponentially fast.
Proof : We can can assume that T n f = T n g = 0 are both 0 by subtracting the constants T n f and T n g from f and g respectively.
We pick 1 < r < e σ n+2 as in Lemma 4.5 where σ is as in Lemma 4.4. Let m = [r l ] , the largest integer smaller than r l . Set f l = K m ⋆ f and g l = K m ⋆ g with frequencies in H l . Then T n f l = T n g l = 0 and f − f l ∞ ≤ 2C(θ) f θ (r l ) −θ and g − g l ∞ < 2C(θ) g θ (r l ) −θ where the 2 accounts for the discrepancy coming from m versus r l . By the last lemma, we get a l (f ), g = a l f, (g − g l ) + a l (f − f l ), g l + a l (f l ), g l .
The last term is eventually 0 since the constant term is 0 and a l moves H l off itself. The first term is bounded by Take l large enough so that g − g l ∞ < 2C(θ) g θ (r l ) −θ < 2, Then the second term is bounded by This yields the desired estimate Corollary 4.8. The same statement as above holds for any Anosov Z k action with k > 1 where every element acts ergodically.
Proof : This combines Theorem 4.7, the existence of a Hölder conjugacy, and the fact that we define matrix coefficients with respect to the pushforward measure which is the unique smooth invariant measure by Proposition 3.1.

Regularity and the Proof of Theorem 1.1
In this section we complete the proof of Theorem 1.1 by showing that the Franks-Manning conjugacy φ between the Z k -actions α and ρ is smooth. We will use φ and the uniform exponential estimates along the coarse Lyapunov foliations of α from Section 3.2, but we will not use Anosov elements explicitly in this section. Instead, we will use the subgroup Z 2 consisting of ergodic elements that we postulated in Theorem 1.1. Theorem 4.7 gives exponential mixing with uniform estimates along this Z 2 . This allows us to define distributions on Hölder functions which correspond to the components of the conjugacy and their derivatives. First however, we will make some reductions to the general case.
By passing to a finite index subgroup of Z k we can assume that the action α has a common fixed point. First we reduce the problem to the case when α acts on the torus with the standard differentiable structure. Note that a construction due to Farrell and Jones shows that there exist Anosov diffeomorphisms of exotic tori [7]. However, every exotic torus of dimension at least 5 has a finite cover which is diffeomorphic to the standard torus [38, Chapter 15 A, last unitalized paragraph]. In this case we can consider the lifts of the actions and the conjugacy. Clearly, the smoothness of φ follows from the smoothness of its lift. We will give an independent argument in Section 6 for the case of 4-dimensional tori. Hence, without loss of generality, we can assume that α acts on the same standard torus as ρ. In dimensions 2 and 3, by Remark A.4 in the Appendix, there are no exotic differentiable structures, though this fact is not strictly needed here. In dimension 3, Theorem 1.1 follows from the main result of [34]. As explained in Section 6, there are no higher rank Anosov actions on tori in dimension 2.
By changing coordinates we can also assume that 0 is a common fixed point for both α and ρ. Then there exists a unique conjugacy φ in the homotopy class of identity satisfying φ(0) = 0. We can lift φ to the mapφ : R n → R n satisfyingφ(0) = 0 and write it as Consider an element a in Z 2 and abbreviate α(a) to a and ρ(a) toĀ. We denote their lifts to R n that fix 0 byã and A respectively and note that A is linear. Since φ is a conjugacy and the lifts fix 0, they satisfyφ •ã = A •φ. Hence we obtain which is equivalent to where Q(x) = A −1 (ã(x) − A(x)). Note that Q(x) is smooth since a is smooth with respect to the standard differentiable structure (this will be crucial later). Since h is Z n periodic it is easy to see that A −1 (h(ã(x))) and hence Q(x) are also Z n periodic. For the remainder of this section we will view h and Q as functions from T n to R n . The functional equation on T n becomes (6) h(x) = Q(x) + A −1 (h(ax)).
Fix a coarse Lyapunov foliation V of α and the corresponding linear coarse Lyapunov foliationV of ρ. Let V be the subspace of R n parallel toV and W be the complementary A invariant subspace, which is parallel to the sum of all coarse Lyapunov foliations of ρ different fromV. Denote by h V : R n → V the projection of h to V along W . Since V is A-invariant, projecting equation (6) and letting A V denote the restriction of A to V we obtain We will use the functional equation (7) with well-chosen elements a to study the derivatives of h V along the coarse Lyapunov foliations of α. These derivatives exist, a priori, only in the sense of distribution on smooth functions. The crucial element of the proof is Lemma 5.1 below which shows that these distributional derivatives extend to functionals on the spaces of Hölder functions. We emphasize that this lemma is quite general, and may be useful in other situations. The main ingredients are the uniform exponential estimates with arbitrarily small exponents along coarse Lyapunov foliations, and exponential mixing for Hölder functions. The key idea is that in our estimates for derivatives, the exponential decay coming from exponential mixing overcomes small exponential growth coming from derivatives. Lemma 5.1. For any coarse Lyapunov foliation V ′ of α, possibly equal to V, and for any θ > 0 the derivatives of h V of any order along V ′ exist as distributions on the space of θ-Hölder functions.
Proof : Let L, L + , L − ⊂ R k be the Lyapunov hyperplane and the positive and negative Lyapunov half-spaces corresponding to V. Let L ′ be the Lyapunov hyperplane corresponding to V ′ . In this proof we will choose a in the Z 2 subgroup consisting of ergodic elements. We note that V and V ′ are coarse Lyapunov foliations for α-action of the full Z k and that we make no assumptions on the relative positions of Z 2 , L, and L ′ in R k . We will choose a in a narrow cone in Z 2 around L ′ ∩ Z 2 , so that a will expand V ′ at most slowly. In case Z 2 ⊂ L ′ , this automatically holds for all a in Z 2 . Since any such cone can not be contained entirely in L − , we can always choose such an a ∈ Z 2 in L + or L.
If a ∈ L + then A −1 V is a contraction. Then the operator F V in (7) is a contraction on the space C 0 (T n , V ). Hence it has a unique fixed point lim F m V (0), which therefore has to coincide with h V . Thus we obtain (8) h If a ∈ L the series in (8) does not converge in the space of continuous functions. However, it converges in the space D 0 of distributions on smooth functions with zero average, and the equality in (8) holds in D 0 . To see this we iterate (7) to get (9) h Since A −m V grows at most polynomially in m for a ∈ L, and since h V is Hölder, Corollary 4.8 implies that the pairing A −N V h V (a N x), f → 0 for any Hölder function f with T n f = 0. This establishes convergence and equality in (8) when both sides are considered as elements in D 0 .
We will use notations of Section 3.3 for derivatives. Given a smooth function g : T n → R l , we write g k,V ′ for the vector consisting of the derivatives of g up to order k along the foliation V ′ . If g is a vector valued function on T n and f is a scalar valued function, we write gf for the vector function obtained by component-wise multiplication of g by f . We then write g, f for the vector obtained by integrating gf over T n . We will use the same notation h k,V ′ V for the vector of distributional derivatives of h V along V ′ (see Section 8 for detailed description of distributional derivatives in the context of foliations).

Differentiating (8) term-wise we obtain the formula for
Note that the derivative of a distribution is defined by its values on derivatives of test functions (16), and those have zero average. Thus convergence and equality in (10) hold in the space D of distributions on smooth functions, even if equality in (8) hold only in D 0 . Since Q V is smooth, the pairings in the series in (10) are simply given by integration.
To show that h k,V ′ V extends to a functional on the space of Hölder functions we will now estimate these pairings in terms of the Hölder norm of f .
We will use smooth approximations of f by convolutions f ε = f ⋆ φ ε , where the kernel is given by rescaling φ ε (x) = ε −n φ( x ε ) of a fixed bump function φ and thus is supported on the ball of radius ε and satisfies Then it is easy to check the following estimates, where . k denotes the C k norm for k ≥ 0, where f is a θ-Hölder function and c k is a constant depending only on k. First we estimate the pairings in (10) with f ε . Note that . l ≤ . k if l ≤ k. We have Since a is chosen in L + ∪ L, A −1 V grows at most polynomially in a and thus, for any η > 0, we can ensure that A −1 V < (1 + η) a for all a with sufficiently large norm. Thus we conclude from the two equations above that (10) with f − f ε using the supremum norm and estimating A −m V as above

Now we estimate the pairings in
Here we used notations of Section 3.3. Denoting N k = a k,V ′ , and using equation (5) from Lemma 3.6 we conclude that Recall that we choose a in a cone around L ′ ∩ Z 2 . For any η > 0, by taking the cone sufficiently narrow and using Proposition 3.4, we can ensure that N 1 = a 1,V ′ < (1+η) a for any such a with sufficiently large norm. Then from the last equation we obtain that For any fixed θ, we have a fixed rate of exponential decay with respect to m in (12), but the rate of exponential growth in (13) can be made arbitrarily slow. This allows us to choose ε that gives exponentially decaying estimates for both (12) and (13). More precisely, we take ε = r −m a θ θ+n+k+1 and denote ζ = r θ 2 θ+n+k+1 > 1. Then we obtain from (12) and (13) that For any k we can now choose a, and hence η, so that ξ = ζ · (1 + η) −(k+2) > 1. Since the polynomial P and constant N k depend only on k and a, we can then estimate P (mN k ) ≤ K 3 (1 + η) m a . Finally, we obtain from the last two equations that Thus for any θ and k we obtain exponentially decreasing estimates for the terms in (10). We conclude that h k,V ′ V , f ≤ C f θ and hence h k,V ′ V extends to a functional on the space of θ-Hölder functions.
Proof of Theorem 1.1: We discussed actions on two-and three dimensional tori above, and will prove Theorem 1.1 for four-dimensional tori with an exotic smooth structure in the next section. When the dimension is greater than four, as explained above, we can pass to a finite cover by smoothing theory and assume that the smooth structure is standard. Passing to a subgroup of finite index, we can also assume that α has a common fixed point. By Lemma 5.1, for any coarse Lyapunov foliation V ′ of α and for any θ > 0 the derivatives of h V of any order along V ′ exist as distributions on the space of θ-Hölder functions. Hence by Corollary 8.4, all h V are C ∞ . Since the subspaces V span, h is determined by the projections h V . It follows that h is C ∞ and hence so is φ. It remains to show that φ is a diffeomorphism. Since φ is a homeomorphism, it suffices to show that the differential of φ is everywhere non-degenerate. This follows from Proposition 3.1 since we have λ = φ * (µ) and µ has smooth positive density.

Four Dimensional Exotic Tori
Now consider a higher rank Anosov action on a 4-dimensional torus with an exotic differentiable structure. Due to low dimension we are able adapt arguments from [11] to obtain the result in this case.
By passing to a finite index subgroup of Z k we can assume that the linear part ρ acts by linear automorphisms from SL(4, Z). We begin by analyzing possibilities for such actions on T 4 . Let A ∈ SL(4, Z) be an Anosov element for ρ. First we claim that the characteristic polynomial of A is irreducible over Q. Indeed, the only possible splitting would be into a product of quadratic terms and would imply existence of a rational invariant subspace of dimension two. Such a subspace would be invariant with respect to a finite index subgroup of Z k . The restriction of ρ to the corresponding torus would still be Anosov and contain a Z 2 subgroup of ergodic elements, as ergodicity is equivalent to having no root of unity as eigenvalue. The latter however is impossible since Anosov actions on T 2 can only have rank one. More precisely, by the Dirichlet Unit Theorem the centralizer of an irreducible Anosov matrix in SL(n, Z) is a finite extension of Z d , where d is n − 1 minus the number of pairs of complex eigenvalues. Moreover, all nontrivial elements of this Z d are semisimple. We conclude that ρ(Z k ) is a subgroup of such Z d ⊂ SL(4, Z).
We note that ρ has four Lyapunov exponents (counted with multiplicity) and χ 1 + χ 2 + χ 3 + χ 4 = 0 by volume preservation. If no two are negatively proportional then ρ, and hence α, are so called TNS (totally nonsymplectic) and smoothness of the conjugacy follows from [11,Theorem 1.1]. Now suppose that there are negatively proportional Lyapunov exponents. This case does not follow from any previous theorem but can still be handled using techniques from [11] and [22]. Note that in this case there are no positively proportional Lyapunov exponents, as otherwise for elements near the kernel of the negatively proportional ones all Lyapunov exponents will be close to zero by volume preservation, contradicting Lemma 4.3. This implies that ρ(Z k ) contains matrices with pure real spectrum and the coarse Lyapunov spaces for ρ are one-dimensional and totally irrational, so in particular the corresponding linear foliations of T 4 are ergodic.
For the nonlinear action α the coarse Lyapunov foliations are also one-dimensional and any pair W i , W j is jointly integrable in topological sense by the conjugacy to the linear action. By [22,Lemma 4.1] the joint foliation W ij has smooth leaves. For each W i consider a W j which does not correspond to negatively proportional exponents. Then one can see as in [11,Proposition 5.2] that there is an element that contracts W i faster than W j and conclude that W i and W j are C ∞ along the leaves of W ij . In place of measurable normal forms in [11], for one-dimensional foliations we can use the nonstationary linearization [26,Proposition A.1] which is continuous on M in the C ∞ topology. Hence a simple version of the holonomy argument [11, Proposition 8.1] works for any W i using the holonomy along such W j . The argument shows that the conjugacy φ is C ∞ along any W i (x) with the derivatives continuous on M . Then the smoothness of φ follows easily as in [11].

The nilmanifold Case
In this section we will describe the adaptations of our arguments needed for the case of an Anosov action on an infranilmanifold M . Passing to finite covers, we can assume that N/Γ is a nilmanifold. Next we reduce to the case when the differentiable structure on N/Γ is standard, i.e. given by the ambient Lie group structure. First we note that there are no nilmanifolds of dimension at most 4 supporting an Anosov automorphism besides the torus. Hence we can employ the theorem of J. Davis, proved in the appendix, that every exotic nilmanifold in dimension at least 5 has a finite cover with standard differentiable structure. This allows to lift the actions to ones smooth with respect to a standard differentiable structure, as in the beginning of Section 5. Thus the main theorem follows for nilmanifolds of dimension at least 5 provided it holds for actions on standard nilmanifolds. We will now give a proof of the main theorem in this set-up .
First note that the arguments from Section 3 allowing uniform control of exponents work verbatim. That certain distributions are dual to the space of Hölder functions will again be key to our arguments. This requires exponential mixing of the action which does not follow easily from Fourier analysis or more generally representation theory anymore. Instead we evoke a recent result by Gorodnik and the third author [15]. This is far less elementary than the results in Section 4, and use recent results of Green and Tao [16] on equidistribution of polynomial sequences.
Theorem 7.1 (Gorodnik-Spatzier). Consider a Z k action α by ergodic affine diffeomorphisms on an infra-nilmanifold. Then for any 0 < θ < 1 there is 0 < λ < 1 such that for any two θ-Hölder functions f, g : X → R we get where z denotes some fixed norm on Z k .
We need to establish regularity of the solutions to the cocycle equations employed in Section 5. We are inspired by the approach of Margulis and Qian in [32, Lemma 6.5]. However, while they write their equations in exponential coordinates and directly study the solutions in these coordinates, we will reduce the cocoycle equation to a series of equations, one for each term of the derived series of N . This yields abelian valued cocycle equations to which we can apply the arguments from the toral case. Here are the details.
As in Section 5 we consider the liftφ : N → N of the Franks-Manning conjugacy φ : N/Γ → N/Γ. We can write it as a productφ = h · I, where h : N → N satisfies (h · I)(a(x)) = A (h · I)(x) (15) on N and projects to the map from N/Γ to N .
Let N ′ be the commutator subgroup of N . Pick a splitting of the Lie algebra N = N ′ ⊕ N 0 of N where N ′ the Lie algebra of N ′ . Note that N 0 is not a Lie algebra. Let N 0 = exp N 0 , where exp is the exponential map. Now we decompose h as a product h = h 1 · h 0 , where h 0 takes values in N 0 and h 1 takes values in N ′ , in the following way. We take h 0 to be the exponential of the N 0 component of exp −1 h and define h 1 = h·(h 0 ) −1 . One can see that h 1 ∈ N ′ from the Campbell-Hausdorff formula since all brackets are in N ′ . Note that h 0 and h 1 project to maps from N/Γ to N .
Step 1: We first show that h 0 is smooth. Leth : N → N ′ \N be the composition of h with the projection N → N ′ \N . Note that h 0 is smooth precisely whenh is smooth, since by construction exp −1 h 0 and exp −1h are just related by the identification of N 0 with the Lie algebra of N ′ \N . Write the group operation in N ′ \N additively. Denote byĀ the induced automorphism of N ′ \N . Then we get (I +h)(a(x)) =Ā (I +h)(x). Now we can use exactly the same arguments as in Section 5 and in particular exponential mixing to show thath is smooth.
Step 2: We write out Equation 15 in terms of the decomposition h = h 1 · h 0 : This gives the formula Since the automorphism A leaves N ′ invariant it follows that both h 1 (x) and A −1 (h 1 (a(x))) belong to N ′ . Hence the function Q 1 (x) := A −1 (h 0 (a(x)))A −1 (a(x))x −1 h 0 (x) −1 also takes values in N ′ . In addition, Q 1 (x) is smooth by construction and satisfies the functional equation Since h 1 project to a map from N/Γ then so do A −1 (h 1 (a(x))) and, from the equation, Q 1 (x). Thus the equation holds in C 0 (N/Γ, N ′ ). Now mod out by the second derived group N ′′ , and denote the projected maps by bars. Again we write multiplication in N ′′ \N ′ additively to get We can analyze the solution to this equation once again using the methods from the basic toral case, and in particular exponential mixing and uniqueness of solutions. We conclude thath 1 is a smooth function. Continue this analysis by decomposing N ′ in terms of N ′′ and a complement N 1 to N ′′ inside N ′ . Since the series of commutator maps terminates of a nilpotent Lie group, we see that h is a smooth function.

Wavefront sets
We establish regularity properties of a distribution whose derivatives along a foliation F are dual to Hölder functions in a suitable fashion. While the definitions and concepts will be developed for foliations, the proof will be entirely local on an open subset of R n 1 × R n 2 and only use partial derivatives along the second factor. However, it will be important to develop the appropriate notions for foliations for our application to the conjugacy problem in the main part of the paper.
The main theorem is a variation of results of Rauch and Taylor in [35] who assume that derivatives of the distribution along a foliation belong to various function spaces. The novelty here is that the derivatives are allowed to be distributions, of a precise order less than 0. While we only deal with the particular case of distributions dual to certain Hölder functions, we expect this to be true much more generally.
We first lay out our assumptions on the foliation. Let x and y denote the coordinates of the first and second factor of a point in R n 1 × R n 2 . Suppose z = Γ(x, y) is a bi-Hölder homeomorphism of an open subset O ⊂ R n 1 x × R n 2 y into R n 1 +n 2 with the property that Γ has y-derivatives of all orders and these derivatives are Hölder in (x, y). We further assume that for fixed x, Γ(x, − ) is an immersion on each {x} × R n 2 y . Then we call Γ a foliation chart, or more precisely, a Hölder foliation chart with smooth leaves. On a manifold, Hölder foliations F with smooth leaves are defined by patching foliation charts. If F can be defined by using smooth foliation charts Γ, we call F smooth. Note that the x × R n 2 for x ∈ R n 1 define a smooth foliation Y of R n 1 +n 2 .
We will further assume F is strongly absolutely continuous, i.e. there is a continuous function J(x, y) > 0 such that all y-derivatives of J exist and are Hölder in x and y and such that for any compactly supported continuous function u on Γ(O) u(z)dz = u(Γ(x, y))J(x, y)dxdy.
Note that if a function u(z) has partial derivatives along the foliation F, then u • Γ(x, y) has partial y-derivatives. In addition, the dependence of these latter derivatives on x is continuous or Hölder if the partial derivatives of u along F are continuous or Hölder. Thus the partials ∂ β y (u(Γ(x, y)) are well-defined, and it makes sense to discuss their regularity. We will now define derivatives along the foliation F on a manifold M defined by foliation charts Γ. Fix a standard basis for R n 2 y , parallel translate it over R n 1 +n 2 and consider the push forward under Γ. This defines vector fields V j tangent to F which are smooth along the leaves of F and whose derivatives along F of any order are Hölder transversely to F. We say that a function f has derivatives of order up to k along F if for any sequence V j 1 , . . . , V j k the derivatives V j 1 . . . V j k (f ) exist. If M is endowed with a Riemannian metric, equivalently we can require the following: consider any smooth vector fields X 1 , . . . , X k on M , and denote their orthogonal projections to the tangent spaces of F by Z 1 , . . . , Z k . Then f has derivatives up to order k along F if the derivatives Z 1 . . . Z k (f ) exist.
Lemma 8.1. Under the above assumptions, the derivatives of Γ −1 along F also Hölder.
Proof : This follows from the standard formulas for differentiating the inverse of immersions, and the assumptions on Hölderness of Γ and its derivatives along Y. Note that the correspondence of the Hölder coefficients, while complicated, is explicit.
In our main theorem below, we will allow the Hölder exponents of the higher order derivatives of both Γ and J to get worse with the order. In the following we will use a fixed non-increasing sequence α k such that all Y or F derivatives of both Γ, Γ −1 and J of order at most k are Hölder with Hölder exponent α k . This is possible by the last lemma.
Note that the vectorfields V j defined above and their derivatives along F up to order k depend α k -Hölder transversely to F.
Fix a Riemannian metric on M . Next, we introduce the space C α,k F of compactly supported α-Hölder functions on M which in addition have derivatives along F of all orders ≤ k and all such derivatives are α-Hölder as functions on M . Then C α,k F is a Banach space with the norm given by the finite sequence of α-Hölder norms of the derivatives along F of order ≤ k. If M is compact, the norm is independent of the Riemannian metric chosen up to bi-Lipschitz equivalence. Note that C α,k F is closed under multiplication. We let (C α,k F ) * be the dual space to C α,k F . Note that any compactly supported smooth function on M naturally belongs to any C α,k F . Hence any element in (C α,k F ) * defines a distribution on smooth functions on M . Alternatively, (C α,k F ) * is the space of distributions (dual to smooth functions) which extend to continuous linear functionals on C α,k F . As for notation, we will also write the pairing D(φ) = D, φ for D ∈ (C α,k F ) * and φ ∈ C α,k F . All of these notions apply to the special case of F = Y.
We will work with a foliation chart Γ and use the above notation for the case M = Γ(O).
under composition with Γ −1 . In consequence, we can also pull back distributions in (C βα k ,k F ) * by Γ to get distributions in (C β,k Y ) * . Proof : Both assertions are standard, and follow simply from the fact that Hölder exponents multiply under composition, and don't change under addition and multiplication. The last statement is obtained by taking duals. The pull back for distributions means push forward by Γ −1 . Now we define distributional derivatives. Let us first consider partial derivatives along y-directions for the Y foliation. These are the derivatives we will use in the proof of the main theorem below. Fix a standard basis for R n 2 y , parallel translate it over R n 1 +n 2 . Then the ∂ ∂y i derivative of a distribution D ∈ (C α,k Y ) * is defined by evaluating on h ∈ C α,k+1 Note that ∂ ∂y i (D) is only defined on C α,k+1 Y , and hence, ∂ ∂y i (D) ∈ (C α,k+1 Y ) * . Similarly, we define distributional derivatives along F. Fix a standard basis for R n 2 y , parallel translate it over R n 1 +n 2 and consider the push forward under Γ. This defines vector fields V j tangent to F which are smooth along the leaves of F and whose derivatives along F of order up to k depend α k -Hölder transversely for α k as above. Assume in the following that α ≤ α k . Indeed the V i (h) involve the coefficients of Γ, and this assumption will insure that taking derivatives along the V j does not affect Hölder exponents. More precisely we F as the V i are α-Hölder by assumption on α. Hence we can define the derivative of a distribution D ∈ (C α,k F ) * by evaluating on h ∈ C α,k+1 Note that V i (D) is only defined on C α,k+1 F , and hence, V i (D) ∈ (C α,k+1 F ) * . Note that pulling back derivatives V j (D) gives us ∂ ∂y j derivates of the pull back of D on the appropriate function spaces.
We conclude that gD ∈ (C α,k F ) * . If D is given by integration against a compactly supported L 1 -function u, then gD is given by integrating against gu. Lemma 8.3. Let α ≤ α k and suppose that g ∈ C α,k+1 Proof : We check this by evaluating both sides on φ ∈ C α,k+1 Note: The inner product D, (V i g)φ is not defined unless g ∈ C α,k+1 F . Thus we need the higher regularity on g in the hypothesis of the previous lemma. This simple problem caused the introduction of the spaces of test functions C α,k F . Let u be an L 1 function defined on a neighborhood of a point z 0 . A vector ζ 0 is called not singular for u at z 0 if there exist an open set U ∋ z 0 and an open cone Z ⊂ R n \ {0} around ζ 0 such that for any positive integer N and any C ∞ function χ with support in U there exists a constant C = C(N, χ) so that (19) | χu (ζ)| = u(z)χ(z) exp(−iz · ζ)dz ≤ C|ζ| −N for all ζ ∈ Z with |ζ| > 1.
Otherwise, ζ 0 is called singular for u at z 0 . The wave front set W F (u) is defined as the set of all (z 0 , ζ 0 ) such that ζ 0 is singular for u at z 0 .
Theorem 8.3.1. Suppose that u(z) is an L 1 function. Let F be a Hölder foliation with smooth leaves which is also strongly absolutely continuous. Consider the distribution D defined by integration against u(z). Assume that any derivative of D along F of any order belongs to (C α F ) * for all positive α.
As an immediate corollary, we obtain the result needed in Section 5.
Corollary 8.4. Let F 1 , . . . , F r be Hölder foliations with smooth leaves on a manifold M which are also strongly absolutely continuous. Assume in addition that the tangent spaces to these foliations span the tangent spaces to M at all points. Now suppose that u(z) is an L 1 function. Consider the distribution D defined by integration against u(z). Assume that any derivative of D of any order along any F i , i = 1, . . . , r belongs to (C α F i ) * for all 1 ≤ i ≤ r and all positive α. Then u is C ∞ .
Proof : Since the tangent spaces to the foliations span the tangent bundle everywhere, no vector ζ = 0 can be conormal to all F i . Now it follows from Theorem 1.1 that W F (u) is empty and hence u is smooth by e.g. [19,Section 8.1].
The main idea in the proof of Theorem 8.3.1 is a simple generalization of an argument of Rauch and Taylor in [35]. However much more care has to be taken to make sure that various operations undertaken are well defined and allowed. In particular, we use integration by parts for derivatives along the foliation. This requires that the test functions in question are differentiable along F up to a suitable order. This led to the definition of the function spaces above.
Remark: The proof of Theorem 1.1 becomes easier if the foliation F has derivatives of all orders of a fixed Hölder class and the distribution in question together with its derivatives along F are dual to a fixed Hölder class.
Proof : We fix (z 0 , ζ 0 ) which in not conormal to F. By the definition of the wave front set it suffices to show that there exist an open set U ∋ z 0 and an open cone Z ⊂ R n \ {0} around ζ 0 such that for any N > 0 and any χ ∈ C ∞ 0 (U ) there exists a constant C so that We define φ(x, y, ζ) = −Γ(x, y) · ζ, and note that, for a fixed ζ, the function φ is in C α k ,k for all k by the choice of α k . Using a foliation chart and the strong absolute continuity of F we can write χu (ζ) = u(Γ(x, y)) χ(Γ(x, y))J(x, y) exp(iφ(x, y, ζ)dxdy.
We can expand To describe functions ψ m,N (x, y, ζ) we note that (g ∂ ∂y 1 ) N is a sum of terms of the form P m ( ∂ ∂y 1 ) m , where P m is a polynomial in g and its first (N − m) derivatives. Applying this to g = 1 i∂φ(x,y,ζ)/∂y 1 , we see that each function ψ m,N (x, y, ζ) is a quotient of a polynomial in Γ(x, y) · ζ and its first (N − m + 1) derivatives divided by a power of i∂φ(x, y, ζ)/∂y 1 . Taking k derivatives of ψ m,N yields, by the product and quotient rules, a similar expression which involves derivatives of Γ(x, y) of order (N − m + 1 + k) and hence is Hölder with exponent α (k+N −m+1) . It follows that, for a fixed ζ and any m = 1, ..., N , the function . Moreover, there exists a constant C such that (24) ψ m,N α (N+1) ,m ≤ C |ζ| −N for all ζ ∈ Z with |ζ| > 1.
Indeed, since φ(x, y, ζ) is linear in ζ, both sides of (23) are homogeneous of degree −N in ζ, and hence so are the functions ψ m,N and their derivatives. We conclude that the functions in (24) are rational functions in ζ of homogeneous degree −N whose coefficients, as functions of (x, y), are Hölder on Γ −1 (U ). The Hölder norms of these coefficients are continuous in ζ and hence are uniformly bounded on Z ∩ {|ζ| = 1}. Finally, using equation (21) we can bound the denominators away from zero and obtain (24).
In the remainder of the proof we estimate each term of this sum. For this we denote A = u(Γ(x, y)) χ(Γ(x, y)) and A ζ m,N = u(Γ(x, y)) χ(Γ(x, y)) J(x, y) ψ m,N (x, y, ζ) and view A and A ζ m,N as the distributions given by integration, with a fixed ζ, against the corresponding functions. Since the functions u • Γ, χ • Γ, J and ψ M,n are in L 1 , A and A ζ m,N lie in (C α Y ) * = (C α,0 Y ) * for all positive α, and A ζ m,N = Jψ m,N A as elements of (C α Y ) * with multiplication of distributions defined as in equation (18). Recall that φ(x, y, ζ) is in (C αm,m Now we use the assumption that derivatives of u and hence of the localization uχ along F exist as elements in (C α F ) * for all positive α. Therefore, by Lemma 8.2, yderivatives of the pull back A = (uχ) • Γ also exist as elements in (C α Y ) * for all positive α. Hence the pairing in (25) can be estimated by (α N +1 )-Hölder norm of the product b (ψ m,N ) α N+1 ≤ C |ζ| −N by (24), and the norm exp(iφ(x, y, ζ)) α N+1 can be estimated by C ′ |ζ|. We conclude that each pairing in (25) can be estimated by C ′′ |ζ| −N +1 , and hence the same estimate holds for | χu(ζ)|. Since N is arbitrary, the desired estimate (20) now follows and shows that any (z 0 , ζ 0 ) which is not conormal to F is not the wave front set of u.
Appendix A.

BY JAMES F. DAVIS
A nilmanifold is the quotient G/L of a simply connected nilpotent Lie group G by a discrete cocompact subgroup L. Two homeomorphisms f, g : X → Y are isotopic if they are homotopic through homeomorphisms.
Theorem A.0.1. Let h : M → G/L be a homeomorphism from a smooth manifold to a nilmanifold of dimension greater than four. Then there is a finite cover G/L → G/L so that the induced pullback homeomorphism M → G/L is isotopic to a diffeomorphism. Lemma A.2. Let h : M → N be a homeomorphism of smooth manifolds of dimension greater than four. Suppose N satisfies (*). Then there is a finite cover N → N so that the induced pullback homeomorphism M → N is isotopic to a diffeomorphism.
In particular any two smooth structures on N become diffeomorphic after passing to a finite cover. An existence result can be proved using similar techniques: any topological manifold of dimension greater than four which satisfies (*) has a finite cover which admits a smooth structure.
Proof : Since a finite cover of a nilmanifold is a nilmanifold, it will be notationally simpler to show that any nilmanifold satisfies condition (**) defined below.
A space N satisfies condition (**) if for any i > 0, for any finite abelian group T , and for any x ∈ H i (N ; T ), then there exists a finite cover p : N → N so that p * x = 0.
We first verify condition (**) when i = 1. Indeed, the Universal Coefficient Theorem gives an isomorphism H 1 (N ; T ) → Hom(H 1 (N ); T ) for all spaces N and the Hurewicz Theorem gives an isomorphism π 1 (N, n 0 ) ab → H 1 (N ) for a path-connected space N . Thus there is a natural isomorphism of contravariant functors from path-connected based spaces to abelian groups Given x ∈ H 1 (N ; T ), there is a connected cover p : N → N and a base point n 0 ∈ N so that p * (π 1 ( N , n 0 )) = ker(Φ(N, n 0 )(x) : π 1 (N, n 0 ) → T ).
Since T is a finite group, p is a finite cover . The commutative square shows that p * x = 0. We now turn to the proof that any nilmanifold satisfies condition (**) when i > 1. The proof will be by induction on the dimension of the nilmanifold N = G/L, using the Gysin sequence of a principal S 1 -bundle S 1 → N π − → N/S 1 where N/S 1 is a nilmanifold. To obtain this principal bundle note that the center Z(G) is nontrivial since G is nilpotent. Furthermore, it can by shown that Z(L) = L ∩ Z(G) is a discrete cocompact subgroup of the real vector space Z(G) (see [33,Proposition 2.17]).
Choose a primitive element l ∈ L ∩ Z(G). Then S 1 = R · l/Z · l acts freely on N and the quotient N/S 1 is the nilmanifold (G/R · l)/(L/Z · l).
Let N be a nilmanifold. Assume by induction that condition (**) holds for all nilmanifolds of strictly smaller dimension. The Gysin sequence (see [5]) is an exact sequence associated to a principal S 1 -fibration. By the inductive hypothesis, there exists a finite cover p/S 1 : N/S 1 → N/S 1 so that p/S 1 * (π ! x) = 0. (Note, here is where we use that i > 1.) Define N as the pullback We have a map of principal S 1 bundles, hence a map of Gysin sequences (see the bottom two rows of the diagram below). By commutativity of the lower right square below and the exactness of the middle row, there is an x ′ ∈ H i ( N/S 1 ; T ) so that π * x ′ = p * x. By the inductive hypothesis again, there is a finite cover p/S 1 : N/S 1 → N/S 1 so that p/S 1 * (x ′ ) = 0. Defining N as a pullback, we have the diagram below. Hence our desired finite cover is p • p : N → N . This completes the proof of the lemma. In preparation for the proof of Lemma A.2 we review a bit of smoothing theory. The two definitive treatments are the books [29] and [17]; see also the recent survey [6]. A smooth structure on a topological manifold Σ is a pair (M, h) where M is a smooth manifold and h : M → Σ is a homeomorphism. Two smooth structures (M 1 , h 1 ) and (M 2 , h 2 ) are isotopic if there is a diffeomorphism f : M 1 → M 2 so that h 1 is isotopic to h 2 • f . Let T O (Σ) be the set of isotopy classes of smooth structures on Σ.
The fundamental theorem of smoothing theory says that a topological manifold of dimension greater than four admits a smooth structure if and only if its topological tangent bundle admits the structure of a vector bundle. Furthermore, isotopy classes of smooth structures are in bijective correspondence with bundle reductions. It will be easier (and slicker) to express this in terms of maps to classifying spaces, as in Part 2 of [17].
Let T op(n) be the group of homeomorphisms of R n fixing the origin. Give