Entropy and dimension of disintegrations of stationary measures

By Pablo Lessa

Abstract

We extend a result of Ledrappier, Hochman, and Solomyak on exact dimensionality of stationary measures for to disintegrations of stationary measures for onto the one dimensional foliations of the space of flags obtained by forgetting a single subspace.

The dimensions of these conditional measures are expressed in terms of the gap between consecutive Lyapunov exponents, and a certain entropy associated to the group action on the one dimensional foliation they are defined on. It is shown that the entropies thus defined are also related to simplicity of the Lyapunov spectrum for the given measure on .

1. Introduction

It was shown by Ledrappier Reference Led84, Hochman and Solomyak Reference HS17, that if is a probability on the projective space of which is stationary with respect to a probability on with finite Lyapunov exponents, then is exact dimensional and its dimension is where is the Furstenberg entropy and is the largest Lyapunov exponent (hence is the gap between the two Lyapunov exponents).

Suppose now that is a probability on and is a -stationary probability on the space of flags in (i.e. pairs where , is a one dimensional subspace, and is a two dimensional subspace), which is a three-dimensional manifold.

We consider here the two foliations of the space of flags obtained by partitioning into sets of flags sharing the same one dimensional subspace on the one hand, and flags sharing the same two dimensional subspace on the other. These are foliations by circles, and furthermore the action of any invertible linear self mapping of preserves both foliations.

In this context we show that the conditional measures obtained by disintegrating with respect to these two foliations, are exact dimensional. Furthermore we express the dimension of these disintegrations in terms of the gap between consecutive Lyapunov exponents as well as two entropies . Before establishing the dimension formula we show that the entropies bound the gaps between exponents from below and therefore, in principle, yield a criteria for simplicity of the Lyapunov spectrum.

We prove our results in a slightly more general context, that of actions of on the space of complete flags in . In this context there are associated one dimensional foliations which correspond to “forgetting” the -dimensional subspace of all flags for some .

1.1. Preliminaries

Let denote the singular values of an element with respect to the standard inner product.

We denote by the space of complete flags in , an element is of the form where is an -dimensional subspace of for each and for .

Let denote the space of flags missing their -dimensional subspace. For a given complete flag we denote by its projection to (i.e. the sequence obtained by removing from ).

We use the notation for equality in distribution between random elements and . And to mean that the probability is absolutely continuous with respect to .

If and are random elements taking values in complete separable metric spaces (a version of) the conditional distribution of given is a -measurable random probability on the range of such that

for all continuous bounded real functions (here the right-hand side is the conditional expectation of with respect to the -algebra generated by ). Such a conditional distribution is well defined up to sets of zero measure but we will abuse notation slightly referring to ‘the conditional distribution’.

It is always the case that there exists a Borel mapping from the range of to the space of probabilities on the range of such that is a version of the conditional distribution of given . Fixing such a mapping one may speak of for non-random in the range of .

The lower local dimension of a probability measure on a metric space at a point is defined by

while the upper local dimension is defined by

where is the ball of radius centered at .

If the lower and upper dimensions of are equal to the same constant -almost everywhere then we say that is exact dimensional and define its global dimension as the given constant.

1.2. Statement of main results

Suppose that is a random element of with distribution such that

and let be a random element of with distribution which is independent from and such that

The existence of such a pair is equivalent to the fact that is a -stationary probability, as first defined in Reference Fur63.

The Lyapunov exponents of relative to are defined by the equations

where is the Jacobian of the restriction of to the subspace (where the volume measure induced by standard inner product is used on and its image). In the degenerate case where one has , and if is one dimensional one has .

The Lyapunov exponents given by the multiplicative ergodic theorem of Reference Ose68 for a product of i.i.d. random matrices of distribution are obtained by maximizing the sums over all stationary probabilities as shown in Reference FK83.

Fix , let be the projection of to , and let be the conditional distribution of given .

Theorem 1 (Inequality between entropy and gap between exponents).

If is the unique stationary probability on which projects to then almost surely,

and if and only if almost surely.

Theorem 2 (Dimension of conditional measures).

If is ergodic, is the unique stationary probability on which projects to , and , then almost surely is exact dimensional and

In the case both theorems above are known. A proof of Theorem 1 in this case was first given in Reference Led84. In the same work the formula for dimension in Theorem 2 is shown to hold for a slightly different notion of dimension. The exact dimensionality of stationary measures when was first proved in Reference HS17 and this implies the formula above for the same notion of dimension we use here.

Theorem 1 implies that the Lyapunov spectrum is simple (i.e. all exponents are different) if there does not exist a family of conditional probabilities satisfying for almost every . This suggests a connection to criteria for simplicity dating back to Reference GdM89 and Reference GR89 though we do not explore this issue further here.

Part 1. Entropy, mutual information, and Lyapunov exponent gaps

2. Entropy and mutual information

We will define below the conditional mutual information between and given . This is a non-negative -measurable random variable which may take the value .

The purpose of this section is to prove that:

Lemma 1 (Entropy and mutual information).

If almost surely then almost surely and .

Conversely, if almost surely then whether is finite or not.

This result reduces the problem of showing that almost surely and that to that of bounding the conditional mutual information between and given .

A general reference covering mutual information including Dobrushin’s theorem and the Gelfand-Yaglom-Perez theorem is Reference Pin64.

2.1. Conditional mutual information

2.1.1. Mutual information

Let and be random elements of two Polish spaces and , and denote the distribution of , , and respectively.

The mutual information between and is defined by

where the supremum is over all finite partitions of into Borel sets.

Directly from the definition one sees that .

By Jensen’s inequality with equality to if and only if and are independent. If takes countably many values and has finite entropy in the sense of Reference Sha48 one has .

It was shown in Reference Dob59 that is the supremum over any sequence of partitions which generate the Borel -algebra in (see also Reference Gra11, Lemma 7.3). This has the following important corollary:

Proposition 1 (Semi-continuity of mutual information).

If in the sense of distributions then .

It was shown in Reference GfY59 and Reference Per59 that if then and

Conversely, if then

whether the right hand side is finite or not.

These results are usually called the Gelfand-Yaglom-Perez Theorem.

In our context, when , this yields the following result:

Proposition 2.

If and then almost surely and .

Conversely, if almost surely then whether is finite or not.

Proof.

The marginal distributions of are and respectively. However the conditional distribution of given is .

Therefore letting be the joint distribution of one has

for all measurable functions .

If almost surely then

so that at -almost every point .

On the other hand if then setting one has

for all measurable funtions .

Letting where is the indicator of an arbitrary subset of , and is continuous on the compact space , one obtains that

for -almost every . Intersecting the -full measure sets where this holds over a countable dense set of functions , one obtains a full measure set for where .

Hence, the distribution of is absolutely continuous with respect to if and only if almost surely and in this case the Radon-Nikodym derivative between the two at is given by .

2.1.2. Conditional mutual information

Let be a -algebra of measurable sets in the probability space on which the random elements and are defined.

The mutual information between and conditioned on is the unique up to modifications on null sets random variable obtained as above but using the conditional distribution of conditioned on . In the case we use the notation .

One still has almost surely. Almost sure equality to zero occurs if and only if and are conditionally independent given .

In general there is no relation between and or even .

To see this suppose for example that are i.i.d. taking the values with probability and , then one has while almost surely.

On the other hand for any Markov chain one has almost surely, and one may construct examples with . For example, setting and where the are i.i.d. with suffices.

The following semi-continuity property holds:

Proposition 3 (Semi-continuity of conditional mutual information).

If the conditional distribution of given converges almost surely to the conditional distribution of given then almost surely.

Proof.

This is a direct consequence of Proposition 1.

The following monotonicity property follows immediately from the definition of mutual information

A more precise version of monotonicity is the following:

Proposition 4 (Chain rule for conditional mutual information).

If are random elements and a -algebra of events of the probability space on which they are defined, then

Proof.

When is trivial this is Reference Gra11, Corollary 7.14 (notice that what said reference denotes by is in our notation). The general case follows by applying this to the conditional distributions given .

2.2. Proof of Lemma 1

We will calculate the marginal distributions and the joint distribution of conditioned on and apply the Gelfand-Yaglom-Perez Theorem as in Proposition 2.

To begin we simply let be the conditional distribution of given .

By stationarity of the conditional distribution of given is .

For the joint distribution notice that the distribution of conditioned on is the same as conditioned on and therefore it is .

Hence the joint conditional distribution of given satisfies (and is determined by the equation)