Skip to Main Content

Wiener-Wintner Ergodic Theorem, in Brief

Idris Assani

Communicated by Notices Associate Editor Steven Sam

Article cover

In 1941, N. Wiener and A. Wintner introduced a strengthening of the Birkhoff-Khinchin Ergodic Theorem, which initiated the study of a general phenomenon in ergodic theory where samplings are “good” for an uncountable number of systems. While this result was interesting on its own, it took mathematicians a few decades to realize that these “good” samplings play key roles in various other types of ergodic averages.

In this short note, we will briefly introduce the theorem and discuss how it played a key role in the study of Furstenberg averages, averages along the cubes, and the Return Times Theorem.

1. What is Ergodic Theory?

The results that we are presenting here are part of ergodic theory. We do not pretend to present an exhaustive list on the topic defined in the introduction. Our goal is to introduce the interested reader and particularly graduate students to this topic without requiring extensive background.⁠Footnote1

1

We will use a to indicate references which are not listed at the end of the article. The complete list of references is available at https://idrisassani.web.unc.edu/wp-content/uploads/sites/21419/2021/09/Complete-bibliography-WWET.pdf.

The word “ergodic,” which was coined by L. Boltzmann, comes from the Greek words (energy) and (path). Ergodic theory is a branch of mathematics which has its origin in statistical mechanics, and can be dated back to the 1870s when J. C. Maxwell and Boltzmann were working on formulating an “ergodic hypothesis” which studied the conditions for which the “time average” of the system equals the “space average.”

For instance, consider a box with gas molecules inside it. We can think of the box as a unit cube in . Suppose one is interested in studying how often these molecules, which are moving freely, visit the first octant of the box on average between a large enough time interval. One could study the path of each molecule, but that becomes hopeless as soon as we realize the number of variables are enormous (perhaps on the order of Avogadro’s number, which is around ). But if we assume that the molecules move freely enough so that they are not particularly trapped in a certain part of the box, then the frequency of their visit should equal the proportion of the entire space that our molecules occupy (if we are correct in the hypothesis that the time average equals the space average). As long as we can make this assumption, we conclude that the molecules are in the first octant of the box of the time on average. We recommend C. Moore’s note 17 for those interested in the historical aspect of the initial development of ergodic theory.

The mathematical formulation of ergodic theory starts with a measure space such that there exists an action by a measurable map The set is then called a dynamical system. If the underlying -algebra is clear, we sometimes omit it and simply write . Ergodic theory is a subfield of the field of dynamical systems which studies the statistical behavior of such systems. The map is said to be measure preserving if for each we have The dynamical system is then called a measure-preserving system. Measure-preserving systems are one of the central objects of study of ergodic theory. Here are some examples of measure-preserving systems:

a)

Rotations on , the one-dimensional torus, with Lebesgue measure . They are defined by the map , where The measure-preserving property is a simple consequence of the fact that Lebesgue measure is invariant under translations. This makes a measure-preserving system.

b)

The map is defined by , where is again the Lebesgue measure. We sketch the graph of this map in Figure 1. Here the measure-preserving property can be derived by checking it on open intervals . The slope of the lines show that where and .

Figure 1.

The doubling map defined by .

Graphic for Figure 1.  without alt text

We remark that the map has some properties quite distinct from the rotation map above. First of all, is invertible, while is not. Second of all, is forward measure preserving, i.e., , while is not. Lastly, the system under the map displays more chaotic behavior, in the sense that trajectories between two points in may differ quite a bit, no matter how close the original points are, whereas those trajectories under will always have fixed distance. The former phenomenon is called sensitivity to initial conditions, or some may call it the butterfly effect. While this notion is not too important in this article, interested readers can learn more details about this in any textbook on dynamical systems.

c)

One can obtain a measure-preserving system in a fairly general situation. Let be a compact metric space and a continuous function from to By a theorem of Krylov-Bogoliubov*, there exists a measure defined on the Borel subsets of which makes the system measure preserving.

We give now the definition of an ergodic measure-preserving system.

Definition 1.1.

The measure-preserving system is said to be ergodic if implies or

We will discuss the importance of this notion later, but an intuitive way to describe this is that an ergodic system contains no nontrivial invariant set (modulo a set of measure zero).

It is simple to see that if in example a) the number is rational, then is periodic. This periodicity prevents the system from being ergodic. However, if is irrational this system is ergodic. This can be seen by using Weyl’s criteria of uniform distribution. Example b) is also ergodic.

Furthermore, is ergodic if and only if every invariant function is a.e. constant (i.e., if for some , then is a constant function a.e.).

We will assume in this paper that In other words is a probability space.

One of the nice things about studying ergodic theory is that all measure-preserving systems employ valuable statistical properties, including those that display more chaotic behavior (like the doubling map example above) that are typically challenging to study at the level of individual orbits. Here we present three classical results that demonstrate these statistical qualities.

We will first discuss possibly the oldest result in ergodic theory, which is due to Poincaré.

Theorem 1.2 (Poincaré Recurrence Theorem).

Let be a measure-preserving system where we assume that Consider with positive measure. Then Furthermore if the system is ergodic, then if is another set with positive measure we have

In essence this theorem says that for “most” points , its iterates under , namely , return infinitely often to the set . When the system is ergodic, then for most points its iterates will “visit” any other set with positive measure infinitely often (an intuitive explanation for this follows from the fact that an ergodic system lacks nontrivial invariant sets).

The following two classical results are considered to be the “heart” of ergodic theory. In particular, the results by von Neumann and Birkhoff (later reformulated by Khinchin) from the early 1930s provided a rationale for the hypothesis that time averages can be equal to space averages, which was a fundamental problem in statistical mechanics.

Figure 2.

G. Birkhoff (left), A. Khinchin (middle), and J. von Neumann (right).

Graphic for Figure 2.  without alt text

First we discuss the result by von Neumann.

Theorem 1.3 (von Neumann’s mean ergodic theorem).

Let be a measure-preserving system, with Let denote a function in for some , . Then the averages converge in norm to a function such that If the system is ergodic, then

One of the consequences of this theorem is the decomposition of the space into the sum of two subspaces: , and the closure of the set of functions of the form Functions of this form are called coboundaries. Thus we have (see Yoshida-Kakutani* or 3, §1.1 for more details).

Next we discuss the result by Birkhoff and Khinchin. Birkhoff originally obtained this result for continuous flows on manifolds. Later Khinchin extended that result to hold for abstract measure-preserving dynamical systems.

Theorem 1.4 (Birkhoff-Khinchin pointwise ergodic theorem).

Let be a measure-preserving system, with Let denote a function in for some , . Then the averages converge a.e. to , where the function satisfies If the system is ergodic, then

One classical way to prove Theorem 1.4 is by using the following maximal inequality: for each and for each we have

By the Banach principle, this inequality shows that the set of functions for which pointwise convergence holds is closed in . This reduces the study of the pointwise convergence to a dense set which is given by

Perhaps some readers may have noticed that Birkhoff-Khinchin’s pointwise result implies that of von Neumann (on a finite measure space, a.e. convergence and boundedness implies norm convergence by the Lebesgue Dominated Convergence Theorem). While that is indeed the case, it should be noted that von Neumann studied these averages from an operator theory perspective. We should also point out some history regarding Birkhoff and von Neumann, which is nicely summarized in the short note by C. Moore 17:

According to Birkhoff and Koopman, von Neumann communicated his result personally to both of them on October 22, 1931, and pointed out to them that his result raised the important question of whether a pointwise result might be valid. Birkhoff then went to work and, by different methods, quickly established his pointwise ergodic theorem. He submitted his paper to PNAS on December 1, 1931, for appearance in the December 1931 issue. One presumes that he sent copies to Koopman and von Neumann, who would have noticed that Birkhoff had not given von Neumann adequate credit and recognition for his result. von Neumann evidently planned to include his ergodic theorem and its proof in a much longer paper he was writing for the Annals of Mathematics, but he then apparently quickly drafted a short paper for PNAS with his proof of the mean ergodic theorem and submitted it to PNAS on December 10, 1931. It appeared in the January 1932 issue. One suspects that these events led Koopman and Birkhoff to write and publish their paper in PNAS 2 months later, which set matters straight and clearly acknowledged von Neumann’s priority. It should also be noted that E. Hopf presented a slightly different proof of the mean ergodic theorem and some improvements on the Birkhoff theorem in a paper, which appeared in the January 1932 issue of PNAS. For whatever reason, the Birkhoff paper and its result has over time become the better known of the two papers, but in light of these historical details, the von Neumann paper deserves at least equal billing.

The averages are called “time averages” and is called the space average. Another statement of the pointwise ergodic theorem for ergodic systems is that for almost any point , the limit of the time averages is the space average.

Given a set of positive measure the sequence is composed of zeros and ones. By the pointwise ergodic theorem the averages converge a.e. to if is ergodic.

Each time we have . We introduce the following definition that we will use later.

Definition 1.5.

Let be a measure-preserving system. Given , we denote by , the th visit time to the set (see Figure 3). More precisely, we define and, for ,

Due to the Poincaré Recurrence Theorem, we know that is defined for every for -a.e. . If is ergodic, the return times can be defined on -a.e. .

Figure 3.

Visit times, and to the set .

Graphic for Figure 3.  without alt text

2. Wiener-Wintner Ergodic Theorem

We would like to discuss an important strengthening of the Birkhoff-Khinchin Pointwise Ergodic Theorem, which was announced by N. Wiener and A. Wintner*. This result will be central to our discussion.

Figure 4.

N. Wiener (left) and A. Wintner (right).

Graphic for Figure 4.  without alt text
Theorem 2.1 (Wiener-Wintner Ergodic Theorem).

Let be a probability measure-preserving system, and let . Then there exists , such that for every , the averages

converge for every .

Though the original proof had a gap, fortunately the gap has been filled, and now there are at least three different proofs of this theorem 3.

We note that if the value of is fixed, one can show that the convergence of the averages above is an immediate corollary of the Birkhoff-Khinchin Theorem. Indeed, let be any measure-preserving system, and let be the rotation system on the -torus by , i.e., , . Let , and let such that . Then

and these averages converge by applying the Birkhoff-Khinchin Theorem on the product space .

The novelty of the Wiener-Wintner result, however, is that the set of full measure is independent of the value of . Since is uncountable, one cannot simply apply one of the standard tricks of measure and integration where we intersect countably many sets of full measure to show that the result holds.

It appears that the Wiener-Wintner Theorem is of interest to some applied mathematicians as well, as communicated to the author by I. Mezić. See a paper, for instance, by M. Budišić, R. Mohr, and I. Mezić*.

More can be said about the Wiener-Wintner Ergodic Theorem if we assume the system to be ergodic. In particular, it becomes easier to identify the limit of Wiener-Wintner averages.

For an ergodic measure-preserving transformation , we say that is an eigenfunction for if -a.e. for some and . The Kronecker factor is a sub--algebra of generated by the eigenfunctions of (i.e., the smallest -algebra that makes all of the eigenfunctions of measurable). In this situation, is the closed linear span of eigenfunctions of . We note that if is an eigenvalue of , then since is measure preserving.

For example, let be a -torus, and let be an irrational number. Let be a map such that . The system is called the skew-product system (where is the normalized Lebesgue measure). We consider functions of the form , where . We note that if and only if , or in other words . This fact can be used to show that the system is an ergodic system (where is the normalized Lebesgue measure).⁠Footnote2 Furthermore, if is an eigenfunction of , i.e., for some , then we must have . This implies that is generated by the set of functions that only depend on the first coordinate.

2

A formal proof can be derived using the Fourier expansion of a function in Furthermore, on a topological system like the skew-product one, ergodicity can be demonstrated by showing every -invariant continuous function is a.e. constant.

We are now ready to state the uniform version of Theorem 2.1. This result was announced in the work of Bourgain on the Double Recurrence Theorem 6, and applied to prove this result (we will discuss this later). A detailed proof can be found in, for instance, 3, Theorem 2.4.

Theorem 2.2 (Uniform Wiener-Wintner Ergodic Theorem).

Let be an ergodic measure-preserving system, the Kronecker factor, and suppose that is orthogonal to . Then for -a.e. , we have

To illustrate this theorem, we present the following example which shows convergence of averages that are fairly difficult to prove directly. Recall the skew-product example above. By the definition of , we have

We also know from the discussion above that the orthogonal complement of is spanned by the set . The Uniform Wiener-Wintner Theorem asserts that for -a.e. , we have

or equivalently,

The Uniform Wiener-Wintner Theorem tells us about the limiting behavior of these averages. As a consequence of this result, one can show that if is a separable ergodic system and , the limit of Wiener-Wintner averages is a “projection” onto the eigenspace of if is one of the eigenvalues of (the averages converge to otherwise).

We remark however that the Uniform Wiener-Wintner Theorem is false without the assumption of ergodicity. Let be the same skew-product system that we saw earlier. Then is a measure-preserving system that is not ergodic (for instance, is invariant under , even though it is not a constant function). The map can be explicitly written as

It is known that the closed linear span of eigenfunctions for is given by the closed linear span of products of the characters . In other words, it is given by the functions depending only on and .

Consider the functions , where . First, note that is orthogonal to the closed linear span⁠Footnote3 of the eigenfunctions of . But the Wiener-Wintner averages for this function are

3

While we do not discuss the notion of factor here, it may be worth mentioning that if the system is not ergodic, then the closed linear span of eigenfunctions does not form a factor. This is why we are not calling this the Kronecker factor.

Thus,

(the supremum is obtained when ), which shows that

so the uniformity result does not hold.

In the next few sections, we will see some of the applications of the Uniform Wiener-Wintner Theorem in other ergodic averages.

3. The Double Recurrence Theorem

Let be a measure-preserving system. Consider a set with positive measure. The sequence takes the value 1 if and both belong to (see Figure 5). One can observe that the gap between and goes to infinity even if itself goes to infinity which adds another level of difficulty.

Figure 5.

Double recurrence to the set : and are in .

Graphic for Figure 5.  without alt text

The Furstenberg averages for the pair of functions are equal to

More generally for functions that are in , we have the following definition.

Definition 3.1.

The Furstenberg averages of the functions