Rudolph's Two-Step Coding Theorem and Alpern's Lemma for R^d Actions

Rudolph showed that the orbits of any measurable, measure preserving $\mathbb R^d$ action can be measurably tiled by $2^d$ rectangles and asked if this number of tiles is optimal for $d>1$. In this paper, using a tiling of $\mathbb R^d$ by notched cubes, we show that $d+1$ tiles suffice. Furthermore, using a detailed analysis of the set of invariant measures on tilings of $\mathbb R^2$ by two rectangles, we show that while for $\mathbb R^2$ actions with completely positive entropy this bound is optimal there exist mixing $\mathbb R^2$ actions whose orbits can be tiled by 2 tiles.


Introduction
The work in this paper completes a project started by the third author in collaboration with Daniel Rudolph, shortly before his death. The project started with a question posed by Rudolph about representations of measurable R d actions. The classical representation theorem for continuous group actions is the Ambrose-Kakutani Theorem [2] which states that every free, probability measure preserving action of R on a Lebesgue space is measurably isomorphic to a flow built under a function. In [10] Rudolph proved his Two Step Coding Theorem showing that given any irrational α > 0, there is in fact a representation where the ceiling function takes only two values 1 and 1 + α. Rudolph proves his result by showing that the flow can be factored onto the translation action of R on the space of tilings of R by two tiles. Since a constant ceiling function results in a flow with a non-ergodic time, it is clear that two is the smallest number of tiles for which such a general statement can be true.
In [11] and [12] Rudolph generalized his one dimensional result to show that every free, measure preserving R d action can factored onto the translation action of R d on a space of tilings of R d by 2 d rectangles. His result shows more: in particular the base points of the tiles form a The first author was partially supported by NSF grant 1200971 and the second author was partially supported by NSERC. section of the R d action and the constraints on the tilings give rise to a natural Z 2 action on the return times to the base points of the tiles. Thus Rudolph's theorem gives a representation of the R d action as a suspension flow.
The Rohlin Lemma can be viewed as the discrete analog of the Ambrose-Kakutani Theorem. In its standard formulation it states that given a free, measure preserving Z action on a Lebesgue probability space X, N ∈ N, and > 0, the space X can be measurably decomposed into a tower of height N and a set of measure usually called the error set. The union of the base of the tower and the error set are the analogs of the section in the continuous representation theorems, and the Rohlin Lemma can be restated as an orbit tiling result. Alpern [1] generalized this result to show that in fact the space can be decomposed into any collection of towers provided the lengths of the towers have no non-trivial common divisor. In particular, he established the discrete analog of Rudolph's Two Step Coding Theorem for Z actions.
Alpern's result was generalized to Z d actions by Prikhodko [8] and Şahin [15] for rectangular towers, but with no restriction on the number of tiles as a function of dimension. Notably, two tiles suffice in any dimension. Further, the proof in [15] is a generalization of ideas introduced in [12], suggesting that the continuity of the group imposes some restrictions on the number of tiles necessary for a general representation theorem. Motivated by the Z d Alpern Theorem, Rudolph asked whether the number of tiles in his R d result, d > 1, was sharp.
In December, 2009 Rudolph and Şahin started working on this project with the aim of first exploring the case with d = 2, and got as far as establishing the first few structural results about tilings of R 2 with two rectangles (up to Lemma 3.6 in Section 3).
In this paper we continue this work and develop new techniques to analyze the geometric structure of tilings of R 2 with two rectangles. This enables us to give a complete description of these tilings, as well as the set of probability measures on the space of such tilings invariant under the standard translation action. We show, in particular, that the standard translation action on this space as a topological dynamical system has entropy zero. As a consequence, the analog of Rudolph's statement in two dimensions using two rectangular tiles is false.
In the positive direction we prove that any free, ergodic and measure preserving R d action can be factored onto an R d tiling dynamical system with d + 1 rectangular tiles. The approach we use is essentially that introduced by Rudolph in [12]. His construction of the tilings relies on the existence of a periodic tiling by a large rectangle composed of the 2 d rectangles of the theorem (referred to as a supertile in our paper) and the ability to move large fundamental domains of the periodic tiling to approximations of arbitrary locations in R d . Moving the fundamental domains relies on performing a series of local perturbations in each coordinate direction and the perturbations in turn require two irrationally related dimensions of the rectangles, thus requiring 2 d rectangles. The new ingredient in our work is that we construct a periodic tiling by supertiles which are not rectangular. In this tiling we are able to move fundamental domains by perturbing in only one direction, and this allows us to reduce the number of rectangular tiles required to d + 1. In what follows we give more precise statements of our results and provide an outline for the paper.
1.1. Tiling orbits of R d actions. In Section 2 we prove an Alpern Lemma for R d actions by providing a set of tiles with which almost every orbit of any free, ergodic, measure preserving R d action can be tiled. To state the theorem formally we need to establish some notation. Given a collection T = {τ 1 , τ 2 , . . . , τ k } of d-dimensional tiles, a T -tiling of R d is a covering of R d by translates of tiles from T such that they only overlap on boundaries of the tiles. The set of all tilings of R d by T is denoted by Y T , and the standard translation action of R d on Y T is denoted by S = {S v } v∈R d . We say that a dynamical system (X, µ, is T -tileable if there exists a factor map φ : (X, µ, We show: There exists a set T = {τ 1 , . . . , τ d+1 } of d + 1 rectangular tiles in R d such that any free, probability measure preserving R d -action (X, µ, The collection of tiles, called the basic tiles, that satisfy the conclusion of Theorem 1 is obtained by sub-dividing a single basic tile given by a d-dimensional rectangle with a smaller one removed from one corner. We call this the supertile and it tiles R d periodically. The supertile in dimension two is an amalgamation of tiles used by Rudolph in [12]. The periodic tiling using the higher-dimensional supertile was previously constructed by Stein [16] and also studied by Kolountzakis [5]. In these papers, a tiling that is essentially the periodic tiling that we use (up to scaling in one coordinate) is called the "notched cube" tiling, but they do not concern themselves with the decomposition of the notched cube into the smaller tiles. In keeping with the terminology of the earlier references, we also refer to the periodic tiling in this work as the notched cube tiling. Section 2 contains the definitions of the basic tiles, the supertile and the notched cube tiling. In Section 2.2 we study the topological mixing properties of the notched cube tiling and we prove Theorem 1.1 in Section 2.3.
1.2. Structure of 2-tiling measures. In Section 3 we study T -tilings of R 2 where T consists of two rectangular tiles. The main result of the section is: where the tiles τ i are rectangles whose corresponding dimensions are irrationally related. Then (Y T , {S v } v∈R 2 ) has zero topological entropy.
Since there are obvious dynamical obstructions to proving Theorem 1.1 if the corresponding dimensions of the tiles are not irrationally related we obtain as a corollary that three tiles is a sharp result for tileability of R 2 actions: There exists a measure preserving R 2 action that is not T -tileable for any collection T of two rectangular tiles.
We conjecture that the analogous result holds for all d ≥ 1, meaning that there exists a measure preserving R d action that is not T -tileable for any collection T of d rectangular tiles, but our proof does not readily generalize.
The proof of Theorem 1.2 depends on the fact that the geometric structure of tilings of the plane with tiles from such a collection T is quite rigid. As a consequence of this rigidity, we show that the set of ergodic, invariant measures on Y T can be characterized via a dichotomy: either the measure is supported on tilings with infinitely many bi-infinite shears or the measure is supported on tilings with a staircase geometric structure we call staircase tilings.
Deferring the precise definitions to Section 3, shears are horizontal or vertical boundaries between non-aligned tiles. The analysis of the infinite shear case is straightforward and is completed in Section 3.3. The detailed analysis of the structure of staircase tilings and a complete description of the measures supported on them is the content of Sections 3.4-3.7. In the absence of infinite shears we identify a particular feature of the tilings, namely a pair of families of staircases constructed from terminating shears. These tilings have a Z 2 lattice type structure that allows us to represent a measure-preserving R 2 action on a staircase tiling as a suspension over a Z 2 action.
In Section 3.8, we compute the entropy of invariant measures on Y T and use this to show that there exist R 2 systems that are not tileable with only two tiles. On the other hand, in Section 4, we show that some classes of measure preserving actions are tileable with exactly two tiles. Finally, in Section 5 we show that for any T consisting of a pair of rectangles whose corresponding dimensions are incommensurable, there exists a mixing R 2 measure preserving system that is T -tileable.
We are grateful to Valery Ryzhikov for bringing his work in [13] and [14] to our attention. In [13] he proves that there is no "epsilonfree" Rohlin Lemma for Z 2 -actions of completely positive entropy by proving that the factor of the Z 2 action obtained from the partition consisting of an epsilon-free tower and its error set has to have zero entropy. The proof of this claim is strikingly similar to our argument in Section 3.8. It is also interesting to note that in spite of the similarities in their proofs, neither result can be obtained from the other.
Note that the unit cube can be tiled by the basic tiles: In particular if the ith coordinate of a point x ∈ [0, 1) d is the first coordinate in the range [0, 1 − α), then it lies in τ i . Otherwise it lies in the unique translate of τ d+1 that occurs in its tiling. See Figure 1 for a picture of the four tiles used in three dimensions tiling the unit cube in R 3 . The supertile consists of two stacked copies of the unit cube, with a copy of τ d+1 removed from a corner (see Figure 2): Definition 2.2. For 0 < α < 1 and associated T α , the supertile τ * α for T α is defined to be We let T * α = T α ∪ τ * α denote the union of the basic tiles and the supertile.
Notation. In what follows we fix an α ∈ (0, 1), its associated sets of tiles T α and T * α , and the supertile τ * α . For notational convenience we suppress the subscripts α for the remainder of the section and simply write T , T * and τ * .
Since the supertile can be decomposed into basic tiles, we immediately have: Lemma 2.3. Given any element y * ∈ Y T * , there is a corresponding tiling y ∈ Y T obtained by decomposing each translate of the supertile into its basic tile components.
We are left with showing that Y T * is not empty.
Then τ * is a fundamental domain for the translation action of AZ d on R d .
Before proving the proposition, we give a characterization of the fundamental domain of a translation action on R d : Proof. Set Q = R d /AZ d . Consider the projection map π : R d → Q given by π(x) = x + AZ d . Since A[0, 1) d is a fundamental domain for the action of AZ d on R d by translation and has volume det A, the natural measure m induced on the quotient has total mass det A. By assumption, π is a bijection between B and π(B), meaning that vol(B) = m(π(B)) ≤ det A.
If equality holds, then setting N to be the null set π −1 (Q \ π(B)) ∩ A[0, 1) d , we have that B ∪ N is a fundamental domain for the action of AZ d on R d by translation.
We use this to complete the proof of the proposition: Proof of Proposition 2.4. Expanding across the bottom row and using induction, one can check that det A = 2 − α d . By Lemma 2.5, it suffices to show that (τ * + A v) ∩ τ * = ∅ for v ∈ Z d \ {0}. By symmetry, it suffices to establish this for v satisfying v d ≥ 0.
We prove the disjointness case by case: By symmetry it suffices to assume v i > 0. and so (A v) i ≥ 1.
Since the i coordinates of τ * lie in [0, 1), it follows that τ * and τ * + A v are disjoint.
is at least 1 and we are done as in Case 1.
We can assume that i is the greatest index up to d − 1 for which v i < 0. Then as above we have (A v) i+1 ≥ 1 and the disjointness follows.
Then τ * is contained in the union of the half spaces i<d {x : Translating the statement of Proposition 2.4 into tilings, we have: There is a periodic tiling of R d by the supertile τ * .
Let N denote the periodic tiling of R d by the supertile given by Corollary 2.6. We call N the notched cube tiling.

2.2.
N is a uniform filling set. As before, we fix 0 < α < 1 and the associated tiles and tilings. In the proof of Theorem 1.1 the tiling N plays the role of a uniform filling set as introduced in [9]. In particular, we use N as a canvas, using a sequence of Rohlin towers to inductively tile larger and larger pieces of orbits. The filling property of N is used to achieve agreement between tilings from different stages and thus is key to ensuring that the sequence of tilings converges. The difference in the filling property defined below in Proposition 2.10 comes from the fact that we are using a continuous group action.
The filling properties of N are a consequence of the way faces of the translate of the supertile intersect. We call the faces of τ * with extremal e d coordinates the top and bottom faces, where e d denotes the dth standard basis element of R d . Lemma 2.7. (Tile Intersection Property) For any 0 < α < 1, let τ * be the supertile for the set of tiles T . Then for any v ∈ Z d , if the top boundary of the translate τ * +A v intersects the bottom boundary of any other translate of τ * , then this intersection exactly agrees with the top boundary of a basic tile of τ * + A v.
Proof. It suffices to consider the case that v = 0. For 1 ≤ j ≤ d, let Then by direct calculation, we have The top of τ * is Notice that by (2), for j < d, τ * + A u j has base Intersecting these with the top of τ * , we obtain which is exactly the top of τ j + ( The base of Hence we have found a collection of d translates of the supertile at the origin, whose bases intersect the supertile at the origin exactly in the tops of all the d basic tiles forming the top of the supertile and thus the τ * + A u j are the only translates of the supertile whose bottoms can intersect the top of τ * .
and denote R 0,M = R M .
We use the Tile Intersection Property to shift any grid patch in N down one unit and tile R d while only changing N slightly: Proposition 2.10. For any M ∈ N and w ∈ AZ d , there exists a tiling y ∈ Y T * such that Proof. Fix M and w. It suffices to show that if the patches (R 1,M + w) c and R M + w − e d are both grid tiled, then the region (R 1,M + w) \ (R M + w − e d ) can be tiled using tiles of T * . By the Tile Intersection Property (Lemma 2.7) we know that if if τ * + A v lies in the set (R 1,M + w)\(R M + w) and τ * +A v intersects R M + w− e d , then the part of τ * +A v that is covered by more than one tile consists of a complete union of basic tiles. The bottom of the untiled region can thus can be tiled by decomposing these τ * + A v in N into their basic tile components and removing the basic tiles that lie in the intersection. The top of the untiled region can be decomposed into d-dimensional rectangles with their dth dimension 1 and the remaining dimensions corresponding to tops of basic tiles τ 1 , . . . , τ d . The tiling y is completed by tiling each such rectangle either by τ 1 , . . . , τ d−1 or by the pair τ d , τ d+1 stacked on top of one another.
By restricting our choice of α, the ability to move a grid patch down in the e d direction suffices to guarantee that there is a tiling which sees a grid patch arbitrarily close to any desired location: Corollary 2.11. (N is a uniform filling set) Suppose α is an irrational which is not algebraic of degree d or lower. Then given > 0, there exists K ∈ N such that for any v ∈ R d and any M , there exists u ∈ R d with v − u < and a tiling y ∈ Y T * such that Proof. Fix > 0. The condition on α guarantees that A −1 e d satisfies n · A −1 e d is irrational for all non-zero n ∈ Z d , and so multiples of A −1 e d are dense in R d /Z d . By compactness of τ * , there exists K > 0 depending only on such that any vector in R d can be approximated within (mod AZ d ) by −k e d for some k with 0 ≤ k ≤ K.
Choosing such k with 0 ≤ k ≤ K and −k e d approximating v within (mod AZ d ), we have that there exists w ∈ AZ d satisfying v − u < , where u = w − k e d . Applying Proposition 2.10 inductively, there exists a sequence of tilings (y (j) ) 1≤j≤k such that and In particular, y = y (k) satisfies y[ u ]. An immediate consequence of the Tile Intersection Property (Lemma 2.7) is that for any j, M ∈ N, ,M ], as required.
To prove Theorem 1.1, we need two generalizations of Corollary 2.11. First, we shift patches that are not perfect grid patches but are surrounded by a sufficiently large layer of supertiles. Further, we shift two patches independently provided they and their supertile collars do not intersect.
Definition 2.12. Let y ∈ Y T * and K, M ∈ N. We say the patch y[R M ] has a grid collar of size K if y[R K,M \ R M ] is grid tiled.
Corollary 2.13. Suppose α is an irrational which is not algebraic of degree d or lower. For all > 0 there exists K ∈ N such that for any and a tiling y ∈ Y T * with Proof. Let w i ∈ AZ d be such that v i ∈ τ * + w i . By (4) we can apply the proof of Corollary 2.11 independently to the patches 2.3. Proof of Theorem 1.1. The strategy that we follow in this section is a minor variation of strategies that have appeared in works of many authors, notably Rudolph [11], [12] (in which rectangular tilings of R d appeared) and Robinson and Şahin [9] (in which uniform filling sets were introduced). The tiling is constructed using a sequence of Rohlin towers. For ease of notation we abbreviate an Definition 2.14. Let (X, µ, T ) be a measurable, free, and measure preserving R d action on a Lebesgue probability space.
is called the error set, and µ(E) is called the error of the tower. For B ⊂ B, the set v∈R T v (B ) is called the slice of the tower based at B .
Lind extended the classical Rohlin Lemma to R d actions [6] , showing that given any size d-dimensional rectangle in R d , any > 0, and any free, measure preserving R d action there is a Rohlin tower of that shape with error . The following result can be obtained from Lind's R d Rohlin Lemma, or from Ornstein and Weiss's generalization of the Rohlin Lemma to amenable group actions [7]: Lemma 2.15. Let (X, µ, T ) be a measurable, free, and measure preserving R d action on a Lebesgue probability space. Let A be the matrix defined in (1) and R M ⊂ Z d be as defined in (3). Then for all > 0 and M ∈ N, there exists a Rohlin tower for T of shape R M with error at most .
Proof. (Of Theorem 1.1) Fix an α satisfying the hypotheses of Corollary 2.11 and a measure preserving system (X, µ, T ) that satisfies the hypotheses of the theorem. We prove that T is T * α tileable. The result then follows from Lemma 2.3.
We give the proof by inductively constructing a sequence of Rohlin towers and a sequence of measurable maps Φ n mapping elements of the tower to elements of Y T * α . Let K n be an increasing sequence obtained by applying Corollary 2.13 with = 2 −n for each n. Let M n be an increasing sequence with M n → ∞ and n a decreasing sequence with n → 0 be chosen such that Construct a sequence of Rohlin towers with base sets B n of shape R 2Kn,Mn and error n . Then by (6) ,Mn denotes those elements v ∈ R Mn such that an entire ball whose diameter equals that of R 2K n−1 ,M n−1 lies in R Mn . If (7) holds, then by the easy direction of the Borel-Cantelli Lemma we have that (8) µ-a.e. x belongs to v∈Jn T v B n for all but finitely many n.
The maps Φ n are constructed such that if x ∈ B n then the following conditions hold: and finally if the slice of the Rohlin tower based at x lies in C n+1 , the core of the (n + 1) stage tower, then both Φ n and Φ n+1 tile this slice and we require that these tilings agree up to a small translation. More formally, suppose x = T u y for some y ∈ B n+1 and u ∈ J n+1 . Then Assuming that these properties are established, we finish the proof by observing that by (8), (9), and (11) for µ-almost every x the sequence Φ n (x) converges to a tiling Φ(x) ∈ Y T * α . By (10), Φ is a factor map from T onto the translation action S on Y T * α . It therefore suffices to show we can build a sequence of maps satisfying these conditions. For x ∈ B 1 we set Φ 1 (x) = N and we extend the definition to (9) and (10) with n = 1. Now suppose that a sequence of maps satisfying (9),(10), and (11) have been defined up to stage n and fix Since these are slices of towers stage n towers, the regions R 2Kn+1,Mn + v i satisfy the conditions of Corollary 2.13 with y i = Φ n (x)[R Mn ]. The tiling Φ n+1 (x) is then given by Corollary 2.13 and (5) yields that (11) is satisfied at this stage.
3. The structure of 2-tilings 3.1. Invariant measures on tilings of R 2 : basic observations. In this section, we consider tilings of R 2 by rectangles. Let T be a finite collection of rectangular tiles and for A ∈ T let w A denote the width of A and h A denote its height. Once T is fixed, we omit it from the notation and refer to Y T as Y . By an invariant measure on Y , we mean a measure invariant under the standard action of R 2 on Y by translation.
We start with some general properties of invariant measures on Y T , and then specialize to the two tile case where the widths and heights of the two tiles are rationally independent. This lemma is essentially the Poincaré recurrence theorem. As usual, we let e 1 , . . . , e d denote the standard basis elements of R d .
Proof. It suffices to show that µ-almost every element of Y contains no semi-infinite horizontal shear that is infinite to the right (the other directions being established identically). Let r < min A∈T h A and let E be the set of tilings such that a semi-infinite right-pointing horizontal shear has its left endpoint in [0, 1) × [0, r).
The sets S n e 1 E are disjoint, as if a tiling lies in two of these sets, then it has distinct semi-infinite rightward shears (necessarily at different heights) with a vertical separation that is less than r. This is impossible since all tiles have height exceeding r. Since the sets have the same measure, they must all be of measure 0. We conclude the proof by taking a union of countably many translates of E.
We say that an invariant measure µ on Y is ergodic in direction v ∈ R 2 if µ is ergodic with respect to the action of (S t v : t ∈ R). A measure that is ergodic in the coordinate directions e 1 and e 2 has no infinite shears: Let µ be an invariant probability measure on Y . If µ is ergodic in the two coordinate directions, then µ-almost every element of Y contains no infinite shears.
Proof. We have already ruled out semi-infinite shears so it remains to rule out bi-infinite shears. By symmetry, it suffices to exclude bi-infinite horizontal shears. Let r < min A∈T h A . The subset W of Y consisting of points such that a bi-infinite horizontal shear enters [0, r) 2 is disjoint from S r e 2 W , but has the same measure, which must therefore be less than 1. Since W is invariant under the horizontal action, it follows that W has measure 0. Taking countably many translates shows the set of points with bi-infinite horizontal shears has measure 0.

Independent heights and widths.
For the remainder of this section, we specialize to the case that T consists of two basic tiles, A and B of dimensions w A × h A and w B × h B respectively. We write Y A,B for the collecting of tilings using A and B, and as before omit the subscripts on Y when the context is clear. Lemma 3.4. Suppose that A and B are tiles such that Y A,B supports a measure that is ergodic in the two coordinate directions. Then the heights of A and B are incommensurable, as are the widths of the tiles.
Proof. Suppose for a contradiction that the widths of A and B are rationally related, so that w A = pα and w B = qα with p, q ∈ N. For this proof, we refer to the left and right boundaries of a tile T as the vertical boundary ∂ v (T ). Fix a tiling z ∈ Y . We say that two tiles, (Notice that this could happen in many ways: the left boundary of T could overlap with the right boundary of T ; the lower left corner of T could be the upper left corner of T , etc.) We let ∼ be the transitive closure of the adjacency relation. If two tiles are related, then the x-coordinates of their left edges differ by an integer multiple of α. For a tile T , let π 2 (T ) be its projection onto the y-axis.
We claim that if int(π 2 (T )) ∩ int(π 2 (T )) = ∅, then T ∼ T . To see this, suppose int(π 2 (T )) ∩ int(π 2 (T )) is non-empty. The intersection is then an open (hence uncountable) set. Let W denote the (countable) set of y coordinates of tops and bottoms of tiles. Thus there exists y 0 ∈ int(π 2 (T )) ∩ int(π 2 (T )) \ W . Hence the tiles on the line y = y 0 between T and T are pairwise adjacent, so that T and T lie in the same equivalence class as required.
Fix a tile T 0 in the tiling z and let J be the convex hull of T ∼T 0 int(π 2 (T )). Then J is an open interval (possibly all of R). We claim that the equivalence class [T 0 ] of T 0 consists precisely of those tiles T such that Conversely if T is a tile such that π 2 (T ) intersects J, then the intersection is uncountable since J is open, and so we can pick y ∈ J ∩π 2 (T )\W . By definition of J, there exists T 1 ∈ [T 0 ] whose projection includes points below y and T 2 ∈ [T 0 ] whose projection includes points above y. Since T 1 and T 2 lie in the same equivalence class, there is a sequence of adjacencies connecting T 1 to T 2 . The sequence of adjacencies must include a tile T whose projection includes y. Notice that since y ∈ W , y necessarily lies in int(π 2 (T )). Now by the previous paragraph, we If J is all of R, then all tiles have left boundaries differing by multiples of α. If J is a proper sub-interval (a, b), then the full lines y = a and y = b form bi-infinite horizontal shears. Now let µ be an invariant probability measure on Y that is ergodic in the direction of the coordinate axes. By Lemma 3.3, almost every point has no bi-infinite shears, so that almost every element of Y has a single equivalence class. Then the collection of elements of Y whose left endpoints lie in αZ + [0, α/2) is invariant under the vertical action, and has measure 1/2, contradicting ergodicity in the directions of the coordinate axes.
3.3. Patterns along shears. For the remainder of the section, we specialize to the case that the rectangular basic tiles A and B are incommensurable, meaning that A and B have incommensurable widths and heights.
Definition 3.5. Define the staircase tilings to be the collection of tilings using rectangular basic tiles A and B that contain no semiinfinite and no bi-infinite shears. We denote the collection of staircase tilings by Y A,B .
The motivation for this name will be apparent once we have described these tilings.  consisting entirely of tile boundaries and consider the tiles that touch from one side. These tiles can have at most one transition between tile types along the segment: from type A to type B; or from type B to type A.
Proof. Without loss of generality, we consider the tiles lying above a horizontal line segment. Suppose that an element of a tiling contains a block of A's followed by a block of B's followed by a block of A's, all of whose bottoms lie along the horizontal line segment. Note that while this segment may be a shear, we also include the case that it is not a shear (this is important in Corollary 3.9 and Proposition 3.7). We show inductively that the border between the A and B blocks is forced to propagate vertically to infinity, producing a semi-infinite shear and a contradiction of x ∈ Y A,B .
By hypothesis, there is at least a single row of B's with A's on either side sitting on the line segment. Assume there are m consecutive B's in the row. Suppose that we have established that there are n consecutive rows of B's one on top of another (the base case n = 1 being satisfied by assumption). This is illustrated in Figure 3. Whichever tiles in x are at the left and right ends of the top row of B's must meet the top of the B's in the interior of the left and right boundaries of the tile. To see this, note that the vertical distance between the top of the nth row of B tiles and the A tile at the bottom is nh B − h A . By the incommensurability assumption, this is not a positive integer combination of h A 's and h B 's. Thus there is no collection of tiles that can be put on top of the A tile so as to have the same top boundary as the nth row of B tiles. The row of tiles that sits on top of this row is then forced to be of width mw B . Using the incommensurability again, the only collection of tiles that can exactly fill this in is m B's, so that one is forced to have n + 1 consecutive rows of B's, completing the induction.
Proposition 3.7. Let A and B be incommensurable rectangular basic tiles. If µ is an ergodic invariant measure on Y A,B such that µ-almost every point has bi-infinite horizontal shears, then µ-almost every point consists of complete rows of A tiles and complete rows of B tiles.
Proof. Consider the set of tilings in which a bi-infinite horizontal shear enters [0, 1) 2 . This set has positive measure; otherwise by taking translates we contradict the hypothesis. Applying the Birkhoff Ergodic Theorem, we deduce that µ-almost every tiling has infinitely many bi-infinite horizontal shears.
By the ABA Lemma (Lemma 3.6), in the row immediately above each horizontal shear, there can be at most one change of tile type.
As in Lemma 3.2, the set of tilings in which there is a point in [0, 1) 2 such that all tiles to the left are A's and all tiles to the right are B's has measure 0 by the Poincaré recurrence theorem applied to the horizontal action. Taking a countable union of translates of this set, we see that in µ-almost every tiling, there are no a ∈ R such that the tiles intersecting the line y = a have exactly one type transition.
In particular, we deduce that in the row immediately above each horizontal shear, there is exactly one tile type. The top of this row is another infinite line segment consisting entirely of tile boundaries, so we can apply the ABA Lemma again to this, and inductively deduce that the tiling consists of full rows of A tiles and full rows of B tiles only.
If T = {A, B} consists of incommensurable rectangular basic tiles, this completes the description of the ergodic invariant probability measures on Y T that have infinite shears.
3.4. Shears in staircase tilings. We start with a characterization of the shears: Proof. Let x ∈ Y A,B and consider a shear in x, which we assume without loss of generality to be horizontal (and finite). By maximality, the continuation of the line segment making up a shear either enters the interior of a tile at the endpoint; or continues as a shared edge (in fact this latter possibility is eliminated below). In particular the top side of the shear is completely filled out by tiles, as is the bottom side.  In particular, shears never cross.
Proof. It suffices to rule out that the line segment making up the shear is extended by shared edges (as occurs at the right of Figure 4). Suppose without loss of generality that the shear is continued by a shared edge between two A's. From the Shear Structure Lemma, we see that one side of the shear has A's followed by B's; while the other side has B's followed by A's. Along the first side of the shear as extended by the shared edges, one sees an A followed by B's followed by more A's. This contradicts the ABA Lemma.

Block structures and basic units in staircase tilings.
We show that the staircases divide a tiling into rectangular patches of perfectly aligned A tiles and perfectly aligned B tiles with shears surrounding them:  Proof. Let x ∈ Y A,B and consider a shear in x, which we assume without loss of generality to be horizontal (and finite), and further suppose that the tiles lying above it consist of m A tiles followed by some sequence of B's. We consider the collection of A tiles accessible from these tiles by crossing only shared edges and claim they form a rectangular patch. Applying symmetric versions of the argument completes the proof of the lemma. The interface between the last A tile and the first B tile forms part of a vertical shear which ends where it meets the horizontal shear. Since we already know the bottom tile on the left of this shear is an A, applying the Shear Structure Lemma, we see that the left side of the vertical shear consists of some number n of A's followed by some B's. Where the A's transition to B's along the vertical shear is the rightmost endpoint of a new horizontal shear going leftwards (see Figure 5 for an illustration).
Since this shear is exactly nh A above the lower shear, the space between the shears can only be filled with n vertically stacked A tiles. It follows that the transition to B's along the underside of the upper shear can only take place after at least m A tiles have been placed.
By Corollary 3.9, we see that a vertical shear goes through the left endpoint of the lower horizontal shear. Since a shear cannot be continued by a shared edge, this shear is forced to continue until it meets the upper shear. Thus the m × n block of A's is surrounded by shears on all four sides, as required.
As a consequence of the Rectangular Blocks Proposition, staircase tilings are extremely rigid. An example of a piece of tiling is illustrated in Figure 6.
We call a configuration of a rectangular block of A's and a rectangular block of B's sitting on top of a horizontal shear a basic unit. These come in two types: AB basic units in which the A block sits to the We show that for tilings in Y A,B , the tiling is completely covered by basic units, disjointly up to their boundary points, and each basic unit is of the same type. Furthermore, we show that the basic units have a lattice structure, so that they may be naturally indexed by Z 2 . This is illustrated in Figure 7. Let x ∈ Y A,B and consider a basic unit, which we assume to be of AB type. Notice that the upper left corner of the B block necessarily lies on a vertical shear, and at the left endpoint of a right-pointing horizontal shear. Since on the underside of this horizontal shear, there is a B block to the left, it must be the case that above it there is an A block to the left. Thus the upper left corner of the B block is the lower left corner of another AB basic unit and vice versa. Similarly the lower right corner of an AB basic unit is the upper right corner of the A block of another AB basic unit and vice versa.
Given a pair of basic units (of AB type) such that the upper left corner of the B block of the first is the lower left corner of the second basic unit, we say that the second is the northeast neighbor of the first and conversely, the first is the southwest neighbor of the second. Similarly given a pair of basic units such that the upper right corner of the A block of the first one is the lower right corner of the second basic unit, the second is said to be the northwest neighbor of the first; and the first is the southeast neighbor of the second.
We record some basic facts in a lemma. Proof. The neighbors are distinct, as the vectors from the lower left corner of a basic unit to the lower left corners of its northeast, northwest, southwest, and southeast neighbors lie in the interiors of the first, second, third and fourth quadrants respectively.
The fact that northeast neighbor of the southwest neighbor of a basic unit is the basic unit itself follows from the definition and similarly in the other directions.
The equality of A block heights between northwest-southeast neighbors follows from the Shear Structure Lemma, since they lie along the same vertical shear. Similar arguments establish the equality of B block widths along northwest-southeast neighbors and of A block widths and B block heights along the northeast-southwest neighbors.
3.6. Basic unit lattice structure. We claim that the basic units are arranged in a lattice that can be indexed in a natural way. We start by showing that for each basic unit, the identification of the neighboring units has been done in such a way that the directions commute: This gives us a natural labeling of basic units by Z 2 . Labeling an arbitrary tile (below, we choose the tile containing the origin) with (0, 0), if a tile is labeled (i, j), we label its northeast neighbor by (i + 1, j), its northwest neighbor by (i, j + 1), its southwest neighbor by (i − 1, j) and its southeast neighbor by (i, j − 1). By Lemmas 3.12 and 3.11, this labeling is consistent. The Rectangular Block Proposition implies that two basic units are equal or have disjoint interiors. We shall show that each basic unit is uniquely labeled, and that the basic units cover the plane. Lemma 3.13. Let x ∈ Y A,B be as above. No basic unit is assigned more than one label by the scheme described above.
Proof. Given two distinct labels (i, j) and (i , j ), without loss assume that i ≥ i. If j ≥ j, then we see that the translation vector from the bottom left corner of the (i, j) basic unit to the (i , j) basic unit lies strictly in the first quadrant, while the translation vector from the bottom left corner of the (i , j) basic unit to the (i , j ) basic unit lies strictly in the second quadrant. Thus the sum of the translation vectors (that is the translation vector from the (i, j) basic unit to the (i , j ) basic unit) lies strictly in the upper half plane. It follows that the basic units are distinct. Similarly if j ≤ j, the translation vector lies strictly in the right half plane, so that in either case, we see that distinctly labeled basic units are distinct.
We can now name the staircases in the name of these tilings. For x ∈ Y A,B , let the basic units be labeled as above. We define the jth northeast staircase to be the union of the bottoms of A tiles and left sides of B tiles in basic units labeled (i, j) for some i. We also let jth expanded northeast staircase be the union of all of the basic units labeled (i, j) for some i. We make the analogous definitions for northwest staircases: the ith northwest staircase is the union of the left sides of the A tiles in basic units labeled (i, j) for some j and tops of the B tiles in basic units labeled (i − 1, j) for some j. The ith expanded northwest staircase is the union of the A tiles in basic units labeled (i, j) for some j and the B tiles in basic units labeled (i − 1, j) for some j. We call such a labeling the standard labeling of a staircase. This is illustrated in Figure 8.
We summarize the basic properties of staircases: Lemma 3.14. Let A and B be incommensurable rectangular tiles and x ∈ Y A,B . A staircase is uniquely determined by the bottom left corners of the basic units lying on it. The heights of A blocks and the widths of B blocks are constant along any northeast staircase, as are the widths of A blocks and heights of B blocks along any northwest staircase.
In the standard labelings of the staircases in x, the (j +1)st northeast staircase is a rigid translation of the jth staircase through (−α, β), where α is the width of the B tiles in the (j + 1)st staircase and β is the height of the A tiles in the jth staircase. The (i + 1)st northwest staircase is a rigid translation of the ith staircase through (γ, δ) where γ is the width of the A tiles in the ith northwest staircase and δ is the height of the B tiles in the (i + 1)st northwest staircase.  Figure 8. A patch of a staircase tiling with two northeast and one northwest staircase indicated.
Proof. The first statement is clear. The second statement follows from Lemma 3.11. For the last statement we deal with the northeast staircases, the northwest staircases being similar. The translation vector from the bottom left corner of the (i, j) basic unit to the bottom left corner of the (i, j + 1) basic unit is (−α, β), where α is the width of the B block in the (i, j + 1) basic unit and β is the height of the A block in the (i, j) basic unit. By the second statement, this translation vector is conserved under passing to a northeast or southwest neighbor, so that by the first statement, the (j + 1)st northeast staircase is a rigid translation of the jth by (−α, β).
We use this to finish the description of the lattice structure of basic units: Proof. The lattice structure of basic units was described above in Lemmas 3.11, 3.12 and 3.13. We are left with showing that the expanded northeast staircases fill out the entire plane. Let (−kw B , h A ) denote the translation vector described in Lemma 3.14, connecting the jth northeast staircase and (j + 1)st. The area between the jth northeast staircase and its translate through (0, h A ) is the union of the A blocks belonging to the jth expanded staircase. The area between that intermediate translate and its translate through (−kw B , 0) (that is the (j +1)st northeast staircase) is the union of the B tiles belonging to the expanded staircase of the (j + 1)st staircase. Hence the area between the 0th and jth northeast staircases is contained in the union of kth expanded staircases for k running from 0 to j.
For j > 0, let E j denote the jth expanded northeast staircase,Ē 0 be the 0th expanded northeast staircase together with all points below and F j =Ē 0 ∪ 0<k≤j E k .
Since the jth staircase is a translate of the (j − 1)st through a vector of the form (−kw B , h A ), we see that F j includes a min(w B , h A )neighborhood of F j−1 . It follows that all points above the initial staircase belong to some F j . A similar argument applies below the initial staircase.  With this convention, we have that the (i, j) basic block consists of a a i w A × b j h A A block and a c j w B × d i h B B block.
Using this notation, we define a map Φ : where y is the position of the origin relative to the bottom left corner of the (0, 0) basic unit.
In fact, we can specify the range of the map more precisely as Lemma 3.17. The map Φ defined in Equation (12) is a bijection between Y A,B and Q.
Proof. To see that Φ is injective, notice that the principal northeast staircase is determined by the sequence (a i , d i ). The translation from the jth northeast staircase to the (j +1)st is (−c j+1 w B , b j h A ). Knowing the jth staircase and the translation vectors to the (j−1)st and (j+1)st northeast staircases determines the dimensions of the A and B blocks (from the formulae above). This means that if two tilings give rise to the same sequences, then they have the same staircases. If they have the same y value in addition, then the origin is at the same point in the (0, 0) tile. To see that Φ is surjective, let r = ((a i , d i ) i∈Z , (b j , c j ) j∈Z , y) ∈ Q. As described, starting from the sequences one can build the staircases and the blocks. They always cover all of R 2 as previously noted. Hence one can always find x ∈ Y A,B such that Φ(x) = r.
We use Φ to give a complete description of the invariant measures on Y A,B . Let Z = (N × N) Z × (N × N) Z , so that Q as defined in (13) becomes where τ 0,0 is defined as in (14). Write Z = W 1 × W 2 , where each W i is a copy of (N×N) Z . Let S 1 (w, w ) = (σ(w), w ) and S 2 (w, w ) = (w, σ(w )) where σ is the left shift. These maps generate a natural Z 2 action on Z. Since Φ is a bijection between Y A,B and Q, the R 2 action on Y A,B induces an R 2 action on Q, preserving the measure ν = Φ * µ. For x ∈ Y A,B and v ∈ R 2 , S v x has the same basic units, but relabeled by adding a constant (the label given to the previous (0, 0) block). Of course, the new y coordinate is the position of the origin in the new (0, 0) block after the tiling is shifted. It follows that for (z, y) ∈ Q, S v (z, y) is of the form (S i 1 S j 2 z, y ), where the (i, j) is the labeling of the block in x that contains the point with coordinates v.
Since the R 2 action on Q locally acts as a translation, we deduce that on the R 2 fibers in Q, the conditional measure of ν is just Lebesgue measure restricted to τ 0,0 . We then define maps between measures on Q and measures on Z and claim that they establish (up to normalization) a bijection between the R 2 -invariant measures on Q and certain explicitly-described Z 2 -invariant measures on Z.
Given an R 2 invariant measure ν on Q, we define a measure on Z by dν(z, y).
Conversely, given a Z 2 invariant measure λ on Z, we define a measure on Q by specifying it on product sets as The maps Ψ 1 and Ψ 2 are inverses of each other.
Proof. Since the areas are bounded below in the definition (15), it follows that Ψ 1 (ν) is finite. To show that Ψ 1 (ν) is invariant, let C × D be a cylinder set in Z, where C and D are cylinder sets specifying at least the 0th and 1st coordinates of W 1 and W 2 . Maintaining the notation of 3.16, note that for all z ∈ C × D, Area(τ 0,0 (z)) is , the area of the (0, 0) basic unit in z, while for z ∈ S 1 (C × D), Area(τ 0,0 (z)) is Hence we have that Ψ 1 (ν)(C ×D) = ν(E)/A 00 and Ψ 1 (ν)(S 1 (C ×D)) = ν(E )/A 10 , where E = C × D × R 2 and E = σ(C) × D × R 2 . However, by considering the picture in tiling space, we see that where m is Lebesgue measure on R 2 . Integrating over ν, exchanging the order of integration, dividing by N 2 and taking the limit, we see that ν(E)/A 00 = ν(E )/A 10 and thus Ψ 1 (ν) is S 1 -invariant. A similar argument establishes S 2 -invariance. To show that Ψ 2 (λ) is invariant, let C ×D ⊂ Q where C is a cylinder set and let v ∈ R 2 . By decomposing C further into cylinder sets if necessary, we may assume that C determines the staircases out to a radius of at least v + 2(w A + w B + h A + h B ). Let z ∈ C and let y = Φ −1 (z, 0). Let g 1 , g 2 , . . . , g n be an enumeration of the basic units in the tiling z that are determined by the cylinder set C. Let u 1 , . . . , u n be the sequence of their bottom left vertices and let (i 1 , j 1 ), . . . , (i n , j n ) be the labels of the basic units as given in Proposition 3.15.
, allowing us to deduce the invariance of Ψ 2 (λ) from the shift-invariance of ν and the translation invariance of Lebesgue measure. Now suppose that ν is ergodic and that E is a Z 2 -invariant subset of Z. Then (E × R 2 ) ∩ Q is an R 2 -invariant subset of Q. It follows that (E × R 2 ) ∩ Q has zero measure or full measure with respect to ν. It follows from the definition (15) of Ψ 1 that Ψ 1 (ν)(E) is also of zero measure or full measure and so Ψ 1 (ν) is ergodic.
Similarly, suppose λ is ergodic and let F be an R 2 -invariant subset of Q. Since F is R 2 -invariant, we deduce that F = {(z, y) ∈ Q : z ∈ E} for a subset E of Z. By invariance of F , it follows that E is Z 2 -invariant, and hence has full measure or zero measure. The same applies to F .
We now identify the ergodic Z 2 -invariant measures on Z. Proof. Let λ 1 and λ 2 be ergodic shift-invariant measures on W 1 and W 2 . Then λ = λ 1 ⊗ λ 2 , defined on cylinder sets by λ(C × D) = λ 1 (C)λ 2 (D), is clearly invariant under each of the S i for i = 1, 2. If A is an invariant set under the Z 2 -action, then letting A w = {v ∈ W 2 : (w, v) ∈ A}, we see that A w is an S 2 -invariant set, and hence of λ 2 -measure 0 or 1 for λ 1 -almost every w. Letting B = {w : λ(A w ) = 1}, this is an S 1invariant set and hence of λ 1 -measure 0 or 1. Thus λ(A) is 0 or 1 and so λ is ergodic.
Conversely, let λ be an ergodic Z 2 -invariant measure on Z. Let λ 1 and λ 2 be the (necessarily invariant and ergodic) marginals of λ. Let C and D be cylinder sets in W 1 and W 2 . Then for λ-almost every (w, w ) ∈ Z, we have and so λ = λ 1 ⊗ λ 2 .
Maintaining the notation of 3.16, this leads to a precise criterion for an ergodic invariant probability measure on Z corresponding to a finite invariant probability measure on Q: Lemma 3.20. Let λ = λ 1 ⊗ λ 2 be an ergodic Z 2 -invariant probability measure on Z. Then Ψ 2 (λ) is a finite measure if and only if a 0 (w) and d 0 (w) are λ 1 -integrable and b 0 (w ) and c 0 (w ) are λ 2 -integrable.
Proof. A necessary and sufficient condition for Ψ 2 (λ) to be a finite measure is Area(τ 0,0 (z)) dλ(z) < ∞. Since we see that Hence, the λ 1 -integrability of a 0 and d 0 and the λ 2 -integrability of b 0 and c 0 is a necessary and sufficient condition for finiteness of Ψ 2 (λ).
Combining Proposition 3.18 and Lemmas 3.19 and 3.20, we have shown the structure for staircase measures: 3.8. Entropy of invariant measures. We maintain the notation of Section 3.7, and let H(P) denote the entropy of a partition P and h S (λ) denote the measure theoretic entropy of S relative to the measure λ. Recall that Z = (N × N) Z × (N × N) Z and that τ 0,0 is defined as in (14).
Proof. We prove that h S 1 (λ 1 ) < ∞, the other estimate being identical. Consider the countable partitions of W 1 , P 1 and P 2 according to the values taken by a 0 and d 0 . Clearly P 1 ∨ P 2 is a generating partition for S 1 . We have h S 1 (λ 1 ) ≤ h S 1 (λ 1 , P 1 ) + h S 1 (λ 1 , P 2 ) and so it suffices to show these terms are finite. We consider h S 1 (λ 1 , P 1 ), the other case being similar.
Write C n = a −1 0 {n}. By Lemma 3.20, we have that n nλ 1 (C n ) < ∞. Then where we used for the fourth line the fact that −x log x is an increasing function on the interval [0, 1/e].
The following is Theorem 1.2 of the introduction stated in the notation of this section. Proof. Using the variational principle, it suffices to show that all of the ergodic invariant measures on Y A,B have zero measure-theoretic entropy. This entropy is, by definition, the measure-theoretic entropy of the Z 2 -subaction (S v ) v∈Z 2 on Y . We make the simplifying assumption that all of w A , h A , w B and h B are greater than 1, so that no unit square can contain the bottom left corner of two tiles. (If this were not satisfied, we would instead compute the entropy with respect to a finer subaction (Z/N ) 2 for an appropriate N .) We define a refining sequence of partitions on Y A,B , namely i,j is the set of tilings such that the region [i/2 n , (i + 1)/2 n ) × [j/2 n , (j + 1)/2 n ) contains the lower left corner of an A tile, B (n) i,j is the set of tilings such that the region [i/2 n , (i + 1)/2 n ) × [j/2 n , (j + 1)/2 n ) contains the lower left corner of a B tile, and C (n) is the event that [0, 1) 2 does not contain the lower left corner of any tile.
Notice that while P (n) is not a generating partition with respect to the Z 2 action, it is the case that for any two distinct points y, y ∈ Y A,B , there exists m ∈ Z 2 and n ∈ N such that S m y and S m y lie in different elements of P (n) . Thus n≥0 P (n) agrees up to set of measure 0 with the Borel σ-algebra. By standard facts in entropy theory (see [18] or [3]), we obtain that h(µ) = lim n→∞ h(µ, P (n) ) for an R 2 -invariant measure µ. We therefore show that for any n ∈ N and any R 2 -invariant measure µ on Y , h(µ, P (n) ) = 0.
We first deal with the case where µ has infinite shears, restricting to the case that µ-almost every point has infinite horizontal shears, the case with vertical shears being similar. By Proposition 3.7, we have a description of the points on which the measure is supported.
Let N > 0 be chosen. We compute the number of non-empty elements in the partition P (n) Notice that the element of P For any point x ∈ Y , each row consists either entirely of A tiles or entirely of B tiles. Considering the first N rows whose bottoms are above the origin, we obtain a sequence of N A's and B's corresponding to the order in which they occur. There are 2 N such possibilities.
Given such a sequence, we first address possible positions of the bottoms of the rows relative to the 2 −n vertical grid implicit in the partition. Notice that there are at most N rows of tiles that could affect the element of P (n) N in which a point lies. Let y 0 be the y-coordinate of the lowest row of tiles whose bottom edge lies above the x-axis, so that 0 ≤ y 0 < max(h A , h B ). Letting y 0 vary, we have that a row bottom crosses the grid at most 2 n max(h A , h B ) times and this can happen in each of the first N rows of tiles of the grid. This leads to a total of 2 n max(h A , h B )N as an upper bound on the number of possible configurations.
Finally, by a similar argument, in each row whose bottom has a ycoordinate in the range [0, N ), the number of possible configurations where the left edges of the tiles lie relative to a 2 −n grid is at most 2 n max(w A , w B )N .
Multiplying these quantities, we see that where N (Q) denotes the number of non-empty elements of the partition Q. In particular, (for fixed n) the number of non-empty partition elements grows slower than e aN 2 for any a > 0. Thus h(µ, P (n) ) = 0 for any n and hence h(µ) = 0.
On the other hand, if the ergodic measure µ is supported on staircase tilings, let λ 1 ⊗ λ 2 be the corresponding measure on W 1 × W 2 . By Theorem 3.21, the measures λ 1 and λ 2 are ergodic. By Lemma 3.22, both of these measures have finite entropy. Let > 0 be given. There exists D > 0 such that with probability at least 1 − , the (0, 0) basic unit has both dimensions less than D. Let E 1 be the set where this occurs. Let This is the relative position of the (k, 0) basic tile to the (0, 0) basic tile. There exist K > 1, n 0 > 0, and a subset E 2 of W 1 of measure at least 1 − such that u N ≤ KN for all N ≥ n 0 . Notice also (by the assumption that w A > 1 and h B > 1) that ( u N ) i > N for i = 1, 2. This defines a segment of the 0th northeast staircase. Since the successive northeast staircases lie at least 1 unit above and below their neighbors, we deduce that the northeast staircases between the (−KN )th and the (KN )th cover [0, N ) 2 . This ensures that w N −1 0 and w KN −1 −KN determine a patch of the tiling that includes [0, N ) 2 .
Given this, there are at most N 2 tiles in the region, whose boundaries lie on at most N 2 y-coordinates and at most N 2 x-coordinates. The above argument shows that as one moves the origin around τ 0,0 (w, w ), the horizontal boundaries cross the 2 −n gridlines at most 2 n HN 2 times and similarly the vertical boundaries cross the 2 −n gridlines at most 2 n W N 2 times, where W and H are the width and height of the (0, 0) basic unit.
Let h 1 be the entropy of λ 1 and h 2 be the entropy of λ 2 . Then for sufficiently large N , W 1 may be covered up to a set of measure by e (h 1 + )N length N cylinder sets and W 2 may be covered up to a set of measure by e (h 2 + )2KN length 2KN cylinder sets. Let E 3 be the union of the cylinder sets in W 1 and E 4 be the union of the cylinder sets in W 2 . Hence The condition on the columns of M ensures that the one dimensional actions (S (t,0) ) t∈R and (S (0,t) ) t∈R are free.
Proof. Let L = M −1 . From the formula for the inverse of a 2×2 matrix, we see the entries of first row of L are independent, as are the entries of the second row. We claim that LZ 2 has a pair of generators lying in the first and fourth quadrants. To see this, consider the primitive lattice vectors in LZ 2 lying in a vertical strip [0, t) × R and take the two vectors lying closest to the horizontal axis from above and below.
Let these generators be (a, b) and (c, −d). Then let T consist the rectangle A with w A = a and h A = d and the rectangle B with w B = c and h B = d. Forming a basic unit by putting the A tile next to the B tile on top of a common horizontal segment, one can check (as in Section 3.6) that the basic units can be arranged to tile R 2 periodically (the displacement vector (a, b) being the translation between consecutive northwest staircases and the displacement vector (c, −d) being the translation between consecutive northeast staircases). Let x ∈ Y A,B be the resulting periodic tiling of the plane. This is illustrated in Figure  9. Notice that To finish the proof, let Φ : T 2 → Y A,B be defined by u → S L u x. The above observation ensures that Φ is well-defined and it is straightforward to see that Φ • S u T 2 = S u Y • Φ, where S T 2 denotes the action on the If the matrix M is invertible, but has the property that there is rational dependence between the entries of one of the columns, then it turns out that the action is T -tileable for a set T consisting of a single suitably chosen basic tile. Suppose the first column of M is rationally dependent. Then the second row of L is rationally dependent. There exists a matrix H in SL 2 (Z) such that (LH) 21 = 0. Note that LZ 2 = LHZ 2 . Let LH = a b 0 c , so that the lattice has generators (a, 0) and (b, c). We then take T to consist of an a × c rectangle, A, and take x to be the configuration of rows of A's, each row shifted rightwards from the row below by b. The remainder of the previous argument works as before.
Proposition 4.2. Let (S v ) v∈R 2 be an ergodic measure-preserving action on a space X. Suppose further that there exist linearly independent vectors h j and eigenfunctions f j for j = 1, 2 satisfying f j (S v x) = exp(2πih j · v)f j (x). Then there exists a set T consisting of a pair of rectangular tiles such that (S v ) : X → X is T -tileable.
Proof. By changing variables, we may regard the f j 's as maps from X to T (regarded as an additive group). Letting F (x) = (f 1 (x), f 2 (x)), where M is the matrix with rows h 1 and h 2 . Hence F is a factor map from X onto the action on the torus appearing in Lemma 4.1. Since that lemma produced a factor map onto the translation action on a tiling space with two rectangular tile types, composing the factor maps completes the proof.

5.
Existence of mixing systems that are tileable 5.1. Statement of the result. We show: Theorem 5.1. Assume that T consists of two basic tiles A and B whose heights and widths are incommensurable and further satisfying w A = h B and w B = h A . Then there exists a mixing measure preserving system (X, B, µ, The condition w A = h B and w B = h A is an artifact of the proof that allows us to simply the computations. With significantly more work in the estimates and careful choice of bounds in each direction, this condition could be removed.
We produce such a mixing example by showing that there exists a mixing measure on Y A,B , and the proof of the mixing property reduces to estimating the distance between certain measures on R 2 with respect to suitable metrics. Before turning to the proof of the theorem in Section 5.3, we describe these metrics.
To estimate the Prokhorov distance, we make use of a special case of a theorem of Zaitsev. Recall that given a probability measure η on R, its Fourier transform is given by η(t) = e ixt dη(x). Theorem 5.5 (Zaitsev [19]). Let η 1 and η 2 be measures on R and let ∆(t) be the difference between their Fourier transforms. If |x| dη 1 (x) and |x| dη 2 (x) are finite, then for any M ≥ e, where W = 2 + |x| dη 1 (x) + |x| dη 2 (x) and c 1 and c 2 are universal constants.  (13)), and furthermore Q is a subset of W 1 × W 2 × R 2 , where each W i is a copy of (N × N) Z . Restricting the measure λ 1 × λ 2 × λ to Q, we obtain a measure ν. We show that this measure ν is mixing.
We choose the natural distance function on the space of things defined by declaring that two tilings are -close if they agree up to an -translation on a set of diameter 1/ . Let f and g be Lipschitz (with respect to this distance on tilings) local functions on Q with Lipschitz norm 1, where local means that the definitions of f and g only depend on a fixed neighborhood of the origin. Thus, the value of f (or g) is determined by the finite sequences w K −K , w K −K , for some sufficiently large K, and by the position in the tile. Notice also that for any x ∈ Q, v → f (S v x) is a Lipschitz function of R 2 with Lipschitz norm 1.
Let F denote the smallest σ-algebra on Q with respect to which, (w 1 ) (−(N +K),N +K) c ∪[−K,K] , (w 2 ) (−(N +K),N +K) c ∪[−K,K] and z, the R 2 component, are measurable (taking K and N to be sufficiently large). Thus, We are left with showing that for sufficiently large v, this last integral is close to the product f (x) dν(x) g(x) dν(x).
Let Ω denote the information not captured in F, meaning that For ω = (z, z ) ∈ Ω and x = (w, w , y) ∈ Q, let ωx denote the point (w,w , y), wherew i = z i if K < |i| ≤ N + K and w i otherwise, and similarly forw . We can then rewrite Choose r g > 0 such that if two tilings x and x agree in the ball of radius r g around the origin, then g(x) = g(x ). Notice that the tilings x and ωx agree completely up to translation off the central spine consisting of the northeast and northwest staircases with labels j and i satisfying |i| ≤ N + K and |j| ≤ N + K. The complement of the spine consists of four regions (oriented approximately along the four coordinate directions).
We denote the region below the −(N + K) northeast staircase and above the (N + K) northwest staircase as region 1; above the (N + K) northeast staircase and the (N + K) northwest staircase as region 2; above the (N + K) northeast staircase but below the −(N + K) northwest staircase as region 3; and finally below the −(N + K) northeast staircase and the −(N + K) northwest staircase as region 4. Note that all of these regions depend on choice of the tiling x.
Let x = (w, w , y) be a point in Q, where w = (a i , d i ) i∈Z , w = (b j , c j ) j∈Z . Define  We further define As a consequence, provided that B( v, r g ) lies in the same region (say the ith) for x and ωx, we deduce that g(S v (ωx)) = g(S v− h x) where h = u i (ωx) − u i (x) is the difference in the relative translation vector of the region between x and ωx.
Let Ω 0 be the collection of x = (w, w , z) such that the number of 1's and 2's in each coordinate of (w, w ) in each of the ranges (K, N + K] and [−(N + K), −K) is within 2| log | standard deviations of the mean.
Define E i to be the collection of x ∈ Ω 0 such that v lies in the ith region of the tiling corresponding to x, more than 100N D away from either of the central staircases where D is the maximal diameter of the basic unit.
Notice that provided v > 100N D/ , v lies more than 100N D away from either of the central staircases with probability 1 − O( ). By the central limit theorem and standard properties of the normal distribution, x ∈ Ω 0 with probability 1 − O( ).
For x ∈ E i , we have that B( v, r g ) also lies in the ith region of ωx and g(S v (ωx)) = g(S v−( u i (ωx)− u i (x)) x) for all ω ∈ Ω.
Hence for x ∈ E i , we have where η is the measure on R 2 given by We see that the x-coordinates on which η is supported are just a translation by ( v) 1 by a weighted sum of binomial random variables. The y-coordinates have a similar description and are independent of the x-coordinates.
By Lemma 5.4, it suffices to show that d M T (η, σ) < where σ is a measure on R 2 such that for x's belonging to set of measure close to 1, one has: (18) g(S v x) dσ( v) − g(x) dµ(x) < We do this in two steps, first approximating η by a normal distribution and then approximating the normal distribution by rectangular pieces with constant density. where the i are independent random variables taking the values ±1 with probability 1/2 each. Let ζ be the distribution of a normal random variable with mean 0 and variance (N/4(w 2 A + w 2 B )). Let N = M 5 , where M is to be chosen later, and note that the quantity W appearing in (17) is O(N ). This guarantees that the first term of (17) is O(log M/M ).
Without loss of generality, assume that w A < w B . For the second term in (17), η 1 (t) = (cos(w A t/2) cos(w B t/2)) N and ζ(t) = exp(−N t 2 (w 2 A + w 2 B )/8). Taking ∆(t) to be the difference of these quantities and using the Taylor expansion, we have that ∆(t) and ∆ (t) are O(N −1/2 ) for |t| ≤ 1/w B . Thus for sufficiently large N , this range gives a trivial contribution to the second term. For |t| > 1/w B , we have that ζ(t) and ζ (t) are O(N te −aN t 2 ) for some constant a > 0 that is independent of N , and so the integral of their squares contribute a total of O(e −aN/w 2 A ) over the range |t| > 1/w B . We are left with controlling η 1 (t) and its derivative in the range |t| > 1/w B . Since η 1 (t) = (cos(w A t/2) cos(w B t/2)) N , the values of t giving significant contributions are those for which both w A t/(2π) and w B t/(2π) are close to integers, which depend on the continued fraction expansion of w A /w B . Specifically if p n /q n ≈ w A /w B , then around t = 2πp n /w A , both terms are close to ±1. Letting M = πp n /w A , we see that the closest distance between a peak of | cos(w A t/2)| and one of | cos(w B t/2)| is Ω(1/M 2 ). The contribution when | cos(w A t/2) cos(w B t/2)| < 0.9 is clearly negligible. One can check that the contribution near a peak is O(N 1/2 e −ah 2 N + N 3/2 h 2 e −ah 2 N ), where h is the distance between peaks of the cosine functions. Thus each peak contributes O(M 7/2 e −aM ), and since there are M peaks, for sufficiently large M (and thus also N ), the total contribution is arbitrarily small. Thus by Theorem 5.5, for M sufficiently large, η 1 lies close to a normal distribution in the Prokhorov metric, meaning that d Prok (η 1 , ζ) can be taken to be arbitrarily small. By Theorem 5.4, the same holds for d BL (η 1 , ζ). Since η 2 has the same distribution, we also have that d BL (η 1 × η 2 , ζ × ζ) can be taken to be arbitrarily small.

5.3.2.
Approximating by pieces with constant density. Choose L 0 such that for all L ≥ L 0 , |(1/L 2 ) [0,L) 2 g(S v x) dλ( v) − g dν| < 4 for a set of x belonging to a subset of Q of ν-measure at least 1 − (recall that λ denotes Lebesgue measure). Notice that for a (one-dimensional) normal distribution, the part of the space at least 2| log | standard deviations from the mean has measure o( ). Dividing the part of the range that is at most 2| log | standard deviations from the mean into pieces of length /| log |, on each piece in the central region, the ratio of the maximum density to the minimum density is at most 1 + . Hence the distance to a distribution that is piecewise constant on intervals of length /| log | times the standard deviation is O( ).
Taking M sufficiently large, as above, such that N (w 2 A + w 2 B )/4 /| log | > L 0 , we have that ζ is -close to a measure σ on R 2 that is piecewise constant on L × L pieces, where L = N (w 2 A + w 2 B )/4 /| log |, and for which 1 − of the mass is concentrated on | log | 4 / 2 pieces. Given x ∈ E i and v, the probability that all | log | 4 / 2 L × L tiles around v in x are good is at least 1 − , ensuring that g(S v− h i (x)+ u x) dσ( u) is -close to g dν on a set of large measure. Thus E ν (g • S v ) is -close to g dν on a set of large measure, completing the proof of (18) and the proof of theorem.