Immersed surfaces in the modular orbifold

A hyperbolic conjugacy class in the modular group PSL(2,Z) corresponds to a closed geodesic in the modular orbifold. Some of these geodesics virtually bound immersed surfaces, and some do not; the distinction is related to the polyhedral structure in the unit ball of the stable commutator length norm. We prove the following stability theorem: for every hyperbolic element of the modular group, the product of this element with a sufficiently large power of a parabolic element is represented by a geodesic that virtually bounds an immersed surface.


Introduction
In many areas of geometry, it is important to understand which immersed curves on a surface bound immersed subsurfaces. Such questions arise (for example) in topology, complex analysis, contact geometry and string theory. In [5] it was shown that studying isometric immersions between hyperbolic surfaces with geodesic boundary gives insight into the polyhedral structure of the second bounded cohomology of a free group (through its connection to the stable commutator length norm, defined in § 2.2).
If Σ is a (complete) noncompact oriented hyperbolic orbifold, a (hyperbolic) conjugacy class g in π 1 (Σ) is represented by a unique geodesic γ on Σ. The immersion problem asks when there is an oriented surface S and an orientation-preserving immersion i : S → Σ taking ∂S to γ in an orientation-preserving way. If the problem has a positive solution one says that γ bounds an immersed surface. The immersion problem is complicated by the fact that there are examples of curves γ that do not bound immersed surfaces, but have finite (possibly disconnected) covers that do bound; i.e. there is an immersion i : S → Σ as above for which ∂S → γ factors through a covering map of some positive degree. In this case we say γ virtually bounds an immersed surface.
One can extend the virtual immersion problem in a natural way to finite rational formal sums of geodesics representing zero in (rational) homology. Formally, one defines the real vector space B H 1 (G) of homogenized 1-boundaries, (see § 2.2), where G = π 1 (Σ). It is shown in [5] that the set of rational chains in B H 1 for which the virtual immersion problem has a positive solution are precisely the set of rational points in a closed convex rational polyhedral cone with nonempty interior. This fact is connected in a deep way with Thurston's characterization (see [15]) of the set of classes in H 2 (M ; Z) represented by fibers of fibrations of a given 3-manifold M over S 1 . It can be used to give a new proof of symplectic rigidity theorems of Burger-Iozzi-Wienhard [3], and has been used by Wilton [16] in his work on Casson's conjecture. Evidently understanding the structure of the set of solutions of the virtual immersion problem is important, with potential applications to many areas of mathematics. Unfortunately, this set is apparently very complicated, even for very simple surfaces Σ.
The purpose of this paper is to prove the following Stability Theorem: Stability Theorem 3.1. Let v be any hyperbolic conjugacy class in PSL(2, Z), represented by a string X of positive R's and L's. Then for all sufficiently large n, the geodesic in the modular orbifold M corresponding to the stabilization R n X virtually bounds an immersed surface in M.
This theorem proves the natural analogue of Conjecture 3.16 from [5], with PSL(2, Z) in place of the free group F 2 .
It follows from the main theorems of [5] that the elements v n ∈ PSL(2, Z) corresponding to the stabilizations of v as above satisfy scl(v n ) = rot(v n )/2, where rot is the rotation quasimorphism on PSL(2, Z), and scl denotes the stable commutator length (see § 2 for details). Under the natural central extension φ : B 3 → PSL(2, Z) where B 3 denotes the 3 strand braid group, there is an equality scl(b) = scl(φ(b)) for all b ∈ [B 3 , B 3 ]; consequently we derive an analogous stability theorem for stable commutator length in B 3 .
We give the necessary background and motivation in § 2. Theorem 3.1 is proved in § 3. In § 4 we generalize our main theorem to (2, p, ∞)-orbifolds for any p ≥ 3, and discuss some related combinatorial problems and a connection to a problem in theoretical computer science. Finally, in § 5 we describe the results of some computer experiments.

Background
We recall here some standard definitions and facts for the convenience of the reader.
2.1. The modular group. The modular group PSL(2, Z) acts discretely and with finite covolume on H 2 , and the quotient is the modular orbifold M, which can be thought of as a triangle orbifold of type (2, 3, ∞).
Every element of PSL(2, Z) either has finite order, or is parabolic, or is conjugate to a product of the form R a1 L b1 R a2 · · · L am where the a i and b i are positive integers, and where L and R are represented by the matrices (the parabolic elements are conjugate into the form R a or L b ). The group PSL(2, Z) is abstractly isomorphic to the free product Z/2Z * Z/3Z.

2.2.
Stable commutator length. For a basic introduction to stable commutator length, see [6], especially Chapters 2 and 4. If G is a group, and g ∈ [G, G], the commutator length of g (denoted cl(g)) is the smallest number of commutators in G whose product is g, and the stable commutator length scl(g) is the limit scl(g) := lim n→∞ cl(g n )/n.
Stable commutator length extends in a natural way to a function on B 1 (G), the space of real group 1-boundaries (i.e. real group 1-chains representing 0 in homology with real coefficients) and descends to a pseudo-norm (which can be thought of as a kind of relative Gromov-Thurston norm) on the quotient B H 1 (G) := The dual of this space (up to scaling by a factor of 2) is Q(G)/H 1 (G) -i.e. homogeneous quasimorphisms on G modulo homomorphisms, with the defect norm. This duality theorem -known as Generalized Bavard Duality -is proved in [6], § 2.6. A special case of this theorem, proved by Bavard in [2], says that for any g ∈ [G, G] there is an equality scl(g) = sup φ φ(g)/2 where the supremum is taken over all homogeneous quasimorphisms φ ∈ Q(G) normalized to have defect D(φ) The theory of stable commutator length has deep connections to dynamics, group theory, geometry, and topology; however, although in principle this function contains a great deal of information, it is notoriously difficult to extract this information, and to interpret it geometrically. It is a fundamental question in any group G to calculate scl on chains in B H 1 (G), and to determine extremal quasimorphisms for such elements. Conversely, given a homogeneous quasimorphism φ ∈ Q(G) (especially one with some geometric "meaning"), it is a fundamental question to determine the (possibly empty) cone in B H 1 (G) on which φ is extremal.

Rotation quasimorphism.
If G is a word-hyperbolic group, scl is a genuine norm on B H 1 (G). Moreover, it is shown in [4] and [5] that if G is virtually free, the unit ball in this norm is a rational polyhedron, and there are codimension one faces associated to realizations of G as the fundamental group of a complete oriented hyperbolic orbifold.
Dual to each such codimension one face is a unique extremal vertex of the unit ball in Q/H 1 . In our case, PSL(2, Z) may be naturally identified with the fundamental group of the modular orbifold. The unique homogeneous quasimorphism dual to this realization, scaled to have defect 1, is the rotation quasimorphism, denoted rot. This rotation function is very closely related to the Rademacher ϕ function, which arises in connection with Dedekind's η function, and is studied by many authors, e.g. [1,10,12,14,11], and so on (see especially [10], § 3.2 for a discussion most closely connected to the point of view of this paper). In fact, up to a constant, the rotation quasimorphism is the homogenization of the Rademacher function; i.e. rot(g) = lim n→∞ ϕ(g n )/6n.
The simplest way to define this function (at least on hyperbolic elements of PSL(2, Z)) is as follows. Associated to a hyperbolic conjugacy class g ∈ PSL(2, Z) is a geodesic γ on M. The geodesic γ cuts M up into complementary regions R i (see Figure 1 for an example). Join each region R i to the cusp by a proper ray α i , and define n i to be the signed intersection number n i = α i ∩ γ. Then rot(g) = 1 2π i n i area(R i ). In other words, up to a factor of 2π, the rotation number is the algebraic area enclosed by γ. Algebraically, if the conjugacy class of g has a factorization of the form R a1 L b1 · · · L bn then rot(g) = 1 6 ( a i − b i ); see [12] 1.5-6, or [14] equation 70 (Rademacher denotes ϕ and 6 rot by Φ and Ψ respectively).
By Bavard duality, one has scl(g) ≥ rot(g)/2 for every g. Moreover, it is shown in [5] (for arbitrary free groups, though the proof easily generalizes to virtually free groups) that equality is achieved if and only if the geodesic representative γ of g virtually bounds an immersed surface. That is, if and only if there is a hyperbolic surface S and an isometric immersion S → M wrapping ∂S some (positive) number n times around γ. Topologically, one can think of the problem of constructing such an immersed surface as a kind of jigsaw puzzle: one takes n · n i copies of each region R i , where n, n i and R i are as above, and tries to glue them up compatibly with their tautological embeddings in M in such a way as to produce a smooth orbifold with geodesic boundary. Evidently, a necessary condition is that the n i are all non-negative. However, this necessary condition is not sufficient.
In [5] it was observed experimentally that for many words w ∈ [F 2 , F 2 ], geodesics on a hyperbolic once-punctured torus corresponding to conjugacy classes of the form [a, b] n w all virtually bound immersed surfaces for sufficiently large n, and it was conjectured (Conjecture 3.16) that this holds in general. Our main theorem (Theorem 3.1 below) proves the natural analogue of this conjecture with the free group F 2 replaced by the virtually free group PSL(2, Z).

2.4.
Braid group. The braid group B 3 is a central extension of PSL(2, Z). Under this projection, the standard braid generators σ 1 and σ 2 are taken to R −1 and L respectively. It is straightforward to show that for any b ∈ [B 3 , B 3 ] the (stable) commutator length of b is equal to the (stable) commutator length of its image in PSL(2, Z). Consequently, our main theorem shows that the rotation quasimorphism is extremal for a sufficiently large stabilization of any element of [B 3 , B 3 ].

Proof of theorem
The purpose of this section is to prove the following theorem: Let v be any hyperbolic conjugacy class in PSL(2, Z), represented by a string X of positive R's and L's. Then for all sufficiently large n, the geodesic in the modular orbifold M corresponding to the stabilization R n X virtually bounds an immersed surface in M.
The proof will occupy the remainder of the section.
The conjugacy class of v has a representative of the form R a1 L b1 R a2 · · · L bn where the a i , b i are all positive integers. We will show that the geodesic γ corresponding to v virtually bounds an immersed surface providing a 1 is sufficiently big compared to i =1 a i + i b i . Evidently the theorem follows from this. We fix the notation N = a 1 and N ′ = i =1 a i + i b i in the sequel, and we prove the theorem under the hypothesis N ≥ 3N ′ + 11n + 3 (note that there is no suggestion that this inequality is sharp).
In the modular orbifold M, let σ denote the embedded geodesic segment running between the orbifold points of orders 2 and 3. The preimage of σ in the universal cover H 2 is a regular 3-valent tree, which we denote σ; see Figure 2. Then ∂W is a collection of circular arcs, whose vertices are the complex numbers e 2πi/3 + n for n ∈ Z. We call these arcs the segments of ∂W . Every segment of ∂W (orbifold) double covers the interval σ in M. In the sequel we use the abbreviation ω = e 2πi/3 .

3.1.
Arcs and subwords. The arc σ cuts γ into a collection of geodesic segments which correspond approximately to the R ai and L bi terms in the expression of v, in the following way.
After choosing a base point and an orientation, a word v in the L's and R's determines a simplicial path in σ, where L indicates a "left turn", and R indicates a "right turn", reading the word from right to left. The letters R or L correspond to the vertices of this path, and a string of the form RL b R (resp. LR a L) corresponds to a segment of length b + 1 (resp. a + 1) which, after translation by some element of PSL(2, Z), we can arrange to be contained in ∂W as a string of b + 1 consecutive arcs moving to the right (resp. a + 1 consecutive arcs moving to the left).
The bi-infinite powerv := · · · vvv · · · determines a bi-infinite path P (v) in σ, which is a quasigeodesic in H 2 a bounded distance from the geodesic representative of an axis of (some conjugate of) v. We may thus crudely associate lifts of segments γ j to W with such subwords.
If we translate P (v) so that the segment corresponding to L b starts at the vertex ω of σ, then this segment ends at ω + b + 1. Moreover, the endpoints of P (v) on the real axis are contained in the intervals (−1, 0) and (b, b + 1). Let γ be the infinite geodesic with the same endpoints as P (v). Then the intersection of γ with W is either empty (for which b = 1 is necessary but not sufficient) or consists of two points, one in the segment of σ from ω to ω ± 1, the other in the segment of σ from ω + b to ω + b ± 1 where the ±1 depends in each case on the rest of the word v (the degenerate case that γ passes through one or two vertices of σ is allowed).
In our case of interest, this intersection γ ∩ W projects to the segment β i of γ corresponding to an L bi subword. Similarly, α i segments of γ correspond to R ai subwords, with the caveat that some L bi or R ai subwords with a i or b i = 1 may not correspond to a segment of γ at all. Example 3.2 (R 7 L 2 RL part 1). We will illustrate the main points of our construction in a particular case.
At each vertex, P turns left or right according to the letters ofv (read from right to left). The path P (v) is chosen so that a segment corresponding to R 7 is contained in ∂W .
Let v correspond to the conjugacy class R 7 L 2 RL which satisfies scl(v) = 5/12 and rot(v) = 5/6. A matrix representative in PSL(2, Z) for v is ( 37 22 5 3 ). A bi-infinite path P (v) is illustrated in Figure 3 and the corresponding axis γ in Figure 4.  The geodesic γ is cut by σ into three segments α 1 , β 1 , α 2 , corresponding to the subwords R 7 , L 2 and R.

3.2.
Lifts and surfaces. For each α i or β i we choose lifts α i and β i properly contained in W subject to the following lifting conditions: (1) the lifts are disjoint, and no two lifts intersect the same segment of ∂W (2) there are exactly five consecutive segments of ∂W between consecutive β i (3) the α i and β i are not nested (i.e. they cobound disjoint disks with ∂W ) except that the β i are all contained "under" the lift α 1 , so that the leftmost vertex of α 1 and the leftmost vertex of β 1 intersect segments of ∂W separated by exactly five other segments of ∂W . For each i = 1 let S i be the subsurface of W bounded by α i and ∂W . Let T be the subsurface of W "above" all the β i and "below" α 1 . We will build our immersed surface from the S i and T , glued up suitably along their intersection with ∂W . Let S denote the (disjoint) union S = ∪ i S i ∪ T . The boundary of S comes in two parts: the part of the boundary along the α i and β i (we denote this part of the boundary ∂ γ S) and the part along ∂W (we denote this part by ∂ W S).
Furthermore, ∂ W S decomposes naturally into segments which are the intersection with the segments of ∂W . There are two kinds of such segments: entire segments (those corresponding to an entire segment of ∂W contained in ∂S), and partial segments (those corresponding to a segment of ∂W containing an endpoint of some α i or β i in its interior). We also allow the case of a degenerate partial segment, consisting of a single vertex of ∂W ∩ α i or ∂W ∩ β i .
By construction, the partial segments of ∂ W S come in oppositely oriented pairs, ending on pairs of points of ∂ γ S ∩ ∂ W S projecting to the same point in γ. We glue up such pairs of partial segments, producing a surface S ′ . Under the covering projection H 2 → M, the surface S immerses in M in such a way that the immersion extends to an immersion of S ′ . The ∂ γ components glue up to produce a smooth boundary component ∂ γ S ′ which wraps once around γ in M. The other part of ∂S ′ , which by abuse of notation we denote ∂ W S ′ , is a union of connected components, each of which is tiled by entire segments of ∂W . At each vertex corresponding to an end vertex of a partial segment, the segments of ∂ W S ′ meet at an angle of 4π/3. At every other vertex the segments meet at an angle of 2π/3. We say such vertices are of type 2 and type 1 respectively.
To complete the construction of an immersed surface virtually bounding γ (and thereby completing the proof of Theorem 3.1) we must show how to glue up S ′ by identifying segments of ∂ W to produce a smooth surface. Such a surface immerses in M, and is extremal for γ. In fact, technically it is easier to glue up S ′ to produce a smooth orbifold, containing orbifold points of order 2 and 3 that map to the corresponding orbifold points in M. Such an orbifold is finitely covered by a smooth surface virtually bounding γ. Example 3.3 (R 7 L 2 RL part 2). With notation as in part 1, we choose lifts α 1 , β 1 and α 2 as indicated in Figure 5.  Notice that the exponent 7 is too small: there is not enough room for β 1 under α 1 without putting two endpoints (c 1 and c 2 ) on adjacent partial segments. The result of gluing up these adjacent segments produces an orbifold point of order 3 in the interior of S ′ .
Furthermore, there are a pair of "degenerate" partial segments, consisting of the points a 1 and a 2 . Identifying these points produces a non-manifold point in S ′ , where the two components ∂ γ S ′ and ∂ W S ′ meet, at a smooth point on ∂ γ S ′ , and at a point with angle 4π/3 (i.e. a point of type 2) on ∂ W S ′ . This non-manifold point will become an ordinary manifold point after we glue up ∂ W S ′ .
The a i and the endpoints of the partial segments containing the b i glue up into two adjacent type 2 points on ∂ W S ′ , and the remaining three vertices of ∂ W S give rise to three adjacent type 1 points on ∂ W S ′ . Thus the vertices on ∂ W S ′ are of type 11221 in cyclic order.
The interior of the 12 segment can be folded up, creating an orbifold point of order 2, and the other four segments identified in pairs (pairing 11 with 22), creating another orbifold point of order 3 corresponding to the "unpaired" 1 vertex.
The result is a smooth orbifold S ′′ with three orbifold points of orders 2, 3, 3, which immerses in M with boundary wrapping once around γ. A finite (orbifold) cover of S ′′ is a genuine surface, which γ virtually bounds.

Combinatorics.
We have seen in general that ∂ W S ′ is determined by combinatorial data consisting of a finite collection of circularly ordered sequences of 1's and 2's, which we call circles. We write such a circle as an ordered list of 1's and 2's, where two such ordered lists define the same circle (denoted ∼) if they differ by a cyclic permutation. Hence 211 ∼ 121 ∼ 112 and so on. A consecutive string of 1's and 2's contained in a circle is bracketed by a dot on either side; hence ·12· is a sequence in the circle 122 and so on.
The 2's correspond to vertices of ∂ W S which are also vertices of ∂W , and are contained in partial edges; whereas the 1's correspond to the other vertices of ∂ W S which are also vertices of ∂W . Hence the total number of 2's is equal to the number of segments of γ − σ, which is at most n. The total number of 1's is at most N + N ′ + n.
The only combinatorial properties of the circles we need are the following: (1) each circle contains at least one 2, and consequently there are at most n circles (this is immediate from the construction); (2) some circle contains a sequence of at least N − N ′ − 7n consecutive 1's, where N is large compared to N ′ and n (this follows from lifting conditions (2) and (3)); and (3) each circle contains a string of at least two consecutive 1's (this follows from lifting condition (2)).
We refer to the sequence of at least N − N ′ − 7n consecutive 1's informally as the big 1 sequence. Providing N is sufficiently big compared to N ′ and nequivalently, providing the big 1 sequence is sufficiently long -we can completely glue up ∂ W S ′ as an orbifold. We now explain how to do this.
The argument consists of a sequence of reductions to simpler and simpler combinatorial configurations. These reductions are described using a pictorial calculus whose meaning should be self-evident.
The first reduction consists of taking a pair of 11's and identifying the segments they bound. If the 11's are on different circles, these two circles become amalgamated into a single circle. We call this the 1-handle move; see Figure 6.
Hence, by bullets (2) and (3), by applying the 1-handle move at most n times, using up at most 2n of the 1's in the big 1 sequence in the process, we can reduce to the case of a single circle. This circle has the form 1 m ν where 1 m is the big 1 sequence, and ν stands for some sequence of 1's and 2's. Since N ≥ 3N ′ + 11n + 3, the big 1 sequence in ∂S ′ has length at least 2N ′ + 6n + 3. Using up at most 2n of  Figure 6. The 1-handle move these 1's gives m ≥ 2N ′ + 4n + 3. On the other hand, the length of ν is at most N ′ + 2n by construction.
The first two moves are special cases of the 1-handle move, where the two ·11· segments being glued are not disjoint. By means of repeated applications of the ·11· → ·2· move, a string of at most 2k consecutive 1's can be reduced to any string of length k. Associated to any sequence of 1's and 2's is its complement, obtained by reversing the order of the sequence and replacing each 1 by a 2 and conversely. We denote the complement of a sequence ν by ν c . We transform a subset of the big 1 sequence into the complement of ν by such ·11· → ·2· moves. This gives the reduction The ν sequence and its complement can be glued up in an obvious way, folding up the segment between the last letter of ν c and the first letter of ν and producing an interior orbifold point of order 2. This amounts to the reduction 1 m ′ ν c ν → 1 m ′′ 2 where m ′′ ≥ 1. After a finite sequence of ·1121· → ·2· moves, we reduce to one of the cases 12, 112 or 1112.
Folding up each edge of the 12 circle glues up the boundary completely, producing two interior orbifold points of order 2. Folding up the ·12· edge and identifying the other pair of edges in the 112 circle glues up the boundary completely, producing two interior orbifold points or orders 2 and 3. Identifying the ·12· and ·21· edges with the succeeding pair of ·11· edges of the 1112 circle glues up the boundary completely, producing two interior orbifold points of order 3. See Figure 7. In every case therefore ∂ W S ′ can be completely glued up, and the proof of Theorem 3.1 is complete.
Remark 3.4. The proof generalizes in an obvious way to formal sums. Let γ n denote the geodesic corresponding to the conjugacy class R n X for some fixed string X. Let δ 1 , · · · , δ m be any finite collection of geodesics in M. Then for sufficiently large n, the 1-manifold γ n ∪ i δ i virtually bounds an immersed surface in M.

Generalizations
In this section we discuss some generalizations of our main theorem.   Proof. The proof is very similar to the proof of Theorem 3.1, and for the sake of brevity we use language and notation as in § 3 by analogy.
There is an embedded geodesic segment σ p running between the orbifold points of orders 2 and p in M p , covered by an infinite p-valent tree σ p in H 2 . Let W p be the closure of the complementary component of H 2 − σ p stabilized by a translation. Then ∂W p is a sequence of circular arcs meeting at an angle of 2π/p.
A conjugacy class R n v represented by a geodesic γ is decomposed into α i and β i arcs by σ p , where we use α i to denote the arcs whose lifts to W p move to the left, and the β i to denote the arcs whose lifts to W p move to the right. For sufficiently large n, there is one arc α 1 with a lift α 1 which intersects segments of ∂W p approximately n apart, whereas the combinatorial types of the α i (for i > 1) and β i are eventually constant.
As in § 3.2 we choose disjoint lifts α i , β i with the β i under α 1 and with five segments of ∂W p between successive β i . These lifts cobound surfaces T and S i which can be glued up to produce S ′ with ∂ W S ′ a collection of circles with vertices labeled by numbers from 1 to (p−1). As in § 3.2, we are guaranteed that each circle contains a ·11·, and one circle contains a big 1 sequence (which may be assumed to be as long as we like, by making n large).
The 1-handle move still makes sense, so after performing a sequence of such moves we can reduce to the case of a single circle with a big 1 sequence; i.e. we have reduced to the case of 1 m ν for ν an arbitrary (but fixed independent of all large n) sequence, and m as big as we like. Folding a ·1k· segment where k < p − 1 produces an interior orbifold point of order 2, and a single (k + 1) vertex. So we can turn a sequence of at most (p − 1)|ν| consecutive 1's into the complement ν c and reduce 1 m ν → 1 m ′ ν c ν. Folding ν c into ν reduces to 1 m ′′ 2 as before.
We must be a bit careful with the endgame depending on the value of m ′′ mod p. We would like to reduce to a 1 p−k k circle for some k, since 1 p−k k → 1 p−k−1 (k+1) → · · · → 1(p − 1) by successive folds, and then 1(p − 1) can be completely folded up, producing two orbifold 2 points (much as we completely folded up the 12 circle in the case p = 3).
Fortunately this reduction can be accomplished if m ′′ is sufficiently big (whatever its residue mod p). Folding an edge gives ·11· → ·2·, but folding at a vertex gives ·111· → ·2·, in either case producing an interior orbifold point of orders 2 and p respectively. By judicious application of some number of these moves (together with folds of the kind ·ab· → ·(a + b)· if a + b < p or ·1ab1· → ·2· if a + b = p) we can reduce 1 m ′′ 2 for any sufficiently large m ′′ to 1 p−k k for some k, and thence glue up completely as above. This completes the proof of the theorem.
It seems very plausible that some generalization of our methods should prove an analogous theorem for (p, q, ∞) orbifolds with p, q > 2, or even arbitrary noncompact hyperbolic orbifolds with underlying topological space a disk, but the combinatorial endgame becomes progressively more complicated, and we have not pursued this.

4.2.
Combinatorics of 12 circles. Returning to the case of p = 3, we describe a slightly different method for producing immersed surfaces. Suppose we have a word X = R a1 L b1 R a2 · · · L bn for which the b i can be partitioned into subsets B i = {b i,1 , b i,2 , · · · } (possibly empty) with the property that a i ≫ j b i,j . In this case, we can choose lifts α i and β j such that for each i, all the β i,j are contained "under" the single arc α i .
In this case, we can still produce a surface S ′ for which ∂ W S ′ is a collection of circles with 1 and 2 vertices. We are thus naturally led to the question: which 12 circles can be glued up completely? This seems like a hard combinatorial question; nevertheless, we describe some interesting necessary conditions and sufficient conditions (though we don't know a simple condition which is both necessary and sufficient). Example 4.3. Each 2 vertex must be glued to some unique 1 vertex, therefore the total number of 2's must be no more than the total number of 1's. So, for example, the circles 2 and 221 can't be glued up.
Example 4.4. The length of a consecutive sequence of 1's can never be increased. So consecutive sequences of 2's must be associated to disjoint consecutive sequences of 1's of at least the same length. So, for example, the circle 22211211211 can't be glued up, even though it has more 1's than 2's.
Example 4.5. A circle with alternating 1's and 2's can be completely glued up. Any circle of the form 21 c1 21 c2 · · · 21 cm where each c i ≥ 7 can be completely glued up, since we can use the reductions ·111· → ·2· or ·11· → ·2· to replace ·1 c · by ·121 · · · 21· whenever c ≥ 7. Hence in particular, rot is extremal for any conjugacy class of the form R a1 L b1 R a2 L · · · R an L bn whenever the b i can be partitioned into subsets B i = {b i,1 , b i,2 , · · · } (possibly empty) with the property that a i ≥ 10 + j (b i,j + 10) for all i.
Given two finite sets of positive numbers {a i } and {b i } (possibly with multiplicity), the problem of partitioning the b i into subsets B i = {b i,1 , b i,2 , · · · } (possibly empty) with the property that a i ≥ j b i,j is familiar in computer science, where the b i denote the lengths of a family of files, and the a i denote the lengths of a family of empty consecutive blocks in memory. See e.g. [13], § 2.2 and § 2.5 for a discussion. The performance of dynamic memory allocation algorithms is very well studied, with respect to many different kinds of statistical distributions for the numbers {a i } and {b i }, and it would be intriguing to pursue this connection further.
Example 4.6. In this paper we have discussed various sufficient conditions on the exponents a i and b i for a word γ to virtually bound an immersed surface. However, these conditions have only depended on the sets of values {a i } and {b i }, and not their order. A complete understanding must necessarily take this order into account. For example, scl(R 3 LRLR 2 LRL 2 ) = 1/6 = rot(R 3 LRLR 2 LRL 2 )/2 whereas scl(R 3 LRLRLR 2 L 2 ) = 4/15 > 1/6 = rot(R 3 LRLRLR 2 L 2 )/2

Experimental results
In this section we describe the results of some computer experiments, comparing the functions scl(·) and rot(·)/2 in general. The function rot can be computed by an exponent sum for a conjugacy class expressed as a product of R's and L's, and scl can be computed using the algorithms described in [4] (implemented on the program scallop, available from [7]) and [6], § 4.2.5.

Distribution of n(X).
For each word X in L and R, define n(X) to be the smallest negative number such that rot is extremal for L −n X (if one exists), or the smallest non-negative number such that rot is extremal for R n X otherwise. Figure 8 shows a histogram of the frequency distribution of n(X), for all words X of length 10. It is a fact ( [9]) that in an arbitrary word-hyperbolic group, most rationally null-homologous words of length n have scl ∼ n/ log n. On the other hand, rot is an example of a bicombable quasimorphism, and therefore by [8], the distribution of values on words of length n satisfies a central limit theorem; in particular, one has |rot| ∼ √ n for most words of length n. It follows that n(X) is at least of size n/ log n for most words X of length n, at least when n is large.

5.2.
Stuttering. One might imagine from the discussion above that if rot is extremal for R n X, then it must also be extremal for R m X for all m > n; however, this is not the case. We call this phenomenon stuttering.
Example 5.1. The quasimorphism rot is extremal for R 3 LRL 2 but not for R 4 LRL 2 . It is extremal for R 2 LRL 2 RL but not for R 3 LRL 2 RL or R 4 LRL 2 RL. It is extremal for RLR 2 L 2 RLRL 2 R 2 L but not for R i LR 2 L 2 RLRL 2 R 2 L for 1 < i < 5. We refer to these examples colloquially as stuttering sequences of length 1, 2 and 3 respectively. We do not know of any examples of stuttering sequences of length > 3, but do not know any reason why such examples should not exist.

Acknowledgments
Danny Calegari was supported by NSF grant DMS 0707130. We would like to thank Benson Farb and Eric Rains for some useful conversations about this material.