Combinatorial and metric properties of Thompson's group T

We discuss metric and combinatorial properties of Thompson's group T, including normal forms for elements and unique tree pair diagram representatives. We relate these properties to those of Thompson's group F when possible, and highlight combinatorial differences between the two groups. We define a set of unique normal forms for elements of T arising from minimal factorizations of elements into natural pieces. We show that the number of carets in a reduced representative of an element of T estimates the word length, and that F is undistorted in T. We describe how to recognize torsion elements in T.


Introduction
Thompson's groups F , T and V are a remarkable family of infinite, finitely-presentable groups studied for their own properties as well as for their connections with questions in logic, homotopy theory, geometric group theory and the amenability of discrete groups.
Cannon, Floyd and Parry give an excellent introduction to these groups in [5]. These three groups can be viewed either algebraically, combinatorially, or analytically. Algebraically, each has both finite and infinite presentations. Geometrically, an element in each group can be viewed as a tree pair diagram; that is, as a pair of finite binary rooted trees with the same number of leaves, with a numbering system pairing the leaves in the two trees. Analytically, an element of each group can be viewed as a piecewise-linear self map of the unit interval: • in F as a piecewise linear homeomorphism, • in T as a homeomorphism of the unit interval with the endpoints identified, and thus of S 1 , • in V as a right-continuous bijection which is locally orientation preserving.
Thompson's group F in particular has been studied extensively. The group F has a standard infinite presentation in which every element has a unique normal form, and a standard two-generator finite presentation. Fordham [8] presented a method of computing the word length of w ∈ F with respect to the standard finite generating set directly from a tree pair diagram representing w. Regarding F as a diagram group, Guba [9] also obtained an effective geometric method for computing the word metric with respect to Date: March 10, 2022. The first, second and fourth authors acknowledge support from NSF International Collaboration grant DMS-0305545 and are grateful for the hospitality of the Centre de Recerca Matemàtica. the standard finite generating set. Belk and Brown [1] have similar results which arise from viewing elements of F as forest diagrams.
In this paper, we discuss analogues for T of some properties of F , using all three of the descriptions of T : algebriac, geometric and analytic. We begin by desribing unique normal forms for elements which arise from their reduced tree pair descriptions. We consider metrically how F is contained as a subgroup of T , and show that the number of carets in a reduced tree pair diagram representing w ∈ T estimates the word length of w with respect to a particular generating set. Thus F is quasi-isometrically embedded in T . Furthermore, we show that there are families of words in F which are isometrically embedded in T with respect to an alternate finite generating set. The groups T and V , unlike F , contain torsion elements, and we describe how to recognize these torsion elements from their tree pair diagrams. Finally, we show that every torsion element of T is conjugate to a power to a generators of T and that the subgroup of rotations in T is quasi-isometrically embedded.

2.
Background on Thompson's groups F and T 2.1. Presentations and tree pair diagrams. Thompson's groups F and T both have representations as groups of piecewise-linear homeomorphisms. The group F is the group of orientation-preserving homeomorphisms of the interval [0, 1], where each homeomorphism is required to have only finitely many discontinuities of slope, called breakpoints, have slopes which are powers of two and have the coordinates of the breakpoints all lie in the set of dyadic rationals. Similarly, the group T consists of orientation-preserving homeomorphisms of the circle S 1 satisfying the same conditions where we represent the circle S 1 as the unit interval [0, 1] with the two endpoints identified.
Cannon, Floyd and Parry give an excellent introduction to Thompson's groups F , T and V in [5]. We refer the reader to this paper for full details on results mentioned in this section. Since more readers have some familiarity with F than with T , we first give a very brief review of the group F , and then a slightly more detailed review of T . Algebraically, F has well known infinite and finite presentations. With respect to the infinite presentation for F , group elements have simple normal forms which are unique. It is easy to see that F can be generated by x 0 and x 1 , which form the standard finite generating set for F , and yield the finite presentation A geometric representation for an element w in F is a tree pair diagram, as discussed in [5]. A tree pair diagram is a pair of finite rooted binary trees with the same number of leaves. By convention, the leaves of each tree are thought of as being numbered from 0 to n reading from left to right. A node of the tree together with its two downward directed edges is called a caret. The left side of the tree consists of the root caret, and all carets connected to the root by a path of left edges; the right side of the tree is defined analogously. A caret is called a left caret if its left leaf lies on the left side of the tree.
A caret is called a right caret if it is not the root caret and its right leaf lies on the right side of the tree. All other carets are called interior. A caret is called exposed if it contains two leaves of the tree. For w ∈ F , we write w = (T − , T + ) to express w as a tree pair diagram, and refer to T − as the source tree and T + as the target tree. These trees arise naturally from the interpretation of F as a group of homeomorphisms. Thinking of w as a homeomorphism of the unit interval, the source tree represents a subdivision of the domain into subintervals of width 1/2 n for varying values of n, and the target tree represents another such a subdivision of the range. The homeomorphism then maps the i th subinterval in the domain linearly to the i th subinterval in the range.
A tree pair diagram representing w in F is not unique. A new diagram can always be produced from a given tree pair diagram representing w simply by adding carets to the ith leaf of both trees. We impose a natural reduction condition: if w = (T − , T + ) and both trees contain a caret with two exposed leaves numbered i and i + 1, then we remove these carets, thus forming a representative for w with fewer carets and leaves. A tree pair diagram which admits no such reductions is called a reduced tree pair diagram, and any element of F is represented by a unique reduced tree pair diagram. When we write w = (T − , T + ) below, we are assuming that the tree pair diagram is reduced unless otherwise specified.
The group T also has both a finite and an infinite presentation. The infinite presentation is given by two families of generators, {x i , i ≥ 0}, the same generators as in the infinite presentation of F , a family {c i , i ≥ 0} of torsion elements, and the following relators: This new family of generators c n (of order n + 2), is simple to describe. The generator c n corresponds to the homeomorphism of the circle obtained as follows. Both domain and range can be thought of as the unit interval with the endpoints identified. We subdivide the interval into n + 1 subintervals by successively halving the rightmost subinterval; or in other words inserting endpoints at 1 2 , 3 4 , . . . , 2 n+1 −1 2 n+1 . Then the homeomorphism maps [0, 1/2] linearly to [ 2 n −1 2 n , 2 n+1 −1 2 n+1 ], and so on around each circle. For example, the element c corresponds to the homeomorphism of S 1 given by Figure 1 shows the graphs of the homeomorphisms corresponding to c 1 and c 2 .
Using the first three relators, we see that only the generators x 0 , x 1 and c 1 are required to generate the group, since the other generators can be obtained from these three. In the following, we will use c to denote the generator c 1 . The group T is finitely presented using the following relators,with respect to the finite generating set {x 0 , x 1 , c}: As with Thompson's group F , we will frequently work with the more convenient infinite set of generators when constructing normal forms for elements and performing computations in the group. We will need to express elements with respect to a finite generating set when discussing word length. There are two natural finite generating sets for T , both extending the standard finite generating set for F . The first and the one that we use primarily below is the generating set {x 0 , x 1 , c 1 } used in the finite presentation above. In Section 5.2 for the purposes of counting carets carefully, we also use the generating set {x 0 , x 1 , c 0 }, which has the advantage that the tree pair diagram for c 0 has only one caret, as opposed to c 1 , which has two carets, at the expense of slightly more complicated relators.
Just as for F , tree pair diagrams serve as efficient representations for elements of T . However, since elements of T represent homeomorphisms of the circle rather than the interval, the tree pair diagram must also include a bijection between the leaves of the source tree and the leaves of the target tree to fully encode the homeomorphism. Since this bijection can at most cyclically shift the leaves, it is determined by the image of the leftmost leaf in the source tree. Since by convention this leaf in the source tree is already thought of as leaf 0, this information is recorded by writing a 0 under the image leaf in the target tree. Hence, for w ∈ T , a marked tree pair diagram representing w is a pair of finite rooted binary trees with the same number of leaves, together with a mark (the numeral 0) on one leaf of the second tree. As usual, we write w = (T − , T + ) to express w as a tree pair diagram, and refer to T − as the source tree and T + (the one with the mark) as the target tree. We remark that to extend this to V , since now the bijection of the subintervals may permute the order in any way, the marking required on the target tree to record the bijection consists of a number on every leaf of the target tree. Just as for F , there are many possible tree pair diagrams for each element of T , which can be obtained by adding carets to the corresponding leaves in the source and target trees in the diagrams. However, when adding the carets, placement is guided by the marking. The leaves of the source tree are thought of as numbered from 0 to n reading from left to right, whereas the marking of the target tree specifies where leaf number 0 of that tree is, and other leaves are numbered from 1 to n reading from left to right cyclically wrapping back to the left once you reach the rightmost leaf. With this numbering in mind, carets can be added as before to leaf i of both trees. If i = 0, the mark stays where it is. Otherwise, if i = 0, the mark on the new target tree is placed on the left leaf of the added caret. So for T , we have a similar reduction condition: if w = (T − , T + ) and both trees contain a caret with two exposed leaves numbered i and i + 1, then we remove these carets and renumber the leaves, moving the mark if needed, thus forming a representative for w with fewer carets and leaves.
A tree pair diagram which admits no such reductions is again called a reduced tree pair diagram, and any element of T is represented by a unique reduced tree pair diagram. In T as well as in F , when we write w = (T − , T + ) below, we are assuming that the tree pair diagram is reduced unless otherwise specified. Checking whether or not a tree pair diagram is reduced is slightly more difficult in T than in F . The process of checking for possible reductions is illustrated in Figure 2. A marked tree pair diagram for an element of T is shown on the top left of Figure 2. In the top right tree pair diagram of Figure  2, the underlying numbering of the leaves of both trees determined by the marking is written explicitly, revealing the reducible carets. The bottom tree pair diagram shows the resulting reduced diagram.
Note that the torsion generators c i have particularly simple tree pair diagrams. In the diagram for c i , both source and target trees consist of the root caret plus i right carets. The mark 0 is placed on the rightmost leaf of the target tree. Figure 3 shows the the tree pair diagrams of the first three generators c 1 , c 2 , and c 3 , together with a general c n . The generator c 0 is merely a pair of single caret trees, with the mark on the rightmost leaf of the target tree.
Whether w ∈ F or w ∈ T , we denote the number of carets in either tree of a tree pair diagram representing w by N (w). When p is a word in the generators of F or T , then p represents an element w in either F or T , and we write N (p) interchangeably with N (w).

2.2.
Group Multiplication in F and T . Group multiplication in F and T corresponds to composition of homeomorphisms, which we can interpret on the level of tree pair diagrams as well. First, we consider u, v ∈ F , where u = (T − , T + ) and v = (S − , S + ). To compute the tree pair diagram corresponding to the product vu, we create unreduced representatives (T − , T + ) and (S − , S + ) of the two elements in which T + = S − . Then the product is represented by the possibly unreduced tree pair diagram (T − , S + ). The multiplication is written following the conventions on composition of homeomorphisms,  so the product vu has as a source diagram that of u, and as a target diagram that of v. That is, the diagram on the left is the source of u and the diagram on the right is the target of v.
To multiply tree pair diagrams representing elements of T we follow a similar procedure. We let u, v ∈ T , where u = (T − , T + ) and v = (S − , S + ). To compute the tree pair    Figure 6. The tree pair diagram representing the product vu obtained from Figure 5. The dotted carets must be erased to find the reduced diagram.
diagram corresponding to the product vu, we create unreduced representatives (T − , T + ) and (S − , S + ) of the two elements in which T + = S − as trees. The product vu will be represented by the pair (T − , S + ) of trees. To decide which leaf in S + to mark with the zero, we just note that it should be the leaf which is paired with the zero leaf in T − . To identify this leaf, we find the zero leaf in T + . Since T + = S − as trees, this leaf viewed as a leaf in S − will be labelled m. Then the leaf labelled m in S + will be the new zero leaf in the tree pair diagram (T − , S + ) for vu. Alternately, we can follow the composition in both pairs of trees to see how the leaves are paired. This newly constructed tree pair diagram will represent vu and is not necessarily reduced. For an example of this multiplication, see Figures 4, 5 and 6.

Words and diagrams
3.1. Normal forms and tree pair diagrams in F . With respect to the infinite presentation for F given above, every element of F has a unique normal form. Any w in F can be written in the form . . < i k and 0 ≤ j 1 < j 2 . . . < j l . However, this expression is not unique. Uniqueness is guaranteed by the addition of the following condition: when both x i and x −1 i occur in the expression, so does at least one of x i+1 or x −1 i+1 , as discussed by Brown and Geoghegan [2]. When we refer to elements of F in normal form, we mean this unique normal form.
If the normal form for w ∈ F contains no generators with negative exponents, we refer to w as a positive word and similarly, we say a normal form represents a negative word if there are no generators with positive exponents.
We call any word which has the form where p is the positive part of the normal form and q the negative part. The normal form for an element of F is the shortest word among all words in pq form representing the given element.
To any (not necessarily reduced) tree pair diagram (T − , T + ) for an element of F we may associate a word in pq form representing the element, using the leaf exponents in the target and source trees. When the leaves of a finite rooted binary tree are numbered from left to right, beginning with zero, the leaf exponent of leaf k is the integer length of the longest path consisting only of left edges of carets which originates at leaf k and does not reach the right side of the tree. A tree pair diagram then gives the word precisely when leaf i k in T + has exponent r k , leaf j k in T − has leaf exponent s k , and generators which do not appear in the word correspond to leaves with exponent zero. We think of this word as the pq factorization of the element given by the particular tree pair diagram. We call a tree an all-right tree if it consists of a root caret together with only right carets. Note that if we let R be the all-right tree with the same number of carets as T − or T + , then (T − , R) is a diagram for the word q and (R, T + ) is a diagram for word p. On the other hand, any word in pq form can be translated into a tree pair diagram. It can be obtained by taking diagrams for p (respectively q), which will have all right source (respectively target) trees. Then, if one diagram has fewer carets, one adds right carets to its all-right tree, and of a corresponding path of right carets to its other tree, to make both diagrams have exactly the same all right tree. Furthermore, under this correspondence for F , reduced tree pair diagrams correspond exactly to normal forms. Figure 7 is an example of this correspondence, and more details can be found in [5,6,7]. If an exposed caret has leaves numbered i and i + 1, then leaf i + 1 must have leaf exponent zero, since it is a right leaf. If both trees in a tree pair diagram have exposed . Computing leaf exponents. The thick edges indicate edges which contribute to non-zero leaf exponents. If a leaf labelled i has r i thick edges (a path of r i left edges going up without reaching the right side of the tree) then the i-th leaf exponent is r i and the generator appearing in the normal form is x r i i . This single tree T pictured above is the target tree of the tree pair diagram (R, T ), where R is the all-right tree with 12 leaves, and has leaf exponents 1,0,3,0,1,0,0,0,2,0,0, and 0 for the leaves 0-11 in order. The tree pair diagram (R, T ) represents the element x 0 x 3 2 x 4 x 2 8 .
carets with leaves numbered i and i + 1, then the corresponding normal form, computed via leaf exponents, contains the generators x i to both positive and negative powers, but no instances of the generator x i+1 . This is precisely the situation when the normal form can be reduced by a relator of F . Thus the condition that the normal form is unique is exactly the condition that the tree pair diagram is reduced. This correspondence will be extended to elements of T in the next section.
3.2. Tree pair diagrams for elements of T . We now discuss the relationship between words in T and tree pair diagrams. This relationship is more complicated in T than it is in F . The representation of elements of T by marked tree pair diagrams suggests a way to decompose an element of T into a product of three elements: the positive and negative parts together with a torsion part in the middle, as described in [5].
Definition 3.1. Let the marked tree pair diagram (T − , T + ) represent g ∈ T . If T − and T + each have i + 1 carets, then we let R be the all-right tree which has i + 1 carets. We can write g as a product pc j i q, where: (1) p, a positive word in the generators of F , is the normal form for the element of F with tree pair diagram (R, T + ), ignoring the marking on T + . (2) c j i is a cyclic permutation of the leaves of R, with 1 ≤ j ≤ i + 2, and p c q Figure 8. Three tree pair diagrams representing the word (3) q, a negative word, is the normal form for the element of F represented by (T − , R).
Then the word g = pc j i q is called the pcq factorization of g associated to the marked tree pair diagram (T − , T + ). In the special case where g ∈ F ⊂ T , the pcq factorization will just be the usual pq factorization, as we consider the c part of the word to be empty (or equivalently, we can allow the exponent j in the torsion part to be zero.) Figure 8 illustrates an example of an element of T decomposed in this way.
The following theorem follows from the existence of these decompositions, and an algebraic proof of this result is found in [5]. . Any element x ∈ T admits an expression of the form · · < i n and 0 ≤ j 1 < j 2 < · · · < j m and either 1 ≤ j < i + 2 or c j i is not present.
We refer to any word satisfying the hypotheses of Theorem 3.2 as a word in pcq form for an element of T (just as words of this form with no c j i term are called words in pq form in the group F ). Neither proof of the existence of pcq forms gives an easy explicit method for transforming a general word in the generators x ±1 i , c i into pcq form without resorting to drawing tree pair diagrams, so we will outline an algebraic method below. We recall that the five types of relators we are using in T are: by an application of relator of type (4). Now, several repeated applications of relator (2) allow the x n to switch with the c m−1 n to obtain the desired result. The second identity is the first one taking inverses, and by noticing that c n has order n + 2, we avoid negative exponents for the c.
We consider a word w ∈ T written in the generators {x i , c j }, and we describe explicitly an algebraic method of rewriting it in pcq form. The idea is to first combine occurrences of multiple c i generators into a power of a single one, and then to move the x n generators to the appropriate side of it. Consider first a subword of the original word w of type c m n w(x i ) c l k , where w(x i ) is a word on the generators x n only, and which may possibly be empty. We will apply relators to reduce this subword to a word of the form w 1 (x i )c h j w 2 (x i ), where w 1 has only positive powers of x n generators and w 2 consists of only negative ones. By the relators of type (1), we can assume that w is of the form pq, that is, with all positive powers of generators on the left and in increasing order of index, and all negative ones on the right and in decreasing order of index. The goal is to move all the positive powers of x n generators to the left of c m n and all negative ones to the right of c l k . Although these moves may change the indices and powers of the c i generators, they merely change a power of a single c j generator to another power of a different single c k generator. To move all the positive powers of generators to the left of c m n , we only need to use relators of the type (2), assuming the index of c is high enough. If it is not, by repeated applications of the first identity of the pumping lemma, the index can be increased arbitrarily, adding only positive powers of generators to the left of c m n . When the subindex is high enough, we can use relators of type (2) to move all positive powers of generators of w past c m n . We note that a relator of type (3) may allow us to eliminate a occurance of x 0 to the immediate right of c m n . It may be necessary to combine the c i and c j generators obtained into a single term after this elimination of x 0 , as we see in an example: with the last equality being an application of the pumping lemma to c 4 . We have achieved the goal of moving a positive power of a generator to the left of c m n . Moving the negative powers of the x n generators to the left is comparable. Using the second identity in the pumping lemma, we can increase the index in c l k as much as necessary to be able to move all negative powers of the x n generators in w to the right of c l k using the relators (2) rewritten as c n+1 x −1 k+1 = x −1 k c n . After this process, we will have a word consisting of positive powers of x n generators, two powers of c i generators, and negative powers of x n generators. We now combine the powers of the two c i generators into a power of a single generator, by increasing the smaller index to reach the larger. To do this, if the smaller is on the left, we can use the first identity in the pumping lemma, and if it is on the right, we can use the second one. This way no x n generator will be added in between the two c i generators and after they have the same index they can be combined into a power of a single generator. The positive powers of the x n generators now appear only to the left of the single power of the c i generator, and negative powers of x n generators only to the right.
After repeated applications of this process to subwords of the type cwc, we will have all occurrences of the c i generators combined into a power of a single one. Our original word is now of the type w 1 (x i ) c m n w 2 (x i ), and w 1 and w 2 may again be assumed, after using relators (1), to be in pq form. We only need to move the positive powers of generators in w 2 to the left of c m n and the negative powers of x n generators of w 1 to the right of c m n , still maintaining a power of a single c i generator in the middle. We describe above as the first step in our algorithm precisely how to do this. Furthermore, if the pumping lemma is needed to move a positive power of a generator to the left, recall that new positive positive powers of generators may appear in the word, but only to the left of the power of the c i generator. Hence, after moving each positive power of a generator, all positive powers of generators in the word are to the left of c m n . We now move each negative power of a generator to the right, and notice that the only cost of this is to add more negative powers of x n generators to the right of c m n . When this is finished, the word has only positive powers of generators to the left of a power of a single c and negative ones to the right. Once the positive powers of the generators are together on the left side of the single c term, we can reorder them if necessary using relators of type (1), and similarly we can reorder the negative part as well.
We will work an example as an illustration. Consider the word The process starts by trying to move the x 3 to the left of c 1 . Since the index of x 3 exceeds the index of c 1 , we cannot apply a relator of type (2) directly. Using the pumping lemma, we write c 1 = x 1 c 2 = x 1 x 2 c 3 . Hence our word is now the following, and we can apply the relator of type to c 3 x 3 , obtaining: The last step is to move the initial x −1 0 to the right side, using several relators of type (2) to obtain x −1 0 c 4 4 = c 4 5 x −1 4 . There is no need this time to increase the index of c 4 4 . The final result is which is in pcq form.
The relationship between words in pq form and tree pair diagrams in F is different than the relationship between pcq forms and tree pair diagrams in T . In F , every tree pair diagram has a pq factorization associated to it, and any word in pq form is in fact the pq factorization associated to a (not necessarily unique) tree pair diagram. Given any word in F in pq form, then we can form a tree pair diagram for this element as follows. We consider reduced tree pair diagrams for p and q, and construct a tree pair diagram for the product pq as described in Section 2.2. The middle trees of the four trees involved in the product are all-right trees. The all-right trees in this decomposition may not have the same number of carets, so in forming the diagram for pq we simply enlarge the smaller of the two of these all-right trees (as well as the other tree in that diagram). Since only right carets are ever added during this process, all of whose leaves have leaf exponent zero, this results in a tree pair diagram whose pq factorization is precisely the word pq we began with.
In T , the correspondence between pcq factorizations and general pcq words is not as straightforward as in F . There is a difference between pcq factorization and pcq algebraic form. Though every element has a tree pair diagram corresponding to a pcq factorization associated to it, there are words in algebraic pcq form which are not the pcq factorizations associated to a tree pair diagram. The difficulty arises when the tree pair diagram for c does not have as many carets as those for p or q, as adding right carets to enlarge c appropriately necessitates adding generators to the normal forms for p and q, so the tree pair diagram one obtains by multiplying as in F will not necessarily have the original word as its factorization. For example, the word x 1 c 1 is in algebraic pcq form, yet it is not the pcq factorization associated to any tree pair diagram. There is a different representative for this element of T which is the pcq factorization associated to the reduced tree pair diagram for this group element: x 1 c 2 x −1 1 . We prefer to work with words which are pcq factorizations associated to tree pair diagrams, which will lead us to unique normal forms.
We can algebraically characterize the words of type pcq which are pcq factorizations associated to tree pair diagrams. The important condition is that the reduced tree pair diagram for c should have at least as many carets as those for p and q. We say that words in T with this property satisfy the factorization condition.
Theorem 3.4. For elements in T which are not in F , the word . . x −s 2 j 2 x −s 1 j 1 , where 0 ≤ i 1 < i 2 < · · · < i n , 0 ≤ j 1 < j 2 < · · · < j m , and 1 ≤ j < i + 2, is the pcq factorization associated to a tree pair diagram if and only if the number of carets in the reduced tree pair diagram for c j i is greater than or equal to the number of carets in the reduced tree pair diagram for both of those for the words Proof. Given a tree pair diagram, by construction, the pcq factorization associated to it satisfies the factorization condition. Given a word that satisfies the factorization condition, we can easily construct the corresponding tree pair diagram as described above. The factorization condition ensures that to perform the mulitiplication, p · c · q as tree pair diagrams, it is only necessary to (possibly) add carets to the tree pair diagrams for the words p and q. This will not alter the normal form, and thus the diagram constructed will indeed have the original word as its pcq factorization.
We can compute the number of carets of a reduced tree pair diagram for a word w ∈ F algebraically from the normal form of w, as described by Burillo, Cleary and Stein in [4].
then the number of carets N (w) in either tree of a reduced tree diagram representing w is N (w) = max{i k + r k + . . . + r n + 1}, for k = 1, 2, . . . , n.
We can always decide algebraically whether w ∈ T , written in pcq form, corresponds to a tree pair diagram. We use Proposition 3.5 to count the carets for the positive and negative parts of the word. The number of carets in a tree pair diagram for c j i is equal to i + 1.

Normal forms in T
In T , we will declare the words in pcq form which are pcq factorizations associated to reduced diagrams to be the normal forms for elements of T , similar to the approach used in F . However, it is no longer true that these words cannot be shortened by applying a relator. As we saw with the normal form x 1 c 2 x −1 1 in T , a word may be the shortest word representing an element which satisfies the factorization condition, yet there may be shorter words we can obtain by applying a relator which do not satisfy the factorization condition.
Thus, when algebraically characterizing the normal form for elements of T , we restrict ourselves to words of pcq form which satisfy the factorization condition, regardless of whether or not a relator may reduce the length of the word. We next specify algebraic conditions which characterize the pcq forms that correspond to normal forms, since we have given geometric conditions in Theorem 3.4 Theorem 4.1. Let w be a pcq factorization for an element g ∈ T associated to a marked tree pair diagram in which each tree has i + 1 carets, where the c part of the word is c j i with 1 ≤ j < i + 2. A reduction of a pair of carets from the tree pair diagram occurs only if the word w satisfies one of the following conditions: (1) The pair of generators x k−j and x −1 k appear, with j ≤ k < i, and neither of the two generators x k−j+1 and x −1 k+1 appear. The reduction corresponds to applying the relator after applying relators from F in the p and q parts of the word, if necessary, to make x k+j and x −1 k adjacent to c j i . Proof. Let g ∈ T be represented by a marked tree pair diagram (T − , T + ). If both trees have an exposed caret whose leaves are identically numbered, then we call that a reducible caret pair, as it must be removed in order to obtain the reduced tree pair diagram representing g. We now consider algebraic conditions corresponding to a reducible caret in a tree pair diagram.
In the tree pair diagram (T − , T + ) for g ∈ T , there are two ways of labelling the leaves in the target tree T + . The first labelling corresponds to the order in which the intervals in the subdivisions determined by these trees are paired in the homeomorphism, and is called the cyclic labelling. The cyclic labelling gives the marked leaf in the target tree the number zero, and the other leaves are given increasing labels from left to right around the leaves of the tree. The second labelling ignores the marking and puts the leaves in increasing order from left to right, beginning with zero. The first labelling is used to determine which leaves in T − are paired with which leaves in T + , and the second labelling is used in the computation of leaf exponents to determine the powers of the generators that appear in the word. Figure 9 shows the labellings for the four cases of the theorem.
Suppose that the tree pair diagram for g ∈ T is not reduced. The four cases above correspond to the following four possible locations of a reducible caret relative to the marked leaf in the target tree.
• Case (1) of the thereom corresponds to the case when the left leaf of the reducible caret is to the left of the marked leaf in T + , but the reducible caret is not the rightmost caret in T − . • Case (2) corresponds to the special case when the reducible caret is a right caret in T − , in which case necessarily its left leaf is to the left of the marked leaf in T + . Leaf exponents from leaves of right carets will always be zero and thus right carets cannot contribute generators to the normal form. They may still result in an exposed reducible caret, which occurs exactly in this case, and the reduction will only affect the q part of the normal form. • Case (3) corresponds to the case when the left leaf of the reducible caret is either to the right of or coincides with the marked leaf in T + , but the reducible caret is not the rightmost caret in T + . • Case (4) corresponds to the special case when the reducible caret is a right caret in T + , in which case it cannot be to the left of the marked caret in T + . As in Case (2), the exposed caret in this case is a right caret and does not contribute a generator to the normal form, but may still be reduced. This cancellation affects only the p part of the normal form.
To see that these are all the possibilities, we note that k, the number of the left leaf in the cyclic numbering of the reducible caret in T − , achieves all possible values in the cases above: • If 0 ≤ k < j − 2 we are in case (3).
• The case k = j − 1 is impossible because the leaves j − 1 and j are at the two ends of the tree. With a cyclic ordering the last and first leaves do not form a caret. • If j ≤ k < i we are in case (1).
• If k = i we are in case (2). Figure 9 illustrates that these are all the possibilities.
The conditions in Theorem 4.1 together with the factorization condition algebraically characterize our normal forms. The normal forms for elements in F have already been characterized, so we restrict to elements not in F in our description.
Theorem 4.2. Any element g ∈ T which is not an element of F admits an expression of the form pcq where 0 ≤ i 1 < i 2 < · · · < i n , 0 ≤ j 1 < j 2 < · · · < j m , and 1 ≤ j < i + 2. Among all the words in this form representing an element, there is a unique one satisfying the following conditions, which we call the normal form.
• The word satisfies the factorization condition, which we now state as i + 1 ≥ max{N (p), N (q)}. • The word does not admit any reductions, and thus its normal form satisfies the following conditions: -If there exists a pair of generators x k−j and x −1 k simultaneously, for j ≤ k < i, then one of the generators x k−j+1 or x −1 k+1 must appear as well.
-If there is a generator x i−j , then x i−j+1 must exist too.
-If there exists a pair of generators x k+i−j+2 and x −1 k for 0 ≤ k < j − 2, then one of the generators x k+i−j+1 or x −1 k+1 must appear as well.
-If there exists a generator x −1 j−2 , then a generator x −1 j−1 must also appear.
Proof. We claim that the conditions above precisely describe the set of unique normal forms for T . A pcq word satisfying the factorization condition is the pcq factorization associated to a marked tree pair diagram. However, if the pcq word satisfies all four reduction conditions, we have just shown in the previous theorem that this diagram is in fact the unique reduced diagram, and hence the word is in fact a normal form.
We remark that the Pumping Lemma together with the reductions in Theorem 4.1 give an explicit way of algebraically transforming any word in the generators of T into a normal form. Given any word, we rewrite it in pcq form using the process described following the Pumping Lemma. If the resulting word does not satisfy the factorization condition, then we iterate the Pumping Lemma until we obtain a word for which the factorization condition is satisfied. The Pumping Lemma increases the number of carets for c and the number of carets for one of the words p and q. Once a word is obtained which satisfies the factorization condition, there must be a corresponding tree pair diagram for the element. Now, if the word satisfies any of the reduction conditions in Theorem 4.1, we apply them successively using the relators described there. This method thus produces the unique normal form. 5. The word metric in T 5.1. Estimating the word metric. For metric questions concerning T , we must consider a finite generating set instead of the one used to obtain the normal form for elements. We now approximate the word length of an element of T with respect to the generating set {x 0 , x 1 , c 1 }, using information contained in the normal form and the reduced tree pair diagram. These estimates are similar to those for the estimates of word metric in F with respect to the generating set {x 0 , x 1 } described [3], [4].
Let |w| denote the word metric in T with respect to the generating set {x 0 , x 1 , c 1 }. There exists a constant C > 0 so that for every w ∈ T , and similarly, for N (w) the number of carets in the reduced tree pair diagram representing w, Proof. These inequalities follow from the correspondence between the normal form and the tree pair diagram for an element w ∈ T . It is clear, from Proposition 3.5, that N (w) ≥ n k=1 r k , N (w) ≥ m l=1 s l , N (x) ≥ i n , and N (w) ≥ j m . The inequality N (w) ≥ i is clear from the fact that c i has i + 1 carets. These inequalities prove that We rewrite the generators x i and c j in terms of x 0 , x 1 and c 1 and look at the lengths of the resulting words to obtain the inequality |w| ≤ C D(w) for some constant C > 0. Combining the two inequalities above, we have To obtain lower bound on the word length, we consider the fact that the tree pair diagram for each generator has either two or three carets. If u is a word in x 0 , x 1 and c with length n, then as these generators are multiplied together, each product may add at most 3 carets to the tree pair diagram. Thus the diagram for u will have at most 3n carets. It then follows that N (w) ≤ 3|w|.
Combining this with the above inequality, we obtain the desired bounds.
We use Theorem 5.1 to show that the inclusion of F in T is a quasi-isometric embedding. This means that there are constants K > 0 and C so that for any w, z ∈ F we have where d F and d T represent the word metric in F and T respectively, with regard to the generating set {x 0 , x 1 } of F and {x 0 , x 1 , c 1 } of T .
When considering whether the inclusion of a finitely generated subgroup H into a finitely generated group G is a quasi-isometric embedding, we can instead equivalently show that the distortion function is bounded. The distortion function is defined by h(r) = 1 r max{|x| H : x ∈ H and |x| G ≤ r}.
Word length in F is comparable to the number of carets in the reduced tree pair diagram representing the word, by Theorem 3 of [4] or more directly by Fordham's method [8]. This, combined with Theorem 5.1 easily shows that the distortion function is bounded, and thus proves the following corollary with respect to the generating sets {x 0 , x 1 } and {x 0 , x 1 , c 1 } and thus all pairs of finite generating sets: Corollary 5.2. The inclusion of F in T is a quasi-isometric embedding.

5.2.
Comparing word length in F and T . Although Corollary 5.2 shows that F is quasi-isometrically embedded in T , in fact the word length of many elements of F does not change at all when these elements are considered as elements of T , with respect to natural finite generating sets. As an example of this phenomenon, we characterize one type of element of F whose word length is unchanged when viewed as an element of T , using the generating set {x 0 , x 1 } for F and {x 0 , x 1 , c 0 } for T . These are elements w ∈ F for which N (w) exceeds the word length |w| F . Fordham [8] computes |w| F by assigning an integer weight between zero and four to each pair of carets in the tree pair diagram representing w. In a given word there are at most two weights of zero. Here we investigate words in which most weights are one. Such words, for example, are represented by tree pair diagrams with no interior carets having right children. This theorem is proved by taking a word in the generators of T , and analyzing how each generator changes the intermediate tree pair diagram as one builds up the final tree pair diagram for w. Carefully controlling the process allows one to obtain an upper bound on N (w) in terms of the length of the word. If the word is actually shorter than |w| F , then this bound, considered together with the lower bound given by the hypothesis, yields a contradiction. We immediately obtain the following corollary, since |x n 0 | F = |x n 1 | F = n, while N (x n 0 ) = n + 1 and N (x n 1 ) = n + 3.
Corollary 5.4. The elements x n 0 and x n 1 have word length n in both F and T with respect to the finite generating sets {x 0 , x 1 } and {x 0 , x 1 , c 0 } respectively.

Torsion elements
Although the group F is torsion free, both T and V contain torsion elements. It is easy to construct torsion elements in T or V by choosing any binary tree S and making any marked tree pair diagram with S as both source and target tree. If the labelling of the target tree is the same as the labelling of the source tree, we get an unreduced representative of the identity; otherwise, we get a non-trivial torsion element. If this is an element of T , the tree pair diagram has pcq factorization in which q = p −1 . In fact, any torsion element can be represented by such a tree pair diagram, though its reduced marked tree pair diagram may well not have the same source and target trees, corresponding to the fact that although it has a pcq word where q = p −1 , the normal form may well not have this special balanced appearance.