PDFLINK |

# Proofs Without Words

Communicated by *Notices* Associate Editor Steven Sam

Ecce! .

In this ‘proof,’ terms in the equation correspond to objects in the figure, and the identity is ‘proved’ by appealing to geometric symmetries of the figure. Symmetries mean a group acting on suitable objects in some space , .

Many proofs without words take the following form: an identity is ‘proved’ by exhibiting an object in ‘representing’ which may be cut into pieces which can be rearranged (i.e., translated by elements of the group , to form a new object ) ‘representing’ Such proofs often work best when . and take values in an abelian group (which need bear no direct relation to and the pieces into which ), and are decomposed also represent elements in The group . should be abelian because in a (planar) figure composed of several pieces there is no obvious order in which to ‘multiply’ the pieces. Pictorial identities in such groups are called *scissors congruences*. An example is the Pythagorean identity for the sides of a right-angled triangle; see Figure 2.

In this example is (where the areas , and of the three squares take values) and is the group of isometries of the (Euclidean) plane.

Everyone knows the ‘proof without words’ for the commutative law for multiplication. There is another commutative law which is a bit trickier to see, one involving *information*. Let’s play twenty questions, and to save time let’s reduce the play to two questions. Let’s further assume that the questions allow only yes-no answers. You think of an object drawn from some class of possible objects and I ask two yes-no questions and Whatever we may mean by ‘information,’ it stands to reason that the amount of information I obtain from asking my questions does not depend on the order I ask them: if . denotes the information I gain from asking a series of questions, and means I first ask and then then I should have ,

On the other hand, I can compute another way. Evidently is equal to (the information I get by asking plus the information I get from ) *conditioned on having already obtained an answer to * (this is the *chain rule* for information). This second quantity is denoted and now my commutative law takes the form ,

Our goal is to give a ‘proof without words’ for this equation.

To translate this into more traditional mathematics, is a space with a probability measure on it and is a random variable. For concreteness let’s take to be the unit interval and to be a random number drawn from this interval. A question about amounts to choosing some (measurable) subset of and asking whether is in this subset or not. Again, for concreteness, let’s take to be the question ‘is and ?’ to be the question ‘is for some fixed ?’ .

*Entropy* is a numerical measure of information; it is given by a (rather opaque) formula

which we write as a function of since it depends only on One heuristic way to think about this formula is to observe that probabilities of independent events multiply, whereas we want to think of information as .*additive*. Thus we define the entropy of an event as the log of its probability, and then the entropy of a question is the *expectation* of the entropy of an answer to that question. The conditional information is either zero if (because in that case we already know and otherwise is equal to ) because if we already know that the question ‘is , has a ‘no’ answer with probability ?’ Thus our entropy equation becomes .

and this is the form in which we shall give a proof without words.

OK. To present our proof without words we need three things: a class of geometric objects which will represent allowable terms in our equation, a group of symmetries so that geometric objects that differ by an element of will be considered to represent ‘the same’ term, and a list of allowable dissections of our objects that will represent addition of terms in the equation.

The objects in our figure will be certain plane polygons assembled from pieces called * -trapezoids*. A *trapezoid* is a quadrilateral with two parallel sides. A trapezoid is *horizontal* if the two parallel sides are horizontal. Unless it is a parallelogram, the other two sides, when extended, intersect at a point. An * -trapezoid* is a horizontal trapezoid whose non-horizontal sides intersect at a point on the (for simplicity we’ll only consider -axis which lie on the positive side of the -trapezoids -axis).

What about the group It is generated by three kinds of transformations which permute ? -trapezoids:

- (1)
horizontal translations for ;

- (2)
horizontal shears for ;

- (3)
dilations for .

Translations and dilations together generate the affine group of the line Shears commute with both translations and dilations; thus . Figure .3 shows four different which are congruent under the -trapezoids -action.

Finally we must say what sorts of ‘dissections’ are permitted; this is the geometric avatar of ‘addition’ in the abelian group where information will take its values. We shall define an abelian group generated by classes of -equivalence modulo the relation -trapezoids whenever is an decomposed into two -trapezoid -trapezoids and by a straight line of one of the following two sorts:

- (1)
(horizontal cut): is any horizontal line; or

- (2)
(vertical cut): is any non-horizontal line concurrent with the non-horizontal edges of .

These two decompositions are illustrated in Figure 4.

It turns out that as an abelian group, is isomorphic to where the first factor is with its additive structure, and the second factor is with its multiplicative structure. To see this observe first that every may be scaled by a dilation so that its top edge has height -trapezoid and then it is determined up to the , by two numbers—the width -action of the top edge and the height of the bottom edge. Let’s denote this class of -equivalence by -trapezoid A horizontal cut decomposes . into a vertical cut decomposes ; into Finally, . for any (positive) integer as can be seen by repeated shears, dilations, and the addition laws. ,

Identifying with shows how to make into an -module: acts by (scalar) multiplication on the left factor. In other words, for any and any we define Geometrically this is achieved by stretching an . in the horizontal direction by a factor of -trapezoid (or, what is equivalent as classes in stretching it in the vertical direction by a factor of , ).

Where does entropy come in? To see this we must take an unexpected detour into hyperbolic geometry. Poincaré’s *upper half-space model* of the hyperbolic plane is the subset consisting of points where except that at points of height , we scale the Euclidean metric infinitesimally by the factor .

The action of on the upper half-space does not preserve the hyperbolic metric; however it *does* preserve the hyperbolic *area form*. It follows that *hyperbolic area* defines a (surjective) homomorphism from to .

The connection to entropy becomes clearer once we compute that the hyperbolic area of is To see this, observe that . has a representative of its class with vertices at -equivalence Then we may compute .

In particular, is the area of the expression and the entropy equation will follow from an analogous identity in , :

At long last we may give the proof of this identity without words; this is Figure 5. On the left is in blue and is in yellow while on the right is in blue and is in yellow. These are evidently equidecomposable in .

One may think of this picture in several ways. The map is a diffeomorphism from the upper half space to if we think of ; as with its tautological symplectic form then , is the hyperbolic area form and the group acts by symplectomorphisms.

In fact, one may reinterpret Figure 5 as a scissors congruence in a somewhat different group. In place of the group one can consider the group generated by horizontal translations, dilations, and reflection in the -axis and then we may define ; to be the abelian group generated by classes of -equivalence modulo horizontal or vertical cuts. Now -trapezoids really is a group of hyperbolic isometries (it is the ‘parabolic’ subgroup of hyperbolic isometries fixing a point at infinity) and is a rather interesting refinement of The decomposition in Figure .5 proves a valid identity in which is now harder to interpret because of the more complicated structure of as an abelian group.

The space spanned by formal symbols -vector for modulo the entropy equation is denoted and is a kind of ‘additive linearization’ of the ,*Bloch group*, which arises in the geometry of 3-dimensional hyperbolic polyhedra. There is a short exact sequence called the *Cathelineau complex*

where is the (additive) group of Kähler differentials of over After our identification of . with the first map is , and the second map is .

This sequence is closely related to the famous ‘Dehn–Sydler–Jessen’ sequence which encodes (among other things) the fact that volume and the Dehn invariant together are a complete invariant of scissors congruence for 3-dimensional *Euclidean* polyhedra. What is the deeper connection between 3-dimensional polyhedra, hyperbolic and symplectic geometry, Kähler differentials, and the chain rule for conditional entropy? I am at a loss for words. -trapezoids,

## Author’s Note

Cathelineau’s paper (which contains much more than is discussed here) is “Remarques sur les différentielles des polylogarithmes uniformes” and appeared in Ann. Inst. Fourier (Grenoble) **46** (1996), no. 5, 1327–1347. I learned about this paper, and its connection to the Dehn–Sydler–Jessen theorem, from Daniil Rudenko. I’d like to thank Rich Schwarz, Amie Wilkinson and the anonymous referee for comments on earlier versions of this note.

## Credits

Figure 1, Figures 3–5, and author photo are courtesy of Danny Calegari.

Figure 2 is courtesy of Scott Sheffield.