The Polish Attack on Enigma II: Zygalski sheets
I'll say a little bit about how the mathematics was applied to reading German messages, up until a drastic change in procedure by the Germans, and I'll say more about how the Poles recovered after this change...
Bill Casselman
University of British Columbia, Vancouver, Canada
Email Bill Casselman
Introduction
In an earlier Feature Column I attempted to explain some of the mathematics behind the remarkable success of Polish mathematicians in breaking German Enigma codes during the years 1932-1939. What I wrote about there was only about the very beginning of the Polish work, and I said nothing about subsequent events. In this Column I'll say a little bit about how the mathematics was applied to reading German messages, up until a drastic change in procedure by the Germans, and I'll say more about how the Poles recovered after this change. Finally, I'll add a few remarks at the end about the Polish contribution to British cryptanalysis.
My main sources for this Column are two essays by Marian Rejewski, one of the three Polish mathematicians who worked on decrypting Enigma code before the War, to be found in a valuable but somewhat rare book. In parts of the exposition I follow the highly readable account by Gordon Welchman of how he rediscovered Polish techniques while working in late 1939 at Bletchley Park.
The Enigma machine
The German messages I am concerned with are those encrypted on the Enigma machine, used by German military forces from about 1930 on. Before explaining how these messages were read by the Poles, I must recall how the machine worked.
It looked like a German typewriter, and had several visible components---a keyboard, lamps (both labeled A-Z), rotors, and some wires called in German-English "steckers" that were plugged into the front of the machine.
An overall view
A closer view of the top
The rotors uncovered
A message to be transmitted was typed one letter at a time on the keys, and when a letter was typed the corresponding letter of the encrypted message lit up. The electric current that caused this passed through the stecker wiring and the rotors. The rotors moved in a manner something like an odometer, and cycled around in $26$ steps. Schematically, this is the layout:
The current passed through the rotors from right to left, through the reflecting rotor at the far left, then back again along the same path. This guaranteed that if $A$ was encrypted to $B$ then $B$ was encrypted to $A$. This meant that the setup for reading messages was identical to that for sending them. Also, no letter was encrypted as itself. In other words, the transformation from plain text to cipher text at any moment can be written as a product of $13$ transpositions $(X\, Y)$, for example $$ (A\,B)(C\,D)(E\,F)(G\,H)(I\,J)(K\,L)(M\,N)(O\,P)(Q\,R)(S\,T)(U\,V)(W\,X)(Y\,Z) $$ In other words, it was a reciprocal transformation.
There were up until late 1938 three rotors, which could be removed and replaced in arbitrary order. The steckers could also be rearranged. These changes were made regularly, increasingly often as time went on.
There was another, more subtle, setting on the machine. The numerals on the rotors were on a ring around the rotor's core, and this could be rotated relative to the core (i.e. relative to the configuration of five points marked on the core in the photograph above). The connections linking one side of a rotor to the other were embedded in its core, not the ring, and the ring position had limited effect: (1) it determined what was visible to the operator through some small windows just above the rotors and (2) it also determined when the advance of one rotor caused the one to its left also to advance, because of some notches on it. (I tried to explain this in a previous Feature Column.) The first of these played an extremely important role in making life difficult for the Poles.
There is a simple equation involving the ring shift that is fundamental to Polish work. I have said that the encryption alphabet depended on the position of the core of the rotors, not the rings surrounding their cores. Suppose a rotor position is assigned to be the same as the ring position displayed in its window, when the ring is fixed so its knob aligns with the funny mark on the core, and that its window position is what appears in the window. Then the rotor position is equal to the sum of the window position and the ring shift (modulo 26).
In summary, at any given moment the machine implemented an encryption alphabet that depended on (1) the order in which the rotors were placed in the machine, (2) the exact position of the rotors in their cycles, and (3) the placement of the steckers in the board at the front of the machine. It was important to separate the encryption into two components, the encryption transformation $C$ due entirely to the rotors, and the transformation $S$ performed by the steckers. The complete transformation could be expressed as $S^{-1}CS$. Both $C$ and $S^{-1}CS$ were reciprocal transformations.
The encryption alphabet did not depend on the ring shift, which was normally the only part of the rotors visible. Instead, the major role of the ring was in setting up the machine for use.
Comments?
How the machines were used
Every Enigma belonged to a network of machines. All machines on a given network at any given moment had the same configuration. Configurations were changed regularly, all machines on the network in unison. This configuration consisted of (1) the rotor order, (2) the stecker arrangement, and (3) the ring shift (i.e. the rotational displacement of the rotor rings from the distinguished point on the cores).
Every message included at the beginning some unenciphered information about origin and time (which all by itself was useful to the British through traffic analysis), but then added on a short enciphered indicator segment, followed by the message itself (also, of course, enciphered). The indicator segment was what cryptographers called the key of the message. It was well known that if a large number of messages were encrypted with a rotary machine starting in the same state then the system could be broken easily, so every message had to be assigned an essentially unique key, and this had to be transmitted to recipients to enable them to read the message. How was this done?
In 1932, when the Poles first began to break into German messages, part of the configuration of the network included a Grundstellung ('basic position'), which was a sequence of three numbers (it is $4 \> 20 \> 19$ in the photographs above). When a message was to be sent: (1) the operator would rotate the rotors until these three numbers were visible. (2) He would next pick on his own responsibility a sequence of three letters, say $ABC$, and then type it into the machine twice. This was called the indicator of the message. The repetition was to make it unlikely that the indicator would be inaccurately transmitted. In principle, the indicator was to be randomly chosen--essentially unique for each message--but in practice, inevitably, it was chosen according to some relatively simple rule. Indicators like $AAA$, $ABC$, $QUE$, $QAY$ were common in the early days. (3) He would then turn the rotors to display his indicator, and type in the message itself.
The steps taken by the Poles to break into German code in these early days took place in several steps. I need some mathematical notation to describe how things went: let $\sigma_{i}$ be the cipher transformation at step $i$ in typing the message.
-
(1) From a sufficient number of messages on a net, the products $\sigma_{i+3}\sigma_{i}$ could be calculated almost trivially, taking advantage that the indicator was repeated.
-
(2) Because of operators' bad habits in choosing an indicator, each factor $\sigma_{i}$ and $\sigma_{i+3}$ could then be found (but with a fair amount of work).
-
(3) From this information together with data donated by the French, the rotor wirings for two of the rotors were deduced. (This was perhaps the most remarkable Polish accomplishment.)
-
(4) By some process not spelled out in Polish sources, the remaining rotor and the reflection rotor were also figured out.
-
(5) For each machine state one then got a characteristic cycle pattern for the products $\sigma_{i+3}\sigma_{i}$. This did not depend on the stecker arrangement, but only on the rotor configuration $C$. This is because the steckers only modify $C$ to its conjugate $S^{-1}CS$ and this characteristic pattern is conjugation-invariant. All one had to do, then, was to step through all $6 \times 26^3 = 105,576$ such configurations and make a catalogue that assigned to every possible initial segment of $6$ characters a list of possible rotor configurations associated to it. A characteristic did not determine $C$ completely, but it did reduce the number of possibilities drastically. Thus from enough messages on a given net in a given period we can make a relatively small list of possible $C$.
-
(6) Given $C$, one could then solve for the stecker permutation $S$, again applying some simple mathematics of permutations. Knowing plain and cipher text for indicators, we can find both $C$ and $S^{-1}CS$. This determines $S$ up to factors commuting with $C$, usually a small number. At the end, candidate values for all data to be guessed at could be tested on intercepted messages. (This is all explained well in the article by Chris Christensen mentioned at the end.)
This worked astonishingly well for several years, enabling the Poles to read almost all German code they intercepted. But on September 15, 1938 the procedure for making up the indicator segment was changed by the Germans. Whereas previously all machines on a network were assigned a fixed Grundstellung, now this too was chosen by the operator. So now he began every message with three letters he chose himself, which were in effect the Grundstellung of this particular message. He would type this in unencrypted text, and and then proceed as before. It might seem at first that this would make things a lot easier, but in fact the Poles' method up to this for reading German messages had relied completely on having at hand a large number of messages sent in the same key. This no longer happened. However, recovery by the Poles was remarkably and admirably quick.
Comments?
From September 15, 1938 to the beginning of the war
The Poles responded with two new techniques to the change in procedure. On the one hand, they invented machines they called bomby (with etymological origin unknown). On the other, they figured out how to make up certain graphical devices they called perforated sheets to shorten some potentially very nasty computations. The British eventually adopted these, for a short time, but called them Jeffreys sheets after the British mathematician who was responsible for producing them at Bletchley Park
Both the old and new routines depended heavily on the fact that the indicator triple chosen by the operator was enciphered and repeated. The old scheme had been able to figure out what the initial encryption substitutions were, and from this start Marian Rejewski had been able to deduce--miraculously--the wiring of the rotors. As a consequence, one now knew those wirings, and the Poles had in fact by 1938 assembled several simulated Enigma machines.
The new scheme used a fragment of this reasoning. Whereas the old scheme had largely used only the first $6$ enciphered latters in a message, the new one used only the first nine--the 'local Grundstellung' followed by the repeated indicator. It depended on the not infrequent occurrence of what the British later on called imaginatively females among the indicators (but for which the Poles had no name).
What is a female indicator? Recall that the first enciphered letters amount to the indicator enciphered twice. A female is an indicator in which one of the repeated characters is also repeated in the enciphered text. In the following examples, the rows marked by $\bullet$ are females:
In the third row $F E N \; A R S U R F$, for example, the $R$ is repeated in positions $2$ and $5$ od the indicator segment. What is the significance of this? Well, each Enigma alphabet is a product of two-cycles. If the indicator is $XYZ$, then the second encryption $\sigma_{1}$ swaps $Y$ and $R$, the fifth $\sigma_{5}$ also swaps $Y$ and $R$. That means that the product $\sigma_{5}\sigma_{1}$ takes $Y$ and $R$ to themselves. In other words, females tell you about certain fixed characters of products Enigma substitutions.
In Rejewski's original break into Enigma (in 1932), what was crucial was that the permutations $\sigma_{i}\sigma_{i+3}$ could be computed completely, and from them in turn the factors $\sigma_{i}$, $\sigma_{i+3}$. That depended on the existence of enough messages sent with a fixed Grundstellung. This is no longer possible, but it turns out that having enough females at hand together with the 'local Grundstellungen' is a satisfactory replacement for the lost data.
What information can be obtained from one female? Say $F E N \; A R S U R F$? What we want to know is the state of the machine when this was sent: (1) the rotor order, (2) the position of the rotors, and (3) the ring shifts. But the initial $FEN$ relates (2) to (3).
So we now ask, what can the female $F E N \; A R S U R F$ tell us about (1) and (2)? Well, we can scan through all possible rotor orders and rotor positions and calculate for each pair what possible females exist. There are $6 \times 26^3 = 105,456$ such pairs, so this is not a trivial task. It took the Poles a very long time to finish it, and just when they had finished the Germans made a further change that made their task almost useless--the Germans added two new rotors to the collection, so instead of a list of $6 \times 26^3$ they required a list of $60 \times 26^3$ (because $60$ is the number of ordered triples chosen from $5$). The new task was in fact completed only by the British after the Poles explained to them what they had been doing.
Comments?
The sheets
The task at hand was to produce a list of females for every possible choice of rotor order and rotor positions. The idea was that if one was given several (roughly, a dozen) initial female segments of nine characters, one could read through the lists eliminating more and more impossible configurations until one had only a very small number left.
There were several extremely clever aspects of the way in which the list was made and used. In trying to explain this, I follow Rejewski's account (pages 179-181 of On the 100th anniversary ... in the second of two memoirs), and also the account by Gordon Welchman in Chapter 4 of his book. Welchman's account is relevant because he rediscovered the Polish technique in the first few months he worked at Bletchley Park, in late 1940.
The lists were put onto sheets of cardboard. There were $6 \times 26$ sheets, each storing $26 \times 26$ bits of data. These were organized into series of $26$ sheets each. For technical reasons, each was blocked off into $51 \times 51$ squares, rather than $26 \times 26$, but this larger area just repeated the basic $26 \times 26$ pattern. The alphabet was laid out across the top, bottom, and sides of each sheet. Thus each possible rotor configuration corresponded to one of the $26 \times 26$ squares, and a hole was punched in all replicas of this square if it was a female for that configuration. In the process of looking at the list of females in a given network on a given day, sheets were laid out on top of each other, aligned in a way determined by the initial $3$ characters of the female. As more and more sheets were added, the number of holes through which light was visible decreased, and if one was reasonably lucky at the end was left exactly one possible rotor configuration still lit.
Each sheet corresponded to one position of the left rotor, and the square on this sheet at position $(i,j)$ corresponded to the positions of the middle and right rotors. Here is a sample sheet (with females in white to indicate transparency):
It was not necessary to make up separate sheets for other females $2,5$ or $3,6$---these can be taken care of by a shift in reading rows and columns. Also, one only needed to make the set of sheets for Grundstellung $AAA$ (i.e $0\,0\,0$), because other Grundstellungen are also taken care of by shifts. This is explained well by Welchman. These shifts are all explained by the equation rotor position = window position + ring shift.
The percentage of rotor configurations with a female at any positions $n$ and $n+3$ is about $40\%$ (essentially the same as if they were randomly chosen reciprocal substitutions). Thus the existence of one of these in a set of messages eliminates about $60\%$ of the possibilities. More females keep on cutting down this factor, so if we can get a reasonable number of females (during a period in which rotor order and ring shift are fixed), we can reduce drastically the number of possibilities. For example, if we find $12$ females, we can reduce the probable number to about $1$ or $2$. This tells us the probable rotor configuration, and puts us in exactly the same situation as the Poles were before September 15, 1938. Deducing the stecker arrangement from this point was now a familiar and solved problem.
Comments?
Final remarks
I have left out here a whole class of techniques and tools used by the Poles, particularly what Rejewski calls the grid method and the bomby they constructed. These would require another story.
We have seen that in September 1938 the Germans made a change of procedure that was not too difficult foir the Poles to recover from. But in December 1938 they introduced two new rotors from which the three actually placed in the machine were chosen. This threw the Poles off badly since, for example, they now had to allow for $60$ rotor orders instaed of $6$. They started to deal with the workload forced on them, since after all the change was one of quantity not quality, but events overtook their efforts.
Just before the Germans invaded Poland, the Poles met with the French and British in a suburb of Warsaw. They explained what they had been doing since 1932, and handed over two Enigma replicas they had built as well as much documentation. Until much later, these were the only Enigma machines the British possessed. The intriguing question is, among the documentation, what was new to the British? This has been much discussed for many years, but there is not yet a definitive and non-controversial answer. It is certainly clear that the British did not know at this time what the internal structure of a military Enigma was, as both Britsuh and Poles testify. But why not? What, exactly, was the obstacle? Rejewski and British witnesses testify that Knox was astonished to learn that the connections between the key board and the 'entry rotor' (the diagonal) was simply the identity transformation. This was in contrast to commercial models of the Enigma, whose transformation was according to the German typewriter keyboard. It is widely asserted that this was the only obstacle to British success---very recently by Mavis Batey in her biography of Dilwyn Knox, who was in 1938-1939 working on Enigma intercepts. She says (at the bottom of p. xi "Dilly now only needs to know the diagonal ... in order to break the German Army and Air Force Enigma." On the following page, she adds to this that Knox sent back to Britain the information that the correct transformation was the identity, and that "the wheel wiring is broken within two hours." I am not aware that there is good evidence for either of these claims, and there is much against them.
Where to learn more
-
Marian Rejewski, Mathematical solution of the Enigma cipher, in Number 1, Volume 6 of Cryptologia (1982).
-
Chris Christensen, Polish mathematicians finding patterns in Enigma messages, in Number 4, Volume 80 of Mathematics Magazine (2007).
-
Mavis Batey, Dilly: the man who broke Enigmas, Biteback, 2009. Mavis Batey worked with Knox at Bletchley Park, and was herself very skillful at decoding, but she was not present in the summer of 1939, and her account of Knox's accomplishments in that period does not seem so reliable. This contains as appendices two important essays by Frank Carter on 'Rodding' and 'Buttoning-up' which are otherwise relatively inaccessible. These two techniques seem to be Dilly Knox's main contributions to cryptanalysis in World War II. She died only recently (in November 2013) and might have been the last BP crypotographer alive.
-
Bill Casselman, Marian Rejewski and the First Break into Enigma (the first feature column on this topic).
-
Gordon Welchman, The Hut 6 story (first edition), McGraw-Hill, 1982.
-
On the 100th anniversary of Marian Rejewski's birth, Perzeglad Historyczno-Wojskowy, Warszawa, 2005.
This contains two very valuable memoirs by Rejewski in parallel Polish and English translation. (The English translation has a few errors, however, and there were also errors added in copying Rejewski's mathematics.) The Polish half can be found at the Polish site spybooks, which also has other, untranslated, memoirs by Rejewski. He is an extraordinary clear writer, and these deserve to be translated.
-
Marian Rejewski: living with the Enigma secret, published by the Bydgoszcz City Council, 2005.
A collection of essays, mostly in English, about Rejewski. Only 1500 copies were printed. Both this and the previous book are probably very difficult to obtain.
Bill Casselman
University of British Columbia, Vancouver, Canada
Email Bill Casselman