Mathematics and Internet Security
Posted April 2006.
When money is transferred electronically, when an email is sent, when a purchase is made online, the users of such systems want to know that the transactions went as planned and were not "hijacked"...
When a transit strike paralyzed the New York metropolitan area in December, 2005, just a few days before the end-of-the-year holidays, many New Yorkers chose to shop over the Internet rather than brave the hours of delays, long walks, and lack of available cabs to carry out in-the-store shopping. However, some people are still "spooked" by making purchases online, fearing for the security of their transactions.
Mathematics is doing its best to try to address these questions with the goal of making on-line transactions totally secure and making the email environment an unalloyed pleasurable addiction. The mathematical tools which are making all this possible are ones that in previous times some thought inapplicable! New techniques are being found for manufacturing faster and smaller chips, which affects the cat and mouse game between those who try to create a secure Internet environment and those who want to steal money without standing on a long line in a bank.
(An illustration of a new lithography technique being developed by NIST for the manufacture of better and faster chips.
Photo courtesy of the National Institute of Standards and Technology.)
The Internet draws heavily on a wide variety of mathematical tools ranging from data compression and error compression techniques, methods for routing messages, and security issues. In honor of Mathematics Awareness Month, whose theme for 2006 is Mathematics and Internet Security, a few of the many issues involved will be highlighted here.
A military commander wants some assurance that the information sent to field commanders does not fall into the hands of opponents. Hence, written communications which can be easily read if intercepted by an an enemy are dangerous. (Asking the messenger to memorize secret messages is not practical, and if one can believe the spy and counter-terrorism thrillers currently on TV, not secure.) Julius Caesar is often credited with one of the earlier attempts at using a cryptological system with a mathematical flavor to disguise messages. It is claimed that he used a system in which each letter of the alphabet in a "plaintext," the original message, is replaced by the next letter of the alphabet, with the last alphabet letter cycling around to be represented by the first letter of the alphabet. Thus, the phrase Caesar Cipher would be replaced by Dbftbs Djqifs. Coming across a message such as this, one is faced with the tremendous range of possible systems that might have been used to disguise the original message. It might confuse the "enemy" for a while. Within the range of what today have come to be called Caesar Ciphers, one could shift the replacement alphabet by r places, rather than 1 place (r =1) in the example above. When r = 5 the phrase Caesar Cipher becomes hfjxfw hnumjw.
However, if a decoder hits on the idea that the way the plaintext is being disguised is to shift each letter by the same number of positions, it is not very much work, with the English alphabet of 26 letters, to try them all. This simple example already shows the interesting connection between "complexity" issues and security, whether it be Internet use of cryptography or military use. In some situations slowing down one's opponent is good enough. If it takes an hour to decode a message whose information content is valueless after an hour, then the coding system has done its job. However, if a message is decoded, then the next time a similar message is intercepted, the time to recover the hidden information could go down from an hour to 3 minutes. Then I must keep finding ways to stay a step ahead of my "enemy." In this discussion I will use the words code and cipher and decode and decipher interchangeably. However, usually a cipher refers to replacing a symbol of a plaintext alphabet by another single symbol from some other alphabet or the same alphabet. By contrast, a code refers to replacing blocks of symbols in the plaintext, by another block of symbols.
Another simple idea is the use of a polyalphabetic cipher, where the alphabet used to encode the plaintext changes with each letter in accordance with some key. Using the key provides a way to change the alphabet used for the encoding as one matches a plaintext letter to a letter in the key. This idea was pioneered by Leone Alberti (1404-1472), who was also a pioneer of projective geometry. It is tempting to believe that such a "complex" system would be unbreakable. However, if the key length is short and there is lots of ciphertext available using the same system, then statistical methods can be used to break the cipher. If a key is used only once and is generated at random, the so-called one-time pad, then the cipher is not breakable. However, key exchange and generation of large amounts of random key present a significant problem for the volume of communications that we want to secure in modern times.
Skipping to more modern times we come to a very important period and figure for the development of mathematics' role in security issues, Alan Turing (1912-1954) .
After Hitler's forces took over countries such as Holland and France, Germany wanted to extend its conquests to include Britain. Britain began an elaborate effort to try to prevent invasion of the British Isles by taking advantage of communications and signals intelligence (patterns in levels of communication traffic that might indicate some special military operation) and information obtained by deciphering coded communication. A special group of individuals was assembled at Bletchley Park, experts in languages and mathematics, to try to glean as much useable information from the enemy as possible. The first successes in breaking German codes was due to the work of Polish mathematicians, including work of Marian Rejewski, who were able to hand over what they had achieved to the British.
(Photo courtesy of NSA, the caption reads "Marian Rejewski, the Polish mathematician
who made the initial breakthrough against the Enigma machine.")
Using the materials from Poland, Turing and his team undoubtedly changed the course of the war (and history) with his efforts in deciphering materials being generated by the German Enigma machine, seen below, as well as other German cryptographical systems.
(Photo courtesy of NSA)
Turing worked with the mathematician Gordon Welchman to develop a specialized "computer" to help break the Enigma-generated codes.
(Bombe courtesy of NSA)
In the United States too, many men and women, including many linguists and mathematicians, were involved with the war effort. Among the most famous of these individuals were William Friedman (who used mathematical techniques though not trained as a mathematician) and his wife Elizabeth (a linguist). The United States' team broke many Japanese codes and thereby were able to change the course of the war.
(Elizabeth Friedman and William Friedman: Photos courtesy of NSA)
Modern cryptography is no longer primarily the concern of the military and diplomats. Cryptography is increasingly applied to maintain capitalism's infrastructure: telecommunications and the Internet. When money is transferred electronically, when an email is sent, when a purchase is made online, the users of such systems want to know that the transactions went as planned and were not "hijacked." The variety of issues here is staggeringly complex and varied. When one party sends a message to another party, one would like to think: a. the message sent was the message that arrived rather than having some other message substituted b. the message sent can not be used in a way that compromises the security of messages sent in the future c. the way that the message was sent does not allow someone in the future to pretend that a message sent by them came from you. The world of "intrigue" that one associates with spies and spying plays out in with no less complexity in the world of Internet security.
Hashing is the process of taking a long string and replacing it by a much shorter string in some systematic way. For example, if one started with a poem, one could replace it by the string which gives the number of letters in the poem. At first glance it might appear that hashing involves issues of data compression rather than data security. However, it should also be clear that a hashed string might have cryptological importance because it might be possible to build into the hashing system that it be hard to figure out what the original string was. The complication with hashing is that ideally one wants to avoid having different strings hash to the same string. When this happens one says that a collision has occurred. In the system described above, two poems of exactly the same number of letters would create a collision. Thus, one has to build into one's system what to do about collisions, should they occur. Another desirable feature of hashing designed for security purposes is that when two similar strings are hashed, the results are extremely different. Thus, if someone tries to modify a "secure document" in some small way, that the "intrusion" could be detected because the hash of the two documents, the original and the forged one, would be very different. There is a circle of ideas that's involved in why hashing is very closely tied with the issue of passwords and digital signatures. A digital signature is an electronic identification system analogous to the handwritten signatures commonly used for letters and checks. One wants to have a system which minimizes the dangers of forgery. In the mid-1990's one of the popular hashing systems was MD-5 (Message-Digest Algorithm 5), which was designed by Ronald Rivest of MIT. However, in 1996 in a series of developments let by mathematicians and computer scientists, it became clear that MD-5 had problems with its security, so it was commonly replaced by SHA-1.
Public Key Systems
A major revolution in cryptology occurred in the middle 1970's with the discovery (some say rediscovery) of a new paradigm for codes. In traditional cryptography the two parties that wanted to share secret information arranged a system, which typically required a "key exchange," of a single key. Intuitively, think of a key as a way to "lock" a message that is being sent in secret. If the receiver has an identical key to the sender, then the message can be unlocked. Thus, single key systems involve having a common basis (key) for the person who encrypted a message and the person who decrypted it to operate the crypto-system.
Whitfield Diffie (Courtesy of Dr. Diffie and Sun Microsystems), Prof. Martin Hellman (Courtesy of Dr. Hellman),
and Dr. Ralph Merkle (Courtesy of Dr. Merkle)
Public Key Cryptology is based on having two keys. One key, used to send a secret message to a particular person X, is publicly available, like a phone number in telephone directory. The second key, which is not made public, is held by X to be used in conjunction with the public key. Another aspect of public key cryptography and modern private key systems is having a method of allowing strangers to confidently exchange keys with each other. Such a system was devised by Diffie and Hellman (using ideas from Merkle) and it is designed to work over a nonsecure communication system. This system was patented, though the patent has now expired. Below we will discuss in a bit more detail the best known of the public key systems, known as as RSA. It is named for its developers Ronald Rivest, Adi Shamir, and Leonard Adleman.
(Adi Shamir, Ronald Rivest, and Leonard Adleman. Their initial work work on RSA dates from the period when they were students at MIT.)
Another popular public key system is due to Taher Elgamal (1984). This method uses ideas about group theory and complexity issues for its security.
(Photo courtesy of Dr. Taher Elgamal)
For both RSA and ElGamal (and other systems) it is useful to know about the congruence concept. Two integers a and b are said to be congruent modulo m (a positive integer, which is at least 2) written:
if a and b leave the same remainder when divided by m or alternatively, that b-a is exactly divisible by m with a zero remainder. Here are some examples:
Note that we can always arrange the number on the right hand side of a congruence to be a number between 0 and (m-1) where m is the modulus. Thus, we could replace 23 by 10 in the last congruence. It is not very difficult to find the value for the "?" in the congruence below:
The idea for doing this is to compute the values of 5, 52, 54, 58, etc. modulo 19 and then use the binary representation of exponent (in this case 72) to help compute the answer. However, the problem of finding the value of k for which the congruence below is valid is much less straightforward:
The problem of finding k in a situation such as this is known as the discrete logarithm problem. When the modulus is very large, methods which are appreciably better than brute force are not currently known. The complexity of finding discrete logarithms (for various m, in particular, when m is prime) and many other algorithms that have been used to try to design public key systems, is not fully understood. It turns out that some systems based on NP-complete problems have been "broken" while other systems which depend on problems whose complexity is still not understood fully seem to be holding their own. Next we will give a simplified discussion of the popular RSA system, which is widely used as a security measure on the Internet.
This is a Test
One of the ideas that inspired workers in public key crypto-systems is the notion of a "one-way function." This is a task which is easy to perform but which is hard to "reverse" unless one has additional special information.
Now, compute the number n = pq, the product of our secret primes.
The values of e and n constitute the public key. Note that knowledge of n, which is the product of our two large primes, is believed to be of no value to decipher C without being able to factor n.
The values of s and n constitute the private key of the receiver of the message. These values are used to decrypt back to the original message M.
where a is an integer relatively prime to p, where p is a prime. The other theorem needed is due to Euler (1707-1783) and is related to the function φ(x) where φ(x) denotes the number of integers relatively prime to x. This function is sometimes call Euler's phi function or the totient function. For any prime p,
It follows that for distinct primes p and q:
Euler proved a lovely generalization of Fermat's Little Theorem which involves the phi function φ(x):
where x is a positive integer which is relatively prime to the integer a.
Not surprisingly, the literature devoted to Internet security is rapidly changing. Practices which were secure with slower machines and the best know algorithms at a given time often become insecure on quite short notice. Information about the past and current trends can be found in a variety of places on the web.
Here are some locations to consult:
Many graduate mathematics departments sponsor research seminars related to cryptography (which often have implications for the Internet). Here is one example, which includes links to papers related to the seminar contents. There are also several research journals that deal with cryptology and Internet security. Websites have emerged (supporting the hypothesis that there is a website for anything and everything) that deal with computer security and cryptology.
Beutelspacher, A., Cryptology, Mathematical Association of America, Washington, 1994.
Bidgoli, H. (ed.), The Internet Encyclopedia, John Wiley, Hoboken, 2004.
Bidgoli, H. (ed.), The Handbook of Information Security, John Wiley, Hoboken, 2006.
Bidgoli, H., (ed), The Handbook of Computer Networks (to appear).
Cheswick, W. and S. Bellovin, A. Rubin, Firewalls and Internet Security: Repelling the Wily Hacker, 2nd edition. Addison-Wesley, 2003.
Churchouse, R., Codes and Ciphers, Cambridge U. Press, Cambridge, 2002.
Clark, R., The Man Who Broke Purple, Weidenfeld & Nicolson, London, 1977.
Coutinho, S., The Mathematics of Ciphers, A.K. Peters, Natick, 1999.
Davies, D. and W. Price, Security for Computer Networks, John Wiley, Chichester, 1984.
Diffie, W. and M. Hellman, New Directions in Cryptography, IEEE Transactions on Information Theory, vol. IT-22, Nov. 1976, p. 644-654.
Diffie, W. and S. Landau, Privacy On the Line, MIT Press, Cambridge, 1998.
Elgamal, T., A public-key cryptosystem and a signature scheme based on discrete logarithms, IEEE Trans. on Info. Theory, vol. IT-31 (1985) 469-472.
Galbraith, S., Elliptic curve public key cryptography, Mathematics Today, 35 (1999) 76-79.
Garfinkel, S. and G. Spafford, Practical Unix and Internet Security. O'Reilly and Associates, Inc., 1996.
Golomb, S., Shift Register Sequences, Holden-Day, San Francisco, 1967.
Hellman, M., An Overview of Public Key Cryptography, IEEE Communications Magazine, May 2002, p. 42-49.
Kahn, D., The Code Breakers, Macmillan, New York, 1967.
Kahn, D., Seizing the Enigma, Souvenir Press, London, 1991.
Koblitz, N., A Course in Number Theory and Cryptography, Springer-Verlag, New York, 1987.
Koblitz, N., Elliptic curve cryptosystems, Math. of Comp., 48 (1987) 203-209.
Koblitz, N and A. Menezes, A survey of public-key cryptosystems, 2004 (available on the web).
Lewand, R., Cryptological Mathematics, Mathematical Association of America, Washington, 2000.
Lidl, R. and G. Pilz, Applied Abstract Algebra, Springer-Verlag, New York, 1984.
Mason, John. Online Privacy Guide: 19 Actionable Steps to Protect Online Privacy, 2017.
Menezes, A. and P. van Ooshot, S. Vanstone, Handbook of Applied Cryptography, CRC, Boca Raton, 1997.
Merkle, R. and M. Hellman, Hiding information and signatures in trapdoor knapsacks, IEEE Trans. Inform. Theory, 24 (1978) 525-530.
Mollin, R., An Introduction to Cryptography, Chapman & Hall/CRC, Boca Raton, 2001.
Rivest, R., and A. Shamir, L. Adleman, A method for obtaining digital signatures and public key cryptosystems, Communications of the ACM, 21 (1978) 120-126.
Schneier, B., Applied Cryptography, Wiley, New York, 1994.
Shor, P., Algorithms for quantum computation: discrete logarithms and factoring, Proc. 35th IEEE Annual Symp. on Foundations of Computer Science, 1994, p. 124-34.
Stallings, W., Cryptography and Network Security, 4th edition, Pearson-Prentice-Hall, Upper Saddle River, 2006.
Welsh, D., Codes and Cryptography, Oxford U. Press, Oxford, 1986.
NOTE: Those who can access JSTOR can find some of the papers mentioned above there. For those with access, the American Mathematical Society's MathSciNet can be used to get additional bibliographic information and reviews of some these materials. Some of the items above can be accessed via the ACM Portal, which also provides bibliographic services.
Welcome to the
These web essays are designed for those who have already discovered the joys of mathematics as well as for those who may be uncomfortable with mathematics.
Search Feature Column
Feature Column at a glance