Feature Column from the AMS

Primes

4. Complexity and algorithms

Given a large positive integer M, how can one tell if it is prime or not? There are two considerations here. Not only does one want to be able to take M and get a yes or no answer (a so-called decision problem) as to whether or not it is prime, but also one wants to get the answer in a reasonable amount of time. Obviously it would not be surprising if getting the answer is easier when checking to see if 377 is prime versus checking 10000000000000001. However, if it takes two years to test the second number, it may not be cost-effective to make the test. The area of mathematics and computer science designed to answer questions such as this has come to be called complexity theory. Complexity theory leads researchers into many interesting areas. For example, an interesting question is how to measure how much work must be done to solve a problem. The number 377 is smaller than 10000000000000001 but it has more nonzero digits. Might there be a method for testing for primality that would be independent of numerical size and depend only on the number of digits in the number? A complexity theorist could try to prove a theorem that there could be no algorithm for testing a number for being a prime which involves, for example, only the numbers of each digit that appear in the number. (Think about 112, 121, and 211.) In many cases there are different ways to measure the complexity of a problem and these different measures lead to interesting mathematical and computing questions.

There are also issues concerning the nature of the process to test if a given integer is prime. The standard algorithmic approach to get an answer is to design a procedure which in no way involves chance. Researchers are interested in such algorithms that give an answer in an amount of time which is "polynomial" in the size of the problem. Such problems are said to be in the complexity class P. Such an algorithm was found by Gary Miller in 1976, assuming that the Extended Riemann Hypothesis was true. Unfortunately, the Riemann Hypothesis and the Extended Riemann Hypothesis are still unsolved!

One can imagine a procedure which uses numbers chosen at random. Suppose there is a procedure which answers whether or not a certain situation holds (say, property T) or does not hold with probability 1/2. If one carries out this procedure many times independently, one can get an answer if T holds where the probability of reaching the right conclusion is very high. Such procedures, however, do run the risk that they reach an incorrect answer even if the probability of this happening can be made as small as one wishes. Such a test for primality, known as the Miller-Rabin Test (named for Gary Miller and Michael Rabin), is often used in practice. Does it surprise you that the Miller-Rabin test makes critical use of Fermat's Little Theorem? The Miller-Rabin algorithm lies in the class BPP, which refers to bounded probabilistic polynomial time. It determines with an exponentially small probability of an error that a number is prime in polynomial time using an algorithm which is randomized. This means that random choices are made when the algorithm is carried out.

Finally, there are procedures which will answer the question of whether an object has property T for certain, but sometimes the procedure gives no answer. There is a positive probability that the procedure terminates without giving an answer if property T holds. Procedures that check for primality of this kind are known.

Returning to the problem of testing if a number is prime, one approach would be to divide the number by the odd numbers 3, 5, 7, etc. and see if one gets a nonzero remainder. One only has to carry out this process up to the square root of M because if the number is composite, it has a prime factor which is at most the square root of M. However, this process takes a lot of time when M is very large. Large numbers must be entered in some form into a computer because it is not practical to do these calculations by hand. To enter a large number into a computer it must be represented in some form, usually in decimal. Large numbers have many digits and the number of digits one needs to enter is approximately log M (base 10) when M is in decimal. (Log 1,000,000 = 6 though the number of digits to write the number down is really 7.) Often the complexity of methods involving M is expressed in terms of log M because this is the number of digits that computer input will require.