Barcodes: Zip Codes
3. Zip Codes
America has relied on the mail since its earliest years. As the country has grown, so too has the volume of mail that America generates. The United States Postal Service delivers 200 billion pieces of mail, (110 billion first class pieces) to potentially 134 million addresses a year. Would this be practical if every piece of mail had to be looked at many, many times by a human being? Most mail that comes into your home has been routed with the assistance of the pattern of short and long bars imprinted on the envelope. How did this come about?
The zip code that citizens add to an address to send Aunt Cindy a birthday card typically consists of a handwritten collection of five decimal digits. These 5 digits designate a region of the country with the first 3 digits (low triples in the East and higher numbers in the West) and a zone within this region with the next 2 digits. If you are really serious, you can put a nine digit collection of decimal digits on the envelope. This system is known as Zip+4; it was instituted in 1983. It added 4 additional digits to the traditional zip code and made it possible in most cases to locate a specific private home. The value of having this additional information for mail processing induced the postal service to offer reductions in postage cost to users of the postal service who provided machine readable versions of Zip+4.
In the discussion above we have concentrated on the use of barcodes to help speed the flow of mail. However, there is a related problem which is only indirectly related to the fact that barcodes are codes. This problem involves the routing of mail. Imagine a letter that has been placed into a mail box in San Francisco, California bound for a house in Tivili, New York. Should the letter be trucked, transported by railroad, or flown or, in part, all of these? Choosing a wise routing scheme can not only speed the mail but save money in running the postal system. Using information that is based on the zip code information on the letter can be used to help design and automate the system of routing mail.
Many clever information systems piggyback on the relationship of geography to zip code. For example, some companies allow one to look for a hotel in an area where one is planning to vacation by entering a zip code, and then to search for all the hotels with that zip code and in neighboring zip codes. Similarly, there are systems which estimate distance between locations based on their zip codes.
In a general way, the decimal digits on the envelope are translated into a binary representation, short and long bars on the envelope. This binary coding of the decimal information has three components: information to inform the scanning machinery that it has located the relevant information to be scanned; a translation of the decimal digit information into binary; and error detection information, represented in binary but constructed from the decimal information. The location information is provided by two long guard bars at the start and end of the binary code. These guard bars are disregarded in interpreting the information contained in the other binary digits. The remaining long and short bars are grouped into blocks of 5 and are translated into a decimal digit using the dictionary below. For example, a block consisting of long, short, long, short, short, represents the decimal digit 9. Note that the representation below does not correspond to the way one counts using the binary system (e.g. 4 = 00100) and that in the dictionary shown each code word has exactly 3 short bars and 2 long bars.
The final five bars are the binary information that codes the error detection information. The number of 5 bar blocks will vary depending on the amount of information being coded (e.g. sometimes only a 5 digit zip code is available, sometimes ZIP+4 is coded, sometimes additional carrier route information is part of the code). However, nearly all the time, there are the extra 5 bars to code a check digit.
How is the last block of five long and short bars generated? The method generating the check digit is surprisingly easy: add the decimal digits involved and choose that decimal digit which, when added to the sum, gives a number which ends in a zero (that is, choose the number so that the sum is congruent to 0 mod 10).
Recall that a b mod m (read a congruent to b mod m) means that a and b leave the same remainder when divided by the positive integer m or that b - a is divisible by m with a zero remainder. For example, 18 8 mod 10 and 23 1 mod 11. Many of the check digit systems can be described using modular arithmetic.
As an experiment you might wish to try to check the digits printed in decimal as part of the address on the business reply sheet on page 626 of the May, Notices of the American Mathematical Society. Are the digits coded in the binary the same as those printed? This is not uncommon, especially for business reply situations. You should also find a piece of mail addressed to your home and decode the bars that have been added at the bottom on a piece of mail addressed to your home that has the zip code handwritten. The postal system hires individuals to read the handwritten zip code and add the machine readable version at the bottom. Attempts are also being made to be able to have computerized optical scanning systems do this work whenever possible.
The zipcode check digit system we have described is especially easy but not especially powerful. If one makes a substitution error, then a problem is detected but if a transposition error occurs then it will not detect this. Of course, in the context we have here, transposition errors are not that likely; what might happen is that the envelope is damaged and the code is not read properly in some part.
In everyday life we rely on the decimal number system, which uses the digits 0, 1, 2, ..., 9, for a wide range of uses ranging from calculation to giving number names to things. Yet computers do not work directly with decimal numbers. Rather they work with the binary number system, which uses only the digits 0 and 1. In using decimal numbers we sometimes limit the ease with which computers can interact with the systems that humans feel more comfortable using. This explains the hybrid situation where we use a barcode for the computer implementation of a system which incorporates decimal digits, with which humans are more familiar. We will see this at work in the UPC (Universal Product Code) and ISBN system, used for naming book
The Universal Product Code
The Ultimate System