Mathematics and DNA[an error occurred while processing this directive]
Dr. De Witt Sumners, Florida State University
DNA (deoxyribonucleic acid) is the blueprint for life. It can be viewed as two very long curves that are intertwined millions of times, linked to other curves, and subjected to four or five successive orders of coiling to convert it into a compact form for information storage. If one scales the cell nucleus up to the size of a basketball, the DNA inside scales to the size of thin fishing line, and 200 km of that fishing line are inside the nuclear basketball. Most cellular DNA is double-stranded (duplex), consisting of two linear backbones of alternating sugar and phosphorus. Attached to each sugar molecule is one of the four bases (nucleotides): A = adenine, T = thymine, C = cytosine, G = guanine. A ladder whose sides are the backbones and whose rungs are hydrogen bonds is formed by hydrogen bonding between base pairs, with A bonding only with T, and C bonding only with G. This ladder is twisted in a right-hand helical fashion, with an average and nearly constant pitch of approximately 10.5 base pairs per full helical twist. The local helical pitch of duplex DNA is determined by both the local base pair sequence and the cellular environment in which the DNA lives; if a DNA molecule is under stress, or constrained to live on the surface of a protein, or is being acted upon by an enzyme, the helical pitch can change.
The packing, twisting, and topological constraints all taken together mean that topological entanglement poses serious functional problems for DNA. This entanglement would interfere with (and be magnified by) the vital cellular life processes of replication, transcription, and recombination. For information retrieval and cell viability, some geometric and topological features must be introduced into the DNA, and others quickly removed. For example, the Crick-Watson helical twist of duplex DNA may require local unwinding in order to make room for a protein involved in transcription to attach to the DNA. The DNA sequence in the vicinity of a gene may need to be altered to include a promoter or repressor. During replication, the daughter duplex DNA molecules become entangled and must be disentangled in order for replication to proceed to completion. After the process is finished, the original DNA conformation must be restored. Some enzymes maintain proper geometry and topology by passing one strand of DNA through another by means of a transient enzyme-bridged break in one of the DNA strands. Other enzymes break the DNA apart and recombine the ends by exchanging them. The description and quantization of the three-dimensional structure of DNA and the changes in DNA structure due to the action of these enzymes have required the serious use of geometry and topology in molecular biology. This use of mathematics as an analytical and computational tool is essential because there are few ways to directly observe an enzyme in action.
In the experimental study of DNA structure and enzyme mechanism, biologists developed the topological approach to enzymology shown schematically in Figure 1, in the article "Lifting the Curtain: Using Topology to Probe the Hidden Action of Enzymes," Notices of the AMS, May 1995. In this approach, one performs experiments on circular substrate DNA molecules; enzymes act on these DNA circles, creating an enzymatic signature by changing the coiling of the DNA, and making and breaking knots and links in the DNA. By observing the changes in geometry (supercoiling) and topology (knotting and linking) in DNA caused by an enzyme, the enzyme mechanism can be described and quantized.
The topological approach to enzymology poses an interesting challenge for mathematics: from the observed changes in DNA geometry and topology, how can one deduce enzyme mechanisms? This requires the construction of mathematical models for enzyme action and the use of these models to analyze the results of topological enzymology experiments. The entangled form of the product DNA knots and links contains information about the enzymes that made them. In addition to utility in the analysis of experimental results, the use of mathematical models forces all of the background assumptions about the biology to be carefully laid out. At this point they can be examined and dissected, and their influence on the biological conclusions drawn from experimental results can be determined.