The fundamental determinants of many biological phenomena are now known to be molecular. And the biologically most important molecules (proteins, RNA and DNA) are comprised of linear chains of building blocks. The order of these blocks within a specific molecule is its sequence, which can be found experimentally. Each protein or RNA molecule commonly folds into a specific structure that depends sensitively on its sequence. DNA in living systems is topologically constrained, so its structure also depends on how it is constrained.
There are many mathematically challenging problems related to molecular sequences and structures. The shapes of proteins may be described using differential geometry. The topological constraints on DNA commonly involve the regulation of its linking number by the transient cutting by enzymes. The activities of DNA, including gene expression and replication, depend sensitively on the linking number imposed. This topological invariant can be decomposed into the sum of two geometric invariants, whose analysis involves integral geometry. The stable structures of such a DNA are those conformations that minimize a conformational energy subject to the constancy of the topological condition. This phenomenon gives rise to a range of variational problems. Experiments show that the stable structures of proteins minimize a governing energy. Thus, in order to predict protein structures from sequence one must solve an optimization problem. This is commonly very difficult to do, because there may be many thousands of degrees of freedom within a single molecule so its configuration space is high dimensional. Reliable methods of solving these problems still are not available.
Many other problems in discrete mathematics and computer science arise in the analysis and comparison of molecular sequences. Here again the inherent difficulty of the problems means that current analytic methods are approximate and not always satisfactory. However, every biologist who works with molecular sequences has an urgent need for good tools with which to compare her sequences with those of others, and to extract the information content of molecular sequences. In this way modern biology is incorporating the methodology of information science.