|
|
3. Genetic code
Even more remarkable is that non-overlapping triples of the letters A, C, G,
and T which are often referred to as codons (especially when the triples are
part of a gene), may spell out (reading in the 5' to 3' direction) the order
of the amino acids that form the long linear molecules known as proteins. It
turns out there are 20 amino acids which are the building blocks for proteins.
Since there are 64 possible triples (repeated letters within the triples are
allowed) using the 4 letters A,C, G, and T, different triples can represent
the same amino acids. There are also 3 triples (TAA, TAG, TGA) known as stop
codons that do not represent (a common phrase being do not code for ) an amino
acid, but are involved in signaling a termination to the protein production
process. One particular codon (ATG), which does represent one of the amino acids,
is an indication for the start of the production of a protein. This system for
coding proteins has come to be known as the genetic code. Before the genetic
code was fully understood, based on work by Marshall Nirenberg, Heinrich Matthaei,
Har Gobind Khorana, and others, various mathematical ideas were invoked to suggest
what kind of code might be involved. There is a diagram
of the RNA version of the genetic code, where U takes the place of T (see below).
One approach to finding a gene is to look for a stretch of DNA which starts
with ATG (the start codon) and ends with one of the stop or termination codons.
This situation in a stretch of DNA is referred to as an open reading frame (ORF).
However, not all ORFs correspond to a stretch of DNA which will initiate the
production of a protein, and so the location of an ORF is not equivalent to
finding a gene. A major problem facing biologists is how to locate those stretches
of DNA that are genes in the large amounts of DNA found in a chromosome.
The way that DNA is involved in the production of proteins is not direct. The
mechanism involves another helical molecule called RNA, of which there are a
variety of types. (Among the types of RNA are messenger RNA, or mRNA, ribosomal
RNA, or rRNA, and transfer RNA, or tRNA.) When a protein is to be produced,
the DNA separates and a copy of the gene is transcribed (the process is referred
to as transcription by molecular biologists) into a strand of RNA (ribonucleic
acid). RNA, like DNA, uses an alphabet of 4 nucleotides A, C, G, and U (for
uracil, a pyrimidine like thymine which takes the place of T). In a somewhat
complex series of steps, a protein is produced.
A gene in the DNA includes sections (introns) that are cut out before the RNA, through a complex process, directs the manufacture of a protein (polypeptide chain).
(This image is used with the permission from the National Human Genome Research Institute (NHGRI))
|
Comments: Email Webmaster |
|