The DNA Code

The basic unit of life on the Earth is the cell. Although some cells are large enough to see, most are microscopic. For example, human red blood cells are about 7-8 micrometers (millionths of a meter) in diameter. The functioning of cells as cells and as part of a larger organism is controlled through deoxyribonucleic acid or DNA. DNA is a carbon-based, long-chain molecule that resembles a twisted ladder ("double helix") and resides primarily in the nucleus of cells. For all living things, DNA molecules carry the information about the heritable traits of the organism.
The Double Helix
The structure of DNA is illustrated in the top right figure (click "Show Labels" to add descriptive labeling). A rotating version of this DNA model may be found in this animation (790 kB streaming). The rungs of the ladder are a sequence of nitrogen-containing molecular pairs. The individual members of these molecular pairs are called bases. The two strands of the DNA molecule are held together by weak hydrogen bonds between the opposing base pairs on each strand (recall that we discussed hydrogen bonding in conjunction with the properties of water and Earth's oceans in Chapter 6). The average spacing between base pairs on the double helix is only 0.34 nm, so there are many hydrogen bonds in even a short segment of DNA. Although each individual bond is weak, the combined set of hydrogen bonds between all base pairs is strong enough to hold the double helix together.

The Base Pairs in DNA
Only four bases appear in DNA: adenine (A), cytosine (C), guanine (G), and thymine (T). The information encoded in DNA corresponds to the order in which these four bases appear in the ladder. The adjacent diagram illustrates schematically a short segment of DNA with four examples of base pairs bonded to each other through hydrogen bonds. As we shall see below, there are strict rules for which base can pair with which base through this hydrogen bonding. The spiral sides of the ladder are composed of alternating phosphate groups (a chemical grouping of a phosphorous atom with four oxygen atoms) and five-carbon sugar molecules called deoxyribose. These are labeled "Sugar-Phosphate Backbones" in the diagram (the orange pentagons represent the deoxyribose molecules). Each strand of the double helix may be viewed as composed of a sequence of nucleotides, where each nucleotide consists of a phosphate group, a deoxyribose molecule, and one of the four nucleotide bases (A, C, G, or T). The DNA double helix goes through a complete twist for every 10.4 nucleotides.

The Rules of Base Pairing
Because of the shapes of the four possible bases (A, C, G, T) in DNA, there are rather strict chemistry rules for which bases can join in a pair by hydrogen bonding. For DNA, the base pairings are always A with T and G with C. Thus, there are only 4 possible "rungs" (base pairs) of the DNA ladder: AT, TA, GC, and CG. We say that A is the complement of T (and G is the complement of C). This means that the nucleotide sequences on the two strands of DNA are themselves complementary. For example, if in a segment of DNA the base sequence on one strand is GTCA, then it must be CAGT on the corresponding segment of the opposite strand, as illustrated in the above left figure. This pattern of base pairing (A with T and G with C) is constant for all species, which provides a strong unfying principle for all life. The diversity of life comes, as noted above, through the different possible orderings of base pairs on the double helix.
Unique Nucleotide Sequences
Large numbers of nucleotides are arranged one after the other in the DNA strands (Note: the top right figure shows only a tiny portion of a DNA double helix). The enormous number of possibilities for arranging the order of these nucleotides accounts for the great diversity of life, while the limited number of nucleotide building blocks accounts for its unity. The order in which one kind of nucleotide follows another in a strand is unique for a species in at least some parts of its DNA. These unique sequences encode the hereditary information for the species.
Genes and Chromosomes
Chromosomes consist of a single DNA molecule along with many proteins attached to it. The number of chromosomes differs with the species. A normal human cell contains 46 chromosomes in 23 pairs (one member of each pair coming from one parent and one from the other), which defines the entire genetic makeup of an individual. Genes consist of particular sequences of nucleotides in a DNA molecule. Each gene has a particular location on a particular chromosome of a species. The genes carry coded information about how to make proteins, and are the basic units of information for heritable traits. These coded pieces of information are passed from parent to offspring during reproduction. The smallest genes have only a few hundred base pairs, while the largest have more than a million. The human genome consists of the total set of genetic information that describes human beings. There are about 3 billion base pairs in the human genome, corresponding to some 50-100,000 genes that encode for different specific features of human beings (such as brown hair or dimpled chins). A gene mutation occurs when one or more of the nucleotide bases in a gene are deleted, added, or replaced. Such gene mutations will be critical to the process of evolution that we shall discuss shortly.