Yeast genome As mentioned earlier, Saccharomyces cere-visiae (yeast) was the first eukaryotic genome to be completely sequenced. Its genome consists of 12.1 million base pairs of DNA and 6100 potential genes, of which about 5900 encode proteins (IFigure 19.23a), giving a gene density of about one gene for every 2000 bp of DNA. The distribution of gene functions in yeast is displayed in I Figure 19.23b. The yeast genome contains considerable redundancy; there are a number of blocks of repeated sequences in the genome, and 30% of the genes exist in two or more copies.

Worm genome Caenorhabditis elegans, a roundworm, has a genome consisting of 97 million base pairs of DNA (I Figure 19.24). More than 18,000 protein-encoding genes have been identified in the C. elegans genome, of which more than 40% are homologous with genes found in other organisms. There is one gene for about every 5000 bp of DNA, and gene density is more uniform across chromosomes than it is in most eukaryotes.

Plant genome The genome of Arabidopsis thaliana, a small mustardlike plant, consists of 167 million base pairs of DNA (IFigure 19.25a), encoding 25,706 predicted genes. Although Arabidopsis has many proteins in common with yeast, worm, fly, and humans, it has roughly 150 protein families not seen in other eukaryotes, including structural proteins, transcription factors, enzymes, and proteins of unknown function. 4 Figure 19.25b shows the distribution of gene functions in Arabidopsis.

Gene duplication has played an important role in the evolution of Arabidopsis, with 60% of its genome consisting

Caenorhabditis elegans (round worm)

Caenorhabditis elegans (round worm)

Six pairs of linear chromosomes Genome size: 97 million bp Number of genes: 18,266 G + C content: 49%

419.24 Genomic characteristics of the roundworm, Caenorhabditis elegans.

of duplicated segments. Seventeen percent of the genes exist in tandem arrays, which are multiple copies of the same gene positioned one after another. One of the processes that produce tandem arrays of duplicated genes is unequal crossing over (see p. 000 in Chapter 9). A number of large duplicated regions, encompassing hundreds of thousands or millions of base pairs of DNA also are present. The large extent of duplication in the Arabidopsis genome suggests that this species had a tetraploid (4N) ancestor (see Chapter 9) and that all genes were duplicated in the past, followed by extensive gene rearrangement and divergence. Thus, at least two different mechanisms seem to have led to the large number of duplications seen in the Arabidopsis genome: (1) duplication of the whole genome through polyploidy; and

(a) Arabidopsis thaliana (b)

(mustard-like weed)

Five pairs of linear chromosomes Genome size: 167 million bp Number of genes: 25,706

0 0

Post a comment