Size of internal exon

145 bp

Size of intron

3365 bp

Size of 5' untranslated region

300 bp

Size of 3' untranslated region

770 bp

Size of coding region

1340 bp

Total length of gene

27,000 bp

1. Miscellaneous

2. Viral protein

3. Transfer/carrier protein

4. Transcription factor

5. Nucleic acid enzyme

6. Signaling molecule

7. Receptor

8. Kinase

9. Select regulatory molecule

10. Transferase

11. Synthase and synthetase

12. Oxidoreductase

13. Lyase

14. Ligase

15. Isomerase

16. Hydrolase

17. Molecular function unknown

18. Transporter

19. Intracellular transporter

20. Select calcium-binding protein

21. Protooncogene

22. Structural protein of muscle

23. Motor

24. Ion channel

25. Immunoglobulin

26. Extracellular matrix

27. Cytoskeletal structural protein

28. Chaperone

29. Cell adhesion chromosomes X, 4, 18, 13, and Y have the lowest density. Some proteins encoded by the human genome that are not found in other animals include those affecting immune function; neural development, structure and function; intercellular and intracellular signaling pathways in development; hemostasis; and apoptosis.

Transposable elements are much more common in the human genome than in worm, plant, and fruit-fly genomes (Table 19.4). The density of transposable elements varies, depending on chromosome location. In one region of the X chromosome, 89% of the DNA is made up of transpos-able elements, whereas other regions are largely devoid of these elements. There are variety of types of transposable elements in the human genome, including LINEs, SINEs, retrotransposons, and DNA transposons (see Chapter 11). Most appear to be evolutionarily old and are defective, containing mutations and deletions so that they are no longer capable of transposition. Information on the human genome and other genome-sequencing projects, including animal, plant, protozoan, fungal, and bacterial genomes

The Future of Genomics

The genomes of numerous organisms are in the process of being sequenced. These sequencing efforts, combined with the large amount of known DNA sequence that now exists, provide information that is tremendously useful for agriculture, human health, and biotechnology. The complete genome sequences of the mouse and the chimpanzee will serve as important sources of insight into the function and evolution of the human genome, inasmuch as these organisms are related to humans and are often used in studies of human health. Having complete genome sequences of crop plants and domestic animals will make it easier to identify genes that affect yield, disease and pest resistance, and other agriculturally important traits, which can then be manipulated by traditional breeding or genetic engineering to produce greater quantities and more-nutritious foods.

In the future, whole or partial genomic sequence information will be used in individual patient care. Currently, newborn babies are screened for a few treatable genetic diseases, such as phenylketonuria, which can be identified with the use of simple biochemical tests. In the future, newborns may be screened for a large number of variations in genetic sequence that confer high risk to treatable diseases, such as coronary artery disease, hypertension, asthma, and certain types of cancer. For those persons who are identified as genetically at risk, preventive treatment may be started early. In what has been called "personalized medicine," a person's DNA sequence may be used to predict responses to different treatment regimes, and drug therapy may then be fine-tuned to a person's genetic background. Genetic testing of both patients and pathogens will allow faster and more-precise diagnoses of many diseases.

Along with the many potential benefits of having complete sequence information are concerns about the misuse of this information. With the knowledge gained from genomic sequencing, many more genes for diseases, disorders, and behavioral and physical traits will be identified, increasing the number of genetic tests that can be performed to make predictions about the future phenotype and health of a person. There is concern that information from genetic testing might be used to discriminate against people who are carriers of disease-causing genes or who might be at risk for some future disease. Questions arise about who owns a person's genome sequence. Should employers and insurance companies have access to this information? What about relatives, who have similar genomes and who might also be at risk for some of the same diseases? There are also questions about the use of this information to select for specific traits in future offspring. All of these concerns are legitimate and must be addressed if we are to use the information from genome sequencing responsibly. Ethical issues associated with the Human Genome Project and genomics in general

Connecting Concepts Across Chapters 9

Genomics, the focus of this chapter, uses many of the techniques described in Chapter 18 for studying individual genes and applies them to the entire genome. What is different about genomics is the tremendous amount of information that is produced by using these techniques, requiring special computational tools. Although the details of many of these methods are beyond the scope of this book, an understanding of the underlying principles of genomics and the general trends emerging from the results of genomic studies is important to a student in a general genetics course. Genomics holds great potential for understanding biological processes and for applications in health, agriculture, and biotechnology. It will undoubtedly be one of the most important areas of future genetic research.

A surprising result to emerge from the study of genomics is the finding that organisms that differ greatly in phenotype and complexity may possess many similar genes and, in fact, may not differ greatly in the total number of genes that they possess. This finding suggests that differences in phenotype are often due more to differing patterns of gene expression than to differences in the protein-coding information of their genomes.

Much of what has already been covered in this book is relevant to the study of genomics. Information on gene mapping (Chapter 7), DNA structure (Chapter 10), chromosome organization (Chapter 11), transcription (Chapter 13), protein synthesis (Chapter 15), and recombinant DNA (Chapter 18) is particularly critical for understanding the concepts presented in this chapter. Comprehension of some of the topics covered in subsequent chapters will be facilitated by an understanding of the information in this chapter; such topics include organelle DNA in Chapter 20 and evolutionary genetics in Chapter 23.


• Genomics is the field of genetics that attempts to understand the content, organization, and function of genetic information contained in whole genomes.

• Structural genomics concerns the organization and sequence of the genome. Functional genomics studies the biological function of genomic information. Comparative genomics compares the genomic information in different organisms.

• Genetic maps position genes relative to other genes by determining rates of recombination and are measured in percent recombination. Physical maps are based on the physical distances between genes and are measured in base pairs.

• The location of sites recognized by restriction enzymes can be determined by cutting the DNA with each restriction enzyme separately and in combinations and then comparing the restriction fragments produced.

• DNA sequencing determines the base sequence of nucleotides along a stretch of DNA. The Sanger (dideoxy) method uses special substrates for DNA synthesis (dideoxynucleoside triphosphates, ddNTPs) that terminate synthesis after they are incorporated into the newly made DNA. Four reactions, each with a different ddNTP, are set up. In each reaction, DNA fragments of varying length are produced, all of which terminate in nucleotides with the same base. The products of the four reactions are separated by gel electrophoresis, and the sequence of the DNA synthesized is read from the pattern of bands on the gel.

• Sequencing a whole genome requires breaking the genome into small overlapping fragments whose DNA sequence can be determined in sequencing reactions. The individual sequences can be ordered into a whole genome sequence with the use of a map-based approach, in which fragments are assembled in order by using previously created genetic and physical maps, or with the use of a whole-genome shotgun approach, in which overlap between fragments is used to assemble them into a whole-genome sequence.

• The Human Genome Project is an effort to determine the entire sequence of the human genome. The project began officially in 1990; rough drafts of the human genome sequence were completed in 2000.

• Single-nucleotide polymorphisms are single-base differences in DNA between individuals and are valuable as markers in linkage studies.

• Expressed-sequence tags are markers associated with expressed (transcribed) DNA sequences. RNA from a cell is subjected to reverse transcription, producing cDNA molecules. A short stretch of the cDNA is then sequenced, which provides a marker that tags (identifies) the DNA fragment. Expressed-sequence tags can be used to find the genes expressed in a genome.

• Bioinformatics is a synthesis of molecular biology and computer science that develops tools to store, retrieve, and analyze DNA, cDNA, and protein sequence data.

• A transcriptome is the set of all RNA molecules transcribed from a genome; a proteome is the set of all the proteins encoded by the genome.

• Computer programs can identify genes by looking for characteristic features of genes within a sequence.

• Homologous genes are evolutionarily related. Orthologs are homologous sequences found in different organisms, whereas paralogs are homologous sequences found in the same organism. Gene function may be determined by looking for homologous sequences (both orthologs and paralogs) whose function has been previously determined.

• Functions of unknown genes may be inferred by searching databases for protein domains in genes that have been previously characterized.

• The functions of unknown genes can be inferred by using methods that compare DNA sequences, including phylogenetic profiling, protein fusion patterns, and linkage arrangements of genes in different organisms.

• A microarray consists of DNA fragments fixed in an orderly pattern to a solid support, such as a nylon filter or glass slide. When a solution containing a mixture of DNA or RNA is applied to the array, any nucleic acid that is complementary to the probe being used will bind to the probe. Microarrays can be used to monitor the expression of thousands of genes simultaneously.

• Genes affecting a particular function or trait can be identified through whole-genome mutagenesis screens. In this process, a group of organisms is screened for abnormal phenotypes subsequent to mutagenesis, and the mutated genes causing the abnormal phenotypes are identified by positional cloning.

• The genomes of many prokaryotic organisms have been determined. Most species have between 1 million and 3 million base pairs of DNA and from 1000 to 2000 genes. Compared with that of eukaryotic genomes, the density of genes in prokaryotic genomes is relatively uniform, with about one gene per 1000 bp. There is relatively little noncoding DNA between prokaryotic genes. Horizontal gene transfer (the movement of genes between different species) has been an important evolutionary process in prokaryotes.

• Eukaryotic genomes are larger and more variable in size than prokaryotic genomes. There is no clear relation between organismal complexity and the amount of DNA or number of genes among multicellular organisms. Much of the genomes of eukaryotic organisms consist of repetitive DNA. Transposable elements are very common in most eukaryotic genomes.

• Genomics is making important contributions to human health, agriculture, biotechnology, and our understanding of evolution.

(important terms genomics (p. 000) structural genomics (p. 000) functional genomics (p. 000) comparative genomics (p. 000) genetic map (p. 000) physical map (p. 000)

restriction mapping (p. 000) contig (p. 000)

DNA sequencing (p. 000) dideoxyribonucleoside triphosphate (ddNTP) (p. 000)

map-based sequencing (p. 000)

whole-genome shotgun sequencing (p. 000) single-nucleotide polymorphism (SNP) (p. 000)

expressed-sequence tag (EST) (p. 000) bioinformatics (p. 000) open reading frame (p. 000) transcriptome (p. 000) proteome (p. 000)

homologous genes (p. 000) orthologous genes (p. 000) paralogous genes (p. 000) protein domain (p. 000)

phylogenetic profile (p. 000) fusion pattern (p. 000) gene neighbor analysis (p. 000)

microarray (p. 000) mutagenesis screen (p. 000) positional cloning (p. 000)

horizontal gene exchange (p. 000)

Worked Problems

1. A linear piece of DNA that is 30 kb long is first cut with BamHI, then with HpalI, and finally with both BamHI and HpaII together. Fragments of the following size were obtained from this reaction.

BamHI: 20-kb, 6-kb, and 4-kb fragments HpaII: 21-kb and 9-kb fragments

BamHI and HpaII: 20-kb, 5-kb, 4-kb, and 1-kb fragments

Draw a restriction map of the 30-kb piece of DNA, indicating the locations of the BamHI and HpaII restriction sites.

This problem can be solved correctly through a variety of approaches; this solution applies one possible approach.

When cut by BamHI alone, the linear piece of DNA is cleaved into three fragments; so there must be two BamHI restriction sites. When cut with HpaII alone, a clone of the same piece of DNA is cleaved into only two fragments; so there is a single HpaII site.

Let's begin to determine the location of these sites by examining the HpaII fragments. Notice that the 21-kb fragment produced when the DNA is cut by HpaII is not present in the fragments produced when the DNA is cut by BamHI and HpaII together (the double digest); this result indicates that the 21-kb HpaII fragment has within it a BamHI site. If we examine the fragments produced by the double digest, we see that the 20-kb and 1-kb fragments sum to 21 kb; so a BamHI site must be 20 kb from one end of the fragment and 1 kb from the other end.

Now, let's examine the fragments produced when the DNA is cut by BamHI alone. The 20-kb and 4-kb fragments are also present in the double digest; so neither of these fragments contains an HpaII site. The 6-kb fragment, however, is not present in the double digest, and the 5-kb and 1-kb fragments in the double digest sum to 6 kb; so this fragment contains an HpaII site that is 5 kb from one end and 1 kb from the other end.

Hpa II site 5 kb 1 kb

We have accounted for all the restriction sites, but we must still determine the order of the sites on the original 30-kb fragment.

Notice that the 5-kb fragment must be adjacent to both the 1-kb and 4-kb fragments; so it must be in between these two fragments.

Hpa II site Bam HI site

JL b

We have also established that the 1-kb and 20-kb fragments are adjacent; because the 5-kb fragment is on one side, the 20-kb fragment must be on the other, completing the restriction map:

Bam HI site

20 kb kb

Bam HI site Bam HI site Hpa II site e h I

4 kb

Similarly, we see that the 9-kb HpaII fragment does not appear in the double digest and that the 5-kb and 4-kb fragments in the double digest add up to 9 kb; so another BamHI site must be 5 kb from one end of this fragment and 4 kb from the other end.

Bam HI site

Blood Pressure Health

Blood Pressure Health

Your heart pumps blood throughout your body using a network of tubing called arteries and capillaries which return the blood back to your heart via your veins. Blood pressure is the force of the blood pushing against the walls of your arteries as your heart beats.Learn more...

Get My Free Ebook

Post a comment