With the completion of the sequence of the S. cerevisiae genome in April 1996 came a new challenge, namely functional analysis of the genome. Methods needed to be developed on a genome-wide scale to provide the tools for understanding the roles of the approximately 6000 gene products, their expression patterns, and how they interact to create a eukaryotic organism capable of complex processes like growth, cell division, and the response to extracellular signals. Several groups from throughout the world are working to meet this challenge and develop databases and research resources for the Saccharomyces scientific community. The most important of these is the Saccharomyces Genome Database (SGD) at Stanford University. Others include the European Functional Analysis Network (EUROFAN), the Yeast Technology Resource Center at the University of Washington in Seattle, Yale Genome Analysis Center at Yale University Medical School in New Haven, and the Center for Molecular Medicine and Therapeutics of the University of British Columbia. In addition, reagents specific for working with Saccharomyces are becoming available through commercial sources.
Why such attention to a simple microorganism? Despite the absence of a multi-celled development, Saccharomyces is not so very different from other eukaryotes at the genetic level. Comparative genomic analysis of Drosophila melanogaster and Caenorhabditis elegans found that the numbers and types of proteins (termed protein sets) found in these organisms is similar in size and only about twice the size of that of Saccharomyces when redundancy is taken into account (Rubin et al., 2000). Clearly, Saccharomyces is an excellent model for the more complex systems and one that is far more amenable to genetic analysis, at least for the near future. It is important to keep in mind that the gene functions identified in Saccharomyces define the components to 'build' a eukaryotic cell. Gene functions involved in the assembly of cells into multicelled organisms will be missing from the Saccharomyces protein set. Nonetheless, if the reader is interested in basic eukaryotic cell functions, Saccharomyces is a very valuable experimental tool.
Saccharomyces can be used to analyze the function of genes from other systems by complementing yeast mutations. The heterologous expression of human genes in Saccharomyces as well as genes from other organisms including plants has already proved to be a valuable tool for functional analysis and a rapidly expanding literature of such studies already exists. Heterologous expression in Saccharomyces will also be used for drug development and testing and for the commercial production of pharmaceuticals and other agents. Thus, completing the functional analysis of Saccharomyces will be a major step for the functional analysis of eukaryotes with larger genomes, like humans.
This chapter is not intended as an exhaustive study of Saccharomyces genome functional analysis. The approaches are many and are constantly being refined. New approaches are conceived and put into practice all the time. A brief survey of the most important and/or widely used methods is presented. The reader might also refer to Kumar & Snyder (2001).
The most valuable site which the reader should be familiar with is the Saccharo-myces Genome Database (SGD), maintained by Stanford University (http:// genome-www.stanford.edu/Saccharomyces). At the site one can access the complete genomic sequence of S. cerevisiae, the complete physical and genetic maps of all of the chromosomes, and information on all of the known genes including references to the literature. Meetings and other items of importance to the yeast community are listed at the site. One can get help with technical questions or look up the address of a colleague. This site also has links to other major sites for yeast protein analysis including YPD and MIPS.
The Yeast Proteome Database (YPD) (http://www.proteome.com/YPDhome. html) is a comprehensive site for information on the approximately 6000 Sac-charomyces proteins. It is maintained by Proteome, Inc. (http://proteome.com/) and provides several products and services including a detailed curation of the scientific literature from a wide array of research publications and precalculated sequence alignments for comparative genomic analysis. A scheme for protein classification is available that summarizes the function and role in the cell of specific proteins. The goal of YPD is to provide a framework for functional analysis. The BioKnowledge Library found at the Proteome site is a relational database and the website compiles published information about individual proteins, including function, subcellular location, expression patterns, and interactions with other proteins, and presents it in an easy-to-use format. Information on the protein sets from S. cerevisiae (YPD), Schizosaccharomyces pombe (PombePD), Casnorhabditis elegans (WormPD), and Candida albicans (CalPD™) is available (Costanzo et al., 2001).
Another important site is the Munich Information Center for Protein Sequences (MIPS) (http://mips.gsf.de/). MIPS is maintained by the bioinformatics section of the National Research Center for Environment and Health of the Max-Planck Institute for Biochemistry and is a member of PIR International (Protein Identification Resource) and of the European Molecular Biological Network (EMBNET). The MIPS Yeast Genome Database contains extensive information regarding the Saccharomyces ORFs. Other projects include genome analysis of Arabidopsis thaliana, the model dicot plant, and Neurospora crassa, another fungal genetic model organism. MIPS is involved in the development of active database systems for the efficient use of sequence data, particularly for human genome analysis. MIPS's Protfam project is for protein classification into families and superfamilies, for motif searches, and identification of homology domains.
A site for the prediction of protein function based on sequence and structural information has been established at UCLA (http://www.doe-mbi.ucla.edu). The site includes a database for yeast proteins and has many other interesting features.
Martzen et al. (1999) reported a genome-wide strategy for identifying genes encoding products with specific biochemical activities (reviewed in Carlson, 2000). This method fuses the complete library of yeast ORFs to glutathione S-transferase (GST), expresses these GST fusions from the high expression copper-regulated CUP1 promoter, and introduces the constructions into an appropriate yeast host where abundant expression can be induced. The GST fusion proteins can then be partially purified using standard methods and tested for specific biochemical activity, such as cAMP activated kinase activity.
A collection of over 6000 yeast strains each expressing a different yeast ORF is available from Research Genetics (Huntsville, AL) (http://resgen.com/) for those interested in this biochemical screening method. Using a pooling approach, one can screen large numbers of transformants expressing the GST fusions. When an activity is detected in a pool, the pools are deconvoluted, and the identity of the specific transformant expressing the GST fusion protein with the desired activity is determined. The usefulness of this method was demonstrated by the identification of the genes encoding cyclic phosphodiesterase and cytochrome c methytransferase (Martzen et al., 1999).
DNA microarray analysis is essentially a method for carrying out thousands of hybridizations at one time using small samples. DNA probes representing each of the Saccharomyces ORFs are irreversibly attached to a solid substrate such as a glass slide or a nylon membrane. The unique sequence fragments are made by PCR using carefully selected primer pairs internal to the transcribed regions. Detailed information on the production of DNA arrays for Saccharomyces can be found in Eisen & Brown (1999). Research Genetics is a commercial source of the complete set of Saccharomyces primer pairs. The DNA fragments are spotted onto the substrate using specialized devices capable of producing an 80 x 80 array of 6400 samples in an area of about 18 mm2. Newer technology that synthesizes the single-stranded oligonucleotide on the substrate is now available and will be most useful for organisms with larger genomes and protein sets than Saccharomyces (Ramsay, 1998).
The DNA sample under analysis is incubated with the DNA chip under conditions that allow hybridization. Initially, the sample DNA was radioactively labeled and phosphorimaging was used to detect positions in the array at which hybridization had occurred. Currently, the sample DNA is labeled with a fluorescent tag and laser scanning or a fluorescent confocal microscope detects positions of hybridization. The results are expressed quantitatively relative to a control condition and changes of twofold or more are considered significant. Various methods on how best to display the results of genome-wide analysis are in development since the wealth of information produced by microarray analysis can be quite daunting (Eisen et al., 1999; Zhang, 1999; Aach et al., 2000; Brown et al., 2000). To carry out a DNA microarray analysis, the researcher must affiliate with a facility that has the equipment to prepare the DNA chips and obtain and analyze the data.
Sample DNA preparation depends on the experiment. Typically, DNA micro-array methods are used to compare transcription expression patterns under different growth conditions, in different mutant backgrounds, or different cell and tissue types. RNA samples are purified from cells grown in the experimental and control conditions. cDNA is made using fluorescently tagged primer that anneals to the oligoT sequence at the 3' end of mRNA. The cDNA sample produced is then used to hybridize to the DNA microarray chip. If the mRNA for a particular gene is represented in the RNA extract, then this method should detect a hybridization signal. The expression level in the experimental culture is compared with that in the control condition and changes in expression level are quantified.
DNA microarray analysis is being used for an increasing variety of studies. Spellman et al. (1998) used this technique to identify and characterize the expression of cell-cycle-regulated genes. Genome-wide comparisons of glucose repression of transcription in migl and mig2 null mutant strains identified many genes controlled by these repressors (Lutfiyya et al., 1998). Similar studies of snflswi mutants and multidrug-resistant yeast mutants have been reported (Sudarsanam et al., 2000; DeRisi et al., 2000). A method referred to as comparative genomic hybridization has been developed to study genomic copy-number changes using the DNA microarray methodology (Pollack et al., 1999). These and other techniques for the study of genomic changes in tumor cells will be valuable tools for design of specific treatment therapies. Information on microarray analysis data sets can be found at the SGD database (http://genome-www4.stanford.edu/MicroArray/SMD/).
The basics of two-hybrid analysis were described in Chapter 10. Stan Fields and colleagues have expanded upon this method to develop reagents for improved genome-wide studies of protein-protein interaction. These methods are described in detail in Uetz et al. (2000) and more information is available on the \V:tshinuton University Yeast Technology Resource Center web site (http://depts.Washington, edu/yeastrc), at the Stan Fields laboratory website (http://depts.washington.edu/ sfields/projects/YPLM/) and at the Curagen Corporation site (http://portal.curagen. com/).
Each of the approximately 6000 Saccharomyces ORFs has been cloned into a Gal4 transcription activation domain vector and a Gal4 DNA-binding domain vector using PCR-based approaches. Each of these bait and prey fusion sets is available as transformants of an appropriate pair of yeast host strains of different mating types. Interaction is tested by mating the appropriate transformant pairs and screening/selecting for a positive protein-protein interaction. HIS3, URA3, and lacZ reporters are available and often more than one is used in each test.
Methods for genome-wide analysis of all potential test pairs (6000 baits by 6000 fish) are under development and the efficacy of two of these approaches has been reported (Uetz et al., 2000). Researchers interested in using this method to explore interactions involving a specific protein of interest should contact Stan Fields to set up a collaboration.
GENOME-WIDE GENERATION OF NULL MUTATIONS
The most straightforward method of creating null mutations in a gene uses one-step gene disruption similar to the methods described in Chapter 1 (Winzeler et al., 1999). With the complete Saccharomyces sequence in hand it became possible to use PCR-based methods to delete each of the approximately 6000 ORFs. A consortium of European and American research laboratories is in the process of generating the complete set of knock-out strains in a number of isogenic haploid strains and in an a/a diploid. The complete list of available knock-out strains can be found at the SGD website (sequence-www.stanford.edu/group/yeast_deletion_project/deletions3. html) and strains can be obtained from Research Genetics (check their website for pricing).
For researchers interested in obtaining null mutations in specialized strain backgrounds, another method based on transposon mutagenesis developed by Michael Snyder of Yale University might better suit one's needs (Burns et al., 1994; RossMacdonald et al., 1997, 1999). Specific details of the method and protocols are available on the Yale Genome Analysis Center website (http://www.ygac.med.yale. edu/). Pools of mutagenized plasmid DNA are provided upon request along with amplification procedures.
The transposon used to randomly mutagenize the yeast genome is a Tn3 derivative that contains the lacZ gene, yeast LEU2 and the E. coli ampicillin-resistance gene. The lacZ gene is at one end of the Tn3 sequence just after the inverted repeat and transposition into an ORF will create a lacZ fusion. Theoretically, one in six transposition events will create a gene fusion to lacZ in the proper orientation and reading frame. The yeast genomic library used for the mutagenesis contained 105 recombinant plasmids (about 20 genomes) with the yeast insert fragment bounded by Notl sites so that the fragment may be released from the vector by simple digestion.
Insertional mutagenesis of the library is achieved by a shuttle approach. A plasmid carrying the transposon is introduced into pools of E. coli host cells carrying the library plasmids. Transposition of the specially constructed Tn3 is activated, and transposition into the yeast insert will occur. Library plasmid DNA is then prepared separately from each pool. All insertions into the ORF of the yeast gene should create a null mutation with very rare exceptions.
Mutagenized library DNA is digested with Notl to release the Tn3-mutagenized yeast fragment, transformed into the yeast host cell, and Leu+ transformants selected. The Tn3-containing yeast fragment replaces the genomic copy of the region by homologous recombination, i.e. one-step gene replacement. The Leu+ transformants can be selected or screened for additional phenotypes, such as suppression of a mutant phenotype of interest. Using this method, one can create null mutations in genes throughout the Saccharomyces genome. The site of the Tn insertion can be readily identified using genomic PCR methods using a primer internal to the transposon. In addition, included among the null mutation will be some that create lacZ fusions and these can be used for subcellular localization of the fusion product or studies of its expression pattern. Analysis has found that the Tn3-based system is not as random as suggested and a new method is under development.
REFERENCES AND FURTHER READING
Aach, J., W. Rindone, & G.M. Church (2000) Systematic management and analysis of yeast gene expression data. Genome Res. 10: 431-445.
Brent, R. (1999) Functional genomics: learning to think about gene expression data. Curr. Biol. 9: R338-R341.
Brown, P.O., W.N. Grundy, D. Lin, N. Cristianini, C.W. Sugnet, R.S. Furey, M. Ares Jr, & D. Haussler (2000) Knowledge-based analysis of microarray gene expression data by using support vector machines. Proc. Natl Acad. Sei. USA 97: 262-267.
Burns, N., B. Srimwade, P.B. Ross-Macdonald, E.Y. Choi, K. Finberg, G.S. Roeder, & M. Snyder (1994) Large-scale analysis of gene expression, protein localization, and gene disruption in Saccharomyces cerevisiae. Genes Dev. 8: 1087-1105.
Carlson, M. (2000) The awesome power of yeast biochemical genomics. Trends Genet. 16: 4951.
Coelho, P.S., A. Kumar, & M. Snyder (2000) Genome-wide collections: toolboxes for functional genomics. Curr. Opin. Microbiol. 3: 309-315.
Costanzo, M.C., M.E. Crawford, J.E. Hirschman, J.E. Kranz, P. Olsen, L.S. Robertson, M.S. Skrzypek, B.R. Braun, K.L. Hopkins, P. Kondu, C. Lengieza, J.E. Lew-Smith, M. Tillberg, & J.I. Garrels (2001) YPD™, PombePD™ and WormPD™: model organism volumes of the BioKnowledge™ Library, an integrated resource for protein information. Nucleic Acids. Res. 29: 75-79.
DeRisi, J., B. van den Hazel, P. Marc, E. Balzi, P. Brown, C. Jacq, & A. Goffeau (2000) Genome microarray analysis of transcriptional activation in multidrug resistant yeast mutants. FEBS Lett. 470: 156-160.
Dujon, B. (1998) European Functional Analysis Network (EUROFAN) and the functional analysis of the Saccharomyces cerevisiae genome. Electrophoresis 19: 617-624.
Eisen, M.B. & P.O. Brown (1999) DNA arrays for analysis of gene expression. Meth. Enzymol. 303: 179-205.
Eisen, M.B., P.T. Spellman, P.O Brown, & D. Botstein (1998) Cluster analysis and display of genome-wide expression patterns. Proc. Natl Acad. Sei. USA 95: 14863-14868.
Fields, S., Y. Kohara, & D.J. Lockhart (1999) Functional genomics. Proc. Natl Acad. Sei. USA 96: 8825-8826.
Fromont-Racine, M. J.C. Rain, & P. Legrain (1997) Toward a functional analysis of the yeast genome through exhaustive two-hybrid screens. Nature Genet. 16: 277-282.
Goffeau, A., B.G. Barrell, H. Bussey, R.W. Davis, et al. (1996) Life with 6000 genes. Science 274: 546-567.
Hieter, P. & M. Boguski (1997) Functional genomics: it's all how you read it. Science 278: 601-602.
Hodges, P.E., A.H. McKee, B.P. Davis, W.E. Payne, & J.I. Garrels (1999) The Yeast Proteome Database (YPD): a model for the organization and presentation of genome-wide functional data. Nucleic Acids Res. 27: 69-73.
Hudson, J.R. Jr, E.P Dawson, K.L. Rushing, C.H. Jackson, D. Lockshon, D. Conover, C. Lanciault, F.R. Harris, S.J. Simmons, R. Rothstein, & S. Fields (1997) The complete set of predicted genes from Saccharomyces cerevisiae in a readily usable form. Genome Res. 1: 1169-1173.
Johnston, M. & S. Fields (2000) Grass-roots genomics. Nature Genet. 24: 5-6.
Kumar, A. & M. Snyder (2001) Emerging technologies in yeast genomics. Nature Rev. Genet. 2: 302-312.
Lutfiyya, L.L., V.R. Iyer, J. DeRisi, M.J. DeVit, P.O. Brown, & M. Johnston (1998)
Characterization of three related glucose repressors and genes they regulate in Saccharomyces cerevisiae. Genetics 150: 1377-1391.
Martzen, M.R., S.M. McCraith, S.L. Spinelli, F.M. Torres, S. Fields, E.J. Grayhach, & E.M. Phizicky (1999) A biochemical genomics approach for identifying genes by the activity of their products. Science 286: 1153-1155.
Mewes, H.S., K. Albermann, M. Bahr, D. Frishman, A. Gleissner, J. Hani, K. Heumann, K. Kleine, A. Maierl, S.G. Oliver, F. Pfeiffer, & A. Zollner (1997) Overview of the yeast genome. Nature 387: 7-65.
Mewes, H.S., K. Albermann, K. Heumann, S. Liebl, & F. Pfeiffer (1997) MIPS: a database for protein sequences, homology data and yeast genome information. Nucleic Acids Res. 25: 28-30.
Oliver, S.G., M.K. Winson, D.B. Kell, & F. Baganz (1998) Systematic functional analysis of the yeast genome. Trends Biotechnol. 16: 373-378.
Pollack, J.R., C.M. Perou, A.A. Alizadeh, M.B. Eisen, A. Pergamenschikov, C.F. Williams, S.S. Jeffrey, D. Botstein, & P.O. Brown (1999) Genome-wide analysis of DNA copy-number changes using cDNA microarrays. Nature Genet. 23: 41-46.
Ramsay, G. (1998) DNA chips: state of the art. Nature Biotechnol. 16: 40-44.
Ross-Macdonald, P., A. Sheehan, G.S. Roeder, & M. Snyder (1997) A multipurpose transposon system for analyzing protein production, localization, and function in Saccharomyces cerevisiae. Proc. Natl Acad. Sei. USA 94: 190-195.
Ross-Macdonald, P., P.S. Coelho, T. Roemer, S. Agarwal, et al. (1999) Large-scale analysis of the yeast genome by transposon tagging and gene disruption. Nature 402: 413-418.
Rubin, G.M., M.D. Yandell, J.R. Wortman, G.L. Miklos, et al. (2000) Comparative genomics of the eukaryotes. Science 287: 2204-2215.
Spellman, P.T., G. Sherlock, M.Q. Zhang, V.R. Iyer, K. Anders, M.B. Eisen, P.O. Brown, D. Botstein, & B. Futcher (1998) Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol. Biol. Cell 9: 32733297.
Sundarsanam, P., V.R. Iyer, P.O. Brown, & F. Winston (2000) Whole-genome expression analysis of swf/swi mutants of Saccharomyces cerevisiae. Proc. Natl Acad. Sei. USA 97: 3364-3369.
Uetz, P., L. Giot, G. Cagney, et al. (2000) A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature 403: 623-627.
Winzerler, E.A., D.D. Shoemaker, A. Astromoff, H. Liang, et al. (1999) Functional characterization of the Saccharomyces cerevisiae genome by deletion and parallel analysis. Science 285: 901-906.
Zhang, M.Q. (1999) Large-scale gene expression data analysis: a new challenge to computational biologists. Genome Res. 9: 681-688.
Genetic Techniques for Biological Research Corinne A. Michels Copyright © 2002 John Wiley & Sons, Ltd ISBNs: 0-471-89921-6 (Hardback); 0-470-84662-3 (Electronic)
Was this article helpful?