Saccharomyces Genome And Nomenclature


Saccharomyces cerevisiae has a haploid chromosome number of 16. The entire Saccharomyces genome of strain S288C is sequenced and available on the Saccharomyces Genome Database (called SGD) at Saccharomyces. The site has a variety of tools for sequence analysis that are particularly useful for the Saccharomyces researcher, including gene and restriction maps of the chromosomes. The site is interconnected with genome databases for other genetic model organisms and sites for protein analysis. There are literature guides for the known Saccharomyces genes, announcements of interest to the yeast research community, and contact information for yeast researchers. SGD is well worth a visit.

The Saccharomyces genome contains more than 13 million basepairs (13Mbp) including the rDNA and more than 6000 open reading frames (ORFs). Each ORF is named to indicate the chromosome number (A for chromosome / to P for chromosome XVI), whether the gene is found on the right (R) or left (L) arm of the chromosome (that is, to the right or left of the centromere), and the ORF number. All ORFs are numbered on each chromosome arm starting at the centromere and going in the direction of the telomere regardless of which strand is the coding strand. Finally, the direction of transcription is indicated by a W (for Watson, the upper strand) or C (for Crick, the lower strand) depending on which strand is the coding strand. Thus, ORF YBR288C is found on the right arm of chromosome II. It is the 288th ORF from the centromere, and the lower strand is the coding strand; that is, it is transcribed from right to left which for the right arm means towards the centromere.


Saccharomyces gene names consist of three letters and a number (usually 1-3 digits). The letters chosen are most often based on the phenotype or function of the gene. Note that the number follows immediately after the letters with no space. For example, a gene encoding one of the enzymes of histidine biosynthesis is referred to as HIS3. As in all organisms, gene names are italicized. Often, several mutant alleles of a particular gene have been identified. These can be distinguished by placing a suffix after the gene name; frequently a hyphen followed by the allele number is used, all with no spaces.

In Saccharomyces, dominant alleles of a gene are capitalized and recessive alleles are in lower case. For example, mutant allele #52 is a recessive mutation of URA3 and is written ura3-52. Mutant strains resistant to the toxic effects of the arginine analogue canavanine carry dominant alterations in CAN1 encoding the arginine permease and are written CAN1-R. It is important that you do not confuse the concept of a wild-type allele versus a mutant allele with capital letters (for dominant alleles) versus lower case letters (for recessive alleles). There are many examples of mutant alleles that are dominant. The interpretation of dominant versus recessive will be discussed in Chapter 4 in more detail.


Descriptive words or abbreviations derived from the gene name can be used when discussing the phenotype of a strain. For example, a strain carrying a lys2 mutant allele (genotype lys2) will not grow in the absence of added lysine. This phenotype can be referred to as lysine minus, lysine", or lys~. Note that the letters are not italicized and that no gene number is given. Lysine synthesis requires several enzymatic steps and therefore mutations in any of several genes encoding these enzymes can cause a lysine minus phenotype. When observing the phenotype of a strain one has no information as to genotype. Therefore it is inappropriate to use the gene number. Genotype can only be determined by doing appropriate crosses to known genetic tester strains.


In journal articles on Saccharomyces it will be noted that researchers name their strains in a wide variety of ways. There are certain standard strains, like S288C, W303, or YPH500, that are commonly used in research laboratories and these will be referenced in the Materials and Methods section of an article. If the authors have done some genetic manipulations with these strains, then they will rename the strain often using their initials. For example, strain YPH500 was constructed by Phil Hieter and coworkers, and the letters stand for Yeast Phil Hieter. The article will state that the new strain is a derivative of the original strain and a literature reference to the original strain will be given. Often a strain list is presented with the relevant genotype of the strains used in the study along with information on the derivation of the strain. The genotype will indicate all of the genes that are mutant. If a gene is not listed it is assumed to be the wild-type allele found in the strain from which the mutant was derived, such as S288C. While all of these strains are highly similar at the sequence level they are not identical. Strain differences may be very few but could potentially be significant for the particular research project being described. Geneticists pay very careful attention to strain backgrounds and do their best to keep them constant.


The protein product of a Saccharomyces gene can be named based on the gene name or the function, if it is known. For example, GAL1 encodes galactokinase, the first enzyme in the catabolism of the sugar galactose. The product of the GAL1 gene is referred to as galactokinase, Gall protein, or Gall p. Note that only the first letter is capitalized and that the protein name is not italicized.

Was this article helpful?

0 0

Post a comment