Population genetics is a kind of rigorous formalism about the evolution of genes. It can be done with no consideration of phenotypes and indeed many population geneticists specifically avoid dealing with the latter or with the connections between the evolution of genes and that of phenotypes. Population genetics is essentially a mathematical theory and hence is as rigorous as mathematics, which means that the theory and its usefulness depend on the values of various parameters, the degree to which they can be accurately known, and the degree to which the assumptions and formulas realistically reflect what goes on in life.
The basic dynamics of variation over time can be described without being too specific about what is meant by "gene" beyond an identifiable, discrete, heritable unit of inheritance. We can assign a relative frequency to each of the alternative states, or alleles, that are found in a gene, in some specified population of inference; the latter can be a local deme or an entire species, as long as we realize that the analysis depends on this choice. Variation requires at least two alleles in our specified population.
Conceptually, an allele frequency can be viewed alternatively as the fraction of copies of the genes in the reference population that are of the specified type or as the probability that a randomly sampled gene from that population will be of that allele. Because the discussion is usually framed in terms of relative variation for a specific genetic unit in a specified population unit, by definition the frequencies of the alternative alleles in our population must sum to 1.0.
To present the logic of genetic theory, a simple situation is usually envisioned, in which the gene in the population has only two alleles that, ever since Mendel, have been conventionally labeled as A and a. We denote the frequency of A as pA; in addition, because there are only two alleles, a must comprise the rest of the population, so its frequency must be 1.0 - pA. That is, pA + pa = 1.0. This can be seen schematically in Figure 3-1. It is important to keep in mind that this refers to the proportion rather than the number of copies of each allele in the population of inference. When working only with samples from that population, all of this refers to the characteristics of the sample, which if one is careful can be used as estimates of the true situation in the entire population.
Another important concept is the genotype frequency, again with regard to a reference population or a sample from it. Many species, including most animals, are
diploid, meaning that (like humans) they inherit a copy of each gene from each of two parents. This means that what happens to alleles in the population happens to them as they are carried around in pairs. The variants in a given individual are referred to as its genotype. (Some species like bacteria are haploid and have only a single copy of each gene; for them their genotype is the same as their allele. There are other ploidies in nature, but these need not be considered here to understand the basic principles.) In a diploid species in the simple case (Figure 3-1), the genotype must be a homozygote, AA or aa, with two copies of the same allele, or a heterozygote, Aa, with two different alleles. The relative frequencies of these genotypes can be denoted as pAA, pAa, and paa. Again, because these are the only possible genotypes for this gene, pAA + pAa + paa = 1.0.
Because a diploid genotype represents the contributions to the individual from each of two parents, the genotype distribution in a given generation is the product of the mate choice and reproductive pattern in the previous generation. Of specific interest is a baseline condition called panmixia, in which individuals choose their mates randomly relative to the genotype of mates choosing each other. When this occurs, the frequencies of the genotypes are just the product of the frequencies of each allele. Thus, the probability that a random individual has one A allele is pA, and the probability that this individual has a second A is also pA, so that we have pAA = pApA = pA2 and by similar reasoningpAa = 2pApa andpaa=pa2.These are known as the Hardy-Weinberg genotype frequencies, and when they characterize a population it is said to be in Hardy-Weinberg Equilibrium (HWE) because, as has been shown mathematically, under idealized conditions the genotype frequencies will not change over time.
Mating is in fact typically strikingly close to random with respect to most genes because mate choice is unaffected by a mate's specific variants at most genes, and HWE serves as a baseline from which to judge observed genotype frequencies. If these differ in a statistically significant way from what is expected under random mating, there may be reason to investigate why this is so, and there are many possible reasons, including mating that is affected by the gene in question, or by natural selection.
These are the basic frequency measures needed in order to describe the essential concepts of evolutionary change from a gene's eye view.
Mutation: Change of State in the Genome
All genetic change ultimately comes about through mutation. Mutation can do many things that will be discussed in Chapter 4 and beyond. These changes can alter a gene's function, expression pattern, or structure. In some instances, new genes can be inserted from outside the individual, as for example when viral particles integrate into the genome. Such a change, if it occurs in the germ line and is transmitted to the next generation, constitutes horizontal transmission described earlier.
Each new variant arises with allele frequency 1/N, where N is the number of copies of the gene in the population of reference. (The frequency is 1/2N in a diploid species, since each of the N individuals in the population has two copies of the gene.)
Over time, alleles experience changes in their frequency. From one generation to the next, the proportion of a given allele may differ for several reasons, the primary one being chance. In a diploid species, for example, one of the two alleles that came together to form the parent randomly segregates into the germ cell to be transmitted to any given offspring. This is mendelian segregation. Which of these two possible outcomes occurs in any given case is inherently probabilistic.
Mendelian segregation introduces a fundamental element of chance in allele frequency change when individuals reproduce, and many other aspects of chance in survival, mate acquisition, or fertility affect whether that reproduction will occur in the first place. Under various assumptions, there are ways to quantify the relative probabilities of various outcomes of mendelian segregation in any family or in an entire population, and hence the distribution of possible allele frequencies in an offspring generation, if the frequencies in the parental generation are known.
The phenomenon of allele frequency change due strictly to chance is known as random genetic drift. We encountered the metaphor of "drift" earlier in discussing random changes in the distribution of phenotypes in a population. Frequency "drifts" randomly up or down over time, until a variant is eventually either fixed in or lost from the population, as illustrated in Figure 3-2.
Genetic drift is inevitable. Based on assumptions about the population, the expected (average, over many replicates could they occur) change in allele frequency from one generation to the next and the probabilities of any particular
20 40 60 80 100
Was this article helpful?