What are population genetics?
The branch of genetics called population genetics is based on the application of nineteenth century Austrian botanist Gregor Mendel’s principles of inheritance to genes in a population. (Although, for some species, “population” can be difficult to define, the term generally refers to a geographic group of interbreeding individuals of the same species.) Mendel’s principles can be used to predict the expected proportions of offspring in a cross between two individuals of known genotypes, where the genotype describes the genetic content of an individual for one or more genes. An individual carries two copies of all chromosomes (except perhaps for the sex chromosomes, as in human males) and therefore has two copies of each gene. These two copies may be identical or somewhat different. Different forms of the same gene are called alleles. A genotype in which both alleles are the same is called a homozygote, while one in which the two alleles are different is a heterozygote. Although a single individual can carry no more than two alleles for a particular gene, there may be many alleles of a gene present in a population.
It would be essentially impossible to track the inheritance patterns of every single mating pair in a population, in essence tracking all the alleles in the gene pool. However, by making some simplifying assumptions about a population, it is possible to predict what will happen to the gene pool over time. Working independently in 1908, the British mathematician Godfrey Hardy and the German physiologist Wilhelm Weinberg were the first to formulate a simple mathematical model describing the behavior of a gene (locus) with two alleles in a population. In this model, the numbers of each allele and of each genotype are not represented as actual numbers but as proportions (known as allele frequencies and genotype frequencies, respectively) so that the model can be applied to any population regardless of its size. By assuming Mendelian inheritance of alleles, Hardy and Weinberg showed that allele frequencies in a population do not change over time and that genotype frequencies will change to specific proportions, determined by the allele frequencies, within one generation and remain at those proportions in future generations. This result is known as the Hardy-Weinberg law, and the stable genotype proportions predicted by the law are known as Hardy-Weinberg equilibrium. It was shown in subsequent work by others that the Hardy-Weinberg law remains true in more complex models with more than two alleles and more than one locus.
In order for the Hardy-Weinberg law to work, certain assumptions about a population must be true:The gene pool must be infinite in size. Mating among individuals (or the fusion of gametes) must be completely random. There must be no new mutations. There must be no gene flow (that is, no alleles should enter or leave the population. There should be no natural selection.
Since real populations cannot meet these conditions, it may seem that the Hardy-Weinberg model is too unrealistic to be useful, but, in fact, it can be useful. First, the conditions of a natural population may be very close to Hardy-Weinberg assumptions, so the Hardy-Weinberg law may be approximately true for at least some populations. Second, if genotypes in a population are not in Hardy-Weinberg equilibrium, it is an indication that one or more of these assumptions is not met. The Hardy-Weinberg law has been broadly expanded, using sophisticated mathematical modeling, and with adequate data can be used to determine why a population’s allele and genotype frequencies are out of Hardy-Weinberg equilibrium.
Sampling and genetic analyses of real populations of many different types of organisms reveal that there is usually a substantial amount of genetic variation, meaning that for a fairly large proportion of genes (loci) that are analyzed, there are multiple alleles, and therefore multiple genotypes, within populations. For example, in the common fruit fly Drosophila melanogaster (an organism that has been well studied genetically since the very early 1900s), between one-third and two-thirds of the genes that have been examined by protein electrophoresis have been found to be variable. Genetic variation can be measured as allele frequencies (allelic variation) or genotype frequencies (genotypic variation). A major task of population geneticists has been to describe such variation, to try to explain why it exists, and to predict its behavior over time.
The Hardy-Weinberg law predicts that if genetic variation exists in a population, it will remain constant over time, with genotypes in specific proportions. However, the law cannot begin to explain natural variation, since genotypes are not always found in Hardy-Weinberg proportions, and studies that involve sampling populations over time often show that genetic variation can be changing. The historical approach to explaining these observations has been to formulate more complex mathematical models based on the simple Hardy-Weinberg model that violate one or more of the implicit Hardy-Weinberg conditions.
Beginning in the 1920s and 1930s, a group of population geneticists, working independently, began exploring the effects of violating Hardy-Weinberg assumptions on genetic variation in populations. In what has become known as the “modern synthesis,” Ronald A. Fisher, J. B. S. Haldane, and Sewall Wright merged Charles Darwin’s theory of natural selection with Mendel’s theory of genetic inheritance to create a field of population genetics that allows for genetic change. They applied mathematics to the problem of variation in populations and were eventually able to incorporate what happens when each, or combinations, of the Hardy-Weinberg assumptions are violated.
One of the implicit conditions of the Hardy-Weinberg model is that genotypes form mating pairs at random. In most cases mates are not selected based on genotype. Unless the gene in question has some direct effect on mate choice, mating with respect to that gene is random. However, there are conditions in natural populations in which mating is not random. For example, if a gene controls fur color and mates are chosen by appropriate fur color, then the genotype of an individual with respect to that gene will determine mating success. For this gene, then, mating is not random but rather “assortative.” Positive assortative mating means that individuals tend to choose mates with genotypes like their own, while negative assortative mating means that individuals tend to choose genotypes different than their own.
Variation in a population for a gene subject to assortative mating is altered from Hardy-Weinberg expectations. Although allele frequencies do not change, genotype frequencies are altered. With positive assortative mating, the result is higher proportions of homozygotes and fewer heterozygotes, while the opposite is true when assortative mating is negative. Sometimes random mating in a population is not possible because of the geographic organization of the population or general mating habits. Truly random mating would mean that any individual can mate with any other, but this is nearly impossible because of gender differences and practical limitations. In natural populations, it is often the case that mates are somewhat related, even closely related, because the population is organized into extended family groups whose members do not (or cannot, as in plants) disperse to mate with members of other groups. Mating between relatives is called inbreeding. Because related individuals tend to have similar genotypes for many genes, the effects of inbreeding are much like those of positive assortative mating for many genes. The proportions of homozygotes for many genes tend to increase. Again, this situation has no effect on allelic variation, only genotypic variation. Clearly, the presence of nonrandom mating patterns cannot by itself explain the majority of patterns of genetic variation in natural populations but can contribute to the action of other forces, such as natural selection.
In the theoretical Hardy-Weinberg population, there are no sources of new genetic variation. In real populations, alleles may enter or leave the population, a process called migration or “gene flow” (a more accurate term, since migration in this context means not only movement between populations but also successful reproduction to introduce alleles in the new population). Also, new alleles may be introduced by mutation, the change in the DNA sequence of an existing allele to create a new one, as a result of errors during DNA replication or the inexact repair of DNA damage from environmental influences such as radiation or mutagenic chemicals. Both of these processes can change both genotype frequencies and allele frequencies in a population. If the tendency to migrate is associated with particular genotypes, a long period of continued migration tends to push genotype and allele frequencies toward higher proportions of one type (in general, more homozygotes) so that the overall effect is to reduce genetic variation. However, in the short term, migration may enhance genetic variation by allowing new alleles and genotypes to enter. The importance of migration depends on the particular population. Some populations may be relatively isolated from others so that migration is a relatively weak force affecting genetic variation, or there may be frequent migration among geographic populations. There are many factors involved, not the least of which is the ability of members of the particular species to move over some distance.
Mutation, because it introduces new alleles into a population, acts to increase genetic variation. Before the modern synthesis, one school of thought was that mutation might be the driving force of evolution, since genetic change over time coming about from continual introduction of new forms of genes seemed possible. In fact, it is possible to develop simple mathematical models of mutation that show resulting patterns of genetic variation that resemble those found in nature. However, to account for the rates of evolution that are commonly observed, very high rates of mutation are required. In general, mutation tends to be quite rare, making the hypothesis of evolution by mutation alone unsatisfactory.
The action of mutation in conjunction with other forces, such as selection, may account for the low-frequency persistence of clearly harmful alleles in populations. For example, one might expect that alleles that can result in genetic diseases (such as cystic fibrosis) would be quickly eliminated from human populations by natural selection. However, low rates of mutation can continually introduce these alleles into populations. In this “mutation-selection balance,” mutation tends to introduce alleles while selection tends to eliminate them, with a net result of continuing low frequencies in the population.
Real populations are not, of course, infinite in size, though some are large enough that this Hardy-Weinberg condition is a useful approximation. However, many natural populations are small, and any population with less than about one thousand individuals will vary randomly in the pattern of genetic variation from generation to generation. These random changes in allele and genotype frequencies are called genetic drift. The situation is analogous to coin tossing. With a fair coin, the expectation is that half of the tosses will result in heads and half in tails. On average, this will be true, but in practice a small sample will not show the expectation. For example, if a coin is tossed ten times, it is unlikely that the result will be exactly five heads and five tails. On the other hand, with a thousand tosses, the results will be closer to half and half. This higher deviation from the expected result in small samples is called a sampling error. In a small population, there is an expectation of the pattern of genetic variation based on the Hardy-Weinberg law, but sampling error during the union of sex cells to form offspring genotypes will result in random deviations from that expectation. The effect is that allele frequencies increase or decrease randomly, with corresponding changes in genotype frequencies. The smaller the population, the greater the sampling error and the more pronounced genetic drift will be.
Genetic drift has an effect on genetic variation that is similar to that of other factors. Over the long term, allele frequencies will drift until all alleles have been eliminated but one, eliminating variation. (For the moment, ignore the action of other forces that increase variation.) Over a period of dozens of generations, however, drift can allow variation to be maintained, especially in larger populations in which drift is minimal.
In the early days of population genetics, the possibility of genetic drift was recognized but often considered to be a minor consideration, with natural selection as a dominant force. Fisher in particular dismissed the importance of genetic drift, engaging over a number of years in a published debate with Wright, who always felt that drift would be important in small populations. Beginning in the 1960’s with the acquisition of data on DNA-level population variation, the role of drift in natural populations became more recognized. It appears to be an especially strong force in cases in which a small number of individuals leave the population and migrate to a new area where they establish a new population. Large changes can occur, especially if the number of migrants is only ten or twenty. This type of situation is now referred to as a founder effect.
Natural selection in a simple model of a gene with two alleles in a population can be easily represented by assuming that genotypes differ in their ability to survive and produce offspring. This ability is called fitness. In applying natural selection to a theoretical population, each genotype is assigned a fitness value between zero and one. Typically, the genotype in a population that is best able to survive and can, on average, produce more offspring than other genotypes is assigned a fitness value of one, and genotypes with lower fitness are assigned fitnesses with fractional values relative to the high-fitness genotype.
The study of this simple model of natural selection has revealed that it can alter genetic variation in different ways, depending on which genotype has the highest fitness. In the simple one-gene, two-allele model, there are three possible genotypes: two homozygotes and one heterozygote. If one homozygote has the highest fitness, it will be favored, and the genetic composition of the population will gradually shift toward more of that genotype (and its corresponding allele). This is called directional selection. If both homozygotes have higher fitness than the heterozygote (disruptive selection), one or the other will be favored, depending on the starting conditions. Both of these situations will decrease genetic variation in the population, because eventually one allele will prevail. Although each of these types of selection (particularly directional) may be found for genes in natural populations, they cannot explain why genetic variation is present, and is perhaps increasing, in nature.
Heterozygote advantage, in which the heterozygote has higher fitness than either homozygote, is the other possible situation in this model. In this case, because the heterozygote carries both alleles, both are expected to be favored together and therefore maintained. This is the only condition in this simple model in which genetic variation may be maintained or increased over time. Although this seems like a plausible explanation for the observed levels of natural variation, studies in which fitness values are measured almost never show heterozygote advantage in genes from natural populations. As a general explanation for the presence of genetic variation, this simple model of selection is unsatisfactory.
Studies of more complex theoretical models of selection (for example, those with many genes and different forms of selection) have revealed conditions that allow patterns of variation very similar to those observed in natural populations, and in some cases it seems clear that natural selection is a major factor determining patterns of genetic change. However, in many cases, selection does not seem to be the most important factor or even a factor at all.
Population genetics has always been a field in which the understanding of theory is ahead of empirical observation and experimental testing, but these have not been neglected. Although Fisher, Haldane, and Wright were mainly theorists, there were other architects of the modern synthesis who concentrated on testing theoretical predictions in natural populations. Beginning in the 1940s, for example, Theodosius Dobzhansky showed in natural and experimental populations of Drosophila species that frequency changes and geographic patterns of variation in chromosome variants are consistent with the effects of natural selection.
Natural selection was the dominant hypothesis for genetic changes in natural populations for the first several decades of the modern synthesis. In the 1960s, new techniques of molecular biology allowed population geneticists to examine molecular variation, first in proteins and later, with the use of restriction enzymes in the 1970s and DNA sequencing in the 1980s and 1990s, in DNA sequences. These types of studies only confirmed that there is a large amount of genetic variation in natural populations, much more than can be attributed to only natural selection. As a result, Motoo Kimura proposed the “neutral theory of evolution,” the idea that most DNA sequence differences do not have fitness differences and that population changes in DNA sequences are governed mainly by genetic drift, with selection playing a minor role. This view, although still debated by some, was mostly accepted by the 1990s, although it was recognized that evolution of proteins and physical traits may be governed by selection to a greater extent.
The field of population genetics is a fundamental part of the current field of evolutionary biology. One possible definition of evolution would be “genetic change in a population over time,” and population geneticists try to describe patterns of genetic variation, document changes in variation, determine their theoretical causes, and predict future patterns. These types of research have been valuable in studying the evolutionary histories of organisms for which there are living representatives, including humans.
In addition to the scientific value of understanding evolutionary history better, there are more immediate applications of such work. In conservation biology, data about genetic variation in a population can help to assess its ability to survive in the future. Data on genetic similarities between populations can aid in decisions about whether they can be considered as the same species or are unique enough to merit preservation.
Population genetics has had an influence on medicine, particularly in understanding why “disease genes,” while clearly harmful, persist in human populations. The field has also affected the planning of vaccination protocols to maximize their effectiveness against parasites, since a vaccine-resistant strain is a result of a rare allele in the parasite population. In the 1990s it began to be recognized that effective treatments for medical conditions would need to take into account genetic variation in human populations, since different individuals might respond differently to the same treatment.
Christiansen, Freddy B. Population Genetics of Multiple Loci. New York: Wiley, 2000. Print.
Dobzhansky, Theodosius. Genetics and the Origin of Species. 3rd ed. New York: Columbia UP, 1951. Print.
Gillespie, John H. Population Genetics: A Concise Guide. 2nd ed. Baltimore: Johns Hopkins UP, 2004. Print.
Hamilton, Matthew B. Population Genetics. Hoboken: Wiley, 2009. Print.
Hartl, Daniel L. A Primer of Population Genetics. Rev. 3rd ed. Sunderland: Sinauer, 2000. Print.
Hedrick, Philip W. Genetics of Populations. 4th ed. Boston: Jones, 2011. Print.
Landweber, Laura F., and Andrew P. Dobson, eds. Genetics and the Extinction of Species: DNA and the Conservation of Biodiversity. Princeton: Princeton UP, 1999. Print.
Lewontin, Richard C. The Genetic Basis of Evolutionary Change. New York: Columbia UP, 1974. Print.
Masel J. "Genetic Drift." Current Biology 21.20 (2011): R837–38. Print.
Papiha, Surinder S., Ranjan Deka, and Ranajit Chakraborty, eds. Genomic Diversity: Applications in Human Population Genetics. New York: Kluwer, 1999. Print.
Peng, Bo, Christopher I. Amos, and Marek Kimmel. Forward-Time Population Genetics Simulations: Methods, Implementation, and Applications. Hoboken: Wiley, 2012. Digital file.
Provine, William B. The Origins of Theoretical Population Genetics. 2nd ed. Chicago: U of Chicago P, 2001. Print.
Relethford, John. Human Population Genetics. Hoboken: Wiley, 2012. Print.
Slatkin, Montgomery, and Michel Veuille, eds. Modern Developments in Theoretical Population Genetics: The Legacy of Gustave Malécot. New York: Oxford UP, 2002. Print.
Templeton, Alan R. Population Genetics and Microevolutionary Theory. Hoboken: Wiley, 2006. Print.