Definition (Genetics & Inherited Conditions)
A genome comprises all of the DNA that is present in each cell of an organism. For prokaryotes, which are always single-celled, it comprises all of the DNA within the bacterial cell that is specific to that species. Other DNA molecules may also reside in a bacterial cell, such as plasmids (small extra pieces of circular DNA) and bacteriophage DNA (bacterial virus DNA). In eukaryotes, the genome typically refers to the DNA in the nucleus, which is composed of linear chromosomes. All eukaryotic cells also have DNA in their mitochondria, the organelle that is responsible for cellular respiration. It is a circular molecule and is sometimes referred to as the mitochondrial genome or simply mitochondrial DNA (mtDNA). Plants and some single-celled organisms have, in addition to mitochondria, another type of organelle called a chloroplast, which also has a circular DNA molecule. This DNA is called the chloroplast genome, or simply chloroplast DNA (cpDNA).
Because the genome includes all of the genes that are expressed in an organism, knowing its nucleotide sequence is considered the first step in a complete understanding of the genetics of an organism. However, much more work follows this first step, because knowing just the nucleotide sequence of all the genes does not identify their function or how they interact with other genes. One important benefit of having the complete genome sequence is that it can greatly speed the discovery of...
(The entire section is 262 words.)
Sequencing Whole Genomes (Genetics & Inherited Conditions)
A number of complementary strategies are involved in sequencing a genome. One approach is the shotgun sequencing of mapped clones. Large sections of DNA are cloned into vectors such as bacterial artificial chromosomes (BACs). A physical map of each BAC is made using techniques such as restriction mapping, or the assignment of previously known sequence elements. The BAC maps are compared to identify overlapping clones, forming a map of long contiguous regions of the genome. BACs are selected from this map and the inserts are randomly fragmented into short pieces, 1-2 kilobase pairs (kb), and subcloned into vectors. Subclones are selected at random and sequenced. Many subclones are sequenced (often enough to provide sevenfold coverage of the clone) and then assembled to yield the contiguous sequence of the original BAC insert. The sequences from overlapping BACs are then assembled. In the finishing stage, additional bridging sequences are obtained to close gaps where there were no overlapping clones.
Whole genome shotgun sequencing involves randomly fragmenting the whole genome and sequencing clones without an initial map. Small clones (up to around 2 kb) are sequenced and assembled into contiguous regions with the help of sequences from larger (10-50 kb) clones that form a scaffold. The sequence is then linked to a physical map of the organism’s chromosomes. This method works effectively on bacterial genomes because...
(The entire section is 356 words.)
AnnotationAnnotation (genomic libraries) (Genetics & Inherited Conditions)
The annotation process involves gathering and presenting information about the location of genes, regulatory elements, structural elements, repetitive DNA, and other factors of the genome. It is important to integrate any previously known information regarding the genome, such as location of ESTs, at this stage. A powerful approach to identifying genes is to map ESTs and mRNAs to the genome. This will identify many of the protein-coding genes and can reveal the intron-exon structure plus possible alternative splicing of the gene. It will not identify most functional RNA genes, and how to do so effectively is an open question. Indeed, how many functional RNA genes there may be in eukaryotic genomes is unclear. For example, in humans approximately twenty-five thousand protein-coding genes have been identified, but there is evidence of many more transcribed sequences, and exactly what these are is unknown.
Some genes can be identified in the genomic sequence by the comparative approach, that is, by showing significant sequence similarity (for example, via BLAST algorithms) with annotated genes from other organisms. Such an approach becomes more powerful as the genomes of more organisms are published.
Computational methods can also be used to predict regions of the sequence that may represent genes. These rely on identifying patterns in the genomic sequence that resemble known properties of...
(The entire section is 238 words.)
Functional GenomicsFunctional genomics (Genetics & Inherited Conditions)
Functional genomics aims to assign a functional role to each gene and identify the tissue type and developmental stage at which it is expressed. Identifying all genes in a genome makes it possible to determine the effect of altering the expression of each gene, through the use of knockouts, gene silencing, or transgenic experiments. Technologies such as microarray analysis allow mRNA expression levels to be measured for tens of thousands of genes simultaneously, while proteomic methods such as mass spectroscopy are beginning to allow high-throughput measurements of proteins. In these areas genomics overlaps with transcriptomics, proteomics, and specialties such as glycomics.
(The entire section is 101 words.)
Structural GenomicsStructural genomics (Genetics & Inherited Conditions)
Structural genomics aims to define the three-dimensional folding of all protein products that an organism produces. The structure of a protein can provide insights into its function and mode of action. Structural genomics touches upon proteomics in the need to consider structural changes when there are post-translational changes or binding with other molecules. Identifying all the genes in a genome allows the amino acid sequence of each protein to be inferred from the DNA, and comparisons between them allow proteins (or characteristic sections of a protein, called folds or domains) to be identified and classified into families.
Structural identification of genes and proteins typically proceeds via each gene being cloned and then expressed. The protein product is then purified, and its structures are experimentally determined using methods such as X-ray crystallography and nuclear magnetic resonance (NMR) spectroscopy. Computational methods of structural prediction, either ab initio (from the beginning) or alternatively by computational prediction, aided by the known structure of a related protein, are generally inferior to direct experimental approaches, but these fields are rapidly advancing and are the key to the future.
(The entire section is 182 words.)
Comparative GenomicsComparative genomics (Genetics & Inherited Conditions)
Comparative genomics expands our knowledge through the comparison of the different genomes of organisms. This is essential to the annotation of genomic sequences. For example, both otherwise unknown genes and particularly regulatory elements in humans and mice were first revealed by identifying conserved intact regions of their genomic sequences. This can identify genes homologous (similar by descent) to those in other species or identify a new member of a gene family. Comparing genomes can give insights into evolutionary questions about a particular gene or the organisms themselves. Important information can also be discovered about the regulation of different genes, the effects of different gene expression patterns between different species, and how the genome of each species came to be the way it is. Comparative genomics essentially relies upon phylogenetic methodology to describe the pattern and process of molecular evolution (phylogenomics).
To date, more than 180 genomes of various organisms have been sequenced, including the sequencing of the cow and dog genomes in 2004, five different domesticated pig breeds in 2005, and the domesticated cat in 2007. Other strategically selected organisms are in the pipeline. The view of the National Human Genome and Research Institute (NHGRI) is that the way to most effectively study essential functional and structural components of the human genome is to...
(The entire section is 246 words.)
Epigenetics (Genetics & Inherited Conditions)
Epigenetics refers to changes in gene expression that are caused by mechanisms other than from the actual sequence of the underlying DNA. These are heritable changes in the function of genes but without a change in the sequence of DNA base pairs. Examples of epigenetic changes include DNA methylation, histone acetylation, imprinting, and RNA interference, in which these mechanisms affect differential gene activation and inactivation. When genes are not needed for the functioning of a particular cell, they can be biochemically “labeled” with methyl groups, called DNA methylation. This will essentially signal that that gene should be “turned off,” and so will not be transcribed into a protein product. In reverse, histones can be acetylated, which signals the activation of gene transcription. These mechanisms alter the structure of chromatin, a combination of DNA-protein complex that folds DNA in different ways, thereby altering the expression of genes without altering the DNA sequence.
(The entire section is 151 words.)
Further Reading (Genetics & Inherited Conditions)
International Human Genome Sequencing Consortium. “Initial Sequencing and Analysis of the Human Genome.” Nature 409, no. 6822 (2001): 860-921. The publication of the first draft of the Human Genome Project. The whole journal issue contains many other papers considering the structure, function, and evolution of the human genome.
Venter, J. C., et al. “The Sequence of the Human Genome.” Science 291, no. 5507 (2001): 1304-1351. Report on the Celera Genomics human genome project.
(The entire section is 67 words.)
Web Sites of Interest (Genetics & Inherited Conditions)
Department of Energy. Joint Genome Institute. http://www.jgi.doe.gov. A collaboration between the Department of Energy’s Lawrence Berkeley, Lawrence Livermore, and Los Alamos National Laboratories. Includes an introduction to genomics, a research time line that starts with Darwin’s work in 1859, and links.
Ensembl Project: “Browse a Genome”. http://www.ensembl.org. A joint project between the EMBL-EBI and the Wellcome Trust Sanger Institute. This project “produces genome databases for vertebrates and other eukaryotic species, and makes this information freely available online.”
Genome News Network: “What’s a Genome?” and “A Quick Guide to Sequenced Genomes”. http://www.genomenewsnetwork.org. A publication of the J. Craig Venter Institute.
Human Genome Sequencing Center. http://www.hgsc.bcm.tmc.edu. Baylor College of Medicine. Posts an ongoing “counter” of human genome sequencing completed worldwide.
National Center for Biotechnology Information. http://www.ncbi.nlm.nih.gov. A central repository for biological information, including links to genome projects and genomic science. Maintains GenBank, a comprehensive, annotated collection of publicly available DNA sequences.
(The entire section is 171 words.)
A New Scientific Discipline (Magill’s Medical Guide, Sixth Edition)
Genomics grew out of the field of genetics, the study of heredity. Until the late twentieth century, it had not been possible to study the complete set of hereditary information in a living organism. Thus, while the field of genetics traces its roots to the 1860’s, when the Austrian monk Gregor Mendel performed experiments on the mechanism of heredity in pea plants, the field of genomics is much younger, dating from the 1980’s. It was in this decade that American geneticist Thomas Roderick used the term “genomics” to name a new scientific journal that dealt with the analysis of genomic information. In Mendel’s time, while organisms were seen to exhibit certain traits, it was not known how these traits were determined. By the early twentieth century, it was recognized that traits are inherited in units of information called genes, although the chemical nature of the gene was still unknown. It took until the middle of that century to recognize that genes were made up of deoxyribonucleic acid (DNA), the structure of which was first identified by American biologist James Watson and British biophysicist Francis Crick in 1953.
DNA is made up of four different deoxyribonucleotides, commonly referred to as bases: adenine (A), cytosine (C), guanine (G), and thymine (T). Together, they spell out a chemical code that is used by the cell to make proteins. Since it is the set of proteins contained within a cell that...
(The entire section is 892 words.)
Divisions Within the Field (Magill’s Medical Guide, Sixth Edition)
In addition to producing the “-omics” disciplines, the field of genomics can itself be subdivided. Although these divisions are somewhat artificial, they do help illustrate the different goals of genomics research. The main divisions of genomics are structural genomics, functional genomics, and comparative genomics.
Structural genomics is concerned with the structure of hereditary information. The determination of the number, location, and order of genes on a particular chromosome is one pursuit of this field. While bacterial genomes are typically contained within a single circular chromosome, humans have twenty-three pairs of chromosomes and some organisms have even more. Studying regions of DNA between genes, or intergenic regions, is also the realm of structural genomics. Intergenic regions are often composed of a highly repetitive DNA sequence that does not code for any type of protein. In the early twenty-first century, the precise function of these regions still eluded scientists. Although noncoding sequence is relatively rare in bacteria, it makes up a major portion of many multicellular organisms, including about 98 percent of the human genome. A separate goal of structural biology is the determination of the three-dimensional structure of all the proteins encoded by a genome, an endeavor called structural proteomics.
Functional genomics is less concerned with the structure of a genome and more...
(The entire section is 633 words.)
Perspective and Prospects (Magill’s Medical Guide, Sixth Edition)
Genomics can trace its origins to the development of techniques used to determine the sequence of DNA. In 1977, British biochemist Frederick Sanger and colleagues published a sequencing method based on the principle of chain termination. In this method, the sequence of target DNA is determined by enzymatically producing a complementary strand of DNA. In Sanger sequencing, as it is now called, a molecular “poison” is included in a given reaction mixture so that the newly synthesized complementary chain is terminated at specific bases. Sanger’s method was then modified in the 1990’s to include fluorescent dyes on the chain-terminating bases so that the DNA sequence could be read using a scanner and recorded directly into a computer. Some have claimed that Sanger and colleagues were actually the first group to sequence a genome, since they published the sequence of a viral genome in the same year that they described their revolutionary technique. Viruses, however, are not free-living organisms, and their genomes are thousands of times smaller than the typical bacterial genome.
The Human Genome Project was first proposed in 1986 and was funded two years later at an expected cost of three billion dollars. The project officially got under way in 1990 as sequencing began in earnest on some of the smaller model genomes. In 1995, as some of these sequencing efforts were nearing completion, American pharmacologist...
(The entire section is 713 words.)
For Further Information: (Magill’s Medical Guide, Sixth Edition)
Brown, Terence A. Genomes 3. New York: Garland Science, 2007. A comprehensive and sophisticated study of genomics and all applicable techniques and applications.
Campbell, A. Malcolm, and Laurie J. Heyer. Discovering Genomics, Proteomics, and Bioinformatics. 2d ed. San Francisco: Pearson/Benjamin Cummings, 2007. An introduction to genomics and the techniques used to study genomic data. Contains many Web-based exercises in bioinformatics.
Clark, M. S. “Comparative Genomics: The Key to Understanding the Human Genome Project.” BioEssays 21 (1999): 121-130. An in-depth discussion of comparative genomics and its significance to the field as a whole.
Collins, Francis S., et al. “A Vision for the Future of Genomics Research.” Nature 422 (April, 2003): 835-847. Upon the completion of the Human Genome Project, the director of the public consortium and his colleagues wrote this article, which presents a plan for the future of genomics.
DeRisi, Joseph L., and Vishwanath R. Iyer. “Genomics and Array Technology.” Current Opinion in Oncology 11 (1999): 76-79. A review of how DNA microarrays are used in genomic research.
Klug, William S., and Michael R. Cummings. “Genomics, Bioinformatics, and Proteomics.” In Concepts of Genetics. 8th ed. Upper Saddle River, N.J.: Prentice Hall, 2007. A chapter that details the differences between...
(The entire section is 272 words.)