What are genomic libraries?
Scientists often need to search through all the genetic information present in an organism to find a specific gene. It is thus convenient to have collections of genetic sequences stored so that such information is readily available. These collections are known as genomic libraries.
The library metaphor is useful in explaining both the structure and function of these information-storage centers. If one were interested in finding a specific literary phrase, one could go to a conventional library and search through the collected works. In such a library, the information is made up of letters organized in a linear fashion to form words, sentences, and chapters. It would not be useful to store this information as individual words or letters or as words collected in a random, jumbled fashion, as the information’s meaning could not then be determined. The more books a library has, the closer it can come to having the complete literary collection, although no collection can guarantee that it has every piece of written word. The same is true of a genomic library. The stored pieces of genetic information cannot be individual bits but must be ordered sequences that are long enough to define a gene. The longer the string of information, the easier it is to make sense of the gene they make up, or “encode.” The more pieces of genetic information a library has, the more likely it is to contain all the information present in a cell. Even a large collection of sequences, however, cannot guarantee that it contains every piece of genetic information.
In order for a genomic library to be practical, some method must be developed to put an entire genome into discrete units, each of which contains sufficiently large amounts of information to be useful but which are also easily replicated and studied. The method must also generate fragments that overlap one another for short stretches. The information exists in the form of chromosomes composed of millions of units known as base pairs. If the information were fragmented in a regular fashion—for example, if it were cut every ten thousand base pairs—there would be no way to identify each fragment’s immediate neighbors. It would be like owning a huge, multivolume novel without any numbering system: it would be almost impossible to determine with which book to start and which to proceed to next. Similarly, without some way of tracking the order of the genetic information, it would be impossible to assemble the sequence of each subfragment into the big continuum of the entire chromosome. The fragments are thus cut so that their ends overlap. With even a few hundred base pairs of overlap, the shared sequences at the end of the fragments can be used to determine the relative position of the different fragments. The different pieces can then be connected into one long unit, or sequence.
There are two common ways to fragment DNA, the basic unit of genetic information, to generate a library. The first is to disrupt the long strands of DNA by forcing them rapidly through a narrow hypodermic needle, creating forces that tear, or shear, the strands into short fragments. The advantage of this method is that the fragment ends are completely random. The disadvantage is that the sheared ends must be modified for easy joining, or ligation. The other method is to use restriction endonucleases, enzymes that recognize specific short stretches of DNA and cleave the DNA at specific positions. To create a library, scientists employ restriction enzymes that recognize four-base-pair sequences for cutting. Normally, the result of cleavage with such an enzyme would be fragments with an average size of 256 base pairs. If the amount of enzyme in the reaction is limited, however, only a limited number of sites will be cut, and much longer fragments can be generated. The ends created by this cleavage are usable for direct ligation into vectors, but the distribution of cleavage sites is not as random as that produced by shearing.
In a conventional library, information is imprinted on paper pages that can be easily replicated by a printing press and easily bound into a complete unit such as a book. Genetic information is stored in the form of DNA. How can the pieces of a genome be stored in such a way that they can be easily replicated and maintained in identical units? The answer is to take the DNA fragments and attach, or ligate, them into lambda phage DNA. When the phage infects bacteria, it makes copies of itself. If the genomic fragment is inserted into the phage DNA, then it will be replicated also, making multiple exact copies (or clones) of itself.
To make an actual library, DNA is isolated from an organism and fragmented as described. Each fragment is then randomly ligated into a lambda phage. The pool of lambda phage containing the inserts is then spread onto an agar plate coated with a “lawn” or confluent layer of bacteria. Wherever a phage lands, it begins to infect and kill bacteria, leaving a clear spot, or “plaque,” in the lawn. Each plaque contains millions of phages with millions of identical copies of one fragment from the original genome. If enough plaques are generated on the plate, each one containing some random piece of the genome, then the entire genome may be represented in the summation of the DNA present in all the plaques. Since the fragment generation is random, however, the completeness of the genomic library can only be estimated. It takes 800,000 plaques containing an average genomic fragment of 17,000 base pairs to give a 99-percent probability that the total will contain a specific human gene. While this may sound like a large number, it takes only fifteen teacup-sized agar plates to produce this many plaques. A genetic library pool of phage can be stored in a refrigerator and plated out onto agar petri dishes whenever needed.
Once the entire genome is spread out as a collection of plaques, it is necessary to isolate the one plaque containing the specific sequences desired from the large collection. To accomplish this, a dry filter paper is laid onto the agar dish covered with plaques. As the moisture from the plate wicks into the paper, it carries with it some of the phage. An ink-dipped needle is pushed through the filter at several spots on the edge, marking the same spot on the filter and the agar. These will serve as common reference points. The filter is treated with a strong base that releases the DNA from the phage and denatures it into single-stranded form. The base is neutralized, and the filter is incubated in a salt buffer containing radioactive single-stranded DNA. The radioactive DNA, or “probe,” is a short stretch of sequence from the gene to be isolated. If the full gene is present on the filter, the probe will hybridize with it and become attached to the filter. The filter is washed, removing all the radioactivity except where the probe has hybridized. The filters are exposed to film, and a dark spot develops over the location of the positive plaque. The ink spots on the filter can then be used to align the spot on the filter with the positive plaque on the plate. The plaque can be purified, and the genomic DNA can then be isolated for further study.
It may turn out that the entire gene is not contained in the fragment isolated from one phage. Since the library was designed so that the ends of one fragment overlap with the adjacent fragment, the ends can be used as a probe to isolate neighboring fragments that contain the rest of the gene. This process of increasing the amount of the genome isolated is called genomic walking.
Bird, R. Curtis, and Bruce F. Smith, eds. Genetic Library Construction and Screening: Advanced Techniques and Applications. New York: Springer, 2002. Print.
Bishop, Martin J., ed. Guide to Human Genome Computing. 2nd ed. San Diego: Academic, 1998. Print.
Cooper, Necia Grant, ed. The Human Genome Project: Deciphering the Blueprint of Heredity. Foreword by Paul Berg. Mill Valley: University Science, 1994. Print.
Dale, Jeremy, Malcolm von Schantz, and Nick Plant. “Genomic and cDNA Libraries.” From Genes to Genomes: Concepts and Applications of DNA Technology. 3rd ed. Chichester: Wiley, 2012. Print.
Danchin, Antoine. The Delphic Boat: What Genomes Tell Us. Trans. Alison Quayle. Cambridge: Harvard UP, 2002. Print.
Hoogenboom, H. R. “Designing and Optimizing Library Selection Strategies for Generating High-Affinity Antibodies.” Trends in Biotechnology 15.2 (1997): 62–70. Print.
Klug, William S., et al. Essentials of Genetics. 8th ed. Boston: Pearson, 2013. Print.
Primrose, S. B., and R. M. Twyman. “Genomic DNA Libraries Are Generated by Fragmenting the Genome and Cloning Overlapping Fragments in Vectors.” Principles of Gene Manipulation and Genomics. 7th ed. Malden: Blackwell, 2006. Print.
Sambrook, Joseph, and David W. Russell. Molecular Cloning: A Laboratory Manual. 4th ed. Cold Spring Harbor: Cold Spring Harbor Laboratory, 2012. Print.
Sandor, Suhai, ed. Theoretical and Computational Methods in Genome Research. New York: Plenum, 1997. Print.
Watson, James D., et al. Recombinant DNA: Genes and Genomes: A Short Course. 3rd ed. New York: Freeman, 2007. Print.