Sequencing

Molecular techniques of analysis are a vital part of forensic science. Analysis of genetic material is instrumental in detecting pathogenic (disease-causing) microorganisms and in identifying a victim or implicating a suspect.

These molecular examinations rely on the determining of the arrangement of the building blocks of the genetic material. This determination is called sequencing.

Sequencing refers to the techniques used to determine the order of the constituent bases (i.e., adenine, thymine, guanine, and cytosine) of deoxyribonucleic acid (DNA), ribonucleic acid (RNA) or the constituent amino acid building blocks of protein.

DNA is typically sequenced for several reasons: to determine the sequence of the protein encoded by the DNA, the location of sites at which restriction enzymes can cut the DNA, the location of DNA sequence elements that regulate the production of messenger RNA, or to detect alterations in the DNA.

The sequencing of DNA is accomplished by stopping the lengthening of a DNA chain at a known base and at a known location in the DNA. Practically, this can be done in two ways. In the first method, called the Sanger-Coulson procedure, a small amount of a specific so-called dideoxynucleoside base is incorporated in a mixture with the other four normal bases. This base is slightly different from the normal base and is radioactively labeled. The radioactive base becomes incorporated into the growing DNA chain instead of the normal base, and growth of the DNA stops. This stoppage is done four times, each time using one of the four different dideoxynucleosides. This generates four collections of DNA molecules. Also, because replication of the DNA always begins at the same point, and because the amount of altered base added is low, for each reaction many DNA pieces of different lengths will be generated. When the sample is used for gel electrophoresis, the different-sized pieces can be resolved as radioactive bands in the gel. Then, knowing the location of the bases, the sequence of the DNA can be deduced.

The second DNA sequencing technique is known as the Maxam-Gilbert technique, after its co-discoverers. In this technique, both strands of double-stranded DNA are radioactively labeled using radioactive phosphorus. Upon heating, the DNA strands separate and can be physically distinguished from each other, as one strand is heavier than the other. Both strands are then cut up using specific enzymes and the different sized fragments of DNA are separated by gel electrophoresis. Based on the pattern of fragments, the DNA sequence is determined.

Several decades ago, sequencing involved scrutinizing the gels by eye. However, the marriage of powerful computational hardware and software to the sequencing process has automated the procedure.

The Sanger-Coulsom approach is the more popular method. Various modifications have been developed, and it has been automated for very large-scale sequencing. During the sequencing of the human genome, a sequencing method called shotgun sequencing was very successfully employed. Shotgun sequencing refers to a method that uses enzymes to cut DNA into hundreds or thousands of random bits. So many fragments are necessary since automated sequencing machines can only decipher relatively short fragments of DNA about 500 bases long. The many sequences are then pieced back together using computers to generate the entire DNA genome sequence.

Protein sequencing involves determining the arrangement of the amino acid building blocks of the protein. It is common to sequence a protein by determining the DNA sequence encoding the protein . This, however, is only possible if a cloned gene is available. It is still often the case that chemical protein sequencing, as described subsequently, must be performed in order to manufacture an oligonucleotide probe that can then be used to locate the target gene. The most popular direct protein chemical sequencing technique in use today is the Edman degradation procedure. This is a series of chemical reactions, which remove one amino acid at a time from a certain end of the protein (the amino terminus). Each amino acid that is released has been chemically modified in the release reaction, allowing the released product to be detected using a technique called reverse phase chromatography. The identity of the released amino acids is sequentially determined, producing the amino acid sequence of the protein.

Another protein sequencing technique is called fast atom bombardment mass spectrometry, or FABMS. This is a powerful technique in which the sample is bombarded with a stream of fast atoms, such as argon. The protein becomes charged and fragmented in a sequence-specific manner. The fragments can be detected and their identity determined. The expense and relative scarcity of the necessary equipment can be a limitation to the technique, though.

Still another protein sequencing strategy is the digestion of the protein with specialized protein-degrading enzymes called proteases. The shorter fragments that are generated, called peptides, can then be sequenced. The problem then is to order the peptides. This is done by the use of two proteases that cut the protein at different points, generating overlapping peptides. The peptides are separated and sequenced, and the patterns of overlap and the resulting protein sequence can be deduced.

SEE ALSO Analytical instrumentation; DNA sequences, unique; PCR (polymerase chain reaction).

Lookup any word on eNotes with our dictionary. Highlight the word and press SHIFT + D for a definition, or SHIFT + T for a synonym.