What are noncoding RNA molecules?
Noncoding RNAs (ncRNAs) include any RNA that is not messenger RNA (mRNA), ribosomal RNA (rRNA), or transfer RNA (tRNA). The discovery of the first ncRNAs in the 1960s occurred because they were expressed in such high numbers. At the time, RNA was considered to function only as a means to express a gene, with all three of the main types of RNA being intimately involved in this process. Many of the ncRNAs discovered over the next twenty years were also discovered fortuitously, before any speculation about their possible functions was even considered. Once transcription and processing of mRNAs was elucidated, many of the ncRNAs were considered leftover fragments representing the introns that had been cut out of pre-mRNAs. At the same time it was discovered that some of the ncRNAs were involved in the process of intron removal and exon splicing. Systematic searches for ncRNAs did not begin until the later 1990s and, once undertaken, revealed a veritable universe of ncRNAs, ranging from very short sequences of less than 100 nucleotides to some around 100,000 nucleotides, and possibly more. Researchers have now identified ncRNAs in essentially all organisms, from bacteria to humans. For a system considered so well understood, the entry of so many new players has added a whole new layer of complexity to the study of genetics.
In almost all eukaryotic genes the coding sequence is interrupted by intervening sequences, called introns,. Therefore when an mRNA is transcribed it cannot be translated without first removing the introns and the joining together (splicing) of the remaining fragments. These remaining fragments, which contain the coding sequence of the gene, are termed exons because they exit into the cytoplasm of the cell, unlike the introns, which are eventually degraded in the nucleus. The cellular “machine” that removes the introns and joins together the exons is the spliceosome. It is a complex assemblage of proteins and several particles called small nuclear ribonucleoproteins (snRNPs, pronounced “snurps” by geneticists). Each snRNP is made up of one or more small nuclear RNAs (snRNAs), the most common ones being U1, U2, and U4/U5/U6 snRNAs, and a characteristic set of proteins bound to the snRNA.
Polyadenylation, another mRNA processing event, is the addition of adenine nucleotides to the 3′ end of mRNAs to make what is called a poly-A tail. A complex made up of several proteins is responsible for recognizing the polyadenylation signals in the mRNA transcript and adding the adenine nucleotides. Replication-dependent histone mRNAs are not polyadenylated, but instead a specific snRNP, and thus an snRNA called U7, are involved in forming a unique stem-loop structure at the 3′ end of the mRNA.
The three rRNAs found in eukaryotic ribosomes (28S, 5.8S, and 18S) are cleaved from a long 45S primary transcript. About half of the original transcript is removed in processing, mature rRNAs have some of their ribose sugars methylated, and some uracil nucleotides in rRNA are converted to pseudouracil (a modified nucleotide), in a process called pseudouridylation. The specific sites for modification and cleavage of the rRNA are determined by small nucleolar RNAs (snoRNAs) acting as guide RNAs. The snoRNAs bind transiently to rRNA in regions where they have complementary base sequences and direct methylation (C/D snoRNPs), pseudouridylation (H/ACA snoRNPs), or cleavage (U3, U8, U22 and MRP snoRNPs in vertebrates) at a set distance on the rRNA from the binding site of the snoRNA. For example, Cbf5 is the pseudouridine synthase enzyme in H/ACA snoRNPs, which is recruited to the site of rRNA modification because it is in the complex with the guide snRNA in the snoRNP. SnoRNA homologs (a homolog is a molecule that is similar to another) have been found in Archaea, but in Bacteria rRNA modifications do not appear to involve guide RNAs.
A complex related to snRNPs was first found in bacteria and has now been found in all groups of organisms. It contains proteins and RNA and is called ribonuclease P (RNase P); it is involved in the processing of tRNA and some rRNAs. Experiments have shown that the RNA component can catalyze the required reactions, even without the protein component, making it the first clear-cut “ribozyme,” an RNA with catalytic properties. Several types of ncRNA are now known to act as ribozymes, and this ability prompted the evolutionary community to propose that early “life” was RNA-based rather than protein and DNA-based.
Another type of ncRNA is involved in RNA editing. These are guide RNAs (gRNAs), discovered in some protists. They guide the insertion or deletion of uracil nucleotides in mitochondrial genes. The details of the process are not well understood, but the mechanism involves complementary base pairing between the rRNA and a gRNA, much like that seen with snoRNAs. RNA editing was found in other organisms and with rRNA and tRNA as well.
Finally, like mRNA, rRNA and tRNA contain introns as well, but their removal and splicing together of the remaining fragments does not rely on the spliceosome machine. Instead, some contain self-splicing introns, that is the introns catalyze their own removal (self-splicing introns are also found in some protein-coding genes in mitochondria). The splicing of tRNA introns is by yet another mechanism, which does not involve ncRNAs.
One of the most exciting discoveries in the area of ncRNAs was the realization that short (20-30 nucleotides) double-stranded RNAs (dsRNAs) trigger RNA silencing, a previously unknown but ubiquitous mechanism of controlling gene expression. The 2006 Nobel Prize in Physiology or Medicine was awarded to Andrew Fire and Craig Mello for their discovery of this phenomenon termed RNA interference or RNAi. The intensive and ongoing research effort that followed this initial discovery identified two major groups of small RNAs involved in RNA silencing: small interfering RNAs (siRNAs). and microRNA (miRNAs). Both originate from long double-stranded RNAs, which can be thousands of base pairs long in the case of siRNAs, but are usually a 70-base-pair long RNA hairpin structure for an miRNA. The 20-30 nucleotides long small RNAs are cleaved from their dsRNA precursors by an enzyme called Dicer, which is a ribonuclease. The small RNAs generated by Dicer bind to the RNA-induced silencing complex (RISC). A nuclear form of RISC is called RITS, for RNA-induced transcriptional silencing. At the core of each complex is a protein called Argonaute, which binds to the small RNA. RISC or RITS are targeted to a particular mRNA by the small RNA bound to Argonaute serving as a guide, since it has a base sequence complementary to the coding, or “sense,” region of an mRNA. When “guided” to an mRNA by the bound siRNA or miRNA, Argonaute stops translation by sequestering or cleaving the target mRNA.
Another type of sRNA is PIWI-interacting RNA (piRNA); it is involved mainly in protection of the genome from parasitic DNA elements, and is thought to work through complexes similar to RISC and RITS.
RNAi was initially thought to exert a type of genetic control called post-transcriptional gene silencing, whereby silencing occurs by targeting mRNA translation or stability. Control of gene expression at the earlier stage of transcription is determined in part by the state of the DNA in the transcribed region. Heterochromatin is a more tightly packed form of DNA associated with repressed transcription and subsequent silencing of gene expression. To a large extent modifications of histones around which the DNA is wrapped determine the packing state of the DNA and subsequently the level of gene expression. Increasing evidence points to RNA silencing acting during transcription as well, and even linking to alterations in DNA packing through interactions with histone modifying agents, as well as affecting DNA methylation, which is also associated with transcriptional silencing.
In bacteria, sRNAs (generally 100 nucleotides long) also target specific mRNAs for degradation, but a protein called Hfq, which is of a different type from Argonaute, plays the role of mediator and effector in the sRNA and target mRNA interaction. Other sRNAs in bacteria activate certain mRNAs by preventing formation of an inhibitory structure in the mRNA. Another ncRNA, simply called OxyS RNA, represses translation by interfering with ribosome binding.
A variety of other ncRNAs carry out more specialized functions, some just beginning to be understood. Gene silencing is a very important component of normal development. As cells become differentiated and specialized, they must express certain genes, and the remaining genes must be silenced. A form of silencing different from RNAi is called imprinting, whereby certain alleles from an allele pair are silenced, often those received from only one sex. A large ncRNA (a little longer than 100,000 nucleotides) called Air is responsible for silencing the paternal alleles in a small autosomal gene cluster. The mechanism underlying Air RNA action is beginning to be elucidated, and involves interaction with the DNA at the region to be silenced and recruitment of histone-modifying activities, leading to transcriptional silencing of the DNA in that region.
In human females, one of the X chromosomes (females have two) must be inactivated so the genes on it will not be expressed. This inactivation, called Lyonization after the discoverer of the phenomenon, Mary Lyon, occurs during development on a random basis in each cell, so that the X chromosome subjected to deactivation is randomly determined. An ncRNA called XIST plays a central part in this process. It is a large RNA of 16,500 nucleotides and it is initially transcribed from genes on both X chromosomes. When X inactivation begins, the active X chromosome ceases to express XIST, whereas the future inactive X chromosome has increased XIST expression and the XIST transcript binds all over the inactivated X chromosome. The X chromosome that gets coated with XIST is then silenced, and the only gene it transcribes thereafter is the XIST gene.
A type of ncRNA called transfer messenger RNA (tmRNA) is involved in resuming translation at ribosomes that have stalled. When a stalled ribosome is encountered, a tmRNA first acts as a tRNA charged with the amino acid alanine. The stalled polypeptide is transferred to the alanine on the tmRNA. Then translation continues, but now the tmRNA acts as the mRNA, instead of the mRNA the ribosome was initially translating. A termination codon is soon reached and the amino acids that were added based on the tmRNA code act as a tag for enzymes in the cytoplasm to break it down. This allows those ribosomes that would normally remain tied up with an mRNA they cannot complete translating to be recycled for translating another mRNA.
Telomerase is an enzyme responsible for maintaining the ends of chromosomes called telomeres. It is a large RNP containing the TER RNA, which is a few hundred (and in some species more than a thousand) base pairs long. TER contains a template sequence used to synthesize the repeat sequences normally found in telomeres.
Most of the ncRNAs described above were unknown until the 1980s, and some of them were only discovered in the 1990s. What appeared to be a relatively simple picture of genetic control in cells has now gained many, previously hidden, layers involving all manner of RNAs, ranging from a mere 20 nucleotides to 100,000 nucleotides or so in length. Some are suggesting that this glimpse is just the tip of the iceberg and that continued research will reap increasingly complex interactions among RNAs and between RNAs and proteins. Genomics, the study of the DNA sequence of genomes, has been a hot field for some time, and is now often focused on discovery of ncRNAs.
Initially, cDNA libraries were surveyed for ncRNA sequences, especially some of the smaller ones that were long thought merely to be leftover scraps from other processes. For example, one study in 2001, which included a survey of a mouse-brain cDNA library, revealed 201 potential novel, small ncRNAs. In a 2003 survey of a cDNA library from Drosophila melanogaster (fruit fly), sixty-six potential novel ncRNAs were discovered. Judging by the large numbers of candidate ncRNAs showing up in what are essentially first-time surveys, many more may remain to be found, and methods for generating small RNA libraries are continually improving. There could potentially be thousands of ncRNA genes. What is surprising is that many of these ncRNA genes are being found in spacer regions and introns, places that were once considered useless junk. With so much now being found in these regions, many geneticists have become ever more cautious in calling any DNA sequence junk DNA.
Because the field of ncRNAs is in its infancy and the functions of many of the ncRNAs are just barely understood, it may be premature to predict specific medical applications, but certainly the potential is there. The population of ncRNAs in a cell, in some sense, resembles a complex set of switches that turn genes on and off—before they are transcribed, while they are being transcribed, or even once translation has begun. Once these switches are better understood, researchers may be able to exploit the system with artificially produced RNAs. Geneticists will probably also discover that a number of diseases that appeared to have unexplained genetic behavior will find the solutions in ncRNA.
Bass, Brenda L. “The Short Answer.” Nature 411 (2001): 428–29. Print.
Castel, Stephane E., and Robert A. Martienssen. “RNA Interference in the Nucleus: Roles for Small RNAs in Transcription, Epigenetics and Beyond.” Nature Reviews Genetics 14.2 (2013): 100–12. Print.
Gesteland, Raymond F., Thomas R. Cech, and John F. Atkins, eds. The RNA World. 3rd ed. New York: Cold Spring Harbor Laboratory P, 2005. Print.
Ghildiyal, Megha, and Phillip D. Zamore. “Small Silencing RNAs: An Expanding Universe.” Nature Reviews Genetics 10 (2009): 94–108. Print.
Grosshans, Helge, and Frank J. Slack. “Micro-RNAs: Small Is Plentiful.” The Journal of Cell Biology 156.1 (2002): 17–21. Print.
Hentze, Matthias W., Elisa Izaurralde, and Bertrand Séraphin. “A New Era for the RNA World.” EMBO Reports 1.5 (2000): 394–98. Print.
Lewin, Benjamin. Genes VII. New York: Oxford UP, 2001. Print.
Morris, Kevin V. Non-Coding RNAs and Epigenetic Regulation of Gene Expression: Drivers of Natural Selection. Norfolk: Caister Academic, 2012. Print.
Storz, Gisela. “An Expanding Universe of Noncoding RNAs.” Science 296 (2002): 1260–263. Print.
Zhang, A. T., et al. “Dynamic Interaction of Y RNAs with Chromatin and Initiation Proteins during Human DNA Replication.” Journal of Cell Science 124.12 (2011): 2058–069. Print.