Modification And Processing Of mRNA
Most eukaryotic mRNAs undergo a quite complicated series of modifications and processing events before translation occurs. These include:
- Chemical modification to the two ends of the mRNA molecule (Capping).
- Polyadenylation
- Removal of introns (Splicing).
- In a few special cases, alterations to the nucleotide sequence of the mRNA {RNA editing).
These events occur only in eukaryotes. The first clue that eukaryotic mRNAs undergo modification and processing came when RNA fractions present in the nucleus and cytoplasm were compared.
- The nucleus can be divided into two distinct regions: the nucleolus, in which rRNA genes are transcribed, and the nucleoplasm, where other genes, including those for mRNA, are transcribed.
- The nucleoplasmic RNA fraction is called heterogenous nuclear RNA, the name indicating that it is made up of a complex mixture of RNA molecules, some over 20 kb in length.
- The mRNA in the cytoplasm is also heterogeneous, but its average length is only 2 kb.
- If the mRNA, in the cytoplasm is derived from hnRNA, then modification and processing events including a reduction in the length of the primary transcripts, must occur before the mRNA leaves the nucleus.
- The hnRNA fraction includes mRNAs at various stages of modification and processing making it very difficult to study the exact order of events that occur within a single molecule.
- There is evidence that the end modifications (cappings) are completed before all the introns are removed (splicing), but this may not be true for all mRNAs.
Role of CTD of RNA Polymerase II in mRNA Processing (RNA Factor): RNA processing steps are tightly coupled to transcription elongation by an ingenious mechanism.
- As discussed previously, a key step of the transition of RNA polymerase 2 to the elongation mode of RNA synthesis is an extensive phosphorylation of the RNA polymerase 2 tail, called the CTD.
- This C-terminal domain of the largest subunit consists of a long tandem array of a repeated seven-amino acid sequence, containing two serines per repeat that can be phosphorylated.
- Because there are 52 repeats in CTD of human RNA polymerase 2, its complete phosphorylation would add 104 negatively charged phosphate groups to the polymerase 2.
- This phosphorylation step not only dissociates the RNA polymerase 2 from other proteins present at the start point of transcription, but it also allows a new set of proteins to associate with the RNA polymerase tail onto the nascent RNA molecule to begin processing.
- As discussed next, some of these processing proteins seem to “jump” from the polymerase tail onto the nascent RNA molecule to begin processing it as emerges from RNA polymerase.
- Thus, RNA polymerase 2 in its elongation mode can be viewed as an RNA factory that both transcribes DNA into RNA and processes the RNA it produces.
RNA Capping: As soon as RNA polymerase II has produced about 25 nucleotides of RNA, the 5’ end of the new RNA molecule is modified of a “cap” that consists of a modified guanine nucleotide (i.e., 7 – methyl G).
- The capping reaction is performed by three enzymes acting in succession: One enzyme called phosphatase removes one phosphate from the 5′ end of the nascent RNA, and another enzyme (a gua nyl transferase) adds a GMP in a reverse linkage (5′ to 5′ instead of 5′ to 3′) and a third enzyme (a methyl transferase) adds a methyl group to the guanosine.
- Because all three enzymes bind to the phosphorylated RNA polymerase tail, they are ready to modify the 5′ end of the nascent transcript as soon as it emerges from the polymerase.
- The S’-methyl cap signals the 5′ end of eukaryotic mRNA, and this landmark helps the cell to distinguish mRNA, from the other types of RNA molecules present in the cell.
- For example, RNA polymerases 1 and 3 produce uncapped RNAs during transcription, in part because their polymerases lack the tail.
- In the nucleus, the cap binds a protein complex called CBC (cap binding complex) which helps the RNA to be properly processed and exported.
- The 5′ methyl cap also has an important role in the translation of mRNA in the cytosol.
Polyadenylation: At the other end, the 3″ end of polymerase 2 transcripts, a sequence of twenty to two hundred adenine-containing nucleotides, known as a poly-A tail, is added by the enzyme poly-A polymerase.
- This step is called polyadenylation which takes place after the 3′ end of the transcript is removed by a nuclease that cuts about twenty nucleotides downstream from the signal S’-AAUAAA-G’.
- The tail adds stability to the molecule and aids in its transportation from the nucleus.
Introns and RNA Splicing: The protein-coding sequences of eukaryotic genes are typically interrupted by noncoding intervening sequences (introns).
- Discovered in 1977, this feature of eukaryotic genes came as a surprise to molecular biologists, who had been until that time, familiar only with bacterial genes, which typically consist of a continuous stretch of coding DNA that is directly transcribed into mRNA.
- In marked contrast, eukaryotic genes were found to be broken up into small pieces of codon sequence (expressed sequences or exons) interspersed with much longer intervening sequences or introns; thus, the coding portion of a eukaryotic gene is often only a small fraction of the length of the gene.
Types of introns: Molecular biologists have recognized seven distinct types of introns in eukaryotes, and additional forms in the archaea.
- Two of these types, namely GV-AG and AU-AC introns, are found in eukaryotic protein-coding genes.
Types Of Introns:
In protein-coding genes, introns are less common in lower eukaryotes: the 6000 genes in the yeast genome contain only 239 introns in total, whereas many individual mammalian genes contain 50 or more introns.
- A eukaryotic pre-mRNA may contain many introns, perhaps over 100, taking up a considerable length of the transcript.
- These introns must be excised and the exons joined together in the correct order before the transcript can function as a mature mRNA.
Introns In Human Genes:
A transmembrane glycoprotein complex of skeletal muscle cells and is absent in Duchenne muscular dystrophy and deficient in Becker muscular dystrophy.
Evolution Of Introns: When the same gene is compared in related species, molecular biologists find that some of the introns are in identical positions but that each species has one or more unique introns.
- This implies that some introns remain in place for millions of years,
- Retaining their positions while species diversify, whereas others appear or disappear during this same period.
This leads to two opposing hypotheses for the evolution of introns:
- ‘Intron late ’ hypothesis: This hypothesis holds that introns evolved relatively recently and are gradually accumulating in eukaryotic genomes.
- ‘Introns early ’ hypothesis: This hypothesis suggests that introns are very ancient and are gradually being lost from eukaryotic genomes.
GU-AG Introns and Protein-Mediated Splicing: In most of the pre-mRNA introns, the first two nucleotides of the intron sequence are 5′-GU- 3′ and the last two 5′-AG-3′.
- They are therefore called ‘GU-AG’ introns and all members of this class are spliced in the same way.
The GU-AG motifs vary in different types of eukaryotes; in vertebrates, they can be described as:
- 5’ splice site → 5′-AG l GUAAGU-3′
- 3’ splice site → 5′-Py Py Py Py Py Py N C A G -3′
In these designations, ‘Py’ is one of the two pyrimidine nucleotides (U of C); ‘N’ is any nucleotide, and the arrow indicates the exon-intron boundary.
- The 5’splice site is also known as the donor site and the 3’ splice site is the acceptor site. There are certain conserved sequences.
- However, they are present in some but not all eukaryotes. Introns in higher eukaryotes usually have a polypyrimidine tract, a pyrimidine-rich region located just upstream of the 3’end of the intron sequence
- This tract is not common in yeast. However, introns of yeast contain a 5MJACUAAC-3′ sequence, located between 18 and 140 bp upstream of the 3’splice; these sites are absent in higher eukaryotes.
- The polypyrimidine tract and the 5′ -UACUAA-3’sequence have different functions.
Two steps of splicing pathways: Cleavage of the 5’splice site. This step occurs by a transesterification reaction promoted by the hydroxyl group attached to the 2’carbon of an adenosine nucleotide located within the intron sequence.
- In yeast, this adenosine is the last one in the conserved 5′; -UACUAA-3’sequence.
- The result of the hydroxyl attack is the cleavage of the phosphodiester bond at the 5’splicc site, accompanied by the formation of a new 5′-2′ phosphodiester bond linking the first nucleotide of the intron (the G of the 5′-GU-3’motif) with the internal adenosine.
- This means that the intron has now been looped back on itself to create a lariat structure.
- A lariat is a rope with a running noose, for catching horses and cattle.
Cleavage of the 3′ splice site and joining of exons: These two reactions occur due to the second transesterification reaction, which is promoted by the 3′-OH group attached to the end of the upstream exon.
- This hydroxyl group attacks the phosphodiester bond at the 3′ splice site, cleaving and so releasing the intron as the lariat structure.
- The free lariat structure is converted back to a linear RNA and degraded.
- At the same time, the 3’end of the upstream exon joins the newly formed 5’end of the downstream exon completing the splicing process.
The Splicing apparatus of GU-AG introns: The central components of the splicing apparatus for GU-AG introns are the small nuclear (sn) RNAs called Ul, U2, U4, U5, and U6.
- These are short molecules [between 106 nucleotides (for example., U6) and 185 nucleotides (for example., U2) in vertebrates that associate with proteins to form small nuclear ribonucleoprotein particles(snRNVs) in the nucleus of eukaryotic cells.
- The snRNPs, together with accessory proteins, attach to the transcript and form a series of complexes, the last one of which is the spliceosome, the structure within which the actual splicing reactions occur (Smith and Valcarcel, 2000).
- The role of splicing apparatus in splicing Splicing involves the formation of the following three transition structures.
Formation of commitment complex: The splicing activity is initiated by the formation of the commitment complex.
This complex comprises Ul-snRNP, which binds to the 5′ splice site, partly by RNA-RNA base pairing, and the protein factors SFI, U2AF35, and U2AF65, which make protein-RNA contacts with the branch site, the polypyrimidine tract, and the 3′ splice site, respectively.
Formation of the pre-spliceosome complex: The pre-spliceosome complex comprises the commitment complex plus U2-snRNP, the latter attached to the branch site. At this stage, an association, between Ul-snRNP and U2-snRNP brings the 5′ splice site into proximity with the branch point.
Formation of the spliceosome: It is formed when U4/U6-snRNP (a single snRNP containing two snRNAs) and U5-snRNP attach to the pre-spliceosome complex.
- This results in additional interactions that bring the 3′ splice site close to the 5′ splice site and the branch point.
- All three key positions in the intron arc are now in proximity and the two transesterineations occur as a linked reaction, possibly catalyzed by U6-snRNI completing the splicing process.
Selection Of Correct Splice Sites: A set of splicing factors called SR Proteins are found to be important in splice site selection.
The SR proteins so-called because their C-terminal domains contain a region rich in serine (abbreviations S) and arginine (R) were first implicated in splicing when it was discovered that they are components of the spliceosome.
SR Proteins Have The Following Functions:
- They establish a connection between bound Ul-snRNP and bound U2AF in the commitment complex (Valcarcel and Green, 1996).
- This indicates that SR proteins have a role in the splice-site selection, since the formation of the commitment complex is the critical stage of the splicing process, as this is the event that identifies which sites will be linked.
They interact with exonic splicing enhancers (ESEs), which are purine-rich sequences located in the exon regions of a transcript(Blencowe 2000).
- Our understanding of the role of ESEs and their counterparts, the ESSs or Exonic splicing silencers (Del Gatto-Konczak et al.,1999)
- Still is in a state of infancy, but its importance in controlling splicing is clear from the discovery that several human diseases, including one type of muscular dystrophy, are caused by mutations in ESE sequences.
- The sites of ESEs and ESSs indicate that assembly of the spliceosome is driven not simply by contacts within the intron but also by interactions with adjacent exons.
- It is possible that an individual commitment complex is not assembled within an intron as shown in, but initially bridges an exon.
This Model Holds Two Positive Points:
- It provides a means by which contact between an ESE or ESS and an SR protein could influence splicing;
- It takes account of the large disparity between the lengths of exons and introns in vertebrates’ genes. In the human genome, for example, the exons have an average length of 145 bp compared with 3365 bp for introns (IHGS, 2001).
Initial assembly of a commitment complex across an exon might therefore be a less difficult task than assembly across longer introns.
- Certain SR proteins, called CASPs(=CTD-associated SR-Iike proteins) or SCAFs (=SR- Iike CTD-associated factors), form a physical connection between the spliceosome and the CTD of the RNA polymerase 2 transcription complex, and hence provide a link between transcript elongation and processing.
- As already discussed, like some of the polyadenylation proteins, these splicing factors probably ride with the RNA polymerase 2 as this enzyme synthesizes the transcript, and is deposited at their appropriate positions at introns splice sites as soon as these are transcribed.
- Electron microscopy studies have shown that transcription and splicing occur together.
Alternative Splicing Pathways: With the discovery of introns, it has been the prevailing concept that each gene always gives rise to the same mRNA: in other words, there is a single splicing pathway for each primary transcript.
- This assumption was found to be incorrect in the 1980s, when it was shown that the primary transcripts of some genes can follow two or more alternative splicing pathways, enabling a single transcript to be processed into related but different mRNAs and hence to direct synthesis of a range of proteins.
- Alternative splicing is most common in higher animals.
- So while there are only three examples of alternative splicing in the yeast Saccharomyces cerevisiae, there are many more examples in Drosophila and humans.
- The presence of many alternative splicing pathways was established when the draft Drosophila sequence and sequence of microscopic nematode Caenorhabditis elegans were compared and found that fruit flies have fewer genes than this nematode.
- Thus, the apparent greater physical complexity of Drvsophila is not reflected in the far simpler diversity of its genome.
- The most likely explanation for the lack of parallelism between the number of genes in the Drosophila genome and the number of proteins in its genome proteome is that a substantial number of the genes give rise to multiple proteins via alternative splicing (Note. Proteome is the collection of functioning proteins synthesized by a living cell).
- Similarly, when the first human chromosome sequences were obtained it was recognized that rather than having 80000-100000 genes, as suggested by the size of the human proteome.
- Humans have only 35000 or so genes. It is now believed that at least 35% of the genes in the human genome undergo alternative splicing (Graveley, 2001) with this new awakening, the principle ‘one gene one protein’, serving as biological dogma since the 1940s, has become completely overthrown.
- Currently, alternative splicing is regarded as a crucial change in the genome expression pathway. This viewpoint is supported by the following two examples:
Sex determination in Drosophila: In Drosophila sex is determined by an alternative splicing cascade (Chabot 1996).
- The first gene in this cascade is SXL, whose transcript contains an optional exon which, when spliced to the one preceding it, results in an inactive version of protein SXL.
- In females, the splicing pathway is such that this exon is skipped so that functional SXL protein is made.
- The SXL protein promotes the selection of a cryptic splice site in a second transcript, Ira, by directing U2AF65 away from its normal 3’ splice site to a second site further downstream.
- The resulting female-special TRA protein is again involved in alternative splicing, this time by interacting with SR proteins to form a multifactor complex that attaches to an ESE within an exon of a third PRC-mRNA,
DSX, promoting the selection of a secondary, female-specific splice site in this transcript. The male and female versions of the DSX proteins are the primary determinants of Drosophila sex.
- In humans slo gene codes for a membrane protein that regulates the entry and exit of potassium ions into and out of cells (Gravcley, 2001). This gene has 35 exons, eight of which are involved in alternative splicing events.
- The alternative splicing pathways involve different combinations of the eight optional exons, leading to over 500 distinct mRNAs, each specifying a membrane protein with slightly different functional properties. What are the biological consequences of this example of multiple splicing? The slo genes are active in the hair cells on the basilar membrane of the cochlea.
- Different hair cells respond to different sound frequencies between 20 and 20 000 Hz, their capabilities determined in part by the properties of their Slo proteins. Alternative splicing of slo genes in cochlear hair cells therefore determines the auditory range of humans.
AU-AC Introns: The AU-AC introns are reported in approximately 20 genes in organisms as diverse as humans, plants, and Drosophila (Nilsen, 1996; Tarn and Steitz, 1997)
- The AU-AC introns have a conserved branch site sequence with the consensus 5′-UCCUUAA-3′, the last adenosine in this motif being the one that participates in the first transesterification reaction.
- Thus, the splicing pathway of AU-AC introns is very similar to that of GU-AG introns but involves a different set of splicing factors.
- Only the U5-snRNP is involved in the splicing mechanisms of both types of intron.
- The roles of Ul-snRNP and U2-snRNP are taken by a previously discovered complex that had never been assigned a function, for example., Ull/U12-snRNP, U4 at/U6 at-snRNP.
Leave a Reply