Relatively Detailed Explanation of Gene Expression

Gene expression is the process of protein formation. Proteins are the molecules that are responsible for enabling us to live. They carry out majority of the processes that are important for life, as they include enzymes, transcription factors and the various cell machinery. Indeed, without proteins, life may not be a possibility.

Gene expression cannot be carried out without a nucleic acid sequence, also known as DNA in humans. DNA is a double stranded molecule comprising of a sugar-phosphate backbone and nitrogenous bases. On a smaller scale, DNA is made up of complementary nucleotide sequences. There are four different nucleotides: adenine (A), cytosine (C), guanine (G) and thymine (T). Humans have approximately 3 billion of these base pairs. However, not all of them are coding sequences. Approximately 1.3% of the DNA codes for proteins, the rest, have many varied functions that are still essential to the overall function of the human system. Our main focus here though, is the 1.3%. What exactly goes on in there that enables us to have the various proteins we need for our daily functions?

Gene expression in the human (and other eukaryotes) system comprises of three ‘steps’: Transcription, RNA Processing, and Translation. Transcription and RNA processing occurs in the nucleus of the cell, before translation occurs.


In this process, template DNA (aka DNA that codes for the proteins needed), directs the synthesis of new RNA. This RNA is a complimentary copy of the template DNA, containing everything ‘mentioned’ by the DNA. It is something like one copying down everything the teacher says, without processing the information and removing the irrelevant parts.

In both eukaryotic and prokaryotic transcription, the process has 3 stages. First, is initiation, where RNA polymerase binds to the promoter region. This initiates the unwinding of the DNA strands, and the polymerase initiates RNA synthesis. Like DNA polymerase, RNA polymerase can only assemble a polynucleotide in the 5′ to 3′ direction. However, no primer is needed to initiate this chain, unlike DNA polymerase.

In eukaryotic cells, proteins known as transcription factors bind to promoters that include a TATA box (a nucleotide sequence that contains TATA, 25 nucleotides upstream from the start of transcription). Afterwhich, more transcription factors will bind to the DNA, together with RNA polymerase II, forming the transcription initiation complex.

Subsequently, elongation occurs, the polymerase will move downstream, unwinding the DNA and elongating the RNA transcript in the 5′ to 3′ direction. As RNA synthesis proceeds downstream, the newly transcribed RNA molecule will detach itself from the DNA template, and the double helix reforms. It is possible for a single gene to be transcribed by multiple molecules of RNA polymerases. This increases the amount of RNA transcribed from it, enabling the cell to make the encoded protein in large amounts.

The last and final step of transcription is termination. This process differs between prokaryotes and eukaryotes. In prokaryotes, transcription will stop when a terminator sequence is transcribed, causing the polymerase to detach from the DNA and release the transcript, which will be available for immediate use as mRNA. In fact, translation of the mRNA strand sometimes occurs while transcription is still taking place!

In eukaryotes, the pre-mRNA is cleaved when the polymerase transcribes a sequence on the DNA known as the polyadenylation signal, which codes for a polyadenylation signal (AAUAAA) in pre-MRNA. The pre-MRNA is cleaved 10-25 nucleotides from this signal, while the polymerase will continue transcribing until it falls off the DNA by a mechanism that is not fully understood as of yet.

RNA processing

In eukaryotes, there is one additional step before translation, and that is RNA processing. What happens here is that introns (non-coding sequences) are removed and exons (coding sequences) spliced together. Also, the 5′ cap and poly-A tail will be added to the 5′ and 3′ end of the pre-mRNA respectively. The 5′ cap and poly-A tail are not translated into protein, nor the regions known as 5′ untranslated region (UTR) and 3′ UTR.


Now comes the complicated part. Translation. After RNA is processed and becomes mRNA, it has to be translated into proteins: something that the cell can use. This process involves a few key members: mRNA, tRNA (transfer RNA), ribosomes and amino acids.

First and foremost, a small ribosomal subunit binds to a molecule of mRNA at the mRNA binding site. Initiator tRNA carrying the amino acid methionine basepairs (via hydrogen bonds) to the start codon AUG. This is an important step, as it establishes the reading frame for the mRNA. If there is a mistake in the reading frame, wrong proteins will be produced and many complications will arise. The large subunit of the ribosome will then attach, completing the translation initiation complex. Proteins known as initiation factors are responsible for bringing everything together. Upon the completion of this initiation complex, the initiator tRNA is in the P site, while the A site remains empty for the next tRNA.

Elongation then occurs. First, codon recognition occurs. The anticodon of a tRNA basepairs with the complementary codon on the mRNA in the A site. GTP hydrolysis ensures that accuracy and efficiency of this step is increased. After which, an rRNA molecule of the large subunit catalysis the formation of a peptide bond between the newly arrived amino acid in the A site and the carboxyl end of the growing polypeptide chain in the P site. This step results in the polypeptide being attached to the tRNA in the A site. The ribosome then translocates the tRNA in the A site to the P site. The empty tRNA in the P site is now moved to the E site, where it is released. The mRNA moves through the ribosome, 5′ end first, bringing the next codon that is going to be translated into the A site.

The last and final stage is termination. Elongation will continue until a stop codon reaches the A site of the ribosome. UAG, UAA and UGA are all stop codons that signals the stop of translation. A release factor then binds directly to the stop codon, causing the addition of a water molecule instead of an amino acid to the peptide chain. The release factor will then hydrolyse the bond between tRRNA in the P site and the last amino acid of the polypeptide chain, freeing the polypeptide. Everything else then dissociates.

Gene regulation

While all this maybe fine and dandy, gene regulation will also occur to ensure that environmental changes will be dealt with successfully by the cell. These are carried out in the form of metabolic control. It occurs on two levels: adjusting the activity of enzymes already in the cell (feedback inhibition), which is a faster response, or regulate the expression of the genes (the operon model).

The operon model is just one of many ways a cell can regulate gene expression. Throughout the various steps of gene expression, gene regulation occurs, in an effort to ensure that the proteins produced in the end will be beneficial to our system.