The Transcriptome

Only a very small percentage (1.2% in humans) of the DNA in vertebrate genomes encodes proteins (the "proteome") because

  • the exons of most genes are separated by much-longer introns
  • between our genes lie vast amounts of DNA much of which appears to regulate the expression of our genes but is not transcribed and translated into a protein product. [More]

So even when the complete sequence of a genome is known, it is often difficult to spot particular genes (open reading frames or ORFs).

One approach to solving the problem is to examine a transcriptome of the organism. Most commonly this is defined as: All the messenger RNA (mRNA) molecules transcribed from the genome.

Link to a discussion of gene transcription.

(Speaking strictly, one would define the transcriptome as all the RNA molecules — which includes a wide variety of untranslated, nonprotein-encoding RNA [Link to examples] — transcribed from the DNA of the genome. It is now thought that 76% of our DNA is transcribed into RNA although only 1.5% of this is messenger RNA for protein synthesis.)

It is "a" transcriptome, not "the" transcriptome, because what genes are transcribed in a cell depends on

  • the kind of cell (e.g., liver cell vs. lymphocyte)
  • what the cell is doing at that time, e.g.,
    • getting ready to divide by mitosis;
    • responding to the arrival of a hormone or cytokine;
    • getting ready to secrete a protein product.

Expressed Sequence Tags (ESTs)

ESTs are short (200–500 nucleotides) DNA sequences that can be used to identify a gene that is being expressed in a cell at a particular time.

The Procedure:

  • Isolate the messenger RNA (mRNA) from a particular tissue (e.g., liver)
  • Treat it with reverse transcriptase. Reverse transcriptase is a DNA polymerase that uses RNA as its template. Thus it is able to make genetic information flow in the reverse (RNA ->DNA) of its normal direction (DNA -> RNA).
  • This produces complementary DNA (cDNA). Note that cDNA differs from the normal gene in lacking the intron sequences.
  • Sequence 200–500 nucleotides at both the 5′ and 3′ ends of each cDNA.
  • Examine the database of the organism's genome to find a matching sequence.
  • That is the gene that was expressed.
Welcome&Next Search

26 September 2012