Genome Sizes

The genome of an organism is the complete set of genes specifying how its phenotype will develop (under a certain set of environmental conditions). In this sense, then, diploid organisms (like ourselves) contain two genomes, one inherited from our mother, the other from our father.

The table below presents a selection of representative genome sizes from the rapidly-growing list of organisms whose genomes have been sequenced.

Table of Genome Sizes (haploid)
  Base pairsGenesNotes
φX174 5,386 11 virus of E. coli
Human mitochondrion 16,569 37  
Nasuia deltocephalinicola 112,091 137 smallest genome yet found in a bacterium. This β-proteobacterium lives in a mutualistic relationship within a special organ of an insect (a leaf hopper) which it supplies with essential amino acids.
Epstein-Barr virus (EBV) 172,282 80 causes mononucleosis
nucleomorph of Guillardia theta 551,264 511 all that remains of the nuclear genome of a red alga (a eukaryote) engulfed long ago by another eukaryote
Mycoplasma genitalium 580,073 525 two of the smallest true organisms
Mycoplasma pneumoniae 816,394 679
Rickettsia prowazekii 1,111,523 834 bacterium that causes epidemic typhus
Treponema pallidum 1,138,011 1,039 bacterium that causes syphilis
Pelagibacter ubique 1,308,759 1,354 smallest genome yet found in a free-living organism (marine α-proteobacterium)
Helicobacter pylori 1,667,867 1,589 chief cause of stomach ulcers (not stress and diet)
Methanocaldococcus jannaschii 1,664,970 1,783 These unicellular microbes look like typical bacteria but their genes are so different from those of either bacteria or eukaryotes that they are classified in a third kingdom: Archaea.
Aeropyrum pernix 1,669,695 1,885
Methanothermobacter thermoautotrophicus 1,751,377 2,008
Streptococcus pneumoniae 2,160,837 2,236 the pneumococcus
Pandoravirus 2,473,870 2556 A virus (of an amoeba) with a genome larger than that of the bacteria and archaea above and about the same as that of some parasitic eukaryotes [Example].
Listeria monocytogenes 2,944,528 2,926 2,853 of these encode proteins; the rest RNAs
Synechocystis 3,573,470 4,003 a marine cyanobacterium ("blue-green alga")
E. coli K-12 4,639,221 4,377 4,290 of these genes encode proteins; the rest RNAs
E. coli O157:H7 5.44 x 106 5,416 strain that is pathogenic for humans; has 1,346 genes not found in E. coli K-12
Schizosaccharomyces pombe 12,462,637 4,929 Fission yeast. A eukaryote with fewer genes than the three bacteria below.
Agrobacterium tumefaciens 4,674,062 5,419 Useful vector for making transgenic plants; shares many genes with Sinorhizobium meliloti
Pseudomonas aeruginosa 6.3 x 106 5,570 Increasingly common cause of opportunistic infections in humans.
Sinorhizobium meliloti 6,691,694 6,204 The rhizobial symbiont of alfalfa. Genome consists of one chromosome and 2 large plasmids.
Saccharomyces cerevisiae 12,495,682 5,770 Budding yeast. A eukaryote.
Neurospora crassa 38,639,769 10,082 Plus 498 RNA genes.
Thalassiosira pseudonana 34.5 x 106 11,242 A diatom. Plus 144 chloroplast and 40 mitochondrial genes encoding proteins
Naegleria gruberi 41 x 106 15,727 This free-living unicellular organism lives as both an amoeboid and a flagellated form. 4,133 of its genes are also found in other eukaryotes suggesting that they were present in the common ancestor of all eukaryotes. The great variety of functions encoded by these genes also suggests that the common ancestor of all eukaryotes was itself as complex as many of the present-day unicellular members.
Drosophila melanogaster 122,653,977 ~17,000 the "fruit fly"
Caenorhabditis elegans 100,258,171 21,733  
Humans 3.3 x 109 ~21,000 [Link to more details.]
Tetraodon nigroviridis (a pufferfish) 3.42 x 108 27,918 Although Tetraodon seems to have more protein-encoding genes than we do, it has much less non-coding DNA so its total genome is about a tenth the size of ours.
Mouse 2.8 x 109 ~23,000  
Amphibians 109–1011 ?  
Arabidopsis thaliana 0.135 x 109 27,416 a flowering plant (angiosperm) with one of the smallest genomes known in the plant kingdom.
Picea abies 19.6 x 109 28,354 the Norway spruce, a conifer (gymnosperm). Even though it has only ~900 more genes than Arabidopsis, it has 145 times as much DNA. Most of this appears to be derived from transposons.
Psilotum nudum 2.5 x 1011 ? Note

Even though Psilotum nudum (sometimes called the "whisk fern") is a far simpler plant than Arabidopsis (it has no true leaves, flowers, or fruit), it has 3000 times as much DNA. No one knows why, but 80% or more of it is repetitive DNA containing no genetic information. This is also the case for some amphibians, which contain 30 times as much DNA as we do but certainly are not 30 times as complex.

The total amount of DNA in the haploid genome is called its C value. The lack of a consistent relationship between the C value and the complexity of an organism (e.g., amphibians vs. mammals) is called the C value paradox.

Not all genes are indispensable.

The scientists at The Institute for Genomic Research (now known as the J. Craig Venter Institute) who determined the Mycoplasma genitalium sequence have followed this work by systematically destroying its genes (by mutating them with insertions) to see which ones are essential to life and which are dispensable. Of the 485 protein-encoding genes, they have concluded that only 381 of them are essential to life. In other words, the loss of any one of the 381 is lethal; the loss of any one of the others is not. (This is not to say that all the organism needs are those 381 — see "A Minimal Genome?" below.)

Using similar techniques, three groups have recently found that only about 10% of the genes in the human genome (~2000 of them) must be present for human cells to grow successfully in culture. These genes encode proteins for such essential functions as controlling the cell cycle, DNA replication, DNA transcription and RNA translation. The cells can tolerate the loss of any one of the other ~18,000 genes. Thus the human genome appears to have redundant pathways that can often compensate for the loss of a single gene at least for cells growing in culture. Probably others will turn out to be essential for the development and functioning of the various types of differentiated cells in the intact body.

A Minimal Genome?

In March of 2016, workers at the J. Craig Venter Institute reported that they had created a strain of mycoplasma containing only 473 genes. This synthetic organism, which grows vigorously in culture, now holds the record for the smallest genome of a free-living organism.

Welcome&Next Search

19 October 2016