2.4 Proteins

Looking for patterns, trends and discrepancies—most but not all organisms assemble proteins from the same amino acids.

Part of the universality of life is the observation that all living organisms construct proteins out of the same pool of 20 amino acids.  These 20 were identified in a rapid era of discovery after the development of partition chromatography in 1943.  However, this has now been expanded to include two additional amino acids – selenocysteine and pyrrolysine, giving a total of 22 amino acids.

Selenocysteine, as the name suggests, is similar to the amino acid cysteine but it has a Selenium atom as part of its side-chain (R-Group). It is a highly reactive and potentially dangerous substance – cells have to use some tricky metabolic pathways in order to prevent it from moving freely and building up in the cytoplasm.   It is used in certain redox reactions and has been found in all three domains of life, although it is not universal amongst them

Pyrrolysine is rarer, having only been found in some species of Archaeans and bacteria. It is structurally similar to lysine, but with the addition of a a pyrroline ring to the side-chain. It’s role and possible existence in other organisms is the focus of many ongoing studies.

Both of these amino acids are not encoded in the DNA – they are instead encoded by the stop codons UGA for selenocysteine and UAG for pyrrolysine and expressed via interactions with specific tRNA molecules, a process known as cotranslation. The biochemistry involved is fairly complex and difficult to summarise for IB biology purposes, but if you are interested the links below are a good place to start.

In summary, then:

  • all living organisms use the traditional 20 amino acids to construct proteins and code for these amino acids in their DNA
  • all three domains of life (thought not every species in them) also use a 21st amino acid selenocysteine in some proteins (humans included)
  • Archaeans and bacteria have developed a mechanism to use a 22nd amino acid, pyrrolysine.
  • both of these amino acids are not coded for in the DNA but are expressed through the use of a stop codon and tRNA
  • the presence of Selenocysteine in all three domains strongly suggests it was present in the last universal common ancestor, and is thus a very ancient biochemical pathway


Selenocysteine and pyrrolysine are powerful examples of the versatility inherent in the genetic code. (Rother and Krzycki).

Like so many aspects of biology, once a rule is determined, the incredible variety of life shows us an exception.

For an interesting TOK-linked discussion, consider this quote, also from the Rother/Krzycki article:

They further provide examples of how precedent, though valuable, is not always the best predictor in scientific investigation…

What do the authors mean by this?  Does this mean that inductive reasoning is not always a reliable form of reason? What other examples from science can you think of to illustrate this quote?


Das, Gunajyoti & Mandal, Shilpi. (2013). Nearest-Neighbor Interactions and Their Influence on the Structural Aspects of Dipeptides. Biochemistry research international. ResearchGate. Accessed on 2 October, 2018

Dinmann, J. (2012). Control of gene expression by translational recoding. Advances in Protein Chemistry and Structural Biology via ScienceDirect. Accessed on 2 October, 2018. https://www.sciencedirect.com/topics/neuroscience/pyrrolysine

Gutiérrez-Preciado, A., Romero, H. & Peimbert, M. (2010) An Evolutionary Perspective on Amino Acids. Nature Education. Accessed on 2 October, 2018.

Rother, Michael, and Joseph A. Krzycki. “Selenocysteine, Pyrrolysine, and the Unique Energy Metabolism of Methanogenic Archaea.” Archaea 2010 (2010): 453642. PMC. Web. 2 Oct. 2018.