Base pair

A base pair in the double strand of a double-stranded nucleic acid (DNA or RNA, in this case also referred to as dsDNA or dsRNA) is two opposite nucleobases that are complementary to each other and held together by hydrogen bonds.

While the length of single-stranded nucleic acids (ssRNA or ssDNA) is given by the number of nucleotides (nt) or bases (b) - in the synthetic case occasionally also by the general designation -mer for the number of sequences of a chain molecule -, the size of double-stranded DNA segments is usually given in base pairs, abbreviated bp,

  • 1 nt = one nucleotide
  • 1 bp = one base pair
  • 1 kbp or kb (kilo base pairs) = 1000 (103) base pairs
  • 1 Mbp or Mb (mega base pairs) = 1,000,000 (106) base pairs
  • 1 Gbp or Gb (giga base pairs) = 1,000,000,000 (109) base pairs

The unduplicated haploid human genome in the nucleus of a germ cell comprises over 3 billion base pairs, about 3.2 Gbp, distributed over 23 chromosomes (1n; 1c). A somatic cell of the human body usually contains a diploid (duplicate) nuclear chromosome set, about 6.4 Gbp, on 46 chromosomes (2n; 2c). This is duplicated (doubled) before cell division, so that each of the 46 chromosomes consists of two chromatids - equal copies of each other with the same genetic information - before nuclear division begins as mitosis, with about 13 Gbp (2n; 4c). In addition to this nuclear DNA (nuclear DNA, nDNA), most human cells, as in all eukaryotes, contain another genome (mitogenome) in each mitochondrion, of about 16.6 kbp each (mitochondrial DNA, mtDNA). An exception is the mature red blood cells, which, as in all mammals, have neither nucleus nor mitochondria. Plant cells also contain the plastid genome (plastome) of their chloroplasts (abbreviated ctDNA or cpDNA).

The number of base pairs is also an important measure of the amount of information stored in a gene. Since each base pair represents a choice from 4 possible forms, 1 bp is equivalent to the information content of 2 bits, twice a bit in binary code. However, in the human genome, only a small proportion of DNA carries the genetic information for building proteins; over 95% is non-coding and often consists of repetitive elements.

Structural formula of an AT base pair with two hydrogen bonds shown in dashed blue.Zoom
Structural formula of an AT base pair with two hydrogen bonds shown in dashed blue.

Structural formula of a GC base pair with three hydrogen bonds shown in dashed blue .Zoom
Structural formula of a GC base pair with three hydrogen bonds shown in dashed blue .

Meaning

Base pairing plays an essential role for DNA reduplication, for transcription and translation in the course of protein biosynthesis as well as for manifold arrangements of the secondary and tertiary structure of nucleic acids.

  • During replication, the DNA double strand is unravelled and the two complementary single strands are completed by base pairing from deoxyribonucleotides to form two DNA double strands.
  • In transcription, a codogenic strand segment of DNA is used as a template to build an RNA single strand with complementary base sequence by base pairing from ribonucleotides, where A is paired with U. The RNA strands formed serve various tasks as mRNA, as tRNA or as rRNA. The RNA strands formed serve various tasks as mRNA, as tRNA or as rRNA.
  • During translation, the base sequence of an mRNA segment is read in steps of three by pairing the three bases of the anticodon of tRNAs with the complementary base triplets of the mRNA. The base sequence stored in DNA and transcribed into mRNA is thus translated into a sequence of amino acids with the amino acids transported by tRNA and thus encodes the amino acid sequence as the primary structure of a protein. This is also where the wobble pairings occur when the 3rd base of a codon of mRNA pairs with the 1st base of tRNA.

Pairing Rules

A base pair is formed by hydrogen bonding between two nucleobases. In this process, one of the purine bases guanine or adenine is joined to one of the pyrimidine bases cytosine, thymine or uracil to form a pair. In the complementary base pairings between two strand segments of nucleic acids, guanine pairs with cytosine and adenine pairs with thymine or with uracil. This can result in the following pairings:

DNA/DNA

  • Guanine with cytosine: G-C and C-G respectively
  • Adenine with thymine: A-T and T-A respectively

DNA/RNA

  • Guanine with cytosine: G-C and C-G respectively
  • Adenine with thymine: T-A
  • Adenine with uracil: A-U

RNA/RNA

  • Guanine with cytosine: G-C and C-G respectively
  • Adenine with uracil: A-U and U-A respectively

Watson-Crick pairings

As early as 1949, the Austrian biochemist Erwin Chargaff established with the Chargaff rules that in DNA the number of bases adenine (A) and thymine (T) is always present in the ratio 1 : 1, likewise the ratio of the bases guanine (G) and cytosine (C) is 1 : 1. In contrast, the quantity ratio A : G or C : T varies greatly (Chargaff's rules).

From this, James D. Watson and Francis Harry Compton Crick concluded that A-T and G-C each form complementary base pairs.

In tRNA and rRNA, base pairing also occurs when the nucleotide strand forms loops, resulting in complementary base sequences facing each other. Since in RNA only uracil is incorporated instead of thymine, the pairings are A-U and G-C.

Unusual pairings

Unusual pairings occur mainly in tRNAs and in triple helices. Although they follow the Watson-Crick scheme, they form other hydrogen bonds: Examples include reverse Watson-Crick pairings, Hoogsteen pairings (named after Karst Hoogsteen, born 1923), and reverse Hoogsteen pairings

Non-Watson-Crick base pairs with Watson-Crick-like geometry

As early as the late 20th century, several studies showed evidence for the existence of non-Watson-Crick base pairs with Watson-Crick-like geometry in the interaction of tRNA and mRNA when they contain pseudouridine(Ψ) or inosine(I).

In this representation, the tRNA residue is always located at position 34, the mRNA counterpart at position +3. For the A-Ψ binding, these values differ and are marked accordingly.

"■" indicates the use of the Hoogsteen site (cis), "⬤" that of the Watson-Crick site (cis). A mediation of base pairing by water is indicated by a "W" in the pairing. A "~" indicates the need for a tautomeric base. "*" indicates modified bases.

Non-Watson-Crick base pairs with Watson-Crick-like geometry

tRNA residue

mRNA residue

Type of base pairing

Ψsyn

A

■―⬤

Gsyn

G

■W⬤

Gsyn

A+

■―⬤

Gsyn

A+

■―⬤

G

Gsyn

■~⬤

G

Gsyn

⬤W■

G

Asyn

⬤―■

I

Asyn

⬤―■

I

Gsyn

⬤~■

I

Gsyn

⬤W■

Ψ

A

Watson-Crick

U*

G

Watson-Crick (U~C)

C*

A

Watson-Crick (C~A)

A (36)

Ψ (+1)

Watson-Crick

A (36)

Ψsyn (+1)

⬤―■

For the U⬤-■A and C⬤-■G bonds to be formed, C must either be in the imino form or protonated.

Wobble pairings

Main article: Wobble hypothesis

The term refers to the Wobble hypothesis of Francis Crick (1966). Wobble pairings are the non-Watson Crick pairings G-U or G-T and A-C:

Pairings of synthetic bases

In synthetic biology, nucleic acids with synthetic bases, among other things, are generated and studied, sometimes also with the aim of pairing these bases. One example is Hachimoji DNA.

reverse A-C Wobble pairingZoom
reverse A-C Wobble pairing

A-C Wobble pairingZoom
A-C Wobble pairing

reverse G-U wobble pairingZoom
reverse G-U wobble pairing

G-U Wobble PairingZoom
G-U Wobble Pairing

reverse A-U Hoogsteen pairingZoom
reverse A-U Hoogsteen pairing

A-U Hoogsteen pairingZoom
A-U Hoogsteen pairing

base pairs in the double strand of a DNAZoom
base pairs in the double strand of a DNA

reverse A-U pairingZoom
reverse A-U pairing

reverse G-C matingZoom
reverse G-C mating


AlegsaOnline.com - 2020 / 2023 - License CC3