Overview
The molecular clock is a technique used to estimate when two lineages diverged by comparing changes in biomolecules over time. It treats accumulated differences in key molecules as a proxy for elapsed evolutionary time and is applied at many taxonomic levels, from populations to higher taxa. For a general introduction see divergence timing resources.
Principles and data
The method rests on the observation that mutations and substitutions accrue at an approximately steady pace in some genes or genomic regions. Researchers compare molecular structures such as base substitutions or amino acid replacements — often using nucleotide sequences or amino acid sequences — to quantify differences. High-throughput and genome-scale studies use large datasets; examples of such data types are described under high data approaches and in comprehensive genome analysis overviews. More detailed discussions of molecule choice appear in molecular structure guides.
Calibration and analytical methods
Raw genetic distances must be converted into time by calibration. Common calibrations use fossils, dated biogeographic events, or known historical samples. Models vary from strict clocks, which assume constant rates, to relaxed-clock models that allow rate variation among branches. Modern implementations frequently use statistical and Bayesian frameworks to estimate both rates and divergence times simultaneously.
History and development
The molecular clock concept emerged in the mid-20th century as comparisons of proteins and genes revealed roughly linear accumulation of differences between some lineages. Early studies compared haemoglobin and other proteins, and the idea has since been extended to DNA sequences and whole genomes. For classic protein examples, see discussions of haemoglobin-based comparisons.
Uses and examples
Molecular clocks are used to date speciation events, study the timing of pathogen outbreaks, and place evolutionary events on an absolute timescale. They can be applied to species-level questions, population splits, or rapidly evolving viruses. Typical molecule choices include mitochondrial DNA, selected nuclear genes, and proteins; practitioners select loci according to the timescale and selective constraints of interest.
Limitations and cautions
Clocks are approximations: rates vary across genes, lineages and time, and natural selection, saturation of sites, recombination or horizontal gene transfer can distort estimates. Careful model selection, multiple calibrations and cross-checking with independent evidence are essential to produce robust divergence time estimates.