Most common words in English — frequency, methods, and uses

Overview of how frequency lists are made, what 'most common words' means (lemmas vs forms), historical sources, notable patterns, and practical uses in teaching and technology.

Author: Leandro Alegsa Created: November 13, 2022 Updated: April 18, 2026

The phrase "most common words in English" usually refers to frequency lists compiled from large collections of real language use. Such lists are based on tokens counted in a corpus and are often reported as lemmas or head words rather than every inflected form. For example, the lemma "be" is counted together with forms like "is", "are", "was" and "were". Major publishers and research groups compile these lists from very large collections of texts — in one notable study the analysis drew on a body of over a billion words (corpus).

How the lists are compiled

Creating a frequency list requires choices that affect the result. Compilers must decide what counts as a word unit (token) vs. a word type, whether to group forms under a single lemma, how to treat contractions and punctuation, and which texts to include. A representative corpus aims to include diverse registers — from formal writing and journalism to informal chat, emails and blogs — so the counts reflect broad usage rather than a single genre. Dictionaries and lexicographers typically treat headwords as the unit of analysis; see a definition at dictionary entry.

Historical background and sources

Frequency studies are not new. Earlier influential corpora include the Brown Corpus and the British National Corpus; more recent efforts build much larger electronic collections and use automated processing. Oxford University Press and associated projects have produced widely cited lists derived from large online corpora and the Oxford English Corpus (OEC). These modern corpora benefit from automated tagging and lemmatization but remain subject to editorial choices about which texts to include.

Patterns and notable facts

Several robust patterns appear across corpora. Function words (articles, prepositions, pronouns, auxiliary verbs) dominate the top ranks; a small number of words account for a large share of tokens. For example, pedagogical sources long note that the first 25 words can make up roughly one-third of printed English, and the first 100 can approach one-half. Frequency distributions also follow predictable mathematical shapes such as Zipf's law: the most frequent item is many times more common than the second, and frequency falls off predictably.

Practical uses and examples

Language teaching: prioritizing high-frequency vocabulary gives learners early communicative payoff.
NLP and search: frequency informs language models, stopword lists, and text-compression schemes.
Lexicography and reading research: lists guide basic dictionaries and graded readers.
Stylistics and corpus linguistics: comparing frequencies highlights register and genre differences.

Typical high-ranking lemmas are short function words and common verbs and nouns — for example, words such as "the", "be", "and", "of", "a", "in", "to", "have", "it" and pronouns like "I" and "you" frequently occur near the top of many lists. Exact order and percentages vary by corpus composition and whether counts collapse inflected forms into lemmas or treat each form separately.

Distinctions and caveats

When using frequency lists remember: (1) Type vs. token — a list of types does not show how often each type appears; (2) Lemma vs. wordform — combining forms can inflate the apparent importance of a headword; (3) Corpus composition — spoken language and social media change rankings compared with formal writing. For reliable interpretation, consult the documentation for the corpus used and, where possible, examine frequency by register or medium rather than relying on a single overall ranking.

Further reading and detailed lists are available from corpus projects and lexicographic resources; introductory definitions and corpus descriptions can be found via the linked resources above (dictionary, lemma, corpus, OEC).

Questions and answers

Q: Who produced the list of the most common words in English?

A: Oxford University Press produced the list of the most common words in English.

Q: What is meant by "words" in the list?

A: By "words" in the list, dictionary head words or lemmas are meant.

Q: How many words were analyzed to come up with the list?

A: The list was based on an analysis of a body of over a billion words.

Q: Who conducted the study that led to the creation of the list?

A: Oxford Online, which is associated with the Oxford English Dictionary, conducted the study that led to the creation of the list.

Q: What sources were used to create the list?

A: The sources used to create the list included writings of all sorts from "literary novels and specialist journals to everyday newspapers and magazines and from Hansard to the language of chatrooms, emails, and weblogs".

Q: How much of all printed material in English do the first 25 words represent?

A: The first 25 words make up about one-third of all printed material in English.

Q: What percentage of all the words in the Oxford English Corpus do the top 100 lemmas listed below account for?

A: The top 100 lemmas listed below account for 50% of all the words in the Oxford English Corpus.

Author

AlegsaOnline.com Most common words in English — frequency, methods, and uses Leandro Alegsa

URL: https://en.alegsaonline.com/art/66864

How to cite this article

APA

Alegsa, L. (April 18, 2026). Most common words in English — frequency, methods, and uses. AlegsaOnline.com. https://en.alegsaonline.com/art/66864

MLA

Alegsa, Leandro. “Most common words in English — frequency, methods, and uses.” AlegsaOnline.com, April 18, 2026, https://en.alegsaonline.com/art/66864

Chicago

Alegsa, Leandro. “Most common words in English — frequency, methods, and uses.” AlegsaOnline.com. Updated April 18, 2026. https://en.alegsaonline.com/art/66864

BibTeX

@misc{alegsaonline_66864,
  author = {Alegsa, Leandro},
  title = {Most common words in English — frequency, methods, and uses},
  year = {2026},
  howpublished = {AlegsaOnline.com},
  url = {https://en.alegsaonline.com/art/66864},
  note = {Updated: April 18, 2026; Language: en}
}

TXT

Leandro Alegsa. “Most common words in English — frequency, methods, and uses.” AlegsaOnline.com. Updated: April 18, 2026. https://en.alegsaonline.com/art/66864

Sources

askoxford.com : AskOxford.com: Language Facts
bckelk.ukfsn.org : Top 1000 words in UK English
duboislc.org : The First 100 Most Commonly Used English Words
itre.cis.upenn.edu : Time after time after time...