Diphthong (pronounced DIF-thong or DIP-thong) refers to a single syllabic vowel sound in which the speaker's articulatory posture moves from one vowel quality to another within the same syllable. In phonetic terms a diphthong is not two separate vowels separated by a consonant but a glide: the tongue, lips or jaw shift continuously so that listeners perceive a change of vowel quality. For discussion of basic terms, see vowel.
Characteristics and classification
Phonetically, diphthongs are characterized by a dynamic formant transition: the first formant (F1) and second formant (F2) change as the articulation moves. Linguists commonly divide diphthongs by direction and prominence. A falling diphthong has a more prominent initial element that glides toward a less prominent off-glide (typical in English eye or go), while a rising diphthong begins on a less prominent glide and moves toward a more prominent nucleus (found in some Romance and Slavic languages). A centering diphthong moves toward a central vowel such as schwa (for example, some pronunciations of near and sure in certain accents).
Common examples in English include the vowel sequences in time (/aɪ/), face (/eɪ/), goat (/oʊ/), mouth (/aʊ/) and choice (/ɔɪ/). Realizations vary widely by region and sociolect: the same orthographic sequence can represent a monophthong, a diphthong, or a triphthong depending on accent. See also accent-related variation.
History of the term
The English word diphthong derives from the Greek elements di- meaning "two" and phthongos meaning "sound" or "tone." Historically, grammarians used the term to describe vowel pairs that behaved as a single phonological unit. Over time phonetics and phonology refined the concept: modern descriptions emphasize continuous spectral movement rather than strictly two discrete segments.
Role in phonology and orthography
In a given language a diphthong may function as a single phoneme (one contrastive sound) or as a sequence of two phonemes, depending on phonological analysis. In some languages orthography marks diphthongs with digraphs (two letters representing one sound), in others with diacritics. Linguists sometimes treat productive vowel glides as phonemic diphthongs when they contrast meaningfully with monophthongs; in other cases they analyze them as combinations of vowels and glides (e.g., /j/ or /w/). The term lexeme appears in related discussions when a sound sequence contributes to morphological identity; compare lexeme and how a diphthong may behave within a word in a particular language.
Comparison with monophthongs and triphthongs
A syllable can contain a monophthong (a steady-state vowel), a diphthong (a two-part glide), or a triphthong (a glide passing through three vowel qualities). Monophthongs require relatively little movement of the articulators during their duration, while diphthongs and triphthongs involve rapid coordinated motion. In many dialects, sequences that historically were diphthongs have collapsed to monophthongs and vice versa; these shifts play a central role in sound-change studies.
Practical importance and teaching
Understanding diphthongs is useful for language teaching, speech therapy, and singing. Teachers of pronunciation often focus on the onset and offset qualities of a diphthong and on the duration of the glide. Accurate transcription uses narrow phonetic symbols to capture subtle differences, while broad phonemic transcription may represent several phonetic variants with a single symbol. Because diphthongs are sensitive to surrounding consonants and prosody, practice in varied contexts helps learners achieve intelligible, natural-sounding speech.
- Key points: dynamic vowel quality, single syllabic nucleus, variable across accents.
- Related topics: monophthong, triphthong, vowel harmony, phonemic analysis.