Overview
ISO 639-1 is the part of the ISO 639 family that defines two-letter (alpha-2) codes for many of the world's principal languages. The purpose of these short, stable identifiers is to provide a compact, human- and machine-readable way to label the language of a resource, user preference, or user interface. Because of their brevity and recognizability, ISO 639-1 codes are commonly embedded in software locales, HTTP headers, metadata fields and content-management systems.
Code format and examples
Codes are written in lowercase and consist of two ASCII letters, for example en (English), fr (French), es (Spanish), zh (Chinese), de (German), ru (Russian), ja (Japanese) and pt (Portuguese). In practical use these codes are often combined with region or script subtags to indicate variants, for example en-US, en-GB, zh-Hant, zh-Hans, pt-BR or sr-Cyrl. Many systems follow conventions from Internet language tag standards when composing such tags.
Scope and limitations
ISO 639-1 intentionally covers a subset of languages—generally those most frequently used in international communication. Not every language has a two-letter code. For additional and less widely used languages, the ISO 639 family provides three-letter codes (ISO 639-2 and ISO 639-3) and separate parts that describe language families or collections. Because of this design, implementations that need comprehensive coverage or fine-grained distinctions should use the three-letter parts of the standard or an established language-tagging scheme that can combine codes and subtags.
Uses in computing and the web
ISO 639-1 codes are widely used with web and software standards: common places include the HTML lang attribute, HTTP Accept-Language negotiation, content metadata, translation files, operating-system locales and internationalized user interfaces. In such contexts codes are usually normalized (lowercase for the language subtag) and paired with region or script subtags whose conventions (for example uppercase region codes) are prescribed by Internet language tag guidelines. Using a two-letter code as part of a fuller language tag helps express regional variants while keeping the base language compact.
Maintenance and practical guidance
The ISO 639 family is maintained through ISO processes; entries are reviewed and occasionally added, revised, or deprecated to reflect linguistic and practical needs. Implementers should be cautious when converting legacy data, and use mapping tables between different parts of ISO 639 when necessary. Practical best practices include providing fallbacks when a two-letter code is not available, storing both the language code and optional region/script subtags when relevant, and following canonicalization rules from prevailing language-tagging recommendations so that tags remain interoperable across systems.
Related standards
- ISO 639-2 and ISO 639-3 supply three-letter identifiers covering many more individual languages.
- Language tags used on the web and in many protocols combine ISO 639 codes with other subtags to express script and regional forms.
Practical notes
When designing software or metadata schemes, prefer ISO 639-1 codes for common languages if compactness is important, but plan for three-letter codes or extended language tags for comprehensive language support. Keep codes in lowercase and document any local mappings or fallbacks to avoid ambiguity in multilingual applications.