Overview
Text is a sequence of written or printed symbols used to record and communicate language. It is organised according to linguistic and cultural conventions so readers can derive meaning. Texts range from single words and labels to extended works such as novels, legal codes and technical manuals. Both humans and automated systems read and interpret text, and its persistence distinguishes it from ephemeral spoken communication.
Characteristics and structure
Text combines characters, words, sentences and larger discourse units. Key aspects include orthography, grammar, punctuation and layout. Typography and formatting influence readability and interpretation, while structural elements such as headings, paragraphs and lists guide navigation. Text may be linear (read sequentially) or hypertextual (with links and structure that permit non‑linear reading).
Forms and media
Historically, texts were produced on stone, clay, parchment and paper; today they appear equally in digital forms. Printed formats include manuscripts, newspapers and books. Digital forms include plain text files, formatted documents, web pages and messages. The distinction between plain text and rich text reflects whether presentation data (fonts, styles, layout) accompanies the character data.
Digital representation and encoding
In computing, character encodings map symbols to numeric values so text can be stored and processed. Modern systems use universal encodings that support many writing systems. Beyond raw characters, digital text can carry metadata about structure and semantics used by search engines, accessibility tools and language processing software.
Uses, analysis and preservation
- Communication and record keeping: letters, contracts, reports and manuals.
- Creative and scholarly work: literature, criticism and research publications.
- Computation and indexing: source code, data interchange formats and searchable corpora.
Fields such as linguistics, literary studies and digital humanities examine style, context and meaning, while librarianship and archives focus on long‑term preservation, provenance and access. Accessibility features like clear structure and semantic markup help make text usable by diverse readers, including those relying on assistive technology.
Distinctions and practical notes
- Text vs speech: text is persistent and revisable; speech conveys tone and prosody.
- Plain vs rich text: plain text contains only characters; rich text includes presentation and layout.
- Text vs data: text encodes human language; data may require interpretation to become prose.