Comma-separated values (CSV) is a plain-text format used to represent tabular data such as rows and columns from a spreadsheet or database. Each record appears on its own line, and fields within a record are typically separated by commas. Because CSV files are human-readable and software-friendly, they are widely used for data interchange, quick exports, and small datasets.

Format and conventions

The basic CSV record is a sequence of fields separated by commas. Simple examples look like: Name,Occupation,Age followed by "Alice","Engineer",30. When a field contains a comma, newline, or quotation mark, it is usually enclosed in double quotes, and an internal double quote is represented by two double quotes. There is no single universally enforced formal standard; however, RFC 4180 is a commonly referenced specification and many programs follow its recommendations. For more detailed guidance, see the CSV specification.

Common variants and dialects

CSV implementations vary by locale and application. Some use different delimiters (semicolon, tab, or pipe) to avoid conflicts with commas in data. Line endings, character encodings (UTF-8 vs legacy encodings), and how headers are handled also differ. These variations are often called "dialects" and must be negotiated when exchanging files between systems.

Typical features and escape rules include:

  • Optional header row naming columns.
  • Quoted fields to contain delimiters or newlines.
  • Doubling quotes to escape a quotation mark inside a quoted field.
  • Inconsistent handling of whitespace around unquoted fields.

Historically, CSV-like formats predate modern spreadsheets and emerged as an easy way to serialize tables in plain text. Over time, support was added to spreadsheet programs, databases, and scripting languages, making CSV a de facto lingua franca for simple data exchange.

Uses, advantages and limitations

CSV is convenient for quick import/export, human inspection, and interoperability: nearly every spreadsheet, statistical package, and programming language can read or write CSV. Its advantages are simplicity, compactness, and broad tool support. Limitations include lack of typed values or schema, ambiguity in parsing due to dialects, and weak support for complex structures (nested data or metadata). For structured needs, formats like JSON, XML, or dedicated table formats with explicit schemas are often preferable.

Because of its ubiquity, CSV remains a practical choice for many everyday data tasks, but users should document and agree upon dialect details (delimiter, quoting, encoding) to avoid interoperability problems.