What is CSV?
Basics of the comma-separated text format and tips for using it in your workflow.
CSV basics
CSV (Comma-Separated Values) is a plain-text format where data is separated by commas. One line is one record; columns are split by the delimiter, and the first line is often the header (column names).
Example
id,name,email 1,Alice,alice@example.com 2,Bob,bob@example.com
This makes it easy to exchange table-like data between Excel, databases, and web apps.
Why use CSV
- Portable: Opens in almost any OS or app; a common format for system integration.
- Lightweight: Smaller than binary formats; good for email and batch jobs.
- Readable: You can open and edit it in a text editor.
Delimiters
Although “CSV” implies comma (,), some regions or apps use semicolon (;) or tab. The CSV Checker can auto-detect or let you choose the delimiter.
Character encoding
CSV is text, so encoding matters. Common options:
- UTF-8: Standard internationally. With BOM, Excel often opens it correctly.
- Windows-1252 / ISO-8859-1: Common in older or regional systems.
Wrong encoding causes garbled text. See Encoding issues for more.
CSV vs Excel
Excel (.xlsx) is a binary format with cells, formatting, formulas, and multiple sheets. CSV is “plain table data” only—no formatting, one sheet per file. CSV is often used for data transfer; we compare them in CSV vs Excel.
Things to watch in CSV
- Duplicate IDs: Same ID in multiple rows can cause DB or matching errors. See Duplicate data guide.
- Invisible characters: Copy-paste or imported data may contain hidden control characters. CSV errors guide explains how to fix them.
- Column count mismatch: Rows with different column counts can break imports. Use format check to find them first.
A brief history: RFC 4180
CSV has been in use since the early days of computing. The format was formally documented in 2005 as RFC 4180, which defined how commas, quotation marks, and line breaks should be handled inside a CSV file. Despite its age, CSV remains one of the most widely used data exchange formats because of its simplicity and near-universal software support. You do not need a special application to open, read, or create a CSV file.
Where CSV is used in practice
CSV appears across virtually every industry:
- E-commerce: Product catalogs, order exports, and inventory lists are commonly shared as CSV between platforms like Shopify, Amazon marketplaces, and warehouse management systems.
- Finance & accounting: Bank transaction exports, payroll data, and general ledger entries are routinely formatted as CSV for import into accounting software such as QuickBooks or Freee.
- Healthcare: Patient lists, appointment records, and lab results are often exchanged in CSV between hospital systems and third-party tools, because the format can be opened without specialized software.
- Government & research: Open data portals and academic datasets frequently publish in CSV so that anyone can download and analyze the data without paying for a proprietary tool.
- Marketing: Email subscriber lists, CRM exports, and ad campaign performance reports are managed and transferred in CSV between platforms.
CSV vs JSON vs XML: which should you use?
CSV is not the only plain-text format for data. Here is a quick comparison to help you choose:
- CSV: Best for flat table data (rows and columns). Easy for non-technical users to open and edit in any spreadsheet app. Limited to a single table with no hierarchy or nesting.
- JSON: Ideal for hierarchical or nested data structures. Common in REST APIs and web development. Not easy to open and edit without developer tools or a code editor.
- XML: Supports complex structures with schemas and validation rules. Common in enterprise systems and legacy integrations. Verbose and harder to read or edit manually.
When your data is a straightforward table with rows and columns — like a product list or a customer export — CSV is almost always the right choice. When you need nested objects, multiple related tables, or tight API integration, JSON is usually more appropriate.
Three mistakes that cause the most CSV problems
Most CSV errors in practice trace back to a small number of recurring mistakes:
- Encoding mismatch: Saving a file in Shift_JIS or Windows-1252 and then opening it in a system that expects UTF-8 produces garbled text. Always confirm the character encoding before sending or importing a CSV file. The format check tool shows the detected encoding automatically.
- Unquoted fields containing the delimiter: A value like
Smith, Johnbreaks the column structure if it is not wrapped in double quotes. This is one of the most common causes of column-count mismatch errors and can be hard to spot manually in a large file. - Invisible characters from copy-paste: Copying data from a web page, PDF, or chat tool often introduces zero-width spaces, non-breaking spaces, or other control characters that are invisible in Excel but cause import failures in databases and APIs. The CSV check tool detects and removes them in one click.