CSV data cleaning

Basics of data cleansing for CSV and what you can do with this tool.

What is data cleaning?

Data cleaning (cleansing) means finding and fixing or removing bad or noisy data so it’s ready for analysis or integration. For CSV, that often means removing invisible characters, trimming spaces, and finding duplicates.

Why it matters

Uncleaned CSV can cause:

Cleaning first reduces errors and rework. See also CSV errors guide.

Removing invisible characters

Pasted or imported data may contain zero-width spaces, control characters, or odd spaces. The single-file check detects “invisible characters” and can remove them with Fix all issues. All processing stays in your browser.

Trimming spaces

Leading/trailing spaces make “ A ” and “A” different and break matching. The single-file check can trim them in one click and optionally normalize full‑width/half‑width per column.

Finding duplicates

Duplicate IDs or emails can cause DB errors or wrong merges. The duplicate data guide explains how to find and handle them. This tool detects and lists duplicates; you then edit the downloaded CSV to remove or merge as needed.

Cleaning workflow (this tool)

  1. Format check: encoding, delimiter, column count, empty lines.
  2. If needed, encoding fix to UTF-8 BOM.
  3. Single-file check: upload CSV, detect invisible chars, duplicate IDs, spaces.
  4. Apply “Fix all issues” and download the cleaned CSV.

Home · Use single-file check