Finding and handling duplicate data

Why duplicate IDs and rows are a problem and how to detect and handle them.

Why duplicates are a problem

When importing CSV into a database, duplicate primary or unique keys cause constraint errors. When merging two CSVs, duplicate keys make it unclear which row is “correct,” leading to wrong overwrites or duplicate rows. So it’s important to find duplicates before import or merge and handle them by your rules.

Choosing the key column

You need to decide which column is the unique key. Examples: ID, email, product code, or a combination of columns. The single-file check suggests key columns or lets you pick one, so you can quickly see which rows are duplicates.

How detection works

This tool flags “duplicate ID” when the same value appears in two or more rows for the chosen column. It lists row numbers and values so you can fix or remove duplicates. You can export the report for use in other tools.

How to handle duplicates

Duplicates in two-file compare

When comparing old vs new CSV with two-file compare, duplicate keys can make diffs show as “changed” instead of “added/removed” because row alignment breaks. Cleaning key-column duplicates with the single-file check first makes the diff result clearer. See CSV errors guide for more.

Home · Check for duplicates