UTF-8 for CSV Files

By Online CSV Editor · Last updated: 2026-03-23

If your CSV shows strange characters like Ã, ’, –, or the replacement symbol , the problem is usually text encoding, not your actual data values. In most modern CSV workflows, UTF-8 is the safest default because it preserves accented names, symbols, and non-English text across browsers and import tools.

The fix is to identify whether the file was saved with the wrong encoding, opened with the wrong encoding, or exported by a tool that adds special markers like BOM. Once you know that, you can re-save the file correctly without damaging delimiters, quoted fields, or row structure.

Quick answer: how to fix weird CSV characters

  1. Confirm the issue is encoding, not delimiter parsing.
  2. Re-open the file using UTF-8 if your tool allows encoding selection.
  3. Inspect a few rows with accented letters, apostrophes, symbols, or multilingual text.
  4. Re-save or export as UTF-8 using a CSV-aware tool.
  5. Test-import the new file before replacing the original workflow file.

What UTF-8 means in CSV files

UTF-8 is a text encoding standard that lets one CSV file represent ordinary English text, accented names, currency symbols, and multilingual content consistently. It does not change the delimiter, quoting, or number of columns. It only controls how text bytes are interpreted as readable characters.

If the same bytes are opened using the wrong encoding, names like José, München, or curly apostrophes can become broken text. That is why CSV files can look correct in one app and corrupted in another.

Common signs your CSV has an encoding problem

  • Accented letters appear as é, ñ, or similar combinations.
  • Smart quotes and apostrophes become ’ or “.
  • Some characters appear as , which means the original byte could not be decoded properly.
  • The file imports, but names, addresses, or product titles look corrupted afterward.
  • The same CSV looks different depending on which spreadsheet or import tool opens it.

Encoding issue or delimiter issue? Check this first

If the file opens as one giant column, that usually points to a delimiter mismatch. If the rows and columns look structurally correct but the text itself looks wrong, that usually points to encoding.

Many people troubleshoot the wrong thing because both problems can happen at once. If your separators also look inconsistent, review how to change a CSV delimiter safely and then come back to encoding checks.

Step-by-step: safe UTF-8 repair workflow

  1. Make a backup copy before editing the original file.
  2. Open the CSV in a tool that lets you verify rows, columns, and sample text values.
  3. Compare known fields such as customer names, city names, apostrophes, and symbols.
  4. Re-import or reopen the source with UTF-8 selected if your editor supports encoding choice.
  5. Export a fresh CSV in UTF-8 and keep the original delimiter unless the destination requires a change.
  6. Run a small test import in the target system and confirm the repaired characters remain correct.
  7. Only replace the original file after the test import passes.

Example of a healthy UTF-8 CSV row

UTF-8 matters most when your file contains characters outside basic ASCII, such as accented letters and currency symbols:

name,city,note
José Álvarez,München,"Paid in € and confirmed by Zoë"

If that row displays broken accents or symbols after opening, the underlying text bytes are probably being decoded with the wrong character set.

UTF-8 with BOM vs UTF-8 without BOM

Some spreadsheet tools and legacy importers prefer UTF-8 with BOM, which adds a short marker at the start of the file. Many web apps and APIs are perfectly happy with plain UTF-8 without BOM.

If a destination system specifically asks for UTF-8 BOM, follow that requirement. If it does not, standard UTF-8 is usually the cleaner default. The important part is consistency between export settings and the importer’s expectations.

Mistakes that make encoding fixes worse

  • Saving over the only copy before checking whether the characters were actually repaired.
  • Changing delimiter and encoding at the same time without testing which issue caused the failure.
  • Opening the CSV in a spreadsheet that auto-converts values before you verify the raw text.
  • Assuming every corrupted file should use BOM when the destination tool never asked for it.
  • Running find-and-replace on broken characters instead of fixing the underlying encoding mismatch.

Quick QA checklist before export

  • Rows and columns still parse correctly
  • Names with accents or apostrophes display correctly
  • Delimiter stayed consistent during the repair
  • Destination system encoding requirement confirmed
  • Test import completed on the repaired file

FAQ

Why does UTF-8 matter for CSV imports?

Because CSV is just text plus separators. If the text encoding is wrong, the file may still import, but names, product titles, and symbols can become unreadable or permanently altered in the destination system.

What does the replacement character � mean?

It usually means the app could not decode one or more bytes using the selected encoding. In practice, that is a sign the file was opened or saved with the wrong character set somewhere in the workflow.

Should I use UTF-8 for multilingual CSV data?

Yes, in most cases. UTF-8 is the best default for multilingual text because it handles a wide range of characters without needing different regional encodings for different languages.

Related guides

Canonical: https://csveditoronline.com/docs/csv-utf8-encoding-fix