Some applications, like Microsoft Excel for Windows, get a bit “confused” when reading data that contains UTF-8 encodings. They’ll assume a file is ASCII-encoded, which will then fail if later the file turns out to include UTF-8-encoded characters.
Many clients now either assume UTF-8 from the start, or allow you to override their chosen encoding when loading a file. For those that do not, you can use the $$bom
flag to request that output include the “Byte Order Mark” or “BOM” at the beginning of the file. The BOM is a special character that, when included as the first byte in a file, signals that it should be parsed as UTF-8.
If you encounter errors parsing UTF-8 data in your client, add the $$bom=true
flag to your SoQL query to force the inclusion of the BOM:
https://data.seattle.gov/resource/tqh5-8vm2.csv?$limit=5&&$$bom=true
Note that the BOM is invisible, so you’ll just have to believe me that it’s in there.