Can qsv handle encodings other than UTF-8? #1450
Replies: 2 comments 3 replies
-
It would be great if it was possible to add a |
Beta Was this translation helpful? Give feedback.
-
Can’t you filter the data into UTF-8 before piping it into qsv? |
Beta Was this translation helpful? Give feedback.
-
qsv is written in Rust, which requires UTF-8 encoding for its string data type.
Therefore, for qsv commands that manipulate input data as strings, this means UTF-8 encoding is required (denoted by 🔣 - the Input Symbols emoji).
However, for commands that manipulate input data as bytes, qsv is encoding-agnostic and can work with any encoding.
To check if your CSV is UTF8-encoded, you can use the
validate
command without a JSON schema, i.e:qsv validate mydata.csv
and it will emit an error if its NOT UTF-8 encoded.
You can also use the
input
command to handle non-UTF8 encoded CSV files with its--encoding-errors
option. This option has three modes:However, both commands will not infer the current encoding.
To get the current encoding, you'll need the
file
command:file -i mydata.csv mydata.csv: application/csv; charset=iso-8859-1
To convert it to UTF-8 encoding, you'll need the
iconv
command:Beta Was this translation helpful? Give feedback.
All reactions