Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a utfbom encoding that handles UTF-8, UTF-16BE, UTF-16LE, UTF-32BE, UTF-32LE #33

Open
tahonermann opened this issue May 21, 2018 · 0 comments

Comments

@tahonermann
Copy link
Owner

text_view currently defines utf8bom, utf16bom, and utf32bom encodings that detect a BOM and dispatch to the appropriate non-BOM encoding to consume remaining input. However, a utfbom encoding would be useful to consume UTF-8, UTF-16, and UTF-32 formatted files that contain a BOM.

There is a question of what to do if the input lacks a BOM. Options are to fail or fallback to an assumed encoding. A policy class could be used to allow programmer control; e.g., fail, fallback to UTF-8, etc...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant