-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LineReader stops reading when it hits a character like "É" or "ñ" #5
Comments
Maybe the file is not encoded using UTF-8? I use |
Hi, i still have this problems |
Have you verified which character encoding is used by the file you are trying to read? |
Hi, it's Unicode (UTF-8) |
Could you can upload a zipped sample somewhere? Then I will find the time to take a look at it in a few days. |
I think you can create new document with some character like í, é, ñ ..... Or i will update some sample data |
I think you should really upload an example file somewhere. I can write an |
This is file's info: Non-ISO extended-ASCII English text, with very long lines, with CRLF line terminators. This is the file: http://www.mediafire.com/?1cwr4if28w504md It have "î" character |
Agreed. As I suspected the file is not encoded as UTF-8. I converted the file to UTF-8 using Notepad++ (options are visible in the menu) so you can try again with this file. |
Maybe we must automatically convert all file to UTF-8 before start reading its content |
I suggest that you look for a way to recognize the character encoding in front. Feel free to add it to the LineReader. |
So you have a textfile such as:
diner
restaurant
lunch-spot
greasy spoon
café // "é" character
coffee shop
cafeteria
LineReader stops reading when it hits the "café" line above. Never gets to "coffee shop".
The text was updated successfully, but these errors were encountered: