-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CSV parse error while generating report. #1
Comments
I dig down a bit of this problem and here's a minimal code snippet to reproduce the problem.
The content of bug.csv seems to be a valid csv file but for some reason, prettytable module having hard time parsing it. |
Hi, Thanks for letting us know about this. I don't know exactly what's going wrong with this dataset. There seems to By the way, may I ask why are you specifying all of your fields as Cheers, Florian 2016-01-12 19:59 GMT+01:00 Kangkook Jee [email protected]:
|
Thanks a lot Florian for your prompt follow-up. First of all, I'd like to answer your question regarding why are we specifying all of fields and we do this for the following reasons.
Thanks again for your help! Regards, Kangkook @roxanageambasu @francislan Please have a look and let me know If you have any ideas or comments on this. |
Ah okay this makes sense. However, for performance reasons, it might be a good idea to preprocess the Cheers, Florian
|
Hi, I traced this bug for quite some time and it's a bug in the csv module used by prettytable to parse CSVs. The csv module has a sniffer class that tries to guess the delimiter (called dialect) of the CSV based on frequency analysis of characters per line. That is, the character which appears with the same frequency on all lines of the CSV will be selected as delimiter. This complicated scheme is failing to guess the delimiter properly and there is no way to shortcut this idiotic process and set the delimiter yourself. Kangkook: If Florian's suggestion to remove some funky fields from the sensitive attributes is not working for you, we may consider using some other library for presenting the reports because I am skeptical that this bug will appear again. |
Thanks @vatlidak for your work. I was at the similar place debugging the issue and I wanted to find a way to provide delimiter character from our side so that it can bypass the problematic code path. Unfortunately I haven't yet succeeded. At this point, it doesn't seem to be an option to remove funky fields with some kind of pre-processing. |
I encountered the following exception while I process an input from Taintmark experiment.
To reproduce the problem, you can use the attached file (test0.csv) as an input to fairtest with the following setting.
The text was updated successfully, but these errors were encountered: