You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using CodableCSV to load user-provided CSV files, one currently needs to ask the user which field delimiter is used in their file.
Describe the solution you'd like
It would be nice if CodableCSV had an option to automatically infer the field delimiter from the provided file.
I saw that this feature is on the roadmap, along with row delimiter detection and header detection. There are also already some references to it in the code, with the idea to use auto-detection when the field delimiter is set to nil in the reader's configuration.
I'd be happy to contribute this feature. My idea was to port the dialect detection code from the CleverCSV Python library to Swift.
Describe alternatives you've considered
An alternative would be to use the library directly, however that would introduce a dependency to the project, and, more importantly, I'm not quite sure how good Swift's support is for calling Python code. I guess it wouldn't work on iOS, for example?
Delimiter inference is indeed something I always wanted to do and plan for, but never really got into doing it. To be honest, I haven't even begin to think how to approach the problem. So, if you want to research it and come up with a solution, I will be more than happy to review it.
The inference code is supposed to live here. You would probably want to expand the switch statement to indicate which delimiter has the user input; i.e.:
you know the field delimiter, but not the row delimiter, or
you know the row delimiter, but not the field delimiter, or
you know neither.
I don't want to add dependencies to the project, so if you want to take this over, I would ask you to write Swift code directly.
Hello @PoshAlpaca and @dehesa. Recently I developed a more simple mechanism to implement CSV file dialect detection on data ingesting pipelines. The approach is described in a research paper and also has a python implementation.
Is your feature request related to a problem?
When using CodableCSV to load user-provided CSV files, one currently needs to ask the user which field delimiter is used in their file.
Describe the solution you'd like
It would be nice if CodableCSV had an option to automatically infer the field delimiter from the provided file.
I saw that this feature is on the roadmap, along with row delimiter detection and header detection. There are also already some references to it in the code, with the idea to use auto-detection when the field delimiter is set to
nil
in the reader's configuration.I'd be happy to contribute this feature. My idea was to port the dialect detection code from the CleverCSV Python library to Swift.
Describe alternatives you've considered
An alternative would be to use the library directly, however that would introduce a dependency to the project, and, more importantly, I'm not quite sure how good Swift's support is for calling Python code. I guess it wouldn't work on iOS, for example?
@dehesa what do you think?
The text was updated successfully, but these errors were encountered: