Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support truncate_ragged_lines option for reading CSV files #53

Merged
merged 3 commits into from
Mar 24, 2024

Conversation

jvdp
Copy link
Contributor

@jvdp jvdp commented Mar 22, 2024

Hi,

Thanks for the library! I'm getting good use out of it.

Polars can give an error message like:

found more fields than defined in 'Schema' (RuntimeError)

Consider setting 'truncate_ragged_lines=true'.

This PR adds support for it in the bindings.

@ankane
Copy link
Owner

ankane commented Mar 22, 2024

Hi @jvdp, thanks for the PR. Happy to include, but please keep the arguments and code in the same order as py-polars for maintainability.

lib/polars/lazy_frame.rb Outdated Show resolved Hide resolved
and set to true for LazyFrame._scan_csv
@ankane
Copy link
Owner

ankane commented Mar 24, 2024

Thanks for updating. However, a lot of the code is still out-of-order (Rust code, Ruby code, @param docs).

ext/polars/src/batched_csv.rs Outdated Show resolved Hide resolved
lib/polars/data_frame.rb Outdated Show resolved Hide resolved
lib/polars/io.rb Outdated Show resolved Hide resolved
@jvdp
Copy link
Contributor Author

jvdp commented Mar 24, 2024

Fixed now! Some notes:

@ankane ankane merged commit 10bff31 into ankane:master Mar 24, 2024
1 check passed
@ankane
Copy link
Owner

ankane commented Mar 24, 2024

Looks great, thanks! Will sync the existing code in a follow-up commit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants