Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add dataset-level stats #2297

Merged
merged 11 commits into from
Nov 19, 2024
Merged

feat: add dataset-level stats #2297

merged 11 commits into from
Nov 19, 2024

Commits on Nov 18, 2024

  1. feat: add datase level stats

    resolves #2288
    jqnatividad committed Nov 18, 2024
    Configuration menu
    Copy the full SHA
    d9bc2a5 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    6ca0100 View commit details
    Browse the repository at this point in the history
  3. chore: apply clippy suggestions

    warning: usage of `FromIterator::from_iter`
       --> src/cmd/stats.rs:789:27
        |
    789 |                 work_br = csv::ByteRecord::from_iter(vec![&*header].into_iter().chain(stat));
        |                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: use `.collect()` instead of `::from_iter()`: `vec![&*header].into_iter().chain(stat).collect::<csv::ByteRecord<_>>()`
        |
        = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#from_iter_instead_of_collect
        = note: `-W clippy::from-iter-instead-of-collect` implied by `-W clippy::pedantic`
        = help: to override `-W clippy::pedantic` add `#[allow(clippy::from_iter_instead_of_collect)]`
    
    warning: implicitly cloning a `ByteRecord` by calling `to_owned` on its dereferenced type
       --> src/cmd/stats.rs:803:31
        |
    803 |             stats_br_vec.push(dataset_stats_br.to_owned());
        |                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: consider using: `dataset_stats_br.clone()`
        |
        = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#implicit_clone
        = note: `-W clippy::implicit-clone` implied by `-W clippy::pedantic`
        = help: to override `-W clippy::pedantic` add `#[allow(clippy::implicit_clone)]`
    
    warning: implicitly cloning a `ByteRecord` by calling `to_owned` on its dereferenced type
       --> src/cmd/stats.rs:812:31
        |
    812 |             stats_br_vec.push(dataset_stats_br.to_owned());
        |                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: consider using: `dataset_stats_br.clone()`
        |
        = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#implicit_clone
    
    warning: implicitly cloning a `ByteRecord` by calling `to_owned` on its dereferenced type
       --> src/cmd/stats.rs:825:31
        |
    825 |             stats_br_vec.push(dataset_stats_br.to_owned());
        |                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: consider using: `dataset_stats_br.clone()`
        |
        = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#implicit_clone
    jqnatividad committed Nov 18, 2024
    Configuration menu
    Copy the full SHA
    8fae48f View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    68f3830 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    3bb256d View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    85c330b View commit details
    Browse the repository at this point in the history

Commits on Nov 19, 2024

  1. refactor: get_stats_records helper to ignore dataset-level stats an…

    …d use simd_json instead of serde_json
    jqnatividad committed Nov 19, 2024
    Configuration menu
    Copy the full SHA
    12bdfff View commit details
    Browse the repository at this point in the history
  2. refactor: get_stats_records - reduce cloning; align stats.jsonl loadi…

    …ng approach
    
    also refactor csv_to_jsonl to pass output_jsonl by reference instead of by value
    jqnatividad committed Nov 19, 2024
    Configuration menu
    Copy the full SHA
    642fd74 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    0b072c8 View commit details
    Browse the repository at this point in the history
  4. refactor: use qsv__ instead of _qsv_ as prefix for qsv dataset le…

    …vel objects
    
    so as not to trigger select where objects starting with _ is a sentinel for last column
    jqnatividad committed Nov 19, 2024
    Configuration menu
    Copy the full SHA
    b05c576 View commit details
    Browse the repository at this point in the history
  5. tests: adjust index and json tests to account for new dataset-lev…

    …el stats
    
    also add assert_succes to select frequency and tojsonl tests to help in debugging
    jqnatividad committed Nov 19, 2024
    Configuration menu
    Copy the full SHA
    672796e View commit details
    Browse the repository at this point in the history