Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fnames_per_batch argument to HDF5Dataset #191

Merged
merged 5 commits into from
Jan 31, 2025

Conversation

EthanMarx
Copy link
Collaborator

@EthanMarx EthanMarx commented Jan 30, 2025

Adds argument for specifying number of files to sample per batch. If left as None will default to all available fnames

@wbenoit26
Copy link
Contributor

Would be good to have a check that fnames_per_batch isn't greater than len(fnames)

Copy link
Contributor

@deepchatterjeeligo deepchatterjeeligo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. One (nitpick) comment: rather that fnames_per_batch use n_files_per_batch or num_files_per_batch. However, up to you.

@EthanMarx
Copy link
Collaborator Author

Both comments addressed

@wbenoit26
Copy link
Contributor

You've got a few print statements in there - we should add that print removal pre-commit hook to this repo

@EthanMarx
Copy link
Collaborator Author

Done - added that hook as well

Copy link

Coverage report

Click to see where and how coverage changed

FileStatementsMissingCoverageCoverage
(new stmts)
Lines missing
  ml4gw/dataloading
  hdf5_dataset.py 89
Project Total  

This report was generated by python-coverage-comment-action

@EthanMarx EthanMarx merged commit b0548be into ML4GW:main Jan 31, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants