Skip to content

No labels!

There aren’t any labels for this repository quite yet.

bug
bug
Something isn't working
corpus
corpus
Corpus level filtering (ie sample cannot be processed in isolation)
data catalog
data catalog
Gathering data from data sources
data format
data format
Convert data format
documentation
documentation
Improvements or additions to documentation
duplicate
duplicate
This issue or pull request already exists
enhancement
enhancement
New feature or request
evaluation
evaluation
filter
filter
good first issue
good first issue
Good for newcomers
help wanted
help wanted
Extra attention is needed
invalid
invalid
This doesn't seem right
language modeling script
language modeling script
Need Language Modeling loading script
metadata
metadata
need custodian permission
need custodian permission
need data sourcing feedback
need data sourcing feedback
question
question
Further information is requested
tokenizer
tokenizer
wontfix
wontfix
This will not be worked on