-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closes part of #682 #684
Closes part of #682 #684
Conversation
Updated to `bigbio_pairs` schema Passes all tests
@phlobo Please have a look at this dataset. I refactored the implementation to the new HF-hub-based style. I'm not completely happy with the modeling - essentially it's aspect-based text classification rather than text similarity. However, the former isn't properly supported by any of the BigBio schemes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would pick a different task, but apart from that, the PR looks good!
@mariosaenger I realized the old version of the dataset is already on the hub: https://huggingface.co/datasets/bigbio/medical_data |
medical_data updated to
bigbio_pairs
schemaPasses all tests
biodatasets/my_dataset/my_dataset.py
(please use only lowercase and underscore for dataset naming)._CITATION
,_DATASETNAME
,_DESCRIPTION
,_HOMEPAGE
,_LICENSE
,_URLs
,_SUPPORTED_TASKS
,_SOURCE_VERSION
, and_BIGBIO_VERSION
variables._info()
,_split_generators()
and_generate_examples()
in dataloader script.BUILDER_CONFIGS
class attribute is a list with at least oneBigBioConfig
for the source schema and one for a bigbio schema.datasets.load_dataset
function.python -m tests.test_bigbio biodatasets/my_dataset/my_dataset.py
.