Add dataset: odeuropa_benchmarks_and_corpora #54

davanstrien · 2022-07-14T13:23:52Z

A URL for this dataset

https://github.com/Odeuropa/benchmarks_and_corpora

Dataset description

This dataset

contains the annotations related to olfactory information from the benchmark created for the ODEUROPA project.
For 7 languages we selected a pool of documents covering different time periods (from 1620 to 1925) and topics (e.g. medicine, law, literature).

This offers an exciting dataset of annotations related to olfactory (smell) information in historical documents. The dataset is interesting because it covers a range of periods but also offers the possibility of utilising ml for a different task than standard entity recognition tasks.

Dataset modality

Text

Dataset licence

Other license

Other licence

No response

How can you access this data

As a download from a repository/website

Confirm the dataset has an open licence

To the best of my knowledge, this dataset is accessible via an open licence

Contact details for data custodian

No response

davanstrien · 2022-07-14T13:26:39Z

I am clarifying the licence for this, see Odeuropa/benchmarks_and_corpora#3 so would hold off working on this until we've got that info back.

davanstrien added the candidate-dataset Proposed dataset to be added label Jul 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add dataset: odeuropa_benchmarks_and_corpora #54

Add dataset: odeuropa_benchmarks_and_corpora #54

davanstrien commented Jul 14, 2022

davanstrien commented Jul 14, 2022

Add dataset: odeuropa_benchmarks_and_corpora #54

Add dataset: odeuropa_benchmarks_and_corpora #54

Comments

davanstrien commented Jul 14, 2022

A URL for this dataset

Dataset description

Dataset modality

Dataset licence

Other licence

How can you access this data

Confirm the dataset has an open licence

Contact details for data custodian

davanstrien commented Jul 14, 2022