Skip to content

Commit

Permalink
Merge pull request #7 from jon-fuller-ukhsa/main
Browse files Browse the repository at this point in the history
fix hyperlink
  • Loading branch information
harrygcoppock authored Jun 12, 2023
2 parents cb1e107 + 639dd3b commit a4efeb3
Showing 1 changed file with 1 addition and 2 deletions.
3 changes: 1 addition & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,8 +41,7 @@ If you are on macOS please add the flag ```--platform=linux/amd64```

### The UK COVID-19 Vocal Audio Dataset
The full UK COVID-19 Vocal Audio Dataset is not publicly available as is classed as 'Special Category Personal Data'. Access may be requested from UKHSA ([email protected]), and will be granted subject to approval and a data sharing contract. To learn about how to apply for UKHSA data, visit:
[https://www.gov.uk/government/publications/accessing-ukhsa-protected-data/accessing-ukh]{https://www.gov.uk/government/publications/accessing-ukhsa-protected-data/accessing-ukhsa-protected-data}

[https://www.gov.uk/government/publications/accessing-ukhsa-protected-data/accessing-ukhsa-protected-data](https://www.gov.uk/government/publications/accessing-ukhsa-protected-data/accessing-ukhsa-protected-data)

We understand that this might not be practical for a number of users interested in our work and therefore we have created a new curated dataset which has been classed as 'Open Access' data (there will be a downloadable link which anyone can use, without the need to even register). In order to achieve this the 'sentence' modality has been removed, leaving behind the 'cough', 'three cough' and 'exahaltion' modalities. In addition, to meet open access requirements, some select attributes of the meta data have been aggregated (to prevent groups of individuals of smaller than 3 being singled out on selection of attributes). This means that the 'sentence' modality results are not replicable or the creation of the train-test splits. We note that this just applies for the the open access version of the data and that our full stack is replicable with the original dataset which can be accessed following the instructions above. We note that we provide the train-test splits in _.csv_ form so that the machine learning experiments can be replicated with the open access data. This open access dataset has been created however, is waiting final UKHSA approval before we upload it to zenodo.

Expand Down

0 comments on commit a4efeb3

Please sign in to comment.