Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More additional tests on data in regional data #312

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

RichardMN
Copy link
Collaborator

Implementation of more tests from #302, intended to be merged into #307

Checks that the number of regions at level 2 >= the number of regions at level 1
Checks that there is at most 1 level 1 region coded to NA
Checks that there is at most 1 level 2 region coded to NA
Checks that the number of level 1 regions is identical in download of level 1 and level 2 dat

This is a rough implementation.

It currently has a stand-alone wrapper to apply purrr because it only is run once for each country class, but then needs to be told what is the maximum level available from that country class. Notionally, this could allow us to go beyond level 2, but the code doesn't have that recursive flexibility yet. It may be possible to merge the wrapper with the existing file. I've removed the download option because these tests only make sense with download = TRUE.

The files and possibly the functions probably should be renamed.

It calls get_regional_data and I think it doing this since this is the "front end" for most users and there could still be glitches between the data class and the output of get_regional_data.

It's not quick.

joseph-palmer and others added 3 commits April 23, 2021 11:27
Checks that the number of regions at level 2 >= the number of regions at level 1
Checks that there is at most 1 level 1 region coded to NA
Checks that there is at most 1 level 2 region coded to NA
Checks that the number of level 1 regions is identical in download of level 1 and level 2 dat
@seabbs seabbs mentioned this pull request Jun 4, 2021
@RichardMN
Copy link
Collaborator Author

It's now hard to see from the comparison against the older version of the branch what this changes, but the changes are fairly minor. I think it probably useful to have roughly three separate strands of tests (which may not all be framed as tests):

  1. unit tests - does the code do what we expect it to do under various (mostly canned) inputs, working mostly at small unit levels (we have this)
  2. data availability tests - does our system successfully extract "some data" from the sources (we have this and run it nightly and I like the dashboard)
  3. data sanity checks - does the data that our system pulls make some sort of sense, and has there possibly been some change to the underlying source which doesn't break the data availability test but may mean the information we are providing isn't reliable, whether for new data, or for older data, or for both

I think this PR and #307 are both trying to provide something like 3, and that it's useful to have. It's sort of a canary.

It's not necessary and we've done without it for a long time. I can tinker with my PR and am happy to try to work with @joseph-palmer on #307, but also content if the feeling right now is that this can be put aside in favour of other efforts.

@github-actions
Copy link

github-actions bot commented Aug 7, 2021

This PR has been flagged as stale due to lack of activity

@seabbs seabbs changed the base branch from additional_tests to master February 4, 2022 15:03
@Bisaloo Bisaloo self-assigned this Feb 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants