Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add_data.sh - verify checksum of downloaded files #83

Open
anthonyfok opened this issue Apr 20, 2021 · 2 comments
Open

add_data.sh - verify checksum of downloaded files #83

anthonyfok opened this issue Apr 20, 2021 · 2 comments
Assignees
Labels
Enhancement New feature or request Task
Milestone

Comments

@anthonyfok
Copy link
Member

As a follow up to Issue #66 / PR #82, it would be nice if:

  • Verify checksum of downloaded files
  • Check if a pre-compressed git repo exists, and if not, fall back to the original Git LFS files
  • Allow repeated runs of add_data.sh, i.e. try to resume an interrupted download. (need to think about clearing and re-importing the PostGIS database? Probably outside of the scope of this issue.)

Previous notes (copied from #66):

  • Check for type: file vs dir
  • --no-act or --dry-run or something similar for add_data.sh to "predict" how much files would be downloaded before download.
  • Get sha256 checksum (Git LFS only)
$ jq -r '.content' response.json | base64 -d
version https://git-lfs.github.com/spec/v1
oid sha256:7d2487268dad05f83a0c5be6dcf90419e0bb53cf34706362baae21ac7841c2b5
size 234453739
  • Use a repo for compressed cached copies?
@anthonyfok anthonyfok added this to the Sprint 32 milestone Apr 20, 2021
@anthonyfok anthonyfok self-assigned this Apr 20, 2021
@anthonyfok
Copy link
Member Author

For (my) reference, curl author Daniel Steinberg explains the retry options in depth:
https://daniel.haxx.se/blog/2020/03/24/curl-ootw-retry-max-time/

--retry-connrefused may be a good candidate to add too. It was added in 7.52.0. curl included Ubuntu 18.04 and above is new enough.

@anthonyfok
Copy link
Member Author

Idea: Use git partial clone and sparse-checkout to get just the pointers to *.csv in the original repo to quickly get the SHA256 checksums. Something like the following:

git clone --filter=blob:none --sparse [email protected]:OpenDRR/openquake-inputs.git
cd openquake-inputs
git sparse-checkout add '*.csv'

@anthonyfok anthonyfok modified the milestones: Sprint 34, Sprint 35 May 25, 2021
anthonyfok added a commit that referenced this issue May 25, 2021
Also, upgrade git to the latest version (2.32.0.rc0 as of this writing)
because "git checkout" for model-inputs got stuck with git 2.28.

See #83
anthonyfok added a commit that referenced this issue Jun 3, 2021
Also, upgrade git to the latest version (2.32.0.rc0 as of this writing)
because "git checkout" for model-inputs got stuck with git 2.28.

See #83
anthonyfok added a commit that referenced this issue Jun 3, 2021
Also, upgrade git to the latest version (2.32.0.rc0 as of this writing)
because "git checkout" for model-inputs got stuck with git 2.28.

See #83
anthonyfok added a commit that referenced this issue Jun 7, 2021
Also, upgrade git to the latest version (2.32.0.rc0 as of this writing)
because "git checkout" for model-inputs got stuck with git 2.28.

See #83
anthonyfok added a commit that referenced this issue Jun 14, 2021
Also, upgrade git to the latest version (2.32.0.rc0 as of this writing)
because "git checkout" for model-inputs got stuck with git 2.28.

See #83
@anthonyfok anthonyfok modified the milestones: Sprint 35, Sprint 36 Jun 14, 2021
@anthonyfok anthonyfok modified the milestones: Sprint 36, Sprint 39 Jul 15, 2021
@anthonyfok anthonyfok modified the milestones: Sprint 39, Sprint 40 Aug 5, 2021
@anthonyfok anthonyfok modified the milestones: Sprint 40, Sprint 41, Sprint 42 Sep 9, 2021
@anthonyfok anthonyfok modified the milestones: Sprint 42, Sprint 43 Sep 23, 2021
@anthonyfok anthonyfok modified the milestones: Sprint 43, Sprint 44 Oct 13, 2021
@anthonyfok anthonyfok modified the milestones: Sprint 44, Sprint 45 Oct 25, 2021
@anthonyfok anthonyfok modified the milestones: Sprint 45, Sprint 46 Nov 8, 2021
@anthonyfok anthonyfok modified the milestones: Sprint 46, Sprint 47 Nov 22, 2021
@anthonyfok anthonyfok modified the milestones: Sprint 47, Sprint 50 Jan 17, 2022
@anthonyfok anthonyfok modified the milestones: Sprint 50, Sprint 52 Feb 15, 2022
@anthonyfok anthonyfok modified the milestones: Sprint 52, Sprint 53 Feb 28, 2022
@anthonyfok anthonyfok modified the milestones: Sprint 53, Sprint 54 Mar 14, 2022
@anthonyfok anthonyfok modified the milestones: Sprint 54, Sprint 55 Mar 25, 2022
@drotheram drotheram modified the milestones: Sprint 55, Sprint 56 Apr 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement New feature or request Task
Projects
None yet
Development

No branches or pull requests

3 participants