-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check MD5 hash before downloading #267
base: main
Are you sure you want to change the base?
Conversation
component-download-omim.owl: | $(TMPDIR) | ||
echo couldn't find a way that worked | ||
|
||
# failed attempts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
edit: got something working; see below
I had a feeling this first attempt using ShellScript wouldn't work out. I'm not great at ShellScript, much less nesting commands, much less within make
. I spent a little over an hour trying many variations.
I think I could do this if I was able to refactor the way we're downloading all of our components, perhaps (a) by splitting into multiple goals / introducing additional variables and/or temp files. Even easier if (b) leveraging Python. But, especially if that involves changing ODK, I can't see those methods being approved.
Probably best to close this and not worry about it for now, unless you want to take a stab or have me try out one of the above proposals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@matentzn Some success!
I should have realized earlier that this was the perfect use case for ChatGPT. Although what it suggested was not different, than one variation I saw on StackOverflow, for whatever reason I copied what it had literally this time. My problem when tried this earlier was that I included [
brackets]
around my if
statement, copying what we are doing with with a lot of our component goals. When I removed the brackets as ChatGPT (and StackOverflow) suggested, though, it worked.
That is, this failed:
if [ diff $(TMPDIR)/tmp.md5 $(TMPDIR)/omim.ttl.md5 >/dev/null 2>&1 ]; then ...
But this succeeded:
if diff $(TMPDIR)/tmp.md5 $(TMPDIR)/omim.ttl.md5 >/dev/null 2>&1; then ...
More specifically, this works:
component-download-omim.owl: | $(TMPDIR)
if [ $(MIR) = true ] && [ $(COMP) = true ]; then wget $(OMIM_URI).md5 -O $(TMPDIR)/tmp.md5; \
if diff $(TMPDIR)/tmp.md5 $(TMPDIR)/omim.ttl.md5 >/dev/null 2>&1; \
then echo same; else echo different; fi; fi
This is why I don't like ShellScript for anything more than a few lines; it's so arcane. Even something as short as this trips me up.
Now that we have a working proof of concept, need to decide what to do next.
One thing we need to do is keep a local copy of these raw downloads, e.g. omim.ttl
, or at least keep its MD5 the last time it ran, e.g. $(TMPDIR)/omim.ttl.md5
. Thus, we would need to change it so that robot
doesn't take as input the URLs of these artefacts (e.g. $(ROBOT) merge -I https://github.com/monarch-initiative/omim/releases/latest/download/omim.ttl
; we'd have to download it first, if that's alright with you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets discuss in next call! the principle sounds great, but I would like it to be explained once face2face
…l copy of the hash the last time the file was downloaded, then doing something if they are different.
a5040b3
to
b3517e4
Compare
747fceb
to
01b3bbc
Compare
@twhetzel FYI this is something I'd like to do at some point that would help reduce |
Overview
Proof of concept: downloading an MD5 file and checking against a local copy of the hash the last time the file was downloaded, then doing something if they are different.
Addresses
Addresses no GitHub issue yet created. However, addresses the problem of our pipeline needlessly spending time downloading files even when they have not changed.