-
Notifications
You must be signed in to change notification settings - Fork 7
Fetching file from GitHub
Anthony Fok edited this page May 14, 2021
·
2 revisions
(TODO)
Related to:
-
fetch_csv
,fetch_csv_xz
andfetch_csv_lfs
in python/add_data.sh - Issue #91: Create XZ-compressed Git repos and download from them
fetch_csv_xz:
- Non-LFS files; large blog that is stored within the git repo.
- Use GitHub API not to fetch the file but the directory instead to get the
download_url
link - FIXME https://raw.githubusercontent.com/userName/repository/master/file.mp4
fetch_csv_lfs:
Can we use directory listing instead for fetch_csv_lfs also? In case some repos have some *.csv files on Git LFS and some in repo.
{
"message": "This API returns blobs up to 1 MB in size. The requested blob is too large to fetch via the API, but you can use the Git Data API to request blobs up to 100 MB in size.",
"errors": [
{
"resource": "Blob",
"field": "data",
"code": "too_large"
}
],
"documentation_url": "https://developer.github.com/v3/repos/contents/#get-contents"
}
Deprecated. There is a better, more direct way to download the file without dealing with base64 decoding.
With directory listing downloaded as github-api/social-vulnerability.dir.json, e.g. the following excerpt:
[
{
"name": "sovi_thresholds_2021.csv.xz",
"path": "social-vulnerability/sovi_thresholds_2021.csv.xz",
"sha": "1e57fa65a807041a8fdd81793dc82965c7f873a3",
"size": 1020,
"url": "https://api.github.com/repos/OpenDRR/model-inputs-xz/contents/social-vulnerability/sovi_thresholds_2021.csv.xz?ref=develop",
"html_url": "https://github.com/OpenDRR/model-inputs-xz/blob/develop/social-vulnerability/sovi_thresholds_2021.csv.xz",
"git_url": "https://api.github.com/repos/OpenDRR/model-inputs-xz/git/blobs/1e57fa65a807041a8fdd81793dc82965c7f873a3",
"download_url": "https://raw.githubusercontent.com/OpenDRR/model-inputs-xz/develop/social-vulnerability/sovi_thresholds_2021.csv.xz?token=AAJXHDGVU75KUM5OMM6EIVTATF4D6",
"type": "file",
"_links": {
"self": "https://api.github.com/repos/OpenDRR/model-inputs-xz/contents/social-vulnerability/sovi_thresholds_2021.csv.xz?ref=develop",
"git": "https://api.github.com/repos/OpenDRR/model-inputs-xz/git/blobs/1e57fa65a807041a8fdd81793dc82965c7f873a3",
"html": "https://github.com/OpenDRR/model-inputs-xz/blob/develop/social-vulnerability/sovi_thresholds_2021.csv.xz"
}
}
]
$ jq -r '.[] | select(.name == "sovi_thresholds_2021.csv.xz") | ._links.git' github-api/social-vulnerability.dir.json
https://api.github.com/repos/OpenDRR/model-inputs-xz/git/blobs/1e57fa65a807041a8fdd81793dc82965c7f873a3
or
$ jq -r '.[] | select(.name == "sovi_thresholds_2021.csv.xz") | .git_url' github-api/social-vulnerability.dir.json
https://api.github.com/repos/OpenDRR/model-inputs-xz/git/blobs/1e57fa65a807041a8fdd81793dc82965c7f873a3
local blob=$(jq -r '.[] | select(.name == "sovi_thresholds_2021.csv.xz") | .git_url' $response)
curl -H "Authorization: token ${GITHUB_TOKEN}" -L "$blob" | \
jq -r '.content' | base64 -d > sovi_thresholds_2021.csv.xz
ls -l sovi_thresholds_2021.csv.xz
Wikis: data | model-factory | opendrr-api | opendrr | python-env | riskprofiler-cms