forked from ooni/ooni-sync
-
Notifications
You must be signed in to change notification settings - Fork 0
Fast downloader of measurements from the OONI measurements API
License
anadahz/ooni-sync
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
ooni-sync now supports downloading OONI reports from the AWS S3 bucket [ooni-data](https://registry.opendata.aws/ooni/). Please use the option `-s3-direct` to download OONI reports from the AWS S3 bucket. ATTENTION: OONI now offers a full database to quicky access all the data and the use of the OONI Sync tool is no longer recommended. For all the options available to access OONI data see: https://ooni.org/data/. Fast downloader of OONI reports using the OONI API. It works by downloading an index of available files, and only downloading the files that are not already present locally. You can run it again and again to keep a local directory up to date with newly published reports. Example usage: ooni-sync -s3-direct -xz -directory reports test_name=tcp_connect This command will create the directory "reports" if it doesn't exist, download all tcp_connect reports that are not already present in the directory, and compress the downloaded reports with xz. The query is composed of name=value pairs. Possible query parameters include: test_name=[name] probe_cc=[cc] probe_asn=AS[num] since=[yyyy-mm-dd] until=[yyyy-mm-dd] The value of "test_name" can be "tcp_connect", "web_connectivity", "ndt", etc. The query parameters "order_by", "order", "offset", and "limit" are used internally by the program and will be overridden if present. More documentation on the API is available from https://api.ooni.io/api/. Because the program only uses filename comparison to know whether a report file has already been downloaded, it is careful not to store a file under its final filename until it has been completely downloaded. It is safe to interrupt the program or run two copies simultaneously. To make your Python scripts capable of processing xz-compressed files, just call this function intead of open(filename): def open_magic(filename): if filename.endswith(".xz"): p = subprocess.Popen(["xz", "-d", "-c", "--", filename], stdout=subprocess.PIPE) return p.stdout else: return open(filename) Tip: to generate a URL that links to a specific measurement OONI Explorer, use this jq syntax: "https://explorer.ooni.torproject.org/measurement/\(.report_id|@uri)?input=\(.input|join(":")|@uri)" For example, here is how to generate a table from a directory full of reports: cat reports/*.json | jq -r '[.measurement_start_time,.probe_cc,.probe_asn,.test_name,"https://explorer.ooni.torproject.org/measurement/\(.report_id|@uri)?input=\(.input|join(":")|@uri)"]|@tsv' Or, if you used the -xz option: xz -dc reports/*.json.xz | jq -r '[.measurement_start_time,.probe_cc,.probe_asn,.test_name,"https://explorer.ooni.torproject.org/measurement/\(.report_id|@uri)?input=\(.input|join(":")|@uri)"]|@tsv' Contact: David Fifield <[email protected]>
About
Fast downloader of measurements from the OONI measurements API
Resources
License
Stars
Watchers
Forks
Packages 0
No packages published
Languages
- Go 98.6%
- Makefile 1.4%