OpenAQ Toolkit

Collection of user guides, tools, and links to resources for working with OpenAQ data.

Resources
- User Guides
- Tools
- Links
Download OpenAQ archive data from S3 using awscli
How big is the OpenAQ S3 bucket?
Convert ndjson to InfluxDB line protocol format
Convert CSV to InfluxDB line protocol format
Contributing

Resources

User Guides

Tools

openaq.org - The main OpenAQ website, contains CSV download pages and the world pollutant map.
ropensci/ropenaq - R package for the OpenAQ API
nickolasclarke/openaq - JavaScript client for the OpenAQ API
dhhagan/py-openaq - Python wrapper for the OpenAQ API
openaq-postman - Postman collections for working with OpenAQ API
jackkoppa/cityaq - Compare air quality for cities
dolugen/openaq-browser - A web client for OpenAQ API
barronh/scrapenaq - Download and convert OpenAQ archived data with Pandas
dolugen/openaq-swagger - OpenAPI v3 spec of OpenAQ API
dolugen/sns-s3-influxdb - Populate InfluxDB with air quality data

Links

OpenAQ on AWS - OpenAQ's publically available S3 bucket and SNS topic informations.

Download OpenAQ archive data from S3 using `awscli`

OpenAQ stores metric data in a S3 bucket, and it's publicly available. One way to download from the archive is using the aws s3 command.

Prerequisites: You need a free AWS account, and have awscli installed and configured.

Download a single file:

aws s3 cp s3://openaq-fetches/realtime-gzipped/2020-06-06/1591476667.ndjson .

Download files for 1 day:

aws s3 cp s3://openaq-fetches/realtime-gzipped/2020-06-06/ . --recursive

You can go up 1 level and download the entire archive if you wish.

If you prefer to not use awscli, take a look at this tool that uses the scraping approach: barronh/scrapenaq.

How big is the OpenAQ S3 bucket?

aws s3 ls --summarize --human-readable --recursive s3://openaq-fetches

As of June 2020, it's 323 GB.

Convert ndjson to InfluxDB line protocol format

The archive files in the S3 bucket are ndjson formatted, or newline delimited JSON. Meaning it's just JSON, but each line is a separate JSON object.

If you were to convert this to InfluxDB's line protocol, you can use ndjson2lineprotocol.py script that's found in this repo.

cat *.ndjson | ./ndjson2lineprotocol.py

The script outputs to standard output, so you may want to redirect it to a file.

Convert CSV to InfluxDB line protocol format

Addition to the S3 option, you can filter and download data as CSV from openaq.org website.

After downloading the CSV, feed the file to csv2lineprotocol.py like so:

cat openaq.csv | ./csv2lineprotocol.py

Contributing

Something missing or need fixing here? Please use the issues page to submit requests and ask questions. You can also create a Pull Request with your changes.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
docs		docs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
csv2lineprotocol.py		csv2lineprotocol.py
ndjson2lineprotocol.py		ndjson2lineprotocol.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OpenAQ Toolkit

Resources

User Guides

Tools

Links

Download OpenAQ archive data from S3 using `awscli`

How big is the OpenAQ S3 bucket?

Convert ndjson to InfluxDB line protocol format

Convert CSV to InfluxDB line protocol format

Contributing

About

Languages

License

dolugen/openaq-toolkit

Folders and files

Latest commit

History

Repository files navigation

OpenAQ Toolkit

Resources

User Guides

Tools

Links

Download OpenAQ archive data from S3 using awscli

How big is the OpenAQ S3 bucket?

Convert ndjson to InfluxDB line protocol format

Convert CSV to InfluxDB line protocol format

Contributing

About

Resources

License

Stars

Watchers

Forks

Languages

Download OpenAQ archive data from S3 using `awscli`