Skip to content
This repository has been archived by the owner on Feb 1, 2022. It is now read-only.

Join API

Jonathan Speiser edited this page Oct 5, 2016 · 17 revisions

Contents

Let's breakdown an example join API data call:

https://api.datausa.io/api/join?required=income,adult_obesity&show=geo

In its most basic form, join API calls required two parameters: a show variable and a required variable. In this example we are showing "geo" (aka geographic locations) across all sumlevels (in the join API by default if you do not specify sumlevels it will default to "all"). Note that in the results not all variables are available at all sumlevels across tables.

To see only county-level data we could do:

https://api.datausa.io/api/join?required=income,adult_obesity&show=geo&sumlevel=county
parameter accepted values description
limit integer Limits the number of rows returned by the query. (default is 100)
order any available column name Column name to use for ordering the resulting data array.
show (required) any available [attribute](Attribute API) A comma-separated list of attributes to show in the query.
sort desc or asc Changes the sort order of the returned data array.
sumlevel any available sumlevel for the given attribute This restricts the data fetched to only display the specified sumlevel(s). If more than one "show" attribute is specified, sumlevel must be a comma-separated list with a value for each attribute. (default is "all" levels)
required (required) any available column name A comma-separated list of column names to be returned in the query.
where see documentation Advanced filtering of columns, similar to the WHERE clause on SQL.
year latest, oldest, all, 4-digit year Filters the returned data to the given year.
display_names 0 or 1 Show attribute names in response
auto_crosswalk 0 or 1 Perform an advanced auto-crosswalk join

Previously we showed an example Join API call.

https://api.datausa.io/api/join?required=income,adult_obesity&show=geo

You can also filter the results to only show data from the latest year from each table:

https://api.datausa.io/api/join?required=income,adult_obesity&show=geo&year=latest

This will pull the latest data available for each variable.

If you wanted to filter based on associated information (say, you didn't only want the latest year) you can also use the .: syntax as a filter in the where clause. For example:

https://api.datausa.io/api/join?required=income,adult_obesity&show=geo&where=income.year:2013,adult_obesity.year:2016

This query selects the data looking at where income in 2013 and adult obesity in 2016. For a full list of table variables see the variables endpoint.

Advanced Auto Crosswalking

By default, joins between tables (with the exception of years columns) look for exact matches. So if one dataset has data only at the county level and another has data only the place level, you would not be able to join counties and places. There are instances where you may want to do something like this. For these instances you may use the "auto_crosswalk=1" flag