Skip to content

Commit

Permalink
Pushing actual hexagons page
Browse files Browse the repository at this point in the history
  • Loading branch information
bpstewar committed Dec 20, 2024
1 parent 84c0a5e commit 2635561
Show file tree
Hide file tree
Showing 3 changed files with 122 additions and 0 deletions.
Binary file added docs/images/hexagon_neighbours.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
29 changes: 29 additions & 0 deletions docs/why_hexagons.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Space2Stats and Hexagons
Space2Stats aggregates data to a globally consistent hexagon grid, leveraging [Uber’s Hexagonal Hierarchical Spatial Index](https://www.uber.com/blog/h3/) ([Github link here](https://github.com/uber/h3)). Hexagons have several benefits over traditional grids, but it is not all that common a toolset, and not an obvious choice to many Geographers, let alone non-geospatial experts. This page is focused on explaining our decision process that led us to these hexagons.

## Consistent shapes over administrative divisions
The goal of the Space2Stats program is to produce a global database of geospatial aggregates at administrative 2 level, so why do we initially aggregate to a consistent grid before attaching the variables to the administrative boundaries? Simply put, administrative boundaries change constantly, which would require re-calculating the entire database with every change. This changing landscape is well-known and there are many projects that are attempting to collect updated administrative divisions:

1. [UN-SALB](https://salb.un.org/en): The Second Administrative Level Boundaries (SALB) programme objective is to promote accessible, interoperable and global data and information on subnational units and boundaries, or common geographies, for better decisions, stronger support to people and planet and to monitor the Sustainable Development Goals.
2. [GeoBoundaries](https://www.geoboundaries.org/): The geoBoundaries Global Database of Political Administrative Boundaries Database is an online, open license (CC BY 4.0) resource of information on administrative boundaries (i.e., state, county) for every country in the world. Since 2016, we have tracked approximately 1 million boundaries within over 200 entities, including all UN member states.
3. [FAO GAUL](https://developers.google.com/earth-engine/datasets/catalog/FAO_GAUL_2015_level0): While this is a commonly used dataset, it is no longer being updated.
4. [GADM](https://gadm.org/index.html): GADM provides maps and spatial data for all countries and their sub-divisions

The Space2Stats program will also publish a global database of administraive boundaries at level 0, 1, and 2, in order to comply with the World Bank's strict legal requirements on international boundaries.

## Hexagons vs Grids
Once we acknowledge the necessity of a standard grid for aggregation, why did we choose hexagons over a grid? There are two principle advantages to using hexagons:
1. Hexagons have a consistent area across the globe, unlike grids which change in width as you move north or south from the equator.
2. Hexagons have more consistent neighbour calculations than grids or triangles.

```{figure} images/hexagon_neighbours.jpg
---
alt: Example of neighbour calculations for various shape tesselations
---
Example of neighbour calculations for tesselations of triangle, hexagons and squares. Image taken from [Uber H3 website](https://www.uber.com/blog/h3/)
```

Additionally, several World Bank projects are leveraging hexagons in their data indexing and calculations:

1. [World Ex](https://worldbank.github.io/worldex/): A python package for indexing geospatial data with the h3 index
2. Food security (Link TBD): TBD
93 changes: 93 additions & 0 deletions notebooks/MP_SCRIPTS/testing_metadata_stac.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import sys, os, json"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"['C:\\\\WBG\\\\Work\\\\Code\\\\DECAT_Space2Stats\\\\space2stats_api\\\\src\\\\space2stats_ingest\\\\METADATA\\\\stac\\\\catalog.json',\n",
" 'C:\\\\WBG\\\\Work\\\\Code\\\\DECAT_Space2Stats\\\\space2stats_api\\\\src\\\\space2stats_ingest\\\\METADATA\\\\stac\\\\space2stats-collection\\\\collection.json',\n",
" 'C:\\\\WBG\\\\Work\\\\Code\\\\DECAT_Space2Stats\\\\space2stats_api\\\\src\\\\space2stats_ingest\\\\METADATA\\\\stac\\\\space2stats-collection\\\\flood_exposure_15cm_1in100\\\\flood_exposure_15cm_1in100.json',\n",
" 'C:\\\\WBG\\\\Work\\\\Code\\\\DECAT_Space2Stats\\\\space2stats_api\\\\src\\\\space2stats_ingest\\\\METADATA\\\\stac\\\\space2stats-collection\\\\nighttime_lights\\\\nighttime_lights.json',\n",
" 'C:\\\\WBG\\\\Work\\\\Code\\\\DECAT_Space2Stats\\\\space2stats_api\\\\src\\\\space2stats_ingest\\\\METADATA\\\\stac\\\\space2stats-collection\\\\space2stats_population_2020\\\\space2stats_population_2020.json',\n",
" 'C:\\\\WBG\\\\Work\\\\Code\\\\DECAT_Space2Stats\\\\space2stats_api\\\\src\\\\space2stats_ingest\\\\METADATA\\\\stac\\\\space2stats-collection\\\\urbanization_ghssmod\\\\urbanization_ghssmod.json']"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"stac_folder = r\"C:\\WBG\\Work\\Code\\DECAT_Space2Stats\\space2stats_api\\src\\space2stats_ingest\\METADATA\\stac\"\n",
"json_files = []\n",
"for root, dirs, files in os.walk(stac_folder):\n",
" for file in files:\n",
" if file.endswith(\".json\"):\n",
" json_files.append(os.path.join(root, file))\n",
"json_files"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"# Check to make sure all JSON files are valid\n",
"def is_valid_json(filepath):\n",
" try:\n",
" with open(filepath, 'r') as f:\n",
" json.load(f)\n",
" return True\n",
" except ValueError as e:\n",
" print(f\"Invalid JSON: {e}\")\n",
" return False\n",
" \n",
"for file in json_files:\n",
" if not is_valid_json(file):\n",
" print(file) "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "s2s",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.10"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

0 comments on commit 2635561

Please sign in to comment.