Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

first update of episode 5 #104

Merged
merged 24 commits into from
May 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
303aa07
updated readme added narrative
Morrizzzzz Feb 26, 2024
225041d
first update of episode 5
fnattino Feb 28, 2024
34d1eaa
updated index.md with narrative
Morrizzzzz Feb 28, 2024
48a3d75
Update index.md
Morrizzzzz Feb 28, 2024
ecdfafb
minor updates on material
Morrizzzzz Mar 4, 2024
8c2c6b0
minor updates in text
Morrizzzzz Mar 5, 2024
f8b2748
minor update text
Morrizzzzz Mar 5, 2024
939cb5a
updated access-data
Morrizzzzz Mar 5, 2024
306610a
splitted some of the steps, added clarifications on what we are doing…
Morrizzzzz Apr 9, 2024
b0dc34b
minor fixes
fnattino Apr 18, 2024
801614e
updated episode 5 with red band and more explantion about pystac
Morrizzzzz Apr 18, 2024
91595cc
spaces after opening of blocks
fnattino Apr 18, 2024
519a9af
Merge branch 'issue-90' of github.com:esciencecenter-digital-skills/g…
Morrizzzzz Apr 19, 2024
6e6544c
Merge branch 'issue-90' of github.com:esciencecenter-digital-skills/g…
Morrizzzzz Apr 19, 2024
78e521e
updated ep 6, but not finished yet.
Morrizzzzz Apr 29, 2024
caab30f
upadted epi 6
Morrizzzzz Apr 30, 2024
4a03fe4
Merge branch 'issue-90' into update_index_and_ep05
fnattino Apr 30, 2024
31b5645
Merge pull request #106 from esciencecenter-digital-skills/update_ind…
fnattino Apr 30, 2024
8e1f022
updated epi 6 not finshed yet end of working day
Morrizzzzz Apr 30, 2024
567c1ec
updated episode 6 into narrative rhodes. still there are quite some l…
Morrizzzzz May 2, 2024
49f1ac3
minor updates
Morrizzzzz May 2, 2024
7d0238b
updated epi6
Morrizzzzz May 2, 2024
52beb32
Merge branch 'draft_30042024' into issue-90
fnattino May 3, 2024
e4430e0
remove the notebook of ep6
rogerkuou May 8, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
# notebooks update narrative NleSc
notebooks/

# sandpaper files
episodes/*html
site/*
Expand Down
5 changes: 3 additions & 2 deletions episodes/01-intro-raster-data.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,22 +15,23 @@ exercises: 5
- Describe the strengths and weaknesses of storing data in raster format.
- Distinguish between continuous and categorical raster data and identify types of datasets that would be stored in each format.
:::


## Introduction

This episode introduces the two primary types of data models that are used to digitally represent the earth's surface: raster and vector. After briefly introducing these data models, this episode focuses on the raster representation, describing some major features and types of raster data. This workshop will focus on how to work with both raster and vector data sets, therefore it is essential that we understand the basic structures of these types of data and the types of phenomena that they can represent.

## Data Structures: Raster and Vector

The two primary data models that are used to represent the earth's surface digitally are the raster and vector. **Raster data** are stored as a grid of values which are rendered on the map as pixels—also known as cells—where each pixel—or cell—represents a value of the earth's surface. Examples of raster data are satellite images or aerial photographs. Data stored according to the **vector data** model are represented by points, lines, or polygons. Examples of vector representation are points of interest, buildings—often represented as building footprints—or roads.
The two primary data models that are used to represent the earth's surface digitally are the raster and vector. **Raster data** is stored as a grid of values which are rendered on a map as pixels—also known as cells—where each pixel—or cell—represents a value of the earth's surface. Examples of raster data are satellite images or aerial photographs. Data stored according to the **vector data** model are represented by points, lines, or polygons. Examples of vector representation are points of interest, buildings—often represented as building footprints—or roads.

Representing phenomena as vector data allows you to add attribute information to them. For instance, a polygon of a house can contain multiple attributes containing information about the address like the street name, zip code, city, and number. More explanations about vector data will be discussed in the [next episode](02-intro-vector-data.md).

When working with spatial information, you will experience that many phenomena can be represented as vector data and raster data. A house, for instance, can be represented by a set of cells in a raster having all the same value or by a polygon as vector containing attribute information (figure 1). It depends on the purpose for which the data is collected and intended to be used which data model it is stored in. But as a rule of thumb, you can apply that discrete phenomena like buildings, roads, trees, signs are represented as vector data, whereas continuous phenomena like temperature, wind speed, elevation are represented as raster data. Yet, one of the things a spatial data analyst often has to do is to transform data from vector to raster or the other way around. Keep in mind that this can cause problems in the data quality.

### Raster Data

Raster data are any pixelated (or gridded) data where each pixel has a value and is associated with a specific geographic location. The value of a pixel can be continuous (e.g., elevation, temperature) or categorical (e.g., land-use type). If this sounds familiar, it is because this data structure is very common: it's how we represent any digital image. A geospatial raster is only different from a digital photo in that it is accompanied by spatial information that connects the data to a particular location. This includes the raster's extent and cell size, the number of rows and columns, and its Coordinate Reference System (CRS), which will be explained in [episode 3](03-crs.md) of this workshop.
Raster data is any pixelated (or gridded) data where each pixel has a value and is associated with a specific geographic location. The value of a pixel can be continuous (e.g., elevation, temperature) or categorical (e.g., land-use type). If this sounds familiar, it is because this data structure is very common: it's how we represent any digital image. A geospatial raster is only different from a digital photo in that it is accompanied by spatial information that connects the data to a particular location. This includes the raster's extent and cell size, the number of rows and columns, and its Coordinate Reference System (CRS), which will be explained in [episode 3](03-crs.md) of this workshop.

![Raster Concept (Source: National Ecological Observatory Network (NEON))](fig/E01/raster_concept.png){alt="raster concept"}

Expand Down
299 changes: 209 additions & 90 deletions episodes/05-access-data.md

Large diffs are not rendered by default.

572 changes: 411 additions & 161 deletions episodes/06-raster-intro.md

Large diffs are not rendered by default.

Binary file added episodes/fig/E05/STAC-s2-preview-after.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added episodes/fig/E05/STAC-s2-preview-before.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed episodes/fig/E05/STAC-s2-preview.jpg
Binary file not shown.
Binary file not shown.
Binary file removed episodes/fig/E06/overview-plot-B09-robust.png
Binary file not shown.
Binary file removed episodes/fig/E06/overview-plot-B09.png
Binary file not shown.
Binary file not shown.
Binary file removed episodes/fig/E06/overview-plot-true-color.png
Binary file not shown.
Binary file added episodes/fig/E06/rhodes_multiband_80.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added episodes/fig/E06/rhodes_red_80_B04.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added episodes/fig/E06/rhodes_red_80_B04_robust.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added episodes/fig/E06/rhodes_red_B04.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
22 changes: 15 additions & 7 deletions index.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,22 +3,30 @@ site: sandpaper::sandpaper_site
---

## Data Carpentries
[Data Carpentry’s](https://datacarpentry.org/) teaching is hands-on. Participants are encouraged to use their own computers to ensure the proper setup of tools for an efficient workflow.

[Data Carpentry’s](https://datacarpentry.org/) teaching is hands-on. Participants are encouraged to use their own computers to ensure the proper setup of tools for an efficient workflow.

## Geospatial Raster and Vector Data with Python
In this lesson you will learn how to work with geospatial data and how to process these with python. Python is one of the most popular programming languages for data science and analytics, with a large and steadily growing community in the field of Earth and Space Sciences. The lesson is meant for participants with a working basic knowledge of Python and allow them to to familiarize with the world of geospatial raster and vector data. (If you are unfamiliar to python we recommend you to follow [this course](https://swcarpentry.github.io/python-novice-inflammation/) or study [this book](https://greenteapress.com/thinkpython2/thinkpython2.pdf) ). In the *Introduction to Geospatial Raster and Vector Data with Python* lesson you will be introduced to a set of tools from the Python ecosystem and learn how these can be used to carry out geospatial data analysis tasks. In particular, you will learn to work with satellite images (i.e. [the Copernicus Sentinel-2 mission][sentinel-2] ) and open topographical geo-datasets (i.e. [OpenStreetmap][osm]). You will learn how these spatial datasets can be accessed, explored, manipulated and visualized using Python.
In this lesson you will learn how to work with geospatial data and how to process these with python. Python is one of the most popular programming languages for data science and analytics, with a large and steadily growing community in the field of Earth and Space Sciences. The lesson is meant for participants with a working basic knowledge of Python and allow them to to familiarize with the world of geospatial raster and vector data. (If you are unfamiliar to python we recommend you to follow [this course](https://swcarpentry.github.io/python-novice-inflammation/) or have a look [here](https://greenteapress.com/thinkpython2/thinkpython2.pdf) ). In the *Introduction to Geospatial Raster and Vector Data with Python* lesson you will be introduced to a set of tools from the Python ecosystem and learn how these can be used to carry out geospatial data analysis tasks. In particular, you will learn to work with satellite images (i.e. [the Copernicus Sentinel-2 mission][sentinel-2] ) and open topographical geo-datasets (i.e. [OpenStreetmap][osm]). You will learn how these spatial datasets can be accessed, explored, manipulated and visualized using Python.

## Case study - Wildfires
As a case study for this lesson we will focus on wildfires. According to the IPCC assessment report, the wildfire seasons are lengthening as a result of changes in temperature and increasing drought conditions [IPCC](https://www.ipcc.ch/report/ar6/wg2/about/frequently-asked-questions/keyfaq1/). To analyse the impact of these wildfires, we will focus on the wildfire that occured on the Greek island [Rhodes in the summer of 2023](https://news.sky.com/story/wildfires-on-rhodes-force-hundreds-of-holidaymakers-to-flee-their-hotels-12925583) which had a devastating effect and led to the evacuation of [19.000 people](https://en.wikipedia.org/wiki/2023_Greece_wildfires). In this lesson we are going analyse the effect of this disaster by estimating which built-up areas were affected by these wildfires. Furthermore, we will analyse which vegetation and land-use types have been affected the most by the wildfire in order to get an understanding of which areas are more vulnerable to wildfires. Finally we are going to estimate which locations would be most suitable for placing watchtowers in the region. The analysis that we set up provides insights in the effect of the wildfire and generates input for wildfire mitigation strategies.
As a case study for this lesson we will focus on wildfires. According to the IPCC assessment report, the wildfire seasons are lengthening as a result of changes in temperature and increasing drought conditions [IPCC](https://www.ipcc.ch/report/ar6/wg2/about/frequently-asked-questions/keyfaq1/). To analyse the impact of these wildfires, we will focus on the wildfire that occured on the Greek island [Rhodes in the summer of 2023](https://news.sky.com/story/wildfires-on-rhodes-force-hundreds-of-holidaymakers-to-flee-their-hotels-12925583), which had a devastating effect and led to the evacuation of [19.000 people](https://en.wikipedia.org/wiki/2023_Greece_wildfires). In this lesson we are going analyse the effect of this disaster by estimating which built-up areas were affected by these wildfires. Furthermore, we will analyse which vegetation and land-use types have been affected the most by the wildfire in order to get an understanding of which areas are more vulnerable to wildfires. Finally we are going to estimate which locations would be most suitable for placing watchtowers in the region. The analysis that we set up provides insights in the effect of the wildfire and generates input for wildfire mitigation strategies.

*Note, that the analyses presented in this lesson are developed for educational purposes. Therefore in some occasions the analysis steps have been simplified and assumptions have been made.*

The data used in this lesson includes optical satellite images from [the Copernicus Sentinel-2 mission][sentinel-2] and topgraphical data from [OpenStreetmap (OSM)][osm]. These datasets are real-world open data sets that entail sufficient complexity to teach many aspects of data analysis and management. The datasets have been selected to allow participants to focus on the core ideas and skills being taught while offering the chance to encounter common challenges with geospatial data. Furthermore, we have selected datasets which are available anywhere on earth.
The data used in this lesson includes optical satellite images from [the Copernicus Sentinel-2 mission][sentinel-2] and topographical data from [OpenStreetMap (OSM)][osm]. These datasets are real-world open data sets that entail sufficient complexity to teach many aspects of data analysis and management. The datasets have been selected to allow participants to focus on the core ideas and skills being taught while offering the chance to encounter common challenges with geospatial data. Furthermore, we have selected datasets which are available anywhere on earth.

During this lesson we will setup an analysis pipeline which identifies scorched areas based bands of satelite images collected after the disaster in july 2023. Next, we will analyse the vegetation type, by calculating the [NDVI index](https://en.wikipedia.org/wiki/Normalized_difference_vegetation_index), that was present in these areas before the wildfire by looking at satellite images before the disaster and compare them with the scorched areas. To confront the effected built-up areas and most important roads, we will be using OSM vector data and compare that with the scorched areas identified above. Finally, we will use elevation data to perform viewshed analyses in order to locate the best locations for (hypothetical) watchtowers, which in theory would allow to identify a wildfire earlier, thus allowing to react more quickly.

To most effectively use these materials, make sure to download the data and install everything before working through this lesson (this especially accounts for learners that follow this lesson in a workshop).

[sentinel-2]: https://sentinel.esa.int/web/sentinel/missions/sentinel-2
[osm]: https://www.openstreetmap.org/#map=14/45.2935/18.7986
[workbench]: https://carpentries.github.io/sandpaper-docs
## Python libraries used in this lesson
The main python libraries that are used in this lesson are:
- [geopandas](https://geopandas.org/en/stable/)
- [rioxarray](https://github.com/corteva/rioxarray)
- [xarray-spatial](https://xarray-spatial.org/)
- [dask](https://www.dask.org/)
- [pystac](https://pystac.readthedocs.io/en/stable/)
- [sentinel-2]: https://sentinel.esa.int/web/sentinel/missions/sentinel-2
- [osm]: https://www.openstreetmap.org/#map=14/45.2935/18.7986
- [workbench]: https://carpentries.github.io/sandpaper-docs
Loading