From 303aa072697ccf73f96a6a7075c9fe11594aa812 Mon Sep 17 00:00:00 2001 From: Maurice de Kleijn Date: Mon, 26 Feb 2024 17:13:35 +0100 Subject: [PATCH 1/7] updated readme added narrative --- .gitignore | 3 +++ index.md | 40 ++++++++++++++++++++++++++-------------- 2 files changed, 29 insertions(+), 14 deletions(-) diff --git a/.gitignore b/.gitignore index 9cad1421..8024a9a2 100644 --- a/.gitignore +++ b/.gitignore @@ -1,3 +1,6 @@ +# notebooks update narrative NleSc +notebooks/ + # sandpaper files episodes/*html site/* diff --git a/index.md b/index.md index 02d54801..fea2191c 100644 --- a/index.md +++ b/index.md @@ -2,19 +2,31 @@ site: sandpaper::sandpaper_site --- -Data Carpentry’s teaching is hands-on, so participants are encouraged to use -their own computers to ensure the proper setup of tools for an efficient -workflow. To most effectively use these materials, please make sure to download -the data and install everything before working through this lesson. - -The data used in this lesson includes optical satellite images from -[the Copernicus Sentinel-2 mission][sentinel-2] and public geographical datasets -from [the dedicated distribution platform of the Dutch government][pdok]. -These are real-world data sets that entail sufficient complexity to teach many -aspects of data analysis and management. They have been selected to allow students -to focus on the core ideas and skills being taught while offering the chance -to encounter common challenges with geospatial data. +## Data Carpentries +[Data Carpentry’s](https://datacarpentry.org/) teaching is hands-on. Participants are encouraged to use their own computers to ensure the proper setup of tools for an efficient workflow. To most effectively use these materials, please make sure to download the data and install everything before working through this lesson. + +## Geospatial Raster and Vector Data with Python +In this lesson you will learn how to work with geospatial data and how to process these with python. Python is one of the most popular programming languages for data science and analytics, with a large and steadily growing community in the field of Earth and Space Sciences. The lesson is meant for participants with a working basic knowledge of Python and allow them to to familiarize with the world of geospatial raster and vector data. We will introduce a set of tools from the Python ecosystem and show how these can be used to carry out geospatial data analysis tasks. In particular, we will consider satellite images (i.e. [the Copernicus Sentinel-2 mission][sentinel-2] ) and open topographical geo-datasets (i.e. [OpenStreetmap][osm]). We will demonstrate how these spatial datasets can be accessed, explored, manipulated and visualized using Python. + +## Case study - Wildfires +As a case study for this lesson we will focus on wildfires. According to the IPCC assessment report, the wildfire seasons are lengthening as a result of changes in temperature and increasing drought conditions [IPCC](https://www.ipcc.ch/report/ar6/wg2/about/frequently-asked-questions/keyfaq1/). To analyse the impact of these wildfires, we will focus on the wildfire that occured on the Greek island [Rhodes in the summer of 2023](https://news.sky.com/story/wildfires-on-rhodes-force-hundreds-of-holidaymakers-to-flee-their-hotels-12925583), which led to the evacuation of [19.000 people](https://en.wikipedia.org/wiki/2023_Greece_wildfires). + + +The analysis that you are going to work on is to estimate which built-up areas were affected by these wildfires. Furthermore, we will analyse which vegetation and land-use types have been affected the most by the wildfire in order to get an understanding of which areas are more vulnerable to wildfires. The latter will generate insights which can be used as input for wildfire mitigation strategies. + +The data used in this lesson includes optical satellite images from [the Copernicus Sentinel-2 mission][sentinel-2] and topgraphical data from [OpenStreetmap][osm]. These datasets are real-world open data sets that entail sufficient complexity to teach many aspects of data analysis and management. The datasets have been selected to allow participants to focus on the core ideas and skills being taught while offering the chance to encounter common challenges with geospatial data. Furthermore, we have selected datasets which are available anywhere on earth. + +Note, that the analyses presented in this lesson are developed for educational purposes. Therefore in some occasions the analysis steps have been simplified and assumptions have been made. + +## Python libraries used in this lesson +The main python libraries that are used in this lesson are: + +**geopandas** +**rioxarray** +**xarray-spatial** +**dask** +**pystac** [sentinel-2]: https://sentinel.esa.int/web/sentinel/missions/sentinel-2 -[pdok]: https://www.pdok.nl -[workbench]: https://carpentries.github.io/sandpaper-docs +[osm]: https://www.openstreetmap.org/#map=14/45.2935/18.7986 +[workbench]: https://carpentries.github.io/sandpaper-docs \ No newline at end of file From 34d1eaa7995da5b8998aa4fbc6d2f1d17aa9fde9 Mon Sep 17 00:00:00 2001 From: Maurice de Kleijn Date: Wed, 28 Feb 2024 15:32:47 +0100 Subject: [PATCH 2/7] updated index.md with narrative --- index.md | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/index.md b/index.md index fea2191c..f406d2aa 100644 --- a/index.md +++ b/index.md @@ -6,26 +6,25 @@ site: sandpaper::sandpaper_site [Data Carpentry’s](https://datacarpentry.org/) teaching is hands-on. Participants are encouraged to use their own computers to ensure the proper setup of tools for an efficient workflow. To most effectively use these materials, please make sure to download the data and install everything before working through this lesson. ## Geospatial Raster and Vector Data with Python -In this lesson you will learn how to work with geospatial data and how to process these with python. Python is one of the most popular programming languages for data science and analytics, with a large and steadily growing community in the field of Earth and Space Sciences. The lesson is meant for participants with a working basic knowledge of Python and allow them to to familiarize with the world of geospatial raster and vector data. We will introduce a set of tools from the Python ecosystem and show how these can be used to carry out geospatial data analysis tasks. In particular, we will consider satellite images (i.e. [the Copernicus Sentinel-2 mission][sentinel-2] ) and open topographical geo-datasets (i.e. [OpenStreetmap][osm]). We will demonstrate how these spatial datasets can be accessed, explored, manipulated and visualized using Python. +In this lesson you will learn how to work with geospatial data and how to process these with python. Python is one of the most popular programming languages for data science and analytics, with a large and steadily growing community in the field of Earth and Space Sciences. The lesson is meant for participants with a working basic knowledge of Python and allow them to to familiarize with the world of geospatial raster and vector data. (If you are unfamiliar to python we recommend you to follow [this course](https://swcarpentry.github.io/python-novice-inflammation/) or have a look [here](https://greenteapress.com/thinkpython2/thinkpython2.pdf) ). In the *Introduction to Geospatial Raster and Vector Data with Python* lesson you will be introduced to a set of tools from the Python ecosystem and learn how these can be used to carry out geospatial data analysis tasks. In particular, you will learn to work with satellite images (i.e. [the Copernicus Sentinel-2 mission][sentinel-2] ) and open topographical geo-datasets (i.e. [OpenStreetmap][osm]). You will learn how these spatial datasets can be accessed, explored, manipulated and visualized using Python. ## Case study - Wildfires As a case study for this lesson we will focus on wildfires. According to the IPCC assessment report, the wildfire seasons are lengthening as a result of changes in temperature and increasing drought conditions [IPCC](https://www.ipcc.ch/report/ar6/wg2/about/frequently-asked-questions/keyfaq1/). To analyse the impact of these wildfires, we will focus on the wildfire that occured on the Greek island [Rhodes in the summer of 2023](https://news.sky.com/story/wildfires-on-rhodes-force-hundreds-of-holidaymakers-to-flee-their-hotels-12925583), which led to the evacuation of [19.000 people](https://en.wikipedia.org/wiki/2023_Greece_wildfires). - The analysis that you are going to work on is to estimate which built-up areas were affected by these wildfires. Furthermore, we will analyse which vegetation and land-use types have been affected the most by the wildfire in order to get an understanding of which areas are more vulnerable to wildfires. The latter will generate insights which can be used as input for wildfire mitigation strategies. The data used in this lesson includes optical satellite images from [the Copernicus Sentinel-2 mission][sentinel-2] and topgraphical data from [OpenStreetmap][osm]. These datasets are real-world open data sets that entail sufficient complexity to teach many aspects of data analysis and management. The datasets have been selected to allow participants to focus on the core ideas and skills being taught while offering the chance to encounter common challenges with geospatial data. Furthermore, we have selected datasets which are available anywhere on earth. -Note, that the analyses presented in this lesson are developed for educational purposes. Therefore in some occasions the analysis steps have been simplified and assumptions have been made. +Note, that the analyses presented in this lesson are developed for educational purposes. Therefore in some occasions the analysis steps have been simplified and assumptions have been made. ## Python libraries used in this lesson The main python libraries that are used in this lesson are: -**geopandas** -**rioxarray** -**xarray-spatial** -**dask** -**pystac** + **geopandas** + **rioxarray** + **xarray-spatial** + **dask** + **pystac** [sentinel-2]: https://sentinel.esa.int/web/sentinel/missions/sentinel-2 [osm]: https://www.openstreetmap.org/#map=14/45.2935/18.7986 From 48a3d75fdb2efb6cb8652b49bbdd599b2e33f36e Mon Sep 17 00:00:00 2001 From: Maurice de Kleijn Date: Wed, 28 Feb 2024 15:37:57 +0100 Subject: [PATCH 3/7] Update index.md minor updates --- index.md | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/index.md b/index.md index f406d2aa..b10d3d34 100644 --- a/index.md +++ b/index.md @@ -19,13 +19,12 @@ Note, that the analyses presented in this lesson are developed for educational p ## Python libraries used in this lesson The main python libraries that are used in this lesson are: - - **geopandas** - **rioxarray** - **xarray-spatial** - **dask** - **pystac** +- [geopandas](https://geopandas.org/en/stable/) +- [rioxarray](https://github.com/corteva/rioxarray) +- [xarray-spatial](https://xarray-spatial.org/) +- [dask](https://www.dask.org/) +- [pystac](https://pystac.readthedocs.io/en/stable/) [sentinel-2]: https://sentinel.esa.int/web/sentinel/missions/sentinel-2 [osm]: https://www.openstreetmap.org/#map=14/45.2935/18.7986 -[workbench]: https://carpentries.github.io/sandpaper-docs \ No newline at end of file +[workbench]: https://carpentries.github.io/sandpaper-docs From ecdfafb09e720b8fe4e48b574f0c4d4e3ff6bf6d Mon Sep 17 00:00:00 2001 From: Maurice de Kleijn Date: Mon, 4 Mar 2024 11:06:10 +0100 Subject: [PATCH 4/7] minor updates on material --- episodes/01-intro-raster-data.md | 28 +++++----------------------- index.md | 2 +- 2 files changed, 6 insertions(+), 24 deletions(-) diff --git a/episodes/01-intro-raster-data.md b/episodes/01-intro-raster-data.md index c56b9dd5..ecfff622 100644 --- a/episodes/01-intro-raster-data.md +++ b/episodes/01-intro-raster-data.md @@ -15,39 +15,21 @@ exercises: 5 - Describe the strengths and weaknesses of storing data in raster format. - Distinguish between continuous and categorical raster data and identify types of datasets that would be stored in each format. ::: + ## Introduction -This episode introduces the two primary types of geospatial -data: rasters and vectors. After briefly introducing these -data types, this episode focuses on raster data, describing -some major features and types of raster data. +This episode introduces the two primary types of geospatial data: rasters and vectors. After briefly introducing these data types, this episode focuses on raster data, describing some major features and types of raster data. ## Data Structures: Raster and Vector -The two primary types of geospatial data are raster -and vector data. Raster data is stored as a grid of values which are rendered on a -map as pixels. Each pixel value represents an area on the Earth's surface. Vector data structures represent specific features on the -Earth's surface, and -assign attributes to those features. Vector data structures -will be discussed in more detail in [the next episode](02-intro-vector-data.md). +The two primary types of geospatial data are raster and vector data. Raster data is stored as a grid of values which are rendered on a map as pixels. Each pixel value represents an area on the Earth's surface. Vector data structures represent specific features on the Earth's surface, and assign attributes to those features. Vector data is stored as points, lines or polygons and can contain multiple attributes for every feature. Vector data will be discussed in more detail in [the next episode](02-intro-vector-data.md). -This workshop will focus on how to work with both raster and vector -data sets, therefore it is essential that we understand the -basic structures of these types of data and the types of data -that they can be used to represent. +This lesson will focus on how to work with both raster and vector data sets, therefore it is essential to understand the basic structures of these data models and the types of data that they can be used to represent. ### About Raster Data -Raster data is any pixelated (or gridded) data where each pixel is associated -with a specific geographic location. The value of a pixel can be -continuous (e.g. elevation) or categorical (e.g. land use). If this sounds -familiar, it is because this data structure is very common: it's how -we represent any digital image. A geospatial raster is only different -from a digital photo in that it is accompanied by spatial information -that connects the data to a particular location. This includes the -raster's extent and cell size, the number of rows and columns, and -its coordinate reference system (or CRS). +Raster data is any pixelated (or gridded) data where each pixel, also called a cell, has the same size and is associated with a specific geographic location. The value of a pixel or cell can be continuous, typically stored as a float number, (e.g. temperature, precipitation, elevation) or categorical, stored as an integer (e.g. land-use types, building footprints, roads). If this sounds familiar, it is because this data structure is very common: it's how we represent any digital image. A geospatial raster is only different from a digital photo in that it is accompanied by spatial information that connects the data to a particular location. This includes the raster's extent and cell size, the number of rows and columns, and its coordinate reference system (or CRS). ![Raster Concept (Source: National Ecological Observatory Network (NEON))](fig/E01/raster_concept.png){alt="raster concept"} diff --git a/index.md b/index.md index b10d3d34..8d148aa6 100644 --- a/index.md +++ b/index.md @@ -15,7 +15,7 @@ The analysis that you are going to work on is to estimate which built-up areas w The data used in this lesson includes optical satellite images from [the Copernicus Sentinel-2 mission][sentinel-2] and topgraphical data from [OpenStreetmap][osm]. These datasets are real-world open data sets that entail sufficient complexity to teach many aspects of data analysis and management. The datasets have been selected to allow participants to focus on the core ideas and skills being taught while offering the chance to encounter common challenges with geospatial data. Furthermore, we have selected datasets which are available anywhere on earth. -Note, that the analyses presented in this lesson are developed for educational purposes. Therefore in some occasions the analysis steps have been simplified and assumptions have been made. +*Note, that the analyses presented in this lesson are developed for educational purposes. Therefore in some occasions the analysis steps have been simplified and assumptions have been made.* ## Python libraries used in this lesson The main python libraries that are used in this lesson are: From 8c2c6b051d26f97c7af9d7405782d8f273da6f59 Mon Sep 17 00:00:00 2001 From: Maurice de Kleijn Date: Tue, 5 Mar 2024 07:47:20 +0100 Subject: [PATCH 5/7] minor updates in text --- episodes/05-access-data.md | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/episodes/05-access-data.md b/episodes/05-access-data.md index f0fee95d..a117d0bf 100644 --- a/episodes/05-access-data.md +++ b/episodes/05-access-data.md @@ -90,14 +90,14 @@ satellite images, one for each band captured by the Multispectral Instrument on ::: When opening a catalog with the STAC browser, you can access the API URL by clicking on the "Source" button on the top -right of the page. By using this URL, we have access to the catalog content and, if supported by the catalog, to the +right of the page. By using this URL, you have access to the catalog content and, if supported by the catalog, to the functionality of searching its items. For the Earth Search STAC catalog the API URL is: ```python api_url = "https://earth-search.aws.element84.com/v1" ``` -You can query a STAC API endpoint from Python using the `pystac_client` library: +You can query a STAC API endpoint from Python using the `pystac_client` library. To do so we will import `Client` from `pystac_client`: ```python from pystac_client import Client @@ -105,8 +105,8 @@ from pystac_client import Client client = Client.open(api_url) ``` -In the following, we ask for scenes belonging to the `sentinel-2-l2a` collection. This dataset includes Sentinel-2 -data products pre-processed at level 2A (bottom-of-atmosphere reflectance) and saved in Cloud Optimized GeoTIFF (COG) +Next, we ask for scenes belonging to the `sentinel-2-l2a` collection. This dataset includes Sentinel-2 +data products pre-processed at level 2A (bottom-of-atmosphere reflectance). This data is stored as Cloud Optimized GeoTIFF (COG) format: ```python @@ -128,7 +128,8 @@ by a high-resolution raster can directly access the lower-resolution versions of on the downloading time. More information on the COG format can be found [here](https://www.cogeo.org). ::: -We also ask for scenes intersecting a geometry defined using the `shapely` library (in this case, a point): +In order to get data for a specific location you can add longitude latitude coordinates (World Geodetic System 1984 EPSG:4326) in your request. In order to do so we are using the `shapely` library to define a geometrical point. +Below we have included a center point for the City of Amsterdam in the Netherlands (i.e. Longitude: 4.89 | Latitude 52.37) . You can change that to the region of your preference. ```python from shapely.geometry import Point @@ -136,7 +137,7 @@ point = Point(4.89, 52.37) # AMS coordinates ``` Note: at this stage, we are only dealing with metadata, so no image is going to be downloaded yet. But even metadata can -be quite bulky if a large number of scenes match our search! For this reason, we limit the search result to 10 items: +be quite bulky if a large number of scenes match our search! For this reason, we limit the search result to 10 items using `max_items`: ```python search = client.search( @@ -146,7 +147,7 @@ search = client.search( ) ``` -We submit the query and find out how many scenes match our search criteria (please note that this output can be different as more data is added to the catalog): +Now we submit the query in order te find out how many scenes match our search criteria (please note that this output can be different as more data is added to the catalog): ```python print(search.matched()) From f8b274888fa7adf29328762b846e11f1f2f219ce Mon Sep 17 00:00:00 2001 From: Maurice de Kleijn Date: Tue, 5 Mar 2024 14:32:17 +0100 Subject: [PATCH 6/7] minor update text --- episodes/05-access-data.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/episodes/05-access-data.md b/episodes/05-access-data.md index a117d0bf..88884373 100644 --- a/episodes/05-access-data.md +++ b/episodes/05-access-data.md @@ -129,7 +129,7 @@ on the downloading time. More information on the COG format can be found [here]( ::: In order to get data for a specific location you can add longitude latitude coordinates (World Geodetic System 1984 EPSG:4326) in your request. In order to do so we are using the `shapely` library to define a geometrical point. -Below we have included a center point for the City of Amsterdam in the Netherlands (i.e. Longitude: 4.89 | Latitude 52.37) . You can change that to the region of your preference. +Below we have included a center point for the City of Amsterdam in the Netherlands (i.e. Longitude: 4.89 | Latitude 52.37) . You can change that to the region of your preference. ```python from shapely.geometry import Point @@ -174,7 +174,7 @@ print(len(items)) ``` which is consistent with the maximum number of items that we have set in the search criteria. We can iterate over -the returned items and print these to show their IDs: +the returned items and print these to show their unique IDs: ```python for item in items: @@ -197,7 +197,8 @@ for item in items: Each of the items contains information about the scene geometry, its acquisition time, and other metadata that can be accessed as a dictionary from the `properties` attribute. -Let's inspect the metadata associated with the first item of the search results: +Now let us inspect the metadata associated with the first item of the search results: + ```python item = items[0] print(item.datetime) From 939cb5a42cc725dfc2a540ede883333bdbbfabbd Mon Sep 17 00:00:00 2001 From: Maurice de Kleijn Date: Tue, 5 Mar 2024 17:33:01 +0100 Subject: [PATCH 7/7] updated access-data --- episodes/05-access-data.md | 59 +++++++++++++++++++++++++------------- 1 file changed, 39 insertions(+), 20 deletions(-) diff --git a/episodes/05-access-data.md b/episodes/05-access-data.md index 88884373..e2cdfb3b 100644 --- a/episodes/05-access-data.md +++ b/episodes/05-access-data.md @@ -129,15 +129,15 @@ on the downloading time. More information on the COG format can be found [here]( ::: In order to get data for a specific location you can add longitude latitude coordinates (World Geodetic System 1984 EPSG:4326) in your request. In order to do so we are using the `shapely` library to define a geometrical point. -Below we have included a center point for the City of Amsterdam in the Netherlands (i.e. Longitude: 4.89 | Latitude 52.37) . You can change that to the region of your preference. +Below we have included a center point for the island of Rhodos, which is the location of interest for our case study (i.e. Longitude: 27.95 | Latitude 36.20) . You can change that to the region of your preference. ```python from shapely.geometry import Point -point = Point(4.89, 52.37) # AMS coordinates +point = Point(27.95, 36.20) # coordinates of Rhodos ``` Note: at this stage, we are only dealing with metadata, so no image is going to be downloaded yet. But even metadata can -be quite bulky if a large number of scenes match our search! For this reason, we limit the search result to 10 items using `max_items`: +be quite bulky if a large number of scenes match our search! For this reason, we limit the search by the intersect of the point (by setting the parameter `intersects`), assign the collection (by setting the parameter `collections`) and provide a maximum to the input of 10 (by setting the parameter `max_items_`). Information about the possible parameters to be set can be found at the [stac search method](https://pystac-client.readthedocs.io/en/stable/api.html#pystac_client.Client.search) documentation. ```python search = client.search( @@ -147,7 +147,7 @@ search = client.search( ) ``` -Now we submit the query in order te find out how many scenes match our search criteria (please note that this output can be different as more data is added to the catalog): +Now we submit the query in order te find out how many scenes match our search criteria with the parameters assigned above (please note that this output can be different as more data is added to the catalog to when this episode was created): ```python print(search.matched()) @@ -157,13 +157,15 @@ print(search.matched()) 840 ``` -Finally, we retrieve the metadata of the search results: +Finally, we retrieve the metadata of the search results by calling the method `item_collection`: ```python items = search.item_collection() ``` -The variable `items` is an `ItemCollection` object. We can check its size by: +The variable `items` is an `ItemCollection` object. More information can be found at the [pystac documentation](https://pystac-client.readthedocs.io/en/stable/api.html#pystac_client.Client.search) + +Now let us check the size using `len`: ```python print(len(items)) @@ -173,7 +175,7 @@ print(len(items)) 10 ``` -which is consistent with the maximum number of items that we have set in the search criteria. We can iterate over +This is consistent with the maximum number of items that we have set in the search criteria. We can iterate over the returned items and print these to show their unique IDs: ```python @@ -194,14 +196,22 @@ for item in items: ``` -Each of the items contains information about the scene geometry, its acquisition time, and other metadata that can be -accessed as a dictionary from the `properties` attribute. +Each of the items contains information about the scene geometry, its acquisition time, and other metadata. To see which information each item contains you can have a look at the [item documentation of pystac](https://pystac.readthedocs.io/en/latest/api/item.html). -Now let us inspect the metadata associated with the first item of the search results: +Now let us inspect the metadata associated with the first item of the search results. Let us have a look at collection date of the first item: ```python item = items[0] print(item.datetime) +``` + +```output +2023-07-01 10:46:30.262000+00:00 +``` +Let us now add the geometry and properties as well. + +```python +item = items[0] print(item.geometry) print(item.properties) ``` @@ -212,13 +222,20 @@ print(item.properties) {'created': '2023-07-02T01:49:17.191Z', 'platform': 'sentinel-2a', 'constellation': 'sentinel-2', 'instruments': ['msi'], 'eo:cloud_cover': 99.952936, 'proj:epsg': 32631, 'mgrs:utm_zone': 31, 'mgrs:latitude_band': 'U', 'mgrs:grid_square': 'FU', 'grid:code': 'MGRS-31UFU', 'view:sun_azimuth': 154.716674921261, 'view:sun_elevation': 58.4960054056685, 's2:degraded_msi_data_percentage': 0.0346, 's2:nodata_pixel_percentage': 33.00232, 's2:saturated_defective_pixel_percentage': 0, 's2:dark_features_percentage': 0, 's2:cloud_shadow_percentage': 0.030847, 's2:vegetation_percentage': 0, 's2:not_vegetated_percentage': 0.004947, 's2:water_percentage': 0.011271, 's2:unclassified_percentage': 0, 's2:medium_proba_clouds_percentage': 5.838514, 's2:high_proba_clouds_percentage': 94.035202, 's2:thin_cirrus_percentage': 0.07922, 's2:snow_ice_percentage': 0, 's2:product_type': 'S2MSI2A', 's2:processing_baseline': '05.09', 's2:product_uri': 'S2A_MSIL2A_20230701T103631_N0509_R008_T31UFU_20230701T200058.SAFE', 's2:generation_time': '2023-07-01T20:00:58.000000Z', 's2:datatake_id': 'GS2A_20230701T103631_041904_N05.09', 's2:datatake_type': 'INS-NOBS', 's2:datastrip_id': 'S2A_OPER_MSI_L2A_DS_2APS_20230701T200058_S20230701T104159_N05.09', 's2:granule_id': 'S2A_OPER_MSI_L2A_TL_2APS_20230701T200058_A041904_T31UFU_N05.09', 's2:reflectance_conversion_factor': 0.967641353116838, 'datetime': '2023-07-01T10:46:30.262000Z', 's2:sequence': '0', 'earthsearch:s3_path': 's3://sentinel-cogs/sentinel-s2-l2a-cogs/31/U/FU/2023/7/S2A_31UFU_20230701_0_L2A', 'earthsearch:payload_id': 'roda-sentinel2/workflow-sentinel2-to-stac/7b1a81ed3fb8d763a0cecf8d9edd4d4a', 'earthsearch:boa_offset_applied': True, 'processing:software': {'sentinel2-to-stac': '0.1.0'}, 'updated': '2023-07-02T01:49:17.191Z'} ``` +Now if we want to access one item in the dictionary, for instance the EPSG code of the projected coordinate system, you need to access the item in the dictionary as usual. For instance: + +```python +item = items[0] +print(item.properties['proj:epsg']) +``` + :::challenge ## Exercise: Search satellite scenes using metadata filters Search for all the available Sentinel-2 scenes in the `sentinel-2-l2a` collection that satisfy the following criteria: - intersect a provided bounding box (use ±0.01 deg in lat/lon from the previously defined point); -- have been recorded between 20 March 2020 and 30 March 2020; -- have a cloud coverage smaller than 10% (hint: use the `query` input argument of `client.search`). +- have been recorded between 1st of July 2023 and 31st of August 2023 (the timespan in which the wildfire took place); +- have a cloud coverage smaller than 1% (Hint: generic metadata filters can be implemented via the `query` input argument of `client.search`, which requires the following syntax (see [docs](https://pystac-client.readthedocs.io/en/stable/usage.html#query-extension)): `query=['']`). How many scenes are available? Save the search results in GeoJSON format. @@ -231,8 +248,8 @@ bbox = point.buffer(0.01).bounds search = client.search( collections=[collection], bbox=bbox, - datetime="2020-03-20/2020-03-30", - query=["eo:cloud_cover<15"] + datetime="2023-07-01/2023-08-31", + query=["eo:cloud_cover<1"] ) print(search.matched()) ``` @@ -307,7 +324,7 @@ visual-jp2: True color image wvp-jp2: Water vapour (WVP) ``` -Among the others, assets include multiple raster data files (one per optical band, as acquired by the multi-spectral +Among the other data files, assets include multiple raster data files (one per optical band, as acquired by the multi-spectral instrument), a thumbnail, a true-color image ("visual"), instrument metadata and scene-classification information ("SCL"). Let's get the URL links to the actual asset: @@ -323,8 +340,10 @@ This can be used to download the corresponding file: ![Overview of the true-color image ("thumbnail")](fig/E05/STAC-s2-preview.jpg){alt="thumbnail of the sentinel-2 scene"} -Remote raster data can be directly opened via the `rioxarray` library. We will -learn more about this library in the next episodes. +Remote raster data can be directly opened via the `rioxarray` library. We will learn more about this library in the next episodes. + +For now we are using [rioxarray to open the raster file](https://corteva.github.io/rioxarray/stable/rioxarray.html#rioxarray.raster_array.RasterArray.to_raster). + ```python import rioxarray nir_href = assets["nir"].href @@ -348,14 +367,14 @@ Attributes: add_offset: 0.0 ``` -We can then save the data to disk: +Now we want to save the data to our local machine using the [to_raster](https://corteva.github.io/rioxarray/stable/rioxarray.html#rioxarray.raster_array.RasterArray.to_raster) function : ```python -# save whole image to disk +# save the image to disk nir.rio.to_raster("nir.tif") ``` -Since that might take a while, given there are over 10000 x 10000 = a hundred million pixels in the 10 meter NIR band, you can take a smaller subset before downloading it. Becuase the raster is a COG, we can download just what we need! +Since that might take a while, given there are over 10000 x 10000 = a hundred million pixels in the 10 meter NIR band, you can take a smaller subset before downloading it. However, since the raster is a COG, we can download just what we need! Here, we specify that we want to download the first (and only) band in the tif file, and a slice of the width and height dimensions.