Skip to content

Commit

Permalink
re-organised folders
Browse files Browse the repository at this point in the history
  • Loading branch information
annefou committed Jun 17, 2024
1 parent d03b1e3 commit 50c5fa0
Show file tree
Hide file tree
Showing 14 changed files with 195 additions and 33 deletions.
23 changes: 16 additions & 7 deletions docs/_toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,19 +4,28 @@ parts:
- caption: About
chapters:
- file: agenda
- file: users-getting-started
- caption: Getting started with Pangeo@EOSC
chapters:
- file: object-storage-minio-test
- file: setup/eosc-pangeo
- file: setup/users-getting-started
- caption: Learning to master the Pangeo ecosystem
chapters:
- file: xarray_introduction
- file: pangeo/xarray_introduction
title: Handling multi-dimensional arrays with xarray
- file: visualization
- file: pangeo/visualization
title: Interactive plotting with holoviews
- file: data_discovery
- file: pangeo/data_discovery
title: Data access and discovery
- file: chunking_introduction
- file: pangeo/chunking_introduction
title: Chunking
- file: dask_introduction
- file: pangeo/dask_introduction
title: Parallel computing with dask
- caption: Beyond the workshop
chapters:
- file: setup/object-storage-minio-test
- file: afterword/resources
title: Resources
- file: afterword/pythia
title: Project Pythia
- file: afterword/envds-book
title: Environmental Data Science Book
25 changes: 25 additions & 0 deletions docs/afterword/envds-book.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# Environmental Data Science Book

## About
The Environmental Data Science Book (or EDS book in short), https://the-environmental-ds-book.netlify.app/, is a living, open and community-driven online resource to showcase and support the publication of data, research and open-source tools for collaborative, reproducible and transparent Environmental Science.

## Who is the book for?
While the scientific community is broad, the target audience of the EDS book is:
* Researchers with some background in environmental science interested in AI and data science methods.
* Researchers with some background in computer science interested in environmental data science.
* Anyone else interested in reproducibility, inclusive, shareable and collaborative AI and data science for environmental applications.

# How to contribute?
The EDS book welcomes contributions from anyone, not only those listed in the target audience.
The core GitHub repository is public and open source licensed (see [here](https://github.com/alan-turing-institute/environmental-ds-book)).
The executable notebooks are hosted in the [EDS book organization](https://github.com/Environmental-DS-Book).
Please see the EDS [contributor’s guide](https://github.com/alan-turing-institute/environmental-ds-book/blob/master/CONTRIBUTING.md) for details on how you can get involved.

<style>
.responsive-wrap iframe{ max-width: 100%;}
</style>
<div class="responsive-wrap">
<!-- this is the embed code provided by Google -->
<iframe src="https://docs.google.com/presentation/d/1IdKnE5jRPR3rPaKkzUtw-5YaUhpsgjsonRlt05d8SgQ/embed?start=false&loop=false&delayms=3000" frameborder="0" width="960" height="569" allowfullscreen="true" mozallowfullscreen="true" webkitallowfullscreen="true"></iframe>
<!-- Google embed ends -->
</div>
35 changes: 35 additions & 0 deletions docs/afterword/pythia.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# Project Pythia

:::{tip}
Project Pythia, https://projectpythia.org/, is the education working group for Pangeo and is an educational resource for the entire geoscience community.
The information below highlights some notions of the initiative. We also include a recent presentation authored by Clyne et al. (2022).

For further details, we suggest visiting Pythia's [About section](https://projectpythia.org/about.html#presentations-about-project-pythia).
:::

## About
Project Pythia is a home for Python-centered learning resources that are open-source, community-owned, geoscience-focused, and high-quality.

## Who is Project Pythia?
The current core Pythia team can be found [here](https://projectpythia.org/index.html#the-project-pythia-team). Pythia is an open and inclusive community! Look [here](https://projectpythia.org/index.html#join-us) for info on how to get involved.

## Project Pythia Goals
1. _The Pythia Portal:_ A searchable online portal that
provides scientists at any point in their career with educational
content and real-world examples needed to learn how to navigate and
integrate the myriad packages within the Python ecosystem for the
geosciences.

2. _Cloud-Deployable Pythia Platforms:_ A light-weight,
Binder-based platform that will make it possible to launch portal
content in customizable executable environments in the Cloud with
only a “single click.”

<style>
.responsive-wrap iframe{ max-width: 100%;}
</style>
<div class="responsive-wrap">
<!-- this is the embed code provided by Google -->
<iframe src="https://docs.google.com/presentation/d/1js9iR2bmNj7rkJSHU9kvB5037KZnK4mlmWF8Twnggis/embed?start=false&loop=false&delayms=3000" frameborder="0" width="960" height="569" allowfullscreen="true" mozallowfullscreen="true" webkitallowfullscreen="true"></iframe>
<!-- Google embed ends -->
</div>
50 changes: 50 additions & 0 deletions docs/afterword/resources.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@

![Pangeo logo](.././figures/pangeo_name_logo.png)

**A community platform for Big Data geoscience**

### Join the community!

| Information | Links |
| :--- |:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| **Website** | https://pangeo.io/ |
| **GitHub** | [![GitHub](https://img.shields.io/badge/GitHub-Pangeo--data-blue?logo=github)](https://github.com/pangeo-data) |
| **Examples** | [![Gallery](https://img.shields.io/badge/Pangeo-Gallery-orange)](http://gallery.pangeo.io/) |
| **Chat** | [![Gitter](https://img.shields.io/badge/Gitter-Chat-yellow?logo=gitter)](https://gitter.im/pangeo-data/Lobby) [![Pangeo - Discourse](https://img.shields.io/discourse/users?server=https%3A%2F%2Fdiscourse.pangeo.io%2F&style=flat-square&logo=discourse)](https://discourse.pangeo.io) |
| **News** | [![Medium - Blog](https://img.shields.io/badge/Medium-Blog-2ea44f?logo=medium)](https://medium.com/pangeo) [![Fllo](https://img.shields.io/twitter/follow/pangeo_data?style=social)](https://twitter.com/pangeo_data) |

### Meetings
- [General](https://pangeo.io/meeting-notes.html#community-meeting): Pangeo holds community meetings meetings every Wednesday. The meetings alternate between 12PM and 4PM US Eastern Time to encourage participants from a wider range of time zones.
- [Continental meetings](https://pangeo.io/meeting-notes.html#continental-community-meetings): to adress different time zones among the globe continental meetings have been organized in Europe/Africa and Oceania.
- [Showcases](https://pangeo.io/pangeo-showcase.html#pangeo-showcase): 15 minutes talks which are an opportunity for anyone to meet other members of the Pangeo community and let them know what you are working on. The talks are recorded, given a DOI, and made available on the [Pangeo YouTube Channel](https://youtube.com/playlist?list=PLuQQBBQFfpgq0OvjKbjcYgTDzDxTqtwua). If you are interested in giving a talk, [fill out this short form](https://forms.gle/QwxKusVvrvDakSNs8).

### Cloud infrastructure
- [2i2c JupyterHub](https://us-central1-b.gcp.pangeo.io/hub/login?next=%2Fhub%2F): serves Pangeo on open source infrastructure. It's operated and designed by [2i2c](https://2i2c.org/), and funded by [NSF EarthCube Program (Award ICER-2026932)](https://www.nsf.gov/awardsearch/showAward?AWD_ID=2026932). The service is **open to anyone** that a hub administrator has approved the application form (see [here](https://docs.google.com/forms/d/e/1FAIpQLSeqKncKG-s365pC_Lfe4_UetJ-wcFfjOSyHhYYQjXbKRHzswQ/viewform)). Find a blog post informing the 2i2c/Pangeo partnership [here](https://2i2c.org/blog/2021/pangeo-goes-live/).

### Data life cycle
- [Pangeo Forge](https://pangeo-forge.org/): a tool designed to aid the extraction, transformation, and loading of datasets.

### Most recent trainings (2021/22)
- [Galaxy training in climate data](https://training.galaxyproject.org/training-material/topics/climate/): contains two modules introducing Pangeo, _Pangeo ecosystem 101 for everyone_ and _Pangeo Notebook in Galaxy - Introduction to Xarray_ showcasing how the Pangeo stack assists processing and analysing big climate datasets.
- [BIOGEOMON 2022 Python Pangeo Workshop](https://github.com/LandscapeGeoinformatics): led by [Landscape Geoinformatics](https://github.com/LandscapeGeoinformatics) includes Jupyter notebooks demonstrating Xarray for working with labeled multi-dimensional arrays of data. The material also shows a few basic steps how to improve reproducibility and pro-actively apply FAIR principles when sharing and archiving data and code online for publishing via [GitHub](https://github.com/) and [Zenodo](https://zenodo.org/).
- [FOSS4G 2021](https://github.com/pangeo-data/foss4g-2021): focuses on data discovery with SpatioTemporal Asset Catalogs (STAC), data loading with Cloud-optimized formats (Cloud-Optimized Geotiff, ZARR), and scalable analysis with Xarray and Dask libraries.

## Additional resources/initiatives consuming Pangeo stack
_List of some active initiatives. Find more in https://github.com/pangeo-data_.

- [CarbonPlan](https://carbonplan.org/): _non-profit initiative_, analyzes climate solutions based on the best available science and data. The team works collaboratively with the Pangeo community to build open tools and resources for the evaluation and deployment of robust climate programs.
- [CliMetLab](https://github.com/ecmwf/climetlab): _package_, aims at simplifying access to climate and meteorological datasets, allowing users to focus on science instead of technical issues such as data access and data formats.
- [climpred](https://github.com/pangeo-data/climpred): _package_, aims to be the primary package used to analyze output from initialized dynamical forecast models, ranging from short-term weather forecasts to decadal climate forecasts.
- [Digital Earth Africa Sandbox](https://sandbox.digitalearth.africa/): _platform_, a cloud-based computational platform that operates through a Jupyter Lab environment. It provides a limited, but free compute resource for technical users and data scientists to explore DE Africa data and products. The platform consumes `xarray` and `dask` to optimize the processing and analysis of the curated datasets.
- [EOOffshore](https://eooffshore.github.io/): _research project_, presents a case study that demonstrates the utility of the Pangeo software ecosystem to address these issues in the development of offshore wind speed and power density estimates, increasing wind measurement coverage of offshore renewable energy assessment areas in the [Irish Continental Shelf](https://www.marine.ie/Home/site-area/irelands-marine-resource/real-map-ireland) region.
- [Fastscape LEM](https://fastscape.org/): _software stack_, aims at making landscape evolution models and topographic analysis algorithms readily accessible to a wide range of users, from experts in landscape evolution modelling to scientists, researchers and teachers in the broader Earth science community.
- [flox](https://github.com/xarray-contrib/flox): _package_, explores strategies for fast GroupBy reductions with `dask.array`. It used to be called dask_groupby.
- [NetCarbon](https://www.netcarbon.fr/home): _startup company_, offering farmers a free solution for measuring and monetizing their sequestered carbon to contribute towards carbon neutrality.
- [Planetary Computer](https://planetarycomputer.microsoft.com/): _platform_, a cloud-based computational platform aiming to combine a petabyte catalog of analysis-ready geospatial data, an API that facilitates spatiotemporal querying over that data and a computing environment that simplifies distributed computing workloads.
- [PyGMT](https://github.com/GenericMappingTools/pygmt): _package_, facilitates processing geospatial and geophysical data and making publication quality maps and figures.
- [scivision](https://github.com/alan-turing-institute/scivision): _package_, aims to connect computer vision model developers to image data providers from diverse scientific fields. The project builds upon existing libraries to create and manipulate data catalogues e.g. `intake`, and `xarray` to handle N-dimensional data for exploring CV models.
- [Urban Grammar AI research project](https://urbangrammarai.xyz/): _research project_, proposes a conceptual framework to characterize urban structure through the notions of spatial signatures and urban grammar. In addition to consume the Pangeo stack, the resource demonstrates some notebooks using [`dask_geopandas`](https://github.com/geopandas/dask-geopandas) to optimize processing and analysing spatial operations on geometric types.
- [verde](https://github.com/fatiando/verde), _package_, aims at processing spatial data (bathymetry, geophysics surveys, etc) and interpolating it on regular grids (i.e., gridding).
- [xarray-sentinel](https://github.com/bopen/xarray-sentinel): _package_, facilitates access and exploration of the SAR data products of the Copernicus Sentinel-1 satellite mission.
- [xESMF](https://github.com/pangeo-data/xESMF): _package_, a regridding tool suited for non-orthogonal grids. xESMF tries to be simple and intuitive.
- [xMIP](https://github.com/jbusecke/xMIP): _package_, facilitates the cleaning, organization and interactive analysis of Model Intercomparison Projects (MIPs) within the Pangeo software stack.
Binary file modified docs/images/pangeo_name_logo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
28 changes: 21 additions & 7 deletions docs/intro.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,27 +41,40 @@ tags:
thumbnail: images/pangeo-logo.png
---

# GEO-OPEN-HACK-2024: Big Geospatial Data Hackathon with Open Infrastructure and Tools
# GEO-OPEN-HACK-2024

**Big Geospatial Data Hackathon with Open Infrastructure and Tools**

+++ {"part":"abstract"}

% The article should include an abstract block at the beginning. The block is delimited by `+++` before and after, and you must specify `"part": "abstract"` as JSON metadata on the block opener. This metadata is required for recognizing the content of this cell as the abstract.
% The abstract should begin with a short description of the problem addressed, briefly describe the new data or analyses, then briefly state the main conclusion(s) and how they are supported, and address any uncertainty.

This tutorial will provide a comprehensive introduction along with hands-on examples to help you understand how these technologies can be used for Earth science data analysis and interpretation.
[GEO-OPEN-HACK-2024](https://iiasa.ac.at/events/jun-2024/geo-open-hack-2024-big-geospatial-data-hackathon-with-open-infrastructure-and-tools) is a comprehensive and informative event designed for advanced geo-coders to explore various open tools and approaches for upscaling geospatial analysis on open High-Performance Computing (HPC) infrastructure.

The event is organised by the [International Institute of Applied Systems Analysis (IIASA)](https://iiasa.ac.at) in collaboration with [Spatial Ecology](https://spatial-ecology.net/).

+++

# Overview
## Overview

In this tutorial, participants will learn how to 1) navigate the Pangeo ecosystem for scalable Earth Science workflows and 2) exploit Earth Observation (EO) data.
This Pangeo tutorial is part of GEO-OPEN-HACK-2024 and will provide a comprehensive introduction along with hands-on examples to help you understand how these technologies can be used for Earth science data analysis and interpretation.

In this tutorial, participants will learn how to 1) navigate the Pangeo ecosystem for scalable Earth Science workflows on Pangeo@EOSC, and 2) exploit Earth Observation (EO) data.

:::{tip}
[Pangeo-EOSC](https://github.com/pangeo-data/pangeo-eosc/) has benefited from services and resources provided by the [EGI-ACE project](https://www.egi.eu/project/egi-ace/) (funded by the European Union’s Horizon 2020 research and innovation programme under Grant Agreement no. 101017567), and the [C-SCALE project](https://c-scale.eu/) (funded by the European Union's Horizon 2020 research and innovation programme under grant agreement no. 101017529), with the dedicated support of [CESNET](https://www.cesnet.cz/en/).
:::

## Tutorial Learning Objectives

By the end of this tutorial, learners will be able to:

- Understand the Pangeo ecosystem
- Learn to access, load, and analyse data using Xarray, visualising data with Hvplot, and scaling workflows with Dask.
- Learn about the European Open Science Cloud (EOSC)
- Understand the Pangeo ecosystem and Pangeo@EOSC;
- Learn to access, load, and analyse data using Xarray, visualising data with Hvplot, and scaling workflows with Dask;
- Understand how to interface Pangeo ecosystem (Xarray & Dask) with the most common machine learning Python ecosystem ([Pytorch](https://pytorch.org) & [Tensorflow](https://www.tensorflow.org/)).


## Prerequisites

Expand All @@ -72,4 +85,5 @@ Before starting this tutorial, learners should have:

## Set up

If you are participating in this training as part of the GEO-OPEN-HACK-2024, you will need to register to [Pangeo@EOSC](https://pangeo-data.github.io/pangeo-eosc/) register yourself following the instructions given at [getting started for users](https://pangeo-data.github.io/geo-open-hack-2024/users-getting-started.html).
If you are participating in this training as part of the GEO-OPEN-HACK-2024, you will need to register yourself to [Pangeo@EOSC](https://pangeo-data.github.io/pangeo-eosc/). The set up instructions are given at [getting started with Pangeo@EOSC](https://pangeo-data.github.io/geo-open-hack-2024/users-getting-started.html).

File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Loading

0 comments on commit 50c5fa0

Please sign in to comment.