Skip to content

Commit

Permalink
deploy: 25026cb
Browse files Browse the repository at this point in the history
  • Loading branch information
annefou committed Jun 27, 2024
0 parents commit 7e9cd5f
Show file tree
Hide file tree
Showing 213 changed files with 146,432 additions and 0 deletions.
4 changes: 4 additions & 0 deletions .buildinfo
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Sphinx build info version 1
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
config: 72908d5a332b0fcf08ee113f743676a0
tags: 645f666f9bcd5a90fca523b33c5a78b7
Empty file added .nojekyll
Empty file.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/EOSC_logo-small.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/chunked.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/dashboardlink.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/dask-xarray-explained.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/dasklab.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/datasize.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/exampledasklab.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/flavors.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/minIO_buckets.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/minIO_keys.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/minIO_login.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/notchunked.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/pangeo_name_logo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
25 changes: 25 additions & 0 deletions _sources/afterword/envds-book.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# Environmental Data Science Book

## About
The Environmental Data Science Book (or EDS book in short), https://the-environmental-ds-book.netlify.app/, is a living, open and community-driven online resource to showcase and support the publication of data, research and open-source tools for collaborative, reproducible and transparent Environmental Science.

## Who is the book for?
While the scientific community is broad, the target audience of the EDS book is:
* Researchers with some background in environmental science interested in AI and data science methods.
* Researchers with some background in computer science interested in environmental data science.
* Anyone else interested in reproducibility, inclusive, shareable and collaborative AI and data science for environmental applications.

# How to contribute?
The EDS book welcomes contributions from anyone, not only those listed in the target audience.
The core GitHub repository is public and open source licensed (see [here](https://github.com/alan-turing-institute/environmental-ds-book)).
The executable notebooks are hosted in the [EDS book organization](https://github.com/Environmental-DS-Book).
Please see the EDS [contributor’s guide](https://github.com/alan-turing-institute/environmental-ds-book/blob/master/CONTRIBUTING.md) for details on how you can get involved.

<style>
.responsive-wrap iframe{ max-width: 100%;}
</style>
<div class="responsive-wrap">
<!-- this is the embed code provided by Google -->
<iframe src="https://docs.google.com/presentation/d/1IdKnE5jRPR3rPaKkzUtw-5YaUhpsgjsonRlt05d8SgQ/embed?start=false&loop=false&delayms=3000" frameborder="0" width="960" height="569" allowfullscreen="true" mozallowfullscreen="true" webkitallowfullscreen="true"></iframe>
<!-- Google embed ends -->
</div>
35 changes: 35 additions & 0 deletions _sources/afterword/pythia.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# Project Pythia

:::{tip}
Project Pythia, https://projectpythia.org/, is the education working group for Pangeo and is an educational resource for the entire geoscience community.
The information below highlights some notions of the initiative. We also include a recent presentation authored by Clyne et al. (2022).

For further details, we suggest visiting Pythia's [About section](https://projectpythia.org/about.html#presentations-about-project-pythia).
:::

## About
Project Pythia is a home for Python-centered learning resources that are open-source, community-owned, geoscience-focused, and high-quality.

## Who is Project Pythia?
The current core Pythia team can be found [here](https://projectpythia.org/index.html#the-project-pythia-team). Pythia is an open and inclusive community! Look [here](https://projectpythia.org/index.html#join-us) for info on how to get involved.

## Project Pythia Goals
1. _The Pythia Portal:_ A searchable online portal that
provides scientists at any point in their career with educational
content and real-world examples needed to learn how to navigate and
integrate the myriad packages within the Python ecosystem for the
geosciences.

2. _Cloud-Deployable Pythia Platforms:_ A light-weight,
Binder-based platform that will make it possible to launch portal
content in customizable executable environments in the Cloud with
only a “single click.”

<style>
.responsive-wrap iframe{ max-width: 100%;}
</style>
<div class="responsive-wrap">
<!-- this is the embed code provided by Google -->
<iframe src="https://docs.google.com/presentation/d/1js9iR2bmNj7rkJSHU9kvB5037KZnK4mlmWF8Twnggis/embed?start=false&loop=false&delayms=3000" frameborder="0" width="960" height="569" allowfullscreen="true" mozallowfullscreen="true" webkitallowfullscreen="true"></iframe>
<!-- Google embed ends -->
</div>
50 changes: 50 additions & 0 deletions _sources/afterword/resources.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@

![Pangeo logo](../images/pangeo_name_logo.png)

**A community platform for Big Data geoscience**

### Join the community!

| Information | Links |
| :--- |:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| **Website** | https://pangeo.io/ |
| **GitHub** | [![GitHub](https://img.shields.io/badge/GitHub-Pangeo--data-blue?logo=github)](https://github.com/pangeo-data) |
| **Examples** | [![Gallery](https://img.shields.io/badge/Pangeo-Gallery-orange)](http://gallery.pangeo.io/) |
| **Chat** | [![Gitter](https://img.shields.io/badge/Gitter-Chat-yellow?logo=gitter)](https://gitter.im/pangeo-data/Lobby) [![Pangeo - Discourse](https://img.shields.io/discourse/users?server=https%3A%2F%2Fdiscourse.pangeo.io%2F&style=flat-square&logo=discourse)](https://discourse.pangeo.io) |
| **News** | [![Medium - Blog](https://img.shields.io/badge/Medium-Blog-2ea44f?logo=medium)](https://medium.com/pangeo) [![Fllo](https://img.shields.io/twitter/follow/pangeo_data?style=social)](https://twitter.com/pangeo_data) |

### Meetings
- [General](https://pangeo.io/meeting-notes.html#community-meeting): Pangeo holds community meetings meetings every Wednesday. The meetings alternate between 12PM and 4PM US Eastern Time to encourage participants from a wider range of time zones.
- [Continental meetings](https://pangeo.io/meeting-notes.html#continental-community-meetings): to adress different time zones among the globe continental meetings have been organized in Europe/Africa and Oceania.
- [Showcases](https://pangeo.io/pangeo-showcase.html#pangeo-showcase): 15 minutes talks which are an opportunity for anyone to meet other members of the Pangeo community and let them know what you are working on. The talks are recorded, given a DOI, and made available on the [Pangeo YouTube Channel](https://youtube.com/playlist?list=PLuQQBBQFfpgq0OvjKbjcYgTDzDxTqtwua). If you are interested in giving a talk, [fill out this short form](https://forms.gle/QwxKusVvrvDakSNs8).

### Most recent trainings (2021/24)
- **Digital Scholarship Days 2024**: [Unlocking the Power of EOSC](https://www.ub.uio.no/english/courses-events/events/dsc/2024/digital-scholarship-days/21-unlocking-eosc.html): Navigating Services for Research Visibility and Impact - The European Open Science Cloud (EOSC) in a nutshell. Jan. 12, 2024 9:00 AM – 12:00 PM at the university of Oslo, Norway.
- **BiDS 2023**: [PANGEO & OpenEO one day Training event](https://openeo.cloud/2023/09/05/bids23-satellite-event-monday-6th-november/)
- **[Reproducibility Challenge 2023](https://eds-book.github.io/reproducibility-challenge-2023/intro.html)**
- **FOSS4G 2022**: [Pangeo Training event](https://pangeo-data.github.io/foss4g-2022)
- **[Pangeo 101 Tutorial for CLIVAR CMIP6 Bootcamp 2022](https://pangeo-data.github.io/clivar-2022/)**
- **[eScience 2022 course on Tools in Climate Science: Linking Observations with Modelling](https://pangeo-data.github.io/escience-2022/)**
- [Galaxy training in climate data](https://training.galaxyproject.org/training-material/topics/climate/): contains two modules introducing Pangeo, _Pangeo ecosystem 101 for everyone_ and _Pangeo Notebook in Galaxy - Introduction to Xarray_ showcasing how the Pangeo stack assists processing and analysing big climate datasets.
- [BIOGEOMON 2022 Python Pangeo Workshop](https://github.com/LandscapeGeoinformatics): led by [Landscape Geoinformatics](https://github.com/LandscapeGeoinformatics) includes Jupyter notebooks demonstrating Xarray for working with labeled multi-dimensional arrays of data. The material also shows a few basic steps how to improve reproducibility and pro-actively apply FAIR principles when sharing and archiving data and code online for publishing via [GitHub](https://github.com/) and [Zenodo](https://zenodo.org/).
- [FOSS4G 2021](https://github.com/pangeo-data/foss4g-2021): focuses on data discovery with SpatioTemporal Asset Catalogs (STAC), data loading with Cloud-optimized formats (Cloud-Optimized Geotiff, ZARR), and scalable analysis with Xarray and Dask libraries.

## Additional resources/initiatives consuming Pangeo stack
_List of some active initiatives. Find more in https://github.com/pangeo-data_.

- [CarbonPlan](https://carbonplan.org/): _non-profit initiative_, analyzes climate solutions based on the best available science and data. The team works collaboratively with the Pangeo community to build open tools and resources for the evaluation and deployment of robust climate programs.
- [CliMetLab](https://github.com/ecmwf/climetlab): _package_, aims at simplifying access to climate and meteorological datasets, allowing users to focus on science instead of technical issues such as data access and data formats.
- [climpred](https://github.com/pangeo-data/climpred): _package_, aims to be the primary package used to analyze output from initialized dynamical forecast models, ranging from short-term weather forecasts to decadal climate forecasts.
- [Digital Earth Africa Sandbox](https://sandbox.digitalearth.africa/): _platform_, a cloud-based computational platform that operates through a Jupyter Lab environment. It provides a limited, but free compute resource for technical users and data scientists to explore DE Africa data and products. The platform consumes `xarray` and `dask` to optimize the processing and analysis of the curated datasets.
- [EOOffshore](https://eooffshore.github.io/): _research project_, presents a case study that demonstrates the utility of the Pangeo software ecosystem to address these issues in the development of offshore wind speed and power density estimates, increasing wind measurement coverage of offshore renewable energy assessment areas in the [Irish Continental Shelf](https://www.marine.ie/Home/site-area/irelands-marine-resource/real-map-ireland) region.
- [Fastscape LEM](https://fastscape.org/): _software stack_, aims at making landscape evolution models and topographic analysis algorithms readily accessible to a wide range of users, from experts in landscape evolution modelling to scientists, researchers and teachers in the broader Earth science community.
- [flox](https://github.com/xarray-contrib/flox): _package_, explores strategies for fast GroupBy reductions with `dask.array`. It used to be called dask_groupby.
- [NetCarbon](https://www.netcarbon.fr/home): _startup company_, offering farmers a free solution for measuring and monetizing their sequestered carbon to contribute towards carbon neutrality.
- [Planetary Computer](https://planetarycomputer.microsoft.com/): _platform_, a cloud-based computational platform aiming to combine a petabyte catalog of analysis-ready geospatial data, an API that facilitates spatiotemporal querying over that data and a computing environment that simplifies distributed computing workloads.
- [PyGMT](https://github.com/GenericMappingTools/pygmt): _package_, facilitates processing geospatial and geophysical data and making publication quality maps and figures.
- [scivision](https://github.com/alan-turing-institute/scivision): _package_, aims to connect computer vision model developers to image data providers from diverse scientific fields. The project builds upon existing libraries to create and manipulate data catalogues e.g. `intake`, and `xarray` to handle N-dimensional data for exploring CV models.
- [Urban Grammar AI research project](https://urbangrammarai.xyz/): _research project_, proposes a conceptual framework to characterize urban structure through the notions of spatial signatures and urban grammar. In addition to consume the Pangeo stack, the resource demonstrates some notebooks using [`dask_geopandas`](https://github.com/geopandas/dask-geopandas) to optimize processing and analysing spatial operations on geometric types.
- [verde](https://github.com/fatiando/verde), _package_, aims at processing spatial data (bathymetry, geophysics surveys, etc) and interpolating it on regular grids (i.e., gridding).
- [xarray-sentinel](https://github.com/bopen/xarray-sentinel): _package_, facilitates access and exploration of the SAR data products of the Copernicus Sentinel-1 satellite mission.
- [xESMF](https://github.com/pangeo-data/xESMF): _package_, a regridding tool suited for non-orthogonal grids. xESMF tries to be simple and intuitive.
- [xMIP](https://github.com/jbusecke/xMIP): _package_, facilitates the cleaning, organization and interactive analysis of Model Intercomparison Projects (MIPs) within the Pangeo software stack.
11 changes: 11 additions & 0 deletions _sources/agenda.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Agenda

(draft)

- Introduction of Pangeo
- Xarray, chunks and dask
- Visualisation with hvplot
- Data access and STAC
- Scaling


87 changes: 87 additions & 0 deletions _sources/intro.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
---
# File metadata may be provided as frontmatter YAML
title: Big Geospatial Data Hackathon with Open Infrastructure and Tools
subtitle: GEO-OPEN-HACK-2024
description: GEO-OPEN-HACK-2024 is a comprehensive and informative event designed for advanced geo-coders to explore various open tools and approaches for upscaling geospatial analysis on open High-Performance Computing (HPC) infrastructure.
date: 2024-05-28
authors:
- id: annefou
name: Anne Fouilloux
orcid: 0000-0002-1784-2920
corresponding: false
roles:
- Pangeo
affiliations:
- simula
- id: tinaok
name: Tina Erica Odaka
orcid: 0000-0002-1500-0156
corresponding: false
roles:
- Pangeo
affiliations:
- ifremer
affiliations:
- id: simula
name: Simula Research Laboratory
city: Oslo
country: Norway
url: https://www.simula.no
ror: https://ror.org/00vn06n10
- id: ifremer
name: IFREMER
city: Brest
country: France
url: https://www.ifremer.fr
ror: https://ror.org/044jxhp58
tags:
- pangeo
- stac
- machine-learning
thumbnail: images/pangeo-logo.png
---

# GEO-OPEN-HACK-2024

**Big Geospatial Data Hackathon with Open Infrastructure and Tools**

+++ {"part":"abstract"}

% The article should include an abstract block at the beginning. The block is delimited by `+++` before and after, and you must specify `"part": "abstract"` as JSON metadata on the block opener. This metadata is required for recognizing the content of this cell as the abstract.
% The abstract should begin with a short description of the problem addressed, briefly describe the new data or analyses, then briefly state the main conclusion(s) and how they are supported, and address any uncertainty.

[GEO-OPEN-HACK-2024](https://iiasa.ac.at/events/jun-2024/geo-open-hack-2024-big-geospatial-data-hackathon-with-open-infrastructure-and-tools) is a comprehensive and informative event designed for advanced geo-coders to explore various open tools and approaches for upscaling geospatial analysis on open High-Performance Computing (HPC) infrastructure.

The event is organised by the [International Institute of Applied Systems Analysis (IIASA)](https://iiasa.ac.at) in collaboration with [Spatial Ecology](https://spatial-ecology.net/).

+++

## Overview

This Pangeo tutorial is part of GEO-OPEN-HACK-2024 and will provide a comprehensive introduction along with hands-on examples to help you understand how these technologies can be used for Earth science data analysis and interpretation.

In this tutorial, participants will learn how to 1) navigate the Pangeo ecosystem for scalable Earth Science workflows on Pangeo@EOSC, and 2) exploit Earth Observation (EO) data.

:::{tip}
[Pangeo-EOSC](https://github.com/pangeo-data/pangeo-eosc/) has benefited from services and resources provided by the [EGI-ACE project](https://www.egi.eu/project/egi-ace/) (funded by the European Union’s Horizon 2020 research and innovation programme under Grant Agreement no. 101017567), and the [C-SCALE project](https://c-scale.eu/) (funded by the European Union's Horizon 2020 research and innovation programme under grant agreement no. 101017529), with the dedicated support of [CESNET](https://www.cesnet.cz/en/).
:::

## Tutorial Learning Objectives

By the end of this tutorial, learners will be able to:

- Learn about the European Open Science Cloud (EOSC)
- Understand the Pangeo ecosystem and Pangeo@EOSC;
- Learn to access, load, and analyse data using Xarray, visualising data with Hvplot, and scaling workflows with Dask;

## Prerequisites

Before starting this tutorial, learners should have:

- Good knowledge of Python or another programming language;
- Good knowledge of geospatial data structures;

## Set up

If you are participating in this training as part of the GEO-OPEN-HACK-2024, you will need to register yourself to [Pangeo@EOSC](https://pangeo-data.github.io/pangeo-eosc/). The set up instructions are given at [getting started with Pangeo@EOSC](https://pangeo-data.github.io/geo-open-hack-2024/setup/users-getting-started.html).

Loading

0 comments on commit 7e9cd5f

Please sign in to comment.