From 6106eb945a02311c1d1c3fde66f6bc1a2e930357 Mon Sep 17 00:00:00 2001 From: juannat7 Date: Fri, 26 Jan 2024 23:48:19 +0000 Subject: [PATCH] deploy: 291aaefed443e0ca7d13e0ef6105d62e4d5915df --- README.html | 218 ++++----------- _sources/README.md | 99 +++---- _sources/baseline.md | 57 ++++ _sources/dataset.md | 17 ++ _sources/evaluation.md | 32 +++ _sources/leaderboard.md | 15 ++ _sources/quickstart.md | 31 +++ _sources/task.md | 21 ++ _sources/training.md | 24 ++ baseline.html | 575 +++++++++++++++++++++++++++++++++++++++ dataset.html | 582 ++++++++++++++++++++++++++++++++++++++++ evaluation.html | 574 +++++++++++++++++++++++++++++++++++++++ genindex.html | 10 +- leaderboard.html | 490 +++++++++++++++++++++++++++++++++ objects.inv | Bin 323 -> 410 bytes quickstart.html | 506 ++++++++++++++++++++++++++++++++++ search.html | 10 +- searchindex.js | 2 +- task.html | 501 ++++++++++++++++++++++++++++++++++ training.html | 498 ++++++++++++++++++++++++++++++++++ 20 files changed, 4025 insertions(+), 237 deletions(-) create mode 100644 _sources/baseline.md create mode 100644 _sources/dataset.md create mode 100644 _sources/evaluation.md create mode 100644 _sources/leaderboard.md create mode 100644 _sources/quickstart.md create mode 100644 _sources/task.md create mode 100644 _sources/training.md create mode 100644 baseline.html create mode 100644 dataset.html create mode 100644 evaluation.html create mode 100644 leaderboard.html create mode 100644 quickstart.html create mode 100644 task.html create mode 100644 training.html diff --git a/README.html b/README.html index f4d5092..941276f 100644 --- a/README.html +++ b/README.html @@ -9,7 +9,7 @@ - ChaosBench - A benchmark for long-term forecasting of chaotic systems — ChaosBench + ChaosBench: A Multi-Channel, Physics-Based Benchmark for Subseasonal-to-Seasonal Climate Prediction — ChaosBench @@ -62,12 +62,10 @@ const thebe_selector_output = ".output, .cell_output" - - - + @@ -163,12 +161,18 @@ @@ -354,7 +358,7 @@
-

ChaosBench - A benchmark for long-term forecasting of chaotic systems

+

ChaosBench: A Multi-Channel, Physics-Based Benchmark for Subseasonal-to-Seasonal Climate Prediction

@@ -364,10 +368,10 @@

Contents

@@ -379,168 +383,40 @@

Contents

-
-

ChaosBench - A benchmark for long-term forecasting of chaotic systems#

-

ChaosBench is a benchmark project to improve long-term forecasting of chaotic systems, in particular subseasonal-to-seasonal (S2S) weather. Current features include:

-
-

1. Benchmark and Dataset#

-
    -
  • Input: ERA5 Reanalysis (1979-2022)

  • -
  • Target: The following table indicates the 48 variables (channels) that are available for Physics-based models. Note that the Input ERA5 observations contains ALL fields, including the unchecked boxes:

    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

    Parameters/Levels (hPa)

    1000

    925

    850

    700

    500

    300

    200

    100

    50

    10

    Geopotential height, z (\(gpm\))

    Specific humidity, q (\(kg kg^{-1}\))

     

     

     

    Temperature, t (\(K\))

    U component of wind, u (\(ms^{-1}\))

    V component of wind, v (\(ms^{-1}\))

    Vertical velocity, w (\(Pas^{-1}\))

     

     

     

     

     

     

     

     

     

    -
  • -
  • Baselines:

    -
      -
    • Physics-based models:

      -
        -
      • UKMO: UK Meteorological Office

      • -
      • NCEP: National Centers for Environmental Prediction

      • -
      • CMA: China Meteorological Administration

      • -
      • ECMWF: European Centre for Medium-Range Weather Forecasts

      • -
      -
    • -
    • Data-driven models:

      -
        -
      • Lagged-Autoencoder

      • -
      • Fourier Neural Operator (FNO)

      • -
      • ResNet

      • -
      • UNet

      • -
      • ViT/ClimaX

      • -
      • PanguWeather

      • -
      • Fourcastnetv2

      • -
      -
    • -
    -
  • -
+
+

ChaosBench: A Multi-Channel, Physics-Based Benchmark for Subseasonal-to-Seasonal Climate Prediction#

+

ChaosBench is a benchmark project to improve long-term forecasting of chaotic systems, in particular subseasonal-to-seasonal (S2S) climate, using ML approaches.

+

Homepage 🔗: https://leap-stc.github.io/ChaosBench

+

Paper 📚: https://arxiv.org/

+

Dataset 🤗: https://huggingface.co/datasets/juannat7/ChaosBench

+
+

Features#

+

Overview of ChaosBench

+

1️⃣ Extended Observations. Spanning over 45 years (1979 - 2023) of ERA5 reanalysis

+

2️⃣ Diverse Baselines. Wide selection of physics-based forecasts from leading national agencies in Europe, the UK, America, and Asia

+

3️⃣ Differentiable Physics Metrics. Introduces two differentiable physics-based metrics to minimize the decay of power spectra at long forecasting horizon (blurriness)

+

4️⃣ Large-Scale Benchmarking. Systematic evaluation for state-of-the-art ML-based weather models like PanguWeather, FourcastNetV2, ViT/ClimaX, and Graphcast

-
-

2. Metrics#

-

We divide our metrics into 2 classes: (1) ML-based, which cover evaluation used in conventional computer vision and forecasting tasks, (2) Physics-based, which are aimed to construct a more physically-faithful and explainable data-driven forecast.

+
+

Getting Started#

    -
  • Vision-based:

    -
      -
    • RMSE

    • -
    • Bias

    • -
    • Anomaly Correlation Coefficient (ACC)

    • -
    • Multiscale Structural Similarity Index (MS-SSIM)

    • -
    -
  • -
  • Physics-based:

    -
      -
    • Spectral Divergence (SpecDiv)

    • -
    • Spectral Residual (SpecRes)

    • -
    -
  • +
  • Quickstart

  • +
  • Dataset Overview

  • +
  • Task Overview

-
-

3. Tasks#

-

We presented two task, where the model still takes as inputs the FULL 60 variables, but the benchmarking is done on either ALL or a SUBSET of target variable(s).

+
+

Build Your Own Model#

    -
  • Task 1: Full Dynamics Prediction. -It is aimed at ALL target channels simultaneously. This task is generally harder to perform but is useful to build a model that emulates the entire weather conditions.

  • -
  • Task 2: Sparse Dynamics Prediction. -It is aimed at a SUBSET of target channel(s). This task is useful to build long-term forecasting model for specific variables, such as near-surface temperature (t-1000) or near-surface humidity (q-1000).

  • +
  • Training

  • +
  • Evaluation

-
-

4. Getting Started#

-

You can learn more about how to use our benchmark product through the following Jupyter notebooks under the notebooks directory. It covers topics ranging from:

+
+

Benchmarking#

diff --git a/_sources/README.md b/_sources/README.md index 3f87ead..4be9024 100644 --- a/_sources/README.md +++ b/_sources/README.md @@ -1,61 +1,38 @@ -# ChaosBench - A benchmark for long-term forecasting of chaotic systems -ChaosBench is a benchmark project to improve long-term forecasting of chaotic systems, in particular subseasonal-to-seasonal (S2S) weather. Current features include: - -## 1. Benchmark and Dataset - -- __Input:__ ERA5 Reanalysis (1979-2022) - -- __Target:__ The following table indicates the 48 variables (channels) that are available for Physics-based models. Note that the __Input__ ERA5 observations contains __ALL__ fields, including the unchecked boxes: - - Parameters/Levels (hPa) | 1000 | 925 | 850 | 700 | 500 | 300 | 200 | 100 | 50 | 10 - :---------------------- | :----| :---| :---| :---| :---| :---| :---| :---| :--| :-| - Geopotential height, z ($gpm$) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | - Specific humidity, q ($kg kg^{-1}$) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |   |   |   | - Temperature, t ($K$) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | - U component of wind, u ($ms^{-1}$) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | - V component of wind, v ($ms^{-1}$) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | - Vertical velocity, w ($Pas^{-1}$) |   |   |   |   | ✓ |   |   |   |   |   | - -- __Baselines:__ - - Physics-based models: - - [x] UKMO: UK Meteorological Office - - [x] NCEP: National Centers for Environmental Prediction - - [x] CMA: China Meteorological Administration - - [x] ECMWF: European Centre for Medium-Range Weather Forecasts - - Data-driven models: - - [x] Lagged-Autoencoder - - [x] Fourier Neural Operator (FNO) - - [x] ResNet - - [x] UNet - - [x] ViT/ClimaX - - [x] PanguWeather - - [x] Fourcastnetv2 - -## 2. Metrics -We divide our metrics into 2 classes: (1) ML-based, which cover evaluation used in conventional computer vision and forecasting tasks, (2) Physics-based, which are aimed to construct a more physically-faithful and explainable data-driven forecast. - -- __Vision-based:__ - - [x] RMSE - - [x] Bias - - [x] Anomaly Correlation Coefficient (ACC) - - [x] Multiscale Structural Similarity Index (MS-SSIM) -- __Physics-based:__ - - [x] Spectral Divergence (SpecDiv) - - [x] Spectral Residual (SpecRes) - - -## 3. Tasks -We presented two task, where the model still takes as inputs the __FULL__ 60 variables, but the benchmarking is done on either __ALL__ or a __SUBSET__ of target variable(s). - -- __Task 1: Full Dynamics Prediction.__ -It is aimed at __ALL__ target channels simultaneously. This task is generally harder to perform but is useful to build a model that emulates the entire weather conditions. - -- __Task 2: Sparse Dynamics Prediction.__ -It is aimed at a __SUBSET__ of target channel(s). This task is useful to build long-term forecasting model for specific variables, such as near-surface temperature (t-1000) or near-surface humidity (q-1000). - -## 4. Getting Started -You can learn more about how to use our benchmark product through the following Jupyter notebooks under the `notebooks` directory. It covers topics ranging from: -- `01*_dataset_exploration` -- `02*_modeling` -- `03*_training` -- `04*_evaluation` +# ChaosBench: A Multi-Channel, Physics-Based Benchmark for Subseasonal-to-Seasonal Climate Prediction + + +ChaosBench is a benchmark project to improve long-term forecasting of chaotic systems, in particular subseasonal-to-seasonal (S2S) climate, using ML approaches. + +Homepage 🔗: https://leap-stc.github.io/ChaosBench + +Paper 📚: https://arxiv.org/ + +Dataset 🤗: https://huggingface.co/datasets/juannat7/ChaosBench + + +## Features + +![Overview of ChaosBench](docs/scheme/chaosbench_scheme.jpg) + +1️⃣ __Extended Observations__. Spanning over 45 years (1979 - 2023) of ERA5 reanalysis + +2️⃣ __Diverse Baselines__. Wide selection of physics-based forecasts from leading national agencies in Europe, the UK, America, and Asia + +3️⃣ __Differentiable Physics Metrics__. Introduces two differentiable physics-based metrics to minimize the decay of power spectra at long forecasting horizon (blurriness) + +4️⃣ __Large-Scale Benchmarking__. Systematic evaluation for state-of-the-art ML-based weather models like PanguWeather, FourcastNetV2, ViT/ClimaX, and Graphcast + + +## Getting Started +- [Quickstart](https://leap-stc.github.io/ChaosBench/quickstart.html) +- [Dataset Overview](https://leap-stc.github.io/ChaosBench/dataset.html) +- [Task Overview](https://leap-stc.github.io/ChaosBench/task.html) + + +## Build Your Own Model +- [Training](https://leap-stc.github.io/ChaosBench/training.html) +- [Evaluation](https://leap-stc.github.io/ChaosBench/evaluation.html) + +## Benchmarking +- [Baseline Models](https://leap-stc.github.io/ChaosBench/baseline.html) +- [Leaderboard](https://leap-stc.github.io/ChaosBench/leaderboard.html) \ No newline at end of file diff --git a/_sources/baseline.md b/_sources/baseline.md new file mode 100644 index 0000000..9077477 --- /dev/null +++ b/_sources/baseline.md @@ -0,0 +1,57 @@ +# Baseline Models +We differentiate between physics-based and data-driven models. The former is succintly illustrated as in the figure below. + +
+ +
+ +## Model Definition +- __Physics-Based Models__: + - [x] UKMO: UK Meteorological Office + - [x] NCEP: National Centers for Environmental Prediction + - [x] CMA: China Meteorological Administration + - [x] ECMWF: European Centre for Medium-Range Weather Forecasts + +- __Data-Driven Models__: + - [x] Lagged-Autoencoder + - [x] Fourier Neural Operator (FNO) + - [x] ResNet + - [x] UNet + - [x] ViT/ClimaX + - [x] PanguWeather + - [x] Fourcastnetv2 + - [x] GraphCast + +## Model Checkpoints +Checkpoints for data-driven models are accessible from [here](https://huggingface.co/datasets/juannat7/ChaosBench/tree/main/logs) + +- Data-driven models are indicated by the `_s2s` suffix (e.g., `unet_s2s`). + +- The hyperparameter specifications are located in `version_xx/lightning_logs/hparams.yaml`. The hyperparameters encode the following: + + - `lead_time` (default: 1): arbitrary delta_t to finetune the model, for direct approach + - `n_step` (default: 1): number of autoregressive step, s, for autoregressive approach + - `only_headline`: if false, optimize for task 1; if true for task 2 + - `batch_size`: the batch size used for training + - `train_years`: list of years used for training + - `val_years`: list of years used for validation + - `epochs`: number of epoch + - `input_size`: number of input channel + - `learning_rate`: update step at each iteration + - `model_name`: the name of the model used for consistency + - `num_workers`: number of workers used in dataloader + - `output_size`: number of output channel + - `t_max`: number of cosine learning rate scheduler cycle + +__NOTE__: You will notice that for each data-driven model, there are 4 checkpoints. + +1. Version 0 - Task 1; autoregressive up to 1-day ahead +2. Version 1 - Task 1; autoregressive up to 5-day ahead +3. Version 2 - Task 2; autoregressive up to 1-day ahead +4. Version 3 - Task 2; autoregressive up to 5-day ahead + +Only for `unet_s2s` do we have many more checkpoints. This is to check for the effect of `direct` vs. `autoregressive` training approach described in the paper. In particular, the `direct` models have the following version numbers, +1. Version {0, 4, 5, 6, 7, 8, 9, 10, 11, 12} - Task 1 +2. Version {2, 13, 14, 15, 16, 17, 18, 19, 20, 21} - Task 2 + +Each element in the array corresponds to checkpoints optimized for each $\Delta T \in \{1, 5, 10, 15, 20, 25, 30, 35, 40, 44\}$. \ No newline at end of file diff --git a/_sources/dataset.md b/_sources/dataset.md new file mode 100644 index 0000000..25a907f --- /dev/null +++ b/_sources/dataset.md @@ -0,0 +1,17 @@ +# Dataset Information + +> __NOTE__: Hands-on exploration of the ChaosBench dataset in `notebooks/01a_s2s_data_exploration.ipynb` + +1. __Input:__ ERA5 Reanalysis (1979-2023) + +2. __Target:__ The following table indicates the 48 variables (channels) that are available for Physics-based models. Note that the __Input__ ERA5 observations contains __ALL__ fields, including the unchecked boxes: + + Parameters/Levels (hPa) | 1000 | 925 | 850 | 700 | 500 | 300 | 200 | 100 | 50 | 10 + :---------------------- | :----| :---| :---| :---| :---| :---| :---| :---| :--| :-| + Geopotential height, z ($gpm$) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | + Specific humidity, q ($kg kg^{-1}$) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |   |   |   | + Temperature, t ($K$) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | + U component of wind, u ($ms^{-1}$) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | + V component of wind, v ($ms^{-1}$) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | + Vertical velocity, w ($Pas^{-1}$) |   |   |   |   | ✓ |   |   |   |   |   | + \ No newline at end of file diff --git a/_sources/evaluation.md b/_sources/evaluation.md new file mode 100644 index 0000000..387e1f0 --- /dev/null +++ b/_sources/evaluation.md @@ -0,0 +1,32 @@ +# Evaluation + +After training your model, you can simply perform evaluation by running: + +1. __Autoregressive__ +``` +python eval_iter.py --model_name _s2s --eval_years 2023 --version_num +``` + +2. __Direct__ +``` +python eval_direct.py --model_name _s2s --eval_years 2023 --version_nums --task_num +``` + +Where `` corresponds to the version(s) that `pytorch_lightning` generated during training. + +__For example__, in our `unet_s2s` baseline model, we can run: + +- Autoregressive: `python eval_iter.py --model_name unet_s2s --eval_years 2023 --version_num 0` + +- Direct: `python eval_direct.py --model_name unet_s2s --eval_years 2023 --version_nums 0 4 5 6 7 8 9 10 11 12 --task_num 1` + + +## Accessing Baseline Scores +You can access the complete scores (in `.csv` format) for data-driven, physics-based models, climatology, and persistence [here](https://huggingface.co/datasets/juannat7/ChaosBench/tree/main/logs). Below is a snippet from `logs/climatology/eval/rmse_climatology.csv`, where each row represents ``, such as `RMSE`, at each future timestep. + +| z-10 | z-50 | z-100 | z-200 | z-300 | ... | w-1000 | +|----------|----------|----------|----------|----------|-----|----------| +| 539.7944 | 285.9499 | 215.14742| 186.43161| 166.28784| ... | 0.07912156| +| 538.9591 | 285.43832| 214.82317| 186.23743| 166.16902| ... | 0.07907272| +| 538.1366 | 284.96063| 214.51791| 186.04941| 166.04732| ... | 0.07903882| +| ... | ... | ... | ... | ... | ... | ... | diff --git a/_sources/leaderboard.md b/_sources/leaderboard.md new file mode 100644 index 0000000..8f2bb67 --- /dev/null +++ b/_sources/leaderboard.md @@ -0,0 +1,15 @@ +# Leaderboard + +We divide our metrics into 2 classes: (1) ML-based, which cover evaluation used in conventional computer vision and forecasting tasks, (2) Physics-based, which are aimed to construct a more physically-faithful and explainable data-driven forecast. + +1. __Vision-based:__ + - [x] RMSE + - [x] Bias + - [x] Anomaly Correlation Coefficient (ACC) + - [x] Multiscale Structural Similarity Index (MS-SSIM) +2. __Physics-based:__ + - [x] Spectral Divergence (SpecDiv) + - [x] Spectral Residual (SpecRes) + + +For all models (data-driven, physics-based, etc), there is a folder named `eval/`. This contains individual `.csv` files for each metric (e.g., SpecDiv, RMSE). Within each file, it contains scores for all channels in question (e.g., the entire 60 for task 1, arbitrary n for task 2, or 48 for physics-based models) across 44-day lead time. \ No newline at end of file diff --git a/_sources/quickstart.md b/_sources/quickstart.md new file mode 100644 index 0000000..3b7c96a --- /dev/null +++ b/_sources/quickstart.md @@ -0,0 +1,31 @@ +# Quickstart + +**Step 1**: Clone the [ChaosBench](https://github.com/leap-stc/ChaosBench) Github repository + +**Step 2**: Create local directory to store your data, e.g., +``` +cd ChaosBench +mkdir data +``` + +**Step 3**: Navigate to `chaosbench/config.py` and change the field `DATA_DIR = //ChaosBench/data` (_Provide absolute path_) + +**Step 4**: Initialize the space by running +``` +cd //ChaosBench/data/ +wget https://huggingface.co/datasets/juannat7/ChaosBench/blob/main/process.sh +chmod +x process.sh +``` +**Step 5**: Download the data + +``` +# NOTE: you can also run each line one at a time to retrieve individual dataset + +./process.sh era5 # Required: For input ERA5 data +./process.sh climatology # Required: For climatology +./process.sh ukmo # Optional: For simulation from UKMO +./process.sh ncep # Optional: For simulation from NCEP +./process.sh cma # Optional: For simulation from CMA +./process.sh ecmwf # Optional: For simulation from ECMWF +``` + \ No newline at end of file diff --git a/_sources/task.md b/_sources/task.md new file mode 100644 index 0000000..946338a --- /dev/null +++ b/_sources/task.md @@ -0,0 +1,21 @@ +# Task Overview + +We presented __TWO__ task, where the model still takes as __inputs the FULL__ 60 variables, but the benchmarking __targets ALL or SUBSET__ of variable(s). + +1. __Task 1️⃣: Full Dynamics Prediction.__ +It is aimed at __ALL__ target channels simultaneously. This task is generally harder to perform but is useful to build a model that emulates the entire weather conditions. + +2. __Task 2️⃣: Sparse Dynamics Prediction.__ +It is aimed at a __SUBSET__ of target channel(s). This task is useful to build long-term forecasting model for specific variables, such as near-surface temperature (t-1000) or near-surface humidity (q-1000). + +__NOTE__: Before training your own model [instructions here](https://leap-stc.github.io/ChaosBench/training.html), you can specify the Task you are optimizing for by changing `only_headline` field in `chaosbench/configs/_s2s.yaml` file: + +- Task 1️⃣: `only_headline: False` + +- Task 2️⃣: `only_headline: True`. By default, it is going to optimize on {t-850, z-500, q-700}. To change this, modify the `HEADLINE_VARS` field in `chaosbench/config.py` + +In addition, we also provide flags to train the model either __autoregressively__ or __directly__. + +- Autoregressive: Using current output as the next model input. The number of iterative steps is defined in the `n_step: ` field. For our baselines, we set `N_STEP = 5`. + +- Direct: Directly targeting specific time in the future. The lead time can be specified in the `lead_time: ` field. Ensure that `n_step: 1` for this case. For our baselines, we set `` $\in \{1, 5, 10, 15, 20, 25, 30, 35, 40, 44\}$ \ No newline at end of file diff --git a/_sources/training.md b/_sources/training.md new file mode 100644 index 0000000..2d062fe --- /dev/null +++ b/_sources/training.md @@ -0,0 +1,24 @@ +# Training + +> __NOTE__: Hands-on modeling and training workflow in `notebooks/02a_s2s_modeling.ipynb` and `notebooks/03a_s2s_train.ipynb` + +We will outline how one can implement their own data-driven models. Several examples, including ED, FNO, ResNet, and UNet have been provided. + +**Step 1**: Define your model class in `chaosbench/models/.py`. At present, we only support models built with `PyTorch` + +**Step 2**: Initialize your model in `chaosbench/models/model.py` under `__init__` method in `S2SBenchmarkModel` class + +**Step 3**: Write a configuration file in `chaosbench/configs/_s2s.yaml`. We recommend reading the details on the definition of [hyperparameters](https://leap-stc.github.io/ChaosBench/baseline.html) and the different [task]((https://leap-stc.github.io/ChaosBench/task.html) before training. Also change the `model_name: _s2s` to ensure correct pathing + +- Task 1️⃣ (autoregressive): `only_headline: False ; n_step: ` +- Task 1️⃣ (direct): `only_headline: False ; n_step: 1 ; lead_time: ` + +- Task 2️⃣ (autoregressive): `only_headline: True ; n_step: ` +- Task 2️⃣ (direct): `only_headline: True ; n_step: 1 ; lead_time: ` + + +**Step 4**: Train by running `python train.py --config_filepath chaosbench/configs/_s2s.yaml` + +**Step 5**: Done! + +__NOTE__: Remember to replace `` with your own model name, e.g., `unet`. Checkpoints and logs would be automatically generated in `logs/_s2s/`. \ No newline at end of file diff --git a/baseline.html b/baseline.html new file mode 100644 index 0000000..966db77 --- /dev/null +++ b/baseline.html @@ -0,0 +1,575 @@ + + + + + + + + + + + + Baseline Models — ChaosBench + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + +
+
+
+
+
+ + + + +
+
+ + + + + +
+ + + +
+ +
+
+ +
+
+ +
+ +
+ +
+ + +
+ +
+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ +
+
+ + + +
+

Baseline Models

+ +
+
+ +
+

Contents

+
+ +
+
+
+ + + + +
+ +
+

Baseline Models#

+

We differentiate between physics-based and data-driven models. The former is succintly illustrated as in the figure below.

+
+ +
+
+

Model Definition#

+
    +
  • Physics-Based Models:

    +
      +
    • UKMO: UK Meteorological Office

    • +
    • NCEP: National Centers for Environmental Prediction

    • +
    • CMA: China Meteorological Administration

    • +
    • ECMWF: European Centre for Medium-Range Weather Forecasts

    • +
    +
  • +
  • Data-Driven Models:

    +
      +
    • Lagged-Autoencoder

    • +
    • Fourier Neural Operator (FNO)

    • +
    • ResNet

    • +
    • UNet

    • +
    • ViT/ClimaX

    • +
    • PanguWeather

    • +
    • Fourcastnetv2

    • +
    • GraphCast

    • +
    +
  • +
+
+
+

Model Checkpoints#

+

Checkpoints for data-driven models are accessible from here

+
    +
  • Data-driven models are indicated by the _s2s suffix (e.g., unet_s2s).

  • +
  • The hyperparameter specifications are located in version_xx/lightning_logs/hparams.yaml. The hyperparameters encode the following:

    +
      +
    • lead_time (default: 1): arbitrary delta_t to finetune the model, for direct approach

    • +
    • n_step (default: 1): number of autoregressive step, s, for autoregressive approach

    • +
    • only_headline: if false, optimize for task 1; if true for task 2

    • +
    • batch_size: the batch size used for training

    • +
    • train_years: list of years used for training

    • +
    • val_years: list of years used for validation

    • +
    • epochs: number of epoch

    • +
    • input_size: number of input channel

    • +
    • learning_rate: update step at each iteration

    • +
    • model_name: the name of the model used for consistency

    • +
    • num_workers: number of workers used in dataloader

    • +
    • output_size: number of output channel

    • +
    • t_max: number of cosine learning rate scheduler cycle

    • +
    +
  • +
+

NOTE: You will notice that for each data-driven model, there are 4 checkpoints.

+
    +
  1. Version 0 - Task 1; autoregressive up to 1-day ahead

  2. +
  3. Version 1 - Task 1; autoregressive up to 5-day ahead

  4. +
  5. Version 2 - Task 2; autoregressive up to 1-day ahead

  6. +
  7. Version 3 - Task 2; autoregressive up to 5-day ahead

  8. +
+

Only for unet_s2s do we have many more checkpoints. This is to check for the effect of direct vs. autoregressive training approach described in the paper. In particular, the direct models have the following version numbers,

+
    +
  1. Version {0, 4, 5, 6, 7, 8, 9, 10, 11, 12} - Task 1

  2. +
  3. Version {2, 13, 14, 15, 16, 17, 18, 19, 20, 21} - Task 2

  4. +
+

Each element in the array corresponds to checkpoints optimized for each \(\Delta T \in \{1, 5, 10, 15, 20, 25, 30, 35, 40, 44\}\).

+
+
+ + + + +
+ + + + + + + + +
+ + + +
+ + +
+
+ + +
+ + +
+
+
+ + + + + +
+
+ + \ No newline at end of file diff --git a/dataset.html b/dataset.html new file mode 100644 index 0000000..381ffaa --- /dev/null +++ b/dataset.html @@ -0,0 +1,582 @@ + + + + + + + + + + + + Dataset Information — ChaosBench + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + +
+
+
+
+
+ + + + +
+
+ + + + + +
+ + + +
+ +
+
+ +
+
+ +
+ +
+ +
+ + +
+ +
+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ +
+
+ + + +
+

Dataset Information

+ +
+
+ +
+
+
+ + + + +
+ +
+

Dataset Information#

+
+

NOTE: Hands-on exploration of the ChaosBench dataset in notebooks/01a_s2s_data_exploration.ipynb

+
+
    +
  1. Input: ERA5 Reanalysis (1979-2023)

  2. +
  3. Target: The following table indicates the 48 variables (channels) that are available for Physics-based models. Note that the Input ERA5 observations contains ALL fields, including the unchecked boxes:

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

    Parameters/Levels (hPa)

    1000

    925

    850

    700

    500

    300

    200

    100

    50

    10

    Geopotential height, z (\(gpm\))

    Specific humidity, q (\(kg kg^{-1}\))

     

     

     

    Temperature, t (\(K\))

    U component of wind, u (\(ms^{-1}\))

    V component of wind, v (\(ms^{-1}\))

    Vertical velocity, w (\(Pas^{-1}\))

     

     

     

     

     

     

     

     

     

    +
  4. +
+
+ + + + +
+ + + + + + + + +
+ + + +
+ + +
+
+ + +
+ + +
+
+
+ + + + + +
+
+ + \ No newline at end of file diff --git a/evaluation.html b/evaluation.html new file mode 100644 index 0000000..1a4dfea --- /dev/null +++ b/evaluation.html @@ -0,0 +1,574 @@ + + + + + + + + + + + + Evaluation — ChaosBench + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + +
+
+
+
+
+ + + + +
+
+ + + + + +
+ + + +
+ +
+
+ +
+
+ +
+ +
+ +
+ + +
+ +
+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ +
+
+ + + +
+

Evaluation

+ +
+
+ +
+

Contents

+
+ +
+
+
+ + + + +
+ +
+

Evaluation#

+

After training your model, you can simply perform evaluation by running:

+
    +
  1. Autoregressive

  2. +
+
python eval_iter.py --model_name <YOUR_MODEL>_s2s --eval_years 2023 --version_num <VERSION_NUM>
+
+
+
    +
  1. Direct

  2. +
+
python eval_direct.py --model_name <YOUR_MODEL>_s2s --eval_years 2023 --version_nums <VERSION_NUM> --task_num <TASK_NUM>
+
+
+

Where <VERSION_NUM(S)> corresponds to the version(s) that pytorch_lightning generated during training.

+

For example, in our unet_s2s baseline model, we can run:

+
    +
  • Autoregressive: python eval_iter.py --model_name unet_s2s --eval_years 2023 --version_num 0

  • +
  • Direct: python eval_direct.py --model_name unet_s2s --eval_years 2023 --version_nums 0 4 5 6 7 8 9 10 11 12 --task_num 1

  • +
+
+

Accessing Baseline Scores#

+

You can access the complete scores (in .csv format) for data-driven, physics-based models, climatology, and persistence here. Below is a snippet from logs/climatology/eval/rmse_climatology.csv, where each row represents <METRIC>, such as RMSE, at each future timestep.

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

z-10

z-50

z-100

z-200

z-300

w-1000

539.7944

285.9499

215.14742

186.43161

166.28784

0.07912156

538.9591

285.43832

214.82317

186.23743

166.16902

0.07907272

538.1366

284.96063

214.51791

186.04941

166.04732

0.07903882

+
+
+ + + + +
+ + + + + + + + +
+ + + +
+ + +
+
+ + +
+ + +
+
+
+ + + + + +
+
+ + \ No newline at end of file diff --git a/genindex.html b/genindex.html index 1bd07d8..a27f04b 100644 --- a/genindex.html +++ b/genindex.html @@ -159,12 +159,18 @@
diff --git a/leaderboard.html b/leaderboard.html new file mode 100644 index 0000000..b38c34b --- /dev/null +++ b/leaderboard.html @@ -0,0 +1,490 @@ + + + + + + + + + + + + Leaderboard — ChaosBench + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + +
+
+
+
+
+ + + + +
+
+ + + + + +
+ + + +
+ +
+
+ +
+
+ +
+ +
+ +
+ + +
+ +
+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ +
+
+ + + +
+

Leaderboard

+ +
+
+ +
+
+
+ + + + +
+ +
+

Leaderboard#

+

We divide our metrics into 2 classes: (1) ML-based, which cover evaluation used in conventional computer vision and forecasting tasks, (2) Physics-based, which are aimed to construct a more physically-faithful and explainable data-driven forecast.

+
    +
  1. Vision-based:

    +
      +
    • RMSE

    • +
    • Bias

    • +
    • Anomaly Correlation Coefficient (ACC)

    • +
    • Multiscale Structural Similarity Index (MS-SSIM)

    • +
    +
  2. +
  3. Physics-based:

    +
      +
    • Spectral Divergence (SpecDiv)

    • +
    • Spectral Residual (SpecRes)

    • +
    +
  4. +
+

For all models (data-driven, physics-based, etc), there is a folder named eval/. This contains individual .csv files for each metric (e.g., SpecDiv, RMSE). Within each file, it contains scores for all channels in question (e.g., the entire 60 for task 1, arbitrary n for task 2, or 48 for physics-based models) across 44-day lead time.

+
+ + + + +
+ + + + + + + + +
+ + + +
+ + +
+
+ + +
+ + +
+
+
+ + + + + + + + \ No newline at end of file diff --git a/objects.inv b/objects.inv index c4a711603ec2ff71cf401409f157f2ad8e9b2eec..24642efbf80ea91196eff9dd3dcf59e6dd7752b9 100644 GIT binary patch delta 302 zcmV+}0nz@$0-6Jmb$^k;Zi6rkhVOZbJiwT=Yj36HP-#_J*$bG&gNTr9Nnqpc7f2{U znzWn!`TuRlxfjRtwV>AV1J{gZFZ8~U=hpR-o?6g0N5$F$J<{tSoya^>3gr_m+sO)M zvm7jPx~O_E9sSUTRzqzOtX3d1r?V=?^dvy$kVpiM zrq*D1+B?BP{kj3tYyL;~&na6A!Qvl0NQ`rm>OZz}a;<;p&iyqA)>#K;cejlO=i4JE z{{IYX_mct((kCj^OK6bzdv|HuIS;D$xpEuPi~_%jGa&o{o1HeYsE-->1w^In!&Wnw AAOHXW delta 214 zcmV;{04e{P1H%H4b$^h}3c@fDgztHZeSr#Iz4;3YLeYcgn5=Cu$x5bD(LNK~E?b7qp5T0uveFFGbQiglrl+5>qgL?ac(F4Uqj02ubb6#}G(HsR&f=%t#QlTSiCgmDbVM2YKYoEl QE6ln0wQNwl0r3=%8ZO&u{Qv*} diff --git a/quickstart.html b/quickstart.html new file mode 100644 index 0000000..8759f61 --- /dev/null +++ b/quickstart.html @@ -0,0 +1,506 @@ + + + + + + + + + + + + Quickstart — ChaosBench + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + +
+
+
+
+
+ + + + +
+
+ + + + + +
+ + + +
+ +
+
+ +
+
+ +
+ +
+ +
+ + +
+ +
+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ +
+
+ + + +
+

Quickstart

+ +
+
+ +
+
+
+ + + + +
+ +
+

Quickstart#

+

Step 1: Clone the ChaosBench Github repository

+

Step 2: Create local directory to store your data, e.g.,

+
cd ChaosBench
+mkdir data
+
+
+

Step 3: Navigate to chaosbench/config.py and change the field DATA_DIR = /<YOUR_WORKING_DIR>/ChaosBench/data (Provide absolute path)

+

Step 4: Initialize the space by running

+
cd /<YOUR_WORKING_DIR>/ChaosBench/data/
+wget https://huggingface.co/datasets/juannat7/ChaosBench/blob/main/process.sh
+chmod +x process.sh
+
+
+

Step 5: Download the data

+
# NOTE: you can also run each line one at a time to retrieve individual dataset
+
+./process.sh era5            # Required: For input ERA5 data
+./process.sh climatology     # Required: For climatology
+./process.sh ukmo            # Optional: For simulation from UKMO
+./process.sh ncep            # Optional: For simulation from NCEP
+./process.sh cma             # Optional: For simulation from CMA
+./process.sh ecmwf           # Optional: For simulation from ECMWF
+
+
+
+ + + + +
+ + + + + + + + +
+ + + +
+ + +
+
+ + +
+ + +
+
+
+ + + + + +
+
+ + \ No newline at end of file diff --git a/search.html b/search.html index b79c1b0..6c42c0d 100644 --- a/search.html +++ b/search.html @@ -161,12 +161,18 @@ diff --git a/searchindex.js b/searchindex.js index e79a337..e90c29d 100644 --- a/searchindex.js +++ b/searchindex.js @@ -1 +1 @@ -Search.setIndex({"docnames": ["README", "intro", "markdown"], "filenames": ["README.md", "intro.md", "markdown.md"], "titles": ["ChaosBench - A benchmark for long-term forecasting of chaotic systems", "Welcome to your Jupyter Book", "Markdown Files"], "terms": {"project": 0, "improv": 0, "particular": [0, 1], "subseason": 0, "season": 0, "s2": 0, "weather": 0, "current": 0, "featur": 0, "includ": 0, "input": 0, "era5": 0, "reanalysi": 0, "1979": 0, "2022": 0, "target": 0, "The": 0, "follow": 0, "tabl": 0, "indic": 0, "48": 0, "variabl": 0, "channel": 0, "ar": 0, "avail": 0, "physic": 0, "base": 0, "model": 0, "note": 0, "observ": 0, "contain": 0, "all": 0, "field": 0, "uncheck": 0, "box": 0, "paramet": 0, "level": 0, "hpa": 0, "1000": 0, "925": 0, "850": 0, "700": 0, "500": 0, "300": 0, "200": 0, "100": 0, "50": 0, "10": 0, "geopotenti": 0, "height": 0, "z": 0, "gpm": 0, "specif": 0, "humid": 0, "q": 0, "kg": 0, "temperatur": 0, "t": 0, "k": 0, "u": 0, "compon": 0, "wind": 0, "ms": 0, "v": 0, "vertic": 0, "veloc": 0, "w": 0, "pa": 0, "baselin": 0, "ukmo": 0, "uk": 0, "meteorolog": 0, "offic": 0, "ncep": 0, "nation": 0, "center": 0, "environment": 0, "predict": 0, "cma": 0, "china": 0, "administr": 0, "ecmwf": 0, "european": 0, "centr": 0, "medium": 0, "rang": 0, "data": 0, "driven": 0, "lag": 0, "autoencod": 0, "fourier": 0, "neural": 0, "oper": 0, "fno": 0, "resnet": 0, "unet": 0, "vit": 0, "climax": 0, "panguweath": 0, "fourcastnetv2": 0, "we": 0, "divid": 0, "our": 0, "class": 0, "ml": 0, "which": 0, "cover": 0, "evalu": 0, "us": 0, "convent": 0, "comput": 0, "vision": 0, "aim": 0, "construct": 0, "more": [0, 1], "faith": 0, "explain": 0, "rmse": 0, "bia": 0, "anomali": 0, "correl": 0, "coeffici": 0, "acc": 0, "multiscal": 0, "structur": [0, 1], "similar": 0, "index": 0, "ssim": 0, "spectral": 0, "diverg": 0, "specdiv": 0, "residu": 0, "specr": 0, "present": 0, "two": 0, "where": 0, "still": 0, "take": 0, "full": 0, "60": 0, "done": 0, "either": 0, "subset": 0, "s": [0, 2], "dynam": 0, "It": [0, 1], "simultan": 0, "thi": [0, 1, 2], "gener": 0, "harder": 0, "perform": 0, "build": 0, "emul": 0, "entir": 0, "condit": 0, "spars": 0, "surfac": 0, "you": [0, 1, 2], "can": [0, 2], "learn": 0, "about": 0, "how": [0, 1], "product": 0, "through": 0, "jupyt": [0, 2], "notebook": [0, 2], "under": 0, "directori": 0, "topic": [0, 1], "from": 0, "01": 0, "_dataset_explor": 0, "02": 0, "_model": 0, "03": 0, "_train": 0, "04": 0, "_evalu": 0, "small": 1, "sampl": 1, "give": 1, "feel": 1, "content": [1, 2], "show": [1, 2], "off": [1, 2], "few": 1, "major": 1, "file": 1, "type": 1, "well": 1, "some": [1, 2], "doe": 1, "go": 1, "depth": 1, "ani": 1, "check": 1, "out": 1, "document": 1, "inform": 1, "acheck": 1, "page": 1, "bundl": 1, "see": 1, "whether": 2, "write": 2, "your": 2, "book": 2, "ipynb": 2, "regular": 2, "md": 2, "ll": 2, "same": 2, "flavor": 2, "call": 2, "myst": 2, "simpl": 2, "help": 2, "get": 2, "start": 2, "syntax": 2, "just": 2, "starter": 2, "lot": 2, "jupyterbook": 2, "org": 2}, "objects": {}, "objtypes": {}, "objnames": {}, "titleterms": {"chaosbench": 0, "A": 0, "benchmark": 0, "long": 0, "term": 0, "forecast": 0, "chaotic": 0, "system": 0, "1": 0, "dataset": 0, "2": 0, "metric": 0, "3": 0, "task": 0, "4": 0, "get": 0, "start": 0, "welcom": 1, "your": 1, "jupyt": 1, "book": 1, "markdown": 2, "file": 2, "learn": 2, "more": 2}, "envversion": {"sphinx.domains.c": 2, "sphinx.domains.changeset": 1, "sphinx.domains.citation": 1, "sphinx.domains.cpp": 6, "sphinx.domains.index": 1, "sphinx.domains.javascript": 2, "sphinx.domains.math": 2, "sphinx.domains.python": 3, "sphinx.domains.rst": 2, "sphinx.domains.std": 2, "sphinx.ext.intersphinx": 1, "sphinx": 56}}) \ No newline at end of file +Search.setIndex({"docnames": ["README", "baseline", "dataset", "evaluation", "leaderboard", "quickstart", "task", "training"], "filenames": ["README.md", "baseline.md", "dataset.md", "evaluation.md", "leaderboard.md", "quickstart.md", "task.md", "training.md"], "titles": ["ChaosBench: A Multi-Channel, Physics-Based Benchmark for Subseasonal-to-Seasonal Climate Prediction", "Baseline Models", "Dataset Information", "Evaluation", "Leaderboard", "Quickstart", "Task Overview", "Training"], "terms": {"project": 0, "improv": 0, "long": [0, 6], "term": [0, 6], "forecast": [0, 1, 4, 6], "chaotic": 0, "system": 0, "particular": [0, 1], "s2": 0, "us": [0, 1, 4, 6], "ml": [0, 4], "approach": [0, 1], "homepag": 0, "http": [0, 5, 7], "leap": [0, 7], "stc": [0, 7], "github": [0, 5, 7], "io": [0, 7], "paper": [0, 1], "arxiv": 0, "org": 0, "dataset": [0, 5], "huggingfac": [0, 5], "co": [0, 5], "juannat7": [0, 5], "1": [0, 1, 2, 3, 4, 5, 6, 7], "extend": 0, "observ": [0, 2], "span": 0, "over": 0, "45": 0, "year": [0, 1], "1979": [0, 2], "2023": [0, 2, 3], "era5": [0, 2, 5], "reanalysi": [0, 2], "2": [0, 1, 4, 5, 6, 7], "divers": 0, "baselin": [0, 6], "wide": 0, "select": 0, "from": [0, 1, 3, 5], "lead": [0, 4, 6], "nation": [0, 1], "agenc": 0, "europ": 0, "uk": [0, 1], "america": 0, "asia": 0, "3": [0, 1, 5, 7], "differenti": [0, 1], "metric": [0, 3, 4], "introduc": 0, "two": [0, 6], "minim": 0, "decai": 0, "power": 0, "spectra": 0, "horizon": 0, "blurri": 0, "4": [0, 1, 3, 5, 7], "larg": 0, "scale": 0, "systemat": 0, "evalu": [0, 4], "state": 0, "art": 0, "weather": [0, 1, 6], "like": 0, "panguweath": [0, 1], "fourcastnetv2": [0, 1], "vit": [0, 1], "climax": [0, 1], "graphcast": [0, 1], "quickstart": 0, "overview": 0, "task": [0, 1, 4, 7], "train": [0, 1, 3, 6], "leaderboard": 0, "we": [1, 3, 4, 6, 7], "between": 1, "physic": [1, 2, 3, 4], "base": [1, 2, 3, 4], "data": [1, 3, 4, 5, 7], "driven": [1, 3, 4, 7], "The": [1, 2, 6], "former": 1, "succintli": 1, "illustr": 1, "figur": 1, "below": [1, 3], "ukmo": [1, 5], "meteorolog": 1, "offic": 1, "ncep": [1, 5], "center": 1, "environment": 1, "predict": [1, 6], "cma": [1, 5], "china": 1, "administr": 1, "ecmwf": [1, 5], "european": 1, "centr": 1, "medium": 1, "rang": 1, "lag": 1, "autoencod": 1, "fourier": 1, "neural": 1, "oper": 1, "fno": [1, 7], "resnet": [1, 7], "unet": [1, 7], "ar": [1, 2, 4, 6], "access": 1, "here": [1, 3, 6], "indic": [1, 2], "_s2": [1, 3, 6, 7], "suffix": 1, "e": [1, 4, 5, 7], "g": [1, 4, 5, 7], "unet_s2": [1, 3], "hyperparamet": [1, 7], "specif": [1, 2, 6], "locat": 1, "version_xx": 1, "lightning_log": 1, "hparam": 1, "yaml": [1, 6, 7], "encod": 1, "follow": [1, 2], "lead_tim": [1, 6, 7], "default": [1, 6], "arbitrari": [1, 4], "delta_t": 1, "finetun": 1, "direct": [1, 3, 6, 7], "n_step": [1, 6, 7], "number": [1, 6], "autoregress": [1, 3, 6, 7], "step": [1, 5, 6, 7], "s": [1, 3, 6], "only_headlin": [1, 6, 7], "fals": [1, 6, 7], "optim": [1, 6], "true": [1, 6, 7], "batch_siz": 1, "batch": 1, "size": 1, "train_year": 1, "list": 1, "val_year": 1, "valid": 1, "epoch": 1, "input_s": 1, "input": [1, 2, 5, 6], "channel": [1, 2, 4, 6], "learning_r": 1, "updat": 1, "each": [1, 3, 4, 5], "iter": [1, 6], "model_nam": [1, 3, 7], "name": [1, 4, 7], "consist": 1, "num_work": 1, "worker": 1, "dataload": 1, "output_s": 1, "output": [1, 6], "t_max": 1, "cosin": 1, "learn": 1, "rate": 1, "schedul": 1, "cycl": 1, "note": [1, 2, 5, 6, 7], "you": [1, 3, 5, 6], "notic": 1, "version": [1, 3], "0": [1, 3], "up": 1, "dai": [1, 4], "ahead": 1, "5": [1, 3, 5, 6, 7], "onli": [1, 7], "do": 1, "have": [1, 7], "mani": 1, "more": [1, 4], "thi": [1, 4, 6], "check": 1, "effect": 1, "vs": 1, "describ": 1, "In": [1, 6], "6": [1, 3], "7": [1, 3], "8": [1, 3], "9": [1, 3], "10": [1, 2, 3, 6], "11": [1, 3], "12": [1, 3], "13": 1, "14": 1, "15": [1, 6], "16": 1, "17": 1, "18": 1, "19": 1, "20": [1, 6], "21": 1, "element": 1, "arrai": 1, "correspond": [1, 3], "delta": 1, "t": [1, 2, 6], "25": [1, 6], "30": [1, 6], "35": [1, 6], "40": [1, 6], "44": [1, 4, 6], "hand": [2, 7], "explor": 2, "chaosbench": [2, 5, 6, 7], "notebook": [2, 7], "01a_s2s_data_explor": 2, "ipynb": [2, 7], "target": [2, 6], "tabl": 2, "48": [2, 4], "variabl": [2, 6], "avail": 2, "model": [2, 3, 4, 6, 7], "contain": [2, 4], "all": [2, 4, 6], "field": [2, 5, 6], "includ": [2, 7], "uncheck": 2, "box": 2, "paramet": 2, "level": 2, "hpa": 2, "1000": [2, 3, 6], "925": 2, "850": [2, 6], "700": [2, 6], "500": [2, 6], "300": [2, 3], "200": [2, 3], "100": [2, 3], "50": [2, 3], "geopotenti": 2, "height": 2, "z": [2, 3, 6], "gpm": 2, "humid": [2, 6], "q": [2, 6], "kg": 2, "temperatur": [2, 6], "k": 2, "u": 2, "compon": 2, "wind": 2, "ms": [2, 4], "v": 2, "vertic": 2, "veloc": 2, "w": [2, 3], "pa": 2, "after": 3, "your": [3, 5, 6, 7], "can": [3, 5, 6, 7], "simpli": 3, "perform": [3, 6], "run": [3, 5, 7], "python": [3, 7], "eval_it": 3, "py": [3, 5, 6, 7], "your_model": [3, 6, 7], "eval_year": 3, "version_num": 3, "eval_direct": 3, "task_num": 3, "where": [3, 6], "pytorch_lightn": 3, "gener": [3, 6, 7], "dure": 3, "For": [3, 4, 5, 6], "exampl": [3, 7], "our": [3, 4, 6], "complet": 3, "csv": [3, 4], "format": 3, "climatolog": [3, 5], "persist": 3, "snippet": 3, "log": [3, 7], "eval": [3, 4], "rmse_climatolog": 3, "row": 3, "repres": 3, "rmse": [3, 4], "futur": [3, 6], "timestep": 3, "539": 3, "7944": 3, "285": 3, "9499": 3, "215": 3, "14742": 3, "186": 3, "43161": 3, "166": 3, "28784": 3, "07912156": 3, "538": 3, "9591": 3, "43832": 3, "214": 3, "82317": 3, "23743": 3, "16902": 3, "07907272": 3, "1366": 3, "284": 3, "96063": 3, "51791": 3, "04941": 3, "04732": 3, "07903882": 3, "divid": 4, "class": [4, 7], "which": 4, "cover": 4, "convent": 4, "comput": 4, "vision": 4, "aim": [4, 6], "construct": 4, "faith": 4, "explain": 4, "bia": 4, "anomali": 4, "correl": 4, "coeffici": 4, "acc": 4, "multiscal": 4, "structur": 4, "similar": 4, "index": 4, "ssim": 4, "spectral": 4, "diverg": 4, "specdiv": 4, "residu": 4, "specr": 4, "etc": 4, "folder": 4, "individu": [4, 5], "file": [4, 6, 7], "within": 4, "score": 4, "question": 4, "entir": [4, 6], "60": [4, 6], "n": 4, "across": 4, "time": [4, 5, 6], "clone": 5, "repositori": 5, "creat": 5, "local": 5, "directori": 5, "store": 5, "cd": 5, "mkdir": 5, "navig": 5, "config": [5, 6, 7], "chang": [5, 6, 7], "data_dir": 5, "your_working_dir": 5, "provid": [5, 6, 7], "absolut": 5, "path": [5, 7], "initi": [5, 7], "space": 5, "wget": 5, "blob": 5, "main": 5, "process": 5, "sh": 5, "chmod": 5, "x": 5, "download": 5, "also": [5, 6, 7], "line": 5, "one": [5, 7], "retriev": 5, "requir": 5, "option": 5, "simul": 5, "present": [6, 7], "still": 6, "take": 6, "full": 6, "benchmark": 6, "subset": 6, "dynam": 6, "It": 6, "simultan": 6, "harder": 6, "build": 6, "emul": 6, "condit": 6, "spars": 6, "surfac": 6, "befor": [6, 7], "own": [6, 7], "instruct": 6, "specifi": 6, "By": 6, "go": 6, "To": 6, "modifi": 6, "headline_var": 6, "addit": 6, "flag": 6, "either": 6, "directli": 6, "current": 6, "next": 6, "defin": [6, 7], "set": 6, "ensur": [6, 7], "case": 6, "workflow": 7, "02a_s2s_model": 7, "03a_s2s_train": 7, "outlin": 7, "how": 7, "implement": 7, "sever": 7, "ed": 7, "been": 7, "At": 7, "support": 7, "built": 7, "pytorch": 7, "under": 7, "__init__": 7, "method": 7, "s2sbenchmarkmodel": 7, "write": 7, "configur": 7, "recommend": 7, "read": 7, "detail": 7, "definit": 7, "differ": 7, "html": 7, "correct": 7, "config_filepath": 7, "done": 7, "rememb": 7, "replac": 7, "checkpoint": 7, "would": 7, "automat": 7}, "objects": {}, "objtypes": {}, "objnames": {}, "titleterms": {"chaosbench": 0, "A": 0, "multi": 0, "channel": 0, "physic": 0, "base": 0, "benchmark": 0, "subseason": 0, "season": 0, "climat": 0, "predict": 0, "featur": 0, "get": 0, "start": 0, "build": 0, "your": 0, "own": 0, "model": [0, 1], "baselin": [1, 3], "definit": 1, "checkpoint": 1, "dataset": 2, "inform": 2, "evalu": 3, "access": 3, "score": 3, "leaderboard": 4, "quickstart": 5, "task": 6, "overview": 6, "train": 7}, "envversion": {"sphinx.domains.c": 2, "sphinx.domains.changeset": 1, "sphinx.domains.citation": 1, "sphinx.domains.cpp": 6, "sphinx.domains.index": 1, "sphinx.domains.javascript": 2, "sphinx.domains.math": 2, "sphinx.domains.python": 3, "sphinx.domains.rst": 2, "sphinx.domains.std": 2, "sphinx.ext.intersphinx": 1, "sphinx": 56}}) \ No newline at end of file diff --git a/task.html b/task.html new file mode 100644 index 0000000..bcbd08a --- /dev/null +++ b/task.html @@ -0,0 +1,501 @@ + + + + + + + + + + + + Task Overview — ChaosBench + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + +
+
+
+
+
+ + + + +
+
+ + + + + +
+ + + +
+ +
+
+ +
+
+ +
+ +
+ +
+ + +
+ +
+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ +
+
+ + + +
+

Task Overview

+ +
+
+ +
+
+
+ + + + +
+ +
+

Task Overview#

+

We presented TWO task, where the model still takes as inputs the FULL 60 variables, but the benchmarking targets ALL or SUBSET of variable(s).

+
    +
  1. Task 1️⃣: Full Dynamics Prediction. +It is aimed at ALL target channels simultaneously. This task is generally harder to perform but is useful to build a model that emulates the entire weather conditions.

  2. +
  3. Task 2️⃣: Sparse Dynamics Prediction. +It is aimed at a SUBSET of target channel(s). This task is useful to build long-term forecasting model for specific variables, such as near-surface temperature (t-1000) or near-surface humidity (q-1000).

  4. +
+

NOTE: Before training your own model instructions here, you can specify the Task you are optimizing for by changing only_headline field in chaosbench/configs/<YOUR_MODEL>_s2s.yaml file:

+
    +
  • Task 1️⃣: only_headline: False

  • +
  • Task 2️⃣: only_headline: True. By default, it is going to optimize on {t-850, z-500, q-700}. To change this, modify the HEADLINE_VARS field in chaosbench/config.py

  • +
+

In addition, we also provide flags to train the model either autoregressively or directly.

+
    +
  • Autoregressive: Using current output as the next model input. The number of iterative steps is defined in the n_step: <N_STEP> field. For our baselines, we set N_STEP = 5.

  • +
  • Direct: Directly targeting specific time in the future. The lead time can be specified in the lead_time: <LEAD_TIME> field. Ensure that n_step: 1 for this case. For our baselines, we set <LEAD_TIME> \(\in \{1, 5, 10, 15, 20, 25, 30, 35, 40, 44\}\)

  • +
+
+ + + + +
+ + + + + + + + +
+ + + +
+ + +
+
+ + +
+ + +
+
+
+ + + + + +
+
+ + \ No newline at end of file diff --git a/training.html b/training.html new file mode 100644 index 0000000..2a60fa2 --- /dev/null +++ b/training.html @@ -0,0 +1,498 @@ + + + + + + + + + + + + Training — ChaosBench + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + +
+
+
+
+
+ + + + +
+
+ + + + + +
+ + + +
+ +
+
+ +
+
+ +
+ +
+ +
+ + +
+ +
+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ +
+
+ + + +
+

Training

+ +
+
+ +
+
+
+ + + + +
+ +
+

Training#

+
+

NOTE: Hands-on modeling and training workflow in notebooks/02a_s2s_modeling.ipynb and notebooks/03a_s2s_train.ipynb

+
+

We will outline how one can implement their own data-driven models. Several examples, including ED, FNO, ResNet, and UNet have been provided.

+

Step 1: Define your model class in chaosbench/models/<YOUR_MODEL>.py. At present, we only support models built with PyTorch

+

Step 2: Initialize your model in chaosbench/models/model.py under __init__ method in S2SBenchmarkModel class

+

Step 3: Write a configuration file in chaosbench/configs/<YOUR_MODEL>_s2s.yaml. We recommend reading the details on the definition of hyperparameters and the different [task]((https://leap-stc.github.io/ChaosBench/task.html) before training. Also change the model_name: <YOUR_MODEL>_s2s to ensure correct pathing

+
    +
  • Task 1️⃣ (autoregressive): only_headline: False ; n_step: <N_STEP>

  • +
  • Task 1️⃣ (direct): only_headline: False ; n_step: 1 ; lead_time: <LEAD_TIME>

  • +
  • Task 2️⃣ (autoregressive): only_headline: True ; n_step: <N_STEP>

  • +
  • Task 2️⃣ (direct): only_headline: True ; n_step: 1 ; lead_time: <LEAD_TIME>

  • +
+

Step 4: Train by running python train.py --config_filepath chaosbench/configs/<YOUR_MODEL>_s2s.yaml

+

Step 5: Done!

+

NOTE: Remember to replace <YOUR_MODEL> with your own model name, e.g., unet. Checkpoints and logs would be automatically generated in logs/<YOUR_MODEL>_s2s/.

+
+ + + + +
+ + + + + + + + +
+ + + +
+ + +
+
+ + +
+ + +
+
+
+ + + + + + + + \ No newline at end of file