diff --git a/README.html b/README.html index 26e7826..82c3595 100644 --- a/README.html +++ b/README.html @@ -65,7 +65,7 @@ - + @@ -166,6 +166,7 @@ @@ -402,6 +402,7 @@

Features

Getting Started#

-
-

Motivation#

-

1️⃣ Collapse to Climatology. Performing comparable or worse than climatology renders these state-of-the-art-models operationally unusable -Collapse

-

2️⃣ Blurring Artifact. Averaged-out forecasts is of little use when one attempts to predict extreme events requiring high-fidelity on the S2S scale (e.g., droughts, hurricanes) -Blurring

-
@@ -463,11 +457,11 @@

Motivation

next

-

Quickstart

+

Motivation

@@ -491,7 +485,6 @@

MotivationGetting Started
  • Build Your Own Model
  • Benchmarking
  • -
  • Motivation
  • diff --git a/_sources/README.md b/_sources/README.md index ee346d4..02d3a72 100644 --- a/_sources/README.md +++ b/_sources/README.md @@ -22,6 +22,7 @@ Dataset 🤗: https://huggingface.co/datasets/LEAP/ChaosBench 4️⃣ __Large-Scale Benchmarking__. Systematic evaluation for state-of-the-art ML-based weather models like PanguWeather, FourcastNetV2, ViT/ClimaX, and Graphcast ## Getting Started +- [Motivation](https://leap-stc.github.io/ChaosBench/motivation.html) - [Quickstart](https://leap-stc.github.io/ChaosBench/quickstart.html) - [Dataset Overview](https://leap-stc.github.io/ChaosBench/dataset.html) - [Task Overview](https://leap-stc.github.io/ChaosBench/task.html) @@ -33,12 +34,4 @@ Dataset 🤗: https://huggingface.co/datasets/LEAP/ChaosBench ## Benchmarking - [Baseline Models](https://leap-stc.github.io/ChaosBench/baseline.html) -- [Leaderboard](https://leap-stc.github.io/ChaosBench/leaderboard.html) - - -## Motivation -1️⃣ __Collapse to Climatology__. Performing comparable or worse than climatology renders these state-of-the-art-models operationally unusable -![Collapse](docs/all_rmse_sota.png) - -2️⃣ __Blurring Artifact__. Averaged-out forecasts is of little use when one attempts to predict extreme events requiring high-fidelity on the S2S scale (e.g., droughts, hurricanes) -![Blurring](docs/preds_climax_q700_direct_Task1.png) \ No newline at end of file +- [Leaderboard](https://leap-stc.github.io/ChaosBench/leaderboard.html) \ No newline at end of file diff --git a/_sources/motivation.md b/_sources/motivation.md new file mode 100644 index 0000000..2463a74 --- /dev/null +++ b/_sources/motivation.md @@ -0,0 +1,8 @@ +# Motivation +Our benchmark is one of the first to perform large-scale evaluation on existing state-of-the-art models, and finds methods originally developed for weather-scale applications _fails_ on S2S task, including: + +1️⃣ __Collapse to Climatology__. Performing comparable or worse than climatology renders these state-of-the-art-models operationally unusable +![Collapse](../docs/all_rmse_sota.png) + +2️⃣ __Blurring Artifact__. Averaged-out forecasts is of little use when one attempts to predict extreme events requiring high-fidelity on the S2S scale (e.g., droughts, hurricanes) +![Blurring](../docs/preds_climax_q700_direct_Task1.png) \ No newline at end of file diff --git a/_sources/quickstart.md b/_sources/quickstart.md index 3779cf5..7559b4a 100644 --- a/_sources/quickstart.md +++ b/_sources/quickstart.md @@ -13,7 +13,7 @@ mkdir data **Step 4**: Initialize the space by running ``` cd ChaosBench/data/ -wget https://huggingface.co/datasets/juannat7/ChaosBench/blob/main/process.sh +wget https://huggingface.co/datasets/LEAP/ChaosBench/blob/main/process.sh chmod +x process.sh ``` **Step 5**: Download the data diff --git a/baseline.html b/baseline.html index d98852f..4aad8fc 100644 --- a/baseline.html +++ b/baseline.html @@ -167,6 +167,7 @@