Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement climate projections #929

Open
wants to merge 56 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 11 commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
73601f9
Initial commit
ekatef Nov 19, 2023
c68abbe
Move a projection script into script directory
ekatef Nov 28, 2023
2beb879
Restructure into Snakemake workflow
ekatef Nov 29, 2023
1a24bad
Organize code
ekatef Nov 29, 2023
174b0b1
Add inputs
ekatef Nov 30, 2023
4594541
Fix Snakemake inputs
ekatef Nov 30, 2023
83c6b0b
Improve documenting
ekatef Nov 30, 2023
b196e5a
Code clean-up
ekatef Nov 30, 2023
369bc9b
Remove hardcoding and add debug plotting
ekatef Nov 30, 2023
d1fd454
Add parameters to the tacked configs
ekatef Nov 30, 2023
21a1751
Merge branch 'main' into implement_climate_projections
ekatef Nov 30, 2023
33225c7
Merge branch 'main' into implement_climate_projections
ekatef Dec 26, 2023
72272d5
Fix wildcards for the projection
ekatef Dec 26, 2023
b288ce7
Merge branch 'pypsa-meets-earth:main' into implement_climate_projections
ekatef Dec 26, 2023
7d3de0f
Merge branch 'implement_climate_projections' of https://github.com/ek…
ekatef Dec 26, 2023
bc5575f
Adjust inputs-outputs in the script docstring
ekatef Dec 26, 2023
d77af95
Add methodology
ekatef Dec 26, 2023
51c8a09
Add docstrings
ekatef Dec 26, 2023
cebf6a3
Add TODOs
ekatef Dec 26, 2023
0d7ec1d
Fix format
ekatef Dec 26, 2023
0e507c1
Fix input
ekatef Dec 27, 2023
47c628a
Remove hardcoding for a parameter name in a projection dataset
ekatef Dec 27, 2023
dc735ba
Remove hardcoding for a parameter name in a cutout dataset
ekatef Dec 27, 2023
4500f45
Wrap plots into functions
ekatef Dec 27, 2023
7d8e17e
Update docstrings
ekatef Dec 27, 2023
42da0b8
Draft stretch implementation
ekatef Dec 27, 2023
e4780a9
Remove a TODO
ekatef Dec 28, 2023
dbe81ac
Use numpy functions
ekatef Dec 28, 2023
5fa9712
Clarify a docstring
ekatef Dec 28, 2023
b4e1893
Improve inputs
ekatef Dec 28, 2023
a292e14
Fix names of the inputs
ekatef Dec 28, 2023
3715f9f
Improve naming
ekatef Dec 28, 2023
a33b9ec
Add a stretch-reference to the docstring
ekatef Dec 28, 2023
d061ef3
Wrap preparation of cmip6-xarrays into a function
ekatef Dec 28, 2023
1c46763
Add stretch
ekatef Dec 28, 2023
f3825f2
Implement stretch
ekatef Dec 28, 2023
989ae50
Fix names
ekatef Dec 28, 2023
f68bf21
Test stretch
ekatef Dec 28, 2023
ba3d1a7
Fix plot parameters
ekatef Dec 28, 2023
c1bfe0c
Improve comments to the config parameters
ekatef Jan 20, 2024
9d21fde
Add comments to the projection calculations
ekatef Jan 20, 2024
58ec438
Merge branch 'main' into implement_climate_projections
ekatef Mar 8, 2024
a8f8636
Fix formatting
ekatef Mar 8, 2024
33d90d2
Enhance docstrings
ekatef Mar 8, 2024
f5cb70e
Improve naming
ekatef Mar 8, 2024
2177bb1
Fix parameters in a function call
ekatef Mar 8, 2024
49117f2
Merge branch 'main' into implement_climate_projections
ekatef Mar 8, 2024
19b65fc
Improve explanation for stretch calculations
ekatef Mar 8, 2024
7ca6216
Improve naming and add enable parameters to the configs
ekatef Mar 9, 2024
39fd2d1
Fix typo
ekatef Mar 9, 2024
517e422
Replace hard-coding
ekatef Mar 9, 2024
1f3520d
Improve naming
ekatef Mar 9, 2024
dd4b060
Add a check for time match
ekatef Mar 9, 2024
f71390f
Fix typo
ekatef Mar 9, 2024
14707c8
Update the docstring
ekatef Mar 9, 2024
47e7946
Remove TODOs
ekatef Mar 9, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions Snakefile
Original file line number Diff line number Diff line change
Expand Up @@ -363,6 +363,31 @@ if config["enable"].get("build_cutout", False):
"scripts/build_cutout.py"


if config["enable"].get("modify_cutout", False):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could call the option climate_projection_cutout or something like that?

I was considering if adding an option different from build_cutout may be necessary, but I believe so to avoid overwriting existing "original" cutouts

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, agree that it would be nice to keep the main workflow as safe as possible from interactions with these additions.

The option name revised. Have I understood you idea correctly? :)


rule build_climate_projections:
params:
snapshots=config["snapshots"],
climate_scenario=config["projection"]["climate_scenario"],
present_year=config["projection"]["present_year"],
future_year=config["projection"]["future_year"],
years_window=config["projection"]["years_window"],
input:
cutout="cutouts/" + CDIR + "{cutout}.nc",
cmip6_avr="/Users/ekaterina/Documents/_github_/cmip6/Global Atlas/tas_ssp245/t_CMIP6_ssp245_mon_201501-210012.nc",
output:
"cutouts/" + CDIR + "{cutout}_{future_year}.nc",
log:
"logs/" + RDIR + "build_climate_projections/{cutout}_{future_year}.log",
benchmark:
"benchmarks/" + RDIR + "build_climate_projections_{cutout}_{future_year}"
threads: ATLITE_NPROCESSES
resources:
mem_mb=ATLITE_NPROCESSES * 1000,
script:
"scripts/build_climate_projections.py"


if config["enable"].get("build_natura_raster", False):

rule build_natura_raster:
Expand Down
8 changes: 8 additions & 0 deletions config.default.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -204,6 +204,14 @@ atlite:
# y: [33., 72] # manual set cutout range


projection:
climate_scenario: ssp245
present_year: 2020
future_year: 2050
years_window: 5 # a half of width currently
gcm_selection: false # false to use all the available global climate models


renewable:
onwind:
cutout: cutout-2013-era5
Expand Down
9 changes: 9 additions & 0 deletions config.tutorial.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -218,6 +218,15 @@ atlite:
# x: [-12., 35.] # set cutout range manual, instead of automatic by boundaries of country
# y: [33., 72] # manual set cutout range


projection:
climate_scenario: ssp245
present_year: 2020
future_year: 2050
years_window: 5 # a half of width currently
gcm_selection: false # false to use all the available global climate models


renewable:
onwind:
cutout: cutout-2013-era5-tutorial
Expand Down
210 changes: 210 additions & 0 deletions scripts/build_climate_projections.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,210 @@
# -*- coding: utf-8 -*-
# SPDX-FileCopyrightText: PyPSA-Earth and PyPSA-Eur Authors
#
# SPDX-License-Identifier: AGPL-3.0-or-later

# -*- coding: utf-8 -*-
"""
Creates a dataset corresponding to the projected climate in a requested year.
davide-f marked this conversation as resolved.
Show resolved Hide resolved

Relevant Settings
-----------------

.. code:: yaml

projection:
climate_scenario:
present_year:
future_year:
years_window:
gcm_selection:

Inputs
------

- ``cutouts/cutout.nc``: confer :ref:`cutout`, a cutout file produced by altile
- ``data/cmip6_avr.cn``:

Outputs
-------

- ``cutouts/{cutout}_{future_year}.nc"``: A cutout modified to account for future climate conditions

Description
-----------

The rule :mod:`build_climate_projections` creates a cutout file which corresponds to a requested year in the future. Temperature projectons are being calculated combining data for cutout.nc which is assumed to be representative of the past climate and an ensemble of CMIP6 globale climate models to account for the future climate
"""


import datetime as dt
import os

import atlite
import geopandas as gpd
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import pypsa
import xarray as xr
from _helpers import configure_logging, create_logger
from shapely.geometry import LineString as Line
from shapely.geometry import Point

logger = create_logger(__name__)


def crop_cmip6(cmip6_xr, cutout_xr):
davide-f marked this conversation as resolved.
Show resolved Hide resolved
# spartial margin needed to avoid data losses during further interpolation
d_pad = 1

cmip6_region = cmip6_xr.sel(
lat=slice(
min(cutout_xr.coords["y"].values) - d_pad,
max(cutout_xr.coords["y"].values) + d_pad,
),
lon=slice(
min(cutout_xr.coords["x"].values) - d_pad,
max(cutout_xr.coords["x"].values) + d_pad,
),
)
Comment on lines +111 to +121
Copy link
Member

@davide-f davide-f Jan 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very nice! I'm wondering, instead of doing min/max, may it be that the first element of the list is also the lowest?
so like cutout_xr.coords["y"].values[0] is the lowest and cutout_xr.coords["y"].values[-1] is the max value?

if so we could avoid the max/mins in the whole document.
Maybe we could use some utility functions for that.

I'm wondering, but is this cmip6 compatible with the object cutout in atlite?
As it is an xarray maybe?

is we load it as a cutout, we can use the bounds function of the cutout object and this facilitates a lot the whole style.
This is a guess though.

Copying here the link to the bounds function and the atlite file, that can give some ideas for the coding style :)
https://github.com/PyPSA/atlite/blob/a71bca2f6f7221b70dbbc371752aef56d937b1ff/atlite/cutout.py#L314

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! A really great idea to align with atlite approaches.

As for the particular bounds calculation, I have investigated this idea, but would say that it doesn't feel like a clear approach to use "magic" index numbers. Agree that it's quite common. But personally, I'm not capable to keep in mind which integer means what, and that is usually quite tricky to deal with the code using such conventions. Not sure it's justifiable but performance gains, as currently performance bottlenecks seem to be in spatial calculations.

What would potentially be interesting to adopt from atlite is work with data chunks. It may be probably a good idea to investigate it for increasing the performance. What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmm maybe it would be good to at least create here an auxiliary function that, having as input an xarray, returns the boundaries like (xmin, xmax, ymin, ymax) or alike.
having that function may help clarifying the readability regardless of their implementation; although, I believe the xmin may be the first value of x and the last one the maximum.
Likewise, an auxiliary function that calculate the dx and dy may be useful although trivial: the advantage may be readability and manutentibility.

What do you think?

return cmip6_region


def interpolate_cmip6_to_cutout_grid(cmip6_xr, cutout_xr):
# TODO read from the cutout file instead of hardcoding
dx_new = 0.3

newlon = np.arange(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be good to use np.max and alike but unsure performences increas significantly

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the hint! Definitely worth investigation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have done some profiling, and np.min( ), np.max( ) have been 5-10% faster. Agree that it's not breakthrough in performance, but it is consistent improvement which doesn't affect code readability. So, I'd opt for it. Thanks for the suggestion! :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for checking, by looking at the atlite code it raised few ideas:

  1. is it possible to load the cmip file as an atlite cutout? this doesn't mean to use the cutout for atlite but simply leverage on the utility functions it has
  2. if not, the cutout.py file gives quite some hints to improve the functionalities if the task is actually time consuming. For example, the minimum value may always be the first of the list while the maximum the last one (to verify though)

Link to the cutout.py file : https://github.com/PyPSA/atlite/blob/a71bca2f6f7221b70dbbc371752aef56d937b1ff/atlite/cutout.py#L314

round(min(cutout_xr.coords["x"].values), 1),
round(max(cutout_xr.coords["x"].values) + dx_new, 1),
dx_new,
)
newlat = np.arange(
round(min(cutout_xr.coords["y"].values), 1),
round(max(cutout_xr.coords["y"].values) + dx_new, 1),
dx_new,
)
cmip6_interp = cmip6_xr.interp(time=cmip6_xr.time, lat=newlat, lon=newlon)

return cmip6_interp


# TODO fix years_window
def subset_by_time(cmip6_xr, month, year, years_window):
# TODO add warning in case there are no enough years
cmip6_in_period = cmip6_xr.where(
(
(cmip6_xr["time.month"] == month)
& (cmip6_xr["time.year"] >= year - years_window)
& (cmip6_xr["time.year"] < year + years_window - 1)
),
drop=True,
)
return cmip6_in_period


def calculate_proj_of_average(cmip6_xr, month, year0, year1, years_window):
cmip6_interp_year0 = subset_by_time(
cmip6_xr,
month,
year=year0,
)
cmip6_interp_year1 = subset_by_time(cmip6_xr, month=month, year=year1)
dt_interp = cmip6_interp_year1["t"].mean("member").mean(
"time"
) - cmip6_interp_year0["t"].mean("member").mean("time")
return dt_interp


def build_projection_for_month(cutout_xr, dt_xr, month):
k_time = cutout_xr.time.dt.month.isin(month)

for i in range(0, len(cutout_xr.y)):
for j in range(0, len(cutout_xr.x)):
cutout_xr.temperature[k_time, i, j] = np.add(
cutout_xr.temperature[k_time, i, j],
np.array([dt_xr[i, j].item()] * k_time.sum().item()),
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first sight, it feels like this code may take long time to compute.
What are you trying to do here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to modify a temperature value at each spatial point according to a projection for an increase in the monthly temperature. You are absolutely right that it may be quite time-consuming. Do you have any ideas on possible approaches to increase performance?


return cutout_xr


def build_cutout_future(cutout_xr, cmip6_xr, months, year0, year1, years_window):
for k_month in [months]:
dt_k = calculate_proj_of_average(
cmip6_xr=cmip6_xr, month=k_month, year0=year0, year1=year1, years_window=5
)

build_projection_for_month(cutout_xr, dt_k, k_month)

cutout_xr = cutout_xr.where(cutout_xr.time.dt.month.isin(months), drop=True)

return cutout_xr


if __name__ == "__main__":
if "snakemake" not in globals():
from _helpers import mock_snakemake

os.chdir(os.path.dirname(os.path.abspath(__file__)))
snakemake = mock_snakemake(
"build_climate_projections",
climate_scenario="ssp2-2.6",
present_year=2020,
future_year=2050,
years_window=5,
cutout="africa-2013-era5",
)
configure_logging(snakemake)

climate_scenario = snakemake.params.climate_scenario
present_year = snakemake.params.present_year
future_year = snakemake.params.future_year
years_window = snakemake.params.years_window
cutout = snakemake.input.cutout
cmip6 = snakemake.input.cmip6_avr

snapshots = pd.date_range(freq="h", **snakemake.params.snapshots)
season_in_focus = snapshots.month.unique().to_list()

cmip6_xr = xr.open_dataset(cmip6)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to load this dataset?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, the dataset is available via Copernicus and licensed as CC-BY 4.0, being IPCC product 🙏🏽

So, it can be downloaded manually or included into a dedicated databundle or loaded via Copernicus API with a request as simple as

c.retrieve(
    'projections-climate-atlas',
    {
        'format': 'zip',
        'origin': 'CMIP6',
        'experiment': 'ssp2_4_5',
        'domain': 'global',
        'period': '2015-2100',
        'variable': 'monthly_mean_of_daily_mean_temperature',
    },
    'download.zip')

cutout_xr = xr.open_dataset(cutout)
cmip6_region = crop_cmip6(cmip6_xr, cutout_xr)

cmip6_region_interp = interpolate_cmip6_to_cutout_grid(
cmip6_xr=cmip6_region, cutout_xr=cutout_xr
)

# -----------------------------------------------------------------
# to be replaced after debug
# -----------------------------------------------------------------
# graphical test of interpolation
fig = plt.figure()
cmip6_region_interp["t"].mean("time").mean("member").plot()
fig.savefig("test_cmip6_interp.png", dpi=700)

fig = plt.figure()
(cutout_xr["temperature"] - 273.15).mean("time").plot(cmap="OrRd")
fig.savefig("results/test_present.png", dpi=700)
# -----------------------------------------------------------------

# TODO Add a check for time match
cutout_future = build_cutout_future(
cutout_xr=cutout_xr,
cmip6_xr=cmip6_region_interp,
months=season_in_focus,
year0=present_year,
year1=future_year,
years_window=years_window,
)

cutout_future.to_netcdf(snakemake.output[0])

# -----------------------------------------------------------------
# to be replaced after debug
# -----------------------------------------------------------------
# graphical test of projection
fig = plt.figure()
(cutout_future["temperature"] - 273.15).mean("time").plot(cmap="OrRd")
fig.savefig("results/test_cmip6_project.png", dpi=700)
Loading