Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

📊 energy: Correct energy prices for inflation #3919

Merged
merged 21 commits into from
Feb 11, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
25225dd
📊 energy: Correct energy prices for inflation
pabloarosado Feb 3, 2025
daba38e
Update Eurostat and Ember energy prices data, with some metadata impr…
pabloarosado Feb 3, 2025
25985df
Archive unused steps
pabloarosado Feb 3, 2025
6b2d04c
Add steps for HICP and combine with gas and electricity prices garden…
pabloarosado Feb 3, 2025
6007560
Adjust PPS for inflation
pabloarosado Feb 3, 2025
3a99554
Improve metadata
pabloarosado Feb 3, 2025
372b284
Add description processing
pabloarosado Feb 3, 2025
7afbdb4
Add description key for PPS
pabloarosado Feb 3, 2025
6779464
Adjust consumer prices in euros for inflation
pabloarosado Feb 3, 2025
ccb018a
Add producer prices in industry data from Eurostat
pabloarosado Feb 6, 2025
434aa1d
Deflate wholesale energy prices
pabloarosado Feb 6, 2025
d933f52
Explain deflation of wholesale prices in metadata
pabloarosado Feb 7, 2025
e50378e
Add key descriptions of deflation process
pabloarosado Feb 7, 2025
7a87963
Remove months where only a few countries are informed
pabloarosado Feb 7, 2025
937b220
Improve deflation calculation for consumer prices
pabloarosado Feb 7, 2025
b682635
Use 2021 base to avoid losing Ember data
pabloarosado Feb 10, 2025
ee9df83
Add sanity checks to price indexes
pabloarosado Feb 10, 2025
1d6fc26
Rebase HICP prices from 2015 to 2021
pabloarosado Feb 10, 2025
813b345
Improve metadata
pabloarosado Feb 10, 2025
c1198dc
Improve metadata and code, and fix issue with description processing
pabloarosado Feb 10, 2025
8ba3dcf
Improve dropdown names
pabloarosado Feb 11, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 31 additions & 0 deletions dag/archive/energy.yml
Original file line number Diff line number Diff line change
Expand Up @@ -381,3 +381,34 @@ steps:
#
data://grapher/energy/2024-11-01/photovoltaic_cost_and_capacity:
- data://garden/energy/2024-11-01/photovoltaic_cost_and_capacity
#
# Eurostat - Energy statistics, prices of natural gas and electricity
#
data://meadow/eurostat/2024-11-05/gas_and_electricity_prices:
- snapshot://eurostat/2024-11-05/gas_and_electricity_prices.zip
#
# Ember - European wholesale electricity prices
#
data://meadow/ember/2024-11-20/european_wholesale_electricity_prices:
- snapshot://ember/2024-11-20/european_wholesale_electricity_prices.csv
#
# Eurostat - Energy statistics, prices of natural gas and electricity
#
data://garden/eurostat/2024-11-05/gas_and_electricity_prices:
- data://meadow/eurostat/2024-11-05/gas_and_electricity_prices
#
# Ember - European wholesale electricity prices
#
data://garden/ember/2024-11-20/european_wholesale_electricity_prices:
- data://meadow/ember/2024-11-20/european_wholesale_electricity_prices
#
# Energy prices
#
data://garden/energy/2024-11-20/energy_prices:
- data://garden/ember/2024-11-20/european_wholesale_electricity_prices
- data://garden/eurostat/2024-11-05/gas_and_electricity_prices
#
# Energy prices
#
data://grapher/energy/2024-11-20/energy_prices:
- data://garden/energy/2024-11-20/energy_prices
94 changes: 58 additions & 36 deletions dag/energy.yml
Original file line number Diff line number Diff line change
Expand Up @@ -229,26 +229,6 @@ steps:
data://grapher/energy/2024-11-15/photovoltaic_cost_and_capacity:
- data://garden/energy/2024-11-15/photovoltaic_cost_and_capacity
#
# Eurostat - Energy statistics, prices of natural gas and electricity
#
data://meadow/eurostat/2024-11-05/gas_and_electricity_prices:
- snapshot://eurostat/2024-11-05/gas_and_electricity_prices.zip
#
# Eurostat - Energy statistics, prices of natural gas and electricity
#
data://garden/eurostat/2024-11-05/gas_and_electricity_prices:
- data://meadow/eurostat/2024-11-05/gas_and_electricity_prices
#
# Ember - European wholesale electricity prices
#
data://meadow/ember/2024-11-20/european_wholesale_electricity_prices:
- snapshot://ember/2024-11-20/european_wholesale_electricity_prices.csv
#
# Ember - European wholesale electricity prices
#
data://garden/ember/2024-11-20/european_wholesale_electricity_prices:
- data://meadow/ember/2024-11-20/european_wholesale_electricity_prices
#
# IEA - Fossil fuel subsidies
#
data://meadow/iea/2024-11-20/fossil_fuel_subsidies:
Expand All @@ -264,22 +244,6 @@ steps:
data://grapher/iea/2024-11-20/fossil_fuel_subsidies:
- data://garden/iea/2024-11-20/fossil_fuel_subsidies
#
# Energy prices
#
data://garden/energy/2024-11-20/energy_prices:
- data://garden/eurostat/2024-11-05/gas_and_electricity_prices
- data://garden/ember/2024-11-20/european_wholesale_electricity_prices
#
# Energy prices
#
data://grapher/energy/2024-11-20/energy_prices:
- data://garden/energy/2024-11-20/energy_prices
#
# Energy prices explorer
#
export://multidim/energy/latest/energy_prices:
- data://grapher/energy/2024-11-20/energy_prices
#
# Benchmark Mineral Intelligence - Battery cell prices.
#
data-private://meadow/benchmark_mineral_intelligence/2024-11-29/battery_cell_prices:
Expand All @@ -295,3 +259,61 @@ steps:
#
data-private://grapher/benchmark_mineral_intelligence/2024-11-29/battery_cell_prices:
- data-private://garden/benchmark_mineral_intelligence/2024-11-29/battery_cell_prices
#
# Eurostat - Harmonised index of consumer prices (HICP)
#
data://meadow/eurostat/2025-02-03/harmonised_index_of_consumer_prices:
- snapshot://eurostat/2025-02-03/harmonised_index_of_consumer_prices.gz
#
# Eurostat - Harmonised index of consumer prices (HICP)
#
data://garden/eurostat/2025-02-03/harmonised_index_of_consumer_prices:
- data://meadow/eurostat/2025-02-03/harmonised_index_of_consumer_prices
#
# Eurostat - Producer prices in industry
#
data://meadow/eurostat/2025-02-03/producer_prices_in_industry:
- snapshot://eurostat/2025-02-03/producer_prices_in_industry.gz
#
# Eurostat - Producer prices in industry
#
data://garden/eurostat/2025-02-03/producer_prices_in_industry:
- data://meadow/eurostat/2025-02-03/producer_prices_in_industry
#
# Eurostat - Energy statistics, prices of natural gas and electricity
#
data://meadow/eurostat/2025-02-03/gas_and_electricity_prices:
- snapshot://eurostat/2025-02-03/gas_and_electricity_prices.zip
#
# Eurostat - Energy statistics, prices of natural gas and electricity
#
data://garden/eurostat/2025-02-03/gas_and_electricity_prices:
- data://meadow/eurostat/2025-02-03/gas_and_electricity_prices
- data://garden/eurostat/2025-02-03/harmonised_index_of_consumer_prices
#
# Ember - European wholesale electricity prices
#
data://meadow/ember/2025-02-03/european_wholesale_electricity_prices:
- snapshot://ember/2025-02-03/european_wholesale_electricity_prices.csv
#
# Ember - European wholesale electricity prices
#
data://garden/ember/2025-02-03/european_wholesale_electricity_prices:
- data://meadow/ember/2025-02-03/european_wholesale_electricity_prices
- data://garden/eurostat/2025-02-03/producer_prices_in_industry
#
# Energy prices
#
data://garden/energy/2025-02-03/energy_prices:
- data://garden/eurostat/2025-02-03/gas_and_electricity_prices
- data://garden/ember/2025-02-03/european_wholesale_electricity_prices
#
# Energy prices
#
data://grapher/energy/2025-02-03/energy_prices:
- data://garden/energy/2025-02-03/energy_prices
#
# Energy prices explorer
#
export://multidim/energy/latest/energy_prices:
- data://grapher/energy/2025-02-03/energy_prices
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
{
"Austria": "Austria",
"Belgium": "Belgium",
"Bulgaria": "Bulgaria",
"Croatia": "Croatia",
"Czechia": "Czechia",
"Denmark": "Denmark",
"Estonia": "Estonia",
"Finland": "Finland",
"France": "France",
"Germany": "Germany",
"Greece": "Greece",
"Hungary": "Hungary",
"Ireland": "Ireland",
"Italy": "Italy",
"Latvia": "Latvia",
"Lithuania": "Lithuania",
"Luxembourg": "Luxembourg",
"Montenegro": "Montenegro",
"Netherlands": "Netherlands",
"North Macedonia": "North Macedonia",
"Norway": "Norway",
"Poland": "Poland",
"Portugal": "Portugal",
"Romania": "Romania",
"Serbia": "Serbia",
"Slovakia": "Slovakia",
"Slovenia": "Slovenia",
"Spain": "Spain",
"Sweden": "Sweden",
"Switzerland": "Switzerland",
"United Kingdom": "United Kingdom"
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
definitions:
common:
presentation:
topic_tags:
- Energy
grapher_config:
note: |-
This data is expressed in constant {EUROS_YEAR} euros, deflated using the Producer Price Index for energy.
processing_level: minor
description_key:
- Wholesale electricity prices are the average spot prices in the day-ahead market, where electricity is traded for delivery to consumers the following day. These prices fluctuate based on supply and demand and are typically set on an hourly basis.
- Prices are measured in euros per [megawatt-hour](#dod:watt-hours).
- These are the prices paid to electricity producers and do not represent the final cost for households or businesses, which also includes additional costs like distribution, transmission, and taxes.
- To account for inflation, prices have been adjusted using the Producer Price Index (PPI) for energy, with {EUROS_YEAR} as the reference year.
description_processing: |-
- To account for inflation, prices have been divided by the Producer Price Index (PPI) for energy (and multiplied by 100), using {EUROS_YEAR} as the reference year. This adjusts for changes in producer costs over time, providing a more consistent measure of price trends.

dataset:
update_period_days: 365

tables:
european_wholesale_electricity_prices_monthly:
variables:
price:
title: Electricity wholesale monthly price
unit: 'constant {EUROS_YEAR} euros per megawatt-hour'
short_unit: "€/MWh"
description_short: |-
Monthly average wholesale price of electricity sold, in euros per [megawatt-hour](#dod:watt-hours). Prices have been adjusted for inflation but not for differences in living costs between countries.
european_wholesale_electricity_prices_annual:
variables:
price:
title: Electricity wholesale annual price
unit: "constant {EUROS_YEAR} euros per megawatt-hour"
short_unit: "€/MWh"
description_short: |-
Annual average wholesale price of electricity sold, in euros per [megawatt-hour](#dod:watt-hours). Prices have been adjusted for inflation but not for differences in living costs between countries.
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
"""Load a meadow dataset and create a garden dataset."""

import pandas as pd
from owid.catalog import Table
from structlog import get_logger

from etl.data_helpers import geo
from etl.helpers import PathFinder, create_dataset

# Initialize logger.
log = get_logger()

# Get paths and naming conventions for current step.
paths = PathFinder(__file__)

# Select and rename columns.
COLUMNS = {
"country": "country",
"date": "date",
"price__eur_mwhe": "price",
}

# Minimum number of countries that must be informed per month (applied to PPI data).
# If the number is smaller than this, the month is removed from the data.
# We do this to avoid having very sparse data (especially on the latest informed month).
MIN_NUM_COUNTRIES_INFORMED_PER_MONTH = 10

# Allow PPI data to lag a certain number of months behind, and warn if that lag is larger than expected.
# NOTE: We will also assert that the minimum date in Ember is fully covered by PPI data.
PPI_ALLOWED_MONTHS_OF_LAG = 2

# Base year for the PPI data.
PPI_EUROS_YEAR = 2021


def adjust_prices_for_inflation(tb_monthly: Table, tb_ppi: Table) -> Table:
# Select Producer Prices Index for classification "[MIG_NRG] MIG - energy".
tb_ppi = tb_ppi[tb_ppi["classification"] == "MIG_NRG"].drop(columns=["classification"]).reset_index(drop=True)

# Adapt dates in PPI dataset to match the monthly electricity prices.
tb_ppi["date"] = tb_ppi["date"].str.strip() + "-01"
assert tb_ppi["date"].str.len().eq(10).all(), "Unexpected date format in PPI dataset."

# Remove months for which we don't have enough countries.
# This happens at least to the most recently informed month, where only a few countries are displayed.
tb_ppi = tb_ppi[
tb_ppi.groupby(["date"])["country"].transform("count") > MIN_NUM_COUNTRIES_INFORMED_PER_MONTH
].reset_index(drop=True)

# Combine energy prices table with PPI table.
tb_monthly = tb_monthly.merge(tb_ppi, on=["country", "date"], how="left")

# Sanity checks.
# Check that the maximum date of PPI is only a certain number of months behind Ember data.
ember_first_month = tb_monthly[tb_monthly["price"].notnull()]["date"].min()
ember_latest_month = tb_monthly[tb_monthly["price"].notnull()]["date"].max()
ppi_first_month = tb_monthly[tb_monthly["ppi"].notnull()]["date"].min()
ppi_latest_month = tb_monthly[tb_monthly["ppi"].notnull()]["date"].max()
if pd.to_datetime(ppi_latest_month) < pd.to_datetime(ember_latest_month) - pd.DateOffset(
months=PPI_ALLOWED_MONTHS_OF_LAG
):
log.warning(f"PPI data is lagging behind more than {PPI_ALLOWED_MONTHS_OF_LAG} months behind Ember's data.")
# Check that the minimum date of PPI fully covers the data in the energy prices table.
error = "PPI data does not cover the minimum date of energy prices."
assert ember_first_month >= ppi_first_month, error
error = "Base year is not as expected"
import re

base_year = re.search(r"\b(20\d{2}|19\d{2})\b", tb_ppi["ppi"].metadata.description_short).group(0)
assert base_year == str(PPI_EUROS_YEAR), error

# Adjust monthly prices for inflation.
# NOTE: When doing this, many prices will be lost (e.g. UK data).
tb_monthly["price"] = tb_monthly["price"] * 100 / tb_monthly["ppi"]

return tb_monthly


def prepare_annual_data(tb_monthly: Table) -> Table:
# Ember provides monthly data, so we can create a monthly table of wholesale electricity prices.
# But we also need to create an annual table of average wholesale electricity prices.
tb_annual = tb_monthly.copy()
tb_annual["year"] = tb_annual["date"].str[:4].astype("Int64")
# NOTE: We will include only complete years. This means that the latest year will not be included. But also, we will disregard country-years like Ireland 2022, which only has data for a few months, for some reason.
n_months = tb_annual.groupby(["country", "year"], observed=True, as_index=False)["date"].transform("count")
tb_annual = (
tb_annual[n_months == 12].groupby(["country", "year"], observed=True, as_index=False).agg({"price": "mean"})
)

return tb_annual


def run(dest_dir: str) -> None:
#
# Load inputs.
#
# Load meadow dataset, and read its main table.
ds_meadow = paths.load_dataset("european_wholesale_electricity_prices")
tb_monthly = ds_meadow.read("european_wholesale_electricity_prices")

# Load Eurostat Producer Prices in Industry dataset, and read its main table.
ds_ppi = paths.load_dataset("producer_prices_in_industry")
tb_ppi = ds_ppi.read("producer_prices_in_industry")

#
# Process data.
#
# Select and rename columns.
tb_monthly = tb_monthly[list(COLUMNS)].rename(columns=COLUMNS, errors="raise")

# Harmonize country names.
tb_monthly = geo.harmonize_countries(df=tb_monthly, countries_file=paths.country_mapping_path)

# Adjust prices for inflation.
tb_monthly = adjust_prices_for_inflation(tb_monthly=tb_monthly, tb_ppi=tb_ppi)

# Prepare annual data.
tb_annual = prepare_annual_data(tb_monthly=tb_monthly)

# Improve table formats.
tb_monthly = tb_monthly.format(["country", "date"], short_name="european_wholesale_electricity_prices_monthly")
tb_annual = tb_annual.format(short_name="european_wholesale_electricity_prices_annual")

#
# Save outputs.
#
# Create a new garden dataset.
ds_garden = create_dataset(
dest_dir,
tables=[tb_monthly, tb_annual],
check_variables_metadata=True,
yaml_params={"EUROS_YEAR": PPI_EUROS_YEAR},
)

# Save changes in the new garden dataset.
ds_garden.save()
11 changes: 11 additions & 0 deletions etl/steps/data/garden/energy/2025-02-03/energy_prices.meta.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
definitions:
common:
processing_level: major
presentation:
topic_tags:
- Energy

dataset:
title: European Energy Prices
update_period_days: 365

Loading