Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aligning PModel and Subdaily PModel #401

Draft
wants to merge 7 commits into
base: develop
Choose a base branch
from

Conversation

davidorme
Copy link
Collaborator

@davidorme davidorme commented Jan 27, 2025

Description

This is a draft of a new structure for the PModel and SubdailyPModel:

  • It centralises a lot of the shared attributes and docstrings in the PModelABC, which includes a single abstract class _fit_model.
  • Each of the PModel and SubdailyPModel subclasses then:
    • Defines the specific _fit_model for the class.
    • Defines an __init__ method that:
      • calls super.__init__(...) to initialise the superclass attributes
      • adds any specific attributes for the subclass, and
      • lastly, calls self._fit_model()

The result aligns the internal attribute names and shares a lot of docstrings and boilerplate. It also dumps the separate PModel.estimate_productivity step and moves fapar and ppfd into the signature of both models.

Warning

This is very much a propotype. However, running the code below work and duplicates the results of the notebook that this was copied from.

Prototype code
from importlib import resources

import matplotlib.pyplot as plt
import numpy as np
import pandas

from pyrealm.pmodel import (
    PModelEnvironment,
    SubdailyScaler,
)
from pyrealm.pmodel.new_pmodel import PModelNew, SubdailyPModelNew

env = PModelEnvironment(
    tc=np.array([20]),
    vpd=np.array([1000]),
    co2=np.array([400]),
    patm=np.array([101325.0]),
)

p = PModelNew(env=env, fapar=np.array([1]), ppfd=np.array([300]))
print(p)
p.summarize()

data_path = resources.files("pyrealm_build_data.subdaily") / "subdaily_BE_Vie_2014.csv"

data = pandas.read_csv(str(data_path))

# Extract the key half hourly timestep variables as numpy arrays
temp_subdaily = data["ta"].to_numpy()
vpd_subdaily = data["vpd"].to_numpy()
co2_subdaily = data["co2"].to_numpy()
patm_subdaily = data["patm"].to_numpy()
ppfd_subdaily = data["ppfd"].to_numpy()
fapar_subdaily = data["fapar"].to_numpy()
datetime_subdaily = pandas.to_datetime(data["time"]).to_numpy()


subdaily_env = PModelEnvironment(
    tc=temp_subdaily,
    vpd=vpd_subdaily,
    co2=co2_subdaily,
    patm=patm_subdaily,
)

# Fit the standard P Model
pmodel_standard = PModelNew(
    subdaily_env,
    method_kphio="fixed",
    reference_kphio=1 / 8,
    ppfd=ppfd_subdaily,
    fapar=fapar_subdaily,
)
pmodel_standard._fit_model()
pmodel_standard.summarize()


# Create the fast slow scaler

fsscaler = SubdailyScaler(datetime_subdaily)

# Set the acclimation window as the values within a one hour window centred on noon
fsscaler.set_window(
    window_center=np.timedelta64(12, "h"),
    half_width=np.timedelta64(30, "m"),
)

# Fit the P Model with fast and slow responses
pmodel_subdaily = SubdailyPModelNew(
    env=subdaily_env,
    fs_scaler=fsscaler,
    allow_holdover=True,
    ppfd=ppfd_subdaily,
    fapar=fapar_subdaily,
    reference_kphio=1 / 8,
)

pmodel_subdaily._fit_model()
pmodel_subdaily.summarize()


idx = np.arange(48 * 120, 48 * 130)
plt.figure(figsize=(10, 4))
plt.plot(datetime_subdaily[idx], pmodel_standard.gpp[idx], label="Instantaneous model")
plt.plot(datetime_subdaily[idx], pmodel_subdaily.gpp[idx], "r-", label="Slow responses")
plt.ylabel("GPP")
plt.legend(frameon=False)
plt.show()

Things still to do:

  • Not all attributes defined on each model are populated. For the moment, we can move these into the model specific __init__ and move them back into the base class once we know how to populate on both models.
  • Replace the old implementation and check the tests still work!
  • It would be nice to use @cachedproperty on some of these attributes to keep overheads down, but that seems like a separate PR, and it shouldn't break the API. For example, PModel.rd might be always calculated or only calculated on request.
  • The signatures of the two classes will likely change: fapar and ppfd will move into PModelEnvironment and an AcclimationModel class will bundle a lot of the args to SubdailyPModel (see discussion PModel API for 2.0.0 #394), but again this can be two separate PRs.

Fixes #386
Fixes #385

Type of change

  • New feature (non-breaking change which adds functionality)
  • Optimization (back-end change that speeds up the code)
  • Bug fix (non-breaking change which fixes an issue)

Key checklist

  • Make sure you've run the pre-commit checks: $ pre-commit run -a
  • All tests pass: $ poetry run pytest

Further checks

  • Code is commented, particularly in hard-to-understand areas
  • Tests added that prove fix is effective or that feature works

@j-emberton
Copy link
Collaborator

Hi @davidorme.

Just taken a look at this. Overall I think it looks really good and is a big improvement on what we had before.

Areas for improvement:

  • I feel the constructor of the PModelABC base class is a bit cluttered and contains a lot of similar repeated operations when evaluating and retrieving different methods and models. There's potential to rationalise these operations into class methods and make them more testable. But this isn't a showstopper.
  • Do we want to add explicit unit checking? This could be as simple as checking feasible physical values for inputs such as ppfd or something more complex such as using Pint to enforce unit consistency when writing a longer analysis script that uses multiple Pyrealm classes.

@davidorme
Copy link
Collaborator Author

Areas for improvement:

  • I feel the constructor of the PModelABC base class is a bit cluttered and contains a lot of similar repeated operations when evaluating and retrieving different methods and models. There's potential to rationalise these operations into class methods and make them more testable. But this isn't a showstopper.

Yup - I agree that could be packaged more cleanly. I'll have a look at this but might park some of this for later polishing. I think the big picture shape of the new API and keeping the tests working is key on this PR.

  • Do we want to add explicit unit checking? This could be as simple as checking feasible physical values for inputs such as ppfd or something more complex such as using Pint to enforce unit consistency when writing a longer analysis script that uses multiple Pyrealm classes.

On the bounds checking - yes, but not here. Once this is done, PPFD and FAPAR will move into PModelEnvironment, and that adds bounds checks already (no issue yet, but see #394)

@j-emberton
Copy link
Collaborator

Hi @davidorme , is this ready to go now?

@davidorme
Copy link
Collaborator Author

@j-emberton Nope - this is going to sprawl a lot more, but there are a couple of incoming PRs that will make this switch cleaner.

@j-emberton
Copy link
Collaborator

@j-emberton Nope - this is going to sprawl a lot more, but there are a couple of incoming PRs that will make this switch cleaner.

ah cool. I thought I saw a refreshed request for review come in so thought I'd double check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Review
2 participants