Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should forecast_fitted_values also work for fitted models in addition to when forecast(fitted=True) is called? #835

Open
quant5 opened this issue Apr 30, 2024 · 1 comment

Comments

@quant5
Copy link

quant5 commented Apr 30, 2024

Description

Please correct my understanding if it's incorrect. Relatively new to the library!

  • The point of sf.forecast is to optimize memory burden & be parallel / optimization friendly.
  • Whereas, sf.fit + sf.predict lets us examine the fitted models closely.
  • If the user wants to examine in-sample fit, there's a convenience method sf.forecast_fitted_values()
  • However, this only works if sf.forecast(..., fitted=True) is called. It doesn't work on models fit using sf.fit.
  • So, if the user would like to examine in-sample fit of models created using sf.fit, there are currently two choices, both suboptimal:
    1. Fit the models again (related to [Models] Return and store models parameters during forecast and CV #639)
    2. Iterate across all models and call predict_in_sample, e.g., sf.fitted_[0, 0].predict_in_sample() - requires deeper understanding of architecture + additional step of converting to a dataframe.

My proposal would involve one or both of the following:

  • Surface sf.forecast_fitted_values() to any StatsForecast object where .fit() has been called, in addition to ones that sf.forecast(..., fitted=True) was called. Unless there's something in the code I missed, implementation would simply be (ii) above.
  • Add a parameter to .fit() method that does the same thing as fitted=True, i.e., stores insample predictions to a "fcst_fitted_values_" object.

I am happy to work on this if there's interest. Let me know your thoughts.

Use case

The primary reason one would use .fit() would be to examine the models more closely, including looking at in-sample fit. I think the use case in this issue well-encapsulates the utility of this function.
#639 (comment)

@jmoralez
Copy link
Member

Hey @quant5, thanks for the proposal. I've been meaning to do this, I think the first place would be to add a fitted argument to the models' fit method, because we currently set it internally to handle the case when the user calls predict_in_sample afterwards, except for models that are too expensive, so we end up with a mix:

self.model_ = _croston_classic(y=y, h=1, fitted=True)

self.model_ = _croston_optimized(y=y, h=1, fitted=False)

self.model_ = _croston_sba(y=y, h=1, fitted=True)

self.model_ = _imapa(y=y, h=1, fitted=False)

Once the models have that argument we could pass it through from StatsForecast.fit and then the forecast_fitted_values would retrieve them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants