-
Notifications
You must be signed in to change notification settings - Fork 4.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cross validation with other forecasts as regressors #442
Comments
I was trying to understand this one a bit more. Would an example of another time-series as a regressor be the sequence of dates of all super-bowls going back several years? Then, assuming we don't have dates already for future super-bowls, we should be regressing those dates as well and incorporating them into the holidays dataframe? |
One might gain benefit from including a time series as an extra regressor if 1) that time series might be correlated with the one of interest, and 2) you expect that forecast to be more accurate than that of interest. That way you can expect to increase the accuracy of your main forecast by including it. A more natural example for the documentation examples would be some other wikipedia page which we expect to be correlated with Manning's, and which also has less uncertainty (e.g. more traffic). Another more natural example might be if we wanted to forecast number of weekly Prophet issues, then we might include as an extra regressor the forecast of the number of weekly downloads - something that is likely correlated, and has less variance. |
That need not always be the case that the external regressors need to be forecasted. There is a need to differentiate 2 types of regressors. One for which future values can always be determined and so the data in the data frame can be reused while cross validating as prophet does today. Second for which the future value could be the same as last known value (we use such regressors for what-if analysis) or the future value comes from an external method given the past values (or) forecasted from past values. I think this issue needs to be prioritized as this leaks future values for some models and people might be overestimating the accuracy of their predictions. |
Bumping this as it'll be one of the next priorities for me :) Agree with @skannan-maf -- examples of the first case would be things like price or marketing spend budget, which we would know in advance for the forecast horizon, and examples of the second case being anything else we think is predictive of the target and easier to forecast than the target itself. # adds regressor where we assume future values are always known
m.add_regressor(
'digital_ad_spend',
mode='additive',
standardize=True,
) One way to integrate the second case into the current API would be to add an additional argument, # adds regressor where future values are uncertain
# option 1, fit the regressor model first
weather_model = Prophet().fit(weather_df)
m.add_regressor(
'avg_temperature',
mode='multiplicative',
standardize=True,
model=weather_model,
)
m.fit(df)
# option 2, define all models upfront, then fit once
weather_model = Prophet()
m.add_regressor(
'avg_temperature',
mode='multiplicative',
standardize=True,
model=weather_model,
)
m.fit(df, regressor_dfs = {'avg_temperature': weather_df}) Next we'd need to tweak the Finally, we could also incorporate uncertainty from the regressor model into the main model. In the |
If we have other time series as regressors that are being forecasted using Prophet, then cross validation should also forecast those regressors. Right now it would use the true future values which would typically underestimate forecast error.
The text was updated successfully, but these errors were encountered: