- Fixes issue of incorrect order of forecasts #142
-
Modeltime now has a Spark Backend
-
NEW Vignette - Modeltime Spark Backend describing how to set up Modeltime with the Spark Backend.
If users install smooth
, the following models become available:
-
adam_reg()
: Interfaces with the ADAM forecasting algorithm insmooth
. -
exp_smoothing()
: A new engine "smooth_es" connects to the Exponential Smoothing algorithm insmooth::es()
. This algorithm has several advantages, most importantly that it can use x-regs (unlike "ets" engine).
- New extractor:
extract_nested_modeltime_table()
- Extracts a nested modeltime table by row id.
-
extract_nested_train_split
andextract_nested_test_split
: Changed parameter from.data
to.object
for consistency with other "extract" functions -
Added a new logged feature to
modeltime_nested_fit()
to track the attribute "metric_set", which is needed for ensembles. Old nested modeltime objects will need to be re-run to get this new attribute. This will be used in ensembles.
Nested (Iterative) Forecasting is aimed at making it easier to perform forecasting that is traditionally done in a for-loop with models like ARIMA, Prophet, and Exponential Smoothing. Functionality has been added to:
- Data Preparation Utilities:
extend_timeseries()
,nest_timeseries()
, andsplit_nested_timeseris()
.
-
modeltime_nested_fit()
: Fits many models to nested time series data and organizes in a "Nested Modeltime Table". Logs Accuracy, Errors, and Test Forecasts. -
control_nested_fit()
: Used to control the fitting process including verbosity and parallel processing. -
Logging Extractors: Functions that retrieve logged information from the initial fitting process.
extract_nested_test_accuracy()
,extract_nested_error_report()
, andextract_nested_test_forecast()
.
-
modeltime_nested_select_best()
: Selects the best model for each time series ID. -
Logging Extractors: Functions that retrieve logged information from the model selection process.
extract_nested_best_model_report()
-
modeltime_nested_refit()
: Refits to the.future_data
. Logs Future Forecasts. -
control_nested_refit()
: Used to control the re-fitting process including verbosity and parallel processing. -
Logging Extractors: Functions that retrieve logged information from the re-fitting process.
extract_nested_future_forecast()
.
- Forecasting with Global Models: Added more complete steps in the forecasting process so now user can see how to forecast each step from start to finish including future forecasting.
extended_forecast_accuracy_metric_set()
: Adds the new MAAPE metric for handling intermittent data when MAPE returns Inf.maape()
: New yardstick metric that calculates "Mean Arctangent Absolute Percentage Error" (MAAPE). Used when MAPE returns Inf typically due to intermittent data.
modeltime_fit_workflowset()
: Improved handling of Workflowset Descriptions, which now match thewflow_id
.
We've expanded Panel Data functionality to produce model accuracy and confidence interval estimates by a Time Series ID (#114). This is useful when you have a Global Model that produces forecasts for more than one time series. You can more easily obtain grouped accuracy and confidence interval estimates.
-
modeltime_calibrate()
: Gains anid
argument that is a quoted column name. This identifies that the residuals should be tracked by an time series identifier feature that indicates the time series groups. -
modeltime_accuracy()
: Gains aacc_by_id
argument that isTRUE
/FALSE
. If the data has been calibrated withid
, then the user can return local model accuracy by the identifier column. The accuracy data frame will return a row for each combination of Model ID and Time Series ID. -
modeltime_forecast()
: Gains aconf_by_id
argument that isTRUE
/FALSE
. If the data has been calibrated withid
, then the user can return local model confidence by the identifier column. The forecast data frame will return an extra column indicating the identifier column. The confidence intervals will be adjusted based on the local time series ID variance instead of the global model variance.
temporal_hierarchy()
: Implements thethief
package by Rob Hyndman and Nikolaos Kourentzes for "Temporal HIErarchical Forecasting". #117
- Issue #111: Fix bug with
modeltime_fit_workflowset()
where the workflowset (wflw_id) order was not maintained.
Parallel Processing
-
New Vignette: Parallel Processing
-
parallel_start()
andparallel_stop()
: Helpers for setting up multicore processing. -
create_model_grid()
: Helper to generate model specifications with filled-in parameters from a parameter grid (e.g.dials::grid_regular()
). -
control_refit()
andcontrol_fit_workflowset()
: Better printing.
Bug Fixes
- Issue #110: Fix bug with
cores > cores_available
.
modeltime_fit_workflowset()
(#85) makes it easy to convert workflow_set
objects to Modeltime Tables (mdl_time_tbl
). Requires a refitting process that can now be performed in parallel or in sequence.
- CROSTON (#5, #98) - This is a new engine that has been added to
exp_smoothing()
. - THETA (#5, #93) - This is a new engine that has been added to
exp_smoothing()
.
exp_smoothing()
gained 3 new tunable parameters:
smooth_level()
: This is often called the "alpha" parameter used as the base level smoothing factor for exponential smoothing models.smooth_trend()
: This is often called the "beta" parameter used as the trend smoothing factor for exponential smoothing models.smooth_seasonal()
: This is often called the "gamma" parameter used as the seasonal smoothing factor for exponential smoothing models.
modeltime_refit()
: supports parallel processing. Seecontrol_refit()
modeltime_fit_workflowset()
: supports parallel processing. Seecontrol_workflowset()
boost_tree(mtry)
: Mapping switched fromcolsample_bytree
tocolsample_bynode
.prophet_boost()
andarima_boost()
have been updated to reflect this change. tidymodels/parsnip#499
- Improve Model Description of Recursive Models (#96)
- We've added new parameters to Exponential Smoothing Models.
exp_smoothing()
models produced in prior versions may require refitting withmodeltime_refit()
to upgrade their internals with the new parameters.
- Add support for
recursive()
for ensembles. The new recursive ensemble functionality is inmodeltime.ensemble
>= 0.3.0.9000.
recursive()
(#71) - Received a full upgrade to work with Panel Data.- New Vignette: "Autoregressive Forecasting with Recursive"
- Deprecating
modeltime::metric_tweak()
foryardstick::metric_tweak()
. Theyardstick::metric_tweak()
has a required.name
argument in addition to.fn
, which is needed for tuning.
Baseline algorithms (#5, #37) have been created for comparing high-performance methods with simple forecasting methods.
window_reg
: Window-based methods such as mean, median, and even more complex seasonal models based on a forecasting window. The main tuning parameter iswindow_size
.naive_reg
: NAIVE and Seasonal NAIVE (SNAIVE) Regression Models
metric_tweak()
- Can modifyyardstick
metrics likemase()
, which have seasonal parameters.default_forecast_accuracy_metric_set()
- Gets a...
parameter that allows us to add more metrics beyond the defaults.
A new function is added modeltime_residuals_test()
(#62, #68). Tests are implemented:
- Shapiro Test - Test for Normality of residuals
- Box-Pierce, Ljung-Box, and Durbin-Watson Tests - Test for Autocorrelation of residuals
plot_modeltime_forecast()
- When plotting a single point forecast,plot_modeltime_forecast()
now usesgeom_point()
instead ofgeom_line()
. Fixes #66.
Fixes
recursive()
&modeltime_refit()
: Now able to refit a recursive workflow or recursive fitted parsnip object.
New Functions
recursive()
: Turn a fitted model into a recursive predictor. (#49, #50)update_modeltime_model()
: New function to update a modeltime model inside a Modeltime Table.
Breaking Changes
- Removed
arima_workflow_tuned
dataset.
as_modeltime_table()
: New function to convert one or more fitted models stored in a list
to a Modeltime Table.
Bug Fixes
- Update
m750_models
: Fixes error "R parsnip Error: Internal error: Unknowncomposition
type."
Panel Data
modeltime_forecast()
upgrades:
keep_data
: Gains a new argumentkeep_data
. This is useful when thenew_data
andactual_data
has important information needed in analyzing the forecast.arrange_index
: Gains a new argumentarrange_index
. By default, the forecast keeps the rows in the same order as the incoming data. Prior versions arranged Model Predictions by.index
, which impacts the users ability to match to Panel Data which is not likely to be arranged by date. Prediction best-practices are to keep the original order of the data, which will be preserved by default. To get the old behavior, simply togglearrange_index = TRUE
.
modeltime_calibrate()
: Can now handle panel data.
modeltime_accuracy()
: Can now handle panel data.
plot_modeltime_forecast()
: Can handle panel data provided the data is grouped by an ID column prior to plotting.
Error Messaging
- Calibration: Improve error messaging during calibration. Provide warnings if models fail. Provide report with
modeltime_calibrate(quiet = FALSE)
.
Compatibility
- Compatibility with
parsnip >= 0.1.4
. Usesset_encodings()
new parameterallow_sparse_x
.
Ensembles
modeltime_refit()
- Changes to improve fault tolerance and error handling / messaging when making ensembles.
Ensembles
- Integrates
modeltime.ensemble
, a new R package designed for forecasting with ensemble models.
New Workflow Helper Functions
add_modeltime_model()
- A helper function making it easy to add a fitted parsnip or workflow object to a modeltime tablepluck_modeltime_model()
&pull_modeltime_model()
- A helper function making it easy to extract a model from a modeltime table
Improvements
- Documentation - Algorithms now identify default parameter values in the "Engine Details" Section in their respective documentation. E.g.
?prophet_boost
prophet_reg()
can now have regressors controlled viaset_engine()
using the following parameters:regressors.mode
- Set toseasonality.mode
by default.regressors.prior.scale
- Set to 10,000 by default.regressors.standardize
- Set to "auto" by default.
Data Sets
Modeltime now includes 4 new data sets:
m750
- M750 Time Series Datasetm750_models
- 3 Modeltime Models made on the M750 Datasetm750_splits
- Anrsplit
object containing Train/test splits of the M750 datam750_training_resamples
- A Time Series Cross Validationtime_series_cv
object made from thetraining(m750_splits)
Bug Fix
plot_modeltime_forecast()
fix issue with "ACTUAL" data being shown at bottom of legend list. Should be first item.
Forecast without Calibration/Refitting
Sometimes it's important to make fast forecasts without calculating out-of-sample accuracy and refitting (which requires 2 rounds of model training). You can now bypass the modeltime_calibrate()
and modeltime_refit()
steps and jump straight into forecasting the future. Here's an example with h = "3 years"
. Note that you will not get confidence intervals with this approach because calibration data is needed for this.
# Make forecasts without calibration/refitting (No Confidence Intervals)
# - This assumes the models have been trained on m750
modeltime_table(
model_fit_prophet,
model_fit_lm
) %>%
modeltime_forecast(
h = "3 years",
actual_data = m750
) %>%
plot_modeltime_forecast(.conf_interval_show = F)
Residual Analysis & Diagonstics
A common tool when forecasting and analyzing residuals, where residuals are .resid = .actual - .prediction
. The residuals may have autocorrelation or nonzero mean, which can indicate model improvement opportunities. In addition, users may which to inspect in-sample and out-of-sample residuals, which can display different results.
modeltime_residuals()
- A new function used to extract out residual informationplot_modeltime_residuals()
- Visualizes the output ofmodeltime_residuals()
. Offers 3 plots:- Time Plot - Residuals over time
- ACF Plot - Residual Autocorrelation vs Lags
- Seasonality - Residual Seasonality Plot
TBATS Model
Use seasonal_reg()
and set engine to "tbats".
seasonal_reg(
seasonal_period_1 = "1 day",
seasonal_period_2 = "1 week"
) %>%
set_engine("tbats")
NNETAR Model
Use nnetar_reg()
and set engine to "nnetar".
model_fit_nnetar <- nnetar_reg() %>%
set_engine("nnetar")
Prophet Model - Logistic Growth Support
prophet_reg()
andprophet_boost()
:- Now supports logistic growth. Set
growth = 'logistic'
and one or more oflogistic_cap
andlogistic_floor
to valid saturation boundaries. - New arguments making it easier to modify the
changepoint_num
,changepoint_range
,seasonality_yearly
,seasonality_weekly
,seasonality_daily
,logistic_cap
,logistic_floor
- Now supports logistic growth. Set
combine_modeltime_tables()
- A helper function making it easy to combine multiple modeltime tables.update_model_description()
- A helper function making it easier to update model descriptions.
-
modeltime_refit()
: When modeltime model parameters update (e.g. when Auto ARIMA changes to a new model), the Model Description now alerts the user (e.g. "UPDATE: ARIMA(0,1,1)(1,1,1)[12]"). -
modeltime_calibrate()
: When training data is supplied in a time window that the model has previously been trained on (e.g.training(splits)
), the calibration calculation first inspects whether the "Fitted" data exists. If it iexists, it returns the "Fitted" data. This helps prevent sequence-based (e.g. ARIMA, ETS, TBATS models) from displaying odd results because these algorithms can only predict sequences directly following the training window. If "Fitted" data is being used, the.type
column will display "Fitted" instead of "Test".
-
modeltime_forecast()
:- Implement
actual_data
reconciliation strategies when recipe removes rows. Strategy attempts to fill predictors using "downup" strategy to preventNA
values from removing rows. - More descriptive errors when external regressors are required.
- Implement
-
modeltime_accuracy()
: Fix issue withnew_data
not recalibrating. -
prophet_reg()
andprophet_boost()
- Can now perform logistic growthgrowth = 'logistic'
. The user can supply "saturation" bounds usinglogistic_cap
and/orlogisitc_floor
.
seasonal_decomp()
has changed toseasonal_reg()
and now supports both TBATS and Seasonal Decomposition Models.prophet_reg()
&prophet_boost()
: Argument changes:num_changepoints
has becomechangepoint_num
modeltime_forecast()
: Now estimates confidence intervals using centered standard deviation. The mean is assumed to be zero and residuals deviate from mean = 0.
- Updates to work with
parsnip
0.1.2. prophet_boost()
: Setnthreads = 1
(default) to ensure parallelization is thread safe.
- Initial Release