Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIX] Unify API #1023

Open
wants to merge 77 commits into
base: main
Choose a base branch
from
Open

[FIX] Unify API #1023

wants to merge 77 commits into from

Conversation

elephaint
Copy link
Contributor

@elephaint elephaint commented May 31, 2024

This is a large refactoring PR and open for discussion. The main goal of the PR is to unify API across different model types, and unify loss functions across different loss types.

Refactoring:

  • Fuses BaseWindows, BaseMultivariate and BaseRecurrent into BaseModel, removing the need for separate classes and unifying model API across different model types. Instead, this PR introduces two model attributes, yielding four possible model options: RECURRENT (True/False) and MULTIVARIATE (True/False). We currently have a model for every combination except a recurrent multivariate model (e.g. a multivariate LSTM), however this is now relatively simple to add. In addition, this change allows to have models that can be recurrent or not, or multivariate or not on-the-fly, based on users' input. This also allows for easier modelling going forward.
  • Unifies model API across all models, adding missing input variables to all model types.
  • Refactors losses, a.o. removing unnecessary domain_map functions.
  • Moves loss.domain_map outside of models to BaseModel
  • Moves RevINMultivariate used by TSMixer, TSMixerx and RMoK to common.modules

Features:

  • All losses compatible with all types of models (e.g. univariate/multivariate, direct/recurrent) OR appropriate protection added.
  • DistributionLoss now supports the use of quantiles in predict, allowing for easy quantile retrieval for all DistributionLosses.
  • Mixture losses (GMM, PMM and NBMM) now support learned weights for weighted mixture distribution outputs.
  • Mixture losses now support the use of quantiles in predict, allowing for easy quantile retrieval.
  • Improved stability of ISQF by adding softplus protection around some parameters instead of using .abs
  • Unified API for any quantile or any confidence level during predict for both point- and distribution losses.

Bug fixes:

  • MASE loss now works.
  • Added various protections around parameter combinations that are invalid (e.g. regarding losses).
  • StudentT increase default DoF to 3 to reduce unbound variance issues.
  • All models are now tested using a test function on the AirPassengers dataset; in most models we included eval: false on the examples whilst not having any other tests, causing most models to effectively not being tested at all.
  • IQLoss doesn't give monotonic quantiles, now it does (by quantiling the quantiles)
  • When training with both a conformal method and non-conformal method, the latter is also cross-validated to compute conformity scores. This is a redundant training step, and removed in this PR.

Breaking changes:

  • Rewrite of all recurrent models to get rid of the quadratic (in the sequence dimension) space complexity. As a result, it is impossible to load a recurrent model from a previous version into this version.
  • Recurrent models now require an input_size to be given.
  • TCN and DRNN are now windows models, not recurrent models.

Tests:

  • Added common._model_checks.py that includes a model testing function. This function runs on every separate model, ensuring that every model is tested on push.

Todo:

  • Test models on speed/scaling as compared to current implementation across a set of datasets.
  • Make sure docstring of all multivariate models is updated to reflect the additional inputs

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@elephaint elephaint marked this pull request as ready for review July 15, 2024 18:48
@elephaint elephaint linked an issue Jul 22, 2024 that may be closed by this pull request
@marcopeix
Copy link
Contributor

I ran my own small experiment, and models are slightly faster, with no degradation in performance. The refactoring looks good to me, but a second opinion would be great on this!

@elephaint
Copy link
Contributor Author

I ran my own small experiment, and models are slightly faster, with no degradation in performance. The refactoring looks good to me, but a second opinion would be great on this!

Thanks!

@elephaint elephaint changed the title [FIX] Code refactoring [FIX] Unify API Oct 17, 2024
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i noticed that this file and action_files/test_models/src/evaluation2.py are quite similar. i have a couple of suggestions:

  • this might be a good opportunity to use the utilsforecast evaluation features. we could replace mae and smape from the losses module and the evaluate function.
  • also, it looks like the only difference between this file and the second one is the list of models, correct? if that’s the case, we could combine them into a single file and use fire to pass the list of models. you could then call it in the .github/workflows/ci.yaml file.

the idea would be to abstract the code in the if __name__ == '__main__': clause, something like this:

def main(models: list):
    groups = ['Monthly']
    datasets = ['M3']
    evaluation = [evaluate(model, dataset, group) for model, group in product(models, groups) for dataset in datasets]
    evaluation = [eval_ for eval_ in evaluation if eval_ is not None]
    evaluation = pd.concat(evaluation)
    evaluation = evaluation[['dataset', 'model', 'time', 'mae', 'smape']]
    evaluation['time'] /= 60  # minutes
    evaluation = evaluation.set_index(['dataset', 'model']).stack().reset_index()
    evaluation.columns = ['dataset', 'model', 'metric', 'val']
    evaluation = evaluation.set_index(['dataset', 'metric', 'model']).unstack().round(3)
    evaluation = evaluation.droplevel(0, 1).reset_index()
    evaluation['AutoARIMA'] = [666.82, 15.35, 3.000]
    evaluation.to_csv('data/evaluation.csv')
    print(evaluation.T)

and then you could use fire inside the main clause:

if __name__ == '__main__':
    import fire
    fire.Fire(main)

this way, we can run it for different models inside .github/workflows/ci.yaml: python -m action_files.test_models.src.evaluation --models <list of models>. wdyt?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this also could apply to action_files/test_models/src/multivariate_evaluation.py. since we are changing models and datasets, we could define main(models: list, dataset: str).

Copy link
Contributor Author

@elephaint elephaint Nov 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, one remarkt - why favour ci over circleci? (I'm ambivalent, don't know why we would prefer one over the other)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we could also merge action_files/test_models/src/models.py, action_files/test_models/src/models2.py, and action_files/test_models/src/multivariate_models.py. from what i can tell, the only real difference between them is the list of models. if that’s the case, we could add a new parameter to main—maybe config: str or something similar—and then have a dictionary with different models based on the config. for example: {"multivariate": , ...}, and then we could call it like this: python -m action_files.test_models.src.models --config multivariate.

let me know what you think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, then we can still fire up multiple runners (for the sake of keeping test time under control it makes sense to split the tests)

Comment on lines +14 to +21
# from neuralforecast.models.rnn import RNN
# from neuralforecast.models.tcn import TCN
from neuralforecast.models.lstm import LSTM
from neuralforecast.models.dilated_rnn import DilatedRNN
from neuralforecast.models.deepar import DeepAR
from neuralforecast.models.mlp import MLP
from neuralforecast.models.nhits import NHITS
from neuralforecast.models.nbeats import NBEATS
# from neuralforecast.models.deepar import DeepAR
# from neuralforecast.models.mlp import MLP
# from neuralforecast.models.nhits import NHITS
# from neuralforecast.models.nbeats import NBEATS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the commented code going to be restored in the future? if this change is permanent, maybe we could delete those lines instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm kind of treating this file also as a testing file locally, we can delete it (it's mainly so that testing locally is faster that you don't need to type all that stuff every time)

Comment on lines +173 to +174
" n_series = 1\n",
" self.n_series = n_series \n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we set n_series = None in this case? using n_series = 1 might lead to unexpected bugs, though i'm not entirely sure if that's the case here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need n_series=1 for the univariate models in a few other places, but I can also introduce additional helper variables to deal with that situation, but that feels a bit unnecessary perhaps....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Recurrent Models] do not support step_size > 1 样本内预测predict_insample无法使用
3 participants