Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEAT] Conformal Predictions in NeuralForecast #1171

Merged
merged 27 commits into from
Oct 11, 2024

Conversation

JQGoh
Copy link
Contributor

@JQGoh JQGoh commented Oct 3, 2024

Rationale and Changes

Caveats

  • This does not support the dataframe of the type SparkDataFrame
  • Quantiled-type losses are not supported (do not conformalized various quantiled outputs)

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@elephaint
Copy link
Contributor

This is great, let me know if I can help you with this

@JQGoh JQGoh changed the title Feat/conformal prediction [FEAT] Conformal Predictions in NeuralForecast Oct 3, 2024
@JQGoh
Copy link
Contributor Author

JQGoh commented Oct 3, 2024

@elephaint @marcopeix @jmoralez Please review

cc: @valeman This will be of your interest

@JQGoh JQGoh marked this pull request as ready for review October 3, 2024 15:21
Copy link
Contributor

@elephaint elephaint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work! I did a first pass, I'll clone the branch tomorrow to take a deeper dive!

nbs/core.ipynb Outdated Show resolved Hide resolved
nbs/docs/tutorials/20_conformal_prediction.ipynb Outdated Show resolved Hide resolved
nbs/docs/tutorials/20_conformal_prediction.ipynb Outdated Show resolved Hide resolved
nbs/docs/tutorials/20_conformal_prediction.ipynb Outdated Show resolved Hide resolved
nbs/docs/tutorials/20_conformal_prediction.ipynb Outdated Show resolved Hide resolved
nbs/utils.ipynb Outdated Show resolved Hide resolved
neuralforecast/core.py Outdated Show resolved Hide resolved
neuralforecast/core.py Outdated Show resolved Hide resolved
neuralforecast/core.py Outdated Show resolved Hide resolved
neuralforecast/core.py Outdated Show resolved Hide resolved
neuralforecast/core.py Outdated Show resolved Hide resolved
@JQGoh
Copy link
Contributor Author

JQGoh commented Oct 7, 2024

@elephaint Please check the revised PR that I have omitted the UNSUPPORTED_LOSSES_CONFORMAL variable and introduced an optional argument to conformalize quantiles.

neuralforecast/core.py Outdated Show resolved Hide resolved
Copy link
Contributor

@elephaint elephaint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great new stuff! I think most of the elements are there, I think on a more high-level I'm still a bit pondering whether this is the best way of including this in the API (I'll get back to that)

Couple of points for now:

  • Please remove the enable_quantiles everywhere, I think it's unnecessary and the distinction between various point losses is arbitrary;
  • Please remove the -conformal tag in the output names, this way the output names will be identical to normal DistributionLoss output names;
  • cross_validation needs the option for conformal intervals too;
  • The example needs a bit more work, for example below is a code snippet for creating a somewhat nicer plot

Example code in the tutorial (this already assumes the -conformal tag will be removed from the output name):

horizon = 12
input_size = 24

conformal_intervals = ConformalIntervals()

models = [NHITS(h=horizon, input_size=input_size, max_steps=100), NHITS(h=horizon, input_size=input_size, max_steps=100, loss=DistributionLoss("Normal", level=[90]))]
nf = NeuralForecast(models=models, freq='ME')
nf.fit(AirPassengersPanel_train, conformal_intervals=conformal_intervals)
fig, (ax1, ax2) = plt.subplots(2, 1, figsize = (20, 7))
plot_df = pd.concat([AirPassengersPanel_train, preds])

plot_df = plot_df[plot_df['unique_id']=='Airline1'].drop(['unique_id','trend','y_[lag12]'], axis=1).iloc[-50:]

ax1.plot(plot_df['ds'], plot_df['y'], c='black', label='True')
ax1.plot(plot_df['ds'], plot_df['NHITS1'], c='blue', label='median')
ax1.fill_between(x=plot_df['ds'][-12:], 
                 y1=plot_df['NHITS1-lo-90'][-12:].values,
                 y2=plot_df['NHITS1-hi-90'][-12:].values,
                 alpha=0.4, label='level 90')
ax1.set_title('AirPassengers Forecast', fontsize=18)
ax1.set_ylabel('Monthly Passengers', fontsize=15)
ax1.legend(prop={'size': 10})
ax1.grid()

ax2.plot(plot_df['ds'], plot_df['y'], c='black', label='True')
ax2.plot(plot_df['ds'], plot_df['NHITS'], c='blue', label='median')
ax2.fill_between(x=plot_df['ds'][-12:], 
                 y1=plot_df['NHITS-lo-90'][-12:].values,
                 y2=plot_df['NHITS-hi-90'][-12:].values,
                 alpha=0.4, label='level 90')
ax2.set_ylabel('Monthly Passengers', fontsize=15)
ax2.set_xlabel('Timestamp [t]', fontsize=15)
ax2.legend(prop={'size': 10})
ax2.grid()

nbs/core.ipynb Outdated Show resolved Hide resolved
nbs/core.ipynb Outdated Show resolved Hide resolved
nbs/core.ipynb Outdated Show resolved Hide resolved
nbs/core.ipynb Outdated Show resolved Hide resolved
@JQGoh
Copy link
Contributor Author

JQGoh commented Oct 7, 2024

@elephaint Thanks for the detailed review.

cross_validation needs the option for conformal intervals too;

If I got this right, we want to enable users to call cross_validation and the prediction outputs can include prediction intervals by conformal.

However, my understanding is that to general conformal predictions:

  1. First, compute conformity scores and store in _cs_df
  2. Second, during the predictions, based on the stored conformity scores and given level, compute prediction intervals

It seems counter-intuitive to me that we can directly get conformal predictions by directly execute cross_validation directly. Hope to hear more about your elaborations on this.

PS: Even if we want to revise this, I suggest that could we do this revision in this subsequent PR?

@elephaint
Copy link
Contributor

elephaint commented Oct 8, 2024

@JQGoh I pushed most of the suggested fixes already, saving us some time.

I think we still need:

  • Cross validation prediction intervals (we should be able to follow MLForecast example)
  • Some more tests (I need to think about the relevant tests)
  • Spark DataFrame support (?)

Other than that I'm really happy with what you made and I think we're almost there.

@JQGoh
Copy link
Contributor Author

JQGoh commented Oct 8, 2024

@JQGoh I pushed most of the suggested fixes already, saving us some time.

I think we still need:

  • Cross validation prediction intervals (we should be able to follow MLForecast example)
  • Some more tests (I need to think about the relevant tests)
  • Spark DataFrame support (?)

Other than that I'm really happy with what you made and I think we're almost there.

@elephaint Thanks for your help with the revision. I will think about the mentioned items. By the way, it appears that for now we only introduce conformal predictions to point loss, but not on the quantiled outputs? If that is the case, think we better mention this in the channel as previously I said that I wanted to introduce an optional parameter that supports this.

@elephaint
Copy link
Contributor

@JQGoh I pushed most of the suggested fixes already, saving us some time.
I think we still need:

  • Cross validation prediction intervals (we should be able to follow MLForecast example)
  • Some more tests (I need to think about the relevant tests)
  • Spark DataFrame support (?)

Other than that I'm really happy with what you made and I think we're almost there.

@elephaint Thanks for your help with the revision. I will think about the mentioned items. By the way, it appears that for now we only introduce conformal predictions to point loss, but not on the quantiled outputs? If that is the case, think we better mention this in the channel as previously I said that I wanted to introduce an optional parameter that supports this.

Correct - users can still get prediction intervals over an arbitrary quantile output by fitting a model with e.g. QuantileLoss(q=0.8). Maybe we change this in the future, but it feels a bit too meta for now to include prediction intervals over prediction intervals.

@JQGoh
Copy link
Contributor Author

JQGoh commented Oct 8, 2024

@JQGoh I pushed most of the suggested fixes already, saving us some time.
I think we still need:

  • Cross validation prediction intervals (we should be able to follow MLForecast example)
  • Some more tests (I need to think about the relevant tests)
  • Spark DataFrame support (?)

Other than that I'm really happy with what you made and I think we're almost there.

@elephaint Thanks for your help with the revision. I will think about the mentioned items. By the way, it appears that for now we only introduce conformal predictions to point loss, but not on the quantiled outputs? If that is the case, think we better mention this in the channel as previously I said that I wanted to introduce an optional parameter that supports this.

Correct - users can still get prediction intervals over an arbitrary quantile output by fitting a model with e.g. QuantileLoss(q=0.8). Maybe we change this in the future, but it feels a bit too meta for now to include prediction intervals over prediction intervals.

That is indeed more neat than we provide conformal predictions on various quantile outputs (also agree that is too "meta"). Great improvement suggested by you 👍

nbs/core.ipynb Show resolved Hide resolved
@elephaint
Copy link
Contributor

elephaint commented Oct 10, 2024

@JQGoh Thanks again! I added some protections and moved the conformity score calculation to within the parts where we are able to do it. I think it will work like this, only the situation with a stored dataset (df=None) I haven't fully covered, although I'm not sure that's a major issue.

Let me know what you think.

@JQGoh
Copy link
Contributor Author

JQGoh commented Oct 10, 2024

@JQGoh Thanks again! I added some protections and moved the conformity score calculation to within the parts where we are able to do it. I think it will work like this, only the situation with a stored dataset (df=None) I haven't fully covered, although I'm not sure that's a major issue.

Let me know what you think.

@elephaint Thanks for adding the changes regarding the protection measures, LGTM.

@elephaint
Copy link
Contributor

Ok, thanks again @JQGoh - I'm happy to merge

@JQGoh
Copy link
Contributor Author

JQGoh commented Oct 11, 2024

Ok, thanks again @JQGoh - I'm happy to merge

@elephaint
Thanks for your help with the user experience improvement🙏

@elephaint elephaint merged commit d0549b6 into Nixtla:main Oct 11, 2024
14 checks passed
@JQGoh JQGoh deleted the feat/conformal-prediction branch October 14, 2024 13:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Conformal Prediction in NeuralForecast
4 participants