-
Notifications
You must be signed in to change notification settings - Fork 359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEAT] Conformal Predictions in NeuralForecast #1171
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
This is great, let me know if I can help you with this |
@elephaint @marcopeix @jmoralez Please review cc: @valeman This will be of your interest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work! I did a first pass, I'll clone the branch tomorrow to take a deeper dive!
ec6ea78
to
d514058
Compare
add argument to conformaliz quantiles if desired
@elephaint Please check the revised PR that I have omitted the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great new stuff! I think most of the elements are there, I think on a more high-level I'm still a bit pondering whether this is the best way of including this in the API (I'll get back to that)
Couple of points for now:
- Please remove the
enable_quantiles
everywhere, I think it's unnecessary and the distinction between various point losses is arbitrary; - Please remove the
-conformal
tag in the output names, this way the output names will be identical to normal DistributionLoss output names; cross_validation
needs the option for conformal intervals too;- The example needs a bit more work, for example below is a code snippet for creating a somewhat nicer plot
Example code in the tutorial (this already assumes the -conformal
tag will be removed from the output name):
horizon = 12
input_size = 24
conformal_intervals = ConformalIntervals()
models = [NHITS(h=horizon, input_size=input_size, max_steps=100), NHITS(h=horizon, input_size=input_size, max_steps=100, loss=DistributionLoss("Normal", level=[90]))]
nf = NeuralForecast(models=models, freq='ME')
nf.fit(AirPassengersPanel_train, conformal_intervals=conformal_intervals)
fig, (ax1, ax2) = plt.subplots(2, 1, figsize = (20, 7))
plot_df = pd.concat([AirPassengersPanel_train, preds])
plot_df = plot_df[plot_df['unique_id']=='Airline1'].drop(['unique_id','trend','y_[lag12]'], axis=1).iloc[-50:]
ax1.plot(plot_df['ds'], plot_df['y'], c='black', label='True')
ax1.plot(plot_df['ds'], plot_df['NHITS1'], c='blue', label='median')
ax1.fill_between(x=plot_df['ds'][-12:],
y1=plot_df['NHITS1-lo-90'][-12:].values,
y2=plot_df['NHITS1-hi-90'][-12:].values,
alpha=0.4, label='level 90')
ax1.set_title('AirPassengers Forecast', fontsize=18)
ax1.set_ylabel('Monthly Passengers', fontsize=15)
ax1.legend(prop={'size': 10})
ax1.grid()
ax2.plot(plot_df['ds'], plot_df['y'], c='black', label='True')
ax2.plot(plot_df['ds'], plot_df['NHITS'], c='blue', label='median')
ax2.fill_between(x=plot_df['ds'][-12:],
y1=plot_df['NHITS-lo-90'][-12:].values,
y2=plot_df['NHITS-hi-90'][-12:].values,
alpha=0.4, label='level 90')
ax2.set_ylabel('Monthly Passengers', fontsize=15)
ax2.set_xlabel('Timestamp [t]', fontsize=15)
ax2.legend(prop={'size': 10})
ax2.grid()
@elephaint Thanks for the detailed review.
If I got this right, we want to enable users to call However, my understanding is that to general conformal predictions:
It seems counter-intuitive to me that we can directly get conformal predictions by directly execute PS: Even if we want to revise this, I suggest that could we do this revision in this subsequent PR? |
@JQGoh I pushed most of the suggested fixes already, saving us some time. I think we still need:
Other than that I'm really happy with what you made and I think we're almost there. |
@elephaint Thanks for your help with the revision. I will think about the mentioned items. By the way, it appears that for now we only introduce conformal predictions to point loss, but not on the quantiled outputs? If that is the case, think we better mention this in the channel as previously I said that I wanted to introduce an optional parameter that supports this. |
Correct - users can still get prediction intervals over an arbitrary quantile output by fitting a model with e.g. |
That is indeed more neat than we provide conformal predictions on various quantile outputs (also agree that is too "meta"). Great improvement suggested by you 👍 |
@JQGoh Thanks again! I added some protections and moved the conformity score calculation to within the parts where we are able to do it. I think it will work like this, only the situation with a stored dataset ( Let me know what you think. |
@elephaint Thanks for adding the changes regarding the protection measures, LGTM. |
Ok, thanks again @JQGoh - I'm happy to merge |
@elephaint |
Rationale and Changes
Caveats
SparkDataFrame