Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to keep the best performance model #23

Open
JunqiZhao opened this issue Aug 17, 2018 · 3 comments
Open

How to keep the best performance model #23

JunqiZhao opened this issue Aug 17, 2018 · 3 comments

Comments

@JunqiZhao
Copy link

Hi Guillaume,
Thanks for your great post, this helps me a lot.
When training the RNN model, the model gives a very high performance in the middle of the training process, while after all the iterations, the final performance is not the best. I am not sure whether this case is normal, is there any way I can keep the best performance model in the training process, instead of the final model after all the iterations?
Thanks!

@deadskull7
Copy link

Hi JunqiZhao ! In keras you can just use the callback named ModelCheckpoint to checkpoint the model and saving the weights while monitoring a quantity eg-val_loss, val_acc, etc. Later you can load the weights saved in .hdf5 format using function load_weights and sending in the path of the weights as an argument for it.

@JunqiZhao
Copy link
Author

Hi @deadskull7 ,
Thanks for your reply and suggestion, I will give it a try!
By the way, through recording the best performance obtained each time I train the model, the best performance varies from time to time, e,g, accuracy=[0.91, 0.92, 0.89....]. Right now, I am using the average of these accuracy values to evaluate different Network Architectures, I was wondering how would you suggest to quantify the model performance under such a situation?
Best,
Junqi

@deadskull7
Copy link

deadskull7 commented Sep 28, 2018

I usually have a habit of evaluating the model performance by first plotting the learning graph and each time I plot the graph I try to see one of the following matching to the plot

  1. Underfitting – Validation and training error high
  2. Overfitting – Validation error is high, training error low
    3. Good fit – Validation error low, slightly higher than the training error
  3. Unknown fit - Validation error low, training error 'high'

The 4th one is quite different which says that your model is more good at testing the data which your model hasn't even seen which is quite suspicious. So this takes you to re-evaluate your data splitting method earlier. I hope I answered you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants