-
Notifications
You must be signed in to change notification settings - Fork 651
Selection of features staying in v1.0, please comment if you like a feature already in keras-contrib! #415
Comments
The Yogi optimizer will be owned by @MarcoAndreaBuchmann if you're still ok with it. |
@gabrieldemarmiesse : Yes that's fine with me. |
Thanks @MarcoAndreaBuchmann ! |
@gabrieldemarmiesse Most of the keras-contrib applications module, which hosts DenseNet, DenseNet-FCN, NASNet, Residual-of-Residual Network (which imo should be scrapped, its almost never used), and Wide ResNet module, were added by me, so I could take charge of maintaining them. ResNets on the other hand, should be charged to their original author. In addition, my other contributions are:
|
@titu1994 , could you make PRs to remove what you think is not needed anymore? In each PR description, you can put why we'll remove it and if there is a workaround, which it is. Thanks a lot for the help! |
@gabrieldemarmiesse Sure, but the only one I recommend removing is the broken BatchRenormalization. It was an issue at the time of implementation, and it stands to this date. BatchRenormalization's internal counter is updated per batch, but the original paper states that its internal counter needs to be updated per epoch. There was no reasonable hook to obtain epoch index inside a Layer, and many alternatives were discussed, but no overall correct one was found. In the end, at the time, I decided to simply keep the counter at a batch level as evaluations on CIFAR 10 showed no significant differences as compared to the paper's own plots at small batchsizes. Another issue is that it has been marked as incompatible with TF. I have not tested it out on TF, but the code was written using the standard Keras Backend, so I don't understand the underlying issue as to why it was marked as incompatible. To address both issues, I personally think it is better to remove it in it's entirety, as any user can simply use the layer without being aware of the incorrectness of the implementation and therefore obtain incorrect results after expending significant resources. |
@titu1994 do you think we should remove some models too? It seems several of them are already in keras.applications. |
@gabrieldemarmiesse I don't think we should. Primary reason being that Keras versions are built to be used for inference only. They are pre-built models which have weights available and need not be trained from scratch. However, the ones here are built to be completely configurable as well as trainable from scratch - including regularization which is explicitly dropped in Keras Applications (either l2 regularization or Dropout in the case of VGG). Edit: Im not suggesting the Keras applications can't be used for fine-tuning, where normally one would keep the weights of the original model frozen in the first round of finetuning, and then unfreeze and train again with very low learning rate. However, if one tries to train a Keras Application NASNet from scratch without proper regularization, they will probably overfit fast. I personally have seen it happen for NASNet and MobileNet frequently, so it is important to keep the contrib versions as well. |
I see. So none of them come with pre-trained weigths? It's fine by me to keep them but they all should have tests running on travis at every commit. |
Certain Contrib Application models do come with pre-trained weights.
As to tests, when they were built, we had no tests for applications in general. It was after Keras 2 I believe that we had solid shape tests for applications (or maybe before and I wasn't aware). In any case, once I submit the PR for removing renorm and ROR, I'll get started on the remaining application tests. |
@gabrieldemarmiesse Yes, sure, I am fine with being the owner of Padam. |
Thanks @MFreidank ! |
@titu1994 Thanks a lot for that! |
@gabrieldemarmiesse, as we talked previously, I did not write the code for the CRF layer, just a few changes to persist and restore models using it. IMHO the code is very convoluted and hard to read. What I can do is try to help people who want to use this layer and add/update some docstrings if necessary. |
@lzfelix sure, thanks for the help. We'll just have to find a solution for CRF, because it seems that it's popular, but we don't have anyone to maintain it. |
@gabrieldemarmiesse Speaking of tests for applications, the A scenario can occur that a user imports keras and keras contrib (which is what we want), and both keras and keras_contrib try to inject keras_applications in them. This is highly redundant. On top of that, keras_application tests whether the classifier can accurately predict the class of a test image of an elephant. While we can do something similar for DenseNet and NASNet, ResNets and WideResNets do not have weights trained on ImageNet. What would be a reasonable middle ground? Skip the elephant classification check for those two? Not have the test for any of the models? |
We should test that we are able to train on a single batch. Keras is going to run automatically shape checks which is what we want. Let's not test the weights, it's not feasable in travis anyway (it takes too long). |
@gabrieldemarmiesse Hmm if shape tests are all we need, then no need to train on any data at all. Back prop doubles the amount of memory needed, and for enormous models like NASNet and DenseNet, it isnt feasible to train a model on Travis on CPU. |
True. Let's just run a single prediction on random pixels. |
Yes that should be fast enough (hopefully) |
This latter solution seems fine. |
Hi, I commit to maintain and answer questions regarding the SineReLu activation function. Cheers, |
Thanks @wilderrodrigues ! |
@titu1994 Could you add yourself to the codeowners when you have the time? Thanks! |
@gabrieldemarmiesse Just to confirm, I'll be in charge of the Capsule layer and squash activation. |
@SriRangaTarun you're already in the |
Hello everyone, to give you guys a better experience overall, we'll move features who have a maintainer to tensorflow/addons, see the announcement here: #519 As such, I'll close this issue. We won't have a 1.0 of keras-contrib, but we'll be better off with tensorflow/addons (they have a pypi package and all!). |
Given our manpower (spoiler alert, it's not much), we need to choose carefully how we select features which will stay in the repo.
Given that this repo will host a lot of features (hopefully more than in keras!) the maintainers cannot do everything. Which is why each layer/optimizer/callback/...etc should be assigned to a contributor or a maintainer who will commit to answer issues and review pull requests about it. The assignee will also commit to have proper unit tests for his/her feature and proper docstrings.
The maintainers from the keras team will be in charge of the general repo health and coding standards.
About the things that I will commit to maintain myself:
If you care about one of the features which is in this repo and think that you can take care of issues, PRs, unit tests and docstrings related to it, then feel free to drop a message here.
All features without someone assigned to it will be removed in v1.0.
The text was updated successfully, but these errors were encountered: