Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
CutMix is an image data augmentation technique increasingly used nowadays in training pipelines to improve performance. It is one of the best performing augmentation methods on CIFAR.
A patch in the image is removed and padded by a patch from another image in the dataset. The ground truth labels are also mixed proportionally to the number of pixels of combined images.
I use cutmix along with transforms of random crop, horizontal flip and normalisation as used by the original code, and train four models with cutmix probability of 0.5 and beta 1.0. The test results are as follows -
These models were trained using MultiStepLR with milestones of 50 and 100 epochs, and gamma of 0.1
I noticed that MultiStepLR helps achieve the same accuracy earlier in the training procedure as compared to CosineAnealingLR. With max epochs of 200, learning rate at 50-60 epochs is around 0.08 while using CosineAnealingLR. This seems a bit high as accuracy and loss fluctuates and training progresses slowly. MultiStepLR helps use lr of 0.01 after 50 epochs, and 0.001 after 100 epochs, which helps reach higher accuracy faster. Hence, I have also added a parser argument to choose the scheduler from either CosineAnealingLR or MultiStepLR.