Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cuda out of memory when training ViT-B architecture on ImageNet 224x224 with a batch size of 1024 #5

Open
jiachenlei opened this issue Dec 19, 2024 · 0 comments

Comments

@jiachenlei
Copy link

jiachenlei commented Dec 19, 2024

Hi, Thanks for your great work and open-sourcing your codes!

I am working on one of my project that utilizes contrastive loss. My code computes the contrastive loss for three times each with different input. I follow the instructions in README.md and simply replace code snippets that compute contrastive loss with inf_cl_loss. However, on ImageNet 224x224, when training a ViT-B model with a batch size of 1024 utilizing 8xA800, the out-of-memory exception occurred.

In addition to this repo, I also utilized the xformers repository for efficient ViT attention computation.

Do you have any clues why this is happening? Thanks!

@jiachenlei jiachenlei changed the title Cuda out of memory when training ViT-B architecture on ImageNet 224x224 with a batch size of 16384 Cuda out of memory when training ViT-B architecture on ImageNet 224x224 with a batch size of 1024 Dec 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant