Cuda out of memory when training ViT-B architecture on ImageNet 224x224 with a batch size of 1024 #5

jiachenlei · 2024-12-19T13:21:36Z

Hi, Thanks for your great work and open-sourcing your codes!

I am working on one of my project that utilizes contrastive loss. My code computes the contrastive loss for three times each with different input. I follow the instructions in README.md and simply replace code snippets that compute contrastive loss with inf_cl_loss. However, on ImageNet 224x224, when training a ViT-B model with a batch size of 1024 utilizing 8xA800, the out-of-memory exception occurred.

In addition to this repo, I also utilized the xformers repository for efficient ViT attention computation.

Do you have any clues why this is happening? Thanks!

jiachenlei changed the title ~~Cuda out of memory when training ViT-B architecture on ImageNet 224x224 with a batch size of 16384~~ Cuda out of memory when training ViT-B architecture on ImageNet 224x224 with a batch size of 1024 Dec 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cuda out of memory when training ViT-B architecture on ImageNet 224x224 with a batch size of 1024 #5

Cuda out of memory when training ViT-B architecture on ImageNet 224x224 with a batch size of 1024 #5

jiachenlei commented Dec 19, 2024 •

edited

Loading

Cuda out of memory when training ViT-B architecture on ImageNet 224x224 with a batch size of 1024 #5

Cuda out of memory when training ViT-B architecture on ImageNet 224x224 with a batch size of 1024 #5

Comments

jiachenlei commented Dec 19, 2024 • edited Loading

jiachenlei commented Dec 19, 2024 •

edited

Loading