Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training failed for resulutions 768x768, 512x512, with error == cudaSuccess(2 vs. 0) on TitanX GPU #132

Open
manogna-s opened this issue Sep 12, 2019 · 0 comments

Comments

@manogna-s
Copy link

The second stage of training with resolution 768x768 is failing throwing the following error:

F0903 14:31:26.106397 92421 syncedmem.cpp:56] Check failed: error == cudaSuccess (2 vs. 0) out of memory
*** Check failure stack trace: ***
@ 0x7fd08aa5c5cd google::LogMessage::Fail()
@ 0x7fd08aa5e433 google::LogMessage::SendToLog()
@ 0x7fd08aa5c15b google::LogMessage::Flush()
@ 0x7fd08aa5ee1e google::LogMessageFatal::~LogMessageFatal()
@ 0x7fd08b2290e0 caffe::SyncedMemory::to_gpu()
@ 0x7fd08b2280a9 caffe::SyncedMemory::mutable_gpu_data()
@ 0x7fd08b390282 caffe::Blob<>::mutable_gpu_data()
@ 0x7fd08b363928 caffe::BaseConvolutionLayer<>::forward_gpu_gemm()
@ 0x7fd08b3eb296 caffe::ConvolutionLayer<>::Forward_gpu()
@ 0x7fd08b1f15f2 caffe::Net<>::ForwardFromTo()
@ 0x7fd08b1f1717 caffe::Net<>::Forward()
@ 0x7fd08b3a6eca caffe::Solver<>::Solve()
@ 0x7fd08b226604 caffe::P2PSync<>::Run()
@ 0x40ada0 train()
@ 0x407590 main
@ 0x7fd0899cc830 __libc_start_main
@ 0x407db9 _start
@ (nil) (unknown)
Aborted (core dumped)

Anyone came cross this error and found a fix for this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant