Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inference time, after loading the weights, is slower than ./build/tools/caffe time #573

Open
jazzseow opened this issue Jul 16, 2019 · 4 comments

Comments

@jazzseow
Copy link

When I ran ./build/tools/caffe time, I got
I0716 10:14:52.669873 18718 caffe.cpp:656] Average Forward pass: 11.4608 ms.

When I ran ./build/examples/ssd/ssd_detect.bin and time the forward function, I got timing like these
time: 23.809 ms
time: 22.517 ms
time: 23.631 ms
time: 22.49 ms
time: 23.481 ms
time: 21.887 ms
time: 23.696 ms
time: 22.322 ms
time: 23.026 ms
time: 23.716 ms
time: 22.506 ms
time: 22.152 ms
time: 23.222 ms
time: 21.964 ms
time: 23.871 ms
time: 22.715 ms
time: 23.888 ms
time: 22.232 ms
time: 23.315 ms

These are my codes
https://drive.google.com/drive/folders/1cAhF9wBNjBpO9Ykoh80Sv5eBqBZJwDYQ?usp=sharing

@drnikolaev
Copy link

@jazzseow could you upload the commands, their outputs and prototxt files used?

@jazzseow
Copy link
Author

@drnikolaev Thank you for your reply.
I have uploaded the required files to https://drive.google.com/open?id=1cAhF9wBNjBpO9Ykoh80Sv5eBqBZJwDYQ

Also, I have uploaded the modifications required to run RefineDet model, under include/ and src/ folders

@drnikolaev
Copy link

A-ha, seems like a bug: when you run caffe time convolution algos get optimized like this:

I0719 11:48:43.798629 11106 cudnn_conv_layer.cpp:857] [n0.d0.r0] Conv Algos (F,BD,BF): 'conv3_1' with space 0.08G 63/1 6 1 0 	(avail 9.72G, req 0.08G)	t: 0 0 0.6
I0719 11:48:44.033376 11106 cudnn_conv_layer.cpp:857] [n0.d0.r0] Conv Algos (F,BD,BF): 'conv3_2' with space 0.09G 233/1 6 1 5 	(avail 9.7G, req 0.09G)	t: 0 0 1.11
I0719 11:48:44.282755 11106 cudnn_conv_layer.cpp:857] [n0.d0.r0] Conv Algos (F,BD,BF): 'conv3_3' with space 0.09G 233/1 6 1 5 	(avail 9.68G, req 0.09G)	t: 0 0 1.15

But caffe test doesn't.
Could you try to comment out lines
https://github.com/NVIDIA/caffe/blob/caffe-0.17/src/caffe/layers/cudnn_conv_layer.cpp#L450
https://github.com/NVIDIA/caffe/blob/caffe-0.17/src/caffe/layers/cudnn_conv_layer.cpp#L456
and retry caffe test?

@jazzseow
Copy link
Author

jazzseow commented Jul 19, 2019

@drnikolaev
So I tried this

if (!use_modest_workspace()) {
    // if (this->phase_ == TRAIN) {
    // Now taking the rest for running FindEx calls
    // We'll release what's possible in BW pass
    LOG(INFO); // line 453
    AllocateFindExWorkspace();
    // Also used by Test Net but based on shared space taken by Train:
    LOG(INFO); // line 456
    FindExConvAlgo(bottom, top);
    LOG(INFO); // line 458
    // }
    use_algo_seeker_ = false;
}

caffe time works fine.
But it resulted in Segmentation Fault() on FindExConvAlgo(bottom, top); when i run ssd_detect.

I0719 16:18:22.243350 23616 cudnn_conv_layer.cpp:453] 
I0719 16:18:22.248224 23616 cudnn_conv_layer.cpp:456] 
Segmentation fault (core dumped)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants