train PNet is so slow #61

tzhang2014 · 2018-01-31T06:20:25Z

when I run python example/train_P_net.py --gpus 0 , My GPU is 1070
INFO:root:Epoch[0] Batch [200] Speed: 123.25 samples/sec Train-Accuracy=0.697969
INFO:root:Epoch[0] Batch [200] Speed: 123.25 samples/sec Train-LogLoss=0.617246
INFO:root:Epoch[0] Batch [200] Speed: 123.25 samples/sec Train-BBOX_MSE=0.103584
can you help me ? this is a wrong ?　Where is the mistake？thx

xiaoxiongli · 2018-02-05T08:03:35Z

you need put your data in SSD disk

tzhang2014 · 2018-02-05T13:32:23Z

@xiaoxiongli thank you, how much time in your PC, What is the configuration of your PC? thx

linsoncvw · 2018-04-24T06:51:47Z

@tzhang2014 i also meet this problem, how did you improve it?

INFO:root:Epoch[0] Batch [200] Speed: 126.56 samples/sec Train-Accuracy=0.697195
INFO:root:Epoch[0] Batch [200] Speed: 126.56 samples/sec Train-LogLoss=0.614800
INFO:root:Epoch[0] Batch [200] Speed: 126.56 samples/sec Train-BBOX_MSE=0.106309

linsoncvw · 2018-04-24T09:22:15Z

Only the first round is slow, the other is very fast.

Qidian213 · 2018-04-27T13:10:29Z

You can change mxnet's environment variables to speed training ,just like cmd : export MXNET_GPU_WORKER_NTHREADS=4 (default = 2) and : export MXNET_GPU_COPY_NTHREADS=4 (default = 1) . after i did it , every thing became better

eg : i7-7700 gtx1060
INFO:root:Epoch[0] Batch [3780] Speed: 8343.78 samples/sec Accuracy=0.898810 LogLoss=0.270442 BBOX_MSE=0.015827
INFO:root:Epoch[0] Batch [3800] Speed: 9112.26 samples/sec Accuracy=0.891901 LogLoss=0.282063 BBOX_MSE=0.015802
INFO:root:Epoch[0] Batch [3820] Speed: 10172.07 samples/sec Accuracy=0.883745 LogLoss=0.303172 BBOX_MSE=0.015691
INFO:root:Epoch[0] Batch [3840] Speed: 10388.03 samples/sec Accuracy=0.878459 LogLoss=0.288958 BBOX_MSE=0.015310
INFO:root:Epoch[0] Batch [3860] Speed: 9720.13 samples/sec Accuracy=0.885983 LogLoss=0.310603 BBOX_MSE=0.015680
INFO:root:Epoch[0] Batch [3880] Speed: 9980.33 samples/sec Accuracy=0.879565 LogLoss=0.300225 BBOX_MSE=0.016198

tzhang2014 · 2018-06-06T06:53:04Z

@linsoncvw After 1 epoch ,the speed is so fast. I don't understand the reason

geoffzhang · 2018-06-14T02:45:23Z

Did you meet "Cannot find argument 'out_grad'" when using train_P_net.py?

EmiPark · 2018-07-03T06:49:20Z

@geoffzhang I met the same problem,did you fix it?

zuoqing1988 · 2018-10-10T06:02:07Z

@geoffzhang @EmiPark delete all 'out_grad=True' in core\symbol.py

cuiyong127 · 2019-09-05T03:45:00Z

@geoffzhang @EmiPark delete all 'out_grad=True' in core\symbol.py
delete "out_grad = True",whether it has an impact on training?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

train PNet is so slow #61

train PNet is so slow #61

tzhang2014 commented Jan 31, 2018 •

edited

Loading

xiaoxiongli commented Feb 5, 2018

tzhang2014 commented Feb 5, 2018 •

edited

Loading

linsoncvw commented Apr 24, 2018

linsoncvw commented Apr 24, 2018

Qidian213 commented Apr 27, 2018

tzhang2014 commented Jun 6, 2018

geoffzhang commented Jun 14, 2018

EmiPark commented Jul 3, 2018

zuoqing1988 commented Oct 10, 2018

cuiyong127 commented Sep 5, 2019

train PNet is so slow #61

train PNet is so slow #61

Comments

tzhang2014 commented Jan 31, 2018 • edited Loading

xiaoxiongli commented Feb 5, 2018

tzhang2014 commented Feb 5, 2018 • edited Loading

linsoncvw commented Apr 24, 2018

linsoncvw commented Apr 24, 2018

Qidian213 commented Apr 27, 2018

tzhang2014 commented Jun 6, 2018

geoffzhang commented Jun 14, 2018

EmiPark commented Jul 3, 2018

zuoqing1988 commented Oct 10, 2018

cuiyong127 commented Sep 5, 2019

tzhang2014 commented Jan 31, 2018 •

edited

Loading

tzhang2014 commented Feb 5, 2018 •

edited

Loading