Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible performance drop in -dev branch #1006

Closed
ozancaglayan opened this issue Aug 29, 2014 · 4 comments
Closed

Possible performance drop in -dev branch #1006

ozancaglayan opened this issue Aug 29, 2014 · 4 comments

Comments

@ozancaglayan
Copy link
Contributor

Hi,

I was testing several BLAS implementations to see the performance difference. I'm using the MNIST dataset as instructed in its tutorial but with max_iter set to 1000.

I just discovered that the training (using train_lenet.sh) is significantly slow compared to the master branch. I tested on two different machines. The results below are from an Intel Xeon W3530 Nehalem CPU. I'm training using CPU mode. Is this an expected slow-down caused by some implementation change?

atlas-sse3 - fedora 19 x86_64 (dev branch)
-------------------------------------------------------
I0828 17:59:20.025907 20321 solver.cpp:302]     Test net output #0: accuracy = 0.9788
I0828 17:59:20.025959 20321 solver.cpp:302]     Test net output #1: loss = 0.0642497 (* 1 = 0.0642497 loss)
I0828 17:59:20.025979 20321 solver.cpp:237] Optimization Done.
I0828 17:59:20.025987 20321 caffe.cpp:113] Optimization Done.

real6m11.887s
user6m31.207s
sys0m1.324s

atlas-sse3 - fedora 19 x86_64 (master branch)
-----------------------------------------------------------
I0828 18:06:28.892992 11738 solver.cpp:270] Test score #0: 0.9776
I0828 18:06:28.893049 11738 solver.cpp:270] Test score #1: 0.0670089
I0828 18:06:28.893060 11738 solver.cpp:218] Optimization Done.
I0828 18:06:28.893131 11738 caffe.cpp:113] Optimization Done.

real4m6.125s
user4m5.772s
sys0m0.140s
@kloudkl
Copy link
Contributor

kloudkl commented Aug 31, 2014

Try the fixes in #1008, time the other examples or benchmark the network with the tool "caffe time" to find out the layer-wise speed gaps.

@shelhamer
Copy link
Member

I can't reproduce this and in fact my dev is faster than master according to caffe time. Are you sure there weren't interfering tasks increasing the CPU or disk load during the test for dev?

@shelhamer
Copy link
Member

It could be the switch to the DataTransformer from inline data transformation in dev since you are running in purely in CPU. You could try your evaluation at aee9cd3 before it was merged in #954 to test if it makes a difference.

@longjon
Copy link
Contributor

longjon commented May 8, 2015

Closing as this seems to have expired; we don't know of any current relevant performance issue.

@longjon longjon closed this as completed May 8, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants