-
Notifications
You must be signed in to change notification settings - Fork 578
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is your setup of Caffe-Greentea optimal ? #70
Comments
@NH89 However, to be really up to speed, there need to be vendor and hardware specific convolution libraries such as cuDNN. AMD has an OpenCL branch (https://github.com/amd/OpenCL-caffe) where they are alternatively unwrapping the batch into one large Matrix-Matrix multiplication. Very memory inefficient compared to cuDNN, but almost as fast. In the same technical report you can read up that with interleaved, pixelwise classification data (which causes large matrix-matrix multiplications and thus a higher efficiency and no batches) are comparably fast to CUDA. |
@naibaf7 Thanks, you saved me from making an expensive error :-) Thank you also for creating Greentea. |
No problem. Probably the OpenCL approaches will catch up with CUDA solutions during Q2/3 next year, as major developments are going on by both AMD and Intel. For my projects in biomedical image segmentation, the OpenCL solution is already speed competitive though. |
I see you are getting markedly slow results with Caffe-Greentea. Which backends are you using, and do you know if they are the best available ?
In ( @naibaf7 ) Fabian Tschopp's tech report http://arxiv.org/pdf/1509.03371.pdf table 6.10 he shows 20x variation in performance depending on which manufacturer's libraries are used.
The text was updated successfully, but these errors were encountered: