Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speeding up ADAM (and maybe other solvers too) #3519

Merged
merged 1 commit into from
Jan 6, 2016

Conversation

philkr
Copy link
Contributor

@philkr philkr commented Jan 5, 2016

It has bothered me for a while that ADAM is quite a bit slower than SGD. This PR speeds up the GPU implementation of ADAM (and if desired I can do the same for other solvers too).

I created a simple python benchmark for solvers, using random input data and a couple of large inner products: solver_bench.zip. Before this PR:

Benchmarking forward              ...      11.96 ms / it
Benchmarking backward             ...       0.32 ms / it
Benchmarking forward-backward     ...      15.98 ms / it
Benchmarking SGD-solver           ...      25.17 ms / it
Benchmarking Adam-solver          ...      40.01 ms / it
Benchmarking AdaDelta-solver      ...      50.09 ms / it
Benchmarking AdaGrad-solver       ...      35.00 ms / it
Benchmarking Nesterov-solver      ...      30.95 ms / it
Benchmarking RMSProp-solver       ...      36.77 ms / it

After this PR:

...
Benchmarking SGD-solver           ...      21.98 ms / it
Benchmarking Adam-solver          ...      23.97 ms / it
Benchmarking AdaDelta-solver      ...      24.00 ms / it
Benchmarking AdaGrad-solver       ...      21.98 ms / it
Benchmarking Nesterov-solver      ...      21.98 ms / it
Benchmarking RMSProp-solver       ...      21.99 ms / it

@philkr philkr added the JD label Jan 5, 2016
@jeffdonahue
Copy link
Contributor

Thanks @philkr, this is a nice simple speedup for GPU training, and passes the TestGradientBasedSolver checks indicating all the solvers should work as they did before besides the performance improvement.

Looks like Travis is failing due to some minor lint issues. Also, could you CamelCase the new GPU kernel names and remove the _device suffixes to match existing ones (e.g. adadelta_update_device -> AdaDeltaUpdate)? I can merge this after those fixes.

@philkr
Copy link
Contributor Author

philkr commented Jan 6, 2016

Lint and CamelCase should be fixed now, and also squashed into a single commit.

@philkr
Copy link
Contributor Author

philkr commented Jan 6, 2016

ok, SGD is now slightly faster too (about 10%). Still don't quite understand why this worked, but I guess it has to do with minimizing the number of kernel calls (2 before, 1 now). Ran all tests again and passed.

@jeffdonahue
Copy link
Contributor

Sweet, thanks again!

jeffdonahue added a commit that referenced this pull request Jan 6, 2016
Speeding up ADAM (and maybe other solvers too)
@jeffdonahue jeffdonahue merged commit b23e9b1 into BVLC:master Jan 6, 2016
@philkr philkr deleted the faster_solver branch January 15, 2016 18:26
@seanbell seanbell mentioned this pull request Mar 23, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants