Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CuDNNConvolutionLayer accumulate gradients #3254

Merged
merged 1 commit into from
Oct 28, 2015

Conversation

ronghanghu
Copy link
Member

After #1977, all layers should accumulate gradients for parameters. These two lines introduced in #3160 seems to be a bug.

@shelhamer @slayton58 can you take a look?

@ronghanghu ronghanghu added the bug label Oct 28, 2015
@longjon
Copy link
Contributor

longjon commented Oct 28, 2015

Good catch @ronghanghu -- this definitely looks wrong. I'll leave open for the moment since I haven't read this code before, but we should take care of this immediately.

We should really have better testing for this situation... if this is bugged as it appears, this breaks weight sharing and gradient accumulation.

@slayton58
Copy link
Contributor

Yep, this does look like a bug -- when I wrote the initial support on our fork and merged it across, it looks like this got dragged in (it's still present in our code)

@shelhamer
Copy link
Member

Good catch @ronghanghu and sorry for missing this is my review. While there
are cuDNN layer tests and weight sharing + gradient accumulation tests
there are not weight sharing + gradient accumulation tests for cuDNN.
On Wed, Oct 28, 2015 at 05:25 slayton58 notifications@github.com wrote:

Yep, this does look like a bug -- when I wrote the initial support on our
fork and merged it across, it looks like this got dragged in (it's still
present in our code)


Reply to this email directly or view it on GitHub
#3254 (comment).

@ronghanghu
Copy link
Member Author

We need to adapt the gradient checker to gradient accumulation, in order to prevent future mistakes like this one and #2532.

ronghanghu added a commit that referenced this pull request Oct 28, 2015
CuDNNConvolutionLayer accumulate gradients
@ronghanghu ronghanghu merged commit ca4e342 into BVLC:master Oct 28, 2015
@ronghanghu ronghanghu deleted the cudnn3_accum_grad branch October 28, 2015 17:19
@seanbell
Copy link

👍 I agree that the gradient checker needs to be updated. A simple fixed was proposed earlier by @tnarihi : tnarihi@7d45526

See also #3036.

antran89 added a commit to antran89/my-very-deep-caffe that referenced this pull request Oct 29, 2016
antran89 added a commit to antran89/my-very-deep-caffe that referenced this pull request Nov 9, 2016
antran89 added a commit to antran89/my-very-deep-caffe that referenced this pull request Apr 20, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants