-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot explicitly "name" BatchNorm parameters for sharing (Siamese network) #5171
Comments
batch norm statistics are not learnable parameters subject to solver updates, so they must be shielded from the solver. `BatchNorm` layer now masks its statistics for itself by zeroing parameter learning rates instead of relying on the layer definition. n.b. declaring `param`s for batch norm layers is no longer allowed.
automatically strip old batch norm layer definitions including `param` messages. the batch norm layer used to require manually masking its state from the solver by setting `param { lr_mult: 0 }` messages for each of its statistics. this is now handled automatically by the layer.
I have the same question as #5120 .. |
@D-X-Y I made an attempt at fixing this issue: PR#5184. Hope they'll approve. |
Thanks for raising this issue. I was too focused on protecting the batch statistics from accidental gradients that I made sharing impossible across layers that could sensibly share statistics as in siamese nets. I'll double-check #5184 and come up with an auto-upgrade rule that handles both lr and name. |
Fixed by #5184 |
TL;DR
It seems like commit a8ec123c00723df0d0ad897e1eea32a29201c81b broke something here: one can no longer pass param to "BatchNorm" layer (they are ignored).
I suppose auto-upgrade should not remove
param
completely from"BatchNorm"
layers, but rather make surelr_mult
anddecay_mult
are zero.Issue summary
I am trying to share BatchNorm internal parameters in a Siamese network: That is, having two BatchNorm layers sharing the same internal parameters.
I explicitly
name
the parameters using:param {'lr_mult':0 'name': 'bn_a'}
However, when building the net, the whole
param
section is missing, and caffe completely ignores thename
of the parameters. No sharing is actually happening.Steps to reproduce
Here is the relevant part of the prototxt:
The relevant part of the log:
Consequently, when building the layers the log reports:
There is no
Sharing parameters 'bn_m' owned by layer 'bn1_a', param index 0
log entries as I would have expected.
It seems like BatchNorm ignores
param
entries completely!Am I missing something?
How can I share the internal parameters of BatchNorm?
Your system configuration
Operating system: ubuntu 14.04
Compiler: gcc
CUDA version (if applicable):
CUDNN version (if applicable):
BLAS: mkl
Python or MATLAB version (for pycaffe and matcaffe respectively): python 2.7
Caffe: my caffe is updated with commit 365ac88
The text was updated successfully, but these errors were encountered: