Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix behaviour of unfrozen BatchNormalization layer (resolves #46) #47

Merged
merged 1 commit into from
Nov 28, 2018

Conversation

Callidior
Copy link
Contributor

@Callidior Callidior commented Nov 6, 2018

Previously, if BatchNormalization was initialized with BatchNormalization(freeze=False), its behaviour was not equivalent to the standard BatchNormalization layer, as one would expect. Instead, it was always forced to be in training mode, providing wrong validation results.

This PR does not change the behaviour for freeze=True, but makes the layer equivalent to the standard BatchNormalization layer from Keras for freeze=False.

Previously, if `BatchNormalization` was been initialized with `BatchNormalization(freeze=False)`, its behaviour was not equivalent to the standard `BatchNormalization` layer, as one would expect. Instead, it was always forced to be in training mode, providing wrong validation results.

This PR does not change the behaviour for `freeze=True`, but makes the layer equivalent to the standard `BatchNormalization` layer from Keras for `freeze=False`.
@Callidior Callidior changed the title Fix behaviour of unfrozen BatchNormalization layer Fix behaviour of unfrozen BatchNormalization layer (resolves #46) Nov 6, 2018
@hgaiser
Copy link
Contributor

hgaiser commented Nov 6, 2018

Doesn't this only change the behaviour if freeze=True?

Also, what accuracy are you getting now?

@Callidior
Copy link
Contributor Author

No, the behaviour for freeze=True is not changed. Previously, we called the method of the superclass with training=(not self.freeze), which would evaluate to training=False. Now, if self.freeze is True, we set training=False, as before.

If self.freeze is False, however, we now have training=None (the default) instead of training=True.

It's still training, but I now already have 12% validation accuracy after the first epoch and 40% after 4 epochs, which is already higher than anything I got without the modifications made in this PR.

@Callidior
Copy link
Contributor Author

Callidior commented Nov 7, 2018

By the way, I would question the example in the README. The model is initialized there with freeze_bn=True (the default), which fixes the BatchNormalization layers to test using the initialization parameters. This should be equivalent to using no batch normalization at all.

I also tried this first for my ImageNet training, since the README does so, but it didn't work.

@Callidior
Copy link
Contributor Author

I now finally obtained 68% validation accuracy, which is much closer to what I got with the bundled ResNet-50 than before.

@0x00b1
Copy link
Contributor

0x00b1 commented Nov 28, 2018

Awesome. Thanks, @Callidior.

@0x00b1 0x00b1 merged commit 7e2e67b into broadinstitute:master Nov 28, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants