I think MVNLayer with across_channels = true is broken #1938

jyegerlehner · 2015-02-22T11:07:33Z

I think MVNLayer is broken when across_channels: true is specified in MVNParameter, even though the unit test is passing.

In MVNLayerTest::TestForwardAcrossChannels, we create the layer parameter with:

layer_param.ParseFromString("mvn_param{across_channels: true}"); ParseFromString is returning false and layer_param.mvn_param().across_channels() is false, after the call, even though we specified true in the prototxt. So the test is running with default value of across_channels, which is false, the default value.

If instead in MVNLayerTest::TestForwardAcrossChannels we call:
CHECK(google::protobuf::TextFormat::ParseFromString("mvn_param{across_channels: true}", &layer_param ));

it succeeds, and layer_param.mvn_param().across_channels() is true. And then you will see that MVNLayerTest::TestForwardAcrossChannels fails, as the Forward_* produces NaN values.

I have no idea why layer_param.ParseFromString fails whereas `google::protobuf::TextFormat::ParseFromString' suceeds. I don't have protocol source installed and would rather debug Caffe than Protobuffers. If anyone knows please educate me.

It at first seems surprising that test TestForwardAcrossChannels succeeds even though across_channels = false when you'd think it would need to be true in order to pass the test. I think it's because the values in bottom blob are drawn from a normal distribution using GaussianFiller, and the only difference between across_channels=true vs. false, is that there are 20 values from which one is computing the mean and variance instead of 60. So you still get a zero mean and variance 1.0 in the top blob which is what the test checks for, regardless of whether you compute them for a single channel or across channels.

I'm working on this in connection with 1895. I will say trying to understand the vectorization approach using BLAS calls used to perform these ops is making my head hurt : /.

The text was updated successfully, but these errors were encountered:

jyegerlehner · 2015-02-22T22:52:37Z

I think 1895 now fixes this.
Edit: 1895 superceded by 1979.

seanbell · 2015-02-23T07:32:00Z

I have no idea why layer_param.ParseFromString fails whereas `google::protobuf::TextFormat::ParseFromString' suceeds.

This is because layer_param.ParseFromString parses serialized binary data (stored as a string), and google::protobuf::TextFormat::ParseFromString parses text-format data. In this case, layer_param.ParseFromString was the wrong function to use. The same problem was raised in #1355.

jyegerlehner · 2015-02-24T00:43:57Z

Thanks @seanbell. As I read your comment I realized I left a bunch of MVNLayer tests still using the wrong ParseFromString. Will go fix.

jyegerlehner · 2015-02-25T04:35:33Z

It turns out one of the tests (TestGradientMeanOnly in test_mvn_layer.cpp) isn't passing (in master branch). To see this just change this line:

layer_param.ParseFromString("mvn_param{normalize_variance: false}");

to this

CHECK(google::protobuf::TextFormat::ParseFromString("mvn_param{normalize_variance: false}", &layer_param));

and add the include

#include "google/protobuf/text_format.h"

I see the test fail. MVNLayer::Backward just copies top diff to bottom diff for the mean only case. Looks like the numerical gradient checker disagrees that is correct, because when it adds and subtracts stepsize to the input in order to approximate the derivative as delta output over delta input, the mean also changes so it ends up with somewhat different gradients than the layer, whereas the layer is assuming the mean to be a "constant" in its gradient calculation. So to get it to pass, seems like we'd need a way to leave mean_ unchanged as we do the gradient checker's two forward calls. Hmmm. Tried fiddling with stepsize and threshold to no avail. If it's obvious to anyone what the right way forward here please share.

jyegerlehner · 2015-02-26T04:23:57Z

OK, 1979 has a resolution for the issues above.

As mentioned in previous note, fixing the ParseFromString issue revealed that MVNLayerTest::TestGradientMeanOnly was failing. My "fix" was to increase the tolerance on the gradient checker in that case, which is a hack. I think the failing test actually reveals that Backward is not exactly correct. It just copies top diff to bottom diff:

caffe_copy(temp_.count(), top_diff, bottom_diff);

Which is premised on the mean being a constant and thus not contributing to the derivative. Which isn't exactly true: as an input changes, it does have a small effect on the mean. However, for most real cases where the blobs are larger that contribution is small and it's a good enough approximation. I think. Maybe.

I could try to work out what the exact correct derivative is (go learn some matrix calculus) but: 1) not sure the (small) accuracy improvement is worth the extra runtime overhead and 2) all signs so far are that the PR is going to be ignored anyway, so not particularly motivated to invest the time. If I were to do that, I think one of you bright young people would need to logic check my thinking above. My foggy old brain is a quarter century removed from its linear algebra schooling.

seanbell · 2015-02-26T20:14:42Z

I think the failing test actually reveals that Backward is not exactly correct. It just copies top diff to bottom diff.

I agree that the gradient is wrong for this case. Someone should check my derivation, but I believe that the missing part of the gradient is the subtraction of the mean. If we let the top be y, bottom x, and loss L, then:

Thus, for the case of mean subtraction, Forward and Backward look the same (and you can reuse the code from the forward case, changing "data" to "diff").

jyegerlehner · 2015-02-27T18:53:09Z

Thank you @seanbell. That looks correct to me (after I realized the delta-sub-i-j is meant to be a kronecker delta).

Gradient checker thinks it works too: implemented in 1979, and reversed the hack that was allowing larger errors, and tests are passing.

jyegerlehner · 2015-02-27T21:35:36Z

Closing this, since there is a PR that fixes this issue.

…lculation. Gradient calculation per analysis of seanbell, found here: BVLC#1938

Fix the MVNLayer tests so they actually test what they claim. MVNLayer fixes: sum_multiplier_ sized correctly; backward gradient calculation. Gradient calculation per analysis of seanbell, found here: BVLC#1938 Fixes according to review comments.

Fix the MVNLayer tests so they actually test what they claim. MVNLayer fixes: sum_multiplier_ sized correctly; backward gradient calculation. Gradient calculation per analysis of seanbell, found here: BVLC/caffe#1938 Fixes according to review comments.

seanbell mentioned this issue Feb 26, 2015

InverseMVNLayer #1979

Closed

jyegerlehner closed this as completed Feb 27, 2015

jyegerlehner mentioned this issue Jun 11, 2015

Unit-Tests do not use custom parameters #2584

Closed

jyegerlehner added a commit to jyegerlehner/caffe that referenced this issue Aug 24, 2015

MVNLayer fixes: sum_multiplier_ sized correctly; backward gradient ca…

1b70ec4

…lculation. Gradient calculation per analysis of seanbell, found here: BVLC#1938

jyegerlehner mentioned this issue Aug 24, 2015

Fix MVNLayer #2964

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

I think MVNLayer with across_channels = true is broken #1938

I think MVNLayer with across_channels = true is broken #1938

jyegerlehner commented Feb 22, 2015

jyegerlehner commented Feb 22, 2015

seanbell commented Feb 23, 2015

jyegerlehner commented Feb 24, 2015

jyegerlehner commented Feb 25, 2015

jyegerlehner commented Feb 26, 2015

seanbell commented Feb 26, 2015

jyegerlehner commented Feb 27, 2015

jyegerlehner commented Feb 27, 2015

I think MVNLayer with across_channels = true is broken #1938

I think MVNLayer with across_channels = true is broken #1938

Comments

jyegerlehner commented Feb 22, 2015

jyegerlehner commented Feb 22, 2015

seanbell commented Feb 23, 2015

jyegerlehner commented Feb 24, 2015

jyegerlehner commented Feb 25, 2015

jyegerlehner commented Feb 26, 2015

seanbell commented Feb 26, 2015

jyegerlehner commented Feb 27, 2015

jyegerlehner commented Feb 27, 2015