Loss layer for Siamese neural network #775

arntanguy · 2014-07-24T13:06:30Z

Hi,
First, I should say that I have very little experience with neural networks, and minimisation algorithms. I implemented a fully connected perceptron last year, but that's as far as my experience goes.

I've created a Siamese Convolutional Neural Network using the recently introduced weight sharing feature to try and recognise similarities between pairs of images.

I created my own LMDB dataset based on RGB(D) images, and an appropiate data layer for feeding the labelled pairs of images to the network. This works perfectly, and with a couple of little improvement, it could probably be sent as a pull request).

Now, I am trying to write a loss function for this, based on [1].
The idea is to minimise the energy, definined as the L1 norm of the difference between the feature descriptors computed from two input images X1 and X2. To do so, the loss function from the paper has a contrastive term that makes sure that the energy is low when the inputs are similar, and high when they aren't.

The loss function is defined for a pair of input images X1, X2 with label Y (1 if similar, 0 otherwise) as

L(W, Y, X1, X2) = Y * Lg(Ew(X1,X2)) * (1-Y)* Li(Ew(X1,X2))

Lg (for a genuine pair of images) and Li (for an impostor pair) are designed so that the minimisation of L decreases the energy of genuine paris and increases the energy of impostors.

More specifically in the paper, they used

L(W, Y, X1, X2) = 2/Q * Y * (Ew)^2 + 2_Q_(1-Y)*exp(-2.77/Q * Ew)

Where Q is a constant set to the upper bound of Ew.

I implemented the forward propagation of this loss function without problem, but I am struggling to figure out what to do for the backpropagation. I am very confused as to what needs to be computed there.

Can someone give me some pointers on how to properly implement that?

What do I need to compute for the backward propagation?
Also, how can I know the range in which the feature descriptors for the network will be (needed to compute the constant Q)?

I am open to other suggestions for the loss function, so your answer doesn't necessarely have to be based on the loss function from [1].

My current code is on my github: geenux/tum_siamese@afaeff85f793f0bbf9f73909d55c936ecb95a23e

Thanks a lot for your help!

[1] Chopra, Sumit, Raia Hadsell, and Yann LeCun. "Learning a similarity metric discriminatively, with application to face verification." Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on. Vol. 1. IEEE, 2005. (http://yann.lecun.com/exdb/publis/pdf/chopra-05.pdf)

shelhamer · 2014-08-27T07:11:06Z

See #959 for a contrastive loss function for Siamese networks.

aimaaonline · 2018-05-02T09:00:48Z

shelhamer can you please help in if I change my loss function to Mean Squared Error for regression.. how can I verfiy numerical and anlytical gradients. I canot get the gredients calculated by Siamese Network at Last Layre.

Cluoyao · 2019-04-01T04:59:54Z

Hi, arntanguy, I want to ask you one question, when you train the model with this loss function, have u ever met the loss value is very large in the first iteration, and then, at the second iteration, the value is small compare with the first? How did u solve it? thx

shelhamer closed this as completed Dec 30, 2014

shelhamer added the question label Dec 30, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Loss layer for Siamese neural network #775

Loss layer for Siamese neural network #775

arntanguy commented Jul 24, 2014

shelhamer commented Aug 27, 2014

aimaaonline commented May 2, 2018

Cluoyao commented Apr 1, 2019

Loss layer for Siamese neural network #775

Loss layer for Siamese neural network #775

Comments

arntanguy commented Jul 24, 2014

shelhamer commented Aug 27, 2014

aimaaonline commented May 2, 2018

Cluoyao commented Apr 1, 2019