Training a network using an existing model for weights - do I need to rename the last layer if data is similar? #4787

GeorgiAngelov · 2016-09-30T06:03:48Z

So I've read in a few places that the last layer of the network needs to be renamed to re-initialize the weights. However, do I still need to do that if I my data is a subset of the classes that the network was trained on ?
Say the model was trained on 'ducks', 'dogs', 'cats', 'mice' and I am now further training it on 'cats' so it can do a better job of detecting cats in weird scenarios. Do I need to now still rename the last layer(s) ?

wk910930 · 2016-10-04T09:58:36Z

If the last layer here is an fully-connected-layer, whose output is fed to a loss function, then you have to remove (or rename to re-initialize) this fully-connected-layer, since the number of classes (labels) changes (4 to 1 in your case). Of course, if you still have 4 classes but just significantly increases the samples of cat, it is okay to keep the weights trained before.

GeorgiAngelov · 2016-10-05T04:00:17Z

@wk910930, yeah technically it should now only contain 1 classes. I wasn't sure if it will make sense to reduce the number of classes the network can recognize even though I no longer will provide input training data for them. Basically, what would give me the best recognition for a single class:

Option #1. Modify the last fully-connected-layer to my number of classes + rename it.
Option #2. Keep the existing class #, even though I'll now only use and train for 1 of the classes going forward.
Option #3. Keep the existing class #, but rename the last fully-connected layer to re-initialize the weights.

What are your thoughts ?

wk910930 · 2016-10-05T05:44:25Z

I prefer option 1. This is kind of like pretrain + fine-tuning fashion. Just like we use imagenet pretrained model (1000 classes) as the initial point to train a more specific model (200 classes) for object detection.

GeorgiAngelov mentioned this issue Oct 6, 2016

Using pre-trained model weights and re-training further causes constant confidence score for all detections rbgirshick/py-faster-rcnn#352

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training a network using an existing model for weights - do I need to rename the last layer if data is similar? #4787

Training a network using an existing model for weights - do I need to rename the last layer if data is similar? #4787

GeorgiAngelov commented Sep 30, 2016

wk910930 commented Oct 4, 2016

GeorgiAngelov commented Oct 5, 2016

wk910930 commented Oct 5, 2016

Training a network using an existing model for weights - do I need to rename the last layer if data is similar? #4787

Training a network using an existing model for weights - do I need to rename the last layer if data is similar? #4787

Comments

GeorgiAngelov commented Sep 30, 2016

wk910930 commented Oct 4, 2016

GeorgiAngelov commented Oct 5, 2016

wk910930 commented Oct 5, 2016