Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training a network using an existing model for weights - do I need to rename the last layer if data is similar? #4787

Open
GeorgiAngelov opened this issue Sep 30, 2016 · 3 comments

Comments

@GeorgiAngelov
Copy link

So I've read in a few places that the last layer of the network needs to be renamed to re-initialize the weights. However, do I still need to do that if I my data is a subset of the classes that the network was trained on ?
Say the model was trained on 'ducks', 'dogs', 'cats', 'mice' and I am now further training it on 'cats' so it can do a better job of detecting cats in weird scenarios. Do I need to now still rename the last layer(s) ?

@wk910930
Copy link
Contributor

wk910930 commented Oct 4, 2016

If the last layer here is an fully-connected-layer, whose output is fed to a loss function, then you have to remove (or rename to re-initialize) this fully-connected-layer, since the number of classes (labels) changes (4 to 1 in your case). Of course, if you still have 4 classes but just significantly increases the samples of cat, it is okay to keep the weights trained before.

@GeorgiAngelov
Copy link
Author

@wk910930, yeah technically it should now only contain 1 classes. I wasn't sure if it will make sense to reduce the number of classes the network can recognize even though I no longer will provide input training data for them. Basically, what would give me the best recognition for a single class:

Option #1. Modify the last fully-connected-layer to my number of classes + rename it.
Option #2. Keep the existing class #, even though I'll now only use and train for 1 of the classes going forward.
Option #3. Keep the existing class #, but rename the last fully-connected layer to re-initialize the weights.

What are your thoughts ?

@wk910930
Copy link
Contributor

wk910930 commented Oct 5, 2016

I prefer option 1. This is kind of like pretrain + fine-tuning fashion. Just like we use imagenet pretrained model (1000 classes) as the initial point to train a more specific model (200 classes) for object detection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants