Skip to content
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

How to finetune from pretrained detectron models with different number of classes? #15

Closed
wangg12 opened this issue Oct 25, 2018 · 43 comments
Labels
enhancement New feature or request

Comments

@wangg12
Copy link
Contributor

wangg12 commented Oct 25, 2018

❓ Questions and Help

Is there a config option to load pretrained coco models for finetuning? The last layers where the number of classes may be different, so those weights should not be loaded.

@fmassa
Copy link
Contributor

fmassa commented Oct 25, 2018

Hi,

There currently isn't an off-the-shelf option in the config for that.
I see two easy options:
1 - from a python interpreter, load the pre-trained files that you want to use, and delete from the state_dict the keys corresponding to the last layer. The exact naming depends on the model architecture, but for boxes the name will end with a cls_score and bbox_pred, and for masks it will end with mask_fcn_logits.
2 - Clone the code-base and modify the names of the two variables that I pointed out to be something else, like cls_score_mine etc. This will work out of the box, and you can modify the NUM_CLASSES in the config without clashes.

I think we could provide a functionality to perform 1 for the users, given a cfg file and a path to a model weight. That could be a possible improvement on top of what we currently have.

What do you think?

@fmassa fmassa added the enhancement New feature or request label Oct 25, 2018
@wangg12
Copy link
Contributor Author

wangg12 commented Oct 25, 2018

@fmassa I think option 1 is more user-friendly. We can add a config option like PRETRAINED_DETECTRON_WEIGHTS and if it is given all the weights but those of the last layer would be loaded to initilize the model.

@fmassa
Copy link
Contributor

fmassa commented Oct 25, 2018

Yeah, option 1 is definitely simpler for the user (even if there are only a few lines to change here and there ;-) )

I'll prepare a PR adding support for this functionality, but I'm not 100% sure of what the API should look like, nor the best fix for it.

API

Should we have a function that acts on the weights and creates a new set of weights file? Or should we add an extra config argument, to make it a single step function? If we add an argument (which seems simpler for the user), would it be ambiguous?

Implementation

For the possible fixes, we could hard-code the possible names for the layers that shouldn't be loaded (as I mentioned before). But this is not super robust if the user changes their module names (which they can, if they want).

Another possible implementation is to not load the weights for the entire predictor. This is effectively the most robust way, as the predictor was designed to be only the "last layer".
This works nicely for boxes, but for masks we would lose one ConvTranspose2d layer initialization as well, which might be that bad in the end.

Thoughts?

@wangg12
Copy link
Contributor Author

wangg12 commented Oct 25, 2018

I would prefer the former way. For possible module name changes by users, I think they should also be careful for weights loading, either by name remapping or random initialization.

@fmassa
Copy link
Contributor

fmassa commented Oct 25, 2018

@wangg12 could you expand on why you'd prefer the first approach? I was actually leaning more towards the second one, as it is more robust, and we have a clear contract with the user when we add an option to the config: "load every weight possible, except those in the predictor".

@wangg12
Copy link
Contributor Author

wangg12 commented Oct 25, 2018

@fmassa There are two conditions where the first one may be more suitable.

  1. I just want to finetune the trained coco model on coco datasets.
  2. I want to use pretrained weights as much as I can, so the lost convtranspose2d weights may be unexpected.
    For other conditions, I think the second way is also OK.

@fmassa
Copy link
Contributor

fmassa commented Oct 25, 2018

So, I've discussed with a few people here and it seems that the best way of handling this would be to actually perform model surgery on the model files.

For example, the best results on CityScapes come from taking a COCO trained detector, then remove most of the classification and mask weights, but retaining those that correspond to common categories between both COCO and CityScapes.
Detectron does something as follows: https://github.com/facebookresearch/Detectron/blob/master/tools/convert_coco_model_to_cityscapes.py , so maybe the most generic thing to do is to provide a few helper functions for users to decide which layers to trim.

@wangg12
Copy link
Contributor Author

wangg12 commented Oct 25, 2018

Yes, this way is more general.

@xuanyuzhou98
Copy link

"load the pre-trained files that you want to use, and delete from the state_dict"

Hi,

There currently isn't an off-the-shelf option in the config for that.
I see two easy options:
1 - from a python interpreter, load the pre-trained files that you want to use, and delete from the state_dict the keys corresponding to the last layer. The exact naming depends on the model architecture, but for boxes the name will end with a cls_score and bbox_pred, and for masks it will end with mask_fcn_logits.
2 - Clone the code-base and modify the names of the two variables that I pointed out to be something else, like cls_score_mine etc. This will work out of the box, and you can modify the NUM_CLASSES in the config without clashes.

I think we could provide a functionality to perform 1 for the users, given a cfg file and a path to a model weight. That could be a possible improvement on top of what we currently have.

What do you think?

Where are the pretrained files located? For example, I want to use pretrained net in imageset, wheere can we find those files and load them?

@fmassa
Copy link
Contributor

fmassa commented Oct 31, 2018

By default, they are stored in ~/.torch/models. The exact name of the file is printed during training, just before the printing of the loaded weights.

@steve-goley
Copy link

steve-goley commented Oct 31, 2018

I added this function to train_net.py with an additional input arg. Note, the loaded models had an additional "module." prefix that had to be removed. After I removed this it worked great.

def _transfer_pretrained_weights(model, pretrained_model_pth):
    pretrained_weights = torch.load(pretrained_model_pth)['model']
    new_dict = {k.replace('module.',''):v for k, v in pretrained_weights.items()
                if 'cls_score' not in k and 'bbox_pred' not in k}
    this_state = model.state_dict()
    this_state.update(new_dict)
    model.load_state_dict(this_state)
    return model

I don't think this is the solution that @fmassa wants to implement but it'll work in a pinch for now.

@cppntn
Copy link

cppntn commented Nov 5, 2018

Hello @steve-goley @fmassa , I've tried to load the pretrained model in this way:
w = torch.load("X-101-32x8d.pkl")

however, an error occured: UnicodeDecodeError: 'ascii' codec can't decode byte 0xad in position 2: ordinal not in range(128)
I am able to get over this errore by doing, with pickle:
with open("X-101-32x8d.pkl", "rb") as f: w = pickle.load(f, encoding='latin1')

But it seems to be no "model" key in the dict, just "blobs" dict and I can't find 'cls_score' and 'bbox_pred'.

Could you tell me how to overcome this issue?

Thanks

@fmassa
Copy link
Contributor

fmassa commented Nov 5, 2018

@antocapp the .pkl files are generally from the Detectron codebase, which is written in Caffe2.

What I'd recommend doing is the following:
1 - create a cfg object similar to what is present in the demo, for that particular model
2 - use load_c2_format function, which will give you a dict containing the model field. In there, you can perform the model surgery that you want, by removing fields etc
3 - save the object using pytorch torch.save, keeping the structure dict(model=state_dict).
4 - change MODEL.WEIGHT to point to this saved file.

Let me know if it doesn't work, I might have missed a step here.

@cppntn
Copy link

cppntn commented Nov 5, 2018

Hi @fmassa, thanks for your support.
I wrote this:

from maskrcnn_benchmark.config import cfg
from maskrcnn_benchmark.utils.c2_model_loading import load_c2_format

cfg.merge_from_file("configs/caffe2/e2e_mask_rcnn_X_101_32x8d_FPN_1x_caffe2.yaml")
path = '/home/antonio/.torch/models/X-101-32x8d.pkl'
_d = load_c2_format(cfg, path)

keys = [k for k in _d['model'].keys()]
print(sorted(keys))

But i can't find 'cls_score' and 'bbox_pred' in the keys.

@fmassa
Copy link
Contributor

fmassa commented Nov 5, 2018

@antocapp you are loading the ImageNet-trained models (X-101-32x8d.pkl), not the detection models that have already been trained on COCO (which is probably what you want). The model file that you are looking for has a long name, should start with _ and parts of it are here.

@cppntn
Copy link

cppntn commented Nov 6, 2018

Thanks @fmassa, so where I can find that model? When i performed inference with that model it works very well (I want just to fine tune it on a class on a specific dataset) but in .torch/models/ i see that only "X-101-32x8d.pkl" has been downloaded. Where i can find the detection model?

Thanks for your help i really appreciate that

EDIT: I launched again inference and it started downloading again the file 36761843/12_2017_baselines/e2e_mask_rcnn_X-101-32x8d-FPN_1x.yaml.06_35_59.RZotkLKI/output/train/coco_2014_train%3Acoco_2014_valminusminival/generalized_rcnn/model_final.pkl ; maybe I accidentally deleted the previous model from models/ folder. Thanks again!
I was able to prune 'cls_score' and 'bbox_pred' layers in the model, then saved it keeping the key 'model' in .pth with torch.save. Then i changed MODEL.WEIGHT to point to this file and ROI_BOX_HEAD.NUM_CLASSES to 2 (background and the only one class that i want to fine tune the model for). Is this correct?

A last question: how should I organize my dataset in order to fine tune the model?

@BelhalK
Copy link

BelhalK commented Nov 23, 2018

Hi @antocapp,
Could you share your chunk of code that takes the pre trained mask rcnn model (beginning with _) and returns the modified one please (prunning the relevant fields)?
I am running into the same issues you mentionned in

Hello @steve-goley @fmassa , I've tried to load the pretrained model in this way:
w = torch.load("X-101-32x8d.pkl")

however, an error occured: UnicodeDecodeError: 'ascii' codec can't decode byte 0xad in position 2: ordinal not in range(128)
I am able to get over this errore by doing, with pickle:
with open("X-101-32x8d.pkl", "rb") as f: w = pickle.load(f, encoding='latin1')

But it seems to be no "model" key in the dict, just "blobs" dict and I can't find 'cls_score' and 'bbox_pred'.

Could you tell me how to overcome this issue?

Thanks

Thank you very much

@fmassa
Copy link
Contributor

fmassa commented Nov 23, 2018

@BelhalK the weights are inside blobs, but they have some pretty different names.

@BelhalK
Copy link

BelhalK commented Nov 23, 2018

Got it. So the working function should be

def _transfer_pretrained_weights(model, pretrained_model_pth):
    pretrained_weights = torch.load(pretrained_model_pth)['**blobs**']
    new_dict = {k.replace('module.',''):v for k, v in pretrained_weights.items()
                if '**somethingelse**' not in k and '**somethingelse**' not in k}
    this_state = model.state_dict()
    this_state.update(new_dict)
    model.load_state_dict(this_state)
    return model

Where somethingelse should be different than cls_score and bbox_pred, right?

@fmassa
Copy link
Contributor

fmassa commented Nov 23, 2018

Almost, you'll probably need to plug it somewhere in utils/c2_loading

@BelhalK
Copy link

BelhalK commented Nov 23, 2018

you may be right.
I initially wanted to insert it in tools/train_net.py
like

def _transfer_pretrained_weights(model, pretrained_model_pth):
    pretrained_weights = torch.load(pretrained_model_pth)['model']
    new_dict = {k.replace('module.',''):v for k, v in pretrained_weights.items()
                if 'cls_score' not in k and 'bbox_pred' not in k}
    this_state = model.state_dict()
    this_state.update(new_dict)
    model.load_state_dict(this_state)
    return model


def train(cfg, local_rank, distributed):
    old_model = build_detection_model(cfg)
    pretrained_model_pth = "/home/belhal/.torch/models/_detectron_35858933_12_2017_baselines_e2e_mask_rcnn_R-50-FPN_1x.yaml.01_48_14.DzEQe4wC_output_train_coco_2014_train%3Acoco_2014_valminusminival_generalized_rcnn_model_final.pkl"
    model = _transfer_pretrained_weights(old_model,pretrained_model_pth)
    device = torch.device(cfg.MODEL.DEVICE)
    model.to(device)
   ....

But it may be necessary in some other scripts

@BelhalK
Copy link

BelhalK commented Nov 24, 2018

I have been using the different tips and tricks of this thread to modify a pre-trained model.
I am having an issue saving the modified dict into a new model.
I am using the following code

path='/Users/belhal/.torch/models/_detectron_35858933_12_2017_baselines_e2e_mask_rcnn_R-50-FPN_1x.yaml.01_48_14.DzEQe4wC_output_train_coco_2014_train%3Acoco_2014_valminusminival_generalized_rcnn_model_final.pkl'
from maskrcnn_benchmark.utils.c2_model_loading import load_c2_format

cfg.merge_from_file("../configs/e2e_mask_rcnn_X_101_32x8d_FPN_1x.yaml")
_d = load_c2_format(cfg, path)
newdict = _d

def removekey(d, listofkeys):
    r = dict(d)
    for key in listofkeys:
        del r[key]
    return r

newdict['model'] = removekey(_d['model'], ['cls_score.bias','cls_score.weight','bbox_pred.bias','bbox_pred.weight'])

How should I use torch.save(??, 'mymodel.pkl')to save a new model named mymodel.pkl with the resulting dict newdict?

Thanks a lot for your help!

@fmassa
Copy link
Contributor

fmassa commented Nov 27, 2018

You can just save it using torch.save(newdict, 'mymodel.pth'). Note the pth extension, and not pkl

@fmassa
Copy link
Contributor

fmassa commented Dec 14, 2018

@jbitton addressed your question in #273

Also, given that the current issues were not enough to give you full context on how to add new datasets, could you perhaps improve a bit the documentation in https://github.com/facebookresearch/maskrcnn-benchmark/blob/master/maskrcnn_benchmark/data/README.md (maybe adding a link from the main README as well) with the points that were missing, and send a PR?

It would be a very welcome contribution!

@jbitton
Copy link

jbitton commented Dec 14, 2018

@fmassa For sure! Do you mind if I get the PR out mid-next week? I'd like to first verify that I was able to go through the training/eval scripts successfully.

@fmassa
Copy link
Contributor

fmassa commented Dec 14, 2018

@jbitton sure, no worries! thanks a lot!

@mattans
Copy link

mattans commented Dec 14, 2018

What's the meaning of %3A in the saved path? It's the HTML code for a colon, but why do we want it in a path?

@fmassa
Copy link
Contributor

fmassa commented Dec 14, 2018

@mattans we don't necessarily want it in the path. But this might be specific to what Windows can have as characters in a path

@wangg12
Copy link
Contributor Author

wangg12 commented Dec 18, 2018

To summarize, I've created a script tools/trim_detectron_model.py here.
You can decide which keys to be removed and which keys to be kept by modifying the script.

Then you can simply point the converted model path in the config file by changing MODEL.WEIGHT.

@wangg12 wangg12 closed this as completed Dec 18, 2018
@fmassa
Copy link
Contributor

fmassa commented Dec 18, 2018

@wangg12 could you maybe add a section in the TROUBLESHOOTING or in the README pointing to your snippet and send a PR?

Thanks!

@wangg12
Copy link
Contributor Author

wangg12 commented Dec 18, 2018

@fmassa I've created a PR #286

@xiaohai12
Copy link

xiaohai12 commented May 29, 2019

I had a question about using trim_detectron_model.py.
If I understand correctly, when we load model by using load_c2_format(cfg, path), this function can only work with .pkl file . However, what we save from training is .pth file, so I had a error when I wanted to use trim_detectron_model.py. for .pth file.

Is there any solution for this?
Thanks.

@christopherbate
Copy link

@xiaohai12 I believe you can just replace the call to load_c2_format with a simple torch.load, but I have not tested.

@xiaohai12
Copy link

xiaohai12 commented May 30, 2019

@xiaohai12 I believe you can just replace the call to load_c2_format with a simple torch.load, but I have not tested.

Thanks. I will try it.

@xiaohai12
Copy link

@xiaohai12 I believe you can just replace the call to load_c2_format with a simple torch.load, but I have not tested.

It worked in my case when I modified load_c2_format to torch.load and modified the the parameters in removekey from cls_score to roi_heads.box.predictor.cls_socre(same for other parameters).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

10 participants