-
Notifications
You must be signed in to change notification settings - Fork 1.8k
pruned model size no change and inference time is even longer #2225
Comments
@misslibra thanks for reporting this issue. It is expected that the pruned model is also 1.7M, because the pruners are responsible for finding weight masks that make the model still performs reasonably good. ModelSpeedup is responsible for making the model smaller based on the generated masks. For your case, could you tell us how you measured the number 1.5ms? with pruner applied? or loading the saved model weight checkpoint to the original model (i.e., before pruning)? if the former, inference latency should be higher because weights should be multiplied by the masks in forward. if the latter, the inference latency should not be different. For ModelSpeedup, it would be great if you can share the code with us, so that we can check whether your model is really compressed. |
Thanks for your support! |
` if name == "main":
|
this is my code to use speedup |
I try to load new model exported by pruner.export_model, and use use_mask logic , inference time is still not cut down. `
|
and how to use ModelSpeedup to get a smaller model ? /home/cindy/Documents/3D/training/camera/geely_yolo3D_02/models.py:291: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! |
@misslibra there are two issues in your code. First, after calling |
@QuanluZhang Thank you so much ! I will try now and update the result |
BTW, the following two examples might help: |
My torch version is 1.3.1 |
looks like a bug in torch.jit, some related issues in pytorch: |
this error can be solved by this notice(from source code): |
@misslibra thanks for sharing the cause. |
@QuanluZhang hi ,when I try to apply ModelSpeedup with Pytorch model nesnet50 model = models.resnet50(pretrained=False) m_speedup = ModelSpeedup(model, input_imgs, masks_file) m_speedup.speedup_model() ` File "/home/cindy/anaconda3/envs/python36_env/lib/python3.6/site-packages/nni/compression/speedup/torch/compressor.py", line 496, in infer_module_mask and I try to add "aten::_convolution" in map : infer_from_inshape in infer_shape.py . |
@misslibra ModelSpeedup relies on shape inference of operations to figure out what modules should be replaced and how. In the current alpha release, we only support limited operations/modules for shape inference. We are working on simplifying the process and interface for adding new operation/module support, will be included in future release. Specifically for the error you encountered, seems like induced by a bug that has been already fixed. Could you pull the latest master branch, source install and try ModelSpeedup again? |
Hello, I also encounter problems when I tried to speed up ResNet. It seems like some conflicts occur between 2 shortcuts. For example, |
@TangChangcheng could you provide an executable python script along with the mask file you use, so that we can reproduce the problem? |
@TangChangcheng @misslibra your issue may be resolved by pr @2579 |
hi @misslibra which pytorch version were you using for L!filterpruner ? I am having import error with pytorch 1.8.1 . ImportError: cannot import name 'L1FilterPruner' from 'nni.compression.pytorch' |
nni Environment:pytorch
I run the example code: model_prune_torch.py
and the pretrain_naive model is 1.7M, the pruned_model is also 1.7M,the same with the mask.
The inference time using pretrain model is 0.4ms,but for the pruned model, time increase to 1.5ms.
I am so confused that what the function of the example? isn't is downscale the model and speedup?
and I also try the speedup method follow the example for my model base on YOLOv3 , still the same .
Please help me what is going wrong ?
Thx!
The text was updated successfully, but these errors were encountered: