Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

Has not supported infering output shape from input shape for module/function: prim::TupleUnpack, .prim::TupleUnpack.152 #3645

Closed
aa12356jm opened this issue May 14, 2021 · 17 comments

Comments

@aa12356jm
Copy link

Describe the issue:

Environment:

  • NNI version:2.2
  • Training service (local|remote|pai|aml|etc):local
  • Client OS:ubuntu16.04
  • Server OS (for remote mode only):
  • Python version:3.7
  • PyTorch/TensorFlow version: torch1.8.1
  • Is conda/virtualenv/venv used?:conda
  • Is running in Docker?:no
@ultmaster ultmaster assigned QuanluZhang and J-shang and unassigned QuanluZhang May 14, 2021
@kvartet kvartet added user raised ModelSpeedup question Further information is requested labels May 14, 2021
@J-shang
Copy link
Contributor

J-shang commented May 14, 2021

Thank you @aa12356jm , model speedup is refactoring, #3462, we will use a new approach to auto infer shape, I think it will be supported in the next version.

@aa12356jm
Copy link
Author

Thank you @aa12356jm , model speedup is refactoring, #3462, we will use a new approach to auto infer shape, I think it will be supported in the next version.

can you explain the reason of the problem to me? i speedup my model, my model consists of three parts: backbone(mobilenetv2)+fpn+head,it is ok if i speedup my backbone. it gives the above error if i speedup backbone+fpn,
so the error is due to fpn, but why ? thanks

@J-shang
Copy link
Contributor

J-shang commented May 17, 2021

Hello @aa12356jm , infer_from_inshape and infer_from_outshape below are all our supported infer_shape for now, if you want to speedup your model, seems you need to add implement for prim::TupleUnpack in infer_from_outshape.

https://github.com/microsoft/nni/blob/dddf0b9efa310443a52467854dea053854f6a134/nni/compression/pytorch/speedup/infer_shape.py

@zheng-ningxin
Copy link
Contributor

@aa12356jm Hi, Basically, I guess it is caused by the function whose return value is complicated(such as tuple or list of tensors). Could you please show us your code snippet of the model definition to reproduce this problem and fix it~ Thanks.

@aa12356jm
Copy link
Author

@aa12356jm Hi, Basically, I guess it is caused by the function whose return value is complicated(such as tuple or list of tensors). Could you please show us your code snippet of the model definition to reproduce this problem and fix it~ Thanks.

yes, the return value of my model is tuple/list,this is the code:

backbone

`

class MobileNetV2_torchvision(nn.Module):
    
    def __init__(self, width):
        super(MobileNetV2_torchvision, self).__init__()
        self.model_backbone = mobilenet_v2(pretrained=False, width_mult=width,num_classes=3)

    def forward(self, x):
        x1 = self.model_backbone.features[:7](x)
        x2 = self.model_backbone.features[7:14](x1)
        x3 = self.model_backbone.features[14:18](x2)
        return [x1,x2,x3]

`

FPN

`

class FPN(nn.Module):
    def __init__(self):
        super(FPN,self).__init__()
        
        self.conv1 = nn.Conv2d(8, 48, kernel_size=(1, 1), stride=(1, 1), bias=False)
        self.conv2 = nn.Conv2d(24, 48, kernel_size=(1, 1), stride=(1, 1), bias=False)
        self.conv3 = nn.Conv2d(80, 48, kernel_size=(1, 1), stride=(1, 1), bias=False)

        self.init_weights()
    
    def init_weights(self):
        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                xavier_init(m, distribution='uniform')

    def forward(self, inputs):
        """Forward function."""
        laterals=[]

        laterals.append(self.conv1(inputs[0]))  #inputs[0]==x1
        laterals.append(self.conv2(inputs[1]))  #inputs[1]==x2
        laterals.append(self.conv3(inputs[2]))  #inputs[2]==x3

        return laterals

`
above is the simplified code, output of backbone is taken as input of FPN

@aa12356jm
Copy link
Author

@zheng-ningxin @J-shang is there any progress?

@zheng-ningxin
Copy link
Contributor

@aa12356jm Yes.
If the middle layer of a model returns a tuple of tensors, there is no problem (the Tupleunpack will be manually expanded), if the entire model returns a tuple of tensors, at this time, because the Tupleunpack has no subsequent nodes, it cannot be manually expanded and is retained. I'll submit a PR to fix this ASAP.
FYI,
issue3645

@aa12356jm
Copy link
Author

@zheng-ningxin i got it ,thanks , looking forward to your PR

@aa12356jm
Copy link
Author

code for debug

`

import torch,time
from nni.compression.pytorch.utils.counter import count_flops_params

import nni
from nni.compression.pytorch import apply_compression_results, ModelSpeedup

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model_dir = '/home/notebook/code/personal/study_deeplearning/nanodet-main-new-fa45577/nanodet-main-fpn/tools/debug_for_issue/'
checkpoint_native = torch.load(model_dir+"model_native_net.pkl", map_location=device) #未剪枝之前的网络
model_net_native = checkpoint_native["net"]

model_path = model_dir+"/pruned_model_last.pth"
mask_model_path = model_dir+"/mask_model_last.pth"
dummy_input = torch.randn([1, 3, 320, 320]).to(device)

model_net_native.load_state_dict(torch.load(model_path))
model_net_native.eval()
model_speedup = model_net_native

apply_compression_results(model_speedup, mask_model_path, device)
m_speedup = ModelSpeedup(model_speedup, dummy_input, mask_model_path, device)
m_speedup.speedup_model()

flops, params, results = count_flops_params(model_speedup, dummy_input)
print(f"FLOPs: {flops}, params: {params}")

start = time.time()
for _ in range(32):
    use_speedup_out = model_speedup(dummy_input)
print('elapsed time when use speedup: ', time.time() - start)

`

@johnnylili
Copy link

Have you solved this problem? My model structure is very similar to yours. I changed the branch code named speedup_v2, but there are still problems with inference

@zheng-ningxin
Copy link
Contributor

@johnnylili Could you please show the code snippet of the model definition? so that I can run it with speedup_v2 and check what's wrong. Thanks.

@johnnylili
Copy link

@zheng-ningxin 你好,我这边可以加你个微信吗?方便我把代码发给您,与您沟通。

@zheng-ningxin
Copy link
Contributor

@johnnylili Ok, please send me your Wechat id through the email Ningxin.Zheng@microsoft.com.

@zheng-ningxin
Copy link
Contributor

@johnnylili Hi, I have supported the model you send to me, please pull the latest version of speedup_v2 and try again, if you meet some problems, please feel free to contact me through the email or Wechat, thanks~

@JohnsenJiang
Copy link

JohnsenJiang commented Jul 7, 2021

"RuntimeError: Has not supported infering output shape from input shape for module/function: prim::TupleUnpack, .prim::TupleUnpack.46"

multiple outputs of a neural network still arise the error above. thanks for share the solution. I had tried the newest nni sourcecode.

class SubNet(nn.Module):
def init(self):
super(SubNet, self).init()
self.conv1 = nn.Conv2d(512, 1024, 3, 1)
self.conv2 = nn.Conv2d(512, 1024, 5, 1)

def forward(self, x):
    out0 = self.conv1(x)
    out1 = self.conv2(x)
    return out0, out1

class MultiOutputsNet(nn.Module):
def init(self):
super(MultiOutputsNet, self).init()
self.backbone = models.vgg16_bn(pretrained=True).features
self.subnet0 = SubNet()
# self.subnet1 = SubNet()

def forward(self, x):
    x = self.backbone(x)
    a0, a1 = self.subnet0(x)
    # b0, b1 = self.subnet1(x)
    return a0, a1

@zheng-ningxin
Copy link
Contributor

zheng-ningxin commented Jul 7, 2021

@JohnsenJiang have you tried on this branch https://github.com/microsoft/nni/pull/3462? Please let me know if you still fail on this branch.

@JohnsenJiang
Copy link

The net i mentioned above works on your speedup_v2 branch. Thanks for your help! My fault to misunderstand your reply above my issue. @zheng-ningxin

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

8 participants