-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(Upsample) How can I use onnx parser with opset 11 ? #284
Comments
Hi @dhkim0225, How are you parsing the ONNX model? If |
@rmccorm4 Thank you for reply. (Really really thanks a lot.) I tried both of them. Here's my code. Pytorch to ONNXdef main(cfg):
net = get_model(cfg['model'], 0, weight_file=None, verbose=cfg['eval']['verbose'])
net.eval()
with torch.no_grad():
dummy_input = torch.randn(1, 3, 1920, 1920, device='cuda')
torch_out = net(dummy_input)
onnx.export(net, dummy_input, "./my_trt/model.onnx",
export_params=True,
verbose=False,
training=False,
input_names=None,
output_names=None,
operator_export_type=onnx.OperatorExportTypes.ONNX,
opset_version=11,
do_constant_folding=True,
example_outputs=torch_out,
strip_doc_string=True,
dynamic_axes=None,
keep_initializers_as_inputs=True) trtexec
Following error messages:
Convert with APIdef get_engine(mode)
EXPLICIT_BATCH = 1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)
with trt.Builder(TRT_LOGGER) as builder, \
builder.create_network(EXPLICIT_BATCH) as network, \
trt.OnnxParser(network, TRT_LOGGER) as parser:
builder.max_batch_size = 1
builder.fp16_mode = True if mode == 'fp16' else False
builder.int8_mode = True if mode == 'int8' else False
builder.max_workspace_size = 1 << 32 # 1GB:30
with open(onnx_file_path, 'rb') as model:
parser.parse(model.read())
print(len(network)) # Printed output == 0. Something is wrong.
engine = builder.build_cuda_engine(network)
with open(engine_file_path, "wb") as f:
f.write(engine.serialize())
return engine Output messages are following,
....trtexec says assertion error with ModelImporter.cpp #L124. I found an similar issue here When I use onnxruntime, model works well without any error. Best Regards, |
Hi @dhkim0225, The appreciation is appreciated 🙂 Re: trtexec errors I'll look into it hopefully tomorrow Re: python API, I noticed the comments when network has 0 layers - that's because parsing failed. For future reference, you can get better output about that by checking output of something like |
The aprreciation for aprreciation is appreciated :p I made a new clean docker and build trt from scratch. Then, len(network) returns a right value. Maybe this is an issue of pytorch. With onnxruntime package, this model works well, but when I call following test code, segmentation fault error occurs. It's really strange. onnx_model = onnx.load(args.onnx_model_path)
onnx.checker.check_model(onnx_model) I made my onnx model with ngc container (nvcr.io/nvidia/pytorch:19.12-py3) which contains pytorch1.4.0a This is a part of my model. We can see 'Constant' module and can find quite similar structure at above pytorch issue. Please let me know if you find anything with this issue. Best Regards, ============================= |
Hey @dhkim0225, As a potential workaround in the meantime while I look into it, just curious what happens when you use PyTorch 1.3 or 1.2 to export the model? 1.4 is pretty bleeding edge so I'm not sure if it introduced anything that might be causing issues. |
Well,, I didn't test with pytorch 1.2 since it supports opset version up to 10. For pytorch 1.3.0 and 1.3.1, not only onnx model occured an error with checker but also didn't work with onnxruntime. Netron's visualization outputs of model are same as model from torch1.4.0 I'll check len(model) after parser.parse() if you want! :) |
Sure, I'm curious what's different with 1.3 / 1.3.1: if not parser.parse(f.read()):
print('ERROR: Failed to parse the ONNX file.')
for error in range(parser.num_errors):
print(parser.get_error(error)) |
@rmccorm4 Sorry for late comment. I made 12 onnx models with pytorch 1.3.0 and 1.3.1. (6 onnx models per each pytorch version) You can see my toy code here All 12 onnx models make following error. All onnx models are invalid_graph (checked with onnx.checker.check_graph). Results of parser.get_error(error) and len(network) are following
network 0 nn.Upsample(scale_factor=2, mode='bilinear', align_corners=False)network 1 nn.Upsample(scale_factor=2, mode='bilinear', align_corners=True)network 2 nn.Upsample(scale_factor=2, mode='nearest')network 3 nn.Upsample((256, 256), mode='bilinear', align_corners=False)network 4 nn.Upsample((256, 256), mode='bilinear', align_corners=True)network 5 nn.Upsample((256, 256), mode='nearest') |
Hi @dhkim0225, For the onnx.check_model(model) or onnx.check_graph(model.graph) |
Regarding the real issue reported by TensorRT when trying to parse the model, I'm guessing it's coming from the Upsample op. I've seen a few other users experience similar difficulties, which I had hoped was fixed in TRT 7, but seems not. Although I'm not sure if this is a TRT issue or an PyTorch/ONNX issue. I was hoping that using
|
Many people are on holiday this week and next week, but hopefully might be able to find something useful in the next couple weeks. |
Thank you for taking the trouble to help me. I'll be waiting for new comments. Please tell me if I have anything to help you. Sincerely yours, |
Per ONNX, seems to be a limitation in supported parameters for Upsample (or indirectly Resize) op:
|
Got same problem here.
The opset_version here is 11, and got a TensorRT error: import torch
import torch.nn as nn
import torch.nn.functional as F
import os
class TestModel(nn.Module):
def __init__(self):
super(TestModel, self).__init__()
def forward(self, x):
x = F.interpolate(x, (256, 256), mode = 'bilinear')
return x
torch_model = TestModel()
dummy_input = torch.randn((1, 3, 256, 256))
torch_out = torch.onnx.export(torch_model,
dummy_input,
'test_model.onnx',
verbose=True,
opset_version=11,) @lara-hdr Hi, saw you worked with Torch and Onnx in other issues, could you please help analysing this problem? It bothers me for days :( |
@ksnzh @dhkim0225 @rmccorm4 What's your onnx version? Mine was 1.6.0, then I installed 1.4.0 by |
I use onnx version 1.6.0 with pytorch 1.2.0.
2020年1月6日 +0800 PM6:32 Shepherd <notifications@github.com>,写道:
… @ksnzh @dhkim0225 @rmccorm4 What's your onnx version? Mine was 1.6.0, then I installed 1.4.0 by pip install onnx==1.4.0 with pytorch 1.3.1 and the constant magically disappeared!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
老哥自己人 我发现这个问题的根源就是onnx 1.6.0 装回1.4.0线性上采样就没什么问题了
|
@qizhen816, I tested your code with pytorch master (nightly) and onnx 1.6.0, and the issue seems fixed, could you confirm? |
@lara-hdr It's not working... I just found out the reason my code worked previously is I used optset_version==10, so the issue may not be caused by onnx.
And
And TensorRT shows no bug. |
@lara-hdr Sorry, I test the code as well as my project code with Torch and Onnx these days, they are all good. The UpSample issue remains with Onnx to TensorRT. Thanks for the help :) |
Hi @dhkim0225, So after looking into this, the original problem below:
Thanks to @kevinch-nv - Looks like the root cause of the issue was how pytorch exports opset11 resizes. Looking at the export code: https://github.com/pytorch/pytorch/blob/master/torch/onnx/symbolic_opset11.py#L177 Pytorch inserts empty "constant" layers for optional inputs and the ONNX parser did not accept this case. It should now be fixed by this PR: onnx/onnx-tensorrt#369 To apply those changes, you can build the OSS components (https://github.com/rmccorm4/tensorrt-utils/blob/20.01/OSS/build_OSS.sh) on top of your TRT install / container like so: wget https://raw.githubusercontent.com/rmccorm4/tensorrt-utils/20.01/OSS/build_OSS.sh
source build_OSS.sh But there is still another issue applying to the following models:
This is because TensorRT only supports asymmetric resizing at the moment:
Your model 5.onnx -
Lastly, 2.onnx -
It seems like PyTorch generates a pretty complex graph for 2.onnx even though it should be similar to 5.onnx. I tried using
|
I'm the author of both ONNX resize op in opset 11 and onnx-simplifier. Could you please send your model to my email daquexian566@gmail.com so that I can try to give some help? Thanks! |
@daquexian Thank you for your attention on this problem. Here is my environment: Here are the details when exporting the model to onnx: This is the onnx model file (opset 11): Hope this information is helpful for you. |
Per the original issue "(Upsample) How can I use onnx parser with opset 11?", this has been solved in upstream ONNX parser per above posts. However, another open issue from this thread is:
If you need this, please open a separate RFE and comment there on your use cases to support it, the more info the better. |
@rmccorm4 meet error |
I had the same error as @rmccorm4 post. Here is my environment: |
@qizhen816 Hi, I have the same problem as you. Have you solved it now? Would you mind telling me how you solve it? |
@ycchanau Sorry for the delay, my onnx problem ends with torch 1.4.0. At first the error occurs with onnx so I tried to make the node look normal. But after testing with onnx-runtime it seems OK. at last, it turned out that TensorRT didn't supprt oonx bilinear upsample op, so I gave up and used other boost libraries. |
@qizhen816 Thanks so much for your reply. May I know which boost libraries you are using now? I have been stuck in TensorRt for a week. |
@ycchanau lol, I'm from China so I tried 2 inference engines from Alibaba and Tencent, they are (MNN)[https://github.com/alibaba/MNN] and (NCNN)[https://github.com/Tencent/ncnn]. They both are repaidlly developing with full support on Windows, Linux and Mobile devices. In my opinion the first one is better, the most important reason is bilinear upsample works perfectly, haha. |
You can try onnxsimplifier. The simplified onnx file should work for you. |
If I undertand correctly, even the latest tensorRT(7.0 and 7.1) not supports bilinear upsample now? |
Raised this issue under onnx-tensorrt. Maybe this might help if you are having issues with interpolation. |
Description
onnx-parser is basically built with ir_version 3, opset 7 (https://github.com/onnx/onnx-tensorrt/blob/master/onnx_trt_backend.cpp)
Is there any way to use onnx parser with opset 11 support ?
I mean, parser works only with opset7 version.
parser works well if I use ir4_opset7 version onnx model, but doesn't work if I use ir4_opset11 version onnx model.
It also cannot parse opset 8 and 9.
My onnx models are made by pytorch 1.4.0a.
Can I rebuild the parser by changing only the BACKEND_OPSET constant inside onnx_trt_backend.cpp?
Environment
TensorRT Version: 7.0.0
GPU Type: T4
Nvidia Driver Version: 440.33.01
CUDA Version: 10.2.89
CUDNN Version: 7.6.5
Operating System + Version: Ubuntu18.04
Python Version (if applicable): 3.6.9
TensorFlow Version (if applicable): 1.4.0
PyTorch Version (if applicable): 1.4.0a
The text was updated successfully, but these errors were encountered: