Change broadcast Add/Mul to element-wise Add/Mul in Detect layer #4811

jebastin-nadar · 2021-09-15T16:28:27Z

🚀 Feature

Motivation

ONNX model produced by export.py is not compatible for inference (even with --simplify) in OpenCV's DNN module, as mentioned in these issues #4471 opencv/opencv#20072.

The problematic nodes are 2 broadcast add and mul nodes in the final detect layer. OpenCV's DNN module cannot handle these broadcast operations currently leading to errors.

Pitch

The add node comes from the broadcast add of self.grid

yolov5/models/yolo.py

Line 66 in 621b6d5

xy = (y[..., 0:2] * 2. - 0.5 + self.grid[i]) * self.stride[i] # xy

and the mul node from the broadcast mul of self.anchor_grid

yolov5/models/yolo.py

Line 67 in 621b6d5

    
           wh = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i].view(1, self.na, 1, 1, 2)  # wh

Both grid and anchor_grid are constant, so I suggest expanding these tensors to their respective input sizes using pytorch's expand or repeat operation so that an elementwise operation is used. I have tried modifying the Detect to expand these tensors, but there are additional nodes added in the final onnx model.

I request @glenn-jocher or another contributor to take a look at this so that the exported yolov5 onnx model can be used in opencv for faster inference.

github-actions · 2021-09-15T16:29:11Z

👋 Hello @SamFC10, thank you for your interest in YOLOv5 🚀! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a 🐛 Bug Report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset images, training logs, screenshots, and a public link to online W&B logging if available.

For business inquiries or professional support requests please visit https://ultralytics.com or email Glenn Jocher at glenn.jocher@ultralytics.com.

Requirements

Python>=3.6.0 with all requirements.txt installed including PyTorch>=1.7. To get started:

$ git clone https://github.com/ultralytics/yolov5
$ cd yolov5
$ pip install -r requirements.txt

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Google Colab and Kaggle notebooks with free GPU:
Google Cloud Deep Learning VM. See GCP Quickstart Guide
Amazon Deep Learning AMI. See AWS Quickstart Guide
Docker Image. See Docker Quickstart Guide

Status

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training (train.py), validation (val.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu every 24 hours and on every commit.

glenn-jocher · 2021-09-15T19:30:49Z

@SamFC10 thanks for the explanation! Expanding these may seem simple, but be advised input shapes are constantly changing, so if you expand you must check shapes and redefine for every batch, which will slow things down and is not required for pytorch inference or training.

The fastest and easiest way to incorporate your ideas into the official codebase is to submit a Pull Request (PR) implementing your idea, and if applicable providing before and after profiling/inference/training results to help us understand the improvement your feature provides. This allows us to directly see the changes in the code and to understand how they affect workflows and performance.

Please see our ✅ Contributing Guide to get started.

glenn-jocher · 2021-09-15T22:22:58Z

@SamFC10 can you verify that expanding works with DNN?

self.grid[i] = self.grid[i].expand(bs, self.na, -1, -1, -1)
self.anchor_grid[i] = self.anchor_grid[i].expand(bs, -1, ny, nx, -1)

jebastin-nadar · 2021-09-16T05:13:59Z

self.anchor_grid[i] = self.anchor_grid[i].expand(bs, -1, ny, nx, -1)

Causes an error during creation of onnx model

File "/content/yolov5/models/yolo.py", line 62, in forward
    self.anchor_grid[i] = self.anchor_grid[i].expand(bs, -1, ny, nx, -1)
RuntimeError: The expanded size of the tensor (1) must match the existing size (80) at non-singleton dimension 3. 
              Target sizes: [1, 3, 1, 1, 2].  Tensor sizes: [3, 80, 80, 2]

jebastin-nadar · 2021-09-16T05:21:31Z

slow things down and is not required for pytorch inference or training

This was my concern as well. I have tried few methods myself to expand these grids to the correct shape and minimize any computation overhead.

For the broadcast "add" node, I expanded the grid in _make_grid() itself and modified the call for _make_grid()

- def _make_grid(nx=20, ny=20):
+ def _make_grid(nx=20, ny=20, na=3):
        yv, xv = torch.meshgrid([torch.arange(ny), torch.arange(nx)])
-        return torch.stack((xv, yv), 2).view((1, 1, ny, nx, 2)).float()
+        return torch.stack((xv, yv), 2).expand((1, na, ny, nx, 2)).float()

This changes the addition to elementwise without any significant overhead (I think!)

Expanding self.anchor_grid seems to be problematic and any solutions that I have tried causes additional nodes in the final onnx model (with --simplify, these nodes are removed)

- wh = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i].view(1, self.na, 1, 1, 2)  # wh
+ wh = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i].view(1, self.na, 1, 1, 2).expand(1, self.na, ny, nx, 2)  # wh

Without --simplify (notice the extra nodes at the right)

With --simplify

This new onnx model with --simplify runs perfectly in OpenCV DNN. The onnx model without --simplify option has some extra nodes, so I wonder if there is a better way of expanding anchor_grid or creating anchor_grid from anchors differently.

glenn-jocher · 2021-09-16T08:15:00Z

@SamFC10 got it! Please submit a PR with the fix that works for DNN and we will take a look at it there. There's probably always going to be added ops that should be balanced against improved exportability, maybe we can introduce an --expand flag to Detect() that is true only for ONNX export.

GioFic95 · 2021-10-04T10:47:27Z

This new onnx model with --simplify runs perfectly in OpenCV DNN.

Hi @SamFC10, I tried your fix but I still have some issues with the integration of YOLOv5 into OpenCV DNN. Could you please share the code you use for inference? Thank you very much in advance.

jebastin-nadar · 2021-10-04T11:33:27Z

@GioFic95 Make sure you are using my fork of yolov5 as my fixes haven't been merged yet. Checkout export-dnn-simple branch for the fix in the comment above or export-dnn branch for the fix in the PR.

git clone --single-branch --branch export-dnn-simple https://github.com/SamFC10/yolov5.git
# git clone --single-branch --branch export-dnn https://github.com/SamFC10/yolov5.git
cd yolov5
python3 export.py --weights yolov5s.pt --include onnx --simplify

Code :

import numpy as np
import cv2

inp = np.random.rand(1, 3, 640, 640).astype(np.float32)
net = cv2.dnn.readNetFromONNX('yolov5s.onnx')
net.setInput(inp)
out = net.forward()
print(out.shape)

returns (1, 25200, 85) with both the branches.

GioFic95 · 2021-10-05T12:45:01Z

@SamFC10 I checked that I have the same output shape as you, but nonetheless the results obtained via OpenCV aren't the same I obtain via PyTorch.

That is, applying this code to the output in OpenCV ONNX (in C++) I get these results:

While the original results, obtained directly with the trained model are the following:

Moreover, these are the results obtained using detect.py with the model exported using your repo (the same used in OpenCV ONNX):

Do you have any advice on how to solve the issue or hypothesis about its reason to suggest?
Thank you very much again.

jebastin-nadar · 2021-10-05T15:27:38Z

@GioFic95
To verify if the new onnx model works using OpenCV DNN, I did the following:

git clone --single-branch --branch export-dnn-simple https://github.com/SamFC10/yolov5.git
cd yolov5
python3 export.py --weights yolov5s.pt --include onnx --simplify

1. Inference using ONNXRuntime

python3 detect.py --weights yolov5s.onnx

2. Inference using OpenCV DNN

To use opencv instead of onnxruntime in detect.py, make these changes
Line 89 :

- check_requirements(('onnx', 'onnxruntime'))
- import onnxruntime
- session = onnxruntime.InferenceSession(w, None)
+ net = cv2.dnn.readNetFromONNX(w)

Line 147

- pred = torch.tensor(session.run([session.get_outputs()[0].name], {session.get_inputs()[0].name: img}))
+ net.setInput(img)
+ pred = torch.tensor(net.forward())

Again using

python3 detect.py --weights yolov5s.onnx

No visible difference

hypothesis about its reason

I suspect there is something wrong in post-processing in your C++ code (see this #708 (comment)). I'm not an expert in C++, so can't point out where exactly is the mistake. To check this, maybe try the opposite of what I did. In your C++ code, use onnxruntime instead of opencv and use the exported onnx model from the master repository. If the outputs are still wrong, then post-processing steps has some bugs.

glenn-jocher · 2021-10-06T18:16:11Z

@SamFC10 might be nice to have a --use-dnn flag in detect.py to simplify this comparison.

msly · 2021-10-09T09:41:51Z

@SamFC10 dnn (0.554s) is slower than onnxruntime(0.318) in same onnx file

jebastin-nadar · 2021-10-09T13:39:19Z

@msly Yes opencv inference is slower than onnxruntime (difference of around 50ms - 100ms on my device).

The goal of this issue and related PR is to not improve inference speed, but rather making the onnx export of yolov5 compatible with various other backends and not limit it to onnxruntime.

glenn-jocher · 2021-10-11T16:59:22Z

Removed TODO after PR #4833 merged.

glenn-jocher · 2021-10-11T18:50:00Z

@SamFC10 I've opened a new PR #5136 to add DNN inference to detect.py using your example here:

@GioFic95 Make sure you are using my fork of yolov5 as my fixes haven't been merged yet. Checkout export-dnn-simple branch for the fix in the comment above or export-dnn branch for the fix in the PR.
git clone --single-branch --branch export-dnn-simple https://github.com/SamFC10/yolov5.git
# git clone --single-branch --branch export-dnn https://github.com/SamFC10/yolov5.git
cd yolov5
python3 export.py --weights yolov5s.pt --include onnx --simplify
Code :
import numpy as np
import cv2

inp = np.random.rand(1, 3, 640, 640).astype(np.float32)
net = cv2.dnn.readNetFromONNX('yolov5s.onnx')
net.setInput(inp)
out = net.forward()
print(out.shape)
returns (1, 25200, 85) with both the branches.

But I am running into a bug on net = cv2.dnn.readNetFromONNX(w)

(venv) glennjocher@Glenns-iMac yolov5 % python detect.py --weights yolov5s.onnx --dnn
detect: weights=['yolov5s.onnx'], source=data/images, imgsz=[640, 640], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs/detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=True
YOLOv5 🚀 v5.0-509-g9d75e42 torch 1.9.1 CPU

[ERROR:0] global /private/var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T/pip-req-build-vy_omupv/opencv/modules/dnn/src/onnx/onnx_importer.cpp (2127) handleNode DNN/ONNX: ERROR during processing node with 2 inputs and 1 outputs: [Unsqueeze]:(390)
Traceback (most recent call last):
  File "/Users/glennjocher/PycharmProjects/yolov5/detect.py", line 306, in <module>
    main(opt)
  File "/Users/glennjocher/PycharmProjects/yolov5/detect.py", line 301, in main
    run(**vars(opt))
  File "/Users/glennjocher/PycharmProjects/yolov5/venv/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
    return func(*args, **kwargs)
  File "/Users/glennjocher/PycharmProjects/yolov5/detect.py", line 92, in run
    net = cv2.dnn.readNetFromONNX(w)
cv2.error: OpenCV(4.5.3) /private/var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T/pip-req-build-vy_omupv/opencv/modules/dnn/src/onnx/onnx_importer.cpp:2146: error: (-2:Unspecified error) in function 'handleNode'
> Node [Unsqueeze]:(390) parse error: OpenCV(4.5.3) /private/var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T/pip-req-build-vy_omupv/opencv/modules/dnn/src/onnx/onnx_importer.cpp:1551: error: (-215:Assertion failed) node_proto.input_size() == 1 in function 'handleNode'

I created the ONNX model simply with python export.py --weights yolov5s.pt --include onnx. Any idea what might be happening?

jebastin-nadar · 2021-10-11T19:07:14Z

OpenCV(4.5.3)

Need to use latest version i.e. 4.5.4 which was released just a few days ago. I have added a fix for this exact error in opencv opencv/opencv#20713 which will be present in the 4.5.4 version.

opencv-python is still on 4.5.3 so we need to wait till the latest version is released which contains the fix. Once it is released,
pip install -U opencv-python should solve the issue.

Meanwhile, the following onnx models should also work with OpenCV 4.5.3:

python export.py --weights yolov5s.pt --include onnx --opset 11
or
python export.py --weights yolov5s.pt --include onnx --simplify

glenn-jocher · 2021-10-11T19:20:27Z

@SamFC10 thanks! I've added your comments to the PR and a new commented line to check the >=4.5.4 requirement, will uncomment line once version is released.

            # check_requirements(('opencv-python>=4.5.4',))

jebastin-nadar added the enhancement New feature or request label Sep 15, 2021

glenn-jocher added the TODO label Sep 15, 2021

jebastin-nadar mentioned this issue Sep 16, 2021

Refactor Detect() anchors for ONNX <> OpenCV DNN compatibility #4833

Merged

glenn-jocher linked a pull request Oct 11, 2021 that will close this issue

Refactor Detect() anchors for ONNX <> OpenCV DNN compatibility #4833

Merged

glenn-jocher closed this as completed in #4833 Oct 11, 2021

glenn-jocher removed the TODO label Oct 11, 2021

glenn-jocher linked a pull request Oct 11, 2021 that will close this issue

Add OpenCV DNN option for ONNX inference #5136

Merged

glenn-jocher mentioned this issue Oct 11, 2021

Add OpenCV DNN option for ONNX inference #5136

Merged

jebastin-nadar mentioned this issue Oct 23, 2021

Uncomment OpenCV 4.5.4 requirement in detect.py #5305

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change broadcast Add/Mul to element-wise Add/Mul in Detect layer #4811

Change broadcast Add/Mul to element-wise Add/Mul in Detect layer #4811

jebastin-nadar commented Sep 15, 2021

github-actions bot commented Sep 15, 2021 •

edited by glenn-jocher

Loading

glenn-jocher commented Sep 15, 2021 •

edited

Loading

glenn-jocher commented Sep 15, 2021

jebastin-nadar commented Sep 16, 2021

jebastin-nadar commented Sep 16, 2021

glenn-jocher commented Sep 16, 2021

GioFic95 commented Oct 4, 2021

jebastin-nadar commented Oct 4, 2021

GioFic95 commented Oct 5, 2021

jebastin-nadar commented Oct 5, 2021

glenn-jocher commented Oct 6, 2021

msly commented Oct 9, 2021

jebastin-nadar commented Oct 9, 2021

glenn-jocher commented Oct 11, 2021

glenn-jocher commented Oct 11, 2021 •

edited

Loading

jebastin-nadar commented Oct 11, 2021

glenn-jocher commented Oct 11, 2021

Change broadcast Add/Mul to element-wise Add/Mul in Detect layer #4811

Change broadcast Add/Mul to element-wise Add/Mul in Detect layer #4811

Comments

jebastin-nadar commented Sep 15, 2021

🚀 Feature

Motivation

Pitch

github-actions bot commented Sep 15, 2021 • edited by glenn-jocher Loading

Requirements

Environments

Status

glenn-jocher commented Sep 15, 2021 • edited Loading

glenn-jocher commented Sep 15, 2021

jebastin-nadar commented Sep 16, 2021

jebastin-nadar commented Sep 16, 2021

glenn-jocher commented Sep 16, 2021

GioFic95 commented Oct 4, 2021

jebastin-nadar commented Oct 4, 2021

GioFic95 commented Oct 5, 2021

jebastin-nadar commented Oct 5, 2021

1. Inference using ONNXRuntime

2. Inference using OpenCV DNN

glenn-jocher commented Oct 6, 2021

msly commented Oct 9, 2021

jebastin-nadar commented Oct 9, 2021

glenn-jocher commented Oct 11, 2021

glenn-jocher commented Oct 11, 2021 • edited Loading

jebastin-nadar commented Oct 11, 2021

glenn-jocher commented Oct 11, 2021

github-actions bot commented Sep 15, 2021 •

edited by glenn-jocher

Loading

glenn-jocher commented Sep 15, 2021 •

edited

Loading

glenn-jocher commented Oct 11, 2021 •

edited

Loading