Option to return the depth map for just the ROI of the detected object. #125

Luxonis-Brandon · 2020-06-01T19:29:27Z

Start with the why:

As it stands now, DepthAI returns the XYZ location of the detected object using a distance (z) averaged over the padding_factor specified here (i.e. some subset area of the overall bounding box area).

This works for a bunch of usecases, and the padding_factor allows tuning this in for custom objects.

But, for some objects, there may be more information that would be of interest on the host side. And example is when the padding factor ends up averaging depth of an object, and something else that may be within this sub-region of interest.

An example of such an issue is shown below, where the padding factor (shown in blue) is partially over both the person's neck and the wall in the background, so the location of the person and the wall is averaged:

To correct the above issue, it is possible to run the inference directly on the right camera with the -cam right option, which will align the depth and object detector exactly. But in some cases, this is undesirable... for example, if the neural inference requires color information.

So as it stands now, if the user on the host wants the depth information corresponding to the region of interest they need to:

request the meta_out to get the region of interest
request the whole depth frame (1280x720x2bytes = 1.84MB/frame)
pull out the ROI from the whole depth frame.

The disadvantage of this is a couple-fold:

Some hosts are USB2 only, so 1.84MB/frame is too much to handle at 30FPS (as it's 55MB/s, and USB2 can only handle ~30MB/s) and also some hosts just can't keep up with that data.
So on these sorts of hosts, the framerate drops a ton if the user wants to get the depth information from the ROI to do more sophisticated algorithms for pulling out the object location (e.g. pulling out only the front-most depth results above a threshold).

So for the example of a person detector, the ROI is often only say 10% of the frame (or less), so then the bandwidth is now 5.5MB/s (instead of 55MB/s), so easily fitting in USB2 (and way less host-side CPU use).

So if we allowed the option to return the actual depth data for -only- the region of interest (ROI) output from the neural network, this would allow the host to process this depth data w/out the high-bandwidth/high-host-load (and potentially low-frame-rate) associated with having to pull the whole depth frame.

The what:

Implement an API option which returns the depth for the region(s) of interest(s) (ROIs) from the object detector.

The text was updated successfully, but these errors were encountered:

Luxonis-Brandon added the enhancement New feature or request label Jun 1, 2020

Luxonis-Brandon mentioned this issue Jun 26, 2020

DepthAI Pipeline Builder Gen2 #136

Closed

Luxonis-Brandon mentioned this issue Sep 28, 2020

Allow Returning Estimated XYZ Position (with padding_factor) from Key Points #205

Open

Luxonis-Brandon added Gen2 labels Oct 14, 2020

Erol444 removed 2021 labels Mar 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Option to return the depth map for just the ROI of the detected object. #125

Option to return the depth map for just the ROI of the detected object. #125

Luxonis-Brandon commented Jun 1, 2020 •

edited

Loading

Option to return the depth map for just the ROI of the detected object. #125

Option to return the depth map for just the ROI of the detected object. #125

Comments

Luxonis-Brandon commented Jun 1, 2020 • edited Loading

Luxonis-Brandon commented Jun 1, 2020 •

edited

Loading