Extensions for Mask R-CNN for improved performance #379

seryilmaz · 2019-01-23T20:30:01Z

This PR adds C++/CUDA extensions for some functions used for Mask R-CNN (and for some other models as well). Extensions are for generating mask targets for loss computation in mask head, box iou computation, box encode function, and proposal matcher function.

…me changes in mask head loss project function

facebook-github-bot · 2019-01-23T20:30:08Z

Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. In order for us to review and merge your code, please sign up at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need the corporate CLA signed.

If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks!

fmassa

Thanks for the PR!

I have an initial question: does this PR work with fp16 on the GPU? I have seen that the kernels use float, but there are no checks for fp32 when dispatching to the kernels (only is_cuda).

seryilmaz · 2019-01-24T19:10:17Z

We keep coordinate and IoU values in fp32 even in mixed precision training; so they work in our mixed precision training. If you cast coordinates (either bounding box coordinates or mask polygon coordinates as fetched from dataset), or IoU matrix (which is an input to proposal matcher function) to fp16, then some of these kernels won't work.
Do you want us to extend these functions to fp16 inputs?

fmassa · 2019-01-25T09:34:20Z

@seryilmaz thanks for the explanation!

I think it's ok to leave these kernels to work on fp32-only for now, but I'd say that it can be a bit confusing if the users move the IoU values to be in fp16.

I'll do a pass on the implementation later today and I'll let you know

fmassa · 2019-01-31T20:10:34Z

I was thinking about a way of being less intrusive to the python implementation, what about the following:

1 - all the parts of the python implementation that you have currently wrapped inside a conditional would become functions.
2 - each function body contains only the python implementation for it.
3 - each function will have a decorator, which will point to the optimized C implementation and will dispatch to it if it satisfies the criteria (input types are good, global flag is activated).

Here is an example implementation:

def criterion(b1, b2):
    global enable_optim
    if not enable_optim:
        return False
    if b1.bbox.is_cuda and b2.bbox.is_cuda:
        return True
    return False

@optimized_codepath(
    _C.box_iou,
    lambda boxlist1, boxlist2: boxlist1.bbox, boxlist2.bbox,
    criterion)
def boxlist_iou(boxlist1, boxlist2):
    # all the python stays here
    return iou

The basic idea is that, if criterion is satisfied for the inputs (and a global flag that will enable/disable the optimizations), then the optimized implementation will be used.
Else, the pure python is used.

This has the benefit to reduce the amount of places where the user need to change something if they want to disable the optimizations, and also has a clean separation between both.

Thoughts?

slayton58 · 2019-01-31T20:45:26Z

That could work, I'd be a little concerned about the verbose-ness of it as the number of arguments increases though.

botcs · 2019-02-25T20:34:03Z

Hi @slayton58,

This PR seems really valuable, since the CPU cropping / resizing is causing some serious overhead.
Are you planning to implement @fmassa's review?

Thank you :)

slayton58 · 2019-02-25T20:57:37Z

@botcs Well, this is @seryilmaz's PR, so...

FWIW I feel that a concrete implementation framework to handle this kind of optimisation would be a big help

botcs · 2019-02-25T23:41:07Z

@slayton58 oh sorry, I forgot to mention @seryilmaz as well.

BTW, what do you mean by:

a concrete implementation framework

slayton58 · 2019-02-26T13:36:26Z

We have a concept from Francisco above for a framework to handle these optimizations but no actual implementation to work from.

zimenglan-sysu-512 · 2019-04-19T06:51:55Z

hi @seryilmaz
curious, what's the current state of this PR?

gottbrath · 2019-10-17T04:38:41Z

@fmassa -- I got a ping on this the other day from NVIDIA. Can you look at this again? It looks like @slayton58 didn't think he had enough detail from your comments above to carry this forward.

fmassa · 2019-10-21T08:33:50Z

@gottbrath @slayton58 maskrcnn-benchmark is now deprecated in favor of detectron2. I'd recommend re-opening an issue to discuss about those optimization in detectron2

seryilmaz and others added 6 commits January 23, 2019 11:48

cuda version of mask target generation. Includes the extension and so…

407ca0c

…me changes in mask head loss project function

rleFrPoly cuda kernel: use more threads/block

85014b1

box_iou cuda code

41fda08

use binary search for rle decoding

e1a2551

box_encode extension

379baec

proposal matcher extension added

dcfac06

fmassa reviewed Jan 24, 2019

View reviewed changes

This was referenced Feb 22, 2019

[Feature request] Make IoU computation more memory efficient #18

Open

Support Binary Mask with transparent SegmentationMask interface #473

Merged

botcs closed this Sep 17, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extensions for Mask R-CNN for improved performance #379

Extensions for Mask R-CNN for improved performance #379

seryilmaz commented Jan 23, 2019

facebook-github-bot commented Jan 23, 2019

fmassa left a comment

seryilmaz commented Jan 24, 2019

fmassa commented Jan 25, 2019

fmassa commented Jan 31, 2019

slayton58 commented Jan 31, 2019

botcs commented Feb 25, 2019

slayton58 commented Feb 25, 2019

botcs commented Feb 25, 2019

slayton58 commented Feb 26, 2019

zimenglan-sysu-512 commented Apr 19, 2019

gottbrath commented Oct 17, 2019

fmassa commented Oct 21, 2019

Extensions for Mask R-CNN for improved performance #379

Extensions for Mask R-CNN for improved performance #379

Conversation

seryilmaz commented Jan 23, 2019

facebook-github-bot commented Jan 23, 2019

fmassa left a comment

Choose a reason for hiding this comment

seryilmaz commented Jan 24, 2019

fmassa commented Jan 25, 2019

fmassa commented Jan 31, 2019

slayton58 commented Jan 31, 2019

botcs commented Feb 25, 2019

slayton58 commented Feb 25, 2019

botcs commented Feb 25, 2019

slayton58 commented Feb 26, 2019

zimenglan-sysu-512 commented Apr 19, 2019

gottbrath commented Oct 17, 2019

fmassa commented Oct 21, 2019