Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate Mask from detection #1764

Open
marcfielding1 opened this issue Sep 26, 2019 · 12 comments
Open

Generate Mask from detection #1764

marcfielding1 opened this issue Sep 26, 2019 · 12 comments

Comments

@marcfielding1
Copy link

Hey Everyone,

So first up great work on this.

I have a perhaps somewhat basic question - I want to use Mask_RCNN to generate masks which I can then relabel, for example COCO detects a can as a bottle I want to output the mask from the can and then retrain with a new label as "can"

I had a read on this (down the bottom).

in results['masks'] I get that each true/false relates to if an object is detected in that pixel, is the pixel relative to the ROI of the object or the whole image?

Could anyone point me in the direction of some reading or a way of extracting the mask that I can subsequently reuse for additional training please?

Thanks.

@marcfielding1
Copy link
Author

Aha might have found it, let me test it out!

cocodataset/cocoapi#131

@marcfielding1
Copy link
Author

So I think this is the answer, I need to test it more, any feedback would be very welcome!

Note some of these values are hard coded I'm just trying to output one mask and one bounding box for testing!

temp_mask = r['masks'].astype(int)
# Here we just take the first mask to see if it works
ground_truth_binary_mask = np.array(temp_mask[:,:,1], dtype=np.uint8)
fortran_ground_truth_binary_mask = np.asfortranarray(ground_truth_binary_mask)
encoded_ground_truth = mask.encode(fortran_ground_truth_binary_mask)
ground_truth_area = mask.area(encoded_ground_truth)
ground_truth_bounding_box = mask.toBbox(encoded_ground_truth)
contours = measure.find_contours(ground_truth_binary_mask, 0.5)
annotation = {
        "segmentation": [],
        "area": ground_truth_area.tolist(),
        "iscrowd": 0,
        "image_id": 123,
        "bbox": ground_truth_bounding_box.tolist(),
        "category_id": 1,
        "id": 1
    }

for contour in contours:
    contour = np.flip(contour, axis=1)
    segmentation = contour.ravel().tolist()
    annotation["segmentation"].append(segmentation)
    
print(json.dumps(annotation, indent=4))

This outputs:

{
    "segmentation": [
        [
            309.0,
            1840.5,
            308.0,
            1840.5,
            307.0,
            1840.5,
            306.0,
            1840.5,
            305.0,
            1840.5,
            304.0,
            1840.5,
            303.0,
            1840.5,
            302.0,
            1840.5,
            301.0,
            1840.5,
            300.0,
            # LOTS AND LOTS more
        ]
    ],
    "area": 1198102,
    "iscrowd": 0,
    "image_id": 123,
    "bbox": [
        7.0,
        680.0,
        1068.0,
        1161.0
    ],
    "category_id": 1,
    "id": 1
}

I'm guessing there's so many points for the segmentation because I'm using the mask that RCNN did rather than being hand drawn?

Anywho I'll go test it some more and come back!

@STASYA00
Copy link

STASYA00 commented Oct 7, 2019

@marcfielding1 no, the number of points is always like that, because this array represents a polygon with many vertices (every pair of points are the coordinates), so if your object does not consist of absolutely straight lines only (like a simple box), it will have this pair of coordinates for every smallest angle it will find.

@marcfielding1
Copy link
Author

marcfielding1 commented Oct 7, 2019

Yeah what I meant was hand crafted masks have straight lines, the ones that come from inference tend to be a bit edgy - thanks a lot, I might even do a PR to add this functionality to utils, effectively it lets autolabel detected objects - in our use case we discount tables chairs etc etc and take the thing on the turntable - it's saving us hours and lots of money.

@nikilkumar9
Copy link

Hi Marc, can I use the output JSON directly to train my model? Otherwise, I'll need to convert them to a VGG style output JSON.

@marcfielding1
Copy link
Author

marcfielding1 commented Nov 15, 2019

@nikilkumar9 Yeah so I take those segmentation masks and output them as labels for other training jobs, basically the assumption as long as it get detected as SOMETHING you can use that label to detect as whatever you wanna label it as.

@nikilkumar9
Copy link

@marcfielding1 Thanks, I'm actually using the output as labels for the SAME model.

For the segmentation array in the output, what order are the coordinates in? Is it x first or y first?

@marcfielding1
Copy link
Author

Err you know I can't remember, I can ask our ML guy, see STASYA00's comment above each pair is a co-ordinate I think. I shall check for ya!

@nikilkumar9
Copy link

@marcfielding1, did you get a chance to check with your ML guy?

STASYA00's comment says that every pair of points in the list is a coordinate, but I don't know if the pair is [x, y] or [y, x].

@JavierClearImageAI
Copy link

JavierClearImageAI commented Nov 21, 2019

@marcfielding1 gave a nice solution. Just one comment to make polygons work properly when masks touch the edges of the image (copied from visualize.py: lines 155-160):

from skimage.measure import find_contours   

# Mask
mask = masks[:, :, i]  

Here the Mask is the object drawn into the image, with 1s for object pixels and 0s for no object pixels

# Mask Polygon
# Pad to ensure proper polygons for masks that touch image edges
padded_mask = np.zeros(
(mask.shape[0] + 2, mask.shape[1] + 2), dtype=np.uint8)
padded_mask[1:-1, 1:-1] = mask
contours = find_contours(padded_mask, 0.5)

'find_contours' transforms the binary mask to a pixel-coordinates list with the edges of the mask

@marcfielding1
Copy link
Author

marcfielding1 commented Nov 21, 2019

@nikilkumar9 Hey mate sorry I haven't had a chance to dive back into this and he's cut off at the moment due to some server weather in his part of the world, I think it's x,y - the best way is to import one of masks into a labelling program and see what it comes out like! - @JavierClearImageAI - nice thanks for that i'll investigate so I understand it properly!

@STASYA00
Copy link

STASYA00 commented Dec 8, 2019

@marcfielding1, did you get a chance to check with your ML guy?

STASYA00's comment says that every pair of points in the list is a coordinate, but I don't know if the pair is [x, y] or [y, x].

@nikilkumar9 it's [x,y], the traditional representation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants