Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distance Transform #17

Open
bluesky314 opened this issue Jun 29, 2020 · 6 comments
Open

Distance Transform #17

bluesky314 opened this issue Jun 29, 2020 · 6 comments

Comments

@bluesky314
Copy link

bluesky314 commented Jun 29, 2020

Hey, I do not understand how distance map is being used here and what the clicks variable is exactly supposed to represent:

def dt(a):
    return cv2.distanceTransform((a * 255).astype(np.uint8), cv2.DIST_L2, 0)

def trimap_transform(trimap):
    h, w = trimap.shape[0], trimap.shape[1]

    clicks = np.zeros((h, w, 6))
    for k in range(2):
        if(np.count_nonzero(trimap[:, :, k]) > 0):
            dt_mask = -dt(1 - trimap[:, :, k])**2
            L = 320
            clicks[:, :, 3*k] = np.exp(dt_mask / (2 * ((0.02 * L)**2)))
            clicks[:, :, 3*k+1] = np.exp(dt_mask / (2 * ((0.08 * L)**2)))
            clicks[:, :, 3*k+2] = np.exp(dt_mask / (2 * ((0.16 * L)**2)))

    return clicks

Can you please explain this?

@MarcoForte
Copy link
Owner

Instead of only feeding the network the binary trimap we also feed the distance transformed version. The distance from the definite foreground and background regions is a strong indicator of what the alpha could be.

The clicks variable represents the transformed trimap. The variable name is not the most accurate and will eventually be fixed.

@bluesky314
Copy link
Author

Ok but it does not seem like a simple distance transform. What do 3k,3k+1,3*k+2 and 2 * ((0.02 * L)**2), 2 * ((0.08 * L)**2) ... mean in the for loop? What is the whole loop doing? And why is clicks of dimension 6?

@99991
Copy link

99991 commented Jul 16, 2020

The distance transform is used to compute an approximate alpha matte based on the trimap.

The first function which is used here goes to 0 at approximately a distance for 25 pixels, the second function goes to 0 for a distance of 100 pixels and the third function for 200 pixels.

Here is a plot of the distance of the distance vs the approximate alpha matte value for the first function:

https://www.wolframalpha.com/input/?i=plot+e%5E%28-%28%281-x%29%5E2%29+%2F+%282+*+%28%280.02+*+320%29%5E2%29%29%29

clicks has 6 channels because the three distances are computed to both the fixed foreground and background of the trimap.

@bluesky314
Copy link
Author

bluesky314 commented Aug 8, 2020

Thanks @99991 and @MarcoForte , but I dont see how distance transform makes sense as the images are all on different scales. Distance in pixel space which is used by distance transform may not mean much when the image is a close-up of a person's face because all points are close by the object of interest.

@bluesky314
Copy link
Author

@99991 , @MarcoForte Can either of you clarify if I am getting something wrong about the appropriateness of distance maps?

@bluesky314
Copy link
Author

@99991 , @MarcoForte Did you guys get to think about the above point that distance transform not taking scale of image into account?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants