Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Human segmentation on Berkeley MHAD depth data #9

Open
amarmhrjn opened this issue Aug 12, 2018 · 4 comments
Open

Human segmentation on Berkeley MHAD depth data #9

amarmhrjn opened this issue Aug 12, 2018 · 4 comments

Comments

@amarmhrjn
Copy link

Please let me know which technique you are using for human segmentation on Berkeley MHAD depth data.

@ashafaei
Copy link
Owner

That groundtruth information is not present in the MHAD dataset, but you can generate it from the available groundtruth. I don't remember exactly how I did it, but I remember several heuristics made it possible. For instance, the actors are always, more or less, within the same distance from the camera. You could also use the skeleton information to separate the actor point-cloud from the background. You can then use the corresponding pixels in the depth-image-space as the segmentation that is to be used for the pipeline. You can quickly do it in one evening. Note that in our pipeline we assume the segmentation is already taken care of.

@amarmhrjn
Copy link
Author

Thanks for the information. I used some threshold value to remove background (by removing/clearing points greater than the threshold) and works well. But, I can not remove floor data (and noises) close to the feet area of the person in the depth image/data. I see you mentioned using Kinect sdk for human segmentation in your thesis document but I am not able to use the sdk for offline depth data. Correct me if I am wrong. Any thoughts?

@ashafaei
Copy link
Owner

For the floor, you can estimate the plane parameters (it's a flat surface around the actors) and then use that to filter any point that is within a small distance from the plane surface. If you learn wX = d, the w would be the direction and the d would be the offset. Any point that satisfies the equation up to d±epsilon can be removed as the plane surface. You'd have to see what epsilon works best. You can estimate the parameters from 3 points on the surface (the Kinects are fixed, the extrinsic parameters don't change.), or you could do something fancier (like hough transform, or MLE).
Yeah, you can't use the SDK on offline data. In my thesis, I'm using multiple Kinect 2's in an online setting.
If you need to implement a human segmentation method, you could train a CNN with actors taken from my data overlaid on depth data of random backgrounds from datasets such as NYU Depth Data. It's an easier problem than the human segmentation. I think there are open-source implementations for this out there too.

@amarmhrjn
Copy link
Author

Thanks for your suggestions, I get the floor plane parameters from 3 points on the surface and then removed the points.
I used classification_demo.m code to test the classification of mhad depth data and found that some body parts are incorrectly classified. I think its mainly due to normalization process for mhad input depth image before classification.
I tried by changing the input depth range for mhad before classification and classification works for some body parts only but not all.
Could you please let me know how did you normalize the mhad depth image when you tested?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants