A Mask R-CNN Keras implementation with Modanet annotations on the Paperdoll dataset
My bachelor's thesis project.
To sum it all up, I created a program that enables you to quickly train any model using fizyr's keras-maskrcnn (I spent around a month to make it work).
And in particular to train it using ModaNet.
ModaNet, I discovered, had its flaws, particularly on the footwear and boots. They had the bounding boxes overlap with each other.
You can check them out by running maskrcnn-modanet viewimage --all-set --original
With and without the "original" parameter, in parallel in two different terminal tabs/windows.
So I fixed them (although help is much appreciated to refine it).
Then I ran some tests to check the results and footwear and boots recognition were dramatically improved.
I then formulated a simple application to analyze how many shoes, or skirts, or one of the other 13 labels, are in the user's instagram account, only analyzing images in which there is only one person in the frame. More details again on the release notes for v1.0.
Below is the home screen of the program.
Usage: maskrcnn-modanet [OPTIONS] COMMAND [ARGS]...
Main CLI.
Options:
--help Show this message and exit.
Commands:
datasets Manage your datasets run 1 -> maskrcnn-modanet datasets...
evaluate Evaluate any trained model, average precision and recall.
instagram Simple implementation to track instagram metrics per...
processimage View and save processed image and annotations from input...
savedvars Show and edit saved variables
train Train using the dataset downloaded usage: maskrcnn-
modanet...
viewannotation View and (not yet needed) save dataset images, plain (not...
viewimage View and (not yet needed) save dataset images, plain (not...
I'll be very happy to merge your pull requests that add new implementations, or link to them in a section here!
Regarding the Instagram analyzer, I started from the Instaloader classes and overrode some methods to get the urls of the posts instead of downloading them.
It then runs through the COCO model to determine the images that have only one person that is bigger than 10% of the image, and on those images I run the ModaNet model to show some statistics about what type of apparel the user is wearing and even display the instances of them, if you request it.
Say you want to quickly find what skirt (or footwear) your instagram star always wears. With this tool you can! And you can also see how often the instagram user shows himself alone in their images, and what he/she usually shares of him (always pictures with shoes? always only the top part?)
Link to the Thesis Presentation
This project is written in Python 3, so it works in all major OSes. Although only Linux and MacOS are fully supported. Keep in mind to use pip or pip3 depending on your settings.
UPDATE: My suggestion is to run it using Google Colab. Just make a copy of this notebook and click play on the code snippets.
The code has been optimized to run well on Colab
Clone this repo
Run pip install maskrcnn-modanet
Or go to the repo you just cloned on the terminal and run pip install -e .
If you see any errors, just install the dependencies manually, just like this: pip3 install --upgrade cython
Now that you've installed it, run maskrcnn-modanet datasets download the/folder/you/want/to/put/data/in
It will take a while, about 40GB to download! EDIT: it is now reduced to just 2-3 GB. See the release notes for v1.0 for details on this and on the instagram application.
Then you can explore its features and commands by running maskrcnn-modanet
Install Python and Keras
Install Git LFS (Large File Storage) to get all the files!
- Sublime Text - The text editor used
- Python 3 - Language utilized
- GitHub Desktop - To manage developement
- Sublime Merge - To manage developement
The following is a copy of PurpleBooth
Please read CONTRIBUTING.md for details on our code of conduct, and the process for submitting pull requests to us.
For the versions available, see the releases on this repository.
- Pier Carlo Cadoppi - Initial work
See also the list of contributors who participated in this project.
This project is licensed under the MIT License - see the LICENSE.md file for details
- Hat tip to anyone whose code was used
-
- Billie Thompson - README Template - PurpleBooth
- Inspiration
- etc lol