New_Recording_-_4_7_2023._2_48_28_PM.mp4
The purpose of this project is to create an app to help people learn more about any kind of reptile around the world. It is also kid friendly. Just input an image of a reptile (such as Crocodile, Lizards, Snake, Turtle) this app will show you all the information such as Name
, Scientific Name
,Type
, generally where the reptile Commonly found in
, Color
, Food Habit
, Conservation Status
, Habitat
in a very basic way.
My goal is for the user to apply an input as an image and receive 8 outputs. So this is the reason I have to choose the Multi-Target
Image Classification method as the backbone.
If I use Data Structure / Algorithm / Explicite code, This will need huge data for individual species. Also Image Analysis is much more complex without ML. This is the reason I chose Machine Learning for Image classification.
I collected my data with the help of google
, ChatGPT
.
- Collected Reptile
Type
names around the world. Currently I choose Crocodile, Lizard, Snake, Turtle. - For the individual
Type
I have collectedName
of all species Also theirScientific Name
,conservation Status
,Habitat
,Found In
,Color
,Diet
. Using Wikipedia,National Geographic website, link and ChatGPT I ensured, my data is Correct or not. All data I have collected manually.108
Species information is collected. - According to
Name
I collected images. Basically, I scrape images from Google Photos usingSelenium
. Almost 50 images collected for eachName
.
As you can see, my dataset is small, and Google Images only has a few images for specific species, I generated more images based on the few images collected 6539
, and I use 8 types of Augmentation
(Rotate,Brightness,Flip etc) to increase this dataset. After all of this, my current data size is 45767
. Dataset
For the rest of the tasks like training, testing, and compression, I choose Fastai
Pytorch
to get better performance and time effect.
For training, I use 3 types of models, resnet50
, resnet34
, xresnet18deeper
, and I save the best model as pkl
for future use.
Also, I did some experiments using batch size
. I used 3 different batch sizes to see if there was any tiny effect on the model's accuracy. Batch size
= 16, which made the model training process very slow and low-accurate. Batch size
= 32
, made the model training process faster with the highest accuracy. Batch size
= 64, it gave the fastest training process but was not as accurate as bs = 32.
As I used 3 types of models here is their best results..
Model | Accuracy | F1_score | Precision | Time |
---|---|---|---|---|
Resnet50 | .9981 | 0.930426 | 0.970514 | 08:20 |
Resnet34 | 0.998095 | 0.927186 | 0.956563 | 07:31 |
xresnet18_deeper | 0.977500 | 0.023208 | 0.068224 | 05:40 |
As resnet50
and resnet34
both reach 99%
Accuracy and another score is way close but resnet34
is a little bit faster. So I chose resnet34
for further work.
For analysis on what is going on during training The model, I use Weight and Bias
and create a dashboard. Here it is.
Model compression is very essential, because we as developers want to reduce or manage app size. In that case, I use onnx
for model size compression. It also makes faster the model.
HuggingFace
is the platform where I deployed my project. You can see this from here.
As this app is educational app, I've decided to make it free for all. So I made a simple UI
using Flask
and rendered it on the render.com
website. This website is free with conditions.
Here is my app link.
- The
background image
of this website is generated usingStable Diffusion
. An AI that generates unseen images using "promt" The promt is: - Here, I also used
Google Text To Speech
AI for kids who cannot read but have a curious mind and want to know aboutReptile
. Audio generates inReal Time
Of course, I've faced so many obstacles to building this website.
First of all, set up the Dataloader
. There are a few resources that I found for Multi-Target
Image classifier. I've experimented in a lot of ways and finally MultiCategoryBlock()
with the parameter add_na = True
, this is for Multi-Target
classification, but I believe this will also work for regression.
Secondly, I messed up to choose the right metric(s), finally I used accuracy_multi
. This metric works not only well for multi-label
classification but also Multi-Target
classification. I also used F1_score
and Precision
for a better understanding of my model.
I have completed this project like I wanted. If you want to develop more just send me a pull request. Thanks.