Semanitic-Segmentation-with-U-Net

Overview of Project:

We are given Helen Dataset which contains images of faces of different persons. Our target is to classify each pixel as

bg (background)
face
lb (left brow)
rb (right brow)
le (left eye)
re (right eye)
nose
ulip
imouth
llip
hair

For this task we will be using the famous U-Net Architecture. U-Net Paper.

Data Description:

For this project , the Helen Dataset used can be downloaded from Helen Dataset.
For each image. It has 3 types of files. One is image.jpg which has the file which will be loaded to the model. Second is the label.png file which has all pixel by pixel classications of the image. The viz.jpg file is just for demonstation purpose and is not of any use to the model.
Following is the directory structure

.
└── helenstar_release
     ├── train
     │   ├── image.jpg
     │   ├── label.png
     ├   ├── viz.jpg
     │   └── ... (1999 sets of 3 images i.e 5997 images total)           
     └── test
         ├── image.jpg
         ├── label.png
         ├── viz.jpg
         └── ... (100 sets of 3 images i.e 300 images total)

Total number of images in dataset : 2099
Number of images in train set : 1999
Number of images in test set : 100

For convenience I will be performing some shiftings to put all image.jpg files in one folders , label.png in other. I will be doing this using the shutil module of python

The final directory strucutre will be as follows

.
└── splitted_Data
        ├── train
        │   ├── images
        │   │   ├── image1.jpg
        │   │   ├── image2.jpg
        │   │   └── ... (1999 files)
        │   └── labels
        │       ├── label1.jpg
        │       ├── label2.jpg
        │       └── ... (1999 files)       
        │           
        └── test
            ├── images
            │   ├── image1.jpg
            │   ├── image2.jpg
            │   └── ... (100 files)
            └── labels
                ├── label1.jpg
                ├── label2.jpg
                └── ... (100 files)

Libraries used:

Numpy
Matplotlib
torch
torchvision
PIL
os module of python
tqdm
shutil

Steps Followed

1. Importing Necessary Libraries

Getting all the required python libraries required for the implementation of the project

2. Looking at directory structure and making desired shiftings

As shown above, the directory strucutre in the link is changed so that it is easy to execute in the later part

3. Data Preprocessing

Now all the images have different dimensions. But to feed them into the model, all the images need to be of the same size. I resized all the images to 256x256. Also when we load images , they are usually loaded in the form of numpy array with dtype = uint8 . They need to be converted to tensors with dtype = torch.float32

4. Defining Train and Test Dataloaders

Now the train dataset has 1999 images which can not be fed in one go. I use Mini Batch Gradient Descent. with a batch size of 10.

5. Defining model

The model architecture is shown in the picture below

It has a encoding path and a decoding path. The architecture is difficult to code in one class UNet(nn.Module. So I define some classes before hand which can help to make our code concise and simple to read

The input to the model is of shape 10x3x256x256 and output is 10x11x256x256

6. Defining Dice Loss and optimizer

For this problem we will be defining the DiceLoss. As there is not pre defined Diceloss in pytorch, We will be defining it on our own. The code is inspired from An overview of semantic image segmentation.

7. Performing Forward Propagation

I perform forward propagation for 30 epochs and print losses. Based on the trend of losses, I have occationally interrupted execution and reduced learning rate

8. Visualizing train loss

9. Visualizing predictions

These are predictions on train set, We'll see predictions on test sets in conclusion part

How to replicate on your device

Just store the data as per the directory structures shown in the code. Run the code You can get the pre trained weights for the model at this link Pre-Trained weights.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
F1_Score.py		F1_Score.py
F1_score_file.ipynb		F1_score_file.ipynb
README.md		README.md
Unet.ipynb		Unet.ipynb
f1_score(1).py		f1_score(1).py
label_names.txt		label_names.txt
list_test.txt		list_test.txt
weights_200_epoch_0.9971_RGB_256.h5py		weights_200_epoch_0.9971_RGB_256.h5py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Semanitic-Segmentation-with-U-Net

Table of Contents:

Overview of Project:

Data Description:

Libraries used:

Steps Followed

1. Importing Necessary Libraries

2. Looking at directory structure and making desired shiftings

3. Data Preprocessing

4. Defining Train and Test Dataloaders

5. Defining model

6. Defining Dice Loss and optimizer

7. Performing Forward Propagation

8. Visualizing train loss

9. Visualizing predictions

How to replicate on your device

About

Releases

Packages

Languages

Tanmay-21/Semanitic-Segmentation-with-U-Net

Folders and files

Latest commit

History

Repository files navigation

Semanitic-Segmentation-with-U-Net

Table of Contents:

Overview of Project:

Data Description:

Libraries used:

Steps Followed

1. Importing Necessary Libraries

2. Looking at directory structure and making desired shiftings

3. Data Preprocessing

4. Defining Train and Test Dataloaders

5. Defining model

6. Defining Dice Loss and optimizer

7. Performing Forward Propagation

8. Visualizing train loss

9. Visualizing predictions

How to replicate on your device

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages