Skip to content

A tensorflow implementation of squeezenext. (includes link to trained model)

License

Notifications You must be signed in to change notification settings

Timen/squeezenext-tensorflow

Repository files navigation

SqueezeNext Tensorflow: A tensorflow Implementation of SqueezeNext

This repository contains an unofficial tensorflow implementation of SqueezeNext, a hardware-aware neural network design.

@article{DBLP:journals/corr/abs-1803-10615,
  author    = {Amir Gholami and
               Kiseok Kwon and
               Bichen Wu and
               Zizheng Tai and
               Xiangyu Yue and
               Peter H. Jin and
               Sicheng Zhao and
               Kurt Keutzer},
  title     = {SqueezeNext: Hardware-Aware Neural Network Design},
  journal   = {CoRR},
  volume    = {abs/1803.10615},
  year      = {2018},
  url       = {http://arxiv.org/abs/1803.10615},
  archivePrefix = {arXiv},
  eprint    = {1803.10615},
  timestamp = {Wed, 11 Apr 2018 17:54:17 +0200},
  biburl    = {https://dblp.org/rec/bib/journals/corr/abs-1803-10615},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

Pretrained model:

Using the data from the paper, original caffe version on github and other sources I tried to recreate the 1.0-SqueezeNext-23 model as closely as possible. The model achieved a 56% top 1 accuracy on validation set and a 80% top 5 accuracy on the validation set. This is about 3% under the reported results. Causes for this could be that the network was trained with a batch size of 256 instead of 1024, and because of the the number of steps required for 120 epochs increased 4 fold. The learning rate schedule was modified to account for the lower batch size and the increased number of steps.

This configuration (stored in the v_1_0_SqNxt_23 config) can be downloaded from here v_1_0_SqNxt_23_mod.

Installation:

This implementation was made using version 1.8 of the tensorflow api. Earlier versions are untested, and may not work due to the use of some recently added functions for data loading and processing. The code was made for python 2.7.

  • Make sure tensorflow 1.8 or higher is by running:

    python -c 'import tensorflow as tf; print(tf.__version__)'  # for Python 2

    And verifying the output is 1.8.0 or above.

  • Clone this repository:

    git clone https://github.com/Timen/squeezenext-tensorflow.git
  • Install requirements:

    pip install -r requirements.txt

Preparing the Dataset:

SqueezeNext like most other classifiers is trained with the ImageNet dataset (http://www.image-net.org/). One can download the data from the afromentioned website, however this can be rather slow so I recommend downloading the dataset using torrents available on (http://academictorrents.com/) namely:

Training Images, Validation Images and Bounding Box Annotations

Please note one should still abide by the original License agreement of the Imagenet dataset. After downloading these files please perform the following steps to prepare the dataset.

  • Create a directory used for processing and storing the dataset. Please note you should have at least around 500 GB of free space available on the drive you are processing the dataset. Once this directory is created copy the 3 files downloaded earlier to the root of this directory from that directory execute the following command:

       export DATA_DIR=$(pwd)
       
  • Execute the following command from this projects root folder:

    bash datasets/process_downloaded_imagenet.sh $DATA_DIR
    

    Where $DATA_DIR is the root of the directory created to hold the 3 downloaded files.

  • Wait for processing to finish. The script process_downloaded_imagenet.sh will automatically extract the tarballs and process al the data into tf-records. The whole process can take between 2 and 5 hours depending on how fast the hard drive and cpu are.

Training:

After installation and dataset preparation one only needs to execute the run_train.sh script to start training. By executing the following command from the projects root folder:

bash run_train.sh

This will start training the 1.0 v1 version of squeezenext for 120 epochs with batch size 256. With a GTX1080Ti this training will take up to 4 days. If your gpu has a smaller memory capacity then a gtx1080ti you probably need to lower the batch size to be able to run the training.

Prediction:

Prediction is done using the predict.py script, to run it you give it a path to a jpeg image and pass the directory containing a trained model in the model_dir argument.

python predict.py ./tabby_cat.jpg --model_dir ?TRAIN_DIR from the run_train.sh or pretrained model directory?

This script will load the image and run the classifier on it, the output is the top 5 human readable class labels.

Modifying the hyper parameters:

The batch size number of epochs and some other settings regarding epoch size, file location etc. can be passed as command line arguments to the train.py script.

Switching between specific configurations such as the grouped convolution and the non grouped convolution versions of squeezenext should be done by selecting which config file from the configs folder to use. This can be done by passing the file name without the .py as the command line argument --configuration. It is easy to add your own configuration just copy one of the other configs and rename the file to something new (keep in mind it will be imported in python so stick to numbers letters and under scores). You can then change the parameters in the file to customize your own config and pass the new file name as --configuration parameter.(the python scripts in configs are automatically imported)

About

A tensorflow implementation of squeezenext. (includes link to trained model)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published