The inference speed of ResNet26-based model is improved through PAI-Blade. PAI-Blade is the framework for high-efficiency model deployment developed by AliCloud PAI: PAI-Blade. The BBS (Blade Benchmark Suite) can be accessed: BBS. In addition, optimized INT8 conv2d operators are generated through TVM TensorCore AutoCodeGen.
- Clone this repo.
- Put Imagenet validation set (50000 images) in
imagenet/val_data/
.
./imagenet/val_data/
├── ILSVRC2012_val_00000001.JPEG
├── ILSVRC2012_val_00000002.JPEG
├── ILSVRC2012_val_00000003.JPEG
├── ILSVRC2012_val_00000004.JPEG
├── ILSVRC2012_val_00000005.JPEG
├── ILSVRC2012_val_00000006.JPEG
...
- Pull nvcr.io/nvidia/tensorrt:19.09-py3 from NGC.
- Start nvcr.io/nvidia/tensorrt:19.09-py3 container, and install dependencies:
# start docker container
sh run.sh
# Install TensorFlow
pip install tensorflow==1.13.1
- Set environment in host:
sh set_env.sh
- In the container workspace, run cmd as below:
# Assuming the workspace is mounted as /app
cd /app
# Run for inference
sh eval.sh