Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enable inference benchmark #5933

Merged
merged 6 commits into from
Dec 1, 2017
Merged

Conversation

tensor-tang
Copy link
Contributor

fix #5911

--use_gpu=False \
--trainer_count=$thread \
--log_period=10 \
--config_args="batch_size=${bs},layer_num=${layer_num},is_test=True" \
Copy link
Contributor

@luotao1 luotao1 Nov 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_test=True,这个参数在三个网络中都没有。
预测的网络和训练的网络不同,请相应调整。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

预测的时候,是没有cost的,所以网络都要调整下。

--save_dir="models/${topology}-${layer_num}" \
--config_args="batch_size=128,layer_num=${layer_num}" \
> /dev/null 2>&1
echo "Done"
Copy link
Contributor

@luotao1 luotao1 Nov 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • 预测不要跟在训练的后面测,这样测预测性能的时候太慢了。
  • 预测使用的网络不需要训练的非常好,因为只是测性能,拿任意一个batch保存的模型即可。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

目前是如果发现本地没有训练好的模型,才会去train下以此来生产一个模型做inference。

这个模型也只是训练一个num_pass,因为是dummy data只有1024张图片,训练也不会很耗时,也只会训练一次,后面相同网络的inference用的都是同样的模型,所以整体不会太影响的。

Copy link
Contributor

@luotao1 luotao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inference能用另外一个脚本来写么?

--use_gpu=False \
--trainer_count=$thread \
--log_period=10 \
--config_args="batch_size=${bs},layer_num=${layer_num},is_test=True" \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

预测的时候,是没有cost的,所以网络都要调整下。

@@ -30,13 +30,74 @@ function train() {
2>&1 | tee ${log}
}

if [ ! -d "train.list" ]; then
function test() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test->inference

@tensor-tang
Copy link
Contributor Author

好的,没问题

@tensor-tang
Copy link
Contributor Author

Done

@luotao1
Copy link
Contributor

luotao1 commented Nov 30, 2017

我将脚本拉下来运行,打印的日志如下:

Training model vgg_19
Done
I1130 09:54:24.775768   634 Util.cpp:166] commandline: /usr/local/bin/paddle_trainer --job=test --config=vgg.py --use_mkldnn=True --use_gpu=False --trainer_count=1 --log_period=32 --config_args=batch_size=1,layer_num=19,is_infer=True --init_model_path=models/vgg-19/pass-00000/ 
[INFO 2017-11-30 09:54:24,892 layers.py:2670] output for __conv_0__: c = 64, h = 224, w = 224, size = 3211264
[INFO 2017-11-30 09:54:24,893 layers.py:2670] output for __conv_1__: c = 64, h = 224, w = 224, size = 3211264
[INFO 2017-11-30 09:54:24,894 layers.py:2803] output for __pool_0__: c = 64, h = 112, w = 112, size = 802816
[INFO 2017-11-30 09:54:24,894 layers.py:2670] output for __conv_2__: c = 128, h = 112, w = 112, size = 1605632
[INFO 2017-11-30 09:54:24,895 layers.py:2670] output for __conv_3__: c = 128, h = 112, w = 112, size = 1605632
[INFO 2017-11-30 09:54:24,895 layers.py:2803] output for __pool_1__: c = 128, h = 56, w = 56, size = 401408
[INFO 2017-11-30 09:54:24,896 layers.py:2670] output for __conv_4__: c = 256, h = 56, w = 56, size = 802816
[INFO 2017-11-30 09:54:24,897 layers.py:2670] output for __conv_5__: c = 256, h = 56, w = 56, size = 802816
[INFO 2017-11-30 09:54:24,897 layers.py:2670] output for __conv_6__: c = 256, h = 56, w = 56, size = 802816
[INFO 2017-11-30 09:54:24,898 layers.py:2670] output for __conv_7__: c = 256, h = 56, w = 56, size = 802816
[INFO 2017-11-30 09:54:24,899 layers.py:2803] output for __pool_2__: c = 256, h = 28, w = 28, size = 200704
[INFO 2017-11-30 09:54:24,899 layers.py:2670] output for __conv_8__: c = 512, h = 28, w = 28, size = 401408
[INFO 2017-11-30 09:54:24,900 layers.py:2670] output for __conv_9__: c = 512, h = 28, w = 28, size = 401408
[INFO 2017-11-30 09:54:24,900 layers.py:2670] output for __conv_10__: c = 512, h = 28, w = 28, size = 401408
[INFO 2017-11-30 09:54:24,901 layers.py:2670] output for __conv_11__: c = 512, h = 28, w = 28, size = 401408
[INFO 2017-11-30 09:54:24,902 layers.py:2803] output for __pool_3__: c = 512, h = 14, w = 14, size = 100352
[INFO 2017-11-30 09:54:24,902 layers.py:2670] output for __conv_12__: c = 512, h = 14, w = 14, size = 100352
[INFO 2017-11-30 09:54:24,903 layers.py:2670] output for __conv_13__: c = 512, h = 14, w = 14, size = 100352
[INFO 2017-11-30 09:54:24,903 layers.py:2670] output for __conv_14__: c = 512, h = 14, w = 14, size = 100352
[INFO 2017-11-30 09:54:24,904 layers.py:2670] output for __conv_15__: c = 512, h = 14, w = 14, size = 100352
[INFO 2017-11-30 09:54:24,905 layers.py:2803] output for __pool_4__: c = 512, h = 7, w = 7, size = 25088
[INFO 2017-11-30 09:54:24,906 networks.py:1724] The input order is [image]
[INFO 2017-11-30 09:54:24,906 networks.py:1730] The output order is [__fc_layer_2__]
I1130 09:54:24.910554   634 Trainer.cpp:145] trainer: in testing mode
I1130 09:54:24.910567   634 Trainer.cpp:152] trainer mode: Testing
I1130 09:54:25.223145   634 PyDataProvider2.cpp:243] loading dataprovider provider::process
I1130 09:54:25.223551   634 GradientMachine.cpp:83] Loading parameters from models/vgg-19/pass-00000/
I1130 09:54:30.908305   634 Tester.cpp:143]  Batch=32 samples=32 AvgCost=1
I1130 09:54:31.867846   634 Tester.cpp:143]  Batch=64 samples=64 AvgCost=1
I1130 09:54:32.345593   634 Tester.cpp:143]  Batch=96 samples=96 AvgCost=1
I1130 09:54:32.801933   634 Tester.cpp:143]  Batch=128 samples=128 AvgCost=1
I1130 09:54:33.265005   634 Tester.cpp:143]  Batch=160 samples=160 AvgCost=1
I1130 09:54:33.743268   634 Tester.cpp:143]  Batch=192 samples=192 AvgCost=1
I1130 09:54:34.210225   634 Tester.cpp:143]  Batch=224 samples=224 AvgCost=1
I1130 09:54:34.666363   634 Tester.cpp:143]  Batch=256 samples=256 AvgCost=1
I1130 09:54:35.130930   634 Tester.cpp:143]  Batch=288 samples=288 AvgCost=1
I1130 09:54:35.564308   634 Tester.cpp:143]  Batch=320 samples=320 AvgCost=1
I1130 09:54:35.989562   634 Tester.cpp:143]  Batch=352 samples=352 AvgCost=1
I1130 09:54:36.428697   634 Tester.cpp:143]  Batch=384 samples=384 AvgCost=1
I1130 09:54:36.858831   634 Tester.cpp:143]  Batch=416 samples=416 AvgCost=1
I1130 09:54:37.290719   634 Tester.cpp:143]  Batch=448 samples=448 AvgCost=1
I1130 09:54:37.718937   634 Tester.cpp:143]  Batch=480 samples=480 AvgCost=1
I1130 09:54:38.143246   634 Tester.cpp:143]  Batch=512 samples=512 AvgCost=1
I1130 09:54:38.567258   634 Tester.cpp:143]  Batch=544 samples=544 AvgCost=1
I1130 09:54:38.992146   634 Tester.cpp:143]  Batch=576 samples=576 AvgCost=1
I1130 09:54:39.428045   634 Tester.cpp:143]  Batch=608 samples=608 AvgCost=1
I1130 09:54:39.852782   634 Tester.cpp:143]  Batch=640 samples=640 AvgCost=1
I1130 09:54:40.277169   634 Tester.cpp:143]  Batch=672 samples=672 AvgCost=1
I1130 09:54:40.701373   634 Tester.cpp:143]  Batch=704 samples=704 AvgCost=1
I1130 09:54:41.140897   634 Tester.cpp:143]  Batch=736 samples=736 AvgCost=1
I1130 09:54:41.642194   634 Tester.cpp:143]  Batch=768 samples=768 AvgCost=1
I1130 09:54:42.160156   634 Tester.cpp:143]  Batch=800 samples=800 AvgCost=1
I1130 09:54:42.675549   634 Tester.cpp:143]  Batch=832 samples=832 AvgCost=1
I1130 09:54:43.191017   634 Tester.cpp:143]  Batch=864 samples=864 AvgCost=1
I1130 09:54:43.706593   634 Tester.cpp:143]  Batch=896 samples=896 AvgCost=1
I1130 09:54:44.222618   634 Tester.cpp:143]  Batch=928 samples=928 AvgCost=1
I1130 09:54:44.743834   634 Tester.cpp:143]  Batch=960 samples=960 AvgCost=1
I1130 09:54:45.259887   634 Tester.cpp:143]  Batch=992 samples=992 AvgCost=1
I1130 09:54:45.780547   634 Tester.cpp:143]  Batch=1024 samples=1024 AvgCost=1
I1130 09:54:45.780632   634 Tester.cpp:245]  Pass=0 samples=1024 AvgCost=1 Eval: 

缺少统计时间的地方,之前训练的时候有

I1122 10:19:50.843370   167 Stat.cpp:102] ======= StatSet: [GlobalStatInfo] status ======
I1122 10:19:50.843457   167 Stat.cpp:105] Stat=FwdBwd                         TID=167    total=48332.6    avg=483.325    max=636.896    min=456.854    count=100  

--log_period=32可以调大一点,可以选100。

@tensor-tang
Copy link
Contributor Author

是的,Training的时候有统计是因为code里面写死了。

这里如果需要我可以想办法在inference结束的时候计算下,不过需要去掉前几个算作burning time。

log的话可以调,不过需要根据batchsize的设置来了,我可以与上面的一起改了。

@tensor-tang
Copy link
Contributor Author

Done.

最后结果会像下面一样,每个case只输出10个log:

I1201 00:26:19.730397 146749 GradientMachine.cpp:83] Loading parameters from models/googlenet-v1/pass-00000/
I1201 00:26:26.132701 146749 Tester.cpp:143] Batch=256 samples=256 AvgCost=1
I1201 00:26:27.385949 146749 Tester.cpp:143] Batch=512 samples=512 AvgCost=1
I1201 00:26:28.974287 146749 Tester.cpp:143] Batch=768 samples=768 AvgCost=1
I1201 00:26:30.627349 146749 Tester.cpp:143] Batch=1024 samples=1024 AvgCost=1
I1201 00:26:32.402112 146749 Tester.cpp:143] Batch=1280 samples=1280 AvgCost=1
I1201 00:26:33.676046 146749 Tester.cpp:143] Batch=1536 samples=1536 AvgCost=1
I1201 00:26:34.764509 146749 Tester.cpp:143] Batch=1792 samples=1792 AvgCost=1
I1201 00:26:35.846726 146749 Tester.cpp:143] Batch=2048 samples=2048 AvgCost=1
I1201 00:26:36.928786 146749 Tester.cpp:143] Batch=2304 samples=2304 AvgCost=1
I1201 00:26:38.007845 146749 Tester.cpp:143] Batch=2560 samples=2560 AvgCost=1
I1201 00:26:38.007884 146749 Tester.cpp:245] Pass=0 samples=2560 AvgCost=1 Eval:
Last 1280 samples start: 00:26:32.402112(1592.402112 sec), end: 00:26:38.007845(1598.007845 sec;
FPS: 228.33 images/sec

最后会出一个FPS的值。

Copy link
Contributor

@luotao1 luotao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look nice!

@luotao1 luotao1 merged commit 000c1f7 into PaddlePaddle:develop Dec 1, 2017
@tensor-tang tensor-tang deleted the inference branch December 1, 2017 12:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Inference Benchmark
2 participants