Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enable inference benchmark #5933

Merged
merged 6 commits into from
Dec 1, 2017
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion benchmark/paddle/image/googlenet.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@

args = {'height': height, 'width': width, 'color': True, 'num_class': num_class}
define_py_data_sources2(
"train.list", None, module="provider", obj="process", args=args)
"train.list", "test.list", module="provider", obj="process", args=args)

settings(
batch_size=batch_size,
Expand Down
2 changes: 1 addition & 1 deletion benchmark/paddle/image/resnet.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@

args = {'height': height, 'width': width, 'color': True, 'num_class': num_class}
define_py_data_sources2(
"train.list", None, module="provider", obj="process", args=args)
"train.list", "test.list", module="provider", obj="process", args=args)

settings(
batch_size=batch_size,
Expand Down
69 changes: 65 additions & 4 deletions benchmark/paddle/image/run_mkldnn.sh
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,13 @@ function train() {
use_mkldnn=$4
if [ $4 == "True" ]; then
thread=1
log="logs/${topology}-${layer_num}-mkldnn-${bs}.log"
log="logs/train-${topology}-${layer_num}-mkldnn-${bs}.log"
elif [ $4 == "False" ]; then
thread=`nproc`
# each trainer_count use only 1 core to avoid conflict
log="logs/${topology}-${layer_num}-${thread}mklml-${bs}.log"
log="logs/train-${topology}-${layer_num}-${thread}mklml-${bs}.log"
else
echo "Wrong input $3, use True or False."
echo "Wrong input $4, use True or False."
exit 0
fi
args="batch_size=${bs},layer_num=${layer_num}"
Expand All @@ -30,13 +30,74 @@ function train() {
2>&1 | tee ${log}
}

if [ ! -d "train.list" ]; then
function test() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test->inference

unset OMP_NUM_THREADS MKL_NUM_THREADS OMP_DYNAMIC KMP_AFFINITY
topology=$1
layer_num=$2
bs=$3
use_mkldnn=$4
if [ $4 == "True" ]; then
thread=1
log="logs/test-${topology}-${layer_num}-mkldnn-${bs}.log"
elif [ $4 == "False" ]; then
thread=`nproc`
if [ $thread -gt $bs ]; then
thread=$bs
fi
log="logs/test-${topology}-${layer_num}-${thread}mklml-${bs}.log"
else
echo "Wrong input $4, use True or False."
exit 0
fi

models_in="models/${topology}-${layer_num}/pass-00000/"
if [ ! -d $models_in ]; then
echo "Training model ${topology}_${layer_num}"
paddle train --job=train \
--config="${topology}.py" \
--use_mkldnn=True \
--use_gpu=False \
--trainer_count=1 \
--num_passes=1 \
--save_dir="models/${topology}-${layer_num}" \
--config_args="batch_size=128,layer_num=${layer_num}" \
> /dev/null 2>&1
echo "Done"
Copy link
Contributor

@luotao1 luotao1 Nov 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • 预测不要跟在训练的后面测,这样测预测性能的时候太慢了。
  • 预测使用的网络不需要训练的非常好,因为只是测性能,拿任意一个batch保存的模型即可。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

目前是如果发现本地没有训练好的模型,才会去train下以此来生产一个模型做inference。

这个模型也只是训练一个num_pass,因为是dummy data只有1024张图片,训练也不会很耗时,也只会训练一次,后面相同网络的inference用的都是同样的模型,所以整体不会太影响的。

fi
paddle train --job=test \
--config="${topology}.py" \
--use_mkldnn=$use_mkldnn \
--use_gpu=False \
--trainer_count=$thread \
--log_period=10 \
--config_args="batch_size=${bs},layer_num=${layer_num},is_test=True" \
Copy link
Contributor

@luotao1 luotao1 Nov 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_test=True,这个参数在三个网络中都没有。
预测的网络和训练的网络不同,请相应调整。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

预测的时候,是没有cost的,所以网络都要调整下。

--init_model_path=$models_in \
2>&1 | tee ${log}
}

if [ ! -f "train.list" ]; then
echo " " > train.list
fi
if [ ! -f "test.list" ]; then
echo " " > test.list
fi
if [ ! -d "logs" ]; then
mkdir logs
fi
if [ ! -d "models" ]; then
mkdir -p models
fi

# inference benchmark
for use_mkldnn in True False; do
for batchsize in 1 2 4 8 16; do
test googlenet v1 $batchsize $use_mkldnn
test resnet 50 $batchsize $use_mkldnn
test vgg 19 $batchsize $use_mkldnn
done
done

# training benchmark
for use_mkldnn in True False; do
for batchsize in 64 128 256; do
train vgg 19 $batchsize $use_mkldnn
Expand Down
2 changes: 1 addition & 1 deletion benchmark/paddle/image/vgg.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@

args = {'height': height, 'width': width, 'color': True, 'num_class': num_class}
define_py_data_sources2(
"train.list", None, module="provider", obj="process", args=args)
"train.list", "test.list", module="provider", obj="process", args=args)

settings(
batch_size=batch_size,
Expand Down