fix ut error of test_recognize_digits, test=develop #27791
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PR types
Others
PR changes
Others
Describe
问题描述:
test_recognize_digits单测在infer时会随机抛出模型参数文件的broken错误,错误log如下:
错误原因分析:
由修复test_train_recognize_digits_mlp和test_train_recognize_digits_convd的PR相关https://github.com/PaddlePaddle/Paddle/pull/27475。
在PR27475中修复之后,paddle_build.sh脚本中运行单测的命令如下
这两个单测会被并行运行,由于2个都在PR27475中设置依赖test_recognize_digits,test_recognize_digits会同时并行运行2次,log如下:
并行运行2个test_recognize_digits时,程序对同一个目录下的同一个param文件进行读写,所以会导致test_recognize_digits中的infer func在load param file时得到的param文件时损坏的。
解决办法:
将test_train_recognize_digits_mlp和test_train_recognize_digits_conv合并为一个单测case,顺序执行就不会有之前的问题。
也尝试过修改cmake文件中set_tests_properties的依赖关系,但是当paddle_build.sh脚本中并行后台运行2个单测时,即使对2个单测互相直接设置了依赖关系,也会导致test_recognize_digits被一起执行2遍,除非某个单测不设置对test_recognize_digits的依赖,这样又会导致test_train_recognize_digits_mlp或者test_train_recognize_digits_conv的单测失败。