forked from PaddlePaddle/Paddle
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request PaddlePaddle#62 from jiweibo/windows_cpu_demo
add xpu, arm ZHAOXIN and SW.
- Loading branch information
Showing
5 changed files
with
576 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,120 @@ | ||
# 使用昆仑预测 | ||
|
||
百度的昆仑芯⽚是⼀款⾼性能的AI SoC芯⽚,⽀持推理和训练。昆仑芯⽚采⽤百度的先进AI架构,⾮常适合常⽤的深度学习和机器学习算法的云端计算需求,并能适配诸如⾃然语⾔处理、⼤规模语⾳识别、⾃动驾驶、⼤规模推荐等多种终端场景的计算需求。 | ||
|
||
Paddle Inference集成了[Paddle-Lite预测引擎](https://paddle-lite.readthedocs.io/zh/latest/demo_guides/baidu_xpu.html)在昆仑xpu上进行预测部署。 | ||
|
||
## 编译注意事项 | ||
|
||
请确保编译的时候设置了WITH_LITE=ON,且XPU_SDK_ROOT设置了正确的路径。 | ||
|
||
## 使用介绍 | ||
|
||
在使用Predictor时,我们通过配置Config中的接口,在XPU上运行。 | ||
|
||
```c++ | ||
config->EnableLiteEngine( | ||
precision_mode=PrecisionType::kFloat32, | ||
zero_copy=false, | ||
passes_filter={}, | ||
ops_filter={}, | ||
) | ||
``` | ||
|
||
- **`precision_mode`**,类型:`enum class PrecisionType {kFloat32 = 0, kHalf, kInt8,};`, 默认值为`PrecisionType::kFloat32`。指定lite子图的运行精度。 | ||
- **`zero_copy`**,类型:bool,lite子图与Paddle之间的数据传递是否是零拷贝模式。 | ||
- **`passes_filter`**,类型:`std::vector<std::string>`,默认为空,扩展借口,暂不使用。 | ||
- **`ops_filer`**,类型:`std::vector<std::string>`,默认为空,显示指定哪些op不使用lite子图运行。 | ||
|
||
Python接口如下: | ||
|
||
```python | ||
config.enable_lite_engine( | ||
precision_mode=PrecisionType.Float32, | ||
zero_copy=False, | ||
passes_filter=[], | ||
ops_filter=[] | ||
) | ||
``` | ||
|
||
### Python demo | ||
|
||
因目前Paddle-Inference目前未将xpu sdk打包到whl包内,所以需要用户下载xpu sdk,并加入到环境变量中,之后会考虑解决该问题。 | ||
|
||
下载[xpu_tool_chain](https://paddle-inference-dist.bj.bcebos.com/inference_demo/xpu_tool_chain.tgz),解压后将shlib加入到LD_LIBRARY_PATH | ||
|
||
``` | ||
tar xzf xpu_tool_chain.tgz | ||
``` | ||
``` | ||
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PWD/output/XTDK/shlib/:$PWD/output/XTDK/runtime/shlib/ | ||
``` | ||
|
||
下载[resnet50](https://paddle-inference-dist.bj.bcebos.com/inference_demo/python/resnet50/ResNet50.tar.gz)模型,并解压,运行如下命令将会调用预测引擎 | ||
|
||
```bash | ||
python resnet50_subgraph.py --model_file ./ResNet50/model --params_file ./ResNet50/params | ||
``` | ||
|
||
resnet50_subgraph.py的内容是: | ||
|
||
``` | ||
import argparse | ||
import time | ||
import numpy as np | ||
from paddle.inference import Config, PrecisionType | ||
from paddle.inference import create_predictor | ||
def main(): | ||
args = parse_args() | ||
config = set_config(args) | ||
predictor = create_predictor(config) | ||
input_names = predictor.get_input_names() | ||
input_hanlde = predictor.get_input_handle(input_names[0]) | ||
fake_input = np.ones((args.batch_size, 3, 224, 224)).astype("float32") | ||
input_hanlde.reshape([args.batch_size, 3, 224, 224]) | ||
input_hanlde.copy_from_cpu(fake_input) | ||
for i in range(args.warmup): | ||
predictor.run() | ||
start_time = time.time() | ||
for i in range(args.repeats): | ||
predictor.run() | ||
output_names = predictor.get_output_names() | ||
output_handle = predictor.get_output_handle(output_names[0]) | ||
output_data = output_handle.copy_to_cpu() | ||
end_time = time.time() | ||
print(output_data[0, :10]) | ||
print('time is: {}'.format((end_time-start_time)/args.repeats * 1000)) | ||
def parse_args(): | ||
parser = argparse.ArgumentParser() | ||
parser.add_argument("--model_dir", type=str, help="model dir") | ||
parser.add_argument("--model_file", type=str, help="model filename") | ||
parser.add_argument("--params_file", type=str, help="parameter filename") | ||
parser.add_argument("--batch_size", type=int, default=1, help="batch size") | ||
parser.add_argument("--warmup", type=int, default=0, help="warmup") | ||
parser.add_argument("--repeats", type=int, default=1, help="repeats") | ||
parser.add_argument("--math_thread_num", type=int, default=1, help="math_thread_num") | ||
return parser.parse_args() | ||
def set_config(args): | ||
config = Config(args.model_file, args.params_file) | ||
config.enable_lite_engine(PrecisionType.Float32, True) | ||
# use lite xpu subgraph | ||
config.enable_xpu(10 * 1024 * 1024) | ||
# use lite cuda subgraph | ||
# config.enable_use_gpu(100, 0) | ||
config.set_cpu_math_library_num_threads(args.math_thread_num) | ||
return config | ||
if __name__ == "__main__": | ||
main() | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,153 @@ | ||
# **飞腾/鲲鹏下从源码编译** | ||
|
||
## 环境准备 | ||
|
||
* **处理器:FT2000+/Kunpeng 920 2426SK** | ||
* **操作系统:麒麟v10/UOS** | ||
* **Python 版本 2.7.15+/3.5.1+/3.6/3.7/3.8 (64 bit)** | ||
* **pip 或 pip3 版本 9.0.1+ (64 bit)** | ||
|
||
飞腾FT2000+和鲲鹏920处理器均为ARMV8架构,在该架构上编译Paddle的方式一致,本文以FT2000+为例,介绍Paddle的源码编译。 | ||
|
||
## 安装步骤 | ||
|
||
目前在FT2000+处理器加国产化操作系统(麒麟UOS)上安装Paddle,只支持源码编译的方式,接下来详细介绍各个步骤。 | ||
|
||
<a name="arm_source"></a> | ||
### **源码编译** | ||
|
||
1. Paddle依赖cmake进行编译构建,需要cmake版本>=3.10,如果操作系统提供的源包括了合适版本的cmake,直接安装即可,否则需要[源码安装](https://github.com/Kitware/CMake) | ||
|
||
``` | ||
wget https://github.com/Kitware/CMake/releases/download/v3.16.8/cmake-3.16.8.tar.gz | ||
``` | ||
|
||
``` | ||
tar -xzf cmake-3.16.8.tar.gz && cd cmake-3.16.8 | ||
``` | ||
|
||
``` | ||
./bootstrap && make && sudo make install | ||
``` | ||
|
||
2. Paddle内部使用patchelf来修改动态库的rpath,如果操作系统提供的源包括了patchelf,直接安装即可,否则需要源码安装,请参考[patchelf官方文档](https://github.com/NixOS/patchelf),后续会考虑在ARM上移出该依赖。 | ||
|
||
``` | ||
./bootstrap.sh | ||
``` | ||
|
||
``` | ||
./configure | ||
``` | ||
|
||
``` | ||
make | ||
``` | ||
|
||
``` | ||
make check | ||
``` | ||
|
||
``` | ||
sudo make install | ||
``` | ||
|
||
3. 根据[requirments.txt](https://github.com/PaddlePaddle/Paddle/blob/develop/python/requirements.txt)安装Python依赖库,在飞腾加国产化操作系统环境中,pip安装可能失败或不能正常工作,主要依赖通过源或源码安装的方式安装依赖库,建议使用系统提供源的方式安装依赖库。 | ||
|
||
4. 将Paddle的源代码克隆到当下目录下的Paddle文件夹中,并进入Paddle目录 | ||
|
||
``` | ||
git clone https://github.com/PaddlePaddle/Paddle.git | ||
``` | ||
|
||
``` | ||
cd Paddle | ||
``` | ||
|
||
5. 切换到较稳定release分支下进行编译: | ||
|
||
``` | ||
git checkout [分支名] | ||
``` | ||
|
||
例如: | ||
|
||
``` | ||
git checkout release/2.0-rc1 | ||
``` | ||
|
||
6. 并且请创建并进入一个叫build的目录下: | ||
|
||
``` | ||
mkdir build && cd build | ||
``` | ||
|
||
7. 链接过程中打开文件数较多,可能超过系统默认限制导致编译出错,设置进程允许打开的最大文件数: | ||
|
||
``` | ||
ulimit -n 4096 | ||
``` | ||
|
||
8. 执行cmake: | ||
|
||
>具体编译选项含义请参见[编译选项表](../Tables.html#Compile) | ||
For Python2: | ||
``` | ||
cmake .. -DPY_VERSION=2 -DPYTHON_EXECUTABLE=`which python2` -DWITH_ARM=ON -DWITH_TESTING=OFF -DCMAKE_BUILD_TYPE=Release -DON_INFER=ON -DWITH_XBYAK=OFF | ||
``` | ||
|
||
For Python3: | ||
``` | ||
cmake .. -DPY_VERSION=3 -DPYTHON_EXECUTABLE=`which python3` -DWITH_ARM=ON -DWITH_TESTING=OFF -DCMAKE_BUILD_TYPE=Release -DON_INFER=ON -DWITH_XBYAK=OFF | ||
``` | ||
|
||
9. 使用以下命令来编译,注意,因为处理器为ARM架构,如果不加`TARGET=ARMV8`则会在编译的时候报错。 | ||
|
||
``` | ||
make TARGET=ARMV8 -j$(nproc) | ||
``` | ||
|
||
10. 编译成功后进入`Paddle/build/python/dist`目录下找到生成的`.whl`包。 | ||
|
||
11. 在当前机器或目标机器安装编译好的`.whl`包: | ||
|
||
``` | ||
pip install -U(whl包的名字)`或`pip3 install -U(whl包的名字) | ||
``` | ||
|
||
恭喜,至此您已完成PaddlePaddle在FT环境下的编译安装。 | ||
|
||
|
||
## **验证安装** | ||
安装完成后您可以使用 `python` 或 `python3` 进入python解释器,输入`import paddle.fluid as fluid` ,再输入 | ||
`fluid.install_check.run_check()` | ||
|
||
如果出现`Your Paddle Fluid is installed succesfully!`,说明您已成功安装。 | ||
|
||
在mobilenetv1和resnet50模型上测试 | ||
|
||
wget -O profile.tar https://paddle-cetc15.bj.bcebos.com/profile.tar?authorization=bce-auth-v1/4409a3f3dd76482ab77af112631f01e4/2020-10-09T10:11:53Z/-1/host/786789f3445f498c6a1fd4d9cd3897ac7233700df0c6ae2fd78079eba89bf3fb | ||
tar xf profile.tar && cd profile | ||
python resnet.py --model_file ResNet50_inference/model --params_file ResNet50_inference/params | ||
# 正确输出应为:[0.0002414 0.00022418 0.00053661 0.00028639 0.00072682 0.000213 | ||
# 0.00638718 0.00128127 0.00013535 0.0007676 ] | ||
python mobilenetv1.py --model_file mobilenetv1/model --params_file mobilenetv1/params | ||
# 正确输出应为:[0.00123949 0.00100392 0.00109539 0.00112206 0.00101901 0.00088412 | ||
# 0.00121536 0.00107679 0.00106071 0.00099605] | ||
python ernie.py --model_dir ernieL3H128_model/ | ||
# 正确输出应为:[0.49879393 0.5012061 ] | ||
|
||
## **如何卸载** | ||
请使用以下命令卸载PaddlePaddle: | ||
|
||
``` | ||
pip uninstall paddlepaddle` 或 `pip3 uninstall paddlepaddle | ||
``` | ||
|
||
|
||
## **备注** | ||
|
||
已在ARM架构下测试过resnet50, mobilenetv1, ernie, ELMo等模型,基本保证了预测使用算子的正确性,如果您在使用过程中遇到计算结果错误,编译失败等问题,请到[issue](https://github.com/PaddlePaddle/Paddle/issues)中留言,我们会及时解决。 | ||
|
||
预测文档见[doc](https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/guides/05_inference_deployment/inference/native_infer.html),使用示例见[Paddle-Inference-Demo](https://github.com/PaddlePaddle/Paddle-Inference-Demo) |
Oops, something went wrong.