Paddle on Mobile #5782

hedaoyuan · 2017-11-20T12:44:18Z

Paddle on Mobile

Based on some previous work and issues, I've listed some things Paddle needs to do on the mobile and embedded devices.

Build

Paddle mobile inference library needs to support a variety of computing platforms, including Linux, Android, iOS and CPUs, CPUs etc. So, we need to continue refining the entire compilation project (Especially Android and iOS compilation project). In addition, the binary size of the inference library also needs to continue to optimize.

Inference API

The C-API design did not consider the mobile scene. The existing C-API is also not enough on the mobile side (Android need Java API). We need to think about whether to refactor or refine the C-API.
And it is more reasonable to rename C-API to Inference API. Also, we need to improve the inference programming model on mobile.

Low Precision

Low-precision calculations can allow for smaller and faster model inference. Many hardware are enhancing hardware support for low-precision computing. Next year, there will be chips that support the ARMv8.2 instruction set architecture. And we can use float16 calculations on the mobile to speed up model inference. Here is an issue #4853 about support for float16 calculation.

Multi-Thread

Multi-thread computing can be used to speed up some computationally intensive operations. However, due to the big.LITTLE architecture and power consumption issues, multi-thread in the mobile is hard to achieve the expected speed of acceleration. Here #4678 is a more detailed explanation of the mobile multi-threaded computing difficulties.

Mobile GPU

Mobile GPU performance has been greatly improved in recent years. For some computationally intensive operations, an order of magnitude acceleration can be achieved on the mobile GPU compared to the CPU. We need to add GPU computing on Paddle Mobile. Here #5469 is a more detailed explanation of why Paddle needs to support the mobile GPU.

Hardware Acceleration

On the mobile, hardware acceleration for model Inference is a trend. We need to know about libraries for Android NN, SNPE, ARM NN, etc. that can be used for hardware acceleration. And how Paddle uses these libraries for the model inference. Here is a project for this work.

Convolution optimization

Matrix multiplication optimization

Document

Benchmark

Demo

wangkuiyi · 2017-11-21T20:07:05Z

Good job! Please make this issue a Github Project ASAP so to make our project management and progress management transparent. For more details about the motivation, please @hedaoyuan refer to my recent email.

hedaoyuan · 2017-11-22T02:45:44Z

@wangkuiyi This is the project of Paddle Mobile https://github.com/PaddlePaddle/Paddle/projects/12

Xreki · 2018-04-26T03:38:56Z

Closed because we won't add new feature to Paddle v2.

hedaoyuan mentioned this issue Nov 20, 2017

Work on Embedded #2025

Closed

hedaoyuan self-assigned this Nov 20, 2017

hedaoyuan mentioned this issue Nov 20, 2017

Make a general framework for doc in how_to_use_capi. PaddlePaddle/models#385

Closed

Xreki added the mobile label Nov 21, 2017

Xreki closed this as completed Apr 26, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Paddle on Mobile #5782

Paddle on Mobile #5782

hedaoyuan commented Nov 20, 2017

wangkuiyi commented Nov 21, 2017

hedaoyuan commented Nov 22, 2017

Xreki commented Apr 26, 2018

Paddle on Mobile #5782

Paddle on Mobile #5782

Comments

hedaoyuan commented Nov 20, 2017

Paddle on Mobile

Build

Inference API

Low Precision

Multi-Thread

Mobile GPU

Hardware Acceleration

Convolution optimization

Matrix multiplication optimization

Document

Benchmark

Demo

wangkuiyi commented Nov 21, 2017

hedaoyuan commented Nov 22, 2017

Xreki commented Apr 26, 2018