You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Based on some previous work and issues, I've listed some things Paddle needs to do on the mobile and embedded devices.
Build
Paddle mobile inference library needs to support a variety of computing platforms, including Linux, Android, iOS and CPUs, CPUs etc. So, we need to continue refining the entire compilation project (Especially Android and iOS compilation project). In addition, the binary size of the inference library also needs to continue to optimize.
Inference API
The C-API design did not consider the mobile scene. The existing C-API is also not enough on the mobile side (Android need Java API). We need to think about whether to refactor or refine the C-API.
And it is more reasonable to rename C-API to Inference API. Also, we need to improve the inference programming model on mobile.
Low Precision
Low-precision calculations can allow for smaller and faster model inference. Many hardware are enhancing hardware support for low-precision computing. Next year, there will be chips that support the ARMv8.2 instruction set architecture. And we can use float16 calculations on the mobile to speed up model inference. Here is an issue #4853 about support for float16 calculation.
Multi-Thread
Multi-thread computing can be used to speed up some computationally intensive operations. However, due to the big.LITTLE architecture and power consumption issues, multi-thread in the mobile is hard to achieve the expected speed of acceleration. Here #4678 is a more detailed explanation of the mobile multi-threaded computing difficulties.
Mobile GPU
Mobile GPU performance has been greatly improved in recent years. For some computationally intensive operations, an order of magnitude acceleration can be achieved on the mobile GPU compared to the CPU. We need to add GPU computing on Paddle Mobile. Here #5469 is a more detailed explanation of why Paddle needs to support the mobile GPU.
Hardware Acceleration
On the mobile, hardware acceleration for model Inference is a trend. We need to know about libraries for Android NN, SNPE, ARM NN, etc. that can be used for hardware acceleration. And how Paddle uses these libraries for the model inference. Here is a project for this work.
Convolution optimization
Matrix multiplication optimization
Document
Benchmark
Demo
The text was updated successfully, but these errors were encountered:
Good job! Please make this issue a Github Project ASAP so to make our project management and progress management transparent. For more details about the motivation, please @hedaoyuan refer to my recent email.
Paddle on Mobile
Based on some previous work and issues, I've listed some things Paddle needs to do on the mobile and embedded devices.
Build
Paddle mobile inference library needs to support a variety of computing platforms, including Linux, Android, iOS and CPUs, CPUs etc. So, we need to continue refining the entire compilation project (Especially Android and iOS compilation project). In addition, the binary size of the inference library also needs to continue to optimize.
Inference API
The C-API design did not consider the mobile scene. The existing C-API is also not enough on the mobile side (Android need Java API). We need to think about whether to refactor or refine the C-API.
And it is more reasonable to rename C-API to Inference API. Also, we need to improve the inference programming model on mobile.
Low Precision
Low-precision calculations can allow for smaller and faster model inference. Many hardware are enhancing hardware support for low-precision computing. Next year, there will be chips that support the ARMv8.2 instruction set architecture. And we can use float16 calculations on the mobile to speed up model inference. Here is an issue #4853 about support for float16 calculation.
Multi-Thread
Multi-thread computing can be used to speed up some computationally intensive operations. However, due to the big.LITTLE architecture and power consumption issues, multi-thread in the mobile is hard to achieve the expected speed of acceleration. Here #4678 is a more detailed explanation of the mobile multi-threaded computing difficulties.
Mobile GPU
Mobile GPU performance has been greatly improved in recent years. For some computationally intensive operations, an order of magnitude acceleration can be achieved on the mobile GPU compared to the CPU. We need to add GPU computing on Paddle Mobile. Here #5469 is a more detailed explanation of why Paddle needs to support the mobile GPU.
Hardware Acceleration
On the mobile, hardware acceleration for model Inference is a trend. We need to know about libraries for Android NN, SNPE, ARM NN, etc. that can be used for hardware acceleration. And how Paddle uses these libraries for the model inference. Here is a project for this work.
Convolution optimization
Matrix multiplication optimization
Document
Benchmark
Demo
The text was updated successfully, but these errors were encountered: