A curated list of awesome edge machine learning resources, including research papers, inference engines, challenges, books, meetups and others.
- Papers
- Datasets
- Inference Engines
- MCU and MPU Software Packages
- AI Chips
- Books
- Challenges
- Other Resources
- Contribute
- LicenseBlock
There is a countless number of possible edge machine learning applications. Here, we collect papers that describe specific solutions.
Automated machine learning (AutoML) is the process of automating the end-to-end process of applying machine learning to real-world problems.Wikipedia AutoML is for example used to design new efficient neural architectures with a constraint on a computational budget (defined either as a number of FLOPS or as an inference time measured on real device) or a size of the architecture.
Efficient architectures represent neural networks with small memory footprint and fast inference time when measured on edge devices.
Federated Learning enables mobile phones to collaboratively learn a shared prediction model while keeping all the training data on device, decoupling the ability to do machine learning from the need to store the data in the cloud.Google AI blog: Federated Learning
Standard machine learning algorithms are not always able to run on edge devices due to large computational requirements and space complexity. This section introduces optimized machine learning algorithms.
Pruning is a common method to derive a compact network – after training, some structural portion of the parameters is removed, along with its associated computations.Importance Estimation for Neural Network Pruning
This section contains papers that are related to edge machine learning but are not part of any major group. These papers often deal with deployment issues (i.e. optimizing inference on target platform).
Quantization is the process of reducing a precision (from 32 bit floating point into lower bit depth representations) of weights and/or activations in a neural network. The advantages of this method are reduced model size and faster model inference on hardware that support arithmetic operations in lower precision.
Visual Wake Words represents a common microcontroller vision use-case of identifying whether a person is present in the image or not, and provides a realistic benchmark for tiny vision models. Within a limited memory footprint of 250 KB, several state-of-the-art mobile models achieve accuracy of 85-90% on the Visual Wake Words dataset.
List of machine learning inference engines and APIs that are optimized for execution and/or training on edge devices.
- Source code: https://github.com/ARM-software/ComputeLibrary
- Arm
- Source code: https://github.com/xmartlabs/Bender
- Documentation: https://xmartlabs.github.io/Bender/
- Xmartlabs
- Source code: https://github.com/pytorch/pytorch/tree/master/caffe2
- Documentation: https://caffe2.ai/
- Documentation: https://developer.apple.com/documentation/coreml
- Apple
- Documentation: https://deeplearning4j.org/docs/latest/deeplearning4j-android
- Skymind
- Source code: https://github.com/Microsoft/ELL
- Documentation: https://microsoft.github.io/ELL
- Microsoft
- Source code: https://github.com/Tencent/FeatherCNN
- Tencent
- Source code: https://github.com/XiaoMi/mace
- Documentation: https://mace.readthedocs.io/
- XiaoMi
- Source code: https://github.com/alibaba/MNN
- Alibaba
- Documentation: https://mxnet.incubator.apache.org/versions/master/faq/smart_device.html
- Amazon
- Source code: https://github.com/tencent/ncnn
- Tencent
- Documentation: https://developer.android.com/ndk/guides/neuralnetworks/
- Source code: https://github.com/PaddlePaddle/paddle-mobile
- Baidu
- Source code: https://developer.qualcomm.com/software/qualcomm-neural-processing-sdk
- Qualcomm
- Source code: https://github.com/OAID/Tengine
- OAID
- Source code: https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite
- Documentation: https://www.tensorflow.org/lite/
- Source code: https://github.com/JDAI-CV/dabnn
- JDAI Computer Vision
List of software packages for AI development on MCU and MPU
STM32Cube function pack for ultra-low power IoT node with artificial intelligence (AI) application based on audio and motion sensing
FP-AI-VISION1 is an STM32Cube function pack featuring examples of computer vision applications based on Convolutional Neural Network (CNN)
TIDL software framework leverages a highly optimized neural network implementation on TI’s Sitara AM57x processors, making use of hardware acceleration on the device
X-LINUX-AI-CV is an STM32 MPU OpenSTLinux Expansion Package that targets Artificial Intelligence for computer vision applications based on Convolutional Neural Network (CNN)
Based on the output result from the translator, the ROM/RAM mounting size and the inference execution processing time are calculated while referring to the information of the selected MCU/MPU
Tool for converting Caffe and TensorFlow models to MCU/MPU development environment
The NXP eIQ™ Auto deep learning (DL) toolkit enables developers to introduce DL algorithms into their applications and to continue satisfying automotive standards
The NXP® eIQ™ machine learning software development environment enables the use of ML algorithms on NXP MCUs, i.MX RT crossover MCUs, and i.MX family SoCs. eIQ software includes inference engines, neural network compilers and optimized libraries
List of resources about AI Chips
A list of ICs and IPs for AI, Machine Learning and Deep Learning
List of books with focus on on-device (e.g., edge or mobile) machine learning.
- Authors: Pete Warden, Daniel Situnayake
- Published: 2020
- Author: Matthijs Hollemans
- Published: 2019
- Author: Matthijs Hollemans
- Published: 2018
- Author: Pete Warden
- Published: 2017
Competition with focus on the best vision solutions that can simultaneously achieve high accuracy in computer vision and energy efficiency. LPIRC is regularly held during computer vision conferences (CVPR, ICCV and others) since 2015 and the winners’ solutions have already improved 24 times in the ratio of accuracy divided by energy.
Embedded and mobile deep learning research resources
A curated list of neural network pruning resources
Collection of recent methods on DNN compression and acceleration
Machine learning tutorials targeted for iOS devices
Unlike other awesome list, we are storing data in YAML format and markdown files are generated with awesome.py
script.
Every directory contains data.yaml
which stores data we want to display and config.yaml
which stores its metadata (e.g. way of sorting data). The way how data will be presented is defined in renderer.py
.
To the extent possible under law, Bisonai has waived all copyright and related or neighboring rights to this work.