A collection of recent methods on DNN compression and acceleration. There are mainly 5 kinds of methods for efficient DNNs:
- neural architecture re-designing or searching
- maintain accuracy, less cost (e.g., #Params, #FLOPs, etc.): MobileNet, ShuffleNet etc.
- maintain cost, more accuracy: Inception, ResNeXt, Xception etc.
- pruning (including structured and unstructured)
- quantization
- matrix decomposition
- knowledge distillation
About abbreviation: In the list below,
o
for oral,w
for workshop,s
for spotlight,b
for best paper.
- 2011-JMLR-Learning with Structured Sparsity
- 2011-NIPSw-Improving the speed of neural networks on CPUs
- 2013-NIPS-Predicting Parameters in Deep Learning
- 2013.08-Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation
- 2014-BMVC-Speeding up convolutional neural networks with low rank expansions
- 2014-INTERSPEECH-1-Bit Stochastic Gradient Descent and its Application to Data-Parallel Distributed Training of Speech DNNs
- 2014-NIPS-Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation
- 2014-NIPS-Do deep neural nets really need to be deep
- 2014.12-Memory bounded deep convolutional networks
- 2015-ICLR-Speeding-up convolutional neural networks using fine-tuned cp-decomposition
- 2015-ICML-Compressing neural networks with the hashing trick
- 2015-INTERSPEECH-A Diversity-Penalizing Ensemble Training Method for Deep Learning
- 2015-BMVC-Data-free parameter pruning for deep neural networks
- 2015-BMVC-Learning the structure of deep architectures using l1 regularization
- 2015-NIPS-Learning both Weights and Connections for Efficient Neural Network
- 2015-NIPS-Binaryconnect: Training deep neural networks with binary weights during propagations
- 2015-NIPS-Structured Transforms for Small-Footprint Deep Learning
- 2015-NIPS-Tensorizing Neural Networks
- 2015-NIPSw-Distilling Intractable Generative Models
- 2015-NIPSw-Federated Optimization:Distributed Optimization Beyond the Datacenter
- 2015-CVPR-Efficient and Accurate Approximations of Nonlinear Convolutional Networks [2016 TPAMI version: Accelerating Very Deep Convolutional Networks for Classification and Detection]
- 2015-CVPR-Sparse Convolutional Neural Networks
- 2015-ICCV-An Exploration of Parameter Redundancy in Deep Networks with Circulant Projections
- 2015.11-Compression of Deep Convolutional Neural Networks for Fast and Low Power Mobile Applications
- 2015.12-Exploiting Local Structures with the Kronecker Layer in Convolutional Networks
- 2016-ICLR-Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding (best paper!)
- 2016-ICLR-All you need is a good init [Code]
- 2016-ICLR-Convolutional neural networks with low-rank regularization [Code]
- 2016-ICLR-Diversity networks
- 2016-ICLR-Neural networks with few multiplications
- 2016-ICLRw-Randomout: Using a convolutional gradient norm to win the filter lottery
- 2016-CVPR-Fast algorithms for convolutional neural networks
- 2016-CVPR-Fast ConvNets Using Group-wise Brain Damage
- 2016-BMVC-Learning neural network architectures using backpropagation
- 2016-ECCV-Less is more: Towards compact cnns
- 2016-EMNLP-Sequence-Level Knowledge Distillation
- 2016-NIPS-Learning Structured Sparsity in Deep Neural Networks
- 2016-NIPS-Dynamic Network Surgery for Efficient DNNs [Code]
- 2016-NIPS-Learning the Number of Neurons in Deep Neural Networks
- 2016-NIPS-Memory-Efficient Backpropagation Through Time
- 2016-NIPS-PerforatedCNNs: Acceleration through Elimination of Redundant Convolutions
- 2016-NIPS-LightRNN: Memory and Computation-Efficient Recurrent Neural Networks
- 2016-NIPS-CNNpack: packing convolutional neural networks in the frequency domain
- 2016-NIPSw-Federated Learning: Strategies for Improving Communication Efficiency
- 2016-ISCA-Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks
- 2016-ICASSP-Learning compact recurrent neural networks
- 2016-CoNLL-Compression of Neural Machine Translation Models via Pruning
- 2016.03-Adaptive Computation Time for Recurrent Neural Networks
- 2016.06-Structured Convolution Matrices for Energy-efficient Deep learning
- 2016.06-Deep neural networks are robust to weight binarization and other non-linear distortions
- 2016.06-Hypernetworks
- 2016.07-IHT-Training skinny deep neural networks with iterative hard thresholding methods
- 2016.08-Recurrent Neural Networks With Limited Numerical Precision
- 2016.10-Deep model compression: Distilling knowledge from noisy teachers
- 2016.10-Federated Optimization: Distributed Machine Learning for On-Device Intelligence
- 2017-ICLR-Pruning Convolutional Neural Networks for Resource Efficient Inference
- 2017-ICLR-Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights [Code]
- 2017-ICLR-Do Deep Convolutional Nets Really Need to be Deep and Convolutional?
- 2017-ICLR-DSD: Dense-Sparse-Dense Training for Deep Neural Networks (Closely related work: SFP and IHT)
- 2017-ICLR-Faster CNNs with Direct Sparse Convolutions and Guided Pruning
- 2017-ICLR-Towards the Limit of Network Quantization
- 2017-ICLR-Loss-aware Binarization of Deep Networks
- 2017-ICLR-Trained Ternary Quantization [Code]
- 2017-ICLR-Exploring Sparsity in Recurrent Neural Networks
- 2017-ICLR-Soft Weight-Sharing for Neural Network Compression [Reddit discussion]
- 2017-ICLR-Variable Computation in Recurrent Neural Networks
- 2017-ICLR-Training Compressed Fully-Connected Networks with a Density-Diversity Penalty
- 2017-ICML-Variational dropout sparsifies deep neural networks
- 2017-ICML-Theoretical Properties for Neural Networks with Weight Matrices of Low Displacement Rank
- 2017-ICML-Deep Tensor Convolution on Multicores
- 2017-ICML-Delta Networks for Optimized Recurrent Network Computation
- 2017-ICML-Beyond Filters: Compact Feature Map for Portable Deep Model
- 2017-ICML-Combined Group and Exclusive Sparsity for Deep Neural Networks
- 2017-ICML-MEC: Memory-efficient Convolution for Deep Neural Network
- 2017-ICML-Deciding How to Decide: Dynamic Routing in Artificial Neural Networks
- 2017-ICML-ZipML: Training Models with End-to-End Low Precision: The Cans, the Cannots, and a Little Bit of Deep Learning
- 2017-ICML-Analytical Guarantees on Numerical Precision of Deep Neural Networks
- 2017-ICML-Adaptive Neural Networks for Efficient Inference
- 2017-ICML-SplitNet: Learning to Semantically Split Deep Networks for Parameter Reduction and Model Parallelization
- 2017-ICMLw-Bayesian Sparsification of Recurrent Neural Networks
- 2017-CVPR-Learning deep CNN denoiser prior for image restoration
- 2017-CVPR-Deep roots: Improving cnn efficiency with hierarchical filter groups
- 2017-CVPR-More is less: A more complicated network with less inference complexity
- 2017-CVPR-All You Need is Beyond a Good Init: Exploring Better Solution for Training Extremely Deep Convolutional Neural Networks with Orthonormality and Modulation
- 2017-CVPR-ResNeXt-Aggregated Residual Transformations for Deep Neural Networks
- 2017-CVPR-Xception: Deep learning with depthwise separable convolutions
- 2017-CVPR-Designing Energy-Efficient CNN using Energy-aware Pruning
- 2017-CVPR-Spatially Adaptive Computation Time for Residual Networks
- 2017-CVPR-Network Sketching: Exploiting Binary Structure in Deep CNNs
- 2017-CVPR-A Compact DNN: Approaching GoogLeNet-Level Accuracy of Classification and Domain Adaptation
- 2017-ICCV-Channel pruning for accelerating very deep neural networks [Code]
- 2017-ICCV-Learning efficient convolutional networks through network slimming [Code]
- 2017-ICCV-ThiNet: A filter level pruning method for deep neural network compression [Project] [Code] [[2018 TPAMI version](NOT FOUND)]
- 2017-ICCV-Interleaved group convolutions
- 2017-ICCV-Coordinating Filters for Faster Deep Neural Networks [Code]
- 2017-ICCV-Performance Guaranteed Network Acceleration via High-Order Residual Quantization
- 2017-NIPS-Net-trim: Convex pruning of deep neural networks with performance guarantee [Code]
- 2017-NIPS-Runtime neural pruning
- 2017-NIPS-Learning to Prune Deep Neural Networks via Layer-wise Optimal Brain Surgeon
- 2017-NIPS-Federated Multi-Task Learning
- 2017-NIPS-Bayesian Compression for Deep Learning
- 2017-NIPS-Structured Bayesian Pruning via Log-Normal Multiplicative Noise
- 2017-NIPS-Towards Accurate Binary Convolutional Neural Network
- 2017-NIPS-Soft-to-Hard Vector Quantization for End-to-End Learning Compressible Representations
- 2017-NIPS-TernGrad: Ternary Gradients to Reduce Communication in Distributed Deep Learning
- 2017-NIPS-Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks
- 2017-NIPS-Training Quantized Nets: A Deeper Understanding
- 2017-NIPS-The Reversible Residual Network: Backpropagation Without Storing Activations [Code]
- 2017-NIPS-Compression-aware Training of Deep Networks
- 2017-FPGA-ESE: efficient speech recognition engine with compressed LSTM on FPGA
- 2017-AISTATS-Communication-Efficient Learning of Deep Networks from Decentralized Data
- 2017-ICASSP-Accelerating Deep Convolutional Networks using low-precision and sparsity
- 2017-NNs-Nonredundant sparse feature extraction using autoencoders with receptive fields clustering
- 2017.02-The Power of Sparsity in Convolutional Neural Networks
- 2017.07-Stochastic, Distributed and Federated Optimization for Machine Learning
- 2017.05-Structural Compression of Convolutional Neural Networks Based on Greedy Filter Pruning
- 2017.07-Extremely Low Bit Neural Network: Squeeze the Last Bit Out with ADMM
- 2017.11-GPU Kernels for Block-Sparse Weights [Code] (OpenAI)
- 2017.11-Block-sparse recurrent neural networks (Baidu)
- 2018-AAAI-Auto-balanced Filter Pruning for Efficient Convolutional Neural Networks
- 2018-AAAI-Deep Neural Network Compression with Single and Multiple Level Quantization
- 2018-AAAI-Dynamic Deep Neural Networks_Optimizing Accuracy-Efficiency Trade-offs by Selective Execution
- 2018-ICML-On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization
- 2018-ICML-Weightless: Lossy Weight Encoding For Deep Neural Network Compression
- 2018-ICMLw-Assessing the Scalability of Biologically-Motivated Deep Learning Algorithms and Architectures
- 2018-ICML-Understanding and simplifying one-shot architecture search
- 2018-ICLRo-Training and Inference with Integers in Deep Neural Networks
- 2018-ICLR-Rethinking the Smaller-Norm-Less-Informative Assumption in Channel Pruning of Convolution Layers
- 2018-ICLR-Learning Sparse NNs Through L0 Regularization
- 2018-ICLR-N2N learning: Network to Network Compression via Policy Gradient Reinforcement Learning
- 2018-ICLR-Model compression via distillation and quantization
- 2018-ICLR-Towards Image Understanding from Deep Compression Without Decoding
- 2018-ICLR-Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training
- 2018-ICLR-Mixed Precision Training of Convolutional Neural Networks using Integer Operations
- 2018-ICLR-Mixed Precision Training
- 2018-ICLR-Apprentice: Using Knowledge Distillation Techniques To Improve Low-Precision Network Accuracy
- 2018-ICLR-Loss-aware Weight Quantization of Deep Networks
- 2018-ICLR-Alternating Multi-bit Quantization for Recurrent Neural Networks
- 2018-ICLR-Adaptive Quantization of Neural Networks
- 2018-ICLR-Variational Network Quantization
- 2018-ICLR-Espresso: Efficient Forward Propagation for Binary Deep Neural Networks
- 2018-ICLR-Learning to share: Simultaneous parameter tying and sparsification in deep learning
- 2018-ICLR-Learning Sparse Neural Networks through L0 Regularization
- 2018-ICLR-WRPN: Wide Reduced-Precision Networks
- 2018-ICLR-Deep rewiring: Training very sparse deep networks
- 2018-ICLR-Efficient sparse-winograd convolutional neural networks [Code]
- 2018-ICLR-Learning Intrinsic Sparse Structures within Long Short-term Memory
- 2018-ICLR-Variational Network Quantization
- 2018-ICLR-Multi-scale dense networks for resource efficient image classification
- 2018-ICLR-Efficient Sparse-Winograd Convolutional Neural Networks
- 2018-ICLR-Compressing Word Embedding via Deep Compositional Code Learning
- 2018-ICLR-Large scale distributed neural network training through online distillation
- 2018-ICLR-Learning Discrete Weights Using the Local Reparameterization Trick
- 2018-ICLR-Training wide residual networks for deployment using a single bit for each weight
- 2018-ICLR-The High-Dimensional Geometry of Binary Neural Networks
- 2018-ICLRw-To Prune, or Not to Prune: Exploring the Efficacy of Pruning for Model Compression (Similar topic: 2018-NIPSw-nip in the bud, 2018-NIPSw-rethink)
- 2018-ICLRw-Systematic Weight Pruning of DNNs using Alternating Direction Method of Multipliers
- 2018-ICLRw-Weightless: Lossy weight encoding for deep neural network compression
- 2018-ICLRw-Variance-based Gradient Compression for Efficient Distributed Deep Learning
- 2018-ICLRw-Stacked Filters Stationary Flow For Hardware-Oriented Acceleration Of Deep Convolutional Neural Networks
- 2018-ICLRw-Training Shallow and Thin Networks for Acceleration via Knowledge Distillation with Conditional Adversarial Networks
- 2018-ICLRw-Accelerating Neural Architecture Search using Performance Prediction
- 2018-ICLRw-Nonlinear Acceleration of CNNs
- 2018-ICLRw-Attention-Based Guided Structured Sparsity of Deep Neural Networks [Code]
- 2018-CVPR-Context-Aware Deep Feature Compression for High-Speed Visual Tracking
- 2018-CVPR-NISP: Pruning Networks using Neuron Importance Score Propagation
- 2018-CVPR-Deep Image Prior [Code]
- 2018-CVPR-Condensenet: An efficient densenet using learned group convolutions [Code]
- 2018-CVPR-Shift: A zero flop, zero parameter alternative to spatial convolutions
- 2018-CVPR-Explicit Loss-Error-Aware Quantization for Low-Bit Deep Neural Networks
- 2018-CVPR-Interleaved structured sparse convolutional neural networks
- 2018-CVPR-Towards Effective Low-bitwidth Convolutional Neural Networks
- 2018-CVPR-CLIP-Q: Deep Network Compression Learning by In-Parallel Pruning-Quantization
- 2018-CVPR-Blockdrop: Dynamic inference paths in residual networks
- 2018-CVPR-Nestednet: Learning nested sparse structures in deep neural networks
- 2018-CVPR-Stochastic downsampling for cost-adjustable inference and improved regularization in convolutional networks
- 2018-CVPR-Wide Compression: Tensor Ring Nets
- 2018-CVPR-Learning Compact Recurrent Neural Networks With Block-Term Tensor Decomposition
- 2018-CVPR-Learning Time/Memory-Efficient Deep Architectures With Budgeted Super Networks
- 2018-CVPR-HydraNets: Specialized Dynamic Architectures for Efficient Inference
- 2018-CVPR-SYQ: Learning Symmetric Quantization for Efficient Deep Neural Networks
- 2018-CVPR-Towards Effective Low-Bitwidth Convolutional Neural Networks
- 2018-CVPR-Two-Step Quantization for Low-Bit Neural Networks
- 2018-CVPR-Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
- 2018-CVPR-"Learning-Compression" Algorithms for Neural Net Pruning
- 2018-CVPR-PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning [Code]
- 2018-CVPR-MorphNet: Fast & Simple Resource-Constrained Structure Learning of Deep Networks
- 2018-CVPR-ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices
- 2018-CVPRw-Squeezenext: Hardware-aware neural network design
- 2018-ICML-Compressing Neural Networks using the Variational Information Bottleneck
- 2018-ICML-DCFNet: Deep Neural Network with Decomposed Convolutional Filters
- 2018-ICML-Deep k-Means Re-Training and Parameter Sharing with Harder Cluster Assignments for Compressing Deep Convolutions
- 2018-ICML-Error Compensated Quantized SGD and its Applications to Large-scale Distributed Optimization
- 2018-ICML-High Performance Zero-Memory Overhead Direct Convolutions
- 2018-ICML-Kronecker Recurrent Units
- 2018-ICML-Weightless: Lossy weight encoding for deep neural network compression
- 2018-ICML-StrassenNets: Deep learning with a multiplication budget
- 2018-ICML-Learning Compact Neural Networks with Regularization
- 2018-ICML-WSNet: Compact and Efficient Networks Through Weight Sampling
- 2018-IJCAI-Efficient DNN Neuron Pruning by Minimizing Layer-wise Nonlinear Reconstruction Error
- 2018-IJCAI-SFP-Soft Filter Pruning for Accelerating Deep Convolutional Neural Networks [Code]
- 2018-IJCAI-Where to Prune: Using LSTM to Guide End-to-end Pruning
- 2018-IJCAI-Accelerating Convolutional Networks via Global & Dynamic Filter Pruning
- 2018-IJCAI-Optimization based Layer-wise Magnitude-based Pruning for DNN Compression
- 2018-IJCAI-Progressive Blockwise Knowledge Distillation for Neural Network Acceleration
- 2018-IJCAI-Complementary Binary Quantization for Joint Multiple Indexing
- 2018-ECCV-A Systematic DNN Weight Pruning Framework using Alternating Direction Method of Multipliers
- 2018-ECCV-Coreset-Based Neural Network Compression
- 2018-ECCV-Data-Driven Sparse Structure Selection for Deep Neural Networks [Code]
- 2018-ECCV-Training Binary Weight Networks via Semi-Binary Decomposition
- 2018-ECCV-Learning Compression from Limited Unlabeled Data
- 2018-ECCV-Constraint-Aware Deep Neural Network Compression
- 2018-ECCV-Sparsely Aggregated Convolutional Networks
- 2018-ECCV-Deep Expander Networks: Efficient Deep Networks from Graph Theory [Code]
- 2018-ECCV-SparseNet-Sparsely Aggregated Convolutional Networks [Code]
- 2018-ECCV-Ask, acquire, and attack: Data-free uap generation using class impressions
- 2018-ECCV-Netadapt: Platform-aware neural network adaptation for mobile applications
- 2018-ECCV-Clustering Convolutional Kernels to Compress Deep Neural Networks
- 2018-ECCV-Bi-Real Net: Enhancing the Performance of 1-bit CNNs With Improved Representational Capability and Advanced Training Algorithm
- 2018-ECCV-Extreme Network Compression via Filter Group Approximation
- 2018-ECCV-Convolutional Networks with Adaptive Inference Graphs
- 2018-ECCV-SkipNet: Learning Dynamic Routing in Convolutional Networks [Code]
- 2018-ECCV-Value-aware Quantization for Training and Inference of Neural Networks
- 2018-ECCV-LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks
- 2018-ECCV-AMC: AutoML for Model Compression and Acceleration on Mobile Devices
- 2018-BMVCo-Structured Probabilistic Pruning for Convolutional Neural Network Acceleration
- 2018-BMVC-Efficient Progressive Neural Architecture Search
- 2018-BMVC-Igcv3: Interleaved lowrank group convolutions for efficient deep neural networks
- 2018-NIPS-Discrimination-aware Channel Pruning for Deep Neural Networks
- 2018-NIPS-Frequency-Domain Dynamic Pruning for Convolutional Neural Networks
- 2018-NIPS-ChannelNets: Compact and Efficient Convolutional Neural Networks via Channel-Wise Convolutions
- 2018-NIPS-DropBlock: A regularization method for convolutional networks
- 2018-NIPS-Constructing fast network through deconstruction of convolution
- 2018-NIPS-Learning Versatile Filters for Efficient Convolutional Neural Networks [Code]
- 2018-NIPS-Moonshine: Distilling with cheap convolutions
- 2018-NIPS-HitNet: hybrid ternary recurrent neural network
- 2018-NIPS-FastGRNN: A Fast, Accurate, Stable and Tiny Kilobyte Sized Gated Recurrent Neural Network
- 2018-NIPS-Training DNNs with Hybrid Block Floating Point
- 2018-NIPS-Reversible Recurrent Neural Networks
- 2018-NIPS-Synaptic Strength For Convolutional Neural Network
- 2018-NIPS-Learning sparse neural networks via sensitivity-driven regularization
- 2018-NIPS-Multi-Task Zipping via Layer-wise Neuron Sharing
- 2018-NIPS-A Linear Speedup Analysis of Distributed Deep Learning with Sparse and Quantized Communication
- 2018-NIPS-Gradient Sparsification for Communication-Efficient Distributed Optimization
- 2018-NIPS-GradiVeQ: Vector Quantization for Bandwidth-Efficient Gradient Aggregation in Distributed CNN Training
- 2018-NIPS-ATOMO: Communication-efficient Learning via Atomic Sparsification
- 2018-NIPS-Norm matters: efficient and accurate normalization schemes in deep networks
- 2018-NIPS-Sparsified SGD with memory
- 2018-NIPS-Pelee: A Real-Time Object Detection System on Mobile Devices
- 2018-NIPS-Scalable methods for 8-bit training of neural networks
- 2018-NIPS-TETRIS: TilE-matching the TRemendous Irregular Sparsity
- 2018-NIPS-Training deep neural networks with 8-bit floating point numbers
- 2018-NIPS-Multiple instance learning for efficient sequential data classification on resource-constrained devices
- 2018-NIPSw-Pruning neural networks: is it time to nip it in the bud?
- 2018-NIPSwb-Rethinking the Value of Network Pruning [2019 ICLR version]
- 2018-NIPSw-Structured Pruning for Efficient ConvNets via Incremental Regularization [2019 IJCNN version] [Code]
- 2018-NIPSw-Adaptive Mixture of Low-Rank Factorizations for Compact Neural Modeling
- 2018-NIPSw-Learning Sparse Networks Using Targeted Dropout [OpenReview] [Code]
- 2018-WACV-Recovering from Random Pruning: On the Plasticity of Deep Convolutional Neural Networks
- 2018.05-Compression of Deep Convolutional Neural Networks under Joint Sparsity Constraints
- 2018.05-AutoPruner: An End-to-End Trainable Filter Pruning Method for Efficient Deep Model Inference
- 2018.10-A Closer Look at Structured Pruning for Neural Network Compression [Code]
- 2018.11-Second-order Optimization Method for Large Mini-batch: Training ResNet-50 on ImageNet in 35 Epochs
- 2018.11-PydMobileNet: Improved Version of MobileNets with Pyramid Depthwise Separable Convolution
- 2019-SysML-Towards Federated Learning at Scale: System Design
- 2019-ICLRo-The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks (best paper!)
- 2019-ICLR-Slimmable Neural Networks [Code]
- 2019-ICLR-Defensive Quantization: When Efficiency Meets Robustness
- 2019-ICLR-Minimal Random Code Learning: Getting Bits Back from Compressed Model Parameters [Code]
- 2019-ICLR-ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware [Code]
- 2019-ICLR-SNIP: Single-shot Network Pruning based on Connection Sensitivity
- 2019-ICLR-Non-vacuous Generalization Bounds at the ImageNet Scale: a PAC-Bayesian Compression Approach
- 2019-ICLR-Dynamic Channel Pruning: Feature Boosting and Suppression
- 2019-ICLR-Energy-Constrained Compression for Deep Neural Networks via Weighted Sparse Projection and Layer Input Masking
- 2019-ICLR-RotDCF: Decomposition of Convolutional Filters for Rotation-Equivariant Deep Networks
- 2019-ICLR-Dynamic Sparse Graph for Efficient Deep Learning
- 2019-ICLR-Big-Little Net: An Efficient Multi-Scale Feature Representation for Visual and Speech Recognition
- 2019-ICLR-Data-Dependent Coresets for Compressing Neural Networks with Applications to Generalization Bounds
- 2019-ICLR-Learning Recurrent Binary/Ternary Weights
- 2019-ICLR-Double Viterbi: Weight Encoding for High Compression Ratio and Fast On-Chip Reconstruction for Deep Neural Network
- 2019-ICLR-Relaxed Quantization for Discretized Neural Networks
- 2019-ICLR-Integer Networks for Data Compression with Latent-Variable Models
- 2019-ICLR-Minimal Random Code Learning: Getting Bits Back from Compressed Model Parameters
- 2019-ICLR-Analysis of Quantized Models
- 2019-ICLR-DARTS: Differentiable Architecture Search [Code]
- 2019-ICLR-Graph HyperNetworks for Neural Architecture Search
- 2019-ICLR-Learnable Embedding Space for Efficient Neural Architecture Compression [Code]
- 2019-ICLR-Efficient Multi-Objective Neural Architecture Search via Lamarckian Evolution
- 2019-ICLR-SNAS: stochastic neural architecture search (SenseTime)
- 2019-AAAIo-A layer decomposition-recomposition framework for neuron pruning towards accurate lightweight networks
- 2019-AAAI-Knowledge Transfer via Distillation of Activation Boundaries Formed by Hidden Neurons [Code]
- 2019-AAAI-Balanced Sparsity for Efficient DNN Inference on GPU [Code]
- 2019-AAAI-CircConv: A Structured Convolution with Low Complexity
- 2019-AAAI-Regularized Evolution for Image Classifier Architecture Search
- 2019-ASPLOS-Packing Sparse Convolutional Neural Networks for Efficient Systolic Array Implementations: Column Combining Under Joint Optimization
- 2019-CVPRo-HAQ: hardware-aware automated quantization
- 2019-CVPRo-Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration [Code]
- 2019-CVPR-All You Need is a Few Shifts: Designing Efficient Convolutional Neural Networks for Image Classification
- 2019-CVPR-Importance Estimation for Neural Network Pruning [Code]
- 2019-CVPR-HetConv Heterogeneous Kernel-Based Convolutions for Deep CNNs
- 2019-CVPR-Fully Learnable Group Convolution for Acceleration of Deep Neural Networks
- 2019-CVPR-Towards Optimal Structured CNN Pruning via Generative Adversarial Learning
- 2019-CVPR-Centripetal SGD for Pruning Very Deep Convolutional Networks with Complicated Structure
- 2019-CVPR-Searching for A Robust Neural Architecture in Four GPU Hours [Code]
- 2019-CVPR-FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search
- 2019-CVPR-RENAS: Reinforced Evolutionary Neural Architecture Search
- 2019-CVPR-ChamNet: Towards Efficient Network Design through Platform-Aware Model Adaptation
- 2019-CVPR-Partial Order Pruning: for Best Speed/Accuracy Trade-off in Neural Architecture Search [Code]
- 2019-CVPR-Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation [Code]
- 2019-CVPR-MnasNet: Platform-Aware Neural Architecture Search for Mobile [Code]
- 2019-CVPR-MFAS: Multimodal Fusion Architecture Search
- 2019-CVPR-A Neurobiological Evaluation Metric for Neural Network Model Search
- 2019-CVPR-Fast Neural Architecture Search of Compact Semantic Segmentation Models via Auxiliary Cells
- 2019-CVPR-Efficient Neural Network Compression [Code]
- 2019-CVPR-T-Net: Parametrizing Fully Convolutional Nets with a Single High-Order Tensor
- 2019-CVPR-Centripetal SGD for Pruning Very Deep Convolutional Networks with Complicated Structure [Code]
- 2019-CVPR-DSC: Dense-Sparse Convolution for Vectorized Inference of Convolutional Neural Networks
- 2019-CVPR-DupNet: Towards Very Tiny Quantized CNN With Improved Accuracy for Face Detection
- 2019-CVPR-ECC: Platform-Independent Energy-Constrained Deep Neural Network Compression via a Bilinear Regression Model
- 2019-CVPR-Variational Convolutional Neural Network Pruning
- 2019-CVPR-Neural Rejuvenation: Improving Deep Network Training by Enhancing Computational Resource Utilization [Code]
- 2019-CVPR-Accelerating Convolutional Neural Networks via Activation Map Compression
- 2019-CVPR-Compressing Convolutional Neural Networks via Factorized Convolutional Filters
- 2019-CVPR-Deep Virtual Networks for Memory Efficient Inference of Multiple Tasks
- 2019-CVPR-Exploiting Kernel Sparsity and Entropy for Interpretable CNN Compression
- 2019-CVPR-MBS: Macroblock Scaling for CNN Model Reduction
- 2019-CVPR-On Implicit Filter Level Sparsity in Convolutional Neural Networks
- 2019-CVPR-Structured Pruning of Neural Networks With Budget-Aware Regularization
- 2019-ICML-Approximated Oracle Filter Pruning for Destructive CNN Width Optimization [Code]
- 2019-ICML-EigenDamage: Structured Pruning in the Kronecker-Factored Eigenbasis [Code]
- 2019-ICML-Zero-Shot Knowledge Distillation in Deep Networks [Code]
- 2019-ICML-LegoNet: Efficient Convolutional Neural Networks with Lego Filters [Code]
- 2019-ICML-EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks [Code]
- 2019-ICML-Collaborative Channel Pruning for Deep Networks
- 2019-ICML-Training CNNs with Selective Allocation of Channels
- 2019-ICML-NAS-Bench-101: Towards Reproducible Neural Architecture Search [Code]
- 2019-ICMLw-Towards Learning of Filter-Level Heterogeneous Compression of Convolutional Neural Networks [Code] (AutoML workshop)
- 2019-IJCAI-Play and Prune: Adaptive Filter Pruning for Deep Model Compression
- 2019-BigComp-Towards Robust Compressed Convolutional Neural Networks
- 2019-ICCV-Rethinking ImageNet Pre-training
- 2019-ICCV-Universally Slimmable Networks and Improved Training Techniques (related: 2019 ICLR Slimmable)
- 2019-ICCV-MetaPruning: Meta Learning for Automatic Neural Network Channel Pruning [Code]
- 2019-PR-Filter-in-Filter: Improve CNNs in a Low-cost Way by Sharing Parameters among the Sub-filters of a Filter
- 2019-PRL-BDNN: Binary Convolution Neural Networks for Fast Object Detection
- 2019-TNNLS-Towards Compact ConvNets via Structure-Sparsity Regularized Filter Pruning [Code]
- 2019-JMLR-Neural Architecture Search: A Survey
- 2019.03-Network Slimming by Slimmable Networks: Towards One-Shot Architecture Search for Channel Numbers [Code]
- 2019.03-Single Path One-Shot Neural Architecture Search with Uniform Sampling
- 2019.04-Resource Efficient 3D Convolutional Neural Networks
- 2019.04-Meta Filter Pruning to Accelerate Deep Convolutional Neural Networks
- 2019.04-Knowledge Squeezed Adversarial Network Compression
- 2019.04-Progressive Differentiable Architecture Search: Bridging the Depth Gap between Search and Evaluation [Code]
- 2019.05-Dynamic Neural Network Channel Execution for Efficient Training
- 2019.05-Network Pruning via Transformable Architecture Search [Code]
- 2019.06-AutoGrow: Automatic Layer Growing in Deep Convolutional Networks
- 2019.06-BasisConv: A method for compressed representation and learning in CNNs
- 2019.06-BlockSwap: Fisher-guided Block Substitution for Network Compression
- 2019.06-Data-Free Quantization through Weight Equalization and Bias Correction
- 2019.06-Separable Layers Enable Structured Efficient Linear Substitutions [Code]
- 2019.06-Butterfly Transform: An Efficient FFT Based Neural Architecture Design
- 2019.06-A Taxonomy of Channel Pruning Signals in CNNs
- 2010-JMLR-How to explain individual classification decisions
- 2015-PLOS ONE-On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation
- 2015-CVPR-Learning to generate chairs with convolutional neural networks
- 2015-CVPR-Understanding deep image representations by inverting them [2016 IJCV version: Visualizing deep convolutional neural networks using natural pre-images]
- 2016-CVPR-Inverting Visual Representations with Convolutional Networks
- 2016-KDD-"Why Should I Trust You?": Explaining the Predictions of Any Classifier (LIME)
- 2016-ICMLw-The Mythos of Model Interpretability
- 2017-NIPSw-The (Un)reliability of saliency methods
- 2017-DSP-Methods for interpreting and understanding deep neural networks
- 2018-ICML-Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV)
- 2018-NIPSs-Sanity Checks for Saliency Maps
- 2018-NIPSs-Human-in-the-Loop Interpretability Prior
- 2018-NIPS-To Trust Or Not To Trust A Classifier [Code]
- 2019-AISTATS-Interpreting Black Box Predictions using Fisher Kernels
- 2019.05-Luck Matters: Understanding Training Dynamics of Deep ReLU Networks
- 2019.05-Adversarial Examples Are Not Bugs, They Are Features
- 2019.06-The Generalization-Stability Tradeoff in Neural Network Pruning
- 2019.06-One ticket to win them all: generalizing lottery ticket initializations across datasets and optimizers
- 2019-Book-Interpretable Machine Learning
- 1996-Born again trees (proposed compressing neural networks and multipletree predictors by approximating them with a single tree)
- 2006-SIGKDD-Model compression
- 2010-ML-A theory of learning from different domains
- 2014-NIPS-Do deep nets really need to be deep?
- 2014-NIPSw-Distilling the Knowledge in a Neural Network (coined the name "knowledge distillation" and "dark knowledge") [Code]
- 2015-NIPS-Bayesian dark knowledge
- 2016-ICLR-Net2net: Accelerating learning via knowledge transfer (Tianqi Chen and Goodfellow)
- 2016-ECCV-Accelerating convolutional neural networks with dominant convolutional kernel and knowledge pre-regression
- 2017-ICLR-Paying more attention to attention: Improving the performance of convolutional neural networksvia attention transfer
- 2017-ICLR-Do deep convolutional nets really need to be deep and convolutional?
- 2017-CVPR-A gift from knowledge distillation: Fast optimization, network minimization and transfer learning
- 2017-NIPS-Sobolev training for neural networks
- 2017-NIPSw-Data-Free Knowledge Distillation for Deep Neural Networks [Code]
- 2017.07-Like What You Like: Knowledge Distill via Neuron Selectivity Transfer
- 2017.10-Knowledge Projection for Deep Neural Networks
- 2017.11-Distilling a Neural Network Into a Soft Decision Tree
- 2017.12-Data Distillation: Towards Omni-Supervised Learning (Kaiming He)
- 2018.03-Interpreting Deep Classifier by Visual Distillation of Dark Knowledge
- 2018.11-Dataset Distillation [Code]
- 2018.12-Learning Student Networks via Feature Embedding
- 2018.12-Few Sample Knowledge Distillation for Efficient Network Compression
- 2018-AAAI-DarkRank: Accelerating Deep Metric Learning via Cross Sample Similarities Transfer
- 2018-AAAI-Dynamic deep neural networks: Optimizing accuracy-efficiency trade-offs by selective execution
- 2018-AAAI-Rocket Launching: A Universal and Efficient Framework for Training Well-performing Light Net
- 2018-AAAI-Adversarial Learning of Portable Student Networks
- 2018-ICML-Born-Again Neural Networks
- 2018-IJCAI-Better and Faster: Knowledge Transfer from Multiple Self-supervised Learning Tasks via Graph Distillation for Video Classification
- 2018-SIGKDD-Towards Evolutionary Compression
- 2018-NIPS-KDGAN: knowledge distillation with generative adversarial networks
- 2018-NIPS-Knowledge Distillation by On-the-Fly Native Ensemble
- 2018-NIPSw-Transparent Model Distillation
- 2019-AAAI-Knowledge Distillation with Adversarial Samples Supporting Decision Boundary
- 2019-AAAI-Knowledge Transfer via Distillation of Activation Boundaries Formed by Hidden Neurons
- 2019-AAAI-Learning to Steer by Mimicking Features from Heterogeneous Auxiliary Networks [Code]
- 2019-CVPR-Knowledge Representing: Efficient, Sparse Representation of Prior Knowledge for Knowledge Distillation
- 2019-CVPR-Knowledge Distillation via Instance Relationship Graph
- 2019-CVPR-Variational Information Distillation for Knowledge Transfer
- 2019-ICCV-Similarity-Preserving Knowledge Distillation
- 2019-ICCV-Correlation Congruence for Knowledge Distillation
- 2019-ICCV-Data-Free Learning of Student Networks
- 2019-ICCV-Learning Lightweight Lane Detection CNNs by Self Attention Distillation [Code]
- 2019.05-DistillHash: Unsupervised Deep Hashing by Distilling Data Pairs
- Been Kim @ Google Brain (Interpretability)
- Elliot Crowley @ Edinburgh
- Gao Huang @ Tsinghua
- Mingjie Sun @ BUAA
- Mohsen Imani @ UCSD
- Naiyan Wang @ TuSimple
- Jianguo Li @ Intel
- Miguel Carreira-Perpinan @ UC Merced
- Pavlo Molchanov @ NVIDIA
- Song Han @ MIT
- Wei Wen @ Duke
- Yang He @ University of Technology Sydney
- Yihui He @ CMU
- Yunhe Wang @ Huawei
- Zhuang Liu @ UC Berkeley
- OpenReview
- ICLR
- CVPR & ICCV
- ECCV
- 2017-ICML Tutorial: interpretable machine learning
- 2018-AAAI
- 2018-ICLR
- 2018-ICML
- 2018-ICML Workshop: Efficient Credit Assignment in Deep Learning and Reinforcement Learning
- 2018-IJCAI
- 2018-BMVC
- 2018-NIPS
- COLT: 2019
- SysML: 2018, 2019, 2020
- CDNNRIA Workshop (Compact Deep Neural Network Representation with Industrial Applications): 1st-2018-NIPSw, 2nd-2019-ICMLw
- LLD Workshop (Learning with Limited Data): 1st-2017-NIPSw, 2nd-2019-ICLRw
- WHI (Worshop on Human Interpretability in Machine Learning): 1st-2016-ICMLw, 2nd-2017-ICMLw, 3rd-2018-ICMLw
- NIPS-18 Workshop on Systems for ML and Open Source Software
- MLPCD Workshop (Machine Learning on the Phone and other Consumer Devices): 2nd-2018-NIPSw
- NNPACK
- DMLC: Tensor Virtual Machine (TVM): Open Deep Learning Compiler Stack
- Tencent: NCNN
- Xiaomi: MACE, Mobile AI Benchmark
- Alibaba: MNN blog (in Chinese)
- Baidu: Paddle-Slim, Paddle-Mobile, Anakin
- Microsoft: ELL
- Facebook: Caffe2/PyTorch
- Apple: CoreML (iOS 11+)
- Google: ML-Kit, NNAPI (Android 8.1+), TF-Lite
- Qualcomm: Snapdragon Neural Processing Engine (SNPE), Adreno GPU SDK
- Huawei: HiAI
- ARM: Tengine
- Related: DAWNBench: An End-to-End Deep Learning Benchmark and Competition