spiking.tex

\section{Spiking extensions}
\label{sec:spiking}
The state-of-the-art DNNs are compute-intensive since they operate on real-valued inputs, leading to higher energy consumption in resource-constrained edge devices with limited power budget. In this regard, Spiking Neural Networks (SNNs) offer a promising solution for enabling energy-efficient neuromorphic computing in the edge nodes. SNNs process and encode input information temporally using sparse spiking events. The intrinsic sparse event-driven computing capability of SNNs can be exploited to achieve higher energy efficiency in hardware implementations as shown in \cite{sengupta2019going, blouw2018benchmarking}. The inherent computational efficiency of SNNs can be attributed to the fact that every layer of an SNN needs to compute the weighted sum of the input spikes with the synaptic weights only in the event of a spike fired by the corresponding input neurons. Furthermore, computing the weighted input sum requires only adders since the inputs are encoded as spikes that are represented by logic states \{0, 1\}. On the contrary, DNNs require multipliers, which consume at least an order of magnitude higher energy than adders, for computing the weighted input sum since they operate on real-valued inputs.

SNNs are naturally suited for splitting and conditional evaluation among different edge nodes because of the following two-fold reasons. First, they improve the energy efficiency per edge node due to inherent sparse event-driven processing capability as mentioned in the previous paragraph. Furthermore, researchers in \cite{sengupta2019going} demonstrated that sparsity in spiking activity increases substantially across successive layers of a deep SNN. Hence, energy efficiency would be further enhanced for edge nodes evaluating the deeper layers of an SNN. Second, SNNs minimize the communication overhead since only the sparse spiking events need to be transmitted between the edge nodes. Despite the computational and communication energy benefits offered by SNNs, the challenges relating to the training complexity of deep SNNs need to be addressed to champion their wide-spread adoption for real-world applications. Needless to say that efficient training strategies for deep SNNs is a current area of active research interest.

The training methodologies for deep SNNs can be broadly divided into the following three categories:
\begin{enumerate}
    \item DNN-to-SNN conversion
    \item Spiking backpropagation
    \item Spike Timing Dependent Plasticity (STDP)
\end{enumerate}
In the DNN-to-SNN conversion approaches, a DNN is trained using artificial rate-based neurons with state-of-the-art backpropagation algorithms and then mapped to deep SNN by substituting the artificial neurons with appropriate spiking neuron models and suitably normalizing the trained weights \cite{cao2015spiking, hunsberger2015spiking, diehl2015fast, rueckauer2017conversion, sengupta2019going}. The conversion approaches have been shown to be almost loss-less for deep VGG and ResNet architectures for complex vision datasets. However, the conversion approaches typically incur larger inference latency for achieving the best accuracy. The inference latency can be minimized by using the proposed splitting and conditional evaluation approach. Alternatively, the inference latency can also be optimized by training deep SNNs using spike-based backpropagation algorithms that use differentiable approximations for the spiking neurons for error backpropagation \cite{lee2016training, panda2016unsupervised, wu2018spatio, lee2018training, jin2018hybrid, shrestha2018slayer, neftci2019surrogate}. However, the SNN backpropagation algorithms incur longer training time compared to the DNN-to-SNN conversion approaches while offering the potential to lower the inference latency under iso-accuracy conditions. The final approach uses bio-plausible layer-wise STDP-based local learning rules to self-learn hierarchical input representations in an unsupervised manner \cite{diehl2015unsupervised, masquelier2007unsupervised, srinivasan2018stdp, tavanaei2018training, kheradpisheh2018stdp, ferre2018unsupervised, thiele2018event, lee2018deep, mozafari2018combining}. The STDP-based training rules are appealing for edge devices since they can be implemented with minimal hardware overhead compared to backpropagation algorithms. However, the STDP-trained SNNs proposed until now are only few (two to three) layers deep. It is not yet clear if the STDP-based learning rules are effective for the later layers of deeper SNNs.

Finally, we note that binary SNNs, which use binary weights \{$-$1, 1\} and spiking activations \{0, 1\}, can be used to achieve both memory- and energy-efficiency in the edge nodes. Binary DNNs, on the contrary, use either \{$-$1, 1\} \cite{courbariaux2015binaryconnect} or \{$-\alpha$, $\alpha$\} \cite{rastegari2016xnor} where $\alpha$ is a layer-specific scaling factor for the weights and binary neuronal activations \{$-$1, 1\}. Hence, every layer of binary DNNs still needs to compute the weighted input sum (XNOR operation followed by population count) for all the input neurons and transmit the binary activations of all the output neurons. However, binary SNNs need to compute the weighted input sum (AND operation followed by population count) only in the event of a spike fired by the corresponding input neurons and transmit only the sparse spiking activations of the output neurons, leading to potentially much higher energy efficiency. Binary SNNs can be trained off-line using binarization algorithms proposed for DNNs that require the full-precision weights to be stored during training \cite{courbariaux2015binaryconnect, rastegari2016xnor, hubara2017quantized}. Alternatively, binary SNNs can also be trained on-chip using stochastic-STDP based learning rules that achieves plasticity by probabilistically switching the binary weights based on spike-timing \cite{suri2013bio, querlioz2015bioinspired, srinivasan2016magnetic, srinivasan2019restocnet}. Stochastic-STDP trained binary SNNs, which eliminate the need for storing the full-precision weights, are attractive for memory- and energy-efficient learning as well as inference in the edge nodes.