Neural networks and their application at CAST

CERN Axion Solar Telescope

CAST

the experiment

search for solar axions, hypothetical pseudoscalar particle solving the strong $\mathcal{CP}$ problem
potential dark matter candidate
coupling to transverse $B$ fields, production in the Sun!

CERN Axion Solar Telescope

CAST

to take away…

exp. signal rates: $≤ \num{0.1}$ $γ$ \si{\per\hour}
background rate: $∼ \SI{0.1}{\per \s}$
need very good background suppression

Neural network intro

Artificial Neural Networks (ANNs)

ANN primer

type of multivariate analysis object providing highly non-linear, multidimensional representations of input data
simplest type: feed-forward multilayer perceptron

MLP example

Artificial Neural Networks (ANNs)

Producing an output and training

Neuron output: \[ y_k = \varphi ∑_{j = 0}^m w_{kj} x_j \] $ \varphi $: activation function, $ w_k $ weight vector

Training minimizes error function \[ E(\mathbf{x_1}, …, \mathbf{x_N} | \mathbf{w}) = ∑_{a=1}^N \frac{1}{2}\left(y_{\text{ANN},a} - \hat{y}_a\right)^2 \] using gradient descent \[ \mathbf{w}^{n+1} = \mathbf{w}^n - η ∇_{w} E \]

Convolutional Neural Networks

CNN schmatic

convolutional and pooling layers alternating:

where a convolutional layer is:

Convolution example in python

Python calc of 2D convolution (instead of a gif…)

import numpy as np
from scipy.signal import convolve2d
A = np.identity(6)
B = np.array([[0,0,0],[0,5,0],[0,0,0]])
C = convolve2d(A, B, 'same')
print(C)

[[5. 0. 0. 0. 0. 0.]
 [0. 5. 0. 0. 0. 0.]
 [0. 0. 5. 0. 0. 0.]
 [0. 0. 0. 5. 0. 0.]
 [0. 0. 0. 0. 5. 0.]
 [0. 0. 0. 0. 0. 5.]]

Convolution example in python

Python calc of 2D convolution (instead of a gif…)

import numpy as np
from scipy.signal import convolve2d
A = np.identity(6)
B = np.array([[1,0,1],[0,1,0],[1,0,1]])
C = convolve2d(A, B, 'same')
print(C)

[[2. 0. 1. 0. 0. 0.]
 [0. 3. 0. 1. 0. 0.]
 [1. 0. 3. 0. 1. 0.]
 [0. 1. 0. 3. 0. 1.]
 [0. 0. 1. 0. 3. 0.]
 [0. 0. 0. 1. 0. 2.]]

Convolution example in pictures

A pictures is worth a thousand words?

\tiny source: http://www.songho.ca/dsp/convolution/convolution2d_example.html

Live demo of MLP training on MNIST

Simple demo of training simple ANN on MNIST

MNIST: a dataset of \num{70000} handwritten digits, size normalized to $\num{28}×\num{28}$ pixels, centered
- in the past used to benchmark image classification; nowadays fast to achieve good accuracies $\geq\SI{90}{\percent}$
network layout:
- input neurons: $\num{28}×\num{28}$ neurons (note: as \num{1}D!)
- 1 hidden layer: \num{1000} neurons
- output layer: \num{10} neurons (\num{1} for each digit)
- activation function: rectified liner unit (ReLU):

\[ f(x) = max(0, x) \]

Live demo of MLP training on MNIST

What do I mean by live demo? 2 programs

Program 1: trains multilayer perceptron (MLP)
- written in Nim (C backend), using Arraymancer
  - linear algebra + neural network library
- trains on \num{60000} digits, performs validation on \num{10000} digits
after every 10 batches (1 batch: 64 digits) send to program 2:
- random test digit
- predicted output
- current error
Program 2 plots data live: written in Nim (JS backend), plots using plotly.js

Start training!

Neural networks at CAST

Back to CAST

Requirements for detectors at CAST

CAST is a very low rate experiment!
detectors should reach: $f_{\text{Background}} ≤ \SI{e-6}{\per \keV \per \cm \squared \per \s}$
signal / background ratio: $\frac{f_{\text{Background}}}{f_{\text{Signal}}} > \num{e5}$
- need very good signal / background classification!

Background example

~/org/Talks/figs/viel_background_gnuplot_dwnsmpl.pdf

X-ray example

~/org/Talks/figs/xray_calibration_gnuplot_dwnsampl_black.pdf

Back to CAST

Requirements for detectors at CAST

\textcolor{gray}{CAST is a very low rate experiment!}
\textcolor{gray}{detectors should reach:} $f_{\text{Background}} ≤ \SI{e-6}{\per \keV \per \cm \squared \per \s}$
\textcolor{gray}{signal / background ratio:} $\frac{f_{\text{Background}}}{f_{\text{Signal}}} > \num{e5}$
- \textcolor{gray}{need very good signal / background classification!}
events (as on previous slides) can be interpreted as images
Convolutional Neural Networks extremely good at image classification

$⇒$ use Convolutional Neural Networks?

Old analysis - data and likelihood method

visible from comparison of background to X-ray event that geometric shapes are very different
utilize that to remove as much background as possible

Likelihood analysis

energy range: \SIrange{0}{10}{\kilo \electronvolt}
split into 8 unequal bins of distinct event properties

Baseline analysis

Analysis pipeline as follows

$⇒$ raw events
$\hphantom{⇒}$ filter ‘clusters’ \ $\hphantom{⇒}$ calc (geometric) properties\ $\hphantom{⇒}$ calc likelihood distribution from:\ \setlength{\leftmargini}{10pt} $\hphantom{⇒}$ \beamerbullet- eccentricity\ $\hphantom{⇒}$ \beamerbullet- length / transverse RMS\ $\hphantom{⇒}$ \beamerbullet- fraction within transverse RMS

Current analysis - data and likelihood method

Likelihood analysis & CNN analysis

energy range: \SIrange{0}{10}{\kilo \electronvolt}
split into 8 unequal bins of distinct event properties
only based on properties of X-rays
set cut on Likelihood distribution, s.t. \SI{80}{\percent} of X-rays are recovered
now: use artificial neural network to classify events as X-ray or background

ANNs applied to CAST

Two ANN approaches

calculate properties of event, use properties as input neurons
use whole events ($\num{256} × \num{256}$ pixels) as input layer

reg. 1:
- small layout $ ⇒ $ fast to train
- potentially biased, not all information usable
reg. 2:
- huge layout $ ⇒ $ only trainable on GPU
- all information available

CNN implementation details

8 networks in total, one for each $E$ bin

input size: $\num{256}×\num{256}$ neurons
3 convolutional and pooling layers alternating w/ 30, 70, 100 kernels using $\num{15} × \num{15}$ filters
pooling layers perform $\num{2}×\num{2}$ max pooling
$tanh$ activation function
1 fully connected feed-forward layer: (1800, 30) neurons
logistic regression layer: \num{2} output neurons
training w/ \num{12000} events per type on Nvidia GTX 1080
training time: $∼ \SIrange[range-phrase={\text{to}}]{1}{10}{\hour}$

CNN example output distribution

CNN output distribution: bad

~/Documents/Talks/figs/CNN_classification_0_3.pdf

CNN example output distribution

CNN output distribution: good

~/Documents/Talks/figs/CNN_classification_1_5.pdf

Potential improvements via CNNs

Signal eff. vs background rej.

~/Documents/Talks/figs/sig_vs_back_rej_cropped.pdf

Potential improvements via CNNs

baseline vs. CNNs: $5×$ background reduction (2014/15 data)

~/Documents/Talks/figs/background_rates_L_CNN_logy.pdf

“Summary”

I hope I could teach you something new / it was still interesting regardless :)
if you’re interested: this talk and the code for the live demo can be found on my GitHub: https://github.com/vindaar/NeuralNetworkLiveDemo

Files

ann_talk.org

Latest commit

History

ann_talk.org

File metadata and controls

Neural networks and their application at CAST

CERN Axion Solar Telescope

CERN Axion Solar Telescope

CAST

the experiment

CERN Axion Solar Telescope

CAST

to take away…

Neural network intro

Artificial Neural Networks (ANNs)

ANN primer

MLP example

Artificial Neural Networks (ANNs)

Producing an output and training

Convolutional Neural Networks

CNN schmatic

Convolution example in python

Python calc of 2D convolution (instead of a gif…)

Convolution example in python

Python calc of 2D convolution (instead of a gif…)

Convolution example in pictures

A pictures is worth a thousand words?

Live demo of MLP training on MNIST

Live demo of MLP training on MNIST

Simple demo of training simple ANN on MNIST

Live demo of MLP training on MNIST

What do I mean by live demo? 2 programs

Neural networks at CAST

Back to CAST

Requirements for detectors at CAST

Background example

X-ray example

Back to CAST

Requirements for detectors at CAST

Old analysis - data and likelihood method

Likelihood analysis

Baseline analysis

Analysis pipeline as follows

Eccentricity

Length / $\text{RMS}_{\text{trans}}$

# pix in $\text{RMS}_{\text{trans}}$

Current analysis - data and likelihood method

Likelihood analysis & CNN analysis

ANNs applied to CAST

Two ANN approaches

CNN implementation details

8 networks in total, one for each $E$ bin

CNN example output distribution

CNN output distribution: bad

CNN example output distribution

CNN output distribution: good

Potential improvements via CNNs

Signal eff. vs background rej.

Potential improvements via CNNs

baseline vs. CNNs: $5×$ background reduction (2014/15 data)

“Summary”