Skip to content
This repository has been archived by the owner on Sep 1, 2023. It is now read-only.

Added anomaly detection guide #3521

Merged
merged 2 commits into from
Apr 7, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion docs/source/api/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@ allows users to find the best model parameters for a particular data set using a
particle swarm optimization algorithm. This API is not always needed, however.
You can always use existing model parameters and tweak them to your needs.


The `Network API <network/>`_ allows users to create a network structure with
each node performing a different task, making for a very flexible experiment
framework and a future foundation for hierarchy.
Expand Down
8 changes: 7 additions & 1 deletion docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@

import os
import sys
import datetime

sourcePath = os.path.abspath(os.path.join('..', 'src'))
rootPath = os.path.abspath('../..')
Expand Down Expand Up @@ -69,11 +70,16 @@
# The full version, including alpha/beta/rc tags.
release = devVersion

buildDate = datetime.datetime.now().strftime(
"Documentation built on %B %d, %Y at %H:%M:%S"
)

# Ensures release version is included in output
rst_epilog_pre = """
.. |release| replace:: {}
.. |buildDate| replace:: {}
"""
rst_epilog = rst_epilog_pre.format(release)
rst_epilog = rst_epilog_pre.format(release, buildDate)

# Adds markdown support (pip install recommonmark)
source_parsers = {
Expand Down
80 changes: 80 additions & 0 deletions docs/source/guides/anomaly-detection.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# Anomaly Detection

This technical note describes how the anomaly score is implemented and incorporated into the CLA (Cortical Learning Algorithm).

The anomaly score enables the CLA to provide a metric representing the degree to which each record is predictable. For example, if you have temporal anomaly model that is predicting the energy consumption of a building, each record will have an anomaly score between zero and one. A zero represents a completely predicted value whereas a one represents a completely anomalous value.

The anomaly score feature of CLA is implemented on top of the core spatial and temporal pooler, and don’t require any spatial pooler and temporal pooler algorithm changes.

## TemporalAnomaly model

### Description

The user must specify the model as a TemporalAnomaly type to have the model report the anomaly score. The anomaly score uses the temporal pooler to detect novel points in sequences. This will detect both novel input patterns (because they have not been seen in any sequence) as well as old spatial patterns that occur in a novel context.

### Computation

A TemporalAnomaly model calculates the anomaly score based on the correctness of the previous prediction. This is calculated as the percentage of active spatial pooler columns that were incorrectly predicted by the temporal pooler.

The algorithm for the anomaly score is as follows:

![equation](http://latex.codecogs.com/gif.latex?anomalyScore%3D%5Cfrac%7B%5Clvert%20A_t-%28P_%7Bt-1%7D%5Cbigcap%20A_t%29%5Crvert%20%7D%7B%5Clvert%20A_t%5Crvert%20%7D)

![equation](http://latex.codecogs.com/gif.latex?P_%7Bt-1%7D%3D%5Ctext%7BPredicted%20columns%20at%20time%20t%7D)

![equation](http://latex.codecogs.com/gif.latex?A_%7Bt%7D%3D%5Ctext%7BActive%20columns%20at%20time%20t%7D)

__Note__: Here, a "predicted column" is a column with a non-zero confidence value. This is not exactly the same as having a cell in the predicted state. For more information, refer the "predicted cells vs. confidences" section below.

Thus, an anomaly score of 1 means that no predicted cells became active and represents a completely anomalous record. A score of 0 means all predicted cells became active and represents a completely predicted record.

### Rationale

The reasoning behind this formulation of the anomaly score was that any record that is not predicted is a novel record. This holds if we have built the best predictive model possible, which we assume we have done via training/swarming.

### Results

This anomaly score has been applied to many datasets. It is the core mechanism used in Numenta's commercial product Grok. In some cases you need to take a moving average of the anomaly score rather than just looking at the raw anomaly score. In NuPIC the example `examples/opf/clients/hotgym_anomaly` provides a good starting point to anomaly detection. See also [this set of examples](https://github.com/subutai/nupic.subutai/tree/master/swarm_examples) for swarming with anomaly detection models.


### Confidences vs. Predicted Cells

#### Description

To compute the temporal anomaly score, the intention was to compute a normalized count of how many columns were active and not predicted. As an implementation shortcut, the set of predicted columns was computed by looking at columns with non-zero column "confidences."

However, it was later discovered that columns with non-zero confidences don’t necessarily have any predicted cells in them. To figure out if a cell is in the predicted state, we use the hard match count (the number of active synapses, after taking into account the permanence threshold). However, to compute the confidences for a cell, the Temporal Pooler uses the soft match count (the number of active synapses, regardless of the permanence values). Therefore, the set of columns with non-zero confidences will always be a superset of the columns containing predicted cells.

When this difference was discovered (~April 2013), an option was added to the CLA to compute the anomaly score based on the predicted cells rather than using confidences.

#### Results

Some experiments using the predicted cells to compute the anomaly score were run on some experiments. However, because these predictions are a subset of the columns with non-zero confidences, the results necessarily had more false positives. As of the time of writing, no change has been made to the computation of the anomaly score based on these results. The anomaly score is still computed using column confidences.

## Non-Temporal Anomaly Detection

### Description

There were also some attempts at adding anomaly detection that are "non-temporal" in nature by using the state of the spatial pooler. A non-temporal anomaly is defined as a combination of fields that doesn’t usually occur, independent of the history of the data.

### Computation

Since NontemporalAnomaly models have no temporal pooler, the anomaly score is based on the state within the spatial pooler.

To compute the nontemporal anomaly score, we first compute the "match" score for each winning column after inhibition

![equation](http://latex.codecogs.com/gif.latex?match_%7Bcol_%7Bk%7D%7D%20%3D%20overlap_%7Bcol_%7Bi%7D%7D*dutyCycle_%7Bcol_%7Bi%7D%7D)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any chance we can cache and host these images ourselves?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good idea. I can do that with my next docs push.


Then, to get the anomaly score (how unusual the data is), we take the inverse of the total matches

![equation](http://latex.codecogs.com/gif.latex?anomalyScore%3D%28%5Csum%20_%7Bcol_%7Bi%7D%5Cepsilon%20winningCols%7Dmatch_%7Bcol_%7Bi%7D%7D+1%29%5E%7B-1%7D)

The addition of 1 is to avoid divide by 0 errors.

### Rationale

The purpose of this anomaly score was to detect input records that represented novel or rare input patterns (independent of the rest of the sequence). If an input pattern has a low overlap score with the winning columns, none of the columns match the input very well, indicating that the CLA has not seen a similar pattern before and this pattern is novel. Conversely, if they duty cycles for a given pattern are generally low, this indicates that a pattern has not been seen for a long time, indicating that it is rare.

### Results

This algorithm was run on some artificial datasets. However, the results were not very promising, and this approach was abandoned. From a theoretical perspective the temporal anomaly detection technique is a superset of this technique. If a static pattern by itself is novel, by definition the temporal pooler won't make good predictions and hence the temporal anomaly score should be high. As such there was not too much interest in pursuing this route.
1 change: 1 addition & 0 deletions docs/source/guides/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,4 @@ Guides
opf
network
swarming/index
anomaly-detection
2 changes: 2 additions & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,3 +31,5 @@ Indices and tables
* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`

This documentation was built on |buildDate|.