FIF : Functional Isolation Forest

This repository hosts Python code of the Functional Isolation Forest algorithms and its extension to Multivariate functional data. Here we provide the source code for the algorithms as well as example notebooks to help get started.

Installation

To get the latest version of the code:

$ git clone https://github.com/Gstaerman/FIF.git

Algorithm

Functional Isolation Forest is an anomaly detection (and anomaly ranking) algorithm for functional data. It shows a great flexibility to distinguish most of anomaly types of functional data.

Some parameters have to be set by the user :

innerproduct (one can fix 'auto' and vary alpha parameter to play with FIF)
D (Dictionary)
time (vector time of discretization points)

See the documentation of FIF.py to get more informations on innerproduct and dictionary possibilities.

Quick Start :

Create a toy dataset :

import numpy as np
np.random.seed(42)
m =100
n =100
tps = np.linspace(0,1,m)
v = np.linspace(1,1.4,n)
X = np.zeros((n,m))
for i in range(n):
    X[i] = 30 * ((1-tps) ** v[i]) * tps ** v[i]


Z1 = np.zeros((m))
for j in range(m):
    if (tps[j]<0.2 or tps[j]>0.8):
        Z1[j] = 30 * ((1-tps[j]) ** 1.2) * tps[j] ** 1.2
    else:
        Z1[j] = 30 * ((1-tps[j]) ** 1.2) * tps[j] ** 1.2 + np.random.normal(0,0.3,1)
Z1[0] = 0
Z1[m-1] = 0


Z2 = 30 * ((1-tps) ** 1.6) * tps ** 1.6


Z3 = np.zeros((m))
for j in range(m):
    Z3[j] = 30 * ((1-tps[j]) ** 1.2) * tps[j] ** 1.2 + np.sin(2*np.pi*tps[j])

Z4 = np.zeros((m))
for j in range(m):
    Z4[j] = 30 * ((1-tps[j]) ** 1.2) * tps[j] ** 1.2

for j in range(70,71):
    Z4[j] += 2

Z5 = np.zeros((m))
for j in range(m):
    Z5[j] = 30 * ((1-tps[j]) ** 1.2) * tps[j] ** 1.2 + 0.5*np.sin(10*np.pi*tps[j])

X = np.concatenate((X,Z1.reshape(1,-1),Z2.reshape(1,-1),
                     Z3.reshape(1,-1), Z4.reshape(1,-1), Z5.reshape(1,-1)), axis = 0)

And then use FIF to ranking functional dataset :

From FIF import *
np.random.seed(42)
F  = FIForest(X, D="gaussian_wavelets", time=tps, innerproduct="auto", alpha=0.5)
S  = F.compute_paths()

The simulated dataset with the five introduced anomalies (top). The sorted dataset (middle), the darker the color, the more the curves are considered anomalies. The sorted anomaly score of the dataset (bottom).

Dependencies

These are the dependencies to use FIF:

numpy

Cite

If you use this code in your project, please cite:

Functional Isolation Forest
Guillaume Staerman, Pavlo Mozharovskyi, Stéphan Clémençon, Florence d'Alché-Buc.
(submitted), 2019.
https://arxiv.org/abs/1904.04573

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
Code_FIF		Code_FIF
README.rst		README.rst
anomaly_example-1.png		anomaly_example-1.png
anomaly_example_rank-1.png		anomaly_example_rank-1.png
anomaly_example_score-1.png		anomaly_example_score-1.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FIF : Functional Isolation Forest

Installation

Algorithm

Quick Start :

Dependencies

Cite

About

Releases

Packages

Languages

eborobert/FIF

Folders and files

Latest commit

History

Repository files navigation

FIF : Functional Isolation Forest

Installation

Algorithm

Quick Start :

Dependencies

Cite

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages