An open source project from Data to AI Lab at MIT.
Using Large Language Models (LLMs) for time series anomaly detection.
- Homepage: https://github.com/sintel-dev/sigllm
SigLLM is an extension of the Orion library, built to detect anomalies in time series data using LLMs. We provide two types of pipelines for anomaly detection:
- Prompter: directly prompting LLMs to find anomalies in time series.
- Detector: using LLMs to forecast time series and finding anomalies through by comparing the real and forecasted signals.
For more details on our pipelines, please read our paper.
The easiest and recommended way to install SigLLM is using pip:
pip install sigllm
This will pull and install the latest stable release from PyPi.
In the following example we show how to use one of the SigLLM Pipelines.
We will load a demo data located in tutorials/data.csv
for this example:
import pandas as pd
data = pd.read_csv('data.csv')
data.head()
which should show a signal with timestamp
and value
.
timestamp value
0 1222840800 6.357008
1 1222862400 12.763547
2 1222884000 18.204697
3 1222905600 21.972602
4 1222927200 23.986643
5 1222948800 24.906765
In this example we use gpt_detector
pipeline and set some hyperparameters. In this case, we set the thresholding strategy to dynamic. The hyperparameters are optional and can be removed.
In addtion, the SigLLM
object takes in a decimal
argument to determine how many digits from the float value include. Here, we don't want to keep any decimal values, so we set it to zero.
from sigllm import SigLLM
hyperparameters = {
"orion.primitives.timeseries_anomalies.find_anomalies#1": {
"fixed_threshold": False
}
}
sigllm = SigLLM(
pipeline='gpt_detector',
decimal=0,
hyperparameters=hyperparameters
)
Now that we have initialized the pipeline, we are ready to use it to detect anomalies:
anomalies = sigllm.detect(data)
⚠️ Depending on the length of your timeseries, this might take time to run.
The output of the previous command will be a pandas.DataFrame
containing a table of detected anomalies:
start end severity
0 1225864800 1227139200 0.625879
Additional resources that might be of interest:
If you use SigLLM for your research, please consider citing the following paper:
Sarah Alnegheimish, Linh Nguyen, Laure Berti-Equille, Kalyan Veeramachaneni. Can Large Language Models be Anomaly Detectors for Time Series?.
@inproceedings{alnegheimish2024sigllm,
title={Can Large Language Models be Anomaly Detectors for Time Series?},
author={Alnegheimish, Sarah and Nguyen, Linh and Berti-Equille, Laure and Veeramachaneni, Kalyan},
booktitle={2024 IEEE International Conferencze on Data Science and Advanced Analytics (IEEE DSAA)},
organization={IEEE},
year={2024}
}