Skip to content

mauricekraus/xLSTM-Mixer

Repository files navigation

xLSTM-Mixer

Architecture Overview

This repository provides the supplemental code for the paper "xLSTM-Mixer: Multivariate Time Series Forecasting by Mixing via Scalar Memories".

arXiv

Abstract

Time series data is prevalent across numerous fields, necessitating the development of robust and accurate forecasting models. Capturing patterns both within and between temporal and multivariate components is crucial for reliable predictions. We introduce xLSTM-Mixer, a model designed to effectively integrate temporal sequences, joint time-variate information, and multiple perspectives for robust forecasting. Our approach begins with a linear forecast shared across variates, which is then refined by xLSTM blocks. These blocks serve as key elements for modeling the complex dynamics of challenging time series data. xLSTM-Mixer ultimately reconciles two distinct views to produce the final forecast. Our extensive evaluations demonstrate xLSTM-Mixer's superior long-term forecasting performance compared to recent state-of-the-art methods. A thorough model analysis provides further insights into its key components and confirms its robustness and effectiveness. This work contributes to the resurgence of recurrent models in time series forecasting.

Table of Contents

  1. Datasets
  2. Setup
  3. Running the Project
  4. Experiments
  5. Citation

Used Datasets

We used commonly available benchmark datasets, which can be downloaded from Google Drive. The environment expects the benchmark datasets to be located in the /common-ts directory on a Linux-based system or container.

Setup

Note: This package is based on PyTorch 2.4 and was tested with Python 3.11.

Non-Docker Installation

The code was tested on an Ubuntu 22.04 machine with CUDA 12.1. The Python requirements are listed in requirements.txt and lighting_requirements.txt. Both can be installed using pip:

pip install -r requirements.txt
pip install -r lighting_requirements.txt

Docker Installation

The repository contains a Dockerfile to install the required environment. You can build the Docker image and run a container to ensure consistency across setups.

Build the Container

docker build -t xlstm-mixer-code -f .docker/Dockerfile .

Attach to the Container

docker run --gpus device=0 -it -v /storage-01/common-ts:/common-ts -v $(pwd):/workspaces/xlstm_mixer xlstm-mixer-code bash

VSCode Environment

The repository also includes a devcontainer.json, allowing you to attach to the project using VSCode for a consistent development environment. The source path needs to be adapted according to ones system.

CUDA Requirements

For the CUDA version of sLSTM, a Compute Capability of 8.0 or higher is required. For more details on CUDA support, check this list (e.g., A100 GPUs are supported).

Running the Project

The project provides a Lightning-based CLI, which is also used for the experiments in the scripts folder. To get general help on utilizing the CLI, run:

python -m xlstm_mixer --help

For specific model or data argument configurations, use:

python -m xlstm_mixer fit --help

Experiments

Predefined scripts with hyperparameters are available in the scripts folder.

Quick Experiment Check

To verify that all requirements are met, pass the --dev flag to any script, e.g.,

bash ./scripts/long_term_forecasting/ett/m1.sh --dev

This will run a test batch to confirm the setup is functioning correctly. Full seed runs are organized by dataset in the scripts folder.

Example Commands

Run the following scripts for specific experiments:

bash ./scripts/long_term_forecasting/weather.sh
bash ./scripts/long_term_forecasting/ett/m1.sh

Citation

If you find this work helpful, please consider citing our paper using the following BibTeX entry:

@article{krausxLSTM-Mixer2024,
  author    = {Maurice Kraus and Felix Divo and Devendra Singh Dhami and Kristian Kersting},
  title     = {xLSTM-Mixer: Multivariate Time Series Forecasting by Mixing via Scalar Memories},
  journal   = {arXiv preprint},
  year      = {2024},
  url       = {https://arxiv.org/abs/2410.16928},
}