ProbTS: Benchmarking Point and Distributional Forecasting across Diverse Prediction Horizons

News 🎉

🚩 Dec 2024: ProbTS now supports GIFT-EVAL benchmark datasets! Visit this page for detailed instructions. Please note that this feature is still in beta version and may contain bugs or inconsistencies. We will continue to update and improve it.

🚩 Dec 2024: Added quick guides for benchmarking foundation models. Visit this page for detailed instructions.

🚩 Oct 2024: ProbTS now includes the ElasTST model! Check out the ElasTST branch to reproduce all results reported in paper or run bash scripts/run_elastst.sh for a quick start.

🚩 Oct 2024: The camera-ready version of ProbTS is now available, with more in-depth analyses on the impact of normalization.

About ProbTS 💡

A wide range of industrial applications desire precise point and distributional forecasting for diverse prediction horizons. ProbTS serves as a benchmarking tool to aid in understanding how advanced time-series models fulfill these essential forecasting needs. It also sheds light on their advantages and disadvantages in addressing different challenges and unveil the possibilities for future research.

To achieve these objectives, ProbTS provides a unified pipeline that implements cutting-edge models from different research threads, including:

Supervised long-term point forecasting models, such as PatchTST, iTransformer, etc.
Supervised short-term probabilistic forecasting models, such as TimeGrad, CSDI, etc.
Pre-trained time-series foundation models for zero-shot forecasting, such as TimesFM, MOIRAI, etc.

Specifically, ProbTS emphasizes the differences in their primary methodological designs, including:

Supporting point or distributional forecasts
Using autoregressive or non-autoregressive decoding schemes for multi-step outputs

Available Models 🧩

ProbTS includes both classical time-series models, specializing in long-term point forecasting or short-term distributional forecasting, and recent time-series foundation models that offer zero-shot and arbitrary-horizon forecasting capabilities for new time series.

Classical Time-series Models

Model	Original Eval. Horizon	Estimation	Decoding Scheme	Class Path
Linear	-	Point	Auto / Non-auto	`probts.model.forecaster.point_forecaster.LinearForecaster`
GRU	-	Point	AR / NAR	`probts.model.forecaster.point_forecaster.GRUForecaster`
Transformer	-	Point	AR / NAR	`probts.model.forecaster.point_forecaster.TransformerForecaster`
Autoformer	Long	Point	NAR	`probts.model.forecaster.point_forecaster.Autoformer`
N-HiTS	Long	Point	NAR	`probts.model.forecaster.point_forecaster.NHiTS`
NLinear	Long	Point	NAR	`probts.model.forecaster.point_forecaster.NLinear`
DLinear	Long	Point	NAR	`probts.model.forecaster.point_forecaster.DLinear`
TSMixer	Long	Point	NAR	`probts.model.forecaster.point_forecaster.TSMixer`
TimesNet	Short / Long	Point	NAR	`probts.model.forecaster.point_forecaster.TimesNet`
PatchTST	Long	Point	NAR	`probts.model.forecaster.point_forecaster.PatchTST`
iTransformer	Long	Point	NAR	`probts.model.forecaster.point_forecaster.iTransformer`
ElasTST	Long	Point	NAR	`probts.model.forecaster.point_forecaster.ElasTST`
GRU NVP	Short	Probabilistic	AR	`probts.model.forecaster.prob_forecaster.GRU_NVP`
GRU MAF	Short	Probabilistic	AR	`probts.model.forecaster.prob_forecaster.GRU_MAF`
Trans MAF	Short	Probabilistic	AR	`probts.model.forecaster.prob_forecaster.Trans_MAF`
TimeGrad	Short	Probabilistic	AR	`probts.model.forecaster.prob_forecaster.TimeGrad`
CSDI	Short	Probabilistic	NAR	`probts.model.forecaster.prob_forecaster.CSDI`
TSDiff	Short	Probabilistic	NAR	`probts.model.forecaster.prob_forecaster.TSDiffCond`

Foundation Models

Model	Any Horizon	Estimation	Decoding Scheme	Class Path
Lag-Llama	✔	Probabilistic	AR	`probts.model.forecaster.prob_forecaster.LagLlama`
ForecastPFN	✔	Point	NAR	`probts.model.forecaster.point_forecaster.ForecastPFN`
TimesFM	✔	Point	AR	`probts.model.forecaster.point_forecaster.TimesFM`
TTM	✘	Point	NAR	`probts.model.forecaster.point_forecaster.TinyTimeMixer`
Timer	✔	Point	AR	`probts.model.forecaster.point_forecaster.Timer`
MOIRAI	✔	Probabilistic	NAR	`probts.model.forecaster.prob_forecaster.Moirai`
UniTS	✔	Point	NAR	`probts.model.forecaster.point_forecaster.UniTS`
Chronos	✔	Probabilistic	AR	`probts.model.forecaster.prob_forecaster.Chronos`

Stay tuned for more models to be added in the future.

Setup 🔧

Environment

ProbTS is developed with Python 3.10 and relies on PyTorch Lightning. To set up the environment:

# Create a new conda environment
conda create -n probts python=3.10
conda activate probts

# Install required packages
pip install .
pip uninstall -y probts # recommended to uninstall the root package (optional)

Optional for TSFMs reproducibility

For time-series foundation models, you need to install basic packages and additional dependencies:

1. Set Up Environment

# Create a new conda environment
conda create -n probts_fm python=3.10
conda activate probts_fm

# Git submodule
git submodule update --init --recursive

# Install additional packages for foundation models
pip install ".[tsfm]"
pip uninstall -y probts # recommended to uninstall the root package (optional)

2. Initialize Submodules

# For MOIRAI, we fix the version of the package for better performance
cd submodules/uni2ts
git reset --hard fce6a6f57bc3bc1a57c7feb3abc6c7eb2f264301

# For TimesFM, fix the version for reproducibility (optional)
cd submodules/timesfm
git reset --hard 5c7b905

# For Lag-Llama, fix the version for reproducibility (optional)
cd submodules/lag_llama
git reset --hard 4ad82d9

# For TinyTimeMixer, fix the version for reproducibility (optional)
cd submodules/tsfm
git reset --hard bb125c14a05e4231636d6b64f8951d5fe96da1dc

Datasets

For a complete dataset list, refer to the Datasets Overview.

Short-Term Forecasting: We use datasets from GluonTS. Configure the datasets using --data.data_manager.init_args.dataset {DATASET_NAME}. You can choose from multivariate or univariate datasets as per your requirement.
```
['exchange_rate_nips', 'electricity_nips', 'traffic_nips', 'solar_nips', 'wiki2000_nips']
```
Long-Term Forecasting: To download the long-term forecasting datasets, please follow these steps:
```
bash scripts/prepare_datasets.sh "./datasets"
```
Configure the datasets using --data.data_manager.init_args.dataset {DATASET_NAME} with the following list of available datasets:
```
['etth1', 'etth2','ettm1','ettm2','traffic_ltsf', 'electricity_ltsf', 'exchange_ltsf', 'illness_ltsf', 'weather_ltsf', 'caiso', 'nordpool']
```
Note: When utilizing long-term forecasting datasets, you must explicitly specify the context_length and prediction_length parameters. For example, to set a context length of 96 and a prediction length of 192, use the following command-line arguments:
```
--data.data_manager.init_args.context_length 96 \
--data.data_manager.init_args.prediction_length 192 \
```
Using Datasets from Monash Time Series Forecasting Repository: To use datasets from the Monash Time Series Forecasting Repository, follow these steps:
1. Download the Dataset:
- Navigate to the target dataset, such as the Electricity Hourly Dataset.
- Download the .tsf file and place it in your local datasets directory (e.g., ./datasets).
1. Configure the Dataset:
- Use the following configuration to specify the dataset, file path, and frequency:
```
--data.data_manager.init_args.dataset {DATASET_NAME} \
--data.data_manager.init_args.data_path /path/to/data_file.tsf \
--data.data_manager.init_args.freq {FREQ} 
```
- Example Configuration:
```
--data.data_manager.init_args.dataset monash_electricity_hourly \
--data.data_manager.init_args.data_path ./datasets/electricity_hourly_dataset.tsf \
--data.data_manager.init_args.freq H \
--data.data_manager.init_args.context_length 96 \
--data.data_manager.init_args.prediction_length 96 \
--data.data_manager.init_args.multivariate true
```
Note 1: Refer to the Pandas Time Series Offset Aliases for the correct frequency values ({FREQ}) to use in your configuration.

Note 2: You can adjust the test instance sampling using the --data.data_manager.init_args.test_rolling_length parameter.

Checkpoints for Foundation Models

Download the checkpoints with the following command (details can be found here):

bash scripts/prepare_tsfm_checkpoints.sh # By downloading, you agree to the original licenses

Quick Start 🚀

Specify --config with a specific configuration file to reproduce results of point or probabilistic models on commonly used long- and short-term forecasting datasets. Configuration files are included in the config folder.

To run models:

bash run.sh

Experimental results reproduction:

Long-term Forecasting:
```
bash scripts/reproduce_ltsf_results.sh
```
Short-term Forecasting:
```
bash scripts/reproduce_stsf_results.sh
```
Time Series Foundation Models:
```
bash scripts/reproduce_tsfm_results.sh
```

Short-term Forecasting Configuration

For short-term forecasting scenarios, datasets and corresponding context_length and prediction_length are automatically obtained from GluonTS. Use the following command:

python run.py --config config/path/to/model.yaml \
                --data.data_manager.init_args.path /path/to/datasets/ \
                --trainer.default_root_dir /path/to/log_dir/ \
                --data.data_manager.init_args.dataset {DATASET_NAME}

See full DATASET_NAME list:

from gluonts.dataset.repository import dataset_names
print(dataset_names)

Long-term Forecasting Configuration

For long-term forecasting scenarios, context_length and prediction_length must be explicitly assigned:

python run.py --config config/path/to/model.yaml \
                --data.data_manager.init_args.path /path/to/datasets/ \
                --trainer.default_root_dir /path/to/log_dir/ \
                --data.data_manager.init_args.dataset {DATASET_NAME} \
                --data.data_manager.init_args.context_length {CTX_LEN} \
                --data.data_manager.init_args.prediction_length {PRED_LEN}

DATASET_NAME options:

['etth1', 'etth2','ettm1','ettm2','traffic_ltsf', 'electricity_ltsf', 'exchange_ltsf', 'illness_ltsf', 'weather_ltsf', 'caiso', 'nordpool']

Forecasting with Varied Prediction Lengths

Conventional forecasting models typically require specific training and deployment for each prediction horizon. However, with the growing importance of varied-horizon forecasting, there is a need for models that can deliver robust predictions across multiple inference horizons after a single training phase.

ProbTS has been updated to support varied-horizon forecasting by enabling the specification of distinct context and prediction lengths for the training, validation, and testing phases.

Quick Start

To quickly train and evaluate ElasTST:

bash scripts/run_elastst.sh

To quickly set up varied-horizon training:

bash scripts/run_varied_hor_training.sh

For detailed information on the configuration, refer to the documentation.

Note: Currently, this feature is only supported by ElasTST, Autoformer, and foundation models.

Benchmarking ⚖️

By utilizing ProbTS, we conduct a systematic comparison between studies that focus on point forecasting and those aimed at distributional estimation, employing various forecasting horizons and evaluation metrics. For more details

Documentation 📖

For detailed information on configuration parameters and model customization, please refer to the documentation.

To print the full pipeline configuration to a file:

python run.py --print_config > config/pipeline_config.yaml

Acknowledgement 🌟

Special thanks to the following repositories for their open-sourced code bases and datasets.

Tools/Packages

Official Implementations

Classical Time-series Models

Time-series Foundation Models

Citing ProbTS 🍻

If you have used ProbTS for research or production, please cite it as follows.

@inproceedings{zhang2024probts,
  title={{ProbTS}: Benchmarking Point and Distributional Forecasting across Diverse Prediction Horizons},
  author={Zhang, Jiawen and Wen, Xumeng and Zhang, Zhenwei and Zheng, Shun and Li, Jia and Bian, Jiang},
  booktitle={NeurIPS Datasets and Benchmarks Track},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
checkpoints		checkpoints
config		config
datasets		datasets
docs		docs
exps		exps
probts		probts
scripts		scripts
submodules		submodules
.gitignore		.gitignore
.gitmodules		.gitmodules
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
pyproject.toml		pyproject.toml
run.py		run.py
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ProbTS: Benchmarking Point and Distributional Forecasting across Diverse Prediction Horizons

News 🎉

About ProbTS 💡

Available Models 🧩

Classical Time-series Models

Foundation Models

Setup 🔧

Environment

Datasets

Checkpoints for Foundation Models

Quick Start 🚀

Short-term Forecasting Configuration

Long-term Forecasting Configuration

Forecasting with Varied Prediction Lengths

Benchmarking ⚖️

Documentation 📖

Acknowledgement 🌟

Tools/Packages

Official Implementations

Citing ProbTS 🍻

About

Releases

Packages

Contributors 3

Languages

License

microsoft/ProbTS

Folders and files

Latest commit

History

Repository files navigation

ProbTS: Benchmarking Point and Distributional Forecasting across Diverse Prediction Horizons

News 🎉

About ProbTS 💡

Available Models 🧩

Classical Time-series Models

Foundation Models

Setup 🔧

Environment

Datasets

Checkpoints for Foundation Models

Quick Start 🚀

Short-term Forecasting Configuration

Long-term Forecasting Configuration

Forecasting with Varied Prediction Lengths

Benchmarking ⚖️

Documentation 📖

Acknowledgement 🌟

Tools/Packages

Official Implementations

Citing ProbTS 🍻

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages