catch22 is a collection of 22 time-series features coded in C that can be run from Python, R, Matlab, and Julia, licensed under the GNU GPL v3 license (or later). The catch22 features are a high-performing subset of the over 7000 features in hctsa.
The features were selected based on their classification performance across a collection of 93 real-world time-series classification problems, as described in our open-access paper, 📗 Lubba et al. (2019). catch22: CAnonical Time-series CHaracteristics.
There is comprehensive documentation for catch22, including:
- Installation instructions (across C, python, R, Julia, and Matlab)
- Information about the theory behind and behavior of each of the features,
- A list of publications that have used or extended catch22
- And more 😋
There are also native versions of this code for other programming languages:
- Rcatch22 (R)
install.packages("Rcatch22")
- pycatch22 (python)
pip install pycatch22
- Catch22.jl (Julia)
Pkg.add("Catch22")
You can also use the C-compiled features directly, or in Matlab, following the detailed installation instructions on the wiki.
If you use this software, please read and cite this open-access article:
- 📗 Lubba et al. catch22: CAnonical Time-series CHaracteristics, Data Min Knowl Disc 33, 1821 (2019).
Summary of the performance of the catch22 feature set across 93 classification problems, and a comparison to the hctsa feature set (cf. Fig. 4 from our paper):
- When presenting results using catch22, you must identify the version used to allow clear reproduction of your results. For example,
CO_f1ecac
was altered from an integer-valued output to a linearly interpolated real-valued output from v0.3. - catch22 features only evaluate dynamical properties of time series and do not respond to basic differences in the location (e.g., mean) or spread (e.g., variance).
- If the location and spread of the raw time-series distribution may be important for your application, you should apply the function argument
catch24 = true
(TRUE
in R,True
in Python) to your call to the catch22 function in the language of your choice. This will result in 24 features being calculated: the catch22 features in addition to mean and standard deviation. - Time series are z-scored internally (for features other than mean and standard deviation), which means that, e.g., constant time series will lead to
NaN
outputs. - Time-series data are taken as an ordered sequence of values (without time stamps). We assume an evenly sampled time series.
- See language-specific usage information in the wiki.
- The computational pipeline used to generate the catch22 feature set is in the
op_importance
repository.