Skip to content

SageMaker implementation of InceptionTime model for time series classification.

Notifications You must be signed in to change notification settings

fg-research/inception-time-sagemaker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

InceptionTime SageMaker Algorithm

The Time Series Classification (Inception) Algorithm from AWS Marketplace performs time series classification with the InceptionTime model. It implements both training and inference from CSV data and supports both CPU and GPU instances. The training and inference Docker images were built by extending the PyTorch 2.1.0 Python 3.10 SageMaker containers. The algorithm can be used for binary, multiclass and multilabel classification of both univariate and multivariate time series.

Model Description

InceptionTime is an ensemble model. Each model in the ensemble has the same architecture and uses the same hyperparameters. The only difference between the models is in the initial values of the weights, which are sampled from the Glorot uniform distribution.

Each model consists of a stack of blocks, where each block includes three convolutional layers with kernel sizes of 10, 20 and 40 and a max pooling layer. The block input is processed by the four layers in parallel, and the four outputs are concatenated before being passed to a batch normalization layer followed by a ReLU activation.

A residual connection is applied between the input time series and the output of the second block, and after that between every three blocks. The residual connection processes the inputs using an additional convolutional layer with a kernel size of 1 followed by a batch normalization layer. The processed inputs are then added to the output, which is transformed by a ReLU activation.

The output of the last block is passed to an average pooling layer which removes the time dimension, and then to a final linear layer.

At inference time, the class probabilities predicted by the different models are averaged in order to obtain a unique predicted probability and, therefore, a unique predicted label, for each class.

InceptionTime architecture (source: doi: 10.1007/s10618-020-00710-y)

Model Resources: [Paper] [Code]

SageMaker Algorithm Description

The algorithm implements the model as described above with no changes. However, the initial values of the weights are not sampled from the Glorot uniform distribution, but are determined using PyTorch's default initialization method.

Training

The training algorithm has two input data channels: training and validation. The training channel is mandatory, while the validation channel is optional.

The training and validation datasets should be provided as CSV files. The column names of the one-hot encoded class labels should start with "y" (e.g. "y1", "y2", ...), while the column names of the time series values should start with "x" (e.g. "x1", "x2", ...).

The CSV file should contain unique sample identifiers in a column named "sample", and unique feature identifiers in a column named "feature". The feature identifiers are used to determine the different dimensions of multivariate time series. When using univariate time series, the feature identifiers can be set to a constant value.

All the time series should have the same length and the same number of dimensions, and should not contain missing values. The time series are scaled internally by the algorithm, there is no need to scale the time series beforehand.

See the sample input files train.csv and valid.csv.

See notebook.ipynb for an example of how to launch a training job.

Distributed Training

The algorithm supports multi-GPU training on a single instance, which is implemented through torch.nn.DataParallel. The algorithm does not support multi-node (or distributed) training across multiple instances.

Incremental Training

The algorithm supports incremental training. The model artifacts generated by a previous training job can be used to continue training the model on the same dataset or to fine-tune the model on a different dataset.

Hyperparameters

The training algorithm takes as input the following hyperparameters:

  • filters: int. The number of filters of each model in the ensemble.
  • depth: int. The number of blocks of each model in the ensemble.
  • models: int. The number of models in the ensemble.
  • lr: float. The learning rate used for training.
  • batch-size: int. The batch size used for training.
  • epochs: int. The number of training epochs.
  • task: str. The type of classification task, either "binary", "multiclass" or "multilabel".

All the hyperparameters are tunable, excluding the type of classification task, which needs to be defined beforehand.

Metrics

The training algorithm logs the following metrics:

  • train_loss: float. Training loss.
  • train_accuracy: float. Training accuracy.

If the validation channel is provided, the training algorithm also logs the following additional metrics:

  • valid_loss: float. Validation loss.
  • valid_accuracy: float. Validation accuracy.

See notebook.ipynb for an example of how to launch a hyperparameter tuning job.

Inference

The inference algorithm takes as input a CSV file containing the time series values. The column names of the time series values should start with "x" (e.g. "x1", "x2", ...).

The CSV file should contain unique sample identifiers in a column named "sample", and unique feature identifiers in a column named "feature". The feature identifiers are used to determine the different dimensions of multivariate time series. When using univariate time series, the feature identifiers can be set to a constant value. The feature identifiers used for inference should match the ones used for training.

All the time series should have the same length and the same number of dimensions, and should not contain missing values. The time series are scaled internally by the algorithm, there is no need to scale the time series beforehand.

See the sample input file test_data.csv.

The inference algorithm outputs the predicted class labels and the predicted class probabilities, which are returned in CSV format.

See the sample output files batch_predictions.csv and real_time_predictions.csv.

See notebook.ipynb for an example of how to launch a batch transform job.

Endpoints

The algorithm supports only real-time inference endpoints. The inference image is too large to be uploaded to a serverless inference endpoint.

See notebook.ipynb for an example of how to deploy the model to an endpoint, invoke the endpoint and process the response.

Additional Resources: [Sample Notebook] [Blog post]

References

  • H. Ismail Fawaz, B. Lucas, G. Forestier, C. Pelletier, D.F. Schmidt, J. Weber, G.I. Webb, L. Idoumghar, P.A. Muller and F. Petitjean, "InceptionTime: Finding AlexNet for Time Series Classification," Data Mining and Knowledge Discovery, vol. 34, no. 6, pp. 1936-1962, 2020, doi: 10.1007/s10618-020-00710-y.