Skip to content
Jean Bresson edited this page Oct 14, 2018 · 41 revisions

Welcome to the om-xmm wiki!

OM-XMM is an external library for OM. It integrates a XMM object which allows users to train Hierarchical Hidden Markov Models (HHMM) with a few samples of labelled data, and then use the trained model for classification. HHMM are fit for being trained on a few examples of temporal data. Any type of data can be used as an input, as long as it consists in a list of temporal descriptors (time series).

The XMM object in OpenMusic

Create a new XMM object by typing xmm-model in the new box editor.

Inputs :

Dataset : (list)

The dataset input is used to give some data for the model to train. The data format must be as follow :

  • A list of labelled samples.
  • Each labelled samples is a list of 2 elements :
    • A list of descriptors, each descriptor being a list of floats (corresponding to the time series)
    • A string containing the label of the class of the sample (20 characters maximum)

The descriptors can be seen as a matrix, with the descriptor id as first dimension, and time as second dimension. Throughout the samples used for training, the number of descriptors (first dimension) must stay the same. However, their size time wise (second dimension) can vary from one sample to another, but descriptors within one sample need to have the same length.

Here is an example of how to format data for one sample, having in input an IAE object, a list describing the sample (t1 t2 label), and a list of descriptors to keep

States : (integer)

The states input corresponds to the number of hidden states for the Hidden Markov Model. The default value is 10. The more complex the data is (e.g. more descriptors), the more hidden states will be needed.

Gaussians : (integer)

XMM is a combination of Gaussian Mixture Models (GMM) and HHMM. That means the observed states of the HHMM are found using GMMs. Thus, the gaussians input allows you to set the number of gaussians for the GMMs. The default value is set to 1. More gaussians might be needed to recognize complex and sparse data.

Regularization : (list)

The regularization input corresponds to the regularization coefficients (offsets added to the covariance matrices of the Gaussian distributions at each re-estimation in the algorithm). It consists in two values : – Relative regularization: offset relative to data-variance. – Absolute regularization: minimum offset value. With that same order, the regularization input expects a list of 2 floats between 0 and 1 (excluded). Default value is set to (0.05 0.01)

Normalize-data

This option allows to normalize your dataset before training and running your model. The means and standard deviation of the dataset is computed, then a standard score normalization is applied to the training data as well as the data input for classification (running).

Finding the right parameters for your model can be quite hard and confusing, but don't worry, tools are there to help you in that process (see the hyperparameter optimization section).

Outputs

The outputs of the XMM object correspond to the inputs. The self output can be used to run and test the model. The Dataset output can be used to test the model (see the Test a model section).