Digital Signal Processing (DSP) functions for creating Automatic Speech Recognition (ASR) features in MATLAB. Equations to compute the features can be found in [1]. Features included:
Feature | Description |
---|---|
STMS | Short-Time Magnitude Spectrum |
PSD | Power Spectral Density |
MFCC | Mel-Frequency Cepstral Coefficients |
LSSE | Log-Spectral Subband Energies |
SSC | Spectral Subband Centroids |
%% SINGLE-CHANNEL SEQUENCE PATH
seq_path = 'path_to_sequence'; % path to the location of the audio sequence.
%% PARAMETERS
x.fs = 16000; % sampling frequency (Hz).
x.Nw = 512; % window length (samples). This is for a 32 ms window.
x.Ns = 256; % window shift (samples). This is for a 16 ms shift.
x.NFFT = 2^nextpow2(x.Nw); % frequency bins (samples).
%% SINGLE-CHANNEL SEQUENCE
[x.wav, ~] = audioread(seq_path); % waveform.
%% MEL-SCALED FILTER BANK
H = melfbank(26, x.NFFT/2 + 1, fs); % mel filter banks.
%% MFCC
x = mfcc(x, H); % compute Mel Frequency Cepstral Coefficients (MFCC).
x.frm - framed & windowed sequence.
x.STMS - single-sided short-time magnitude spectrum.
x.PSD - single-sided short-time power spectral density.
x.SSE - spectral suband energies.
x.LSSE - log-spectral suband energies.
x.MFCC - mel-frequency cepstral coefficients.