DeepXi: Residual Bidirectional Long Short-Term Memory (ResBLSTM) Network A Priori SNR estimator

DeepXi (where the Greek letter 'xi' or ξ is ponounced /zaɪ/) is a residual bidirectional long short-term memory (ResBLSTM) network a priori SNR estimator that was proposed in [1]. It can be used by minimum mean-square error (MMSE) approaches like the MMSE short-time spectral amplitude (MMSE-STSA) estimator, the MMSE log-spectral amplitude (MMSE-LSA) estimator, and the Wiener filter (WF) approach. It can also be used to estimate the ideal ratio mask (IRM) and the ideal binary mask (IBM). DeepXi is implemented in TensorFlow and is trained to estimate the a priori SNR for single channel noisy speech with a sampling frequency of 16 kHz.

Prerequisites

TensorFlow (installed in a virtual environment)
Python3
MATLAB

Installation

It is recommended to use a virtual environment.

git clone https://github.com/anicolson/DeepXi.git
pip install -r requirements.txt

Download the Model

A trained model can be downloaded from here. Unzip and place in the model directory. The model was trained with a sampling rate of 16 kHz.

How to Perform Speech Enhancement

Simply run the script (python3 deepxi.py). Run the script in the virtual environment that TensorFlow is installed in. The script has different inference options, and is also able to perform training if required.

Directory Description

Directory	Description
lib	Functions for deepxi.py.
model	The directory for the model (the model must be downloaded).
noisy_speech	Noisy speech. Place noisy speech .wav files to be enhanced here.
output	DeepXi outputs, including the enhanced speech .wav output files.
stats	Statistics of a sample from the training set. The mean and standard deviation of the a priori SNR for the sample are used to compute the training target.

References

[1] A. Nicolson and K. K. Paliwal, "Deep Learning For Minimum Mean-Square Error Approaches to Speech Enhancement", Submitted to Speech Communication.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

DeepXi: Residual Bidirectional Long Short-Term Memory (ResBLSTM) Network A Priori SNR estimator

Prerequisites

Installation

Download the Model

How to Perform Speech Enhancement

Directory Description

References

Files

README.md

Latest commit

History

README.md

File metadata and controls

DeepXi: Residual Bidirectional Long Short-Term Memory (ResBLSTM) Network A Priori SNR estimator

Prerequisites

Installation

Download the Model

How to Perform Speech Enhancement

Directory Description

References