Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
-
Updated
Dec 13, 2024 - Jupyter Notebook
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
A python package to build AI-powered real-time audio applications
Speaker embedding (d-vector) trained with GE2E loss
Keras implementation of ‘’Deep Speaker: an End-to-End Neural Speaker Embedding System‘’ (speaker recognition)
PyTorch implementation of Densely Connected Time Delay Neural Network
A pipeline to read lips and generate speech for the read content, i.e Lip to Speech Synthesis.
Companion repository for the paper "A Comparison of Metric Learning Loss Functions for End-to-End Speaker Verification" published at SLSP 2020
A curated list of speaker-embedding speaker-verification, speaker-identification resources.
Voxceleb1 i-vector based speaker recognition system
Luigi pipeline to download VoxCeleb(2) audio from YouTube and extract speaker segments
Trained speaker embedding deep learning models and evaluation pipelines in pytorch and tesorflow for speaker recognition.
On-device speaker recognition engine powered by deep learning
Speaker embedding for VI-SVC and VI-SVS, alse for VITS; Use this to replace the ID to implement voice clone.
PyTorch implementation of the 1D-Triplet-CNN neural network model described in Fusing MFCC and LPC Features using 1D Triplet CNN for Speaker Recognition in Severely Degraded Audio Signals by A. Chowdhury, and A. Ross.
DropClass and DropAdapt - repository for the paper accepted to Speaker Odyssey 2020
Official implementation of the ICASSP 2024 paper: Emphasized Non-Target Speaker Knowledge in Knowledge Distillation for Speaker Verification
A curated list of awesome speaker recognition/verification papers, projects, datasets, and competition.
Create speaker voiceprints from a few seconds of audio. And, identify individuals in real-time streaming or recorded conversations.
Angular triplet center loss implementation in Pytorch.
simple version of our torch kaldi toolkit, developed at the LIA by 2 apprentices. (@Chaanks & @vbrignatz)
Add a description, image, and links to the speaker-embedding topic page so that developers can more easily learn about it.
To associate your repository with the speaker-embedding topic, visit your repo's landing page and select "manage topics."