GitHub - IvanEvan/speaker-diarization: speaker diarization in phone recording/电话录音中的说话人分离

1. Overview

This repo was created to separate two speakers from a telephone recording.
If your telephone recording has more than two speakers, I can't guarantee that my method will work.
In addition to this one, to get good result, please try to make sure that different speakers have the same length of speech.

2. Implement

1. Split a wave to audio clips by remove mute
2. Count all clips' id-vector use pre-trained speaker recognition model
3. Use K-means to cluster all clips' id-vector when K=2

3. Result

4. Appendix

1. The pre-trained speaker recognition model from WeidiXie's repo VGG-Speaker-Recognition. Thanks for the open source!
2. Because my method looks like a non-supervised method, so you can try supervised method even end2end. You can get more information about speaker diarization from Here

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
pretrained		pretrained
README.md		README.md
backbone.py		backbone.py
model.py		model.py
result.png		result.png
speaker_diarization.py		speaker_diarization.py
toolkits.py		toolkits.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

1. Overview

2. Implement

3. Result

4. Appendix

About

Releases

Packages

Languages

IvanEvan/speaker-diarization

Folders and files

Latest commit

History

Repository files navigation

1. Overview

2. Implement

3. Result

4. Appendix

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages