Skip to content

Mel cepstral distortion (MCD) computations in python. Use Merlin toolkit to convert .wav files to .gcm files. Work in all form of .wav files

Notifications You must be signed in to change notification settings

Lukelluke/MCD-MEL-CEPSTRAL-DISTANCE-MCD-application

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MCD-toolkit-polished-useful

Use Merlin toolkit to convert .wav files to .gcm files, and modified original MCD implement.

  1. install Merlin toolkit as https://github.com/CSTR-Edinburgh/merlin taught.(to extract some useful features like .bap/.lf0/.mgc)
  2. cd into merlin/tool , and install as new script under that file
  3. cd into merlin/tools/SPTK*3.9, follow INSTALL md taught( because we mainly use sptk toolkit to extract features)
  4. cd merlin/egs/voice_conversion/s1
  5. ./01_setup.sh speakerA speakerB (it will mkdir to files named database/speakerA && apeakerB)

  1. ./02_prepare_acoustic_features.sh <path_to_wav_dir> <path_to_feat_dir> (you need to mkdir two new folder to contain the features to be extracted, recommend to build under database folder)

  2. image-20200718155047663

  3. then u got two .mgc files


  1. install MCD tools as https://github.com/MattShannon/mcd/tree/c86266a2caf6a7cb248ea89ea56f90fd161a297e taguht

10 modify some codes in htk_io/vecseq.py, as picture show:

image-20200718155142402

    def readFile(self, vecSeqFile):
        """Reads a raw vector sequence file.

        The dtype of the returned numpy array is always the numpy default
        np.float, which may be 32-bit or 64-bit depending on architecture, etc.
        """
        Vec = np.fromfile(vecSeqFile, dtype=self.dtypeFile)
        lengthOfVec = len(Vec)
        misLenToPad = lengthOfVec % self.vecSize
        means = np.mean(Vec)

        for i in range(misLenToPad):
            Vec = np.insert(Vec, lengthOfVec, means)

        return np.reshape(
            Vec,
            (-1, self.vecSize)
        ).astype(np.float)

        # return np.reshape(
        #     np.fromfile(vecSeqFile, dtype=self.dtypeFile),
        #     (-1, self.vecSize)
        # ).astype(np.float)

up modified code is to solve reshape problem during read .mgc files data: original auther's algorithm here make me always counter with error: image-20200718155712646 (if this pic can't perform normal, use this site: https://blog-1301959139.cos.ap-beijing.myqcloud.com/picGo/20200718155714.png

  1. Here my solution is pad the mean number of .mgc into source .mgc so as to make it could be % by 40 dimension.

  2. this change work well on all kind of wav files (Notice that : Merlin only accept 16bit format of wav, you can change this parameter by Audition or 'sox' toolkit)

  3. Last but not least: we would better use 16k sample rate as our wav sr. In exp, 44100hz's MCD would be 16+ when it just 13+ in 16k hz


14 Ps. one more thing puzzled me: i use stargan-vc2's original demo .wav files for testing, which authority told that MCD is only 6.+ in their paper, however i calculate then as 13+.

In another hand, i use my personal VoiceConversion tast final result for testing, can get 7.1+score, which makes sense (MCD should be among [4, 8], the less the better )

  1. Any question, or any improment suggestion, welcome to make issues!

16.one more details to notice: under MCD folder, should copy all files under bin out: image-20200718162036091

and then can put ur original .mgc in test_data/ref-examples/ folder, converted/synthesized .mgc in test_data/synth-examples/ folder 🌟These two files name should be the same!

modified contents in corpus.lst under test_data/ folder as : <your .mgc files name, without .mgc>

then use cammand: ** cat test_data/corpus.lst | xargs dtw_synth test_data/ref-examples test_data/synth-examples out **


☑️ 🌟 To do , release some examples

About

Mel cepstral distortion (MCD) computations in python. Use Merlin toolkit to convert .wav files to .gcm files. Work in all form of .wav files

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published