Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TTS bug #1081

Open
janjanusek opened this issue Jul 6, 2024 · 27 comments
Open

TTS bug #1081

janjanusek opened this issue Jul 6, 2024 · 27 comments

Comments

@janjanusek
Copy link

Hello, I'm using your TTS library and must say it's very good, but I have no idea why when I instantiate OfflineTts model and run generate it returns result, but when I do it second time on the same instance I'll get error, for model I'm using this

new OfflineTtsModelConfig()
        {
            Vits = new OfflineTtsVitsModelConfig()
            {
                Tokens = Description.Tokens.FullName,
                Model = Description.OnnxModel.FullName,
                DataDir = Description.NgDataDir.FullName,
                NoiseScale = 0,
                NoiseScaleW = 0
            },
            Provider = "cpu",
            NumThreads = 1,
            Debug = 1
        },
        MaxNumSentences = 1
    }

I'm using org.k2fsa.sherpa.onnx package and tried all possible versions, also alongside I'm using as referrence within my project onnx runtime (for my other models) but that should not affect this package. My app is running on NET8, and I'm using Win11 with EU based lang.

error I'm getting on second run for any input (even the same one):

D:\a\sherpa-onnx\sherpa-onnx\sherpa-onnx/csrc/offline-tts-vits-impl.h:Generate:165 Raw text: John
2024-07-06 19:33:02.2001917 [E:onnxruntime:, sequential_executor.c
c:516 onnxruntime::ExecuteKernel] Non-zero status code returned wh
ile running GatherElements node. Name:'/dp/flows.7/GatherElements_
3' Status Message: C:\a\_work\1\s\onnxruntime\core\providers\cpu\t
ensor\gather_elements.cc:154 onnxruntime::core_impl GatherElements op: Out of range value in index tensor

worth to notice that when I create another instance it works again for single use..

@csukuangfj
Copy link
Collaborator

could you please tell us which model you are using?

@janjanusek
Copy link
Author

I tried almost all piper models, no matter what I use it seem problem persists. What is the difference between them?

@janjanusek
Copy link
Author

and like month ago it was working for me no problem, I changed nothing and it stopped working, only thing I had in my mind was libraries ver chages, but as I said I'm using onnx runtime and ml.llm packages alongside sherpa packages

@csukuangfj
Copy link
Collaborator

Could you post the complete code for reproducing?

We have never encountered such an issue before.

@janjanusek
Copy link
Author

I'll post it in about 12 hours, thanks

@janjanusek
Copy link
Author

Okay I found the way how to replicate the issue, apparently Microsoft.ML.OnnxRuntimeGenAI once installed to your project it starts to happen. You can replicate the problem by adding Microsoft.ML.OnnxRuntimeGenAI of the latest 0.3.0v into example offline-tts project and replace line with audio generate with

    OfflineTtsGeneratedAudio audio = tts.Generate(options.Text, speed, sid);
    OfflineTtsGeneratedAudio audio1 = tts.Generate(options.Text, speed, sid);
    OfflineTtsGeneratedAudio audio2 = tts.Generate(options.Text, speed, sid);

@csukuangfj
Copy link
Collaborator

apparently Microsoft.ML.OnnxRuntimeGenAI once installed to your project it starts to happen.

If you uninstall it, will it fix the issue or not?

@janjanusek
Copy link
Author

janjanusek commented Jul 7, 2024

absolutelly, but how it can be that by installing other library your just doesn't work. Apparently there is some dynamic binding at sherpa-onnx causing this I suppose. In my solution I need to use both libraries.

@csukuangfj
Copy link
Collaborator

sherpa-onnx also links to onnxruntime.dll

Please search onnxruntime.dll inside the sherpa-onnx package directory.

You can use sherpa-onnx's onnxruntime.dll to replace the one from Microsoft.ML.OnnxRuntimeGenAI
and see if it works.

Otherwise, there are conflicts between different versions of onnxruntime.dll

@janjanusek
Copy link
Author

Okay so I tried to replace onnxruntime at build for the one used by sherpa-onnx, I made sherpa-onnx working again but the PHI3 model using Microsoft.ML.OnnxRuntimeGenAI thrown following exception
The requested API version [18] is not available, only API versions [1, 17] are supported in this build. Current ORT Version is: 1.17.1

So in theory approach you proposed is dirty but valid. I saw PR for 1.18.1 onnxruntime, is there any expected date when it could be available? I believe that will solve all the issues.

I understant that update your library to newest onnxruntime must be unpleasant job, but in order to maintain this awesome project it is worthy.

@csukuangfj
Copy link
Collaborator

We also want to update to the latest onnxruntime. However, the onnxruntime > 1.17.1 causes issues with sherpa-onnx.

Please see #906

Basically, you can use onnxruntime 1.18.1 to compile sherpa-onnx from source and then use the DLL you generated to replace the one you downloaded into .Net.

If you have any issues after doing this at runtime, please see

microsoft/onnxruntime#20808 (comment)

We have supported so many models that we have not found time to use the above code to convert existing models one by one. That is why we have not updated onnxruntime to the latest version.

@janjanusek
Copy link
Author

I'll try that today and let you know, thanks

@janjanusek
Copy link
Author

I tried to build it from your fork with onnxruntime ver 18 but I've got tons of errors while doing that I believe my env does not have right dependencies? donno, look
24>onnxruntime.lib(onnxruntime_c_api.obj): Error LNK2038 : mismatch detected for '_ITERATOR_DEBUG_LEVEL': value '0' doesn't match value '2' in sherpa-onnx-offline-language-identification.obj 24>onnxruntime.lib(onnxruntime_c_api.obj): Error LNK2038 : mismatch detected for 'RuntimeLibrary': value 'MT_StaticRelease' doesn't match value 'MTd_StaticDebug' in sherpa-onnx-offline-language-identification.obj 24>onnxruntime.lib(error_code.obj): Error LNK2038 : mismatch detected for '_ITERATOR_DEBUG_LEVEL': value '0' doesn't match value '2' in sherpa-onnx-offline-language-identification.obj 24>onnxruntime.lib(error_code.obj): Error LNK2038 : mismatch detected for 'RuntimeLibrary': value 'MT_StaticRelease' doesn't match value 'MTd_StaticDebug' in sherpa-onnx-offline-language-identification.obj 24>onnxruntime.lib(allocator.obj): Error LNK2038 : mismatch detected for '_ITERATOR_DEBUG_LEVEL': value '0' doesn't match value '2' in sherpa-onnx-offline-language-identification.obj 24>onnxruntime.lib(allocator.obj): Error LNK2038 : mismatch detected for 'RuntimeLibrary': value 'MT_StaticRelease' doesn't match value 'MTd_StaticDebug' in sherpa-onnx-offline-language-identification.obj 24>onnxruntime.lib(onnxruntime_typeinfo.obj): Error LNK2038 : mismatch detected for '_ITERATOR_DEBUG_LEVEL': value '0' doesn't match value '2' in sherpa-onnx-offline-language-identification.obj 24>onnxruntime.lib(onnxruntime_typeinfo.obj): Error LNK2038 : mismatch detected for 'RuntimeLibrary': value 'MT_StaticRelease' doesn't match value 'MTd_StaticDebug' in sherpa-onnx-offline-language-identification.obj 24>onnxruntime.lib(tensor_type_and_shape.obj): Error LNK2038 : mismatch detected for '_ITERATOR_DEBUG_LEVEL': value '0' doesn't match value '2' in sherpa-onnx-offline-language-identification.obj 24>onnxruntime.lib(tensor_type_and_shape.obj): Error LNK2038 : mismatch detected for 'RuntimeLibrary': value 'MT_StaticRelease' doesn't match value 'MTd_StaticDebug' in sherpa-onnx-offline-language-identification.obj 24>onnxruntime.lib(abi_session_options.obj): Error LNK2038 : mismatch detected for '_ITERATOR_DEBUG_LEVEL': value '0' doesn't match value '2' in sherpa-onnx-offline-language-identification.obj 24>onnxruntime.lib(abi_session_options.obj): Error LNK2038 : mismatch detected for 'RuntimeLibrary': value 'MT_StaticRelease' doesn't match value 'MTd_StaticDebug' in sherpa-onnx-offline-language-identification.obj 24>onnxruntime.lib(onnxruntime_map_type_info.obj): Error LNK2038 : mismatch detected for '_ITERATOR_DEBUG_LEVEL': value '0' doesn't match value '2' in sherpa-onnx-offline-language-identification.obj 24>onnxruntime.lib(onnxruntime_map_type_info.obj): Error LNK2038 : mismatch detected for 'RuntimeLibrary': value 'MT_StaticRelease' doesn't match value 'MTd_StaticDebug' in sherpa-onnx-offline-language-identification.obj 24>onnxruntime.lib(onnxruntime_sequence_type_info.obj): Error LNK2038 : mismatch detected for '_ITERATOR_DEBUG_LEVEL': value '0' doesn't match value '2' in sherpa-onnx-offline-language-identification.obj 24>onnxruntime.lib(onnxruntime_sequence_type_info.obj): Error LNK2038 : mismatch detected for 'RuntimeLibrary': value 'MT_StaticRelease' doesn't match value 'MTd_StaticDebug' in sherpa-onnx-offline-language-identification.obj 22>LINK: Warning LNK4044 : unrecognized option '/Wl,-rpath,$ORIGIN'; ignored 24>onnxruntime.lib(run_options.obj): Error LNK2038 : mismatch detected for '_ITERATOR_DEBUG_LEVEL': value '0' doesn't match value '2' in sherpa-onnx-offline-language-identification.obj 24>onnxruntime.lib(run_options.obj): Error LNK2038 : mismatch detected for 'RuntimeLibrary': value 'MT_StaticRelease' doesn't match value 'MTd_StaticDebug' in sherpa-onnx-offline-language-identification.obj

list goes on for a long time but for simplicity I trimmed it here. I converted all models with this simple script but still could not make all of them work (some of the actually did although when I changed speed whatever different from 1 it generated extra short audio).

import os
import onnx

def convert_model(modelPath):
    print(modelPath)
    oldModel = onnx.load(modelPath)
    upgradedModel = onnx.version_converter.convert_version(oldModel, 21)
    onnx.save(upgradedModel, modelPath)
    
def apply_function_to_onnx_files(directory, function):
    """
    Recursively search for all ONNX files in the given directory and apply a function to each file's full path.

    :param directory: The directory path to search in.
    :param function: The function to apply to each ONNX file path.
    """
    for root, _, files in os.walk(directory):
        for file in files:
            if file.endswith('.onnx'):
                print(f'converting: "{file}"')
                full_path = os.path.join(root, file)
                function(full_path)
                
apply_function_to_onnx_files('your root path', convert_model)

If you want I can update script to be able process archives as it is in https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models so it would pull onnx out convert and replace it within archive.

My question to you is, since you alredy have this done, would you mind to build appropriate dlls and publish it here once you have time? my plan is to run linux64bit, osx and win64bit platforms and I already spend waay too much time on this issue my self and have to move ahead with development.

Thanks and looking forward to your response.

@janjanusek
Copy link
Author

I managed just now to work vctk model with 109 speakers I can work with that, so if you just prepare onnx 18 of sherpa-onnx version will be good enough.

I can update python script so you can automatically convert all your models to newest opset.

@janjanusek
Copy link
Author

So additional informations:
when you run coqui models on 18.1.0 it works, also models using only lexicon instead of ng-data dir but those are really not robust if user makes a typo so I'll use meanwhile coqiu model EN with 109 speakers until you adapt trully 18.1.0 onnxruntime.

If you want that script let me know if not you can close ticket and hopefully new version will be supported withn 1-2 months.

I would really like to use all models with 6+ speakers since size to value ratio is good in there.

@csukuangfj
Copy link
Collaborator

The latest nuget package supports onnxruntime 1.18.0. please re-try.

@csukuangfj
Copy link
Collaborator

so I'll use meanwhile coqiu model EN with 109

I suggest that you also try
vits-piper-en_US-libritts_r-medium.tar.bz2

It has more than 900 speakers!

@janjanusek
Copy link
Author

janjanusek commented Jul 12, 2024

To make absolutelly transparent how to replicate the problem please run sherpa offline-tts in 1.10.13

I tried to run model 'vits-piper-en_GB-vctk-medium' with and without conversion of the 21 opset, it changed nothing.

I got error (opset 21):

Wrote to ./generated.wav succeeded!
2024-07-12 08:39:29.7787850 [E:onnxruntime:, sequential_executor.cc:516 onnxruntime::ExecuteKernel] Non-zero status code returned while running Expand node. Name:'/dp/flows.5/Expand_25' Status Message: invalid expand shape
Unhandled exception. System.Runtime.InteropServices.SEHException (0x80004005): External component has thrown an exception.
   at SherpaOnnx.OfflineTts.SherpaOnnxOfflineTtsGenerate(IntPtr handle, Byte[] utf8Text, Int32 sid, Single speed)
   at SherpaOnnx.OfflineTts.Generate(String text, Single speed, Int32 speakerId)
   at OfflineTtsDemo.Run(Options options) in C:\Users\code
NET\RiderProjects\sherpa-onnx\dotnet-examples\offline-tts\Program.cs:line 161
   at OfflineTtsDemo.Main(String[] args) in C:\Users\codeN
ET\RiderProjects\sherpa-onnx\dotnet-examples\offline-tts\Program.cs:line 82

I got error (opset unchanged):

Wrote to ./generated.wav succeeded!
2024-07-12 08:42:15.4418995 [E:onnxruntime:, sequential_executor.cc:516 onnxruntime::ExecuteKernel] Non-zero status code returned while running Reshape node. Name:'/Reshape_1' Status Message: C:\a\_work\1\s\onnxruntime\core\providers\cpu\tensor\reshape_helper.h:30 onnxruntime::ReshapeHelper::ReshapeHelper i < input_shape.NumDimensions() was false. The dimension with value zero exceeds the dimension size of the input tensor.

Unhandled exception. System.Runtime.InteropServices.SEHException (0x80004005): External component has thrown an exception.
   at SherpaOnnx.OfflineTts.SherpaOnnxOfflineTtsGenerate(IntPtr handle, Byte[] utf8Text, Int32 sid, Single speed)     
   at SherpaOnnx.OfflineTts.Generate(String text, Single speed, Int32 speakerId)
   at OfflineTtsDemo.Run(Options options) in C:\Users\codeN
ET\RiderProjects\sherpa-onnx\dotnet-examples\offline-tts\Program.cs:line 161
   at OfflineTtsDemo.Main(String[] args) in C:\Users\codeNE
T\RiderProjects\sherpa-onnx\dotnet-examples\offline-tts\Program.cs:line 82

How to replicate change running single generate to generate in loop like following

for (int i = 0; i < 10; i++)
    {
      OfflineTtsGeneratedAudio audio = tts.Generate(options.Text, speed, sid);
      bool ok = audio.SaveToWaveFile(options.OutputFilename);
      if (ok)
      {
        Console.WriteLine($"Wrote to {options.OutputFilename} succeeded!");
      }
      else
      {
        Console.WriteLine($"Failed to write {options.OutputFilename}");
      }
    }

First run will pass the others will fail, same thing as when you generate speaker 0 and then 1 in sequence.

@csukuangfj
Copy link
Collaborator

Thank you for reporting it. Will look into it during the weekend.

@janjanusek
Copy link
Author

Allright, the tts demo works now. I appreciate effort, but I was hoping you figure out the problem, not rollback to old onnx version.

So how to resolve this onnx runtime issue? Is that on the list any time soon?

btw I don't believe it was ever problem with opset because I also run now another model on opset 14 (due to tf conversion bug) with onnx runtime 18.1.

I really do want to go with sherpa in PROD env. Or if you have any ideas I could try, I'm eager to hear from you.

I also believe that problem can be simulated when you install Microsoft.ML.OnnxRuntimeGenAI 0.3.0 to tts example project.

@janjanusek
Copy link
Author

Don't forget, first run always works.. so to me it looks there are some data from first run and that causes issue on reshape node... because creation of new instance make it works again.

@csukuangfj
Copy link
Collaborator

Don't forget, first run always works.. so to me it looks there are some data from first run and that causes issue on reshape node... because creation of new instance make it works again.

That is unexpected. I cannot understand it.

To me the model should be stateless.

@janjanusek
Copy link
Author

I guess model it self yes, but I would suggest to check indempotency of c pipeline around the model. I don't know why it's happening eighter, but all traces suggest exactly that.

@janjanusek
Copy link
Author

temporary I resolved issue with packed console application I'm executing in separated process which has required dependencies, it works but it also adds additional size to the app package so I'm really eager to see support for onnx runtime 18.1+

@csukuangfj
Copy link
Collaborator

so I'm really eager to see support for onnx runtime 18.1+

Sorry that I have no idea how to fix it to support onnxruntime 1.18.1

@janjanusek
Copy link
Author

I know, no blame dude, it's really difficult to tackle this as it makes no sense to crash when stateless. We spent already long time on this topic. btw is it possible to create more models with many speakers in other languages? could you give me some hint where to start? I built desktop app but when I want support many languages it's can get really big with single speaker models. 🤷🏼‍♂️

@csukuangfj
Copy link
Collaborator

We support models from https://github.com/rhasspy/piper

Would you be able to take a look at the doc of piper?

Once you have a model from piper, it is straightforward to convert it to sherpa-onnx.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants