Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG🐛] Size match in converted xttsv2 models #43

Open
scruffynerf opened this issue Dec 18, 2024 · 2 comments
Open

[BUG🐛] Size match in converted xttsv2 models #43

scruffynerf opened this issue Dec 18, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@scruffynerf
Copy link

Bug Description

[rank0]:   File "mypath/lib/python3.10/site-packages/auralis/core/tts.py", line 85, in _load_model
[rank0]:     return MODEL_REGISTRY[config['model_type']].from_pretrained(model_name_or_path, **kwargs)
[rank0]:   File "mypath/lib/python3.10/site-packages/auralis/models/xttsv2/XTTSv2.py", line 299, in from_pretrained
[rank0]:     model.load_state_dict(hifigan_state)
[rank0]:   File "mypath/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2584, in load_state_dict
[rank0]:     raise RuntimeError(
[rank0]: RuntimeError: Error(s) in loading state_dict for XTTSv2Engine:
[rank0]: 	size mismatch for text_embedding.weight: copying a param with shape torch.Size([6153, 1024]) from checkpoint, the shape in current model is torch.Size([6681, 1024]).
[rank0]: 	size mismatch for text_head.weight: copying a param with shape torch.Size([6153, 1024]) from checkpoint, the shape in current model is torch.Size([6681, 1024]).
[rank0]: 	size mismatch for text_head.bias: copying a param with shape torch.Size([6153]) from checkpoint, the shape in current model is torch.Size([6681]).

## Minimal Reproducible Example

use the current converter script with either
HF's drewThomasson/Morgan_freeman_xtts_model
or
HF's scruffynerf/xtts-vincent

(both of these work, and were trained using https://github.com/daswer123/xtts-finetune-webui )

and then try to use/load the resulting converted files

@scruffynerf scruffynerf added the bug Something isn't working label Dec 18, 2024
@scruffynerf
Copy link
Author

scruffynerf commented Dec 18, 2024

Ah ha, figured it out.

Coqui xtts2 v2.0.2 differs from v2.0.3 in the # of tokens

https://huggingface.co/coqui/XTTS-v2/commit/6b8036b35d787cf43d18d640587956b9db8fd1b8

the above models were training on v2.0.2

The convertor script needs to be aware of this, since any difference will cause it to not work once converted, since the config/etc don't match the actual trained gpt section of the model

Correct me if I'm wrong, but basically, either this means the gpt config must be adjusted in this case, since it no longer matches the stock config/etc. OR you should just fail the convertor, and complain that only v2.0.3 models can be converted.

@C00reNUT
Copy link

C00reNUT commented Dec 19, 2024

same issue here with 2.0.0 model version used for training, this would also maybe explain the difference in quality/output #27 when I am converting coqui 2.0.0 model using provided script...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants