Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

参考 Readme 生成的是电流声 #400

Open
BUG1989 opened this issue Jun 22, 2024 · 2 comments
Open

参考 Readme 生成的是电流声 #400

BUG1989 opened this issue Jun 22, 2024 · 2 comments
Labels
bug Something isn't working help wanted Extra attention is needed question Further information is requested

Comments

@BUG1989
Copy link

BUG1989 commented Jun 22, 2024

commit id: e58fe48
步骤

git clone https://github.com/2noise/ChatTTS
cd ChatTTS
conda create -n chattts
conda activate chattts
pip install -r requirements.txt
python examples/cmd/run.py "chat T T S is a text to speech model designed for dialogue applications."

生成的 output_audio_0.wav如下:
output_audio_0.zip

@BUG1989
Copy link
Author

BUG1989 commented Jun 22, 2024

但是,我使用以下的代码又是可以的

import ChatTTS
from IPython.display import Audio
import torch
import torchaudio

from dotenv import load_dotenv
load_dotenv()

chat = ChatTTS.Chat()
chat.load_models(compile=False) # Set to True for better performance

###################################
# Sample a speaker from Gaussian.

rand_spk = chat.sample_random_speaker()

params_infer_code = {
  'spk_emb': rand_spk, # add sampled speaker 
  'temperature': .3, # using custom temperature
  'top_P': 0.7, # top P decode
  'top_K': 20, # top K decode
}

inputs_en = """
chat T T S is a text to speech model designed for dialogue applications. 
[uv_break]it supports mixed language input [uv_break]and offers multi speaker 
capabilities with precise control over prosodic elements [laugh]like like 
[uv_break]laughter[laugh], [uv_break]pauses, [uv_break]and intonation. 
[uv_break]it delivers natural and expressive speech,[uv_break]so please
[uv_break] use the project responsibly at your own risk.[uv_break]
""".replace('\n', '') # English is still experimental.

params_refine_text = {
  'prompt': '[oral_2][laugh_0][break_4]'
} 
# audio_array_cn = chat.infer(inputs_cn, params_refine_text=params_refine_text)
audio_array_en = chat.infer(inputs_en, params_refine_text=params_refine_text)
torchaudio.save("output3.wav", torch.from_numpy(audio_array_en[0]), 24000)

@fumiama
Copy link
Member

fumiama commented Jun 22, 2024

无法复现。请提供更详细信息,如系统版本,python版本,torch版本,GPU型号,CUDA版本等。

@fumiama fumiama added bug Something isn't working help wanted Extra attention is needed question Further information is requested labels Jun 22, 2024
fumiama added a commit that referenced this issue Jun 22, 2024
when use_decoder=False
introduced in #383
maybe related to #400
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants