Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup client.speaker and add additional tts speakers #155

Merged
merged 32 commits into from
Sep 27, 2014

Conversation

Holzhaus
Copy link
Member

@Holzhaus Holzhaus commented Sep 6, 2014

This is the second chunk of split up pull request #124.

I've added svox-pico-tts and google-tts. Pico TTS is the old Google OpenSource TTS Engine. Try it, it's much better than espeak!

Also, get rid of aplay and use pyaudio instead (pyaudio is a dependency anyway, so why rely on an external tool that might not be installed)

@@ -56,7 +195,7 @@ def newSpeaker():
ValueError if no speaker implementation is supported on this platform
"""

for cls in [eSpeakSpeaker, saySpeaker]:
for cls in [googleSpeaker, picoSpeaker, eSpeakSpeaker, saySpeaker]:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to add a profile option to configure the speaker?

@Holzhaus
Copy link
Member Author

Holzhaus commented Sep 9, 2014

OK. I'll do that tomorrow.
What if the selected speaker is not available? Fallback to another one or just sys.exit(1) and display an error message?

@charliermarsh
Copy link

Error message, IMO. I don't think that will happen very often and, when it does, you probably want to know that the speaker failed.

@Holzhaus
Copy link
Member Author

Parsing profile.yml and getting the config value for tts_engine will be added tomorrow.

@Holzhaus
Copy link
Member Author

Done.

@Holzhaus Holzhaus self-assigned this Sep 11, 2014
@Holzhaus
Copy link
Member Author

Can I merge this in?

@charliermarsh
Copy link

I will test this shortly. Hopefully we can merge it in today.

@charliermarsh
Copy link

On the Pi, for whatever reason, the output is incredibly choppy. Must be something to do with using pyaudio vs. aplay. Will dig a little deeper. (Interesting, I see this elsewhere around the web, e.g., http://stackoverflow.com/questions/21903597/pyaudio-sound-quality-when-playing-a-file.)

"""
import os
import platform
import re
import sys
import json

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused import.

@charliermarsh
Copy link

I've tried a bunch of different chunk sizes (12000, 44100) to no avail.

@Holzhaus
Copy link
Member Author

Have You tried the example from documentation?

@Holzhaus
Copy link
Member Author

The PyAudio example code runs without problems. I've added testing code to speaker.py. Just run speaker.py directly and it tries to say "This is a test" with all available speakers.

This also works fine on my Raspberry Pi (running ArchLinuxARM). Can you confirm that?

@Holzhaus
Copy link
Member Author

If that works for you, too, the problem could be caused by:

  • Slow SD card (simultaneous reading of wave file and writing to stream)
  • High CPU load
  • Misconfigured Sound System (likely, because you needed to specify the sound device explicitly when using aplay)
  • We're initializing PyAudio/Portaudio multiple times (in mic.py, in speaker.py) simultaneously. Maybe we're allowed to do this only once?

If the latter is true, we should create a dedicated audio class as a wrapper for pyaudio (which is probably a good idea anyway, so that we can easily switch from pyaudio to something else, if we need to.)

@charliermarsh
Copy link

Thanks @Holzhaus. Will test again tonight and look into a few of these suggestions.

@charliermarsh
Copy link

I still get the same poor audio quality. I tried flushing the file as well, but to no avail. Not nearly familiar enough with the Pi audio settings to figure out what's wrong with my config--I'm using the primary disk image that we ship. I suspect that it's a buffering problem, perhaps related to the SD card.

Is there a reason why we need to do this play indirection for the eSpeakSpeaker? Why not just call eSpeak directly, like we did in the past?

@Holzhaus
Copy link
Member Author

@crm416 What tests did you run exactly? Please do these test in that order to isolate the problem.

1. ALSA config

Does playing a wave file with aplay work without any other arguments (i.e. aplay /path/to/file.wav)?
If that works fine, continue with 2. If not, your alsa config needs to be fixed. Try putting something like this in your ~/.asoundrc (or /etc/asound.conf):

pcm.rpiaudio
{
    type hw
    card 0
}
pcm.usbmic
{
    type hw
    card 1
}

pcm.!default
{
    type asym
    playback.pcm
    {
        type plug
        slave.pcm "rpiaudio"
    }
    capture.pcm 
    {
        type plug
        slave.pcm "usbmic"
    }
}

You may need to edit the card numbers. You can look them up by using cat /proc/asound/cards.

2. PyAudio Init

Did you execute the speaker.py module directly (without running jasper)? If that works fine, the Problem is probably with PyAudio being initialized multiple times. If not, continue with 3.

3. Slow SD card

Replace lines 52-55 with this:

frame_num = f.getnframes()
data = f.readframes(frame_num)
# The whole file has now been read into memory
stream.write(data)

If that works fine, the problem is probably a slow SD card.
If not, it might be some other strange issue. Maybe a bug in PyAudio/Portaudio (which version are you using anyway?)

@Holzhaus
Copy link
Member Author

@crm416

Is there a reason why we need to do this play indirection for the eSpeakSpeaker? Why not just call eSpeak directly, like we did in the past?

The original code was also using the play() method:

def say(self, phrase, OPTIONS=" -vdefault+m3 -p 40 -s 160 --stdout > say.wav"):
        os.system("espeak " + json.dumps(phrase) + OPTIONS)
        self.play("say.wav")

def play(self, filename):
        os.system("aplay -D hw:1,0 " + filename)

Anyway, we can't rely on espeak detecting the correct audio setup. Furthermore, we might want to switch to the python-espeak module in the future.

Also, we want to use platform-independent output, so using plain aplay is not an option, either.

And last but not least: We need code to play wave files anyway, e.g. the beeps in Mic.activeListen() and the output of pico2wave (if the user does not like the sound of espeak).

@Holzhaus
Copy link
Member Author

If we fail to fix the problem, I can live with changing the play() method to use aplay for the moment, but this is something we should definitely get rid of.

@charliermarsh
Copy link

Okay. aplay /path/to/file.wav sounds great. espeak "some text" sounds okay too. Solutions 2 and 3 didn't seem to have any affect. I'm on PyAudio version 0.2.8.

When I do run python speaker.py, I get:

ALSA lib pcm_dmix.c:1018:(snd_pcm_dmix_open) unable to open slave
ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear
ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe
ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side
ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi
ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi
ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem
ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem
ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline
ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline
ALSA lib pcm_dmix.c:957:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
ALSA lib pcm_dmix.c:1018:(snd_pcm_dmix_open) unable to open slave

I do not get these errors when I run alsa /path/to/file.

@Holzhaus
Copy link
Member Author

This is kind of normal, as PyAudio likes to spam stderr.

Ill revert the play nethod to aplay for now.

@Holzhaus
Copy link
Member Author

I rebased to latest upstream version to make this mergeable again.

@Holzhaus
Copy link
Member Author

If nobody has comments or finds a problem, I'll merge this tomorrow.

Holzhaus added a commit that referenced this pull request Sep 27, 2014
Cleanup client.speaker and add additional tts speakers
@Holzhaus Holzhaus merged commit 5167f9f into jasperproject:master Sep 27, 2014
@Holzhaus Holzhaus deleted the new-speakers branch October 1, 2014 17:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants