Chatting with Alexa
In a previous post I mentioned my idea of using different triggers for Alexa. In particular, I didn’t want to setup a hardware “push-to-talk” button. Now that I had code to interact with Alexa Voice Service (AVS), I needed a way to generate the audio commands for the requests (instead of recording my voice for each command).
So I ended up looking for a text-to-speech (TTS) tool that could run on Linux. After a bit of searching on the Internet, I found that Festival seemed to be the best option for TTS on Linux.
I am using Ubuntu for my development and testing, and it was very straightforward to install.
sudo apt-get install festival
As is the usual case, the Arch Linux wiki also had a great entry for Festival that offers more details.
What I needed was the text2wave
command that comes with the festival
package when it is installed. With this command, you can give a text file as input and it can output the speech as a WAV file.
text2wave -o <output_wav_file> <input_text_file>
I wrote a simple Python wrapper to execute the command:
def tts(text, save_to=None):
"""Converts text to speech (WAV) file.
Args:
text (str): Text to convert
save_to (str): File path for saving the WAV file. If not
provided will save to a `/tmp/simple-tts/`
Returns:
Path (str) where the WAV file is saved.
"""
os.system('mkdir -p {}'.format(TEMP_DIR))
temp_file = TEMP_DIR + '/{}.txt'.format(uuid.uuid4())
if not save_to:
save_to = TEMP_DIR + '/{}.wav'.format(uuid.uuid4())
with open(temp_file, 'w') as f:
f.write(text)
os.system('text2wave -o {out_fn} {in_fn}'.format(
out_fn=save_to, in_fn=temp_file))
return save_to
After some testing, I found that Alexa had some trouble understanding the default voice that is used by Festival. After some more Google-Fu, I was able to find some information about installing additional voice packs for Festival. I tested several of the voice packs and found that the CMU Arctic clb (US English female voice) gave the best results.
Here’s the TTS code, and a simple example that shows how to tie it together with alexa-client.