Chatting with Alexa
In a previous post I mentioned my idea of using different triggers for Alexa. In particular, I didn’t want to setup a hardware “push-to-talk” button. Now that I had code to interact with Alexa Voice Service (AVS), I needed a way to generate the audio commands for the requests (instead of recording my voice for each command).
So I ended up looking for a text-to-speech (TTS) tool that could run on Linux. After a bit of searching on the Internet, I found that Festival seemed to be the best option for TTS on Linux.
I am using Ubuntu for my development and testing, and it was very straightforward to install.
sudo apt-get install festival
As is the usual case, the Arch Linux wiki also had a great entry for Festival that offers more details.
What I needed was the text2wave
command that comes with the festival
package when it is installed. With this command, you can give a text file as input and it can output the speech as a WAV file.
text2wave -o <output_wav_file> <input_text_file>
I wrote a simple Python wrapper to execute the command:
After some testing, I found that Alexa had some trouble understanding the default voice that is used by Festival. After some more Google-Fu, I was able to find some information about installing additional voice packs for Festival. I tested several of the voice packs and found that the CMU Arctic clb (US English female voice) gave the best results.
Here’s the TTS code, and a simple example that shows how to tie it together with alexa-client.