Alexa Voice Service
TL;DR: Amazon’s Alexa is cool. I wrote some code to talk to it.
Updated: March 27, 2016
Amazon has released a very good demo and guide for turning a Raspberry Pi into an Alexa Voice Service device on Github. In particular, the README file has very detailed sections on how to setup your Raspberry Pi and how to get started with Alexa Voice Service, which I did not cover in detail in the post below. I encourage everyone interested in AVS to read through their demo repository (I did!).
Preamble
So my last post was well over a month ago (oops). There were a lot of late nights and weekend hacking sessions (as my beautiful wife can attest to, sorry babe :D), some frustration, a lot of trial-and-error, and many iterations of code. Long story short… I can’t really remember everything that I did (oops again).
Fortunately, I kept a bunch of code repositories that’ll help me distill my experiences into what I hope will be some good posts. I’ll also be cleaning up those repos making them available on my Github
Alexa
When Amazon announced the Echo back in 2014. I remember dismissing it as another gimmicky attempt by Amazon to sell you more stuff.
At the time, it could only do a limited number of things:
- You can talk to it (a la Siri, Cortana, and Google Now) and ask for simple things like the weather or Wikipedia references
- You can tell it to play music or tell you a joke
- And of course, you can ask it to buy things from Amazon
But over the course of 2015, my opinion of it Alexa, the service; not necessarily the Echo device itself) changed from dismissive to impressed.
More people got their hands on an Echo and the reviews were unanimously positive. Unlike some of the other voice assistants out there, Alexa seems to work really well. So well in fact, that most people refer to “her” as Alexa and the “Echo” name has pretty much been forgotten by all except the techies. I mean, how many people refer to their iPhones as Siri or their PC as Cortana?
Alexa’s potential also became more apparent as Amazon opened up the service to developers. In particular, it quickly earned its place as the center of the smart home as an increasing number of third party integrations became available.
As mentioned in my last post, all of this was floating in the back of my mind while I was trying to figure out what to build with my Raspberry Pi.
Getting Started with Alexa
So my first step was to sign up for the developer preview. It was fairly straightforward, especially since I already had an Amazon account. IIRC, I just needed to agree to the Terms of Service. Amazon’s Getting Started Guide is pretty good. In particular, the excerpt below was what I needed.
Register with the Alexa Voice Service
The following steps describe how to register your device or application with the Alexa Voice Service:
- Get a free Amazon developer account if you do not already have one.
- Sign into the Alexa developer portal.
- Select Get Started in the Alexa Voice Service button.
- In the Register a Product Type menu, select Device or Application.
- Enter the Device or Application Type Information.
- Select or create a Security Profile to allow Amazon to identify and authenticate your device or application.
- Enter Device or Application Details.
- To enable Amazon Music on your device, complete the questionnaire.
- Select Submit to complete the registration process.
The steps in bold are the ones needed to get the ID values needed for interacting with Alexa Voice Service.
Next, I started writing a standalone Python client to interact with Alexa Voice Service (AVS) based on the code from AlexaPi.
In particular, I extracted the code for interacting with AVS and abstracted it into a few methods:
get_token()
This is more-or-less unchanged from the gettoken()
method in AlexaPi. It’s responsible for getting the client token needed to send requests to AVS.
get_request_params()
This is a utility method that returns a tuple of parameters needed for every AVS request. In particular, it will setup the Authorization
HTTP header and the request payload parameters that are always needed.
save_response_audio()
This is another utility method for extracting and saving the audio file returned in the AVS response.
ask()
The primary method used to interact with AVS. It takes an audio file as input, sends it to AVS and saves the response audio so it can be played back.
ask_multiple()
In my testing, I found that the latency for each AVS request is around 3-5 seconds. This is OK if you’re interacting with Alexa like you would with an Amazon Echo (i.e. speaking each request one after the other). However, if you’re interacting programatically (more on that in the next post), where you want to send multiple requests at a time, that latency quickly adds up.
So I added code using the requests_futures
package that will allow sending mulitple requests concurrently. This way I would get the responses for all the requests in about 3-5 seconds.
And there you have it, a simple Python client to interact with Alexa Voice Service. You can find all the code on my Github.
Thanks for reading!