Fundamentals for programming Amazon Echo Show
As a lead agency for Connected Commerce, we have been heavily involved in voice interfaces – whether it's Google Home, Amazon Echo or one of the many other voice products available – since December of last year. The new Echo Show is a result from Amazon’s learnings, designed to overcome the problems and obstacles faced by all voice interfaces when it comes to communicating information. Echo Show compensates for the limitations of communicating information via a voice Interface with a classic display. A no-interface device is thus transformed into a full service touchpoint in the digital ecosystem, opening up hitherto undreamt-of possibilities. In this article we will examine how we can approach this new component, the screen with delegation via the Amazon endpoint. Reverse engineering is the key.
Echo Show does not look particularly exciting, and reminds you of something from a 1970s sci-fi movie. At the front the surface is sloped, with the display at the center, a camera at the top and the speaker grille at the bottom. The device won't look out of place in your living room. We had to use an address in the United States to buy our Echo Show, as it is not yet available in Germany. Amazon has yet to announce a release date for the German market.
Currently no developer guidelines
Also lacking is information for developers and the development of skills on Echo Show. However, the Amazon Developer Portal provides information on how JSON communication between the skill and endpoint must look for the new functionality. Parameters are described, templates are shown and callbacks are explained. However, all of this information exists solely in the form of communication protocols and not as guidelines. As traditional users of the Alexa Skills Kit framework for Java, we feel left out in the cold. A note in the latest framework version in GitHub tells us that Version 1.4 was prepared for Echo Show, but there are neither documents nor code examples available.
What display options does Echo Show offer?
We have an Echo Show, and we want to develop a skill for it. So, let's take a deep breath and dive into the framework code, which was committed in the last release. We must begin by asking ourselves what our actual expectations are. Echo Show can present information in a variety of ways. It must therefore be possible to transmit information on layouts and send this as a response to the Alexa endpoint. If we take a look at the response object, we see that virtually nothing has changed since the last release. The only point at which we could transmit dynamic data are the so-called directives. If we search a little in the framework, we will find the RenderTemplateDirective instance, and it is here where we can transfer a template.
We already have access to templates on the Amazon Developer pages. At present there are six fixed templates: two for displaying content lists and four for displaying single content. The difference between the two content list templates is that one is intended for horizontal lists, while the other is for vertical lists. The four templates for single content differ in their display options as follows (see Fig. 1):