Human Interface Design

Human interface design considerations contribute significantly to a customer’s sense of privacy and understanding. These considerations involve the customer interacting with or receiving information from the device itself, not from a voice agent.

Products that support more than one simultaneous voice agent have unique challenges both in designing usable controls, such as buttons, and in representing their current state with attention system displays and audio cues.

Physical User Interface

The following best practices are designed to ensure that your customer always knows when a device is active and detecting wake words. These recommendations are vitally important to maintaining customer trust.

  • Microphone: All products which allow hands-free voice activation (not push-to-talk only) must have a microphone on/off control. Please refer to AVS Security Requirements for more information.
  • Action: Where appropriate, products should include an “Action” button that functions for each active agent. The Action button should afford the following functions:
    • Initiate a new voice interaction
    • Interrupt responses and media output from any and all agents
    • Stop a sounding Alert
  • Volume Adjust: All devices that output sound should include a physical control to adjust the universal volume, affecting all agents.

Physical User Interface image

Button Interactions

It is recommended that buttons or other controls that interact with agents, such as Play and Pause buttons, do so consistently between agents. Devices should not implement separate sets of similar controls for different agents. Note that this will require that the device be able to direct the button press command to the proper agent.

These best practices apply whether the buttons are physical or virtual, and they also inform the decisions about which commands map to Universal Device Commands.

Overloading

Overloaded buttons, or other controls, may have more than one function or be able to invoke more than one agent. They may behave differently based on a mode the device is in, or on the length or pattern of pressing the button. Overloaded buttons are intrinsically difficult to use for customers, who must then remember both the extra functions of the control, as well as more than one method of interaction.

With multiple agents on a device, there is a risk of even more complicated interactions. Overloaded buttons should be avoided when possible. If you must overload a button, you should:

  • Provide clear and repeatable instructions about the function and use of the button
  • Group similar functions to a single control
  • Use a label, icon, or some other indication that the button has multiple functions
  • Keep the interaction patterns simple and easy to remember, such as a long or short button press.

overloading image
Important

The microphone on/off button must not be overloaded. Amazon recommends that the Action button not be overloaded.

line image

Universal Device Commands

Universal Device Commands (UDCs) are those commands and controls that a customer may use with any compatible agent to control certain device functions, even if the agent was not used to initiate the experience. UDCs can broadly be classified in two categories:

  • Device global commands (e.g. changing the device’s volume) that are implemented by each agent separately
  • Cross-agent commands (e.g. stop a sounding timer that was started by another agent) that may require state information to be shared from the device to enable agents to properly interpret the request.

UDCs are a necessary feature for devices with multiple active agents, and aim to satisfy customer expectations and solve for the most common frustration points and address customer expectations.

For example: Imagine one person sets an alarm and then leaves the room. Then another person enters the room, hears the alarm sound, and wants to turn it off. The interaction should be possible using any compatible agent, and should not result in an agent telling the person that there are no alarms set. Similarly, customers should be able to use any active agent to control the device’s volume, much like using the volume control buttons on the device.

Baseline Guidance
  • Devices with multiple simultaneous agents should provide access to the device state information that agents need to implement Universal Device Commands, when invoked by the customer.
  • The data sent by the device to an invoked agent about on going activity states on the device should be minimal and specific to actions that UDCs allow the agent to take. For example, if a customer uses one agent to begin a timer but then invokes a second agent to stop that timer when it rings, the only information the second agent should receive about the timer is that there is stoppable sounding timer on the device (and not, for example, details about the duration of the timer or which agent originally set it).
  • Agents invoked to take action on an ongoing activity should not use the device state information provided for any other purpose than to fulfill the UDC request.

Recommended Universal Device Commands

A recommended set of UDCs is listed below. Your product may include other UDCs depending upon the experience and agent capabilities. When considering implementing additional commands, keep in mind:

  • Customers may initiate long-running activities that will be stopped later, and the customer may not remember which agent to use (e.g. media playback).
  • Unprompted activities may begin that require customer interaction and they may not know which agent to use in order to respond (e.g. sounding timer).
  • Customers may want any agent to be able to control global device settings (e.g. volume).
  • A command to one agent should not bypass authentication or other security requirements for any other agent.

In this version of the Design Guide, we include the following categories of Universal Device Commands that multi-agent devices should consistently support:

Stop Timer

As a user, I want to invoke any compatible agent on device and say ‘Stop’ (and variants like ‘Cancel’ or the action button) to stop a sounding (i.e. completed) timer, regardless of which agent created the timer.

Dismiss Alarm

As a user, I want to invoke any compatible agent on device and say ‘Stop’ (and variants like ‘Cancel’, ‘Quiet’ or the action button) to dismiss a sounding alarm, regardless of which agent set the alarm.

Dismiss Reminder

As a user, I want to invoke any compatible agent on device and say ‘Stop’ (and variants like ‘Cancel’, ‘Dismiss’) to dismiss a sounding reminder, regardless of which agent set the reminder

Stop Media

As a user, I want to invoke any compatible agent on device and say ‘Stop’ to stop any active media playback including music, radio, long-form audio (ebooks, podcasts, news, etc.) and videos.

Stop Camera Feed

As a user, I want to invoke any compatible agent on device and say ‘Stop’ (and variants like ‘End’) to stop any active streaming smart home camera feeds.

Reject Calls

As a user, I want to invoke any compatible agent on device and say ‘Reject’ (and variants like ‘Stop’, ‘Cancel’) to reject an incoming phone, audio, or video call, regardless of which agent provides the service.

Stop Speech

As a user, I want to invoke any compatible agent on device and say ‘Stop’ (and supported variants like ‘End’, ‘Cancel’, etc.) to stop any ongoing agent speech activity, regardless of which agent is speaking.

Global Foreground Stop

As a user, I want to invoke any compatible agent on device and say ‘Stop’ (and supported variants like ‘End’, ‘Shut up’, etc.) to stop the intended foreground activity when there is more than one active session (e.g. Timer over Music, Weather TTS over Music), regardless of which agent initiated the activities.

Volume Control (up/down)

As a user, I want to invoke any compatible agent on device and say ‘Set volume up/down’ (and variants like ’turn it up/down’) to change the global volume setting, regardless of which agent set it last.

Volume Control (to level N)

As a user, I want to invoke any compatible agent on device and say ‘Set volume to N’ (where N = values from 0 to 10) to change the global volume setting, regardless of which agent set it last.

Volume Mute

As a user, I want to invoke any compatible agent on device and say ‘Mute’ to mute the global volume setting, regardless of which agent set it last.

line image
wave image