Step 3: Understand the Alexa Directives and Lambda Responses (VSK Echo Show)
In the previous section, Step 2: Enable your Video Skill on a Multimodal Device and Test, you initiated a request from Alexa cloud to your Lambda and observed the requests and responses logged in CloudWatch. Now that you're looking at logs in CloudWatch, observing directives Alexa sends and the responses provided by Lambda, let's unpack this code and explain what's going on in greater detail.
- Basic Workflow
- An Analogy of the Interaction
- Stepping Section by Section through the Sample Lambda
- Next Steps
Basic Workflow
The workflow (described in Architecture Overview) begins when a user says a phrase, such as "Play [media title]," to the multimodal device. Alexa's natural language processing intelligence does the work of parsing the user's utterances and figuring out the user's intent. Alexa then packages up the user's intent into a directive. The following directives are used with multimodal device interactions:
GetPlayableItems
GetPlayableItemsMetadata
GetBrowseNodeItems
GetDisplayableItems
GetDisplayableItemsMetadata
GetBrowseNodeItems
GetNextPage
Different scenarios (search, play, channel change, etc.) prompt different directives to be sent. These scenarios and the sent directives are defined in Directives Alexa Sends and in the reference documentation.
When users say, "Play the movie Manchester by the Sea", Alexa identifies all catalogs that contain the media, and then sends a GetPlayableItems
directive to the appropriate Lambda functions for those catalogs. The GetPlayableItems
directive looks like this:
Alexa Request: GetPlayableItems
{
"directive": {
"header": {
"correlationToken": "dFMb0z+PgpgdDmluhJ1LddFvSqZ/jCc8ptlAKulUj90jSqg==",
"messageId": "9f4803ec-4c94-4fdf-89c2-d502d5e52bb4",
"name": "GetPlayableItems",
"namespace": "Alexa.VideoContentProvider",
"payloadVersion": "3"
},
"endpoint": {
"scope": {
"type": "BearerToken",
"token": "access-token-from-skill"
},
"endpointId": "videoDevice-001",
"cookie": {
}
},
"payload": {
"entities": [
{
"type": "Video",
"value": "Manchester by the Sea",
"externalIds": {
"imdb": "tt4574334"
}
}
],
"contentType": "RECORDING",
"locale": "en-US",
"minResultLimit": 8,
"maxResultLimit": 25,
"timeWindow": {
"end": "2016-09-07T23:59:00+00:00",
"start": "2016-09-01T00:00:00+00:00"
}
}
}
}
The directive name appears in the header
block. You can see that this is a GetPlayableItems
directive.
Your Lambda receives this directive as an event. Your Lambda code needs to perform whatever lookups are necessary to identify what media titles match that request (based on the payload
in the directive). As part of the lookup, your Lambda might identify additional media titles relevant to the user's request.
Then your Lambda returns a response to Alexa that conforms with the requirements for responses for that directive type. For GetPlayableItems
directives, the GetPlayableItemsResponse
looks like this:
Lambda Response: GetPlayableItemsResponse
{
"event": {
"header": {
"correlationToken": "dFMb0z+PgpgdDmluhJ1LddFvSqZ/jCc8ptlAKulUj90jSqg==",
"messageId": "5f0a0546-caad-416f-a617-80cf083a05cd",
"name": "GetPlayableItemsResponse",
"namespace": "Alexa.VideoContentProvider",
"payloadVersion": "3"
},
"payload": {
"nextToken": "fvkjbr20dvjbkwOpqStr",
"mediaItems": [
{
"mediaIdentifier": {
"id": "recordingId://provider1.dvr.rp.1234-2345-63434-asdf"
}
},
{
"mediaIdentifier": {
"id": "recordingId://provider1.dvr.rp.1234-2345-63434-asdf"
}
}
]
}
}
}
In this code, there are two different mediaIdentifier
values matching the user's request. The values for these mediaIdentifier
properties correspond with the content IDs in your catalog.
After Alexa receives the GetPlayableItemsResponse
response, Alexa might ask the user to clarify which media title the user wants to play, or which video provider the user wants to play the media from (in cases where multiple providers have the same media).
After resolving the media the user wants to play, Alexa will then send another directive to your Lambda called GetPlayableItemsMetadata
. This directive asks your Lambda for more details about the chosen media title. Alexa needs this in order to show information about this media title (the value for the mediaIdentifier
— for example, recordingId://provider1.dvr.rp.1234-2345-63434-asdf
— means nothing to Alexa). You need to supply information related to this mediaIdentifier
that indicates what to show on the user's screen, such as the title, thumbnail, duration, rating, etc.
The GetPlayableItemsMetadata
that Alexa sends might look like this:
Alexa Request: GetPlayableItemsMetadata
{
"directive": {
"header": {
"correlationToken": "dFMb0z+PgpgdDmluhJ1LddFvSqZ/jCc8ptlAKulUj90jSqg==",
"messageId": "0f918d6e-ebae-48f1-a237-13c6f5b9f5da",
"name": "GetPlayableItemsMetadata",
"namespace": "Alexa.VideoContentProvider",
"payloadVersion": "3"
},
"endpoint": {
"scope": {
"type": "BearerToken",
"token": "access-token-from-skill"
},
"endpointId": "videoDevice-001",
"cookie": {
}
},
"payload": {
"locale": "en-US",
"mediaIdentifier": {
"id": "recordingId://provider1.dvr.rp.1234-2345-63434-asdf"
}
}
}
}
You can see here that the payload
identifies a specific mediaIdentifier
that the user wants to play.
Your Lambda then retrieves the needed information about this mediaIdentifier
and returns a GetPlayableItemsMetadataResponse
response with more information about it. The response might look as follows:
Lamba Response: GetPlayableItemsMetadataResponse
{
"event": {
"header": {
"correlationToken": "dFMb0z+PgpgdDmluhJ1LddFvSqZ/jCc8ptlAKulUj90jSqg==",
"messageId": "38ce5b22-eeff-40b8-a84f-979446f9b27e",
"name": "GetPlayableItemsMetadataResponse",
"namespace": "Alexa.VideoContentProvider",
"payloadVersion": "3"
},
"payload": {
"searchResults": [
{
"name": "Interstellar",
"contentType": "ON_DEMAND",
"series": {
"seasonNumber": "1",
"episodeNumber": "1",
"seriesName": "The Big Bang Theory",
"episodeName": "Pilot"
},
"playbackContextToken": "{\"streamUrl\": \"http:\/\/samplemediasite.com\/sample\/video.mp4\", \"title\": \"Some Video Title\"}",
"parentalControl": {
"pinControl": "REQUIRED"
},
"absoluteViewingPositionMilliseconds": 1232340
}
]
}
}
}
Alexa then passes the playbackContextToken
to your web player, which then converts this identifier into a media playback URL and loads the media.
An Analogy of the Interaction
To put this interaction more concretely, consider this analogy. A customer walks into a video store and asks the clerk, "I want to watch To Kill a Mockingbird." The clerk passes on the request to a backroom worker who looks through the media library and locates the Mockingbird media section. He finds that there are multiple matches for this media, with different editions and variations, some matches with Gregory Peck and other matches with Mockingjay in Hunger Games.
The backroom worker relays the info back to the clerk, who then asks the customer, "Which of these media titles do you actually want?" The customer says, "I want the first one (with Gregory Peck)." The clerk then turns to the backroom worker and says the customer wants the Gregory Peck media title. The backroom worker retrieves all the details about the Gregory Peck title and returns this info to the clerk. The clerk loads the media into a player and plays it for the customer.
In short, the user's request gets converted to a GetPlayableItems
directive sent to your Lambda. Your Lambda responds with a GetPlayableItemsResponse
listing the matching titles. Alexa replies with a GetPlayableItemsMetadata
directive for the title the user selects, and your Lambda replies with a GetPlayableItemsMetadataResponse
containing all the details for playback.
Exactly how you code your Lambda and perform the necessary backend services to retrieve the right data is up to you. The documentation here will not provide tutorials that describe how to interact with your backend services to gather and generate the needed responses because each partner's code and backend services differs, as these backend services differ considerably from partner to partner.
Stepping Section by Section through the Sample Lambda
The following sections will explain the logic in the sample Lambda function. The sample Lambda function, provided in Step 1: Create Your Video Skill and Lambda Function, specifically the section Step 1.3: Create the Lambda Function for Your Video Skill, tries to demonstrate the required responses for several directive types. Let's unpack the Lambda function section by section.
hardCodedResponse
function here bypasses the lookups that your backend service will need to perform. To handle the incoming events arriving at your Lambda, your code will need to interface with some backend service to perform lookups, searches, etc., and return the needed information. To avoid getting lost in detailed code that might include a lot of logic foreign to your own implementation, we've simply hard-coded the responses here to show you the requirements of the response.Also, note that you can provide your Lambda code in a variety of languages. See the following AWS Lambda documentation topics for instructions on working with other languages:
- Building Lambda Functions with Node.js
- Building Lambda Functions with Python
- Building Lambda Functions with Ruby
- Building Lambda Functions with Java
- Building Lambda Functions with Go
- Building Lambda Functions with C#
- Building Lambda Functions with PowerShell
This tutorial uses Node JS. Here's the full sample Lambda. After the code, we'll step through this section by section.
// section 1 begin
var AWS = require('aws-sdk');
exports.handler = (event, context, callback) => {
console.log("Interaction starts");
hardCodedResponse(event, context);
};
// section 1 end
// section 2 begin
function hardCodedResponse(event, context) {
var name = event.directive.header.name;
console.log("Alexa Request: ", name, JSON.stringify(event));
// section 2 end
// section 3 begin
var DiscoverResultResponse = {
"event": {
"header": {
"namespace": "Alexa.Discovery",
"name": "Discover.Response",
"payloadVersion": "3",
"messageId": "ff746d98-ab02-4c9e-9d0d-b44711658414"
},
"payload": {
"endpoints": [{
"endpointId": "ALEXA_VOICE_SERVICE_EXTERNAL_MEDIA_PLAYER_VIDEO_PROVIDER",
"endpointTypeId": "TEST_VSK_MM",
"manufacturerName": "TEST_VSK_MM",
"friendlyName": "TEST_VSK_MM",
"description": "TEST_VSK_MM",
"displayCategories": ["APPLICATION"],
"cookie": {},
"capabilities": [{
"type": "AlexaInterface",
"interface": "Alexa.RemoteVideoPlayer",
"version": "1.0"
}, {
"type": "AlexaInterface",
"interface": "Alexa.PlaybackController",
"version": "1.0"
}, {
"type": "AlexaInterface",
"interface": "Alexa.SeekController",
"version": "1.0"
}, {
"type": "AlexaInterface",
"interface": "Alexa.ChannelController",
"version": "1.0"
},
{
"type": "AlexaInterface",
"interface": "Alexa.MultiModalLandingPage",
"version": "1.0"
}]
}]
}
}
};
var GetPlayableItemsResponse = {
"event": {
"header": {
"correlationToken": "dFMb0z+PgpgdDmluhJ1LddFvSqZ/jCc8ptlAKulUj90jSqg==",
"messageId": "5f0a0546-caad-416f-a617-80cf083a05cd",
"name": "GetPlayableItemsResponse",
"namespace": "Alexa.VideoContentProvider",
"payloadVersion": "3"
},
"payload": {
"nextToken": "fvkjbr20dvjbkwOpqStr",
"mediaItems": [{
"mediaIdentifier": {
"id": "tt1254207"
}
}]
}
}
};
var GetPlayableItemsMetadataResponse = {
"event": {
"header": {
"correlationToken": "dFMb0z+PgpgdDmluhJ1LddFvSqZ/jCc8ptlAKulUj90jSqg==",
"messageId": "38ce5b22-eeff-40b8-a84f-979446f9b27e",
"name": "GetPlayableItemsMetadataResponse",
"namespace": "Alexa.VideoContentProvider",
"payloadVersion": "3"
},
"payload": {
"searchResults": [{
"name": "Big Buck Bunny",
"contentType": "ON_DEMAND",
"series": {
"seasonNumber": "1",
"episodeNumber": "1",
"seriesName": "Blender Foundation Videos",
"episodeName": "Pilot"
},
"playbackContextToken": "{\"streamUrl\": \"http:\/\/commondatastorage.googleapis.com\/gtv-videos-bucket\/sample\/BigBuckBunny.mp4\", \"title\": \"Big Buck Bunny\"}",
"parentalControl": {
"pinControl": "REQUIRED"
},
"absoluteViewingPositionMilliseconds": 1232340
}]
}
}
};
var GetDisplayableItemsResponse = {
"event": {
"header": {
"correlationToken": "dFMb0z+PgpgdDmluhJ1LddFvSqZ/jCc8ptlAKulUj90jSqg==",
"messageId": "5f0a0546-caad-416f-a617-80cf083a05cd",
"name": "GetDisplayableItemsResponse",
"namespace": "Alexa.VideoContentProvider",
"payloadVersion": "3"
},
"payload": {
"nextToken": "fvkjbr20dvjbkwOpqStr",
"mediaItems": [{
"mediaIdentifier": {
"id": "tt1254207"
}
}, {
"mediaIdentifier": {
"id": "tt0807840"
}
}]
}
}
};
var GetNextPageResponse = {
"event": {
"header": {
"correlationToken": "dFMb0z+PgpgdDmluhJ1LddFvSqZ/jCc8ptlAKulUj90jSqg==",
"messageId": "9f4803ec-4c94-4fdf-89c2-d502d5e52bb4",
"name": "GetNextPageResponse",
"namespace": "Alexa.VideoContentProvider",
"payloadVersion": "3"
},
"endpoint": {
"scope": {
"type": "BearerToken",
"token": "Alexa-access-token"
},
"endpointId": "appliance-001"
},
"payload": {
"nextToken": "qefjrfiugef74",
"mediaItems": [{
"mediaIdentifier": {
"id": "tt0807840"
}
},
{
"mediaIdentifier": {
"id": "tt1254207"
}
},
{
"mediaIdentifier": {
"id": "tt7993892"
}
},
{
"mediaIdentifier": {
"id": "tt2285752"
}
},
{
"mediaIdentifier": {
"id": "tt4957236"
}
}
]
}
}
}
var GetDisplayableItemsMetadataResponse = {
"event": {
"header": {
"correlationToken": "dFMb0z+PgpgdDmluhJ1LddFvSqZ/jCc8ptlAKulUj90jSqg==",
"messageId": "38ce5b22-eeff-40b8-a84f-979446f9b27e",
"name": "GetDisplayableItemsMetadataResponse",
"namespace": "Alexa.VideoContentProvider",
"payloadVersion": "3"
},
"payload": {
"resultsTitle": "SearchResults",
"searchResults": [{
"name": "Big Buck Bunny",
"contentType": "ON_DEMAND",
"itemType": "VIDEO",
"releaseYear": "2014",
"selectionAction": "PLAY",
"thumbnailImage": {
"contentDescription": "Big Buck Bunny image",
"sources": [{
"url": "https:\/\/devportal-reference-docs.s3-us-west-1.amazonaws.com\/video-skills-kit\/bigbuckbunnythumb.png",
"size": "X_LARGE",
"widthPixels": 1920,
"heightPixels": 1280
}]
},
"runtime": {
"runTimeInMilliseconds": 5400000,
"displayString": "9m"
},
"closedCaption": {
"status": "AVAILABLE",
"displayString": "CC"
},
"absoluteViewingPositionMilliseconds": 0,
"parentalControl": {
"pinControl": "REQUIRED"
},
"viewingDisplayString": "PurchaseOptions",
"reviews": [{
"totalReviewCount": 1897,
"type": "FIVE_STAR",
"ratingDisplayString": "4.06"
}],
"rating": {
"category": "G"
},
"mediaIdentifier": {
"id": "tt1254207"
}
}]
}
}
};
// section 3 end
// section 4 begin
if (name === 'Discover') {
console.log("Lambda Response: DiscoverResultResponse", JSON.stringify(DiscoverResultResponse));
context.succeed(DiscoverResultResponse);
} else if (name === 'GetPlayableItems') {
console.log("Lambda Response: GetPlayableItemsResponse", JSON.stringify(GetPlayableItemsResponse));
context.succeed(GetPlayableItemsResponse);
} else if (name === 'GetPlayableItemsMetadata') {
console.log("Lambda Response: GetPlayableItemsMetadataResponse", JSON.stringify(GetPlayableItemsMetadataResponse));
context.succeed(GetPlayableItemsMetadataResponse);
} else if (name === 'GetDisplayableItems') {
console.log("Lambda Response: GetDisplayableItemsResponse", JSON.stringify(GetDisplayableItemsResponse));
context.succeed(GetDisplayableItemsResponse);
} else if (name === 'GetDisplayableItemsMetadata') {
console.log("Lambda Response: GetDisplayableItemsMetadataResponse", JSON.stringify(GetDisplayableItemsMetadataResponse));
context.succeed(GetDisplayableItemsMetadataResponse);
}
else if (name === 'GetNextPage') {
console.log("Lambda Response: GetNextPageResponse", JSON.stringify(GetNextPageResponse));
context.succeed(GetNextPageResponse);
}
};
// section 4 end
Section 1 Explanation
// section 1 begin
var AWS = require('aws-sdk');
exports.handler = (event, context, callback) => {
console.log("Interaction starts");
hardCodedResponse(event, context);
...
}
// section 1 end
First we declare a dependency on the AWS SDK for JavaScript in Node.js. This SDK allows your Node JS code to perform a number of functions inside of AWS, which you can read about in the AWS SDK documentation.
The handler
method is explained in AWS Lambda Function Handler in Node.js. When a Lambda function is invoked, AWS Lambda starts executing your code by calling the handler
function. AWS Lambda passes any event data to this handler
as the first parameter. The runtime passes three arguments to the handler
method: event
, context
, and callback
:
event
: The first argument is theevent
object, which contains information from the invoker. In this case the invoker is the Alexa directive. Alexa passes this directive as a JSON-formatted string when it callsInvoke
.context
: The second argument is thecontext
object, which contains information about the invocation, function, and execution environment.callback
: The third argument,callback
, is a function that you can call in non-async functions to send a response. Thecallback
function takes two arguments: an Error and a response. The response object must be compatible withJSON.stringify
.
Your handler should process the incoming event data and may invoke any other functions/methods in your code.
To see the event
and context
, you can log these to the console:
console.log("Alexa Request: " + JSON.stringify(event, null, 2));
console.log("Context: " + JSON.stringify(context, null, 2));
The event
is logged as part of the hardCodedResponse
function; the context
isn't logged at all in the sample Lambda.
The event
and context
are JSON objects. JSON.stringify
renders a JSON object as a string. The JSON.stringify
method takes several parameters: the object, a replacer (not used here), and a spacing value.
The context
isn't necessarily important here, but it shows the name of the Lambda function invoked, the log stream, the memory used, and other details. In Cloudwatch, the context
for an event in our workflow looks as follows:
Context:
{
"callbackWaitsForEmptyEventLoop": true,
"functionVersion": "$LATEST",
"functionName": "hawaii_echo_lambda",
"memoryLimitInMB": "128",
"logGroupName": "/aws/lambda/hawaii_echo_lambda",
"logStreamName": "2019/06/21/[$LATEST]2eaa24e01fff497187f6d0fcc2230e8d",
"invokedFunctionArn": "arn:aws:lambda:us-east-1:458179560631:function:hawaii_echo_lambda",
"awsRequestId": "1b0c6361-bbe5-440b-95cb-3024f3abfa53"
}
You can see that hawaii_echo_lambda
is the Lambda invoked, and logs in Cloudwatch are grouped in /aws/lambda/hawaii_echo_lambda
.
After this initial handler
method, the hardCodedResponse(event, context);
function runs, taking in the event
and context
as parameters. This function is explained in the next section.
Section 2 Explanation
// section 2 begin
function hardCodedResponse(event, context) {
var name = event.directive.header.name;
console.log("Alexa Request: ", name, JSON.stringify(event));
// section 2 end
...}
The sample Lambda code contains a function called hardCodedResponse
that passes in the event
and context
as parameters. This shows how you can get information from the incoming event (the Alexa request) and set variables for the information contained in that event.
For example, the sample Lambda code sets a variable called name
to store the value for event.directive.header.name
.
As needed, you can set any properties in the directive to some variable that you use in your lookups. The information you need (and how you manipulate it) depends on the task you're trying to perform. For example, you might want to perform a lookup based on a movie title. As such, you might need certain details to feed into your lookup functions with your backend services.
Section 3 Explanation
At this point, the sample Lambda code simply defines variables for the pre-defined responses that it will respond to the directive with. For example:
//section 3 begin
var DiscoverResultResponse = {
...
};
var GetPlayableItemsResponse = {
...
};
var GetPlayableItemsMetadataResponse = {
...
};
var GetDisplayableItemsResponse = {
...
};
var GetNextPageResponse = {
...
};
var GetDisplayableItemsMetadataResponse = {
...
};
//section 3 end
As noted earlier, the sample Lambda hard-codes these responses. In a real implementation, you need to retrieve the needed information dynamically through your backend service. For example, based on the incoming payload, you would take the information and plug this into your own logic for resolving the user's request.
The sample Lambda here doesn't have a backend service with media information and such, so the responses are simply pre-defined. As a result, if you're using this sample Lambda in a test to explore how video skills work on multimodal devices, you're limited to queries for the content defined here.
In the future, a more dynamic sample Lambda might be made available in this documentation. However, since the lookup process will vary drastically from partner to partner based on their differing backend services and programming languages, this extra code to query a backend service might not be that instructive.
Section 4 Explanation
// section 4 begin
if (name === 'Discover') {
console.log("response", JSON.stringify(discoverResult));
context.succeed(discoverResult);
} else if (name === 'GetPlayableItems') {
console.log("response", JSON.stringify(getPlayableItems));
context.succeed(getPlayableItems);
} else if (name === 'GetPlayableItemsMetadata') {
console.log("response", JSON.stringify(getPlayableItemsMetadata));
context.succeed(getPlayableItemsMetadata);
} else if (name === 'GetDisplayableItems') {
console.log("response", JSON.stringify(getDisplayableItems));
context.succeed(getDisplayableItems);
} else if (name === 'GetDisplayableItemsMetadata') {
console.log("response", JSON.stringify(getDisplayableItemsMetadata));
context.succeed(getDisplayableItemsMetadata);
}
else if (name === 'GetNextPage') {
console.log("Lambda Response: GetNextPageResponse", JSON.stringify(GetNextPageResponse));
context.succeed(GetNextPageResponse);
}
};
// section 4 end
The final section of the function returns the appropriate response based on the directive name. If the directive was GetPlayableItems
, then the GetPlayableItemsResponse
is passed back to Alexa in the callback. The context.succeed
method puts the information into the callback.
In the Cloudwatch logs, look for a line that begins with Lambda Response
to see the response that your Lambda sends back to Alexa.
Next Steps
Go on to Step 4: Understand How Your Web Player Gets the Media Playback URL.
Last updated: Nov 02, 2020