aws january 2016 webinar series - building smart applications with amazon machine learning

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Alex Ingerman Sr. Manager, Tech. Product Management, Amazon Machine Learning 1/28/2016 Real-World Smart Applications with Amazon Machine Learning

Upload: amazon-web-services

Post on 15-Jan-2017




0 download


Page 1: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Alex IngermanSr. Manager, Tech. Product Management, Amazon Machine Learning


Real-World Smart Applications with Amazon Machine Learning

Page 2: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning


• Why social media + machine learning = happy customers

• Using Amazon ML to find important social media conversations

• Building an end-to-end application to act on these conversations

Page 3: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Application details

Goal: build a smart application for social media listening in the cloud

Full source code and documentation are on GitHub:

Amazon Kinesis


Amazon Machine Learning


Amazon Mechanical Turk

Page 4: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Motivation for listening to social media

Customer is reporting a possible service issue

Page 5: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Motivation for listening to social media

Customer is making a feature request

Page 6: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Motivation for listening to social media

Customer is angry or unhappy

Page 7: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Motivation for listening to social media

Customer is asking a question

Page 8: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Why do we need machine learning for this?

The social media stream is high-volume, and most of the messages are not CS-actionable

Page 9: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Amazon Machine Learning in one slide

• Easy to use, managed machine learning service built for developers

• Robust, powerful machine learning technology based on Amazon’s internal systems

• Create models using your data already stored in the AWS cloud

• Deploy models to production in seconds

Page 10: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Formulating the problem

We would like to…

Instantly find new tweets mentioning @awscloud, ingest and analyze each one to predict whether a customer service agent should act on it, and, if so, send that tweet to customer service agents.

Page 11: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Formulating the problem

We would like to…

Instantly find new tweets mentioning @awscloud, ingest and analyze each one to predict whether a customer service agent should act on it, and, if so, send that tweet to customer service agents.

Twitter API

Page 12: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Formulating the problem

We would like to…

Instantly find new tweets mentioning @awscloud, ingest and analyze each one to predict whether a customer service agent should act on it, and, if so, send that tweet to customer service agents.

Twitter API Amazon Kinesis

Page 13: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Formulating the problem

We would like to…

Instantly find new tweets mentioning @awscloud, ingest and analyze each one to predict whether a customer service agent should act on it, and, if so, send that tweet to customer service agents.

Twitter API Amazon Kinesis


Page 14: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Formulating the problem

We would like to…

Instantly find new tweets mentioning @awscloud, ingest and analyze each one to predict whether a customer service agent should act on it, and, if so, send that tweet to customer service agents.

Twitter API Amazon Kinesis


Amazon Machine Learning

Page 15: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Formulating the problem

We would like to…

Instantly find new tweets mentioning @awscloud, ingest and analyze each one to predict whether a customer service agent should act on it, and, if so, send that tweet to customer service agents.

Twitter API Amazon Kinesis


Amazon Machine Learning


Page 16: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Building smart applications

Pick the ML strategy



2 3

CreateML model


Write and configure code


Try it out!

Page 17: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Picking the machine learning strategy

Question we want to answer: Is this tweet customer service-actionable, or not?

Our dataset: Text and metadata from past tweets mentioning @awscloud

Machine learning approach:Create a binary classification model to answer a yes/no question, and provide a confidence score

Page 18: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Building smart applications

Pick the ML strategy



2 3

CreateML model


Write and configure code


Try it out!

Page 19: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Retrieve past tweets

Twitter API can be used to search for tweets containing our company’s handle (e.g., @awscloud)

import twitter

twitter_api = twitter.Api(**twitter_credentials)twitter_handle = ‘awscloud’search_query = '@' + twitter_handle + ' -from:' + twitter_handleresults = twitter_api.GetSearch(term=search_query, count=100, result_type='recent’)# We can go further back in time by issuing additional search requests

Page 20: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Retrieve past tweets

Twitter API can be used to search for tweets containing our company’s handle (e.g., @awscloud)

import twitter

twitter_api = twitter.Api(**twitter_credentials)twitter_handle = ‘awscloud’search_query = '@' + twitter_handle + ' -from:' + twitter_handleresults = twitter_api.GetSearch(term=search_query, count=100, result_type='recent')# We can go further back in time by issuing additional search requests

Good news: data is well-structured and cleanBad news: tweets are not categorized (labeled) for us

Page 21: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Labeling past tweets

Why label tweets? (Many) machine learning algorithms work by discovering patterns connecting data points and labels

How many tweets need to be labeled? Several thousands to start with

Can I pay someone to do this? Yes! Amazon Mechanical Turk is a marketplace for tasks that require human intelligence

Page 22: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Creating the Mechanical Turk task

Page 23: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Creating the Mechanical Turk task

Page 24: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Creating the Mechanical Turk task

Page 25: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Creating the Mechanical Turk task

Page 26: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Creating the Mechanical Turk task

Page 27: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Publishing the task

Page 28: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Publishing the task

Page 29: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Preview labeling resultsSample tweets from our previously collected dataset + their labels

This column was created from

Mechanical Turk responses

Page 30: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Preview labeling resultsSample tweets and labels (most metadata fields removed for clarity)

Page 31: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Preview labeling resultsSample tweets and labels (most metadata fields removed for clarity)

Page 32: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Preview labeling resultsSample tweets and labels (most metadata fields removed for clarity)

Page 33: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Preview labeling resultsSample tweets and labels (most metadata fields removed for clarity)

Page 34: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Preview labeling resultsSample tweets and labels (most metadata fields removed for clarity)

Page 35: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Building smart applications

Pick the ML strategy



2 3

CreateML model


Write and configure code


Try it out!

Page 36: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Amazon ML process, in a nutshell

1. Create your datasourcesTwo API calls to create your training and evaluation dataSanity-check your data in service console

2. Create your ML modelOne API call to build a model, with smart default or custom setting

3. Evaluate your ML modelOne API call to compute your model’s quality metric

4. Adjust your ML modelUse console to align performance trade-offs to your business goals

Page 37: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Create the data schema string

{ "dataFileContainsHeader": true, "dataFormat": "CSV", "targetAttributeName": "trainingLabel", "attributes": [ { "attributeName": "description", "attributeType": "TEXT" }, <additional attributes here>, { "attributeName": "trainingLabel", "attributeType": "BINARY" } ]}

Schemas communicate metadata about your dataset: • Data format• Attributes’ names, types, and order • Names of special attributes

Page 38: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Create the training datasource

import botoml = boto.connect_machinelearning()

data_spec = {'DataLocationS3’ : s3_uri # E.g.: s3://my-bucket/dir/data.csv'DataSchema’ : data_schema } # Schema string (previous slide)

# Use only the first 70% of the datasource for training. data_spec['DataRearrangement'] = ‘{ "splitting”: {"percentBegin": 0, "percentEnd”: 70 } }’

ml.create_data_source_from_s3( data_source_id = “ds-tweets-train”,data_source_name = “Tweet

training data (70%)”, data_spec,

compute_statistics = True)

Page 39: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Create the evaluation datasource

import botoml = boto.connect_machinelearning()

data_spec = {'DataLocationS3’ : s3_uri # E.g.: s3://my-bucket/dir/data.csv'DataSchema’ : data_schema } # Schema string (previous slide)

# Use the last 30% of the datasource for evaluation. data_spec['DataRearrangement'] = ‘{ "splitting”: {"percentBegin": 70, "percentEnd”: 100 } }’

ml.create_data_source_from_s3( data_source_id = “ds-tweets-eval”,data_source_name = “Tweet

evaluation data (30%)”, data_spec,

compute_statistics = True)

Page 40: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Visually inspecting training data

Page 41: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Create the ML model

import botoml = boto.connect_machinelearning()

ml.create_ml_model( ml_model_id = “ml-tweets”, ml_model_name = “Tweets screening model”,

ml_model_type = “BINARY”, training_data_source_id = “ds-tweets-train”)

Input data location is looked up from the training datasource ID

Default model parameters and automatic data transformations are used, or you can provide your own

Page 42: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Evaluate the ML model

import botoml = boto.connect_machinelearning()

ml.create_evaluation( evaluation_id = “ev-tweets”,evaluation_name = “Evaluation of tweet screening

model”,ml_model_id = “ml-tweets”,

evaluation_data_source_id = “ds-tweets-eval”)

Input data location is looked up from the evaluation datasource ID

Amazon ML automatically selects and computes an industry-standard evaluation metric based on your ML model type

Page 43: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Visually inspecting and adjusting the ML model

Page 44: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Building smart applications

Pick the ML strategy



2 3

CreateML model


Write and configure code


Try it out!

Page 45: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Reminder: Our data flow

Twitter API Amazon Kinesis


Amazon Machine Learning


Page 46: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Create an Amazon ML endpoint for retrieving real-time predictions

import botoml = boto.connect_machinelearning()

ml.create_realtime_endpoint(“ml-tweets”)# Endpoint information can be retrieved using the get_ml_model() method. Sample output: #"EndpointInfo": {# "CreatedAt": 1424378682.266, # "EndpointStatus": "READY", # "EndpointUrl": ”", # "PeakRequestsPerSecond": 200}

Twitter API Amazon Kinesis


Amazon Machine Learning


Page 47: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Create an Amazon Kinesis stream for receiving tweets

import botokinesis = boto.connect_kinesis()

kinesis.create_stream(stream_name = ‘tweetStream’, shard_count = 1)# Each open shard can support up to 5 read transactions per second, up to a # maximum total of 2 MB of data read per second. Each shard can support up to # 1000 records written per second, up to a maximum total of 1 MB data written # per second.

Twitter API Amazon Kinesis


Amazon Machine Learning


Page 48: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Set up AWS Lambda to coordinate the data flow

The Lambda function is our application’s backbone. We will:

1. Write the code that will process and route tweets2. Configure the Lambda execution policy (what is it allowed to do?)3. Add the Kinesis stream as the data source for the Lambda function

Twitter API Amazon Kinesis


Amazon Machine Learning


Page 49: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Create Lambda functions

Twitter API Amazon Kinesis


Amazon Machine Learning


// These are our function’s signatures and globals only. See GitHub repository for full source. var ml = new AWS.MachineLearning(); var endpointUrl = ''; var mlModelId = ’ml-tweets'; var snsTopicArn = 'arn:aws:sns:{region}:{awsAccountId}:{snsTopic}'; var snsMessageSubject = 'Respond to tweet'; var snsMessagePrefix = 'ML model '+mlModelId+': Respond to this tweet:';

var processRecords = function() {…} // Base64 decode the Kinesis payload and parse JSONvar callPredict = function(tweetData) {…} // Call Amazon ML real-time prediction APIvar updateSns = function(tweetData) {…} // Publish CS-actionable tweets to SNS topicvar checkRealtimeEndpoint = function(err, data) {…} // Get Amazon ML endpoint URI

Page 50: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Create Lambda functions

Twitter API Amazon Kinesis


Amazon Machine Learning


// These are our function’s signatures and globals only. See GitHub repository for full source. var ml = new AWS.MachineLearning(); var endpointUrl = ''; var mlModelId = ’ml-tweets'; var snsTopicArn = 'arn:aws:sns:{region}:{awsAccountId}:{snsTopic}'; var snsMessageSubject = 'Respond to tweet'; var snsMessagePrefix = 'ML model '+mlModelId+': Respond to this tweet:';

var processRecords = function() {…} // Base64 decode the Kinesis payload and parse JSONvar callPredict = function(tweetData) {…} // Call Amazon ML real-time prediction APIvar updateSns = function(tweetData) {…} // Publish CS-actionable tweets to SNS topicvar checkRealtimeEndpoint = function(err, data) {…} // Get Amazon ML endpoint URI

Page 51: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Configure Lambda execution policy

Twitter API Amazon Kinesis


Amazon Machine Learning


{ "Statement": [ { "Action": [ "logs:*” ],

"Effect": "Allow", "Resource": "arn:aws:logs:{region}:{awsAccountId}:log-group:/aws/lambda/{lambdaFunctionName}:*"

},{ "Action": [ "sns:publish” ],

"Effect": "Allow", "Resource": "arn:aws:sns:{region}:{awsAccountId}:{snsTopic}"

},{ "Action": [ "machinelearning:GetMLModel”, "machinelearning:Predict” ],

"Effect": "Allow", "Resource": "arn:aws:machinelearning:{region}:{awsAccountId}:mlmodel/{mlModelId}”

},{ "Action": [ "kinesis:ReadStream”, "kinesis:GetRecords”,

"kinesis:GetShardIterator”, "kinesis:DescribeStream”,"kinesis:ListStreams” ], "Effect": "Allow", "Resource": "arn:aws:kinesis:{region}:{awsAccountId}:stream/{kinesisStream}"

}] }

Page 52: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Configure Lambda execution policy

Twitter API Amazon Kinesis


Amazon Machine Learning


{ "Statement": [ { "Action": [ "logs:*” ],

"Effect": "Allow", "Resource": "arn:aws:logs:{region}:{awsAccountId}:log-group:/aws/lambda/{lambdaFunctionName}:*"

},{ "Action": [ "sns:publish” ],

"Effect": "Allow", "Resource": "arn:aws:sns:{region}:{awsAccountId}:{snsTopic}"

},{ "Action": [ "machinelearning:GetMLModel”, "machinelearning:Predict” ],

"Effect": "Allow", "Resource": "arn:aws:machinelearning:{region}:{awsAccountId}:mlmodel/{mlModelId}”

},{ "Action": [ "kinesis:ReadStream”, "kinesis:GetRecords”,

"kinesis:GetShardIterator”, "kinesis:DescribeStream”,"kinesis:ListStreams” ], "Effect": "Allow", "Resource": "arn:aws:kinesis:{region}:{awsAccountId}:stream/{kinesisStream}"

}] }

Allow request logging in Amazon


Page 53: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Configure Lambda execution policy

Twitter API Amazon Kinesis


Amazon Machine Learning


{ "Statement": [ { "Action": [ "logs:*” ],

"Effect": "Allow", "Resource": "arn:aws:logs:{region}:{awsAccountId}:log-group:/aws/lambda/{lambdaFunctionName}:*"

},{ "Action": [ "sns:publish” ],

"Effect": "Allow", "Resource": "arn:aws:sns:{region}:{awsAccountId}:{snsTopic}"

},{ "Action": [ "machinelearning:GetMLModel”, "machinelearning:Predict” ],

"Effect": "Allow", "Resource": "arn:aws:machinelearning:{region}:{awsAccountId}:mlmodel/{mlModelId}”

},{ "Action": [ "kinesis:ReadStream”, "kinesis:GetRecords”,

"kinesis:GetShardIterator”, "kinesis:DescribeStream”,"kinesis:ListStreams” ], "Effect": "Allow", "Resource": "arn:aws:kinesis:{region}:{awsAccountId}:stream/{kinesisStream}"

}] }

Allow publication of notifications to

SNS topic

Page 54: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Configure Lambda execution policy

Twitter API Amazon Kinesis


Amazon Machine Learning


{ "Statement": [ { "Action": [ "logs:*” ],

"Effect": "Allow", "Resource": "arn:aws:logs:{region}:{awsAccountId}:log-group:/aws/lambda/{lambdaFunctionName}:*"

},{ "Action": [ "sns:publish” ],

"Effect": "Allow", "Resource": "arn:aws:sns:{region}:{awsAccountId}:{snsTopic}"

},{ "Action": [ "machinelearning:GetMLModel”, "machinelearning:Predict” ],

"Effect": "Allow", "Resource": "arn:aws:machinelearning:{region}:{awsAccountId}:mlmodel/{mlModelId}”

},{ "Action": [ "kinesis:ReadStream”, "kinesis:GetRecords”,

"kinesis:GetShardIterator”, "kinesis:DescribeStream”,"kinesis:ListStreams” ], "Effect": "Allow", "Resource": "arn:aws:kinesis:{region}:{awsAccountId}:stream/{kinesisStream}"

}] }

Allow calls to Amazon ML

real-time prediction APIs

Page 55: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Configure Lambda execution policy

Twitter API Amazon Kinesis


Amazon Machine Learning


{ "Statement": [ { "Action": [ "logs:*” ],

"Effect": "Allow", "Resource": "arn:aws:logs:{region}:{awsAccountId}:log-group:/aws/lambda/{lambdaFunctionName}:*"

},{ "Action": [ "sns:publish” ],

"Effect": "Allow", "Resource": "arn:aws:sns:{region}:{awsAccountId}:{snsTopic}"

},{ "Action": [ "machinelearning:GetMLModel”, "machinelearning:Predict” ],

"Effect": "Allow", "Resource": "arn:aws:machinelearning:{region}:{awsAccountId}:mlmodel/{mlModelId}”

},{ "Action": [ "kinesis:ReadStream”, "kinesis:GetRecords”,

"kinesis:GetShardIterator”, "kinesis:DescribeStream”,"kinesis:ListStreams” ], "Effect": "Allow", "Resource": "arn:aws:kinesis:{region}:{awsAccountId}:stream/{kinesisStream}"

}] }

Allow reading of data from

Kinesis stream

Page 56: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Connect Kinesis stream and Lambda function

import boto

aws_lambda = boto.connect_awslambda()

aws_lambda.add_event_source( event_source = 'arn:aws:kinesis:' + region + ':' + aws_account_id + ':stream/' + “tweetStream”, function_name = “process_tweets”, role = 'arn:aws:iam::' + aws_account_id + ':role/' + lambda_execution_role)

Twitter API Amazon Kinesis


Amazon Machine Learning


Page 57: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Building smart applications

Pick the ML strategy



2 3

CreateML model


Write and configure code


Try it out!

Page 58: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Amazon ML real-time predictions test

Here is a tweet:

Page 59: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Amazon ML real-time predictions test

Here is the same tweet…as a JSON blob:

{ "statuses_count": "8617", "description": "Software Developer", "friends_count": "96", "text": "`scala-aws-s3`  A Simple Amazon #S3 Wrapper for #Scala 1.10.20 available :", "verified": "False", "geo_enabled": "True", "uid": "3800711", "favourites_count": "36", "screen_name": "turutosiya", "followers_count": "640", "": "Toshiya TSURU", "sid": "647222291672100864"


Page 60: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Amazon ML real-time predictions test

Let’s use the AWS Command Line Interface to request a prediction for this tweet:

aws machinelearning predict\


--ml-model-id ml-tweets\

--record ‘<json_blob>’

Page 61: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Amazon ML real-time predictions test

Let’s use the AWS Command Line Interface to request a prediction for this tweet:

aws machinelearning predict\


--ml-model-id ml-tweets\

--record ‘<json_blob>’{ "Prediction": { "predictedLabel": "0", "predictedScores": { "0": 0.012336540967226028 }, "details": { "PredictiveModelType": "BINARY", "Algorithm": "SGD" } }}

Page 62: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Recap: Our application’s data flow

Twitter API Amazon Kinesis


Amazon Machine Learning


Page 63: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

End-to-end application demo

Page 64: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Generalizing to more feedback channels

Amazon Kinesis


Model 1 AmazonSNS

Model 2

Model 3

Page 65: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

What’s next?

Try the service:

Download the Social Media Listening application code:

Get in [email protected]

Page 66: AWS January 2016 Webinar Series - Building Smart Applications with Amazon Machine Learning

Thank you!