amazon machine learning #awsloft berlin

49
Amazon Machine Learning Barbara Pogorzelska, Technical Program Manager

Upload: aws-germany

Post on 14-Jan-2017

1.043 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: Amazon Machine Learning #AWSLoft Berlin

Amazon Machine Learning

Barbara Pogorzelska,

Technical Program Manager

Page 2: Amazon Machine Learning #AWSLoft Berlin

Agenda

• Machine learning and the data ecosystem

• Smart applications by example (and counter-example)

• Amazon Machine Learning features and benefits

• Developing with Amazon ML

• Q&A

Page 3: Amazon Machine Learning #AWSLoft Berlin

Three types of data-driven development

Retrospectiveanalysis and

reporting

Amazon RedshiftAmazon RDS Amazon S3

Amazon EMR

Page 4: Amazon Machine Learning #AWSLoft Berlin

Three types of data-driven development

Retrospectiveanalysis and

reporting

Here-and-nowreal-time processing

and dashboards

Amazon Kinesis Amazon EC2 AWS Lambda

Amazon Redshift, Amazon RDS Amazon S3

Amazon EMR

Page 5: Amazon Machine Learning #AWSLoft Berlin

Three types of data-driven development

Retrospectiveanalysis and

reporting

Here-and-nowreal-time processing

and dashboards

Predictionsto enable smart

applications

Amazon Kinesis Amazon EC2 AWS Lambda

Amazon Redshift, Amazon RDS Amazon S3

Amazon EMR

Page 6: Amazon Machine Learning #AWSLoft Berlin

Machine learning and smart applications

Machine learning is the technology that automatically finds

patterns in your data and uses them to make predictions for

new data points as they become available

Page 7: Amazon Machine Learning #AWSLoft Berlin

Machine learning and smart applications

Machine learning is the technology that automatically finds

patterns in your data and uses them to make predictions for

new data points as they become available

Your data + machine learning = smart applications

Page 8: Amazon Machine Learning #AWSLoft Berlin

Smart applications by example

Based on what you know about the user:

Will they use your product?

Page 9: Amazon Machine Learning #AWSLoft Berlin

Smart applications by example

Based on what you know about the user:

Will they use your product?

Based on what you know about an order:

Is this order fraudulent?

Page 10: Amazon Machine Learning #AWSLoft Berlin

Smart applications by example

Based on what you know about the user:

Will they use your product?

Based on what you know about an order:

Is this order fraudulent?

Based on what you know about a news article:

What other articles are interesting?

Page 11: Amazon Machine Learning #AWSLoft Berlin

And a few more examples…

Fraud detection Detecting fraudulent transactions, filtering spam emails, flagging suspicious reviews, …

Personalization Recommending content, predictive content loading, improving user experience, …

Targeted marketing Matching customers and offers, choosing marketing campaigns, cross-selling and up-selling, …

Content classification Categorizing documents, matching hiring managers and resumes, …

Churn prediction Finding customers who are likely to stop using the service, free-tier upgrade targeting, …

Customer support Predictive routing of customer emails, social media listening, …

Page 12: Amazon Machine Learning #AWSLoft Berlin

Building smart applications – a counter-pattern

Dear Alex,

This awesome quadcopter is on sale

for just $49.99!

Page 13: Amazon Machine Learning #AWSLoft Berlin

Smart applications by counter-example

SELECT c.ID

FROM customers c

LEFT JOIN orders o

ON c.ID = o.customer

GROUP BY c.ID

HAVING o.date > GETDATE() – 30

We can start by sending the

offer to all customers who

placed an order in the last 30

days

Page 14: Amazon Machine Learning #AWSLoft Berlin

Smart applications by counter-example

SELECT c.ID

FROM customers c

LEFT JOIN orders o

ON c.ID = o.customer

GROUP BY c.ID

HAVING

AND o.date > GETDATE() – 30

… let’s narrow it down to just

customers who bought toys

Page 15: Amazon Machine Learning #AWSLoft Berlin

Smart applications by counter-example

SELECT c.ID

FROM customers c

LEFT JOIN orders o

ON c.ID = o.customer

GROUP BY c.ID

HAVING o.category = ‘toys’

AND

(COUNT(*) > 2

AND SUM(o.price) > 200

AND o.date > GETDATE() – 30)

)

… and expand the query to customers who purchased other toy helicopters recently

Page 16: Amazon Machine Learning #AWSLoft Berlin

Smart applications by counter-example

SELECT c.ID

FROM customers c

LEFT JOIN orders o

ON c.ID = o.customer

LEFT JOIN products p

ON p.ID = o.product

GROUP BY c.ID

HAVING o.category = ‘toys’

AND ((p.description LIKE ‘% %’

AND o.date > GETDATE() - 60)

OR (COUNT(*) > 2

AND SUM(o.price) > 200

AND o.date > GETDATE() – 30)

)

… but what about

quadcopters?

Page 17: Amazon Machine Learning #AWSLoft Berlin

Smart applications by counter-example

SELECT c.ID

FROM customers c

LEFT JOIN orders o

ON c.ID = o.customer

LEFT JOIN products p

ON p.ID = o.product

GROUP BY c.ID

HAVING o.category = ‘toys’

AND ((p.description LIKE ‘%copter%’

AND o.date > GETDATE() - )

OR (COUNT(*) > 2

AND SUM(o.price) > 200

AND o.date > GETDATE() – 30)

)

… maybe we should go back further in time

Page 18: Amazon Machine Learning #AWSLoft Berlin

Smart applications by counter-example

SELECT c.ID

FROM customers c

LEFT JOIN orders o

ON c.ID = o.customer

LEFT JOIN products p

ON p.ID = o.product

GROUP BY c.ID

HAVING o.category = ‘toys’

AND ((p.description LIKE ‘%copter%’

AND o.date > GETDATE() - 120)

OR (COUNT(*) > 2

AND SUM(o.price) > 200

AND o.date > GETDATE() – )

)

… tweak the query more

Page 19: Amazon Machine Learning #AWSLoft Berlin

Smart applications by counter-example

SELECT c.ID

FROM customers c

LEFT JOIN orders o

ON c.ID = o.customer

LEFT JOIN products p

ON p.ID = o.product

GROUP BY c.ID

HAVING o.category = ‘toys’

AND ((p.description LIKE ‘%copter%’

AND o.date > GETDATE() - 120)

OR (COUNT(*) > 2

AND SUM(o.price) >

AND o.date > GETDATE() – 40)

)

… again

Page 20: Amazon Machine Learning #AWSLoft Berlin

Smart applications by counter-example

SELECT c.ID

FROM customers c

LEFT JOIN orders o

ON c.ID = o.customer

LEFT JOIN products p

ON p.ID = o.product

GROUP BY c.ID

HAVING o.category = ‘toys’

AND ((p.description LIKE ‘%copter%’

AND o.date > GETDATE() - )

OR (COUNT(*) > 2

AND SUM(o.price) > 150

AND o.date > GETDATE() – 40)

)

… and again

Page 21: Amazon Machine Learning #AWSLoft Berlin

Smart applications by counter-example

SELECT c.ID

FROM customers c

LEFT JOIN orders o

ON c.ID = o.customer

LEFT JOIN products p

ON p.ID = o.product

GROUP BY c.ID

HAVING o.category = ‘toys’

AND ((p.description LIKE ‘%copter%’

AND o.date > GETDATE() - )

OR (COUNT(*) > 2

AND SUM(o.price) > 150

AND o.date > GETDATE() – 40)

)

Use machine learning

technology to learn

your business rules

from data!

Page 22: Amazon Machine Learning #AWSLoft Berlin

Why aren’t there more smart applications?

1. Machine learning expertise is rare

2. Building and scaling machine learning technology is hard

3. Closing the gap between models and applications is time-consuming and

expensive

Page 23: Amazon Machine Learning #AWSLoft Berlin

Building smart applications today

Expertise Technology Operationalization

Limited supply of data scientists

Many choices, few mainstays

Complex and error-prone data workflows

Expensive to hire or outsource

Difficult to use and scale Custom platforms and APIs

Many moving pieces lead to custom solutions every time

Reinventing the model lifecycle management wheel

Page 24: Amazon Machine Learning #AWSLoft Berlin

What if there were a better way?

Page 25: Amazon Machine Learning #AWSLoft Berlin

Introducing Amazon ML

Easy to use, managed machine learning service built

for developers

Robust, powerful machine learning technology

based on Amazon’s internal systems

Create models using your data already stored in the

AWS cloud

Deploy models to production in seconds

Page 26: Amazon Machine Learning #AWSLoft Berlin

Easy to use and developer-friendly

Use the intuitive, powerful service console to build and

explore your initial models

• Data retrieval

• Model training, quality evaluation, fine-tuning

• Deployment and management

Automate model lifecycle with fully featured APIs and SDKs

• Java, Python, .NET, JavaScript, Ruby, PHP

Easily create smart iOS and Android applications with AWS

Mobile SDK

Page 27: Amazon Machine Learning #AWSLoft Berlin

Powerful machine learning technology

Based on Amazon’s battle-hardened internal systems

Not just the algorithms:

• Smart data transformations

• Input data and model quality alerts

• Built-in industry best practices

Grows with your needs

• Train on up to 100 GB of data

• Generate billions of predictions

• Obtain predictions in batches or real-time

Page 28: Amazon Machine Learning #AWSLoft Berlin

Integrated with AWS Data Ecosystem

Access data that is stored in S3, Amazon Redshift, or

MySQL databases in RDS

Output predictions to S3 for easy integration with your data

flows

Use AWS Identity and Access Management (IAM) for fine-

grained data-access permission policies

Page 29: Amazon Machine Learning #AWSLoft Berlin

Fully-managed model and prediction services

End-to-end service, with no servers to provision and

manage

One-click production model deployment

Programmatically query model metadata to enable

automatic retraining workflows

Monitor prediction usage patterns with Amazon

CloudWatch metrics

Page 30: Amazon Machine Learning #AWSLoft Berlin

Pay-as-you-go and inexpensive

Data analysis, model training, and evaluation:

$0.42/instance hour

Batch predictions: $0.10/1000

Real-time predictions: $0.10/1000

+ hourly capacity reservation charge

Page 31: Amazon Machine Learning #AWSLoft Berlin

Three Supported Types of Predictions

Binary Classification: predict the answer to a yes/no question

• Is this order fraudulent?

• Will this customer convert?

• Which article should I show next?

Multi-class classification: predict the correct category from list:

• What is the genre of this movie?

• What is the root cause of this customer contact?

Regression: predict the value of a numeric variable

• How many units of this item will sell next week?

• How long will this user session last?

Page 32: Amazon Machine Learning #AWSLoft Berlin

Buildmodel

Evaluate andoptimize

Retrieve predictions

1 2 3

Building smart applications with Amazon ML

Page 33: Amazon Machine Learning #AWSLoft Berlin

Trainmodel

Evaluate andoptimize

Retrieve predictions

1 2 3

Building smart applications with Amazon ML

• Create a Datasource object pointing to your data• Explore and understand your data• Transform data and train your model

Page 34: Amazon Machine Learning #AWSLoft Berlin

Create a Datasource object

>>> import boto

>>> ml = boto.connect_machinelearning()

>>> ds = ml.create_data_source_from_s3(

data_source_id = ’my_datasource',

data_spec= {

'DataLocationS3':'s3://bucket/input/',

'DataSchemaLocationS3':'s3://bucket/input/.schema'},

compute_statistics = True)

Page 35: Amazon Machine Learning #AWSLoft Berlin

Explore and understand your data

Page 36: Amazon Machine Learning #AWSLoft Berlin

Train your model

>>> import boto

>>> ml = boto.connect_machinelearning()

>>> model = ml.create_ml_model(

ml_model_id=’my_model',

ml_model_type='REGRESSION',

training_data_source_id='my_datasource')

Page 37: Amazon Machine Learning #AWSLoft Berlin

Trainmodel

Evaluate andoptimize

Retrieve predictions

1 2 3

Building smart applications with Amazon ML

• Understand model quality• Adjust model interpretation

Page 38: Amazon Machine Learning #AWSLoft Berlin

Explore model quality

Page 39: Amazon Machine Learning #AWSLoft Berlin

Fine-tune model interpretation

Page 40: Amazon Machine Learning #AWSLoft Berlin

Fine-tune model interpretation

Page 41: Amazon Machine Learning #AWSLoft Berlin

Trainmodel

Evaluate andoptimize

Retrieve predictions

1 2 3

Building smart applications with Amazon ML

• Batch predictions• Real-time predictions

Page 42: Amazon Machine Learning #AWSLoft Berlin

Batch predictions

Asynchronous, large-volume prediction generation

Request through service console or API

Best for applications that deal with batches of data records

>>> import boto

>>> ml = boto.connect_machinelearning()

>>> model = ml.create_batch_prediction(

batch_prediction_id = 'my_batch_prediction’

batch_prediction_data_source_id = ’my_datasource’

ml_model_id = ’my_model',

output_uri = 's3://examplebucket/output/’)

Page 43: Amazon Machine Learning #AWSLoft Berlin

Real-time predictions

Synchronous, low-latency, high-throughput prediction generation

Request through service API or server or mobile SDKs

Best for interaction applications that deal with individual data records

>>> import boto

>>> ml = boto.connect_machinelearning()

>>> ml.predict(

ml_model_id=’my_model',

predict_endpoint=’example_endpoint’,

record={’key1':’value1’, ’key2':’value2’})

{

'Prediction': {

'predictedValue': 13.284348,

'details': {

'Algorithm': 'SGD',

'PredictiveModelType': 'REGRESSION’

}

}

}

Page 44: Amazon Machine Learning #AWSLoft Berlin

Architecture patterns for smart applications

Page 45: Amazon Machine Learning #AWSLoft Berlin

Batch predictions with EMR

Query for predictions with Amazon ML batch API

Process data with EMR

Raw data in S3Aggregated data

in S3

Predictions in S3 Your application

Page 46: Amazon Machine Learning #AWSLoft Berlin

Batch predictions with Amazon Redshift

Structured dataIn Amazon Redshift

Load predictions into Amazon Redshift

-or-Read prediction results directly

from S3

Predictions in S3

Query for predictions with Amazon ML batch API

Your application

Page 47: Amazon Machine Learning #AWSLoft Berlin

Real-time predictions for interactive applications

Your application

Query for predictions with Amazon ML real-time API

Page 48: Amazon Machine Learning #AWSLoft Berlin

Adding predictions to an existing data flow

Your applicationAmazon

DynamoDB

+

Trigger event with Lambda+

Query for predictions with Amazon ML real-time API

Page 49: Amazon Machine Learning #AWSLoft Berlin

aws.amazon.com/de/activate

Everything and Anything Startups

Need to Get Started on AWS