sports category prediction with amazon ml

12
Sports Category prediction with Amazon ML @nafu003 ookami, Inc. 2015/04/26

Upload: fumiya-nakamura

Post on 17-Jan-2017

427 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: Sports Category Prediction with Amazon ML

!

Sports Category prediction with Amazon ML

@nafu003 ookami, Inc. 2015/04/26

Page 2: Sports Category Prediction with Amazon ML

“Amazon Machine Learning (Amazon ML) is a robust AWS machine learning platform

in the cloud that allows software developers to train predictive models and use them to create powerful predictive applications.”

What is Amazon ML?

Amazon Machine Learning Developer Guide

Page 3: Sports Category Prediction with Amazon ML

Amazon ML Key Concepts• Datasources contain metadata associated with data

inputs to Amazon ML

• ML models generate predictions using the patterns extracted from the input data

• Evaluations measure the quality of ML models

• Batch predictions asynchronously generate predictions for multiple input data observations

• Real-time predictions synchronously generate predictions for individual data observations

Amazon Machine Learning Developer Guide

Page 4: Sports Category Prediction with Amazon ML

ML models

• Binary

• Multiclass

• Regression

Page 5: Sports Category Prediction with Amazon ML

ML models

• Binary

• Multiclass

• Regression

Page 6: Sports Category Prediction with Amazon ML

Datasources• Extract from ookami news database

• 99999 records

• 69730 training data

• 30433 evaluation data

• Each row has

• news title

• news summary

• news url

• sports category name

Page 7: Sports Category Prediction with Amazon ML

Better than the baseline

• Average F1 score: 0.49

• Baseline F1 score: 0.02

Page 8: Sports Category Prediction with Amazon ML

F1 score A measure of a test's accuracy

F1 score =2 ⇤ precision ⇤ recalprecision+ recall

F1 score equals two times precision times recall over precision plus recall

Page 9: Sports Category Prediction with Amazon ML

http://en.wikipedia.org/wiki/Precision_and_recall

• True positives

• Predict trueand it’s actually true

• False negative

• Predict falsebut it’s true contrary to expectations

Page 10: Sports Category Prediction with Amazon ML

71.65% is Soccer 39.50% actually

Page 11: Sports Category Prediction with Amazon ML

We can improve more

• Collect more data

• Choose right category

• Make sure each category has close number of records

Page 12: Sports Category Prediction with Amazon ML

References

• Amazon Machine Learning Developer Guide

• http://en.wikipedia.org/wiki/F1_score

• http://en.wikipedia.org/wiki/Precision_and_recall

• http://www.ees.dendai.ac.jp/Lectures/TechEng-Horio/HowToRead.pdf