intro to machine learning in python

24
INTRO TO MACHINE LEARNING IN PYTHON Russel Mahmud @PyCon Dhaka 2014

Upload: russel-mahmud

Post on 11-Aug-2014

479 views

Category:

Data & Analytics


6 download

DESCRIPTION

Machine Learning Basics, Introduction of Scikit-learn, Web traffic prediction, Cross validation

TRANSCRIPT

Page 1: Intro To Machine Learning in Python

INTRO TO MACHINE LEARNING IN PYTHON

Russel Mahmud @PyCon Dhaka 2014

Page 2: Intro To Machine Learning in Python

Who am I ?

Machine Learning

in Bangladesh

Software Engineer @NewsCred Passionate about Big Data, Analytics and

ML

https://github.com/livewithpython/sklearn-pycon-2014#LiveWithPython

Page 3: Intro To Machine Learning in Python

Agenda Machine Learning Basics Introduction to Scikit-learn A simple example Conclusion Q&A

Page 4: Intro To Machine Learning in Python

Story 1 : PredPol (Predictive Policing)

Predict crime at real time.

`

Page 5: Intro To Machine Learning in Python

Story 2 : YouTube Neuron

Google’s artificial brain learns to find Cat

Page 6: Intro To Machine Learning in Python

What is Machine Learning?Field of study that gives computers the ability to learn without being explicitly programmed.

- Arthur Samuel

A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.

- Tom M. Mitchell

Page 7: Intro To Machine Learning in Python

Algorithm typesSupervised Learning Unsupervised Learning

Page 8: Intro To Machine Learning in Python

Python Tools for Machine Learning Scikit-learn Statsmodels PyMC Shogun Orange ...

Page 9: Intro To Machine Learning in Python

Scikit-learn Simple and efficient

for data mining and data analysis

Open source, commercially usable

It’s much faster than other libraries

It’s built on numpy, scipy and matplotlib

Page 10: Intro To Machine Learning in Python

Scikit-learn Simple and consistent

API Instantiate the model

m = Model () Fit the model

m.fit(train_data, target) orm.fit(train_data)

Predictm.predict(test_data)

Evaluatem.score(train_data, target)

Page 11: Intro To Machine Learning in Python

Example : Web Traffic Prediction Current limit : 100,000 hits/hours Predict the right time to allocate

sufficient resources

Page 12: Intro To Machine Learning in Python

Reading in the data

Page 13: Intro To Machine Learning in Python

Preparing the data

Page 14: Intro To Machine Learning in Python

Taking a peek

Page 15: Intro To Machine Learning in Python

Model Selection

Page 16: Intro To Machine Learning in Python

Simple Model

Page 17: Intro To Machine Learning in Python

Playing around Residual Score

Linear 0.4163

RandomForest 0.952

RidgeRegression 0.7665

Page 18: Intro To Machine Learning in Python

Taking a closer look

Page 19: Intro To Machine Learning in Python

Underfitting and Overfitting aka high bias model is very

simple

aka high variance model is

excessively complex

Page 20: Intro To Machine Learning in Python

Evaluation Measure performance with using cross-

validation Cross Validation Score

Linear 0.4450

RandomForest 0.6519

RidgeRegression 0.7256

Page 21: Intro To Machine Learning in Python

Example : Solution

Page 22: Intro To Machine Learning in Python

Conclusion

Python is AwesomeScikit-learn makes it more Awesome

Page 24: Intro To Machine Learning in Python

Q&A