think machine learning with scikit-learn (python)

Post on 18-Jan-2017

119 Views

Category:

Data & Analytics

5 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Think Machine Learning with Scikit-learn (Python)

By: Chetan KhatriPrincipal Big Data Engineer, Nazara Technologies.

Data Science Lab, The Department of Computer Science, University of Kachchh.

About mel - Principal Big Data Engineer, Nazara Technologies.l - Technical Reviewer – Packt Publication.l - Ex. Developer - Eccella Corporation.l Alumni, The Department of Computer Science, KSKV

Kachchh University.

Outline

l An Introduction to Machine Learningl Hello World in Machine learning with 6 lines of code

l Visualizing a Decision Treel Classifying Imagesl Supervised learning : Pipelinel Writing first Classifier

Early Days AI Programs : Deep Blue

Now, AI Programs

l Alpha go is best example, wrote for Playing Go game, but it can play Atari games also.

Machine Learning

l Machine Learning does this possible, it is study of algorithms which learns from examples and experience having set of rules and hard coded lines.

l “Learns from Examples and Experience”

Let's have problem

l Let's have problem: It seems easy but difficult to solve without machine learning.

Open Source Libraries

Classifier

Scikit-learn

Test ! No error ! Yay !!

Supervised Learning

Collecting Training

Data

Train Classifier

MakePredictions

Training DataWeight Texture Label150g Bumpy Orange170g Bumpy Orange140g Smooth Apple130g Smooth Apple

Features

Examples

Training Data

Important Concepts

l How does this work in Real world ?l How much training data do you need ?l How is the tree created ?l What makes a good feature ?

Many Types of Classifier

l Artificial Neural Network (ANN)l Support Vector Machine (SVM)l Nearest Neighbour classifier (KNN)l Random Forest (RF)l Gradient Boosting Machine (GBM)l Etc..l Etc..

Demo

2. Visualizing a Decision Tree

3. What Makes a Good Feature?

l Imagine we want to write classifier to classify two types of dogs.

Variation in the world !

l Hands - On

About 80% of dogs at this height are labs

About 95% of dogs at this height are greyhounds

l Feature captures different types of information

Thought Experiment

Avoid useless features

Independent features are best

l Height in Inchesl Height in centimeters

l Height in Inchesl Height in centimeters

l Avoid Redundant featuresl Feature should be easy to understand

l Simpler relationships are easier to learn.l Ideal features are:

l Informativel Independentl Simple

4. Pipeline - Machine Learning

l http://playground.tensorflow.org/

5. Writing our first classifier

l Measure Distance

Demo

l Implement nearest neighbor Algorithm

Next Step

Thank you

top related