machine learning

Machine Learning

LW/OB presentation

Machine learning ( ML ) is a field concerned with studying and developing algorithms that perform

better at a task as they gain experience

( but mostly I wanted to use this cool picture )

WARNING This presentation is seriously lacking slides,

preparation and cool running examples.

That being said. I know what I’m talking about ;)

What ML is really about…

• ML is about data, and modeling its distribution

• ML is about a tradeoff between model accuracy and predictive power

• ML is about finding simple yet expressive classes of distributions

• ML is about using approximate numerical methods to perform Bayesian update on the training data

ML = intersection of

Data sizes vary…

From a couple kilobytes

Data sizes vary

From a couple kilobytes To petabytes

Type of problems solved

• Supervised

• Unsupervised

• Reinforcement learning

• ( transduction )

• Supervised

– Classification

– Regression

• Unsupervised

• Supervised

• Unsupervised

- Clustering

- Discovering causal links

• Supervised

• Unsupervised

– Learn to perform a task, only from final result

– Not discussed, improve supervised learning with unsupervised samples

Typical applications

• Image, speech, pattern recognition

• Collaborative filtering

• Time series forecasting

• Game playing

• Denoising

• Any task where experience is valuable

Common ML techniques

• Linear regression

• Factor models

• Decision trees

• Factor models

• Decision trees

• Neural networks

• Factor models

• Decision trees

• Neural networks

perceptron, multilayer perceptron with backpropagation, hebbian autoassociative memory, Boltzmann machine, spiking neurons…

• Factor models

• Decision trees

• Neural networks

• SVM’s

• Factor models

• Decision trees

• Neural networks

• SVM’s

• Bayesian networks, white box models…

Meta-Methods

– Ensemble forecasting

Meta-Methods

– Bootstrapping, Bagging, model averaging

Meta-Methods

– Boosting

Meta-Methods

– Boosting

– Inductive bias through

Meta-Methods

– Boosting

• Out of sample testing

Meta-Methods

– Boosting

• Out of sample testing

• Minimum description length

Neural networks demystified

Neural networks demystified • Perceptron ( 1957 )

THIS IS…

THIS IS… LINEAR

ALGEBRA!

Neural networks demystified • Perceptron

• Linear separability

Neural networks demystified • Perceptron

8 binary inputs => 1/2212classifications linearly separable

• Multilayered perceptron + backpropagation

( 1969 ~ 1986 )

• Smooth Interpolation

Many more types…

SVM in a nutshell

• Maximize margin

SVM in a nutshell

• Maximize margin

• Embed in a high dimensional space

Ensemble learning

• Combine predictions through voting ( with classifiers ) or regression to improve prediction

Ensemble learning

• Train on random ( with replacement ) subsets of the data ( bootstrapping )

Ensemble learning

• Train on random ( with replacement ) subsets of the data ( bootstrapping )

• Or weight the data according to the quality of prediction, and train new weak classifiers accordingly ( boosting )

Numerical tricks

• Optimization of fit with standard operational search techniques

Numerical tricks

• EM algorithm

Numerical tricks

• EM algorithm

• MCMC methods ( Gibbs sampling, metropolis algorithm… )

A fundamental Bayesian model, the Hidden Markov Model

• Hidden states produce observed states

A fundamental Bayesian model, the Hidden Markov Model

• Hidden states produce observed states

• Billions of application

– Finance

– Speech recognition

– Swype

– Kinect

– Open heart surgery

– Airplane navigation

Questions I was asked

• How does Boosting work ?

• What is the No Free Lunch Theorem ?

• Writing style recognition

• Signature recognition

• Rule extraction

• Moving odds in response to informed gamblers

• BellKor-Pragmatic Chaos and the Netflix prize

Writing style recognition

• Naïve Bayes ( similar to spam filtering, bag of words approach )

• Clustering of HMM model parameters

• Simple statistics on text corpus ( sentence length distribution, word length distribution, density of punctuation )

• Combine with a logistic regression

Signature recognition

• Depends if raster or vector

• Post office uses neural networks, but corpus is gigantic

• Dimensionality reduction is key

• Wavelet on raster image for feature extraction

• Dimensionality reduction is key

• Wavelet on raster image for feature extraction

• Path following then learning on path features ( total variation, average curvature etc )

Rules extraction

• Hard, hypothesis space not smooth

• Decision tree regression

• Genetic Programming ( Koza )

Netflix prize

• The base (cinematch) = latent semantic model

• The defining characteristic of winners, ensemble prediction with neural networks to combine predictors

• Best team were mergers of good teams

Latent semantic model

• There is a set of K “features”. Each movie has a score in each feature, each user has a weight for each feature.

• Features are latent, we only assume the value of K.

• Equivalent to representing the rating matrix as a product of a score and preference matrix. SVD minimizes RMSE

Poker is hard…

• Gigantic, yet not continuous state space

• Dimensionality reduction isn’t easy

• High variance

• Possible to make parametric strategies and optimize with ML

• Inputs such as pot odds trivial to compute

Uhuh, slides end here

Sort of… Questions ?

machine learning

bagof words

fundamental

signature

neural networksdemystified

vector post

numerical

data sizes

neural networks

Technology

1. introduction to machine - lis-lab.fr ›...

machine learning - introduction to machine...

mathworks - introducing machine learning · 4 ntroducing...

learning with large datasets machine learning large scale...

hawaii machine learning meetup · introduction –why...

artificial intelligence & machine learning · intelligence...

deep learning - machine learning...

machine learning made easy€¦ · machine learning –what...

machine learning using mapreduce - boise state...

machine learning: machine learning:

machine learning lecture 5 bayesian learning g53mle |...

cs7267 machine learning introduction to machine learning

predicting housing prices with azure machine learning ·...

machine learning: machine learning: introduction...

machine learning-deep learning fondement et...

advanced topics in machine learning: bayesian machine...

machine learning 2. logistic regression and lda · machine...

machine reasoning: a perspective and …...“from machine...

machine learning for nlp - ethics and machine learning

machine learning with matlab - … · 2 agenda machine...