lecture 10: svm and mira

14
Machine Learning for Language Technology Lecture 10: SVM and MIRA Marina San5ni Department of Linguis5cs and Philology Uppsala University, Uppsala, Sweden Autumn 2014 Acknowledgement: Thanks to Prof. Joakim Nivre for course design and materials 1

Upload: marina-santini

Post on 05-Dec-2014

206 views

Category:

Education


1 download

DESCRIPTION

Outline: margin, maximizing margin, the norm, support vectors machines, SVM, Margin Infused Relaxed Algorithm, MIRA

TRANSCRIPT

Page 1: Lecture 10: SVM and MIRA

Machine  Learning  for  Language  Technology    Lecture  10:  SVM  and  MIRA  

Marina  San5ni  Department  of  Linguis5cs  and  Philology  Uppsala  University,  Uppsala,  Sweden  

 Autumn  2014  

 Acknowledgement:  Thanks  to  Prof.  Joakim  Nivre  for  course  design  and  materials  

1  

Page 2: Lecture 10: SVM and MIRA

Margin  

Page 3: Lecture 10: SVM and MIRA

Maximizing  Margin  (i)  

Page 4: Lecture 10: SVM and MIRA

Maximizing  Margin  (ii)  

Page 5: Lecture 10: SVM and MIRA

Maximizing  Margin  (iii)  

Page 6: Lecture 10: SVM and MIRA

Max  Margin  =  Min  Norm  

Page 7: Lecture 10: SVM and MIRA

Maximizing  the  margin  

Linear  Classifiers:  Repe55on  &  Extension   7  

•  The  no5on  of  margin:  a  way  of  predic5ng  what  it  will  be  a  good  separa5on  on  the  test  set.    

•  Intui5vely,  if  we  make  the  margin  between  opposite  groups  as  wide  as  possible,  our  chances  to  guess  correct  in  the  test  set  should  increase.    

•  the  generaliza5on  error  on  unseen  test  data  is  propor5onal  to  the  inverse  of  the  margin:  the  larger  the  margin,  the  smaller  the  generaliza5on  error  

Page 8: Lecture 10: SVM and MIRA

Support  Vector  Machines  (SVM)  (i)  

Page 9: Lecture 10: SVM and MIRA

Support  Vector  Machines  (SVM)  (ii)  

Page 10: Lecture 10: SVM and MIRA

Margin  Infused  Relaxed  Algorithm  (MIRA)  

Page 11: Lecture 10: SVM and MIRA

MIRA  

Page 12: Lecture 10: SVM and MIRA

Perceptron  vs.  SVMs/MIRA  

Linear  Classifiers:  Repe55on  &  Extension   12  

Perceptron   SVMs/MIRA  If the training set is separable by some margin, the Perceptron will find a weight vector that separates the data, but it will not necessarily pick up the vector that maximizes the margin. If we are lucky, it will be a vector with the largest margin, but there will be no guarantee.

SVMs/MIRA want a weight vector that maximizes the margin to 1. Here the margin is normalized to 1. So we put a constraint on the weight vector saying that the weight should be such that when you computes the norm we should get 1. We keep the margin fixed and minimize the norm. That is, we want the smallest weight vector that gives us margin 1.

We  do  not  minimize  the  norm,  we  minimize  the  norm  squared  divided  by  2  to  make  the  math  easier  (trust  the  people  who  suggested  this  J  )    

Page 13: Lecture 10: SVM and MIRA

Summary  

Page 14: Lecture 10: SVM and MIRA

The  end