Download - Logistic Regression

Transcript
Page 1: Logistic Regression

Machine  learning  workshop  [email protected]  

Machine  learning  introduc7on  Logis&c  regression  

Feature  selec7on  Boos7ng,  tree  boos7ng  

 See  more  machine  learning  post:  h>p://dongguo.me    

Page 2: Logistic Regression

Overview  of  machine  learning  

    Machine  Learning  

Unsupervised  Learning  

Semi-­‐supervised  Learning  

Supervised  Learning  

Classifica7on   Regression  

Logis7c  regression  

Page 3: Logistic Regression

How  to  choose  a  suitable  model?  

Characteris&c   Naïve  Bayes  

Trees   K  Nearest  neighbor  

Logis&c  regression  

Neural  Networks  

SVM  

Computa7onal  scalability  

3   3   1   3   1   1  

Interpretability    2   2     1    2   1   1  

Predic7ve  power   1   1    3   2   3   3  

Natural  handling  data  of  “mixed”  type  

1   3   1   1   1   1  

Robustness  to  outliers  in  input  space  

 3   3   3   3     1   1  

<Elements  of  Sta-s-cal  Learning>  II  P351      

Page 4: Logistic Regression

Why  model  can’t  perform  perfectly  on  unseen  data  

•  Expected  risk  

•  Empirical  risk  

•  Choose  func7on  family  for  predic7on  func7ons    

•  Error  

Page 5: Logistic Regression

Logis7c  regression  

Page 6: Logistic Regression

Outline  

•  Introduc7on  •  Inference  •  Regulariza7on  •  Experiments  •  More  – Mul7-­‐nominal  LR  –  Generalized  linear  model  

•  Applica7on  

Page 7: Logistic Regression

Logit  func7on  and  logis7c  func7on  

•  Logit  func7on    

•  logis7c  func7on:  Inversed  logit  

Page 8: Logistic Regression

Logis7c  regression  

•  Predic7on  func7on  

Page 9: Logistic Regression

Inference  with  maximize  likelihood  (1)  

•  Likelihood  

•  Inference  

Page 10: Logistic Regression

Inference  with  maximize  likelihood  (2)  

•  Inference  cont.  

•  Use  gradient  descent  

 •  Stochas7c  gradient  descent  

Page 11: Logistic Regression

Regulariza7on  

•  Penalize  large  weight  to  avoid  overfi`ng  

 –  L2  regulariza7on  

 –  L1  regulariza7on  

Page 12: Logistic Regression

Regulariza7on:  Maximum  a  posteriori  

•  MAP  

Page 13: Logistic Regression

L2  regulariza7on  :  Gaussian  Prior    

•  Gaussian  prior  

 

•  MAP  

 •  Gradient  descent  step  

Page 14: Logistic Regression

L1  regulariza7on  :  Laplace  Prior    

•  Laplace  prior  

•  MAP    •  Gradient  descent  step  

Page 15: Logistic Regression

Implementa7on  

•  L2  LR    •  L1  LR  

_weightOfFeatures[fea] += step * (feaValue * error - reguParam * _weightOfFeatures[fea]);

if (_weightOfFeatures[fea] > 0) { _weightOfFeatures[fea] += step * (feaValue * error) - step * reguParam; if (_weightOfFeatures[fea] < 0) _weightOfFeatures[fea] = 0; }else if (_weightOfFeatures[fea] < 0) { _weightOfFeatures[fea] += step * (feaValue * error) + step * reguParam; if (_weightOfFeatures[fea] > 0) _weightOfFeatures[fea] = 0; }else{ _weightOfFeatures[fea] += step * (feaValue * error); }

Page 16: Logistic Regression

L2  VS.  L1  

•  L2  regulariza7on  –  Almost  all  weights  are  not  equal  to  zero  –  Not  suitable  when  training  samples  are  scarce  

•  L1  regulariza7on  –  Produces  sparse  parameter  vectors  – More  suitable  when  most  features  are  irrelevant  –  Could  handle  scarce  training  samples  be>er  

Page 17: Logistic Regression

Experiments  

•  Dataset  –  Goal:  gender  predic7on  –  Dataset:  train  samples  (431k),  test  samples  (167k)  

•  Comparison  algorithms  –  A:  gradient  descent  with  L1  regulariza7on  –  B:  gradient  descent  with  L2  regulariza7on  –  C:  OWL-­‐QN  (L-­‐BFGS  based  op7miza7on  with  L1  regulariza7on)  

•  Parameters  choice  –  Regulariza7on  value  –  Step(learning  speed)  –  Decay  ra7o  –  Itera7on  over  condi7on  

•  Max  itera7on  7mes(50)  ||    AUC  change  <=0.0005  

Page 18: Logistic Regression

Experiments  (cont.)  

•  Experiments  results  Parameters  and  metrics  

gradient  descent  with  L1  

gradient  descent  with  L2  

OWL-­‐QN  

‘best’  regulariza7on  term  

0.001~0.005   0.0002~0.001   1  

Best  step   0.05   0.02~0.05   -­‐  

Best  decay  ra7o   0.85   0.85   -­‐  

Itera7on  7mes   26   20~26   48  

Not  zero  feature  /  all  feature  

10492/10938   10938/10938   6629/10938  

AUC   0.8470   0.8463   0.8467  

Page 19: Logistic Regression

Mul7-­‐nominal  logis7c  regression  

•  Predic7on  func7on  

 •  Inference  with  maximize  likelihood  

•  Gradient  descent  step  (L2)  

Page 20: Logistic Regression

More  Link  func7ons  

•  Inference  with  maximize  likelihood  

 

•  Link  func7on  

•  Link  func7ons  for  binomial  distribu7on  –  Logit  func7on  

–  Probit  func7on  

–  Log-­‐log  func7on  

Page 21: Logistic Regression

Generalized  linear  model  

•  What  is  GLM  –  Generaliza7on  of  linear  regression  –  Connect  linear  model  with  response  variable  by  link  func7on  –  More  distribu7on  for  response  variable  

•  Typical  GLM  –  Linear  regression  ,  Logis7c  regression,  Poisson  regression  

•  Overview  

   

Page 22: Logistic Regression

Applica7on  •  Yahoo  

–  <Personalized  Click  Predic7on  in  Sponsored  Search>  WSDM’10  

•  Microsoq  –  <Scalable  Training  of  L1-­‐Regularized  Log-­‐Linear  Models>  ICML’07  

•  Baidu  –  Contextual  ads  CTR  predic7on  

•  h>p://www.docin.com/p-­‐376254439.html  

•  Hulu  –  Demographic  targe7ng  –  Other  ad-­‐targe7ng  project  –  Custom  churn  predic7on  –  More…  

Page 23: Logistic Regression

Reference  

•  ‘Scalable  Training  of  L1-­‐Regularized  Log-­‐Linear  Models’  ICML’07  –  h>p://www.docin.com/p-­‐376254439.html#  

•  ‘Genera-ve  and  discrimina-ve  classifiers:  Naïve  Bayes  and  logis-c  regression’  by  Mitchell  

•  ‘Feature  selec-on,  L1  vs.  L2  regulariza-on,  and  rota-onal  invariance’  ICML’04  

Page 24: Logistic Regression

Recommended  resources  

•  Machine  Learning  open  class  –  by  Andrew  Ng  –  //10.20.0.130/TempShare/Machine-­‐Learning  Open  Class  

•  h>p://www.cnblogs.com/vivounicorn/archive/2012/02/24/2365328.html  

•  logis7c  regression  Implementa7on[link]  –  //10.20.0.130/TempShare/guodong/Logis7c  regression  Implementa7on/  –  Support  binomial  and  mul7nominal  LR  with  L1  and  L2  regulariza7on  

•  OWL-­‐QN  –  //10.20.0.130/TempShare/guodong/OWL-­‐QN/  

Page 25: Logistic Regression

Thanks  


Top Related