active learning 入門

Download Active Learning 入門

Post on 12-Nov-2014

11.702 views

Category:

Technology

2 download

Embed Size (px)

DESCRIPTION

 

TRANSCRIPT

  • 1. Active Learning 2013/9/1 DSIRNLP #4 / @shuyo
  • 2. Agenda Active Learning Active Learning
  • 3. References (Settles 2009) Active Learning Literature Survey (Schein+ 2007) Active Learning for Logistic Regression LR AL (Olsson 2009) A Literature Survey of Active Machine Learning in the Context of Natural Language Processing (Guo+ 2007) Optimistic Active Learning using Mutual Information Expected Error Reduction AL 1 MM+M (Tong+ 2000) Support Vector Machine Active Learning with Application to Text Classification SVM AL
  • 4. Active Learning
  • 5. () 10 (Zhu 2005) 11 (Settles+ 2008) 1970 (5)
  • 6. (semi-supervised learning) (active learning)
  • 7. Active Learning () Oracle : (query) via (Settles 2009)
  • 8.
  • 9.
  • 10.
  • 11.
  • 12. Active Learning
  • 13. Active Learning
  • 14. Active Learning (Settles 2009) membership query synthesis stream-based selective sampling pool-based sampling 6 Active Learning (Settles 2009)
  • 15. Active Learning membership query synthesis stream-based selective sampling 11 Oracle pool-based sampling pool 1
  • 16. query 1. Uncertainly Sampling 2. Query-By-Committee 3. Expected Model Change 4. Expected Error Reduction 5. Variance Reduction 6. Density-Weighted Methods 1 2 3 4
  • 17. 1. Uncertainly Sampling 0.5 ()
  • 18. Uncertainly Sampling Least Confident argmin max Margin Sampling (12 ) argmin 1 2
  • 19. Uncertainly Sampling Entropy-based Approach argmax log
  • 20. category 1 category 2 category 3 least conf. margin entropy A 0.50 0.35 0.15 0.50 0.15 0.43 B 0.55 0.45 0.00 0.55 0.10 0.30 C 0.51 0.25 0.24 0.51 0.26 0.45 via (Settles 2009) 3
  • 21.
  • 22. Python / numpy / scipy / scikit learn sklearn.linear_model.LogisticRegression(L1) sklearn.linear_model.LogisticRegression(L2) sklearn.naive_bayes.MultinomialNB https://github.com/shuyo/iir/tree/master/activelearn
  • 23. 20NewsGroups sklearn.datasets.fetch_20newsgroups_vectorized 56436 tf-idfL2 training : 11314test: 7532 20 34
  • 24. LogisticRegression : C MultinomialNB : alpha test data Active Learning
  • 25. test data L1 LogisticRegression C=89443 0.8340 L2 LogisticRegression C=1193.9 0.8196 MultinomialNB alpha=0.0054781 0.8359 (3) 0.8497 (8285%)
  • 26. 1 pool pool query oracle random sampling baseline (Schein+ 2007) straw men MultinomialNB
  • 27. 2000 (/) 2000
  • 28. 500
  • 29.
  • 30. 30050 20
  • 31. margin sampling
  • 32. 300
  • 33. margin sampling entropy-based random sampling random sampling
  • 34. 2. Query-By-Committee Uncertainly sampling (Settles 2009) 2 Vote Entropy Average Kullback-Leibler Divergence
  • 35. Vote Entropy argmax log C: V(y): y
  • 36. Average Kullback-Leibler Divergence KL-divergence (Settles 2009)
  • 37. Uncertainly Sampling MultinomialNB, LogisticRegression (L1 / L2) random sampling margin sampling baseline
  • 38. (300)
  • 39. 300 20
  • 40. Uncertainly sampling random margin sampling Vote Entropy OK SVM, Random Forest, ... Average KL Divergence random NB LR
  • 41. 3. Density-Weighted Methods Uncertainly Sampling QBC query A B B () A A B
  • 42. Information Density (Settles+ 2008) : A x argmax () argmin () x argm () 1 sim , =1 U pool xu pool u
  • 43. Uncertainly Sampling 3 Information Density =1
  • 44. 300
  • 45. Information Density entropy-based least confident + ID entropy-based margin sampling + ID
  • 46.
  • 47. Active Learning random sampling query query test data
  • 48. Oracle Sampling Err(X) X LU query argmin Err , yu xu Active Learning oracle sampling
  • 49.
  • 50. pool 22451494
  • 51. =2 Active Learning Greedy
  • 52.
  • 53. 4. Expected Error Reduction test set development set active learning (risk) or
  • 54. P(y|x;L) L x y (xi, yi) log-loss , , log , , R(xi, yi; L) yi , ; = ; (, ; ) x_i query
  • 55. MM+M (Guo+ 2007) Expected Error Reduction log-loss() yi min , ; xi query (MCMI[min]) oracle yi xj query (Most Unconcern) Uncertainly Sampling Entropy-based Approach
  • 56. (4)
  • 57. MM-MS () Expected Error Reduction ()(pool size) 4 100 margin T(=30) MCMI[min] query MM+M 60 MM+M margin sampling
  • 58. 4
  • 59. 4100
  • 60. margin sampling entropy-based + information density Expected Error Reduction Variance Reduction O(()^3 ) RFTL active learning regret