statistical methods for learning multimedia …infolab.stanford.edu/~echang/icme03-ucsb.pdf7/6/2003...
TRANSCRIPT
![Page 1: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/1.jpg)
7/6/2003 ICME Tutorial, Baltimore 1
Statistical Methods for Learning Multimedia Semantics
Edward ChangAssociate Professor,Electrical Engineering, UC Santa BarbaraCTO, VIMA Technologies
![Page 2: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/2.jpg)
7/6/2003 ICME Tutorial, Baltimore 2
Outline
Statistical LearningMultimedia Applications’ Data CharacteristicsClassical ModelsKernel Methods
Linear Model ViewNearest Neighbor ViewGeometric View
Dimension Reduction Methods
![Page 3: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/3.jpg)
7/6/2003 ICME Tutorial, Baltimore 3
Statistical Learning
Program the computers to learn!Computers improve performancewith experience at some taskExample:
Task: classify imagesPerformance: prediction accuracyExperience: labeled images
![Page 4: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/4.jpg)
7/6/2003 ICME Tutorial, Baltimore 4
Definition
X: Data poolU: Unlabeled pool L: Labeled pool
G: LabelsRegression: G → RClassification: G → +1, -1
H: Learning algorithm
![Page 5: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/5.jpg)
7/6/2003 ICME Tutorial, Baltimore 5
Statistical LearningExperience
Characterized by training data LTraining
f = H(L)Task (e.g., prediction)
ŷ = f(u), u ∈ UPerformance
Measured by some error functione.g., maximizing yf(u)
![Page 6: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/6.jpg)
7/6/2003 ICME Tutorial, Baltimore 6
Learning Algorithms (H)
Linear RegressionK-NNBayesian AnalysisNeural NetworksDecision TreesKernel MethodsEtc.
![Page 7: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/7.jpg)
7/6/2003 ICME Tutorial, Baltimore 7
H Having a hypothesis spaceFind the “best” hypothesis based on the training data (L) efficiently
Best solutionFitting L well? Predicting U accurately!
EfficiencyComputational complexity and resource requirements
![Page 8: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/8.jpg)
7/6/2003 ICME Tutorial, Baltimore 8
Classical Model [Donoho 2000]
N:Number of training instancesN = |U|
N+, N-
D:DimensionalityN >> D N → ∞
E.g., PAC learnabilityN- ≈ N+
![Page 9: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/9.jpg)
7/6/2003 ICME Tutorial, Baltimore 9
Emerging MM Applications
N < DN+ << N-
ExamplesInformation retrieval with relevance feedbackK-class classification⌧Image classification⌧Gene profiling
![Page 10: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/10.jpg)
7/6/2003 ICME Tutorial, Baltimore 10
Gene Profiling ExampleN = 59 cases, D = 4026 genes
![Page 11: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/11.jpg)
7/6/2003 ICME Tutorial, Baltimore 11
Image Retrieval Demo
N < DN < 50D = 150
N+ << N-
ACM SIGMOD 01; ACM MM 01,02; IEEE CVPR 03Also see my Web site
![Page 12: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/12.jpg)
7/6/2003 ICME Tutorial, Baltimore 12
SVMactive
![Page 13: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/13.jpg)
7/6/2003 ICME Tutorial, Baltimore 13
SVMactive
![Page 14: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/14.jpg)
7/6/2003 ICME Tutorial, Baltimore 14
SVMactive
![Page 15: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/15.jpg)
7/6/2003 ICME Tutorial, Baltimore 15
SVMactive
![Page 16: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/16.jpg)
7/6/2003 ICME Tutorial, Baltimore 16
Ranking
![Page 17: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/17.jpg)
7/6/2003 ICME Tutorial, Baltimore 17
Solution Summary N < D
ACM MM 2001 (SVM Active)⌧Make each u in U most informative
PCM 2002, ICIP 2003⌧Increase N- through co-training
ACM MM 2002 (DPF)⌧Reduce D
N+ << N-
ACM MM 2003, ICML 2003⌧Conformal transformation ⌧Kernel boundary alignment
![Page 18: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/18.jpg)
7/6/2003 ICME Tutorial, Baltimore 18
Outline
Statistical LearningMM Applications’ Data CharacteristicsClassical Models (Classification)Kernel Methods
Linear Model ViewNearest Neighbor ViewGeometric View
Dimension Reduction Methods
![Page 19: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/19.jpg)
7/6/2003 ICME Tutorial, Baltimore 19
Classical Methods
Linear ModelLeast SquareMaximum Likelihood Naïve BayesianLDAMaximum Margin Hyperplane
Nearest Neighbor
![Page 20: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/20.jpg)
7/6/2003 ICME Tutorial, Baltimore 20
Linear Regression
![Page 21: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/21.jpg)
7/6/2003 ICME Tutorial, Baltimore 21
Least Square
Y = β0 + ΣΣ βj Xj (j = 1 to D)Y = XTβRSS(β) = (Y – Xβ)T(Y – Xβ)
RSS: Residual Sum of Squareβ = (XTX)-1 XTY
![Page 22: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/22.jpg)
7/6/2003 ICME Tutorial, Baltimore 22
Maximum Likelihood
Y = β0 + ΣΣ βj Xj (j = 1 to p)Y = XTβY = XTβ + ε
ε (noise signals) are independentε → N (0, ∂2)
P(y|βx) has a normal dist. withMean at y = βxVariance ∂2
![Page 23: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/23.jpg)
7/6/2003 ICME Tutorial, Baltimore 23
Maximum Likelihood
P(y|βx) → N (0, ∂2) Training
Given (x1,y1) (x2,y2) … (xn,yn)Infer P(β | x1, x2,… xn, y1, y2,…yn )By Bayes rule, orMaximum Likelihood Estimate
![Page 24: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/24.jpg)
7/6/2003 ICME Tutorial, Baltimore 24
Maximum Likelihood
For what β isP(y1, y2,…yn | x1, x2,… xn, β) maximized?ΠΠ P(yi|βxi) maximized? ΠΠ exp(-½(yi-βxi/∂)2) maximized?ΣΣ (-½(yi-βxi/∂)2 maximized?ΣΣ (yi-βxi)2 minimized?
![Page 25: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/25.jpg)
7/6/2003 ICME Tutorial, Baltimore 25
Least Square Linear Model
Solution Method #1RSS(β) = (Y – Xβ)T(Y – Xβ)β = (XTX)-1 XTY
Solution Method #2 (for D > N)Gradient decentPerceptron
![Page 26: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/26.jpg)
7/6/2003 ICME Tutorial, Baltimore 26
Other Linear Models
LDAFind the projection direction which minimizes the overlap for two Gaussian distributions
Separating Hyperplane
![Page 27: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/27.jpg)
7/6/2003 ICME Tutorial, Baltimore 27
LDA
![Page 28: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/28.jpg)
7/6/2003 ICME Tutorial, Baltimore 28
![Page 29: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/29.jpg)
7/6/2003 ICME Tutorial, Baltimore 29
Separating Hyperplane
![Page 30: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/30.jpg)
7/6/2003 ICME Tutorial, Baltimore 30
Separating Hyperplane
![Page 31: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/31.jpg)
7/6/2003 ICME Tutorial, Baltimore 31
Maximum Margin Hyperplane
Only support vectors involve in class prediction!
![Page 32: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/32.jpg)
7/6/2003 ICME Tutorial, Baltimore 32
Linear Models
N ≥ DLeast SquareLDA
D > NPerceptron (using gradient decent)Maximum Hyperplane
Generative vs. Discriminative Model
![Page 33: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/33.jpg)
7/6/2003 ICME Tutorial, Baltimore 33
Linear Model Fits All Data?
![Page 34: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/34.jpg)
7/6/2003 ICME Tutorial, Baltimore 34
How about Joining the Dots?
Y(x) = 1/k ΣΣ yi,
xi ∈Nk(x)K = 1
![Page 35: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/35.jpg)
7/6/2003 ICME Tutorial, Baltimore 35
Linear Model Fits All?
![Page 36: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/36.jpg)
7/6/2003 ICME Tutorial, Baltimore 36
NN with k = 1
![Page 37: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/37.jpg)
7/6/2003 ICME Tutorial, Baltimore 37
Nearest Neighbor
Four Things Make a Memory Based Learner
A distance function?K: number of neighbors to consider?A weighted function (optional)?How to fit with the local points?
![Page 38: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/38.jpg)
7/6/2003 ICME Tutorial, Baltimore 38
Problems of K=1
Fitting NoiseJagged Boundaries
![Page 39: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/39.jpg)
7/6/2003 ICME Tutorial, Baltimore 39
Solutions
Fitting NoisePick a larger K?
![Page 40: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/40.jpg)
7/6/2003 ICME Tutorial, Baltimore 40
NN with k = 15
![Page 41: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/41.jpg)
7/6/2003 ICME Tutorial, Baltimore 41
NN
![Page 42: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/42.jpg)
7/6/2003 ICME Tutorial, Baltimore 42
Solutions
Fitting NoisePick a larger K?
Jagged BoundariesIntroducing Kernel as a weighting function
![Page 43: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/43.jpg)
7/6/2003 ICME Tutorial, Baltimore 43
Nearest Neighbor → Kernel Method
Four Things Make a Memory Based Learner
A distance functionK: number of neighbors to consider? AllA weighted function: RBF kernelsHow to fit with the local points? Predict weights
![Page 44: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/44.jpg)
7/6/2003 ICME Tutorial, Baltimore 44
Kernel Method
RBF Weighted FunctionKernel width holds the key⌧Implying KUse cross validation to find the “optimal” width
Fitting with the Local PointsWhere NN meets Linear Model
![Page 45: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/45.jpg)
7/6/2003 ICME Tutorial, Baltimore 45
LM vs. NNLinear Model
f(x) is approximated by a global linear functionMore stable, less flexible
Nearest NeighborK-NN assumes f(x) is well approximated by a locally constant functionLess stable, more flexible
Between LM and NNThe other models…
![Page 46: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/46.jpg)
7/6/2003 ICME Tutorial, Baltimore 46
Decision Theories
Bias & Variance TradeoffBayes PredictionVC DimensionalityPAC Learnability
![Page 47: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/47.jpg)
7/6/2003 ICME Tutorial, Baltimore 47
Variance vs. Bias
MSE(x) = ET [f(x) – ŷ]2
= ET[ŷ – ET(ŷ)]2 + [ET(ŷ)– f(x)]2
Error = VarT(ŷ) + Bias2(ŷ)
![Page 48: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/48.jpg)
7/6/2003 ICME Tutorial, Baltimore 48
Variance vs. Bias
![Page 49: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/49.jpg)
7/6/2003 ICME Tutorial, Baltimore 49
Outline
Statistical LearningEmerging Applications Data CharacteristicsClassical Models (Classification)Kernel MethodsDimension Reduction Methods
![Page 50: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/50.jpg)
7/6/2003 ICME Tutorial, Baltimore 50
Where Are We and Where Am I Heading To ?
LM and NNKernel Method of Three Views
LM viewNN viewGeometric view
![Page 51: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/51.jpg)
7/6/2003 ICME Tutorial, Baltimore 51
Linear Model View
Y = β0 + ΣΣ β XSeparating Hyperplane
Max||β||=1 CSubject to yyii f(f(xxii) ) ≥≥ C, orC, oryyii ((β0 +β xi) ≥≥ CC
![Page 52: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/52.jpg)
7/6/2003 ICME Tutorial, Baltimore 52
Classifier Margin
Margin Defined as width of the boundary before hitting a data object
Maximum MarginTends to minimize classification varianceNo formal theory for this yet
![Page 53: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/53.jpg)
7/6/2003 ICME Tutorial, Baltimore 53
Separating Hyperplane
![Page 54: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/54.jpg)
7/6/2003 ICME Tutorial, Baltimore 54
M’s Mathematical Representation
Plus-plane{x: wx+b = +1}
Minus-plane{x: wx+b = -1}
w ⊥ Plus-planew(u – v) = 0, if u and v on plus-plane
w ⊥ Minus-plane
![Page 55: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/55.jpg)
7/6/2003 ICME Tutorial, Baltimore 55
Separating Hyperplane
![Page 56: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/56.jpg)
7/6/2003 ICME Tutorial, Baltimore 56
M
Let x- be any point on minus-planeLet x+ be the closest plus-plane-point to x-
x+ = x- + λw, whyThe line (x+x-) ⊥ minus-plane
M = |x+ - x-|
![Page 57: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/57.jpg)
7/6/2003 ICME Tutorial, Baltimore 57
M
1. wx- + b = -1 2. wx+ + b = 1 3. x+ = x- + λw 4. M = |x+ - x-|5. w(x- + λw) + b = 1 (from 2 & 3)6. wx- + b + λww = 17. λww = 2
![Page 58: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/58.jpg)
7/6/2003 ICME Tutorial, Baltimore 58
M
1. λww = 22. λ = 2/ww3. M = |x+ - x-| = |λw| = λ|w| = 2/|w|
4. Max MGradient decent, simulated annealing, EM, Newton’s method…
![Page 59: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/59.jpg)
7/6/2003 ICME Tutorial, Baltimore 59
Max M
Max M = 2/|w|Min |w|/2Min |w|2/2
subject to yi(xiw+b) ≥ 1i = 1,…,N
Quadratic criterion with linear inequality constraints
![Page 60: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/60.jpg)
7/6/2003 ICME Tutorial, Baltimore 60
Max M
Min |w|2/2subject to yi(xiw+b) ≥ 1i = 1,…,N
Lp = minw,b |w|2/2 + ΣΣi=1..N αi[yi(xiw+b)-1]
w = ΣΣi=1..N αiyixi
0 = ΣΣi=1..N αiyi
![Page 61: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/61.jpg)
7/6/2003 ICME Tutorial, Baltimore 61
Wolfe Dual
Ld = ΣΣi=1..N α - 1/2 ΣΣΣΣi,j=1..Nαiαjyiyjxixj
Subject to αi ≥ 0αi [yi(xiw+b)-1] = 0KKT conditions⌧αi > 0, yi(xiw+b) = 1 (Support Vectors)⌧αi = 0, yi(xiw+b) > 1
![Page 62: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/62.jpg)
7/6/2003 ICME Tutorial, Baltimore 62
Class Predictionyyqq = = w xq + b
w = ΣΣi=1..N αiyixi
yyqq = sign(= sign(ΣΣi=1..N αiyi(xi ·Xq) + b)
![Page 63: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/63.jpg)
7/6/2003 ICME Tutorial, Baltimore 63
Non-separatable Classes
Soft Margin HyperplaneBasis Expansion
![Page 64: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/64.jpg)
7/6/2003 ICME Tutorial, Baltimore 64
Non-separating Case
![Page 65: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/65.jpg)
7/6/2003 ICME Tutorial, Baltimore 65
Soft Margin SVMs
Min |w|2/2subject to yi(xiw+b) ≥ 1i = 1,…,N
Min |w|2/2 + C ∑εi
xiw+b ≥ 1 - εi if yi = 1xiw+b ≤ -1 + εi if yi = -1εi ≥ 0
![Page 66: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/66.jpg)
7/6/2003 ICME Tutorial, Baltimore 66
Non-separating Case
![Page 67: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/67.jpg)
7/6/2003 ICME Tutorial, Baltimore 67
Wolfe Dual
Ld = ΣΣi=1..N α - 1/2 ΣΣΣΣi,j=1..Nαiαjyiyjxixj
Subject to C ≥ αi ≥ 0ΣΣ αiyi = 0KKT conditions
yyqq = = sign ((ΣΣi=1..N αiyi(xi ·Xq) + b)
![Page 68: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/68.jpg)
7/6/2003 ICME Tutorial, Baltimore 68
Basis Function
![Page 69: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/69.jpg)
7/6/2003 ICME Tutorial, Baltimore 69
Harder 1D Example
![Page 70: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/70.jpg)
7/6/2003 ICME Tutorial, Baltimore 70
Basis Function
Φ(X) = (x, x2)
![Page 71: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/71.jpg)
7/6/2003 ICME Tutorial, Baltimore 71
Harder 1D Example
![Page 72: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/72.jpg)
7/6/2003 ICME Tutorial, Baltimore 72
Some Basis Functions
Φ(X) = ΣΣ γmhm(X) hm(X) Rp → R
Common FunctionsPolynomialRadial basis functionsSigmoid functions
![Page 73: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/73.jpg)
7/6/2003 ICME Tutorial, Baltimore 73
Kernel FunctionLd = ΣΣi=1..N α - 1/2 ΣΣΣΣi,j=1..Nαiαjyiyj Φ(xi)Φ (xj)Subject to
C ≥ αi ≥ 0ΣΣ αiyi = 0KKT conditions
yyqq = sign (= sign (ΣΣi=1..N αiyi(Φ(xi)·Φ(Xq)) + b)K(xi, xj) = Φ(xi)·Φ(Xj)
Kernel function!
![Page 74: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/74.jpg)
7/6/2003 ICME Tutorial, Baltimore 74
Quadratic Basis Functions
Φ(a) = {1, ai, ai aj}, ij = 1..D(D+1)(D+2)/2 termsD2 termsO(D2) computational cost
It is equivalent to (ab+1)2
O(D) computational costTotal computational cost
O(N2D)
![Page 75: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/75.jpg)
7/6/2003 ICME Tutorial, Baltimore 75
Dot Product Saves the Day
O(N2D)Quadratic
O(N2D2)Cubic
O(N2D3)Quartic
O(N2D4)
![Page 76: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/76.jpg)
7/6/2003 ICME Tutorial, Baltimore 76
Quiz
What is a polynomial kernel degree dfunction’s signature?(ab+1)d
![Page 77: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/77.jpg)
7/6/2003 ICME Tutorial, Baltimore 77
Outline
LM and NNKernel Method of Three Views
LM viewNN viewGeometric view
![Page 78: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/78.jpg)
7/6/2003 ICME Tutorial, Baltimore 78
Nearest Neighbor View
Z, a set of zero mean jointly Gaussian random variables,
Each Zi corresponds to one example Xi
Cov (zi, zj) = K(xi, xj)yi, the lable of zi, +1 or -1
P(yi | zi) = σ(yi,zi)
![Page 79: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/79.jpg)
7/6/2003 ICME Tutorial, Baltimore 79
Training Data
![Page 80: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/80.jpg)
7/6/2003 ICME Tutorial, Baltimore 80
General Kernel Classifier [Jaakkola, etc. 99]
MAP Classification for xt
yt = sign (Σ αi yi K(xt,xi)) K(xi, xj) = Cov (zi, zj) (some similarity function)
Supervised Training: Compute αi Given X and y, andAn error function such as J(α) = - ½ Σ αi αj yi yj K(xi,xj) + Σ F(αi)
![Page 81: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/81.jpg)
7/6/2003 ICME Tutorial, Baltimore 81
Leave One Out
![Page 82: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/82.jpg)
7/6/2003 ICME Tutorial, Baltimore 82
SVMsyt = sign (Σ αi yi K(xt,xi))(yi xi) training data, αi nonnegative, and kernel K positive definiteαi is obtained by maximizing
J(α) = - ½ Σ αi αj yi yj K(xi,xj) + Σ F(αi)F(αi) = αi
αi ≥ 0, Σyiαi = 0
![Page 83: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/83.jpg)
7/6/2003 ICME Tutorial, Baltimore 83
Important Insight
K(xi, xj) = Cov (zi, zj) To design of a kernel is to design a similarity function that produces a positive definite covariance matrix on the training instances
![Page 84: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/84.jpg)
7/6/2003 ICME Tutorial, Baltimore 84
Basis Function Selection
Three General ApproachesRestriction methods⌧Limit the class of functionsSelection methods⌧Scan the dictionary adaptively (Boosting)Regularization methods⌧Use the entire dictionary but restrict
coefficients (Ridge Regression)
![Page 85: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/85.jpg)
7/6/2003 ICME Tutorial, Baltimore 85
Overfitting?
Probably NotBecause
N free parameters (not D)Maximizing margin
![Page 86: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/86.jpg)
7/6/2003 ICME Tutorial, Baltimore 86
Geometrical View
S = w X + b|w| = 1, b = 0V = {w | Si f(xi) > 0; i = 1..n, |w| = 1}SVM is the center of the largest sphere contained in V
![Page 87: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/87.jpg)
7/6/2003 ICME Tutorial, Baltimore 87
SVMs
![Page 88: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/88.jpg)
7/6/2003 ICME Tutorial, Baltimore 88
BPMs
Bayes Objective FunctionŜt = Bayes Z (Xt) = argmin Si in S E H|Z = x [l(H(x), Si)]
BPMs [Herbrich, etc. 2001]Abp= argmin h in H Ex[E H|Z = x [l(H(x), h(x))]]
![Page 89: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/89.jpg)
7/6/2003 ICME Tutorial, Baltimore 89
BPMs
Linear ClassifierInput X Posses Spherical Gaussian Density
BP is the Center of Mass of the Version Space
![Page 90: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/90.jpg)
7/6/2003 ICME Tutorial, Baltimore 90
BPMs vs. SVMs
![Page 91: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/91.jpg)
7/6/2003 ICME Tutorial, Baltimore 91
BPMs
Use SVMs to find a good h in HFind the BP
Billiard Algorithm [Herbrich, etc. 2001]
Perceptron Algorithm [Herbrich, etc. 2001]
![Page 92: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/92.jpg)
7/6/2003 ICME Tutorial, Baltimore 92
Billiard Ball Algorithm (R. Herbrich )
![Page 93: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/93.jpg)
7/6/2003 ICME Tutorial, Baltimore 93
Outline
Statistical LearningEmerging Applications Data CharacteristicsClassical Models (Classification)Kernel MethodsDimension Reduction Methods
![Page 94: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/94.jpg)
7/6/2003 ICME Tutorial, Baltimore 94
Similarity Measurement
![Page 95: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/95.jpg)
7/6/2003 ICME Tutorial, Baltimore 95
Perceptual Distance FunctionTwo Monumental Challenges
Formulating a perceptual feature spaceFormulating a perceptual distance function
![Page 96: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/96.jpg)
7/6/2003 ICME Tutorial, Baltimore 96
Dimensionality Curse
D: Data DimensionWhen D increases
Nearest neighbors are not localAll points are equally distanced
![Page 97: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/97.jpg)
7/6/2003 ICME Tutorial, Baltimore 97
Sparse High-D Space [C. Aggarwal, etc. ICDT 2001]
Hyper-cube Range Queries
dd ssP =][
![Page 98: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/98.jpg)
7/6/2003 ICME Tutorial, Baltimore 98
![Page 99: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/99.jpg)
7/6/2003 ICME Tutorial, Baltimore 99
Sparse High-D Space
Spherical Range Queries
![Page 100: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/100.jpg)
7/6/2003 ICME Tutorial, Baltimore 100
)12(
)5.0()]5.0,([+Γ
•=∈ dQspRP
ddd π
![Page 101: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/101.jpg)
7/6/2003 ICME Tutorial, Baltimore 101
![Page 102: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/102.jpg)
7/6/2003 ICME Tutorial, Baltimore 102
Dimensionality Curse
![Page 103: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/103.jpg)
7/6/2003 ICME Tutorial, Baltimore 103
So?
Is nearest neighbor estimate cursed in high-D spaces?
Yes!When D is large and N is relatively small, the estimate is off!!
![Page 104: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/104.jpg)
7/6/2003 ICME Tutorial, Baltimore 104
Are We Doomed?
How does the curse affect classification?Similar objects tend to clustertogetherClassification makes binary prediction
![Page 105: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/105.jpg)
7/6/2003 ICME Tutorial, Baltimore 105
Distribution of Distances
![Page 106: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/106.jpg)
7/6/2003 ICME Tutorial, Baltimore 106
Some Solutions to High-D
Restricted Estimators Specifying the nature of local neighborhood
Adaptive Feature Reduction PCA, LDA
Dynamic Partial Function
![Page 107: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/107.jpg)
7/6/2003 ICME Tutorial, Baltimore 107
Three Major Paradigms
Preserve data description in a lower dimensional space
PCAMaximize discriminability in a lower dimensional space
LDAActivate only similar channels
DPF
![Page 108: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/108.jpg)
7/6/2003 ICME Tutorial, Baltimore 108
Minkowski Distance
Objects P and QD = (ΣM (pi - qi)n)1/n
Similar images are similar in all M features
![Page 109: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/109.jpg)
7/6/2003 ICME Tutorial, Baltimore 109
1.0E-06
1.0E-05
1.0E-04
1.0E-03
1.0E-02
1.0E-01
00.
060.
130.
190.
250.
320.
380.
440.
510.
570.
630.
690.
760.
820.
880.
95
Feature Distance
Freq
uenc
y
1.0E-06
1.0E-05
1.0E-04
1.0E-03
1.0E-02
1.0E-01
00.
060.
130.
190.
250.
320.
380.
440.
510.
570.
630.
690.
760.
820.
880.
95
Feature Distance
Freq
uenc
y
![Page 110: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/110.jpg)
7/6/2003 ICME Tutorial, Baltimore 110
Weighted Minkowski Distance
D = (ΣM wi(pi - qi)n)1/n
Similar images are similar in the same subset of the M features
![Page 111: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/111.jpg)
7/6/2003 ICME Tutorial, Baltimore 111
0 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 0 0
0.007545 0.01307 0.004637 0.002413 0.002635 0.002954 0.0020070.014669 0.02717 0.010578 0.006734 0.007725 0.006379 0.0057660.012615 0.023055 0.009333 0.006764 0.007363 0.006593 0.0054430.082128 0.212612 0.068016 0.037835 0.032241 0.018068 0.0132030.061564 0.176548 0.045542 0.026445 0.026374 0.018583 0.0220370.019243 0.037016 0.015684 0.010834 0.012792 0.013536 0.0093460.09418 0.153677 0.066896 0.040249 0.036368 0.030341 0.0211380.1284 0.335405 0.13774 0.072613 0.054947 0.039216 0.043319
0.041414 0.101403 0.035881 0.022633 0.018991 0.017131 0.019450.014024 0.049782 0.01457 0.0053 0.004439 0.003041 0.0052260.049319 0.120274 0.045804 0.020165 0.019499 0.013805 0.018513
GIF
00.020.040.060.080.1
0.120.14
1 11 21 31 41 51 61 71 81 91 101
111
121
131
141
Feature Number
Aver
age
Dis
tanc
e0 0 0 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 0 0
0.002923 0.004377 0.029086 0.017063 0.007649 0.002019 0.001984 0.011560.006648 0.010143 0.070708 0.046142 0.023502 0.005178 0.005169 0.030140.006298 0.009264 0.075118 0.042225 0.020053 0.006285 0.006533 0.0300430.010198 0.056025 0.052869 0.033199 0.018294 0.00688 0.006858 0.023620.017066 0.047514 0.104013 0.073459 0.037468 0.013849 0.01293 0.0483440.008148 0.015337 0.074134 0.044238 0.021222 0.005197 0.005099 0.0299780.013529 0.051743 0.063263 0.038084 0.020885 0.010481 0.009844 0.0285110.045746 0.104141 0.145924 0.11276 0.065015 0.026333 0.02593 0.0751920.026167 0.034522 0.085067 0.054154 0.02918 0.015887 0.014371 0.0397320.002676 0.012148 0.008913 0.004682 0.002452 0.000913 0.000905 0.0035730.014527 0.036084 0.046779 0.024712 0.017418 0.004182 0.004991 0.0196160.012121 0.030269 0.045198 0.022268 0.012468 0.004706 0.004955 0.017919
Scale up/down
00.050.1
0.150.2
0.250.3
0.350.4
1 11 21 31 41 51 61 71 81 91 101
111
121
131
141
Feature Number
Aver
age
Dis
tanc
e
0.024788 0.069615 0.0226 0.009364 0.01 0.00678 0.0097120.094781 0.227558 0.099002 0.046466 0.047815 0.036883 0.0246990.093399 0.233519 0.188091 0.043026 0.037991 0.022151 0.0240640.040228 0.102763 0.034949 0.014184 0.01465 0.010237 0.0155170.001163 0.000896 0.000722 0.000627 0.000349 0.000452 0.0027580.006947 0.006769 0.003541 0.006377 0.002048 0.005515 0.0130060.006365 0.005313 0.002064 0.004006 0.002055 0.003338 0.01010.011705 0.010935 0.006615 0.007506 0.003319 0.005911 0.0152110.009434 0.010169 0.004484 0.006306 0.002582 0.004798 0.0136570.006305 0.005997 0.003392 0.005719 0.002382 0.004853 0.0128020.005835 0.00945 0.004323 0.00564 0.002688 0.004535 0.0063320.008149 0.009636 0.0047 0.006213 0.002564 0.003375 0.0064210.006776 0.010315 0.005393 0.008004 0.003845 0.005659 0.0132030.001526 0.002551 0.000576 0.000371 0.000331 0.000286 0.000380.016302 0.022657 0.007055 0.00353 0.002171 0.004162 0.003980.012414 0.020159 0.007076 0.003102 0.00188 0.004606 0.003490.007231 0.013591 0.004979 0.001092 0.000582 0.002766 0.0007410.011588 0.015102 0.005764 0.003855 0.00262 0.004584 0.0037920.01212 0.016013 0.006441 0.004048 0.002728 0.004856 0.004241
0.012235 0.01671 0.00483 0.002616 0.00197 0.00268 0.001672
Cropping
00.050.1
0.150.2
0.250.3
0.35
1 11 21 31 41 51 61 71 81 91 101
111
121
131
141
Feature Number
Ave
rage
Dis
tanc
e
0.006109 0.019169 0.032795 0.015229 0.008667 0.002357 0.00292 0.0123940.01223 0.070665 0.046472 0.02549 0.017445 0.008694 0.00841 0.021302
0.019067 0.08113 0.04592 0.024327 0.014169 0.004995 0.005275 0.0189370.011323 0.029089 0.063856 0.037716 0.01988 0.00522 0.005556 0.0264460.000995 0.000971 0.00241 0.001415 0.000736 0.000275 0.000272 0.0010220.007103 0.006337 0.015615 0.008709 0.003433 0.001572 0.002071 0.006280.004321 0.004457 0.012494 0.007507 0.003403 0.001351 0.001976 0.0053460.007451 0.008135 0.017145 0.008711 0.003192 0.001154 0.00223 0.0064860.00576 0.006822 0.015235 0.00869 0.003676 0.001193 0.002159 0.006191
0.006491 0.005948 0.013473 0.007436 0.003165 0.001777 0.002377 0.0056460.003832 0.005257 0.011884 0.008077 0.002654 0.001227 0.001213 0.0050110.004812 0.005389 0.011737 0.00729 0.003216 0.001534 0.002039 0.0051630.008795 0.007888 0.016303 0.008801 0.004048 0.002367 0.0027 0.0068440.000451 0.000707 0.002277 0.001346 0.000797 0.000253 0.000239 0.0009820.004914 0.006924 0.01499 0.009123 0.006657 0.003364 0.003391 0.0075050.004473 0.006398 0.017247 0.008858 0.005219 0.002338 0.002392 0.0072110.001723 0.003639 0.010426 0.005216 0.003024 0.00043 0.000423 0.0039040.00427 0.005712 0.011221 0.00856 0.006923 0.004464 0.004462 0.007126
0.004978 0.006186 0.009864 0.007161 0.005881 0.003835 0.003847 0.0061180.001722 0.0046 0.015611 0.007291 0.00338 0.000508 0.00049 0.005456
Rotation
0
0.02
0.04
0.06
0.08
0.1
0.12
1 10 19 28 37 46 55 64 73 82 91 100
109
118
127
136
Feature Number
Aver
age
Dis
tanc
e
![Page 112: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/112.jpg)
7/6/2003 ICME Tutorial, Baltimore 112
Similarity Theories
Objects are similar in all respects (Richardson 1928)Objects are similar in some respects (Tversky 1977)Similarity is a process of determining respects, rather than using predefined respects (Goldstone 94)
![Page 113: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/113.jpg)
7/6/2003 ICME Tutorial, Baltimore 113
DPF
Which Place is Similar to DC?PartialDynamicDynamic Partial FunctionSee ACM MM 2002, ICIP 2002, ACM MM Journal
![Page 114: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/114.jpg)
7/6/2003 ICME Tutorial, Baltimore 114
Precision/Recall
![Page 115: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/115.jpg)
7/6/2003 ICME Tutorial, Baltimore 115
Summary
Statistical LearningEmerging Applications Data CharacteristicsClassical Models (Classification)Kernel Methods
Linear Model ViewNearest Neighbor ViewGeometric View
Dimension Reduction Methods
![Page 116: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/116.jpg)
7/6/2003 ICME Tutorial, Baltimore 116
Advanced Topics
Imbalance Data LearningN- >> N+
See our ICML 2003 papersSequence-data KernelKernel Alignment & Boosting
![Page 117: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/117.jpg)
7/6/2003 ICME Tutorial, Baltimore 117
Useful Links
Related Publicationshttp://www-db.stanford.edu/~echang/
Online DemoVIMA Technologies
Six deployments as July 2003www.vimatech.com
![Page 118: Statistical Methods for Learning Multimedia …infolab.stanford.edu/~echang/ICME03-UCSB.pdf7/6/2003 ICME Tutorial, Baltimore 1 Statistical Methods for Learning Multimedia Semantics](https://reader031.vdocuments.mx/reader031/viewer/2022030720/5b05a5087f8b9ad1768ba9db/html5/thumbnails/118.jpg)
7/6/2003 ICME Tutorial, Baltimore 118
References1. The Elements of Statistical Learning, T. Hastie, R. Tibshirani, and J.
Friedman, Springer, N.Y., 20012. Machine Learning, T. Mitchell, 19973. High-dimensional Data Analysis, D. Donoho, American Math. Society Lecture,
20004. Support Vector Machine Active Learning for Image Retrieval, S. Tong and E.
Chang, ACM MM, 20015. Dynamic Partial Function, B. Li and E. Chang, ACM Multimedia Journal, 20036. Pattern Discovery in Sequences under a Markov Assumption, D. Chudova and
P. Smyth, ACM KDD 20027. Bayes Point Machines, R. Herbrich, T. Graepel and C. Campbell, Journal of
Machine Learning Research, 20018. The Nature of Statistical Learning Theory, V. Vapnik, Springer, N.Y., 19959. Probabilistic Kernel Regression Models, T. Jaakkola and D. Haussler,
Conference of AI and Statistics, 199910. Support Vector Machines, Lecture Notes, A. Moore, CMU11. On the Surprising Behavior of Distance Metrics in High-dimensional Space, C.
Aggarwal, A. Hinneburg, and D. Keim, ICDT 2001 12. Adaptive Conformal Transformation for Learning Imbalanced Data, G. Wu, E.
Chang, International Conference on Machine Learning, August 2003