nearest neighbor classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions...
TRANSCRIPT
![Page 1: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/1.jpg)
MachineLearning
NearestNeighborClassification
1
![Page 2: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/2.jpg)
Thislecture
• K-nearestneighborclassification– Thebasicalgorithm– Differentdistancemeasures– Somepracticalaspects
• Voronoi DiagramsandDecisionBoundaries– Whatisthehypothesisspace?
• TheCurseofDimensionality
2
![Page 3: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/3.jpg)
Thislecture
• K-nearestneighborclassification– Thebasicalgorithm– Differentdistancemeasures– Somepracticalaspects
• Voronoi DiagramsandDecisionBoundaries– Whatisthehypothesisspace?
• TheCurseofDimensionality
3
![Page 4: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/4.jpg)
Howwouldyoucolortheblankcircles?
A
B
C
4
![Page 5: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/5.jpg)
Howwouldyoucolortheblankcircles?B
C
Ifwebaseditonthecoloroftheirnearestneighbors,wewouldgetA:BlueB:RedC:Red
5
A
![Page 6: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/6.jpg)
Trainingdatapartitionstheentireinstancespace(usinglabelsofnearestneighbors)
6
![Page 7: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/7.jpg)
NearestNeighbors:Thebasicversion
• Trainingexamplesarevectorsxi associatedwithalabelyi– E.g.xi =afeaturevectorforanemail,yi =SPAM
• Learning:Juststoreallthetrainingexamples
• Prediction foranewexamplex– Findthetrainingexamplexi thatisclosest tox– Predictthelabelofx tothelabelyi associatedwithxi
7
![Page 8: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/8.jpg)
K-NearestNeighbors
• Trainingexamplesarevectorsxi associatedwithalabelyi– E.g.xi =afeaturevectorforanemail,yi =SPAM
• Learning:Juststoreallthetrainingexamples
• Prediction foranewexamplex– Findthekclosest trainingexamplestox– Constructthelabelofx usingthesekpoints.How?– Forclassification:?
8
![Page 9: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/9.jpg)
K-NearestNeighbors
• Trainingexamplesarevectorsxi associatedwithalabelyi– E.g.xi =afeaturevectorforanemail,yi =SPAM
• Learning:Juststoreallthetrainingexamples
• Prediction foranewexamplex– Findthekclosest trainingexamplestox– Constructthelabelofx usingthesekpoints.How?– Forclassification:Everyneighborvotesonthelabel.Predictthemost
frequentlabelamongtheneighbors.– Forregression:?
9
![Page 10: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/10.jpg)
K-NearestNeighbors
• Trainingexamplesarevectorsxi associatedwithalabelyi– E.g.xi =afeaturevectorforanemail,yi =SPAM
• Learning:Juststoreallthetrainingexamples
• Prediction foranewexamplex– Findthekclosest trainingexamplestox– Constructthelabelofx usingthesekpoints.How?– Forclassification:Everyneighborvotesonthelabel.Predictthemost
frequentlabelamongtheneighbors.– Forregression:Predictthemeanvalue
10
![Page 11: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/11.jpg)
Instancebasedlearning
• Aclassoflearningmethods– Learning:Storingexampleswithlabels– Prediction:Whenpresentedanewexample,classifythelabelsusing
similar storedexamples
• K-nearestneighborsalgorithmisanexampleofthisclassofmethods
• Alsocalledlazylearning,becausemostofthecomputation(inthesimplestcase,allcomputation)isperformedonlyatpredictiontime
11Questions?
![Page 12: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/12.jpg)
Distancebetweeninstances
• Ingeneral,agoodplacetoinjectknowledgeaboutthedomain
• Behaviorofthisapproachcandependonthis
• Howdowemeasuredistancesbetweeninstances?
12
![Page 13: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/13.jpg)
Distancebetweeninstances
Numericfeatures,representedasndimensionalvectors
13
![Page 14: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/14.jpg)
Distancebetweeninstances
Numericfeatures,representedasndimensionalvectors– Euclideandistance
– Manhattandistance
– Lp-norm• Euclidean=L2• Manhattan=L1• Exercise:WhatisL1?
14
![Page 15: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/15.jpg)
Distancebetweeninstances
Numericfeatures,representedasndimensionalvectors– Euclideandistance
– Manhattandistance
– Lp-norm• Euclidean=L2• Manhattan=L1• Exercise:WhatisL1?
15
![Page 16: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/16.jpg)
Distancebetweeninstances
Numericfeatures,representedasndimensionalvectors– Euclideandistance
– Manhattandistance
– Lp-norm• Euclidean=L2• Manhattan=L1• Exercise:WhatisL1?
16
![Page 17: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/17.jpg)
Distancebetweeninstances
Whataboutsymbolic/categoricalfeatures?
17
![Page 18: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/18.jpg)
Distancebetweeninstances
Symbolic/categoricalfeatures
MostcommondistanceistheHammingdistance– Numberofbitsthataredifferent– Or:Numberoffeaturesthathaveadifferentvalue– Alsocalledtheoverlap– Example:
X1:{Shape=Triangle,Color=Red,Location=Left,Orientation=Up}X2:{Shape=Triangle,Color=Blue,Location=Left,Orientation=Down}
Hammingdistance=2
18
![Page 19: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/19.jpg)
Advantages
• Trainingisveryfast– Justaddinglabeledinstancestoalist– Morecomplexindexingmethodscanbeused,whichslowdown
learningslightlytomakepredictionfaster
• Canlearnverycomplexfunctions
• Wealwayshavethetrainingdata– Forotherlearningalgorithms,aftertraining,wedon’tstorethedata
anymore.Whatifwewanttodosomethingwithitlater…
19
![Page 20: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/20.jpg)
Disadvantages
• Needsalotofstorage– Isthisreallyaproblemnow?
• Predictioncanbeslow!– Naïvely:O(dN)forNtrainingexamplesinddimensions– Moredatawillmakeitslower– Comparetootherclassifiers,wherepredictionisveryfast
• Nearestneighborsarefooledbyirrelevantattributes– Importantandsubtle
20
Questions?
![Page 21: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/21.jpg)
Summary:K-NearestNeighbors• Probablythefirst“machinelearning”algorithm
– Guarantee:Ifthereareenoughtrainingexamples,theerrorofthenearestneighborclassifierwillconvergetotheerroroftheoptimal(i.e.bestpossible)predictor
• Inpractice,useanoddK.Why?– Tobreakties
• HowtochooseK?Usingaheld-outsetorbycross-validation
• Featurenormalizationcouldbeimportant– Often,goodideatocenterthefeaturestomakethemzeromeanandunitstandard
deviation.Why?– Becausedifferentfeaturescouldhavedifferentscales(weight,height,etc);butthe
distanceweightsthemequally
• Variantsexist– Neighbors’labelscouldbeweightedbytheirdistance
21
![Page 22: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/22.jpg)
Summary:K-NearestNeighbors• Probablythefirst“machinelearning”algorithm
– Guarantee:Ifthereareenoughtrainingexamples,theerrorofthenearestneighborclassifierwillconvergetotheerroroftheoptimal(i.e.bestpossible)predictor
• Inpractice,useanoddK.Why?– Tobreakties
• HowtochooseK?Usingaheld-outsetorbycross-validation
• Featurenormalizationcouldbeimportant– Often,goodideatocenterthefeaturestomakethemzeromeanandunitstandard
deviation.Why?– Becausedifferentfeaturescouldhavedifferentscales(weight,height,etc);butthe
distanceweightsthemequally
• Variantsexist– Neighbors’labelscouldbeweightedbytheirdistance
22
![Page 23: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/23.jpg)
Summary:K-NearestNeighbors• Probablythefirst“machinelearning”algorithm
– Guarantee:Ifthereareenoughtrainingexamples,theerrorofthenearestneighborclassifierwillconvergetotheerroroftheoptimal(i.e.bestpossible)predictor
• Inpractice,useanoddK.Why?– Tobreakties
• HowtochooseK?Usingaheld-outsetorbycross-validation
• Featurenormalizationcouldbeimportant– Often,goodideatocenterthefeaturestomakethemzeromeanandunitstandard
deviation.Why?– Becausedifferentfeaturescouldhavedifferentscales(weight,height,etc);butthe
distanceweightsthemequally
• Variantsexist– Neighbors’labelscouldbeweightedbytheirdistance
23
![Page 24: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/24.jpg)
Summary:K-NearestNeighbors• Probablythefirst“machinelearning”algorithm
– Guarantee:Ifthereareenoughtrainingexamples,theerrorofthenearestneighborclassifierwillconvergetotheerroroftheoptimal(i.e.bestpossible)predictor
• Inpractice,useanoddK.Why?– Tobreakties
• HowtochooseK?Usingaheld-outsetorbycross-validation
• Featurenormalizationcouldbeimportant– Often,goodideatocenterthefeaturestomakethemzeromeanandunitstandard
deviation.Why?– Becausedifferentfeaturescouldhavedifferentscales(weight,height,etc);butthe
distanceweightsthemequally
• Variantsexist– Neighbors’labelscouldbeweightedbytheirdistance
24
![Page 25: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/25.jpg)
Summary:K-NearestNeighbors• Probablythefirst“machinelearning”algorithm
– Guarantee:Ifthereareenoughtrainingexamples,theerrorofthenearestneighborclassifierwillconvergetotheerroroftheoptimal(i.e.bestpossible)predictor
• Inpractice,useanoddK.Why?– Tobreakties
• HowtochooseK?Usingaheld-outsetorbycross-validation
• Featurenormalizationcouldbeimportant– Often,goodideatocenterthefeaturestomakethemzeromeanandunitstandard
deviation.Why?– Becausedifferentfeaturescouldhavedifferentscales(weight,height,etc);butthe
distanceweightsthemequally
• Variantsexist– Neighbors’labelscouldbeweightedbytheirdistance
25
![Page 26: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/26.jpg)
Wherearewe?
• K-nearestneighborclassification– Thebasicalgorithm– Differentdistancemeasures– Somepracticalaspects
• Voronoi DiagramsandDecisionBoundaries– Whatisthehypothesisspace?
• TheCurseofDimensionality
26
![Page 27: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/27.jpg)
Wherearewe?
• K-nearestneighborclassification– Thebasicalgorithm– Differentdistancemeasures– Somepracticalaspects
• Voronoi DiagramsandDecisionBoundaries– Whatisthehypothesisspace?
• TheCurseofDimensionality
27
![Page 28: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/28.jpg)
ThedecisionboundaryforKNN
IstheKnearestneighborsalgorithmexplicitlybuildingafunction?
28
![Page 29: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/29.jpg)
ThedecisionboundaryforKNN
IstheKnearestneighborsalgorithmexplicitlybuildingafunction?– No,itneverformsanexplicithypothesis
Butwecanstillask:Givenatrainingsetwhatistheimplicitfunctionthatisbeingcomputed
29
![Page 30: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/30.jpg)
TheVoronoi Diagram
30
Foranypointx inatrainingsetS,theVoronoi Cellofx isapolyhedronconsistingofallpointsclosertox thananyotherpointsinS
TheVoronoi diagramistheunionofallVoronoi cells• Coverstheentirespace
![Page 31: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/31.jpg)
TheVoronoi Diagram
31
Foranypointx inatrainingsetS,theVoronoi Cellofx isapolyhedronconsistingofallpointsclosertox thananyotherpointsinS
TheVoronoi diagramistheunionofallVoronoi cells• Coverstheentirespace
![Page 32: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/32.jpg)
Voronoi diagramsoftrainingexamples
32
Foranypointx inatrainingsetS,theVoronoi Cellofx isapolytopeconsistingofallpointsclosertox thananyotherpointsinS
TheVoronoi diagramistheunionofallVoronoi cells• Coverstheentirespace
PointsintheVoronoicellofatrainingexampleareclosertoitthananyothers
![Page 33: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/33.jpg)
Voronoi diagramsoftrainingexamples
33
Foranypointx inatrainingsetS,theVoronoi Cellofx isapolytopeconsistingofallpointsclosertox thananyotherpointsinS
TheVoronoi diagramistheunionofallVoronoi cells• Coverstheentirespace
PointsintheVoronoicellofatrainingexampleareclosertoitthananyothers
![Page 34: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/34.jpg)
Voronoi diagramsoftrainingexamples
34
PointsintheVoronoicellofatrainingexampleareclosertoitthananyothers
PictureusesEuclideandistancewith1-nearestneighbors.
WhataboutK-nearestneighbors?
Alsopartitionsthespace,butmuchmorecomplexdecisionboundary
![Page 35: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/35.jpg)
Voronoi diagramsoftrainingexamples
35
PointsintheVoronoicellofatrainingexampleareclosertoitthananyothers
PictureusesEuclideandistancewith1-nearestneighbors.
WhataboutK-nearestneighbors?
Alsopartitionsthespace,butmuchmorecomplexdecisionboundary
Whataboutpointsontheboundary? Whatlabelwilltheyget?
![Page 36: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/36.jpg)
Exercise
Ifyouhaveonlytwotrainingpoints,whatwillthedecisionboundaryfor1-nearestneighborbe?
36
![Page 37: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/37.jpg)
Exercise
Ifyouhaveonlytwotrainingpoints,whatwillthedecisionboundaryfor1-nearestneighborbe?
– Alinebisectingthetwopoints
37
![Page 38: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/38.jpg)
Thislecture
• K-nearestneighborclassification– Thebasicalgorithm– Differentdistancemeasures– Somepracticalaspects
• Voronoi DiagramsandDecisionBoundaries– Whatisthehypothesisspace?
• TheCurseofDimensionality
38
![Page 39: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/39.jpg)
Whyyourclassifiermightgowrong
Twoimportantconsiderationswithlearningalgorithms
• Overfitting:Wehavealreadyseenthis
• Thecurseofdimensionality– Methodsthatworkwithlowdimensionalspacesmayfailinhigh
dimensions– Whatmightbeintuitivefor2or3dimensionsdonotalwaysapplyto
highdimensionalspaces
39
Checkoutthe1884bookFlatland:ARomanceofManyDimensions forafunintroductiontothefourthdimension
![Page 40: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/40.jpg)
Ofcourse,irrelevantattributeswillhurt
Supposewehave1000dimensionalfeaturevectors– Butonly10featuresarerelevant
– Distanceswillbedominatedbythelargenumberofirrelevantfeatures
40
![Page 41: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/41.jpg)
Ofcourse,irrelevantattributeswillhurt
Supposewehave1000dimensionalfeaturevectors– Butonly10featuresarerelevant
– Distanceswillbedominatedbythelargenumberofirrelevantfeatures
41
Butevenwithonlyrelevantattributes,highdimensionalspacesbehaveinoddways
![Page 42: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/42.jpg)
TheCurseofDimensionality
Example1:Whatfractionofthepointsinacubelieoutsidethesphereinscribedinit?
42
Intuitionsthatarebasedon2or3dimensionalspacesdonotalwayscarryovertohighdimensionalspaces
Whatfractionofthesquare(i.e thecube)isoutsidetheinscribedcircle(i.e thesphere)intwodimensions?
2r
Intwodimensions
![Page 43: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/43.jpg)
TheCurseofDimensionality
Example1:Whatfractionofthepointsinacubelieoutsidethesphereinscribedinit?
43
Intuitionsthatarebasedon2or3dimensionalspacesdonotalwayscarryovertohighdimensionalspaces
Whatfractionofthesquare(i.e thecube)isoutsidetheinscribedcircle(i.e thesphere)intwodimensions?
2r
Intwodimensions
![Page 44: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/44.jpg)
TheCurseofDimensionality
Example1:Whatfractionofthepointsinacubelieoutsidethesphereinscribedinit?
44
Intuitionsthatarebasedon2or3dimensionalspacesdonotalwayscarryovertohighdimensionalspaces
Whatfractionofthesquare(i.e thecube)isoutsidetheinscribedcircle(i.e thesphere)intwodimensions?
2r
Intwodimensions
But,distancesdonotbehavethesamewayinhighdimensions
![Page 45: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/45.jpg)
TheCurseofDimensionality
Example1:Whatfractionofthepointsinacubelieoutsidethesphereinscribedinit?
Intuitionsthatarebasedon2or3dimensionalspacesdonotalwayscarryovertohighdimensionalspaces
2r
Inthreedimensions
45
Whatfractionofthecubeisoutsidetheinscribedsphereinthreedimensions?
But,distancesdonotbehavethesamewayinhighdimensions
![Page 46: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/46.jpg)
TheCurseofDimensionality
Example1:Whatfractionofthepointsinacubelieoutsidethesphereinscribedinit?
Intuitionsthatarebasedon2or3dimensionalspacesdonotalwayscarryovertohighdimensionalspaces
2r
Inthreedimensions
46
Whatfractionofthecubeisoutsidetheinscribedsphereinthreedimensions?
Asthedimensionalityincreases,thisfractionapproaches1!!
Inhighdimensions,mostofthevolumeofthecubeisfarawayfromthecenter!
![Page 47: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/47.jpg)
TheCurseofDimensionality
Example2:Whatfractionofthevolumeofaunitsphereliesbetweenradius1- ² andradius1?
Intuitionsthatarebasedon2or3dimensionalspacesdonotalwayscarryovertohighdimensionalspaces
47
![Page 48: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/48.jpg)
TheCurseofDimensionality
Example2:Whatfractionofthevolumeofaunitsphereliesbetweenradius1- ² andradius1?
Intuitionsthatarebasedon2or3dimensionalspacesdonotalwayscarryovertohighdimensionalspaces
48
Intwodimensions
![Page 49: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/49.jpg)
TheCurseofDimensionality
Example2:Whatfractionofthevolumeofaunitsphereliesbetweenradius1- ² andradius1?
Intuitionsthatarebasedon2or3dimensionalspacesdonotalwayscarryovertohighdimensionalspaces
49
Intwodimensions Whatfractionoftheareaofthecircleisintheblueregion?
![Page 50: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/50.jpg)
TheCurseofDimensionality
Example2:Whatfractionofthevolumeofaunitsphereliesbetweenradius1- ² andradius1?
Intuitionsthatarebasedon2or3dimensionalspacesdonotalwayscarryovertohighdimensionalspaces
50
Intwodimensions Whatfractionoftheareaofthecircleisintheblueregion?
![Page 51: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/51.jpg)
TheCurseofDimensionality
Example2:Whatfractionofthevolumeofaunitsphereliesbetweenradius1- ² andradius1?
Intuitionsthatarebasedon2or3dimensionalspacesdonotalwayscarryovertohighdimensionalspaces
51
But,distancesdonotbehavethesamewayinhighdimensions
Intwodimensions Whatfractionoftheareaofthecircleisintheblueregion?
![Page 52: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/52.jpg)
TheCurseofDimensionality
Example2:Whatfractionofthevolumeofaunitsphereliesbetweenradius1- ² andradius1?
Intuitionsthatarebasedon2or3dimensionalspacesdonotalwayscarryovertohighdimensionalspaces
52
But,distancesdonotbehavethesamewayinhighdimensionsInddimensions,thefractionis
Asdincreases,thisfractiongoesto1!
Inhighdimensions,mostofthevolumeofthesphereisfarawayfromthecenter!
Questions?
![Page 53: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/53.jpg)
TheCurseofDimensionality
• Mostofthepointsinhighdimensionalspacesarefarawayfromtheorigin!– In2or3dimensions,mostpointsarenearthecenter– Needmoredatato“fillupthespace”
• Badnewsfornearestneighborclassificationinhighdimensionalspaces
Evenifmost/allfeaturesarerelevant,inhighdimensionalspaces,mostpointsareequallyfarfromeachother!
“Neighborhood”becomesverylarge
Presentscomputationalproblemstoo
53
![Page 54: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/54.jpg)
Dealingwiththecurseofdimensionality
• Most“real-world”dataisnotuniformlydistributedinthehighdimensionalspace– Differentwaysofcapturingtheunderlyingdimensionality ofthespace
• Eg:Dimensionalityreductiontechniques,manifoldlearning
• Featureselection isanart– Differentmethodsexist– Selectfeatures,maybebyinformationgain– Tryoutdifferentfeaturesetsofdifferentsizesandpickagoodset
basedonavalidationset
• Priorknowledgeorpreferencesaboutthehypothesescanalsohelp
54
Questions?
![Page 55: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/55.jpg)
Summary:Nearestneighborsclassification
• Probablytheoldestandsimplestlearningalgorithm– Predictionisexpensive.
• Efficientdatastructureshelp.k-Dtrees:themostpopular,workswellinlowdimensions
• Approximatenearestneighborsmaybegoodenoughsometimes.Hashingbasedalgorithmsexist
• Requiresadistancemeasurebetweeninstances– Metriclearning:Learnthe“right”distanceforyourproblem
• PartitionsthespaceintoaVoronoi Diagram
• Bewarethecurseofdimensionality
55
Questions?
![Page 56: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/56.jpg)
Exercises
1. WhatwillhappenwhenyouchooseKtothenumberoftrainingexamples?
2. Supposeyouwanttobuildanearestneighborsclassifiertopredictwhetherabeverageisacoffeeorateausingtwofeatures:thevolumeoftheliquid(inmilliliters)andthecaffeinecontent(ingrams).Youcollectthefollowingdata:
56
Volume(ml)
Caffeine(g) Label
238 0.026 Tea
100 0.011 Tea
120 0.040 Coffee
237 0.095 Coffee
WhatisthelabelforatestpointwithVolume=120,Caffeine=0.013?
Whymightthisbeincorrect?
Howwouldyoufixtheproblem?
![Page 57: Nearest Neighbor Classificationsvivek.com/teaching/machine-learning/fall2018/... · dimensions –What might be intuitive for 2 or 3 dimensions do not always apply to high dimensional](https://reader035.vdocuments.mx/reader035/viewer/2022081404/5f047fbc7e708231d40e457b/html5/thumbnails/57.jpg)
Exercises
1. WhatwillhappenwhenyouchooseKtothenumberoftrainingexamples?
2. Supposeyouwanttobuildanearestneighborsclassifiertopredictwhetherabeverageisacoffeeorateausingtwofeatures:thevolumeoftheliquid(inmilliliters)andthecaffeinecontent(ingrams).Youcollectthefollowingdata:
57
Volume(ml)
Caffeine(g) Label
238 0.026 Tea
100 0.011 Tea
120 0.040 Coffee
237 0.095 Coffee
WhatisthelabelforatestpointwithVolume=120,Caffeine=0.013?
Whymightthisbeincorrect?
Howwouldyoufixtheproblem?
Thelabelwillalwaysbethemostcommonlabelinthetrainingdata
Coffee
BecauseVolumewilldominatethedistance
Rescalethefeatures.Maybetozeromean,unitvariance