L1. State of the Art in Machine Learning

Download L1. State of the Art in Machine Learning

Post on 18-Jan-2017

1.433 views

Category:

Data & Analytics

1 download

Embed Size (px)

TRANSCRIPT

<ul><li><p>State of the Art in Machine Learning</p><p>Poul Petersen</p><p>BigML</p></li><li><p>2</p><p>What is ML?</p><p>a field of study that gives computers the ability to learn without being explicitly </p><p>programmed</p><p>Professor Arthur Samuel, 1959</p><p> What ability to learn do computers have?</p><p> What does explicitly programmed mean?</p></li><li><p>Square Feet</p><p>Price</p><p>Sacramento Real Estate Prices by sq_ft</p><p>3</p><p>Ability to Learn</p></li><li><p>4</p><p>Ability to Learn Provide computer with examples of the relationship between </p><p>square footage X and price Y. </p><p> The computer learns the equation of a line F() which fits the examples.</p><p> The computer can now predict the price of any home knowing only the square footage: F( x ) = y</p><p> This model is known as Linear Regression. There are other types of models.</p><p> You may have noticed there was some points that did not fit. This is important!</p></li><li><p>5</p><p>Learning Problems (fit)</p><p>Under-fitting Over-fitting</p><p> Model does not fit well enough</p><p> Does not capture the underlying trend of the data</p><p> Change algorithm or features</p><p> Model fits too well does not generalize</p><p> Captures the noise or outliers of the data</p><p> Change algorithm or filter outliers</p></li><li><p>6</p><p>Learning Problems (missing)</p><p> Missing values at training/prediction time</p><p> Some algorithms can handle missing values, some no</p><p> Missing data is sometimes important</p><p> Replace missing values</p><p> Predict missing values</p></li><li><p>7</p><p>Learning Problems (missing)</p><p>Missing@ Decision Trees KNNLogistic </p><p>RegressionNaive Bayes</p><p>Neural Networks</p><p>Training Yes No No Yes Yes*</p><p>Prediction Yes No No Yes No</p></li><li><p>8</p><p>Not Explicitly Programmed</p><p>Control System</p><p>Customers go on vacation We need a program that predicts if the light will come on when the control </p><p>system flips the switch.</p></li><li><p>9</p><p>Not Explicitly Programmed</p><p>@iLoveRuby</p><p>Switch Light?</p><p>on TRUE</p><p>off FALSE</p><p> @iLoveRuby reasoned the rules by experience</p><p> Then programmed the rules explicitly</p></li><li><p>10</p><p>Not Explicitly ProgrammedSwitch Light?</p><p>on TRUE</p><p>off FALSE</p><p> The ML System reasoned the rules from the data and created a Model</p><p> Functionally the Model is the same as @iLoveRubys explicit model</p><p>@iLoveML</p><p>question</p><p>answer</p><p>ML System Model</p></li><li><p>11</p><p>Not Explicitly Programmed</p><p> how long has the bulb has been in service</p><p> reliability of brand</p><p> rated power: higher power shorter life</p><p> duty cycle</p><p> room conditions: temperature, humidity</p><p>Goal is to predict if the bulb will come on</p><p>but the switch is not the important variable:</p><p>Even worse: None of these conditions is absolute. </p></li><li><p>12</p><p>Not Explicitly Programmedbrand power age duty temp humidity FAIL?koala 45 338 1 16 0.03 FALSEotter 15 140 1 27 0.27 FALSEkoala 15 315 1 19 0.37 TRUEotter 45 338 1 29 0.27 TRUEkoala 45 211 1 23 0.85 TRUEotter 15 328 1 17 0.56 FALSEkoala 15 318 2 22 0.45 TRUEkoala 15 273 1 27 0.18 FALSEkoala 45 102 1 21 0.48 FALSEkoala 15 110 2 15 0.99 TRUEotter 45 355 2 15 0.01 FALSEotter 15 69 1 24 0.70 FALSEkoala 15 69 1 24 0.70 FALSEkoala 15 337 2 27 0.83 TRUE</p></li><li><p>13</p><p>Not Explicitly Programmed</p><p>@iLoveRuby</p><p> Multi-dimensional data is much harder to find rules</p><p> Explicit program requires modification</p><p>brand power age duty temp humidity FAIL?</p><p>koala 45 338 1 16 0.03 FALSE</p><p>otter 15 140 1 27 0.27 FALSE</p><p>koala 15 315 1 19 0.37 TRUE</p><p>otter 45 338 1 29 0.27 TRUE</p><p>koala 45 211 1 23 0.85 TRUE</p><p>otter 15 328 1 17 0.56 FALSE</p><p>koala 15 318 2 22 0.45 TRUE</p><p>koala 15 273 1 27 0.18 FALSE</p><p>koala 45 102 1 21 0.48 FALSE</p><p>koala 15 110 2 15 0.99 TRUE</p><p>otter 45 355 2 15 0.01 FALSE</p><p>otter 15 69 1 24 0.70 FALSE</p><p>koala 15 69 1 24 0.70 FALSE</p><p>koala 15 337 2 27 0.83 TRUE</p></li><li><p>@iLoveML ML System Model</p><p>14</p><p>Not Explicitly Programmed</p><p> ML System easily re-trains on new data</p><p>question</p><p>answer</p><p>brand power age duty temp humidity FAIL?</p><p>koala 45 338 1 16 0.03 FALSE</p><p>otter 15 140 1 27 0.27 FALSE</p><p>koala 15 315 1 19 0.37 TRUE</p><p>otter 45 338 1 29 0.27 TRUE</p><p>koala 45 211 1 23 0.85 TRUE</p><p>otter 15 328 1 17 0.56 FALSE</p><p>koala 15 318 2 22 0.45 TRUE</p><p>koala 15 273 1 27 0.18 FALSE</p><p>koala 45 102 1 21 0.48 FALSE</p><p>koala 15 110 2 15 0.99 TRUE</p><p>otter 45 355 2 15 0.01 FALSE</p><p>otter 15 69 1 24 0.70 FALSE</p><p>koala 15 69 1 24 0.70 FALSE</p><p>koala 15 337 2 27 0.83 TRUE</p></li><li><p>@iLoveML ML System Model</p><p>15</p><p>Terminology</p><p>input</p><p>prediction</p><p>brand power age duty temp humidity FAIL?</p><p>koala 45 338 1 16 0.03 FALSE</p><p>otter 15 140 1 27 0.27 FALSE</p><p>koala 15 315 1 19 0.37 TRUE</p><p>otter 45 338 1 29 0.27 TRUE</p><p>koala 45 211 1 23 0.85 TRUE</p><p>otter 15 328 1 17 0.56 FALSE</p><p>koala 15 318 2 22 0.45 TRUE</p><p>koala 15 273 1 27 0.18 FALSE</p><p>koala 45 102 1 21 0.48 FALSE</p><p>koala 15 110 2 15 0.99 TRUE</p><p>otter 45 355 2 15 0.01 FALSE</p><p>otter 15 69 1 24 0.70 FALSE</p><p>koala 15 69 1 24 0.70 FALSE</p><p>koala 15 337 2 27 0.83 TRUE</p><p>datasource</p><p>training</p><p>putting intoproduction</p></li><li><p>16</p><p>Supervised Learning</p><p>brand power age duty temp humidity FAIL?</p><p>koala 45 338 1 16 0.03 FALSE</p><p>otter 15 140 1 27 0.27 FALSE</p><p>koala 15 315 1 19 0.37 TRUE</p><p>otter 45 338 1 29 0.27 TRUE</p><p>koala 45 211 1 23 0.85 TRUE</p><p>otter 15 328 1 17 0.56 FALSE</p><p>koala 15 318 2 22 0.45 TRUE</p><p>features</p><p>instances</p><p>label</p><p>Labeled Data</p></li><li><p>17</p><p>Supervised Learning</p><p>animal state proximity actiontiger hungry close run</p><p>elephant happy far take picture</p><p>Classification</p><p>animal state proximity min_kmhtiger hungry close 70hippo angry far 10</p><p>Regression</p><p>label</p><p>animal state proximity action1 action2tiger hungry close run look untasty</p><p>elephant happy far take picture call friends</p><p>Multi-Label Classification</p></li><li><p>18</p><p>Unsupervised Learning</p><p>date customer account auth class zip amount</p><p>Mon Bob 3421 pin clothes 46140 135</p><p>Tue Bob 3421 sign food 46140 401</p><p>Tue Alice 2456 pin food 12222 234</p><p>Wed Sally 6788 pin gas 26339 94</p><p>Wed Bob 3421 pin tech 21350 2459</p><p>Wed Bob 3421 pin gas 46140 83</p><p>The Sally 6788 sign food 26339 51</p><p>features</p><p>instances</p><p>Unlabeled Data</p></li><li><p>19</p><p>Unsupervised Learning</p><p>date customer account auth class zip amountMon Bob 3421 pin clothes 46140 135Tue Bob 3421 sign food 46140 401Tue Alice 2456 pin food 12222 234Wed Sally 6788 pin gas 26339 94Wed Bob 3421 pin tech 21350 2459Wed Bob 3421 pin gas 46140 83The Sally 6788 sign food 26339 51</p><p>Clustering</p><p>date customer account auth class zip amountMon Bob 3421 pin clothes 46140 135Tue Bob 3421 sign food 46140 401Tue Alice 2456 pin food 12222 234Wed Sally 6788 pin gas 26339 94Wed Bob 3421 pin tech 21350 2459Wed Bob 3421 pin gas 46140 83The Sally 6788 sign food 26339 51</p><p>Anomaly Detection</p><p>similar</p><p>unusual</p></li><li><p>20</p><p>Semi-Supervised Learning</p><p>brand power age duty temp humidity FAIL?</p><p>koala 45 338 1 16 0.03 FALSE</p><p>otter 15 140 1 27 0.27</p><p>koala 15 315 1 19 0.37</p><p>otter 45 338 1 29 0.27 TRUE</p><p>koala 45 211 1 23 0.85 TRUE</p><p>otter 15 328 1 17 0.56</p><p>koala 15 318 2 22 0.45 TRUE</p><p>label</p><p>Labeled and Unlabeled Data</p><p>} infer</p></li><li><p>21</p><p>Data Types</p><p>numeric</p><p>1 2 3</p><p>1, 2.0, 3, -5.4 categoricaltrue, yes, red, mammal categoricalcategorical</p><p>A B C</p><p>DATE-TIME2013-09-25 10:02DATE-TIME</p><p>YEAR</p><p>MONTH</p><p>DAY-OF-MONTH</p><p>YYYY-MM-DD</p><p>DAY-OF-WEEK</p><p>HOUR</p><p>MINUTE</p><p>YYYY-MM-DD</p><p>YYYY-MM-DD</p><p>M-T-W-T-F-S-D</p><p>HH:MM:SS</p><p>HH:MM:SS</p><p>2013</p><p>September</p><p>25</p><p>Wednesday</p><p>10</p><p>02</p><p>textBe not afraid of greatness: some are born great, some achieve greatness, and some have greatness thrust upon 'em.</p><p>text</p><p>greatafraidbornsome</p><p>appears 2 timesappears 1 timeappears 1 timeappears 2 times</p></li><li><p>22</p><p>Text Analysis</p><p>Be not afraid of greatness: some are born great, some achieve greatness, and some have greatnessthrust upon 'em.</p></li><li><p>22</p><p>Text Analysis</p><p>Be not afraid of greatness: some are born great, some achieve greatness, and some have greatnessthrust upon 'em.</p><p>great: appears 4 times</p></li><li><p>23</p><p>Text Analysisgreat afraid born achieve</p><p>4 1 1 1</p><p>Be not afraid of greatness: some are born great, some achieve greatness, and some have greatnessthrust upon em.</p><p>Model</p><p>The token great </p><p>does not occur</p><p>The token afraid </p><p>occurs more than once</p></li><li><p>24</p><p>Topic Modeling</p></li><li><p>25</p><p>Brief History of ML</p><p>1950 1960 1970 1980 1990 2000 2010</p><p>PerceptronNeural </p><p>Networks</p><p>Ensembles</p><p>Support Vector Machines</p><p>Boosting</p><p>Inte</p><p>rpre</p><p>tabi</p><p>lity</p><p>Rosenblatt, 1957</p><p>Quinlan, 1979 (ID3),</p><p>Minsky, 1969</p><p>Vapnik, 1963 Corina &amp; Vapnik, 1995</p><p>Schapire, 1989 (Boosting) Schapire, 1995 (Adaboost) </p><p>Breiman, 2001 (Random Forests)Breiman, 1994 (Bagging)</p><p>Deep LearningHinton, 2006Fukushima, 1989 (ANN)</p><p>Breiman, 1984 (CART)</p><p>2020</p><p>+</p><p>-</p><p>Decision Trees</p></li><li><p>26</p><p>Why ML Now?</p><p> Decreasing cost of data</p><p> Abundant computing power, especially cloud</p><p> Machine Learning APIs</p><p> Abundance of APIs + internet to combine easily</p></li><li><p>27</p><p>Composability</p></li><li><p>28</p><p>The Stages of a ML App</p><p>State the problem</p><p>Data Wrangling</p><p>Feature EngineeringLearning</p><p>Deploying</p><p>Predicting</p><p>Measuring Impact</p></li></ul>