introduction to data science - escp europe
TRANSCRIPT
Introduction to Data ScienceMS - ESCP Europe - 14/02/15
Martin Daniel @martindaniel4Source : www.d3js.org
Entrepreneurship
Media solutions
Head of Data Science
Founder (100 m texts processed)
Lecturer 1 week - data science Bootcamp
Founder
Organizer d3js / Data For Good
2010
2011
2015
Martin Daniel @martindaniel4
Martin Daniel @martindaniel4
&
Source : http://www.mercialfred.com/infographie-sms-dechiffres.html
Data is « eating the world »trifacta.com, 2013
Martin Daniel @martindaniel4Source : www.trifacta.com
Martin Daniel @martindaniel4Source : www.raremaps.com
Matthew Fontaine Maury1806 - 1873
More Data
Better AlgorithmsMore users
Better Service
Martin Daniel @martindaniel4
Data Startups Virtuous Cycle
Martin Daniel @martindaniel4
Percentage of proprietary statin prescribing by CCG, UKNHS, prescribinganalytics.com, 2011 - 2012
> 200 million £ saving / year
# Healthcare
Martin Daniel @martindaniel4
Google Flu Tracker vs CDC Official ReportGoogle.org, 2011 - 2012
t(Google) = t(CDC) - 2 weeks
# Healthcare
Source : www.google.org/flutrends/
Martin Daniel @martindaniel4
Uber e-hailing map in NYCuberdata, June 2013
# Travel
t(Uber) < t(Taxi)
Source : http://blog.uber.com/tag/uberdata/
« Machine Learning is the field of study that gives computers the ability to learn without being explicitly programmed. »
Arthur Samuel, 1959
Martin Daniel @martindaniel4
Martin Daniel @martindaniel4
More Data Beats Usually Better AlgorithmsBanko & Brill, Microsoft Research, 2001
e.g : Choose between {to, two, too}
For breakfast I ate __ eggs
Source : Michele Banko , Eric Brill, Scaling to Very Very Large Corpora for Natural Language Disambiguation (2001)
Dans quel ordre afficher des produits dans une liste ?
Martin Daniel @martindaniel4Source : fifty-five.com
Taux de clic par position et par listejuin 2013
• Liste = Carrefour d’audience
• Merchandising manuel ou générique
• Problématique large, tous secteurs
Source : fifty-five.com
Martin Daniel @martindaniel4
Taux d’ajout au panier cumulés robes 1 / robes 23S – nov 2013 – fev. 2014
Taux de passage cumulés robes 1 / robes 2 3S – nov 2013 – fev. 2014
Source : fifty-five.com
Martin Daniel @martindaniel4
« Husky » {freq: 15, aff: 0.4, cat: abc, res: 10, loc: 1, typo:Quoi Etc..} Modélisation
Ok
Quoi > Qui
Qui > QuoiLogs Processing Visualisation
Martin Daniel @martindaniel4
Training
Training
Production