introduction to data science - escp europe

22
Introduction to Data Science MS - ESCP Europe - 14/02/15 Martin Daniel @martindaniel4 Source : www.d3js.org

Upload: martin-daniel

Post on 21-Apr-2017

3.851 views

Category:

Data & Analytics


2 download

TRANSCRIPT

Introduction to Data ScienceMS - ESCP Europe - 14/02/15

Martin Daniel @martindaniel4Source : www.d3js.org

Entrepreneurship

Media solutions

Head of Data Science

Founder (100 m texts processed)

Lecturer 1 week - data science Bootcamp

Founder

Organizer d3js / Data For Good

2010

2011

2015

Martin Daniel @martindaniel4

Martin Daniel @martindaniel4

&

Source : http://www.mercialfred.com/infographie-sms-dechiffres.html

Data is « eating the world »trifacta.com, 2013

Martin Daniel @martindaniel4Source : www.trifacta.com

Martin Daniel @martindaniel4Source : www.raremaps.com

Matthew Fontaine Maury1806 - 1873

More Data

Better AlgorithmsMore users

Better Service

Martin Daniel @martindaniel4

Data Startups Virtuous Cycle

Martin Daniel @martindaniel4

E-commerce Healthcare Travel Sports Government

Martin Daniel @martindaniel4

Percentage of proprietary statin prescribing by CCG, UKNHS, prescribinganalytics.com, 2011 - 2012

> 200 million £ saving / year

# Healthcare

Martin Daniel @martindaniel4

Google Flu Tracker vs CDC Official ReportGoogle.org, 2011 - 2012

t(Google) = t(CDC) - 2 weeks

# Healthcare

Source : www.google.org/flutrends/

Martin Daniel @martindaniel4

Uber e-hailing map in NYCuberdata, June 2013

# Travel

t(Uber) < t(Taxi)

Source : http://blog.uber.com/tag/uberdata/

« Machine Learning is the field of study that gives computers the ability to learn without being explicitly programmed. »

Arthur Samuel, 1959

Martin Daniel @martindaniel4

Martin Daniel @martindaniel4

More Data Beats Usually Better AlgorithmsBanko & Brill, Microsoft Research, 2001

e.g : Choose between {to, two, too}

For breakfast I ate __ eggs

Source : Michele Banko , Eric Brill, Scaling to Very Very Large Corpora for Natural Language Disambiguation (2001)

Rank

Dans quel ordre afficher des produits dans une liste ?

Martin Daniel @martindaniel4Source : fifty-five.com

Taux de clic par position et par listejuin 2013

• Liste = Carrefour d’audience

• Merchandising manuel ou générique

• Problématique large, tous secteurs

Source : fifty-five.com

Martin Daniel @martindaniel4

Taux d’ajout au panier cumulés robes 1 / robes 23S – nov 2013 – fev. 2014

Taux de passage cumulés robes 1 / robes 2 3S – nov 2013 – fev. 2014

Source : fifty-five.com

Martin Daniel @martindaniel4

Search Analytics

Martin Daniel @martindaniel4

Détecter automatiquement les requêtes mal catégorisées.

Martin Daniel @martindaniel4

« Husky » {freq: 15, aff: 0.4, cat: abc, res: 10, loc: 1, typo:Quoi Etc..} Modélisation

Ok

Quoi > Qui

Qui > QuoiLogs Processing Visualisation

Martin Daniel @martindaniel4

Training

Training

Production

New Marketing Skills

Strategy Data

Marketing

Martin Daniel @martindaniel4

• Learn to code ! • 9-week coding bootcamp• 1-week data science Kit

Martin Daniel @martindaniel4

Merci !

Martin Daniel @martindaniel4