intro data science at nyt 2015-01-22
TRANSCRIPT
![Page 1: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/1.jpg)
data science @ The New York Times
and how a 163-year old content company became data-driven
[email protected]@nytimes.com@chrishwiggins
![Page 2: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/2.jpg)
1. the path
![Page 3: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/3.jpg)
biology: 1892 vs. 1995
biology changed for good.
![Page 4: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/4.jpg)
genetics: 1837 vs. 2012
ML toolset; data science mindset
![Page 5: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/5.jpg)
becoming a data science culture
- drew conway, 2010
![Page 6: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/6.jpg)
data science: web scale
![Page 7: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/7.jpg)
example:
163 yr old
![Page 8: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/8.jpg)
bit.ly/nyt-interactive-2013
![Page 9: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/9.jpg)
![Page 10: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/10.jpg)
![Page 11: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/11.jpg)
example:
millions of views per hour
![Page 12: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/12.jpg)
![Page 13: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/13.jpg)
![Page 14: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/14.jpg)
data science: the web
![Page 15: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/15.jpg)
data science: the web
is your “online presence”
![Page 16: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/16.jpg)
data science: the web
is a microscope
![Page 17: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/17.jpg)
data science: the web
is an experimental tool
![Page 18: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/18.jpg)
data science: the web
is an optimization tool
![Page 19: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/19.jpg)
1. the path
![Page 20: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/20.jpg)
learnings
- supervised learning- unsupervised learning- reinforcement learning
![Page 21: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/21.jpg)
supervised learning, e.g.,
![Page 22: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/22.jpg)
supervised learning, e.g.,
“the funnel”
![Page 23: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/23.jpg)
interpretable supervised learning
supe
r co
ol s
tuff
![Page 24: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/24.jpg)
supervised learning, e.g.,
“logistics”
![Page 25: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/25.jpg)
unsupervised learning, e.g,
“segments”
![Page 26: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/26.jpg)
unsupervised learning, e.g,
“segments”
![Page 27: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/27.jpg)
unsupervised learning, e.g,
“segments”
argmax_z p(z|x)=14
![Page 28: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/28.jpg)
unsupervised learning, e.g,
“segments”
“baby boomer”
![Page 29: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/29.jpg)
reinforcement learning
![Page 30: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/30.jpg)
reinforcement learning
aka “A/B testing”;RCT
![Page 31: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/31.jpg)
reinforcement learning
![Page 32: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/32.jpg)
reinforcement learning
img: MSR SV (RIP)e.g., multi-armed bandits
![Page 33: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/33.jpg)
data science: - people, - process, - technology
![Page 34: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/34.jpg)
2. data science:
![Page 35: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/35.jpg)
“data”:
![Page 36: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/36.jpg)
“data”:
“metrics”“business analytics”
“Excel”“reporting”
![Page 37: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/37.jpg)
Reporting
![Page 38: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/38.jpg)
Reportingbusiness as usual
![Page 39: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/39.jpg)
Reporting
Learning
business as usual
![Page 40: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/40.jpg)
Reporting
Learning(esp. supervised)
business as usual
![Page 41: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/41.jpg)
supervised learning, e.g.,
“the funnel”
![Page 42: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/42.jpg)
Reporting
Learning
Test
business as usual
(esp. supervised)
![Page 43: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/43.jpg)
Reporting
Learning
Testaka “A/B testing”;
business as usual
(esp. supervised)
![Page 44: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/44.jpg)
Reporting
Learning
Testaka “A/B testing”;
business as usual
(esp. supervised)
Some of the most recognizable personalization in our service is the collection of “genre” rows. …Members connect with these rows so well that we measure an increase in
member retention by placing the most tailored rows higher on the page instead of lower.
![Page 45: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/45.jpg)
Reporting
Learning
Testaka “A/B testing”;
business as usual
(esp. supervised)
![Page 46: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/46.jpg)
Reporting
Learning
Test
Optimizing
aka “A/B testing”;
(esp. supervised)
business as usual
![Page 47: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/47.jpg)
Reporting
Learning
Test
Optimizing
aka “A/B testing”;(i.e. reinforcement
(esp. supervised)
business as usual
![Page 48: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/48.jpg)
Reporting
Learning
Test
Optimizing
Explore
aka “A/B testing”;(i.e. reinforcement
(esp. supervised)
business as usual
![Page 49: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/49.jpg)
Reporting
Learning
Test
Optimizing
Explore
aka “A/B testing”;
aka “segmenting”
(i.e. reinforcement
(esp. supervised)
business as usual
![Page 50: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/50.jpg)
Reporting
Learning
Test
Optimizing
Explore
aka “A/B testing”;
aka “segmenting”
(i.e. reinforcement
(esp. supervised)
business as usual
![Page 51: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/51.jpg)
“segments”
Exploreaka “segmenting”
![Page 52: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/52.jpg)
“segments”
“z=14”
Exploreaka “segmenting”
![Page 53: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/53.jpg)
“segments”
“baby boomer”
Exploreaka “segmenting”
![Page 54: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/54.jpg)
Reporting
Learning
Optimizing
tech company
![Page 55: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/55.jpg)
Reporting
“model” company
![Page 56: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/56.jpg)
Reporting
fake company
![Page 57: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/57.jpg)
Reporting
Learning
Test
Optimizing
Explorestartups:
![Page 58: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/58.jpg)
“a startup is a temporary organization in search of a repeatable and scalable business model” —Steve Blank
![Page 59: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/59.jpg)
every publisher is now a startup
![Page 60: intro data science at NYT 2015-01-22](https://reader030.vdocuments.mx/reader030/viewer/2022032618/55b7c3f2bb61eb0d068b462d/html5/thumbnails/60.jpg)
data science: - people, - process, - technology