data science in the age of ai mcfam distinguished lecture · deeply into positions effectively...
TRANSCRIPT
Data science in the age of AIthe roles of automation, augmentation, human judgment
Jim Guszcza, PhD, FCAS
MCFAM Distinguished Lecture
University of Minnesota
March 1, 2018
The original data scientist
“An approximate answer to the right
question is worth a good deal more than
an exact answer to an approximate
problem.”
-- John Tukey
1993: “greater statistics”
2001: “data science”
Today
We need a concept of
“Greater data science”
The “last mile problem” of predictive analytics
MODEL
Predictive models can point us in the right direction …
The “last mile problem” of predictive analytics
MODEL
Predictive models can point us in the right direction …
… but they provide no value unless the are followed by the right actions or desired behavior change
Act 1Big data
Three definitions of big data
1. Data sets with sizes beyond the capability of standard IT tools to capture, process, and analyze in reasonable time frames.
Three definitions of big data
1. Data sets with sizes beyond the capability of standard IT tools to capture, process, and analyze in reasonable time frames.
2. Data with high Volume, Velocity, Variety• Huge datasets
• … emanating continuously from smart phones, sensors, cameras, GPS devices, computers, TVs, …
• … involving all manner of numeric, text, photographic data
Three definitions of big data
1. Data sets with sizes beyond the capability of standard IT tools to capture, process, and analyze in reasonable time frames.
2. Data with high Volume, Velocity, Variety• Huge datasets
• … emanating continuously from smart phones, sensors, cameras, GPS devices, computers, TVs, …
• … involving all manner of numeric, text, photographic data
3. “Anything that doesn’t fit in Excel”
The city of New York does actuarial prediction big data
Actuarial vs clinical prediction – the motion picture
Human judges are not merely worse than optimal regression equations;
they are worse than almost any regression equation.
— Richard Nisbett and Lee Ross
A story I don’t like
“There is now a better way. Petabytes allow us to say: “Correlation is enough.” We can
stop looking for models. We can analyze the data without hypotheses about what it
might show. We can throw the numbers into the biggest computing clusters the world
has ever seen and let statistical algorithms find patterns where science cannot.”
Copyright © 2017 Deloitte Development LLC. All rights reserved.
Google Flu Trends
Google Flu Trends
Google Flu Trends – from poster child to parable
Google Flu Trends – from poster child to parable
The end of the end of theory
The end of the end of theory
Act 2Reframing big data
Big data – the classic example
(This we
know)
A more striking correlation
(!)
More food for thought
(!!)
Hard to swallow
Digital breadcrumbs and personalizationOur lives are digitally mediated.
We continually leave behind digital breadcrumbs about:
• Who we email, call
• Our communication style
• How we drive
• What we buy
• What we eat
• What we read, watch
• How we sleep
• How we exercise
• What we think
B is for Behavioral
I believe that the power of Big Data is that it is information about people's behavior instead of information about their beliefs… It's not about the things you post [online] … which is what most people think about, and it's not data from internal company processes and RFIDs.
This sort of Big Data comes from things like location data off of your cell phone or credit card, it's the little data breadcrumbs that you leave behind you as you move around in the world.
Those breadcrumbs tell… the story of your life... Big data is increasingly about real behavior, and by analyzing this sort of data, scientists can tell an enormous amount about you. They can tell whether you are the sort of person who will pay back loans. They can tell you if you're likely to get diabetes.
—Sandy Pentland, MIT Media Lab “Reinventing Society in the Wake of Big Data”
edge.org conversation
Data science is now a societal issue
Since this data is mostly about people, there are enormous issues about privacy, data ownership, and data control. You can imagine using Big Data to make a world that is incredibly invasive, incredibly 'Big Brother'… George Orwell was not nearly creative enough when he wrote 1984...
—Sandy Pentland, MIT Media Lab “Reinventing Society in the Wake of Big Data”
edge.org conversation
Like, you know
Researchers at Cambridge University Psychometrics Centre built predictive models of personal details based purely on social network “Likes” of a sample of 58,000 people.
• Relationship status, substance abuse 65-73% accurate• Political leanings (democrat vs Republican) 85% accurate• Religion (Christian vs Muslim) 82% accurate• Male sexual orientation 88% accurate• Ethnicity (African-American vs Caucasian) 95% accurate
“Observation of Likes alone was nearly is roughly as informative as using an individual’s actual personality test score.”
“Similar predictions could be made from all manner of digital data, with this kind of secondary ‘inference’ made with remarkable accuracy”
-- “Digital Records Could Expose Intimate Details and Personality Traits of Millions”University of Cambridge Research News
http://www.cam.ac.uk/research/news/digital-records-could-expose-intimate-details-and-personality-traits-of-millions
http://web.media.mit.edu/~yva/InfographicPersonality.png
Customer-centric uses of big dataThe role of applied behavioral economics
Customer-centricityBrand
Centric
Customer
Centric
Bounded rationality: we are terrible natural statisticians. We need help from data science.
Bounded selfishness: we are driven by fairness, and social norms – not just economic benefits.
Bounded self-control: we make short-term decisions at odds with our long-term goals.
The three bounds
“Nudge” is human-centered design Design considerations
• Present bias
• Loss aversion
• Social proof
• “Social Physics”
• Framing effects
• Intuitive language / infoVis
• Status quo bias
• Mental accounting
• Cognitive load / “Scarcity”
• Pre-commitment
• Lotteries
(overweighing small probabilities)
• Unit bias (“mindless eating”)
• Removing bottlenecks
“While Cass and I were capable of recognizing good nudges when we came across them, we were still missing an organizing principle for how to devise effective nudges..
We had a breakthrough… when I reread Don Norman’s classic book The Design of Everyday Things.”
– Richard Thaler, Misbehaving
Customer-centric data products
Data: Use telematics data to calculate risk factorsDigital: Periodic driver feedback reportsDesign: Employ Opower-style peer comparisons
Copyright © 2017 Deloitte Development LLC. All rights reserved.
Prosocial applications of big data
Tapping into bounded selfishness
Data: Statistical fraud detection using web click dataDigital: Interactions mediated by web siteDesign: Optimize behavioral nudge pop-up messages (use A/B testing)
Uberizing insurance?
Creating new customer-centric business models
Act 3The rebirth of AI
The second machine age
An attempt will be made to find how to make machines use language, form abstractions and concepts, solve kinds of problems now reserved for humans, and improve themselves…
We think that a significant advance can be made in one or more of these problems if a carefully selected group of scientists work on it for a summer.
What is AI? (Answer in 1956)
— John McCarthy, Marvin Minsky,Nathan Rochester, Claude Shannon
“A Proposal for the DartmouthSummer Research Project on ArtificialIntelligence”
What is AI? (Answer today)
The same family of algorithms we’ve been using for 20 years are doing a better job because they have access to big data.
— Daniel Levitin
Invariably, simple models and a lot of data trump more elaborate models based on less data… Currently, statistical translation models consist mostly of large memorized phrase tables…
— Peter Norvig
Neural networks in the 1990s
Neural networks today
Marvin [Minsky] was advocating what’s called “commonsense reasoning”.
Machines have shown essentially no examples of doing that.
Therefore, they are complements to people. People are actually not so bad at that.
However, they are somewhat lousy at tuning things and keeping exact accounts of stuff. Machines are good at that.
That gives the idea that there could be a human-machine partnership…
— Sandy Pentland, Deloitte Review 2017
AI = Augmented Intelligence
Marvin [Minsky] was advocating what’s called “commonsense reasoning”.
Machines have shown essentially no examples of doing that.
Therefore, they are complements to people. People are actually not so bad at that.
However, they are somewhat lousy at tuning things and keeping exact accounts of stuff. Machines are good at that.
That gives the idea that there could be a human-machine partnership…
— Sandy Pentland, Deloitte Review 2017
AI = Augmented Intelligence
AI != human intelligence
AI != human intelligence
Google Translate: memorized phrase tables
Feel the Bern
May, 2017
May, 2017
The prequel
The prequel
… with a twist ending
Their skill at manipulating and “coaching” their computers to look very deeply into positions effectively counteracted the superior chess understanding of their grandmaster opponents and the greater computational power of other participants.
Weak human + machine + better process was superior to a strong computer alone and, more remarkably, superior to a strong human + machine + inferior process.
— Garry Kasparov
Human plus computer
Their skill at manipulating and “coaching” their computers to look very deeply into positions effectively counteracted the superior chess understanding of their grandmaster opponents and the greater computational power of other participants.
Weak human + machine + better process was superior to a strong computer alone and, more remarkably, superior to a strong human + machine + inferior process.
— Garry Kasparov
Human plus computer
Getting this process right involves more than math + stats.We also need concepts from psychology, human-centered design, ethics, …
Men will set the goals, formulate the hypotheses, determine the criteria, and perform the evaluations.
Computing machines will do the routinizable work that must be done to prepare the way for insights and decisions in technical and scientific thinking…
The symbiotic partnership will perform intellectual operations much more effectively than man alone can perform them.
— JCR Licklider, “Man-Computer Symbiosis”
JCR Licklider (1960)
The drive to automation
What about underwriting more complex risks?
Copyright © 2017 Deloitte Development LLC. All rights reserved.
A false comparison
Models are a form of “artificial intelligence” that augment (but do not replace) human expertise.
Equations > experts
(Equations + experts) > experts
nn XXXY *...** 22110
Eyeglasses for the mind’s eye
Equations > experts
(Equations + experts) > experts
nn XXXY *...** 22110
Eyeglasses for the mind’s eye
Equations > experts
(Equations + experts) > experts
nn XXXY *...** 22110
Algorithms are “cognitive prostheses” that augment
(but – in general – do not replace) human expertise.
CodaA few final thoughts
http://economix.blogs.nytimes.com/2011/04/14/time-and-judgment/
Noise and bias can affect important judgments
n.b.These findings
have been questioned!
Actuarial vs clinical prediction – the motion picture
Human judges are not merely worse than optimal regression equations;
they are worse than almost any regression equation.
— Richard Nisbett and Lee Ross
Algorithms can be biased too
Copyright © 2017 Deloitte Development LLC. All rights reserved.
Many jobs will continue to be lost to intelligent automation…
but if you’re looking for a field that will be booming for many years, get into human-machine collaboration and process architecture and design.
– Garry Kasparov, Deep Thinking
Copyright © 2017 Deloitte Development LLC. All rights reserved.
The future of work is Freestyle x
The problems that we face with technology are fundamental… We need a calmer, more reliable, more humane approach.
We need augmentation, not automation.
– Don Norman
consistent de-biased informed meaningful
data + human judgment / empathy decisions that are…
Copyright © 2017 Deloitte Development LLC. All rights reserved.
“Freestyle x” in medicine
… Machine learning will displace much of the work of radiologists and anatomical pathologists.
These physicians focus largely on interpreting digitized images, which can easily be fed directly to algorithms instead.
– Ziad Obermeyer and Ezekiel Emanuel, NEJM
“Freestyle x” in medicine
… Machine learning will become an indispensable tool for clinicians seeking to truly understand their patients.
As patients’ conditions and medical technologies become more complex, the role of machine learning will grow, and clinical medicine will be challenged to grow with it.
– Ziad Obermeyer and Ezekiel Emanuel, NEJM
“Freestyle x” in medicine
… As AI gets further incorporated… we have to make some tougher decisions.
We underpay teachers, despite the fact that it’s a really hard job and a really hard thing for a computer to do well.
So for us to reexamine what we value, what we are collectively willing to pay for—whether it’s teachers, nurses, caregivers, moms or dads who stay at home, artists, all the things that are incredibly valuable to us right now but don’t rank high on the pay totem pole—that’s a conversation we need to begin to have.
– Barack Obama, Oct 2016
Copies available in the lobbyFor more discussion see:
“The Last Mile Problem: how data science and behavioral science can work together” Deloitte Review, January 2015http://dupress.com/articles/behavioral-economics-predictive-analytics/
“The Importance of Misbehaving: a conversation with Richard Thaler” Deloitte Review, January 2016https://dupress.deloitte.com/dup-us-en/deloitte-review/issue-18/behavioral-economics-
richard-thaler-interview.html
“Cognitive collaboration: why humans and computers think better together” Deloitte Review, January 2017https://dupress.deloitte.com/dup-us-en/deloitte-review/issue-20/augmented-intelligence-human-computer-collaboration.html