data science in the age of ai mcfam distinguished lecture · deeply into positions effectively...

Data science in the age of AIthe roles of automation, augmentation, human judgment

Jim Guszcza, PhD, FCAS

MCFAM Distinguished Lecture

University of Minnesota

March 1, 2018

The original data scientist

“An approximate answer to the right

question is worth a good deal more than

an exact answer to an approximate

problem.”

-- John Tukey

1993: “greater statistics”

2001: “data science”

Today

We need a concept of

“Greater data science”

The “last mile problem” of predictive analytics

MODEL

Predictive models can point us in the right direction …

The “last mile problem” of predictive analytics

MODEL

Predictive models can point us in the right direction …

… but they provide no value unless the are followed by the right actions or desired behavior change

Act 1Big data

Three definitions of big data

1. Data sets with sizes beyond the capability of standard IT tools to capture, process, and analyze in reasonable time frames.



2. Data with high Volume, Velocity, Variety• Huge datasets

• … emanating continuously from smart phones, sensors, cameras, GPS devices, computers, TVs, …

• … involving all manner of numeric, text, photographic data



2. Data with high Volume, Velocity, Variety• Huge datasets

• … emanating continuously from smart phones, sensors, cameras, GPS devices, computers, TVs, …

• … involving all manner of numeric, text, photographic data

3. “Anything that doesn’t fit in Excel”

The city of New York does actuarial prediction big data

Actuarial vs clinical prediction – the motion picture

Human judges are not merely worse than optimal regression equations;

they are worse than almost any regression equation.

— Richard Nisbett and Lee Ross

A story I don’t like

“There is now a better way. Petabytes allow us to say: “Correlation is enough.” We can

stop looking for models. We can analyze the data without hypotheses about what it

might show. We can throw the numbers into the biggest computing clusters the world

has ever seen and let statistical algorithms find patterns where science cannot.”

Copyright © 2017 Deloitte Development LLC. All rights reserved.

http://www.google.ca/url?sa=i&rct=j&q=end+of+theory&source=images&cd=&cad=rja&docid=Z5KZsEvYpe0RFM&tbnid=f4J2GxLCRsxclM:&ved=0CAUQjRw&url=http://technocalifornia.blogspot.com/2012/07/more-data-or-better-models.html&ei=snuaUbPGBaakiQK5iIGoDA&bvm=bv.46751780,d.cGE&psig=AFQjCNHOuTKfAL9yt1Ssm-az1ZCTE8tcwQ&ust=1369165102487291

Google Flu Trends

Google Flu Trends – from poster child to parable

The end of the end of theory

Act 2Reframing big data

Big data – the classic example

(This we

know)

A more striking correlation

(!)

More food for thought

(!!)

Hard to swallow

Digital breadcrumbs and personalizationOur lives are digitally mediated.

We continually leave behind digital breadcrumbs about:

• Who we email, call

• Our communication style

• How we drive

• What we buy

• What we eat

• What we read, watch

• How we sleep

• How we exercise

• What we think

B is for Behavioral

I believe that the power of Big Data is that it is information about people's behavior instead of information about their beliefs… It's not about the things you post [online] … which is what most people think about, and it's not data from internal company processes and RFIDs.

This sort of Big Data comes from things like location data off of your cell phone or credit card, it's the little data breadcrumbs that you leave behind you as you move around in the world.

Those breadcrumbs tell… the story of your life... Big data is increasingly about real behavior, and by analyzing this sort of data, scientists can tell an enormous amount about you. They can tell whether you are the sort of person who will pay back loans. They can tell you if you're likely to get diabetes.

—Sandy Pentland, MIT Media Lab “Reinventing Society in the Wake of Big Data”

edge.org conversation

http://www.google.com.sg/url?sa=i&rct=j&q=&esrc=s&frm=1&source=images&cd=&cad=rja&docid=_qlfhtA7m2rrfM&tbnid=8aiEWqKoNwXhXM:&ved=0CAUQjRw&url=http://www.cogitocorp.com/meet-the-team/&ei=lbJ8UrXzAdDLrQe5iIGgBA&bvm=bv.56146854,d.bmk&psig=AFQjCNF1_m33vDxci9Yahb8q4Sot0Rey-A&ust=1383990260923208

Data science is now a societal issue

Since this data is mostly about people, there are enormous issues about privacy, data ownership, and data control. You can imagine using Big Data to make a world that is incredibly invasive, incredibly 'Big Brother'… George Orwell was not nearly creative enough when he wrote 1984...

—Sandy Pentland, MIT Media Lab “Reinventing Society in the Wake of Big Data”

edge.org conversation

http://www.google.com.sg/url?sa=i&rct=j&q=&esrc=s&frm=1&source=images&cd=&cad=rja&docid=_qlfhtA7m2rrfM&tbnid=8aiEWqKoNwXhXM:&ved=0CAUQjRw&url=http://www.cogitocorp.com/meet-the-team/&ei=lbJ8UrXzAdDLrQe5iIGgBA&bvm=bv.56146854,d.bmk&psig=AFQjCNF1_m33vDxci9Yahb8q4Sot0Rey-A&ust=1383990260923208

Like, you know

Researchers at Cambridge University Psychometrics Centre built predictive models of personal details based purely on social network “Likes” of a sample of 58,000 people.

• Relationship status, substance abuse 65-73% accurate• Political leanings (democrat vs Republican) 85% accurate• Religion (Christian vs Muslim) 82% accurate• Male sexual orientation 88% accurate• Ethnicity (African-American vs Caucasian) 95% accurate

“Observation of Likes alone was nearly is roughly as informative as using an individual’s actual personality test score.”

“Similar predictions could be made from all manner of digital data, with this kind of secondary ‘inference’ made with remarkable accuracy”

-- “Digital Records Could Expose Intimate Details and Personality Traits of Millions”University of Cambridge Research News

http://www.cam.ac.uk/research/news/digital-records-could-expose-intimate-details-and-personality-traits-of-millions

http://www.cam.ac.uk/research/news/digital-records-could-expose-intimate-details-and-personality-traits-of-millions

http://web.media.mit.edu/~yva/InfographicPersonality.png

http://web.media.mit.edu/~yva/InfographicPersonality.png

Customer-centric uses of big dataThe role of applied behavioral economics

Customer-centricityBrand

Centric

Customer

Centric

Bounded rationality: we are terrible natural statisticians. We need help from data science.

Bounded selfishness: we are driven by fairness, and social norms – not just economic benefits.

Bounded self-control: we make short-term decisions at odds with our long-term goals.

The three bounds

“Nudge” is human-centered design Design considerations

• Present bias

• Loss aversion

• Social proof

• “Social Physics”

• Framing effects

• Intuitive language / infoVis

• Status quo bias

• Mental accounting

• Cognitive load / “Scarcity”

• Pre-commitment

• Lotteries

(overweighing small probabilities)

• Unit bias (“mindless eating”)

• Removing bottlenecks

“While Cass and I were capable of recognizing good nudges when we came across them, we were still missing an organizing principle for how to devise effective nudges..

We had a breakthrough… when I reread Don Norman’s classic book The Design of Everyday Things.”

– Richard Thaler, Misbehaving

Customer-centric data products

Data: Use telematics data to calculate risk factorsDigital: Periodic driver feedback reportsDesign: Employ Opower-style peer comparisons


Prosocial applications of big data

Tapping into bounded selfishness

Data: Statistical fraud detection using web click dataDigital: Interactions mediated by web siteDesign: Optimize behavioral nudge pop-up messages (use A/B testing)

Uberizing insurance?

Creating new customer-centric business models

Act 3The rebirth of AI

The second machine age

An attempt will be made to find how to make machines use language, form abstractions and concepts, solve kinds of problems now reserved for humans, and improve themselves…

We think that a significant advance can be made in one or more of these problems if a carefully selected group of scientists work on it for a summer.

What is AI? (Answer in 1956)

— John McCarthy, Marvin Minsky,Nathan Rochester, Claude Shannon

“A Proposal for the DartmouthSummer Research Project on ArtificialIntelligence”

What is AI? (Answer today)

The same family of algorithms we’ve been using for 20 years are doing a better job because they have access to big data.

— Daniel Levitin

Invariably, simple models and a lot of data trump more elaborate models based on less data… Currently, statistical translation models consist mostly of large memorized phrase tables…

— Peter Norvig

Neural networks in the 1990s

Neural networks today

Marvin [Minsky] was advocating what’s called “commonsense reasoning”.

Machines have shown essentially no examples of doing that.

Therefore, they are complements to people. People are actually not so bad at that.

However, they are somewhat lousy at tuning things and keeping exact accounts of stuff. Machines are good at that.

That gives the idea that there could be a human-machine partnership…

— Sandy Pentland, Deloitte Review 2017

AI = Augmented Intelligence

AI != human intelligence

Google Translate: memorized phrase tables

Feel the Bern

May, 2017

The prequel

… with a twist ending

Their skill at manipulating and “coaching” their computers to look very deeply into positions effectively counteracted the superior chess understanding of their grandmaster opponents and the greater computational power of other participants.

Weak human + machine + better process was superior to a strong computer alone and, more remarkably, superior to a strong human + machine + inferior process.

— Garry Kasparov

Human plus computer

Their skill at manipulating and “coaching” their computers to look very deeply into positions effectively counteracted the superior chess understanding of their grandmaster opponents and the greater computational power of other participants.

Weak human + machine + better process was superior to a strong computer alone and, more remarkably, superior to a strong human + machine + inferior process.

— Garry Kasparov

Human plus computer

Getting this process right involves more than math + stats.We also need concepts from psychology, human-centered design, ethics, …

Men will set the goals, formulate the hypotheses, determine the criteria, and perform the evaluations.

Computing machines will do the routinizable work that must be done to prepare the way for insights and decisions in technical and scientific thinking…

The symbiotic partnership will perform intellectual operations much more effectively than man alone can perform them.

— JCR Licklider, “Man-Computer Symbiosis”

JCR Licklider (1960)

The drive to automation

What about underwriting more complex risks?


A false comparison

Models are a form of “artificial intelligence” that augment (but do not replace) human expertise.

Equations > experts

(Equations + experts) > experts

nn XXXY *...** 22110

Eyeglasses for the mind’s eye

Equations > experts


nn XXXY *...** 22110

Eyeglasses for the mind’s eye

Equations > experts


nn XXXY *...** 22110

Algorithms are “cognitive prostheses” that augment

(but – in general – do not replace) human expertise.

CodaA few final thoughts

http://economix.blogs.nytimes.com/2011/04/14/time-and-judgment/

Noise and bias can affect important judgments

n.b.These findings

have been questioned!

http://economix.blogs.nytimes.com/2011/04/14/time-and-judgment/

Actuarial vs clinical prediction – the motion picture

Human judges are not merely worse than optimal regression equations;

they are worse than almost any regression equation.

— Richard Nisbett and Lee Ross

Algorithms can be biased too


Many jobs will continue to be lost to intelligent automation…

but if you’re looking for a field that will be booming for many years, get into human-machine collaboration and process architecture and design.

– Garry Kasparov, Deep Thinking


The future of work is Freestyle x

The problems that we face with technology are fundamental… We need a calmer, more reliable, more humane approach.

We need augmentation, not automation.

– Don Norman

consistent de-biased informed meaningful

data + human judgment / empathy decisions that are…


“Freestyle x” in medicine

… Machine learning will displace much of the work of radiologists and anatomical pathologists.

These physicians focus largely on interpreting digitized images, which can easily be fed directly to algorithms instead.

– Ziad Obermeyer and Ezekiel Emanuel, NEJM


… Machine learning will become an indispensable tool for clinicians seeking to truly understand their patients.

As patients’ conditions and medical technologies become more complex, the role of machine learning will grow, and clinical medicine will be challenged to grow with it.

– Ziad Obermeyer and Ezekiel Emanuel, NEJM

… As AI gets further incorporated… we have to make some tougher decisions.

We underpay teachers, despite the fact that it’s a really hard job and a really hard thing for a computer to do well.

So for us to reexamine what we value, what we are collectively willing to pay for—whether it’s teachers, nurses, caregivers, moms or dads who stay at home, artists, all the things that are incredibly valuable to us right now but don’t rank high on the pay totem pole—that’s a conversation we need to begin to have.

– Barack Obama, Oct 2016

Copies available in the lobbyFor more discussion see:

“The Last Mile Problem: how data science and behavioral science can work together” Deloitte Review, January 2015http://dupress.com/articles/behavioral-economics-predictive-analytics/

“The Importance of Misbehaving: a conversation with Richard Thaler” Deloitte Review, January 2016https://dupress.deloitte.com/dup-us-en/deloitte-review/issue-18/behavioral-economics-

richard-thaler-interview.html

“Cognitive collaboration: why humans and computers think better together” Deloitte Review, January 2017https://dupress.deloitte.com/dup-us-en/deloitte-review/issue-20/augmented-intelligence-human-computer-collaboration.html

http://dupress.com/articles/behavioral-economics-predictive-analytics/

https://dupress.deloitte.com/dup-us-en/deloitte-review/issue-18/behavioral-economics-richard-thaler-interview.html

https://dupress.deloitte.com/dup-us-en/deloitte-review/issue-20/augmented-intelligence-human-computer-collaboration.html

data science in the age of ai mcfam distinguished lecture · deeply into positions effectively...

Documents