jeff jonas - ibm - presentation at the chief data officer forum, government
TRANSCRIPT
© 2016 IBM Corporation
Context ComputingAnd the Rise of Sensemaking Systems
CDO GovernmentJune 8-9, 2016
Jeff Jonas, IBM FellowChief Scientist, Context Computinghttp://www.twitter.com/jeffjonaswww.jeffjonas.typepad.com
© 2016 IBM Corporation2
Jeff JonasIBM FellowChief Scientist, Context Computing Founded Systems Research & Development (SRD) in 1985
Architected, designed, developed roughly 100 systems over the last three decades
– Defense, intelligence– Financial services– Gaming– Law enforcement
Acquired by IBM in 2005
Currently focused on Context Computing, Sensemaking and Privacy by Design (PbD)
© 2016 IBM Corporation3
”The data must find the data and the
relevance must find you.”
© 2016 IBM Corporation
Trend: Organizations Are Getting Dumber
4
Time
Incr
easi
ng C
ompu
te P
ower
Sensemaking Algorithms
Available Observation
Space ContextEnterpriseAmnesia
Every two days now we create as much information as we did from the dawn of civilization up until 2003.”
~ Eric Schmidt, CEO Google
© 2016 IBM Corporation
Trend: Organizations Are Getting Dumber
5
Time
Incr
easi
ng C
ompu
te P
ower
Sensemaking Algorithms
Available Observation
Space ContextWHY?
© 2016 IBM Corporation
Algorithms at Dead End.
You Can’t Squeeze Knowledge
Out of a Pixel.
6
© 2016 IBM Corporation8
Context
“Better understanding something by taking into account the things around it.”
© 2016 IBM Corporation9
I ducked as the bat flew my way.
Another exciting baseball game.
© 2016 IBM Corporation10
In Context
VendorHigh ValueAsset
Job Applicant
FormerEmployee Bad Guy
© 2016 IBM Corporation11
Context Accumulating
ContextAccumulation
ContextualizedObservations
Observation(Any kind of data from
any kind of sensor)
© 2016 IBM Corporation12
Context Informs Decisioning
ContextAccumulation
ContextualizedObservations
ObservationIn Context
Decisioning
Act
Data Finds Data Relevance Finds YouThe data is the question!
Observation(Any kind of data from
any kind of sensor)
© 2016 IBM Corporation13
The Puzzle Metaphor
Imagine an ever-growing pile of puzzle pieces of varying sizes, shapes, colors
What it represents is unknown – there is no picture on hand
Is it one puzzle, 15 puzzles, or 1,500 different puzzles?
Some pieces are duplicates, missing, incomplete or have errors
Some pieces may even be professionally fabricated lies
Until you take the pieces to the table, it is nearly impossible to assess the scene
© 2016 IBM Corporation14
Puzzling Images: Courtesy Ravensburger © 2011
270 pieces90%
200 pieces66%
150 pieces50%
6 pieces2%
30 pieces10% (duplicates)
© 2016 IBM Corporation15
© 2016 IBM Corporation16
© 2016 IBM Corporation17
First Discovery
© 2016 IBM Corporation18
More Data Finds Data
© 2016 IBM Corporation19
Duplicates in Front Of Your Eyes
© 2016 IBM Corporation20
First Duplicate Found Here
© 2016 IBM Corporation21
© 2016 IBM Corporation22
Incremental Context – Incremental Discovery
6:40pm START
22min “Hey, this one is a duplicate!”
35min “I think some pieces are missing.”
37min “Looks like a bunch of hillbillies on a porch.”
44min “Hillbillies, playing guitars, sitting on a porch, near a barber sign and a banjo!”
© 2016 IBM Corporation23
150 pieces50%
© 2016 IBM Corporation24
Incremental Context – Incremental Discovery
47min “We should take the sky and grass off the table.”
2hr “Let’s switch sides, and see if we can make sense of this from
different perspectives.”
2hr10m “Wait, there are three … no, four puzzles.”
2hr18m “I think you threw in a few random pieces.”
© 2016 IBM Corporation25
© 2016 IBM Corporation26
How Context Accumulates
With each new observation one asserts: 1) Un-associated; 2) near neighbors; or 3) connected
Must favor the false negative
New observations sometimes reverse earlier assertions
Some observations produce novel discovery
The emerging picture helps focus collection interests
As the working space expands, computational effort increases
Then given sufficient observations there comes a tipping point whereby decision certainty increases while compute effort decreases!
© 2016 IBM Corporation27
Big Data [in context]. New Physics.
More data: better the predictions– Lower false positives– Lower false negatives
More data: bad data good– Suddenly glad your data is not perfect
More data: less compute
© 2016 IBM Corporation28
Big Data
Pile of ______ Information In Context
© 2016 IBM Corporation29
One Essential Form of Context: “Entity Resolution”
Is it 5 people each with 1 account or is it 1 person with 5 accounts?
Is it 20 cases of SARS in 20 cities or one case reported 20 times?
If one cannot count, one cannot estimate vector or velocity (direction, speed).
Without vector and velocity prediction is nearly impossible.
© 2016 IBM Corporation30
Who is Fang Wong?
Fang WongTop 100 Customer
F A WongSeattle, DOB: 6/12/82
Former Customer
@FangWong2.5M Followers
[email protected] Subscriber
Fang [email protected] Department’s
Prospect List
© 2016 IBM Corporation31
Resolving the Fang Wong
Fang WongTop 100 Customer
F A WongSeattle, DOB: 6/12/82
Former Customer
@FangWong2.5M Followers
[email protected] Subscriber
Fang [email protected] Department’s
Prospect List
© 2016 IBM Corporation32
Resolving the Fang Wong
Fang WongTop 100 Customer2.5M Followers
Newsletter Subscriber
© 2016 IBM Corporation33
Graphing the (resolved) Fang Wong
Bill SmithMember of the Board
Employee
Customer
Customer
FraudsterFang Wong
Top 100 Customer2.5M Followers
Newsletter Subscriber
© 2016 IBM Corporation34
Contextualizing Sandy Maden
Bill SmithMember of the Board
Sandy MadenJob Applicant
Employee
Lives With
Co-signer
FormerEmployee
(term no rehire)
Customer Customer
Customer
FraudsterFang Wong
Top 100 Customer2.5M Followers
Newsletter Subscriber
© 2016 IBM Corporation35
“Entities”
Bill SmithMember of the Board
Sandy MadenJob Applicant
Employee
Lives With
Co-signer
FormerEmployee
(term no rehire)
Customer Customer
Customer
FraudsterFang Wong
Top 100 Customer2.5M Followers
Newsletter Subscriber
Company
Boat
Plane
RouterCar
Asteroid
© 2016 IBM Corporation36
ENTITY RESOLUTION: NEW THINK
© 2016 IBM Corporation37
Entity Resolution Different Degrees of Difficulty
Exactly Same
Fuzzy
IncompatibleFeatures
Deceit
Bob Jones123455
Bob Jones123455
Bob Jones123455
Robert T Jonnes000123455
Bob Jones123455
Bob@TheCo
Bob Jones123455
Ken Wells550119
© 2016 IBM Corporation38
Key Features Enable Entity Resolution
Name License Plate No. Serial Number
Address VIN MAC AddressDate of Birth Make IP AddressPhone Model MakePassport Year ModelNationality Color Firmware
VersionBiometric Etc. Etc.Etc.
People Cars Router
© 2016 IBM Corporation39
Consider Lying Identical Twins
#123Sue3/3/84UberstanExp 2011
PASSPORT#123Sue3/3/84UberstanExp 2011
PASSPORT
Fingerprint
DNA Most TrustedAuthority
“Same person – trust me.”
Most TrustedAuthority
© 2016 IBM Corporation40
The same thing cannot be in two places … at the same time.
Two different things cannot occupy the same space … at the same time.
© 2016 IBM Corporation41
Space & Time Enables Absolute Disambiguation
Name License Plate No. Serial Number
Address VIN MAC AddressDate of Birth Make IP AddressPhone Model MakePassport Year ModelNationality Color Firmware
VersionBiometric Etc. Etc.Etc.
People Cars Router When When When
Where Where Where
© 2016 IBM Corporation
“Life Arcs” Are Also Telling
42
Bill Smith4/13/67
Salem, Oregon
Bill Smith4/13/67
Seattle, Washington
Address HistoryTampa, FL 2008-2016Biloxi, MS 2005-2008NY, NY 1996-2005Tampa, FL 1984-1996
Address HistorySan Diego, CA 2005-2016San Fran, CA 2005-2005Phoenix, AZ 1990-2005San Jose, CA 1982-1990
© 2016 IBM Corporation43
OMG
© 2016 IBM Corporation
Space-Time-Travel
Cell phones are generating a staggering amount of geo-locational data – 600B transactions per day being created in the US alone
This data is being “de-identified” and shared with third parties – in volume and in real-time
Your movement quickly reveals where you spend your time
Re-identification (figuring out who is who) is somewhat trivial
And, oh so powerful predictions …
44
© 2016 IBM Corporation
The 10 People I Spend the Most Time With(Not at Home and Not at Work) Michelle Pfeiffer Greg Bob Amanda Ivan Shelby Lindsey Adam Brooke
45
He must be
following me!
© 2016 IBM Corporation
Unfair Advantage?
The Uberstan intelligence service preempts the next mass protest in real-time
A political opponent is crushed and resigns two days after announcing their candidacy
46
© 2016 IBM Corporation
Consequences
Space-time-travel data is the ultimate biometric
Adoption is now accelerating at a blistering pace
It will enable enormous opportunity
It will unravel one’s secrets
It will challenge existing notions of privacy
47
© 2016 IBM Corporation
Toying with Publically Available Cell Phone Data
35,831 Call Data Records (CDRs)– 6 months: From 08-31-2009 through 02-27-2010
18,391 Total Number of Usable CDR’s– Excluded CDRs with missing latitude, longitude, time, flow, or accuracy>250 meters
2,444 Hangouts– Minimum of 2 events, spanning at least 15 minutes, in a 610m STB
The Pattern of Life– 130 Hangouts total– 64 Hangouts 3 or more times48
Ummm … seems we are
living in habitrails!
© 2016 IBM Corporation49
© 2016 IBM Corporation
Hangouts
50
© 2016 IBM Corporation
Getting to Know Malte Spitz
51
Six months of my life in 35,000 recordshttp://www.malte-spitz.de/blog/4103927.html
© 2016 IBM Corporation52
ARCHITECTURAL CONSIDERATIONS
© 2016 IBM Corporation53
Action
Red Analytics
Green Analytics
Blue Analytics
ObservationSpace
Old School: Isolated Analytics
© 2016 IBM Corporation54
ObservationSpace
ActionInformationIn Context
Next: General Purpose Sensemaking
Data Finds Data Relevance Finds You
Sensemaking
© 2016 IBM Corporation55
ObservationSpace
ActionInformationIn Context
Data Finds Data Relevance Finds You
Helping Focusing Human Attention
Sensemaking
General Purpose • Insider Threat• Marketing• Next Best Action• Anti-Money Laundering• Asteroid Hunting
Simultaneously!
© 2016 IBM Corporation56
Sensemaking Architecture
Deep Reflection
DiscoveredPatterns
ContextAccumulation
ContextualizedObservations
ObservationIn Context
Decisioning
ActObservation(Any kind of data from
any kind of sensor)
Data Finds Data Relevance Finds You
Data MiningDeep Learning
Feature Extraction Transformation
Scoring & Predictive ModelsEvent Processing
Context Computing
© 2016 IBM Corporation57
The most competitive organizations
are going to make sense of what they are observing
fast enough to do something about it
while they are observing it.
© 2016 IBM Corporation58
Related Blog Postswww.jeffjonas.typepad.com
Data Finds Data
Puzzling: How Observations Are Accumulated Into Context
Big Data. New Physics.
G2 is 4
Fantasy Analytics
© 2016 IBM Corporation59
“No one writes bomb on manifest!”
© 2016 IBM Corporation60
Googe: [IBM CDO Lookbook]
.including Context Computing which will be avail first through Watson Analytics
© 2016 IBM Corporation
Email: [email protected]: www.jeffjonas.typepad.com
Twitter: http://www.twitter.com/jeffjonas
Questions?
© 2016 IBM Corporation
Context ComputingAnd the Rise of Sensemaking Systems
CDO GovernmentJune 8-9, 2016
Jeff Jonas, IBM FellowChief Scientist, Context Computinghttp://www.twitter.com/jeffjonaswww.jeffjonas.typepad.com
© 2016 IBM Corporation63
WIDENING OBSERVATION SPACESA SNEAK PEEK INTO MY CURRENT WORK
© 2016 IBM Corporation64
Dealing with Probabilities
Deep Reflection
DiscoveredPatterns
ContextAccumulation
ContextualizedObservations
ObservationIn Context
Decisioning
ActObservation(Any kind of data from
any kind of sensor)
Certainty6.25%
© 2016 IBM Corporation
Additional DataOriginal Certainty
Dealing with Probabilities
Mark Smith123 Main StreetSanta Rosa, CADOB: 5/12/1974
Mark SmithSanta Rosa, CA702.433.8871
Confirmed across 3 credit bureaus:Mark Smith123 Main StreetSanta Rosa, CADOB: 5/12/1974702.433.8871
Confirmed across two data aggregators:Mark Smith, Santa Rosa, 05/12/74- Only one observed123 Main Street, Santa Rosa, CA- No other Marks- No other Smiths702.433.8871- Exclusive to Mark Smith
(*) 16 Mark Smiths live in Santa Rosa, CA [ref: http://www.intelius.com/results.php?trackit=63&ReportType=1&qf=Mark&qmi=&qn=Smith&qs=CA&qc=Santa+Rosa]
Certainty6.25%*
Decision Certainty
© 2016 IBM Corporation66
Using Curiosity to Increase Decision Certainty
Deep Reflection
DiscoveredPatterns
ContextAccumulation
ContextualizedObservations
ObservationIn Context
Decisioning
ActObservation(Any kind of data from
any kind of sensor)
SelectiveCuriosity
Figure Out Who to Ask Yes
Make Request(s)
Assembly of Responses into
ObservationsCertainty
6.25%
Is it worth being curious
about?
© 2016 IBM Corporation67
Before
Deep Reflection
DiscoveredPatterns
ContextAccumulation
ContextualizedObservations
ObservationIn Context
Decisioning
ActObservation(Any kind of data from
any kind of sensor)
Certainty 6.25%
© 2016 IBM Corporation68
After
Deep Reflection
DiscoveredPatterns
ContextAccumulation
ContextualizedObservations
ObservationIn Context
Decisioning
ActObservation(Any kind of data from
any kind of sensor)
Decision Certainty
© 2016 IBM Corporation
SELECTIVE CURIOSITY IN ACTIONA TRUE STORY
© 2016 IBM Corporation70
Why Selective Curiosity MattersPatent US8620927
There are many domains where 99% accuracy is just not good enough e.g.,– Elections– Healthcare– National security– Police investigations– Self-driving cars
In the coming era of Internet of Things, robots, and cognitive computing “decision certainty” is going to make or break these coming technologies.
Selective Curiosity will make this possible …
© 2016 IBM Corporation
Context ComputingAnd the Rise of Sensemaking Systems
CDO GovernmentJune 8-9, 2016
Jeff Jonas, IBM FellowChief Scientist, Context Computinghttp://www.twitter.com/jeffjonaswww.jeffjonas.typepad.com