exploring emergent consumer experience: a topological data analysis approach
TRANSCRIPT
© Novak and Hoffman 2015 | http://postsocial.gwu.edu
Exploring Emergent Consumer
Experience: A Topological Data
Analysis Approach
Tom Novak
Donna Hoffman
The George Washington University
Center for the Connected Consumer
CCB 2015 Annual Complexity in Business Conference, November 12-13, 2015
1
© Novak and Hoffman 2015 | http://postsocial.gwu.edu
Ready or Not, The Smart Home is Coming Soon!
http://www.jklossner.com/TechToons/
2
© Novak and Hoffman 2015 | http://postsocial.gwu.edu
› Setting the Stage
› A New Framework
› Uncovering the Possibility Space with Topological Data Analysis (TDA)
› TDA Application: Smart Home Sensors and Activities
› TDA Application: IFTTT Rules
Agenda
3
© Novak and Hoffman 2015 | http://postsocial.gwu.edu
Internet Phase 1
Internet of Information
(Web)
Internet Phase 2
Internet of People
(Social)
Internet Phase 3
Internet of Things
(Post-Social)
Research Focus online experience social mediaconsumer experience of the
assemblage
Catchphrase
“Nobody knows you’re a dog” “On the Internet, everybody knows you’re a dog”
“On the Internet of Things, nobody knows you’re a
fridge”
5
From the Internet to IoT
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 6
The Evolution of InteractivityInternet Phase 1
Internet of Information(Web)
Internet Phase 2Internet of People
(Social)
Internet Phase 3Internet of Things
(Post-Social)
Nature of Interactivity
Many to many interaction between people and content via Web interfaces.
Many to many interaction directly between people via social networks.
C2M, M2M, M2P, C2C interactions; device interactions autonomous; Digital → physical. Interactions highly heterogeneous, ongoing and evolve over time.
??
Identity of the Internet
Shop online, browse web pages, search for information are all largely static.
Global Collective Identity
Shift to smarter apps to enable more sophisticated and complex interactions. Balance of power shifts from marketer to consumer.
Global and Personal Collective Identity
C2C interactions recede compared to evolving heterogeneous interactions of C2M, M2M, M2P in overlapping assemblages.
New Personalized Consumer Experiences Will Emerge in the Context of the Unique Identities of These New Assemblages
© Novak and Hoffman 2015 | http://postsocial.gwu.edu
Why Now? The 7 Technology Laws
Technology Law Description
#1 Moore’s Law Processing power. Transistor density on integrated circuits doubles every 12-24 months.
#2 Kryder’s Law Storage power. The density of information on hard drives doubles every 13 months.
#3 Gilder’s Law Communications power. Total bandwidth of communication systems doubles every 6 months.
#4 Kurzweil’s Law Accelerating returns. The time interval between salient technology events shorter as time passes.
#5 Weiser’s Law Instant adaptation. As technology becomes ubiquitous, people instantly adapt to new technology and take it for granted.
#6 Meeker’s Law 10x Multiplier Effect. With each new technology cycle, the number of devices increases tenfold.
#7 Metcalfe’s Law Network power. The value of a network is proportional to the square of the number of users.
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 8
The Consumer Internet of Things (IoT)
The collection of everyday objects in the physical environment, embedded with technology including sensors, actuators and the ability to communicate wirelessly with the Internet. These devices interact and communicate with themselves and each other –and with humans.
-- (Hoffman and Novak 2015)
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 9
The Internet of Things is Going to Be Huge
250 Million Connected Cars by 2020 (Telefonica)
$19 Trillion opportunity by 2020
(Cisco)
“100% IoT” by 2020 (CES)
Intel’s IoT group had 19% increase + $2.1
billion revenue in 2014
6.4 billion by 2016 and 21 billion by 2020 (Gartner)
© Novak and Hoffman 2015 | http://postsocial.gwu.edu
A lot of hype, but so far not a lot of adoption
Industry research suggests that consumer adoption of smart devices, the “Internet of Things,” is inevitable (Acquity Group 2014; Affinova 2014), with ⅔ of consumers saying they plan to buy at least one smart home device in the next five years.
Adoption rates are low:16% own one device and 4% own two or more (Gartner)6% use smart home tech (Nielsen)4% own one device (Acquity)
Even the most aggressive projections suggest that only 30% are expected to purchase a smart thermostat (one of the most obvious smart home applications) 5 years from now, with much lower rates of adoption for other smart home devices.
But the Consumer IoT Has an Adoption Problem
10
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 11
Even Early Adopters are Having Trouble
© Novak and Hoffman 2015 | http://postsocial.gwu.edu
Awareness. 87% of consumers have never heard of the “Internet of Things” and if they have, they don’t really know what it is.
Price, Security, Privacy and a Loss of Control. Oh, and smart home device prices are too high, there are serious concerns about security and privacy, and consumers worry that smart devices may develop scary minds of their own.
Product Value and Performance. Industry research shows that the main barriers are a lack of awareness and a perception that the value proposition is missing. Many current products simply aren’t ready for prime time.
3 Barriers to Adoption of Smart Devices
12
© Novak and Hoffman 2015 | http://postsocial.gwu.edu
For the consumer IoT to expand beyond upscale consumers and techy DIYers willing to suffer, we need to understand the value it offers to consumers.
The current focus is on individual products (thermostat, light bulb, refrigerator) and specific “use cases” (turn on the lights when I get home).
Many consumers are having a hard time seeing the value from that focus.
We believe that value is embedded in what smart devices means to consumers and want to shift the conversation to smart device identity and consumer experience.
For this, for this we need a new framework...
Cracking the Value Code
13
© Novak and Hoffman 2015 | http://postsocial.gwu.edu
What kind of insights can we derive about emergent
consumer experience in the IoT from all these
heterogeneous interactions?
These interactions create a whole that is more than
the sum of the parts – a set of recurrent
“assemblages” (Hoffman and Novak 2015).
Just as the web needed new frameworks for
understanding consumer experience (Hoffman and
Novak 1996), the IoT will need new frameworks to
understand the consumer experience that emerges
from these interactions.
Interactivity is Evolving and New Consumer
Experiences are Emerging
15
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 16
The Smart Home Consumer Starts with A Device or Two
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 17
The Consumer’s Focus Shifts Once There Are 3-4 Devices
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 18
People Start to Want the Devices to Talk to Each Other
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 19
Assemblage Theory. A comprehensive theory from the neo-realist school of philosophy which explains the processes by which the identity of a whole - a whole that is more than the sum of the parts - emerges from on-going interactions among its parts .
(Deleuze and Guattari 1988; DeLanda 2002, 2006, 2011; Harman 2008).
Smart Devices - An Assemblage of Heterogeneous Parts
© Novak and Hoffman 2015 | http://postsocial.gwu.edu
CONSUMER DEVICEFrom assemblage theory, interaction
involves paired capacities (Hoffman and
Novak 2015; DeLanda 2006)
capacity to affect
(IF “trigger”)
+ capacity to be affected
(THEN “action”)
The entities in an IoT interaction are either
consumers or devices (sensors, beacons,
smart products, hubs, wearables, etc.).
Interaction is the Fundamental Unit of Analysis in the
Post-Social Internet of Things (IoT)
IF I trigger a beacon
by entering a room
THEN Philips Hue
turns on the lights
DEVICE CONSUMER
IF my leak detector is
triggered by a broken
pipe
THEN my smart
home hub sends me
a text
DEVICEDEVICE
IF a motion sensor
detects activity
THEN my security
camera starts recording
Hoffman and Novak (2015) “Emergent Experience and the Connected Consumer in the Smart Home Assemblage and the IoT”
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 21
Some Interactions Define “Use Cases”
When the garage door opens or closes -
Then turn the kitchen lights red.
“Safety and Security”
© Novak and Hoffman 2015 | http://postsocial.gwu.edu
In a 2010 White Paper, IBM introduced four areas of smart home capabilities demonstrating early adoption: 1) entertainment and convenience, 2) energy management, 3) safety and security, and 4) health and welfare
In 2015, Lowe’s markets an Iris “Safe & Secure” kit, and a “Comfort & Control” kit.
In 2015, SmartThings markets the benefits of system as “Home Security,” “Peace of Mind,” and “Limitless Possibilities”
These benefits derive from specific use cases.
Use Cases Dominate Current Smart Home Marketing
22
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 23
But - Uses Cases Don’t Explain What Can Emerge
Safety and Security
Awareness of Energy Usage
Control and Convenience
“My house cares
about me”
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 24
The Home and Consumer Develop Emergent
Capacities From Interactions
Consumer Smart Home
remotely control camera
stream live videoview video live stream
move cameratime 1
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 25
New Capacities Emerge From Interactions
Consumer Smart Home
remotely control camera
stream live videoview video live stream
feel secure
move camera
enable securitytime 2
time 1
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 26
New Capacities Emerge From Interactions
Consumer Smart Home
remotely control camera
stream live videoview video live stream
feel secure
move camera
enable security
grant delivery person entry open front doortime 3
time 2
time 1
© Novak and Hoffman 2015 | http://postsocial.gwu.edu
A Framework for Consumer Experience in the IoT
27
http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2648786
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 28
Assemblage theory assumes an underlying
topological space of possibilities.
An actual assemblage is an individual singularity
that has been realized in this possibility space.
The possibility space contains universal
singularities - topological invariants – that structure
the possibility space and influence the population of
individual singularities that will emerge (DeLanda 2006, p30).
Thus, consumer experience is not only concerned
with “what is” for a given consumer (a single
consumer-IoT assemblage), but also with the
underlying structure of “what could be” for all
consumers (the larger population of all consumer IoT
assemblages).
Uncovering the Possibility Space of Consumer Experience
from Interactions in the IoT
© Novak and Hoffman 2015 | http://postsocial.gwu.edu
Uncovering the Possibility Space with TDA
29
© Novak and Hoffman 2015 | http://postsocial.gwu.edu
Interactivity is Evolving and New Consumer Experiences
are Emerging
What kind of insights can we derive
about emergent consumer experience
in the IoT - the “possibility space” -
from actual interactions?
These interactions represent a lot of
digital “big data” – very high
dimensionality data consisting of often
millions of ongoing interactions among
complex, heterogeneous component
devices (and consumers!).
30
© Novak and Hoffman 2015 | http://postsocial.gwu.edu
Predictive Analytics approach: Fit predictive models to the data. But the complexity of the data means
hypothesis testing is often challenging. We need to know what questions to ask. Are we asking the right
questions? With big data, insights can be slow.
Conventional approaches for reduction and visualization: Use linear and nonlinear dimension
reduction techniques such as PCA, MCA, and MDS. But, even if they work, are sensitive to distance metrics
and do not preserve topological structures of the data.
Data-Driven Discovery Approach: Hypothesis-free approach based on computational topology to
qualitatively analyze functions on very high-dimensional data and visualize the data structure in low-
dimensional topological spaces. Topological data analysis (TDA) reveals structures in the data that have
invariant properties and can propel insight and improve hypothesis-generation and predictive modeling;
“digital serendipity” (Singh 2013).
132 million data points
Making Sense of of Digital Big Data from the IoT
31
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 32
TDA Framework for Generating Topological Networks
Data
Metric and Lenses
Cover Image
Cluster Inverse Image
Nodes and Edges
Network Visualization
Computational topology technique on complex high-dim data using unsupervised machine learning and network visualization.
Result is 3-dim topology of simplicial complexes (discrete, combinatorial objects) in which groups of data are represented as nodes that contain rows that are similar to each other in the high-dim space and edges connect nodes that share rows (Carlsson 2009; Lum, et.al. 2012).
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 33
TDA draws on theory of topological spaces and simplicial complexes (algebraic topology); implementation invokes computational topology (computational geometry, computational complexity theory, and computer science) – e.g. see Carlsson2009; Lum, et.al. 2012; Singh, Memoli and Carlsson 2007.
TDA applies a function (lens) to a data set and builds a compressed summary of the data.
A visual network of nodes (representing data points) connected by edges is created using four types of parameters:
Metric (measure of similarity)
Lenses (functions on the data)
Bin resolution
Bin overlap
How TDA Works
Slide image courtesy of Ayasdi, Inc. http://ayasdi.com/
Implementation: Ayasdi 3.0 Software Platform (ayasdi.com, Ayasdi, Inc., Menlo Park, CA) and Python Mapper (Singh, Memoli and Carlsson 2007; http://danifold.net/mapper/).
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 34
Basic Steps of Implementation: 1. Specify a Metric and Real-
Valued Functions to Calculate an Image of the Data
We choose a metric to represent the distance or similarity among rows.
We choose real-valued functions (lenses) (i.e. the image) to be applied to the data.
Here, we map data points to their y-coordinate value.
y-coordinate function
Slide image courtesy of Ayasdi, Inc. http://ayasdi.com/
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 35
Basic Steps of Implementation: 2. Cover Image
We put the data into overlapping bins based on a chosen resolution(number of groups for each function) and gain(degree of overlap of each group within each function).
The functions use the metric to transform each row in the dataset into a single point that fall into overlapping bins.
y-coordinate function
Slide image courtesy of Ayasdi, Inc. http://ayasdi.com/
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 36
Basic Steps of Implementation: 3. Cluster Inverse Image
Cluster the data in the cover based on the metric. Rows that fall within each bin are independently clustered, using a measure of similarity (the metric) on the original data (i.e. the inverse image of the functions).
Ayasdi uses single-linkage clustering with a fixed heuristic for the choice of the resolution parameter.
Each within-bin cluster becomes a node in the network. Nodes represent sets of data points that are similar with respect to the metric.
Slide image courtesy of Ayasdi, Inc. http://ayasdi.com/
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 37
Basic Steps of Implementation: 4. Network Visualization
Edges connect nodes with rows in common.
Since the data were divided into overlapping bins, a data point can be in multiple nodes.
The resulting nodes and edges construct a topological network of the simplicial complex.
A visualization is created whose patterns can be interpreted for meaning.
Slide image courtesy of Ayasdi, Inc. http://ayasdi.com/
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 38
Metrics: correlation, Euclidean distance, cosine, hamming, categorical cosine, user-defined…
Functions: mean, variance, density, centrality, PCA, MDS, user-defined…
Supervised Machine Learning Models: TDA of machine learning outputs with outcome variables can enhance models through discovery of systematic error and construction of local models as opposed to a single global model
Metrics and Functions
Slide image courtesy of Ayasdi, Inc. http://ayasdi.com/
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 39
TDA Application: Smart Home Sensors and Activities
For a smart home to be smart, it needs to be able to understand what the people in the home are doing.
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 40
TDA Example #1: Smart Home Sensors and Activities
ID day/time activity sensor
1
2
Phone Call
Wash Hands
Cook
Eat
Clean
Phone Call
Wash Hands
Cook
Eat
Clean
24 participants were asked to perform 5 activities in a fixed sequence, in an apartment equipped with motion and item sensors:
1. Make a phone call to retrieve a voicemail with a recipe
2. Wash their hands
3. Cook a pot of oatmeal according to the directions in the voicemail
4. Eat the oatmeal
5. Clean the dishes
A total of 24 sensors recorded motion (11 sensors), items (10 sensors), and water and burner usage (3 sensors) as the participants performed the 5 activities in sequence. Time was recorded for each sensor event.
Data source: Dataset 1, ADL Activities, CASAS, Washington State University, http://casas.wsu.edu/datasets/
D. Cook and M. Schmitter-Edgecombe, Assessing the quality of activities in a smart environment.
Methods of Information in Medicine, 2009.
tim
e
IDDAY/TIME ACTIVITY SENSOR
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 41
Map of Smart Home Sensor Locations
Phone book sensor
Oatmeal
Raisins
Brown
Sugar
Bowl Spoon
Medicine
container
Cabinet sensor
Pot
Burner
Hot
Water
Cold
Water
Motion Sensors Item Sensors
Gas & Water Sensors
Phone sensor
Data source: Dataset 1, ADL Activities, CASAS, Washington State University,
http://casas.wsu.edu/datasets/
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 42
Sample of Smart Home Event Stream Data
ID DATE TIME SENSOR ACTIVITY (“ground truth”)
1 2008-02-27 12:50:10.25482 AD1-B wash hands
1 2008-02-27 12:50:25.21543 AD1-B wash hands
1 2008-02-27 12:50:38.708635 M17 wash hands
1 2008-02-27 12:50:40.5204 AD1-B wash hands
1 2008-02-27 12:50:42.521531 M17 wash hands
1 2008-02-27 12:50:43.533263 M17 cook
1 2008-02-27 12:50:56.242 I07 cook
1 2008-02-27 12:51:01.601718 D01 cook
1 2008-02-27 12:51:03.797739 I02 cook
1 2008-02-27 12:51:05.950145 I03 cook
1 2008-02-27 12:51:09.800081 I01 cook
1 2008-02-27 12:51:10.4779 M17 cook
1 2008-02-27 12:51:11.473605 I04 cook
1 2008-02-27 12:51:12.497614 I05 cook
1 2008-02-27 12:51:16.729926 D01 cook
There are total of 6425 sensor events from 24 participants.
Data source: Dataset 1, ADL Activities, CASAS, Washington State University,
http://casas.wsu.edu/datasets/
© Novak and Hoffman 2015 | http://postsocial.gwu.edu
ID activity sensor
Current
event
time
(seconds)
Window
start time
(seconds)
Window
duration
(seconds) AD
1_
A
AD
1_
B
AD
1_
C
D0
1
E01
I01
I02
I03
I04
I05
I06
I07
I08
M0
1
M0
7
M0
8
M0
9
M1
3
M1
4
M1
5
M1
6
M1
7
M1
8
M2
3
aste
risk
1 cook I07 166 133 33 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
1 cook M17 167 135 32 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
1 cook M17 170 137 33 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
1 cook AD1-A 173 137 36 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 cook M17 173 140 33 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
1 cook AD1-A 176 151 25 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 cook AD1-B 176 153 23 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 cook M17 178 153 25 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
1 cook AD1-A 180 154 26 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 cook AD1-B 180 154 26 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 cook AD1-A 182 156 26 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 cook AD1-B 185 156 29 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 cook AD1-A 188 158 30 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 cook AD1-B 188 160 28 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 cook M17 190 161 29 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
1 cook M17 194 161 33 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
1 cook I05 199 162 37 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 cook AD1-A 200 163 37 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 cook M17 207 164 43 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
1 cook I05 207 166 41 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 cook M17 213 167 46 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
1 cook AD1-A 215 170 45 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 cook M17 219 173 46 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
1 cook M17 221 173 48 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
TIME SENSOR FOR CURRENT EVENT
43
Step 1: Create Binary Variables for Each of 24 Sensors
Sensor AD1-A (burner) was triggered at 215 seconds for participant ID = 1
Time
begins with
0 for each
participant.
© Novak and Hoffman 2015 | http://postsocial.gwu.edu
ID activity sensor
Current
event
time
(seconds)
Window
start time
(seconds)
Window
duration
(seconds) AD
1_
A
AD
1_
B
AD
1_
C
D0
1
E01
I01
I02
I03
I04
I05
I06
I07
I08
M0
1
M0
7
M0
8
M0
9
M1
3
M1
4
M1
5
M1
6
M1
7
M1
8
M2
3
aste
risk
1 cook I07 166 133 33 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
1 cook M17 167 135 32 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
1 cook M17 170 137 33 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
1 cook AD1-A 173 137 36 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 cook M17 173 140 33 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
1 cook AD1-A 176 151 25 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 cook AD1-B 176 153 23 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 cook M17 178 153 25 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
1 cook AD1-A 180 154 26 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 cook AD1-B 180 154 26 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 cook AD1-A 182 156 26 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 cook AD1-B 185 156 29 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 cook AD1-A 188 158 30 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 cook AD1-B 188 160 28 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 cook M17 190 161 29 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
1 cook M17 194 161 33 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
1 cook I05 199 162 37 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 cook AD1-A 200 163 37 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 cook M17 207 164 43 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
1 cook I05 207 166 41 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 cook M17 213 167 46 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
1 cook AD1-A 215 170 45 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 cook M17 219 173 46 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
1 cook M17 221 173 48 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
TIME SENSOR FOR CURRENT EVENT
44
Step 2: Define Sliding Window of the Past 20 Events
Fixed window from time ti-19 to ti sums over the last 20 events, for 24 sensors
win
do
w o
f 20 e
ven
ts
© Novak and Hoffman 2015 | http://postsocial.gwu.edu
ID activity sensor
Current
event
time
(seconds)
Window
start time
(seconds)
Window
duration
(seconds) AD
1_
A
AD
1_
B
AD
1_
C
D0
1
E01
I01
I02
I03
I04
I05
I06
I07
I08
M0
1
M0
7
M0
8
M0
9
M1
3
M1
4
M1
5
M1
6
M1
7
M1
8
M2
3
aste
risk
AD
1_
As
AD
1_
Bs
AD
1_
Cs
D0
1s
E01
s
I01
s
I02
s
I03
s
I04
s
I05
s
I06
s
I07
s
I08
s
M0
1s
M0
7s
M0
8s
M0
9s
M1
3s
M1
4s
M1
5s
M1
6s
M1
7s
M1
8s
M2
3s
aste
risk
_s
1 cook I07 166 133 33 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 2 1 2 1 1 0 1 0 0 0 0 0 0 0 0 0 9 1 0 0
1 cook M17 167 135 32 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 2 0 2 1 2 1 1 0 1 0 0 0 0 0 0 0 0 0 9 1 0 0
1 cook M17 170 137 33 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 2 0 2 1 2 1 1 0 1 0 0 0 0 0 0 0 0 0 10 0 0 0
1 cook AD1-A 173 137 36 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 2 1 2 1 1 0 1 0 0 0 0 0 0 0 0 0 10 0 0 0
1 cook M17 173 140 33 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 0 2 1 2 1 1 0 1 0 0 0 0 0 0 0 0 0 10 0 0 0
1 cook AD1-A 176 151 25 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 1 0 2 1 2 1 1 0 1 0 0 0 0 0 0 0 0 0 9 0 0 0
1 cook AD1-B 176 153 23 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 1 0 0 0 2 1 2 1 1 0 1 0 0 0 0 0 0 0 0 0 9 0 0 0
1 cook M17 178 153 25 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 2 1 0 0 0 2 1 2 1 1 0 1 0 0 0 0 0 0 0 0 0 9 0 0 0
1 cook AD1-A 180 154 26 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 1 0 0 0 1 1 2 1 1 0 1 0 0 0 0 0 0 0 0 0 9 0 0 0
1 cook AD1-B 180 154 26 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 2 0 0 0 0 1 2 1 1 0 1 0 0 0 0 0 0 0 0 0 9 0 0 0
1 cook AD1-A 182 156 26 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 2 0 0 0 0 1 2 1 1 0 1 0 0 0 0 0 0 0 0 0 8 0 0 0
1 cook AD1-B 185 156 29 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 3 0 0 0 0 0 2 1 1 0 1 0 0 0 0 0 0 0 0 0 8 0 0 0
1 cook AD1-A 188 158 30 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 3 0 0 0 0 0 2 0 1 0 1 0 0 0 0 0 0 0 0 0 8 0 0 0
1 cook AD1-B 188 160 28 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 4 0 0 0 0 0 2 0 0 0 1 0 0 0 0 0 0 0 0 0 8 0 0 0
1 cook M17 190 161 29 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 5 4 0 0 0 0 0 2 0 0 0 1 0 0 0 0 0 0 0 0 0 8 0 0 0
1 cook M17 194 161 33 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 5 4 0 0 0 0 0 2 0 0 0 1 0 0 0 0 0 0 0 0 0 8 0 0 0
1 cook I05 199 162 37 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 4 0 0 0 0 0 1 0 1 0 1 0 0 0 0 0 0 0 0 0 8 0 0 0
1 cook AD1-A 200 163 37 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 4 0 0 0 0 0 1 0 1 0 1 0 0 0 0 0 0 0 0 0 7 0 0 0
1 cook M17 207 164 43 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 6 4 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 8 0 0 0
1 cook I05 207 166 41 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 4 0 0 0 0 0 0 0 2 0 1 0 0 0 0 0 0 0 0 0 7 0 0 0
1 cook M17 213 167 46 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 6 4 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 8 0 0 0
1 cook AD1-A 215 170 45 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 7 4 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 7 0 0 0
1 cook M17 219 173 46 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 7 4 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 7 0 0 0
1 cook M17 221 173 48 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 6 4 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 8 0 0 0
TIME SENSOR FOR CURRENT EVENT SENSOR SUMS IN SLIDING WINDOW
45
Step 3: Create Sums of Sensor Events in Sliding Window
Sum of the number of times each sensor was triggered in the sliding window
See: Cook, Krishnam and Rashidi (2013), Krishnan and Cook (2014), Cook and Schmitter-Edgecombe (2009)
© Novak and Hoffman 2015 | http://postsocial.gwu.edu
ID activity sensor
Current
event
time
(seconds)
Window
start time
(seconds)
Window
duration
(seconds) AD
1_
A
AD
1_
B
AD
1_
C
D0
1
E01
I01
I02
I03
I04
I05
I06
I07
I08
M0
1
M0
7
M0
8
M0
9
M1
3
M1
4
M1
5
M1
6
M1
7
M1
8
M2
3
aste
risk
AD
1_
As
AD
1_
Bs
AD
1_
Cs
D0
1s
E01
s
I01
s
I02
s
I03
s
I04
s
I05
s
I06
s
I07
s
I08
s
M0
1s
M0
7s
M0
8s
M0
9s
M1
3s
M1
4s
M1
5s
M1
6s
M1
7s
M1
8s
M2
3s
aste
risk
_s
1 cook I07 166 133 33 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 2 1 2 1 1 0 1 0 0 0 0 0 0 0 0 0 9 1 0 0
1 cook M17 167 135 32 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 2 0 2 1 2 1 1 0 1 0 0 0 0 0 0 0 0 0 9 1 0 0
1 cook M17 170 137 33 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 2 0 2 1 2 1 1 0 1 0 0 0 0 0 0 0 0 0 10 0 0 0
1 cook AD1-A 173 137 36 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 2 1 2 1 1 0 1 0 0 0 0 0 0 0 0 0 10 0 0 0
1 cook M17 173 140 33 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 0 2 1 2 1 1 0 1 0 0 0 0 0 0 0 0 0 10 0 0 0
1 cook AD1-A 176 151 25 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 1 0 2 1 2 1 1 0 1 0 0 0 0 0 0 0 0 0 9 0 0 0
1 cook AD1-B 176 153 23 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 1 0 0 0 2 1 2 1 1 0 1 0 0 0 0 0 0 0 0 0 9 0 0 0
1 cook M17 178 153 25 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 2 1 0 0 0 2 1 2 1 1 0 1 0 0 0 0 0 0 0 0 0 9 0 0 0
1 cook AD1-A 180 154 26 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 1 0 0 0 1 1 2 1 1 0 1 0 0 0 0 0 0 0 0 0 9 0 0 0
1 cook AD1-B 180 154 26 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 2 0 0 0 0 1 2 1 1 0 1 0 0 0 0 0 0 0 0 0 9 0 0 0
1 cook AD1-A 182 156 26 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 2 0 0 0 0 1 2 1 1 0 1 0 0 0 0 0 0 0 0 0 8 0 0 0
1 cook AD1-B 185 156 29 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 3 0 0 0 0 0 2 1 1 0 1 0 0 0 0 0 0 0 0 0 8 0 0 0
1 cook AD1-A 188 158 30 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 3 0 0 0 0 0 2 0 1 0 1 0 0 0 0 0 0 0 0 0 8 0 0 0
1 cook AD1-B 188 160 28 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 4 0 0 0 0 0 2 0 0 0 1 0 0 0 0 0 0 0 0 0 8 0 0 0
1 cook M17 190 161 29 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 5 4 0 0 0 0 0 2 0 0 0 1 0 0 0 0 0 0 0 0 0 8 0 0 0
1 cook M17 194 161 33 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 5 4 0 0 0 0 0 2 0 0 0 1 0 0 0 0 0 0 0 0 0 8 0 0 0
1 cook I05 199 162 37 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 4 0 0 0 0 0 1 0 1 0 1 0 0 0 0 0 0 0 0 0 8 0 0 0
1 cook AD1-A 200 163 37 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 4 0 0 0 0 0 1 0 1 0 1 0 0 0 0 0 0 0 0 0 7 0 0 0
1 cook M17 207 164 43 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 6 4 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 8 0 0 0
1 cook I05 207 166 41 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 4 0 0 0 0 0 0 0 2 0 1 0 0 0 0 0 0 0 0 0 7 0 0 0
1 cook M17 213 167 46 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 6 4 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 8 0 0 0
1 cook AD1-A 215 170 45 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 7 4 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 7 0 0 0
1 cook M17 219 173 46 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 7 4 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 7 0 0 0
1 cook M17 221 173 48 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 6 4 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 8 0 0 0
TIME SENSOR FOR CURRENT EVENT SENSOR SUMS IN SLIDING WINDOW
46
Step 4: Use the Data for Topological Data Analysis (TDA)
1) TDA uses rectangular matrix of events (rows) by 3 time and 24 sensor sums variables (columns)
2) Actual activity is “ground truth” (i.e., what we know and want the TDA to be able to detect).
3) The specific sensor triggered and ID are explanatory variables
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 47
Rectangular data matrix of processed sensor events using a simple baseline approach:
› Rows: current sensor event is the unit of analysis
› Columns: sums of 24 sensor events over a 20 event sliding window, time of current
event, time of first event in window, duration time of window
Distance among sensor events (rows) was obtained using the normalized correlation
metric. Columns were normalized to a mean of 0 and variance of 1, and Pearson
correlation of each pair of rows was obtained.
Ayasdi TDA lens was the x and y coordinates of a 2-dimenisonal embedding of a k-
nearest neighbors graph of the data.
TDA of Smart Home Sensor and Activity Data
Objective: Use TDA to produce a topological summary of the stream of sensor events.
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 48
Apartment Activities (Colored by Time of Events)
Metric: Normalized Correlation
Lens: K-nearest neighbors (resolution 30, gain 3)
*We used the Ayasdi 3.0 software platform
(ayasdi.com, Ayasdi, Inc., Menlo Park, CA) to
perform the TDA on the CASAS data
Blue early times
Green intermediate times
Red later times
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 49
Apartment Activities (Colored by # of Phone Call Events)
88% of Phone Call
events are in Group 1
12% of Phone Call
events are in Group 2
Look in phone book,
start phone call
M13 (dining) 48%
M14 (dining) 14%
M08 (dining) 6%
M07 (dining) 6%
M01 (dining) 6%
M09 (dining) 5%
Phone book 4%
Phone on 3%
Hang up phone,
approach kitchen
M13 (dining) 51%
M14 (dining) 29%
Phone off 21%
Note: individual sensors are shown that are triggered significantly more
often for this activity in a circled group (hypergeometric p-value <.05)
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 50
Apartment Activities (Colored by # of Wash Hands Events)
59% of Wash Hands
events are in Group 2
20% of Wash Hands
events are in Group 1
21% of Wash Hands
events are in Group 3
Near Sink
Hot Water 19%
M16 (kitchen) 13%
M15 (kitchen) 12%
M14 (dining) 12%
Near Sink
Hot Water 13%
M14 (dining) 13%
M15 (kitchen) 12%
M16 (kitchen) 10%
At Sink
Hot Water 20%
M17 (stove) 46%
M18 (sink) 26%
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 51
Apartment Activities (Colored by # of Cook Events)
98% of Cook
events are in Group 3
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 52
TDA Shows There are Two Parts to Cooking
Preparing Oatmeal
Burner 22%
Oatmeal 3%
Raisins 3%
Brown sugar 3%
Bowl 3%
Measuring spoon 3%
M17(stove) 45%
Cooking Oatmeal
Burner 33%
M17 (stove) 54%
Meal Preparation
Meal Cooking
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 53
Apartment Activities (Colored by # of Eat Events)
24% of Eat
events are in Group 4
28% of Eat
events are in Group 5
47% of Eat
events are in Group 3
Get Medicine and Water
Cold Water 13%
Medicine container 6%
Cabinet 9%
Brown sugar 3%
M16 (kitchen) 10%
M15 (kitchen) 8%
Walk to Table and Eat
Medicine container 3%
M13 (dining) 26%
M14 (dining) 23%
M15 (kitchen) 11%
M16 (kitchen) 11% Eat
M14 (dining) 45%
M13 (dining) 40%
M15 (kitchen) 5%
M16 (kitchen) 5%
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 54
Apartment Activities (Colored by # of Clean Events)
68% of Clean
events are in Group 517% of Clean
events are in Group 3
15% of Clean
events are in Group 4
Put items away and start cleaning
Hot Water 24%
Cabinet 6%
Oatmeal 5%
Bowl 4%
Measuring spoon 3%
Raisins 3%
M18 (sink) 24%
Cleaning
Hot Water 32%
Cold Water 8%
Medicine container 6%
M18 (sink) 11%Cleaning
Hot Water 29%
Cold Water 15%
Medicine container 6%
M18 (sink) 11%
M15 (kitchen) 4%
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 55
Implications of TDA Analysis of Smart Home Data
TDA recovers known activities based on a simple baseline method for processing event streams using sliding windows of counts of sensors, plus start, stop and duration time of the window.
TDA provides clear visual input on the performance of a given approach to processing event stream data. For example, cooking has a clear structure with 2 stages.
Next step is to improve activity recognition by modifying how the event stream is processed, for example by using alterative window sizes, fixed vs. variable windows, exponential time decay on event sums, sensor dependency, past contextual information (e.g. Krishnan and Cook 2014).
Longer term: Use machine learning to predict the activities from the sensor sums, and use TDA to understand the errors in prediction.
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 56
TDA Application: IFTTT Rules
What is the structure of the possibility space underlying the way people use IFTTT?
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 57
Most Popular IFTTT Trigger and Action Channels
IFTTT Trigger Channel IFTTT Action Channel
Feed (RSS feeds) 25% Twitter 15%
Instagram 9% Email 11%
Date & Time 8% SMS 8%
Weather 8% Evernote 7%
Facebook 5% Facebook 7%
Gmail 4% Dropbox 6%
YouTube 2% Facebook Pages 4%
Twitter 2% Google Drive 4%
WordPress 2% Tumblr 3%
Tumblr 2% Pocket 3%
Total of top 10 trigger channels 68%
Total of top 10 action channels 74%
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 58
IFTTT Data: 120,253 Rules and 1101 Variables
“Turn on my lights when I get close to my home”(Title Words)
IOS Location(Trigger Channel)
You enter an area(Trigger Words)
Phillips Hue(Action Channel)
Turn on the lights(Action Words)
IF
THEN
86 binary variables
69 binary variables 103 binary variables
280 binary variables
563 binary variables
Data Description
120,253 IFTTT rules were created by 60,230 IFTTT users over a 3 year period from mid 2011 to mid 2014. (8404 unique rules)
68% of users created 1 rule22% of users created 2-3 rules8% of users created 4-9 rules2% of users created 10+ rules (23,496 total rules or 20% of data)
1101 binary variables for trigger channels, action channels, trigger words, action words & title words.
132 million data points
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 59
1. What IFTTT rules are people creating?
2. Are there interesting and important patterns that underlie the IFTTT rules that have been created?
3. Do these patterns suggest emergent themes?
Let’s take a look at some conventional data reduction and visualization approaches first…
Preliminary Research Questions
© Novak and Hoffman 2015 | http://postsocial.gwu.edu
Network Graph of 553 Nodes for 120,253 IFTTT Rules Using All IFTTT Trigger Channels/Triggers and Action Channels/Actions
Network graph cannot reveal clear patterns in these complex data.
Only option would be to significantly reduce the number of nodes shown.
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 61
PCA and K-Means Clustering on the Component Scores
Some general features, but complexity is not revealed:
Component 1 identifies groups of rules about photos (+) versus email and SMS (-)
Component 2 separates weather and smart home rules (+) and new rules and feeds (-)
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 62
Multiple Correspondence Analysis of the Burt Matrix
Some gross features, but no complexity:
dimension 1 separates weather (+) and photos (1) while dimension 2 contrasts reading (+) and the smart home devices and Android devices (-).
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 63
Conventional dimension reduction and visualization approaches have difficulty revealing clear patterns in these complex data of 1100+ vars.
Need about 400 dimensions to account for most of the variance in the data. Took Stata 6 hours to compute the MCA solution.
General features that grossly separate rules are apparent, but it’s difficult to see the more subtle behavioral patterns underlying rule creation, let alone potential emergent themes.
Topological data analysis offers an approach to visualization of network structure that is more revealing and useful.
TDA Provides a Different Approach
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 64
Ayadsi* TDA Solution of All 120,253 IFTTT Rules
Each node is a cluster of rules.
Nodes connect if they have
rules in common.
Color indicates the number of
rules in each node:
2790 nodes containing
118,226 rules
(98.3% of all rules)
1305 nodes containing 2028 rules
(1.7% of all rules)
1
rule500+ rules
Metric: Hamming
Lenses: MDS 1 and MDS 2 (resolution 60, gain 1.6)
*We used the Ayasdi 3.0 software platform
(ayasdi.com, Ayasdi, Inc., Menlo Park, CA) to
perform the TDA on the IFTTT data
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 65
TDA Reveals the Possibility Space of IFTTT Rules
8
IF a new feed item
THEN send me an SMS or email notification
13,437 rules
Send me feed items
1
IF a new feed item
THEN tweet/post it or save it for later
22,227 rules
Share or save feed items
7
IF you like/tag content
THEN upload it to the cloud
4,235 rules
Store my likes
5
IF time/weather/SMS/email/location event
THEN control a device or log event
12,668 rules
Internet of Things/QS
4
IF time/weather/location event
THEN send me an SMS or email notification
14,265 rules
Send me event info
3
IF a new email or Craigslist post meets search conditions
THEN send me an SMS or email notification
7,128 rules
Send me content I’m looking for
2
IF new email, article, video
THEN save it for later
5,869 rules
Save content for later
8
IF news social media content
THEN tweet/post it or save it for later
35,882 rules
Share or save social media
6
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 66
IFTTT Feed Rules:
Located in Two Distinct Flares in Full Network
IF a new feed item
THEN send me an SMS or email notification
13,437 rules
Send me feed items
1
IF a new feed item
THEN tweet/post it or save it for later
22,227 rules
Share or save feed items
7
News Feeds are
used as triggers in
two very different
ways.
What happens if
we only analyze
the 29,990 IFTTT
rules that have
news feed as a
trigger?
Immediately consume news
Distribute or archive news
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 67
IFTTT Feed Rules:
Three Different Types in Subset Network
IF a new feed item matches my search
THEN send me an SMS or email notification
10,758 rules
Send me feed items I’m looking for
1
IF a new feed item
THEN share it on social media
9,530 rules
Share feed items with others
7b
IF a new feed item
THEN save it in Pocket, Buffer, Delicious etc.
6,997 rules
Save feed items for later
7a
Metric: Hamming
Lenses: MDS 1 and MDS 2 (resolution 30, gain 1.6)
Cluster 1 remains the
same (send me).
Cluster 7 splits into 7a
(save) and 7b (share). Immediately consume news
Distribute news
Archive news
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 68
In the full topological summary of 120,253 IFTTT rules, we can use
categorical variables as data lenses to separate the IFTTT rules into
groups with distinct structures.
› Social media channel lens - whether the IFTTT rule included a
social media channel (e.g. Facebook, Twitter, YouTube, etc.).
Does the structure of IFTTT rules vary based on whether the rule
includes a social media channel?
› Time lens - the year the IFTTT rule was created.
Does the structure of IFTTT rules change over time?
Using Data Lenses to Include Known Structure
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 69
Using a Data Lens to Separate Networks:
Social and Non-Social IFTTT Rules
54,300 Non-Social Rules
(45% of rules do not use a Social Media channel)65,953 Social Rules
(55% of rule use a Social Media channel)
Metric: Hamming
Lens: MDS 1 & 2 (resolution 60, gain 1.9)
Data Lens: Social Media (2 groups)
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 70
Non-Social Rules Social Rules
Send an SMS or email if
a new feed item
Save new feed
item to read later
Save new
feed item to
cloud
Save new photo
or file to cloud
Add event to cloud from a
location or SMS/email trigger
Save tagged
or favorited
content to
cloud
Smart Home and Wearables
Day, Time and Weather
Create a
note from
Send email to
notify me when
something
happens
Send me email or SMS if
Craigslist or Reddit post
Send me
email or
SMS if
tagged in
Schedule
tweet/update by
day and time
Save Facebook
or Instagram
content to cloud
Save Instagram pic
to cloud, or send to
FB or Flickr
Send Instagram
content to Twitter or
blog
Send content from
one social platform
to another
Tweet, blog,
post new
feed item
Using a Data Lens to Separate Networks:
Social and Non-Social IFTTT Rules
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 71
Non-Social Rules Social Rules
Smart Home and
Automation
Collect and
Save
Notify Me
Notify Me
Social
Automation
Collect and
Save
Broadcast
and Share
Using a Data Lens to Separate Networks:
Social and Non-Social IFTTT Rules
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 72
Non-Social
IFTTT
Rules
Social
IFTTT
Rules
Year 12011-12
Year 22012-13
Year 32013-14
Metric: Hamming
Lens: MDS 1 & 2 (resolution 50, gain 2.1)
Data Lens: Social Media (2 groups), Year (3 groups)
Using a Data Lens to Separate Networks:
Three Years of IFTTT Rules
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 73
Non-Social
IFTTT
Rules
Social
IFTTT
Rules
Year 12011-12
Year 22012-13
Year 32013-14
Collect & Save
Notify Me
Automation The basic structure of IFTTT emerged in
year 1, but became more organized and
interconnected in years 2 and 3.Broadcast & Share
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 74
TDA provides a way to visualize what emerges from the IoT using
interaction events as the unit of analysis.
IFTTT is an assemblage that emerges from the capacities of the
components (i.e. triggers and actions) exercised in the interaction over
time between apps and devices that are connected through the individual
rules.
Based on our assemblage theory framework (Hoffman and Novak 2015),
the topology represents the possibility space (DeLanda 2006, 2011)
underlying the potential capacities of the IFTTT assemblage.
Supports productive hypothesis generation and subsequent predictive
modeling.
Implications of TDA Analysis of IFTTT Rules
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 75
Exploring Emergent Consumer Experience: A Topological Data Analysis A
Tom Novak and Donna Hoffman
postsocial.gwu.edu
© Novak and Hoffman 2015 | http://postsocial.gwu.edu 76
Acknowledgments
The Ayasdi 3.0 software platform for
topological data analysis (ayasdi.com)
was used to construct all networks of
the IFTTT data. The authors
acknowledge the support of Devi
Ramanan, Global Head – Product
Collaborations, Ayasdi Inc., Menlo
Park, CA
IFTTT data were collected from a crawl
during May - June 2014 and are used
with permission of IFTTT.com, San
Francisco, CA.
Smart apartment data are from the
Center for Advanced Studies in
Adaptive Systems (CASAS), School of
Electrical Engineering and Computer
Science, Washington State University.
http://ailab.wsu.edu/casas/