design for x: exploring product design with apache spark and graphlab
TRANSCRIPT
DESIGN FOR Xexploring data science product design with apache spark + graphlab {create} @amcasari @Concur data science summit 2016, san francisco
nasa
data science via random walks
senior product mgr +
data scientist
@ Concur Labs
control systems
engineering +
robotics + legos
officer in USN
operations research
analyst
wandering dirtbag +
conservation volunteer
EE +
applied math
+ complex systems
underwater robotics
engineer
technology
consultant
SAHM
INSANELY QUICK INTRO TO +
➤ Concur Accelerator Team ➤ Concur Labs
➤ Incubator (still brewing)
850KUsers log into Concur
300KExpense reports
processed
120KTrips booked
170MTrips & expense
reports warehoused
Typical Day at ConcurHow do we encourage a culture of innovation while delivering quality service to our existing
33,000 business clients and 40M users?
DESIGN SPRINTS FOR DATA SCIENCEY PROTOTYPES
courtesy google ventures {we iterated…because data}
INSANELY QUICK INTRO TO
➤ “fast and general engine for large-scale data processing” ➤ advanced cyclic data flow and in-memory computing > runs
10x-100x faster than Hadoop MR ➤ interactive shells in several languages (incl. SQL)
➤ performant + scalable
courtesy databricks
ALMOST AS INSANELY QUICK INTRO TO +
➤ graphlab create is based on a python data science library developed + (some) os’d by turi
➤ SFrame <<>> Spark DataFrame | SparkRDD ➤ (yes it works with Open Source SFrame and GLC)
courtesy turi
➤ “We could {build this} {answer this better} if….” ➤ Reciprocal Data Applications
DESIGN FOR KNOWLEDGE GAPS
rda rdarda
choose your data storage
choose your data storage
choose your data storage
the app you really
want to make
➤ “Can we trust our sensors?” ➤ “Has our network been hacked?”
DESIGN FOR IOT NETWORKS
device
device
device
alerts, notifications, monitoring dashboards
data services
Anomaly Detection Toolkit
TimeSeries <<>> SFrame
➤ “How do we create a conversational interface?”
….nothing new, just the burning question since Turing, 1950
DESIGN FOR BOTS
what NOT to do….
non-creepy unisex
animal mascot conversational
ui
choose or
create your
framework
choose your data storage
Advanced Deep Learning
Text Analysis Toolkit
Graph Analytics Toolkit
➤ know your biases + limitations
➤ in your data, their data, all the data ➤ in your feature selection
➤ in your algorithm
…..because ethics (these ALL bias your results + communications)
DESIGN FOR FAIRNESS
learn more at data & society’s case studies
+ +
open source. reproducible. transparent.
{THANKS MUCH}
➤ Concur is hiring!
➤ SAP + SAP Ariba are hiring!
concurlabs.com
github.com/concurlabs
➤ example notebooks will be posted on our github in the future
@amcasari