data science, self learning algorithms (by alexander frimout & max nie)

53
3.2 Simple sensors: case ‘smart metering’ CONFIDENTIAL 1 Template presentation Innovation Day 2016 CONFIDENTIAL Max Nie Coordinator digital lab & project office [email protected] Alexander Frimout Consultant InnoLab [email protected] TRACK 3: EVOLVING ARCHITECTURES DATA SCIENCE: SELF LEARNING ALGORITHMS

Upload: verhaert

Post on 12-Jan-2017

46 views

Category:

Data & Analytics


1 download

TRANSCRIPT

Page 1: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL1

Template presentation Innovation Day 2016 CONFIDENTIAL

Max NieCoordinator digital lab & project [email protected]

Alexander FrimoutConsultant [email protected]

TRACK 3: EVOLVING ARCHITECTURES

DATA SCIENCE: SELF LEARNING ALGORITHMS

Page 2: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL2

CONFIDENTIAL2

EVOLVING ARCHITECTURESLearning machines in a new data world

Page 3: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL3

A TRADITIONAL HARDWARE PRODUCT IS “MATURE”

Everything you need(and will ever need)In one handy box

Page 4: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL4

WHAT HAPPENS TO THE BOX ONCE IT LEAVES THE COMPANY?

“We have no idea”

Sounds familiar?

Maintenance?

How is it used?How long does it last?

What goes wrong?

Are people happy with our product?

Page 5: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL5

THE PRODUCT IS EXPECTED TO ALWAYS PERFORM TO ITS STANDARD

Sometimes an “error” with a product doesn’t show until later…

…or a users mess up the intended use of a product...

…and the only option is to fix/improve it in a next generation

Page 6: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL6

SOFTWARE DEVELOPMENT TAKES A DIFFERENT APPROACH

Page 7: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL7

EVEN SO CALLED “MATURE” SOFTWARE IS NEVER TRULY FINISHED

Page 8: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL8

NEW, CONNECTED PRODUCTS ALSO HAVE THIS POSSIBILITY

Self learning machines can add enormous value!• Personal experience tailored to the user• Evolving products that promise more• Better understanding of your own product• Reduced costs for user & manufacturer

Page 9: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL9

ML CAN DO VERY COOL STUFF (BUT WE DON’T FULLY UNDERSTAND WHY)

Page 10: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL10

• The difference between 95% and 99% accuracy in speech recognition is game changing

• Training a speech recognition app requires $100 of electricity• 1 super computer to run a Neural Net with 100 billion connections

• 10^19 floating point operations on thousands of parallel GPUs

• 4 TB training data.

THE CATCH: ML REQUIRES TRULY MASSIVE AMOUNTS OF TRAINING DATA

Page 11: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL11

ML IS NOT NEW TECHNOLOGY, THE BREAKTHROUGH IS IN THE SCALE

Page 12: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL12

Can you ignore this?How do you play this game?

ML IS DRIVEN BY VERY BIG TECH WITH VERY BIG DATA

Page 13: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL13

A new class of software interfaces that interacts at our own messy level• Pictures

• Speech

• Text

• Expressions

• Behavior

ML ENABLES PRODUCTS THAT UNDERSTAND AND INTERACT WITH US

Page 14: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL14

IMAGE RECOGNITION

Page 15: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL15

SPEECH AND NATURAL LANGUAGE RECOGNITION

Page 16: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL16

1. In this ecosystem volumes of training data are the currency • Your added value is determined by how much you really know

2. Artificial Intelligence is the next computing platform• New value chains and classes of products will emerge

3. Software and CPU’s are cheap; training data is not• Algorithms and hardware are not a source of differentiation,

• Building training data is the basis for ROI

4. Performance of smart product continuously grows based on the flywheel of user generated data feedback

• Through machine learning

• Through superior user insights

IMPLICATIONS

Page 17: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL17

1. What data assets do you own?2. What data assets could you create?

YOUR DATA ASSETS

Page 18: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL18

CONFIDENTIAL18

HOW TO PLAYPrinciples of smart product innovation

Page 19: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL19

Buying not just a product, but a promise

Some services are only possible after a sufficiently large data set/user base

Our world is evolving fast, we expect our products to evolve with us

Making the most out of data to improve your product

INCREASING VALUE WITH LEARNING & ADAPTIVE PRODUCTS

Tesla cars have driven over 150 million miles autonomously

Page 20: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL20

Don’t just start collecting data without first knowing

Build the use cases for your product/service:• What is the added value of this solution? What advantages

or improvements am I offering my users?• Is this a good fit with my product? Can I do this technically?• What market am I targeting? Can I make a profit with this?

BUILDING THE RIGHT USE CASES

WHY

Page 21: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL21

Involve experts from all fields

Typically during a pressure cooker or sprint session

Keep an open mind and build a wide range of diverging cases

Select the right ones by objectively criticizing all aspects

BUILDING USE CASES REQUIRES A MULTIDISCIPLINARY APPROACH!

Page 22: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL22

SOME EXAMPLES OF USE CASES

IF …What is my trigger?

I detect the performance of my elevator dropping

I can monitor the heat profile and exact hotspot of a transformer

THEN …What action can I perform?

I want to dispatch a technician early

I can set up cooling much more rapidly and efficiently

BECAUSE …What is the underlying driver?

I want to prevent is from malfunctioning laterGetting stuck in an elevator causes huge dissatisfaction with my hotel guests

Better heat management canincrease operating life by several yearsUniform cooling is inefficient

Page 23: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL23

3 THINGS ARE NEEDED FOR MACHINE LEARNING

1. Training Data which has been tagged, categorized, or otherwise sorted by humans.

2. Software libraries which build the machine learning models by evaluating training data.

3. Hardware CPUs and GPUs which run the software’s calculations.

More and more becoming commodities• Computation in the cloud• Low powered networking• Low powered CPU• Minimal storage

Page 24: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL24

1. Product performance increases as more training data is fed

2. New user growth from ever increasing performance

3. Unique insights from product data drive product evolution and revolution

ONGOING PRODUCT PERFORMANCE IMPROVEMENT DRIVEN BY DATA

More users

More data

Better product/service

Better algorithms

1. Training data

Page 25: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL25

Unique data can allow you to provide a unique service or product

1. Many different factors can make your data unique!

2. You don’t have to generate all data yourself

3. Putting together all the right pieces of the puzzle is important

DATA IS NOT A COMMODITY!

Unique location

Established base

Always-on machinery

Product data

User data

Infrastructure access

Pre-installed sensor

You have more access to unique data than you think!

Product usage

Unique technology

Financial dataIntelligentX: The beer that’s continuously getting better

Market data

R&D testing

Page 26: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL26

• Hardware to collect & process data can & should be cheap!• Cheap sensors

• Computation in the cloud is mandatory to exploit big data assets

• Low powered networking

• Low powered CPU

• Minimal storage

• Hardware design must enable data collection for the right use cases and contexts

• Think beyond mobile apps to wearables and other devices

• Form factor and price will drive hardware innovation, not performance

IMPACT ON HARDWARE

Page 27: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL27

• Hiring people to produce training data is too expensive

• So you must acquire an audience and let them create your training data

• The ideal data driven application creates training data and delivers value, powered by the data captured

Offer value or meaning in return for data

THE IDEAL DATA DRIVEN APPLICATION

Page 28: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL28

1. It tracks your steps

2. It tracks your distance run by GPS

3. It loads your Spotify playlists

4. It connects to online services

5. It synchs data with fitness apps

6. It’s SDK allows 3rd party development

7. It’s an Alexa powered PA:• “Alexa, play my workout list”

• “Alexa, what will the weather/traffic be?

• “Alexa, what’s the latest news”

• “Alexa, add milk to my shopping list”

• “Alexa, set the house temperature to 22°

• “Alexander, you have one meeting today

PEBBLE CORE: A $69 SMART PHONE REPLACEMENT (FOR RUNNING)

Page 29: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL29

Hardware suppliers can become service providers!Transformation process for organization

Requires you to consider alternative business models!

THIS WILL IMPACT YOUR ORGANIZATION & BUSINESS MODEL!

From …Buying a car

To …

Subscribing to a flexible transportation service

Who will handle user communication? Do we need an IT department? Who are our new stakeholders?

How will we handle data?

Map your new ecosystem!

Page 30: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL30

Try to think of use cases for improving a product with data for…

EXERCISE

…a pillow

Hint: you can include an actuators!

Page 31: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL31

CONFIDENTIAL31

WHAT DOES IT TAKEDeveloping self learning algorithms

Page 32: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL32

DATA DRIVEN INNOVATION PROCESS

Create smart concept (use case)

Solve the data science problem

Develop & introduce productINNOVATE

Have training data

INSIGHTS, PERFORMANCE

Page 33: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL33

DATA SCIENCE PROCESS

DATA QUESTION

DATA PRODUCT

TIDY DATA

DATA PROCESSING

DATA ANALYSIS

SneakernetManual download

Scraping Custom scripts

DescriptiveExploratoryPredictiveInferential

CausalMechanistic

Audience analysisPremises

Conclusion(s)

• Answerable with data

• Data is obtainable

• Business & user validated

• Explore• Clean• Transform• Combine

• Descriptive analysis

• Exploratory analysis

• Inferential analysis

• Predictive analysis

• Prescriptive analysis

• Post• Visualization• App

Page 34: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL34

‘SUPERVISED LEARNING’ BASED ON TRAINING DATA SETS

Regression problems Classification problems

Hypothesis Function Cost Function

Source: http://www.andrewng.org/

Page 35: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL35

THE OPTIMAL HYPOTHESIS MINIMIZES THE COST FUNCTION

Iterative convergence

Source: http://www.andrewng.org/

Page 36: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL36

GRADIENT DESCENT ALGORITHM FOR COST FUNCTION MINIMIZATION

Page 37: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL37

THE JOY OF CONVEX COST FUNCTIONS

Page 38: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL38

UNFORTUNATELY REAL PROBLEMS ARE NONLINEAR

Page 39: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL39

THIS REQUIRES A MORE FLEXIBLE APPROACH

Page 40: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL40

NON LINEAR CLASSIFICATION

Page 41: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL41

MODELING THE XNOR FUNCTION WITH A NEURAL NETWORK

Page 42: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL42

BUILDING COMPLEXITY AND SCALE WITH NEURAL NETWORKS

Page 43: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL43

TRAINING STEP 1: DEFINE NEURAL NETWORK ARCHITECTURE

Page 44: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL44

TRAINING STEP 2: EVALUATE COST AND PARTIAL DERIVATIVE FUNCTIONS

Page 45: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL45

TRAINING STEP 3: MINIMIZE NON CONVEX COST FUNCTION

Page 46: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL46

HOW DEEP LEARNING OVERCOMES THE BIAS VARIANCE TRADE-OFF

1. Example: set benchmark for speech recognition at human error rate of 1%

2. If training error is too high, e.g. 5% then you have a bias issue run a bigger neural network

3. If validation set error is too high, e.g. 6% then you have a variance issue get more data

4. Otherwise you’re done.

Page 47: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL47

• Machine learning libraries: Theano, Keras, NumPy,…

• Big data tooling: Hadoop, MapReduce, Spark,…

• MLaaS by Amazon, Google, IBM, Microsoft,… • And their cloud API’s for Speech, Vision, Natural Language, Translation

TAKE ADVANTAGE OF OPEN SOURCE AND CLOUD

Page 48: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL48

EXAMPLE CODE TO MODEL AND FIT A NN USING KERAS AND NUMPY

from keras.models import Sequential

from keras.layers import Dense

import numpy

# fix random seed for reproducibility

seed = 7

numpy.random.seed(seed)

# load pima indians dataset

dataset = numpy.loadtxt("pima-indians-diabetes.csv", delimiter=",")

# split into input (X) and output (Y) variables

X = dataset[:,0:8]

Y = dataset[:,8]

# create model

model = Sequential()

model.add(Dense(12, input_dim=8, init='uniform',

activation='relu'))

model.add(Dense(8, init='uniform', activation='relu'))

model.add(Dense(1, init='uniform', activation='sigmoid'))

# Compile model

model.compile(loss='binary_crossentropy' , optimizer='adam',

metrics=['accuracy'])

# Fit the model

model.fit(X, Y, nb_epoch=150, batch_size=10)

# evaluate the model

scores = model.evaluate(X, Y)

print("%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))

Page 49: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL49

1. Commercial data partners• Big tech companies

• Data brokers in all industry domains

• High resolution satellite data

2. Public open data• Government agencies

• Academic institutions

• International organizations

• NGO’s

• Space agencies

ENRICH YOUR DATA ASSETS WITH OPEN DATA FOR UNIQUE INSIGHTS

Page 50: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL50

1. Large scale supervised machine learning enables adaptive, self learning products.

2. Large volumes of training data is a key competitive advantage, find or make your own data assets!

3. Finding the right use cases and answering the right data questionsis critical, and requires a multidisciplinary effort.

4. Algorithms and computing are becoming commoditized. Leverage open source and cloud computing and focus on strategic differentiation based on unique data.

CONCLUSIONS

Page 51: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL51

VERHAERT CONNECT / CWI CASE: ILLEGAL PARKING PREDICTION

• Training data asset: several years of scan car data

• Application concept: a heat map showing illegal parking probabilities

• Data question: predict illegal parking probabilities for each city neighborhood

• Modeling approach: discrete choice regression model

• Cost function: TBD

• Algorithms: maximum likelihood estimators

• R&D plan: 3 months

• Product development plan: 3 months

Page 52: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL52

VERHAERT CONNECT: A NATURAL PROPOSITION

SENSOR FUSION

TECHNOLOGY

INTEGRATION

ALGORITHMS

CONTEXT SENSITIVE

USER CENTRICBIG DATA

ADDED VALUE

MULTIDISCIPLINARY

Page 53: Data science, self learning algorithms (by Alexander Frimout & Max Nie)

3.2 Simple sensors: case ‘smart metering’

CONFIDENTIAL53

Innovation Day is an initiative of Masters in Innovation, the umbrella brand of the Verhaert Group which aims to connect, train and accelerate professional innovators.

KruibekeBelgiumHogenakkerhoekstraat 21B-9150 KruibekeT +32 3 250 19 00E [email protected]

www.verhaert.com

NivellesBelgium

NoordwijkNetherlands

Av. Robert Schuman 102B-1400 NivellesT +32 67 47 57 10E [email protected]

www.lambda-x.com

Kapteynstraat 12201 BB NoordwijkT +31 71 760 05 50E [email protected]

connect.verhaert.com

INDUSTRY

TECHXFER

MEDICAL

AEROSPACE

TECHXFER

FMCGCONNECT

TECHXFER

FMCGCONNECT

MEDICAL

AveiroPortugalAv. Dr. LourençoPeixinho 96D 4o3800-159 AveiroT +351 234 604 088E [email protected]

www.load-interactive.com

CONNECT

GentbruggeBelgiumBruiloftstraat 55-57B-9050 GentbruggeT +32 9 330 27 90E [email protected]

www.moebiusdesign.com

ON SITE CONSULTANCY