data mining to real-time processing - tibco community · introduction to machine learning. ......

38
© Copyright 2000-2016 TIBCO Software Inc. Mike Alperin Kai Waehner August, 2016 Machine Learning in Manufacturing: Data Mining to Real-time Processing

Upload: vanhuong

Post on 29-May-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

© Copyright 2000-2016 TIBCO Software Inc.

Mike Alperin

Kai Waehner

August, 2016

Machine Learning in Manufacturing:

Data Mining to Real-time Processing

© Copyright 2000-2016 TIBCO Software Inc.

• Introduction to Machine Learning

• Machine Learning in Manufacturing

• Methods and Architecture

• Data Mining - Demo

• Real-time Processing of Streaming Data - Demo

• Q&A

Agenda

© Copyright 2000-2016 TIBCO Software Inc.

Introduction to Machine Learning

Machine Learning

Machine learning is a method of data analysis that automates analytical

model building. Using algorithms that iteratively learn from data, machine

learning allows computers to find hidden insights without being explicitly

programmed where to look.

http://www.sas.com

Real World Examples of Machine Learning

Spam Detection Search Results +Product Recommendation

Picture Detection(Friends, Locations, Products)

Machine Learning is already present in your daily life…

Now, every enterprise is beginning to leverage it!

© Copyright 2000-2016 TIBCO Software Inc.

• Supervised – Solve known problems

• What factors are driving manufacturing defects?

• Decision Trees, Random Forest, Gradient Boosting Machine

• Unsupervised – Identify new patterns, Detect anomalies

• Are there new failure modes emerging?

• Clustering, Principle Components, Neural Networks, Support Vector

Machines

• Optimization – Support Decision-making

• What is the optimum scheduling of operators or equipment maintenance?

• Genetic Algorithm

Types of Machine Learning

© Copyright 2000-2016 TIBCO Software Inc.

Decision Tree – Titanic Survival Rate

family size

Wikipedia

© Copyright 2000-2016 TIBCO Software Inc.

Classical Statistics – Fit parameters to a well-defined model

Decision Tree – Product Pass / Fail by Process & Equipment

Bad Product

Good Product

Clearcoat Bake Temperature>= 132 C

Sanding Station1, 2, 4 3 Basecoat Thickness

Peeling Clearcoat

< 132 C

… … … …

Automobile Paint Process

Decision Tree – Training and Test Data Sets

© Copyright 2000-2016 TIBCO Software Inc.

Ensemble Tree Algorithms

• Random Forest, Gradient Boosting Machine (GBM)

• Method – Average many simple trees

• Sample the data: fit a simple tree

• Re-sample the data; up-weighting the observations that weren’t fitted well in

previous model

• Continue adding trees until fit is good

• Save all the trees and average them

• Better fit + prediction than single trees

12

Gradient Boosting Machine (GBM)

• Machine Learning• Machine learning algorithms + Big Data sets can produce models

that accurately fit complex data patterns.

• GBM: Better results than classic statistical methods• Ideal for data-mining / predictions for complex processes &

products• Performs well for variable reduction• Can fit complex nonlinear relationships & interactions• Scales to Big Data

• Easier to use• No need to specify the data model• Accommodates continuous and categorical predictors & responses• Handles missing data and outliers well

• Simple user interface • Model complexity can be hidden from user• Results presented with easy-to-understand visualizations• Interface easily customized for your use case and users

© Copyright 2000-2016 TIBCO Software Inc.

Advanced Analytics and Big Data Tools

Many more ….

© Copyright 2000-2016 TIBCO Software Inc.

Machine Learning in Manufacturing

Correlate Product or Equipment Results to Process & Supplier Data

• Supplier - Incoming Materials and Components• measured electrical, chemical, physical characteristics

• batch-id, lot_id

• Manufacturing Process• Physical, chemical or electrical measurements

• WIP / MES: track-in / track-out date, process equipment id, recipe, operator, …

• Process equipment sensor data

• Equipment Maintenance logs

• Defect Inspections

• Cost of labor, materials, machines and facilities

• Product Quality and Reliability Test• Measured product functional and performance characteristics

• Accelerated life test results

• Product Field Returns• Failure mode, unit / batch / lot ID

• Failure analysis root cause results

• Warranty / Repair claim, call center and cost – structured & unstructured

• Problem

• Product & Equipment problems difficult to accurately diagnose for complex manufacturing processes

• Big Data problem – millions of units, hundreds / thousands of predictors

• Response: Product, Process or Equipment Fail data

• Predictors: in-process equipment, process and product measurements or attributes

• Value

• Being used by customers to find previously undetected problems. Reduces time-to-market and increases profit.

• Method

• GBM analysis template to identify significant predictors, interactions and nonlinearities

• For large datasets, hybrid data access used to perform variable reduction step in-DB

• Simple interface – easy for business analyst to run and interpret results

GBM results for semiconductor yield as a function of in-process equipment & product measurements

Machine Learning to Predict Equipment or Product Fails

© Copyright 2000-2016 TIBCO Software Inc.

Method and Architecture

INSIGHT ACTION

Insight – Action Loop

© Copyright 2000-2016 TIBCO Software Inc.

Fast Data Reference Architecture

Operational Analytics

OperationsOperationsLive UI

SENSOR DATA

TRANSACTIONS

MESSAGE BUS

MACHINE DATA

SOCIAL DATA

Streaming AnalyticsAction

AggregateAggregate

RulesRules

Stream Processing

AnalyticsAnalytics

CorrelateCorrelate

Live Monitoring

Continuous query processing

Continuous query processing

AlertsAlerts

Manual action, escalation

Manual action, escalation

HISTORICAL ANALYSIS

Data SheetsData

Sheets

BIBI

Data Scientists

Data Scientists

CleansedData

History

Data Discovery

Enterprise Service BusEnterprise Service Bus

ERPERP MDMMDM DBDB WMSWMS

SOASOA

Data Storage

Intern

al Data

Integratio

n B

us

APIAPI

Complex Event Processing

Machine LearningMachine Learning

Big Data

© Copyright 2000-2016 TIBCO Software Inc.

Demo: Data Mining

© Copyright 2000-2016 TIBCO Software Inc.

• Model Set up – Parameter Selection

• Model Configuration

• Run Model

• Evaluate model: ROC Curve, AUC

• Visualize Model Results

• Variable Importance

• Interactions

Demo Outline

© Copyright 2000-2016 TIBCO Software Inc.

Demo Screenshot – Model Configuration and Evaluation

© Copyright 2000-2016 TIBCO Software Inc.

Demo Screenshot – Model Results

© Copyright 2000-2016 TIBCO Software Inc.

Real-time Processing

Of Streaming Data

Predictive Analytics for Manufacturing

Goal: Scrap parts as early as possible to reduce costs in a manufacturing process.

Question: When to scrap a part in Station 1 instead of sending it to Station 2?

Station 1 Station 2

Cost Before9€

7€ 13€Total Cost

29€(or more)

Scrap? Scrap?

Fast Data Architecture for Predictive Maintenance

Operational Analytics

OperationsOperationsLive UI

CSV Batch

JSON Real Time

XML Real Time

Streaming AnalyticsAction

AggregateAggregate

RulesRules

AnalyticsAnalytics

CorrelateCorrelate

Live Datamart

Continuous query processing

Continuous query processing

AlertsAlerts

Manual action, escalation

Manual action, escalation

HISTORICAL ANALYSIS Data Scientists

Data Scientists

FlumeHDFS

Spotfire

R / TERRR / TERRHDFS

Hadoop (Cloudera)

StreamBase

TIBCO Fast Data Platform

H2OH2O

Oracle RDBMS

Avro Parquet … PMMLPMML

Inte

rnal D

ata

TIBCO Spotfire with H2O Integration

Data Discovery / Data Mining (“Are parts that repeat a station more likely scrap parts?”)

TIBCO Spotfire with H2O Integration

Advanced Analytics (“Scrap parts as early as possible!”)

TIBCO Spotfire with H2O Integration

Advanced Analytics (“Scrap parts as early as possible!”)

TIBCO StreamBase + R / TERR

TIBCO StreamBase + H20

TIBCO StreamBase + PMML

TIBCO Live Datamart

Operational Intelligence (“Monitor the manufacturing process and change rules in real time!”)

Live Dartmart Desktop Client

TIBCO Live Datamart

Operational Intelligence (“Monitor the manufacturing process and change rules in real time!”)

Live Dartmart Web API

© Copyright 2000-2016 TIBCO Software Inc.

Demo:

Real Time Processing

(Predictive Scrapping of Parts

in an Assembly Line)

© Copyright 2000-2016 TIBCO Software Inc.

Learn & Do More: Machine Learning on the TIBCO Community

Wiki page

Component Exchange:• Data functions• Accelerators• Templates

https://community.tibco.com/wiki/machine-learning-tibco-spotfire-and-streambase

https://community.tibco.com/exchange

© Copyright 2000-2016 TIBCO Software Inc.

Learn & Do More: Accelerators on the TIBCO Community

https://community.tibco.com/wiki/accelerators https://community.tibco.com/exchange

Component Exchange Accelerators:• Apache Spark• Intelligent Equipment• Connected Vehicles

Wiki page

© Copyright 2000-2016 TIBCO Software Inc.

Q & A