data analytics for semiconductor manufacturing · model regression decision trees ensemble methods...

16
1 © 2016 The MathWorks, Inc. Data Analytics for Semiconductor Manufacturing

Upload: others

Post on 08-Sep-2019

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Analytics for Semiconductor Manufacturing · Model Regression Decision Trees Ensemble Methods Neural Network Non-linear Reg. (GLM, Logistic) Linear Regression Classification

1© 2016 The MathWorks, Inc.

Data Analytics for Semiconductor Manufacturing

Page 2: Data Analytics for Semiconductor Manufacturing · Model Regression Decision Trees Ensemble Methods Neural Network Non-linear Reg. (GLM, Logistic) Linear Regression Classification

2

Analytics uses data to drive decision making, rather than gut feel or intuition. Analytics uses data

to derive insights which drives business actions and therefore produces business outcomes.

Analytics Maturity

Level of Sophistication

Prescriptive

Predictive

Diagnostic

Descriptive

What should

happen?

What will

happen?

Why did it

happen?

What

happened?Co

mp

eti

tive A

dvan

tag

e

What do we mean by “Data Analytics” ?

Copyright of Shell Global solutions

Page 3: Data Analytics for Semiconductor Manufacturing · Model Regression Decision Trees Ensemble Methods Neural Network Non-linear Reg. (GLM, Logistic) Linear Regression Classification

3

Data Analytics 3.0

• Structured Data set based Descriptive Analysis

• Decision making from Historical Data

• Big data (Structured + Unstructured)

• IoT (Internet of Things)

• 3Vs (Volume, Variety, Velocity) + Value

• Machine learning and Deep learning

• Optimized workflow and decision making with

prescriptive analytics

• In-Memory processing with streaming data

• Smart factory

• Enterprise system Integration

93.68%

2.44%

0.14%

0.03%

0.03%

0.00%

0.00%

0.00%

5.55%

92.60%

4.18%

0.23%

0.12%

0.00%

0.00%

0.00%

0.59%

4.03%

91.02%

7.49%

0.73%

0.11%

0.00%

0.00%

0.18%

0.73%

3.90%

87.86%

8.27%

0.82%

0.37%

0.00%

0.00%

0.15%

0.60%

3.78%

86.74%

9.64%

1.84%

0.00%

0.00%

0.00%

0.08%

0.39%

3.28%

85.37%

6.24%

0.00%

0.00%

0.00%

0.00%

0.06%

0.18%

2.41%

81.88%

0.00%

0.00%

0.06%

0.08%

0.16%

0.64%

1.64%

9.67%

100.00%

AAA AA A BBB BB B CCC D

AAA

AA

A

BBB

BB

B

CCC

D

Page 4: Data Analytics for Semiconductor Manufacturing · Model Regression Decision Trees Ensemble Methods Neural Network Non-linear Reg. (GLM, Logistic) Linear Regression Classification

4

Data Analytics for Semiconductor Manufacturing Process

Data is being logged everywhere

Data comes from many disparate sources

Get insight from aggregated data and

make actionable decisions to

– Improvement Yield

– Detecting error

– Reduction Manufacturing cost

– Process optimize

– Prognostics

– equipment health/Maintenance

Actionable

InsightWafer

Oxidation

Photo lithography

EtchingThin Film

Deposition

Metallization

EDS

Page 5: Data Analytics for Semiconductor Manufacturing · Model Regression Decision Trees Ensemble Methods Neural Network Non-linear Reg. (GLM, Logistic) Linear Regression Classification

5

Workflow of Analytics for Integrated Wafer Inspection system

Process, feature data

Business and Transactional

Data Aggregator / Reducer and

/Automated / Enterprise System

1. Big Data Access

Big/Diverse Data Sources

2. Analytics

Machine Learning

3. Integrate to

Enterprise System

Automation, Business

Data Exploration Analytics Modeling & Evaluation Integration

Page 6: Data Analytics for Semiconductor Manufacturing · Model Regression Decision Trees Ensemble Methods Neural Network Non-linear Reg. (GLM, Logistic) Linear Regression Classification

6

“Any collection of data sets so large and complex that it

becomes difficult to process using … traditional data

processing applications.”(Wikipedia)

“Any collection of data sets so large that it becomes

difficult to process using traditional MATLAB functions,

which assume all of the data is in memory.”(MATLAB)

Various Data Sources from complex process

image, sensor, numeric, text, etc.

Rapid Data exploration from enormous number

of manufacturing equipments

Scalable Data processing with scalable HW

Identify the value from uncertainties

Statistics, Clustering, Classification, Regression

Ease of deployment

Support common development environment

(C/C++, .Net, Java, Excel, Hadoop)

Big Data

Volume

Variety

Velocity

Value

1. Big Data Access

Big/Diverse Data Sources

Page 7: Data Analytics for Semiconductor Manufacturing · Model Regression Decision Trees Ensemble Methods Neural Network Non-linear Reg. (GLM, Logistic) Linear Regression Classification

7

“Reading Big Data in MATLAB”

ComplexityEmbarrassingly

Parallel

Non-

Partitionable

MapReduce

Distributed Memory

SPMD and distributed arrays

Load,

Analyze,

Discard

parfor, datastore,

out-of-memory

in-memory

Researchers

“Techniques for Big Data in MATLAB”

1. Big Data Access

Big/Diverse Data Sources

Page 8: Data Analytics for Semiconductor Manufacturing · Model Regression Decision Trees Ensemble Methods Neural Network Non-linear Reg. (GLM, Logistic) Linear Regression Classification

8

Why use MATLAB for Machine Learning?

Complete environment for Machine Learning

(prototyping to production)

Verified and trusted machine learning algorithms

Classification Learner app that lets you perform

common machine learning tasks interactively

Tightly integrated with Signal Processing, Image

Processing, Computer Vision and other workflows.

Open and flexible architecture enables a customized

workflow

2. Analytics

Machine Learning

-3 -2 -1 0 1 2 3-2

-1

0

1

2

3

4

1

2

Support Vectors

-0.4 -0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6-0.5

0

0.5

1

1.5

x1

x2

group1

group2

group3

group4

group5

group6

group7

group8

Page 9: Data Analytics for Semiconductor Manufacturing · Model Regression Decision Trees Ensemble Methods Neural Network Non-linear Reg. (GLM, Logistic) Linear Regression Classification

9

ChallengeApply machine learning techniques to improve overlay

metrology in semiconductor manufacturing

SolutionUse MATLAB to create and train a neural network that

predicts overlay metrology from alignment metrology

Results Industry leadership established

Potential manufacturing improvements identified

Maintenance overhead minimized

ASML Develops Virtual Metrology

Technology for Semiconductor

Manufacturing with Machine Learning

Link to user story

“As a process engineer I had no

experience with neural networks or

machine learning. I worked through

the MATLAB examples to find the

best machine learning functions for

generating virtual metrology. I

couldn’t have done this in C or

Python—it would’ve taken too long

to find, validate, and integrate the

right packages.”

Emil Schmitt-Weaver

ASML

Cutaway of a TWINSCAN and Track as wafers receive

alignment and overlay metrology

2. Analytics

Machine Learning

Page 10: Data Analytics for Semiconductor Manufacturing · Model Regression Decision Trees Ensemble Methods Neural Network Non-linear Reg. (GLM, Logistic) Linear Regression Classification

10

Machine Learning Workflow

MODELSUPERVISED

LEARNING

UNSUPERVISED

LEARNING

KMEANSAUTO-

ENCODER

PCAGMM CLASSIFICATION

REGRESSION

LOAD

DATA

PREDICTIONMODELNEW

DATA

Step 1: Training

Step 2: Evaluation

PREPROCESSING TRAINING

2. Analytics

Machine Learning

Iterate

Page 11: Data Analytics for Semiconductor Manufacturing · Model Regression Decision Trees Ensemble Methods Neural Network Non-linear Reg. (GLM, Logistic) Linear Regression Classification

11

Overview of Machine Learning in MATLAB

Clustering

K-means.

Fuzzy K-meansHierarchical

Neural Network Gaussian

Mixture

Hidden Markov Model

Regression

Decision

Trees

Ensemble

MethodsNeural Network

Non-linear Reg.

(GLM, Logistic)

Linear

Regression

Classification

K-means.

Fuzzy K-meansHierarchical

Naive Bayes, Nearest Neighbors

Gaussian

Mixture

Hidden Markov Model

Unsupervised

Learning

Supervised

Learning

2. Analytics

Machine Learning

Page 12: Data Analytics for Semiconductor Manufacturing · Model Regression Decision Trees Ensemble Methods Neural Network Non-linear Reg. (GLM, Logistic) Linear Regression Classification

12

Integrate Analytics Application to Enterprise System 3. Integrate to

Enterprise System

Automation, Business

Separated Storage;Local DriveNetwork DriveDatabase

Typical Workflow

Process Control System/Reporting System

Re-coding

Increase

Decrease

Time consumptionOperation overhead

Error

Pitfall…Performance

Yield rate

Efficiency…

IT

- IT Mgr.

- Enterprise

Architect

- Programmers

Business Units

- Head of BU

- Domain experts

- (Data scientists)

Business Units

- Head of BU

- Domain experts

- (Data scientists)

Business Units

Analytics Group

- Head of BU

- Head of Analytics

- Domain experts

- Data scientists

Page 13: Data Analytics for Semiconductor Manufacturing · Model Regression Decision Trees Ensemble Methods Neural Network Non-linear Reg. (GLM, Logistic) Linear Regression Classification

13

Deployment for Analytics Integration 3. Integrate to

Enterprise System

Automation, Business

MATLAB

MATLAB

Compiler SDK

C/C++ExcelAdd-in JavaHadoop .NET

MATLAB

Compiler

MATLABProduction

Server

StandaloneApplication

Integrated into production IT

environments without recoding or

creating custom infrastructure

Packaged as compatible components

with a wide range of systems (Java,

.NET, Excel, Python & C/C++)

Shared as standalone applications or

run from: web, database servers,

desktop, enterprise applications, etc.

Page 14: Data Analytics for Semiconductor Manufacturing · Model Regression Decision Trees Ensemble Methods Neural Network Non-linear Reg. (GLM, Logistic) Linear Regression Classification

14

Solutions for Web/Enterprise Application ServersMATLAB Production Server

3. Integrate to

Enterprise System

Automation, Business

Most efficient path for creating

enterprise applications

Deploy MATLAB programs into pr

oduction

• Manage multiple MATLAB programs

and versions

• Update programs without server resta

rts

• Reliably service large numbers of con

current requests

Integrate with web, database, and

application servers

Page 15: Data Analytics for Semiconductor Manufacturing · Model Regression Decision Trees Ensemble Methods Neural Network Non-linear Reg. (GLM, Logistic) Linear Regression Classification

15

Integrate Analytics Application to Enterprise System 3. Integrate to

Enterprise System

Automation, Business

Suggested Workflow

MATLAB Production Server

Request

Broker

Enterprise System/Process Automation

AnalyticsIntegration

Without Re-coding

Data from manufacturing process- Wafer- Oxidation- Photo Lithography- Etching- Overlay Metrology- Scanner / SEM image- Fab/Probe/Assembly/Test Yields- Economic market data- etc

Results/ Reporting system(C# and Java)

- Process Optimization- Yield Improvement- Rapid processing

- Increase efficiency- IP management

Out of Memory Data Access for Massive Data(Hadoop, DB)

Test deployment in R&D part

C/C++ Java .NET

ExcelAdd-in

HadoopStandaloneApplication

Prototyping Analytics solution

Page 16: Data Analytics for Semiconductor Manufacturing · Model Regression Decision Trees Ensemble Methods Neural Network Non-linear Reg. (GLM, Logistic) Linear Regression Classification

16

Summary :

MATLAB Data analytics in Semiconductor Industry

1. Big Data Access

Big/Diverse Data Sources

3. Integrate to

Enterprise System

Automation, Business

2. Analytics

Machine Learning

Scale Up Computation Analytics / A.I System IntegrationData Exploration/ Visualization

• Parallel Computing toolbox• MATLAB Distributed

Computing Server

Multicore DesktopCluster / Cloud ComputingGPU computingTask Parallel / Data Parallel

Big data Access Datastore, MapReduceHDFS supportDB, No SQL support

• MATLAB• Database toolbox

• Statistics and Machine learning

• (Global)Optimization • Neural Network toolbox

• MATLAB Compiler• MATLAB compiler SDK• MATLAB Production Server

Machine Learning/ Deep LearningConvolutional Neural NetworkArtificial Neural NetworkLinear/Non-Linear Regression …

C/C++ Java .NET

ExcelAdd-in

HadoopStandaloneApplication

Spark

Stand-alone ApplicationMulti-language target libraryHadoop integrationWeb service

-3 -2 -1 0 1 2 3-2

-1

0

1

2

3

4

1

2

Support Vectors