motivation conclusion two emse-cmp original contributions in variable ranking and selection were...

1
Motivation Conclusion Two EMSE-CMP original contributions in variable ranking and selection were implemented with the LS- SVM regression method in a Matlab Prototype for the design of VM models. The Matlab Prototype was validated using real data from two case studies: Austriamicrosystems: Prediction of PECVD (Plasma Enhanced Chemical Vapor Deposition) oxide thickness for an Inter Metal Dielectric (IMD) layers. STMicroelectronics Rousset case: Prediction of Overlay of Photolithography process Ranking and selection variables The main scientific contributions of EMSE-CMP are the development of filter and wrapper methods to ranking and selection variables. Some manufacturing processes have a very large number of input variables. The result is complex predictive models with poor generalization capabilities: The confidence level of a model is even larger when it uses a small number of adjusted parameters. In addition, taking into account irrelevant variables leads to introducing noisy data that yields to overfitting and then poor generalization capabilities. The goal of variable ranking and selection is to determine the smallest subset of variables, carrying as much information as possible, to explain the dependent variable, while discarding both redundant and/or irrelevant variables (i.e., poorly informative). EMSE-CMP has two main contributions in ranking and variable selection: 1)Contribution to filter method: Mutual Information-based Variable Selection using a Probe Feature. 2)Contribution to wrapper method: Wrapper with a meta- heuristic approach, namely a Tabu search algorithm (TabuWrap). Design of Virtual Metrology Models by Machine Learning : A Matlab Prototype A. Ferreira 1 , G. Pages 1 , Y. Oussar 2 1 Ecole Nat. Sup. des Mines de Saint-Etienne, 2 ESPCI Paristech We propose a methodology for building VM models using machine learning techniques. After a standard data pre- processing, a variable ranking and selection procedure is applied to determine the most relevant variables for predicting the metrology variables. Different techniques coming from the statistical learning theory such as: PLS regression and Least Squares Support Vector Machine LS- SVM regression are implemented. For each technique, the model parameters are estimated using a training algorithm. A k-fold cross-validation (or leave-one-out) procedure is used to select the model that exhibits the best generalization capabilities. Its performance is then estimated using a test dataset. The EMSE-CMP methodology was implemented in a Matlab Prototype dedicated to Virtual Metrology Models design. Basic Diagram for the Design of VM Models based on Machine Learning Techniques Approac h Pros Cons Filter Model free Low computational cost Fairly irregular May degrade performances Wrapper Consistent High accuracy Improves performances Computational burden Filter and Wrappers Approaches Matlab Prototype Case Study STMicroelectronics Rousset site Prediction of Overlay of Photolithography process 43 variables out of 169 have been selected

Upload: hilary-jackson

Post on 05-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Motivation Conclusion Two EMSE-CMP original contributions in variable ranking and selection were implemented with the LS-SVM regression method in a Matlab

Motivation

ConclusionTwo EMSE-CMP original contributions in variable ranking and selection were implemented with the LS-SVM regression method in a Matlab Prototype for the design of VM models. The Matlab Prototype was validated using real data from two case studies:

•Austriamicrosystems: Prediction of PECVD (Plasma Enhanced Chemical Vapor Deposition) oxide thickness for an Inter Metal Dielectric (IMD) layers.•STMicroelectronics Rousset case: Prediction of Overlay of Photolithography process

Ranking and selection variablesThe main scientific contributions of EMSE-CMP are the development of filter and wrapper methods to ranking and selection variables. Some manufacturing processes have a very large number of input variables. The result is complex predictive models with poor generalization capabilities: The confidence level of a model is even larger when it uses a small number of adjusted parameters. In addition, taking into account irrelevant variables leads to introducing noisy data that yields to overfitting and then poor generalization capabilities. The goal of variable ranking and selection is to determine the smallest subset of variables, carrying as much information as possible, to explain the dependent variable, while discarding both redundant and/or irrelevant variables (i.e., poorly informative). EMSE-CMP has two main contributions in ranking and variable selection:

1)Contribution to filter method: Mutual Information-based Variable Selection using a Probe Feature.

2)Contribution to wrapper method: Wrapper with a meta-heuristic approach, namely a Tabu search algorithm (TabuWrap).

Design of Virtual Metrology Models by Machine Learning : A Matlab Prototype

A. Ferreira1, G. Pages1, Y. Oussar2

1Ecole Nat. Sup. des Mines de Saint-Etienne, 2 ESPCI Paristech

We propose a methodology for building VM models using machine learning techniques. After a standard data pre-processing, a variable ranking and selection procedure is applied to determine the most relevant variables for predicting the metrology variables. Different techniques coming from the statistical learning theory such as: PLS regression and Least Squares Support Vector Machine LS-SVM regression are implemented. For each technique, the model parameters are estimated using a training algorithm. A k-fold cross-validation (or leave-one-out) procedure is used to select the model that exhibits the best generalization capabilities. Its performance is then estimated using a test dataset. The EMSE-CMP methodology was implemented in a Matlab Prototype dedicated to Virtual Metrology Models design.

Basic Diagram for the Design of VM Models based on Machine Learning Techniques

Approach Pros Cons

Filter• Model free• Low computational cost

• Fairly irregular• May degrade performances

Wrapper• Consistent• High accuracy• Improves performances

Computational burden

Filter and Wrappers Approaches

Matlab Prototype

Case StudySTMicroelectronics Rousset site

Prediction of Overlay of Photolithography process

43 variables out of 169 have been selected