Transcript
Page 1: Data mining and machine learning algorithms for soft ...konyvtar.uni-pannon.hu/doktori/2016/Kulcsar_Tibor_theses_en.pdf · Data mining and machine learning algorithms for soft sensor

Theses of the doctoral (PhD) dissertation

Data mining and machine learning

algorithms for soft sensor development

Tibor Kulcsár

University of PannoniaDoctoral School in

Chemical Engineering and Material Sciences

Supervisor:János Abonyi, DSc

Department of Process EngineeringVeszprém2016.

Page 2: Data mining and machine learning algorithms for soft ...konyvtar.uni-pannon.hu/doktori/2016/Kulcsar_Tibor_theses_en.pdf · Data mining and machine learning algorithms for soft sensor

1. Introduction and the aim of the work

Nowadays industry and the in particular process industry more and more reli-es on new information theory and arti�cial intelligence related solutions thatcan help to improve the technology and reduce the cost of instrumentation,automation and maintenance. Software sensors are capable of extending oreven replace the classical instrumentation of technology. I developed techni-ques and tools to support the identi�cation and maintenance of data-drivenmodels of soft sensors used for product quality and energy usage estimation.

The dissertation describes three di�erent aspects of soft-sensor develop-ment. The �rst two chapters are focusing on parametric and non-parametricmodelling, while the third chapter deals with the selection and preprocessingof the data.

Regarding non-parametric models, I proposed a genetic programmingbased algorithm to generate dimension reduction mappings. I showed theapplicability of these mappings in spectroscopic modelling and in solving theclassical Wine classi�cation benchmark problem. Finally, I presented toolsto for the selection of the input variables of these models and data segments.I demonstrated the capability of the methods in spectroscopic modelling andenergy monitoring of chemical processes.

2. Experimental tools and technologies

We developed the dimension reduction mapping algorithms to extract usefulinformation from the data are originated from the Diesel Blending Unit andthe product development laboratory of MOL Duna Re�nery. The datasetscontained spectra recorded by ABB and Bruker spectrometers. For the pro-cessing of the spectra, I used the software of ABB and Bruker. We usedMATLAB to implement our algorithms. For the development of paramet-ric modelling related algorithms, the times series of process variables wareextracted from the OSIsoft PI central data collection system of MOL DunaRe�nery.

1

Page 3: Data mining and machine learning algorithms for soft ...konyvtar.uni-pannon.hu/doktori/2016/Kulcsar_Tibor_theses_en.pdf · Data mining and machine learning algorithms for soft sensor

3 Theses

1. I have developed a genetic programming based solution to

visualise high dimensional datasets. The method can explore

the operating regimes of online NIR analysers and can support

the identi�cation of classi�er models.

(Related publications: [3, 5, 15, 18, 19, 14, 13])

(a) I developed a multi-chromosome representation based genetic al-gorithm to �nd explicit multi-dimensional projections of high-dimensional input spaces into lower dimensions. I applied themethod to visualise spectral databases of soft sensors by preserv-ing neighbourhood and distance relations of NIR spectra. [3, 18,19]

(b) I modi�ed the cost function of the genetic programming to supportthe visualisation of classi�cation problems. The results con�rmthat the performance of traditional classi�ers improves, when weapply them on goal-oriented projected data. Additionally, I havede�ned a new classi�er that uses convex polygons and also gen-erates an informative view to the user. I have shown that thealgorithm can separate the operational regimes of a technologyhence helps to de�ne local models of soft sensors. [3, 5, 14, 13]

2. I developed parametric models for spectrometric applications

and target calculation in energy monitoring systems. I worked

out methods to accelerate the modelling process and can gen-

erate informative vislualisations to show the hidden structure

and the validity range of the models.

(Related publications: [7, 1, 4, 6, 16, 8, 5, 12, 9])

(a) I developed a validation process that can qualify the modellingperformance and can determine the validity range of spectrometricmodels. I presented models that can visualise and explore thehidden structures in the training dataset. I compared data from a�ow-through cell, and a �bre optic spectrometer and proved thatthe more cost e�cient �bre optic system has similar performanceas the �ow through cell system has. [7, 6, 12, 9]

(b) I proved that Self-Organizing-Maps (SOM) can separate the op-erating modes in energy monitoring (EM) systems and I workedout a SOM based feature ranking and selection tool. [1, 4, 6]

2

Page 4: Data mining and machine learning algorithms for soft ...konyvtar.uni-pannon.hu/doktori/2016/Kulcsar_Tibor_theses_en.pdf · Data mining and machine learning algorithms for soft sensor

(c) I implemented a Random Forest (RF) regression based featureselection algorithm as an extension of the developed framework.I used the RF to select relevant input variables for EM targetingmodels. [4]

3. I developed goal-orineted multivariate time series analysis meth-

ods to support the identi�cation of parametric and non-parametric

models used in NIR analyser based soft sensors and energy

monitoring systems. (Related publications: [1, 4, 2, 12, 9, 14])

(a) I demonstrated that Principal Component Analysis based timesseries segmentation can be used to �nd consistent operating pe-riods of production and events a�ecting the dynamical behaviourof the process. [1, 6]

(b) I developed a novel regression-based times series algorithm to de-tect homogeneous periods of operation based on the predictionaccuracy of energy targeting models. The method is applicable toidentify events when energy e�ciency di�ers signi�cantly. [1, 2]

(c) I demonstrated that Self-Organizing-Maps can not only be usedto isolate operating modes and to de�ne local models for the in-dividual operating regimes, but it can also be applied to featureselection. The results illustrate that all of these functionalitiessupport the building of compact models used in energy monitor-ing. [2]

4 Utilization of results

A part of the results presented in the dissertation has been already utilised.A new practice has been introduced to maintenance and to upgrade themodel that is running in the ABB spectrometer of the Diesel Blending unitin MOL Duna Re�nery. This method includes the generation of new explicitmapping equations (aggregates). A �bre optic spectrometer was installedinto one of the experimental reactors of the product development department,and it is being used to trace the experiments. In this system, partial leastsquares models are providing the estimations for the material properties ofthe product. The feature selection methods are being utilised to build newtargeting models into the energy monitoring systems.

The dissertation presented the application of the genetic algorithm togenerate dimensional reduction mappings of spectral datasets. However be-

3

Page 5: Data mining and machine learning algorithms for soft ...konyvtar.uni-pannon.hu/doktori/2016/Kulcsar_Tibor_theses_en.pdf · Data mining and machine learning algorithms for soft sensor

sides of chemical applications this tools could be used e�ectively in any otherdata mining problems.

The presented time series evaluation methods have been already utilisedin telecommunication, more speci�cally in the platforms that are suppliedby I-New Uni�ed Mobil Solutions. Hence, these practices can detect severalfault and incident patterns before they cause the complete service outage.The time series analysis has been included into the monitoring system ofthe Mobil Virtual Network Operators (MVNOs) of Virgin Mobile Colombia,Virgin Mobile Chile, Compass and other providers.

4

Page 6: Data mining and machine learning algorithms for soft ...konyvtar.uni-pannon.hu/doktori/2016/Kulcsar_Tibor_theses_en.pdf · Data mining and machine learning algorithms for soft sensor

5 Publications related to theses

Articles in international journals

[1] Janos Abonyi, Tibor Kulcsar, Miklos Balaton, and Laszlo Nagy. His-torical process data based energy monitoring-model based time-seriessegmentation to determine target values. Chemical Engineering Trans-actions, 35:931�936, 2013.

[2] Janos Abonyi, Tibor Kulcsar, Miklos Balaton, and Laszlo Nagy. Energymonitoring of process systems: time-series segmentation-based targetingmodels. Clean Technologies and Environmental Policy, 16(7):1245�1253,2014.

[3] Tibor Kulcsar and Janos Abonyi. Development of a modelling frame-work for nir spectroscopy based on-line analyzers using dimensional re-duction techniques and genetic programming. Chemical EngineeringTransactions, 32, 2013.

[4] Tibor Kulcsar, Miklos Balaton, Laszlo Nagy, and Janos Abonyi. Featureselection based root cause analysis for energy monitoring and targeting.Chemical Engineering Transactions, 39:709�714, 2014.

[5] Tibor Kulcsar, Barbara Farsang, Sandor Nemeth, and Janos Abonyi.Multivariate statistical and computational intelligence techniques forquality monitoring of production systems. In Cengiz Kahraman andSeda Yan�k, editors, Intelligent Decision Making in Quality Manage-ment, volume 97 of Intelligent Systems Reference Library, pages 237�263. Springer International Publishing, 2016.

Articles in Hungarian journals

[6] Tibor Kulcsar and Janos Abonyi. Statistical process control based per-formance evaluation of on-line analysers. Hungarian Journal of Industryand Chemistry, 41(1):77�82, 2013.

[7] Tibor Kulcsar, Gabor Sarossy, Gabor Bereznai, Robert Auer, and JanosAbonyi. Partial least squares model based process monitoring usingnear infrared spectroscopy. Periodica Polytechnica Chemical Engineer-ing, 57(1-2):15�20, 2013.

5

Page 7: Data mining and machine learning algorithms for soft ...konyvtar.uni-pannon.hu/doktori/2016/Kulcsar_Tibor_theses_en.pdf · Data mining and machine learning algorithms for soft sensor

Conferences

[8] J. Abonyi, B. Farsang, and T. Kulcsar. Data-driven development andmaintenance of soft-sensors. In Applied Machine Intelligence and In-formatics (SAMI), 2014 IEEE 12th International Symposium on, pages239�244. IEEE, Jan 2014.

[9] J. Abonyi, B. Farsang, and T. Kulcsar. Data-driven developmentand maintenance of soft-sensors. In 2014 IEEE 12th InternationalSymposium on Applied Machine Intelligence and Informatics (SAMI),Herl'any, Slovakia, 2014.

[10] Janos Abonyi, Tibor Kulcsar, Miklos Balaton, and Laszlo Nagy. Histor-ical process data based energy monitoring - model based time-series seg-mentation to determine target values. In PRES'13 - Conference ProcessIntegration, Modelling and Optimisation for Energy Saving and Pollu-tion Reduction, Rhodes, Greece, 2013.

[11] Janos Abonyi, Tibor Kulcsar, Miklos Balaton, and Laszlo Nagy. Fea-ture selection based root cause analysis for energy monitoring and tar-geting. In PRES 2014 - Conference Process Integration, Modelling andOptimisation for Energy Saving and Pollution Reduction, Prague, CzechRepublic, 2014.

[12] Tibor Kulcsar and Janos Abonyi. Partial least squares model basedprocess monitoring using near infrared spectroscopy. In Chemical Engi-neering Days '12, Veszprem, Hungary, 2012.

[13] Tibor Kulcsar and Janos Abonyi. Development of a modeling frameworkfor nir spectroscopy based on-line analysers using dimensional reductiontechniques and genetic programming. In ICheaP12 - International Con-ference on Chemical and Process Engineering, 2014.

[14] Tibor Kulcsar, Gabor Bereznai, Gabor Sarossy, Robert Auer, and JanosAbonyi. Data-driven development and maintenance of soft-sensors. InVisualisation of High Dimensional Data by Use of Genetic Program-ming: Application to On-line Infrared Spectroscopy Based Process Mon-itoring, 2012.

[15] Tibor Kulcsar, Gabor Bereznai, Gabor Sarossy, Robert Auer, and JanosAbonyi. Visualisation of high dimensional data by use of genetic pro-gramming: Application to on-line infrared spectroscopy based pro-cess monitoring. In Soft Computing in Industrial Applications, volume

6

Page 8: Data mining and machine learning algorithms for soft ...konyvtar.uni-pannon.hu/doktori/2016/Kulcsar_Tibor_theses_en.pdf · Data mining and machine learning algorithms for soft sensor

223 of Advances in Intelligent Systems and Computing, pages 223�231.Springer International Publishing, 2014.

[16] Tibor Kulcsar, Peter Koncz, Miklos Balaton, Laszlo Nagy, and JanosAbonyi. Statistical process control based energy monitoring of chemicalprocesses. In Petar Sabev Varbanov Ji°í Jaromír Kleme² and Peng YenLiew, editors, 24th European Symposium on Computer Aided ProcessEngineering, volume 33 of Computer Aided Chemical Engineering, pages397 � 402. Elsevier, 2014.

[17] Tibor Kulcsar, Peter Koncz, Miklos Balaton, Laszlo Nagy, and JanosAbonyi. Statistical process control based energy monitoring of chemicalprocesses. In ESCAPE 24 - 24th European Symposium on ComputerAided Process Engineering, Budapest, Hungary, 2014.

[18] Tibor Kulcsar, Gabor Sarossy, Gabor Bereznai, Robert Auer, and JanosAbonyi. Visualization and indexing of spectral databases. In Proceedingsof International Conference on Computational and Statistical Sciences,volume 6, pages 860 � 865. World Academy of Science, Engineering andTechnology, 2012.

[19] Tibor Kulcsar, Gabor Sarossy, Gabor Bereznai, Robert Auer, and JanosAbonyi. Visualization and indexing of spectral databases. In Interna-tional Conference on Computational and Statistical Sciences, Zurich,Switzerland, 2012.

7


Top Related