research article multivariate methods based soft measurement...

8
Research Article Multivariate Methods Based Soft Measurement for Wine Quality Evaluation Shen Yin, 1 Lei Liu, 2 Xin Gao, 1 and Hamid Reza Karimi 3 1 College of Engineering, Bohai University, Liaoning 121013, China 2 College of Mathematics and Physics, Bohai University, Liaoning 121013, China 3 Department of Engineering, Faculty of Engineering and Science, University of Agder, 4898 Grimstad, Norway Correspondence should be addressed to Lei Liu; [email protected] Received 22 April 2014; Accepted 13 May 2014; Published 3 June 2014 Academic Editor: Josep M. Rossell Copyright © 2014 Shen Yin et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Soſt measurement is a new, developing, and promising industry technology and has been widely used in the industry nowadays. is technology plays a significant role especially in the case where some key variables are difficult to be measured by traditional measurement methods. In this paper, the quality of the wine is evaluated given the wine physicochemical indexes according to multivariate methods based soſt measurement. e multivariate methods used in this paper include ordinary least squares regression (OLSR), principal component regression (PCR), partial least squares regression (PLSR), and modified partial least squares regression (MPLSR). By comparing the performance of the four methods, the MPLSR prediction model shows superior results than the others. In general, to determine the quality of the wine, experienced wine tasters are hired to taste the wine and make a decision. However, since the physicochemical indexes of wine can to some extent reflect the quality of wine, the multivariate statistical methods based soſt measure can help the oenologist in wine evaluation. 1. Introduction e evaluation of wine quality is highly significant due to their effects on wine classification and target marketing. e quality of wine reflects several major influences, including quality ratings, the reputation of the winery, and profitability [1]. erefore, it is imperative to evaluate wine quality for both the food industry and wine science community [2, 3]. In view of the evaluation of wine quality, the traditional method is manual inspection or analysis of the chemical compounds. ese methods cost huge financial inputs and time. e stud- ies of wine quality evaluation are abundant internationally. e support vector machine (SVM) can build the wine quality classification model. e probabilistic neural network can evaluate the wine quality based on the mineral elements content of red wine. e visualization method of wine quality evaluation proposed can evaluate the wine quality according to the physicochemical property. However, in this paper, the multivariate methods based on soſt measurement are the appropriate tools for chemical and physical measurements based on the wine physicochemical indexes. Soſt measurement technology is widely used [46]. e production process and the automatic control theory can be combined perfectly by soſt measurement technique. In the industrial process, it seems common that some key control variables cannot be measured due to technique and economy factors [7]. However, based on soſt measurement, a mathematical model can be built according to the relation of detectable process variable and in this way the undetectable process variable can be measured or estimated [8]. Due to these detectable variables and estimable variables, the productive process is controllable. With the method of online analysis applied, the invest- ment and the instrument maintenance continue to increase. Manual inspection is a time-consuming and complex work. e soſt measurement is superior, with the development of computer technology. e primary methods to build soſt measurement model include mechanism based modelling, knowledge based modelling, and data based modelling according to different prediction models. Using mechanism based technique to build the model, the known physical and chemical laws are a must and sometimes the mechanism in Hindawi Publishing Corporation Abstract and Applied Analysis Volume 2014, Article ID 740754, 7 pages http://dx.doi.org/10.1155/2014/740754

Upload: others

Post on 10-Mar-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Research Article Multivariate Methods Based Soft Measurement …downloads.hindawi.com/journals/aaa/2014/740754.pdf · 2019-07-31 · Research Article Multivariate Methods Based Soft

Research ArticleMultivariate Methods Based Soft Measurement forWine Quality Evaluation

Shen Yin1 Lei Liu2 Xin Gao1 and Hamid Reza Karimi3

1 College of Engineering Bohai University Liaoning 121013 China2 College of Mathematics and Physics Bohai University Liaoning 121013 China3Department of Engineering Faculty of Engineering and Science University of Agder 4898 Grimstad Norway

Correspondence should be addressed to Lei Liu bhushxllgmailcom

Received 22 April 2014 Accepted 13 May 2014 Published 3 June 2014

Academic Editor Josep M Rossell

Copyright copy 2014 Shen Yin et alThis is an open access article distributed under the Creative Commons Attribution License whichpermits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

Soft measurement is a new developing and promising industry technology and has been widely used in the industry nowadaysThis technology plays a significant role especially in the case where some key variables are difficult to be measured by traditionalmeasurement methods In this paper the quality of the wine is evaluated given the wine physicochemical indexes accordingto multivariate methods based soft measurement The multivariate methods used in this paper include ordinary least squaresregression (OLSR) principal component regression (PCR) partial least squares regression (PLSR) and modified partial leastsquares regression (MPLSR) By comparing the performance of the four methods the MPLSR prediction model shows superiorresults than the others In general to determine the quality of the wine experienced wine tasters are hired to taste the wine andmake a decision However since the physicochemical indexes of wine can to some extent reflect the quality of wine themultivariatestatistical methods based soft measure can help the oenologist in wine evaluation

1 Introduction

The evaluation of wine quality is highly significant due totheir effects on wine classification and target marketing Thequality of wine reflects several major influences includingquality ratings the reputation of the winery and profitability[1]Therefore it is imperative to evaluatewine quality for boththe food industry and wine science community [2 3] In viewof the evaluation of wine quality the traditional method ismanual inspection or analysis of the chemical compoundsThese methods cost huge financial inputs and timeThe stud-ies of wine quality evaluation are abundant internationallyThe support vectormachine (SVM) can build thewine qualityclassification model The probabilistic neural network canevaluate the wine quality based on the mineral elementscontent of red wineThe visualizationmethod of wine qualityevaluation proposed can evaluate the wine quality accordingto the physicochemical property However in this paper themultivariate methods based on soft measurement are theappropriate tools for chemical and physical measurementsbased on the wine physicochemical indexes

Soft measurement technology is widely used [4ndash6] Theproduction process and the automatic control theory canbe combined perfectly by soft measurement technique Inthe industrial process it seems common that some keycontrol variables cannot be measured due to technique andeconomy factors [7] However based on soft measurement amathematical model can be built according to the relation ofdetectable process variable and in this way the undetectableprocess variable can be measured or estimated [8] Dueto these detectable variables and estimable variables theproductive process is controllable

With the method of online analysis applied the invest-ment and the instrument maintenance continue to increaseManual inspection is a time-consuming and complex workThe soft measurement is superior with the development ofcomputer technology The primary methods to build softmeasurement model include mechanism based modellingknowledge based modelling and data based modellingaccording to different prediction models Using mechanismbased technique to build the model the known physical andchemical laws are a must and sometimes the mechanism in

Hindawi Publishing CorporationAbstract and Applied AnalysisVolume 2014 Article ID 740754 7 pageshttpdxdoiorg1011552014740754

2 Abstract and Applied Analysis

production process needs to be understood deeplyMoreoverthis application is limited owing to the difficulty inmodellingand period length Although the knowledge based modellingis a simple easy-to-understand and convenient method itis not suitable for the high precision and the knowledgerules extracted The process data based modelling methodcan build statistical regression model based on multivariatestatistical analysis theory using a large number of productiondataThismethod is superior inmodellingmaintenance andprecision In this paper themodel basedmodelling is utilizedand the model is built by the multivariate methods based onsoft measurement

In this paper in order to evaluate red wine quality themultivariate methods based on soft measurement are usedThis algorithm can help construct the fitted wine qualitymodel and predict wine quality We make a comparisonamong the models established by using OLSR PCR PLSRand MPLSR respectively and then select out the best modelin order to improve the model accuracy [9]Themethods aresuperior to manual measurement in the facet of wine qualityevaluation Due to the serious stability of OLSR the modelsof PCR PLSR and MPLSR are built These models can alsohelp improve the production process Furthermore it canbe useful in target marketing it can help identify the mostrelevant factors and can help classify wines such as premiumbrands (useful for setting prices) [2] The physicochemicalindexes that can impact the quality of wine are also proposedin this paperThere are 87 physicochemical indexes collectedin which a set containing 20 samples is used as calibration setand one containing 7 samples is used as verification set Wepredict the wine quality when these physicochemical indexesare just known

This paper is organized as follows Section 2 reviewsthe multivariate statistical methods including OLSR PCRPLSR and MPLSR In Section 3 the data sets are introducedand analyzed The wine prediction models are establishedbased on the multivariate statistical methods and then acomparison among these models is provided Finally theconclusions are presented in the last section

2 Modeling Algorithms

In this section the multivariate statistical analysis is used tosolve the wine problems OLSR PCR PLSR and MPLSR areintroduced briefly Owing to the multicollinearity or the lessnumber of samples than variables OLSR lost its effectivenessPCR is utilized to solve the problem of collineation Theinformation plays an important role in the data set based onthe cumulative percent variance (CPV) Unlike PCR PLSRespecially focuses on the internal height linearly dependentvariables MPLSR has the advantage of both PCR and PLSRand is better than them

21 Ordinary Least Squares Regression (OLSR) Least squaresmethod is a kind of mathematical optimization techniqueThe matching data is found by minimizing the error sum ofsquares The unknown data is obtained by the least squaresmethod simply The difference between quadratic sum of

actual data and quadratic sum of calculated data is very smallso that the least squares method can also be used for curvefitting In general the OLSR algorithmn is used to solveproblems only involving one dimensional response variablesHowever OLSR can also solve the problems of more thanone dimensional response variables The algorithm is brieflyintroduced as follows

Step 1 Collect 119899 samples of the measurable variables andresponse variables and then normalize them to zero meanand unit variance denoted as 119880 = [119906

1sdot sdot sdot 119906119898] isin 119877

119899times119898 119884 =

[1199101sdot sdot sdot 119910119901] isin 119877119899times119901 The following steps will be executed only

when 119880119879119880 is a full rank matrix

Step 2 Minimize the sum of errors 119890119894

120573119894= arg min 119890

119879

119894119890119894

= arg min (119910119894minus 119880120573119894)119879

(119910119894minus 119880120573119894) 119894 = 1 sdot sdot sdot 119901

(1)

Because of the invertibility of 119880119879119880 the 120573119894can also be

calculated by

120573119894= (119880119879119880)

minus1

119880119879119910119894 (2)

Step 3 Use the 119890119894and 120573

119894to form 119864 isin 119877

119899times119901 and 119872 isin 119877119898times119901

respectively and the final model can be indicated as

119884 = 119880119872 + 119864 (3)

22 Principal Component Regression (PCR) Principal com-ponents analysis invented by Pearson in 1901 [10] can analyzethe data and establish the mathematical model The methodplays an important role in the data of principal compo-nents (ie characteristic vectors) and their weights (ie theeigenvalues [11]) through the study of the characteristics ofcovariance to decompose matrix [12]

Since 1980s PCA has been successfully applied in numer-ous areas including data compression image processing fea-ture extraction pattern recognition and process monitoring[13 14] Since PCR is superior in dimensionality reductionit may solve the variable multicollinearity in practice andOLSRrsquos stability problem [15] The collinearity problem iseffectually solvedThe irrelevant aggregative indicators orig-inal indicators variant take the place of the original onesbased on PCR The aggregative indicators are used to showthe original indicators in this statistical approach PCR hasbeen widely used in many fields [16ndash18]

Step 1 Gather measurable variables and the responses vari-ables and normalize them to zero mean and unit variancepresented as 119880 = [119906

1sdot sdot sdot 119906119898] isin 119877

119899times119898 and 119884 = [1199101sdot sdot sdot 119910119901] isin

119877119899times119901

Step 2 Accomplish singular value decomposition (SVD) onthe covariance matrix of independent variable set

1

119899 minus 1

119880119879119880 = 119875Λ119875

119879

Λ = diag (1205821sdot sdot sdot 120582119898) 120582

1gt sdot sdot sdot gt 120582

119898gt 0

(4)

Abstract and Applied Analysis 3

in which 119875 is the so-called loading matrix of covariancematrix

Step 3 Determine the number of principle components withthe appropriate criteria [19] and calculate the score matrix 119879

119879 = UPpc isin 119877119899times119897

(5)

where

119875pc sub 119875 = [119875pc 119875res] isin 119877119898times119898

119875pc = [1199011sdot sdot sdot 119901119897] isin 119877119898times119897

119875res = [119901119897+1

sdot sdot sdot 119901119898] isin 119877119898times(119898minus119897)

(6)

119897 is the number of the principal components

Step 4 Perform OLS regression between the score matrix 119879

and the dependent matrix 119884 and obtain the final model

120573119894= (119879119879119879)

minus1

119879119879119910119894 119894 = 1 sdot sdot sdot 119901

119861 = [1205731sdot sdot sdot 120573119901] isin 119877119898times119901

119884 = 119879119861 + 119864

(7)

Namely 119884 = 119880119872 + 119864 119872 = 119875pc119861

23 Partial Least Squares Regression (PLSR) Partial leastsquares method can be applied in the case where the numberof explanatory variables is very high PLS generalizes andcombines features from principal component analysis andmultiple regression Least squares regression and principalcomponent regression method extract different factor scoresThe purpose of PCR is to extract the relevant information toensconce matrix 119880 and predict the value of variable 119884 Thenwe can use these independent variables to improve the qualityof prediction model When the correlation of some usefulvariables is low the reliability of the final prediction modelof PCR will go down This technique has certain defects andis too hard in solving this problem Nevertheless PLS candecompose the variables119880 and119884 and extract the componentsfrom119880119884 at the same time Latent variables (LVs) of119880 and119884have a strong relation and ensure the PLSR algorithmn basedmodel can make prediction from measurable variables [20]We decide a few factors to participate in the model PLSRis widely used in many areas such as fault detection wineanalysis and chemistry [9 21 22] We consider the standardPLSR method as follows

Step 1 Collect the measurable variables 119880 and the responsevariables119884 and normalize the data sets of119880 and 119884 expressedas 119880 = [119906

1sdot sdot sdot 119906119898] isin 119877

119899times119898 and 119884 = [1199101sdot sdot sdot 119910119901] isin 119877

119899times119901in which 119906

119894and 119910

119894are normalized to zero mean and unit

variance

Step 2 Calculate the following equations 120574 times iteratively

(119901119896 119902119896) = arg max

119901119896=1119902119896=1

119901119879

119896119880119879119884119902119896 119896 = 1 sdot sdot sdot 120574

119886119896= 119880119896119901119896

119887119896= 119884119896119902119896

119880119896= 119886119896119888119879

119896+ 119880119896+1

119884119896= 119886119896119903119879

119896+ 119884119896+1

119888119896=

119880119879

119896119886119896

1003817100381710038171003817119886119896

1003817100381710038171003817

2119903119896=

119884119879

119896119886119896

1003817100381710038171003817119886119896

1003817100381710038171003817

2

119880119896+1

= 119880119896minus 119886119896119888119879

119896 119884

119896+1= 119884119896minus 119886119896119903119879

119896

(8)

where the 119901119896 119902119896and 119886119896 119887119896are the loading vector and score

vector of 119880 and 119884 respectively and the 120574 is the numberof latent variables (LVs) which is usually determined by thecross validation criteria [22]

Step 3 Store 119901119896 119886119896 119888119896 119903119896into 119875 119860 119862 119877 and the result of

standard PLSR algorithm can be expressed as

119860 = 119880119875 119880 = 119860119862119879+ 119864

119884 = 119860119877119879+ 119865

(9)

Namely 119884 = 119880119872 + 119865 119872 = 119875119877119879

24 Modified Partial Least Squares Regression (MPLSR) As amatter of fact the problem of multicollinearity and the lessnumber of samples than variables are two common phenom-ena When OLSR is applied the problem of multicollinearityleads to a serious stability problem PCR and PLSR cansolve the problem of highly collinear PCR algorithm solvesthe collinearity problems efficiently by introducing principalcomponents (PCs) PLSR considers both outer relations (119880and 119884 block individually) and inner relation (linking bothblocks) but PCR only considers the outer relation of119880 blockPCR and PLSR run the risk of losing useful information inselected PCs or LVs by these dimension reduction methodsIn order to solve these problems Yin et al proposed theMPLSR [23] MPLSR has been validated on the industrialbenchmark of Tennessee Eastman process for fault detectionand a good result has been obtained

Step 1 Gather all the 119899 samples of measurable variables andresponse variables and stack them into119880 and 119884 respectivelyNormalize them to zero mean and unit variance denoted as119880 = [119906

1sdot sdot sdot 119906119898] isin 119877119899times119898 and 119884 = [119910

1sdot sdot sdot 119910119901] isin 119877119899times119901

Step 2 Calculate the regression coefficient matrix119872

1

119899

119884119879119880 asymp 119872

119879119880119879119880

119899

119872 =

(119880119879119880)

minus1

119880119879119884 if rank (119880119879119880) = 119898

(119880119879119880)

dagger

119880119879119884 if rank (119880119879119880) lt 119898

(10)

in which the (119880119879119880)dagger is the pseudoinverse of 119880119879119880

4 Abstract and Applied Analysis

Table 1 The wine quality of one sample given by human experts

Human expert 1 2 3 4 5 6 7 8 9 10 Mean pointsPresentation (20 points) 9 10 10 10 10 9 10 10 10 7 95Fragrance (30 points) 25 20 21 21 23 18 23 25 24 23 223Mouth-feel (40 points) 31 26 29 25 28 32 28 32 27 26 284Overall feeling (10 points) 9 8 8 9 9 8 9 9 8 9 86Sum points (100 points) 74 64 68 65 70 67 70 76 69 65 688

Table 2 The wine quality of one sample given by human experts

Group 1 2The standard deviation 73426 3978

Step 3 Obtain the final model of MPLSR

119884 = 119880119872 + 119864119910 (11)

where 119864119910is the residue part of 119884 which is uncorrelated with

119880

3 Results and Discussion

In order to classify and identify red wine [24] proposed thetechnology based on the spectrum and pattern recognitionNevertheless this paper is devoted to comparing thesemethods and predicting the wine quality for the purpose ofwine classification and target marketing

In this section we use three multivariate statisticalmethods based on soft measurement including principalcomponent regression (PCR) partial least squares regres-sion (PLSR) and modified partial least squares regression(MPLSR) to find which is the best method according to therelationship between the red wine physicochemical indexesand the wine quality We ignore ordinary least squaresregression (OLSR) which is not for this situation The fittingefficiency and prediction ability of these methods providedare compared by relevant figures or indexes and the bestmodel can be searched

31 Data Preprocessing Thedata set (mathematicalmodelingofficial site httpwwwmcmeducn) in this paper includesthe red wine qualities physicochemical indexes and aromasubstances perfectly of which the last two are collectivelyknown as physicochemical indexes in the following

The scores given by professional tasters in two groupshave obvious differences Solving the scores is a primary taskWe analyze two sets of scores from four aspects presentation(20 points) fragrance (30 points)mouth-feel (40 points) andoverall feeling (10 points) and the total is 100 For example theimportant information in one sample can be found in Table 1

In the data set 27 samples of wine in the first group areequal to those in the second group There are 10 tasters ondifferent levels in each group The missing data in the firstgroup is replaced by average value (AVG) The first grouprsquosstandard deviation (SD) is 73426 and the other onersquos is 3978

The average value of samples in both groups and standarddeviation of two groups are calculated in Table 2 respectively

119860119881 =

1

119873

119873

sum

119894=1

119909119894

119878119863 =

1

119873

radic

119873

sum

119894=1

(119909119894minus 119860119881)

2

(12)

Obviously the standard deviation of the second groupcalculated is smaller than the one of the first group Thesecond group provides a more reliable result than the firstone Sowe choose the second one as the actual wine quality tocompare with fitted wine quality To construct the predictingmodels the set of 20 random samples collected is usedas calibration set The remainder samples are used as theverification set to verify the prediction ability of the models

32 Modeling and Comparison In order to guarantee theinfluence between indexes and quality of wine some assump-tions are made as follows

(1) In this paper the vinification processes are identicaland the environment of vinification is the same

(2) The scores given by the expert tasters approach thereal quality scores

(3) The grapes used to vinify a kind of wine are of onespecies

The purpose of all assumptions above is that only thewinephysicochemical indexes can affect the quality of wine Weemphasize the relations between the physicochemical indexesand the quality of red wine while ignoring others

OLSR is not applicable to the situation ofmulticollinearityin the calibration set It is not utilized when the number ofsamples is smaller than the number of variables In this paperOLSR is not employed to establish the model

The other multivariate statistic methods based on softmeasurement including PCR PLSR andMPLSR are utilizedto establish models We propose the root mean squared errorof calibration (RMSEC) and root mean error of prediction(RMSEP) RMSEC evaluates the fitting efficiency andRMSEP

Abstract and Applied Analysis 5

60 80 10050

60

70

80

90

100

PCR fitted wine quality

Actu

al w

ine q

ualit

y

0 10 2060

65

70

75

80

Wine samplesW

ine q

ualit

y

Actual quality Fitted quality

60 80 10050

60

70

80

90

100

PCR predicted wine quality

Actu

al w

ine q

ualit

y

Figure 1 PCR fitted quality versus actual quality and predicted quality versus actual quality

demonstrates the prediction ability of different methods [9]RMSEC and RMSEP can be described as follows

RMSEC =radicsum119873

119894=1(119910119894minus 119910cal119894)

2

119873

RMSEP =radicsum119872

119894=1(119910119894minus 119910pred119894)

2

119872

(13)

where 119910119894is the measured value for the 119894th sample 119910cal119894 and

119910pred119894 are the method fitted and model predicted responsevalues for the 119894th sample respectively and 119873 and 119872 arethe number of calibration samples and verification samplesrespectively

The CPV2 (cumulative percent variance) [19] is firstlyemployed to select the number of PCs with 98 as CPV and17 PCs are obtained for PCRWith the help of cross validation[22 25] 4 LVs are chosen for PLSR There is no need forMPLSR to select the PCs or LVs that are the main difficultiesfor PCR andPLSR inmaintaining the integrity of informationfrom the origin data set

According to Figures 1 2 and 3 and Table 3 thesedifferent multivariate statistical methods will significantlyimpact the effect of the predicting models Due to theproblems of multicollinearity and the less number of samplesthan variables in the calibration set the OLSR is totally notfitted to establish the prediction model In order to reach theminimal values of RMSECMPLSR obtains a satisfying resultfor the fitting efficiency However PCR PLSR and MPLSRhave a similar prediction ability in this paper

33 Contribution Ratio of Wine Physicochemical Indexes Inthis paper 87 wine physicochemical indexes are collected inthe calibration set and these indexes can provide adequateinformation The large wine physicochemical indexes to bemeasured are a time consuming and complex work We allknow some wine physicochemical indexes do not contribute

to the wine quality In order to avoid doing lengthy and com-plex calculations it is necessary to analyze the contribution ofevery wine physicochemical index and select the main winephysicochemical indexes

In order to analyze the contribution of every grapephysicochemical index to the wine quality a contributionratio (CR) is introduced and it can be formulated as follows

119910 minus 1205730= 119861119883 CR

119894=

120573119894119909119894

119910 minus 1205730

119894 = 1 sdot sdot sdot 119898

119861 = [1205731 1205732sdot sdot sdot 120573119898] isin 1198771times119898

119883 = [1199091 1199092sdot sdot sdot 119909119898]119879

isin 1198771times119898

(14)

where 119861 is the coefficient matrix of regression equation and1205730is a constant The benefit of this definition for CR is that

the sum of all CR119894(119894 = 1 sdot sdot sdot 119898) equals 1

Thefigure contribution ofwine physicochemical indexesis shown in Figure 4 As can be seen from the figure the greatinfluence on the wine quality is cis-resveratrol color aceticacid 2-methylpropyl ester butanedioic acid diethyl esterand so forth On the contrary some physicochemical indexescan be ignored such as transresveratrol TR limonene 1-butanol3-methyl styrene and so forth Since some variablescan be controlled in the production process this informationcan be used to improve the wine quality The detailedinformation is shown in [9]

4 Conclusion

In this paper multivariate methods based on soft measure-ment ordinary least squares regression (OLSR) principalcomponent regression (PCR) partial least squares regres-sion (PLSR) and modified partial least squares regression(MPLSR) are reviewedWith these methods and the real dataobtained in practice wine quality prediction models havebeen constructed One purpose of this work is to choosethe best method among OLSR PLS PCR and MPLSR The

6 Abstract and Applied Analysis

Actual qualityFitted quality

60 80 10050

60

70

80

90

100

PLSR fitted wine quality

Actu

al w

ine q

ualit

y

0 10 2060

65

70

75

80

Wine samplesW

ine q

ualit

y60 80 100

50

60

70

80

90

100

PLSR predicted wine quality

Actu

al w

ine q

ualit

y

Figure 2 PLSR fitted quality versus actual quality and predicted quality versus actual quality

Actual qualityFitted quality

60 80 10050

60

70

80

90

100

MPLSR fitted wine quality

Actu

al w

ine q

ualit

y

0 10 2060

65

70

75

80

Wine samples

Win

e qua

lity

60 80 10050

60

70

80

90

100

MPLSR predicted wine quality

Actu

al w

ine q

ualit

y

Figure 3 MPLSR fitted quality versus actual quality and predicted quality versus actual quality

model built by MPLSR performs the best among the fourmodels with the superior fitting efficiency represented byRMSECThe models built by PCR PLSR and MPLSR have asimilar prediction ability represented by RESEP Several winephysicochemical indexes have significant contributions to thefinal wine quality while others are insignificantThe efficiencyof the MPLSR model is validated on a larger data set whilethis method is based on the objective tests aiding the speedand quality of the oenologist performance The outcome ofthe work is useful for the wine industry From the perspectiveof industries they can savemanpower and financial resourcesin this method

With the science and technology transformed into pro-ductivity the advanced technology is widely used in industryAt the same time soft measurement is developing Softmeasurement has achieved comprehensive results in the pro-cess control research This technology breaks the traditionalpattern of single input single output (SISO) The computer

Table 3 RMSEC and RMSEP values

PCR PLSR MPLSRRMSEC 062 060 0RMSEP 307 320 301

technology is conducive to soft measurement improvingthe availability of soft-sensing technique and reducing thedifficulty of application We believe that soft measurementwill have a wide application prospect with the integration ofcomputer technology and advanced technology

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Abstract and Applied Analysis 7

0 10 20 30 40 50 60 70 80 90

0

01

02

03

04

05

minus02

minus01

Figure 4 Contribution ratio of wine physicochemical indexes inMPLSR

Acknowledgments

This work was supported by the National Natural ScienceFoundation of China (no 61304102) and the Natural ScienceFoundation of Liaoning Province China (no 2013020002)

References

[1] G Schamel and K Anderson ldquoWine quality and varietalregional and winery reputations hedonic prices for Australiaand New Zealandrdquo Economic Record vol 79 no 246 pp 357ndash369 2003

[2] P Cortez A Cerdeira F Almeida T Matos and J Reis ldquoMod-eling wine preferences by data mining from physicochemicalpropertiesrdquoDecision Support Systems vol 47 no 4 pp 547ndash5532009

[3] S Buratti S Benedetti M Scampicchio and E C PangerodldquoCharacterization and classification of Italian Barbera winesby using an electronic nose and an amperometric electronictonguerdquo Analytica Chimica Acta vol 525 no 1 pp 133ndash1392004

[4] A van den Bos ldquoApplication of statistical parameter estimationmethods to physical measurementsrdquo Journal of Physics EScientific Instruments vol 10 no 8 article 753 1977

[5] J Liang and J Qian ldquoMultivariate statistical process monitoringand control recent developments and applications to chemicalindustryrdquoChinese Journal of Chemical Engineering vol 11 no 2pp 191ndash203 2003

[6] J-M Riviere M Bayart J-M Thiriet A Boras and MRobert ldquoIntelligent instruments some modelling approachesrdquoMeasurement and Control vol 29 no 6 pp 179ndash186 1996

[7] J Yu and C Zhou ldquoSoft-sensing techniques in process controlrdquoControlTheory and Applications vol 13 no 2 pp 137ndash144 1996

[8] J Yu ldquoSoft-sensing technology and its applicationrdquo ProcessAutomation Instrumentation vol 29 no 1 pp 1ndash7 2008

[9] S Yin X Zhu and H R Karimi ldquoQuality evaluation basedon multivariate statistical methodsrdquo Mathematical Problems inEngineering vol 2013 Article ID 639652 10 pages 2013

[10] K Pearson ldquoOn lines and planes of closest fit to systems ofpoints in spacerdquo Philosophical Magazine vol 2 no 6 pp 559ndash572 1901

[11] P J A Shaw Multivariate Statistics for the EnvironmentalSciences Hodder Arnold London UK 2003

[12] H Abdi and L J Williams ldquoPrincipal component analysisrdquoWiley Interdisciplinary Reviews Computational Statistics vol 2no 4 pp 433ndash459 2010

[13] J E Jackson A Userrsquos Guide to Principal Components vol 587John Wiley amp Sons New York NY USA 2005

[14] I Jolliffe ldquoPrincipal component analysisrdquo in Encyclopedia ofStatistics in Behavioral Science John Wiley amp Sons New YorkNY USA 2005

[15] S Weisberg Applied Linear Regression vol 528 John Wiley ampSons New York NY USA 2005

[16] C Xie D Chen R Zheng and Z Xie ldquoPrincipal componentanalysis on aroma constituents of seven higharoma patternoolong teasrdquo in Journal of South China Agricultural Universityvol 20 pp 113ndash117 1998

[17] J Yang H Tong and L Jia ldquoPrincipal composition analysis ofsensory and physiochemical quality of fermented bean curdrdquo inTransactions of the Chinese Society of Agricultural Engineeringvol 2 2002

[18] Y Bao Z Hu Y Bai and R Guo ldquoApplication of principalcomponent analysis and cluster analysis to evaluating ecologicalsafety of land userdquo Transactions of the Chinese Society ofAgricultural Engineering vol 22 no 8 pp 87ndash90 2006

[19] S Valle W Li and S J Qin ldquoSelection of the number ofprincipal components the variance of the reconstruction errorcriterion with a comparison to other methodsrdquo Industrial ampEngineering Chemistry Research vol 38 no 11 pp 4389ndash44011999

[20] P Geladi and B R Kowalski ldquoPartial least-squares regression atutorialrdquo Analytica Chimica Acta vol 185 pp 1ndash17 1986

[21] S Yin S X Ding A Haghani H Hao and P Zhang ldquoAcomparison study of basic data-driven fault diagnosis andprocess monitoring methods on the benchmark TennesseeEastman processrdquo Journal of Process Control vol 22 no 9 pp1567ndash1581 2012

[22] S Wold M Sjostrom and L Eriksson ldquoPLS-regression a basictool of chemometricsrdquoChemometrics and Intelligent LaboratorySystems vol 58 no 2 pp 109ndash130 2001

[23] S Yin S X Ding P Zhang A Hagahni and A Naik ldquoStudyon modifications of pls approach for process monitoringrdquoThreshold vol 2 pp 12389ndash12394 2011

[24] G-F Wu Y-H Jiang Y-Y Wang and Y He ldquoDiscriminationof varieties of dry red wines based on independent componentanalysis and BP neural networkrdquo Spectroscopy and SpectralAnalysis vol 29 no 5 pp 1268ndash1271 2009

[25] X Su Z Li Y Feng and L Wu ldquoNew global exponentialstability criteria for interval-delayed neural networksrdquo Journalof Systems and Control Engineering vol 225 no 1 pp 125ndash1362011

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 2: Research Article Multivariate Methods Based Soft Measurement …downloads.hindawi.com/journals/aaa/2014/740754.pdf · 2019-07-31 · Research Article Multivariate Methods Based Soft

2 Abstract and Applied Analysis

production process needs to be understood deeplyMoreoverthis application is limited owing to the difficulty inmodellingand period length Although the knowledge based modellingis a simple easy-to-understand and convenient method itis not suitable for the high precision and the knowledgerules extracted The process data based modelling methodcan build statistical regression model based on multivariatestatistical analysis theory using a large number of productiondataThismethod is superior inmodellingmaintenance andprecision In this paper themodel basedmodelling is utilizedand the model is built by the multivariate methods based onsoft measurement

In this paper in order to evaluate red wine quality themultivariate methods based on soft measurement are usedThis algorithm can help construct the fitted wine qualitymodel and predict wine quality We make a comparisonamong the models established by using OLSR PCR PLSRand MPLSR respectively and then select out the best modelin order to improve the model accuracy [9]Themethods aresuperior to manual measurement in the facet of wine qualityevaluation Due to the serious stability of OLSR the modelsof PCR PLSR and MPLSR are built These models can alsohelp improve the production process Furthermore it canbe useful in target marketing it can help identify the mostrelevant factors and can help classify wines such as premiumbrands (useful for setting prices) [2] The physicochemicalindexes that can impact the quality of wine are also proposedin this paperThere are 87 physicochemical indexes collectedin which a set containing 20 samples is used as calibration setand one containing 7 samples is used as verification set Wepredict the wine quality when these physicochemical indexesare just known

This paper is organized as follows Section 2 reviewsthe multivariate statistical methods including OLSR PCRPLSR and MPLSR In Section 3 the data sets are introducedand analyzed The wine prediction models are establishedbased on the multivariate statistical methods and then acomparison among these models is provided Finally theconclusions are presented in the last section

2 Modeling Algorithms

In this section the multivariate statistical analysis is used tosolve the wine problems OLSR PCR PLSR and MPLSR areintroduced briefly Owing to the multicollinearity or the lessnumber of samples than variables OLSR lost its effectivenessPCR is utilized to solve the problem of collineation Theinformation plays an important role in the data set based onthe cumulative percent variance (CPV) Unlike PCR PLSRespecially focuses on the internal height linearly dependentvariables MPLSR has the advantage of both PCR and PLSRand is better than them

21 Ordinary Least Squares Regression (OLSR) Least squaresmethod is a kind of mathematical optimization techniqueThe matching data is found by minimizing the error sum ofsquares The unknown data is obtained by the least squaresmethod simply The difference between quadratic sum of

actual data and quadratic sum of calculated data is very smallso that the least squares method can also be used for curvefitting In general the OLSR algorithmn is used to solveproblems only involving one dimensional response variablesHowever OLSR can also solve the problems of more thanone dimensional response variables The algorithm is brieflyintroduced as follows

Step 1 Collect 119899 samples of the measurable variables andresponse variables and then normalize them to zero meanand unit variance denoted as 119880 = [119906

1sdot sdot sdot 119906119898] isin 119877

119899times119898 119884 =

[1199101sdot sdot sdot 119910119901] isin 119877119899times119901 The following steps will be executed only

when 119880119879119880 is a full rank matrix

Step 2 Minimize the sum of errors 119890119894

120573119894= arg min 119890

119879

119894119890119894

= arg min (119910119894minus 119880120573119894)119879

(119910119894minus 119880120573119894) 119894 = 1 sdot sdot sdot 119901

(1)

Because of the invertibility of 119880119879119880 the 120573119894can also be

calculated by

120573119894= (119880119879119880)

minus1

119880119879119910119894 (2)

Step 3 Use the 119890119894and 120573

119894to form 119864 isin 119877

119899times119901 and 119872 isin 119877119898times119901

respectively and the final model can be indicated as

119884 = 119880119872 + 119864 (3)

22 Principal Component Regression (PCR) Principal com-ponents analysis invented by Pearson in 1901 [10] can analyzethe data and establish the mathematical model The methodplays an important role in the data of principal compo-nents (ie characteristic vectors) and their weights (ie theeigenvalues [11]) through the study of the characteristics ofcovariance to decompose matrix [12]

Since 1980s PCA has been successfully applied in numer-ous areas including data compression image processing fea-ture extraction pattern recognition and process monitoring[13 14] Since PCR is superior in dimensionality reductionit may solve the variable multicollinearity in practice andOLSRrsquos stability problem [15] The collinearity problem iseffectually solvedThe irrelevant aggregative indicators orig-inal indicators variant take the place of the original onesbased on PCR The aggregative indicators are used to showthe original indicators in this statistical approach PCR hasbeen widely used in many fields [16ndash18]

Step 1 Gather measurable variables and the responses vari-ables and normalize them to zero mean and unit variancepresented as 119880 = [119906

1sdot sdot sdot 119906119898] isin 119877

119899times119898 and 119884 = [1199101sdot sdot sdot 119910119901] isin

119877119899times119901

Step 2 Accomplish singular value decomposition (SVD) onthe covariance matrix of independent variable set

1

119899 minus 1

119880119879119880 = 119875Λ119875

119879

Λ = diag (1205821sdot sdot sdot 120582119898) 120582

1gt sdot sdot sdot gt 120582

119898gt 0

(4)

Abstract and Applied Analysis 3

in which 119875 is the so-called loading matrix of covariancematrix

Step 3 Determine the number of principle components withthe appropriate criteria [19] and calculate the score matrix 119879

119879 = UPpc isin 119877119899times119897

(5)

where

119875pc sub 119875 = [119875pc 119875res] isin 119877119898times119898

119875pc = [1199011sdot sdot sdot 119901119897] isin 119877119898times119897

119875res = [119901119897+1

sdot sdot sdot 119901119898] isin 119877119898times(119898minus119897)

(6)

119897 is the number of the principal components

Step 4 Perform OLS regression between the score matrix 119879

and the dependent matrix 119884 and obtain the final model

120573119894= (119879119879119879)

minus1

119879119879119910119894 119894 = 1 sdot sdot sdot 119901

119861 = [1205731sdot sdot sdot 120573119901] isin 119877119898times119901

119884 = 119879119861 + 119864

(7)

Namely 119884 = 119880119872 + 119864 119872 = 119875pc119861

23 Partial Least Squares Regression (PLSR) Partial leastsquares method can be applied in the case where the numberof explanatory variables is very high PLS generalizes andcombines features from principal component analysis andmultiple regression Least squares regression and principalcomponent regression method extract different factor scoresThe purpose of PCR is to extract the relevant information toensconce matrix 119880 and predict the value of variable 119884 Thenwe can use these independent variables to improve the qualityof prediction model When the correlation of some usefulvariables is low the reliability of the final prediction modelof PCR will go down This technique has certain defects andis too hard in solving this problem Nevertheless PLS candecompose the variables119880 and119884 and extract the componentsfrom119880119884 at the same time Latent variables (LVs) of119880 and119884have a strong relation and ensure the PLSR algorithmn basedmodel can make prediction from measurable variables [20]We decide a few factors to participate in the model PLSRis widely used in many areas such as fault detection wineanalysis and chemistry [9 21 22] We consider the standardPLSR method as follows

Step 1 Collect the measurable variables 119880 and the responsevariables119884 and normalize the data sets of119880 and 119884 expressedas 119880 = [119906

1sdot sdot sdot 119906119898] isin 119877

119899times119898 and 119884 = [1199101sdot sdot sdot 119910119901] isin 119877

119899times119901in which 119906

119894and 119910

119894are normalized to zero mean and unit

variance

Step 2 Calculate the following equations 120574 times iteratively

(119901119896 119902119896) = arg max

119901119896=1119902119896=1

119901119879

119896119880119879119884119902119896 119896 = 1 sdot sdot sdot 120574

119886119896= 119880119896119901119896

119887119896= 119884119896119902119896

119880119896= 119886119896119888119879

119896+ 119880119896+1

119884119896= 119886119896119903119879

119896+ 119884119896+1

119888119896=

119880119879

119896119886119896

1003817100381710038171003817119886119896

1003817100381710038171003817

2119903119896=

119884119879

119896119886119896

1003817100381710038171003817119886119896

1003817100381710038171003817

2

119880119896+1

= 119880119896minus 119886119896119888119879

119896 119884

119896+1= 119884119896minus 119886119896119903119879

119896

(8)

where the 119901119896 119902119896and 119886119896 119887119896are the loading vector and score

vector of 119880 and 119884 respectively and the 120574 is the numberof latent variables (LVs) which is usually determined by thecross validation criteria [22]

Step 3 Store 119901119896 119886119896 119888119896 119903119896into 119875 119860 119862 119877 and the result of

standard PLSR algorithm can be expressed as

119860 = 119880119875 119880 = 119860119862119879+ 119864

119884 = 119860119877119879+ 119865

(9)

Namely 119884 = 119880119872 + 119865 119872 = 119875119877119879

24 Modified Partial Least Squares Regression (MPLSR) As amatter of fact the problem of multicollinearity and the lessnumber of samples than variables are two common phenom-ena When OLSR is applied the problem of multicollinearityleads to a serious stability problem PCR and PLSR cansolve the problem of highly collinear PCR algorithm solvesthe collinearity problems efficiently by introducing principalcomponents (PCs) PLSR considers both outer relations (119880and 119884 block individually) and inner relation (linking bothblocks) but PCR only considers the outer relation of119880 blockPCR and PLSR run the risk of losing useful information inselected PCs or LVs by these dimension reduction methodsIn order to solve these problems Yin et al proposed theMPLSR [23] MPLSR has been validated on the industrialbenchmark of Tennessee Eastman process for fault detectionand a good result has been obtained

Step 1 Gather all the 119899 samples of measurable variables andresponse variables and stack them into119880 and 119884 respectivelyNormalize them to zero mean and unit variance denoted as119880 = [119906

1sdot sdot sdot 119906119898] isin 119877119899times119898 and 119884 = [119910

1sdot sdot sdot 119910119901] isin 119877119899times119901

Step 2 Calculate the regression coefficient matrix119872

1

119899

119884119879119880 asymp 119872

119879119880119879119880

119899

119872 =

(119880119879119880)

minus1

119880119879119884 if rank (119880119879119880) = 119898

(119880119879119880)

dagger

119880119879119884 if rank (119880119879119880) lt 119898

(10)

in which the (119880119879119880)dagger is the pseudoinverse of 119880119879119880

4 Abstract and Applied Analysis

Table 1 The wine quality of one sample given by human experts

Human expert 1 2 3 4 5 6 7 8 9 10 Mean pointsPresentation (20 points) 9 10 10 10 10 9 10 10 10 7 95Fragrance (30 points) 25 20 21 21 23 18 23 25 24 23 223Mouth-feel (40 points) 31 26 29 25 28 32 28 32 27 26 284Overall feeling (10 points) 9 8 8 9 9 8 9 9 8 9 86Sum points (100 points) 74 64 68 65 70 67 70 76 69 65 688

Table 2 The wine quality of one sample given by human experts

Group 1 2The standard deviation 73426 3978

Step 3 Obtain the final model of MPLSR

119884 = 119880119872 + 119864119910 (11)

where 119864119910is the residue part of 119884 which is uncorrelated with

119880

3 Results and Discussion

In order to classify and identify red wine [24] proposed thetechnology based on the spectrum and pattern recognitionNevertheless this paper is devoted to comparing thesemethods and predicting the wine quality for the purpose ofwine classification and target marketing

In this section we use three multivariate statisticalmethods based on soft measurement including principalcomponent regression (PCR) partial least squares regres-sion (PLSR) and modified partial least squares regression(MPLSR) to find which is the best method according to therelationship between the red wine physicochemical indexesand the wine quality We ignore ordinary least squaresregression (OLSR) which is not for this situation The fittingefficiency and prediction ability of these methods providedare compared by relevant figures or indexes and the bestmodel can be searched

31 Data Preprocessing Thedata set (mathematicalmodelingofficial site httpwwwmcmeducn) in this paper includesthe red wine qualities physicochemical indexes and aromasubstances perfectly of which the last two are collectivelyknown as physicochemical indexes in the following

The scores given by professional tasters in two groupshave obvious differences Solving the scores is a primary taskWe analyze two sets of scores from four aspects presentation(20 points) fragrance (30 points)mouth-feel (40 points) andoverall feeling (10 points) and the total is 100 For example theimportant information in one sample can be found in Table 1

In the data set 27 samples of wine in the first group areequal to those in the second group There are 10 tasters ondifferent levels in each group The missing data in the firstgroup is replaced by average value (AVG) The first grouprsquosstandard deviation (SD) is 73426 and the other onersquos is 3978

The average value of samples in both groups and standarddeviation of two groups are calculated in Table 2 respectively

119860119881 =

1

119873

119873

sum

119894=1

119909119894

119878119863 =

1

119873

radic

119873

sum

119894=1

(119909119894minus 119860119881)

2

(12)

Obviously the standard deviation of the second groupcalculated is smaller than the one of the first group Thesecond group provides a more reliable result than the firstone Sowe choose the second one as the actual wine quality tocompare with fitted wine quality To construct the predictingmodels the set of 20 random samples collected is usedas calibration set The remainder samples are used as theverification set to verify the prediction ability of the models

32 Modeling and Comparison In order to guarantee theinfluence between indexes and quality of wine some assump-tions are made as follows

(1) In this paper the vinification processes are identicaland the environment of vinification is the same

(2) The scores given by the expert tasters approach thereal quality scores

(3) The grapes used to vinify a kind of wine are of onespecies

The purpose of all assumptions above is that only thewinephysicochemical indexes can affect the quality of wine Weemphasize the relations between the physicochemical indexesand the quality of red wine while ignoring others

OLSR is not applicable to the situation ofmulticollinearityin the calibration set It is not utilized when the number ofsamples is smaller than the number of variables In this paperOLSR is not employed to establish the model

The other multivariate statistic methods based on softmeasurement including PCR PLSR andMPLSR are utilizedto establish models We propose the root mean squared errorof calibration (RMSEC) and root mean error of prediction(RMSEP) RMSEC evaluates the fitting efficiency andRMSEP

Abstract and Applied Analysis 5

60 80 10050

60

70

80

90

100

PCR fitted wine quality

Actu

al w

ine q

ualit

y

0 10 2060

65

70

75

80

Wine samplesW

ine q

ualit

y

Actual quality Fitted quality

60 80 10050

60

70

80

90

100

PCR predicted wine quality

Actu

al w

ine q

ualit

y

Figure 1 PCR fitted quality versus actual quality and predicted quality versus actual quality

demonstrates the prediction ability of different methods [9]RMSEC and RMSEP can be described as follows

RMSEC =radicsum119873

119894=1(119910119894minus 119910cal119894)

2

119873

RMSEP =radicsum119872

119894=1(119910119894minus 119910pred119894)

2

119872

(13)

where 119910119894is the measured value for the 119894th sample 119910cal119894 and

119910pred119894 are the method fitted and model predicted responsevalues for the 119894th sample respectively and 119873 and 119872 arethe number of calibration samples and verification samplesrespectively

The CPV2 (cumulative percent variance) [19] is firstlyemployed to select the number of PCs with 98 as CPV and17 PCs are obtained for PCRWith the help of cross validation[22 25] 4 LVs are chosen for PLSR There is no need forMPLSR to select the PCs or LVs that are the main difficultiesfor PCR andPLSR inmaintaining the integrity of informationfrom the origin data set

According to Figures 1 2 and 3 and Table 3 thesedifferent multivariate statistical methods will significantlyimpact the effect of the predicting models Due to theproblems of multicollinearity and the less number of samplesthan variables in the calibration set the OLSR is totally notfitted to establish the prediction model In order to reach theminimal values of RMSECMPLSR obtains a satisfying resultfor the fitting efficiency However PCR PLSR and MPLSRhave a similar prediction ability in this paper

33 Contribution Ratio of Wine Physicochemical Indexes Inthis paper 87 wine physicochemical indexes are collected inthe calibration set and these indexes can provide adequateinformation The large wine physicochemical indexes to bemeasured are a time consuming and complex work We allknow some wine physicochemical indexes do not contribute

to the wine quality In order to avoid doing lengthy and com-plex calculations it is necessary to analyze the contribution ofevery wine physicochemical index and select the main winephysicochemical indexes

In order to analyze the contribution of every grapephysicochemical index to the wine quality a contributionratio (CR) is introduced and it can be formulated as follows

119910 minus 1205730= 119861119883 CR

119894=

120573119894119909119894

119910 minus 1205730

119894 = 1 sdot sdot sdot 119898

119861 = [1205731 1205732sdot sdot sdot 120573119898] isin 1198771times119898

119883 = [1199091 1199092sdot sdot sdot 119909119898]119879

isin 1198771times119898

(14)

where 119861 is the coefficient matrix of regression equation and1205730is a constant The benefit of this definition for CR is that

the sum of all CR119894(119894 = 1 sdot sdot sdot 119898) equals 1

Thefigure contribution ofwine physicochemical indexesis shown in Figure 4 As can be seen from the figure the greatinfluence on the wine quality is cis-resveratrol color aceticacid 2-methylpropyl ester butanedioic acid diethyl esterand so forth On the contrary some physicochemical indexescan be ignored such as transresveratrol TR limonene 1-butanol3-methyl styrene and so forth Since some variablescan be controlled in the production process this informationcan be used to improve the wine quality The detailedinformation is shown in [9]

4 Conclusion

In this paper multivariate methods based on soft measure-ment ordinary least squares regression (OLSR) principalcomponent regression (PCR) partial least squares regres-sion (PLSR) and modified partial least squares regression(MPLSR) are reviewedWith these methods and the real dataobtained in practice wine quality prediction models havebeen constructed One purpose of this work is to choosethe best method among OLSR PLS PCR and MPLSR The

6 Abstract and Applied Analysis

Actual qualityFitted quality

60 80 10050

60

70

80

90

100

PLSR fitted wine quality

Actu

al w

ine q

ualit

y

0 10 2060

65

70

75

80

Wine samplesW

ine q

ualit

y60 80 100

50

60

70

80

90

100

PLSR predicted wine quality

Actu

al w

ine q

ualit

y

Figure 2 PLSR fitted quality versus actual quality and predicted quality versus actual quality

Actual qualityFitted quality

60 80 10050

60

70

80

90

100

MPLSR fitted wine quality

Actu

al w

ine q

ualit

y

0 10 2060

65

70

75

80

Wine samples

Win

e qua

lity

60 80 10050

60

70

80

90

100

MPLSR predicted wine quality

Actu

al w

ine q

ualit

y

Figure 3 MPLSR fitted quality versus actual quality and predicted quality versus actual quality

model built by MPLSR performs the best among the fourmodels with the superior fitting efficiency represented byRMSECThe models built by PCR PLSR and MPLSR have asimilar prediction ability represented by RESEP Several winephysicochemical indexes have significant contributions to thefinal wine quality while others are insignificantThe efficiencyof the MPLSR model is validated on a larger data set whilethis method is based on the objective tests aiding the speedand quality of the oenologist performance The outcome ofthe work is useful for the wine industry From the perspectiveof industries they can savemanpower and financial resourcesin this method

With the science and technology transformed into pro-ductivity the advanced technology is widely used in industryAt the same time soft measurement is developing Softmeasurement has achieved comprehensive results in the pro-cess control research This technology breaks the traditionalpattern of single input single output (SISO) The computer

Table 3 RMSEC and RMSEP values

PCR PLSR MPLSRRMSEC 062 060 0RMSEP 307 320 301

technology is conducive to soft measurement improvingthe availability of soft-sensing technique and reducing thedifficulty of application We believe that soft measurementwill have a wide application prospect with the integration ofcomputer technology and advanced technology

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Abstract and Applied Analysis 7

0 10 20 30 40 50 60 70 80 90

0

01

02

03

04

05

minus02

minus01

Figure 4 Contribution ratio of wine physicochemical indexes inMPLSR

Acknowledgments

This work was supported by the National Natural ScienceFoundation of China (no 61304102) and the Natural ScienceFoundation of Liaoning Province China (no 2013020002)

References

[1] G Schamel and K Anderson ldquoWine quality and varietalregional and winery reputations hedonic prices for Australiaand New Zealandrdquo Economic Record vol 79 no 246 pp 357ndash369 2003

[2] P Cortez A Cerdeira F Almeida T Matos and J Reis ldquoMod-eling wine preferences by data mining from physicochemicalpropertiesrdquoDecision Support Systems vol 47 no 4 pp 547ndash5532009

[3] S Buratti S Benedetti M Scampicchio and E C PangerodldquoCharacterization and classification of Italian Barbera winesby using an electronic nose and an amperometric electronictonguerdquo Analytica Chimica Acta vol 525 no 1 pp 133ndash1392004

[4] A van den Bos ldquoApplication of statistical parameter estimationmethods to physical measurementsrdquo Journal of Physics EScientific Instruments vol 10 no 8 article 753 1977

[5] J Liang and J Qian ldquoMultivariate statistical process monitoringand control recent developments and applications to chemicalindustryrdquoChinese Journal of Chemical Engineering vol 11 no 2pp 191ndash203 2003

[6] J-M Riviere M Bayart J-M Thiriet A Boras and MRobert ldquoIntelligent instruments some modelling approachesrdquoMeasurement and Control vol 29 no 6 pp 179ndash186 1996

[7] J Yu and C Zhou ldquoSoft-sensing techniques in process controlrdquoControlTheory and Applications vol 13 no 2 pp 137ndash144 1996

[8] J Yu ldquoSoft-sensing technology and its applicationrdquo ProcessAutomation Instrumentation vol 29 no 1 pp 1ndash7 2008

[9] S Yin X Zhu and H R Karimi ldquoQuality evaluation basedon multivariate statistical methodsrdquo Mathematical Problems inEngineering vol 2013 Article ID 639652 10 pages 2013

[10] K Pearson ldquoOn lines and planes of closest fit to systems ofpoints in spacerdquo Philosophical Magazine vol 2 no 6 pp 559ndash572 1901

[11] P J A Shaw Multivariate Statistics for the EnvironmentalSciences Hodder Arnold London UK 2003

[12] H Abdi and L J Williams ldquoPrincipal component analysisrdquoWiley Interdisciplinary Reviews Computational Statistics vol 2no 4 pp 433ndash459 2010

[13] J E Jackson A Userrsquos Guide to Principal Components vol 587John Wiley amp Sons New York NY USA 2005

[14] I Jolliffe ldquoPrincipal component analysisrdquo in Encyclopedia ofStatistics in Behavioral Science John Wiley amp Sons New YorkNY USA 2005

[15] S Weisberg Applied Linear Regression vol 528 John Wiley ampSons New York NY USA 2005

[16] C Xie D Chen R Zheng and Z Xie ldquoPrincipal componentanalysis on aroma constituents of seven higharoma patternoolong teasrdquo in Journal of South China Agricultural Universityvol 20 pp 113ndash117 1998

[17] J Yang H Tong and L Jia ldquoPrincipal composition analysis ofsensory and physiochemical quality of fermented bean curdrdquo inTransactions of the Chinese Society of Agricultural Engineeringvol 2 2002

[18] Y Bao Z Hu Y Bai and R Guo ldquoApplication of principalcomponent analysis and cluster analysis to evaluating ecologicalsafety of land userdquo Transactions of the Chinese Society ofAgricultural Engineering vol 22 no 8 pp 87ndash90 2006

[19] S Valle W Li and S J Qin ldquoSelection of the number ofprincipal components the variance of the reconstruction errorcriterion with a comparison to other methodsrdquo Industrial ampEngineering Chemistry Research vol 38 no 11 pp 4389ndash44011999

[20] P Geladi and B R Kowalski ldquoPartial least-squares regression atutorialrdquo Analytica Chimica Acta vol 185 pp 1ndash17 1986

[21] S Yin S X Ding A Haghani H Hao and P Zhang ldquoAcomparison study of basic data-driven fault diagnosis andprocess monitoring methods on the benchmark TennesseeEastman processrdquo Journal of Process Control vol 22 no 9 pp1567ndash1581 2012

[22] S Wold M Sjostrom and L Eriksson ldquoPLS-regression a basictool of chemometricsrdquoChemometrics and Intelligent LaboratorySystems vol 58 no 2 pp 109ndash130 2001

[23] S Yin S X Ding P Zhang A Hagahni and A Naik ldquoStudyon modifications of pls approach for process monitoringrdquoThreshold vol 2 pp 12389ndash12394 2011

[24] G-F Wu Y-H Jiang Y-Y Wang and Y He ldquoDiscriminationof varieties of dry red wines based on independent componentanalysis and BP neural networkrdquo Spectroscopy and SpectralAnalysis vol 29 no 5 pp 1268ndash1271 2009

[25] X Su Z Li Y Feng and L Wu ldquoNew global exponentialstability criteria for interval-delayed neural networksrdquo Journalof Systems and Control Engineering vol 225 no 1 pp 125ndash1362011

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 3: Research Article Multivariate Methods Based Soft Measurement …downloads.hindawi.com/journals/aaa/2014/740754.pdf · 2019-07-31 · Research Article Multivariate Methods Based Soft

Abstract and Applied Analysis 3

in which 119875 is the so-called loading matrix of covariancematrix

Step 3 Determine the number of principle components withthe appropriate criteria [19] and calculate the score matrix 119879

119879 = UPpc isin 119877119899times119897

(5)

where

119875pc sub 119875 = [119875pc 119875res] isin 119877119898times119898

119875pc = [1199011sdot sdot sdot 119901119897] isin 119877119898times119897

119875res = [119901119897+1

sdot sdot sdot 119901119898] isin 119877119898times(119898minus119897)

(6)

119897 is the number of the principal components

Step 4 Perform OLS regression between the score matrix 119879

and the dependent matrix 119884 and obtain the final model

120573119894= (119879119879119879)

minus1

119879119879119910119894 119894 = 1 sdot sdot sdot 119901

119861 = [1205731sdot sdot sdot 120573119901] isin 119877119898times119901

119884 = 119879119861 + 119864

(7)

Namely 119884 = 119880119872 + 119864 119872 = 119875pc119861

23 Partial Least Squares Regression (PLSR) Partial leastsquares method can be applied in the case where the numberof explanatory variables is very high PLS generalizes andcombines features from principal component analysis andmultiple regression Least squares regression and principalcomponent regression method extract different factor scoresThe purpose of PCR is to extract the relevant information toensconce matrix 119880 and predict the value of variable 119884 Thenwe can use these independent variables to improve the qualityof prediction model When the correlation of some usefulvariables is low the reliability of the final prediction modelof PCR will go down This technique has certain defects andis too hard in solving this problem Nevertheless PLS candecompose the variables119880 and119884 and extract the componentsfrom119880119884 at the same time Latent variables (LVs) of119880 and119884have a strong relation and ensure the PLSR algorithmn basedmodel can make prediction from measurable variables [20]We decide a few factors to participate in the model PLSRis widely used in many areas such as fault detection wineanalysis and chemistry [9 21 22] We consider the standardPLSR method as follows

Step 1 Collect the measurable variables 119880 and the responsevariables119884 and normalize the data sets of119880 and 119884 expressedas 119880 = [119906

1sdot sdot sdot 119906119898] isin 119877

119899times119898 and 119884 = [1199101sdot sdot sdot 119910119901] isin 119877

119899times119901in which 119906

119894and 119910

119894are normalized to zero mean and unit

variance

Step 2 Calculate the following equations 120574 times iteratively

(119901119896 119902119896) = arg max

119901119896=1119902119896=1

119901119879

119896119880119879119884119902119896 119896 = 1 sdot sdot sdot 120574

119886119896= 119880119896119901119896

119887119896= 119884119896119902119896

119880119896= 119886119896119888119879

119896+ 119880119896+1

119884119896= 119886119896119903119879

119896+ 119884119896+1

119888119896=

119880119879

119896119886119896

1003817100381710038171003817119886119896

1003817100381710038171003817

2119903119896=

119884119879

119896119886119896

1003817100381710038171003817119886119896

1003817100381710038171003817

2

119880119896+1

= 119880119896minus 119886119896119888119879

119896 119884

119896+1= 119884119896minus 119886119896119903119879

119896

(8)

where the 119901119896 119902119896and 119886119896 119887119896are the loading vector and score

vector of 119880 and 119884 respectively and the 120574 is the numberof latent variables (LVs) which is usually determined by thecross validation criteria [22]

Step 3 Store 119901119896 119886119896 119888119896 119903119896into 119875 119860 119862 119877 and the result of

standard PLSR algorithm can be expressed as

119860 = 119880119875 119880 = 119860119862119879+ 119864

119884 = 119860119877119879+ 119865

(9)

Namely 119884 = 119880119872 + 119865 119872 = 119875119877119879

24 Modified Partial Least Squares Regression (MPLSR) As amatter of fact the problem of multicollinearity and the lessnumber of samples than variables are two common phenom-ena When OLSR is applied the problem of multicollinearityleads to a serious stability problem PCR and PLSR cansolve the problem of highly collinear PCR algorithm solvesthe collinearity problems efficiently by introducing principalcomponents (PCs) PLSR considers both outer relations (119880and 119884 block individually) and inner relation (linking bothblocks) but PCR only considers the outer relation of119880 blockPCR and PLSR run the risk of losing useful information inselected PCs or LVs by these dimension reduction methodsIn order to solve these problems Yin et al proposed theMPLSR [23] MPLSR has been validated on the industrialbenchmark of Tennessee Eastman process for fault detectionand a good result has been obtained

Step 1 Gather all the 119899 samples of measurable variables andresponse variables and stack them into119880 and 119884 respectivelyNormalize them to zero mean and unit variance denoted as119880 = [119906

1sdot sdot sdot 119906119898] isin 119877119899times119898 and 119884 = [119910

1sdot sdot sdot 119910119901] isin 119877119899times119901

Step 2 Calculate the regression coefficient matrix119872

1

119899

119884119879119880 asymp 119872

119879119880119879119880

119899

119872 =

(119880119879119880)

minus1

119880119879119884 if rank (119880119879119880) = 119898

(119880119879119880)

dagger

119880119879119884 if rank (119880119879119880) lt 119898

(10)

in which the (119880119879119880)dagger is the pseudoinverse of 119880119879119880

4 Abstract and Applied Analysis

Table 1 The wine quality of one sample given by human experts

Human expert 1 2 3 4 5 6 7 8 9 10 Mean pointsPresentation (20 points) 9 10 10 10 10 9 10 10 10 7 95Fragrance (30 points) 25 20 21 21 23 18 23 25 24 23 223Mouth-feel (40 points) 31 26 29 25 28 32 28 32 27 26 284Overall feeling (10 points) 9 8 8 9 9 8 9 9 8 9 86Sum points (100 points) 74 64 68 65 70 67 70 76 69 65 688

Table 2 The wine quality of one sample given by human experts

Group 1 2The standard deviation 73426 3978

Step 3 Obtain the final model of MPLSR

119884 = 119880119872 + 119864119910 (11)

where 119864119910is the residue part of 119884 which is uncorrelated with

119880

3 Results and Discussion

In order to classify and identify red wine [24] proposed thetechnology based on the spectrum and pattern recognitionNevertheless this paper is devoted to comparing thesemethods and predicting the wine quality for the purpose ofwine classification and target marketing

In this section we use three multivariate statisticalmethods based on soft measurement including principalcomponent regression (PCR) partial least squares regres-sion (PLSR) and modified partial least squares regression(MPLSR) to find which is the best method according to therelationship between the red wine physicochemical indexesand the wine quality We ignore ordinary least squaresregression (OLSR) which is not for this situation The fittingefficiency and prediction ability of these methods providedare compared by relevant figures or indexes and the bestmodel can be searched

31 Data Preprocessing Thedata set (mathematicalmodelingofficial site httpwwwmcmeducn) in this paper includesthe red wine qualities physicochemical indexes and aromasubstances perfectly of which the last two are collectivelyknown as physicochemical indexes in the following

The scores given by professional tasters in two groupshave obvious differences Solving the scores is a primary taskWe analyze two sets of scores from four aspects presentation(20 points) fragrance (30 points)mouth-feel (40 points) andoverall feeling (10 points) and the total is 100 For example theimportant information in one sample can be found in Table 1

In the data set 27 samples of wine in the first group areequal to those in the second group There are 10 tasters ondifferent levels in each group The missing data in the firstgroup is replaced by average value (AVG) The first grouprsquosstandard deviation (SD) is 73426 and the other onersquos is 3978

The average value of samples in both groups and standarddeviation of two groups are calculated in Table 2 respectively

119860119881 =

1

119873

119873

sum

119894=1

119909119894

119878119863 =

1

119873

radic

119873

sum

119894=1

(119909119894minus 119860119881)

2

(12)

Obviously the standard deviation of the second groupcalculated is smaller than the one of the first group Thesecond group provides a more reliable result than the firstone Sowe choose the second one as the actual wine quality tocompare with fitted wine quality To construct the predictingmodels the set of 20 random samples collected is usedas calibration set The remainder samples are used as theverification set to verify the prediction ability of the models

32 Modeling and Comparison In order to guarantee theinfluence between indexes and quality of wine some assump-tions are made as follows

(1) In this paper the vinification processes are identicaland the environment of vinification is the same

(2) The scores given by the expert tasters approach thereal quality scores

(3) The grapes used to vinify a kind of wine are of onespecies

The purpose of all assumptions above is that only thewinephysicochemical indexes can affect the quality of wine Weemphasize the relations between the physicochemical indexesand the quality of red wine while ignoring others

OLSR is not applicable to the situation ofmulticollinearityin the calibration set It is not utilized when the number ofsamples is smaller than the number of variables In this paperOLSR is not employed to establish the model

The other multivariate statistic methods based on softmeasurement including PCR PLSR andMPLSR are utilizedto establish models We propose the root mean squared errorof calibration (RMSEC) and root mean error of prediction(RMSEP) RMSEC evaluates the fitting efficiency andRMSEP

Abstract and Applied Analysis 5

60 80 10050

60

70

80

90

100

PCR fitted wine quality

Actu

al w

ine q

ualit

y

0 10 2060

65

70

75

80

Wine samplesW

ine q

ualit

y

Actual quality Fitted quality

60 80 10050

60

70

80

90

100

PCR predicted wine quality

Actu

al w

ine q

ualit

y

Figure 1 PCR fitted quality versus actual quality and predicted quality versus actual quality

demonstrates the prediction ability of different methods [9]RMSEC and RMSEP can be described as follows

RMSEC =radicsum119873

119894=1(119910119894minus 119910cal119894)

2

119873

RMSEP =radicsum119872

119894=1(119910119894minus 119910pred119894)

2

119872

(13)

where 119910119894is the measured value for the 119894th sample 119910cal119894 and

119910pred119894 are the method fitted and model predicted responsevalues for the 119894th sample respectively and 119873 and 119872 arethe number of calibration samples and verification samplesrespectively

The CPV2 (cumulative percent variance) [19] is firstlyemployed to select the number of PCs with 98 as CPV and17 PCs are obtained for PCRWith the help of cross validation[22 25] 4 LVs are chosen for PLSR There is no need forMPLSR to select the PCs or LVs that are the main difficultiesfor PCR andPLSR inmaintaining the integrity of informationfrom the origin data set

According to Figures 1 2 and 3 and Table 3 thesedifferent multivariate statistical methods will significantlyimpact the effect of the predicting models Due to theproblems of multicollinearity and the less number of samplesthan variables in the calibration set the OLSR is totally notfitted to establish the prediction model In order to reach theminimal values of RMSECMPLSR obtains a satisfying resultfor the fitting efficiency However PCR PLSR and MPLSRhave a similar prediction ability in this paper

33 Contribution Ratio of Wine Physicochemical Indexes Inthis paper 87 wine physicochemical indexes are collected inthe calibration set and these indexes can provide adequateinformation The large wine physicochemical indexes to bemeasured are a time consuming and complex work We allknow some wine physicochemical indexes do not contribute

to the wine quality In order to avoid doing lengthy and com-plex calculations it is necessary to analyze the contribution ofevery wine physicochemical index and select the main winephysicochemical indexes

In order to analyze the contribution of every grapephysicochemical index to the wine quality a contributionratio (CR) is introduced and it can be formulated as follows

119910 minus 1205730= 119861119883 CR

119894=

120573119894119909119894

119910 minus 1205730

119894 = 1 sdot sdot sdot 119898

119861 = [1205731 1205732sdot sdot sdot 120573119898] isin 1198771times119898

119883 = [1199091 1199092sdot sdot sdot 119909119898]119879

isin 1198771times119898

(14)

where 119861 is the coefficient matrix of regression equation and1205730is a constant The benefit of this definition for CR is that

the sum of all CR119894(119894 = 1 sdot sdot sdot 119898) equals 1

Thefigure contribution ofwine physicochemical indexesis shown in Figure 4 As can be seen from the figure the greatinfluence on the wine quality is cis-resveratrol color aceticacid 2-methylpropyl ester butanedioic acid diethyl esterand so forth On the contrary some physicochemical indexescan be ignored such as transresveratrol TR limonene 1-butanol3-methyl styrene and so forth Since some variablescan be controlled in the production process this informationcan be used to improve the wine quality The detailedinformation is shown in [9]

4 Conclusion

In this paper multivariate methods based on soft measure-ment ordinary least squares regression (OLSR) principalcomponent regression (PCR) partial least squares regres-sion (PLSR) and modified partial least squares regression(MPLSR) are reviewedWith these methods and the real dataobtained in practice wine quality prediction models havebeen constructed One purpose of this work is to choosethe best method among OLSR PLS PCR and MPLSR The

6 Abstract and Applied Analysis

Actual qualityFitted quality

60 80 10050

60

70

80

90

100

PLSR fitted wine quality

Actu

al w

ine q

ualit

y

0 10 2060

65

70

75

80

Wine samplesW

ine q

ualit

y60 80 100

50

60

70

80

90

100

PLSR predicted wine quality

Actu

al w

ine q

ualit

y

Figure 2 PLSR fitted quality versus actual quality and predicted quality versus actual quality

Actual qualityFitted quality

60 80 10050

60

70

80

90

100

MPLSR fitted wine quality

Actu

al w

ine q

ualit

y

0 10 2060

65

70

75

80

Wine samples

Win

e qua

lity

60 80 10050

60

70

80

90

100

MPLSR predicted wine quality

Actu

al w

ine q

ualit

y

Figure 3 MPLSR fitted quality versus actual quality and predicted quality versus actual quality

model built by MPLSR performs the best among the fourmodels with the superior fitting efficiency represented byRMSECThe models built by PCR PLSR and MPLSR have asimilar prediction ability represented by RESEP Several winephysicochemical indexes have significant contributions to thefinal wine quality while others are insignificantThe efficiencyof the MPLSR model is validated on a larger data set whilethis method is based on the objective tests aiding the speedand quality of the oenologist performance The outcome ofthe work is useful for the wine industry From the perspectiveof industries they can savemanpower and financial resourcesin this method

With the science and technology transformed into pro-ductivity the advanced technology is widely used in industryAt the same time soft measurement is developing Softmeasurement has achieved comprehensive results in the pro-cess control research This technology breaks the traditionalpattern of single input single output (SISO) The computer

Table 3 RMSEC and RMSEP values

PCR PLSR MPLSRRMSEC 062 060 0RMSEP 307 320 301

technology is conducive to soft measurement improvingthe availability of soft-sensing technique and reducing thedifficulty of application We believe that soft measurementwill have a wide application prospect with the integration ofcomputer technology and advanced technology

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Abstract and Applied Analysis 7

0 10 20 30 40 50 60 70 80 90

0

01

02

03

04

05

minus02

minus01

Figure 4 Contribution ratio of wine physicochemical indexes inMPLSR

Acknowledgments

This work was supported by the National Natural ScienceFoundation of China (no 61304102) and the Natural ScienceFoundation of Liaoning Province China (no 2013020002)

References

[1] G Schamel and K Anderson ldquoWine quality and varietalregional and winery reputations hedonic prices for Australiaand New Zealandrdquo Economic Record vol 79 no 246 pp 357ndash369 2003

[2] P Cortez A Cerdeira F Almeida T Matos and J Reis ldquoMod-eling wine preferences by data mining from physicochemicalpropertiesrdquoDecision Support Systems vol 47 no 4 pp 547ndash5532009

[3] S Buratti S Benedetti M Scampicchio and E C PangerodldquoCharacterization and classification of Italian Barbera winesby using an electronic nose and an amperometric electronictonguerdquo Analytica Chimica Acta vol 525 no 1 pp 133ndash1392004

[4] A van den Bos ldquoApplication of statistical parameter estimationmethods to physical measurementsrdquo Journal of Physics EScientific Instruments vol 10 no 8 article 753 1977

[5] J Liang and J Qian ldquoMultivariate statistical process monitoringand control recent developments and applications to chemicalindustryrdquoChinese Journal of Chemical Engineering vol 11 no 2pp 191ndash203 2003

[6] J-M Riviere M Bayart J-M Thiriet A Boras and MRobert ldquoIntelligent instruments some modelling approachesrdquoMeasurement and Control vol 29 no 6 pp 179ndash186 1996

[7] J Yu and C Zhou ldquoSoft-sensing techniques in process controlrdquoControlTheory and Applications vol 13 no 2 pp 137ndash144 1996

[8] J Yu ldquoSoft-sensing technology and its applicationrdquo ProcessAutomation Instrumentation vol 29 no 1 pp 1ndash7 2008

[9] S Yin X Zhu and H R Karimi ldquoQuality evaluation basedon multivariate statistical methodsrdquo Mathematical Problems inEngineering vol 2013 Article ID 639652 10 pages 2013

[10] K Pearson ldquoOn lines and planes of closest fit to systems ofpoints in spacerdquo Philosophical Magazine vol 2 no 6 pp 559ndash572 1901

[11] P J A Shaw Multivariate Statistics for the EnvironmentalSciences Hodder Arnold London UK 2003

[12] H Abdi and L J Williams ldquoPrincipal component analysisrdquoWiley Interdisciplinary Reviews Computational Statistics vol 2no 4 pp 433ndash459 2010

[13] J E Jackson A Userrsquos Guide to Principal Components vol 587John Wiley amp Sons New York NY USA 2005

[14] I Jolliffe ldquoPrincipal component analysisrdquo in Encyclopedia ofStatistics in Behavioral Science John Wiley amp Sons New YorkNY USA 2005

[15] S Weisberg Applied Linear Regression vol 528 John Wiley ampSons New York NY USA 2005

[16] C Xie D Chen R Zheng and Z Xie ldquoPrincipal componentanalysis on aroma constituents of seven higharoma patternoolong teasrdquo in Journal of South China Agricultural Universityvol 20 pp 113ndash117 1998

[17] J Yang H Tong and L Jia ldquoPrincipal composition analysis ofsensory and physiochemical quality of fermented bean curdrdquo inTransactions of the Chinese Society of Agricultural Engineeringvol 2 2002

[18] Y Bao Z Hu Y Bai and R Guo ldquoApplication of principalcomponent analysis and cluster analysis to evaluating ecologicalsafety of land userdquo Transactions of the Chinese Society ofAgricultural Engineering vol 22 no 8 pp 87ndash90 2006

[19] S Valle W Li and S J Qin ldquoSelection of the number ofprincipal components the variance of the reconstruction errorcriterion with a comparison to other methodsrdquo Industrial ampEngineering Chemistry Research vol 38 no 11 pp 4389ndash44011999

[20] P Geladi and B R Kowalski ldquoPartial least-squares regression atutorialrdquo Analytica Chimica Acta vol 185 pp 1ndash17 1986

[21] S Yin S X Ding A Haghani H Hao and P Zhang ldquoAcomparison study of basic data-driven fault diagnosis andprocess monitoring methods on the benchmark TennesseeEastman processrdquo Journal of Process Control vol 22 no 9 pp1567ndash1581 2012

[22] S Wold M Sjostrom and L Eriksson ldquoPLS-regression a basictool of chemometricsrdquoChemometrics and Intelligent LaboratorySystems vol 58 no 2 pp 109ndash130 2001

[23] S Yin S X Ding P Zhang A Hagahni and A Naik ldquoStudyon modifications of pls approach for process monitoringrdquoThreshold vol 2 pp 12389ndash12394 2011

[24] G-F Wu Y-H Jiang Y-Y Wang and Y He ldquoDiscriminationof varieties of dry red wines based on independent componentanalysis and BP neural networkrdquo Spectroscopy and SpectralAnalysis vol 29 no 5 pp 1268ndash1271 2009

[25] X Su Z Li Y Feng and L Wu ldquoNew global exponentialstability criteria for interval-delayed neural networksrdquo Journalof Systems and Control Engineering vol 225 no 1 pp 125ndash1362011

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 4: Research Article Multivariate Methods Based Soft Measurement …downloads.hindawi.com/journals/aaa/2014/740754.pdf · 2019-07-31 · Research Article Multivariate Methods Based Soft

4 Abstract and Applied Analysis

Table 1 The wine quality of one sample given by human experts

Human expert 1 2 3 4 5 6 7 8 9 10 Mean pointsPresentation (20 points) 9 10 10 10 10 9 10 10 10 7 95Fragrance (30 points) 25 20 21 21 23 18 23 25 24 23 223Mouth-feel (40 points) 31 26 29 25 28 32 28 32 27 26 284Overall feeling (10 points) 9 8 8 9 9 8 9 9 8 9 86Sum points (100 points) 74 64 68 65 70 67 70 76 69 65 688

Table 2 The wine quality of one sample given by human experts

Group 1 2The standard deviation 73426 3978

Step 3 Obtain the final model of MPLSR

119884 = 119880119872 + 119864119910 (11)

where 119864119910is the residue part of 119884 which is uncorrelated with

119880

3 Results and Discussion

In order to classify and identify red wine [24] proposed thetechnology based on the spectrum and pattern recognitionNevertheless this paper is devoted to comparing thesemethods and predicting the wine quality for the purpose ofwine classification and target marketing

In this section we use three multivariate statisticalmethods based on soft measurement including principalcomponent regression (PCR) partial least squares regres-sion (PLSR) and modified partial least squares regression(MPLSR) to find which is the best method according to therelationship between the red wine physicochemical indexesand the wine quality We ignore ordinary least squaresregression (OLSR) which is not for this situation The fittingefficiency and prediction ability of these methods providedare compared by relevant figures or indexes and the bestmodel can be searched

31 Data Preprocessing Thedata set (mathematicalmodelingofficial site httpwwwmcmeducn) in this paper includesthe red wine qualities physicochemical indexes and aromasubstances perfectly of which the last two are collectivelyknown as physicochemical indexes in the following

The scores given by professional tasters in two groupshave obvious differences Solving the scores is a primary taskWe analyze two sets of scores from four aspects presentation(20 points) fragrance (30 points)mouth-feel (40 points) andoverall feeling (10 points) and the total is 100 For example theimportant information in one sample can be found in Table 1

In the data set 27 samples of wine in the first group areequal to those in the second group There are 10 tasters ondifferent levels in each group The missing data in the firstgroup is replaced by average value (AVG) The first grouprsquosstandard deviation (SD) is 73426 and the other onersquos is 3978

The average value of samples in both groups and standarddeviation of two groups are calculated in Table 2 respectively

119860119881 =

1

119873

119873

sum

119894=1

119909119894

119878119863 =

1

119873

radic

119873

sum

119894=1

(119909119894minus 119860119881)

2

(12)

Obviously the standard deviation of the second groupcalculated is smaller than the one of the first group Thesecond group provides a more reliable result than the firstone Sowe choose the second one as the actual wine quality tocompare with fitted wine quality To construct the predictingmodels the set of 20 random samples collected is usedas calibration set The remainder samples are used as theverification set to verify the prediction ability of the models

32 Modeling and Comparison In order to guarantee theinfluence between indexes and quality of wine some assump-tions are made as follows

(1) In this paper the vinification processes are identicaland the environment of vinification is the same

(2) The scores given by the expert tasters approach thereal quality scores

(3) The grapes used to vinify a kind of wine are of onespecies

The purpose of all assumptions above is that only thewinephysicochemical indexes can affect the quality of wine Weemphasize the relations between the physicochemical indexesand the quality of red wine while ignoring others

OLSR is not applicable to the situation ofmulticollinearityin the calibration set It is not utilized when the number ofsamples is smaller than the number of variables In this paperOLSR is not employed to establish the model

The other multivariate statistic methods based on softmeasurement including PCR PLSR andMPLSR are utilizedto establish models We propose the root mean squared errorof calibration (RMSEC) and root mean error of prediction(RMSEP) RMSEC evaluates the fitting efficiency andRMSEP

Abstract and Applied Analysis 5

60 80 10050

60

70

80

90

100

PCR fitted wine quality

Actu

al w

ine q

ualit

y

0 10 2060

65

70

75

80

Wine samplesW

ine q

ualit

y

Actual quality Fitted quality

60 80 10050

60

70

80

90

100

PCR predicted wine quality

Actu

al w

ine q

ualit

y

Figure 1 PCR fitted quality versus actual quality and predicted quality versus actual quality

demonstrates the prediction ability of different methods [9]RMSEC and RMSEP can be described as follows

RMSEC =radicsum119873

119894=1(119910119894minus 119910cal119894)

2

119873

RMSEP =radicsum119872

119894=1(119910119894minus 119910pred119894)

2

119872

(13)

where 119910119894is the measured value for the 119894th sample 119910cal119894 and

119910pred119894 are the method fitted and model predicted responsevalues for the 119894th sample respectively and 119873 and 119872 arethe number of calibration samples and verification samplesrespectively

The CPV2 (cumulative percent variance) [19] is firstlyemployed to select the number of PCs with 98 as CPV and17 PCs are obtained for PCRWith the help of cross validation[22 25] 4 LVs are chosen for PLSR There is no need forMPLSR to select the PCs or LVs that are the main difficultiesfor PCR andPLSR inmaintaining the integrity of informationfrom the origin data set

According to Figures 1 2 and 3 and Table 3 thesedifferent multivariate statistical methods will significantlyimpact the effect of the predicting models Due to theproblems of multicollinearity and the less number of samplesthan variables in the calibration set the OLSR is totally notfitted to establish the prediction model In order to reach theminimal values of RMSECMPLSR obtains a satisfying resultfor the fitting efficiency However PCR PLSR and MPLSRhave a similar prediction ability in this paper

33 Contribution Ratio of Wine Physicochemical Indexes Inthis paper 87 wine physicochemical indexes are collected inthe calibration set and these indexes can provide adequateinformation The large wine physicochemical indexes to bemeasured are a time consuming and complex work We allknow some wine physicochemical indexes do not contribute

to the wine quality In order to avoid doing lengthy and com-plex calculations it is necessary to analyze the contribution ofevery wine physicochemical index and select the main winephysicochemical indexes

In order to analyze the contribution of every grapephysicochemical index to the wine quality a contributionratio (CR) is introduced and it can be formulated as follows

119910 minus 1205730= 119861119883 CR

119894=

120573119894119909119894

119910 minus 1205730

119894 = 1 sdot sdot sdot 119898

119861 = [1205731 1205732sdot sdot sdot 120573119898] isin 1198771times119898

119883 = [1199091 1199092sdot sdot sdot 119909119898]119879

isin 1198771times119898

(14)

where 119861 is the coefficient matrix of regression equation and1205730is a constant The benefit of this definition for CR is that

the sum of all CR119894(119894 = 1 sdot sdot sdot 119898) equals 1

Thefigure contribution ofwine physicochemical indexesis shown in Figure 4 As can be seen from the figure the greatinfluence on the wine quality is cis-resveratrol color aceticacid 2-methylpropyl ester butanedioic acid diethyl esterand so forth On the contrary some physicochemical indexescan be ignored such as transresveratrol TR limonene 1-butanol3-methyl styrene and so forth Since some variablescan be controlled in the production process this informationcan be used to improve the wine quality The detailedinformation is shown in [9]

4 Conclusion

In this paper multivariate methods based on soft measure-ment ordinary least squares regression (OLSR) principalcomponent regression (PCR) partial least squares regres-sion (PLSR) and modified partial least squares regression(MPLSR) are reviewedWith these methods and the real dataobtained in practice wine quality prediction models havebeen constructed One purpose of this work is to choosethe best method among OLSR PLS PCR and MPLSR The

6 Abstract and Applied Analysis

Actual qualityFitted quality

60 80 10050

60

70

80

90

100

PLSR fitted wine quality

Actu

al w

ine q

ualit

y

0 10 2060

65

70

75

80

Wine samplesW

ine q

ualit

y60 80 100

50

60

70

80

90

100

PLSR predicted wine quality

Actu

al w

ine q

ualit

y

Figure 2 PLSR fitted quality versus actual quality and predicted quality versus actual quality

Actual qualityFitted quality

60 80 10050

60

70

80

90

100

MPLSR fitted wine quality

Actu

al w

ine q

ualit

y

0 10 2060

65

70

75

80

Wine samples

Win

e qua

lity

60 80 10050

60

70

80

90

100

MPLSR predicted wine quality

Actu

al w

ine q

ualit

y

Figure 3 MPLSR fitted quality versus actual quality and predicted quality versus actual quality

model built by MPLSR performs the best among the fourmodels with the superior fitting efficiency represented byRMSECThe models built by PCR PLSR and MPLSR have asimilar prediction ability represented by RESEP Several winephysicochemical indexes have significant contributions to thefinal wine quality while others are insignificantThe efficiencyof the MPLSR model is validated on a larger data set whilethis method is based on the objective tests aiding the speedand quality of the oenologist performance The outcome ofthe work is useful for the wine industry From the perspectiveof industries they can savemanpower and financial resourcesin this method

With the science and technology transformed into pro-ductivity the advanced technology is widely used in industryAt the same time soft measurement is developing Softmeasurement has achieved comprehensive results in the pro-cess control research This technology breaks the traditionalpattern of single input single output (SISO) The computer

Table 3 RMSEC and RMSEP values

PCR PLSR MPLSRRMSEC 062 060 0RMSEP 307 320 301

technology is conducive to soft measurement improvingthe availability of soft-sensing technique and reducing thedifficulty of application We believe that soft measurementwill have a wide application prospect with the integration ofcomputer technology and advanced technology

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Abstract and Applied Analysis 7

0 10 20 30 40 50 60 70 80 90

0

01

02

03

04

05

minus02

minus01

Figure 4 Contribution ratio of wine physicochemical indexes inMPLSR

Acknowledgments

This work was supported by the National Natural ScienceFoundation of China (no 61304102) and the Natural ScienceFoundation of Liaoning Province China (no 2013020002)

References

[1] G Schamel and K Anderson ldquoWine quality and varietalregional and winery reputations hedonic prices for Australiaand New Zealandrdquo Economic Record vol 79 no 246 pp 357ndash369 2003

[2] P Cortez A Cerdeira F Almeida T Matos and J Reis ldquoMod-eling wine preferences by data mining from physicochemicalpropertiesrdquoDecision Support Systems vol 47 no 4 pp 547ndash5532009

[3] S Buratti S Benedetti M Scampicchio and E C PangerodldquoCharacterization and classification of Italian Barbera winesby using an electronic nose and an amperometric electronictonguerdquo Analytica Chimica Acta vol 525 no 1 pp 133ndash1392004

[4] A van den Bos ldquoApplication of statistical parameter estimationmethods to physical measurementsrdquo Journal of Physics EScientific Instruments vol 10 no 8 article 753 1977

[5] J Liang and J Qian ldquoMultivariate statistical process monitoringand control recent developments and applications to chemicalindustryrdquoChinese Journal of Chemical Engineering vol 11 no 2pp 191ndash203 2003

[6] J-M Riviere M Bayart J-M Thiriet A Boras and MRobert ldquoIntelligent instruments some modelling approachesrdquoMeasurement and Control vol 29 no 6 pp 179ndash186 1996

[7] J Yu and C Zhou ldquoSoft-sensing techniques in process controlrdquoControlTheory and Applications vol 13 no 2 pp 137ndash144 1996

[8] J Yu ldquoSoft-sensing technology and its applicationrdquo ProcessAutomation Instrumentation vol 29 no 1 pp 1ndash7 2008

[9] S Yin X Zhu and H R Karimi ldquoQuality evaluation basedon multivariate statistical methodsrdquo Mathematical Problems inEngineering vol 2013 Article ID 639652 10 pages 2013

[10] K Pearson ldquoOn lines and planes of closest fit to systems ofpoints in spacerdquo Philosophical Magazine vol 2 no 6 pp 559ndash572 1901

[11] P J A Shaw Multivariate Statistics for the EnvironmentalSciences Hodder Arnold London UK 2003

[12] H Abdi and L J Williams ldquoPrincipal component analysisrdquoWiley Interdisciplinary Reviews Computational Statistics vol 2no 4 pp 433ndash459 2010

[13] J E Jackson A Userrsquos Guide to Principal Components vol 587John Wiley amp Sons New York NY USA 2005

[14] I Jolliffe ldquoPrincipal component analysisrdquo in Encyclopedia ofStatistics in Behavioral Science John Wiley amp Sons New YorkNY USA 2005

[15] S Weisberg Applied Linear Regression vol 528 John Wiley ampSons New York NY USA 2005

[16] C Xie D Chen R Zheng and Z Xie ldquoPrincipal componentanalysis on aroma constituents of seven higharoma patternoolong teasrdquo in Journal of South China Agricultural Universityvol 20 pp 113ndash117 1998

[17] J Yang H Tong and L Jia ldquoPrincipal composition analysis ofsensory and physiochemical quality of fermented bean curdrdquo inTransactions of the Chinese Society of Agricultural Engineeringvol 2 2002

[18] Y Bao Z Hu Y Bai and R Guo ldquoApplication of principalcomponent analysis and cluster analysis to evaluating ecologicalsafety of land userdquo Transactions of the Chinese Society ofAgricultural Engineering vol 22 no 8 pp 87ndash90 2006

[19] S Valle W Li and S J Qin ldquoSelection of the number ofprincipal components the variance of the reconstruction errorcriterion with a comparison to other methodsrdquo Industrial ampEngineering Chemistry Research vol 38 no 11 pp 4389ndash44011999

[20] P Geladi and B R Kowalski ldquoPartial least-squares regression atutorialrdquo Analytica Chimica Acta vol 185 pp 1ndash17 1986

[21] S Yin S X Ding A Haghani H Hao and P Zhang ldquoAcomparison study of basic data-driven fault diagnosis andprocess monitoring methods on the benchmark TennesseeEastman processrdquo Journal of Process Control vol 22 no 9 pp1567ndash1581 2012

[22] S Wold M Sjostrom and L Eriksson ldquoPLS-regression a basictool of chemometricsrdquoChemometrics and Intelligent LaboratorySystems vol 58 no 2 pp 109ndash130 2001

[23] S Yin S X Ding P Zhang A Hagahni and A Naik ldquoStudyon modifications of pls approach for process monitoringrdquoThreshold vol 2 pp 12389ndash12394 2011

[24] G-F Wu Y-H Jiang Y-Y Wang and Y He ldquoDiscriminationof varieties of dry red wines based on independent componentanalysis and BP neural networkrdquo Spectroscopy and SpectralAnalysis vol 29 no 5 pp 1268ndash1271 2009

[25] X Su Z Li Y Feng and L Wu ldquoNew global exponentialstability criteria for interval-delayed neural networksrdquo Journalof Systems and Control Engineering vol 225 no 1 pp 125ndash1362011

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 5: Research Article Multivariate Methods Based Soft Measurement …downloads.hindawi.com/journals/aaa/2014/740754.pdf · 2019-07-31 · Research Article Multivariate Methods Based Soft

Abstract and Applied Analysis 5

60 80 10050

60

70

80

90

100

PCR fitted wine quality

Actu

al w

ine q

ualit

y

0 10 2060

65

70

75

80

Wine samplesW

ine q

ualit

y

Actual quality Fitted quality

60 80 10050

60

70

80

90

100

PCR predicted wine quality

Actu

al w

ine q

ualit

y

Figure 1 PCR fitted quality versus actual quality and predicted quality versus actual quality

demonstrates the prediction ability of different methods [9]RMSEC and RMSEP can be described as follows

RMSEC =radicsum119873

119894=1(119910119894minus 119910cal119894)

2

119873

RMSEP =radicsum119872

119894=1(119910119894minus 119910pred119894)

2

119872

(13)

where 119910119894is the measured value for the 119894th sample 119910cal119894 and

119910pred119894 are the method fitted and model predicted responsevalues for the 119894th sample respectively and 119873 and 119872 arethe number of calibration samples and verification samplesrespectively

The CPV2 (cumulative percent variance) [19] is firstlyemployed to select the number of PCs with 98 as CPV and17 PCs are obtained for PCRWith the help of cross validation[22 25] 4 LVs are chosen for PLSR There is no need forMPLSR to select the PCs or LVs that are the main difficultiesfor PCR andPLSR inmaintaining the integrity of informationfrom the origin data set

According to Figures 1 2 and 3 and Table 3 thesedifferent multivariate statistical methods will significantlyimpact the effect of the predicting models Due to theproblems of multicollinearity and the less number of samplesthan variables in the calibration set the OLSR is totally notfitted to establish the prediction model In order to reach theminimal values of RMSECMPLSR obtains a satisfying resultfor the fitting efficiency However PCR PLSR and MPLSRhave a similar prediction ability in this paper

33 Contribution Ratio of Wine Physicochemical Indexes Inthis paper 87 wine physicochemical indexes are collected inthe calibration set and these indexes can provide adequateinformation The large wine physicochemical indexes to bemeasured are a time consuming and complex work We allknow some wine physicochemical indexes do not contribute

to the wine quality In order to avoid doing lengthy and com-plex calculations it is necessary to analyze the contribution ofevery wine physicochemical index and select the main winephysicochemical indexes

In order to analyze the contribution of every grapephysicochemical index to the wine quality a contributionratio (CR) is introduced and it can be formulated as follows

119910 minus 1205730= 119861119883 CR

119894=

120573119894119909119894

119910 minus 1205730

119894 = 1 sdot sdot sdot 119898

119861 = [1205731 1205732sdot sdot sdot 120573119898] isin 1198771times119898

119883 = [1199091 1199092sdot sdot sdot 119909119898]119879

isin 1198771times119898

(14)

where 119861 is the coefficient matrix of regression equation and1205730is a constant The benefit of this definition for CR is that

the sum of all CR119894(119894 = 1 sdot sdot sdot 119898) equals 1

Thefigure contribution ofwine physicochemical indexesis shown in Figure 4 As can be seen from the figure the greatinfluence on the wine quality is cis-resveratrol color aceticacid 2-methylpropyl ester butanedioic acid diethyl esterand so forth On the contrary some physicochemical indexescan be ignored such as transresveratrol TR limonene 1-butanol3-methyl styrene and so forth Since some variablescan be controlled in the production process this informationcan be used to improve the wine quality The detailedinformation is shown in [9]

4 Conclusion

In this paper multivariate methods based on soft measure-ment ordinary least squares regression (OLSR) principalcomponent regression (PCR) partial least squares regres-sion (PLSR) and modified partial least squares regression(MPLSR) are reviewedWith these methods and the real dataobtained in practice wine quality prediction models havebeen constructed One purpose of this work is to choosethe best method among OLSR PLS PCR and MPLSR The

6 Abstract and Applied Analysis

Actual qualityFitted quality

60 80 10050

60

70

80

90

100

PLSR fitted wine quality

Actu

al w

ine q

ualit

y

0 10 2060

65

70

75

80

Wine samplesW

ine q

ualit

y60 80 100

50

60

70

80

90

100

PLSR predicted wine quality

Actu

al w

ine q

ualit

y

Figure 2 PLSR fitted quality versus actual quality and predicted quality versus actual quality

Actual qualityFitted quality

60 80 10050

60

70

80

90

100

MPLSR fitted wine quality

Actu

al w

ine q

ualit

y

0 10 2060

65

70

75

80

Wine samples

Win

e qua

lity

60 80 10050

60

70

80

90

100

MPLSR predicted wine quality

Actu

al w

ine q

ualit

y

Figure 3 MPLSR fitted quality versus actual quality and predicted quality versus actual quality

model built by MPLSR performs the best among the fourmodels with the superior fitting efficiency represented byRMSECThe models built by PCR PLSR and MPLSR have asimilar prediction ability represented by RESEP Several winephysicochemical indexes have significant contributions to thefinal wine quality while others are insignificantThe efficiencyof the MPLSR model is validated on a larger data set whilethis method is based on the objective tests aiding the speedand quality of the oenologist performance The outcome ofthe work is useful for the wine industry From the perspectiveof industries they can savemanpower and financial resourcesin this method

With the science and technology transformed into pro-ductivity the advanced technology is widely used in industryAt the same time soft measurement is developing Softmeasurement has achieved comprehensive results in the pro-cess control research This technology breaks the traditionalpattern of single input single output (SISO) The computer

Table 3 RMSEC and RMSEP values

PCR PLSR MPLSRRMSEC 062 060 0RMSEP 307 320 301

technology is conducive to soft measurement improvingthe availability of soft-sensing technique and reducing thedifficulty of application We believe that soft measurementwill have a wide application prospect with the integration ofcomputer technology and advanced technology

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Abstract and Applied Analysis 7

0 10 20 30 40 50 60 70 80 90

0

01

02

03

04

05

minus02

minus01

Figure 4 Contribution ratio of wine physicochemical indexes inMPLSR

Acknowledgments

This work was supported by the National Natural ScienceFoundation of China (no 61304102) and the Natural ScienceFoundation of Liaoning Province China (no 2013020002)

References

[1] G Schamel and K Anderson ldquoWine quality and varietalregional and winery reputations hedonic prices for Australiaand New Zealandrdquo Economic Record vol 79 no 246 pp 357ndash369 2003

[2] P Cortez A Cerdeira F Almeida T Matos and J Reis ldquoMod-eling wine preferences by data mining from physicochemicalpropertiesrdquoDecision Support Systems vol 47 no 4 pp 547ndash5532009

[3] S Buratti S Benedetti M Scampicchio and E C PangerodldquoCharacterization and classification of Italian Barbera winesby using an electronic nose and an amperometric electronictonguerdquo Analytica Chimica Acta vol 525 no 1 pp 133ndash1392004

[4] A van den Bos ldquoApplication of statistical parameter estimationmethods to physical measurementsrdquo Journal of Physics EScientific Instruments vol 10 no 8 article 753 1977

[5] J Liang and J Qian ldquoMultivariate statistical process monitoringand control recent developments and applications to chemicalindustryrdquoChinese Journal of Chemical Engineering vol 11 no 2pp 191ndash203 2003

[6] J-M Riviere M Bayart J-M Thiriet A Boras and MRobert ldquoIntelligent instruments some modelling approachesrdquoMeasurement and Control vol 29 no 6 pp 179ndash186 1996

[7] J Yu and C Zhou ldquoSoft-sensing techniques in process controlrdquoControlTheory and Applications vol 13 no 2 pp 137ndash144 1996

[8] J Yu ldquoSoft-sensing technology and its applicationrdquo ProcessAutomation Instrumentation vol 29 no 1 pp 1ndash7 2008

[9] S Yin X Zhu and H R Karimi ldquoQuality evaluation basedon multivariate statistical methodsrdquo Mathematical Problems inEngineering vol 2013 Article ID 639652 10 pages 2013

[10] K Pearson ldquoOn lines and planes of closest fit to systems ofpoints in spacerdquo Philosophical Magazine vol 2 no 6 pp 559ndash572 1901

[11] P J A Shaw Multivariate Statistics for the EnvironmentalSciences Hodder Arnold London UK 2003

[12] H Abdi and L J Williams ldquoPrincipal component analysisrdquoWiley Interdisciplinary Reviews Computational Statistics vol 2no 4 pp 433ndash459 2010

[13] J E Jackson A Userrsquos Guide to Principal Components vol 587John Wiley amp Sons New York NY USA 2005

[14] I Jolliffe ldquoPrincipal component analysisrdquo in Encyclopedia ofStatistics in Behavioral Science John Wiley amp Sons New YorkNY USA 2005

[15] S Weisberg Applied Linear Regression vol 528 John Wiley ampSons New York NY USA 2005

[16] C Xie D Chen R Zheng and Z Xie ldquoPrincipal componentanalysis on aroma constituents of seven higharoma patternoolong teasrdquo in Journal of South China Agricultural Universityvol 20 pp 113ndash117 1998

[17] J Yang H Tong and L Jia ldquoPrincipal composition analysis ofsensory and physiochemical quality of fermented bean curdrdquo inTransactions of the Chinese Society of Agricultural Engineeringvol 2 2002

[18] Y Bao Z Hu Y Bai and R Guo ldquoApplication of principalcomponent analysis and cluster analysis to evaluating ecologicalsafety of land userdquo Transactions of the Chinese Society ofAgricultural Engineering vol 22 no 8 pp 87ndash90 2006

[19] S Valle W Li and S J Qin ldquoSelection of the number ofprincipal components the variance of the reconstruction errorcriterion with a comparison to other methodsrdquo Industrial ampEngineering Chemistry Research vol 38 no 11 pp 4389ndash44011999

[20] P Geladi and B R Kowalski ldquoPartial least-squares regression atutorialrdquo Analytica Chimica Acta vol 185 pp 1ndash17 1986

[21] S Yin S X Ding A Haghani H Hao and P Zhang ldquoAcomparison study of basic data-driven fault diagnosis andprocess monitoring methods on the benchmark TennesseeEastman processrdquo Journal of Process Control vol 22 no 9 pp1567ndash1581 2012

[22] S Wold M Sjostrom and L Eriksson ldquoPLS-regression a basictool of chemometricsrdquoChemometrics and Intelligent LaboratorySystems vol 58 no 2 pp 109ndash130 2001

[23] S Yin S X Ding P Zhang A Hagahni and A Naik ldquoStudyon modifications of pls approach for process monitoringrdquoThreshold vol 2 pp 12389ndash12394 2011

[24] G-F Wu Y-H Jiang Y-Y Wang and Y He ldquoDiscriminationof varieties of dry red wines based on independent componentanalysis and BP neural networkrdquo Spectroscopy and SpectralAnalysis vol 29 no 5 pp 1268ndash1271 2009

[25] X Su Z Li Y Feng and L Wu ldquoNew global exponentialstability criteria for interval-delayed neural networksrdquo Journalof Systems and Control Engineering vol 225 no 1 pp 125ndash1362011

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 6: Research Article Multivariate Methods Based Soft Measurement …downloads.hindawi.com/journals/aaa/2014/740754.pdf · 2019-07-31 · Research Article Multivariate Methods Based Soft

6 Abstract and Applied Analysis

Actual qualityFitted quality

60 80 10050

60

70

80

90

100

PLSR fitted wine quality

Actu

al w

ine q

ualit

y

0 10 2060

65

70

75

80

Wine samplesW

ine q

ualit

y60 80 100

50

60

70

80

90

100

PLSR predicted wine quality

Actu

al w

ine q

ualit

y

Figure 2 PLSR fitted quality versus actual quality and predicted quality versus actual quality

Actual qualityFitted quality

60 80 10050

60

70

80

90

100

MPLSR fitted wine quality

Actu

al w

ine q

ualit

y

0 10 2060

65

70

75

80

Wine samples

Win

e qua

lity

60 80 10050

60

70

80

90

100

MPLSR predicted wine quality

Actu

al w

ine q

ualit

y

Figure 3 MPLSR fitted quality versus actual quality and predicted quality versus actual quality

model built by MPLSR performs the best among the fourmodels with the superior fitting efficiency represented byRMSECThe models built by PCR PLSR and MPLSR have asimilar prediction ability represented by RESEP Several winephysicochemical indexes have significant contributions to thefinal wine quality while others are insignificantThe efficiencyof the MPLSR model is validated on a larger data set whilethis method is based on the objective tests aiding the speedand quality of the oenologist performance The outcome ofthe work is useful for the wine industry From the perspectiveof industries they can savemanpower and financial resourcesin this method

With the science and technology transformed into pro-ductivity the advanced technology is widely used in industryAt the same time soft measurement is developing Softmeasurement has achieved comprehensive results in the pro-cess control research This technology breaks the traditionalpattern of single input single output (SISO) The computer

Table 3 RMSEC and RMSEP values

PCR PLSR MPLSRRMSEC 062 060 0RMSEP 307 320 301

technology is conducive to soft measurement improvingthe availability of soft-sensing technique and reducing thedifficulty of application We believe that soft measurementwill have a wide application prospect with the integration ofcomputer technology and advanced technology

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Abstract and Applied Analysis 7

0 10 20 30 40 50 60 70 80 90

0

01

02

03

04

05

minus02

minus01

Figure 4 Contribution ratio of wine physicochemical indexes inMPLSR

Acknowledgments

This work was supported by the National Natural ScienceFoundation of China (no 61304102) and the Natural ScienceFoundation of Liaoning Province China (no 2013020002)

References

[1] G Schamel and K Anderson ldquoWine quality and varietalregional and winery reputations hedonic prices for Australiaand New Zealandrdquo Economic Record vol 79 no 246 pp 357ndash369 2003

[2] P Cortez A Cerdeira F Almeida T Matos and J Reis ldquoMod-eling wine preferences by data mining from physicochemicalpropertiesrdquoDecision Support Systems vol 47 no 4 pp 547ndash5532009

[3] S Buratti S Benedetti M Scampicchio and E C PangerodldquoCharacterization and classification of Italian Barbera winesby using an electronic nose and an amperometric electronictonguerdquo Analytica Chimica Acta vol 525 no 1 pp 133ndash1392004

[4] A van den Bos ldquoApplication of statistical parameter estimationmethods to physical measurementsrdquo Journal of Physics EScientific Instruments vol 10 no 8 article 753 1977

[5] J Liang and J Qian ldquoMultivariate statistical process monitoringand control recent developments and applications to chemicalindustryrdquoChinese Journal of Chemical Engineering vol 11 no 2pp 191ndash203 2003

[6] J-M Riviere M Bayart J-M Thiriet A Boras and MRobert ldquoIntelligent instruments some modelling approachesrdquoMeasurement and Control vol 29 no 6 pp 179ndash186 1996

[7] J Yu and C Zhou ldquoSoft-sensing techniques in process controlrdquoControlTheory and Applications vol 13 no 2 pp 137ndash144 1996

[8] J Yu ldquoSoft-sensing technology and its applicationrdquo ProcessAutomation Instrumentation vol 29 no 1 pp 1ndash7 2008

[9] S Yin X Zhu and H R Karimi ldquoQuality evaluation basedon multivariate statistical methodsrdquo Mathematical Problems inEngineering vol 2013 Article ID 639652 10 pages 2013

[10] K Pearson ldquoOn lines and planes of closest fit to systems ofpoints in spacerdquo Philosophical Magazine vol 2 no 6 pp 559ndash572 1901

[11] P J A Shaw Multivariate Statistics for the EnvironmentalSciences Hodder Arnold London UK 2003

[12] H Abdi and L J Williams ldquoPrincipal component analysisrdquoWiley Interdisciplinary Reviews Computational Statistics vol 2no 4 pp 433ndash459 2010

[13] J E Jackson A Userrsquos Guide to Principal Components vol 587John Wiley amp Sons New York NY USA 2005

[14] I Jolliffe ldquoPrincipal component analysisrdquo in Encyclopedia ofStatistics in Behavioral Science John Wiley amp Sons New YorkNY USA 2005

[15] S Weisberg Applied Linear Regression vol 528 John Wiley ampSons New York NY USA 2005

[16] C Xie D Chen R Zheng and Z Xie ldquoPrincipal componentanalysis on aroma constituents of seven higharoma patternoolong teasrdquo in Journal of South China Agricultural Universityvol 20 pp 113ndash117 1998

[17] J Yang H Tong and L Jia ldquoPrincipal composition analysis ofsensory and physiochemical quality of fermented bean curdrdquo inTransactions of the Chinese Society of Agricultural Engineeringvol 2 2002

[18] Y Bao Z Hu Y Bai and R Guo ldquoApplication of principalcomponent analysis and cluster analysis to evaluating ecologicalsafety of land userdquo Transactions of the Chinese Society ofAgricultural Engineering vol 22 no 8 pp 87ndash90 2006

[19] S Valle W Li and S J Qin ldquoSelection of the number ofprincipal components the variance of the reconstruction errorcriterion with a comparison to other methodsrdquo Industrial ampEngineering Chemistry Research vol 38 no 11 pp 4389ndash44011999

[20] P Geladi and B R Kowalski ldquoPartial least-squares regression atutorialrdquo Analytica Chimica Acta vol 185 pp 1ndash17 1986

[21] S Yin S X Ding A Haghani H Hao and P Zhang ldquoAcomparison study of basic data-driven fault diagnosis andprocess monitoring methods on the benchmark TennesseeEastman processrdquo Journal of Process Control vol 22 no 9 pp1567ndash1581 2012

[22] S Wold M Sjostrom and L Eriksson ldquoPLS-regression a basictool of chemometricsrdquoChemometrics and Intelligent LaboratorySystems vol 58 no 2 pp 109ndash130 2001

[23] S Yin S X Ding P Zhang A Hagahni and A Naik ldquoStudyon modifications of pls approach for process monitoringrdquoThreshold vol 2 pp 12389ndash12394 2011

[24] G-F Wu Y-H Jiang Y-Y Wang and Y He ldquoDiscriminationof varieties of dry red wines based on independent componentanalysis and BP neural networkrdquo Spectroscopy and SpectralAnalysis vol 29 no 5 pp 1268ndash1271 2009

[25] X Su Z Li Y Feng and L Wu ldquoNew global exponentialstability criteria for interval-delayed neural networksrdquo Journalof Systems and Control Engineering vol 225 no 1 pp 125ndash1362011

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 7: Research Article Multivariate Methods Based Soft Measurement …downloads.hindawi.com/journals/aaa/2014/740754.pdf · 2019-07-31 · Research Article Multivariate Methods Based Soft

Abstract and Applied Analysis 7

0 10 20 30 40 50 60 70 80 90

0

01

02

03

04

05

minus02

minus01

Figure 4 Contribution ratio of wine physicochemical indexes inMPLSR

Acknowledgments

This work was supported by the National Natural ScienceFoundation of China (no 61304102) and the Natural ScienceFoundation of Liaoning Province China (no 2013020002)

References

[1] G Schamel and K Anderson ldquoWine quality and varietalregional and winery reputations hedonic prices for Australiaand New Zealandrdquo Economic Record vol 79 no 246 pp 357ndash369 2003

[2] P Cortez A Cerdeira F Almeida T Matos and J Reis ldquoMod-eling wine preferences by data mining from physicochemicalpropertiesrdquoDecision Support Systems vol 47 no 4 pp 547ndash5532009

[3] S Buratti S Benedetti M Scampicchio and E C PangerodldquoCharacterization and classification of Italian Barbera winesby using an electronic nose and an amperometric electronictonguerdquo Analytica Chimica Acta vol 525 no 1 pp 133ndash1392004

[4] A van den Bos ldquoApplication of statistical parameter estimationmethods to physical measurementsrdquo Journal of Physics EScientific Instruments vol 10 no 8 article 753 1977

[5] J Liang and J Qian ldquoMultivariate statistical process monitoringand control recent developments and applications to chemicalindustryrdquoChinese Journal of Chemical Engineering vol 11 no 2pp 191ndash203 2003

[6] J-M Riviere M Bayart J-M Thiriet A Boras and MRobert ldquoIntelligent instruments some modelling approachesrdquoMeasurement and Control vol 29 no 6 pp 179ndash186 1996

[7] J Yu and C Zhou ldquoSoft-sensing techniques in process controlrdquoControlTheory and Applications vol 13 no 2 pp 137ndash144 1996

[8] J Yu ldquoSoft-sensing technology and its applicationrdquo ProcessAutomation Instrumentation vol 29 no 1 pp 1ndash7 2008

[9] S Yin X Zhu and H R Karimi ldquoQuality evaluation basedon multivariate statistical methodsrdquo Mathematical Problems inEngineering vol 2013 Article ID 639652 10 pages 2013

[10] K Pearson ldquoOn lines and planes of closest fit to systems ofpoints in spacerdquo Philosophical Magazine vol 2 no 6 pp 559ndash572 1901

[11] P J A Shaw Multivariate Statistics for the EnvironmentalSciences Hodder Arnold London UK 2003

[12] H Abdi and L J Williams ldquoPrincipal component analysisrdquoWiley Interdisciplinary Reviews Computational Statistics vol 2no 4 pp 433ndash459 2010

[13] J E Jackson A Userrsquos Guide to Principal Components vol 587John Wiley amp Sons New York NY USA 2005

[14] I Jolliffe ldquoPrincipal component analysisrdquo in Encyclopedia ofStatistics in Behavioral Science John Wiley amp Sons New YorkNY USA 2005

[15] S Weisberg Applied Linear Regression vol 528 John Wiley ampSons New York NY USA 2005

[16] C Xie D Chen R Zheng and Z Xie ldquoPrincipal componentanalysis on aroma constituents of seven higharoma patternoolong teasrdquo in Journal of South China Agricultural Universityvol 20 pp 113ndash117 1998

[17] J Yang H Tong and L Jia ldquoPrincipal composition analysis ofsensory and physiochemical quality of fermented bean curdrdquo inTransactions of the Chinese Society of Agricultural Engineeringvol 2 2002

[18] Y Bao Z Hu Y Bai and R Guo ldquoApplication of principalcomponent analysis and cluster analysis to evaluating ecologicalsafety of land userdquo Transactions of the Chinese Society ofAgricultural Engineering vol 22 no 8 pp 87ndash90 2006

[19] S Valle W Li and S J Qin ldquoSelection of the number ofprincipal components the variance of the reconstruction errorcriterion with a comparison to other methodsrdquo Industrial ampEngineering Chemistry Research vol 38 no 11 pp 4389ndash44011999

[20] P Geladi and B R Kowalski ldquoPartial least-squares regression atutorialrdquo Analytica Chimica Acta vol 185 pp 1ndash17 1986

[21] S Yin S X Ding A Haghani H Hao and P Zhang ldquoAcomparison study of basic data-driven fault diagnosis andprocess monitoring methods on the benchmark TennesseeEastman processrdquo Journal of Process Control vol 22 no 9 pp1567ndash1581 2012

[22] S Wold M Sjostrom and L Eriksson ldquoPLS-regression a basictool of chemometricsrdquoChemometrics and Intelligent LaboratorySystems vol 58 no 2 pp 109ndash130 2001

[23] S Yin S X Ding P Zhang A Hagahni and A Naik ldquoStudyon modifications of pls approach for process monitoringrdquoThreshold vol 2 pp 12389ndash12394 2011

[24] G-F Wu Y-H Jiang Y-Y Wang and Y He ldquoDiscriminationof varieties of dry red wines based on independent componentanalysis and BP neural networkrdquo Spectroscopy and SpectralAnalysis vol 29 no 5 pp 1268ndash1271 2009

[25] X Su Z Li Y Feng and L Wu ldquoNew global exponentialstability criteria for interval-delayed neural networksrdquo Journalof Systems and Control Engineering vol 225 no 1 pp 125ndash1362011

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 8: Research Article Multivariate Methods Based Soft Measurement …downloads.hindawi.com/journals/aaa/2014/740754.pdf · 2019-07-31 · Research Article Multivariate Methods Based Soft

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of