comparative analysis of learning and meta-learning algorithms for creating models for predicting the...

Computers and Electronics in Agriculture 80 (2012) 54–62

Contents lists available at SciVerse ScienceDirect

Computers and Electronics in Agriculture

journal homepage: www.elsevier .com/locate /compag

Comparative analysis of learning and meta-learning algorithms for creatingmodels for predicting the probable alcohol level during the ripening of grape berries

Roberto Fernandez Martinez ⇑, Ruben Lostado Lorza, Julio Fernandez Ceniceros,F. Javier Martinez-de-Pison AscacibarEDMANS Group, University of La Rioja, Logroño, Spain

a r t i c l e i n f o a b s t r a c t

Article history:Received 24 February 2011Received in revised form 10 October 2011Accepted 16 October 2011

Keywords:Grape sugar concentrationLearning and meta-learning algorithmsFeature selectionProbable alcohol levelBagging algorithm

0168-1699/$ - see front matter � 2011 Elsevier B.V. Adoi:10.1016/j.compag.2011.10.009

⇑ Corresponding author. Tel.: +34 941299274; fax:E-mail address: [email protected] (R.URL: http://www.mineriadatos.com (R. Fernandez

The changes occurring in the dynamics of sugar concentration in grape berries are fairly significant dur-ing maturation, whereby they are commonly used as a marker of their development. In view of theimportance this parameter has for wine producers, this paper designs several models for predictingthe must’s probable alcohol level using both meteorological variables and those specific to the vineyard.Presentation is made of a comparative analysis of learning and meta-learning algorithms for the selectionof variables and the design of useful predictive models for estimating this level. The models are designedaccording to data gathered at different locations within the Rioja Qualified Designation of Origin (DOCRioja, Spain) under different climate conditions, as well as involving different grape varieties. The modelsdesigned in this study provide very good results, and following their validation by experts, they havebeen proven to make a major contribution to decision-making in vine growing. Finally, considering theindices of analysis studied, it has been observed that the ensemble-type model based on the Baggingalgorithm with REPTree decision trees records the best results, with a root mean squared error (RMSE)of 8.1% and a correlation of 84.9%.

� 2011 Elsevier B.V. All rights reserved.

1. Introduction A vineyard’s characteristics and the particular weather condi-

The criteria for determining the quality and best date for theharvest are informed by the development observed in differentparameters. Some of the more common ones are: the dynamicsof maturation, the berries’ physical and chemical features, severalindices such as the stock’s development and stress, or the phyto-sanitary condition of the bunch. By considering all these criteria,determination may be made of the degree of ripening reached ina vineyard and, therefore, the date for harvesting the fruit (Ribè-reau-Gayon et al., 2006; Jackson, 2008).

Amongst these criteria, one of the most widely used indicatorsis the berry’s composition during the ripening period (Coombe,1992) and, specifically, its concentration of sugars (Jackson andLombard, 1993; Davies and Robinson, 1996; Kliewer, 1996). As re-ported by Sadras and McCarthy (2007), sugar concentration variesthroughout the ripening process in step with the grape’s develop-ment, influenced by both weather conditions and the vineyard’shusbandry. This dependence is readily explained, as the sugarsare produced largely by photosynthesis, a process that takes placemainly in the leaves and depends on ambient conditions (Cham-pagnol, 1984; Blouin and Guimberteau, 2000).

ll rights reserved.

+34 941299794.Fernandez Martinez).Martinez).

tions prevailing mean that the ripening process differs from oneyear to the next (Coombe, 1987; Barbeau et al., 2004; Chuineet al., 2004) and, therefore, so does the amount of sugar stored inthe grape (Greer and Weston, 2010). The trend in this feature,thereby, is not only altered as a result of maturation, as it alsoundergoes variations depending on the climate (Robinson and Da-vies, 2000; Gaudillère et al., 2002). In terms of all the environmen-tal factors, the ones most widely studied for discovering how theclimate affects vineyards are temperature (Buttrose et al., 1971;Kliewer, 1977; Bindi et al., 1996; Jones et al., 2005) and the rainfallor irrigation they have received (López et al., 2007; Keller et al.,2008; Leeuwen et al., 2009).

One way of knowing beforehand the berries’ sugar level isthrough the use of models and the relationships they provide.Using them for predicting certain variables may help to under-stand their evolution and take decisions on the future harvest(Du Plessis, 1984; Sinclair and Seligman, 1996). Considering theusefulness of these predictions, and given that the amount of su-gar is one of the most widely analyzed variables and is used innumerous maturity indicators, some prior papers develop modelsfor this feature (Sadras and McCarthy, 2007; Sadras et al., 2008;Dai et al., 2009). This study complements the work mentioned,getting some very good estimations of the probable alcoholic con-tent during the ripening period, relating it to its influence withthe climatic variables.

http://dx.doi.org/10.1016/j.compag.2011.10.009

mailto:[email protected]

http://www.mineriadatos.com

http://dx.doi.org/10.1016/j.compag.2011.10.009

http://www.sciencedirect.com/science/journal/01681699

http://www.elsevier.com/locate/compag

Table 1Data on the plots used in this study, showing the number of plots for each varietyused in each area, as well as the variation in the age of the vines and their altitude.

Area Number of plots Variety Year planted Altitude (m)

Rioja Alta 2 Garnacha 1992–1998 530–5902 Graciano 1989–1996 411–4302 Mazuelo 1980–1989 428–455

12 Tempranillo 1981–2000 420–6505 Viura 1972–1990 400–652

Rioja Baja 9 Garnacha 1975–1998 310–5653 Graciano 1984–1996 350–5652 Mazuelo 1960–1986 410–460

10 Tempranillo 1984–2001 325–565

RiojaAlavesa

10 Tempranillo 1992–2001 415–6204 Viura 1975–1988 410–650

R. Fernandez Martinez et al. / Computers and Electronics in Agriculture 80 (2012) 54–62 55

Various models are developed for predicting the amount of su-gar present in the grape through its probable alcohol level duringthe final weeks of maturation. This is achieved by using the datalog available for past seasons, corresponding to several locationsin the DOC Rioja vine-growing region with a range of grape varie-ties and different weather conditions. Although it is not a verylarge area, it is noticeable that the variation in environmental fac-tors has a major impact on the development and composition ofthe berry, which is furthermore influenced by the ban on irrigationduring this period (Smart et al., 1980; Martinez de Toda, 1991).

The main objective of this work is to provide an advanced mod-el for predicting probable alcohol level during ripening in order toestablish a strategic decision-making process. As there are cur-rently a large number of learning algorithms that allow predictinga value on the basis of a historical dataset, the aim has focused ondetermining the method that would provide the most accuratemodel depending on the problem posed.

To achieve this goal, the process for creating the models hasbeen divided into several steps. Firstly, determination is made ofthe variables with the highest influence on the berry’s develop-ment process, using different attribute selection techniques tochoose the most significant ones (John et al., 1994).

Secondly, a battery of different learning and meta-learningalgorithms are subsequently validated in order to identify thosethat generate the best predictive models. This identification is donebased in several coefficients indicative of the error obtained in theevaluation of the predictions accuracy. Therefore, the algorithmsused have been classified into five groups according to their oper-ating methodology, whereby the model with the best performancein each one of the groups and overall can be identified.

� Functions: calculated according to mathematical models, as lin-ear regressors, neural networks, etc.� Nearest neighbor methods: using information learnt before-

hand to obtain values according to the proximity of nearbypoints.� Ensemble methods: methods that allow combining different

kinds of learning algorithms.� Decision trees for regression: these generate regression models

by dividing the instance-space according to the methodologyused for creating decision trees.� Rule-based learning: based on techniques that seek decision

rules.

Finally, the models created are tested with new data so as toidentify their degree of generalization.

2. Materials and methods

2.1. Study area

The area where the study was conducted involved the wine-producing region of DOC Rioja, in Northern Spain, which has vine-yards extending over 63,593 ha. The different sites, 61 plots, wereselected bearing in mind the degree of representativeness of thezone in which they were located in order to encompass the region’sentire range of climate conditions. The grape varieties selected cor-respond to a high percentage of those allowed to be grown withinDOC Rioja (Table 1).

2.2. Measurement methods

The method for measuring the alcohol level was determinedaccording to Regulation N� 2676/90/EEC (E.C. Regulations N�2676/90/EEC, 1990), which regulates the different methods of

analysis applicable to the wine sector. Regarding the method formeasuring the alcohol level in musts, the chosen method wasrefractometry according to the chapter two of this regulation.

The measuring process for each plot involved collecting 100grape berries from different vine stocks within an approximate ra-dius of 20 m. On each vine stock, two berries were picked from theshoulders, two berries from the middle and one berry from the tip,varying the direction and location of the bunch on the vine.

The berries were sampled on a weekly basis during the graperipening process, with this period running from the end of véraisonthrough to the beginning of the harvest, and lasting between 6 and9 weeks depending on the plot’s harvest date. All the vines used inthis study were more than 9 years old at the time of sampling.

2.3. Meteorological data

The data for these variables were collected by the network ofweather stations distributed throughout the region of La Rioja aspart of its agroclimate information system (SIAR). Readings weretaken every 30 min at 24 weather stations and stored in a databasealong with all the other information used to conduct this study. Ta-ble 2 provides a summary of the main parameters measured duringthe sampling years.

2.4. Feature selection

Once the data had been pre-processed by detecting and discard-ing missing and spurious values (Zhang et al., 2003), the variableswere generated as shown in Table 3. The process of defining thevariables was based on the work by Andrades Rodríguez (1991)and Ribèreau-Gayon et al. (2006), on expert advice and on the cor-relation analysis of the data obtained.

In view of the large quantity of variables obtained, use wasmade of algorithms for the selection of variables that allowed sin-gling out the most important ones for improving the predictivequality of the models (Liu and Yu, 2002; Peng et al., 2009). Thesealgorithms helped to weed out those irrelevant variables that didnot enhance the quality of the models by improving their efficacy(Blum and Langley, 1997).

Myriad methods were used that were divided into two types:filter and wrapper. Filter methods operate independently, theydo not use learning algorithms and they perform an assessmentaccording to the general characteristics of the data. On the otherhand, wrapper methods require predetermined learning algo-rithms, thereby imbuing them with greater accuracy, albeit entail-ing a greater computational cost.

The methods for selecting variables, as used in this work, can bedivided into three groups:

Table 2Summary of the main meteorological parameters measured during the years this work has been undertaken. It shows the average values for the different weather stationsgrouped according to the different areas of influence.

Location Parameter 2002 2003 2004 2005 2006 2007 2008

Rioja Alta Temperature (�C) Max 36.90 40.00 36.80 38.10 37.00 37.00 35.00Min �2.40 �3.80 �3.10 �9.00 �5.10 �5.70 �4.20Mean 12.86 13.88 12.69 12.64 13.65 12.67 12.78

Relative humidity (%) Max 95.00 93.16 94.00 92.80 92.48 89.11 90.97Min 24.96 30.17 40.77 17.88 25.11 37.71 45.49Mean 68.95 65.50 67.53 64.52 65.48 66.50 68.82

Wind speed (km/s) Mean 7.60 4.60 7.11 7.36 7.20 6.55 5.62Precipitation (mm) Total 409.50 409.30 433.00 427.40 419.30 366.50 603.40

Rioja Baja Temperature (�C) Max 35.30 37.30 34.70 37.60 36.30 38.80 34.80Min �0.01 �4.80 �1.60 �9.10 �6.40 �4.30 �5.30Mean 13.59 14.24 13.12 13.11 14.16 13.47 13.23



Rioja Alavesa Temperature (�C) Max 36.60 39.20 38.90 38.20 36.80 36.40 34.40Min �1.20 �4.40 �8.30 �8.10 �5.10 �6.20 �4.10Mean 13.82 14.41 13.05 13.29 14.24 13.31 12.70



56 R. Fernandez Martinez et al. / Computers and Electronics in Agriculture 80 (2012) 54–62

� Methods based on attribute evaluation filters: ChiSquaredAt-tributeEval (Chan and Wong, 1991), FilteredAttributeEval (Wit-ten and Frank, 2005), GainRatioAttributeEval (Witten and Frank,2005), InfoGainAttributeEval (Witten and Frank, 2005), OneRAt-tributeEval (Witten and Frank, 2005), ReliefFAttributeEval (Rob-nik-Sikonja and Kononenko, 1997), SignificanceAttributeEval(Ahmad and Dey, 2005), SymmetricalUncertAttributeEval (Wit-ten and Frank, 2005).� Methods based on subset evaluation filters: CfsSubsetEval (Hall,

1998), ConsistencySubsetEval (Liu and Setiono, 1996), Filtered-SubsetEval (Witten and Frank, 2005).� Methods based on wrappers: ClassifierSubsetEval (Witten and

Frank, 2005), WrapperSubsetEval (Kohavi and John, 1997).

All these methods are linked to search procedures responsiblefor generating the possible groups of features and selecting thebest ones. The following search procedures were used: BestFirst(Rich and Knight, 1991), ExhaustiveSearch (Witten and Frank,2005), GeneticSearch (Goldberg, 1989), GreedyStepwise (Kittler,1978), LinearForwardSelection (Gutlein et al., 2009), Random-Search (Liu and Setiono, 1996), Ranker (Witten and Frank,2005), ScatterSearchV1 (Garcia Lopez et al., 2006), SubsetSizeFor-wardSelection (Gutlein et al., 2009) and TabuSearch (Hedar et al.,2006).

2.5. Selecting the best data learning and meta-learning algorithms

Once the main features of the stored data had been identified, acomparison was made between the learning and meta-learningalgorithms that produced the best predictive models. The algo-rithms used are divided into five groups, with the first, calledfunctions:

� Gaussian Processes (GP) (Mackay, 1998; Williams, 1998): this isa form of supervised learning, which is a generalization of theGaussian probability distribution. This Gaussian process is spec-ified by a mean and a covariance function, and the output func-tion in any one data modelling problem is assumed to be asingle sample from this Gaussian distribution.

� LeastMedSq (LMSQ) (Portnoy and Koenker, 1997): it is animplementation of least median squared linear regression thatminimizes the median squared error. Linear regression algo-rithms are used to form predictions.� Linear Regression (LINREG) (Wilkinson and Rogers, 1973):

although requiring a major restriction of the model’s linearity,this algorithm is used to view the data behavior using a linearmodel. Furthermore, it uses the Akaike criterion for modelselection, being able to deal with weighted instances.� Multilayer Perceptron (MLP) (Haykin, 1999): it is a feed forward

artificial neural network model that maps sets of input data ontoa set of appropriate outputs. A supervised learning techniquecalled back propagation is used for training the network connect-ing many simple perceptron-like models in a hierarchical struc-ture, which can distinguish data that is not linearly separable.� RBF Network (RBFN) (Haykin, 1999): the main characteristic of

the radial basic function neural network is the use of a normal-ized distance between the input points and the hidden nodes todefine the activation of each node. The closer the two points, thestronger the activation.� SMOreg (SMO) (Smola and Schölkopf, 1998): it is an implemen-

tation of the sequential minimal optimization algorithm fortraining a support vector regression model.

The second involves nearest neighbor methods:

� IBk (IBk) (Aha et al., 1991): it is instance-based learning thatworks as a k-nearest-neighbor classifier. A variety of differentsearch algorithms are used to speed up the task of finding thenearest neighbors.� K⁄ (Kstar) (Cleary and Trigg, 1995): it is an instance-based clas-

sifier used for regression; that is, the class of a test instance isbased upon the class of those training instances similar to it,as determined by some similarity function based on anentropy-based distance function.� Locally weighted learning (LWL) (Atkeson et al., 1997; Frank

et al., 2003): it uses an instance-based algorithm to assigninstance weights that are then used by a specified weighted sys-tem using linear regression.

Table 3Initial variables used for calculating the models.

Vineyard variablesVariety VarVineyard age (year) AgeAltitude (m) Altit

Environmental variables related to the amount of rainfallTotal rainfall over the preceding week (mm.) RFWTotal rainfall over the preceding two weeks (mm) RF2WTotal rainfall over the preceding three weeks (mm) RF3WTotal rainfall since the beginning of the year (mm) RFYTotal rainfall since bud break (mm) RFBBTotal rainfall during the penultimate week (mm) RFW2Total rainfall during the penultimate and antepenultimate week

(mm)RF2W2

Total rainfall between bud break and flowering (mm) RFBBFTotal rainfall between flowering and setting (mm) RFFSTotal rainfall between setting and véraison (mm) RFSVTotal rainfall between véraison and harvest (mm) RFVH

Environmental variables related to wind, humidity and weightPrevailing wind direction over the preceding week (N,S,E,W) DirAverage relative humidity over the preceding week (%) HumMinimum relative humidity over the preceding week (%) HumMinMaximum relative humidity over the preceding week (%) HumMaxAverage wind speed in km/h over the preceding week (km/h) SpeedMaximum wind speed in km/h over the preceding week (km/h) SpeedMaxWeight of 100 berries W100B

Environmental variables related to temperatureAverage temperature over the preceding week (�C) TempMinimum temperature over the preceding week (�C) TempMinMaximum temperature over the preceding week (�C) TempMaxAggregate of average daily temperatures since the beginning of

the year (�C)STemp

Days with maximum temperatures above 40� C D40Days with average temperatures above 18� C during maturation DM18Days with maximum temperatures above 30� C during

maturationDM30

Average differences between maximum and minimum dailytemperature during maturation (�C)

DDN


The third, ensemble methods:

� Additive Regression (AR) (Friedman, 1999): it is a meta classifierthat enhances the performance of a regression-based classifier.Each iteration fits a model to the residuals left by the classifieron the previous iteration. Prediction is accomplished by addingthe predictions of each classifier. Reducing the shrinkageparameter helps prevent overfitting and has a smoothing effect,although it increases the learning time.� Bagging (BAG) (Breiman, 1996): Bagging (Bootstrap AGGregat-

ING) is an algorithm that generates ensemble-type meta-mod-els based on the concepts of bootstrapping and aggregating.This method involves generating a series of sub-models, witheach one being trained using a dataset obtained by samplingwith the replacement of the original dataset. The meta-model’soutput is obtained as the mean of the predictions of the modelsgenerated. This algorithm tends to be highly effective whenused with trees, neural networks and support vector machines,as it allows each sub-model to specialize with a subset of thedataset instances.� Ensemble Selection (ES) (Caruana et al., 2004): this algorithm

evolves a population of weight vectors for the regressors inthe ensemble in order to minimize a function of the generaliza-tion error. When the algorithm outputs the best evolved weightvector, the models of the ensemble that do not have a prede-fined threshold are dropped.� Random Sub Space (RSS) (Ho, 1998): this method constructs a

decision tree based classifier that maintains the highest accu-racy on training data and improves on generalization accuracyas it grows in complexity. The classifier consists of multiple

trees constructed systematically by pseudo-randomly selectingsubsets of components of the feature vector; that is, trees con-structed in randomly chosen subspaces.

The fourth, regression decision tree based methods:

� M5P algorithm (M5P) (Quinlan, 1992): generates M5 modeltrees using the M5’ algorithm, which was introduced in Wangand Witten (1997) and enhances the original M5 algorithm byQuinlan.� REPTree (REPT) (Witten and Frank, 2005): it builds a regression

tree using information gain/variance reduction and prunes itusing reduced-error pruning.

Finally, methods that teach the model using rules:

� Decision Table (DT) (Kohavi, 1995): a model is built based ondata where the class is known, and then that model is used toclassify new data where the class is unknown. The entropy-based attribute selection used for classification is replaced byanother criterion as common regression tree predictionsexplained by Breiman et al. (1984).� M5 Rules (M5R) (Wang and Witten, 1997; Holmes et al., 1999):

this algorithm generates a decision list for regression problemsusing separate-and-conquer. A model tree is built in each itera-tion using M5 and the best leaf is made into a rule.

The models are trained using cross-validation, as their calcula-tion times are not very high and allow using the entire trainingdataset, with 3387 entries, for creating and validating the models.This method involves dividing the initial database into 10 subsets.To calculate this error, 9 subsets were chosen to train the model,with the one subset omitted from the training being used to calcu-late the error of the partial sample. This procedure is repeated tentimes, each time using a different test subset. Finally, the error iscalculated as the arithmetic mean of the ten errors of the partialsamples.

Once the different algorithms had been trained, a selection wasmade of those with the best predictive performance. Differentcoefficients indicative of the error made were used to evaluatethe accuracy of the predictions.

– Mean absolute error (MAE)

MAE ¼ 1n

Xn

k¼1

jmk � pkj ð1Þ

– Root relative squared error (RRSE)

RRSE ¼

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPnk¼1ðpk �mkÞ2Pnk¼1ðmk �mÞ2

sð2Þ

– Root mean squared error (RMSE)

RMSE ¼

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1n

Xn

k¼1

ðmk � pkÞ2

vuut ð3Þ

– Relative absolute error (RAE)

RAE ¼Pn

k¼1jðpk �mkÞjPnk¼1jðmk �mÞj

ð4Þ

– Correlation coefficient (CORR)

CORR ¼Pn

k¼1ðpk��pÞðmk� �mÞ

n�1ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPnk¼1

ðpk��pÞ2n�1

Pnk¼1

ðmk� �mÞ2n�1

q ð5Þ


where m and p are, respectively, the measured and predicted out-puts, n is the number of points of the database used to validate themodels, m ¼ 1

n

Pnk¼1mk and p ¼ 1

n

Pnk¼1pk. Furthermore, a study is

made of the p-value coefficient associated with the results of themodels as an indicator of the reliability of the correlation obtainedin the results.

Finally, AIC and BIC:

– Akaike Information Criterion (AIC) (Akaike, 1973)

Table 4Data on

WeeWeeWeeWeeWeeWeeWeeWee

AIC ¼ 2k� 2 lnðLÞ ð6Þ

– Bayesian Information Criterion (BIC) (Schwarz, 1978)

BIC ¼ �2 lnðLÞ þ k lnðnÞ; ð7Þ

where k is the number of parameters in the statistical model, L isthe maximized value of the likelihood function for the estimatedmodel, and n is the sample size.

WEKA suite (Witten and Frank, 2005) was used to develop thedifferent models.

3. Results and discussion

3.1. Variations in the probable alcohol level

According to the samples studied, the amount of sugar concen-trates very quickly as from véraison, with the growth slope level-ling off as ripening progresses until the highest concentration isreached when the grape reaches the moment of maximum devel-opment. These variations in the probable alcohol level depend onnumerous variables, on the vineyard’s location, the type of grape,weather conditions. . .From this moment onwards, an over-ripen-ing process begins, recording a slight drop in sugar levels (Table 4).

Furthermore, the study made has identified certain relation-ships between these variables and the probable alcohol level ofthe grape berries in the area under study, as well as the natureof its influence with weather conditions, such as those shown inFig. 1, for example.

3.2. Feature selection

Once the data had been pre-processed, analyzing and discardingany anomalous or spurious ones, a selection was made of the mostimportant variables. The application of these methods is of interestas the performance of certain learning algorithms deteriorateswhen the input variables have irrelevant or redundant attributes.Table 5 shows the results of using the different algorithms. Thesewere applied by groups of variables in order to improve their effec-tiveness, separating temperature-related variables on the onehand, and rainfall-related variables on the other. The selection

the probable alcohol level of the grape berries obtained in this study, grouped by

Rioja Alta Rio

Tempranillo Garnacha Viura Graciano Mazuelo Tem

k 1 8.78 ± 0.33 6.37 ± 0.69 6.03 ± 0.38 7.87 ± 0.98 6.96 ± 1.18 10.3k 2 10.01 ± 0.23 7.87 ± 0.55 7.80 ± 0.33 9.84 ± 0.79 7.80 ± 0.80 11.3k 3 11.02 ± 0.22 9.53 ± 0.50 9.22 ± 0.36 11.10 ± 0.74 8.93 ± 0.93 12.0k 4 11.57 ± 0.19 10.52 ± 0.56 10.04 ± 0.34 11.57 ± 0.78 9.58 ± 0.95 12.3k 5 12.05 ± 0.18 11.18 ± 0.53 10.73 ± 0.31 11.80 ± 0.84 10.32 ± 0.84 12.6k 6 12.54 ± 0.17 12.10 ± 0.61 11.36 ± 0.30 12.20 ± 0.79 10.83 ± 0.91 12.6k 7 12.67 ± 0.16 12.61 ± 0.50 11.75 ± 0.28 12.27 ± 0.85 11.15 ± 0.81 12.3k 8 12.19 ± 0.31 12.20 ± 0.65 11.43 ± 0.29 11.00 ± 0.38 10.00 ± 1.46 12.2

was made by choosing the variables used in at least two of thethree methods. The following variables were selected from the cal-culations made: Var, Age, Altit, Hum, Speed, W100B, Temp, Stemp,D40, DDN, RF2W, RF3W, RFBB, RFW2, RFVH. To do so, account wastaken of the position in the ranking and the percentage of timesthat each variable was chosen.

Following the application of these methods, the 29 variables se-lected initially were reduced by almost half to 15. This reduction invariables proved to be effective, as shown in Table 6, where a sub-stantial improvement can be seen in the testing errors when use ismade of the entry variables obtained after applying the reductionmethods to them.

3.3. Models developed

Once the main variables had been selected, the next process in-volved the training, validation and testing of the models.

3.3.1. Model trainingThe training and validation of the models involved selecting the

data for all the years from the dataset except for one, that is, 86%.The remaining 14%, which corresponded to 2006, were used for thefinal testing of the models generated.

The models were validated using 10-fold cross-validation. Theentire process was repeated ten times for each model. Finally, cal-culation was made of the mean and the standard deviation of the10 cross-validation errors. Table 7 shows the cross-validation er-rors for each one of the algorithms studied.

For each one of the groups, the models with the best mean cor-relation were: the GP model in the functions group, with a correla-tion of 80.2%; the BAG model in the ensemble methods group, witha correlation of 84.4%; the IBk model in the neighbors group, with acorrelation of 76.5%; the M5R model in the rules group with a cor-relation of 80.2%; and the M5P model in the trees group with a cor-relation of 80.8%. They all recorded a p-value below 2.2E-16, whichindicates a high level of statistical significance for the resultsobtained.

In fact, the majority of the ensemble methods group models re-corded very good results. The AIC index for the best of them, theBAG model, was 12.1% lower than in the GP; 11% in terms of theIBk; 21.8% regarding the M5R and 18.6% lower than in the M5P.

The RMSE-MEAN and MAE-MEAN errors for the BAG were 8.7%and 6.8% as opposed to 9.7% and 7.5% for the GP; 10.5% and 8.0% forIBk; 9.6% and 7.5% for the M5R and 9.4% and 7.3% for the M5P.

Other significant results obtained from the analysis were thatthe linear regressions within the functions group recorded lowerresults than Gaussian systems or neural networks, as these adaptbetter to non-linear systems, as is the case here.

It is also significant that the tree-based algorithms recordedgood results, as they divide the instance-space, classifying the datainto several leaves, and perform a linear regression with each leaf’sgroup of cases.

sampling weeks during the ripening process with a confidence interval of 95%.

ja Baja Rioja Alavesa

pranillo Garnacha Graciano Mazuelo Tempranillo Viura

7 ± 0.28 10.16 ± 0.36 8.10 ± 0.60 7.21 ± 0.85 9.98 ± 0.29 10.10 ± 0.653 ± 0.23 11.35 ± 0.29 9.21 ± 0.53 8.27 ± 0.74 10.10 ± 0.30 8.84 ± 0.780 ± 0.22 12.50 ± 0.30 10.39 ± 0.67 9.55 ± 0.94 10.63 ± 0.31 9.87 ± 0.876 ± 0.19 12.83 ± 0.27 11.04 ± 0.59 10.14 ± 0.75 11.18 ± 0.30 10.30 ± 0.864 ± 0.17 13.29 ± 0.25 11.44 ± 0.56 10.69 ± 0.83 11.72 ± 0.26 10.68 ± 0.699 ± 0.17 13.42 ± 0.24 12.11 ± 0.55 11.27 ± 0.69 12.23 ± 0.25 11.16 ± 0.707 ± 0.09 13.07 ± 0.23 11.81 ± 0.54 10.74 ± 0.56 12.41 ± 0.24 11.60 ± 0.660 ± 0.05 13.37 ± 0.19 11.35 ± 0.32 10.70 ± 0.43 12.24 ± 0.34 11.40 ± 0,75

Fig. 1. Relationship between weight and the probable alcohol level for the berries of the Garnacha variety.

Table 5Results of the selection of variables obtained according to different attribute evaluatorand search methods.

Variable Filters: Attribute SubsetEvaluator (%)

Filters: Single-Attribute Evaluators

Wrappers(%)

Var 71 9 50Age 68 8 50Altit 89 7 75Dir 0 10 50Hum 75 2 0HumMin 52 4 50HumMax 28 3 0Speed 88 1 10SpeedMax 4 6 0W100B 89 5 75Temp 37 3 0TempMin 4 5 50TempMax 4 8 60Stemp 93 1 50D40 42 4 0DM18 21 6 50DM30 0 7 0DDN 40 2 0RFW 1 1 0RF2W 34 6 50RF3W 94 5 100RFY 23 4 50RFBB 72 2 0RFW2 25 8 50RF2W2 3 9 0RFBBF 0 4 0RFFS 23 7 0RFSV 7 10 0RFVH 34 3 50


3.3.2. Model testingFrom amongst all these models, the best ones were chosen after

verifying their degree of generalization with previously separateddata, corresponding to the 2006 season.

From amongst all the methods applied, and according to thecoefficients studied, a check was made to identify those that re-corded the best performance. The models tested were the best ineach group: GP (functions), BAG (ensemble), IBk (neighbors),M5R (rules) and M5P (trees).

The best models obtained in the training were tested with datafrom the 2006 season, which had not been used before in this stage(Table 8).

According to these results, the Bagging method recorded thebest performance. Regarding the different configurations studiedfor this method, the best results were recorded by the one usinga size of 100% for each one of the bags and a REPTree type decisiontree as numerical predictor with a minimum limit of variance pro-portion of 0.001.

The correlation of the BAG method was 84.9%, higher than the75% of the GP (11.7% higher), the 82.7% of the IBk (2.5% higher),the 80.9% of the M5R (4.7% higher) and the 82.3% of the M5P(3.0% higher). The p-value in all the results showed high statisticalsignificance (p-value < 2.2E-16).

The RMSE-MEAN and MAE-MEAN errors of the BAG model were8.1% and 6.1%, respectively, as opposed to the 10.1% (19.8% lower)and 7.9% (22.8% lower) for the GP; the 8.5% (4.7% lower) and 6.7%(8.9% lower) of the IBk; the 8.9% (9.0% lower) and 6.7% (8.9% lower)of the M5R and the 8.6% (5.8% lower) and 6.4% (4.69% lower) of theM5P.

Table 6Errors obtained based on the same model, using all the initial variables and using only the variables selected by the methods for reducing variables. The model used was Baggingin conjunction with the REPTree classifier.

CORR RMSE MAE RAE RRSE AIC BIC

All variables 0.796⁄⁄⁄ 0.099 0.076 60.03 62.55 �6220 �6203Reduced number of variables 0.844⁄⁄⁄ 0.087 0.068 53.23 54.56 �6659 �6642

Significant differences on CORR are indicated: ⁄⁄⁄P < 0.001; ⁄⁄P < 0.01; ⁄P < 0.05.

Table 7List of data obtained in the training for the cross-validation of models.

Group Model CORRmean

CORRSD

RMSEmean

RMSESD

MAEmean

MAESD

RAEmean

RAESD

RRSEmean

RRSESD

AIC BIC Time(sec)

Function GP 0.808⁄⁄⁄ 0.002 0.097 0.000 0.075 0.000 58.33 0.27 60.14 0.31 �5850 �5833 15,085MLP 0.797⁄⁄⁄ 0.007 0.108 0.005 0.086 0.005 67.18 3.97 67.46 3.18 �4674 �4734 2391SMO 0.637⁄⁄⁄ 0.001 0.124 0.000 0.097 0.000 75.36 0.13 77.18 0.09 �4151 �4540 2550LR 0.557⁄⁄⁄ 0.001 0.134 0.000 0.105 0.000 81.83 0.13 83.13 0.12 �4257 �4657 16RBFN 0.303⁄⁄⁄ 0.019 0.153 0.001 0.122 0.000 95.18 0.65 95.20 0.60 �1160 �1140 85

Ensemble BAG 0.844⁄⁄⁄ 0.001 0.087 0.000 0.068 0.000 53.23 0.72 54.56 0.60 �6659 �6642 45,033RSS 0.831⁄⁄⁄ 0.002 0.095 0.000 0.074 0.000 58.03 0.33 59.00 0.31 �6550 �6532 104AR 0.819⁄⁄⁄ 0.005 0.092 0.001 0.072 0.000 55.89 0.73 57.32 0.77 �4563 �4545 2268ES 0.799⁄⁄⁄ 0.002 0.097 0.000 0.075 0.000 58.62 0.39 60.19 0.35 �5815 �5797 3710

Neighbors IBk 0.765⁄⁄⁄ 0.006 0.105 0.001 0.080 0.000 62.81 0.51 65.10 0.74 �5922 �5904 1LWL 0.745⁄⁄⁄ 0.007 0.110 0.001 0.085 0.000 66.23 0.74 68.34 0.88 �5055 �5037 1Kstar 0.701⁄⁄⁄ 0.003 0.119 0.000 0.093 0.000 72.51 0.36 74.14 0.33 �4730 �4712 1

Rules M5R 0.802⁄⁄⁄ 0.003 0.096 0.000 0.075 0.000 58.24 0.49 59.67 0.44 �5205 �5187 985DT 0.712⁄⁄⁄ 0.003 0.113 0.000 0.088 0.000 68.95 0.37 70.30 0.43 �4852 �4835 2250

Trees M5P 0.808⁄⁄⁄ 0.003 0.094 0.000 0.073 0.000 57.35 0.44 58.77 0.47 �5416 �5398 360

The errors are ordered according to their correlation, showing for the different configurations of each algorithm only the data for the model that recorded the best results.Significant differences on CORR are indicated: ⁄⁄⁄P < 0.001; ⁄⁄P < 0.01; ⁄P < 0.05.

Table 8Results for the testing of the different models selected in the training, showing the best performances of the configurations.

Group Model CORR RMSE MAE RAE RRSE AIC BIC

Function GP 0.750⁄⁄⁄ 0.101 0.079 63.147 65.953 �869.14 �857.76Ensemble BAG 0.849⁄⁄⁄ 0.081 0.061 48.541 52.308 �992.36 �980.98Neighbors IBk 0.827⁄⁄⁄ 0.085 0.067 53.172 55.185 �848.88 �837.50Rules M5R 0.809⁄⁄⁄ 0.089 0.067 53.324 58.098 �779.37 �767.99Trees M5P 0.823⁄⁄⁄ 0.086 0.064 51.147 55.840 �809.11 �797.73

Significant differences on CORR are indicated: ⁄⁄⁄P < 0.001; ⁄⁄P < 0.01; ⁄P < 0.05.

Fig. 2. Prediction for several of the locations studied, one for each variety. The predicted data are obtained with several algorithms with the best results (GP, BAG, IBk, M5Rand M5P).


In sum, BAG was the best according to all seven metrics: It hadthe smallest value for each error measure, the largest correlationcoefficient, and the best AIC and BIC coefficients.

Authors such as Breiman (1996); Freund and Schapire (1996);Bauer and Kohavi (1999) and, more recently, Seni and Elder(2010) have shown that in many cases the Bagging method is capa-

Fig. 3. Scatter plots of a sample of predicted and observed values corresponding to the Bagging model, showing the three algorithms selected as the ones most suitable forresolving the problem (GP, BAG, IBk, M5R and M5P).


ble of creating better models for generalizing the problem than thebase models individually, as they can break down the variance orbias of the error and so reduce it. In fact, in certain examples withreal datasets, this method has shown it can reduce both of them.

Fig. 2 shows the prediction values for the probable alcohol level,for the best models obtained, alongside the real values obtained atdifferent sites during the 2006 season.

In addition, Fig. 3 shows different scatter plots revealing thecorrelation between the measured and calculated values for thethree best models selected. The correlations of the models confirmthe difficulty of the problem and the existence of different types ofperformance depending on the grape variety, finally proving thegood predictive ability of the models obtained.

The results show that the best models record better results thanother models reported in the literature, although it should be notedthat the input variables are not the same. In the case of Dai et al.(2009), they get a RMSE of 0.23 quantifying the goodness-of-fit,whereas the best model in this work has recorded a testing RMSEof 0.08.

4. Conclusions

The dynamics of sugar concentration in grape berries during theperiod running from post-véraison through to harvest is one of themost widely used variables for determining how they are ripeningand so decide when the best time for their harvesting will be. As itis very important for vine-growers to know this value beforehand,this work provided consistent models taking into account variablesthat were specific to each vineyard, as well as the weather condi-tions affecting each one.

The calculation of the models was based on a large number ofvariables that complicated the generation process. In order to sim-plify and improve them, the use of feature selection techniqueswas proposed. These techniques, used before the models werebuilt, proved to be suitable for improving predictions. Different fil-ter and wrapper algorithms were used within these techniques,thereby reducing the significant variables by almost half; the con-ducting of different tests revealed that discarding these non-signif-icant variables reduced the error and the computation time spenton working with the models.

Once the variables had been obtained for use with the models, awide array of models was developed for predicting the probablealcohol level of a vineyard during the grape ripening process. Sev-eral learning algorithms grouped according to their operation werecompared, identifying the models that recorded the best resultswithin each group, with subsequent testing of those deemed tobe the best with new data that had not been used in the training,reaching the conclusion that the best model is the one generatedwith ensemble-type meta-algorithms. In particular, the Baggingalgorithm in conjunction with REPTree revealed a better predictiveability than the other models developed. Furthermore, it improvedthe results of other studies reported in the literature that sought to

predict this same parameter. Consequently, the model developedfor this study may be used as part of the strategy for vineyard hus-bandry during the berry ripening process and the harvestingperiod.

Acknowledgement

The authors thank La Rioja University through FPI fellowshipsand La Rioja Autonomous Government for its support throughthe FOMENTA program 2010/13.

References

Aha, D.W., Kibler, D., Albert, M.K., 1991. Instance-based learning algorithms.Machine Learning 6, 37–66.

Ahmad, A., Dey, L., 2005. A feature selection technique for classificatory analysis.Pattern Recognition Letters 26 (1), 43–56.

Akaike, H., 1973. Information theory and an extension of the maximum likelihoodprinciple. In: Petrov, B.N., Csaki, F. (Eds.), Second international symposium oninformation theory, pp. 267–281.

Andrades Rodríguez, M., 1991. Influencias Climáticas Sobre el Proceso deMaduración del Fruto de VitisVinifera: Diferenciación Varietal. Gobierno de laRioja, Consejería de Agricultura y Alimentación (in Spanish).

Atkeson, C., Moore, A., Schaal, S., 1997. Locally weighted learning. ArtificialIntelligence Review 11, 11–73.

Barbeau, G., Bournand, S., Champenois, R., Bouvet, M.-H., Blin, A., Cosneau, M., 2004.The behaviour of four red grapevine varieties of Val de Loire according toclimatic variables. Journal International des Sciences de la Vigne et du Vin 38(1), 35–40.

Bauer, E., Kohavi, R., 1999. An empirical comparison of voting classificationalgorithms: bagging, boosting, and variants. Machine Learning 35, 1–38.

Bindi, M., Fibbi, L., Gozzini, B., Orlandini, S., Miglietta, F., 1996. Modelling the impactof future climate scenarios on yield and yield variability of grapevine. ClimateResearch 7 (3), 213–224.

Blouin, J., Guimberteau, G., 2000. Maturation et Maturité des Raisins. Ed. Feret. (inFrench).

Blum, A.L., Langley, P., 1997. Selection of relevant features and examples in machinelearning. Artificial Intelligence 97 (1–2), 245–271.

Breiman, L., 1996. Bagging predictors. Machine Learning 24 (2), 123–140.Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J., 1984. Classification and

Regression Trees. Wadsworth International Group.Buttrose, M.S., Hale, C.R., Kliewer, W.M., 1971. Effect of temperature on the

composition of ‘Cabernet-Sauvignon’ berries. American Journal of Enology andViticulture 22, 71–75.

Caruana, R., Niculescu, A., Crew, G., Ksikes A., 2004. Ensemble selection fromlibraries of models. Twenty-First International Conference on MachineLearning, ICML 2004, pp. 137–144.

Champagnol, F., 1984. Elements de physiologie de la vigne et de viticulturegenerale.François Champagnol (Ed.), Déhan, Saint-Gely-du-Fesc, Montpellier (in French).

Chan, K.C., Wong, A. K., 1991. A Statistical Technique for Extracting ClassificatoryKnowledge from Databases. Knowledge Discovery in Databases. G. Piatetsky-Shapiro, W.J. Frawley.

Chuine, I., Yiou, P., Viovy, N., Seguin, B., Daux, V., Ladurie, E.L.R., 2004. Graperipening as a past climate indicator. Nature 432 (7015), 289–290.

Cleary, J.G., Trigg, L.E., 1995. K⁄: An Instance-based Learner Using an EntropicDistance Measure. 12th International Conference on Machine Learning, pp.108–114.

Coombe, B.G., 1987. Influence of temperature on composition and quality of grapes.Acta Horticulturae 206, 23–36.

Coombe, B.G., 1992. Research on development and ripening of the grape berry.American Journal of Enology and Viticulture 43, 101–110.

Dai, Z.W., Vivin, P., Robert, T., Milin, S., Li, S.H., Génard, M., 2009. Model-basedanalysis of sugar accumulation in response to source-sink ratio and watersupply in grape (Vitisvinifera) berries. Functional Plant Biology 36, 527–540.


Davies, C., Robinson, S.P., 1996. Sugar accumulation in grape berries (cloning of twoputative vacuolar invertase cDNAs and their expression in grapevine tissues).Plant Physiology 111, 275–283.

Du Plessis, C.S., 1984. Optimum maturity and quality parameters in grapes: areview. South African Journal of Enology and Viticulture 5, 35–42.

E.C. Regulations N� 2676/90/EEC, 1990. E.C. Regulations N� 2676/90/EEC of 17September 1990 ‘determining Community methods for the analysis of wines’and amendments: (a) N� 822/97/EC 6 May 1997 and (b) N� 440/2003/EC of 10March 2003.

Frank, E., Hall, M., Pfahringer, B., 2003. Locally weighted Naive Bayes. 19thConference in Uncertainty in Artificial Intelligence, pp. 249–256.

Freund Y., Schapire R.E., 1996. Experiments with a new boosting algorithm. In:Machine Learning: Proceedings of the Thirteenth International Conference, pp.325–332.

Friedman, J.H., 1999. Stochastic gradient boosting (Tech. Rep.). Palo Alto, CA,Stanford University, Statistics Department.

Garcia Lopez, F., Garcia Torres, M., Melian Batista, B., Moreno Perez, J.A., Moreno-Vega, J.M., 2006. Solving feature subset selection problem by a Parallel ScatterSearch. European Journal of Operational Research 169 (2), 477–489.

Gaudillère, J.P., Van Leeuwen, C., Ollat, N., 2002. Carbon isotope composition ofsugars in grapevine, an integrated indicator of vineyard water status. Journal ofExperimental Botany 53 (369), 757–763.

Goldberg, D.E., 1989. Genetic Algorithms in Search, Optimization and MachineLearning. Addison-Wesley.

Greer, D.H., Weston, C., 2010. Heat stress affects flowering, berry growth, sugaraccumulation and photosynthesis of Vitis vinifera cv. Semillon grapevinesgrown in a controlled environment. Functional Plant Biology 37, 206–214.

Gutlein, M., Frank, E., Hall, M., Karwath, A., 2009. Large-scale attribute selectionusing wrappers. IEEE Symposium on Computational Intelligence and DataMining, 332–339.

Hall, M.A., 1998. Correlation-based feature subset selection for machine learning.Department of Computer Science, University of Waikato, Hamilton, NewZealand.

Haykin, S., 1999. Neural Networks, A Comprehensive Foundation, second ed.Prentice Hall, New Jersey.

Hedar, A., Wang, J., Fukushima, M., 2006. Tabu search for attribute reduction inrough set theory. Soft Computing – A Fusion of Foundations, Methodologies andApplications 12 (9), 909–918.

Ho, T.K., 1998. The random subspace method for constructing decision forests. IEEETransactions on Pattern Analysis and Machine Intelligence 20 (8), 832–844.

Holmes, G., Hall, M., Frank, E., 1999. Generating rule sets from model trees. TwelfthAustralian Joint Conference on Artificial Intelligence, pp. 1–12.

Jackson, R.S., 2008. Wine Science, Principles and Applications, third ed. Elsevier Inc..Jackson, D.I., Lombard, P.B., 1993. Environmental and management practices

affecting grape composition and wine quality – a review. American Journal ofEnology and Viticulture 44, 409–430.

John, G.H., Kohavi, R., Pfleger, K., 1994. Irrelevant features and the subset selectionproblem. International Conference on Machine Learning, pp. 121–129.

Jones, G.V., White, M.A., Cooper, O.R., Storchmann, K., 2005. Climate change andglobal wine quality. Climatic Change 73 (3), 319–343.

Keller, M., Smithyman, R.P., Mills, L.J., 2008. Interactive effects of deficit irrigationand crop load on cabernet sauvignon in an arid climate. American Journal ofEnology and Viticulture 59 (3), 221–234.

Kittler, J., 1978. Feature set search algorithms. Pattern Recognition and SignalProcessing, 41-60. Mphen aan den Rijn, Netherlands: Sijthoff and Noordhoff.

Kliewer, W.M., 1977. Influence of temperature, solar radiation and nitrogen oncoloration and composition of Emperor grapes. American Journal of Enologyand Viticulture 28, 96–103.

Kliewer, W.M., 1996. Sugars and organic acids of Vitis vinifera. Plant Physiology 41,923–931.

Kohavi, R., 1995. The power of decision tables. 8th European Conference on MachineLearning, pp. 174–189.

Kohavi, R., John, G.H., 1997. Wrappers for feature subset selection. ArtificialIntelligence 97 (1–2), 273–324.

Leeuwen, C.V., Tregoat, O., Choné, X., Bois, B., Pernet, D., Gaudillére, J.-P., 2009. Vinewater status is a key factor in grape ripening and vintage quality for redbordeaux wine. How can it be assessed for vineyard management purposes?Journal International des Sciences de la Vigneet du Vin 43 (3), 121–134.

Liu, H., Setiono, R., 1996. A probabilistic approach to feature selection – a filtersolution. 13th International Conference on Machine Learning, pp. 319–327.

Liu, H., Yu, L., 2002. Feature selection for data mining. Research Technical Report.Arizona State University.

López, M.I., Sánchez, T., Díaz, A., Ramírez, P., Morales, J., 2007. Influence of a deficitirrigation regime during ripening on berry composition in grapevines (Vitisvinifera L.) grown in semi-arid areas. International Journal of Food Sciences andNutrition 58 (7), 491–507.

Mackay, D.J.C., 1998. Introduction to Gaussian processes. Neural Networks andMachine Learning 168, 133–165.

Martinez de Toda, F., 1991. Biología de la Vid. Fundamentos Biológicos de laViticultura. Ediciones Mundi-Prensa (in Spanish).

Peng, Y., Wu, Z., Jiang, J., 2009. A novel feature selection approach for biomedicaldata classification. Journal of Biomedical Informatics 43 (1), 15–23.

Portnoy, S., Koenker, R., 1997. The Gaussian hare and the Laplacian tortoise:computability of squared-error versus absolute-error estimators. StatisticalScience 12 (4), 279–300.

Quinlan, R.J., 1992. Learning with continuous classes. 5th Australian JointConference on Artificial Intelligence, Singapore, pp. 343–348.

Ribèreau-Gayon, P., Dubourdieu, D., Donéche, B., Lonvaud, A., 2006. Handbook ofEnology, vol. 1 – The Microbiology of Wine and Vinifications, second ed. TheGrape and its Maturation. John Wiley & Sons Ltd. (Chapter 10).

Rich, E., Knight, K., 1991. Artificial Intelligence. McGraw-Hill.Robinson, S.P., Davies, C., 2000. Molecular biology of grape berry ripening.

Australian Journal of Grape and Wine Research 6 (2), 168–174.Robnik-Sikonja, M., Kononenko, I., 1997. An adaptation of relief for attribute

estimation in regression. Fourteenth International Conference on MachineLearning, pp. 296–304.

Sadras, V.O., McCarthy, M.G., 2007. Quantifying the dynamics of sugarconcentration in berries of Vitis vinifera cv. Shiraz: a novel approach basedon allometric analysis. Australian Journal of Grape and Wine Research 13, 66–71.

Sadras, V.O., Collins, M., Soar, C.J., 2008. Modelling variety-dependent dynamics ofsoluble solids and water in berries of Vitis vinifera. Australian Journal of Grapeand Wine Research 14 (3), 250–259.

Schwarz, G., 1978. Estimating the dimension of a model. Annals of Statistics 6, 461–464.

Seni, G., Elder, J.F., 2010. Ensemble Methods in Data Mining: Improving AccuracyThrough Combining Predictions. Morgan & Claypool.

Sinclair, T.R., Seligman, N.G., 1996. Crop modeling: from infancy to maturity.Agronomy Journal 88, 698–704.

Smart, R.E., Alcorso, C., Hornsby, D.A., 1980. A comparison of winegrapeperformance at the present limits of Australian viticultural climates – AliceSprings and Hobart. Aust. Grapegrower and Winemaker 196, 28–30.

Smola, A.J., Schölkopf, B., 1998. On a kernel-based method for pattern recognition,regression, approximation and operator inversion. Algorithmica 22, 211–231.

Wang, Y., Witten, I.H., 1997. Induction of model trees for predicting continuousclasses. Poster papers of the 9th European Conference on Machine Learning, pp.128–137. University of Economics, Faculty of Informatics and Statistics, Prague.

Wilkinson, G.N., Rogers, C.E., 1973. Symbolic descriptions of factorial models foranalysis of variance. Applied Statistics 22, 392–399.

Williams, C.K.I., 1998. Prediction with Gaussian processes: from linear regression tolinear prediction and beyond. In: Jordan, M.I. (Ed.), Learning and Inference inGraphical Models. Kluwer Academic Press, pp. 599–621.

Witten, I.H., Frank, E., 2005. Data Mining: Practical Machine Learning Tools andTechniques with Java Implementations. Morgan Kaufmann, San Francisco.

Zhang, S., Zhang, C., Yang, Q., 2003. Data preparation for data mining. AppliedArtificial Intelligence 17 (5–6), 375–381.

comparative analysis of learning and meta-learning algorithms for creating models for predicting the...

Documents