economic regime identiﬂcation and prediction in tac scm...

Economic Regime Identification and Prediction in TAC SCM UsingSales and Procurement Information

Frederik HogenboomErasmus Sch. of Econ.

Erasmus University [email protected]

Wolfgang KetterRotterdam Sch. of Mgmt.


Jan van DalenRotterdam Sch. of Mgmt.


Uzay KaymakErasmus Sch. of Econ.


John CollinsComputer Science and Engr.

University of [email protected]

Alok GuptaCarlson Sch. of Mgmt.University of [email protected]

Abstract

Our research is focused on the effects of the additionof procurement information (offer prices) to a sales-based economic regime model. This model is usedfor strategic, tactical, and operational decision mak-ing in dynamic supply chains. We evaluate the perfor-mance of the regime model through experiments withthe MinneTAC trading agent, which competes in theTAC SCM game. The new regime model has an over-all predictive performance which is equal to the per-formance of the existing model. Regime switches arepredicted more accurately, whereas the prediction ac-curacy of dominant regimes does not improve. How-ever, because procurement information has been addedto the model, the model has been enriched, which givesnew opportunities for applications in the procurementmarket, such as procurement reserve pricing.

IntroductionNowadays, markets are extremely competitive, and thusit is important to gain insight into the dynamics ofsupply chains and to research supply chain optimiza-tion possibilities, both for individual elements in thechain, as well as for the chain as a whole. For instance,in correctly predicting future market conditions, i.e.,economic regimes (Ketter 2007), lie competitive advan-tages, e.g., one can anticipate on upcoming scarcitiesin the sales market by adjusting procurement policiesand sales prices in advance. This can save for instancestorage costs, and increase profits. Thus, one couldbenefit greatly from being able to make tactical andstrategic decisions in an uncertain market, based onidentified and predicted economic regimes. Combiningtechniques from computer science with economic theoryto solve problems in economic environments contributesto novel approaches to existing problems.

Ketter introduced an economic regime model, whichis based on sales information (Ketter 2007). This modelcan be applied to any real or simulated market situa-tion. However, procurement information has not been

Copyright c© 2009, Association for the Advancement of Ar-tificial Intelligence (www.aaai.org). All rights reserved.

used sofar for identifying and predicting regimes. Thisinformation is truly valuable for determining economicregimes, since it captures specific market characteris-tics. For instance, an increase in the amount of compo-nents sold in the procurement market could indicate anexpected scarcity in the sales market, as manufacturersare building stocks.

We present an extension to the regime model as in-troduced by Ketter, which is implemented in the Min-neTAC trading agent (Collins, Ketter, and Gini 2009).We investigate the effects of adding procurement infor-mation to this model. The performance of the regimemodel is evaluated through experiments on the qualityof regime probability predictions, and checking correla-tions with existing market conditions.

The MinneTAC trading agent has competed for sev-eral years in the Trading Agent Competition for Sup-ply Chain Management (TAC SCM). TAC SCM is anannual international competition for designing trad-ing agents for a simulated personal computer supplychain (Collins et al. 2005). The supply chain includescustomers, traders, and suppliers and the computermarket is divided into market segments. Traders haveto procure components from suppliers and sell assem-bled computers to customers. Only limited informationis available to traders, such as yesterday’s maximumand minimum market prices, which makes predictingthe market a non-trivial task.

TAC SCM has attracted researchers from all over theworld, because its environment is designed in such away that it contains many characteristics that can befound in real-life supply chains, such as the behaviorof unpredictable opponents and interdependent chainentities. The simulated supply chain of the TAC SCMgame offers many research opportunities into varioussubjects, such as price setting strategies and predictionstrategies for competitor behavior or market character-istics and developments.

The paper is organized as follows. First, we con-tinue with introducing the existing economic sales-based regime model. Subsequently, we define and eval-uate a new economic regime framework based on both

sales and procurement information. Finally, conclu-sions are drawn and future work is suggested.

Background and Related Work

Regime identification and prediction using the economicregime model are used for making sales decisions in theMinneTAC agent (Ketter et al. 2009). Every day, foreach individual product, the probabilities of current andfuture economic regimes are determined, which are usedfor forecasting price densities. With the help of thecurrent and future price density we are able to predictmarket prices, market price trends, and the customeracceptance probability for specific offers. Regimes areused for both tactical and strategic decisions, such asproduct pricing and production planning.

The regimes in the TAC SCM game can be consid-ered as a set of characteristics which apply to a certainperiod of days. The identification and prediction ofregimes is done so that different behavior can be mod-eled for different situations, which is also referred to asa switching model. The agent’s problems can be solveddifferently depending on the state of its environment,which influences the accuracy of the agent’s predictionsand the amount of profit made.

Regimes are used in multiple contexts, such as polit-ical and economic contexts. In general, a regime refersto a set of conditions. In economic context, regimes arealso referred to as business cycle phases. These phasesare commonly used in macro-economic environments,as is the case in (Osborn and Sensier 2002). How-ever, in (Ketter 2007), regimes are applied in the micro-economic environment of the TAC SCM game. Thismakes sense, since an economic environment is simu-lated and one can capture (economical) characteristicsin economic regimes, enabling an agent to reason basedon certain market conditions.

Other applications of regimes can be found in elec-tricity markets. These markets often are oligopolies,which is also the case in TAC SCM. Becker et al.and Mount et al. both model spikes in electricityprices as a regimes switching model, based on Markovswitching models (Becker, Hurn, and Pavlov 2007;Mount, Ning, and Cai 2006).

Over the past decades, research has been into iden-tifying and predicting regimes, but also into regimechanges. Regime changes are important events in timeseries, in which one can obtain strategic advantage ifthey are identified or predicted correctly. Recently,Massey and Wu (2005) emphasize this importance ofthe ability to detect and respond to regime shifts, as itis critical to economic success. If these shifts are not de-tected, this could lead to lower profits. Also, Massey etal. elaborate on the causes of under- and overreactionto (predicted) regime shifts.

The foundation of research into regime shifts lies in1989, when Hamilton published a paper about model-ing regime changes using postwar U.S. real GNP as in-put (Hamilton 1989). Hamilton used Markov matrices

to observe regime shifts, by drawing probabilistic infer-ence about whether and when they may have occurredbased on the observed behavior of series.

The algorithms of the regime model implementedby the MinneTAC agent, which identifies and predictseconomic regimes, are based on economic theory andincorporate some adapted techniques. The model’sregimes are identified as extreme scarcity, scarcity, abalanced situation, oversupply, and extreme oversup-ply (Ketter 2007). We define the regime set as R ={ES, S, B, O,EO}. Each day of the game, the regimeprobability distribution is determined and a regimeprediction is made. Thus, each regime Rk in set R∀k = 1, . . . , M (where M = 5) has a certain likelihoodto hold as dominant regime. The regime probabilitiesin set R sum up to 1 and the regime with the highestprobability is considered to be dominant.

Currently, the regime model identifies and predictseconomic regimes based on yesterday’s normalizedmean (mid-range) sales price (Ketter et al. 2007) for anarbitrary day d, which is also referred to as npd−1, aswell as on quantities. Also, for some predictions, the en-tire history of normalized mean sales prices is used. Theproblem is that in supply chains, data for the day itselfis not available, and thus identification is done based onhistorical information, whereas predictions are made forday d up to planning horizon h (usually twenty days).

The data the model is based on is normalized, asnormalization allows for machine learning across mar-kets, i.e., over different products, which can be com-pared qualitatively, and it enables whole market fore-casts. Also, the range of the variable is fixed and thusknown beforehand, which simplifies computations.

Regime identification is currently done by offlineand online machine learning. Offer acceptance prob-abilities associated with given product prices (approx-imated using a Gaussian Mixture Model (Tittering-ton, Smith, and Makov 1985)), derived from observ-able historical and current sales market data, are clus-tered offline using the K-Means algorithm (MacQueen1967), which yields distinguishable statistical patterns(clusters). These clusters are labeled with the properregimes after correlation analyses. Regime probabili-ties, which are indicative of how market conditions are,are determined by calculating the normalized price den-sity of all clusters, given sales prices.

There are several techniques for predicting regimes,each of which is most suitable for a specific time span.Today’s regimes can be predicted based on exponen-tially smoothed price predictions, as extensively elabo-rated in (Ketter et al. 2008). Short-term regime pre-diction for tactical decision making is done by using aMarkov prediction process. This process is based onthe last normalized smoothed mid-range price. To thisend, Markov transition matrices, which are created of-fline by a counting process over past games, are beingused. Long-term regime prediction is done by using aMarkov correction-prediction process. This process isalmost equal to the short-term regime prediction, but

is based on all normalized smoothed mid-range pricesup and until the previous day, instead of just the lastnormalized smoothed mid-range price.

Extending the Regime ModelWe extend the regime model as introduced by Ketteret al. (2009) with procurement information. Our modeldifferentiates from the model introduced in (Ketter etal. 2009) in the fact that it is based on procurementinformation, and not solely on sales information. In or-der to be able to select the most promising procurementvariable, we apply the information gain metric (An-drews et al. 2007) to a data set containing procure-ment information on prices, quantities, offers, orders,and requests for quotation (RFQs) gathered from his-torical game data1. We now continue with discussingsome details of the information gain metric.

Information GainThe information gain is an entropy-based metric thatindicates how much better we can predict a specific tar-get by knowing certain features. In our case, the targetis the dominant regime and the features are procure-ment variables. According to Mitchell (Mitchell 1997),the entropy is a commonly used measure in informa-tion theory, which characterizes the purity of an arbi-trary collection of examples. Let W be a collection ofgame results, numW the number of possible values of W(i.e., regimes), and P (w) the probability that W takeson value w. Assuming a uniform probability distribu-tion, the latter probability is equal to the proportion ofW belonging to class w. The entropy of a collection ofgame results, entropy (W ), is defined as

entropy (W ) =numW∑w=1

−P (w) log2 P (w) . (1)

Now let V be an attribute (procurement variable),numV the number of possible values of V , P (v) theprobability that V takes on value v, and P (w|v) theprobability that W takes on value w, given v. Theentropy of a collection of game results W given an at-tribute V , entropy (W |V ), is defined as

entropy (W |V ) =numV∑v=1

P (v)

(numW∑w=1

−P (w|v) log2 P (w|v)

). (2)

Using (1) and (2), the information gained on outcomeW from attribute V , gain (W,V ), can be calculated us-ing

gain (W,V ) = entropy (W )− entropy (W |V ) . (3)1Data set contains 2007 Semi-Finals games played on

the SICS tac5 server (IDs: 9321–9328), 2007 Finals gamesplayed on the SICS tac3 server (IDs: 7306–7313), 2008Semi-Finals games played on the University of Minnesota’s(UMN) tac02 server (IDs: 761–769), and 2008 Finals gamesplayed on the UMN tac01 server (IDs: 792–800).

Here, the entropy of a collection of game results Wgiven an attribute V is subtracted from the entropyof W . For more information on the calculation of theinformation gain, see (Andrews et al. 2007). Apply-ing the information gain metric to several possible pro-curement variables using our data set results in Table 1.The higher the score, the more information the variableadds to the model.

Variable GainOffer price 0.7393Order price 0.5400RFQ lead time 0.5106RFQ reserve price 0.4909Ratio orders / offers 0.4555Order quantity 0.4310RFQ quantity 0.3901Demand 0.3833Offer quantity 0.3174

Table 1: Information gain scores for several procure-ment variables.

The latter table shows that component offer prices(recalculated on a per-product basis) are most likelyto improve the predictive capabilities of the regimemodel. Therefore, we add these component offer prices,to which we refer to as op, to the existing regime model.These prices result from all requests for quotation in aTAC SCM game. These requests include requests forshort-term delivery (e.g., 5 days) as well as long-termdelivery (e.g., 50 days).

Regime Model VariablesAs both regime identification and prediction are basedon normalized sales prices, this section introduces amathematical formulation of the normalized price npfor product g, npg, on day d, which is calculated as

npg =priceg

asmCostg +∑numPartsg

j=1 nomPartCostg,j

, (4)

using the product price priceg and the nominal manu-facturing costs for each component j belonging to prod-uct g, nomPartCostg,j , respectively.

The estimated normalized mean price can be volatileand lacks information on trends. Therefore, (exponen-tial) smoothing can be applied, resulting in yesterday’ssmoothed normalized minimum and maximum pricesnpmin

d−1 and npmaxd−1 . Equations (5) through (7) show the

smoothing process using a Brown linear exponentialsmoother (Brown, Meyer, and D’Esopo 1961), whereα is a smoothing factor determined by a hill-climbingprocedure which minimizes the variance of npmin

d−1.

npmin′d−1 = α · npmin

d−1 + (1− α) · npmin′d−2 , (5)

npmin′′d−1 = α · npmin′

d−1 + (1− α) · npmin′′d−2 , (6)

npmind−1 = 2 · npmin′

d−1 − npmin′′d−1 . (7)

Here, two price components are smoothed separately,after which they are combined. This way, changes inthe mean and trend can be captured. Brown linear ex-ponential smoothing is applied, since the trend as wellas the mean vary over time. The calculation of npmax

d−1is done by analogy with (5) through (7). Now, yester-day’s exponentially smoothed normalized price on anarbitrary day d can be calculated by averaging yester-day’s exponentially smoothed normalized minimum andmaximum prices, i.e., npd−1 = 0.5 · npmin

d−1 +0.5 · npmaxd−1 .

We extend the regime model with the mean compo-nent offer price for product g, opg, on day d, such that

opg =∑numS

s=1

∑numCg

c=1 opg,s,c

numOpg

, (8)

where numS refers to the number of suppliers, numCg

refers to the number of components for product g, andnumOpg represents the number of entries of the pro-curement variable. Thus, the mean component offerprice is calculated by means of a counting process overall component prices. These prices are extracted fromall requests for quotation related to a specific productsend on an arbitrary day.

Because we would like the variable to include someinformation about other preceding days as well, sothat it represents a trend instead of an event, we ap-ply an exponential smoother to the variable. Thesmoothed value of yesterday’s product-based compo-nent offer price, opd−1, is calculated as shown in (9),where β represents a smoothing factor and is deter-mined using a hill-climbing procedure which minimizesthe variance of opd−1:

opd−1 = β · opd−1 + (1− β) · opd−2 . (9)

Smoothing is done by taking a certain percentage (β)of yesterday’s value of op. Then, the remaining per-centage is taken of the previous (smoothed) value ofvariable op, i.e., the day before yesterday’s value, af-ter which both values are added. This is a less complexway of smoothing than applies for the normalized meansales price, though it still smoothes out the variable’spossible volatility.

Regime IdentificationThe existing regime model is based on a Gaussian Mix-ture Model (GMM) (Titterington, Smith, and Makov1985) with a fixed number (N) of Gaussian compo-nents. A GMM is used, since it is able to approximatearbitrary density functions. Also, a GMM is a semi-parametric approach which allows for fast computingand uses less memory than other approaches (Ketter etal. 2009). In the current regime model, fixed means,µi, which are equally distributed, and variances, σ2

i ,where i is used as an index to point to a component,∀i = 1, . . . , N , are used. The fixed means and variancesare chosen so that adjacent Gaussians are two standarddeviations apart (Ketter et al. 2008), are used. Thismight lead to good results when fitting a model on one

dimension, but after adding a dimension to the model,fixed means and variances might prevent the GMM toreach a good fit. Therefore, we do not constrain themeans and variances for now.

As is the case with the current model, we apply theExpectation-Maximization algorithm to determine theGaussian components of the GMM and their prior prob-abilities, P (ζi). The Gaussian components are, unlikethe components of the current model, based on both npand op. For now, the number of Gaussian components,N , is equal to 3, because this helps visualizing and ex-plaining the main concepts of the model. We definethe bivariate density of the sales and procurement offerprices as

p (np ∩ op) =N∑

i=1

P (ζi) p (np ∩ op|ζi) . (10)

This density is equal to the sum of all Gaussian com-ponents p (np ∩ op|ζi) multiplied by their prior prob-abilities P (ζi). We define a typical two-dimensionalGaussian component asp (np ∩ op|ζi) = p

(np ∩ op|{µnpi

∩ µopi∩ σnpi

∩ σopi

})

= Ae−

((np−µnpi )

2σ2npi

+(op−µopi )

2σ2opi

)

, (11)where A is the amplitude of the Gaussian density, µnpi

and µopiare the means of the i-th Gaussian on the

normalized mean price and mean offer price axes, andσnpi

and σopiare their respective standard deviations.

Figure 1 shows a two-dimensional GMM created us-ing a training set2 and the equations discussed above.For sake of illustration, the model contains three Gaus-sian components, and is trained with a maximum of fif-teen hundred iterations on data on the high market seg-ment. Experiments show that using less iterations doesnot guarantee a well fit model. Figures 1(a) and 1(c)show projections of the individual Gaussians and thedensity of the normalized mean sales price and meanoffer price onto the axes of both variables, whereas thedensity is shown as a surface in Figure 1(b).

In order to find patterns in these probabilities, weneed to calculate to which extent each of the proba-bilities is a member of each Gaussian component. Theposterior probability for each component P (ζi|np ∩ op)follows from (10) after applying Bayes’ rule:

P (ζi|np ∩ op) =P (ζi) p (np ∩ op|ζi)∑N

j=1 P (ζj) p (np ∩ op|ζj),

∀i = 1, . . . , N . (12)Equation (12) applies for each Gaussian, and thus

the vector of posterior probabilities for the two-dimensional Gaussian Mixture Model can be described

2Training set contains 2007 Semi-Finals games played onthe SICS tac5 server (IDs: 9323–9327), 2007 Finals gamesplayed on the SICS tac3 server (IDs: 7308–7312), 2008Semi-Finals games played on the UMN tac02 server (IDs:763–768), and 2008 Finals games played on the UMN tac01server (IDs: 794–799).

(a) (b) (c)

Figure 1: A two-dimensional GMM based on np and op, using three Gaussian components, where (a) and (c) showprojections of the Gaussian components used in the model demonstrated in (b). Each individual Gaussian has itsown density function, and combining these densities results into a single price density.

as η (np ∩ op) = [P (ζ1|np ∩ op) , . . . , P (ζN |np ∩ op)].For each combination of normalized mean prices andcomponent offer prices, we can calculate η (np ∩ op) us-ing the fitted Gaussian Mixture Model.

We need to find clusters within the posterior prob-abilities, as they can be linked to regimes, becausethey each describe certain conditions and characteris-tics. Clustering the posterior probabilities in M clus-ters is done using K-Means clustering. We tested differ-ent clustering algorithms, such as spectral clustering,which resulted in similar clusters. Clustering is done infifteen replicates, using a maximum of one hundred it-erations. Experiments show that this maximum allowsthe algorithm to converge nicely on our data set. Thesquared Euclidean distance measure is used to measuredistances to the cluster centers for each data point. Fig-ure 2 shows results of applying the K-Means clusteringalgorithm to the GMM we have fit to our data on high-range products with three clusters.

We link the cluster centers P (ζ|Rk) to regimes, butthese clusters do not tell us which cluster representswhich regime. Let us assume for now we know how to

Figure 2: Three identified clusters in the posterior prob-abilities, P (ζi|np ∩ op), of a two-dimensional GMM.

assign the proper regime label to each cluster. Thenwe can rewrite p (np ∩ op|ζi) by analogy with (10) in aform that shows the dependence of the normalized salesprice and mean component offer price on the regime Rk.

p (np ∩ op|Rk) =N∑

i=1

P (ζi|Rk) p (np ∩ op|ζi) ,

∀k = 1, . . . ,M . (13)

In (13), P (ζi|Rk) refers to the N by M matrix re-sulting from the K-Means algorithm, and p (np ∩ op|ζi)refers to the individual Gaussians. When applyingBayes’ rule, we obtain the probability of regime Rk de-pendent on the sales and offer prices, as defined in (14).

P (Rk|np ∩ op) =P (Rk) p (np ∩ op|Rk)∑Mj=1 P (Rj) p (np ∩ op|Rj)

,

∀k = 1, . . . , M . (14)

Figure 3 shows a plot of the regime probabilities(given normalized sales price and component offer price)

Figure 3: Regime probabilities, P (Rk|np ∩ op), forproducts of the high segment.

for products of the low segment, resulting from a Gaus-sian Mixture Model and clustering its posterior proba-bilities in three clusters. We observe that under differ-ent conditions, different regimes are dominant, as differ-ent clusters have high probabilities for different combi-nations of component offer and sales order prices. Thus,each identified regime is dominant for certain combina-tions of both variables the model is based on.

Online, regime probabilities can be calculated (inter-polated) for different combinations of normalized meanprices and procurement-side offer prices using the datathat is displayed in Figure 3. Then, the dominantregime Rr on an arbitrary game day d is

Rr s.t. r = argmax ~P (Rk|npd−1 ∩ opd−1) . (15)1≤k≤M

We conclude that in general, the regime identificationstill works similar to the existing appoach. However,a dimension has been added to the Gaussian MixtureModel, causing differently structured probability densi-ties as well as regime clusters. This requires reformu-lating the entire regime identification model.

Regime PredictionWe already introduced three techniques for regime pre-diction, each of which has its own characteristics andoptimal time span to predict regimes for. This sectioncontinues with discussing these techniques. First, wewill elaborate on the exponential smoother process, af-ter which we will discuss Markov processes.

The exponential smoother regime prediction processis more reactive to the current market condition thanany other method, because the exponential smootherprocess takes yesterday’s information as input. Thisinformation is smoothed with information on precedingdays to reduce volatility.

The prediction process calculates a trend, trmin

d−1, inthe minimum normalized mean sales price by using (5)and (6), and smoothing factor γ: tr

min

d−1 = γ1−γ ·(npmin′

d−1 −npmin′′

d−1 ). The exponentially smoothed maximum nor-malized trend, tr

max

d−1 , is calculated in a similar way. Us-ing the minimum and maximum trends, the mid-rangetrend of the sales price can be calculated by averagingboth values, and thus tr

np

d−1 = 0.5 · trmin

d−1 + 0.5 · trmax

d−1 .Using yesterday’s value and the mid-range trend of

sales prices, one can estimate the value of sales pricesn days in the future as shown in (16), where h is theplanning horizon:

npd+n = npd−1+(1 + n)· trnp

d−1, ∀n = 0, . . . , h . (16)

Now that we have defined a way to predict future val-ues of np, we can also formulate a way to predict futurevalues of op. This is also done using a trend, as the ap-proach performs good in the current model and we pre-fer to keep things similar to the current approach. Also,it is beyond our scope to look into alternatives. How-ever, the calculation is somewhat different from what

we have discussed for np, because of the fact that theprocurement variable (i.e., component offer prices) rep-resents a mean value and we do not have minimum andmaximum values at hand. Also, different smoothing isapplied to offer prices than to sales prices, which meansthe two Brown linear exponential smoothing compo-nents used for calculating the sales price trend are notavailable for our procurement variable.

We calculate the trend of opd−1, by computing thedifference between yesterday’s smoothed component of-fer price and the smoothed offer price of the day beforeyesterday, i.e., tr

op

d−1 = opd−1 − opd−2. Then, futurevalues for n days into the future up to planning horizonh are calculated similar to future values of np, as shownin (17). Here, the calculated trend is added 1+n timesto the last known value of the component offer prices.We express the future values of op mathematically as

opd+n = opd−1 +(1 + n) · tropd−1, ∀n = 0, . . . , h . (17)

Using the future values for day d + n of np and op,the probability for each regime (given both prices) forn days in the future can be calculated similar to (14),where the density of the variables dependent on regimeRk is calculated using (13) by marginalizing over theindividual Gaussians and cluster centers:

P(Rk|npd+n ∩ opd+n

)=

p(npd+n ∩ opd+n|Rk

)P (Rk)

∑Mj=1 p

(npd+n ∩ opd+n|Rj

)P (Rj)

,

∀k = 1, . . . , M . (18)

Making short-term and long-term regime predictionscan be done using Markov prediction and correction-prediction processes (Isaacson and Madsen 1976). Incontrast to the exponential smoother process where fu-ture prices are predicted, resulting indirectly in predic-tions of future regimes, regimes are predicted directly.Markov processes are less responsive to current marketsituations, as they make use of a history of events.

For short-term regime predictions, a Markov predic-tion process is used. This process is based on the lastprice measurement and on a Markov transition matrixT (rd+n|rd). This matrix is created by means of a count-ing process on offline data and contains posterior prob-abilities of transitioning to regime rd+n on day d + n(i.e., n days into the future), given rd, which is thecurrent regime. Note that we introduce the symbol rfor denoting regimes, to emphasize a focus shift, as weare not looking at the individual regimes in the way wewere looking at them until now. Now, the regimes rep-resent rows and columns in a Markov transition matrixand we use probability vectors combined with transitionmatrices, instead of single regime probabilities.

Ketter et al. distinguish between two types of Markovpredictions: n-day prediction and repeated one-day pre-diction. The first type is an interval prediction, where

for each day n up to planning horizon h a Markov tran-sition matrix is computed offline (per product, or atwhatever level of detail the regime model is defined),whereas the second type only needs one Markov tran-sition matrix (per product). This matrix is repeated ntimes up to planning horizon h.

The calculation of the n-day prediction for n daysahead is performed recursively as

~P (rd+n|npd−1 ∩ opd−1) =∑rd+n

. . .∑rd−1

{~P (rd−1|npd−1 ∩ opd−1)·

Tn (rd+n|rd−1)}

,

∀n = 0, . . . , h , (19)

where the previous (identified or predicted) poste-rior regime probabilities dependent on the normalizedmean sales prices and normalized mean component offerprices are multiplied with the applicable Markov tran-sition matrix. The calculation of the repeated one-dayprediction is done similarly, as shown in (20). However,only one transition matrix is used, T0 (rd|rd−1), which isrepeated n times for predictions of n days in the future.

~P (rd+n|npd−1 ∩ opd−1) =∑rd+n

. . .∑rd−1

{~P (rd−1|npd−1 ∩ opd−1)·

n∏t=0

T0 (rd|rd−1)}

,

∀n = 0, . . . , h . (20)

Long-term predictions are made using a Markov cor-rection-prediction process. The latter process is almostsimilar to the Markov prediction process we already dis-cussed. The difference is that the long-term predictionprocess is modeled based on the entire history of prices,instead of just the last price measurement. Hence, weneed to extend the probability of regime rd−1 dependenton npd−1 and opd−1 so that it incorporates the entirehistory of the values of np and op. This correction isdone by applying a recursive Bayesian update to theidentified regime probabilities. Then, predictions fortoday are based on the corrected regime probabilitieson day d− 1, while predictions for n days in the futureare done recursively. Equation (21) defines the n-dayvariant of the Markov correction-prediction process.

~P(rd+n|

{np1, . . . , npd−1

} ∩ {op1, . . . , opd−1

})=

∑rd+n

. . .∑rd−1

{~P

(rd−1|

{np1, . . . , npd−1

}∩

{op1, . . . , opd−1

}) · Tn (rd+n|rd−1)}

,

∀n = 0, . . . , h . (21)

The same principles apply to the calculation of therepeated one-day correction-prediction. Again, the dif-ference is the usage of the Markov transition matrices.

~P(rd+n|

{np1, . . . , npd−1

} ∩ {op1, . . . , opd−1

})=

∑rd+n

. . .∑rd−1

{~P

(rd−1|

{np1, . . . , npd−1

}∩

{op1, . . . , opd−1

}) ·n∏

t=0

T0 (rd|rd−1)}

,

∀n = 0, . . . , h . (22)

Note that the Markov transition matrices whichare being used in (21) and (22) (Tn (rd+n|rd−1) ∀n =0, . . . , h and T0 (rd|rd−1), respectively) are the samematrices as used in (19) and (20).

Performance EvaluationInitial offline experiments show that a five-regime modelis preferred over a three-regime model, as the regimesare identified and predicted more accurately. In theseexperiments, we train regime models using our trainingdata, after which the performance is evaluated using atest set3.

Regime Identification EvaluationWe evaluate each model on several aspects. Correla-tions between identified dominant regime and economicregime identifiers, such as factory utilization and fin-ished goods, are evaluated to test the feasibility of theclusters that have been found. Finally, the feasibility ofthe course of regime probabilities is evaluated.

The best performing regime model is configured withfive regimes and ten Gaussian components, with whichwe continue our experiments. Figure 5 shows an exam-ple of the course of its identified regime probabilitiesthrough an arbitrary TAC SCM game of a typical agentin the mid-range product segment. Regimes are clearlydominant for a certain time and regimes do not switchtoo often, which is also visible in Figure 4. Here, thedaily dominant regimes of the same game are displayed,together with the course of the normalized mean salesprice and some other economic identifiers.

Regime labels are assigned to the cluster centers bymeans of correlation studies, of which the results areshown in Figure 6. In these studies, seventy-five hun-dred data points drawn from the training set are used tocalculate the Pearson correlation (Fisher 1925) betweenthe identified dominant regime and economic regimeidentifiers. This number of samples is large enough toensure p-values below 0.01. Characteristics of the clus-ters are in line with the human interpretation of the

3Test set contains 2007 Semi-Finals games played on theSICS tac5 server (IDs: 9321, 9322, 9328), 2007 Finals gamesplayed on the SICS tac3 server (IDs: 7306, 7307, 7313), 2008Semi-Finals games played on the UMN tac02 server (IDs:761, 762, 769), and 2008 Finals games played on the UMNtac01 server (IDs: 792, 793, 800).

Figure 4: Overview of the dominant regimes during a TAC SCM game, together with the course of the normalizedmean sales price and other economic identifiers, i.e., finished goods and factory utilization.

regime definitions. For example, during scarcity, thereis a shortage of finished goods, and sales prices are high.

Regime Prediction Evaluation

We evaluate how the agent would predict the proba-bilities for five regimes with our ten-Gaussian GMMusing our test set that contains historical data, and wecompare these results to those of the current imple-mentation of the regime model on the same test set. Inour experiments, we evaluate three product segments,i.e., the low-, mid-, and high-range segment, to get arough indication of the performance. This performanceis measured in terms of the percentage of correctly pre-dicted regimes and regime switch occurrences. We de-fine the correct regime as the identified regime.

Figure 5: Course of identified regime probabilities overgame time in an arbitrary TAC SCM game.

Regimes are predicted using a combination of theprediction methods we have discussed. Today’sregime probabilities are predicted using an exponen-tial smoother process. Short-term predictions (up toten days into the future) are done using a Markov pre-diction process. We choose to use the n-day variant,since repeated one-day prediction tends to converge toa stationary distribution sooner. Finally, for long-termpredictions up to twenty days into the future, we ap-ply a Markov correction-prediction process. The upperbound (or planning horizon h) is set to twenty days,because production scheduling is set up every day forthe next twenty days. This possibly leads to new ormore accurate insights in future developments.

EO O B S ES−1

−0.5

0

0.5

1

Regime

Co

rrel

atio

n C

oef

fici

ents

Correlation Coefficients Regimes

Finished GoodsFactory UtilizationRatio Offer / Demand (sales)Order Price (sales)Offer Price (procurement)

Figure 6: Correlation coefficients of five identifiedregime clusters, resulting from a two-dimensional Gaus-sian Mixture Model with ten individual Gaussians cre-ated based on mid-range product data.

Looking at the prediction performance of the selectedregime model (compared to the performance of the cur-rent model), one can observe small improvements, aswell as small deteriorations. This observation is sup-ported by Table 2. Here, the accuracy measured in apercentage of correctly predicted regimes and regimeswitches (within plus or minus two days). The tableshows the performance of the new model compared tothe current model for three market segments (i.e., low-range, mid-range, and high-range products). The scoreof the best performing model is printed bold.

Correct Segment New model Existing modelRegime Low 46.43% 51.86%

Mid 40.63% 52.93%High 40.48% 41.91%

Time Low 48.68% 52.78%Mid 53.00% 43.44%High 50.93% 46.30%

Table 2: Prediction performances of a two-dimensionalGMM with five clusters and ten individual Gaussians(new model) compared to the performance of a one-dimensional five-cluster GMM with sixteen individualGaussians (existing model).

The differences between the scores of the existingmodel as presented in Table 2 and the results presentedin (Ketter et al. 2006) can be caused by the fact thatKetter et al. only apply a Markov prediction process foreach prediction. Furthermore, we experiment on 2007and 2008 data, which contains different games than theones used in (Ketter et al. 2006). This may result inmarket conditions which are harder to predict, becauseagents are getting more advanced and more competitiveevery year, which can cause other decisions to be madeunder the same conditions and thus games could havedifferent characteristics.

On average, our model predicts regime switches moreaccurately than the current model. This indicates thatthe addition of procurement information does affect theprediction performance positively. However, regimesare predicted with a lower accuracy than the currentmodel. Though overall, the differences between the per-formances are small, and therefore we conclude that theaddition of procurement information in the proposedway does not affect the prediction performance greatly.

Conclusions and Future WorkWe have added procurement information (componentoffer prices) to a sales-based regime model, which isused for predicting price density probabilities in a sim-ulated supply chain. The regime model has been ex-tended by adding a dimension to the Gaussian Mix-ture Model which is at the core of the economic regimemodel. After evaluating the performance of our modelthrough experiments with the MinneTAC agent, whichcompetes in the TAC SCM game for several years, wefind that the new regime model has a similar overall

predictive performance as the existing model. Regimeswitches are predicted more accurately, whereas theprediction accuracy of dominant regimes is slightlyworse.

However, by adding procurement information, wehave enriched the model and we expect the new regimemodel to yield good results once implemented in theMinneTAC agent. The agent will be able to make de-cisions based on more information that is indicative ofhow market conditions are. Therefore, the agent willclassify market conditions differently and more well-considered, which could lead to improvements. Also,our model seems to be as robust as the current model,and thus, maybe new types of decision making comewithin reach. Furthermore, we see opportunities forapplications in the procurement market, which is worthfurther research.

For further research, we suggest to run tests againstother competitors in the TAC SCM game to verify themodel. We are currently looking into the use of time-delayed procurement information, as procurement in-formation could be a leading indicator for the salesmarket. Subsequently, more or different procurementinformation can serve as a basis to the model. Not onlyusing other data (e.g., offer quantities) or time-delayeddata, but also applying other smoothing techniques tooffer prices and normalizing data fall within the scopeof the meaning of different procurement information.

References

Andrews, J.; Benisch, M.; Sardinha, A.; and Sadeh,N. 2007. What Differentiates a Winning Agent: AnInformation Gain Based Analysis of TAC-SCM. InAAAI Workshop on Trading Agent Design and Anal-ysis (TADA’07).Becker, R.; Hurn, S.; and Pavlov, V. 2007. ModellingSpikes in Electricity Prices. The Economic Record82(263):371–382.Brown, R. G.; Meyer, R. F.; and D’Esopo, D. A. 1961.The fundamental theorem of exponential smoothing.Operations Research 9(5):673–687.Collins, J.; Arunachalam, R.; Sadeh, N.; Eriksson,J.; Finne, N.; and Janson, S. 2005. The SupplyChain Management Game for the 2006 Trading AgentCompetition. Technical Report CMU-ISRI-05-132,Carnegie Mellon University, Pittsburgh, PA.Collins, J.; Ketter, W.; and Gini, M. 2009. Flex-ible Decision Control in an Autonomous TradingAgent. Electronic Commerce Research and Applica-tions (ECRA).Fisher, R. A. 1925. Statistical Methods for ResearchWorkers. Oliver & Boyd.Hamilton, J. D. 1989. A New Approach to the Eco-nomic Analysis of Nonstationary Time Series and theBusiness Cycle. Econometrica 57(2):357–384.Isaacson, D. L., and Madsen, R. W. 1976. Markov

Chains – Theory and Applications. John Wiley &Sons.Ketter, W.; Collins, J.; Gini, M.; Gupta, A.; andSchrater, P. 2006. Identifying and Forecasting Eco-nomic Regimes in TAC SCM. In Poutre, H. L.; Sadeh,N.; and Janson, S., eds., AMEC and TADA 2005,LNAI 3937. Springer Verlag Berlin Heidelberg. 113–125.Ketter, W.; Collins, J.; Gini, M.; Gupta, A.; andSchrater, P. 2007. A predictive empirical model forpricing and resource allocation decisions. In Proc. of9th Int’l Conf. on Electronic Commerce, 449–458.Ketter, W.; Collins, J.; Gini, M.; Gupta, A.; andSchrater, P. 2008. Tactical and Strategic Sales Man-agement for Intelligent Agents Guided By EconomicRegimes. ERIM Working paper, ERS-2008-061-LIS.Ketter, W.; Collins, J.; Gini, M.; Gupta, A.; andSchrater, P. 2009. Detecting and Forecasting Eco-nomic Regimes in Multi-Agent Automated Exchanges.Decision Support Systems.Ketter, W. 2007. Identification and Prediction of Eco-nomic Regimes to Guide Decision Making in Multi-Agent Marketplaces. Ph.D. Dissertation, University ofMinnesota, Twin-Cities, USA.MacQueen, J. B. 1967. Some methods of classificationand analysis of multivariate observations. In Proceed-ings of the Fifth Berkeley Symposium on MathematicalStatistics and Probability (MSP’67), 281–297.Massey, C., and Wu, G. 2005. Detecting regime shifts:The causes of under- and overestimation. ManagementScience 51(6):932–947.Mitchell, T. M. 1997. Mchine Learning. McGraw-Hill.ISBN: 0071154671.Mount, T. D.; Ning, Y.; and Cai, X. 2006. PredictingPrice Spikes in Electricity Markets using a Regime-Switching model with Time-Varying Parameters. En-ergy Economics 28(1):62–80.Osborn, D. R., and Sensier, M. 2002. The Predictionof Business Cycle Phases: Financial Variables and In-ternational Linkages. National Institute Econ. Rev.182(1):96–105.Titterington, D.; Smith, A.; and Makov, U. 1985. Sta-tistical Analysis of Finite Mixture Distributions. NewYork: John Wiley and Sons.

economic regime identiﬂcation and prediction in tac scm...

Documents