time series models for forecasting construction costs using time series indexes

Time Series Models for Forecasting ConstructionCosts Using Time Series Indexes

Seokyon Hwang, Ph.D., A.M.ASCE1

Abstract: Construction often involves considerable time gaps between cost estimation and on-site operations. In addition, many operationsare performed over a considerable period of time. Accordingly, estimating construction costs must consider the trend of costs in the market,where construction costs normally change over time. Insight into the trend of construction costs in the market, therefore, is beneficial, evencritical, to the effective cost management of construction projects. In an effort to support such insight development, two time series modelswere built by analyzing time series index data and comparing them with existing methods in the present study. The developed time seriesmodels accurately predict construction cost indexes. In particular, the models respond sensitively and swiftly to a quick, large change of costs,which allows for accurate forecasting over the short- and long-term periods. Overall, the models are effective for understanding the trend ofconstruction costs. DOI: 10.1061/(ASCE)CO.1943-7862.0000350. © 2011 American Society of Civil Engineers.

CE Database subject headings: Construction costs; Time series analysis; Forecasting; Predictions; Models.

Author keywords: Construction costs; Time series analysis; Forecasting models; Continual prediction.

Introduction

Accurate cost estimating has been a challenge in the constructionindustry, in which cost estimates are prepared under conditions ofhigh uncertainty (Rao and Grobler 1997; Flood 1997). Attemptingto improve the accuracy of cost estimates, researchers have pur-sued, developed, and implemented various methods. The ap-proaches of these methods to cost estimating can be categorizedloosely as factor analysis or pattern analysis. Factor analysis is con-cerned with the analysis of the relationships among costs andfactors that are believed to affect construction costs, thereby ac-counting for the impact of such factors on construction costs.The majority of existing methods for estimating construction costsfall into this category. Some of these methods are deterministic(Trost and Oberlender 2003; Attalla and Hegazy 2003), and othersare stochastic (Touran 2003; Doğan et al. 2006).

Pattern analysis is concerned with the identification of thebehaviors of construction costs over time in the markets. This ap-proach often uses cost indexes that either represent or are closelyrelevant to prices of labor, materials, and equipment for construc-tion. Estimating costs based on such indexes has been adoptedwidely in the construction industry (Diekman 1983). Methods inthis category generally estimate costs by the following two ap-proaches (Diekman 1983): (1) associating the total cost of a facilitywith several major parameters of the facility, such as size, system,and location; and (2) analyzing the trend of indexes relevant to con-struction costs over time (Koppula 1981; Williams 1994; Wilmotand Cheng 2003). The present study is concerned with the latterapproach.

Prices of resources for construction projects changes over time;some experience drastic change, and others change gradually.Under the circumstances, it is critical for a contractor to predictfuture resource prices as accurate as possible (Issa 2000). Notingthe significance of such prediction, two time series models are pre-sented in this paper, aiming to support prediction of the change ofconstruction costs due to economic conditions in the markets. Themodels predict the trend of construction cost change in the marketsby analyzing the pattern of a few economic indexes over time. Thispaper also includes a discussion of the effectiveness of the selectedtime series economic indexes for predicting construction costs. Theanalytical procedure for model development presented in this paperillustrates the analysis of autocorrelated and/or cross-correlatedtime series indexes for estimating construction costs. The presentedprocedure will serve as a guideline for practitioners to developprediction models specific to their own data or publicized marketdata, either at the project level or for individual cost items.

Time Series Analysis

This section briefly describes time series analysis techniques; addi-tional details about time series analysis can be found in many sta-tistics textbooks, including Box and Jenkins (1976) and Brockwelland Davis (2002). Time series data that are observed periodicallyhave been known to be useful for forecasting processes, and thetime series analysis technique has been adopted in a variety ofdisciplines such as economics, medical science, and other naturalsciences and engineering fields (Box and Jenkins 1976; Brockwelland Davis 2002). Various cost indexes in construction are typicalproducts of such time series data. Analysis of such time series datagenerally results in a time series model. The resulting models areused either to explain or to project the behaviors of subjects such asproducts of a process, effect of treatments, or change of marketconditions.

Based on the number of series involved in a model, time seriesmodels are classified either as autoregressive moving average(ARMA) models or as multivariate autoregressive (VAR) models.The following briefly describes a fundamental difference between a

1Assistant Professor, Dept. of Civil Engineering, Lamar Univ., Room2622 Cherry Engineering Building, PO Box 10024, Beaumont, TX77710. E-mail: [email protected]

Note. This manuscript was submitted on June 1, 2009; approved onJanuary 6, 2011; published online on January 8, 2011. Discussion periodopen until February 1, 2012; separate discussions must be submitted forindividual papers. This paper is part of the Journal of Construction En-gineering andManagement, Vol. 137, No. 9, September 1, 2011. ©ASCE,ISSN 0733-9364/2011/9-656–662/$25.00.

656 / JOURNAL OF CONSTRUCTION ENGINEERING AND MANAGEMENT © ASCE / SEPTEMBER 2011

J. Constr. Eng. Manage. 2011.137:656-662.

Dow

nloa

ded

from

asc

elib

rary

.org

by

UN

IVE

RSI

TY

OF

NE

W O

RL

EA

NS

on 0

6/27

/14.

Cop

yrig

ht A

SCE

. For

per

sona

l use

onl

y; a

ll ri

ghts

res

erve

d.

http://dx.doi.org/10.1061/(ASCE)CO.1943-7862.0000350




univariate model ARMAðp; qÞ and a multivariate model VAR(p):“The analysis of multivariate time series fXtg is concerned with notonly serial dependence within each component series fXtig but alsointerdependence between the different component series fXtig andfXtjg, where i ≠ j” (Brockwell and Davis 2002). In other words, anARMAðp; qÞ model, a univariate stationary time series model, isused to represent the time-lagged relationship of autocorrelatedobservations within a single series. On the other hand, a VAR(p)model simultaneously accounts for both the autocorrelation withina single series and the time-lagged relationship of correlated obser-vations among interrelated multiple series; the latter relationship iscalled cross correlation. Most of the properties of univariate station-ary time series are extended to multivariate series (Brockwell andDavis 2002); thus, developing a VAR(p) model follows the sameprocedure as modeling an ARMAðp; qÞ.

Regardless of the type of model, time series analysis generallyfollows five steps in order (Box and Jenkins 1976; Brockwell andDavis 2002): (1) examine the main features of the data series;(2) check for dependency in the data series; (3) choose a modelto fit the data series; (4) diagnose the constructed model; and(5) forecast and update. In many cases, the original data seriesneeds a transformation process between Steps (1) and (2): if a seriescontains a trend component and/or a seasonality component, thenthe nonstationary series must be transformed into a stationary seriesby removing the components. The transformed stationary seriesthen can be represented by one of the typical time series models,such as the ARMAðp; qÞ model or the VAR(p) model (Brockwelland Davis 2002). Additional details about each step are provided inthe following section.

The time series analysis technique, or simply, time series analy-sis, is a well-established method that has been utilized successfullyin many domains for forecasting processes. However, it has rarelybeen used in the construction domain (Sparkes and McHugh 1984).Only a few previous research efforts in the domain have utilized themethod. In the context of cost prediction, previous applications canbe found from Koppula (1981), Williams (1994), and Wilmot andCheng (2003). Abdelhamid and Everett (1999) also applied timeseries analysis to analyze experimental data to predict constructionproductivity improvement.

Time Series Models for Forecasting ConstructionCosts

Variables and Data

The trend of costs for goods or services that are commonly used forconstruction can provide insight into the construction cost changesthat are caused by changing demand, prices, and other economicconditions in the markets (Ostwald 1984; Williams 1994). To trackand analyze the trends, various periodic cost data are collected andreported over time. Collected periodic cost data are processed todevelop cost indexes that represent relative scales of costs for fixedquantities of goods or services between different periods. As notedpreviously, such periodic cost indexes often serve well as usefulsources for estimating future construction costs. For instance, byknowing the trend of cost changes, engineers can take cost fluctu-ations over time into consideration as they prepare budgets at theplanning stage and, if necessary, adjust existing budgets duringexecution.

Periodic construction cost data are available either as aggregatedindexes or as price reports about individual resources. In the presentstudy, the construction cost index (CCI) reported by EngineeringNews-Record (ENR) was selected as a representative periodic

construction cost data source. The CCI represents a weightedaggregate index of the prices of constant quantities of labor andmaterials that are used commonly in most construction projects;thus, it can provide a guideline about cost change that is applicableto most construction projects (Grogan 1992). For this study, theCCI was selected to develop a more generic method that is appli-cable to most construction projects. There is another index similarto CCI—the building cost index (BCI)—which is reported by thesame publisher. Although the two indexes are developed by differ-ent calculation methods, they exhibit high colinearity betweenthem (Hwang 2009). The monthly CCI data were collected forthe January 1960 to December 2006 period [Fig. 1(a)].

Notable characteristics of the CCI series (Hwang 2009) are(1) at the macro level, the series grows steadily while showinga relatively rapid increase at some periods [Fig. 1(a)]; (2) at themicro level, the series exhibits small variations between adjacentobservations while following a fairly deterministic linear trend[Fig. 1(c)]; (3) seasonal variations are sinusoidal with a period of12 months [Fig. 1(d)]; and (4) the series exhibits correlationbetween sequential observations (detailed subsequently in the sub-section “Checking Autocorrelations and Cross Correlations inData”). These qualities suggest that the CCI is a nonstationary sea-sonal time series, which may be explained effectively by the time-lagged relationships when the CCI is appropriately transformedinto a stationary series.

As discussed previously, construction costs are likely to be in-fluenced by a variety of economic factors. Therefore, tracking andanalyzing the trend of significant factors can be helpful in predict-ing construction costs. Williams (1994) argued, “The constructionprice is affected by many factors, including inflationary trends inthe economy, the current level of construction activity, seasonal ef-fects, and interest rates.” Three factors—prime rate (PR), housingstarts (HS), and consumer price index (CPI)—were analyzed in theprevious study; PR and HS were found to have no significant inter-dependence with CCI (Hwang 2009). Thus, the two indexes weredropped in the present study because no additional value for pre-dicting the CCI was expected by incorporating the two variables.Consequently, only the CPI was selected as an influential factor.

The CPI, one of the key economic indicators, represents theaverage change in prices for goods and services in the United States(U.S. Dept. of Labor 2006). Despite its similar function to that ofCCI, the goods and services used to calculate the CPI, however, arecompletely different from those referenced to calculate the CCIbecause of their different contexts. Nevertheless, a significantadvantage of the CPI for estimating construction costs is that itprovides a measure of inflation over time, which can influence con-struction costs. For instance, a change in the price of fuel may sig-nificantly affect operation costs of construction activities that useheavy equipment intensively. CPI data were collected for the sameperiod as that of the CCI. The behavior of the CPI over time wasobserved to be similar to the behavior of the CCI over time: the fourcharacteristics of the CCI series were observed in the CPI series,but the pattern of the series was not congruent over time [Fig. 1(b)].

Defining CCI and CPI as X1 and X2, respectively, the two seriescan be represented as components of vector valued time series byXt ¼ ½Xt1;Xt2�T , t ¼ 0;�1; � � �, where t = observation order. Withthe two series, two types of time series models—univariateARMAðp; qÞmodel or multivariate VAR(p) model—can be consid-ered. The univariate model can be used to represent the time-laggedinfluence of past observations of CCI, X1, on the series itself in thefuture. It is also worthwhile to determine whether incorporatingCPI, X2, in the process of predicting CCI would improve the ac-curacy of the predictions. For this purpose, the multivariate modelcan serve well by investigating how X1 is influenced not only by its

JOURNAL OF CONSTRUCTION ENGINEERING AND MANAGEMENT © ASCE / SEPTEMBER 2011 / 657


Dow

nloa

ded

from

asc

elib

rary

.org

by

UN

IVE

RSI

TY

OF

NE

W O

RL

EA

NS

on 0

6/27

/14.

Cop

yrig

ht A

SCE

. For

per

sona

l use

onl

y; a

ll ri

ghts

res

erve

d.

previous observations, but also by the potentially interrelated seriesX2 with X1.

Transformation of Nonstationary Series CCI and CPI

Because both the CCI and CPI series have the linear trend andyearly seasonality, the two series may be represented as a realiza-tion of a process Xt ¼ mt þ st þ Nt, where mt = trend component;st = seasonal component; and Nt = random noise. The nonstation-ary series is transformed into a stationary series by removingmt and

st from the original series; then the transformed series can be fittedby one of the typical time series models. Both mt and st are firstestimated, and then values fitted by the estimated components aresubtracted from the original observations. If the resulting residualsare autocorrelated, they can be fitted by a time series model. In thissense, modeling the transformed time series is in fact to find thepattern of the autocorrelated residuals. The pattern of such autocor-related residuals often can be represented by time series models(Pankratz 1991).

In this analysis, both series were transformed by two methods,TM-1 and TM-2. TM-1 consists of two steps: remove the overalltrend and then eliminate seasonality. Because the CCI series growsin a fairly linear pattern, its trend could be estimated by simplelinear regression (Hwang 2009). To generalize, assume a trendcomponent as mt ¼ at þ b. Subtracting the trend from the originalseries yields a new series, a so-called detrended series. Remainingseasonality with a period of 12 months in the detrended series wasthen removed by lag-12 differencing, ∇12Xt ¼ ðXt � Xt�12Þ. Moredetails about the lag-time differencing method can be found in Boxand Jenkins (1976). If the resulting series transformed by TM-1 isdefined as 1Rt, it can be written without the noise term as

1Rt ¼ fXtþ12 � ½aðt þ 12Þ þ b�g � ½Xt � ðat þ bÞ�¼ Xtþ12 � Xt � 12a

In a different context, TM-2, a simple differencing method, wasimplemented without estimating the trend component mt because asimple differencing method often results in the removal of a trendand a seasonal component at the same time (Brockwell and Davis2002). Noting the pattern of seasonality in the original series, thelag-12 differencing was applied directly to the original data. Refer-ring to the resulting series as 2Rt , it can be written without the noiseterm as 2Rt ¼ Xtþ12 � Xt.

The resulting series 2Rt did not show any deterministic trend;thus, it was not necessary to estimate a trend component. The onlydifference between 1Rt and 2Rt is a constant term 12a. This indi-cates that the two transformed CCI series follow the same patternwith a constant difference. The two new series remain nonstation-ary after the one-time application of TM-2. Thus, another lag-1 dif-ferencing was applied to both new series, which resulted in a newseries, iR

0t ¼ iRt � iRt�1, i ¼ 1, 2. The double-transformed series

are identical because the second transformation canceled the con-stant term 12a. The same procedure was applied to the CPI series,except for one additional application of the lag-1 differencing. Inthis way, the two original nonstationary seasonal time series weresuccessfully transformed into stationary series: both transformedseries oscillate around zero without exhibiting a particular trendor cyclic seasonality.

Checking Autocorrelations and Cross Correlations inData

A time series is meaningful only when dependency exists amongobservations within the series (Brockwell and Davis 2002). There-fore, it is necessary to investigate dependency in the series of in-terest prior to further modeling. A number of methods are availablefor the test. A simple yet systematic way is to examine the sampleautocorrelation function (SACF) and sample partial autocorrelationfunction (SPACF). If there is more than one series, the samplecross-correlation functions (SCCFs) of those series, in additionto SACF and SPACF, must be examined. If dependency doesnot exist among observations, then approximately 95% of theautocorrelations of the series and the cross correlations fall insideof the bounds �1:96=

ffiffiffin

pfor the lags larger than zero, where

n = number of observations in the series.

(a)

(b)

(c)

(d)

0

2000

4000

6000

8000

10000

CC

I

0

50

100

150

200

250

CP

I

2000

2250

2500

2750

3000

CC

I

-40

0

40

80

01/1960 05/1969 10/1978 03/1988 08/1997 12/2006

01/1960 05/1969 10/1978 03/1988 08/1997 12/2006

01/1975 10/1975 07/1976 05/1977 02/1978 12/1978

01/1975 10/1975 07/1976 05/1977 02/1978 12/1978

CC

I

Fig. 1. (a) Monthly index of CCI (January 1960 throughDecember 2006); (b) monthly index of CPI (January 1960through December 2006); (c) trend and seasonality in CCI (January1975 through December 1978); (d) seasonality in CCI after detrending(January 1975 through December 1978)



Dow

nloa

ded

from

asc

elib

rary

.org

by

UN

IVE

RSI

TY

OF

NE

W O

RL

EA

NS

on 0

6/27

/14.

Cop

yrig

ht A

SCE

. For

per

sona

l use

onl

y; a

ll ri

ghts

res

erve

d.

For model building, a window of data, January 1975 to February1991, was selected. Such selection is possible because the proper-ties of a stationary process allow for using any window of a series todevelop a time series model (Brockwell and Davis 2002). A prac-tical reason for selecting the specific window was to directly com-pare the performance of new models with the existing methods.Fig. 2(a) presents the SACF and SPACF of the CCI series: a fewof the sample autocorrelations appear to be significant because theyare out of the bounds. Similarly, Fig. 2(b) depicts the SACFs andSCCFs of the CCI and CPI series. The dependency check identifiedthat both the single series of CCI and the vector series of CCI andCPI were meaningful for performing univariate and multivariatetime series analysis, respectively.

Model Selection

Box and Jenkins (1976) stated that a stationary series can be de-scribed by its SACF and SPACF: the functions reflect the nature ofdependence existing in the series; thus, the patterns of the functionsover time lags provide a guideline for choosing the tentative type ofa model. However, identification of the model type based on thosefunctions is not always feasible and straightforward; it is confusing

in many cases. This was true with the present study, as shown inFig. 2. Nevertheless, the plots at least suggest that the series maynot be well fitted by either the AR(p) process or the MA(q) process.In this case, various models with different orders are constructedand tested to select the best fitting model based on statisticalcriteria. The Akaike information criterion (AICC), AICC ¼ AICþ½2kðk þ 1Þ�=ðn� k � 1Þ and AIC ¼ 2k þ n lnðRSS=nÞ, is oftenused as a major criterion in the model selection process, wherek = number of parameters; n = number of observations; andRSS = residual sum of squares (Brockwell and Davis 2002). Amodel that minimizes AICC is preferred. The following Eqs. (1)and (2) represent the two selected best-fitting models—ARMA(5,5) and VAR(12) for the univariate model and the multivariatemodel, respectively—of which the estimates of parameters are pre-sented in Table 1. The AICCs of the models suggest that ARMA(5,5) is slightly more accurate than VAR(12):

Xt ¼X5

i¼1

ϕiXt�i þX5

j¼1

θjZt�j þ Zt; fZtg ∼WNð0; σ2Þ ð1Þ

Table 1. Summary of Selected Time Series Models

Model Parameter estimate AICC

ARMA(5,5) ϕ1 �0:2333 1700

ϕ2 0.3346

ϕ3 0.3575

ϕ4 �0:1844

ϕ5 �0:5306

θ1 0.2666

θ2 �0:4622

θ3 �0:4906

θ4 0.2095

θ5 0.9712

VAR(12) Φ0 �0:5962 1755

�0:0077

Φ1 �0:0168 1.3135

�0:0004 �0:3428

Φ2 �0:1900 2.5723

0.0004 �0:2353

Φ3 0.0283 �13:5706

0.0014 �0:3508

Φ4 �0:0181 �6:1062

0.0016 �0:3955

Φ5 0.1667 �13:9622

�0:0006 �0:2494

Φ6 0.0750 �9:6228

0.0005 �0:2417

Φ7 0.0902 �6:1325

0.0005 �0:0682

Φ8 0.1127 2.3074

�0:0007 �0:1710

Φ9 0.0748 �7:5695

0.0007 �0:0341

Φ10 �0:0713 �8:8808

�0:0009 0.0700

Φ11 �0:0404 �4:0376

�0:0001 0.0961

Φ12 �0:3702 3.9137

�0:0005 �0:3554

(a)

(b)

-0.50

-0.25

0.00

0.25

0.50

0.75

1.00

0 5 10 15 20

SA

CF

Lag

-0.50

-0.25

0.00

0.25

0.50

0.75

1.00

0 5 10 15 20

SP

AC

F

Lag

-1.00

-0.50

0.00

0.50

1.00

SA

CF

Lag

CCI series

-1.00

-0.50

0.00

0.50

1.00

SC

CF

Lag

CCI series x CPI series

-1.00

-0.50

0.00

0.50

1.00

SC

CF

Lag

CPI series x CCI series

-1.00

-0.50

0.00

0.50

1.00

0 5 10 15 20 0 5 10 15 20

0 5 10 15 20 0 5 10 15 20

SA

CF

Lag

CPI series

Fig. 2. Dependency among observations up to lag 20: (a) SACFand SPACF of CCI series; (b) SACFs and SCCFs of CCI and CPIseries



Dow

nloa

ded

from

asc

elib

rary

.org

by

UN

IVE

RSI

TY

OF

NE

W O

RL

EA

NS

on 0

6/27

/14.

Cop

yrig

ht A

SCE

. For

per

sona

l use

onl

y; a

ll ri

ghts

res

erve

d.

Xt ¼ ϕ0 þX12

h¼1

ΦhXt�h þ Zt; fZtg ∼WNð0;ΣÞ ð2Þ

Forecasts by Models

Given the data set of CCI and CPI for the period of January 1975 toFebruary 1991, one-step-ahead forecasts of CCI were first gener-ated by each model. Comparison with the actual observationsshows that both of the models are very effective (Fig. 3), butARMA(5,5) is slightly better than the other, as indicated by AICCpreviously. The average of absolute errors of forecasts were alsocalculated and compared: ARMA(5,5) produced 19.72, whereasVAR(12) produced 38.86. This again confirms that both modelsare very accurate, considering the magnitude of the original obser-vations during the predicted period.

Checking Model Results

As stated previously, residuals from an appropriate model shouldbe uncorrelated; this can be verified by the autocorrelations of theresiduals. Figs. 4(a) and 4(b) illustrate the SACF of residuals fromARMA(5,5) and the SACFs and SCCFs of residuals from VAR(12), respectively. As the graphs show, the residuals from bothmodels are not significantly autocorrelated or cross correlated overtime lags greater than zero—at least 95% of the autocorrelationsand cross correlations fall in between �1:96=

ffiffiffin

p. This indicates

that there is no reason to reject the constructed models.

Model Performance Comparisons

The ultimate criterion of performance of the models is the accuracyof forecasts by the models. This section compares the accuracy ofthe two time series models developed in this study to the accuracyof existing methods.

New Models versus Common Practices and DynamicRegression Models

Practitioners often project a serial index into the future by averag-ing the change rate of the index in the recent past, assuming that thechange rate in the future will be the same as its recent average. New

models were first compared with such industry practices. In thiscomparison, two practices were considered: the average monthlychange rates of CCI for the past 12 and 24 months were usedto forecast CCI. For this comparison, serial observations wereselected from two periods—January 1975 to February 1991 andJanuary 2003 to December 2004. Given the selected periods, eachmethod was implemented to predict the index for two periods—March 1991 to February 1993 and January 2005 to December2006. The latter period—January 2005 to December 2006—wasintentionally selected because of the relatively larger growth andmore dynamic fluctuations of CCI during the period, as illustratedin Fig. 1(a). Given the two selected windows of data, each methodwas implemented to produce forecasts up to 24 months from thelast month of each selected period. For the purpose of determininghow far each method can predict accurately, the average of absoluteerrors (or deviations) of forecasts for the predicted period was mea-sured for each model in two dimensions—over 12 months and 24months. Table 2 summarizes the results of this comparison by pre-senting the average of absolute errors of forecasts of each modelfor the two periods. ARMA(5,5) and VAR(12) outperformed theindustry practices, yielding much smaller errors than those of the

(a)

(b)

-0.50

-0.25

0.00

0.25

0.50

0.75

1.00

SA

CF

Lag

-0.50

-0.25

0.00

0.25

0.50

0.75

1.00

0 5 10 15 20

0 5 10 15 20

SP

AC

F

Lag

-1.00

-0.50

0.00

0.50

1.00

SA

CF

Lag

CCI series

-1.00

-0.50

0.00

0.50

1.00

0 5 10 15 200 5 10 15 20

SC

CF

Lag

CCI series x CPI series

-1.00

-0.50

0.00

0.50

1.00

SC

CF

Lag

CPI series x CCI series

-1.00

-0.50

0.00

0.50

1.00

0 5 10 15 20 0 5 10 15 20

SA

CF

Lag

CPI series

Fig. 4. Dependency among residuals up to 20 lags: (a) autocorrelationsof residuals from ARMA(5,5); (b) autocorrelations and cross correla-tions of residuals from VAR(12)

(a)

(b)

y = 1.0002xR² = 0.9994

2000

2500

3000

3500

4000

4500

5000

2000 2500 3000 3500 4000 4500 5000One

-ste

p-ah

ead

fore

cast

of

CC

I (y)

Observed CCI (x)

y = 1.0003xR² = 0.9974

2000

2500

3000

3500

4000

4500

5000

2000 2500 3000 3500 4000 4500 5000

One

-ste

p-ah

ead

fore

cast

of

CC

I (y)

Observed CCI (x)

Fig. 3. Comparison of one-step-ahead forecasts and true observations:(a) ARMA(5,5); (b) VAR(12)



Dow

nloa

ded

from

asc

elib

rary

.org

by

UN

IVE

RSI

TY

OF

NE

W O

RL

EA

NS

on 0

6/27

/14.

Cop

yrig

ht A

SCE

. For

per

sona

l use

onl

y; a

ll ri

ghts

res

erve

d.

industry practices compared. For the longer prediction period (24months), the industry practices yielded much larger errors when theseries involved fast growth and large fluctuations (January 2005 toDecember 2006) than when the series experienced a steady growth(March 1991 to February 1993). Meanwhile, the new models wererelatively consistent in the scale of error for each period regardlessof the length of prediction period.

In the same manner, new models were compared with two dy-namic regression (DR) models, DR-4 and DR-5 (Hwang 2009).Again, the results of the comparison are presented in Table 2.For each case, the two DR models yielded slightly smaller errorsthan those of new models.

New Models versus Williams’ Neural Network Model

The new models then were compared with the neural networkmodel that was developed using the CCI series from July 1967to February 1991 by Williams (1994). The model was implementedto generate 1-month-ahead monthly percent changes of CCI fromMarch 1991 to October 1992. The neural network model yielded aRSS of 5.31, whereas the new models produced much smallerRSSs for the same period—2.20 by ARMA(5,5) and 3.29 byVAR(12); this is strong evidence in favor of the two new models.

New Models versus Koppula’s Models

Another comparison was conducted between the new models andexisting time series models—the Box-Jenkins ARIMA processmodel and the Holt-Winters smoothing model—developed byKoppula (1981) by using monthly CCI from January 1962 toDecember 1976. As in the comparison presented above, 24 fore-casts of CCI for the period of January 1977 to December 1978 weregenerated. The average of absolute percent errors was calculatedby 100 × jðobserved value� forecasted valueÞ=observed valuej) forthe periods of 12 months and 24 months. Table 3 presents the re-sults of this comparison. The Holt-Winter smoothing model wasexcluded from the comparison of the 24-month prediction becausethe model used the actual values observed during the initial 12months to predict from the 13th month. Overall, the new modelsare either better than or as good as the existing models.

Limitations and Future Research

An effective way to monitor and predict the trend of constructioncosts is to analyze frequently and regularly serial indexes of con-struction cost data or periodic cost reports about materials, equip-ment, and labor of trades. By tracking quantitative serial cost data,project personnel can develop a deeper understanding of the trendof cost change over time. This research inevitably results involves afew limitations when it comes to implementation in reality, whichare due to the nature of data. A limitation of the proposed modelsmay derive from the nature of the construction cost index, in whichthe trend of the index may not necessarily agree with the trendof cost of the specific resource. A project may use a particularresource heavily. Project personnel then may need to pay more at-tention to the price of the specific resource in the market. In thiscase, an aggregated index of weighted costs of a few selectedresources may provide limited help for estimating costs. Anotherpotential limitation is the accuracy of long-term prediction whenrapid changes in demand of resources in the market are introduced,which is more or less an inevitable situation associated with timeseries analysis. Nevertheless, by the nature of time series analysis,the time series models can respond to such changes to a certainextent because the models allow adjustment to existing predictionover time as new observed value are fed into the models.

There are a few future research opportunities. The forecastingapproach presented in this paper may also be useful at the level ofindividual resources if there are many cost reports available aboutthe prices of individual resources used for construction, includingmaterials, labor of trades, and equipment. Thus, the presented ap-proach can be extended to predicting prices of individual resources.More potential future research includes integration of the modelswith existing project control methods. For example, earned valuemanagement and cash flow analysis can take advantage of moreaccurate forecasts. Similarly, procurement management also canbe integrated with the price forecasting models. Finally, it is alsoworthwhile to incorporate the impact of project-specific factorsattributable to the change of construction costs along with eco-nomic conditions in analysis.

Conclusions

Construction operations on-site are usually commenced monthsor even years after cost estimates of the operations are prepared.Furthermore, the operations are then conducted over a considerableperiod of time. In such construction projects, it often is difficult toestimate and manage construction costs because construction costsare not static but change dynamically over time. Under the circum-stances, it is important to be able to develop insights about thetrends of construction costs by continually monitoring and analyz-ing changes of costs over time in the market. For this, effective

Table 3. Results of Accuracy Comparison: New Models versus Koppula’sModels

Average prediction percent |error|a

Prediction method 12-month 24-month

ARMA(5,5) 0.36 1.04

VAR(12) 0.98 1.01

ARIMAb 0.80 1.08

Holt-Winterb 0.70 N/AaThe average of absolute percent errors.bExisting Koppula’s model.

Table 2. Results of Accuracy Comparison: New Models versus IndustryPractices and Dynamic Regression Models

Average prediction |errors|a

Predicted period Prediction method 12-month 24-month

March 1991 through

February 1993

ARMA(5,5) 17.42 51.75

VAR(12) 47.74 49.93

12-month average 77.33 76.76


DR-4 14.96 13.00

DR-5 16.29 13.16

January 2005 through

December 2006

ARMA(5,5) 34.56 33.02

VAR(12) 39.84 36.88



DR-4 19.21 20.70

DR-5 22.92 22.50aThe average of absolute errors of forecasts.



Dow

nloa

ded

from

asc

elib

rary

.org

by

UN

IVE

RSI

TY

OF

NE

W O

RL

EA

NS

on 0

6/27

/14.

Cop

yrig

ht A

SCE

. For

per

sona

l use

onl

y; a

ll ri

ghts

res

erve

d.

tracking and analysis of time series construction cost data is verybeneficial to cost management in construction. This study pre-sented two time series models—ARMA(5,5) and VAR(12)—tosupport estimating construction costs by means of forecasting con-struction cost indexes. A series of comparisons proved that the newmodels are more accurate than existing models previously devel-oped by others. In particular, the new models responded sensitivelyand swiftly to quick, big changes to predict the series for the peri-ods following the change. Comparing the accuracy of the two mod-els, the univariate model ARMA(5,5) was slightly more accuratethan VAR(12). Although it is rational to believe that overall priceinflation affects construction costs in the market, adding CPI didnot produce a significantly better prediction. Instead, the trendof CCI was explained more effectively by its previous values only.Therefore, considering the cost of analysis and prediction, the uni-variate model ARMA(5,5) is slightly preferred because it involvesonly one variable, and the multivariate model involves two varia-bles. A notable advantage of the models is that using only quanti-tative data, the models produce objective predictions withoutinvolving additional subjective judgment. The proposed modelsare envisioned to serve well the following purposes: preparingthe initial budget for a new project, taking advantage of short-termfluctuations of prices of resources for the activities, and determin-ing the level of contingency due to price inflation.

References

Abdelhamid, T. S., and Everett, J. G. (1999). “Time series analysis forconstruction productivity experiments.” J. Constr. Eng. Manage.,125(2), 87–95.

Attalla, M., and Hegazy, T. (2003). “Predicting cost deviation inreconstruction projects: Artificial neural networks versus regression.”J. Constr. Eng. Manage., 129(4), 405–411.

Box, G. E. P., and Jenkins, G. M. (1976). Time series analysis forecastingand control, Holden-Day, CA.

Brockwell, P. J., and Davis, R. A. (2002). Introduction to time series and

forecasting, Springer, New York.Diekmann, J. E. (1983). “Probabilistic estimating: Mathematics and appli-

cations.” J. Constr. Eng. Manage., 109(3), 297–308.Doğan, S. Z., Arditi, D., and Günaydın, H. M. (2006). “Determining attrib-

ute weights in a CBR model for early cost prediction of structuralsystems.” J. Constr. Eng. Manage., 132(10), 1092–1098.

Flood, I. (1997). “Modeling uncertainty in cost estimates: A universalextension of the central limit theorem.” Proc., 4th Congress onComputing in Civil Engineering, ASCE, New York, 551–558.

Grogan, T. (1992). “Keeping track of a moving target.” Eng. News-Rec.,228(13), 42–47.

Hwang, S. (2009). “Dynamic regression models for prediction of construc-tion costs.” J. Constr. Eng. Manage., 135(5), 360–367.

Issa, R. R. A. (2000). “Application of artificial neural networks to predict-ing construction material prices.” Proc., 8th Int. Conf. on Computing inCivil and Building Engineering, ASCE, Reston, VA, 1129–1132.

Koppula, S. D. (1981). “Forecasting engineering costs: Two case studies.”J. Constr. Div., 107(4), 733–743.

Ostwald, P. E. (1984). Cost estimating, Prentice-Hall, Englewood Cliffs,NJ.

Pankratz, A. (1991). Forecasting with dynamic regression models, Wiley,New York.

Rao, G. N., and Grobler, F. (1997). “Integrated analysis of cost risk andschedule risk.” Proc., 4th Congress on Computing in Civil Engineering,ASCE, New York, 1404–1411.

Sparkes, J. R., and McHugh, A. K. (1984). “Awareness and use of forecasttechniques in British industry.” J. Forecast., 3(1), 37–42.

Touran, A. (2003). “Probabilistic model for cost contingency.” J. Constr.Eng. Manage., 129(3), 280–284.

Trost, S. M., and Oberlender, G. D. (2003). “Predicting accuracy of earlycost estimates using factor analysis and multivariate regression.”J. Constr. Eng. Manage., 129(2), 198–204.

U.S. Dept. of Labor (2006). “Consumer price indexes.” ⟨http://www.bls.gov/cpi/home.htm⟩.

Williams, T. P. (1994). “Predicting changes in construction cost indexesusing neural networks.” J. Constr. Eng. Manage., 120(2), 306–320.

Wilmot, C. G., and Cheng, G. (2003). “Estimating future highwayconstruction costs.” J. Constr. Eng. Manage., 129(3), 272–279.



Dow

nloa

ded

from

asc

elib

rary

.org

by

UN

IVE

RSI

TY

OF

NE

W O

RL

EA

NS

on 0

6/27

/14.

Cop

yrig

ht A

SCE

. For

per

sona

l use

onl

y; a

ll ri

ghts

res

erve

d.

http://dx.doi.org/10.1061/(ASCE)0733-9364(1999)125:2(87)






http://dx.doi.org/10.1002/for.3980030105




http://www.bls.gov/cpi/home.htm






time series models for forecasting construction costs using time series indexes

Documents