1-s2.0-s0968090x09000540-main

20
Multi-agent model predictive control of signaling split in urban traffic networks q Lucas Barcelos de Oliveira, Eduardo Camponogara * Department of Automation and Systems Engineering, Federal University of Santa Catarina, Cx.P. 476, 88040-900 Florianópolis, SC, Brazil article info Article history: Received 15 October 2008 Received in revised form 3 March 2009 Accepted 29 April 2009 Keywords: Urban traffic networks Split control Distributed agents Distributed optimization Model predictive control abstract The operation of large dynamic systems such as urban traffic networks remains a challenge in control engineering to a great extent due to their sheer size, intrinsic complexity, and nonlinear behavior. Recently, control engineers have looked for unconventional means for modeling and control of complex dynamic systems, in particular the technology of multi-agent systems whose appeal stems from their composite nature, flexibility, and sca- lability. This paper contributes to this evolving technology by proposing a framework for multi-agent control of linear dynamic systems, which decomposes a centralized model predictive control problem into a network of coupled, but small sub-problems that are solved by the distributed agents. Theoretical results ensure convergence of the distributed iterations to a globally optimal solution. The framework is applied to the signaling split control of traffic networks. Experiments conducted with simulation software indicate that the multi-agent framework attains performance comparable to conventional control. The main advantages of the multi-agent framework are its graceful extension and localized reconfiguration, which require adjustments only in the control strategies of the agents in the vicinity. Ó 2009 Elsevier Ltd. All rights reserved. 1. Introduction The steady advances in communications and computer technology are shaping the way traffic control systems are de- signed. Today, operating centers can receive data from remote sensors and apply control policies that respond to the prevail- ing traffic conditions. Among the existing real-time control systems, the Traffic-responsive Urban Control (TUC) framework (Diakaki et al., 2002) has drawn interest for its simplicity, robustness, and good performance corroborated with field appli- cations in Munich, Southampton, and Chania (Bielefeldt et al., 2004; Diakaki and Papageorgiou, 1997; Kosmatopoulos et al., 2006). TUC uses a modified store-and-forward model of traffic flow (Gazis and Potts, 1963) with purely continuous state and control variables which greatly simplifies the synthesis of a control strategy. In its baseline form, TUC has an off-line and an on-line module (Diakaki et al., 2002). The off-line module solves an unconstrained linear-quadratic-regulator (LQR) problem that minimizes a quadratic cost function on queue lengths and deviations from nominal split signals. The on-line module produces feasible split signals, which satisfy green time bounds and add up to cycle time, by solving a quadratic program that minimizes the distance from the infeasible signals obtained with the LQR policy. Invariably, such a framework does not necessarily reach optimal solutions to the underlying constrained control problem (Camacho and Bordons, 2004). To this end, model predictive control (MPC) approaches have been proposed to explictly handle constraints and thereby improve solution quality of the TUC framework (Aboudolas et al., 2007; de Oliveira and Camponogara, 2007). 0968-090X/$ - see front matter Ó 2009 Elsevier Ltd. All rights reserved. doi:10.1016/j.trc.2009.04.022 q This research was supported in part by Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq) under Grant #473841/2007-0. * Corresponding author. Tel.: +55 48 3721 7688; fax: +55 48 3721 9934. E-mail address: [email protected] (E. Camponogara). Transportation Research Part C 18 (2010) 120–139 Contents lists available at ScienceDirect Transportation Research Part C journal homepage: www.elsevier.com/locate/trc

Upload: julio-cesar

Post on 03-Feb-2016

9 views

Category:

Documents


0 download

DESCRIPTION

control

TRANSCRIPT

Page 1: 1-s2.0-S0968090X09000540-main

Transportation Research Part C 18 (2010) 120–139

Contents lists available at ScienceDirect

Transportation Research Part C

journal homepage: www.elsevier .com/locate / t rc

Multi-agent model predictive control of signaling split in urbantraffic networks q

Lucas Barcelos de Oliveira, Eduardo Camponogara *

Department of Automation and Systems Engineering, Federal University of Santa Catarina, Cx.P. 476, 88040-900 Florianópolis, SC, Brazil

a r t i c l e i n f o

Article history:Received 15 October 2008Received in revised form 3 March 2009Accepted 29 April 2009

Keywords:Urban traffic networksSplit controlDistributed agentsDistributed optimizationModel predictive control

0968-090X/$ - see front matter � 2009 Elsevier Ltddoi:10.1016/j.trc.2009.04.022

q This research was supported in part by Conselho* Corresponding author. Tel.: +55 48 3721 7688;

E-mail address: [email protected] (E. Camp

a b s t r a c t

The operation of large dynamic systems such as urban traffic networks remains a challengein control engineering to a great extent due to their sheer size, intrinsic complexity, andnonlinear behavior. Recently, control engineers have looked for unconventional meansfor modeling and control of complex dynamic systems, in particular the technology ofmulti-agent systems whose appeal stems from their composite nature, flexibility, and sca-lability. This paper contributes to this evolving technology by proposing a framework formulti-agent control of linear dynamic systems, which decomposes a centralized modelpredictive control problem into a network of coupled, but small sub-problems that aresolved by the distributed agents. Theoretical results ensure convergence of the distributediterations to a globally optimal solution. The framework is applied to the signaling splitcontrol of traffic networks. Experiments conducted with simulation software indicate thatthe multi-agent framework attains performance comparable to conventional control. Themain advantages of the multi-agent framework are its graceful extension and localizedreconfiguration, which require adjustments only in the control strategies of the agents inthe vicinity.

� 2009 Elsevier Ltd. All rights reserved.

1. Introduction

The steady advances in communications and computer technology are shaping the way traffic control systems are de-signed. Today, operating centers can receive data from remote sensors and apply control policies that respond to the prevail-ing traffic conditions. Among the existing real-time control systems, the Traffic-responsive Urban Control (TUC) framework(Diakaki et al., 2002) has drawn interest for its simplicity, robustness, and good performance corroborated with field appli-cations in Munich, Southampton, and Chania (Bielefeldt et al., 2004; Diakaki and Papageorgiou, 1997; Kosmatopoulos et al.,2006). TUC uses a modified store-and-forward model of traffic flow (Gazis and Potts, 1963) with purely continuous state andcontrol variables which greatly simplifies the synthesis of a control strategy. In its baseline form, TUC has an off-line and anon-line module (Diakaki et al., 2002). The off-line module solves an unconstrained linear-quadratic-regulator (LQR) problemthat minimizes a quadratic cost function on queue lengths and deviations from nominal split signals. The on-line moduleproduces feasible split signals, which satisfy green time bounds and add up to cycle time, by solving a quadratic programthat minimizes the distance from the infeasible signals obtained with the LQR policy. Invariably, such a framework doesnot necessarily reach optimal solutions to the underlying constrained control problem (Camacho and Bordons, 2004). To thisend, model predictive control (MPC) approaches have been proposed to explictly handle constraints and thereby improvesolution quality of the TUC framework (Aboudolas et al., 2007; de Oliveira and Camponogara, 2007).

. All rights reserved.

Nacional de Desenvolvimento Científico e Tecnológico (CNPq) under Grant #473841/2007-0.fax: +55 48 3721 9934.onogara).

Page 2: 1-s2.0-S0968090X09000540-main

L.B. de Oliveira, E. Camponogara / Transportation Research Part C 18 (2010) 120–139 121

The technology of multi-agent systems has also advanced in the past decades, particularly so in artificial intelligence andsoftware engineering (Jennings, 2000; Maturana et al., 2005). This evolving technology aims to arrange agents of limited per-ception and expertise in an organization to perform tasks that are beyond the abilities of the agents. The problem-solvingability of a multi-agent system emerges from the interactions of the agents, which employ some form of reasoning to coop-erate with others and resolve conflicts when driven by the interests of the organization.

Intelligent agents and multi-agent systems have been successful in solving unstructured problems (for which adequatemodels are not known), in the replacement of and assistance to humans (Tomás and Garcia, 2005; Rigolli and Brady,2005; Nguyen-Duc et al., 2008), in solving high abstraction problems (Pechoucek et al., 2006; Tumer and Agogino, 2007),and in handling discrete decisions (Yamashita et al., 2005; de Oliveira et al., 2005; Balan and Luke, 2006). The nature of theseproblems contrasts with dynamic control problems, which are typically structured (for which good models based on differ-ential equations are known) and where the aim is to control machines, the decisions are of low level and demand guaranteesof stability and convergence, and the control variables are continuous.

While multi-agent systems are very adaptive in unstructured problems, they have been mostly used as a software engi-neering paradigm in the field of dynamic control systems (Maturana et al., 2005; Srinivasan and Choy, 2006; Tatara et al.,2007). Control engineers and computer scientists are bridging the gap between these disciplines by developing multi-agentsystems to cope with the sheer size and complexity of large dynamic control systems (Li et al., 2005; Manikonda et al., 2001;Tatara et al., 2005; Negenborn et al., 2008). The appeal for multi-agent technology stems from the composite nature, flexi-bility, and scalability.

Aligned with these efforts, this paper proposes a framework for a network of distributed agents to control linear dynamicsystems, which are put together by interconnecting linear sub-systems with local input constraints. Our framework decom-poses the optimization problem arising from the MPC approach into a network of coupled, but small sub-problems to besolved by the agent network. Each agent senses and controls the variables of its sub-system, while communicating withagents in the vicinity to obtain neighborhood variables and coordinate their actions. A well-crafted problem decompositionand coordination protocol ensure convergence of the agents’ iterates to a global optimum of the MPC problem.

The work reported here builds upon preceding work on distributed control (Camponogara et al., 2002; Camponogara andTalukdar, 2007) by exploiting the linear dynamic structure to develop simpler models and algorithms. The paper focuses onthe development of the multi-agent MPC framework and its application to the control of signaling split in urban traffic net-works. While being able to attain performance comparable to centralized MPC, the multi-agent MPC framework is more ro-bust in that the failure of a control agent compromises only its local sub-system. And it also supports a plug-in technologythat allows for graceful expansion and reconfiguration to be performed locally, rather than having to coordinate at the con-trol center.

The remaining sections are structured as follows. Section 2 presents basic concepts of urban traffic networks and de-scribes the store-and-forward model used by the TUC strategy. Section 3 formulates split control as an MPC problem fora network of dynamically coupled sub-systems, one for each intersection. Last but not least, the section develops a decom-position of the MPC problem into a set of sub-problems and outlines a distributed algorithm for the agent network to reachan optimal solution. Section 4 reports results from computational experiments aimed to compare the TUC LQR strategy withthe multi-agent MPC approach. Section 5 draws some final remarks and suggests directions for future work.

2. Urban traffic control

The origin of urban traffic control dates back to the early 20th century with the appearance of traffic lights. The first at-tempts of real-time traffic control began in the 1980’s with the implementation of SCOOT (Robertson and Bretherton, 1991;Hunt et al., 1981) and SCATS (Lowrie, 1982) strategies. Nevertheless, despite the continuous research in the past decades,most of the control strategies still rely on heuristics to compute the signaling split such as the acclaimed TRANSYT (Robertson,1969).

Urban traffic control is usually divided in several modules which are responsible for several aspects of traffic control.These modules include ramp metering, dynamic message signaling, signaling split control, and public transport. By splitwe mean the green light time assigned to each street or road of an intersection. This is one of the four control factors thatmostly influence traffic (Diakaki, 1999; Papageorgiou, 2004), with the others being stage specification, cycle duration, andoffset between intersections. The signaling split control module of the TUC strategy is of particular interest to this paper.

The traffic-responsive urban control framework uses a store-and-forward model which represents traffic flow with con-tinuous variables, thereby facilitating the synthesis of multi-variable control algorithms such as LQR and MPC. The under-lying assumption of this store-and-forward model is shown in Fig. 1. The bold full line in colors green1 and red representthe cycle of a junction. The square wave in full line represents the usual traffic flow model of a single stream of vehicles, usinginteger variables to differentiate periods with right of way and saturated flow, associated with the green portion of the cycleline, from periods with no flow, where the cycle line is red. The dashed line on the other hand represents the same flow of vehi-cles as seen by the model proposed by Gazis and Potts (1963). From this illustration, one can view the store-and-forward modelas the mean flow crossing the stop line of an intersection during the control interval, meaning that this interval has to be greater

1 For interpretation of color in Fig. 1, the reader is referred to the web version of this article.

Page 3: 1-s2.0-S0968090X09000540-main

Fig. 1. Store-and-forward flow (dashed line) and flow modeled with binary variables (full line).

122 L.B. de Oliveira, E. Camponogara / Transportation Research Part C 18 (2010) 120–139

than the intersection’s cycle time. The TUC traffic model does not try to realistically model the complex and rapidly evolvingdynamics of traffic, such as driver reaction times, acceleration, and deceleration, but rather it is concerned with the long termevolution of the in-and-outflows of the network.

2.1. Urban traffic network modeling

The models for traffic network and traffic flow presented below are from (Diakaki, 1999). A urban traffic network consistsof intersections or junctions joined by links which represent streets, avenues, roads or any other infrastructure connectingthem. A junction comprises a set of approaches ending at a common crossing area. An approach is a subset of the lanes ofa link from which vehicles are able to cross the intersection simultaneously, being defined by the topology and stages ofthe network. A stage, or phase, is the period of time during which the traffic light signals are held constant at the intersection.Approaches may also be further divided into one or more streams. The maximum flow that can cross the stop line of an inter-section when a stream has the right of way (r.o.w.) is the saturation flow, which is usually expressed in vehicles per hour. Theyellow time introduced between consecutive phases to ensure safety is known as lost time. And the time frame until the rep-etition of stages is called cycle time or cycle. These concepts are the building blocks for traffic modeling.

Fig. 2 shows a urban traffic network with two roads each of which has 4 lanes. Taking the horizontal link on the west–eastdirection as the reference, one notices two distinct approaches: one bundling vehicles willing to make a left turn and theother bundling the vehicles wishing to go straight ahead. The arrows show all streams of this network. The figure also illus-trates the three stages of the intersection which are repeated in each cycle.

A urban traffic network is therefore viewed as a directed graph whose nodes are the junctions j 2 J and whose arcs cor-respond to the links z 2 Z. The sets Ij and Oj have the incoming and outgoing links of junction j, respectively. The routes of

Fig. 2. Basic concepts for traffic modeling.

Page 4: 1-s2.0-S0968090X09000540-main

L.B. de Oliveira, E. Camponogara / Transportation Research Part C 18 (2010) 120–139 123

vehicles entering the network are assumed to follow statistical patterns that are modeled by turning rates. Specifically, theturning rate sz;w gives the rate of vehicles that reach a junction j from link z 2 Ij and turn into a link w 2 Oj. For the purpose ofthe traffic control analysis herein, turning rates sz;w, cycle times Cj and lost times Lj at the junctions, and saturation flows Sz

for the links are all known constants.Let Fj be the set of phases at junction j, while uj;i denotes the green time of phase i 2 Fj. It is typical to have all intersections

operating with a common cycle time C, which is enforced by the constraintP

i2Fjuj;i þ Lj ¼ C. An additional constraint is

uj;i 2 uminj;i ; umax

j;i

h iwhere umin

j;i umaxj;i

� �is the minimum (maximum) allowable green time. Also, let Vz # Fj be the subset of

phases for which link z has the r.o.w. at junction j.The traffic flow dynamics of the network link z in Fig. 3 is given by

Dxzðt þ 1Þ ¼ DT½qzðtÞ þ dzðtÞ � pzðtÞ � czðtÞ� ð1Þ

where t ¼ 1; 2; . . . is a discrete time index and DT is the control interval; xz denotes the number of vehicles in link z; qz (pz) isthe inflow (outflow) of link z during the time window DT½t; ðt þ 1Þ�; dz is the demand, that is, the vehicles not originating fromadjacent links that enter the network; and cz is the exit flow.

Because turning rates are known, the traffic flow into link z is expressed as

qzðtÞ ¼Xw2Ij1

sw;zpwðtÞ

where sw;z is the turning rate towards link z 2 Oj1 coming from link w 2 Ij1 . Demand and exit rates are lumped together as asingle disturbance, say ezðtÞ. Assuming that inflows and outflows of link z with r.o.w. are equal to their saturation flow, Sz, Eq.(1) becomes

xzðt þ 1Þ ¼ xzðtÞ þ DTXw2Ij1

sw;zSw

C

Xi2Vw

uj1 ;iðtÞ �Sz

C

Xi2Vz

uj2 ;iðtÞ þ ezðtÞ

24

35 ð2Þ

where the control signal uj1 ;iðtÞ is the green time for vehicles going through junction j1 during phase i, whereasP

i2Vzuj2 ;iðtÞ is

the green time for vehicles leaving link z. Notice that link z starts at junction j1 and ends at j2. Generalizing Eq. (2) for allnetwork links leads to the matrix equation

xðt þ 1Þ ¼ AxðtÞ þ BuðtÞ þ eðtÞ ð3Þ

where xðtÞ is the state vector; uðtÞ is the control vector containing signals uj;iðtÞ; 8i 2 Fj; 8j 2 J; eðtÞ is the vector with thedisturbances; and A ¼ I is the state matrix, whereas B is the control input matrix.

2.2. Split control

Traffic-responsive control systems adjust split signals according to the demands of involved streams. In standard form,the TUC strategy uses the LQR technique to find a time-invariant gain matrix, which is simpler than optimizing a perfor-mance criterion (Diakaki et al., 2002) but potentially delivering a sub-optimal control law. To apply the LQR technique,the disturbances are disregarded and the dynamic system (3) becomes:

xðt þ 1Þ ¼ AxðtÞ þ BuðtÞ ð4Þ

Such assumption is plausible since the goal is to attain a satisfactory gain matrix. The minimization of the proportionaloccupancy of the links xz=xmax

z , where xmaxz is the link capacity, is attempted to reduce the risk of oversaturation and spillback.

To this end, the following quadratic function is used:

Fig. 3. Traffic flow dynamics in a link.

Page 5: 1-s2.0-S0968090X09000540-main

2 A s

124 L.B. de Oliveira, E. Camponogara / Transportation Research Part C 18 (2010) 120–139

J ¼ 12

X1t¼0

kxðtÞk2Q þ kuðtÞk

2R

� �ð5Þ

where Q and R are diagonal matrices, with the first being positive definite and the second being positive semi-definite2.According to the LQR theory, an infinite time horizon is used in (5) to achieve a time-invariant control law. As matrix Q weighsthe states (the number of vehicles in the roads), the minimization of the average occupancy is achieved by making its diagonalelements equal to 1=ðxmax

z Þ2 for the corresponding links z 2 Z. Matrix R reflects the penalty imposed on control effort, usuallydefined as R ¼ rI where r is found experimentally. Minimizing criterion (5) leads to the control law

uðtÞ ¼ �LxðtÞ ð6Þ

where L is Ricatti’s gain matrix which depends on A;B;Q , and R, but with small susceptibility to variation of these matrices(Diakaki et al., 2002). The feedback control law (6) does not account for the constraints on the control signals, which are im-posed in an ad hoc manner by solving the following problem at each sample time t and for each junction j 2 J

QjðtÞ : minUj;iðtÞ

Xi2Fj

½uj;iðtÞ � Uj;iðtÞ�2 ð7aÞ

s: to :Xi2Fj

Uj;iðtÞ þ Lj ¼ Cj ð7bÞ

Uj;iðtÞ 2 ½uminj;i ; umax

j;i �; 8i 2 Fj ð7cÞ

where Uj;iðtÞ is the closest solution in Euclidean space to uj;iðtÞ. QjðtÞ is a quadratic program which can be solved in real-timewith an efficient algorithm (Diakaki, 1999) that converges in at most jFjj steps. Although this approach gives a feasible split,the resulting solution does not necessarily satisfy the optimality conditions for the dynamic system defined by Eq. (4) subjectto the constraints on control signals. Actually, this multi-variable regulator behaves in a purely reactive way to unknowndisturbances because no predictions on disturbances are made. On the other hand, the structure of matrix L provides theregulator with a gating effect, that is, the split of highly loaded links on peripheral junctions are reduced to preclude satura-tion in upstream links and thereby avoid gridlocks.

Previous works (Aboudolas et al., 2007; de Oliveira and Camponogara, 2007; de Oliveira, 2008) report that significantimprovements may be induced by replacing the standard LQR control law with a procedure that accounts for systems con-straints such as model predictive control. Generally speaking, the MPC approach is composed by (Camacho and Bordons,2004; Kühne, 2005)

� a prediction model satisfactorily describing the process dynamics in a finite-time horizon;� a cost function which gives the control signals when minimized; and� a sliding horizon of prediction and control, which is translated a step forward at each sample period, requiring the com-

putation of new control actions from which only that of the actual time is implemented.

Model predictive control minimizes the same cost function of LQR control, except that it covers a limited time frame givenby the prediction horizon. MPC is regarded as a feed-forward control strategy because a disturbance model can be embeddedin its prediction model. Nevertheless, the use of a disturbance model may mask the benefits of the computation of a bettercontrol signal under equal circumstances. Put another way, the dynamic model for traffic flow should be the same for TUCand MPC strategies when comparing their performances. Following these principles, the MPC problem for signaling split con-trol at time t is cast as

PðtÞ : minXK

k¼1

12

xðt þ kjtÞ0Q xðt þ kjtÞ þXK�1

k¼0

12

uðt þ kjtÞ0Ruðt þ kjtÞ ð8aÞ

s: to : xðtjtÞ ¼ xðtÞ ð8bÞ

For k ¼ 0; . . . ;K � 1 :

xðt þ kþ 1jtÞ ¼ Axðt þ kjtÞ þ Buðt þ kjtÞ ð8cÞ

Cuðt þ kjtÞP c ð8dÞ

Duðt þ kjtÞ ¼ d ð8eÞ

where K is the length of the prediction horizon; xðtÞ is the current state of the traffic network at time t; xðt þ kjtÞ is the stateprediction for time t þ k; uðt þ kjtÞ is the control prediction for time t þ k, but only uðtjtÞ is implemented with uðtÞ ¼ uðtjtÞ; Cand c define the inequality constraints; and D and d define the equality constraints.

emi-positive matrix M induces a vector norm kxkM ¼ffiffiffiffiffiffiffiffiffiffiffiffix0Mxp

.

Page 6: 1-s2.0-S0968090X09000540-main

L.B. de Oliveira, E. Camponogara / Transportation Research Part C 18 (2010) 120–139 125

3. Multi-agent model predictive control

This section introduces the concept of linear dynamic network (LDN) which models the traffic flow dynamics and splitcontrol problem shown above. It presents a distributed formulation PðtÞ for LDNs which generalizes the MPC formulationfor split control given in Eqs. (8a)–(8e). Further, this section develops a decomposition of PðtÞ into a set fPmðtÞg of sub-prob-lems and proposes a distributed algorithm for an agent network to reach a solution to PðtÞ by iteratively solving fPmðtÞg.

3.1. MPC formulation

A dynamic network consists of the interconnection of M sub-systems that forms a graph G ¼ ðV ; EÞ, where each sub-system is a node in V and each arc ði; jÞ 2 E defines a coupling between sub-systems i and j. Vector xm 2 Rnm has the localstate and um 2 Rpm has the local controls of sub-system m. The state of sub-system m evolves in time depending on its localstate, local control signals, and the control signals at the up-stream sub-systems. For discrete-time dynamics, the state equa-tion for sub-system m is:

3 Witplant xð

xmðt þ 1Þ ¼ AmxmðtÞ þX

i2IðmÞBmiuiðtÞ ð9Þ

where t 2 N is the discrete sample time and IðmÞ ¼ fmg [ fi : ði;mÞ 2 Eg is the set of input neighbors of sub-system m, whichincludes m and the up-stream sub-systems. The network state is x ¼ ðx1; . . . ;xMÞ, whereas its control vector isu ¼ ðu1; . . . ;uMÞ. Clearly, the dynamic Eqs. (9) are collectively given by xðt þ 1Þ ¼ AxðtÞ þ BuðtÞ for suitable matrices A andB. Hereafter, the network dynamic system is assumed to be controllable3.

Given the network state xðtÞ, the MPC framework obtains the control signals for time t by solving the following quadraticprogramming problem:

PðtÞ : minXM

m¼1

/mðtÞ ¼XM

m¼1

XK

k¼1

12½xmðt þ kjtÞ0Q mxmðt þ kjtÞ þ umðt þ k� 1jtÞ0Rmumðt þ k� 1jtÞ� ð10aÞ

s: to :

xmðtjtÞ ¼ xmðtÞ;m 2M ð10bÞFor all m 2M; k 2K : xmðt þ kþ 1jtÞ ¼ Amxmðt þ kjtÞ þ

Xi2IðmÞ

Bmiuiðt þ kjtÞ ð10cÞ

Cmumðt þ kjtÞP cm ð10dÞDmumðt þ kjtÞ ¼ dm ð10eÞ

where xmðt þ kjtÞ is sub-system m’s state prediction for time t þ k calculated at time t, whereas umðt þ kjtÞ is its predictedcontrol signal; Qm is positive semi-definite and Rm is positive definite; Cm and cm (Dm and dm) define the inequality (equality)constraints; and M ¼ f1; . . . ;Mg is the set with the indices of the sub-systems and K ¼ f0; . . . ;K � 1g defines the predictionhorizon.

Only the control signals predicted for time t are implemented, namely umðtÞ ¼ umðtjtÞ. The other control signals are cal-culated merely to predict the long-term effects of the present control actions and thereby avoid actions that have poor long-term performance. Because of this predictive feature, the framework is called model predictive control. At the next sampletime, t þ 1, the prediction horizon is rolled forward: the current state xðt þ 1Þ is measured, Pðt þ 1Þ is solved, and new controlsignals umðt þ 1Þ are obtained and implemented. The process continues indefinitely receding into infinity. This is why suchcontrol framework is also known as rolling, sliding, and receding horizon control.

The test bed is the traffic network depicted in Fig. 4 with 13 one-way roads and six junctions. The state x3 ¼ ½x6 x7�0 of sub-system three has the number of vehicles in roads 6 and 7, while its control vector u3 ¼ ½u6 u7�0 has the green time for eachroad. The coupling graph G appears in Fig. 5. The set of input neighbors to sub-system three is Ið3Þ ¼ f1;3;4g. Matrix B33

expresses the discharge of queues x3 as a function of green times u3, while B31ðB34Þ expresses how queues x3 build up asx1ðx4Þ are emptied. For the purpose of illustration,

B33 ¼ T� S6

C 0

0 � S7C

!; B34 ¼ T

0 0s8;7

S8C s9;7

S9C

!; B31 ¼ T

s1;6S1C s2;6

S2C s3;6

S3C

0 0 0

!

where T (seconds) is the control interval, si;j is the conversion rate from road i into j; Si (vehicles/s) is the saturation flow ofroad i, and C (seconds) is the cycle time. The inequality constraints impose minimum and maximum green times on thephases. The equalities guarantee that the total green time plus lost time (yellow time) add up to cycle time.

h A being n� n and B being n�m, the pair ðA;BÞ is said to be controllable if the n� nm matrix ½A AB A2B � � �An�1B� has full row rank. For a controllablet þ 1Þ ¼ AxðtÞ þ BuðtÞ, there exist control vectors uð0Þ, uð1Þ, . . ., uðn� 1Þ that force xðnÞ to the origin regardless of the initial state xð0Þ.

Page 7: 1-s2.0-S0968090X09000540-main

Fig. 4. Traffic network. The shaded area indicates sub-system 3 whose incoming queues are modeled by state variables x6 and x7.

Fig. 5. Dynamic coupling graph.

126 L.B. de Oliveira, E. Camponogara / Transportation Research Part C 18 (2010) 120–139

3.2. Compact formulation

The elimination of linear dependencies and the aggregation of variables over the prediction horizon leads to an equivalentform of PðtÞ that simplifies the design of algorithms. Note that sub-system m’s state prediction for time t þ k is a function ofits state at time t and the control signals prior to time t þ k

xmðt þ kjtÞ ¼ AkmxmðtÞ þ

Xk

l¼1

Xi2IðmÞ

Al�1m Bmiuiðt þ k� ljtÞ ð11Þ

Let vector umðtÞ ¼ ðumðtjtÞ; . . . ; umðt þ K � 1jtÞÞ collect the control variables and xmðtÞ ¼ ðxmðt þ 1jtÞ; . . . ; xmðt þ KjtÞÞ bethe state variables predicted over the time horizon. By defining matrices

�Am ¼

Am

A2m

..

.

AKm

26666664

37777775

and �Bmi ¼

Bmi 0 � � � 0

AmBmi Bmi � � � 0

..

. ... . .

.0

AK�1m Bmi AK�2

m Bmi � � � Bmi

26666664

37777775

Page 8: 1-s2.0-S0968090X09000540-main

L.B. de Oliveira, E. Camponogara / Transportation Research Part C 18 (2010) 120–139 127

where 0 denotes a matrix with zeros of suitable dimension, the state predictions are calculated as

4 The

xmðtÞ ¼ �AmxmðtÞ þX

i2IðmÞ

�BmiuiðtÞ ð12Þ

Let In denote the identity matrix of dimension n. By defining �Q m ¼ IK � Qm and �Rm ¼ IK � Rm in terms of the Kroneckerproduct4, and using Eq. (12), the objective term /mðtÞ becomes

/mðtÞ ¼12

xmðtÞ0�A0m �Q m�AmxmðtÞ þ

Xi2IðmÞ

xmðtÞ0�A0m �Q m�BmiuiðtÞ þ

12

Xi2IðmÞ

Xj2IðmÞ

uiðtÞ0�B0mi�Q m

�BmjujðtÞ þ12

umðtÞ0�RmumðtÞ ð13Þ

Now, define the following vectors, matrices, and constant

gmiðtÞ ¼ �B0mi�Q m

�AmxmðtÞ for i 2 IðmÞ ð14aÞHmij ¼ �B0mi

�Q m�Bmj for i; j 2 IðmÞ; i – m or j – m ð14bÞ

Hmmm ¼ �B0mm�Q m

�Bmm þ �Rm ð14cÞ

cðtÞ ¼Xm2M

12

xmðtÞ0�A0m �Q m�AmxmðtÞ ð14dÞ

Then, problem PðtÞ becomes

PðtÞ : min12

Xm2M

Xi2IðmÞ

Xj2IðmÞ

uiðtÞ0HmijujðtÞ þXm2M

Xi2IðmÞ

gmiðtÞ0uiðtÞ þ cðtÞ ð15aÞ

s: to : �CmumðtÞP �cm; m 2M ð15bÞ�DmumðtÞ ¼ �dm; m 2M ð15cÞ

where �Cm ¼ IK � Cm; �Dm ¼ IK � Dm, and �cm ¼ ½c0m � � � c0m�0 and �dm ¼ ½d0m � � � d0m�

0 have appropriate dimensions.Here, the issue is how a network of distributed agents solves PðtÞ instead of a centralized agent. In what follows, we de-

velop a decomposition of PðtÞ into a set of coupled sub-problems fPmðtÞg and outline a distributed solution protocol.

3.3. Problem decomposition

For the distribution of decision-making, an agent m decides upon the values of the local control variables of sub-systemm. The values umðtÞ are obtained by solving a local optimization problem PmðtÞ at each sample time. The design of the sub-problem set fPmðtÞg and the couplings among the agents is the so-called problem decomposition. The decomposition is said tobe perfect if each sub-problem PmðtÞ encompasses all of the objective terms and constraints of PðtÞ that depend on umðtÞ.Models and algorithms for perfect and approximate decomposition are found in (Camponogara and Talukdar, 2004,2005). For a perfect decomposition, let:

� �IðmÞ ¼ fi : m 2 IðiÞ; i – mg be the set of output neighbors of sub-system m, that is, any sub-system i whose state xiðtÞ isaffected by umðtÞ;

� CðmÞ ¼ fði; jÞ 2 IðmÞ � IðmÞ : i ¼ m or j ¼ mg be the sub-system pairs of quadratic terms in /m that depend on umðtÞ; and� Cðm; kÞ ¼ fði; jÞ 2 IðkÞ � IðkÞ : i ¼ m or j ¼ mg be the pairs of quadratic terms in /kðtÞ, k 2 �IðmÞ, that depend on umðtÞ.

In the sample traffic network (Figs. 4 and 5), Ið1Þ ¼ f1g;�Ið1Þ ¼ f2;3;5;6g;Cð1Þ ¼ fð1;1Þg, and Cð1;3Þ ¼fð1;3Þ; ð1;4Þ; ð1;1Þ; ð3;1Þ; ð4;1Þg. Notice that umðtÞ can affect the state of systems other than IðmÞ [�IðmÞ. For instance, sub-system 1 is coupled to sub-system defined by Eq. (4) via sub-system 3, but 4 R Ið1Þ [�Ið1Þ. The notion of neighborhood estab-lishes the interdependence among sub-systems. Agent m’s view of the network is divided in three sets:

� local variables: the variables in vector umðtÞ;� neighborhood variables: all the variables in vector ymðtÞ ¼ ½uiðtÞ : i 2 NðmÞ� where NðmÞ ¼ IðmÞ [ fi :

ði; jÞ 2 Cðm; kÞ; k 2 �IðmÞg � fmg is the neighborhood of agent m. The neighborhood of agent m consists of the sub-systemsother than m that are affected by the decision umðtÞ or whose decisions affect xmðtÞ. Notice that �IðmÞ# NðmÞ; and

� remote variables: all of the other variables which consist of vector zmðtÞ ¼ ½uiðtÞ : i R NðmÞ [ fmg�.

From agent m’s view point, uðtÞ ¼ ½umðtÞ0 ymðtÞ0 zmðtÞ0�0. Let f ðuðtÞÞ ¼

Pm2M/mðtÞ denote the objective function of PðtÞ.

A perfect problem decomposition requires the local problem PmðtÞ to account for all the dependencies with the neighborsof agent m. This is achieved if PmðtÞ is obtained from PðtÞ by (i) discarding from the objective f the terms not involving umðtÞand (ii) dropping the constraints not associated with agent m. Formally, agent m’s local problem is

operator � denotes the Kronecker product. �Qm is a block diagonal matrix with K blocks, each of which being a matrix Qm.

Page 9: 1-s2.0-S0968090X09000540-main

5 A seand bou

128 L.B. de Oliveira, E. Camponogara / Transportation Research Part C 18 (2010) 120–139

Pmðt; ymðtÞÞ : min fm ¼12

umðtÞ0HmumðtÞ þ gmðtÞ0umðtÞ ð16aÞ

s: to : �CmumðtÞP �cm ð16bÞ�DmumðtÞ ¼ �dm ð16cÞ

where Hm is a suitable matrix and gmðtÞ is a suitable vector. A step-by-step procedure to obtain Hm and gmðtÞ from Hijl andgijðtÞ is developed in (Camponogara and de Oliveira, 2009). Evidently, a perfect decomposition ensures that

f ðuðtÞÞ ¼ fmðumðtÞ; ymðtÞÞ þ �f mðymðtÞ; zmðtÞÞ þ cðtÞ

for each agent m where �f m is a suitable function. To simplify notation, hereafter Pm; PmðtÞ, and Pmðt; ymðtÞÞ will be shorthandsfor sub-problem (16a)–(16c).

A perfect problem decomposition leads to some relationships between PðtÞ and fPmðtÞg that are handy to the design of adistributed algorithm for the agent network. Assumptions and resulting properties are presented below. The reader can referto (Camponogara and de Oliveira, 2009) for the demonstrations and some illustrations.

Proposition 1. A solution uðtÞ satisfies first-order KKT (Karush-Kuhn-Tucker) optimality conditions for PðtÞ if, and only if,ðumðtÞ; ymðtÞÞ satisfies KKT conditions of Pmðt; ymðtÞÞ for each m 2M.

Definition 1. (Feasible spaces) The feasible spaces are:

� Um ¼ fum : Cmum P �cm;Dmum ¼ �dmg is the feasible space for PmðtÞ;� U ¼ U1 � � � � � UM is the feasible space for PðtÞ; and� Ym ¼ �i2NðmÞUi is the feasible space for agent m’s neighborhood variables.

Assumption 1. (Compactness) The feasible space U is a compact set.

Assumption 2. (Strict feasibility) There exists u 2 U such that �Cmum > �cm and �Dmum ¼ �dm for all m 2M.

Compactness5 is a plausible assumption because control signals are invariably bounded. So is the strict feasibility assump-tion: if the interior of U is empty, then some inequalities are indeed equalities and should be regarded as such.

Proposition 2. Problem PðtÞ given by (15a)–(15c) is convex.

Corollary 1. Sub-problem Pmðt; ymðtÞÞ is convex.

Proposition 3. (Optimality conditions) Because f ðuðtÞÞ is a convex function and U is a convex set, uðtÞH is a local minimum for fover U if and only if:

rf ðuðtÞHÞ0ðuðtÞ � uðtÞHÞP 0; 8uðtÞ 2 U ð17Þ

A vector uðtÞH satisfying condition (17) is called stationary point.

Corollary 2. (Local optimality conditions) uðtÞH is a local minimum for PðtÞ if, and only if, ðumðtÞH; ymðtÞHÞ is a local minimum ofPmðt; ymðtÞHÞ for all m 2M.

This corollary means that an overall control vector that cannot be unilaterally improved by a single agent (a fixed point) islocally optimal for all sub-problems fPmðtÞg and therefore also locally optimal for PðtÞ. As the problems are all convex, a localoptimum induces a global optimum.

3.4. Multi-agent distributed control

A perfect problem decomposition establishes an equivalence between an optimal solution to PðtÞ and a stationary solu-tion for the sub-problem network fPmðtÞg. How do the agents reach a fixed point uðtÞH? Below, we present a distributed algo-rithm for the agents to arrive at a stationary point for fPmðtÞg which works by generating a sequenceuðtÞðkÞ ¼ ðuðkÞ1 ðtÞ; . . . ; uðkÞM ðtÞÞ of iterates. Starting with a feasible control vector uðtÞð0Þ, at each iteration k the agents exchangetheir decisions locally, coordinate the iterations to preclude coupled agents from acting simultaneously, and keep workinguntil convergence is attained or time is up. At this point, the control signals are implemented and the horizon is rolled for-wards to the next sample time. Two fundamental assumptions for the convergence of the agents’ iterates to a stationarysolution are stated below.

t S is compact if for any given sequence xðkÞ of vectors in S there exists a subsequence xðkiÞwhich converges to a point xH in S. Any compact set is closednded.

Page 10: 1-s2.0-S0968090X09000540-main

L.B. de Oliveira, E. Camponogara / Transportation Research Part C 18 (2010) 120–139 129

Assumption 3. (Synchronous work) If agent m revises its decisions at iteration k, then:

(i) agent m uses yðkÞm ðtÞ ¼ ½uðkÞi ðtÞ : i 2 NðmÞ� to produce an approximate solution to Pmðt; yðkÞm ðtÞÞ that becomes its next iter-ate uðkþ1Þ

m ðtÞ;(ii) all the neighbors of agent m keep their decisions at iteration k, that is, uðkþ1Þ

i ðtÞ ¼ uðkÞi ðtÞ for all i 2 NðmÞ.

Assumption 4. (Continuous work) If uðtÞðkÞ is not a stationary point for all problems in fPmðtÞg, then at least one agent m forwhich uðkÞm ðtÞ is not a stationary point for PmðtÞ produces a new iterate uðkþ1Þ

m ðtÞ by approximately solving Pmðt; yðkÞm ðtÞÞ.

Condition (ii) of Assumption 3 and Assumption 4 hold by arranging the agents to iterate repeatedly in a sequencehS1; . . . ; Sri where Si #M;[r

i¼1Si ¼M, and all distinct pairs m;n 2 Si are non-neighbors for all i. hS1; S2; S3i withS1 ¼ f2;4;6g; S2 ¼ f3;5g, and S3 ¼ f1g is a valid sequence for the illustrative scenario. Actually, this sequence is too restric-tive because the dynamic Eq. (9) assumes that uiðtÞ; i 2 IðmÞ; influences the entire state vector xmðt þ 1Þ. This is not the casein the traffic scenario. While the control signals u1ðtÞ and u4ðtÞ influence x3ðt þ 1Þ as a whole in the model, u1ðtÞ influencesonly the part of x3ðt þ 1Þ associated with x6, whereas u4ðtÞ influences only the part associated with x7. Thus,S1 ¼ f2;4;6g; S2 ¼ f3;5g, and S3 ¼ f1;4g is also a plausible iteration sequence for the agents. Time-varying sequences thatuphold the conditions and synchronization protocols are other alternatives.

Of relevance is the way an agent m solves PmðtÞ approximately so that the iterates uðtÞðkÞ are drawn to a stationary point offPmðtÞg. To this end, we developed a distributed algorithm based on the feasible direction method (Bertsekas, 1995) which isonly outlined below, but fully developed in (Camponogara and de Oliveira, 2009). The distributed feasible direction methodis specially tailored for LDNs, taking advantage of the local dynamic and constraint structure which is not present in frame-works for more general settings (Camponogara et al., 2002; Camponogara and Talukdar, 2007).

At the current iterate uðtÞðkÞ, agent m computes a locally descent direction dðkÞm ðtÞ ¼ �uðkÞm ðtÞ � uðkÞm ðtÞ by solving a linear pro-gramming (LP) problem

DðkÞm ðtÞ ¼ min�uðkÞm ðtÞ

rumðtÞfmðuðkÞm ðtÞ; yðkÞm ðtÞÞ0ð�uðkÞm ðtÞ � uðkÞm ðtÞÞ ð18aÞ

s: to : Cm �uðkÞm ðtÞP �cm ð18bÞ

�Dm �uðkÞm ðtÞ ¼ �dm ð18cÞ

A direction dðkÞm ðtÞ – 0 is locally feasible at ðuðkÞm ðtÞ; yðkÞm ðtÞÞ if uðkÞm ðtÞ þ amdðkÞm ðtÞ 2 Um for all sufficiently small am > 0. A locallyfeasible direction is locally descent at a nonstationary point ðuðkÞm ðtÞ; yðkÞm ðtÞÞ if rfmðuðkÞm ðtÞ; yðkÞm ðtÞÞ0dðkÞm ðtÞ < 0. Notice that thesolution to DðkÞm ðtÞ produces a locally descent direction if one exists.

The next iterate uðkþ1Þm ðtÞ ¼ uðkÞm ðtÞ þ aðkÞm ðtÞdðkÞm ðtÞ is obtained by finding a step aðkÞm ðtÞ that satisfies the Armijo rule. Given

bm;rm 2 ð0;1Þ;aðkÞm ðtÞ is the smallest nonnegative integer am for which:

fmðuðkÞm ðtÞ þ bamm dðkÞm ðtÞ; yðkÞm ðtÞÞ 6 fmðuðkÞm ðtÞ; yðkÞm ðtÞÞ þ rmbam

m rumðtÞfmðuðkÞm ðtÞ; yðkÞm ðtÞÞ0dðkÞm

Agent-iterations as delineated above, Assumptions 3, and 4 ensure that the iterates uðkÞðtÞ arrive at a stationary point offPmðtÞg and thereby a solution to PðtÞ. Some technical details are needed for the convergence proof, but effectively the agentnetwork implements a distributed feasible direction method for quadratic programming (Camponogara and de Oliveira,2009). The procedure used by each agent m at iteration k to solve fPmðtÞg is outlined below.

Agent-iterationðt;m; kÞ

1: if agent m cannot revise its decisions in iteration k then2: uðkþ1Þ

m ðtÞ ¼ uðkÞm ðtÞ3: return4: end if5: Agent m obtains yðkÞm ðtÞ ¼ ½uðkÞi ðtÞ : i 2 NðmÞ� from its neighbors6: Agent m solves DðkÞm ðtÞ to obtain dðkÞm ðtÞ7: if dðkÞm ðtÞ ¼ 0 then8: uðkþ1Þ

m ðtÞ ¼ uðkÞm ðtÞ . ðuðkÞm ðtÞ; yðkÞm ðtÞÞ is stationary for PmðtÞ9: return10: end if11: am ¼ 012: while ðfmðuðkÞm ðtÞ þ bam

m dðkÞm ðtÞ; yðkÞm ðtÞÞ > fmðuðkÞm ðtÞ; y

ðkÞm ðtÞÞ þ rmbam

m rumðtÞfmðuðkÞm ðtÞ; yðkÞm ðtÞÞ0d

ðkÞm Þ

13: am ¼ am þ 114: end while15: uðkþ1Þ

m ðtÞ ¼ uðkÞm ðtÞ þ bamm dðkÞm ðtÞ

Page 11: 1-s2.0-S0968090X09000540-main

130 L.B. de Oliveira, E. Camponogara / Transportation Research Part C 18 (2010) 120–139

The iteration procedure is relatively simple. The most computationally demanding step is the solution of the linear pro-gram, for which fast and robust LP solvers are available off-the-shelf.

3.4.1. Analytical computation of feasible descent directionThe constraint structure of the linear dynamic network for signaling split control allows DðkÞm ðtÞ to be solved analytically.

To this end, replace the split uiðtÞ of a road i at cycle t with uiðtÞ ¼ li þ diðtÞ, where li is the lower bound for green timeand diðtÞ is the green time extension. If the upper bound for uiðtÞ is the cycle time C, then constraints (18b) and (18c)become

P

ui2um

�diðt þ kjtÞ ¼ C �P

ui2um

li; 8k 2K

�diðt þ kjtÞP 0; 8ui 2 um; 8k 2K

Such constraint structure is separable, having an independent constraint set for each prediction time k. Further, any basicsolution will have precisely one nonzero variable �diðt þ kjtÞ for each k. The net result is that an optimal basic solution toDðkÞm ðtÞ is found by defining the basic variables as those corresponding to the most negative entries of the gradientrfmðuðkÞm ðtÞ; yðkÞm ðtÞÞ.

3.4.2. Conflict resolutionThe multi-agent MPC framework can be viewed as a dynamic game (Camponogara et al., 2006). Each agent m has an im-

plicit reaction function RmðymÞ determining the agent’s response um to the decisions ym of its neighboring agents. The reac-tion function is computed by solving sub-problem PmðtÞ. Thus, the agents resolve conflicts by iteratively reacting to oneanother’s decisions until they reach a fixed point. Such a fixed point is a Nash point for the game, that is, a combined decisionvector u which cannot be improved unilaterally by any agent with respect to its objective. On the one hand, the agents areselfish to the extent that they are driven by their own interests, as quantified by their objective functions. On the other hand,this selfish behavior leads to a global optimum since the objectives of the agents are aligned with the global objective inproblem PðtÞ.

3.4.3. Multi-agent MPC as a multi-agent systemAll in all, the multi-agent MPC framework falls within the class of multi-agent systems, which are systems composed of

multiple interacting intelligent agents having the characteristics of autonomy, local views, and decentralization (Wooldridge,2002). The agents have limited autonomy because they follow the iteration and communication protocol imposed byAssumptions 3 and 4, but each agent m is free to decide upon the values of parameters bm and rm based on what workedbest in the past, perform multiple iterations rather than simply satisfying the Armijo rule, and even utilize a totally differentalgorithm that would solve Pm or find a near-optimal solution which implicitly satisfies the Armijo rule. The views of theagents are local because they sense and decide upon the values of a fraction of the state and control variables, respectively.And the agents are decentralized since no single agent has a complete view of or operates the entire network.

3.5. Closed-loop stability

The MPC approach is a kind of feedback control. It repeatedly revises the predicted control actions over a receding horizonas new state measurements are received. However, the optimizations do not explictly consider the system behavior beyondthe prediction horizon, potentially leading the system to an unstable mode. For simplicity, let the origin ðx;uÞ ¼ 0 be an equi-librium point for the dynamic network xðt þ 1Þ ¼ AxðtÞ þ BuðtÞ. It is important to mention that stabilization conditions as-sume that the prediction model is perfect, xðt þ kjtÞ ¼ xðt þ kÞ, and a global optimum is found for the optimization problems.The two main strategies for closed-loop stability of MPC are terminal constraints and infinite horizons (Maciejowski, 2002).

The terminal constraint strategy drives the final state to the origin, that is, it introduces the constraint xmðt þ KjtÞ ¼ 0 forall m 2M. Then a positive-definite objective function and these terminal constraints ensure closed-loop stability of the net-work. Notice that this strategy would couple the sub-systems in more complex ways than the local constraint structure givenby Eqs. (10d) and (10e). Instead, the penalty term 1

� kxmðt þ KjtÞk2 for all m 2M can be introduced in the objective to retainthe local structure of the network. Notice that xmðt þ KjtÞ tends to 0 as �! 0.

An infinite prediction horizon ensures closed-loop stability. As the optimizations would not be in finite-dimensionalspace, the infinite horizon problem has to be expressed in terms of a finite set of control variables. If the network is intrin-sically stable (all eigenvalues of A are inside the unit disc), this strategy introduces a terminal costP

m2Mxmðt þ KjtÞ0Wmxmðt þ KjtÞ where Wm ¼P1

i¼0ðA0mÞ

iQ mAim is convergent because Am is stable. For an unstable plant,

Wm does not converge and the unstable modes must be forced to zero at the end of the prediction. The network state isdecomposed in terms of stable xs

m ¼ Asmxm and unstable xu

m ¼ Aumxm modes via a Jordan decomposition. PðtÞ is augmented

with a terminal constraint xumðt þ KjtÞ ¼ 0 for all m 2M and a terminal cost

Pm2Mxs

mðt þ KjtÞ0Wsmxs

mðt þ KjtÞ where Wsm is

obtained similarly to Wm. The terminal constraints couple the sub-systems through the constraint set, but they can beapproximated with a penalty term as explained above to preserve the local structure of PðtÞ.

All in all, the distributed agents can implement the terminal cost strategy if the open-loop plant is stable, or otherwiseintroduce terminal constraints on the unstable modes while enforcing terminal costs on the stable modes. The agents

Page 12: 1-s2.0-S0968090X09000540-main

L.B. de Oliveira, E. Camponogara / Transportation Research Part C 18 (2010) 120–139 131

enforce terminal constraints approximately using penalty terms. Regardless of the strategy or combination thereof, theagents can ensure closed-loop stability without compromising the local structure of fPmðtÞg.

4. Simulation analysis

Fig. 4 shows the urban traffic network that served as a test bed for simulations with the multi-agent model predictivecontrol strategy and the standard TUC two-stage regulator. First, both strategies are evaluated through a numerical analysisbased on the nominal network model. Besides a comparison of these strategies, the study included an analysis of the con-vergence of the solutions produced by multi-agent MPC to the optimal solution obtained with centralized MPC. Second, sim-ulations were conducted with a professional traffic simulation software to assess the behavior of both strategies in a morerealistic scenario and subject to model discrepancies. Third, the test bed network was expanded by adding two junctions,four state variables, and four control signals to illustrate the flexibility and scalability of multi-agent MPC.

4.1. Network specification

The test bed network was designed to represent an urban perimeter traversed by high flow avenues, providing a conve-nient scenario for split control evaluation. Nevertheless, the complexity of the network is influenced by other variables thatinclude cycle time and offset between junctions. The network consists only of one-way links to diminish the influence ofthese control parameters and the network specification on the performance metrics. Further, offset control is not imple-mented and the cycle time is defined as a multiple of the shortest Webster cycle to balance the internal streams of vehicles.To mitigate the influence of the network specification and the uncontrolled variables (offset and cycle time), three scenarioswere appraised.

4.1.1. Scenario I: distinct cycles ðC–ÞCycle times and nominal splits were computed through a method known as Webster’s procedure (Webster, 1959), which

yields optimal cycle times and signaling splits for isolated junctions. The procedure is summarized by the equations below:

Table 1Nomina

Linkz

x1

x2

x3

x4

x5

x6

x7

x8

x9

x10

x11

x12

x13

Cj ¼1:5Lj þ 5

1�P

k2IjqN

k =Skand uj;i ¼

ðqNi =SiÞðCj � LjÞP

k2IjqN

k =Sk; for all i 2 Fj

where Cj is the cycle of junction j; Lj denotes the lost time of the same junction; qNi is the nominal inflow to link i in vehicles

per hour; Si is the saturation flow of link i; Ij is the set of input links of junction j; uj;i is the nominal green time allocated tophase i of junction j; and Fj is the set of phases of the controlled junction j.

The cycle times and splits resulting from the application of Webster’s procedure appear in Table 1. The cycles and splitsare not optimal since the junctions are not isolated and operate synchronously (their offset is zero). In fact, vehicle progres-sion is erratic and difficult since the junctions have distinct cycles. In this scenario, the high traffic inflows concentrated inthe main avenues x2 and x8 make progression even more difficult.

4.1.2. Scenario II: equal cycles (C=)Cycle times were set to 120 s, providing a harmonic progression of vehicles and minimizing the undesirable effects of the

lack of synchronization. For this scenario, the traffic inflows were more balanced to avoid oscillations in the internal flowsand thereby diminish the effects of synchronization. With the given cycle time, the nominal splits were obtained by theWebster’s procedure. The nominal traffic control parameters are presented in Table 2.

l parameters of the distinct cycles scenario.

Control Saturation Nominal inflow Nominal split Cycleuj;z Sz (veh/h) qz (veh/h) uN

j;z(s) C (s)

u1;1 3600 1000 58.0 192.0u1;2 3600 1100 63.8u1;3 3600 900 52.2u2;1 3600 – 46.7 132.6u2;2 3600 – 73.9u3;1 1800 – 26.3 81.9u3;2 3600 – 43.6u4;1 3600 1800 89.2 165.6u4;2 3600 1300 64.4u5;1 3600 – 50.9 91.7u5;2 1800 – 28.8u6;1 3600 – 75.6 131.3u6;2 3600 – 43.7

Page 13: 1-s2.0-S0968090X09000540-main

Table 2Nominal parameters of the equal cycles scenario.

Link Control Saturation Nominal inflow Nominal split Cyclez uj;z Sz (veh/h) qz(veh/h) uN

j;z (s) C (s)

x1 u1;1 3600 800 28.8 120x2 u1;2 3600 1300 46.8x3 u1;3 3600 900 32.4x4 u2;1 3600 – 72.5 120x5 u2;2 3600 – 39.5x6 u3;1 1800 – 54.9 120x7 u3;2 3600 – 57.1x8 u4;1 3600 900 63.0 120x9 u4;2 3600 700 49.0x10 u5;1 3600 – 59.8 120x11 u5;2 1800 – 52.2x12 u6;1 3600 – 54.7 120x13 u6;2 3600 – 57.3

Table 3Nominal parameters of the equal cycles scenario with crash simulation.

Link Control Saturation Nominal inflow Nominal split Cyclez uj;z Sz (veh/h) qz (veh/h) uN

j;z (s) C (s)

x1 u1;1 3600 800 28.8 120x2 u1;2 3600 1300 46.8x3 u1;3 3600 900/0/1500 32.4x4 u2;1 3600 – 72.5 120x5 u2;2 3600 – 39.5x6 u3;1 1800 – 54.9 120x7 u3;2 3600 – 57.1x8 u4;1 3600 900 63.0 120x9 u4;2 3600 700 49.0x10 u5;1 3600 – 59.8 120x11 u5;2 1800 – 52.2x12 u6;1 3600 – 54.7 120x13 u6;2 3600 – 57.3

132 L.B. de Oliveira, E. Camponogara / Transportation Research Part C 18 (2010) 120–139

4.1.3. Scenario III: equal cycles with crash simulation (C = /crash)This scenario has the same characteristics of the previous one, except for the simulation of a car crash in link x3. This inci-

dent occurs at the 15 min of simulation and blocks the link for 15 min. The intent is to temporarily suspend traffic flowthrough the link. When the link is unblocked at the 30th minute of simulation, the inflow of link x3 reaches a rate higherthan the nominal rate for the remaining of the simulation because of the accumulation of vehicles during the incident. Table3 presents the values regarding this scenario.

4.1.4. RemarksTable 4 shows the turning rates which are common for all scenarios6. The first column gives the origin link of the conver-

sion, while the remaining columns define the destination links. The data characterizing a scenario and the turning rates are suf-ficient to determine the matrix B (see Section 2) and thereby obtain the dynamic system xðt þ 1Þ ¼ AxðtÞ þ BuðtÞ.

Scenarios II and III share the same dynamic system but differ in the input demand fðkÞ, which simulates the suspension oftraffic flow on link x3 for 15 min in scenario III. A comparison between scenarios I and II aims to verify if one of the controlstrategies is more suitable for an erratic progression (distinct cycle times) or smoother progression (equal cycle times). Acomparison between scenarios II and III seeks to assess the robustness of the control strategies when the demands deviatedrastically from the nominal demands.

4.2. Numerical results

This section presents results from numerical simulation using Eq. (4) as a model for the traffic system. The simulation canbe implemented with scientific computation software, such as MATLAB� and SCILAB, and even programming languages such asPYTHON and C. The network’s actual state is calculated at each interval based on the given initial conditions, the previous state,and the discrete model. To make the control design model different from the simulation model, a disturbance was introducedin the simulation model

6 Turning rates are not reported for x4; x5; x12, and x13 because they are exit links.

Page 14: 1-s2.0-S0968090X09000540-main

Fig. 6. Flowchart of the numerical simulation process.

Table 4Nominal turning rates for the test bed network.

sw;j x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13

x1 – – – 0.20 – 0.05 – – – – 0.05 – 0.70x2 – – – 0.25 – 0.30 – – – – 0.30 – 0.15x3 – – – 0.65 – 0.05 – – – – 0.05 – 0.15x6 – – – – 0.50 – – – – – – – –x7 – – – – 0.80 – – – – – – – –x8 – – – – – – 0.40 – – 0.60 – – –x9 – – – – – – 0.60 – – 0.40 – – –x10 – – – – – – – – – – – 0.80 –x11 – – – – – – – – – – – 0.50 –

L.B. de Oliveira, E. Camponogara / Transportation Research Part C 18 (2010) 120–139 133

xðt þ 1Þ ¼ AxðtÞ þ BuðtÞ þ fðtÞ ð19Þ

where the term fðtÞ represents the inflows of input links, that is, vehicles entering the network. In the network of Fig. 4, theset of inflow links is f1;2;3;8;9g. Fig. 6 presents the flowchart of the numerical simulation.

Note however that this simulation is a rough representation of real traffic behavior and is best used for control design. Forinstance, the model does not allow vehicles to cross two intersections in the same control interval. Another limitation is theassumption that queue lengths are sufficiently large and downstream links are not obstructed, so that the outflow of a linkwith right of way is approximated by its saturation flow. This assumption does not hold when few vehicles are waiting at thestop line of a road, say xz, which feeds another queue, say xw.

The scenario chosen for this experiment is the one of distinct cycles. Since the model ignores the interactions betweenjunctions, the given cycles and splits can be regarded as optimal and there is no need to replicate the experiments for otherscenarios. The inflows of the network, fðtÞ, are defined by a Gaussian function centered at the middle of simulation time, withan initial value equal to the nominal inflows and a peak that doubles the nominal.

The network was simulated for approximately 2 h, namely T ¼ 40 simulation steps with control interval of DT ¼ 200 s.The impact of the prediction horizon on multi-agent MPC is evaluated for steps ranging from 1 to 5. Furthermore, 10 randominitial conditions were considered to increase reliability of the analysis. The initial state of the links were obtained at randomin the range from 0 to 500 vehicles for each initial condition.

Page 15: 1-s2.0-S0968090X09000540-main

Fig. 7. Mean accumulated cost over 40 simulation steps for a set of 10 random initial conditions.

134 L.B. de Oliveira, E. Camponogara / Transportation Research Part C 18 (2010) 120–139

Because model (19) is not ideal for traffic representation, the following accumulated cost was chosen as the objectivefunction and comparative metric:

Jexp ¼XT

t¼0

xðtÞ0QxðtÞ þ DuðtÞ0RDuðtÞ� �

ð20Þ

where T is the number of simulation steps; DuðtÞ ¼ uN � uðtÞ is the deviation from the nominal control signals; Q ¼ I is anidentity matrix weighing the states; and R ¼ 0:003I is a matrix weighing control deviation.

The values assumed for the weighting matrices are typical of other papers on TUC control (Diakaki, 1999; Carlson et al.,2006). The objective Jexp simultaneously minimizes queues in a balanced way with the quadratic term kxðtÞk2

Q and controldeviation from a nominal fixed-time control policy uN with the quadratic term kDuðtÞk2

R ¼ kuðtÞ � uNk2R. The traffic engineer

experimentally sets the parameter r defining the control-cost matrix R ¼ rI and thereby the trade-off rate between the twoobjectives. Nominal splits for the experiments appear in Tables 1–3.

A stop criterion of relative tolerance was selected for multi-agent MPC. Let JexpðtÞ ¼PK

k¼1½xðt þ kjtÞ0Q xðt þ kjtÞþDuðt þ k� 1jtÞ0RDuðt þ k� 1jtÞ� be the objective function for MPC over the prediction horizon, that is, the objective ofPðtÞ. The distributed agents iterate until kJðkÞexpðtÞ � Jðk�1Þ

exp ðtÞk=kJðkÞexpðtÞk < q where q is the tolerance, fDuðtÞðkÞg is the sequence

of iterates produced by the agents, and k is the iteration counter. Such criterion is satisfied when the relative decrease in theobjective function becomes insignificant.

Fig. 7 shows the mean accumulated cost Jexp over 10 simulation runs with different initial conditions. These results cor-roborate the efficiency of multi-agent MPC. For a prediction horizon K ¼ 5, multi-agent MPC achieves a performance increaseof approximately 10% in comparison to the TUC LQR approach. For long horizons, the changes in control signals are moresubtle and so are the variations in objective function as shown in the figure. For short horizons, the relative distance betweenthe multi-agent MPC solution and the centralized solution becomes more pronounced, specially for high tolerances. Junc-tions with high influence on the network, as junction 1, induce a large cost reduction that compared to the reduction fromless influential junctions can trigger the stop criterion far from the optimal point.

4.3. Simulation results

Aiming to circumvent the limitations of the numerical analysis, the three scenarios were modeled in AIMSUN� version 6

which is a professional traffic simulator (Barceló and Casas, 2002). The performance results from these simulations are morereliable as the traffic dynamics are modeled more accurately.

Eq. (20) remains the objective function for computing the gain matrix L of the TUC strategy and for multi-agent MPC. Ma-trix Q was the identity, whereas the control deviation matrix R was either R1 ¼ 0:003I or R2 ¼ I. All scenarios share the samecontrol interval DT ¼ 200 s and a duration of approximately 1 h. Further, equal prediction and control horizons of lengthK 2 f1;3g were used for multi-agent MPC. Although they seem small at first, such sliding horizons are in accordance withthe dynamics of interest in the process: the proposed control interval is 200 s long which is larger than the highest cycletime, thereby configuring an adequate control horizon.

Page 16: 1-s2.0-S0968090X09000540-main

Table 5Simulation results with R1 matrix for all scenarios.

Scenario Journey time (s/km) Density (veh/km)

Mean Std. dev. Mean Std. dev.

TUC LQR C– 241.23 3.15 29.51 0.67C ¼ 189.89 0.75 18.57 0.23C ¼/crash 193.06 2.72 19.14 2.74

M-MPC K ¼ 1 C– 240.42 6.43 29.59 0.97C ¼ 189.85 0.96 18.57 0.09C ¼/crash 192.09 1.80 19.06 2.44

M-MPC K ¼ 3 C– 465.66 55.38 53.57 4.74C ¼ 208.21 2.68 20.30 0.27C ¼/crash 205.77 18.83 20.55 3.86

Fig. 8. AIMSUN simulation model of the test bed network.

L.B. de Oliveira, E. Camponogara / Transportation Research Part C 18 (2010) 120–139 135

Because state variables are not readily available in AIMSUN7, inductive loop detectors were inserted at the entrance and stop

line of the controlled links. Then, the number of vehicles that have entered but not left the link is obtained by subtracting themeasurements of the former detector from the measurements of the latter detector.

Fig. 8 depicts the AIMSUN simulation model of the test bed network. A set of ten replications with different seeds were sim-ulated for each scenario. Tables 5 and 6 report the results achieved by multi-agent MPC (M-MPC) and the TUC LQR strategyfor matrices R1 and R2, respectively. The results encompass the scenarios of distinct cycle times ðC–Þ, identical cycle times(C=), and identical cycle times with car crash (C = /crash).

With control–cost matrix R1, the difference between the performance of multi-agent MPC with a unitary step controlhorizon (K ¼ 1) and the TUC LQR approach is not statistically significant. On the other hand, the multi-agent MPC perfor-mance is inferior with a prediction horizon of three steps, corroborating the hypothesis that the predictions from the trafficflow model given in Eq. (4) might be significantly wrong. This observation is reinforced by the lack of performance degra-dation in the numerical experiments in which the predictions match the actual model.

With control–cost matrix R2, the results are slightly favorable to multi-agent MPC but not statistically significant whenthe length of the prediction horizon is K ¼ 1. The TUC LQR approach achieves better performance than multi-agent MPCwhen K ¼ 3.

7 URL: http://www.aimsun.com.

Page 17: 1-s2.0-S0968090X09000540-main

Table 6Simulation results with R2 matrix for all scenarios.

Scenario Journey time (s/km) Density (veh/km)

Mean Std. dev. Mean Std. dev.

TUC LQR C– 240.87 2.73 29.63 0.56C ¼ 189.03 0.59 18.46 0.24C ¼/ crash 192.38 2.70 18.97 2.64

M-MPC K ¼ 1 C– 237.82 3.03 29.35 0.37C ¼ 188.74 0.80 18.47 0.06C ¼/ crash 191.60 2.47 19.05 2.53

M-MPC K ¼ 3 C– 311.64 31.32 37.56 3.24C ¼ 199.04 2.36 19.40 0.21C ¼/ crash 202.22 6.45 20.07 0.29

136 L.B. de Oliveira, E. Camponogara / Transportation Research Part C 18 (2010) 120–139

A comparison between Tables 5 and 6 indicates that the performance of all control strategies were slightly better whenR ¼ R2.

4.4. Multi-agent MPC reconfigurability

To demonstrate that multi-agent MPC can be reconfigured at ease, two junctions were added to the test bed network asdepicted in Fig. 9. The inclusion of the new junctions will take place in two phases, first including sub-systems 7 and 8 after-wards. The introduction of junction 7 expands the neighborhood of junction 6 from the set Nð6Þ ¼ f1;5g to Nð6Þ ¼ f1;5;7g.As a consequence, new terms are included in agent 6’s objective function to account for the influence of the control signals atjunction 6 on the state of junction 7. No change is required in any other junction.

Initially, the neighborhood of junction 7 consists only of junction 6, configuring an easily implementable sub-system. Inthe form of Eq. (16a), the objective function of agent 7 is given by

H7ðtÞ ¼ H777

¼ ðB077Q 7B77 þ �R7Þ

g7ðtÞ ¼12ðH0767 þ H776Þu6ðtÞ

¼ B077Q7B76u6ðtÞ þ �B077Q7A7x7ðtÞ

The addition of junction 8 is very similar to the previous one. This time, the introduction of junction 8 expands the neigh-borhood of junction 7 from the set Nð7Þ ¼ f6g to Nð7Þ ¼ f6;8g. As a consequence, agent 7’s objective function must be up-dated to account for the influence on the state of junction 8.

Fig. 9. Expanded traffic network.

Page 18: 1-s2.0-S0968090X09000540-main

Table 7Simulation results of the expanded test bed network.

Scenario Journey time (s/km) Density (veh/km)

Mean Std. dev. Mean Std. dev.

M-MPC K ¼ 1 C ¼ 200.07 0.76 19.63 0.05

L.B. de Oliveira, E. Camponogara / Transportation Research Part C 18 (2010) 120–139 137

H7ðtÞ ¼ H777 þ H877

¼ ðB077Q 7�B77 þ R7Þ þ B087Q8B87

g7ðtÞ ¼12ðH0767 þ H776Þu6ðtÞ þ g88ðtÞ þ

12ðH0887 þ H878Þu8ðtÞ þ g87ðtÞ

¼ B077Q7B76u6ðtÞ þ B077Q 7A7x7ðtÞ þ B087Q8B88u8ðtÞ þ �B087Q 8A8x8ðtÞ

The sub-problem of agent 8 is actually fairly simple since junction 7 is its sole neighboring sub-system

H8ðtÞ ¼ H888 ¼ B088Q 8B88 þ R8

g8ðtÞ ¼12ðH0878 þ H887Þu7ðtÞ þ g88ðtÞ ¼ B088Q 8B87u7ðtÞ þ B088Q 8A8x8ðtÞ

At this point the system is already configured with the newly added junctions. The reconfiguration process is summarizedin the following steps:

(1) statistically gather the parameters of the new junction(s);(2) determine the neighborhood of the added intersection(s); and(3) revise the objective function of the junctions belonging to that neighborhood and determine the objective function of

the new sub-system(s) according to Eq. (16a).

The parameters necessary to put together the simulation scenario are

� the turning rates are s12;14 ¼ 0:6; s13;14 ¼ 0:4; s14;16 ¼ 0:5, and s15;16 ¼ 0:5;� the saturation flow is 3600 veh=h for links x14; x15; x16, and x17;� the nominal splits are uN

7;1 ¼ uN7;2 ¼ uN

8;1 ¼ uN8;2 ¼ 54 s; and

� the inflow for links x15 and x17 is 800 veh=h.

With the purpose of illustration, the AIMSUN equal cycle scenario ðC ¼Þ was modified to encompass junctions 7 and 8. Theresults from the simulations appear in Table 7 for a prediction horizon of one step and R ¼ 3 � 10�3I.

To provide a clear comparison with the LQR process of reconfiguration, the steps needed to include the two junctionsabove are listed below:

(1) statistically gather the parameters of junction 7;(2) include the new data in the global matrices A;B;Q , and R;(3) compute the new control matrix L;(4) modify all the parameters of the control matrix;(5) statistically gather the parameters of junction 8;(6) include the new data in the global matrices A;B;Q , and R;(7) compute the new control matrix L;(8) modify all the parameters of the control matrix; and(9) set up new procedures for recovering feasibility of control signals.

Although the number of steps involved are similar, the inclusion of a new junction in the LQR control scheme requiresmodification of the control laws of all junctions. As network complexity increases, this task not only becomes arduous,but also error prone as the parameters must be manually input.

5. Summary and future work

The operation of large dynamic systems remains a challenge in control engineering to a great extent due to their sheersize, intrinsic complexity, and nonlinear behavior (Tatara et al., 2005, 2007). Recently, control engineers have turned theirattention to multi-agent systems for their composite nature, flexibility, and scalability. To this end, this paper contributedto this evolving technology with a framework for multi-agent control of linear dynamic networks, which are obtained fromthe interconnection of sub-systems that become dynamically coupled but otherwise have local constraints.

Page 19: 1-s2.0-S0968090X09000540-main

138 L.B. de Oliveira, E. Camponogara / Transportation Research Part C 18 (2010) 120–139

Of particular interest to this paper is the signaling split control of traffic flow modeled by store-and-forward equations.Such model leads to a linear dynamic network of sub-systems matching the traffic junctions. The state variables are thenumber of vehicles in the roads leading to each junction, while the control signals are the green times given to each of theirstages. The signaling split control entails solving a constrained, infinite time, linear-quadratic-regulator problem (Diakakiet al., 2002): the quadratic cost seeks to minimize queue lengths and deviation from nominal signals; the constraints ensurethat the green times add up to cycle time and are within bounds; and the linear dynamics result from the store-and-forwardtraffic flow model.

The TUC approach uses a feedback control law for signaling split, whereby a static feedback matrix is computed off-linewith the LQR technique and a quadratic program is solved on-line to recover split feasibility. On the other hand, model pre-dictive control handles constraints in a systematic way by using a finite-time rolling horizon and solving optimization prob-lems on-line. To cope with large networks and allow distributed reconfiguration, this paper proposed a decomposition of theMPC problem into a set of locally coupled sub-problems that are iteratively solved by a network of distributed agents. Theiterates produced by these distributed agents are drawn towards a globally optimal solution if they synchronize their work.The purpose of the experiments was threefold. First, the numerical analysis aimed to demonstrate the convergent behaviorof the multi-agent system and compare its speed with that of an ideal, centralized agent that solves the overall MPC problem.Second, the simulation analysis showed that multi-agent model predictive control can achieve performance comparable tothe TUC approach in representative scenarios implemented with the Aimsun simulator. And third, the experiments illus-trated the flexibility of the multi-agent MPC framework by introducing two additional controlled junctions, which requiredonly the reconfiguration of the control agent at the neighboring junction.

The research reported heretofore is multidisciplinary with contributions across the fields of multi-agent technology, opti-mization, and urban traffic control. Further improvements will be pursued along the following directions:

� numerical and simulated studies with very large networks aimed to confirm the potential of the multi-agent MPCframework;

� the formulation and application of traffic models that more accurately represent traffic flow (Aboudolas et al., 2007); and� the formal extension of the multi-agent framework to handle constraints on state variables.

References

Aboudolas, K., Papageorgiou, M., Kosmatopoulos, E., 2007. Control and optimization methods for traffic signal control in large-scale congested urban roadnetworks. In: Proceedings of the American Control Conference, New York, USA, pp. 3132–3138.

Balan, G., Luke, S., 2006. History-based traffic control. In: AAMAS’06: Proceedings of the 5th International Joint Conference on Autonomous Agents andMultiagent Systems, ACM, New York, NY, USA, pp. 616–621.

Barceló, J., Casas, J., 2002. Dynamic network simulation with Aimsun. In: Proceedings of the International Symposium on Transport Simulation. <http://www.aimsun.com/site/content/view/35/50/>.

Bertsekas, D.P., 1995. Nonlinear Programming. Athena Scientific, Belmont, MA.Bielefeldt, C., Diakaki, C., Papageorgiou, M., 2001. TUC and the SMART NETS project. In: Proceedings of the International IEEE Conference on Intelligent

Transportation Systems, Oakland, CA, USA, pp. 55–60.Camacho, E.F., Bordons, C., 2004. Model Predictive Control. Springer-Verlag.Camponogara, E., de Oliveira, L.B., 2009. Distributed optimization for model predictive control of linear dynamic networks, Accepted by IEEE Transactions on

Systems, Man, and Cybernetics – Part A. <http://www.das.ufsc.br/~camponog/papers/dmpc-tuc.pdf>.Camponogara, E., Talukdar, S.N., 2004. Designing communication networks for distributed control agents. European Journal of Operational Research 153 (3),

544–563.Camponogara, E., Talukdar, S., 2005. Designing communication networks to decompose network control problems. INFORMS Journal on Computing 17 (2),

207–223.Camponogara, E., Talukdar, S.N., 2007. Distributed model predictive control: synchronous and asynchronous computation. IEEE Transactions on Systems,

Man, and Cybernetics – Part A 37 (5), 732–745.Camponogara, E., Jia, D., Krogh, B.H., Talukdar, S.N., 2002. Distributed model predictive control. IEEE Control Systems Magazine 22 (1), 44–52.Camponogara, E., Zhou, H., Talukdar, S.N., 2006. Altruistic agents in uncertain, dynamic games. Journal of Computer & Systems Sciences International 45,

536–552.Carlson, R.C., Kraus Junior, W., Campnogara, E. 2006. Combining the TUC urban traffic control strategy with bandwidth maximisation control in

transportation systems. In: Proceedings of the 11th IFAC Symposium on Control in Transportation Systems.de Oliveira, L.B., 2008. Otimização e controle distribuído de frações de verde em malhas veiculares urbanas, Master’s thesis, Graduate Program in Electrical

Engineering, Federal University of Santa Catarina, in Portuguese.de Oliveira, L.B., Camponogara, E., 2007. Predictive control for urban traffic networks: initial evaluation. In: Proceedings of the 3rd IFAC Symposium on

System, Structure and Control, Iguassu Falls, Brazil.de Oliveira, D., Bazzan, A.L.C., Lesser, V., 2005. Using cooperative mediation to coordinate traffic lights: a case study. In: AAMAS’05: Proceedings of the 4th

International Joint Conference on Autonomous Agents and Multiagent Systems, pp. 563–470.Diakaki, C., 1999. Integrated control of traffic flow in corridor networks, Ph.D. Thesis, Department of Production Engineering and Management, Technical

University of Crete, Greece.Diakaki, C., Papageorgiou, M., 1997. Partners of the Project TABASCO, Urban Integrated Traffic Control Implementation Strategies, Tech. Rep. Project

TABASCO (TR1054), Transport Telematics Office, Brussels, Belgium (September 1997).Diakaki, C., Papageorgiou, M., Aboudolas, K., 2002. A multivariable regulator approach to traffic-responsive network-wide signal control. Control

Engineering Practice 10 (2), 183–195.Gazis, D.C., Potts, R.B., 1963. The oversaturated intersection. In: Proceedings of the Second International Symposium on Traffic Theory, pp. 221–237.Hunt, P.B., Robertson, D.I., Bretherton, R.D., Winton, R.I., 1981. SCOOT – a traffic responsive method of coordinating signals, Tech. rep., Transport Research

Laboratory, Crowthorne, England.Jennings, N., 2000. On agent-based software engineering. Artificial Intelligence 117, 277–296.

Page 20: 1-s2.0-S0968090X09000540-main

L.B. de Oliveira, E. Camponogara / Transportation Research Part C 18 (2010) 120–139 139

Kosmatopoulos, E., Papageorgiou, M., Bielefeldt, C., Dinopoulou, V., Morris, R., Mueck, J., Richards, A., Weichenmeier, F., 2006. International comparative fieldevaluation of a traffic-responsive signal control strategy in three cities. Transportation Research Part A: Policy and Practice 40 (5), 399–413.

Kühne, F., 2005. Controle preditivo de robôs móveis não holonômicos, Master’s thesis, Graduate Program in Electrical Engineering, Federal University of RioGrande do Sul, Brazil, in Portuguese.

Li, S., Zhang, Y., Zhu, Q., 2005. Nash-optimization enhanced distributed model predictive control applied to the Shell benchmark problem. InformationSciences 170 (2-4), 329–349.

Lowrie, P.R., 1982. The Sydney co-ordinated adaptive traffic system – principles, methodology and algorithms. In: Proceedings of the IEE InternationalConference on Road Traffic Signalling, London, pp. 67–70.

Maciejowski, J.M., 2002. Predictive Control with Constraints. Prentice Hall.Manikonda, V., Levy, R., Satapathy, G., Lovell, D.J., Chang, P.C., Teittinen, A., 2001. Autonomous agents for traffic simulation and control. Transportation

Research Record 1774, 1–10.Maturana, F.P., Staron, R.J., Hall, K.H., 2005. Methodologies and tools for intelligent agents in distributed control. IEEE Intelligent Systems 20 (1), 42–49.Negenborn, R.R., Schutter, B.D., Hellendoorn, J., 2008. Multi-agent model predictive control for transportation networks: serial versus parallel schemes.

Engineering Applications of Artificial Intelligence 21 (3), 353–366.Nguyen-Duc, M., Guessoum, Z., Mari, O., Perrot, J.-F., Briot, J.-P., Duong, V., 2008. Towards a reliable air traffic control. In: AAMAS’08: Proceedings of the 7th

International Joint Conference on Autonomous Agents and Multiagent Systems, pp. 101–104.Papageorgiou, M., 2004. Overview of road traffic control strategies. In: Information and Communication Technologies: From Theory to Applications, pp. LIX–

LLX.Pechoucek, M., Šišlák, D., Pavlícek, D., Uller, M., 2006. Autonomous agents for air-traffic deconfliction. In: AAMAS’06: Proceedings of the 5th International

Joint Conference on Autonomous Agents and Multiagent Systems, ACM, New York, NY, USA, pp. 1498–1505.Rigolli, M., Brady, M., 2005. Towards a behavioural traffic monitoring system. In: AAMAS’05: Proceedings of the 4th International Joint Conference on

Autonomous Agents and Multiagent Systems, ACM, New York, NY, USA, pp. 449–454.Robertson, D.I., 1969. TRANSYT: A traffic network study tool, Tech. rep., Transport Research Laboratory, Crowthorne, England.Robertson, D.I., Bretherton, R.D., 1991. Optimizing networks of traffic signals in real time – the SCOOT method. IEEE Transactions on Vehicular Technology

40 (1), 11–15.Srinivasan, D., Choy, M.C., 2006. Cooperative multi-agent system for coordinated traffic signal control. IEE Proceedings Intelligent Transport Systems 153

(1), 41–49.Tatara, E., Birol, I., Teymour, F., Çinar, A., 2005. Agent-based control of autocatalytic replicators in networks of reactors. Computers & Chemical Engineering

29, 807–815.Tatara, E., Çinar, A., Teymour, F., 2007. Control of complex distributed systems with distributed intelligent agents. Journal of Process Control 17, 415–427.Tomás, V.R., Garcia, L.A., 2005. A cooperative multiagent system for traffic management and control. In: AAMAS’05: Proceedings of the 4th International

Joint Conference on Autonomous Agents and Multiagent Systems, ACM, New York, NY, USA, pp. 52–59.Tumer, K., Agogino, A., 2007. Distributed agent-based air traffic flow management. In: AAMAS’07: Proceedings of the 6th International Joint Conference on

Autonomous Agents and Multiagent Systems, ACM, New York, NY, USA, pp. 1–8.Webster, F.V., 1959. Traffic signal settings, Tech. Rep. 39, Road Research Laboratory, London, UK.Wooldridge, M., 2002. An Introduction to MultiAgent Systems. John Wiley & Sons Ltd.Yamashita, T., Izumi, K., Kurumatani, K., Nakashima, H., 2005. Smooth traffic flow with a cooperative car navigation system. In: AAMAS’05: Proceedings of

the 4th International Joint Conference on Autonomous Agents and Multiagent Systems, ACM, New York, NY, USA, pp. 478–485.