a review of neural networks to transport

Pergamon

Transpn. Res:C. Vol. 3, No. 4. pp. 247 260. 1995 Copyright a; 1995 Elsevier Science Ltd

Printed in Great Britain. All rights reserved 0968-090X/95 $9.50 + 0.00

A REVIEW OF NEURAL NETWORKS APPLIED TO TRANSPORT

MARK DOUGHERTY The Institute for Transport Studies, University of Leeds. Leeds LS2 9JT. U.K.

(Received 20 Much 1995)

Abstract - This paper attempts to summarise the findings of a large number of research papers concerning the application of neural networks to transportation. A brief introduction to neural networks is included, for the benefit of readers unfamiliar with the techniques. Because the subject is so young, some of the papers appear only in conference proceedings or other less formal pub- lications I make no apology for this; I felt it was important to cover as much of the contemporary work as was possible.

The paper surveys both the application areas found to be fruitful and the range of neural network paradigms which have been used. Not surprisingly, multilayer feedforward networks such as backpropagation have so far been by far the most popular, but there are signs of a growing diversity; practitioners using neural networks are urged to seek out the less well known paradigms and experiment with them themselves.

A particular weakness noted in much of the work is the informal approach taken to detailed analysis of the results of the research. It is postulated that a more rigorous approach to matters such as comparison with other techniques and also the methodology used to design the neural networks would help a clearer picture to emerge as to best practice and future research directions.

INTRODUCTION

The field of transport studies has seen an explosion of interest in neural networks in the 1990s. This is evident when one considers that over 40 papers published since 1990 are reviewed in this paper. Only a handful can be found from the previous decade. This can been seen as part of a general pattern of increased use of artificial intelligence techniques in transport (Kirby and Parker, 1994).

Review articles on the subject have been published previously (Faghri and Hua, 1992a; Hua and Faghri, 1994), but both of these papers concentrate more on future potential and possibilities rather than previous work. The aim of this paper is therefore to make a critical review of the work carried out thus far. What are the main classes of problem within transport which have been tackled using neural networks? What kinds of neural network have been used? What are the main achievements to date? What mistakes have been made, and how could they have been avoided? What are the obstacles to further progress and how might they be overcome? Finally, I think it is significant that those working in the field of neural networks applied to transport still usually find it necessary to announce this fact in the titles of their papers. Is using neural networks really that different from other methods of analysis?

INTRODUCTION TO NEURAL NETWORKS

Neural networks is a broad term covering a great many different architectures, or paradigms. The operation of these paradigms can vary enormously. However, all neural networks share some basic common features. They are composed of a number of very simple processing elements, known as neurons. These elements take data in from a number of sources and compute an output dependent in some way on the values of the inputs, using an internal “transfer function”. The neurons are joined together by weighted

241

248 Mark Dougherty

Fig. 1, A neuron (processing element).

connections; data flows along these connections and is scaled during transmission according to the values of the weights (Fig. 1).

In general terms the relationship between the inputs X0 . . . X, of neuron j and its output Y, is given by equations (1) and (2). The function is typically a non-linear function such as a sigmoid.

Ij=c Wj;Xi (summarion) i=O

yj =.f(lj 1 (transfer) (2)

The output of a particular neuron may therefore contribute to the input received by another. Naturally such a system is of little use unless it communicates with the outside world and so some connections take data in from an external source, whilst others pass data back out. The neural network’s functionality is very much bound up in the values of the connection weights, which can be updated over time, causing the neural network to adapt and possibly “learn”. Partly because this idea is so abstract, those working with neural networks have tended to impose a more rigid structure in practice. Several simplifications are made:

l The neurons are arranged neatly in layers, with the existence or not of a connection between two neurons being governed by a strict rule. For example, a common scheme is for the output of each neuron in one layer to be fully connected to the inputs of all neurons in another. This arrangement is typical of a feedforward network (Fig. 2).

Input layer

T T T T T T

Fig. 2. A typical feedforward neural network

Review of neural networks 249

A “learning rule” is defined which determines how and when connection weights are updated. Connection weights have minimum and maximum strengths. All the neurons within a layer, or often the entire network behave in the same way; that is, they all use the same formula to compute an output from the weighted inputs. Many networks have a further simplification in that they are feedforward networks with no circular information paths; data flows in steps from the input side to the output side. By contrast, recirculation networks do have such circular paths. In this case it is usually assumed that all neurons compute their results simultaneously; these results then map onto a new neural network state, and the process can be repeated.

Neural networks can be categorised according to the type of learning rule employed. Three main learning schemes have been devised: supervised, reinforcement and self- organising. A fourth category of neural network can be defined as those networks which use more than one type of learning; these are often described as hybrid networks.

Supervised learning

In supervised learning, an input is presented to one side of a feedforward network, and an output computed. This is compared with the output desired for these inputs, and a global error function computed. This is then used to update the weights in order to move the output towards the desired output. Over the course of many examples being presented to the neural network, it is hoped that the global error will gradually decline, as the network converges into a steady state. The learning is described as supervised, because the network is given an exact description of the behaviour required after each iteration.

Perceptrons (continuous data) and the ADALINE (binary data) are early examples of this type of network. Whilst it was shown that they could solve certain types of problem, they cannot converge to a solution for a problem which is not linearly separable (Minsky and Papert, 1969). The well-known backpropagation learning method solves this problem for perceptrons by updating each internal weight in proportion to a partial derivative of the error surface (Hecht-Nielsen, 1989). Similarly, the MADALINE network, which is able to solve linearly separable problems, was developed from the ADALINE.

The majority of transport applications of neural networks have used backpropagation networks; the reader should therefore assume that backpropagation (or similar) has been used for a particular piece of work unless otherwise stated.

Reinforcement learning

A less direct way of training a network is to simply inform it whether it performed well or badly for each iteration. If the neural network performed well, processing elements within the network which are outputting an active signal are examined, and the strength of the most active input connections to these neurons is increased. Often this is done on a “winner takes all” basis, with only the connection providing the strongest input being updated. Thus strong, active paths through the neural network are reinforced. Conversely, if the neural network performed badly the strength of some connections are weakened. Reinforcement learning is often known as Kohonen learning.

Examples of paradigms which use this approach are Learning Vector Quantisation (Kohonen et a/., 1988) which is a feedforward network and Hopfield networks (Hopfield, 1982) which are recirculation networks. A generalisation of the Hopfield network is the Boltzmann machine, which introduces the added sophistication of simulated annealing.

Self-organising networks

This type of paradigm operates purely on input data, with all the criteria for updating the weights being determined internally within the neural network. The main use of such networks is for classification problems where it is not certain beforehand what the defi- nition of the classes should be. Adaptive Resonance Theory (ART) networks are an example of this type of network (Grossberg, 1976). An ART network contains a novel

250 Mark Dougherty

feature detector: if the network is exposed to a pattern it has not seen before it defines it as belonging to a new class. Further similar examples are then grouped within that class. Self-organising feature maps work differently, in that they perform an adaptive dimen- sionality reduction of input data from many dimensions to just two (Kohonen, 1988). Spatial grouping within this two-dimensional space can indicate separate classes of input.

Combined networks Finally, some networks may use different learning methods for the connections between

different layers. Radial basis functions are an example of this general approach: here a backpropagation type output layer sits above an initial network of radially symmetric kernel units (Leonard et al., 1992). Counterpropagation has a Grossberg reinforcement learning output layer above a self-organising feature map.

The reader is recommended to refer to textbooks on neural networks for comprehensive treatment of these and other paradigms (Hecht-Nielsen, 1990; Beale and Jackson, 1990).

THE TARGET SUBJECT AREAS

Before examining any work in detail, it is interesting to gain a broad overview of the kind of subject areas within transport which have received attention from those working with neural networks. Table 1 gives an initial breakdown by subject area of the papers reviewed in this article. Obviously the categories are purely arbitrary; some could reasonably be merged, whilst others cover what is arguably a very broad area and could be sub-divided. However, it is hoped that the reader gets a good idea of the kind of topics being addressed.

A point to notice is that the vast majority of these papers concern road-based transport. Whilst it is probably true that the majority of research projects have been directed towards this mode, the table is not truly representative. For example, only a few papers covering the maritime and avionics industries appear. One reason for this is that a lot of the most interesting work has been kept secret, for both commercial and military reasons. Also bear in mind that some subjects (such as autonomous underwater vehicles) are highly specialised, and the reader is therefore left to search out further references if interested.

What follows is a brief discussion of the papers covering each subject area:

Driver behaviour Work in this area splits neatly into two schools: modelling of either strategic or

instinctive decisions. The first school is typified by papers by Yang et al. (1992) and Dougherty and Joint (1992), who both describe using neural networks to analyze data collected from interactive route choice simulators. Volunteer drivers who took part in these experiments were asked to make route choices based on the values of a wide variety

Table I

Subject area Number of papers

Driver behaviour/autonomous vehicles 12 Parameter estimation 7 Pavement maintenance 6 Vehicle detection/classification 5 Traffic pattern analysis 5 Freight operations 4 Traffic forecasting 4 Transport policy and economics 2 Air transport 2 Maritime transport 2 Submarine vehicles I Metro operations I Traffic control I Total 52


of criteria. A neural network was then trained using these data to make similar decisions. The network was then asked to make predictions of route choice for data items it had not seen previously; good replication rates with regard to the actual decisions made were reported in both papers. A neural network proved a quicker and more accurate method of analysis than alternative techniques such as logit models.

Perhaps what is particularly interesting is the contrast between these two papers as to the method used to establish the importance of a particular criterion. Yang et al. analyzed the variation of the replication rate after changing the information given to the drivers, contending that changes in replication rate are indicative of how important a particular factor is: if drivers are not given sufficient information their decisions will become less rational. Dougherty and Joint took a completely different approach, performing elasticity tests on the trained networks, in order to get an idea of the relative importance of each criterion.

The second school is typified by work aimed at trying to build real-time models capable of steering a car around a road environment. Lyons and Hunt (1992) describe an initial experiment, using data collected on an interactive computer simulation model, which attempted to model overtaking manoeuvres. This work was later extended and the problem reparameterised to enable real data to be used (Hunt and Lyons, 1993; Lyons, 1994). Learning vector quantisation networks were used in this later work; the “winner takes all” approach appears well suited to modelling this type of decision making. Another classical modelling problem which has been attacked using neural networks is gap acceptance (Pant and Balakrishman, 1994). A self-organising neural network has been built which can reverse articulated trucks: this is a particularly interesting problem because human drivers often cannot solve it themselves. Because of this, a self-organising approach was absolutely essential. Neural networks (ART) have also been demonstrated for control of the lateral position of a vehicle using objects in the driver’s view such as white lines (Kornhauser, 1991). Again, a data set from drivers operating a simulation was used in this work and with the cost of virtual reality technology falling rapidly, this seems a likely area of enormous future interest.

Other researchers have taken this idea much further, designing systems which can be used to equip a real vehicle. Work using neural networks processing data from an array of distance detectors has been reported (Neusser et al., 1991), although this system was only trained to deal with simplistic environments. Much greater flexibility has been achieved in projects such as NAVLAB (Crisman and Webb, 1991). In this type of work we can clearly see the two facets of instinctive and strategic decision making being unified to produce a truly autonomous vehicle capable of interacting with a normal road environment (Pomerleau et al., 1991).

Purameter estimation

This area of interest falls into a class of problem where measurable quantities such as traffic flows are used to estimate parameters which, although of great use to traffic engineers, are not easily measured on street without resorting to expensive manual surveys.

Origin-destination (OD) matrices are a typical data resource which engineers have need of. Work has been published by many groups attempting to make estimates of such matrices from flow data. Neural networks have been used in this context, typically on a very small network, or even a single intersection (Kikuchi et al., 1993; Yang et al., 1992; Chin et al., 1994). Whilst this work is interesting, it must be remembered that traffic engineers often require OD matrices covering several hundred points. Two problems are likely to occur if an attempt is made to scale this work up. Firstly, the computation time of the neural network methods described will rise with the square of the number of points. Secondly, the number of training examples required to train the networks will become very large as a result, and collecting real data on this scale would become infeasible.

The inverse of OD matrix estimation involves estimating traffic flows from a basis of an known (or estimated) OD matrix: the classic traffic assignment problem. A typical scenario is that a traffic scheme is suggested which involves a major change to the road

252 Mark Dougherty

network. First an OD matrix is estimated, and then the traffic reassigned to a new

imaginary network which mirrors the existing network with the proposed changes super- imposed. Work has been reported which used a hybrid genetic algorithm/neural network

to solve the assignment problem (Xiong and Schneider, 1992). In fact the main com- ponent of this system was a cumulative genetic algorithm, which was used to search out a

minimum on the trip cost surface; a neural network was used as the cost function to be minimised. Two main benefits of this method are claimed. The method was faster than a

conventional equilibrium model (although the use of a genetic algorithm is probably the most significant factor affecting speed of operation). Much more directly related to the use

of neural networks is that several cost criteria could be incorporated very conveniently. Travel time estimates are another important requirement of traffic engineers. Neural

networks are likely to be much more effective for this type of problem. This is because,

unlike OD estimation, good results can be achieved using data collected on only the link of interest (Hua and Faghri, 1994). Even better results might be obtainable using data

from neighbouring links (Nelson and Palacharla, 1993). This latter work used counterpropagation networks functioning as an adaptive look-up table; unfortunately the authors

do not show whether this method really generalised, as the amount of data used was very small. Estimates have also been made of total link occupancies in an urban context using

neural networks; this information could possibly be of use in adaptive traffic control systems (Dougherty et al., 1993). Finally, a system capable of estimating the maximum

capacity of a link from patterns of density and speed observed on the link has been reported (Heymans et al., 1991).

Pavement maintenance

A system providing advice on the best options for maintaining a road surface can be split into two sub-systems: a diagnostic element and a prognostic element. Both have

received attention from the neural network community. The diagnostic element has been successfully attacked as a problem of image processing

(Kaseko and Ritchie, 1992, 1993). In this work neural networks are used to process pictures of road surfaces and categorise features within them into different types of defect. Note that considerable amounts of classical image processing were still necessary before

the neural network stage was reached, emphasising the point that neural networks do not usually provide a complete solution to a problem. Another aspect of pavement diagnosis is that of automatically recognising road markings which have been damaged or obscured (Hua and Faghri, 1993). Here a Hopfield net was used as an associative memory to map incomplete images onto templates. Again we see the idea of neural networks being used as

sub-systems; what makes this work unusual is that two different architectures of neural network are used for the different sub-tasks of image association and recognition.

Those exploring the idea of prognostic systems for pavement maintenance suffer a particularly difficult problem with regard to transferability. This is because the action

required for the treatment of a road surface is not just dependent on its condition, but many other factors such as the level and type of traffic it is intended to bear and more importantly, how much money there is to spend! Collecting sufficient data to cover all

these eventualities is extremely difficult. Work has been described which side-steps this problem altogether by only considering data from a small geographical area and with treatments suggested by a panel of experts from only two (related) organisations (Pant et al., 1993). This gave good results within this rather limited domain, but the network could not be applied elsewhere without retraining; this would involve another lengthy data collection exercise.

An alternative approach is to only order the examples into several priority bands, with no exact suggestion of what particular treatment is needed (Hajek and Hurdal, 1993). This is more general, but a preprocessing stage, tailored to the local situation, is needed. The idea of linking several neural networks appears in respect to this topic as well (Rewinski, 1992), but this paper does not describe its data sources or provide statistically significant results, so it must be regarded as very preliminary in nature.


Vehicle detectionlclas$ication Following on from the idea of a highway equipped with arrays of sensors, much work

has been undertaken to extract the maximum amount of information from the signals generated. A typical derived quantity is the class of passing vehicles: a factor of its wheelbase, number of axles, weight etc. Basis function networks have been applied to this problem with some success (Mead et al., 1994); although the numerical results were not outstanding, a commercial unit using an algorithmic approach performed even worse! Clearly, more extensive testing, perhaps using devices supplied by other manufacturers, is needed to gain a fuller picture.

A more ambitious idea is to dispense with traditional vehicle detection technology and employ video cameras coupled with high- performance image processing techniques. Two pieces of work using neural networks which complement each other very nicely have been reported in this area. The first concerns detection of vehicles as they pass across a video camera (Bullock et al., 1992, 1993). It is reported that although performance is similar to conventional image processing techniques under ideal conditions, neural networks are more flexible with regard to changes in external factors such as shadowing and camera position. The second piece of work shows how, once detected, vehicles can be classified into one of several types, again using a neural network (Belgaroui and Blosseville, 1993). The idea of a system containing several neural networks, each performing a sub-task springs to mind. Neural networks have also been used for the problem of automatic license plate reading (Margarita, 1990); once again this work could easily be combined with a detection system.

Trafic pattern analysis Traffic networks equipped with arrays of inductive loops, or other equivalent sensors,

are rich sources of data concerning parameters such as the speed and volume of passing vehicles. Such data sets, particularly if collected from several distinct geographical sites, are extremely complicated to analyze because of the causal relationships in both space and time which drive the behaviour of traffic systems. Several groups have therefore been involved in work which has the general aim of using neural networks to discover patterns within this data, and there is a wide spread of applications within this area.

Neural networks have been demonstrated as an aid to congestion diagnosis (Kirby et al., 1993) by training a neural network to classify an urban traffic network into one of two states: congested or non-congested. The main limitation of this work is the above- mentioned difficulty of transferability, as the network was trained using a highly specific data set defined by a local expert. The paper does, however, give an interesting demon- stration of the use of neural networks to fuse several different congestion measures together, to produce a higher level diagnosis. A similar approach is taken in later work (Hua and Faghri, 1993a,b), but the number of congestion categories is extended; this work used adaptive resonance theory and therefore demonstrates that alternative paradigms to backpropagation are certainly worth considering.

This same group has also used adaptive resonance theory to explore the possibility of using neural networks to analyze the seasonal variation of traffic flows (Faghri and Hua, 1992b). This is of importance to traffic engineers making surveys of traffic, so that results can be reduced to a common baseline. Unfortunately the variation varies with the characteristics of the section of road being examined, and the main problem is therefore to try and classify a section into one of several types before a correction is applied. Naturally this must be done using only a small amount of data in the temporal domain; if continuous data over a number of years were available a correction factor could be easily determined. A neural network was successfully used to carry out this task.

Finally, we come to the task of identifying non-recurrent congestion caused by an incident occurring on the carriageway, such as an accident. Successful work on this subject has been carried out on motorway data (Ritchie et al., 1992), which compares favourably with other more traditional incident detection algorithms in use (Ritchie and Cheu, 1993). The main difficulty in this area is not the detection rate, which is excellent for

254 Mark Dougherty

most of the techniques, but the false alarm rate. This must be exceedingly low if operators are to take actual notice of alarms raised in control centres. It is reported that it is in this area where neural networks score highly, particularly if a persistence factor is set (the neural network must report an incident for two or more consecutive time-slices before an alarm is raised). Another difficulty is finding enough real data containing confirmed incidents to test the system with; for this reason work reported so far has used simulated data sets.

Freight operations The main problem in freight operations for which attempts have been made to use

neural networks concerns optimisation of routing networks and scheduling. The main problem all the researchers seem to have experienced is parameterisation of the problem. This is because the problem is so highly non-linear. One possible solution to this is to use self-organising categorisation networks (Matsuyuma, 1991; Jwell et al., 1991). Different optimisation schemes are then used, depending on the class. Another approach is to explore different encoding schemes, with the hope that one can be found which extracts the salient features (Potuin and Shen, 1991). Yet another approach has been to use Boltzmann machines as, unlike other paradigms, this type of network is specifically designed for optimisation problems (Ohba et al., 1989). Unfortunately results were some- what disappointing and is seems that the techniques involving complex pre-processing show more promise.

Trajic forecasting The forecasting of traffic falls into two distinct categories. Strategic forecasting is where

an attempt is made to predict traffic flows months or years into the future, and usually influences major decisions on road planning. In contrast, short-term forecasts often have a horizon of only a few minutes, and can conceivably feed directly into traffic control systems. Neural networks have been used in both strategic (Chin et al., 1992) and short- term (Dougherty et al., 1994; Dougherty and Cobbett, 1994; Clark et al., 1993) forecasts. Promising results are reported in both cases. The latter paper has a particularly interesting theme in that it makes a considerable effort to compare the neural networks against ARIMA time-series modelling. Several different goodness-of-fit measures are used, and it becomes apparent that the “best” technique depends on how you care to measure the result! The question of comparative studies is discussed at greater length later.

Transport policy and economics A traffic related problem in economics is modelling the effect of noise from air traffic on

the prices of houses adjacent to an airport (Collins and Evans, 1994). This work is an excellent example of how neural networks may have uses in areas of transport studies previously thought of as unlikely application areas. A neural network was used in this work as a tool for multivariate analysis. Inputs consisted of a large number of possible factors affecting the sale price of a house: condition, size, age etc. and of course a noise factor reflecting the nuisance from aeroplanes landing at the local airport. The elasticity of the neural network was used [in a similar fashion to Dougherty and Joint (1992)] to measure the importance of a particular parameter: in this case the noise factor. A particular point of interest about this work is that the results disagree with another analysis using hedonic regression.

A further study comparing neural networks with regression techniques also reached conflicting conclusions (Duliba, 1991). In this work a neural network was trained to predict overall levels of performance in the transport industry. It performed better than a random effects specification regression model, but worse than a fixed effects specification regression model. It seems that more work is certainly required in this area to further evaluate the usefulness of neural networks.


Air transport The realm of avionics is highly specialised, and is largely dominated by a few large

research organisations. Much of the work is kept secret for defence reasons. Therefore only two papers are mentioned in this review to give the reader a brief insight into the sort of work being carried out (Mann and Hayhim, 1991; Beastall, 1989). Both concern analysis of radar signals; learning vector quantisation is used in the first piece of work.

Maritime transport Two contrasting pieces of work are mentioned here. A study has shown that neural

networks have potential for use in autonomous ship navigation in confined spaces (Stamenkovich, 1991). An unusual aspect of this work is that the neural network is used as a “supervisor” to advise a more microscopic conventional control model, rather than control the ship directly.

The second paper concerns using learning vector quantisation networks in an image processing system which recognises and classifies profile images of ships (Lo and Bavarian, 1991). Whilst this particular piece of work is probably of little civilian interest, it never- theless points to possible future applications of neural networks in the maritime industry.

Submarine vehicles A surprisingly large amount of work has been carried out concerning autonomous

submarine vehicles, and much of the work involves neural networks (Demuth and Springsteen, 1990). Again, this field is highly specialised and not of such general interest.

Metro operations A recent paper discusses the possibility of using a combination of fuzzy logic and neural

networks to control the acceleration and deceleration of a metro train (Hartani et al., 1994). This very interesting paper describes a complex hierarchical hybrid decision making system which contains neural networks similar to Learning Vector Quantisation.

Trajk control A single paper in this review deals with a neural network directly attacking a traffic

control problem (Nahatsuji and Terutoshi, 1991). A neural network was trained to suggest the optimum green splits for a single intersection. This was later extended to a network of three intersections. Whilst the work is undoubtedly interesting, the authors do not make it clear how they prepared the data sets used to train and test the neural network.

ANALYSIS

Several general points are worthy of further discussion:

Paradigms used Table 2 shows the distribution of papers reporting the use of different paradigms. It

must be pointed out that not all of the networks described in the papers follow exact

Table 2

Paradigm

Backpropagation Learning vector quantisation Adaptive resonance theory Self-organising map ADALINE Hopfield Basis functions Counterpropagation Boltzmann machine

Number

36 7 4 2 I I I I I

TR c 311-f

256 Mark Dougherty

“textbook” definitions. In these cases I have attempted to place the work in the closest category. Where a paper reports using more than one paradigm I have entered it in both categories; hence the total number of examples of use exceeds the number of papers reported. Unfortunately there are insufficient examples of the use of many of the paradigms to draw any particular conclusions as to their usefulness for different types of problem; this difficulty is exacerbated by the lack of detailed numerical analysis of results in some papers.

As yet there is little sign of a methodological approach to the detailed design of a neural network for a particular task, with most researchers applying simple trial and error techniques to find optimum configurations. Work in this area within the transport field has really only reached the initial stage of selecting a paradigm (Faghri and Hua, 1992) and this work is very general in nature. It also pays little attention to the Kohonen learning based paradigms of learning vector quantisation and self organising feature maps, which are indicated by Table 2 as being very promising for transport applications. Developments in this area should be watched closely, as it is one of the key questions which needs to be solved in order to make the technology more accessible. It will also greatly enhance the credibility of the field.

Neural networks as sub-systems An interesting point to notice is that only one paper (Nahatsuji and Terutoshi, 1991)

concerns a neural network being used to directly alter traffic control parameters. Whilst many of the other papers describe work with a long-term goal of enhancing traffic control systems, the emphasis is very much on neural networks carrying out higher level functions such as pattern recognition or short-term forecasting. Thus it is clear that the experience of most designers is that neural networks are of little use unless they are embedded into systems which contain further algorithms and/or decision making capabilities.

Performance comparisons An area where much of the reported work is inadequate is making careful comparisons

(either qualitative or quantitative) with alternative techniques. In some papers, no comparative work is quoted or reported and therefore the reader is left in the dark as to whether the results justified using neural networks. Since one often quoted benefit of using neural networks is that they can outperform conventional methods of analysis, more solid evidence of this would be comforting.

In the author’s own experience (Clark et al., 1993) carefully devised methods of statistical analysis can reach similar levels of performance to neural networks. A further complication described in this reference is that different measures of success may not completely agree. It is obvious that further work is often needed to establish confidence limits and the statistical significance of the results. Unfortunately, the exuberant enthusiasm for neural networks displayed by many authors sometimes tempts them into turning a blind eye to this task.

Further evidence of a less than rigorous approach can be seen in a common ruse employed by several authors. This is to attack a highly non-linear problem with neural networks and then perform a token linear regression on the data by way of comparison. Not surprisingly, neural networks seem superior under such circumstances, but a trained statistician is unlikely to be impressed! The reader is urged to examine the relationship between statistics and neural networks much more closely (Ripley, 1992).

Useability Neural networks are preferred as a method of analysis for reasons other than improved

numerical accuracy. Many authors report that it is often quicker to build a neural network model rather than a statistical one, because much of the task of model specification and tuning is automated. Another benefit is that neural networks have none of the preconceptions which many algorithmic models are forced into by the designer, who


naturally builds the model along the lines he or she believes it to behave through their own experience.

On the downside, it has to be conceded that neural networks suffer from their black box nature. This is particularly noticeable when they are compared against techniques such as multinomial logit models. The main problem to be faced is that often the main reason for modelling a system is not necessarily to produce an accurate model but rather to reach some understanding. A good example of this difficulty can be seen in Pant and Balakrishman (1994) where a neural network is compared against a binary-logit model for modelling gap acceptance at an intersection. Although the neural network actually performed better in numerical terms, it is arguably the logit model which is more useful, as it produces a utility function for each parameter and the reader therefore gains a much greater insight into the model. Elasticity testing of the neural network (Dougherty and Joint, 1992; Collins and Evans, 1994) is a possible solution which can give the best of both worlds. Unfortunately it is quite a time consuming and painstaking process; advances in this area are outstripping the neural network development tools currently available.

From a point of view of the useability of networks, the overwhelming majority of the authors quoted in the paper reported very positively. However, if the neural network being built is to be implemented in an operational context, one needs to consider the difficulties mentioned below regarding implementation and retraining.

Implementation Neural network applications in transport are reaching the stage where significant

amounts of research are expected to be implemented in actual systems. The question of stepping from an abstract computer model to a fully-implemented prototype must be addressed. This can be problematical if the work has been developed in a simulation environment. Some commercial simulations offer the possibility of converting neural networks into modules of code, otherwise it may be necessary to code up the networks anyway; this rather undermines the argument for buying a simulation environment. A further possibility is to consider a hardware implementation of a neural network. This is likely to only be of interest if designing a real-time system with heavy computational requirements.

A major problem which those using neural networks have only started to address is the problem of neural networks becoming out of date once installed in the field. Two possible solutions to this have been put forward. One can collect new data at regular intervals and retrain the networks; the main questions are how often this must be done, and what the cost of such an operation is. This will of course depend on how easy it is to measure the data, and whether expensive manual preprocessing is needed to select a balanced sample of all conditions (Collins and Evans, 1994). Alternatively, paradigms such as adaptive resonance theory can continue their training on-line in the field, as, unlike other paradigms they do not “forget” earlier relationships, meaning that a balanced sample is not required. This allows more ad hoc retraining schedules and is more flexible. Unfor- tunately, ART and its derivatives are designed purely for classification and have only binary outputs. This makes them unsuitable for the many transport related problems which need continuous variables as output such as traffic flow forecasting or travel time estimates.

As well as the problem of becoming out of date, one must also consider what happens to system performance in the case of missing or incorrect data, either during training or after implementation. Although fault tolerance is often quoted as an advantage of using neural networks (Beale and Jackson, 1990) the authors’ own experience of such problems has not been very encouraging (Dougherty and Cobbett, 1994). This is an area ripe for future research; as up to this moment very few neural network based systems have actually been implemented “on street” and thus these issues have often not been examined. This returns us to the problem of finding better and less labour intensive ways of building representative training data sets.

258

Data sources

Mark Dougherty

A good set of data is an essential requirement for working with neural networks. It is clear that it is much easier to obtain data for some transport applications than others. In some cases, it has not been possible to obtain sufficient quantities of data, and as previously mentioned, simulated data have often been used (Hginyen and Widrow, 1989; Ritchie and Cheu, 1993). It is important to remember that much more data are often required for training than testing; therefore where data are limited it may be better to reserve what real data are available for the testing phase, as this allows more credible results to be produced.

Cost-benefit analysis of using neural networks Even if neural networks can outperform alternative techniques for particular types of

analysis, little attention is paid to the question of whether the benefit in performance is worthwhile. What investment is needed in terms of computing platform (both hardware and software) and staff training to realise the benefits of neural computing? With appro- priate tools and skills available, does a neural network solution take more or less effort to achieve than a conventional one? These important issues are rarely considered in the papers reviewed, but are important if interest in neural networks is to be sustained.

For those considering the backpropagation paradigm (which has been used in the vast majority of transport applications), the investment required is quite small. Various backpropagation simulators exist in the public domain, and there are comparatively few parameters of the network for the user to optimise. When considering many other paradigms, the situation is much less clear-cut. The user will probably either have to write code to build the networks by hand (in which case an object-oriented language such as C + + is strongly recommended), or purchase a relatively expensive commercial simulation. The latter has a lot to recommend it, especially if the user wishes to experiment with several different network paradigms.

CONCLUSIONS

Many of the problems that those studying transport systems are attempting to solve are highly non-linear. Data sources are often numerous and complex. Neural networks show great promise as a useful tool for analysing these data, but there is much work still to be done, particularly in regard to model interpretation and validation.

Larger and more comprehensive comparative studies are needed. Not only should neural networks be compared with state-of-the-art statistical techniques, but different paradigms apart from backpropagation should be experimented with more often, so that a clearer picture emerges as to the best techniques to use for different problems. Since backpropagation, learning vector quantisation and adaptive resonance theory have been the most widely used paradigms, and are representative of supervised, reinforcement and self-organising learning, respectively, it is suggested that these three paradigms could be considered a standard tool kit for transport applications of neural networks.

More detailed documentation of how neural network based systems in transport have been built would enable a more methodological approach to emerge. Too much work is based around optimising network configurations and data sets by trial and error, which although a useful technique for prototyping, is not suitable for real implementations where the costs of development must be estimated accurately before work starts.

REFERENCES

Beale R. and Jackson T. (1990) Neural Cumpfing: An Mroduc/ion. Adam Hilger. Bristol. Beastall W. (1989) Recognition of radar signals by neural networks. Proc. 1st IEE Cm/: on Arrl$cial Neurul

Nerworks. London. Belgaroui B. and Blossville J. M. (1993) A road traffic application of neural techniques. Recherche Transporrs

SPcuritP, English Issue No. 9, pp. 53-65.


Bullock D., Garrett J., Hendrickson C. and Pearce A. (1992) A neural network for image base vehicle detection. Proc. In!. Conf on Arlificial Infelligence Applications in Transportation Engineering, San Buenaventura, CA.

Bullock D.. Garrett J. and Hendrickson C. (1993) A neural network of image-based vehicle detection. Transpn. Re.s.-C 1, 2355247.

Chin S. M.. Hwang H. L. and Miaou S. P. (1992) Transportation demand forecasting with a computer-simulated neural network model. Proc. Int. Conf. on Ar~ificiul Intelligence Applications in Transportation Engineering, San Buenaventura, CA.

Chin S. M.. Hwang H. L. and Pei T. (1994) Using neural network to synthesize origin-destination Row in a traffic circle. Preprints of Transport Research Board Con/:, Washington, DC.

Clark S. D.. Dougherty M. S. and Kirby H. R. (1993) The use of neural network and time series modes for short term forecastmg: a comparative study. Proc. PTRC Summer Meeting, Manchester.

Collins A. and Evans A. (1994) Aircraft noise and residential property values, an artificial neural network approach. J. Tramp. Econ. PoIic>, 28(2). 175 197.

Crisman J. D. and Webb J A. (I 991) The warp machine on NAVLAB. IEEE Trans. Par/. Anal. Mach. Intell 13(S),

451 465.

Demuth G. and Springsteen S. (1990) Obstacle avoidance using neural networks. Proc. Symp. on Autonomous

b’ndenc~ater Vehicle Technology, Washington, DC. Dougherty M. S. and Cobbett M. (1994) Short term inter-urban traffic forecasts using neural networks. Proc. 2nd

DRIVE-II Workshop on Short-Term Forecasting, Delft. The Netherlands, Dougherty M. S. and Joint M. (1992) A behavioural model of driver route choice using neural networks. Proc.

Int. Con/. on Artificial Inielligence Applications in Transportation Engineering, San Buenaventura, CA. Dougherty M. S., Kirby H. R. and Boyle R. D. (1993) The use of neural networks to recognise and predict traffic

congestion. Traj: Engng Conrr. 34(6), 31 l-314.

Dougherty M. S., Kirby H. R. and Boyle R. D. (1994) Using neural networks to recognise predict and model traffic. Arrificial Inrelligrnce Applications to Traffic Engineering (Bielli, Ambrosino and Boero, Eds). VSP, Utrecht.

Duliba K. A. (1991) Contrasting neural nets with regression in predicting performance in the transportation industry. _74//r Int. Conf: on System Sciences, Hawaii.

Faghri A. and Hua J. (1992a) Evaluation of artificial neural network applications in transportation engineering. Transportution Research Record 1358, 7 1. 80.

Faghri A. and Hua J. (1992b) Roadway seasonal classification using neural networks. Proc. Int. Conf. on Art@-

cial Inrelligence Applications in Transportation Engineering, San Buenaventura, CA. Grossberg S. (1976) Adaptive pattern recognition and universal recording: (I) parallel development and coding

of neural feature detectors. Biol. Cyherne/. 23, I2 I 134.

Hajek J. and Hurdal B. (1993) Comparison of rule-based and neural network solutions for a structured selection problem. Transporration Research Record 1399. 1~-7.

Hartani R., Hayat S.. Sellam S.. Bouchon-Meunier B. and Gallinari P. (1994) Regulation de trafic de lignes de metro basee sur la logique floue et les reseaux de neurones. Proc. 14th Int. Conf. on Artificial Imelligence.

E.xpert S.v.crems and Natural Language (AI and Transportation Conclave) I Paris. Hecht-Nielsen R. (1989) Theory of the backpropagation neural network. Proc. Inr. Joint Co@ on Neural

Networks. pp. 593361 I. IEEE Press, New York. Hecht-Nielsen R. (1990) Nemocomputing. Addison-Wesley, Reading MA. Heymans B. C., Oneria J. P. and Carriere P. E. (1991) Determining maximum traffic flow using back

propagation. Proc. Int. Joint Conf on Neural Networks, Seattle. Hgmyen D. and Widrow B. (1989) The truck backer-upper: an example of self-learning in neural networks. Inf.

Joinr Corzf. on Neural Nerworks, Washington, DC. Hopfield J. J. (1982) Neural networks and physical systems with emergent collective computational abilities.

Proc. Nat1 Acad. Sri. 79, 2554-2558. Hua J. and Faghri A. (1993a) Traffic mark classification using artificial neural networks. Proc. Pacific Rim Con@

Seattle. Hua J. and Faghri A. (1993b) Dynamic traffic pattern classification using artificial neural networks. Transpor-

tation Reseurch Record 1399, pp. 14-19. Hua J. and Faghri A. (1994) Application of artificial neural networks to IVHS. Preprints of Transport Research

Board Con/:. Washington, DC. Hunt J. G. and Lyons G. D. (1993) Modelling dual carriageway lane changing using neural networks. Corrf. on

Informing Technologies far Construction Civil Engineering and Transport, Brunel. Jwell P. L.. Nygard K. E. and Nagesh K. (1991) Multiple neural networks for selecting and problem solving

technique. Proc. Int. Join/ Con/: on Neural Networks. Seattle. Kaseko M. S. and Ritchie S. G. (1993) A neural network-based methodology for pavement crack detection and

classification. Transpn. Re.s.-C 1, 2755291. Kaseko M. S. and Ritchie S. G. (1992) A neural network-based methodology for automated distress classifica-

tion of pavement images. Proc. Int. Conf: on Artificial Inlelligence Applications in Transportation Engineering,

San Buenaventura, CA. Kikuchi S.. Nanda R. and Perincherry V. (I 993) A method to estimate trip O-D patterns using a neural network

approach. Tramp. Plunn. Technol. 17, 51 65. Kirby H. R. and Parker G. B. (1994) The development of traffic and transport applications of artificial intelli-

gence: an overview. Ar/ific,iaI Inrelligence Applications IO Traffic Engineering (Bielli, Ambrosino and Boero Eds). VSP. Utrecht.

Kirby H. R.. Boyle R. D. and Dougherty M. S. (1993) Recognition of road trafic patterns using neural networks. Conf. on Infiwming Technologiesfor Conswuciion. Civil Engineering and Transport, Brunel.

Kohonen T. (1988) Seif Organixtion and Associative Memory, 2nd Edn. Springer-Verlag, New York. Kohonen T. er 01. (1988) Statistical pattern recognition with neural networks: benchmark studies. Proc. Second

Annual IEEE Conf: on Neural Networks.

260 Mark Dougherty

Kornhauser A. (1991) Neural network approaches for lateral control of autonomous highway vehicles. Proc. Vehicle Navigation and Information Systems Con&, pp. 1143-l 151, Dearborn, MI.

Leonard J. A., Kramer M. A. and Ungar L. H. (1992) Using radial basis functions to approximate a function and its error bounds. IEEE Trans. Neur. Networks 3(4), 624-626.

Lo Z. P. and Bavarian B. (1991) A neural piecewise linear classifier for pattern classification. Int. Joint Conf. on Neural Networks.

Lyons G. (1994) Calibration and validation of a neural network driver decision model. Trafl Engng Contr. 36, 10-15.

Lyons G. and Hunt J. (1993) Traffic modelling-a role for neural networks? Proc. Third Int. Conf on the Appli- cation af Artificial Intelligence to Civil and Structural Engineering, Edinburgh, U.K.

Mann R. and Hayhim S. (1991) Application of the self-organising feature map and learning vector quantisation to radar clutter classification. Proc. Int. Conf on Artificial Neural Networks, Espoo.

Margarita S. (1990) Recognition of European car plates with modular neural networks. Proc. Inr. Neural Network Conf, Paris.

Matsuyama Y. (1991) Self-organization via competition, cooperation and categorization applied to extended vehicle routing problems. Proc. Int. Joint Conf. on Neural Networks, Seattle.

Mead W. C., Fisher H. N., Jones R. D., Bisset K. R. and Leopold A. L. (1994) Application of adaptive and neural network computational techniques to traffic volume and classification monitoring. Preprints of Transport Research Board Conf., Washington, DC.

Minsky M. L. and Papert S. S. (1969) Perceptrons. MIT Press, Cambridge, MA. Nahatsuji T. and Terutoshi K. (1991) Development of a self-organizing traffic control system using neural

network models. Transportation Research Record 1324, pp. 131~145. Nelson P. and Palacharla P. (1993) A neural network model for data fusion in ADVANCE. Proc. Pacific Rim

Co& Seattle. Neusser S., Hoefflinger B., Nijhuis J., Siggelhow A. and Spaanenburg L. (1991) A case study in car control by

neural networks. 24th ISATA Int. Symp. on Automotive Technology and Automation, Florence. Ohba Y., Midorikaura H. and Iizuha H. (1989) Optimizing problems by neural networks. Technology Report of

the Seihei University, No. 48. Pant P. D., Zhou X., Arudi R. S., Bodocsi A. and Aktan A. E. (1993) Neural-network-based procedure for

condition assessment of utility cuts in flexible pavements. Transportation Research Record 1399, pp. 8-13. Pant P. D. and Balakrishman P. (1994) Neural network for gap acceptance at stop-controlled intersections.

J. Transpn. Engng 120(3), 432-446. Pomerleau D., Gowdy J. and Thorpe C. (1991) Combining artificial neural networks and symbolic processing for

autonomous robot guidance. Engng Applic. Art$ Intell. 4(4), 279-285. Potuin J.-Y. and Shen Y. (1991) A neural network approach to the vehicle dispatching problem. IEEE Inf. Co@

on Neural Networks, Singapore. Rewinski S. (1992) The neural designing in pavement management. Proc. Int. Conf. on Arfificial Intelligence

Applications in Transportation Engineering, San Buenaventura, CA. Ripley B. D. (1992) Statistical aspects of neural networks. Invited lecture for Semstat, Sandbjerg, Denmark. Ritchie S. G. and Cheu R. L. (1993) Simulation of freeway incident detection using artificial neural networks.

Transpn. Res.-C 1, 203-217. Ritchie S. G., Cheu R. L. and Recher W. W. (1992) Freeway incident detection using artificial neural networks.

Proc. Int. Conf. on Artificial Intelligence Applications in Transportation Engineering, San Buenaventura, CA. Stamenkovich M. (1991) An application of artificial neural networks for autonomous ship navigation through a

channel. Proc. Vehicle Navigation and Information Systems Conf. Xiong Y. and Schneider J. B. (1992) Transportation network design using a cumulative genetic algorithm and

neural network. Transportation Research Record 1364, pp. 31-44. Yang H., Akiyama T. and Sasaki T. (1992) A neural network approach to the identification of real time origin-

destination flows from traffic counts. Proc. Int. Conf. on Artificial intelligence Applications in Transportation Engineering, San Buenaventura, CA.

Young H., Kitamura R., Jovanis P., Vaughn K. and Abdel-aty M. (1993) Exploration of route choice behaviour with advanced traveller information using neural network concepts. Transporration 20(2), 199-223.

a review of neural networks to transport

Documents

review of neural networks

application of neural

field of neural networks

kinds of neural network

transport kirby

field of transport studies

previous work

transport mark dougherty