zurawski r.the industrial communication technology handbook.2005.automotive technologies

V-1

VApplications of Networks and Other TechnologiesAutomotive Communication Technologies

29 Design of Automotive X-by-Wire Systems ....................................................................29-1

Cédric Wilwert, Nicolas Navet, Ye Qiong Song, and Françoise Simonot-Lion

30 FlexRay Communication Technology ............................................................................30-1

Dietmar Millinger and Roman Nossal

31 The LIN Standard ............................................................................................................31-1

Antal Rajnák

32 Volcano: Enabling Correctness by Design.....................................................................32-1

Antal Rajnák

Networks in Building Automation33 The Use of Network Hierarchies in Building Telemetry and Control

Applications......................................................................................................................33-1

Edward Koch

34 EIB: European Installation Bus ......................................................................................34-1

Wolfgang Kastner and Georg Neugschwandtner

35 Fundamentals of LonWorks/EIA-709 Networks: ANSI/EIA-709 Protocol Standard (LonTalk) ..........................................................................................................................35-1

Dietmar Loy

Manufacturing Message Specification in Industrial Automation36 The Standard Message Specification for Industrial Automation Systems: ISO 9506

(MMS) ...............................................................................................................................36-1

Karlheinz Schwarz

37 Virtual Factory Communication System Using ISO 9506 and Its Application to Networked Factory Machine...........................................................................................37-1

Dong-Sung Kim and Zygmunt J. Haas

© 2005 by CRC Press

V-2 The Industrial Communication Technology Handbook

Mo tion Control38 The SERCOS interface™..................................................................................................38-1

Scott C. Hibbard, Peter Lutz, and Ronald M. Larsen

Train Communication Network39 The IEC/IEEE Train Communication Network ............................................................39-1

Hubert Kirrmann and Pierre A. Zuber

Smart Transducer Interface40 A Smart Transducer Interface Standard for Sensors and Actuators ...........................40-1

Kang Lee

Energy Systems41 Applying IEC 61375 (Train Communication Network) to Data Communication in

Electrical Substations ......................................................................................................41-1

Hubert Kirrmann

SEMI42 SEMI Interface and Communication Standards: An Overview and Case Study........42-1

A.M. Fong, K.M. Goh, Y.G. Lim, K. Yi, and O. Tin


29-1

29Design of Automotive

X-by-Wire Systems

29.1

29.2of X-by-Wire Systems .......................................................29-2

Constraints

29.3 Fault-Tolerant Services for X-by-Wire ............................29-4

Systems and Middleware Services

29.4 Steer-by-Wire Architecture: A Case Study ......................29-8

29.5 Conclusion.......................................................................29-18References ...................................................................................29-18

29.1 Why X-by-Wire Systems?

Embedded electronics, and more precisely embedded software, is a fast-growing area, and software-basedsystems are increasingly replacing the mechanical and hydraulic ones. The reasons for this evolution aretechnological as well as economical. On the one hand, the cost of hardware components is decreasingwhile their performances and reliability are increasing. On the other hand, electronic technology facilitatesthe introduction of new functions whose development would be costly, or not even feasible, if usingmechanical or hydraulic systems alone. This evolution, formerly confined to functions such as motorcontrol, wipers, lights, or door controls, now affects all car domains, even for critical functions such asthrottle, brake, or steering control. This trend resulted in the introduction of the concept X-by-Wire,where mechanical or hydraulic systems embedded in an automotive application will be replaced by fullyelectric/electronic ones.

Historically, the first critical X-by-Wire function was Throttle-by-Wire, implemented in a ChevroletCorvette series in 1980 to replace the cable-based throttle. Today, this function is present in most vehicles,for example, the Peugeot 307. Shift-by-Wire systems, also known as Gear-by-Wire, are also implementedin some high-end vehicles such as the BMW 5 and 7 series.

However, mechanical systems are still necessary for the most currently used X-by-Wire systems, eitherto work in conjunction with the electronic system or as a backup (e.g., the electronic hydraulic brakingsystem; semiactive suspensions, as in the Mercedes Adaptive Dampfung System; the electronic camshaftin the BMW Valvetronic technology; and the robotized gear box). It is interesting to note that therobotized gear box is an option proposed by all carmakers in the world today.

Cédric WilwertPSA Peugeot–Citroën

Nicolas NavetLORIA

Ye Qiong SongLORIA

Françoise Simonot-LionLORIA


Overview on the Communication Services • Main Time-

Why X-by-Wire Systems? .................................................29-1

Problem, Context, and Constraints for the DesignSteer-by-Wire System • Brake-by-Wire Systems

General Constraints • Dependability Constraints • Real-Time

Dependability and Real-Time Properties • Operational

Triggered Protocols for Automotive Industry • Operating

Architecture • Dependability Issues

Functional Description of a Steer-by-Wire System •

29-2 The Industrial Communication Technology Handbook

One of the main obstacles for general acceptance of X-by-Wire systems is the difficulty to prove thatall the necessary safety measures are followed. It is enough to note that the dysfunction of Steer-by-Wire,Brake-by-Wire, or Throttle-by-Wire systems would jeopardize the safety of the occupants. As seen before,a number of X-by-Wire systems have already been implemented in certain series of vehicles; however,Steer-by-Wire and Brake-by-Wire systems will always have a mechanical backup. The concern for safetyis certainly a major factor.

Another obstacle is that the customer’s demand is not very great at the moment; he does not realizethe technical advantages and only sees a higher price tag. But the advantages of this technology can be

carmakers are investing in this domain.

29.1.1 Steer-by-Wire SystemThe first advantage lies in the decreased risk of the steering column entering the cockpit in the event ofa frontal crash. Furthermore, the variable steering ratio of the Steer-by-Wire system brings remarkablyincreased comfort to the driver. This function enables the steering ratio between the handwheel and thewheels to adapt according to the driving conditions. In parking and urban driving, this ratio should besmaller in order to reduce the amplitude of the handwheel rotation. Another facility for steering func-tionality, brought by software-based technology, is the m-split braking, which consists of applying adissymmetric torque to the wheels in case of a dissymmetric adherence. Finally, the steering column isone of the heaviest components of the vehicle and removing it significantly decreases the weight of thevehicle and thus reduces fuel consumption.

Among the drawbacks, the electrical power needed to power the front axle requires the use of 42-Vtechnology. At the Society of Automotive Engineers (SAE) Conference in 2003 [1], it was said that thistechnology would not be mature before 2010. This announcement has considerably reduced the emphasisput on the X-by-Wire developments in general. Furthermore, the safety issues have not yet been fully defined.

29.1.2 Brake-by-Wire SystemsA Brake-by-Wire system implemented with one microcontroller and one actuator per wheel can signif-icantly increase the quality of the braking, in particular by reducing the stopping distance. Moreover,this technology provides more precise braking by adapting to the pressure the driver puts on the pedal.Like Steer-by-Wire, there is a significant decrease in the weight of the vehicle in removing the hydraulicbraking system, and therefore significantly lower costs. Finally, Brake-by-Wire will help to protect theenvironment because no braking fluid is necessary.

Unfortunately, as with the Steer-by-Wire system, the 42-V problem is a barrier for its present deploy-ment. Another hindrance comes from the mentality of the customers. Indeed, the customer is faced witha new technology for a critical function that will be more expensive in the beginning and whose benefitshe does not clearly see.

The first technology in Brake-by-Wire was the introduction of electrohydraulic braking (EHB). Themain difference between the EHB system and the classic braking system is that each wheel has an inde-pendent braking subsystem: the hydraulic pressure is applied independently on each wheel. However, aclassic hydraulic circuit is still implemented from the pedal to the front wheels for safety reasons. The EHBsystem was factory installed for the first time in 2001 with the Mercedes Roadster SL. Today, Toyota proposesa regenerative EHB in its Prius, which uses the energy dissipated during deceleration to charge the battery.

29.2 Problem, Context, and Constraints for the Design ofX-by-Wire Systems

29.2.1 General ConstraintsAs explained before, some X-by-Wire systems are already standard in certain series. However, while theimplementation of Brake-by-Wire and Steer-by-Wire is possible with a 14-V battery in low-weight


very attractive for both carmakers and customers (see Sections 29.1.1 and 29.1.2), which explains why

Design of Automotive X-by-Wire Systems 29-3

vehicles, this is not the case for the heavier ones. As the first Brake-by-Wire and Steer-by-Wire systemswill be costly in the beginning, they have greater chances of being implemented in high-end vehicles.Consequently, 42-V technology has to be mature before X-by-Wire systems can be mass-produced.Moreover, cell fuel technology, with fully electric energy sources, seems to be well situated for replacingcombustion engines. This significantly reduces the necessity of 42-V technology in the long run [1].

In addition, the size of the X-by-Wire systems and their cost are major constraints for carmakers;electronics-based systems already account for 30% of the total cost in current vehicles.

29.2.2 Dependability Constraints

An analysis of the requirements for X-by-Wire systems was published in the conclusions of the X-by-Wire Esprit project [19]. In general, for a critical X-by-Wire system, it must be ensured that [19]:

• A system failure does not lead to a state in which human life, economics, or environment isendangered.

• A single failure of one component must not lead to a failure of the whole X-by-Wire system.

The system shall memorize intermittent failures and it shall signal a critical failure to the driver, forexample, through a lightning on the dashboard. Moreover, it is required that the system is at least able totolerate one major critical fault without loss of the functionality for a time long enough to reach a safeparking area. This requirement is a very constraining one, because if only two redundant components areused to provide a critical function, in case of failure of one of these components, the driver must immobilizethe vehicle. This requirement will have to be confronted with the availability requirements in the future.

In terms of the criticality of the involved functions, automotive X-by-Wire systems can reasonably becompared with Flight-by-Wire systems in the avionic field. According to [19], the probability of encoun-tering a critical safety failure shall not exceed 5◊10–10 per hour and per system, but other studies havebeen realized with a maximal bound of 10–9. This quantification can be translated in terms of safetyintegrity level (SIL) [2], and a maximal bound of 10–9 corresponds to a SIL4 system. In fact, SIL4conformance is reached below 10–8.

Up to now, it has been a challenge to reach such dependability because of the lack of experience inthe automotive industry with X-by-Wire systems and because of the complexity of the problem. Inparticular, the environment (electromagnetic interference (EMI), temperature peaks, etc.) may reducethe predictability of the system, and the design is subject to heavy cost and weight constraints. It is likelythat, as in the aeronautic industry, for legal and technical reasons, the design process (e.g., softwaredevelopment, methods and tools for validation, etc.) will have to be certified in the future.

Finally, one objective that must be reached is that an X-by-Wire system offers the same availabilityand same maintainability as its mechanical/hydraulic counterparts. The challenge is to prove that a givenX-by-Wire system adheres to all these requirements.

29.2.3 Real-Time Constraints

Belonging to the chassis domain, Steer-by-Wire and Brake-by-Wire systems are intrinsically real-timedistributed systems. They implement complex multivariable control laws and deliver real-time informa-tion to intelligent devices that are physically distant (for example, the four wheels). They have to respectstringent time constraints, such as a sampling period of only a few milliseconds. End-to-end responsetimes shall also be bounded; for example, the time between a request from the driver and the responseof the physical system must be lower than a few tens of milliseconds (see the Steer-by-Wire example in

degradation but also cause the instability of the vehicle.Although these constraints may differ according to the driving conditions, they all have to be respected

whatever the situation. So, in general, the worst-case scenario must be taken into consideration and thesereal-time constraints must be met with a high probability, even if the system is under random pertur-bations, because of the critical safety nature of the X-by-Wire applications.


Section 29.4). An excessive end-to-end response time of a control loop may not only induce performance


29.3 Fault-Tolerant Services for X-by-Wire

29.3.1 Overview on the Communication Services

The communication system has to provide services that are pertinent with respect to the dependability

operational at a given time is usually needed by X-by-Wire applications, and thus a membership servicethat furnishes this information will ease the development of application-level software and its accuracy.

29.3.1.1 Time-Triggered Communication

Among communication networks, one distinguishes between time-triggered protocols, where activitiesare driven by the progress of time, and event-triggered protocols, where activities are driven by theoccurrence of events. Both types of communication have advantages, but one considers that, in general,dependability is much easier to ensure using a time-triggered bus (for instance, refer to [3] for a discussionon this topic). This reference explains that, currently, only time-triggered communication systems arebeing considered for use in X-by-Wire applications. In this category, multiaccess protocols based on time-division multiple access (TDMA) are particularly well suited; they provide deterministic access to themedium (the order of the transmissions is defined statically at the design time and organized in rounds),and thus bounded response times. Moreover, their regular message transmissions can be used as heart-beats for detecting station failures.

The two TDMA-based networks that are candidates for supporting X-by-Wire applications are Time-

writing, FlexRay, which is backed by the major players of the European automotive industry, seems in astrong position for becoming the standard in the industry, although the specifications are not yet finalized.

29.3.1.2 Fault-Tolerant Unit

To achieve fault tolerance, that is to say, the capacity of a system to deliver its service even in the presenceof faults, certain nodes are replicated and clustered into fault-tolerant units (FTUs). An FTU is a set ofseveral stations that perform the same function, and each node of an FTU possesses its own slot in theround so that the failure of one or more stations in the same FTU can be tolerated. Actually, the role ofFTUs is twofold. First, they make the system resilient in the presence of transmission errors (some framesof the FTU may still be correct, while others are corrupted). Second, they provide a means to fight againstmeasurement and computation errors occurring before transmission (some nodes may send the correctvalues, while others may make errors). The stations forming an FTU will be called replicas from here

architecture. However, the fail-silent property is generally not easy to verify. The number of replicas perFTU, which are required to tolerate k faulty components, depend on the behavior of the individualcomponents [4]. For instance, if the failure of k nodes must be tolerated, then the least necessary numberof replicated nodes is k + 1 when all nodes are fail-silent.

29.3.1.3 Fail-Silent Node

As previously explained, fail-silent nodes greatly decrease the complexity of the design of a criticalapplication. A node is said to be fail-silent if:

1. It sends frames at the correct point in time (correctness in the time domain) and the correct valueis transmitted (correctness in the value domain)

2. It sends detectably incorrect frames (e.g., wrong cyclic redundancy check (CRC)) in its own slotor no frame at all

A communication system such as TTP/C is able to provide very reliable support for the requirements(which provide the so-called fail-silence in the temporal domain once they are accomplished), especiallythrough the bus guardian concept, while the value domain is mainly the responsibility of the application.


objectives of the application (see Section 29.2). For instance, the knowledge of the stations that are

out. The use of fail-silent nodes (see Section 29.3.1.3) greatly simplifies the design of a fault-tolerant

Triggered Protocol (TTP)/C (see Section 29.3.2.1) and FlexRay (see Section 29.3.2.2). At the time of

Refer to [4, 5, 6] for good starting points on the difficult problem of ensuring fail-silence.


29.3.1.4 Bus Guardian

When communications are multiplexed, it may happen that a faulty electronic control unit (ECU),transmitting outside its specification, for instance, at the wrong time or with a larger frame, perturbsthe correct functioning of the whole network. One well-known manifestation is the so-called babblingidiots nodes that transmit continuously. To avoid this situation, a component called the bus guardianrestricts the controller’s ability to transmit by allowing transmission only when the node exhibits aspecified behavior. Ideally, the bus guardian should have its own copy of the schedule, be physicallyseparated from the controller, possess its own power supply, and be able to construct the global timeitself. Due to the strong pressure from the automotive industry concerning costs, these assumptions arenot fulfilled in general, which reduces the efficiency of the bus guardian strategy.

29.3.2 Main Time-Triggered Protocols for Automotive Industry

29.3.2.1 TTP/C

TTP/C (Time-Triggered Protocol), which is defined in [7], is a central part of the Time-TriggeredArchitecture (TTA) [8], and it possesses numerous features and services related to dependability, suchas the bus guardian [5], the group membership algorithm [9], and support for mode changes [10]. TTA

implementations of TTP/C, as well as software tools for the design of the application, are commercializedby the TTTech company [11] and available today.

On a TTP/C network, the transmission support is replicated and each channel transports its own copyof the same message. Although electromagnetic interference (EMI) is likely to affect both channels in

be avoided by a local bus guardian such as spatial proximity fault. For instance, a star topology is more

To avoid a single point of failure, a dual star topology should be used, but with the drawback that thelength of the cables is significantly increased.

At the medium access control (MAC) level, TTP/C implements the synchronous TDMA scheme: thestations (or nodes) have access to the bus in a strict deterministic sequential order and each stationpossesses the bus for a constant period called a slot, during which it has to transmit one frame. Thesequence of slots such that all stations have accessed the bus one time is called a TDMA round. An

in the TDMA round, but successive slots belonging to the same station are of the same size. ConsecutiveTDMA rounds may differ by the data transmitted during the slots, and the sequence of all TDMA roundsis the cluster cycle, which repeats itself in a cycle.

TTP/C includes powerful but complex algorithms for easing and speeding up the design of fault-tolerant applications, and some of them have been formally verified (for instance, [9] and [12]). Inparticular, TTP/C implements a clique avoidance algorithm and a membership algorithm that alsoprovides data acknowledgment. The fault hypothesis used for the design of TTP/C is well specified andalso quite restrictive (two successive faults must occur at least two rounds apart). Situations outside thefault hypothesis are treated using “never give up” (NUP) strategies [3], which aim to continue operationin a degraded mode. For example, a usual method is that each node switches to local control accordingto the information still available, while trying to return to the normal mode.

29.3.2.2 FlexRay

A consortium of major companies from the automotive field is currently developing the FlexRay protocol.The core members are BMW, Bosch, DaimlerChrysler, General Motors, Motorola, Philips, and Volks-wagen. The specifications of the FlexRay protocol are not publicly available or finalized at the time ofwriting; however, material describing the protocol is available on the FlexRay Web site [13].


example of a round is shown in Figure 29.6. The size of the slot is not necessarily identical for all stations

quite a similar manner, the redundancy provides some resilience to transmission errors (see Section

better fault tolerance since the star can act as a central bus guardian and protect against errors that cannot29.4.4.4). TTP/C can be implemented with a bus topology or a star topology. The latter topology provides

resilient to spatial proximity faults and faults due to a desynchronization of an ECU (see Section 29.3.1.4).

and TTP/C have been designed and extensively studied at the Vienna University of Technology. Hardware


The FlexRay network is very flexible with regard to topology and transmission support redundancy.It can be configured as a bus, a star, or multistars, and it is not mandatory that each station possessreplicated channels, even though this should be the case for X-by-Wire functions.

At the MAC level, FlexRay defines a communication cycle as the concatenation of a time-triggered (orstatic) window and an event-triggered (or dynamic) window. In each communication window, whosesize is set statically at design time, a different protocol is applied. The communication cycles are executedperiodically. The time-triggered window uses a TDMA MAC protocol; the main difference with TTP/Cis that a station might possess several slots in the time-triggered window, but the size of all slots is identical(Figure 29.1). In the event-triggered part of the communication cycle, the protocol is FTDMA (flexibletime-division multiple access): the time is divided into so-called mini-slots; each station possesses a givennumber of mini-slots (not necessarily consecutive), and it can start the transmission of a frame insideeach of its own mini-slots. A mini-slot remains idle if the station has nothing to transmit. An exampleof a dynamic window is shown in Figure 29.2: on channel B, frame n started to be transmitted in themini-slot n while mini-slot n + 1 has not been used. It is noteworthy that the frame n + 4 is not receivedsimultaneously on channels A and B since, in the dynamic window, transmissions are independent onboth channels. The FlexRay MAC protocol is much more flexible than TTP/C MAC since in the staticwindow nodes are assigned as much slots as necessary (up to 4095 for each node) and since frames areonly transmitted if necessary in the dynamic part of the communication cycle. Compared to TTP/C, thestructure of the communication cycle is not statically stored in the nodes; it is indeed revealed duringthe start-up phase. However, unlike TTP/C, mode changes with a different communication schedule foreach mode are not possible.

From the dependability point of view, all services and functionalities of FlexRay, except the busguardian and the clock synchronization, are not currently well documented, nor is the fault hypothesisused for the design. However, it seems that most features will have to be implemented in software orhardware layers on the top of FlexRay, with the drawback that efficient implementations might be moredifficult to achieve.

29.3.2.3 TTCAN

TTCAN (time-triggered Controller Area Network) [14] is a communication protocol developed by RobertBosch GmbH on the basis of the CAN physical and data link layers. Time-triggered communication is

FIGURE 29.1 Example of a FlexRay communication cycle with four nodes: A, B, C, and D.

FIGURE 29.2 Example of message scheduling in the dynamic segment of the FlexRay communication cycle.

...Node BStaticSlot

Node DStaticSlot

Node AStaticSlot

Node CStaticSlot

Node AStaticSlot

Node AStaticSlot

TDMAStatic Segment

Node BStaticSlot

Node AStaticSlot

FTDMADynamic Segment

Mini Slots

Channel 1

Channel 2

n n+1 n+2n+2

Frame ID n+1

n

Frame ID n

n+1

Frame ID n+2

n+3

MiniSlot

n+4 n+5

Frame ID n+5

n+3

Frame ID n+4

n+6n+4

n+7

Slot Counter



built upon the standard CAN protocol, but the controllers must be able to disable automatic retrans-mission and provide the application with the time at which the first bit of a frame was sent or received [15].

The bus topology of the network, the characteristics of the transmission support, the frame format,as well as the maximum data rate (1 Mbits/s) are imposed by CAN protocol. Channel redundancy ispossible, but not standardized, and no bus guardian is implemented in the node. A key idea is to propose,as with FlexRay, a flexible time-triggered/event-triggered protocol. As illustrated in Figure 29.3, TTCANdefines a basic cycle (the equivalent of the FlexRay communication cycle) as the concatenation of oneor several time-triggered (or exclusive) windows and one event-triggered (or arbitrating) window. Exclu-sive windows are devoted to time-triggered transmissions (i.e., periodic messages), while the arbitratingwindow is ruled by the standard CAN protocol: transmissions are dynamic and bus access is grantedaccording to the priority of the frames. Several basic cycles that differ by their organization in exclusiveand arbitrating windows and by the messages sent inside exclusive windows can be defined. The list ofsuccessive basic cycles is called the system matrix and the matrix is executed in loops. Interestingly, theprotocol enables the master node, the node that initiates the basic cycle through the transmission of thereference message, to stop the functioning in TTCAN mode and to resume in standard CAN. Later, themaster node can switch back to the TTCAN mode by sending a reference message.

TTCAN is built on a well-mastered and cheap technology that is CAN but, as defined by the standard,does not provide important dependability services such as the bus guardian, membership service, andreliable acknowledgment. It is, of course, possible to implement some of these mechanisms at theapplication or middleware level, but with a reduced efficiency. It seems that carmakers may consider theuse of TTCAN for some systems during a transition period until the FlexRay technology is fully mature.

29.3.3 Operating Systems and Middleware Services

In the context of automotive applications, middleware is a software layer located above the platform(hardware, operating system, protocols) that aims to offer high-level services to the application in orderto reduce the time needed to market and improve the overall quality of the system. The main purposeof middleware is to hide the distribution of the functions and the heterogeneity inside the platform (ECU,network, CPU, OS, etc.). Another interest of middleware is to provide high-level services, and for X-by-Wire applications, services related to dependability are needed.

Several projects aimed at the development of automotive middleware layers have been undertaken

results publicly available have been produced in the context of the OSEK/VDX consortium (detailed

whose objective is to build a standard architecture for in-vehicle control units. Among the results of the

FIGURE 29.3 Example of a TTCAN basic cycle.

ExclusiveWindow

Time windows formessages

...ExclusiveWindow

ExclusiveWindow

ArbitrationWindow

ReferenceMessage

ExclusiveWindow

ReferenceMessage

master nodetransmission

TDMACAN standard

arbitration

FreeWindow

Basic Cycle


(EAST, www.east-eea.net; AUTOSAR, http://www.autosar.org/). To the best of our knowledge, the only

information can be obtained at http://www.osek-vdx.org), which is a project of the automotive industry

http://www.east-eea.net

http://www.autosar.org

http://www.osek-vdx.org


OSEK/VDX group, two are of particular interest for X-by-Wire: OSEKTime operating and the fault-tolerant communication layer.

29.3.3.1 OSEKTime

OSEKTime OS (OSEK/VDX time-triggered operating systems) [16] is a small operating system designedto offer the basic services needed by time-triggered applications. In particular, OSEKTime OS offersservices for task management, interrupt processing, and error handling.

An offline-generated dispatcher table, termed a dispatcher round, activates the tasks in a determinedorder and repeats itself as long as the system is running. Several different dispatcher tables, corresponding,for instance, to different functioning modes of the system, can be defined, but the switching from onetable to the next can only take place at the end of a round.

Tasks cannot be blocked waiting for an external event, but they can be preempted; a running task willalways be preempted by a task that is activated, and the designer must take resource contention intoaccount in this case. An interesting feature of OSEKTime OS is the deadline monitoring that can beperformed for some specified tasks: when such a task is still not finished at its deadline, a specificapplication error handling routine is invoked and the operating system is reinitialized. In OSEKTimeOS, the rate at which interrupts can occur is bounded, at a rate specified when the system is configured,in order to keep the system predictable.

As for communication cycles in time-triggered networks, the configuration of the dispatcher roundcan be done offline through a software tool that will ensure the correctness of the system.

29.3.3.2 FTCom

OSEK/VDX FTCom (fault-tolerant communication) [17] is a proposal for a software layer that providesservices for facilitating the development of fault-tolerant applications on top of time-triggered networks.One important function of FTCom, with respect to X-by-Wire, is to manage the redundancy of data

sometimes preferable to present only one copy of data to the application in order to simplify theapplication code and to keep it independent from the level of redundancy (i.e., the number of nodescomposing an FTU). In OSEK/VDX terminology, the algorithm responsible for the choice of the valuethat will be transmitted to the application is termed the agreement algorithm. Many agreement strategiesare possible: pick-any (fail-silent node), average value, pick-a-particular-one, majority vote, etc. OSEKFTCom provides a generic way for specifying the agreement strategy of replicated data.

Two other important services of the FTCom are (1) management of the packing and unpacking ofmessages [18], which is needed if the use of network bandwidth has to be optimized, and (2) provisionof message-filtering mechanisms for passing only significant data to the application.

29.4 Steer-by-Wire Architecture: A Case Study

A Steer-by-Wire system aims to provide two main services: controlling the wheel direction according tothe driver’s request and providing a mechanical-like force feedback to the handwheel. In this section, wepresent the functional point of view of such a system, the real-time and dependability properties thathave to be observed, and a realistic operational Steer-by-Wire system used as a reference architecture forevaluation purposes. Finally, we will focus on the real-time requirements, and after proposing a way tomodel failures, we will show how each component or subsystem of the reference architecture can reachthe dependability objective.

29.4.1 Functional Description of a Steer-by-Wire System

In a Steer-by-Wire system, two main services have to be provided: the front-axle actuation and thehandwheel force feedback. So, from a functional point of view, this implies two main functions that are


needed for achieving fault tolerance (see Section 29.3.1.2). From an implementation point of view, it is


not completely independent. However, in the following discussion, in order to simplify their description,we will not take into account the interdependencies between these functions.

29.4.1.1 Front-Axle Control

This function computes the orders that are given to the motor of the front axle, mainly according to thestate of this front axle, and the commands given by the driver through the handwheel. The driver’srequests are translated through:

• Hand wheel angle• Hand wheel torque• Hand wheel speed

29.4.1.2 Handwheel Force Feedback

This function computes the order that will be given to the handwheel motor, in particular according to:

• Speed of the vehicle• Front-axle position• Front tie rod force

The elaboration of these orders requires the execution of filtering algorithms, complex control lawsunder stringent sampling periods of a few milliseconds. The main property to ensure is that the end-to-end response time between a new command from the driver and the effect on the front axle is bounded.

29.4.2 Dependability and Real-Time Properties

As stated in Section 29.2.2, a Steer-by-Wire system must comply with safety integrity level 4 (SIL4) [2].This means that the system shall be able to tolerate a single failure and to ensure that the probability ofencountering a safety-critical failure does not exceed 10–9 per hour. An important issue is to formallydetermine the relation between the distributed system that supports the steering control function andthe failure occurrences at the steering system level. Section 29.4.4.5 aims to propose a few solutions tothis problem.

Besides these dependability constraints, any steering system has to ensure certain performances. Spe-cifically, let us consider the front-axle control function. According to the vehicle technical requirementsstated in [19], whatever the system is (i.e., mechanical/hydraulic or by-wire), the maximum angles ofthe front wheels should be at least ±40˚ (±90˚ for upcoming systems). This leads to a specific performanceproperty imposing the ability to control the steering with a velocity of at least 40˚ per second (90˚).

Furthermore, some real-time properties are derived from control requirements. In particular, somecontrol laws implemented in the case study required a sampling period equal to 2 ms (i.e., 500 sampledwheel angle values per second). Each sample was treated through filtering and control algorithms andled to a value that is used by an actuator in order to reach the desired steering position. Obviously, adelay (named end-to-end response time) that cannot be neglected exists between one sampling and itscorresponding reaction on the actuator. This delay is mainly due to the execution of the algorithm onthe ECU and to the transfer of the sampled value through the communication systems. In order toguarantee vehicle stability, this delay has to be less than a given bound that depends on the type of vehicleas well as on the driving condition (velocity, wheel angle, etc.). It is the carmakers’ responsibility to beable to compute the limit of the tolerated delay for any given situation the vehicle may be in.

Notice that the occasional absence of samples or out-of-bound delays at the controller or actuatorlevel, for instance, due to frame loss, does not necessarily lead to vehicle instability, but degrades steeringperformance (or quality of service). This is because most of the control laws that are used are designedwith specific delay and absence of sampling data compensation mechanisms, thus tolerating perturbationsunder a given threshold. In Section 29.4.4.5.1, we will show an example of how to evaluate such a thresholdvalue using a Matlab/Simulink model of the system.



29.4.3 Operational Architecture

An operational architecture, which is a solution for the implementation of the functions presented in Section29.4.1, is described in this section. Figure 29.4 illustrates the hardware architecture on which the operationalone is based. It includes four electronic control units (microcontrollers): HW ECU1 (handwheel ECU1),HW ECU2 (handwheel ECU2), FAA ECU1 (front-axle actuator ECU1), and FAA ECU2 (front-axle actuatorECU2). Each node is connected to the two TDMA-based communication channels (BUS1 and BUS2).Finally, three sensors — as1, as2, and as3 — placed near the handwheel measure the requests of the driverin a similar way, the latter being translated into a 3-tuple: <handwheel angle, handwheel torque, handwheelspeed>. Three other sensors — rps1, rps2, and rps3 — are dedicated to the measurement of the front-axleposition. Finally, two motors (FAA Motor1 and FAA Motor2), configured in active redundancy, act on thefront axle, while two other motors (HW Motor1 and HW Motor2) realize the force feedback control onthe handwheel. Sensors as1, as2, and as3 (sensors rps1, rps2, and rps3, respectively) are connected by point-to-point links to both HW ECU1 and HW ECU2 (FAA ECU1 and FAA ECU2, respectively).

29.4.3.1 Implementation of the Front-Axle Control Function

The requests of the driver are measured by the three replicated sensors as1, as2, and as3 and sent to bothHW ECU1 and HW ECU2. Each ECU performs a majority vote on the three received values and transmitssecure data on both communication channels BUS1 and BUS2. The two ECUs, FAA ECU1 and FAA

FIGURE 29.4 Steer-by-Wire operational architecture.

Hand Wheel(HW)

HW ECU1

rps1FAA

Motor 1

RotorPositionSensor

HW ECU2

FAA ECU1 FAA ECU2

HWMotor 1

HWSensors

HWMotor 2

rps3rps2

as1 as2 as3

Front Axle

Legend

TDMA network

point to point link

as

HW motor hand wheel actuatorsfor force feedback

ECU nodes

FAA motor front axleactuators

rps Rotor Positionsensors

HWSensor

hand wheel sensors

RotorPositionSensor

RotorPositionSensor

FAAMotor 2



ECU2, placed behind the front axle, consume these data, as well as the last wheel position, in order toelaborate the commands that are to be applied to FAA Motor1 and FAA Motor2.

29.4.3.2 Implementation of the Force Feedback Control Function

In a way similar to the previous function, measurements taken by rps1, rps2, and rps3 are transmittedto both FAA ECU1 and FAA ECU2. Each of these ECUs elaborates information transmitted on thenetwork. The consumers of this information are both HW ECU1 and HW ECU2, which compute thecommand transmitted to HW Motor1 and HW Motor2.

The replication of algorithms on several ECUs of measurements from several similar sensors and ofinformation transmission over redundant buses is highly used in this operational architecture. The choices

cost, and dimension requirements. Alternative Steer-by-Wire architectures presented in the literature(e.g., [19]) use, in addition, two central ECUs located between the handwheel and the front, but froman economic point of view, it is of course preferable to use only four ECUs if dependability criteria are met.

29.4.4 Dependability Issues

In this section, after recalling which failures are taken into account, we justify choices that are made forthe specification of the hardware architecture.

29.4.4.1 Failure Model

The terms fault, error, and failure are currently used in system engineering. In [20], these terms are clearlydefined. Let us consider that a system has to deliver a service. A system failure is an event that occurswhen the delivered service deviates from the expected one. An error is the part of the state of the systemthat may cause a subsequent failure, and a fault is the adjudged cause of an error. A fault is active whenit produces an error; otherwise, it is dormant. Note that if we consider that a system is composed ofcomponents, we can observe possible causal relations between the failure of one or several componentsand the failure of the system (Figure 29.5).

Usually, two classes of faults can be distinguished according to their effects inside a system: Byzantinefaults and coherent faults. These faults are caused by a failure of one or several components of the system.A Byzantine fault is a fault whose effect can be perceived differently by different observers. The effect ofa coherent fault is seen the same by all observers. Moreover, the property of fail-silence, assumed forsome components, leads to a third class of fault. A component is said to be fail-silent if, each time it isnot silent, we can conclude that it is functioning properly. Note that at a system level, the silence of thisclass of component is seen as a fault when it occurs.

A second classification of faults relies on the duration of a fault. In this case, we consider two typesof faults according to their effect on the whole system. In our context, a transient fault is a fault whoseduration is such that the system does not reach a “not safe state.” A permanent fault is a nontransientone. The main issue is to evaluate the delay after which a transient fault becomes a permanent one. InSection 29.4.4.5.1, we present a method for the evaluation of the worst tolerable delay.

Following the approach proposed in [21], we define the system dependability requirement by a tripletFM(b, c, f), named flexible failure model, where b is the maximum number of Byzantine failing sources,c the maximum number of coherent failing sources, and f the maximum number of fail-silent sourcesthat the system must be able to tolerate. In this case study, we consider a failure model defined byFM(1, 1, 1): the system must always tolerate, at a given time, one Byzantine fault or one coherent faultor a fault due to one fail-silence of a component.

FIGURE 29.5 Failure propagation.

Component SystemFault Error Failure Fault Error Failure


made in terms of redundancy and diversification (see Section 29.4.4.3) are constrained by dependability,


29.4.4.2 Operational Architecture vs. Dependability Requirements

In this section, we give some rules that we applied for designing the architecture. The system under studyhas to provide two main services: control of the front axle according to the driver requests and furnishingof a force feedback to the driver. We focus on the former. A similar rationale can be used for the secondone. In both cases, the main question is the evaluation of the minimum number of redundant componentsthat contribute to meet the dependability requirement.

Dependability analyses are generally based on a strong hypothesis assuming that, in the whole system,n simultaneous component failures can never occur for any set of redundant components (set of redun-dant handwheel sensors, set of redundant handwheel ECU, etc.). In this case study, we suppose that n isequal to 1.

Note that in [22], Lamport et al. state that 3n + 1 redundant components are necessary to tolerate nByzantine faults. In order to tolerate n coherent faults, it is sufficient to have 2n + 1 redundant components.

29.4.4.2.1 ECU RedundancyTwo functions need to be implemented in ECU: front-axle control and force feedback control. To avoidcostly and numerous wires, ECUs have to be placed close to the sensors, and communication betweenECUs has to be multiplexed. A handwheel ECU (a front-axle ECU, respectively) is a consumer ofinformation sampled from the handwheel (front axle, respectively) and is a producer of informationused by the front-axle control function (force feedback control function, respectively) implemented ina front-axle ECU (handwheel ECU, respectively). According to the rule given by Lamport, the minimumnumber of redundant handwheel ECUs (front-axle ECUs, respectively) should be four. This solution ismainly used in the aeronautic domain. But automotive requirements are completely different in termsof cost and space. Therefore, a classical solution is to use fail-silent ECUs. In this case, obviously, onlytwo handwheel ECUs (front-axle ECUs, respectively) are necessary. However, we have to ensure the fail-silence property. To do this, several techniques based on the Petri net analysis [23], C model simulation[24], or fault injection [25] are used.

29.4.4.2.2 Handwheel Sensor RedundancyA handwheel sensor produces information for two handwheel ECUs. Three handwheel sensors arenecessary for ensuring that each handwheel ECU, assumed to provide a voting algorithm, is able totolerate one Byzantine fault (and subsequently one coherent fault or one fail-silent sensor).

29.4.4.2.3 Actuator RedundancyActuators are mechanical components without any calculating ability, and a single actuator can takecharge of piloting the front axle. Furthermore, according to the inherent reliability properties of theseactuators, we guarantee that an actuator can never wrongly apply an order received by a front-axle ECU.Under these assumptions and taking into account the formerly stated fail-silence property of a front-axleECU, only two couples <front-axle ECU, actuator> are necessary for the tolerance of, at most, one fault.

29.4.4.3 Redundancy and Diversification

According to the dependability requirements, presented in Section 29.4.2, and to the assumption madeon the ECUs (fail-silence in our case), and according to the failure occurrence model (Section 29.3.1.3),a certain level of redundancy has to be implemented. If the chosen fault tolerance strategy is failurerecovery, redundant ECUs will work only in the case of the primary ECU failing. In this case, failuredetection must be quick and reliable. Otherwise, if the strategy is failure compensation, as with the front-axle motors, redundant ECUs will be placed in parallel and work simultaneously. Because of the stringentreal-time constraint, our architecture must provide failure compensation.

It is worth noting that the redundancy of identical ECUs does not prevent the architecture fromcommon mode failures: the hardware of redundant ECUs should be furnished by different suppliers andtheir software realized by different teams. The Ariane 501 explosion is a good example to show theimportance of diversification. Both backup and active inertial reference systems failed for the same reason[26]. If software and hardware had been diversified, one of the two inertial reference systems would have



remained safe. But for cost and maintenance reasons, it is not always possible to implement diversifiedcomponents and technologies.

29.4.4.4 Configuration of the Communication Protocol

Communication is driven by a TDMA-based protocol with two replicated channels. More precisely, thenetwork that is used in this case study is TTP/C because of the availability of the protocol specificationand the components. However, the same analysis is valid for any time-triggered protocol such as TTCANand FlexRay.

For reliability reasons, the same frame is transmitted on the two replicated channels. In order to avoidcommon mode failures (EMI, temperature, etc.), channels should be placed as far as possible from eachother in the vehicle.

29.4.4.4.1 Slot Allocation Strategy to Maximize the Robustness of the TransmissionIn TTP/C, the transmission order inside a round can be freely chosen by the application designer. Amongthe criteria for constructing the TDMA round, applicative constraints like computation time and sam-pling rates can be taken into account. But as shown in [27, 28], the robustness of a TDMA-based systemagainst transmission errors heavily depends on the location of the slots inside the round.

In automotive systems, one observes that transmission errors are highly correlated: there occur per-turbations that corrupt several consecutive frames (so-called bursts of errors). Should two frames that

perturbation could corrupt both frames. The objective to pursue depends on the status of the FTU with

the successful transmission of one single frame for the whole set of replicas is sufficient since the valuecarried by the frame is necessarily correct. In this case, the objective to achieve with regard to therobustness against transmission errors is the minimizing of the probability that all frames of the FTU(carrying data corresponding to the same production cycle) be corrupted. This probability is denotedP_all in [27].

In practice, replicated sensors may return slightly different observations, and without extra commu-nication for an agreement, replicated nodes of a same FTU may transmit different data. If a decision,such as a majority vote, has to be taken by a node with regard to the value of the transmitted data, theobjective is to maximize probability that at least one frame of each FTU is successfully transmitted duringa production cycle. If the production cycle is equal to one round, then it comes back to minimizingP_one, the probability that one or more frames of an FTU have become corrupted.

It has been shown in [27], with some reasonable assumptions on the error model, that the optimalsolution to minimize P_all is to “spread” the different frames of a same FTU uniformly over the TDMAround. An algorithm that ensures the optimal solution is provided for the case where the FTUs have, atmost, two different cardinalities (for instance, one FTU is made of two replicas and other FTUs are madeof three replicas). For the other cases, a low-complexity heuristic is proposed [27], and it was proven tobe close to the optimal on simulations that were performed.

In [28], it was demonstrated that under very weak assumptions on the error model, and whateverthe number of FTUs and their cardinalities, the clustering together of the transmission of all the framesof an FTU minimizes P_one when the production cycle of the data sent is equal to the length of aTDMA round.

These two results, for the fail-silent case and non-fail-silent case, provide simple guidelines to theapplication designer for designing the schedule of transmission. In our case study, since all ECUs arefail-silent, our requirement is to minimize the probability of losing all replicas in the TDMA round, andthus the redundant frames have to be spread over time.

29.4.4.4.2 Allocation of the Slots in the Round

of data:


Let us consider the architecture shown in Figure 29.4 with the following characteristics for the production

belong to the same FTU (see Section 29.3.1.2) be transmitted just one after the other, then a single

regard to the concept of fail-silence (see Section 29.3.1.3). For FTUs composed of a set of fail-silent nodes,


HW ECU1/HW ECU2 — production of two pieces of data packed in a single frame:• HW angle every 2 ms• HW torque every 4 ms

FAA ECU1/FAA ECU2 — production of two pieces of data packed in a single frame:• FAA position every 2 ms• Tie rod force every 4 ms

The size of the TDMA round is set to the minimal production period, i.e., 2 ms. Since the frames arecomposed of the same information, whatever the round, the size of the cluster cycle is equal to oneround, which is possible with the latest version of the specification [7]. According to all these consider-ations, the location of the slots inside the round is shown in Figure 29.6. The slot duration is set equalfor every slot. However, it is not a constraint imposed by TTP/C, but in this case, this choice has beenjustified by application-level constraints (deadline on tasks, etc.).

29.4.4.5 Evaluation of the Behavioral Reliability of the Architecture

For a given mean time to failure (MTTF) of the components (sensors, computers, network links, actu-ators) and their redundancy, the use of classic reliability analysis methods, for instance, fault tree analysis(FTA) and failure mode and effect analysis (FMEA), can provide an estimation of the reliability of a

formances are taken into account in these kinds of evaluation, and thus these evaluations are clearly notsufficient in our context.

Under normal conditions, the use of a time-triggered scheduling of tasks and messages allows thereception of the sensor data at regular intervals, thus providing bounded end-to-end delays. However,random environmental perturbations (e.g., EMI) could make the communication system unavailableduring some periods. For example, consecutive transmission errors can create a period during which thecontroller or the actuators do not receive any sensor data. The concept of behavioral reliability, definedin [30], for determining the probability with which the Steer-by-Wire system violates the end-to-enddelay constraints for 1 h under a stochastic perturbations. Our objective is to ensure this probability willbe less than 10–9

The end-to-end delay for the front-axle control function is composed of the so-called pure delay (delayinduced by the system before the driver’s command is given to the actuators) and the mechatronic delayintroduced by the actuators (electric motors in our case). The mechatronic delay can be bounded by aconstant Tmec. In what follows, we will only focus on the analysis of the pure delay analysis, which isdenoted by Tp. Systems that are not able to ensure a pure delay Tp lower than a tolerable upper boundTmax are considered to be unstable. The value of Tmax can be estimated by tests in vehicles and simulations.

The behavioral reliability is estimated by the probability that the pure delay is greater than themaximum tolerable bound: PBR = P[Tp > Tmax]. When Tp is equal to or lower than Tmax, the quality ofservice is degraded, while the vehicle is considered to be potentially unstable when Tp > Tmax.

29.4.4.5.1 Evaluation of Tmax

To illustrate the method, only the function “turning the wheels according to driver’s request” is considered.The evaluation of Tmax can be performed either by testing in a vehicle or by using Matlab/Simulink. The

FIGURE 29.6 Placement of the slots in the cluster cycle.

HWECU1

HWECU2

FAAECU2

Cluster Cycle (2ms)

HWECU1

FAAECU1

HWECU2

FAAECU2

FAAECU1

TDMA-Round (2ms) TDMA-Round (2ms)

Slot Duration(0.5ms)

...HW

ECU2FAA

ECU2HW

ECU1FAA

ECU1


, which is more stringent than the SIL4 requirement (see Section 29.2.2).

Steer-by-Wire architecture (see, for instance, [29]). However, neither transient faults nor real-time per-


method we adopted was using Matlab/Simulink first and confirming the results with testing in vehicles.The software framework used was composed of a Matlab/Simulink model of the Steer-by-Wire architec-ture and a vehicle environment model.

The architecture presented in Section 29.4.3 was simulated according to a handwheel angle utilizationprofile (positions of the handwheel over time). The impact of the variation of Tp on performance, mainlythe stability of the vehicle and the time needed to reach the requested wheel angle, was evaluated andtranslated in terms of the quality-of-service score, denoted by S. Table 29.1 shows an example of therelation between the score S given to the system and the perturbation time. It corresponds to aninstantaneous rotation of the handwheel from 0 to 45˚ at 100 km/h. From Table 29.1, with a minimumtolerable score of 11, one sees that 17.6 ms is the critical limit (figures in italic) for this perturbationtime; beyond this limit, the vehicle becomes unstable and the safety of the driver can be at risk.

The different values of Tp in Table 29.1 correspond to the cases where, during 1, 2, 4, 5, 6, 7, 8, 9,and 14 consecutive cluster cycles, the front-axle actuators receive nothing (caused, for instance, byenvironmental perturbations). In practice, even without receiving any sampling data, the actuator stillperforms, but the turning of the front axle is made, for instance, on the basis of the command of theprevious period, or with an estimation based on the commands of several previous periods. So, in thiscase, Tp + Tmec is no longer the end-to-end delay strictly speaking, but the delay during which the systemhas not been able to take into account the current handwheel angle in order to compute the commandsfor the actuators.

29.4.4.5.2 Quantification of the Behavioral ReliabilityWith the use of the TTP/C network, communication cycles are predefined and cyclic. However, as will

p

a possible desynchronization between the sampling period and the cluster cycle. The evaluation of thebehavioral reliability should be based on the worst-case Tp but not the nominal Tp because of the safety-critical nature of the system. Moreover, with transient failures due to perturbations, Tp becomes a randomvariable. Therefore, in this section, we first evaluate Tp and then the behavioral reliability.

29.4.4.5.2.1 Worst-Case Pure Delay without Transient Failures — Figure 29.7 shows the temporal char-acteristics of the front-axle control function and the relationship with the cluster cycles. The fault-tolerantcommunication layer (Section 29.3.3.2) has been configured so that the front-axle ECUs (FAA ECU1and FAA ECU2) have to wait for all replicas of data before consuming them.

Cluster cycles are numbered by index i = 1, 2, … The ith cluster cycle starts at ti. For the front-axlecontrol function, the worst-case pure delay, denoted by Tp

WC, appears when the ith handwheel sensorsampling period starts just after ti. This slight desynchronization leads to a situation where, during theith cluster cycle, only the data elaborated using the sample of the (i – 1)th cluster cycle are transmitted tothe front-axle ECUs. In fact, in TTP/C, data kept in the buffer of each HW ECU are transmitted at thebeginning of each slot. In Figure 29.7, at the beginning of each HWA ECU slot, data in the buffer have

TABLE 29.1 QoS Score vs. Perturbation Time

Configuration of the Steering System TP (ms) Score S

Mechanical steering system 0 11.23Steer-by-Wire 3.6 11.21Steer-by-Wire 5.6 11.19Steer-by-Wire 9.6 11.15Steer-by-Wire 11.6 11.13Steer-by-Wire 13.6 11.10Steer-by-Wire 15.6 11.05Steer-by-Wire 17.6 11Steer-by-Wire 19.6 10.90Steer-by-Wire 29.6 10.45


be shown by the analysis of the example in Figure 29.7, T can be greater than a cluster cycle because of


not yet been refreshed and only data corresponding to the previous sample are transmitted. So, the worst-case delay between the ith HWA sample and the beginning of the ith actuation is given by

(29.1)

where TMA is the duration of a cluster cycle, TNET corresponds to the delay between the beginning of acluster cycle and the arrival of all the replicas to the FAA ECU2, and TT is the treatment time of the datawithin an FAA ECU.

Although in our case we have only one TDMA round per cluster cycle, in general one can find severalTDMA rounds per cluster cycle. In this latter case, assuming that there is one computation per datareception, Equation 29.1 takes the form

(29.2)

where Tr is the duration of a TDMA round and n is the number of TDMA rounds between two HWAdata emissions (n could be less than or equal to the number of TDMA rounds in a cluster cycle).

This result should be used for system dimensioning at the design step in order to ensure that TpWC is

smaller than the tolerable upper bound Tmax (e.g., 17.6 ms for the example presented in 0). For the slotplacement given in Figure 29.7, we obtain

FIGURE 29.7 Temporal characteristics of the function “turning the wheels according to the driver’s request.”

t

HWAECU1

FAAECU1

HWAECU2

FAAECU2

HWAECU1

FAAECU1

HWAECU2

FAAECU2

...

cluster cycle (T )MA

hand wheelsampling i

hand wheelsampling i+1

sensors

HW ECU1HW ECU2

treatment of samplenumber i

end oftransmissionof data i-1

end oftransmissionof data i

datai-1 data i+1data i

Buffers ofHW ECU1

HW ECU2

treatment of samplenumber i+1

data i-1 data i+1 data i

transmission ofdata i-1

transmission ofdata i

TDMA cycle

treatment ofdata i-1

treatment ofdata i

FAA ECUs

actuator actionfor data i-1

Actuators

actuator actionfor data i

TNET TT

Pure delay for data iMechatronic

delay

t i t i-1

T T T Tp

WCMA NET T= + +

T nT T TpWC

r NET T= + +



This is to say that in failure-free conditions, the pure delay is bounded by 3.6 ms.

29.4.4.5.2.2 Pure Delay under Transient Failures and Behavioral Reliability Evaluation — When pertur-bations occur, the pure delay can be longer than Tp

WC, but how the perturbations will influence the puredelay depends on the failure occurrence model. Establishing a realistic failure model according to per-turbation occurrences is a complex statistical work that is beyond the scope of this chapter. A morerealistic failure model of EMI perturbations is proposed in [31].

In what follows, to illustrate the evaluation method of behavioral reliability (PBR), we use a simplifiedfailure model, stationary in time and where failures are independent from each other. The granularityof the failure model is the cluster cycle (one failure leads to one erroneous or one empty cluster cycle).

no information is transmitted to the actuators during this cluster cycle (the information is lost ordestroyed before the command is given to the actuators). In the worst case, each time a failure occurs,the actuator has to wait Tp

WC plus one cluster cycle TMA to receive refreshed information. Let TpWC,ERR

denote the maximum delay with N consecutive erroneous cluster cycles:

(29.3)

Behavioral reliability is then calculated by PBR = P[TpWC,ERR > Tmax] = P[N(nTr) + Tp

WC > Tmax]. So, wehave:

(29.4)

This probability can be directly used to determine the SILs. In this case study, the requirement is thatPBR < 10–9.

In our simplified failure model, as the failures occur following a stationary and uniform probabilitydistribution, PBR also gives the probability of failures per hour. For the studied architecture, the maximumtolerable number of erroneous cluster cycles is given by

So, we must here have PBR = P[N > 7] < 10–9 (failure/hour).As explained before, the chosen error model is a simplified one: erroneous or empty communication

cycles are assumed independent events and the probability of losing one cluster cycle is ER. So, theprobability of losing x consecutive cluster cycles before one successful transmission is P[N = x] = (1 –ER)(ER)x (geometric law):

(29.5)

The proposed operational architecture meets the dependability requirements with ER < 0.075 under ageometrical failure model. In practice, it is necessary to use a more realistic error model, constructed onthe basis of measurements taken from a prototype. Indeed, the effects of transient failures and externalperturbations, such as EMI or temperature peaks, are not negligible and will become even more prob-lematic when the 42-V technology [32] is used.

TWCPD = * + + =1 2 1 4 0 2 3 6. . . ms

T NT T NnT Tp

WC ERRMA p

WCr p

WC, = + = +

P P N T T nTBR p

WCr= > -[ [( )/ ]]max

N T T nTpWC

r= - = - =[( ) /( )] [( . . ) / ]max 17 6 3 6 2 7

P P N x P N k EBR Rx

k

x

= > = - = = +

=Â[ ] [ ] ( )1 1

0


A failure can happen at any step detailed in Figure 29.7. Whichever step has a failure, we consider that


29.5 Conclusion

X-by-Wire is a clear trend of automotive development due to the advantages of the electronic componentsfor enhancing safety, functionality, and reducing cost.

In this chapter, after having examined the real-time and dependability constraints of the X-by-Wiresystems, we reviewed the fault-tolerant services and the communication protocols (TTP/C, FlexRay, andTTCAN) that are needed for such systems. Methods for designing a dependable X-by-Wire system weredescribed and a Steer-by-Wire system based on TTP/C was then used as a case study. We showed howto build a fault-tolerant architecture by choosing the necessary redundant components and the scheduleof transmission. A method for evaluating the probability that the real-time constraints would be violatedunder a simple perturbation model was also proposed. This method can be used to predict whether thearchitecture meets the SIL4 requirement.

If the dependability of the X-by-Wire can be evaluated by assuming that one can establish a realisticfailure model, the certification organization still remains to be convinced. At the time of writing, thelegislation in some countries does not authorize fully X-by-Wire cars to circulate. The use of X-by-Wiresystems mass production cars in the future also depends on other factors such as the advances in the42-V technology.

References

[1] Society of Automotive Engineers (SAE) public discussion: 42 Volt Electrical Systems and Fuel Cells:Harmonious Marriage or Incompatible Partners? SAE (N. Traub), General Motors (Ch. Borroni-Bird, Director of Design and Technology Fusion), Delphi (J. Botti, Innovation Center), Daimler-Chrysler (T. Moore, Vice President, Liberty and Technical Affairs), UTC Fuels Cells (F.R. Preli,Vice President, Engineering), SAE 2003 World Congress and Exhibition, Detroit, 2003.

[2] IEC 61508-1, Functional Safety of Electrical Electronic Programmable Electronic Safety-RelatedSystems: Part 1: General Requirements, IEC/SC 65A, 1998.

[3] J. Rushby, A Comparison of Bus Architectures for Safety-Critical Embedded Systems, TechnicalReport, Computer Science Laboratory SRI International, 2003.

[4] E. Dilger, T. Führer, B. Müller, S. Poledna, The X-by-Wire Concept: Time-Triggered InformationExchange and Fail Silence Support by New System Services, Technische Universität Wien, Institutfür Technische Informatik, no. 7/1998; also available as SAE Technical Paper 98055, 1998.

[5] C. Temple, Avoiding the babbling-idiot failure in a time-triggered communication system, inInternational Symposium on Fault-Tolerant Computing (FTCS), Munich, Germany, 1998.

[6] S. Poledna, P. Barrett, A. Burns, A. Wellings, Replica determinism and flexible scheduling in hardreal-time dependable systems, IEEE Transactions on Computers, 49, 100–111, 2000.

[7] Time-Triggered Protocol TTP/C, High-Level Specification Document, Protocol Version 1.1, 2003.[8] H. Kopetz, Real-Time Systems: Design Principles for Distributed Embedded Applications, Kluwer

Academic Publishers, Dordrecht, 1997.[9] H. Pfeifer, Formal verification of the TTP group membership algorithm, in FORTE/PSTV Euro-

conference, Pisa, Italy, 2000.[10] H. Kopetz, R. Nossal, R. Hexel, A. Krüger, D. Millinger, R. Pallierer, C. Temple, M. Krug, Mode

handling in the Time-Triggered Architecture, Control Engineering Practice, 6, 61–66, 1998.[11][12] G. Bauer, M. Paulitsch, An investigation of membership and clique avoidance in TTP/C, in 19th

[13][14] ISO 11898-4, Road Vehicles: Controller Area Network (CAN): Part 4: Time Triggered Communi-

cation.[15]


TTTech Computertechnik AG, http://www.tttech.com/, 2004.

FlexRay Consortium, http://www.flexray.com, 2004.IEEE Symposium on Reliable Distributed Systems, Nuremberg, Germany, 2000.

Bosch, Time Triggered Communication on CAN, http://www.can.bosch.com/content/TT_CAN.html, 2004.

http://www.tttech.com

http://www.flexray.com

http://www.semiconductors.bosch.de

http://www.semiconductors.bosch.de


[16]

[17]

[18] N. Tracey, Comparing OSEK and OSEKTime, in Embedded System Conference (ESC) Europe,

[19] X-by-Wire Project, Brite-EuRam 111 Program, X-By-Wire: Safety Related Fault Tolerant Systemsin Vehicles, Final Report, 1998.

[20] A. Avizienis, J.-C. Laprie, B. Randell, Fundamental concepts of dependability, in 3rd InformationSurvivability Workshop, Boston, MA, pp. 7–12, 2000.

[21] J.A. Garay, K.J. Perry, A continuum of failure models for distributed computing, in 6th DistributedAlgorithm International Workshop (WDAG), Haifa, Israel, 1992.

[22] L. Lamport, R. Shostak, M. Pease, The Byzantine Generals Problem, ACM Transactions on Pro-gramming Language and Systems, 4, 382–401, 1982.

[23] G. Grünsteidl, H. Kantz, H. Kopetz, Communication reliability in distributed real-time systems,in 10th IFAC Workshop on Distributed Computer Control Systems, Semmering, Austria, 1991.

[24] P. Herout, S. Racek, J. Hlavicka, Model-based dependability evaluation method for TTP/C basedsystems, in EDCC-4: Fourth European Dependable Computing Conference, Toulouse, France, 2002.

[25] R. Hexel, FITS: a fault injection architecture for time-triggered systems, in 26th Australian ComputerScience Conference (ACSC2003), Adelaide, Australia, 2003.

[26]

[27] B. Gaujal, N. Navet, Maximizing the Robustness of TDMA Networks with Application to TTP/C,Technical Report RR-4614, INRIA, 2002.

[28] B. Gaujal, N. Navet, Optimal replica allocation for TTP/C based systems, in 5th FeT IFAC Conference(FeT 2003), Aveiro, Portugal, July 2003.

[29] R. Hammett, P. Babcock, Achieving 10-9 dependability with drive-by-wire systems, in SAE 2003World Congress and Exhibition, Detroit, MI, 2003.

[30] C. Wilwert, Y.Q. Song, F. Simonot-Lion, T. Clément, Evaluating quality of service and behavioralreliability of steer-by-wire systems, in 9th IEEE International Conference on Emerging Technologiesand Factory Automation (ETFA), Lisbon, Portugal, 2003.

[31] N. Navet, Y.Q. Song, F. Simonot, Worst-case deadline failure probability in real-time applicationsdistributed over CAN (Controller Area Network), Journal of Systems Architecture, 46, 607–617,2000.

[32] H. Kopetz, H. Kantz, G. Gründsteidl, P. Puschner, J. Reisinger, Tolerating transient fault in MARS,in 20th Symposium of Fault Tolerant Computing, Newcastle upon Tyne, U.K., 1990.


OSEK Consortium, OSEK/VDX Time-Triggered Operating System, Version 1.0, available at http://www.osek-vdx.org/, 2001.OSEK Consortium, OSEK/VDX Fault-Tolerant Communication, Version 1.0, available at http://

Stuttgart, Germany, 2001.

Report by the Inquiry Board, Ariane 501 Flight Failure, available at http://www.mssl.ucl.ac.uk/www_plasma/missions/cluster/about_cluster/cluster1/ariane5rep.html, 1996.

www.osek-vdx.org/, 2001.





http://www.mssl.ucl.ac.uk

http://www.mssl.ucl.ac.uk

30-1

30FlexRay

CommunicationTechnology

30.1 Introduction ......................................................................30-130.2 Automotive Requirements ................................................30-1

30.3

Current State

30.4 System Configuration .....................................................30-10Development Models

30.5 Standard Software Components ....................................30-13Standardized Interfaces

References ...................................................................................30-14

30.1 Introduction

New electronic technologies have dramatically changed cars and the way we experience driving. Anti-blocking system (ABS), electronic stability program (ESP), air bags, and many more applications havemade cars a lot more convenient, comfortable, and — above all — safer. This trend of the past decadehas been rather pleasant for the consumer — and a tedious task for the automotive industry. The reasonsfor this drawback have not only to do with the need of higher integration of the involved technologies.The very nature of the deployed communication technologies makes the task of integration itself a lotmore complex, as well as the design of fault-tolerant systems on top of these communication technologiesrather difficult.

These limitations, on the one hand, as well as requirements and anticipated challenges of futureautomotive applications, on the other hand, motivated OEMs and suppliers to join forces. The goal ofthe 2001 founded FlexRay consortium [1] is to establish one standard for a high-performance commu-nication technology in the automotive industry.

30.2 Automotive Requirements

Since OEMs and suppliers were the founding fathers of the FlexRay consortium, it was clear from thevery beginning of the work on the new de facto communication standard that FlexRay would have tomeet the requirements of the automotive industry. Therefore, two key issues have driven the developmentwork for the communication protocol: the need for a technological basis and solution for future safety-related applications and the need to keep costs down.

Dietmar MillingerDECOMSYS (Dependable Computer Systems)

Roman NossalDECOMSYS (Dependable Computer Systems)


Physical Layer • Bus Guardian • Protocol Services • FlexRay

What Is FlexRay? ...............................................................30-3Media Access • Clock Synchronization • Start-Up • Coding and

Cutting Costs • Future Proof


30.2.1 Cutting Costs

The cost factor is a key driver for many requirements for the communication system, as the push forsystematic reuse of existing components in multiple car platforms proves. Due to this approach, subsetsof components related to a specific function can be reused in multiple platforms without changes insidethe components. This elegant and cost-saving solution, however, is only possible if the communicationsystem offers two decisive qualities:

1. It must be standardized and provide a stable interface to the components.2. It has to provide a deterministic communication service to the components.

This communication determinism is the solution for the problem of interdependencies betweencomponents, which is a major problem and cost factor in today’s automotive distributed systems. Sinceany change in one component can change the behavior of the entire system, integration and testing areof utmost importance, and therefore extremely difficult and expensive, in order to ensure the neededsystem reliability. A deterministic communication system significantly reduces this integration and testeffort because it guarantees that the cross-influence is completely under control of the application andnot introduced by the communication system.

30.2.1.1 Migration

A new technology such as FlexRay does not make all predecessors obsolete at once. It rather replaces thetraditional systems gradually and builds on proven solutions. Therefore, existing components and appli-cations have to be migrated into new systems. In order to make this migration path as smooth andefficient as possible, FlexRay has integrated some key qualities of existing communication technologies,e.g., dynamic communication.

30.2.1.2 Scalability

Communication determinism and reuse are also key enablers for scalability, which obviously is yet anothercost-driven requirement. Scalability, however, calls not only for communication determinism and reuse,but also for the support of multiple network topologies and network architectures, as well as the appli-cability of the communication technology in different application domains like power train, chassiscontrol, backbone architectures, or driver assistance systems.

30.2.2 Future Proof

Keeping costs down is only one side of the coin. The automotive industry has visions of the future carand applications. The most obvious developments are active safety functions like electronic brakingsystems, driver convenience functions like active front steering, and the fast-growing domain of driverassistance systems like active cruise control or the lane departure warning function. These automotiveapplications demand a high level of reliability and safety from the network infrastructure in the car inorder to provide the required level of safety at the system level. Therefore, the communication technologyhas to meet requirements such as redundant communication channels, a deterministic media accessscheme, high robustness in case of transient faults, a distributed fault-tolerant agreement about theprotocol state, and extensive error detection and reporting toward the application. The most stringentparticular requirement arises from the deterministic media access scheme. In time-division multiple-access (TDMA) schemes for networks, all participating communication partners need a common under-standing of the time used in order to control access to the communication medium. Typically, a fault-tolerant distributed mechanism for clock synchronization is required. Additionally, the safety require-ment introduces the need to protect individual communication partners from faults of other partnersby means of guardians. Otherwise, errors of one partner could cross-influence other partners, thusviolating safety demands.

Specific automotive issues complete the broad range of requirements forming the framework for andof FlexRay. These issues include the use of automotive components like crystals, automotive electromag-


FlexRay Communication Technology 30-3

netic compatibility (EMC) requirements, support for power management to conserve battery power,support for electrical and optical physical layers, and a high-bandwidth demand of at least 2 ¥ 10 Mbit/s.

30.3 What Is FlexRay?

Before the development of FlexRay was started, a comprehensive evaluation of the existing technologiestook place. The results showed that none of the existing communication technologies could fulfill therequirements to a satisfactory degree. Thus, the development of a new technology was started. Theresulting communication protocol FlexRay is an open, scalable, deterministic, and high-performancecommunication technology for automotive applications.

A FlexRay network consists of a set of electronic control units (ECUs) with integrated communicationcontrollers (Figure 30.1). Each communication controller connects the ECU to one or more communi-cation channels via a communication port, which in turn links to a bus driver. The bus driver connectsto the physical layer of the communication channel and can contain a guardian unit that monitors theTDMA access of the controller (the architecture of an ECU is depicted in Figure 30.2). A communicationchannel can be as simple as a single bus wire or as complex as active or passive star configurations.

FlexRay supports the operation of a communication controller with single or redundant communi-cation channels. In the case of a single communication channel configuration, all controllers are attachedto the communication channel via one port. In the case of redundant configuration, controllers can beattached to the communication channels via one or two ports. Controllers that are connected to twochannels can be configured to transmit data redundantly on two channels at the same time. Thisredundant transmission allows the masking of a temporary fault of one communication channel andthus constitutes a powerful fault tolerance feature of the protocol. A second fault tolerance feature relatedto transient faults can be constructed by the redundant transmission of data over the same channels witha particular time delay between the redundant transmissions. This delayed transmission allows thetoleration of transient faults on both channels under particular preconditions.

FIGURE 30.1 FlexRay network.

FIGURE 30.2 ECU architecture.

node B node Cnode A node D node E node F

Channel 0

Channel 1


Node

Communication

Controller (CC)

Host

Host Interface

Bus Driver (BD)

Bus Guardian

(BG)

Pow

er S

uppl

y

Channel 0

Channel 1


30.3.1 Media Access

The media access strategy of FlexRay is basically a TDMA scheme with some very specific properties.The basic element of the TDMA scheme is a communication cycle. A communication cycle contains astatic segment, a dynamic segment, and two protocol segments called symbol window and network idletime (Figure 30.3). Communication cycles are executed periodically from start-up of the network untilshutdown. Two or more communication cycles can form an application cycle.

The static segment consists of slots with fixed duration. The duration and number of slots are determinedby configuration parameters of the FlexRay controllers. These parameters must be identical in all controllersof a network. They form a so-called global contract. Each slot is exclusively owned by one FlexRaycommunication controller for transmission of a frame. This ownership only relates to one channel. Onother channels, in the same slot, either the same or another controller can transmit a frame. The identifi-cation of the transmitting controllers in one slot is also determined by configuration parameters of theFlexRay controllers. This piece of information is local to the sending controller. The receiving controllersdo not possess any knowledge on the transmitter of a frame; they are configured solely to receive in aspecific slot. Hence, the content of a frame is determined by its positions in the communication cycle.

The static segment provides deterministic communication timing, since it is exactly known when aframe is transmitted on the channel, giving a strong guarantee for the communication latency. Thisstrong guarantee in the static segment comes with a trade-off of fixed-bandwidth reservation.

The dynamic segment has fixed duration, which is subdivided into so-called minislots. A minislot hasa fixed length that is substantially shorter than that of a static slot. The length of a minislot is not sufficientto accommodate a frame; a minislot only defines a potential start time of a transmission in the dynamicsegment. Similar to static slots, each minislot is exclusively owned by one FlexRay controller for thetransmission of a frame. During the dynamic segment, all controllers in the network maintain a consistentview about the current minislot. If a controller wants to transmit in a minislot, the controller accesses themedium and starts transmitting the frame. This is detected by all other controllers, which interrupt thecounting of minislots. Thus, the minislot is expanded to a real slot, which is large enough to accommodatea frame transmission. It is only after the end of the frame transmission that counting of the minislotscontinues. The expansion of a minislot reduces the number of minislots available in this dynamic segment.

minislot 4 occurs. Each of the channels offers 16 minislots for transmission. The owner of minislot 4 onchannel 0 — in this case controller D — has data to transmit. Hence, the minislot is expanded as shownin Figure 30.4b. The number of available minislots in the dynamic segment on channel 0 is reduced to 13.

If there are no data to transmit by the owner of a minislot, it remains silent. The minislot is notexpanded and slot counting continues with the next minislot. Because no minislot expansion occurred,no additional bandwidth beyond the minislot itself is used; hence, other, lower-priority minislots havemore bandwidth available.

This dynamic media access control scheme produces a priority and demand-driven access pattern thatoptimally uses the reserved bandwidth for dynamic communication. A controller that owns an earlier

FIGURE 30.3 FlexRay communication cycle.


The operation of the dynamic segment is illustrated in Figure 30.4: Figure 30.4a shows the situation before

static segment dynamic segment symbol window

optional

network communication time network idle time

static slot static slot static slotmini-slot

mini-slot

mini-slot

mini-slot

mini-slot

mini-slot

Communication cycle


minislot, i.e., a minislot that has a lower number, has higher priority. The further back in the dynamicsegment a minislot is situated, the higher is the probability that it will not be in existence in a particularcycle due to the expansion of higher-priority slots. A minislot is only expanded and its bandwidth usedif the owning controller has data to transmit. As a consequence, the local controller configuration hasto ensure that each minislot is configured only once in a network. The minimum duration of a minislotis mainly determined by physical parameters of the network (delay) and by the maximum deviation ofthe clock frequency in the controllers. The duration of a minislot and the length of the dynamic segmentare global configuration parameters that have to be consistent within all controllers in the network.

The symbol window is a time slot of fixed duration, in which special symbols can be transmitted onthe network. Symbols are used for network management purposes.

The network idle time is a protocol-specific time window in which no traffic is scheduled on thecommunication channel. The communication controllers use this time window to execute the clocksynchronization algorithm. The offset correction (see below) that is done as a consequence of clocksynchronization requires that some controllers correct their local view of the time forward and othershave to correct backward. The correction is done in the network idle time. Hence, no consistent operationsof the media access control can be guaranteed, and thus silence is required. Since this duration has tobe subtracted from net bandwidth, it is kept as small as possible. The minimum length is largelydetermined by the maximum deviation between the local clocks after one communication cycle. Theduration of the network idle time is a global parameter that has to be consistent between all controllersin a network.

FIGURE 30.4 FlexRay dynamic segment.


4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

Static Segment

Communication CycleC

han

nel

1

t

t

Ch

ann

el

0

NI T

Symbol Window

A1

A1 B1 C2

C1

Dynamic Segment

1 2 3

1 2 3

a)

4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

Static Segment

Communication Cycle

Ch

ann

el

1

t

t

Ch

ann

el

0

NI T

Symbol Window

A1

A1 B1 C2

C1

Dynamic Segment

1 2 3

1 2 3

4 5 6 7 8 9 10 11 12 13 14 15 16

D1

b)


While at least a minimal static segment and the network idle time are mandatory parts of a commu-nication cycle, the symbol window and dynamic segment are optional parts. This results in basicallythree reasonable configurations (Figure 30.5):

• The pure static configuration, which contains only static slots for transmission. In order to enableclock synchronization, the static segment must consist of at least two slots, which are owned bydifferent controllers. If a fault-tolerant clock synchronization should be maintained, the staticsegment must contain at least four static slots.

• The mixed configuration with a static segment and a dynamic segment, where the ratio betweenstatic bandwidth and dynamic bandwidth can vary in a broad range.

• Finally, a pure dynamic configuration with all bandwidth assigned to dynamic communication.This configuration also requires a so-called degraded static segment, which has two static slots.

Considering the most likely application domains, mixed configurations will be dominant. Dependingon the actual configuration, FlexRay can achieve a best-case bandwidth utilization of about 70%, withthe average around 60%.

30.3.2 Clock Synchronization

The media access scheme of FlexRay relies on a common and consistent view about time that is sharedbetween all communication controllers in the network. It is the task of the clock synchronization serviceto generate such a time view locally inside each communication controller. For the detailed descriptionof the clock synchronization, first the representation of time inside a communication controller isdescribed. The physical basis for each time representation is the tick of the local controller oscillator.This clock signal is divided by an integer multiple to form a clock signal called microtick. An integernumber of microticks forms a time unit called macrotick. Minislots and static slots are set up as integermultiples of macroticks. The number of microticks that constitute a macrotick is statically configured.However, for adjustment of the local time, the clock synchronization service can temporarily adjust thisratio in order to accelerate or decelerate the macrotick clock.

The clock synchronization service is a distributed control system that produces local macroticks witha defined precision in relation to the local macroticks of the other controllers of a network. The controlsystem takes some globally visible reference events that represent the global time ticks, measures thedeviation of the local time ticks from the global ticks, and computes the local adjustments in order tominimize the deviation of the local clock from the global ticks. Due to the distributed nature of theFlexRay system, no explicit global reference event exists. The only event that is globally observable onthe communication channel is a frame transmission. The start of a transmission is triggered by the localtime base of the sending controller. Each controller can collect these reference events to form a virtual

FIGURE 30.5 FlexRay configurations.


static segment (min. 2 static slots) symbol window

optional

static seg. (min. 2 static slots) dynamic segment symbol window

optional

dynamic segment symbol window

degraded static segment (min. 2 static slots) optional

Pure static configuration

Mixed configuration

Pure dynamic configuration


global time base by computing a fault-tolerant mean value of the deviation between the local time andthe perceived events.

The reasoning behind this approach is based on the assumption that the majority of local clocks inthe network are correct. Correctness of a local clock is given when the local clock does not deviate fromevery other by more than the precision value. A controller with a locally correct clock sees only deviationvalues within the precision value. By computing the median value of deviations, the actual temporaldeviation of the local clock from the virtual reference tick is formed. Next, the local clock is adjustedsuch that the local deviation is minimized. Since all nodes in the network perform this operation, alllocal clocks move their ticks toward the tick of the virtual global time. The operation of observation,computing, and correction is performed in every communication cycle.

In the case of wrong transmission times of faulty controllers on the communication channel, thingsget more complicated. Here, a special part of the fault-tolerant median value algorithm takes over. Thisalgorithm uses only the best of the measured deviation values. All other values are discarded. Thisalgorithm ensures that the maximum influence of a faulty controller to the virtual global time is strictlybound. Additionally, the protocol requires the marking of particular synchronization frames that can beused for deviation measurement. The reasoning behind this mechanism is twofold: First, it is used topick exactly one frame from a controller in order to avoid monopolization of the global time by onecontroller with many transmit frames. Second, particular controllers can be excluded from clock syn-chronization, either because the crystal is not trustworthy or, more likely, because there are systemconfigurations in which a controller is not available.

In case of a faulty local clock, the faulty controller perceives only deviation values that exceed aparticular value. This value can be derived from the precision value. This condition is checked by thesynchronization service and an error is reported to the application. A specific extension of the clocksynchronization services handles the compensation of permanent deviations of one node. In case sucha permanent deviation is detected, a permanent correction is applied. The detection and calculation ofsuch permanent deviations are executed less frequently than the correction of temporal deviations.

The error handling of the protocol follows a strategy that identifies every problem as fast as possible,but keeps the controller alive as long as possible. Problem indicators are frames that are received outsidetheir expected arrival intervals or when the clock synchronization does not receive sufficient synchroni-zation frames. The automatic reaction of the controller is to degrade the operation from a sending modeto a passive mode where reception is still possible. At the same time, the problem is indicated to theapplication, which can react in an application-specific manner to the detected problem. This strategygives the designer of a system maximum flexibility.

30.3.3 Start-Up

The preceding protocol description handled only the case of an already running system. To reach thisstate, the start-up service is part of the protocol. Its purpose is to establish a common view on the globaltime and the position in the communication cycle.

Generally, the start-up service has to handle two different cases. The cold-start case is a start-up of allnodes in the network, while the reintegration case is the integration of a starting controller into an already

During cold start, the algorithm has to ensure that a cold-start situation is really given. Otherwise,the starting controller might disturb an already running set of controllers. For this reason, the startingcontroller has to listen a considerable amount of time for the so-called listen timeout for traffic on thecommunication channel. In case no traffic is detected, the controller assumes a cold-start situation andstarts to transmit frames for a limited number of rounds. If another controller responds with frames thatfit the slot counter of the cold-start node, start-up was successful.

In case traffic is detected during the observation period, the controller changes into the reintegrationmode. In this mode, the controller has to synchronize the slot counter with the frames seen on thechannel. Therefore, the controller receives frames from the channel and sets the slot counter accordingly.


running set of controllers (Figure 30.6).


For a certain period, the controller checks the plausibility of the received frames in relation to the internalslot counter. If there is a match, the controller enters the normal mode, in which active transmission offrames is allowed.

30.3.4 Coding and Physical Layer

The frame format for data transmission contains three sections: the header, payload (body section),and trailer (Figure 30.7). The header contains protocol control information like the synchronizationframe flag, the frame ID, a null frame indicator, the frame length, and a cycle counter. The payloadsection contains up to 254 bytes of data. In case the payload does not contain any data, the null frameindicator is set. Optionally, the data section can contain a message ID, which identifies the type ofinformation transported in the frame. The trailer section contains a 24-bit cyclic redundancy check(CRC) that protects the complete frame.

The existing FlexRay communication controllers support communication bit rates of up to 10 Mbpson two channels over an electrical physical layer. The physical layer is connected to the controller via atransceiver component.

This physical layer supports bus topologies, star topologies, cascaded star topologies, and bus stubs

scalability and flexibility of electronic architectures in automotive applications.Besides transforming bit streams between the communication controller and physical layer, the trans-

ceiver component also provides a set of very specific services for an automotive network. The majorservices are alarm handling and wake-up control. Alarm signals are a very powerful mechanism fordiverse information exchange between a sender controller and receiver controllers. A sender transmitsan alarm symbol on the bus parallel to alarm information in a frame. A receiver ECU receives the alarminformation in the frame like normal data. Additionally, the communication controller receives the alarm

FIGURE 30.6 Start-up process.

FIGURE 30.7 FlexRay frame format.


connected to star couplers, as shown in Figure 30.8. This multitude of topologies allows a maximum of

Listen timeout

Listen to traffic& gather time

and cycleinformation

Send framesand wait for

matching reply

Yes NoTrafficdetected?

CRC CRC

24 bit

CRC

FlexRay Frame 5 + (0 ... 254) + 3 Bytes

Header Section Body Section Trailer Section

Data

0 ... 252 Bytes

Data

16 bit

Message ID

Data DataFrame ID Length

40 bit

HeaderCRC Cycle

Protocol Flags


symbol on the physical layer and indicates this symbol to the ECU. Thus, the ECU has two highlyindependent indicators for an alarm to act on. This scheme can be used for the validation of criticalsignals like an air bag fire command.

The second type of service provided by the symbol mechanism is the wake-up function. A wake-upservice is required in automotive applications where electronic components have a sleep mode, in whichpower consumption is extremely reduced. The wake-up service restarts normal operation in all sleepingECU components. In a network, the wake-up service uses a special signal that is transmitted over thenetwork. In FlexRay, this function relies on the ability of a transceiver component to identify a wake-upsymbol and to signal this event to the communication controller and the remaining components of theECU to wake these components up.

30.3.5 Bus Guardian

The media access strategy completely relies on the cooperative behavior of every communication con-troller in a network. The protocol mechanisms inside a controller ensure this behavior to a considerablyhigh level of confidence. However, for safety-relevant applications, the controller internal mechanismsdo not provide a sufficiently high level of confidence. An additional and independent component isrequired to ensure that no controller can disturb the media access mechanism of the network. Thisadditional component is called a bus guardian.

In FlexRay, the bus guardian is a component with an independent clock and is constructed such thatan error in the controller cannot influence the guardian and vice versa. The bus guardian is configuredwith its own set of parameters. These are independent of the parameters of the controller, although bothparameter sets represent the same communication cycle and slot pattern. During runtime, the busguardian receives synchronization signals from the controller in order to keep track with the communi-cation cycle. Using its own clock, the bus guardian verifies that those synchronization signals do avoidbeing influenced by a faulty controller.

Typically, the bus guardian will be combined with the transceiver component. Optionally, a centralbus guardian located inside a star coupler can be used.

30.3.6 Protocol Services

Application information is transmitted by the communication controller inside of frames. A framecontains one or more application signals. A controller provides an interface for frame transmission andreception that consists of buffers. A buffer consists of a control/status section and the data section. Thesesections have different semantics for receive and transmit frames and for static and dynamic slots.

The control section of transmit buffers for frames in the static segment contains the slot ID and channelin which the frame is transmitted. Once a buffer of this type is configured and the communication is

FIGURE 30.8 Topologies.


Bus

Active Cascaded Stars Active Stars with Bus Extension

Active Star


started, the controller periodically transmits the data in the data section in the slot configured in the slotID. When the application changes the data in the buffer, the subsequent transmission contains the newdata. A special control flag allows modifying this behavior so that in case the application does not updatethe data in the buffer, a null frame is transmitted, signaling the failure to update to other controllers.The control section of a receive buffer for frames in the static segment defines the slot ID and channelfrom which the frame should be loaded into the buffer. The status section contains the frame receivestatus and the null frame indicator. One special flag indicates that a new frame has been received. It isimportant to note that the slot ID and channel selection for slots in the static segment cannot be changedduring operation.

Buffer status and control sections for frames in the dynamic segment are similar to the buffers in thestatic segment. Differences result from the fact that the slot ID and channel can be changed during normaloperation and that multiple buffers can be grouped together to form a first-in, first-out (FIFO) for framereception from the dynamic section.

The communication controller provides a set of timers that run clocked by the synchronized time ofthe network. Several different conditions can be used to generate interrupts based on these timers. Theseinterrupts are efficient means to synchronize the application with the timing on the bus.

30.3.7 FlexRay Current State

At the time of writing, the protocol was in its final stage of development; the protocol specification hadthe version number 1.9. The first public release of the protocol — specification version 2.0 — was releasedin July 2004.

FlexRay controllers are currently available from Freescale. This controller is based on intermediateversions of the protocol specification; the latest version, named MFR4200, implements the protocolversions 1.0 and 1.1. Apart from Freescale, Bosch and NEC have announced that in the near future theywill also offer FlexRay controllers.

The special physical layer for FlexRay is provided by Phillips. It offers support for the topologiesmentioned above and a data rate of 10 Mbit/s on one channel. There will be two versions of the busdrivers: one with an integrated bus guardian and the other without this unit.

30.4 System Configuration

With the advent of the TDMA communication technologies, especially in the automotive applicationdomain, the offline configuration of networks has become increasingly important. Offline configurationmeans that the configuration parameters of the communication controllers are not generated during theruntime of the system, but are determined throughout the development time of the system. The processesfor system development are mainly determined by the applied technology, but are also driven by industry-specific technical or organizational constraints. In the following section, the background for such a designprocess is described by first defining a model for the used information and then explaining the informa-tion processing.

The information model categorizes information into eight information domains. This categorizationis comprehensive in the sense that each and every piece of development information belongs to one of

The functional domain defines entities called functions and the communication relations betweenthem. Functions describe the functionality of the entire system, creating a hierarchy from very abstracthigh-level functions down to specific, tailored ones. A system is normally composed of more than onefunction. In a vehicle, for example, the functional hierarchy would feature chassis functions on the toplevel, steering functions, and braking functions, the latter being broken down even more into basicbraking function, antilock brake functions, and so on. A communication relation between functions orwithin a function starts at a sender function, connects the sender function with receiver functions, andhas an assigned signal.


the information domains (Figure 30.9).


Signals are defined in the signal domain. A signal definition contains a signal name, the signalsemantics, and the value range. Signals are assigned to messages. The message domain defines messages,i.e., packages of signals that should be transmitted together. A message has a name and a fixed setup ofthe signals it contains. The network domain determines which ECU will send a frame containing amessage as well as which ECUs should receive this frame. The TDMA domain establishes the exact pointsin time when frames are transmitted. Finally, the architecture domain defines the physical structure ofa system, with all ECUs, communication systems, and the connections of the ECUs to them.

The above six information domains are of importance for the entire system, i.e., for all ECUs of thesystems. Hence, they are considered global information.

Two additional domains complement the message, network, TDMA, and architecture domains. Thesetwo domains are the process domain and the dispatching domain. The process domain describes thesoftware architecture of the system. It lists all processes and their interactions like mutual exclusion, aswell as the assignment of processes to functions. Processes are information processing units of anapplication. They have timing parameters assigned to them that define the period and the time offset oftheir execution. The dispatching domain is the ECU counterpart of the TDMA domain. It determinesthe application timing, i.e., which process is executed at which point time. Implicitly, this also definesthe preemption of processes, i.e., when a process is interrupted by the execution of another process. Thelatter two information domains feature information that is relevant for only one ECU. Hence, they areconsidered to be local information (local referring to one ECU rather than the entire system).

The categorization of information given by the information domain model leads the way to thedevelopment process. It is an organizational constraint in the automotive development processes thatthe knowledge related to the system to be developed is distributed among the process participants. Theseparticipants are typically the automobile manufacturer (OEM) and one or more suppliers. The OEMpossesses the information on the intended system functions, the envisioned system architecture, and the

FIGURE 30.9 Information domains.


Communication Bus

ArchitectureDomain

NetworkDomain

MessageDomain

ProcessDomain

DispatchingDomain

TDMADomain

ECU2

Process 1.b

Process 1.a

Process1.a

Process1.b

Process1.c

2.aECU1 Proc

Msg X Msg Y

SenderReceiver

ECU1 ECU2

SenderReceiver

ECU1 ECU2

A B C

SingleDomain

A: B:unit Bmax 200min 0

int Bmax 32000min − 32000

Slot 1 Slot 2 Slot 3

FunctionalDomainFunc 1

Func 2 Func 3

C

A B

Message X Message YFunction 1


allocation of functions to architectural components, i.e., to ECUs. Thus, the OEM’s knowledge coversthree of the six global information domains.

The supplier, on the other hand, is the expert on function implementation and ECU design. Thismeans he provides the knowledge on the process domain, i.e., the software architecture underlying thefunction implementation. Each function of the system or each part thereof is implemented by a set ofinteracting processes. The functionality of the ECU relies not only on the software architecture but alsoon the execution pattern. The supplier has to define the timing of each process executed in his developedECU. Hence, the dispatching domain is supplier knowledge as well.

With five of the eight information domains assigned, the open question is whose responsibility arethe three remaining information domains. As described in the previous section, these domains — themessage, network, and TDMA domains — define the communication behavior on the system level. Theyspecify at which point in time which message is transferred from which sending ECU to which receivingECUs. These domains obviously affect all ECUs of the system; hence, no single supplier should be ableto define these domains. For this reason, the DECOMSYS development process for collaborative systemdevelopment between an OEM and several suppliers assigns the message, network, and TDMA domainsto the OEM.

So far, each piece of development information has been assigned to a process participant, which resultsin a static structure. In the following, the appropriate dynamic structure and development process willbe described in brief.

The proposed OEM–supplier development process takes a two-phase approach (Figure 30.10). In thefirst phase, the OEM has to cover all global aspects; subsequently, the suppliers deal with the local aspects.

The process builds on the functional model of the system, which belongs to the functional domainand the signal domain, and the architectural model, belonging to the architecture domain. The functionalmodel describes the functions of the system and their structure, consisting of subfunctions. As subfunc-tions have to exchange information in order to provide the intended function output, the functionalmodel inherently also defines the signals that are transferred from one subfunction to others. Thefunctional model is complemented by the architectural model, which defines the system topology andthe ECUs that are present in the system. The mapping between these two models results in a concretesystem architecture for a particular system in a particular vehicle. This is described in the distributedfunctional model.

Based on the distributed functional model, the OEM performs communication scheduling. Duringthis operation, signals are packed to messages, which in turn are scheduled for transmission at a specificpoint in time. Communication scheduling concludes the global design steps and thus the OEM’s tasks.

The suppliers base their local design steps on the global information given by the OEM. The so-called split of the distributed functional model tells the supplier which functions or subfunctions theECU has to perform, that is, his responsibility. The supplier conducts software architecture design and

FIGURE 30.10 OEM–supplier development process.


HardwareModel

FunctionModel

Assign-mentModel

Comm.Timing

PartialFunction

Model

SoftwareModel

Appl.Timing

Split Schedule Export

Sof

twar

e D

esig

n

Tas

k S

ched

ulin

g

Co

mm

un

ica

tion

Sch

ed

ulin

g

Fun

ctio

n D

istr

ibut

ion

& A

ssig

nmen

t


creates the software model for each function or subfunction. The resulting list of processes is scheduledfor the ECU, taking into account local constraints as well as the global constraints defined in thecommunication schedule.

Note that for performing the local design, the supplier solely requires parts of the global informationcreated by the OEM as well as his own knowledge on the ECU. In principle, the suppliers do not influenceeach other.

30.4.1 Development Models

The use of development models with a clear purpose and information content is the answer to thechallenge of reuse of components for different car lines. Each model focuses on a certain type of infor-mation. The full picture of the system consists of these individual models and the mappings between them.

When developing a new car, only those models affected by the differences between the previous versionand the new vehicle have to be adapted, while the other models remain unchanged. To be more specific,the reuse of system parts calls for the separation of the architectural and functional models. The functionalmodel, i.e., the functions to be executed by the distributed system, is primarily independent of a specificcar model and can be reused in different model ranges. The architectural model, on the other hand, i.e.,the concrete number of ECUs and their properties in a certain car, varies between model ranges.

It is decisive for a useful development process to allow the separate development of the functionalmodel and the architectural model. At the same time, the process must support the mapping of thefunctions to a concrete hardware architecture. The DECOMSYS OEM–supplier development processmeets this requirement.

30.5 Standard Software Components

Standard software components within an ECU deliver a set of services to the actual application. Theapplication itself can make use of these services without implementing any of them. For example, if atransport layer is part of the standard software components used in a project, the application does notneed to take into account the segmentation and reassembly of data that exceed the maximum message size.

In order to reuse the application code in another project, the services offered by the standard softwarein the new project should be the same. If the standard software provides less functionality, the code hasto be changed, as missing services have to be added.

In the optimal case, the standardization effort covers all OEMs and suppliers. Only then can reuse ofexisting code be guaranteed, thus creating a win–win situation for all participants: the OEMs can purchasetested software that has proven its function and reliability in other projects; suppliers, on the other hand,have the possibility to sell this software, which they have created with considerable effort, to other OEMs.

30.5.1 Standardized Interfaces

Standard software components are not the only answer to the challenge of reuse. The standard softwarecomponents and their standardized services must be complemented by standardized interfaces, throughwhich the application software can access these services. Standardizing interfaces for software meansproviding one operating system API (Application Programming Interface) for the software to access com-munication as well as other resources, like analog digital converters (ADCs). Similarly, standardized networkinterfaces allow the reuse of entire ECUs in different networks. The hardware of a distributed system canhave a standardized interface represented by an abstract description of the network communication.

Standardization of software components as well as interfaces, and thus a system architecture, is not acompetitive issue. Depending on where the interfaces are set, there is ample room for each participatingcompany to use its strengths effectively to achieve its purpose. In our opinion, the real competitive issuesare the functions in an ECU or the overall system functionality that is realized by the interaction of ECUfunctions. The special behavior of an electronic power-steering system as perceived by the car driver is



mainly determined by the control algorithms and their application data, rather than by the type ofcommunication interface used for integration.

With respect to standardization efforts, the industry is currently moving in the right direction. Initi-atives like the OSEK/VDX consortium [2], HIS [3], and many others attempt to standardize certainsoftware components and interfaces. The FIBEX group that is now part of the ASAM consortium [4]develops a standardized exchange format between tools based on Extensible Markup Language (XML),which is able to hold the complete specification of a distributed system. Many of these initiatives andprojects are now united in the AUTOSAR development partnership [5] with the goal to generate anindustry-wide standard for the basic software infrastructure.

References

[4][5]


[1] www.flexray.com.[2] www.osek-vdx.org.[3] www.automotive-his.de.

www.asam.de.www.autosar.de.

http://www.flexray.com


http://www.automotive-his.de

http://www.asam.net

31-1

31The LIN Standard

31.1 Introduction ......................................................................31-131.2 The Need............................................................................31-131.3 History ...............................................................................31-231.4 Some LIN Basics................................................................31-3

31.5 Design Process and Work Flow........................................31-4

31.6 Future .................................................................................31-631.7 Volcano LIN Tool Chain...................................................31-6

31.8 Summary..........................................................................31-11Acknowledgments......................................................................31-12

31.1 Introduction

LIN is much more than just another protocol. It is defining a straightforward design methodology, toolinterfaces, and a signal-based API (Application Programming Interface) in a single package. The LIN(local interconnect network) is an open communication standard, enabling fast and cost-efficient imple-mentation of low-cost multiplex systems. It supports encapsulation for model-based design and valida-tion, leading to front-loaded development processes that are faster and more cost efficient than traditionaldevelopment methods.

The LIN standard not only covers the definition of the bus protocol, but also expands its scope intothe domain of application and tool interfaces, reconfiguration mechanisms, and diagnostic services —thus offering a holistic communication solution for automotive, industrial, and consumer applications.In other words, it is systems engineering at its best, enabling distributed and parallel development processes.

Availability of dedicated tools to automate the design and system integration process is a key factorfor the success of LIN.

31.2 The Need

The car industry today is implementing an increasing number of functions in software. Complex electricalarchitectures using multiple networks, with different protocols, are the norm in modern high-end cars.The software industry in general is handling software complexity through best practices such as:

• Abstraction: Hiding the unnecessary level of detail.

Antal RajnákVolcano Communications Technologies AG


System Definition Process • Debugging

The LIN Physical Layer • The LIN Protocol

LIN Network Architect • Requirement Capturing • LIN Target

Additional Information .............................................................31-13

Package • LIN Spector: Test Tool


• Composability: Partitioning a solution into a set of separately specified, developed, and validatedmodules, easily combined into a larger structure inheriting the validity of its components —without the need for revalidation.

• Parallel processes: State-of-the-art development processes such as the rational unified process(RUP) are based on parallel and iterative development where the most critical parts are developedand tested in the first iterations.

The automotive industry is under constant pressure to reduce cost and lead time, while stillproviding increasing amounts of functionality. This must be managed without sacrificing quality. Itis not uncommon today for a car project to spend half a billion U.S. dollars on development, andperhaps as much as $150 million on prototypes. By shortening lead time, the carmaker creates benefitsin several ways; typically both development cost and capital costs are reduced. At the same time, anearlier market introduction creates better sales volumes and therefore better profit. One way ofreducing lead time is by eliminating traditional prototype loops requiring full-size cars, and ratherrelying on virtual development, replacing traditional development and testing methods by ComputerAided Engineering (CAE).

To reduce development time while maintaining quality, a reduction in lead time must occur in acoordinated fashion for all major subsystems of a car, such as body, electrical, chassis, and engine. Withimproved tools and practices for other subsystems, and increasing complexity of the electrical system,more focus must be placed on the electrical development process, as it may determine the total lead timeand quality of the car. These two challenges — lead time reduction and handling of increased softwarecomplexity — will put growing pressure on the industry to handle development of electrical architecturesin a more purposeful manner.

31.3 History

The LIN consortium started in late 1998, initiated by five car manufacturers (Audi, BMW, Daimler-Chrysler, Volvo, and Volkswagen), the tool manufacturer VCT, and the semiconductor manufacturerMotorola. The work group focused on specification of an open standard for low-cost local interconnectnetworks in vehicles where the bandwidth and versatility of the Controller Area Network (CAN) arenot required. The LIN standard includes the specification of the transmission protocol, the transmis-sion medium, the interface between development tools, and the interfaces for application softwareprogramming. LIN promotes scalable architectures and interoperability of network nodes from theviewpoint of hardware and software, and a predictable electromagnetic compatibility (EMC) behavior.LIN complements the existing portfolio of automotive multiplex networks. It will be the enablingfactor for the implementation of hierarchical vehicle networks, in order to gain further qualityenhancement and cost reduction of vehicles. It addresses the needs of increasing complexity andimplementation and maintenance of software in distributed systems by provision for a highly auto-mated tool chain.

The main properties of the LIN bus are:

• Single master–multiple slaves structure• Low-cost silicon implementation based on common Universal Asynchronous Receiver/Transmitter

(UART)/Serial Communications Interface (SCI) hardware, an equivalent in software, or as a purestate machine

• Self-synchronization without a quartz or ceramics resonator in the slave nodes• Deterministic signal transfer entities, with signal propagation time computable in advance• Signal-based API

A LIN network is composed of one master and one or more slave nodes. The medium access iscontrolled by a master node — no arbitration or collision management in the slaves is required. Worst-case latency of signal transfer is guaranteed.


The LIN Standard 31-3

31.4 Some LIN Basics

LIN is a low-cost, single-wire network. The starting point of the physical layer design was the ISO 9141standard. In order to meet EMC requirements, the slew rates are controlled. The protocol is a simplemaster–slave protocol based on the common UART format. In order to enable communication betweennodes clocked by low-cost resistance capacitor (RC) oscillators, synchronization information is trans-mitted by the master node on the bus. Slave nodes will synchronize with the master clock, which isregarded to be accurate. The speed of the LIN network is up to 20 kbit/s, and the transmission is protectedby a checksum. The LIN protocol is message identifier based. The identifiers do not address nodes directly,but denote the meaning of the messages. This way, any message can have multiple destinations (multi-casting). The master sends out the message header consisting of a synchronization break (serving as aunique identifier for the beginning of the frame), a synchronization field carrying the clock information,and the message identifier, which denotes the meaning of the message.

Upon reception of the message identifier, the nodes on the network will know exactly what to do withthe message. One of the nodes sends out the message response and the others either listen or do notcare. Messages from the master to the slave(s) are carried out in the same manner — in this case, theslave task incorporated into the master node sends the response.

LIN messages are scheduled in a time-triggered fashion. This provides a model for the accuratecalculation of latency times, thus supporting fully predictable behavior. Since the master sends outthe headers, it is in complete control of the scheduling and is also able to swap between a set ofpredefined schedule tables, according to the specific requirements/modes of the applications runningin the subsystem.

31.4.1 The LIN Physical Layer

The transport medium is a single-line, wired-AND bus supplied via a termination resistor from thepositive battery node (VBAT, nominally 12 V). The bus line transceiver is an enhanced ISO 9141 imple-mentation. The bus can take two complementary logical values: the dominant value, with an electricalvoltage close to ground and representing a logical 0, and the recessive value, with an electrical voltageclose to the battery supply and representing a logical 1 (Figure 31.1).

The bus is terminated by a pull-up resistor with a value of 1 kOhm in the master node and 30 kOhmin a slave node. A diode in series with the resistor is required to prevent the electronic control unit (ECU)from being powered by the bus in case of a local loss of battery. The termination capacitance is typicallyCSlave = 220 pF in the slave nodes, while the capacitance of the master node is higher in order to makethe total line capacitance less dependent on the actual number of slave nodes in a particular network.The maximum signaling rate is limited to 20 kbit/s. This value is a practical compromise between theconflicting requirements of high slew rates for the purpose of easy synchronization and slower slew rates

FIGURE 31.1 Logical states and corresponding voltage levels on a LIN-bus.


SCI/UARTor SLIC

VBAT8...18 V

60%

40%

GNDTime

Bus

BUS Voltage

master: 1kΩslave: 30kΩ

Controlled slope

DominantLogic ‘0’

RcessiveLogic ‘1’

Master: 2.2nFSlave : 220pF

Electronic Control Unit

Rx

Tx


for electromagnetic compatibility. The minimum baud rate is 1 kbit/s — helping to avoid conflicts withthe practical implementation of time-out periods.

31.4.2 The LIN Protocol

The entities that are transferred on the LIN bus are frames. One message frame is formed by the headerand the response (data) part. The communication in a LIN network is always initiated by the mastertask sending out a message header, which includes the synchronization break, the synchronization byte,and the message identifier. One slave task is activated upon reception and filtering of the identifier andstarts the transmission of the message response. The response is composed of one to eight data bytesand is protected by one checksum byte.

The time it takes to send a frame is the sum of the time to send each byte, plus the response spaceand the interbyte space. The interbyte space is the period between the end of the stop bit of a byte andthe start bit of the following byte.

The interframe space is the time from the end of a frame until the start of the next frame. A frame isconstructed of a break followed by 4 to 11 byte fields. The structure of a frame is shown in Figure 31.2.

In order to allow the detection of signaling errors, the sender of a message is required to monitor thetransmission. After transmission of a byte, the subsequent byte may only be transmitted if the receivedbyte was correct. This allows proper handling of bus collisions and time-outs.

Signals are transported in the data field of a frame. Several signals can be packed into one frame aslong as they do not overlap each other. Each signal has exactly one producer; i.e., it is always written bythe same node in the cluster. Zero, one, or multiple nodes may subscribe to the signal. A key propertyof the LIN protocol is the use of schedule tables. A schedule table makes it possible to ensure that thebus will never be overloaded. It is also the key component to guarantee timely delivery of signals to thesubscribing applications. Deterministic behavior is made possible by the fact that all transfers in a LINcluster are initiated by the master task. It is the responsibility of the master to ensure that all framesrelevant in a certain mode of operation are given enough time to be transferred.

31.5 Design Process and Work Flow

Regardless of the protocol, a network design process includes three major elements:

• Requirement capturing (signal definitions and timing requirements)• Network configuration/design• Network validation

The holistic concept of LIN supports the entire development, configuration, and validation of anetwork by providing definitions of all necessary interfaces.

FIGURE 31.2 Frame structure.



The LIN work flow allows for the implementation of a seamless chain of design and developmenttools, enhancing speed of development and the reliability of the resulting LIN cluster.

The LIN configuration language allows description of a complete LIN network and also contains allinformation necessary to monitor the network. This information is sufficient to make a limited emulationof one or multiple nodes if they are not available. The LIN description file (LDF) can be one componentused to generate software for an electronic control unit (ECU), which shall be part of the LIN network.An API has been defined by the LIN standard to provide a uniform, abstract way to access the LINnetwork from applications. The syntax of a LIN description file is simple and compact enough to behandled manually, but use of computer-based tools is encouraged. node capability files, as described inthe LIN node capability language specification, provides one way to (almost) automatically generate LINdescription files.

31.5.1 System Definition Process

Defining optimal signal packing and a schedule table that fulfills the signaling needs in varying modesof operation, with consideration of capabilities of the participating nodes, is called the system definitionprocess. Typically, it will result in generation of the LDF, written by hand for simple systems or generatedby high-level network design tools, reusing existing, preconfigured slave nodes to create a cluster of them;starting from scratch is not that convenient. This is especially true if the defined system contains nodeaddress conflicts or frame identifier conflicts. The LIN node capability language, which is a new featurein LIN 2.0, provides a standardized syntax for specification of off-the-shelf slave nodes. This will simplifyprocurement of standard nodes as well as provide possibilities for tools that automate cluster generation.The availability of such nodes is expected to grow rapidly. If accompanied by a node capability file, itwill be possible to generate both the LIN configuration file and the initialization code for the masternode. Thus, true plug and play with nodes in a cluster will become a reality.

By receiving a node capability file (NCF) with every existing slave node, the system definition step isautomatic: just add the NCFs to your project in the system definition tool and it produces the LDFtogether with C code to configure a conflict-free cluster. The configuration C code shall, of course, berun in the master node during start-up of the cluster.

If you want to create new slave nodes as well, the process becomes somewhat more complicated. Thesteps to perform will depend on the system definition tool being used, which is not part of the LINspecification. A useful tool will allow for entering of additional information before generating the LDF.(It is always possible to write a fictious NCF for the nonexistent slave node, and thus it will be included.)

An example of the intended work flow is depicted in Figure 31.3.The slave nodes are connected to the master forming a LIN cluster. The corresponding node capability

files are parsed by the system defining tool to generate an LDF in the system definition process. The LDFis parsed by the system generator to automatically generate LIN-related functions in the desired nodes(the master and slave 3 in the example shown in Figure 31.3). The LDF is also used by a LIN bus analyzer/emulator tool to allow for cluster debugging.

FIGURE 31.3 Work flow.


LIN DescriptionFile

Slave1

Slave2

Slave3

Mas-ter

System

LIN

Design

Debugging

Node Capability Files

Bus analyzerand emulator

System DefiningTool

SystemGenerator


If the setup and configuration of any LIN cluster are fully automatic, a great step toward plug-and-play development with LIN will be taken. In other words, it will be just as easy to use distributed nodesin a LIN cluster as it is to use a single node with the physical devices connected directly to the node.

It is worth noting that the generated LDF reflects the configured network; any preexisting conflictsbetween nodes or frames must have been resolved before activating cluster traffic.

31.5.2 Debugging

Debugging and node emulation are based on the LDF produced during system definition. Emulation ofthe master adds the requirement that the cluster must be configured to be conflict-free. Hence, theemulator tool must be able to read reconfiguration data produced by the system definition tool.

One example of a comprehensive tool chain built around the open interface definitions of the LINstandard is presented below.

31.6 Future

The driving ideas and resulting technology behind the success of LIN — especially in the area of thestructured approach toward the system design process — will most likely migrate to other areas ofautomotive electronics. LIN itself will find its way to applications outside of the automotive world dueto its low cost and versatility. The LIN specification will evolve further to cover upcoming needs. Forexample, the future 42-V power supply will require a new physical layer. There will be a broad supplyof components that are made for LIN. Because of high production volumes, these products can beused cost-effectively in many applications, enhancing the functionality of vehicles in a more cost-effective manner.

31.7 Volcano LIN Tool Chain

The Volcano LIN tool chain process is illustrated in Figure 31.4.

1. LIN network requirements are entered into LNA.2. Automatic frame compilation and schedule table generation are done by LNA.

FIGURE 31.4 The Volcano tool-chain for LIN.



3. The LIN description file is generated by LNA.4. The LIN configuration generator tool converts the LIN description file and private file to target-

dependent “.c” and “.h” codes.5. Application code is compiled with target-dependent configuration code and linked to the LIN

target package library.6. Analysis and emulation are performed with a LIN Spector using the generated LIN description file.

31.7.1 LIN Network Architect

The LIN Network Architect (LNA) is built for design and management of LIN networks. Starting withthe entry of basic data such as signals, encoding types, and nodes, LNA takes the user through all stagesof network definition.

31.7.2 Requirement Capturing

There are two types of data administered by LNA:

• Global objects (signals, encoding types, and nodes)• Project-related data (network topology, frames, and schedule tables)

Global objects shall be created first and can then be reused in any number of projects (Figure 31.5).

FIGURE 31.5 Definition of global objects in LNA.



They can be defined manually or imported by using a standardized Extensible Markup Language(XML) input file (based on FIBEX revision 1.0). Future versions of the tool will be able to import datadirectly from the standardized node capability file (NCF). Comprehensive version and variant handlingis supported.

The systems integrator combines subsets of these objects to define networks. Consistency checks arecontinuously performed during this process. This is followed by automatic packing of signals into frames(Figure 31.6).

The last task to be completed is that of generating the schedule table in accordance with the timing

such as bandwidth and memory usage.Based on the allocation of signals to networks via node interfaces, the tool will automatically identify

gateway requirements between subnetworks, regardless of whether they are LIN to LIN or LIN to CAN.The transfer of signals from one subnetwork to another will become completely transparent to theapplication of the automatically selected gateway node.

The tool uses a publish–subscribe model. A signal can only be published by one node, but it can bereceived by any number of other nodes. Different nodes may have different end-to-end timing require-

The max_age is the most important timing parameter defined in the Volcano timing model. Thisparameter describes the maximum allowed time between the generation and consumption of a signalinvolved in a distributed function.

FIGURE 31.6 Network definition and frame packing.


requirements captured earlier in the process (Figure 31.7). The optimization considers several factors

ments (Figure 31.8).


Changes can be introduced in a straightforward manner, with frame definitions and schedule tablesautomaticly recalculated to reflect the changed requirements.

When the timing analysis has been performed and the feasibility of the individual subnetworks has

Textual reports can be generated as well, to enhance the readability of information for all partiesinvolved in the design, verification, and maintenance process.

FIGURE 31.7 Manual or automatic schedule table generation.

FIGURE 31.8 LIN timing model.


been established, LDFs will automatically be created for each network (Figure 31.9).


31.7.3 LIN Target Package

The LIN target package (LTP) represents the embedded software portion of the Volcano tool chain forLIN. The LTP is distributed as a precompiled and fully validated object library, also including associateddocumentation and a command line configuration utility (LCFG) with automatic code generation capa-bility, generating the configuration-specific code and set of data structures.

Implementing the LTP with an application program is a simple process. The LDF created by the offlinetool contains the communication-related network information. In addition, a target file as an ASCII-based script defines low-level microcontroller information such as memory model, clock, SCI, and othernode specifics to the LTP. These two files are run as input through the command line utility LCFG. Itconverts them both into target-dependent code usable by the microcontroller. The output contains allrelevant configuration information formatted into compiler-ready C source code.

The target-dependent source code is added to the module build system along with the precompiledobject library. After compilation the LTP gets linked to the application functionality to form the targetimage, which is ready for download.

The application programmer can interface to the LTP and therefore to the LIN subnetwork through

related flag control (but also node-related initialization), interrupt handling, and timed task management.The low-level details of communication are hidden to the application programmer using the LTP.

Specifics about signal allocation within frames, frame IDs, and others are carried within the LDF so that

FIGURE 31.9 LDF generation.


the standardized LIN API (Figure 31.10). API calls include signal-oriented read and write calls, signal-


applications can be reused simply by linking to different configurations described by different LDFs. Aslong as signal formats do not get changed, a reconfiguration of the network only requires repeating theprocess described above, resulting in a new target image without impact to the application. When thenode allows for reflashing, the configuration can even be adapted without further supplier involvement,allowing for end-of-line programming or after-sales adaptations in case of service.

LTPs are created and built for a specific microcontroller and compiler target platform. A number ofports to popular targets are available, and new ports can be made at the customer’s request.

31.7.4 LIN Spector: Test Tool

LIN Spector is a highly flexible analysis and emulation tool used for testing and validating LIN networks.The tool is devided into two components: external hardware and PC-based software. Using a 32-bitmicrocontroller, the hardware portion performs exact low-level real-time bus monitoring (down to 10-msresolution) while interfacing to the PC via standard RS-232. Other connections are provided for externalpower and a 9-pin D-Sub for bus and triggering connections. The output trigger is provided for con-nection to an oscilloscope, allowing the user to externally monitor the bus signaling.

Starting with LDF import, the tool allows for monitoring and display of all network signal data.Advanced analysis is possible with logical name and scaled physical value views. Full emulation of oneor many nodes — regardless of whether they are master or slave — is possible using LDF information

logic-based triggers.An optional emulation module enables the user to simulate complete applications or run test cycles

when changing emulated signal values and switching schedule tables — all in real time. The functionsare specified by the user via LIN emulation control (LEC) files created in a C-like programming language.

This can also be used to validate the complete LIN communication in a target module. Test cases aredefined stressing bus communication by error injection on the bit or protocol timing level.

Sophisticated graphical user interface panels can be created using the LIN Go feature within the test

These panels interface with the network data defined by the LDF for display and control.

31.8 Summary

LIN is an enabling factor for the implementation of a hierarchical vehicle network to achieve higherquality and reduction of costs for automotive makers. This is enabled by providing best practices ofsoftware development to the industry: abstraction and composability. LIN allows for reduction of themany existing low-end multiplex solutions and for cutting the costs of development, production, service,and logistics in vehicle electronics.

FIGURE 31.10 LIN API structure.


(Figure 31.11). Communication logging and replay is possible, including the ability to start a log via

device (Figure 31.12).


The growing number of car lines equipped with LIN and the ambitious plans for the next generationof cars are probably the best proof for the success of LIN. The simplicity and completeness of the LINspecification, combined with a holistic networking concept allowing for a high degree of automation,have made LIN the perfect complement to CAN as the backbone of in-vehicle communication. Some ofthe market growth even resides in the downsizing of parts of the vehicle network from MS CAN to LIN,where limited communication requirements allow for such downsizing.

The release of LIN 2.0 has further enhanced the reuse of components across car manufacturers andhas added a higher degree of automated design capability by introduction of node capability descriptionfiles and by defining mechanisms for reconfigurability of identical LIN devices in the same network.

VCT is offering the corresponding and highly automated tool chain to guarantee design to correctness.This shortens design cycles and — as a conceptual approach — and allows for integration into higher-level tools. LIN solutions provide a means for the automotive industry to drive new technology andfunctionality in all classes of vehicles.

Acknowledgments

I thank Hans-Christian von der Wense of Freescale Semiconductor Munich and István Horváth andThomas Engler of Volcano Automotive Group for their contributions to this chapter.

FIGURE 31.11 LIN Spector — diagnostics and emulation tool for LIN.



Additional Information

ISO 9141, Road Vehicles: Diagnostics Systems: Requirement for Interchange of Digital Information, 1stedition, 1989.

Dr. Günter Reichart, LIN: a subbus standard in an open system architecture, in 1st International LINConference 19, Ludwigsburg, Germany, September 2002.

J.W. Specks, A. Rajnák, LIN: protocol, development tools, and software interfaces for local interconnect

Germany, October 2000.W. Specks, A. Rajnák, The scaleable network architecture of the Volvo S80, in 8th International Conference

on Electronic Systems for Vehicles, Baden-Baden, Germany, October 1998, pp. 597–641.The LIN specification package and further background information about LIN and the LIN consortium

FIGURE 31.12 LIN Go — graphical objects.


LIN Consortium, LIN Specification, Version 2.0, www.lin-subbus.org, September 2003.

are available via the URL http://www.lin-subbus.org.Information about LIN products referred to in this chapter is available via the URL http://www.

volcanoautomotive.com.

networks in vehicles, in 9th International Conference on Electronic Systems for Vehicles, Baden-Baden,

http://www.lin-subbus.org

http://www.lin-subbus.org

http://www.mentor.com


32-1

32Volcano: Enabling

Correctness by Design

32.1 Introduction ......................................................................32-132.2 Volcano Concepts..............................................................32-3

32.3 Volcano Network Architect ............................................32-10

32.4

Reference ....................................................................................32-18More Information......................................................................32-18

32.1 Introduction

Volcano is a holistic concept defining a protocol-independent design methodology for distributed real-time networks in vehicles. The concept deals with both technical and nontechnical entities (i.e., parti-tioning of responsibilities into well-defined roles in the development process).

The vision of Volcano is enabling correctness by design. By taking a strict systems engineeringapproach and focusing resources into design, a majority of system-related issues can be identified andsolved early in a project. The quality is designed into the vehicle, not tested out. Minimized cost,increased quality, and a high degree of configuration and reconfiguration flexibility are the trademarksof the Volcano concept.

The Volcano approach is particularly beneficial as the complexity of vehicles is increasing very rapidlyand as projects will have to cope with new functions and requirements throughout their lifetime.

A unique feature of the Volcano concept is a solution called post-compile-time reconfiguration flex-ibility, where the network configuration contains signal-to-frame mapping, ID assignment, and frameperiod, is located in a configurable flash area of the electronic control unit (ECU), and can be changedwithout the need for touching the application software, thus eliminating the need for revalidation, savingboth costs and lead time.

The concept’s origins can be traced back to a project at Volvo Car Corporation in 1994–1998 whendevelopment of Volvo’s new large platform took place. It reuses solid industrial experience and takes into

The concept is characterized by three important features:

• Ability to guarantee the real-time performance of the network already at the design stage, thussignificantly reducing the need for testing

Antal RajnákVolcano Communications Technologies AG


Capture of Timing Constraints

The Car OEM Tool Chain: One Example • VNA: Tool Overview

Volcano Signals and the Publish–Subscribe Model • Frames • Network Interfaces • The Volcano API • Timing Model •

Volcano Software in an ECU..........................................32-15

Acknowledgments......................................................................32-17Volcano Configuration • Work Flow

account recent findings from real-time research (Figure 32.1).


• Built-in flexibility, enabling the vehicle manufacturer to upgrade the network in the preproductionphase of a project as well as in the aftermarket

• Efficient use of available resources

The actual implementation of the concept consists of two major parts:

• The offline tool set for requirement capturing and automated network design (covering multipleprotocols and gateway configuration). It provides strong administrative functions for variant andversion handling, needed during the complete life cycle of a car project.

• The target part — represented by a highly efficient and portable embedded software packageoffering a signal-based Application Programming Interface (API), handling of multiple protocols,integrated gateway functionality, and post-compile-time reconfiguration capability, together witha PC-based generation tool.

Even though the implementation originally supported the Controller Area Network (CAN) and Volcanolite* protocols, it has successfully been extended to fit also other emerging network protocols. Localinterconnect network (LIN) was added first, to be followed by the FlexRay and Media Oriented SystemsTransport (MOST) protocols. The philosophy behind this is that communications have to be managedin one single development environment, covering all protocols used, in order to ensure end-to-end timingpredictability, still providing the necessary architectural freedom to chose the most economic solutionfor the task.

The Volcano approach is particularly beneficial because the complexity of vehicles is increasing veryrapidly and because projects will have to cope with new functions and requirements throughout theirlifetime. The computing industry has discovered over the last 40 years that certain techniques are neededin order to manage complex software systems. Two of these techniques are abstraction (where unnecessaryinformation is hidden) and composability (if software components proven to be correct are combined,then the resulting system will be correct as well). Volcano is making heavy use of both of these techniques.

The automotive industry is implementing an increasing number of functions in software. The intro-duction of protocols like MOST for multimedia and FlexRay for active chassis systems results in highlycomplex electrical architectures. Finally, all these complex subnetworks are linked through gateways. Thebehavior of the entire car network has a crucial influence upon the car’s performance and reliability. Tomanage software development involving many suppliers, hundreds of thousands of lines of code andthousands of signals require a structured systems engineering approach. Inherent in the concept ofsystems engineering is a clear partitioning of the architecture, requirements, and responsibilities.

FIGURE 32.1 The Volvo S80 main networks.

*A low-speed, Serial Communications Interface (SCI)-based proprietary master–slave protocol used by Volvo.


Volcano: Enabling Correctness by Design 32-3

A modern vehicle includes a number of microprocessor-based components called electronic controlunits (ECUs), provided by a variety of suppliers.

CAN provides an industry standard solution for connecting ECUs together using a single broadcastbus. A shared broadcast bus makes it much easier to add desired functionality: ECUs can be added easily,and they can communicate data easily and cheaply (adding a function may be “just software”). Butincreased functionality leads to more software and greater complexity. Testing a module for conformanceto timing requirements is the most difficult of the problems. With a shared broadcast bus, the timingperformance of the bus might not be known until all the modules are delivered and the bus usage ofeach is known. Testing for timing conformance can only then begin (which is often too far into thedevelopment of a vehicle to find and correct major timing errors). The supplier of a module can onlydo limited testing for timing conformance: it does not have a complete picture of the final load placedon the bus. This is particularly important when dealing with the CAN bus: arrivals of frames from thebus may cause interrupts on a module wishing to receive the frames, and so the load on the microprocessorin the ECU is partially dependent on the bus load.

It is often thought that CAN is somehow unpredictable and the latencies for lower-priority frames inthe network are unbounded. This is untrue, and in fact, CAN is a highly predictable communicationsprotocol. Furthermore, CAN is well suited to handle large amounts of traffic with differing time constraints.

However, with CAN there are a few particular problems:

• The distribution of identifiers: CAN uses identifiers for two purposes: distinguishing differentmessages on the bus and assigning relative priorities to those messages — the latter being oftenneglected.

• Limited bandwidth: Due to a low maximum signaling speed of 1 Mbit/s, further reduced bysignificant protocol overhead.

Volcano was designed to provide abstraction, composability, and identifier distribution reflecting trueurgencies, while at the same time providing the most efficient utilization of the protocol.

32.2 Volcano Concepts

The Volcano concept is founded on the ability to guarantee the worst-case latencies of all frames sent ina multiprotocol network system. This is a key step because it gives the following:

• A way of guaranteeing that there are no communications-related timing problems.• A way of maximizing the amount of information carried on the bus. This is important for reduced

production costs.• The possibility to develop highly automated tools for design of optimal network configurations.

The timing guarantee for CAN is provided by mathematical analysis developed from academic research[1]. Other protocols like FlexRay are predictable by design. For this reason, some of the subjects discussedbelow are CAN specific and others are independent of the protocol used.

The analysis is able to calculate the worst-case latency for each frame sent on the bus. This latency isthe longest time from placing a frame in a CAN controller at the sending side to the time the frame iscorrectly received at all receivers.

The analysis needs to make several assumptions about how the bus is used.One of these assumptions is that there is a limited set of frames that can access the bus and that time-

related attributes of these frames are known (e.g., frame size, frame periodicity, queuing jitter, and so on).Another important assumption is that the CAN hardware can be driven correctly:

• The internal message queue within any CAN controller in the system is organized (or can be used)as such that the highest-priority message will be sent out first if more than one message is readyto be sent. (The hardware slot position-based arbitration is OK as long as the number of sentframes is less than the number of transmit slots available in the CAN controller.)



• The CAN controller should be able to send out a stream of scheduled messages without releasingthe bus in the interframe space between two messages. Such devices will arbitrate for the bus rightafter sending the previous message and will only release the bus in case of lost arbitration.

A third important assumption is the error model: the analysis can account for retransmissions due toerrors on the bus, but requires a model for the number of errors in a given time interval.

The Volcano software running in each ECU controls the CAN hardware and accesses the bus so thatall these assumptions are met, allowing application software to rely on all communications taking placeon time. This means that integration testing at the automotive manufacturer can concentrate on func-tional testing of the application software.

Another important benefit is that a large amount of communications protocol overhead can be avoided.Examples of how protocol overheads are reduced by obtaining timing guarantees are:

• There is no need to provide frame acknowledgment within the communications layer, dramaticallyreducing bus traffic. The only case where an ECU can fail to receive a frame via CAN is if theECU is off the bus, a serious fault that is detected and handled by network management andonboard diagnostics.

• Retransmissions are unnecessary. The system-level timing analysis guarantees that a frame willarrive on time. Time-outs only happen after a fault, which can be detected and handled by networkmanagement or the onboard diagnostics.

A Volcano system never suffers from intermittent overruns during correct operation because of thetiming guarantees, and therefore achieves these efficiency gains.

32.2.1 Volcano Signals and the Publish–Subscribe Model

The Volcano system provides signals as the basic communication object. Signals are small data items thatare sent between ECUs.

The publish–subscribe model is used for defining signaling needs. For a given ECU there are a set ofsignals that are published (i.e., made available to the system integrator) and a number of subscribedsignals (i.e., signals that are required as inputs to the ECU).

The signal model is provided directly to the programmer of ECU application software, and the Volcanosoftware running in each ECU is responsible for translation between signals and CAN frames.

An important design requirement for the Volcano software was that the application programmer isunaware of the bus behavior: all the details of the network are hidden and the programmer only dealswith signals through a simple API. This is crucial because a major problem with alternative techniquesis that the application software makes assumptions about the CAN behavior, and therefore changing thebus behavior becomes difficult.

In Volcano there are three types of signals:

• Integer signals: These represent unsigned numbers and are of a static size between 1 and 16 bits.So, for example, a 16-bit signal can store integers in the range of 0 to 65,535.

• Boolean signals: These represent truth conditions (true/false). Note that this is not the same as a1-bit integer signal (which stores the integer values 0 or 1).

• Byte signals: These represent data with no Volcano-defined structure. A byte signal consists of afixed number of between 1 and 8 bytes.

The advantage of Boolean and integer signals is that the values of a signal are independent of processorarchitecture (i.e., the values of the signals are consistent regardless of the “endian-ness” of the micropro-cessors in each ECU).

For published signals, Volcano internally stores the value of these signals and, in case of periodicsignals, will send them to the network according to a pattern defined offline by the system integrator.The system integrator also defines the initial value of a signal. The value of a signal persists until updatedby the application program via a write call or until Volcano is reinitialized.



For subscribed signals, Volcano internally stores the current value of each signal. The system integratoralso defines the initial value of a signal.

The value of a subscribed signal persists until:

• It is updated by receiving a new value from the network• Volcano is reinitialized• A signal refresh time-out occurs and the value is replaced by a substitute value defined by the

application programmer

In the case where new signal values are received from the network, these values will not be reflectedin the values of subscribed signals until a Volcano input call is made.

A published signal value is updated via a write call. The latest value of a subscribed signal is obtainedvia a read call. A write call for a subscribed signal is not permitted.

The last written value of a published signal may be obtained via a read call.

32.2.1.1 Update Bits

The Volcano concept permits placement of several signals with different update rates into the same frame.It provides a special mechanism — named update bit — to indicate which signals within the frame haveactually been updated; i.e., the ECU generating the signal wrote a fresh value of the signal since the lasttime it was transmitted. The Volcano software on an ECU transmitting a signal automatically clears theupdate bit when it has been sent. This ensures that a Volcano-based ECU on the receiving side will knoweach time the signal has been updated (the application can see this update bit by using flags tied to anupdate bit; see below). Using update bits to their full extent requires that the underlying protocol issecure. (Frames cannot be lost without being detected.) The CAN protocol is regarded as such, but notthe LIN protocol. Therefore, the update bit mechanism is limited to CAN within Volcano.

32.2.1.2 Flags

A flag is a Volcano object purely local to an ECU. It is bound to one of two things:

• The update bit of a received Volcano signal; the flag is set when the update bit is set.• The containing frame of a signal; the flag is set when the frame containing the signal is received

(regardless of whether an update bit for the signal is set).

Many flags can be bound to each update bit, or the reception of a containing frame. Volcano sets allthe flags bound to an object when the occurrence is seen. The flags are cleared explicitly by theapplication software.

32.2.1.3 Time-Outs

A time-out is, like the flags, a Volcano object purely local to an ECU. The time-out is declared by theapplication programmer and is bound to a subscribed signal. A time-out condition occurs when theparticular signal was not received within the given time limit. In this case, the signal (and a number ofother signals) is set to a value specified as part of the declaration of the time-out. As with the flags, thetime-out reset mechanism can be bound to either:

• The update bit of a received Volcano signal• The frame carrying a specific signal

32.2.2 Frames

A frame is a container capable of carrying a certain amount of data (0 to 8 bytes for CAN and LIN).Several signals can be packed into the available data space and transmitted together in one frame on thenetwork. The total size of a frame is determined by the protocol. A frame can be transmitted periodicallyor sporadically. Each frame is assigned a unique identifier. The identifier serves two purposes in theCAN case:



• Identifies and filters a frame on reception at an ECU• Assigns a priority to a frame

32.2.2.1 Immediate Frames

Volcano normally hides the existence of network frames from the application designer. However, undercertain cases there is a need to send and receive frames with very short processing latencies. In thesecases, direct application support is required. Such frames are designated immediate frames.

There are two Volcano calls to handle immediate frames:

• A transmit call, which immediately sends the designated frame to the network• A receive call, which immediately processes the designated incoming frame if that frame is pending

There is also a read-update-bit call to test the update bit of a subscribed signal within an immediateframe.

The signals packed into an immediate frame can be accessed with normal read and write functioncalls in the same way as all other normal signals.

The application programmer is responsible for ensuring that the transmit call is made only when thesignal values of published signals are consistent.

32.2.2.2 Frame Modes

In Volcano one is allowed to specify different frame modes for an ECU. A frame mode is a descriptionof an ECU working mode, where a set of frames (signals) can be active (input and output). The framescan be active in one or many frame modes. The timing properties of frames do not have to be the samefor different frame modes supporting the same frame.

32.2.3 Network Interfaces

A network interface is the device used to send and receive frames to and from networks. A networkinterface connects a given ECU to the network. In the CAN case, more than one network interface (CANcontroller) on the same ECU may be connected to the same network. Likewise, an ECU may be connectedto more than one network.

The network interfaces in Volcano are protocol specific. The protocols currently supported are CANand LIN; FlexRay and MOST are under implementation.

The network interface is managed by a standard set of Volcano calls. These allow the interface to beinitialized or reinitialized, connected to the network (i.e., begin operating the defined protocol), anddisconnected from the network (i.e., take no further part in the defined protocol). There is also a Volcanocall to return the status of the interface.

32.2.4 The Volcano API

The Volcano API provides a set of simple calls to manipulate signals and to control the CAN/LIN con-trollers. There are also calls to control Volcano sending to and receiving from networks. To manipulatesignals there are read and write calls. A read call returns to the caller the latest value of a signal; a writecall sets the value of a signal. The read and write calls are the same regardless of the underlying network type.

32.2.4.1 Volcano Thread of Control

There are two Volcano calls that must be called at the same fixed rate: v_input() and v_output(). If thev_gateway() function is used, the same calling rate should be used as for the v_input() and v_output()functions. The v_output() call places the frames into the appropriate controllers. The v_input() call takesreceived frames and makes the signal values available to read calls. The v_gateway() call copies values ofsignals in frames received from the network to values of signals in frames sent to the network. Thev_sb_tick() call handles transmitting and receiving frames for subbuses.



Volcano also provides a very low latency communication mechanism in the form of the immediateframe API. This is a view of frames on the network that allows transmission and reception from and tothe Volcano domain without the normal Volcano input and output latencies, or mutual exclusionrequirements with the v_input() and v_output() calls. There are two communication calls in the imme-diate signal API: v_imf_rx() and v_imf_tx().

The v_imf_tx() call copies values of immediate signals into a frame and places the frame in theappropriate CAN controller for transmission. The v_imf_rx() takes a received frame containing imme-diate signals and makes the signal values available to read calls.

A third call, v_imf_queued(), allows the user to see if an immediate frame has really been sent on the network.The controller calls allow the application to initialize, connect, and disconnect from networks, and to

place the controllers into sleep mode, among others.

32.2.4.2 Volcano Resource Information

The ambition of the Volcano concept is to provide a fully predictable communications solution. In orderto achieve this, the resource usage of the Volcano embedded part has to be determined. Resources ofspecial interest are memory and execution time.

32.2.4.2.1 Execution Time of Volcano Processing CallsIn order to bound processing time, a budget for the v_input() call (i.e., the maximum number of framesthat will be processed by a single call to v_input()) has to be established. A corresponding process fortransmitted frames applies as well.

32.2.5 Timing Model

The Volcano timing model covers end-to-end timing (from button press to activation). To be able to setin context the signal timing information needed in order to analyze a network configuration of signalsand frames, a timing model is used. This section defines the required information that must be providedby an application programmer in order to be able to guarantee the end-to-end timing requirements.

A Volcano signal is transported over a network within a frame. Figure 32.2 identifies six time pointsbetween the generation and consumption of a signal value:

1. Notional generation (signal generated) — either by hardware (e.g., switch pressed) or software(e.g., time-out signaled). The user can define this point to best reflect his system.

2. First v_output() (or v_imf_tx() for an immediate frame) at which a new value is available. Thisis the first such call after the signal value is written by a write call.

3. The frame containing the signal is first entered for transmission (arbitration on a CAN bus).

FIGURE 32.2 The Volcano timing model.


max_age

TPL

TSL

Notionalgeneration

Frame entersarbitration

Transmissioncompleted

First v_inputat whichsignal isavailable

Notionalconsumption

3 4 5 621

time

TT

TBT

TAT

First v_outputat which new

value isavailable


4. Transmission of the frame completes successfully (i.e., the subscriber’s communication controllerreceives the frame from the network).

5. v_input() (or v_imf_rx() for an immediate frame) makes the signal available to the application.6. Notional consumption — the user application consumes the data. The user can define this point

to best reflect his system.

The max_age of the signal is the maximum age, measured from notional generation, at which it isacceptable for notional consumption. The max_age is the overall timing requirement on a signal.

TPL (publish latency) is the time from notional generation to the first v_output() call when the signalvalue is available to Volcano (a write call has been made). It will depend on the properties of the publishingapplication. Typical values might be the frame_processing_period (if the signal is written fresh everyperiod, but this is not synchronized with v_output()), the offset between the write call and v_output()(if the two are synchronized), or the sum of the frame_processing_period and the period of some lower-rate activity that generates the value. This value must be given by the application programmer.

TSL (subscribe latency) is the time from the first v_input that makes the new value available to theapplication to the time when the value is consumed. The consumption of a signal is a user-defined eventthat will depend on the properties of the subscribing function. As an example, it can be a lamp being litor an actuator starting to move. This value must be given by the application programmer.

The intervals TBT, TT, and TAT are controlled by the Volcano 5 configuration and are dependent uponthe nature of the frame in which the signal is transported.

The value TBT is the time before transmission (the time from the v_output call until the frame entersarbitration on the bus). TBT is a per-frame value that depends on the type of frame carrying the signal(see later sections). This time is shared by all signals in the frame and is common to all subscribers tothose signals.

The value TAT is the time after transmission (the time from when the frame has been successfullytransmitted on the network until the next v_input call). TAT is a per-frame value that may be differentfor each subscribing ECU.

The value TT is the time required to transmit the frame (including the arbitration time) on the network.

32.2.5.1 Jitter

The application programmer at the supplier must also provide information of the jitter to the systemsintegrator. This information is as follows:

The input_jitter and output_jitter refer to the variability in the time taken to complete the v_input()and v_output() calls, measured relative to the occurrence of the periodic event causing Volcano processing

output_jitter is measured.In Figure 32.3, E marks the earliest completion time of the v_output() call and L marks the latest

completion time, relative to the start of the cycle. The output_jitter is therefore L – E. The input_jitteris measured according to the same principles.

If a single-thread system is used, without interrupts, the calculation of the input_jitter and output_jitteris straightforward: the earliest time is the best-case execution time of all the calls in the cycle (includingthe v_output() call), and the latest time is the worst-case execution time of all the calls. The situation ismore complex if interrupts can occur or the system consists of multiple tasks, since the latest time musttake into account preemption from interrupts and other tasks.

32.2.6 Capture of Timing Constraints

The declaration of a signal in a Volcano fixed configuration file provides syntax to capture the followingtiming-related information:

• Whether a signal is state or state change — (info_type)• Whether a signal is sporadic or periodic — (generation_type)


to be done (i.e., calls to v_input(), v_gateway(), and v_output() to be made). Figure 32.3 shows how the


• The latency• The min_interval• The max_interval• The max_age

The first two (together with whether the signal is published or subscribed to) provide signal propertiesthat determine the kind of signal.

A state signal carries a value that completely describes the signaled property (e.g., the current positionof a switch). A subscriber to such a signal need only observe the signal value when the information isrequired for the subscriber’s purposes (e.g., signal values can be missed without affecting the usefulnessof later values).

A state change signal carries a value that must always be observed in order to be meaningful (e.g.,distance traveled since last signal value). A subscriber must observe every signal value.

A sporadic signal is one that is written by the application in response to some event (e.g., a button press).A periodic signal is one that is written by the application at regular intervals.The latency of a signal is the time from notional generation to being available to Volcano (for a

published signal), or from being made available to the application by Volcano to notional consumption(for a subscribed signal). Note that immediate signals (those in immediate frames) include time takento move frames to and from the network in these latencies.

The min_interval has different interpretations for published and subscribed signals. For a publishedsignal, it is the minimum time between any pair of write calls to the signal (this allows, for example, thecalculation of the maximum rate at which the signal could cause a sporadic frame carrying it to betransmitted). For a subscribed signal, it is the minimum acceptable time between arrivals of the signal.This is optional: it is intended to be used if the processing associated with the signal is triggered by arrivalof a new value, rather than periodic. In such a case, it provides a constraint that the signal should notbe connected to a published signal with a faster rate.

The max_interval has different interpretations for published and subscribed signals. For a publishedsignal, the interesting timing information is already captured by min_interval and publish latency. Fora subscribed signal, it is the maximum interval between notional consumptions of the signal (i.e., it canbe used to determine that signal values are sampled quickly enough that none will be missed).

The max_age of a signal is the maximum acceptable age of a signal at notional consumption, measuredfrom notional generation. This value is meaningful for subscribed signals.

FIGURE 32.3 Measurement of output jitter.


TV

E

Occurrence ofperiodic event

that initiatesVolcano

processing calls

Worst-caseexecution time of

v_output call

Frameprocessingperiod

Completionv_output call

Execution ofothercomputation

Best-caseexecution time ofv_output call

L

TV TV


In addition to the signal timing properties described above, the Volcano fixed configuration fileprovides syntax to capture the following additional timing-related information:

• Volcano processing period• Volcano jitter time

The Volcano processing period defines the nominal intervals between successive v_input() calls on theECU and between successive v_output() calls (i.e., the rates of the calls are the same, but v_input() andv_output() are not assumed to become due at the same instant). For example, if the Volcano processingperiod is 5 ms, then each v_output() call becomes due 5 ms after the previous one became due.

The Volcano jitter defines the time by which the actual call may lag behind the time at which it becamedue. Note that becomes due refers to the start of the call, and jitter refers to the completion of the call.

32.3 Volcano Network Architect

To manage increasing complexity in electrical architectures, a structured development approach isbelieved essential to ensure correctness by design. The Volcano Automotive Group has developed anetwork design tool, Volcano Network Architect (VNA), to support a development process, based onstrict systems engineering principles. Gatewaying of signals between different networks is automaticallyhandled by the VNA tool and the accompanying embedded software.

The tool supports partitioning of responsibilities into different roles, such as system integrator andfunction owner. Third-party tools may be used for functional modeling. These models can be importedinto VNA.

VNA is the top-level tool in the Volcano Automotive Group’s tool chain for designing vehicle networksystems. The tool chain supports important aspects of systems engineering such as:

• Use of functional modeling tools• Partitioning of responsibilities• Abstracting away from hardware- and protocol-specific details providing a signal-based API for

the application developer• Abstracting away from the network topology through automatic gatewaying between different

networks• Automatic frame compilation to ensure that all declared requirements are fulfilled (if possible),

that is, delivering correctness by design• Reconfiguration flexibility by supporting post-compile-time reconfiguration capability

The VNA tool supports network design and makes management and maintenance of distributednetwork solutions more efficient. The tool supports capturing of requirements and then takes a userthrough all stages of network definition.

32.3.1 The Car OEM Tool Chain: One Example

Increasing competition and complex electrical architectures demands enhanced processes. Functionmodeling has proved to be a suitable tool to capture the functional needs in a vehicle. Tools such asRational Rose provide a good foundation to capture all different functions, and other tools (Statemate,Simulink) model them in order to allocate objects and functionality in the vehicle. Networking is essentialsince the functionality is distributed among a number of ECUs in the vehicle. Substantial parts of theoutcome from the function modeling are highly suitable to use as input to a network design tool suchas VNA.

The amount of information required to properly define the networks is vast. To support input of data,VNA provides an automated import from third-party tools through an Extensible Markup Language(XML)-based format.



It is the job of the signal database administrator/system integrator to ensure that all data entered intothe system are valid and internally consistent. VNA supports this task through a built-in multilevelconsistency checker that verifies all data.

In this particular approach, the network is designed by the system integrator in close contact with thedifferent function owners in order to capture all necessary signaling requirements — functional andnonfunctional (including timing). When the requirements are agreed upon and documented in VNA,the system integrator uses VNA to pack all signals into frames; this can be done manually or automatically.The algorithm used by VNA handles gatewaying by partitioning end-to-end timing requirements intorequirements per network segment.

All requirements are captured in the form of a Microsoft Word document called software requirementspecification (SWRS) that is generated by VNA and sent to the different node owners as a draft copy tobe signed off. When all SWRSs have been signed off, VNA automatically creates all necessary configurationfiles used in the vehicle, along with a variety of files for third-party analysis and measurement tools.

The network-level (global) configuration files are used as input to the Volcano configuration tool andVolcano back-end tool in order to generate a set of downloadable binary configuration files for eachnode. The use of reconfigurable nodes makes the system very flexible since the Volcano concept separatesapplication-dependent information and network-dependent information. A change in the network bythe system integrator can easily be applied to a vehicle without having to recompile the applicationsoftware in the nodes. The connection between function modeling and VNA provides good support foriterative design. It verifies network consistency and timing up front, to ensure a predictable and deter-ministic network.

32.3.2 VNA: Tool Overview

32.3.2.1 Global Objects

The work flow in VNA ensures that all relevant information about the network is captured. Global objectsare created first and then (re)used in several projects. The VNA user works with objects of types such assignals, nodes, interfaces, etc. These objects are used to build up the networks used in a car. Signals aredefined by name and type and can have logical or physical encoding information attached. Interfacesdetailing hardware requirements are defined, leading to the description of actual nodes on a network.For each node, receive and transmit signals are defined, and timing requirements are provided for thesignals. This information is intended for global use, that is, across car variants, platforms, etc.

32.3.2.2 Project- or Configuration-Related Data

When all global data have been collected, the network will be designed by connecting the interfaces ina desired configuration. VNA has strong project and variant handling. Different configurations canselectively use or adapt the global objects, for example, by removing a high-end feature from a low-endcar model. This means that VNA can manage multiple configurations, designs, and releases, with versionand variant handling.

The release handling ensures that all components in a configuration are locked. It is, however, stillpossible to reuse the components in unchanged form. This makes it possible to go back to any releasedconfiguration at any point in time.

32.3.2.3 Database

The VNA tool was designed to have one common multiuser database per car OEM. In order to securethe highest possible performance, all complex and time-consuming VNA operations are performedtoward a local RAM mirror of the database. A specially designed database interface ensures consistencyin the local mirror. Operations that are not time critical, such as database management, operate towardthe database.


All data objects, both global and configuration specific, are stored in a common database (Figure 32.4).


The built-in multiuser functionality allows multiple users to access all data stored in the databasesimultaneously. To ensure that a data object is not modified by more than one user, the object must belocked before any modification; although an object is locked for modification, read access is of courseallowed for all users.

32.3.2.4 Version and Variant Handling

The VNA database implements functionality for variant and version handling. Most of the global dataobjects, e.g., signals, functions, and nodes, may exist in different versions, but only one version of anobject can be used in a specific project or configuration.

The node objects can be seen as the main global objects, since hierarchically they include all othertypes of global objects. The node objects can exist in different variants, but only one object can be usedfrom a variant folder in a particular project or configuration.

32.3.2.5 Consistency Checking

Extensive functionality for consistency checking is built into the VNA tool. The consistency check canbe manually activated when needed, but it is also running continuously to check user input and giveimmediate feedback on any suspected inconsistency. The consistency check ensures that the networkdesign follows predefined rules and generates errors when appropriate.

32.3.2.6 Timing Analysis/Frame Compilation

The Volcano concept is based on a foundation of guaranteed message latency and a signal-based pub-lish–subscribe model. This provides abstraction by hiding the network and protocol details, allowing thedeveloper to work in the application domain with signals, functions, and related timing information.

Much effort has been spent on developing and refining the timing analysis in VNA. The timing analysisis built upon a scheduling model called DMA (deadline monotonic analysis) and calculates the worst-case latency for each frame among a defined set of frames sent on the bus. Parts of this functionalityhave been built into the consistency check routine as described above, but the real power of the VNAtool is found in the frame packer/frame compiler functionality.

FIGURE 32.4 The database is a central part of the VNA system. In order to ensure highest possible performance,each instance of VNA accesses a local mirror of the database that is continuously synchronized with its parent.


Framecompile

Consistencycheck

Config.generator

Quickgenerator

GUI

Dbackup

DB

in

RA

DConsole

DB

Usemanage

D

If

DConverter

Volcano Configuration Files

Fixed Target Network

Specs,Reports &Document

ASAP

SWRSLIN Description Files

.ldfGenericExp./Imp

XML

FIBEXXML Files

Conversiontool

customer3rd Party Format

HTML


The frame packer/compiler attempts to create an optimal packing of signals into frames and thencalculate the proper IDs to every frame, ensuring that all the timing requirements captured earlier in theprocess are fulfilled (if possible). This automatic packing of multiple signals into each frame makes moreefficient use of the data bus, by amortizing some of the protocol overheads involved, thus lowering busload. The combined effect of multiple signals per frame and perfect filtering results in a lower interruptand CPU load, which means that the same performance can be obtained at lower cost. The frame packercan create the most optimal solution if all nodes are reconfigurable. To handle carryover nodes that arenot reconfigurable (ROM based), these nodes and their associated frames can be classed as fixed. Framepacking can also be performed manually if desired. Should changes to the design be required at a latertime, the process allows rapid turnaround of design changes, rerunning of the frame compiler, andregeneration of the configuration files.

The VNA tool can be used to design network solutions that are later realized by embedded softwarefrom any provider. However, the VNA tool is designed with the Volcano embedded software (VTP) inmind, which implements the expected behavior into the different nodes. To get the full benefits of thetool chain, VNA and VTP should be used together.

32.3.2.7 Volcano Filtering Algorithm

A crucial aspect of network configuration is how to choose identifiers so that the load on a CPU relatedto handling of interrupts generated by frames of no interest for the particular node is minimized: mostCAN controllers have only limited filtering capabilities. The Volcano filtering algorithm is designed toachieve this.

An identifier is split into two parts: priority bits and filter bits. All frames on a network must haveunique priority bits; for real-time performance, the priority setting of a frame should reflect the relativeurgency of the frame. The filter bits are used to determine if a CAN controller should accept or reject aframe. Each ECU that needs to receive frames by interrupts is assigned a single filter bit; the hardwarefiltering in the CAN controller is set to “must match 1” for the filter bit and “don’t care” for all other bits.

The filter bits of a frame are set for each ECU by which the frame needs to be seen. So a frame that isbroadcast to all ECUs on the network is assigned filter bits all set to 1. For a frame sent to a single ECUon the network, just one filter bit is set. Figure 32.5 illustrates this; the frame shown is sent to four ECUs.

If an ECU takes an interrupt for just the frames that it needs, then the filtering is said to be perfect.In some systems there may be more ECUs needing to receive frames by interrupt than there are filterbits in the network; in this case, some ECUs will need to share a bit. If this happens, then Volcano will

FIGURE 32.5 A CAN identifier on an extended CAN network. The network clause has defined the CAN identifiersto have 7 priority bits and 13 filter bits. The least significant bit of the value corresponds with the bit of the identifiertransmitted last. Only legal CAN identifiers can be specified: identifiers with the seven most significant bits equal to1 are illegal according to the CAN standard.


0000 1 01 1000 0000 0000 0110 0 010 01

081216202428 4ID bit

num priority bits = 7 num filter bits = 13

unused bits (0)

2C400061

2C4087


filter the frames in software, using the priority bits to uniquely identify the frame and discardingunwanted frames.

The priority bits are the most significant bits. They indicate priority and uniquely identify a frame.The number of priority bits must be large enough to uniquely identify a frame in a given networkconfiguration. The priority bits for a given frame are set by the relative urgency (or deadline) of theframe. This is derived from how urgently each subscriber of a signal in the frame needs the signal (asdescribed earlier). In most systems, 5 to 10 priority bits is sufficient.

The filter bits are the remaining least significant bits and are used to indicate the destination ECUs fora given frame. Treating them as a target mask does this: each ECU (or group of ECUs) is assigned a singlefilter bit. The filtering for a CAN controller in the ECU is set up to accept only frames where the corre-sponding filter bit in the identifier is set. This can give perfect filtering: an interrupt is raised if and only ifthe frame is needed by the ECU. Perfect filtering can dramatically reduce the CPU load compared to filteringin software. Indeed, perfect filtering is essential if the system integrator needs to connect ECUs with slow8-bit CPUs to high-speed CAN networks (if filtering were implemented in software, the CPU would spendmost of its available processing time handling interrupts and discarding unwanted frames). The filteringscheme also allows for broadcast of a frame to an arbitrary set of ECUs. This can reduce the traffic on thebus since frames do not need to be transmitted several times to different destinations (Figure 32.6).

Because the system integrator is able to define the configuration data and because those data definethe complete network behavior of an ECU, the in-vehicle networks are under the control of the systemintegrator.

32.3.2.8 Multiprotocol Support

The existing version of VNA supports the complementary, contemporary network protocols of CAN andLIN. The next version will also have support for the FlexRay protocol. A prototype version of VNA with

FIGURE 32.6 VNA screen.



partial MOST support is currently under construction. As network technology continues to advance intoother protocols, VNA will also move to support these advances.

32.3.2.9 Gatewaying

A network normally consists of multiple network segments using different protocols. Signals may betransferred from one segment to another through a gateway node. As implemented throughout thewhole tool chain of the Volcano Automotive Group, gatewaying of data even across multiple protocolsis automatically configured in VNA. In this way, VNA allows any node to subscribe any signal generatedon any network without needing to know how this signal is gatewayed from the publishing node.Handling of timing requirements over one or more gateways is also handled by VNA. The Volcanosolution requires no special gatewaying hardware and therefore provides the most cost efficient solutionto signal gatewaying.

32.3.2.10 Data Export and Import

The VNA tool enables the OEMs to get a close integration between VNA and functional modeling toolsand to share data between different OEMs and subcontractors, e.g., node developers.

Support of emerging standards such as FIBEX and XML will further simplify information sharing andbecome a basis for configuration of third-party communication layers.

32.4 Volcano Software in an ECU

The Volcano tool chain includes networking software running in each ECU in the system. This softwareuses the configuration data to control the transmission and reception of frames on one or more busesand present signals to the application programmer. One view of the Volcano network software is as acommunications engine under the control of the system integrator. The view of the application program-mer is different: the software is a black box into which published signals are placed, and out of whichcan be summoned subscribed signals.

The main implementation goals for Volcano target software are as follows:

• Predictable real-time behavior — No data loss under any circumstances• Efficiency — Low RAM usage, fast execution time, small code size• Portability — Low cost of moving to a new platform

32.4.1 Volcano Configuration

Building a configuration is a key part of the Volcano concept. As already mentioned, a configuration isbased around details, such as how signals are mapped into frames, allocation of identifiers, and processingof intervals.

For each ECU, there are two authorities acting in the configuration process: the system integrator andthe ECU supplier. The system integrator provides the Volcano configuration for the ECU regarding thenetwork behavior at the system level, and the supplier provides the Volcano configuration data for theECU in terms of the internal behavior.

32.4.1.1 The Configuration Files

The Volcano configuration data are captured in four different types of files:

• Fixed information (agreed upon between the supplier and system integrator).• Private information provided by the ECU supplier. The ECU supplier does not necessarily have to

provide this information to the system integrator.• Network configuration information supplied by the system integrator.• Target information (supplier description of the ECU published to the system integrator).



32.4.1.1.1 Fixed InformationThe fixed information is the most important in achieving a working system. It consists of a completedescription of the dependencies between the ECU and the network. This includes a description of thesignals the ECU needs from the network, how often Volcano calls will be executed, and so on. Theinformation also includes a description of the CAN controller(s) and possible limitations regardingreception and transmission boundaries and supported frame modes. The fixed information forms a“contract” between the supplier and the system integrator: the information should not be changedwithout both parties being aware of the changes. The fixed information file is referred to as the FIX file.

32.4.1.1.2 Private InformationThe private file contains additional information for Volcano that does not affect the network: time-outvalues associated with signals and what flags are used by the application. The private information file isreferred to as the PRI file.

32.4.1.1.3 Network InformationThe network information specifies the network configuration of the ECU. The system integrator mustdefine the number of frames sent from and received by the ECU, the frame identifier and length, anddetails of how the signals in the agreed-upon information are mapped into these frames. Here, the vehiclemanufacturer also defines the different frame modes used in the network. The network information fileis referred to as the NET file.

32.4.1.1.4 Target InformationThe target information contains information about the resources that the supplier has allocated toVolcano in the ECU. It describes the ECU’s hardware (e.g., used CAN controllers and where those aremapped in memory). The target information file is referred to as the TGT file.

32.4.2 Work Flow

The Volcano system identifies two major roles in the development of a network of ECUs: the applicationdesigner (which may include the designer of the ECU system or the application programmer) and thesystem integrator. The application designer is typically located at the organization developing the ECUhardware and application software. The system integrator is typically located at the vehicle manufacturer.The interface between the application designer and the system integrator is carefully controlled, and theinformation owned by each side is strictly defined. The Volcano tool chain implementation clearly reflectsthis partitioning of roles.

The Volcano system includes a number of tools to help the system integrator in defining a networkconfiguration. The Network Architect is a high-level design tool, with a database containing all thepublish–subscribe information for each ECU available, as described in the previous sections. Aftermapping the signaling needs on particular network architecture, thus defining the connections betweenthe published and subscribed signals, an automatic frame compiler will be run. The frame compiler tooluses the requirements captured earlier to build a configuration that meets those requirements. There aremany possibilities to optimize the bus behavior. The frame compiler includes the CAN bus timing analysisand LIN schedule table generation and will not generate a configuration that violates the timing require-ments placed on the system. The frame compiler also uses the analysis to answer “What if?” type ofquestions and guide the user in building a valid and optimized network configuration.

The output of the frame compiler is used to build configuration data specific to each ECU. This isused by the Volcano target software in the ECU to properly configure and use the hardware resources.

The Volcano configuration data generator tool set (V5CFG/V5BND) is used to translate this ASCIItext information to executable binary code in the following way:

• When the supplier executes the tool, it reads the FIX, PRI, and TGT files to generate compile timedata files. These data files are compiled and linked together with the application program togetherwith the Volcano library supplied for the specific ECU system.



• When the vehicle manufacturer executes the tool, it reads the FIX, NET, and TGT files to generatethe binary data that are to be located in the ECU’s Volcano configuration memory (known as theVolcano NVRAM). An ECU is then configured (or reconfigured) by downloading the binary datato the ECU’s memory.

It is vital to realize that changes to either the FIX or TGT file cannot be done without havingcoordination between the system integrator and the ECU supplier.

The vehicle manufacturer can, however, change the NET file without informing the ECU supplier. Inthe same way, the ECU supplier can change the PRI file without informing the system integrator.

Figure 32.7 shows how the Volcano target code for an ECU is configured by the supplier and the systemintegrator.

The Volcano concept and related products have been successfully used in production since 1996.Present car OEMs using the entire tool chain are Aston Martin, Jaguar, LandRover, MG Rover VolvoCars, and Volvo Bus Corporation.

Acknowledgments

I acknowledge the contributions of my colleagues at Volcano Automotive Group — in particular, IstvánHorváth, Niklas Amberntsson, and Mats Ramnefors for their contributions to this chapter.

FIGURE 32.7 Volcano Target Code configuration process.


V5CFGconfiguration

tool

‘‘private’’

V5CFGconfiguration

tool

V5BNDtarget

tailoring

‘‘fixed’’ ‘‘network’’

Volcano 5target library

Compile-timedata

Applicationprogram

‘‘target’’

Binary datafor ECU

configuration

Program code (ROM/FLASHEEPROM)

Volcano 5 NVRAM pool

ECU memory

intermediate‘‘fix/net’’

intermediate‘‘fix/pri’’

V5BNDtarget codegeneration

agreedinformation

ECUsupplier

vehiclemanufacturer


Reference

[1] K. Tindell and A. Burns, Guaranteeing message latencies on Controller Area Network (CAN), inProceedings of the 1st International CAN Conference, 1994, pp. 2–11.

More Information

K. Tindell, H. Hansson, and A.J. Wellings, Analysing real-time communications: Controller Area Network(CAN), in Proceedings of the 15th IEEE Real-Time Systems Symposium, San Juan, Puerto Rico, 1994,pp. 259–265.

K. Tindell, A. Rajnák, and L. Casparsson, CAN Communications Concept withGuaranteed Message Latencies, SAE Paper, 1998.

L. Casparsson, K. Tindell, A. Rajnák, and P. Malmberg, Volcano: A Revolution in On-Board Communi-cation, Volvo Technology Report, 1998.

W. Specks and A. Rajnák, The scaleable network architecture of the Volvo S80, in 8th InternationalConference on Electronic Systems for Vehicles, Baden-Baden, Germany, October 1998, pp. 597–641.


http://www.VolcanoAutomotive.com.


zurawski r.the industrial communication technology handbook.2005.automotive technologies

Documents