vtu question paper solutions -...
TRANSCRIPT
Embedded computing systems 10CS72
Dept of CSE, SJBIT Page 1
VTU Question paper solutions
1. Give the characteristics and constraint of embedded system? Jun 14
Embedded computing is in many ways much more demanding than the sort of
programs that you may have written for PCs or workstations. Functionality is
important in both general-purpose computing and embedded computing, but
embedded applications must meet many other constraints as well.
■ Complex algorithms: The operations performed by the
microprocessor may be very sophisticated. For example, the
microprocessor that controls an automobile engine must perform
complicated filtering functions to opti- mize the performance of the
car while minimizing pollution and fuel utilization.
■ User interface: Microprocessors are frequently used to control
complex user interfaces that may include multiple menus and many
options. The moving maps in Global Positioning System (GPS) navigation
are good examples of sophisticated user interfaces.
■ Real time: Many embedded computing systems have to perform in real
time— if the data is not ready by a certain deadline, the system breaks. In
some cases, failure to meet a deadline is unsafe and can even endanger lives.
In other cases, missing a deadline does not create safety problems but does
create unhappy customers—missed deadlines in printers, for example, can
result in scrambled pages.
■ Multirate: Not only must operations be completed by deadlines, but
many embedded computing systems have several real-time activities
going on at the same time. They may simultaneously control some
operations that run at slow rates and others that run at high rates.
Multimedia applications are prime examples of multirate behavior. The
audio and video portions of a multimedia stream run at very different
rates, but they must remain closely synchronized. Failure to meet a
deadline on either the audio or video portions spoils the perception of the
entire presentation.
■ Manufacturing cost: The total cost of building the system is very
important in many cases. Manufacturing cost is determined by many factors,
including the type of microprocessor used, the amount of memory
required, and the types of I/O devices.
Embedded computing systems 10CS72
Dept of CSE, SJBIT Page 2
■ Power and energy: Power consumption directly affects the cost
of the hardware, since a larger power supply may be necessary.
Energy con- sumption affects battery life, which is important in many
applications, as well as heat consumption, which can be important
even in desktop applications.
2. Explain the challenges in embedded computing system design. Jun 14/Jan 14
External constraints are one important source of difficulty in embedded
system design. Let’s consider some important problems that must be taken
into account in embedded system design.
How much hardware do we need?We have a great deal of control
over the amount of computing power we apply to our problem. We
cannot only select the type of microprocessor used, but also select the
amount of memory, the peripheral devices, and more. Since we often must
meet both performance deadlines and manufacturing cost constraints, the
choice of hardware is important—too little hardware and the system fails to
meet its deadlines, too much hardware and it becomes too expensive.
How do we meet deadlines?The brute force way of meeting a
deadline is to speed up the hardware so that the program runs faster. Of
course, that makes the system more expensive. It is also entirely possible
that increasing the CPU clock rate may not make enough difference to
execution time, since the program’s speed may be limited by the memory
system.
How do we minimize power consumption?In battery-powered
applications, power consumption is extremely important. Even in
nonbattery applications, excessive power consumption can increase heat
dis- sipation. One way to make a digital system consume less power is
to make it un more slowly, but naively slowing down the system can
obviously lead to missed deadlines. Careful design is required to slow
down the noncritical parts of the machine for power consumption while
still meeting necessary performance goals.
How do we design for upgradability?The hardware platform may
be used over several product generations, or for several different versions of
a product in the same generation, with few or no changes. However, we
want to be able to add features by changing software. How can we design
a machine that will provide the required performance for software that we
haven’t yet written?
Embedded computing systems 10CS72
Dept of CSE, SJBIT Page 3
How Does it Really work ?Reliability is always important when
selling products—customers rightly expect that products they buy will
work. Reliability is especially important in some appli- cations, such as
safety-critical systems. If we wait until we have a running system and try
to eliminate the bugs, we will be too late—we won’t find enough bugs, it
will be too expensive to fix them, and it will take too long as well. Another
set of challenges comes from the characteristics of the components and
systems them- selves. If workstation programming is like assembling a
machine on a bench, then embedded system design is often more like
working on a car—cramped, delicate, and difficult. Let’s consider some
ways in which the nature of embedded computing machines makes their
design more difficult.
■ Complex testing: Exercising an embedded system is generally more
difficult than typing in some data. We may have to run a real machine in
order to generate the proper data. The timing of data is often important,
meaning that we cannot separate the testing of an embedded computer
from the machine in which it is embedded.
■ Limited observability and controllability: Embedded
computing systems usually do not come with keyboards and screens.This
makes it more difficult to see what is going on and to affect the system’s
operation. We may be forced to watch the values of electrical signals on the
microprocessor bus, for example, to know what is going on inside the
system. Moreover, in real-time applica- tions we may not be able to easily
stop the system to see what is going on inside.
■ Restricted development environments: The development
environments for embedded systems (the tools used to develop software
and hardware) are often much more limited than those available for PCs
and workstations. We generally compile code on one type of machine, such
as a PC, and download it onto the embedded system. To debug the code, we
must usually rely on pro- grams that run on the PC or workstation and then
look inside the embedded system
3. Define design methodology. Explain the embedded system design process?
Jun 14
A design methodology is important for three reasons. First, it allows us to
keep a scorecard on a design to ensure that we have done everything we
need to do, such as optimizing performance or perform- ing functional tests.
Second, it allows us to develop computer-aided design tools. Developing a
single program that takes in a concept for an embedded system and emits a
completed design would be a daunting task, but by first breaking the process
Embedded computing systems 10CS72
Dept of CSE, SJBIT Page 4
into manageable steps, we can work on automating (or at least
semiautomating) the steps one at a time. Third, a design methodology makes
it much easier for members of a design team to communicate. By defining
the overall process, team members can more easily understand what they
are supposed to do, what they should receive from other team members at
certain times, and what they are to hand off when they complete their
assigned steps. Since most embedded systems are designed by teams,
coordination is perhaps the most important role of a well-defined design
methodology.
specification, we create a more detailed description of what we want.
But the specification states only how the system behaves, not how it is
built. The details of the system’s internals begin to take shape when we
develop the architecture, which gives the system structure in terms of large
components. Once we know the components we need, we can design those
components, including both software modules and any specialized
hardware we need. Based on those components, we can finally build a
complete system.
In this section we will consider design from the top–down—we will begin
with the most abstract description of the system and conclude with
concrete details. The alternative is a bottom–up view in which we start
with components to build a system. Bottom–up design steps are shown in
the figure as dashed-line arrows. We need bottom–up design because we do
not have perfect insight into how later stages of the design process will turn
out. Decisions at one stage of design are based upon estimates of what will
happen later: How fast can we make a particular function run? How much
memory will we need? How much system bus capacity do we need? If
our estimates are inadequate, we may have to backtrack and amend our
original decisions to take the new facts into account. In general, the less
experience we have with the design of similar systems, the more we will
have to rely on bottom-up design information to help us refine the systemBut
the steps in the design process are only one axis along which we can view
embedded system design. We also need to consider the major goals of the
design:
4. Explain model train control system? Jan 14
MODEL TRAIN CONTROLLER
In order to learn how to use UML to model systems, we will specify a simple
system, a model train controller, which is illustrated in Figure 1.14. The user
sends messages to the train with a control box attached to the tracks. The
Embedded computing systems 10CS72
Dept of CSE, SJBIT Page 5
control box may have familiar controls such as a throttle, emergency stop
button, and so on. Since the train receives its electrical power from the
two rails of the track, the control box can send signals to the train over the
tracks by modulating the power supply voltage.The control panel sends
packets over the tracks to the receiver on the train. The train includes analog
electronics to sense the bits being transmitted and a control system to set
the train motor’s speed and direction based on those commands. Each
packet includes an address so that the console can control several trains on
the same track; the packet also includes an error correction code (ECC) to
guard against transmission errors. This is a one-way communication system—
the model train cannot send commands back to the user.
UNIT 2
1. Differentiate between Harvard and von Neumann architecture. Jun 14
The Harvard architecture is a computer architecture with physically separate storage and signal
pathways for instructions and data. The term originated from the Harvard Mark I relay-based
computer, which stored instructions on punched tape (24 bits wide) and data in electro-
mechanical counters. These early machines had data storage entirely contained within the central
processing unit, and provided no access to the instruction storage as data. Programs needed to be
loaded by an operator; the processor could not boot itself.
Today, most processors implement such separate signal pathways for performance reasons but
actually implement a modified Harvard architecture, so they can support tasks such as loading a
program from disk storage as data and then executing it.
Under pure von Neumann architecture the CPU can be either reading an instruction or
reading/writing data from/to the memory. Both cannot occur at the same time since the
instructions and data use the same bus system. In a computer using the Harvard architecture, the
CPU can both read an instruction and perform a data memory access at the same time, even
without a cache. A Harvard architecture computer can thus be faster for a given circuit
complexity because instruction fetches and data access do not contend for a single memory
pathway.
Also, a Harvard architecture machine has distinct code and data address spaces: instruction
address zero is not the same as data address zero. Instruction address zero might identify a
twenty-four bit value, while data address zero might indicate an eight bit byte that isn't part of
that twenty-four bit value.
2. Define ARM processor. Explain advanced ARM features? Jun 14
ARM is a family of instruction set architectures for computer processors based on a reduced
instruction set computing (RISC) architecture developed by British company ARM Holdings.
Embedded computing systems 10CS72
Dept of CSE, SJBIT Page 6
A RISC-based computer design approach means ARM processors require significantly fewer
transistors than typical CISC x86 processors in most personal computers. This approach reduces
costs, heat and power use. These are desirable traits for light, portable, battery-powered
devices—including smart phones, laptops, tablet and notepad computers, and other embedded
systems. A simpler design facilitates more efficient multi-core CPUs and higher core counts at
lower cost, providing improved energy efficiency for servers.
ARM Holdings develops the instruction set and architecture for ARM-based products, but does
not manufacture products. The company periodically releases updates to its cores. Current cores
from ARM Holdings support a 32-bit address space and 32-bit arithmetic; the ARMv8-A
architecture, adds support for a 64-bit address space and 64-bit arithmetic. Instructions for ARM
Holdings' cores have 32 bits wide fixed-length instructions, but later versions of the architecture
also support a variable-length instruction set that provides both 32 and 16 bits wide instructions
for improved code density. Some cores can also provide hardware execution of Java bytecodes.
3. What is pipelining? Explain the c55X of a seven stages of pipeline with a neat diagram of
ARM instructions? Jun 14/JAN 14
Pipelining is an implementation technique where multiple instructions are overlapped in
execution. The computer pipeline is divided in stages. Each stage completes a part of an
instruction in parallel. The stages are connected one to the next to form a pipe - instructions enter
at one end, progress through the stages, and exit at the other end. Pipelining does not decrease
the time for individual instruction execution. Instead, it increases instruction throughput. The
throughput of the instruction pipeline is determined by how often an instruction exits the
pipeline.
C55x has 7-stage pipe:
• fetch;
• decode;
• address: computes data/branch addresses;
• access 1: reads data;
• access 2: finishes data read;
• Read stage: puts operands on internal busses
• execute
4. Explain memory system mechanisms? Jan 14
On an Embedded System, memory is at a premium. Some chips, particularly embedded VLSI
chips, and low-end microprocessors may only have a small amount of RAM "on board" (built
directly into the chip), and therefore their memory is not expandable. Other embedded systems
have a certain amount of memory, and have no means to expand. In addition to RAM, some
embedded systems have some non-volatile memory, in the form of miniature magnetic disks,
Embedded computing systems 10CS72
Dept of CSE, SJBIT Page 7
FLASH memory expansions, or even various 3rd-party memory card expansions. Keep in mind
however, that a memory upgrade on an embedded system may cost more than the entire system
itself. An embedded systems programmer, therefore, needs to be very much aware of the
memory available, and the memory needed to complete a task.
Memory is an important part of embedded systems. The cost and performance of an embedded
system heavily depends on the kind of memory devices it utilizes. In this section we will discuss
about “Memory Classification”, “Memory Technologies” and “Memory Management”.
Memory Classification
Memory Devices can be classified based on following characteristics
(a) Accessibility
(b) Persistence of Storage
(c) Storage Density & Cost
(d) Storage Media
(f) Power Consumption
UNIT 3
1. Write the major components of bus protocol. Explain the burst read transaction with a
neat timing diagram? Jun 14/Jan 14
Bus Protocols The basic building block of most bus protocols is the four-
cycle handshake, The handshake ensures that when two devices want to
communicate, one is ready to transmit and the other is ready to receive. The
hand- shake uses a pair of wires dedicated to the handshake: end (meaning
enquiry) and ack (meaning acknowledge). Extra wires are used for the data
transmitted during the handshake. The four cycles are described below.
1. Device 1 raises its output to signal an enquiry, which tells device
2 that it should get ready to listen for data.
2. When device 2 is ready to receive, it raises its output to signal an
acknowledgement. At this point, devices 1 and 2 can transmit or receive.
3. Once the data transfer is complete, device 2 lowers its output,
signaling that it has received the data.
4. After seeing that ack has been released, device 1 lowers its output.
Embedded computing systems 10CS72
Dept of CSE, SJBIT Page 8
2. Describe: i) Timer ii) cross compiler iii)logic analyzer Jan 14
Timer: Just about every microcontroller comes with one or more (sometimes many more) built-
in timer/counters, and these are extremely useful to the embedded programmer - perhaps second
in usefulness only to GPIO. The term timer/counter itself reflects the fact that the underlying
counter hardware can usually be configured to count either regular clock pulses (making it a
timer) or irregular event pulses (making it a counter). This tutorial will use the term “timer”
rather than “timer/counter” for the actual hardware block, in the name of simplicity, but will try
and make it clear when the device is acting as an event counter rather than a normal timer. Also
note that sometimes timers are called “hardware timers” to distinguish them from software
timers which are bits of software that perform some timing function.
Cross compiler: A cross compiler is a compiler capable of creating executable code for a
platform other than the one on which the compiler is running. For example in order to compile
for Linux/ARM you first need to obtain its libraries to compile against.
A cross compiler is necessary to compile for multiple platforms from one machine. A platform
could be infeasible for a compiler to run on, such as for the microcontroller of an embedded
system because those systems contain no operating system. In paravirtualization one machine
runs many operating systems, and a cross compiler could generate an executable for each of
them from one main source.
Logic analyzer: A logic analyzer is an electronic instrument that captures and displays multiple
signals from a digital system or digital circuit. A logic analyzer may convert the captured data
into timing diagrams, protocol decodes, state machine traces, assembly language, or may
correlate assembly with source-level software. Logic Analyzers have advanced triggering
capabilities, and are useful when a user needs to see the timing relationships between many
signals in a digital system.[1]
3. Explain the glue logic interface? Jun 14
Glue logic is a special form of digital circuitry that allows different types of logic chips or
circuits to work together by acting as an interface between them.As an example, consider a chip
that contains a CPU (central processing unit) and a RAM (random access memory) block. These
circuits can be interfaced within the chip using glue logic, so that they work smoothly together.
On printed circuit boards, glue logic can take the form of discrete ICs (integrated circuits) in
their own packages. In more complicated situations, programmable logic devices (PLDs) can
play the role of glue logic.Other functions of glue logic include address decoding (with older
processors), interfacing to peripherals, circuits to protect against ESD (electrostatic discharge) or
EMP (electromagnetic pulse) events, and the prevention of unauthorized cloning or reverse
engineering, by hiding the actual function of a circuit from external observers and hackers.
Embedded computing systems 10CS72
Dept of CSE, SJBIT Page 9
4 . Explain components of embedded programs? Jun 14
An embedded system has three main components : Hardware, Software and time
operating system
i) Hardware
• Power Supply
• Processor
• Memory
• Timers
• Serial communication ports
• Output/Output circuits
• System application specific circuits
ii) Software: The application software is required to perform the series of tasks.
An embedded system has software designed to keep in view of three constraints:
• Availability of System Memory
• Availability of processor speed
• The need to limit power dissipation when running the system continuously in
cycles of wait for events, run , stop and wake up.
iii) Real Time Operating System: (RTOS) It supervises the application software
and provides a mechanism to let the processor run a process as per scheduling and
do the switching from one process (task) to another process.
UNIT 4
1. Write a note on DMA controller? Jan 14
DMA: Standard bus transactions require the CPU to be in the middle of
every read and write transaction. However, there are certain types of data
transfers in which the CPU does not need to be involved. For example, a
high-speed I/O device may want to transfer a block of data into memory.
While it is possible to write a program that alternately reads the device and
writes to memory, it would be faster to eliminate the CPU’s involvement
and let the device and memory communicate directly. This
Direct memory access (DMA) is a bus operation that allows reads and
writes not controlled by the CPU. A DMA transfer is controlled by a
DMA controller , which requests control of the bus from the CPU. After
gaining control, the DMA con- troller performs read and write operations
directly between devices and memory.
Embedded computing systems 10CS72
Dept of CSE, SJBIT Page 10
2. Explain the circular buffers for embedded programs? Jun14/Jan 14
The ring buffer's first-in first-out data structure is useful tool for transmitting data between
asynchronous processes. Here's how to bit bang one in C without C++'s Standard Template
Library.
The ring buffer is a circular software queue. This queue has a first-in-first-out (FIFO) data
characteristic. These buffers are quite common and are found in many embedded systems.
Usually, most developers write these constructs from scratch on an as-needed basis.
The C++ language has the Standard Template Library (STL), which has a very easy-to-use set of
class templates. This library enables the developer to create the queue and other lists relatively
easily. For the purposes of this article, however, I am assuming that we do not have access to the
C++ language.
The ring buffer usually has two indices to the elements within the buffer. The distance between
the indices can range from zero (0) to the total number of elements within the buffer. The use of
the dual indices means the queue length can shrink to zero, (empty), to the total number of
elements, (full). Figure 1 shows the ring structure of the ring buffer, (FIFO) queue
3. with a neat sketch, explain the role of assemblers and linkers in compilation process. .
Jun 14
COMPILERS, ASSEMBLERS and LINKERS
� Normally the C’s program building process involves four stages and utilizes different
‘tools’ such as a preprocessor, compiler, assembler, and linker.
� At the end there should be a single executable file. Below are the stages that happen in
order regardless of the operating system/compiler and graphically illustrated in Figure
w.1.
1. Preprocessing is the first pass of any C compilation. It processes include-files,
conditional compilation instructions and macros.
2. Compilation is the second pass. It takes the output of the preprocessor, and the source
code, and generates assembler source code.
3. Assembly is the third stage of compilation. It takes the assembly source code and
produces an assembly listing with offsets. The assembler output is stored in an object file.
Embedded computing systems 10CS72
Dept of CSE, SJBIT Page 11
4. Linking is the final stage of compilation. It takes one or more object files or libraries as
input and combines them to produce a single (usually executable)
4. Explain with example techniques in optimizing. Jun 14/Jan 14
Performance Optimization StrategiesLet’s look more generally at how to improve
program execution time. First, make sure that the code really needs to be
accelerated. If you are dealing with a large program, the part of the program using
the most time may not be obvious. Profiling the program will help you find hot
spots. A profiler does not measure execution time—instead, it counts the number
of times that procedures or basic blocks in the program are executed. There are
two major ways to profile a program: We can modify the executable program by
adding instructions that increment a location every time the program passes that
point in the program; or we can sample the program counter during execution
and keep track of the distribution of PC values. Profiling adds relatively little
overhead to the program and it gives us some useful information about where the
program spends most of its time.You may be able to redesign your algorithm to
improve efficiency. Examining asymptotic performance is often a good guide to
efficiency. Doing fewer operations is usually the key to performance. In a few
cases, however, brute force may provide a better implementation. A seemingly
simple high-level language statement may in fact hide a very long sequence of
operations that slows down the algorithm. Using dynamically allocated memory is
one example, since managing the heap takes time but is hidden from the
programmer. For example, a sophisticated algorithm that uses dynamic storage
may be slower in practice than an algorithm that performs more operations on
statically allocated memory.Finally, you can look at the implementation of the
program itself. A few hints on program implementation are summarized below.
■ Try to use registers efficiently. Group accesses to a value together so
that the value can be brought into a register and kept there.■ Make use of page
mode accesses in the memory system whenever possible. Page
mode reads and writes eliminate one step in the memory access. You can
Embedded computing systems 10CS72
Dept of CSE, SJBIT Page 12
increase use of page mode by rearranging your variables so that more can be
referenced contiguously.
■ Analyze cache behavior to find major cache conflicts. Restructure
the code to eliminate as many of these as you can as follows:
—For instruction conflicts, if the offending code segment is small, try to
rewrite the segment to make it as small as possible so that it better fits into the
cache. Writing in assembly language may be necessary. For con- flicts across
larger spans of code, try moving the instructions or padding with NOPs.
—For scalar data conflicts, move the data values to different locations to reduce
conflicts.
—For array data conflicts, consider either moving the arrays or changing your array
access patterns to reduce conflicts.
UNIT 5
1. Explain operating system Architecture, with a neat diagram? Jan 14
Embedded computing systems 10CS72
Dept of CSE, SJBIT Page 13
The kernel is the core of an operating system. It is the software responsible for running programs
and providing secure access to the machine's hardware. Since there are many programs, and
resources are limited, the kernel also decides when and how long a program should run. This is
called scheduling. Accessing the hardware directly can be very complex, since there are many
different hardware designs for the same type of component. Kernels usually implement some
level of hardware abstraction (a set of instructions universal to all devices of a certain type) to
hide the underlying complexity from applications and provide a clean and uniform interface.
This helps application programmers to develop programs without having to know how to
program for specific devices. The kernel relies upon software drivers that translate the generic
command into instructions specific to that device.
2. Define different types of operating system? Jan 14
Real-time
A real-time operating system is a multitasking operating system that aims at executing real-time
applications. Real-time operating systems often use specialized scheduling algorithms so that
they can achieve a deterministic nature of behavior. The main objective of real-time operating
systems is their quick and predictable response to events. They have an event-driven or time-
sharing design and often aspects of both. An event-driven system switches between tasks based
on their priorities or external events while time-sharing operating systems switch tasks based on
clock interrupts.
Multi-user
A multi-user operating system allows multiple users to access a computer system at the
same time. Time-sharing systems and Internet servers can be classified as multi-user
systems as they enable multiple-user access to a computer through the sharing of time.
Single-user operating systems have only one user but may allow multiple programs to run
at the same time.
Multi-tasking vs. single-tasking
A multi-tasking operating system allows more than one program to be running at the
same time, from the point of view of human time scales. A single-tasking system has
only one running program. Multi-tasking can be of two types: pre-emptive and co-
operative. In pre-emptive multitasking, the operating system slices the CPU time and
dedicates one slot to each of the programs. Unix-like operating systems such as Solaris
and Linux support pre-emptive multitasking, as does AmigaOS. Cooperative multitasking is
Embedded computing systems 10CS72
Dept of CSE, SJBIT Page 14
achieved by relying on each process to give time to the other processes in a defined manner. 16-
bit versions of Microsoft Windows used cooperative multi-tasking. 32-bit versions of both
Windows NT and Win9x, used pre-emptive multi-tasking. Mac OS prior to OS X used to support
cooperative multitasking.
Distributed
A distributed operating system manages a group of independent computers and makes
them appear to be a single computer. The development of networked computers that
could be linked and communicate with each other gave rise to distributed computing.
Distributed computations are carried out on more than one machine. When computers in
a group work in cooperation, they make a distributed system.
Embedded
Embedded operating systems are designed to be used in embedded computer systems.
They are designed to operate on small machines like PDAs with less autonomy. They are
able to operate with a limited number of resources. They are very compact and extremely
efficient by design. Windows CE and Minix 3 are some examples of embedded operating
systems.
3. What is RTOS? List and explain the different services of RTOS. Jun 14
A real-time operating system (RTOS) is an operating system (OS) intended to serve
real-time application requests. It must be able to process data as it comes in, typically
without buffering delays. Processing time requirements (including any OS delay) are
measured in tenths of seconds or shorter.
• The RTOS performs few tasks, thus ensuring that the tasks will always be executed
before the deadline
• The RTOS drops or reduces certain functions when they cannot be executed within the
time constraints ("load shedding")
• The RTOS monitors input consistently and in a timely manner
• The RTOS monitors resources and can interrupt background processes as needed to
ensure real-time execution
• The RTOS anticipates potential requests and frees enough of the system to allow timely
reaction to the user's request
• The RTOS keeps track of how much of each resource (CPU time per timeslice, RAM,
communications bandwidth, etc.) might possibly be used in the worst-case by the
currently-running tasks, and refuses to accept a new task unless it "fits" in the remaining
un-allocated resources.
Embedded computing systems 10CS72
Dept of CSE, SJBIT Page 15
4. Describe the concept of multithreading and write the comparison between thread and
process. Jun 14
Multithreading is the ability of a program or an operating system process to manage its use by
more than one user at a time and to even manage multiple requests by the same user without
having to have multiple copies of the programming running in the computer. Central processing
units have hardware support to efficiently execute multiple threads. These are distinguished from
multiprocessing systems (such as multi-core systems) in that the threads have to share the
resources of a single core: the computing units, the CPU caches and the translation lookaside
buffer (TLB). Where multiprocessing systems include multiple complete processing units,
multithreading aims to increase utilization of a single core by using thread-level as well as
instruction-level parallelism. As the two techniques are complementary, they are sometimes
combined in systems with multiple multithreading CPUs and in CPUs with multiple
multithreading cores.
UNIT 6
1. Define blocking and non blocking communication. Explain the two styles of interprocess
communication, with a example. Jun 14
In computing, inter-process communication (IPC) is a set of methods for the exchange of data
among multiple threads in one or more processes. Processes may be running on one or more
computers connected by a network. IPC methods are divided into methods for message passing,
synchronization, shared memory, and remote procedure calls (RPC). The method of IPC used
may vary based on the bandwidth and latency of communication between the threads, and the
type of data being communicated.
There are several reasons for providing an environment that allows process cooperation:
• Information sharing
• Computational speedup
• Modularity
• Convenience
• Privilege separation
• blocking or nonblocking . After sending a blocking communication,
the process goes into the waiting state until it receives a response.
Nonblocking communication allows the process to continue
execution after sending the communication. Both types of
Embedded computing systems 10CS72
Dept of CSE, SJBIT Page 16
communication are useful.
• There are two major styles of interprocess communication: shared
memory and message passing . The two are logically equivalent—
given one, you can build an interface that implements the other.
However, some programs may be easier to write using one rather
than the other. In addition, the hardware platform may make one
easier to implement or more efficient than the other.
2. What are the assumptions for the performance of a real system running process?
Mention the factors affect context switching time and interrupt latency. Jun 14/Jan 14
Context switch: A context switch is the computing process of saving and restoring
the state (context) of a CPU such that multiple processes can share a single CPU
resource. The context switch is an essential feature of a multitasking operating
system.
Context switches are usually time consuming and much of the design of
operating systems is to minimize the time of context switches.
A context switch can mean a register context switch, a task context switch, a
thread context switch, or a process context switch. What will be switched is
determined by the processor and the operating system.
The scheduler is the part of the operating systems that manage context
switching, it perform context switching in one of the following conditions:
1. Multitasking: One process needs to be switched out of (termed "yield"
which means "give up") the CPU so another process can run. Within a
preemptive multitasking operating system, the scheduler allows every task
(according to its priority level) to run for some certain amount of time, called its
time slice where a timer interrupt triggers the operating system to schedule
another process for execution instead.
If a process will wait for one of the computer resources or will perform an
I/O operation, the operating system schedules another process for execution
instead.
2. Interrupt handling: Some CPU architectures (like the Intel x86
architecture) are interrupt driven. When an interrupt occurs, the scheduler calls
its interrupt handler in order to serve the interrupt after switching contexts; the
scheduler suspended the currently running process till executing the interrupt
handler.
Embedded computing systems 10CS72
Dept of CSE, SJBIT Page 17
3. User and kernel mode switching: When a transition between user mode
and kernel mode is required in an operating system, a context switch is not
necessary; a mode transition is not by itself a context switch. However,
depending on the operating system, a context switch may also take place at this
time.
Context switching can be performed primarily by software or hardware. Some
CPUs have hardware support for context switches, else if not, it is performed
totally by the operating system software. In a context switch, the state of a
process must be saved somehow before running another process, so that, the
scheduler resume the execution of the process from the point it was suspended;
after restoring its complete state before running it again.
A process (also sometimes referred to as a task) is an executing (i.e., running)
instance of a program. In Linux, threads are lightweight processes that can run
in parallel and share an address space (i.e., a range of memory locations) and
other resources with their parent processes (i.e., the processes that created
them).
A context is the contents of a CPU's registers and program counter at any point
in time. A register is a small amount of very fast memory inside of a CPU (as
opposed to the slower RAM main memory outside of the CPU) that is used to
speed the execution of computer programs by providing quick access to
commonly used values, generally those in the midst of a calculation. A program
counter is a specialized register that indicates the position of the CPU in its
instruction sequence and which holds either the address of the instruction being
executed or the address of the next instruction to be executed, depending on the
specific system.
Context switching can be described in slightly more detail as the kernel (i.e., the
core of the operating system) performing the following activities with regard to
processes (including threads) on the CPU:
(1) suspending the progression of one process and storing the CPU's state (i.e.,
the context) for that process somewhere in memory,
(2) retrieving the context of the next process from memory and restoring it in
the CPU's registers and
(3) returning to the location indicated by the program counter (i.e., returning to
the line of code at which the process was interrupted) in order to resume the
process.
A context switch is sometimes described as the kernel suspending execution of
one process on the CPU and resuming execution of some other process that had
previously been suspended. Although this wording can help clarify the concept,
it can be confusing in itself because a process is, by definition, an executing
Embedded computing systems 10CS72
Dept of CSE, SJBIT Page 18
instance of a program. Thus the wording suspending progression of a process
might be preferable.
3. Explain shared memory communication implemented on a bus? Jan 14
In computer hardware, shared memory refers to a (typically large) block of random access
memory (RAM) that can be accessed by several different central processing units (CPUs) in a
multiple-processor computer system.
A shared memory system is relatively easy to program since all processors share a single view of
data and the communication between processors can be as fast as memory accesses to a same
location. The issue with shared memory systems is that many CPUs need fast access to memory
and will likely cache memory, which has two complications:
• CPU-to-memory connection becomes a bottleneck. Shared memory computers cannot scale very
well. Most of them have ten or fewer processors.
• Cache coherence: Whenever one cache is updated with information that may be used by other
processors, the change needs to be reflected to the other processors, otherwise the different
processors will be working with incoherent data (see cache coherence and memory coherence).
Such coherence protocols can, when they work well, provide extremely high-performance access
to shared information between multiple processors. On the other hand they can sometimes
become overloaded and become a bottleneck to performance.
Technologies like crossbar switches, Omega networks, HyperTransport or Front-side bus can be
used to dampen the bottleneck-effects.
The alternatives to shared memory are distributed memory and distributed shared memory, each
having a similar set of issues. See also Non-Uniform Memory Access.
UNIT 7
1. Explain hardware and software Architecture of distributed embedded system? Jan 14
Networks for embedded computing span a broad range of
requirements; many of those requirements are very different from
those for general-purpose networks. Some networks are used in safety-
critical applications, such as automotive control. Some networks, such
as those used in consumer electronics systems, must be very
inexpensive. Other networks, such as industrial control networksmust
be extremely rugged and reliable.
Several interconnect networks have been developed especially for
distributed embedded computing:
Embedded computing systems 10CS72
Dept of CSE, SJBIT Page 19
■ The I2 C bus is used in microcontroller-based systems.
■ The Controller Area Network (CAN) bus was developed for
automotive electronics. It provides megabit rates and can handle large
numbers of devices.
■ Ethernet and variations of standard Ethernet are used for a variety of
control applications.
In addition, many networks designed for general-purpose computing
have been put to use in embedded applications as well.
In this section, we study some commonly used embedded networks,
includ- ing the I2 C bus and Ethernet; we will also briefly discuss
networks for industrial applications.
2. SCL or SDL to different values, the open collector/ open drain circuitry
prevents errors, but each master device must listen to the bus while
transmitting to be sure that it is not interfering with another message—if
the device receives a different value than it is trying to transmit, then it
knows that it is interfering with another message.
3. E
very I2 C device has an address. The addresses of the devices are
determined by the system designer, usually as part of the program for the
I2 C driver. The addresses must of course be chosen so that no two
devices in the system have the same address. A device address is 7
bits in the standard I2 C definition (the extended I2 C allows 10-bit
addresses). The address 0000000 is used to signal a general call or
bus broadcast, which can be used to signal all devices simultaneously.
The address
4. 11110XX is reserved for the extended 10-bit addressing scheme; there
are several other reserved addresses as well.
5. A bus transaction comprised a series of 1-byte transmissions and an
address followed by one or more data bytes. I2 C encourages a data-push
programming style. When a master wants to write a slave, it transmits
the slave’s address followed by the data. Since a slave cannot initiate a
transfer, the master must send a read request with the slave’s address
and let the slave transmit the data. Therefore, an address transmission
includes the 7-bit address and 1 bit for data direction: 0 for writing
from the master to the slave and 1 for reading from the slave to the
Embedded computing systems 10CS72
Dept of CSE, SJBIT Page 20
master. (This
2. Explain about multichip communication with a neat diagram? Jan 14
The I 2C bus [Phi92] is a well-known bus commonly used to link
microcontrollers into systems. It has even been used for the command
interface in an MPEG-2 video chip [van97]; while a separate bus was
used for high-speed video data, setup information was transmitted to the
on-chip controller through an I2 C bus interface.
I2 C is designed to be low cost, easy to implement, and of moderate speed (up to 100 KB/s for the standard bus and up to 400 KB/s for the extended bus). As a result, it uses only two lines: the serial data line (SDL) for data and the serial clock line (SCL), which indicates when valid data are on the data line. Figure 8.7 shows the structure of a typical I2 C bus system. Every node in the network is connected to both SCL and SDL. Some nodes may be able to act as bus masters and the bus
The open collector/open drain circuitry allows a slave device to stretch a
clock signal during a read from a slave. The master is responsible for
generating the SCL clock, but the slave can stretch the low period of the
clock (but not the high period) if necessary.
The I2 C bus is designed as a multimaster bus—any one of several different devices may act as the master at various times. As a result, there is no global mas- ter to generate the clock signal on SCL. Instead, a master drives both SCL and SDL when it is sending data. When the bus is idle, both SCL and SDL remain high. When two devices try to drive either SCL or SDL to different values, the open collector/ open drain circuitry prevents errors, but each master device must listen to the bus while transmitting to be sure that it is not interfering with another message—if the device receives a different value than it is trying to transmit, then it knows that it is interfering with another message.
However, starts and stops must be paired. A master can write and
then read (or read and then write) by sending a start after the data
transmission, followed by another address transmission and then more
data. The basic state transition graph for the master’s actions in a bus
transaction is shown in
The formats of some typical complete bus transactions are shown in
Figure 8.11. In the first example, the master writes 2 bytes to the
addressed slave. In the second, the master requests a read from a
slave. In the third, the master writes
Embedded computing systems 10CS72
Dept of CSE, SJBIT Page 21
1 byte to the slave, and then sends another start to initiate a read
from the slave.
3. With a neat sketch, explain the CAN data frame format and typical bus transactions
on the IC bus. Jun 14
CAN bus (for controller area network) is a vehicle bus standard designed to allow
microcontrollers and devices to communicate with each other within a vehicle without a host
computer. CAN bus is a message-based protocol, designed specifically for automotive
applications but now also used in other areas such as aerospace, maritime, industrial automation
and medical equipment. Development of the CAN bus started originally in 1983 at Robert Bosch
GmbH.[1]
The protocol was officially released in 1986 at the Society of Automotive Engineers
(SAE) congress in Detroit, Michigan. The first CAN controller chips, produced by Intel and
Philips, came on the market in 1987. Bosch published the CAN 2.0 specification in 1991. In
2012 Bosch has specified the improved CAN data link layer protocol, called CAN FD, which
will extend the ISO 11898-1.CAN bus is one of five protocols used in the on-board diagnostics
(OBD)-II vehicle diagnostics standard. The OBD-II standard has been mandatory for all cars and
light trucks sold in the United States since 1996, and the EOBD standard has been mandatory for
all petrol vehicles sold in the European Union since 2001 and all diesel vehicles since 2004
Data frame
The data frame is the only frame for actual data transmission. There are two message formats:
• Base frame format: with 11 identifier bits
• Extended frame format: with 29 identifier bits
The CAN standard requires the implementation must accept the base frame format and may
accept the extended frame format, but must tolerate the extended frame format.
Base frame format
CAN-Frame in base format with electrical levels without stuffbits
Embedded computing systems 10CS72
Dept of CSE, SJBIT Page 22
The frame format is as follows:
Field name Length
(bits) Purpose
Start-of-frame 1 Denotes the start of frame transmission
Identifier (green) 11 A (unique) identifier for the data which also represents the message
priority
Remote transmission
request (RTR) 1 Dominant (0) (see Remote Frame below)
Identifier extension
bit (IDE) 1
Declaring if 11 bit message ID or 29 bit message ID is used.
Dominant (0) indicate 11 bit message ID while Recessive (1) indicate
29 bit message.
Reserved bit (r0) 1 Reserved bit (it must be set to dominant (0), but accepted as either
dominant or recessive)
Data length code
(DLC) (yellow) 4 Number of bytes of data (0–8 bytes)
[a]
Data field (red) 0–64 (0-8
bytes) Data to be transmitted (length in bytes dictated by DLC field)
CRC 15 Cyclic redundancy check
CRC delimiter 1 Must be recessive (1)
ACK slot 1 Transmitter sends recessive (1) and any receiver can assert a
dominant (0)
ACK delimiter 1 Must be recessive (1)
End-of-frame (EOF) 7 Must be recessive (1)
4.Explain Ethernet format and IP structure. Jun 14
Internet Protocol version 4 (IPv4) is the fourth version in the development of the Internet
Protocol (IP) Internet, and routes most traffic on the Internet.[1]
However, a successor protocol,
IPv6, has been defined and is in various stages of production deployment. IPv4 is described in
IETF publication RFC 791 (September 1981), replacing an earlier definition (RFC 760, January
Embedded computing systems 10CS72
Dept of CSE, SJBIT Page 23
1980).IPv4 is a connectionless protocol for use on packet-switched networks. It operates on a
best effort delivery model, in that it does not guarantee delivery, nor does it assure proper
sequencing or avoidance of duplicate delivery. These aspects, including data integrity, are
addressed by an upper layer transport protocol, such as the Transmission Control Protocol
(TCP).IPv4 uses 32-bit (four-byte) addresses, which limits the address space to 4294967296 (232
)
addresses. As addresses were assigned to users, the number of unassigned addresses decreased.
IPv4 address exhaustion occurred on February 3, 2011, although it had been significantly
delayed by address changes such as classful network design, Classless Inter-Domain Routing,
and network address translation (NAT).
This limitation of IPv4 stimulated the development of IPv6 in the 1990s, which has been in
commercial deployment since 2006.IPv4 reserves special address blocks for private networks
(~18 million addresses) and multicast addresses (~270 million addresses).
An IP packet consists of a header section and a data section.
An IP packet has no data checksum or any other footer after the data section. Typically the link
layer encapsulates IP packets in frames with a CRC footer that detects most errors, and typically
the end-to-end TCP layer checksum detects most other errors.[12]
Header
The IPv4 packet header consists of 14 fields, of which 13 are required. The 14th field is optional
(red background in table) and aptly named: options. The fields in the header are packed with the
most significant byte first (big endian), and for the diagram and discussion, the most significant
bits are considered to come first (MSB 0 bit numbering). The most significant bit is numbered 0,
so the version field is actually found in the four most significant bits of the first byte, for
example.
IPv4 Header Format
Offsets Octet 0 1 2 3
Octet Bit 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
0 0 Version IHL DSCP ECN Total Length
4 32 Identification Flags Fragment Offset
8 64 Time To Live Protocol Header Checksum
12 96 Source IP Address
Embedded computing systems 10CS72
Dept of CSE, SJBIT Page 24
16 128 Destination IP Address
20 160
Options (if IHL > 5)
Version
The first header field in an IP packet is the four-bit version field. For IPv4, this has a value of 4
(hence the name IPv4).
Internet Header Length (IHL)
The second field (4 bits) is the Internet Header Length (IHL), which is the number of 32-bit
words in the header. Since an IPv4 header may contain a variable number of options, this field
specifies the size of the header (this also coincides with the offset to the data). The minimum
value for this field is 5 (RFC 791), which is a length of 5×32 = 160 bits = 20 bytes. Being a 4-bit
value, the maximum length is 15 words (15×32 bits) or 480 bits = 60 bytes.
Differentiated Services Code Point (DSCP)
Originally defined as the Type of service field, this field is now defined by RFC 2474 for
Differentiated services (DiffServ). New technologies are emerging that require real-time data
streaming and therefore make use of the DSCP field. An example is Voice over IP (VoIP), which
is used for interactive data voice exchange.
Explicit Congestion Notification (ECN)
This field is defined in RFC 3168 and allows end-to-end notification of network congestion
without dropping packets. ECN is an optional feature that is only used when both endpoints
support it and are willing to use it. It is only effective when supported by the underlying network.
Total Length
This 16-bit field defines the entire packet (fragment) size, including header and data, in bytes.
The minimum-length packet is 20 bytes (20-byte header + 0 bytes data) and the maximum is
65,535 bytes — the maximum value of a 16-bit word. The largest datagram that any host is
required to be able to reassemble is 576 bytes, but most modern hosts handle much larger
packets. Sometimes subnetworks impose further restrictions on the packet size, in which case
datagrams must be fragmented. Fragmentation is handled in either the host or router in IPv4.
Identification
This field is an identification field and is primarily used for uniquely identifying the group of
fragments of a single IP datagram. Some experimental work has suggested using the ID field for
Embedded computing systems 10CS72
Dept of CSE, SJBIT Page 25
other purposes, such as for adding packet-tracing information to help trace datagrams with
spoofed source addresses,[13]
but RFC 6864 now prohibits any such use.
Flags
A three-bit field follows and is used to control or identify fragments. They are (in order, from
high order to low order):
• bit 0: Reserved; must be zero.[note 1]
• bit 1: Don't Fragment (DF)
• bit 2: More Fragments (MF)
If the DF flag is set, and fragmentation is required to route the packet, then the packet is dropped.
This can be used when sending packets to a host that does not have sufficient resources to handle
fragmentation. It can also be used for Path MTU Discovery, either automatically by the host IP
software, or manually using diagnostic tools such as ping or traceroute.
For unfragmented packets, the MF flag is cleared. For fragmented packets, all fragments except
the last have the MF flag set. The last fragment has a non-zero Fragment Offset field,
differentiating it from an unfragmented packet.
Fragment Offset
The fragment offset field, measured in units of eight-byte blocks (64 bits), is 13 bits long and
specifies the offset of a particular fragment relative to the beginning of the original unfragmented
IP datagram. The first fragment has an offset of zero. This allows a maximum offset of (213
– 1) ×
8 = 65,528 bytes, which would exceed the maximum IP packet length of 65,535 bytes with the
header length included (65,528 + 20 = 65,548 bytes).
Time To Live (TTL)
An eight-bit time to live field helps prevent datagrams from persisting (e.g. going in circles) on
an internet. This field limits a datagram's lifetime. It is specified in seconds, but time intervals
less than 1 second are rounded up to 1. In practice, the field has become a hop count—when the
datagram arrives at a router, the router decrements the TTL field by one. When the TTL field hits
zero, the router discards the packet and typically sends an ICMP Time Exceeded message to the
sender.
UNIT 8
1. Write a note on object file and map file. Jan 14
An object file is a file containing object code, meaning relocatable format machine code that is
usually not directly executable. Object files are produced by an assembler, compiler, or other
language translator, and used as input to the linker, which in turn typically generates an
Embedded computing systems 10CS72
Dept of CSE, SJBIT Page 26
executable or library by combining parts of object files. There are various formats for object
files, and the same object code can be packaged in different object files.
In addition to the object code itself, object files may contain metadata used for linking or
debugging, including: information to resolve symbolic cross-references between different
modules, relocation information, stack unwinding information, comments, program symbols,
debugging or profiling information.
An object file format is a computer file format used for the storage of object code and related
data. There are many different object file formats; originally each type of computer had its own
unique format, but with the advent of Unix and other portable operating systems, some formats,
such as COFF and ELF have been defined and used on different kinds of systems. It is possible
for the same file format to be used both as linker input and output, and thus as the library and
executable file format. The design and/or choice of an object file format is a key part of overall
system design. It affects the performance of the linker and thus programmer turnaround while
developing. If the format is used for executables, the design also affects the time programs take
to begin running, and thus the responsiveness for users. Most object file formats are structured as
blocks of data, each block containing a certain type of data (see Memory segmentation). These
blocks can be paged in as needed by the virtual memory system, needing no further processing to
be ready to use.Debugging information may either be an integral part of the object file format, as
in COFF, or a semi-independent format which may be used with several object formats, such as
stabs or DWARF.
Object files are usually divided into segments or sections, not to be confused with memory
segmentation. Segments in different object files may be combined by the linker according to
rules specified when the segments are defined. Conventions exist for segments shared between
object files; for instance, in DOS there are different memory models that specify the names of
special segments and whether or not they may be combined.[2]
Types of data supported by typical object file formats:
• Header (descriptive and control information)
• Text segment (executable code)
• Data segment (static data)
• BSS segment (uninitialized static data)
• External definitions and references for linking
• Relocation information
• Dynamic linking information
• Debugging information
2. Explain about simulators, emulators and debuggers. Jan 14
Embedded computing systems 10CS72
Dept of CSE, SJBIT Page 27
Emulator
An emulator is hardware or software or both that duplicates (or emulates) the
functions of one computer system (the guest) in another computer system (the
host), different from the first one, so that the emulated behavior closely resembles
the8.6 Debugging
Embedded debugging may be performed at different levels, depending on the
facilities available. From simplest to most sophisticated they can be roughly
grouped into the following areas:
Interactive resident debugging, using the simple shell provided by the embedded
operating system (e.g. Forth and Basic)
External debugging using logging or serial port output to trace operation using
either a monitor in flash or using a debug server like the Remedy Debugger which
even works for heterogeneous multicore systems.
Simulation is the imitation of the operation of a real-world process or system
over time. The act of simulating something first requires that a model be
developed; this model represents the key characteristics or behaviorsf the selected
physical or abstract system or process. The model represents the system itself,
whereas the simulation represents the operation of the system over time.
3.What is simulator? Explain its features. Jun 14
Simulation is the imitation of the operation of a real-world process or system
over time.[1]
The act of simulating something first requires that a model be
developed; this model represents the key characteristics or behaviors/functions of
the selected physical or abstract system or process. The model represents the
system itself, whereas the simulation represents the operation of the system over
time.
Simulation is used in many contexts, such as simulation of technology for
performance optimization, safety engineering, testing, training, education, and
video games. Often, computer experiments are used to study simulation models.
Simulation is also used with scientific modelling of natural systems or human
systems to gain insight into their functioning.[2]
Simulation can be used to show
the eventual real effects of alternative conditions and courses of action.
Simulation is also used when the real system cannot be engaged, because it may
not be accessible, or it may be dangerous or unacceptable to engage, or it is being
designed but not yet built, or it may simply not exist.[3]
Embedded computing systems 10CS72
Dept of CSE, SJBIT Page 28
Key issues in simulation include acquisition of valid source information about the
relevant selection of key characteristics and behaviours, the use of simplifying
approximations and assumptions within the simulation, and fidelity and validity
of the simulation outcomes.
4. What are the improvements over firmware software debugging? Explain Jun 14
Embedded debugging may be performed at different levels, depending on the
facilities available. From simplest to most sophisticated they can be roughly
grouped into the following areas:
Interactive resident debugging, using the simple shell provided by the embedded
operating system (e.g. Forth and Basic)
External debugging using logging or serial port output to trace operation using
either a monitor in flash or using a debug server like the Remedy Debugger which
even works for heterogeneous multicore systems.
An in-circuit debugger (ICD), a hardware device that connects to the
microprocessor via a JTAG or Nexus interface. This allows the operation of the
microprocessor to be controlled externally, but is typically restricted to specific
debugging capabilities in the processor.
An in-circuit emulator (ICE) replaces the microprocessor with a simulated
equivalent, providing full control over all aspects of the microprocessor.
A complete emulator provides a simulation of all aspects of the hardware,
allowing all of it to be controlled and modified, and allowing debugging on a
normal PC. The downsides are expense and slow operation, in some cases up to
100X slower than the final system.
For SoC designs, the typical approach is to verify and debug the design on an
FPGA prototype board. Tools such as Certus [8]
are used to insert probes in the
FPGA RTL that make signals available for observation. This is used to debug
hardware, firmware and software interactions across multiple FPGA with
capabilities similar to a logic analyzer.
Unless restricted to external debugging, the programmer can typically load and
run software through the tools, view the code running in the processor, and start
or stop its operation. The view of the code may be as HLL source-code, assembly
code or mixture of both.
Because an embedded system is often composed of a wide variety of elements,
the debugging strategy may vary. For instance, debugging a software- (and
microprocessor-) centric embedded system is different from debugging an
Embedded computing systems 10CS72
Dept of CSE, SJBIT Page 29
embedded system where most of the processing is performed by peripherals
(DSP, FPGA, co-processor). An increasing number of embedded systems today
use more than one single processor core. A common problem with multi-core
development is the proper synchronization of software execution. In such a case,
the embedded system design may wish to check the data traffic on the busses
between the processor cores, which requires very low-level debugging, at
signal/bus level, with a logic analyzer, for instance.