embedded system designread.pudn.com/downloads124/doc/525538/esd_3_1.pdf · 2008-03-17 ·...
TRANSCRIPT
1
Embedded System Design
• Introduction to Embedded Systems
• Embedded Processors
• Real-Time Operating Systems
• Embedded Interfacing and Inter-networking
• Hardware/Software Co-Design
2
Overview
• Process– Defines the state of an executing program
• Operating system– Provides the mechanism for switching execution between the
processes– RTOSes (Real-Time Operating Systems)
• OSes that satisfy real-time requirements– Deterministic response time to external stimuli
– POSIX-compliant (IEEE standard portable operating system interface for computer environments)
• Provides an open standard for OS support• Based on Unix system• Supports real-time requirements in the POSIX 1003.4 document• Many RTOSes are POSIX-compliant
3
Real-Time Design Approaches
4
Real-Time Design Approaches
• Synchronous vs Event-driven
– Synchronous systems perform tasks on a set of schedules• Conflicting schedules can be difficult to resolve
– Asynchronous systems are event-driven• Actions are been performed based on outside stimulation
– Most systems are combination of both
• There are two primary techniques used in real-time designs:
– Super-loops (without Kernel)• One program running
– Multi-tasking (with Kernel)• Many programs running, taking turns
5
Real-Time Design Approaches (Cont.)
• Super-Loops operation– There is a background loop that is always running while none of the ISR is
executing– Interrupt Service Routines (IRSs) handle asynchronous events at foreground
• Timer interrupts, I/O interrupts
– The CPU is always busy• Multitasking operation
– With multitasking operation, multiple tasks or threads is executed based on the scheduling policies
– The scheduling policy is implemented in the kernel– The tasks give up the CPU either
• Voluntarily: cooperative multitasking– Developer pre-determines via system calls
• Involuntarily: preemptive multitasking– Process scheduling algorithm
6
Example of Super-Loop Execution
void main (void) { // backgroundinitialization;FOREVER {
read analog inputs;read discrete inputs;perform monitoring functions;perform control functions;update analog outputs;scan keyboard;handle communication requests;update display;
}}ISR (void) { // foregroundhandle asynchronous events;
}
7
Super-Loop Execution
Background Foreground
ISR
ISR
ISR
ISR
Time
8
Multitasking Operation
Kernel
CPU
TasksTasksTasksTasks
Ready QueueSystem clock
Interrupt
9
Processes, Tasks, Threads, Kernel, and Context Switching
10
Processes, Tasks, and Threads
• Processes and Tasks– An entity of work within an OS that has control over resources– It owns or controls resources (e.g. access to peripherals)– It has threads of execution
• Can be single or multiple threads– Task needs to provide separate spaces for each thread
– Almost interchangeable between process and task• Process may requires additional information beyond normal register
contents swapping (context switch) to maintain integrity– E.g. memory management information
• Threads– No additional context information beyond that stored in the processor
register– Its ownership of resources is inherited from its parent task or process
• Threads Tasks Processes∈∈
11
What is a Kernel?
• The kernel is a piece of software– It has to share the CPU with all the rest of application code
• The kernel is the part of the OS that handles:– Process management
• E.g. process creation, deletion, suspension– Process scheduling
• Ensures that the most important code runs first– Allow multitasking
• Application is broken into multiple processes– Inter-process communications and synchronization
• Message queues, semaphores– Mutual exclusion
• Semaphore management• A typical micro kernel may only have 10 to 12 functions
– Interrupts are usually locked as we enter the kernel– The kernel typically gets to run anytime we return from interrupt
12
Why Processes?
• Why need multiple processes?– Processes help to manage timing complexity
• Multi-rate systems– E.g. multimedia, automotive engines, printers, and
telephone PBXs• Asynchronous input
– E.g. user interfaces, communication systems
• Early multi-tasking technology: co-routine– Like subroutine, but caller determines the return address– Does not satisfy timing requirements– Becomes unmanageable for more than 3 co-routines
13
Example for Co-routines
ADR r14,co2aco1a
…ADR r13,co1bMOV r15,r14
co1b …ADR r13,co1cMOV r15,r14
co1c ...
co2a
…ADR r14,co2b
MOV r15,r13
co2b
…ADR r14,co2c
MOV r15,r13
Co2c
…
Co-routine 1 Co-routine 2
r13: stores return address for Co-routine 1
r14: stores return address for Co-routine 2
r15: Program Counter
14
Essence of Processes
• Processes: a unique execution of a program– Fundamental abstraction for dealing with multiple simultaneous operations– Organize executable code into manageable units
• Procedure: organize source code into manageable units– Apply operating system techniques to manage the interaction between the
processes– Each process contain both its code and data
• Records the current states of the execution• Two copies of a single program executing on their own data results in
two distinct processes– The data set for a process includes values in both CPU registers and
the memory cells– Reentrant: can be executed several times with the same results
• Easier for debugging• E.g. non-reentrant: program with global variables
15
Essence of Processes (Cont.)
• Executing several processes in a single CPU – CPU execute one process at a time– Activation record: It keeps a separate record of the status of each process,
which contains• The process’s priority• The process’s state (e.g. ready, waiting, …)• The value to be stored in the Program Counter (PC)• The related data to reactivate the process
• Context Switching– Mechanism for moving the CPU from one executing process to another– CPU can stop executing one process (P1) and start executing another
process (P2) • Record the complete state information of the P1 into its Activation record• Moves the data from P2’s Activation record into the corresponding registers
and main memory• Changes the PC to the beginning of P2’s code and start execution of P2
16
Context Switching
CPU
PC
Registers...
Memory
Process 1Activation Record
Process 2Activation Record
Process 1Data, code
Process 2Data, code
PC points to the code within
Process 2
1
2
3
Each Activation Record reside in RAM:e.g. around 20 bytes for uC/OS II for 68HC11
17
How Does a Process Get the CPU?
• The following events create conditions for a context switching:
– An external interrupt
– A timer• This can also be thought as an interrupt (periodic interrupt)
– A process willing to give up control of the CPU
– A process completes all of its work
– The scheduler determines to switch the context based on its algorithm
18
Processes in POSIX
• A new process is created by making a copy of an existing process
– It creates two different processes running the same code
• Complication: one process runs the code for the new process while the other process continues the work of the old process
• Each process runs in its own address space
– Can not directly access the data or code of the other processes
– POSIX supplies a mechanism for shared memory to allow processes to communicate
process a
process a process b
copy
19
Processes in POSIX (Cont.)
/* A process makes a copy of itself by calling the folk() function. OS creates a new process called child process. The return value of fork(): the parent process return its process ID number, while the child process returns 0 */
childid = fork(); if (childid == 0) { /* child operations */
execv(“mychild”,childargs); /* execute the child process until failed */
perror(“execv”);exit(1); /* return to parent */
} else { /* parent operations */parent_stuff(); /* execute parent functionality */wait(&cstatus); /* wait for the child process to free up
the memory it occupied. cstatus returns the child process’s state */
exit(0);}
20
Process States
• The OS considers a process in one of the five scheduling states:
– Dormant• Waiting to create processes
– Waiting• Waiting for data from an I/O or another process
– Ready• The scheduler selects the next process to run
– Based on process priorities
– Executing• At most one process executing on CPU
– Interrupted• While an ISR occurs
21
Five States for a Process
Dormant
Ready
WaitingExecuting
Interrupted
Create process
Resides in ROM (non-active)
Waits for execution
•Wait for time to expire•Wait for signal•Wait for message
Interrupt Service Routine (ISRs)
Get CPUPreempted
Get Data and CPU
Need Data
Need Data
Get Data
22
Essence of Threads
• Threads:– A thread is a single stream of control in the system– Typically a process can be divided into a series of
cooperating threads– A thread is a light-weight process (or task)
• Have their own distinct sets of values for CPU registers• But coexisted in the same memory space with other threads
– A thread may inadvertently destroy the data of another thread executing on the machine
• Thread is commonly used in embedded system to avoid the cost andcomplexity of MMU
– MMU provides strict separation between memory spaces (used by general purpose computers)
23
Example of Process and Thread
• Application:
– Most desktop operating systems use a process-based model
• A process can be made up of several threads with a common address space
• E.g. MS Word is a process-based application
– One thread for printing
– One thread for reading the keyboard
– One thread for displaying the text in a window
– Most RTOSes use a thread-based tasking model
• Each job is a separate thread of execution, and they may communicate with each other
24
Functionalities of Operating Systems
25
Operating Systems
• Operating Systems control and allocate resources:– Determine who gets the CPU
• The most important resource• Share the CPU and resources in a rigidly established manner
between competing threads of execution• CPU access is controlled by the scheduler
– The scheduler is centralized in a single algorithm that coordinates computation and the transfer of CPU state
– Determine when I/O should takes place– Determine how much memory is allocated– React in a deterministic way to external events
26
Embedded Operating Systems
• Major functionalities in embedded OSes:– Scheduling processes– Managing shared resources
• CPU is shared among the processes• I/O devices
– OS supplies a driver for each device and a standard communication mechanism
– Networking is one of the important I/O to be managed in embeddedsystem
– File system services• Maybe kept in RAM or Flash systems
– Powerful debugging facilities• Interfaces to host PC systems
– Security
27
Embedded Operating Systems (Cont.)
• Timing violations:– What happens if a process does not finish by its
deadline?• Hard deadline:
– System will fail if deadline is missed– E.g. safety-critical systems, air bag deployment, anti-
lock brakes• Soft deadline:
– User may notice, but the system does not necessarily fails
– E.g. telephone systems, an Automatic Teller Machine (ATM)
28
Scheduling Policies
29
Scheduling Policies
• The scheduler:
– One of the jobs of the OS is to run the scheduler
– The scheduler is a piece of code that implements certain policy to pick up the next process to run
• Typically involves a context switching
• Scheduling policies:
– The rule that the scheduler applies to make the context switch decision are known as the scheduling policy
– Defines how processes are selected from the ready state to the executing state
30
Scheduling Policies (Cont.)
• How do we evaluate a scheduling policy?
– Ability to satisfy all deadlines
– CPU utilization: the percentage of CPU’s execution time devoted to useful work
– Scheduling overhead: time required to make scheduling decision
– Tradeoff: higher CPU utilization requires more scheduling overhead
• POSIX supports real-time scheduling in the sched_setscheduler() function
• Scheduling Policies supported in real-time system:
– Cooperative
– Round robin
– Preemptive, priority-based
– Rate-Monotonic Scheduling (RMS)
– Earliest-Deadline-First Scheduling (EDF)
31
Cooperative Multitasking
• Cooperative multitasking– One process gives up the CPU to another voluntarily– Drawback: introduce severe bugs
• Simple programming errors can cause the system to lock up, falling to response to input, and to be totally inoperable
– Similar to a procedure call• But it does not immediately return to the caller
– It switches between processes
– Scheduler determines which process to be executed next• But has to wait until the previous process to give up
– Gives flexibility to choose the order of process to run• But a process will continue to execute until it voluntarily turns over
control to another process
32
Example for Cooperative Multitasking Processes
if (x>2)sub1(y);
else sub2(y,z);
cswitch();proca(a,b,c);
Process 1
if (x>2)sub1(y);
else sub2(y,z);
cswitch();proca(a,b,c);
Process 1
proc_data(r,s,t);cswitch();If (val1==3)abc(val2);
Rst(val3);
Process 2
proc_data(r,s,t);cswitch();If (val1==3)abc(val2);
Rst(val3);
Process 2
Save_state(current);p=choose_process();load_and_go(p);
Scheduler
Save_state(current);p=choose_process();load_and_go(p);
Schedulercswitch(): for context switching
33
Cooperative Multitasking (Cont.)
• CPU’s process call mechanism:– The call to cswitch() uses standard procedure call
mechanism to generate a return address– cswitch() copies the procedure call state in a process
activation record somewhere in memory
– After the scheduler deciding which process will execute next, cswitch() copies that process’s activation record into the locations normally used for the procedure call state
– Executing a standard procedure call will then return control to the selected process
34
Round-Robin Scheduling
• Round-robin scheduling:– Processes of equal priority can be Time-Sliced:
• Each process executes for a Time Quantum– The Time Quantum is generally configurable
• Assumes that time sliced processes are ready-to-run• No limit on number of processes at same priority• A process can give up its time slice
– Attempts to be fair by giving each process an equal time slice– Commonly found in Unix-variants– Supported by the POSIX-based RTOSes as well as many of the
threaded-based RTOSes– For most RTOSes, round-robin scheduler does not disable priority-
based scheduling• Usually cooperate with priority-based scheduling
35
Round-Robin Scheduling (Cont.)
D
ISR
Mid-Priority
Time
High-Priority
C
Time slice
B
A
A Process A, B, and C have same priority
C
B
C cont.
36
Preemptive, Priority-based Scheduling
• Use of interrupt to build context switching for preemptive multitasking
• Interrupt forces CPU to transfer control to the OS– Reduce programming errors– Allows CPU time to be allocated more efficiently
• Timer-driven preemption– A timer generate periodic interrupts to the CPU– Interrupt handler calls the OS
• Saves the previous process’s state in an activation record• Selects the next process to execute• Switches the context to that process
CPU
Tim
erInterrupt
37
Preemptive, Priority-based Scheduling (Cont.)
• Difference in triggering event:
– Cooperative: voluntary release of the CPU
– Preemptive: timer interrupt
• Flow of control with preemption:
Time
P1 OS P1 OS P2
Interrupt (Timer)
Scheduling and other functions performed by OS
Interrupt (Timer)
38
Preemptive, Priority-based Scheduling (Cont.)
B
ISR
Mid-Priority
Time
High-Priority
A
C
A cont.
B cont.
Low-Priority
Timer
39
Preemptive, Priority-based Scheduling (Cont.)
• How to set process priorities?
– Examples of low-priority processes:• Keyboard scanning
• Operator interface
• Display updates
– Examples of high-priority processes:• Control loops
• Communications
• Error handlers (protection systems)
• I/O (analog, digital, discrete, …)
40
Preemptive, Priority-based Scheduling (Cont.)
• General-purpose v.s. Embedded Scheduling:– PCs try to avoid processes to have starving CPU access
• Fairness is the main concern for CPU access
– Embedded systems always require deadline to be met• Low-priority processes may not run for a long time
• Priority-driven scheduling– Each process has a priority– CPU goes to the highest-priority process that is ready– Priorities determine scheduling policy:
• Fixed priorities• Time-varying priorities
41
Preemptive, Priority-based Scheduling (Cont.)
• Processes:– P1: priority 1, execution time = 10– P2: priority 2, execution time = 20– P3: priority 3, execution time = 30
• Rules:– Each process has a fixed priority– A process continues execution until it completes or is preempted by a higher-priority
process– The ready process with highest priority is selected for execution
time
P2 ready t=0
0 3010 20 6040 50
P2 P1
P3 ready t=18
P2 P3
P1 ready t=15Neglect OS’s scheduling overhead
42
RMS/ RMA
• Rate-Monotonic Scheduling (RMS): widely used, analyzable, for real-time scheduling policy– Static scheduling policy (fixed priorities)
• Sufficient in most cases
• The underlying theory, Rate-Monotonic Analysis (RMA) model:– All processes run periodically on a single CPU– Context switching time is ignored– No data dependencies between processes– Process execution time is constant (time-invariant)– All deadlines are at the end of their periods– Highest-priority ready process is selected for execution
43
RMS/ RMA (Cont.)
• RMS priorities:– Guarantee all processes to meet their deadlines– Policy: Shortest-period process is assigned highest priority
• Priority is inverse proportional to period
– This fixed-priority scheduling policy is proven to be the optimum assignment of static priorities to processes
• It provides the highest CPU utilization while ensuring that all processes meet their deadlines
• If a set of processes can not be scheduled by RMS, they can not be scheduled by any other fixed-priority policies
• Process parameters:– Ti is the computational time of the process i– τi is the period of process i
Period τi
Pi
Computation Time Ti
44
RMS/ RMA (Cont.)
∑=
=n
i i
iTU
1τ
)12( /1 −= mmU
69.0=U
• RMS CPU utilization for a set of n tasks:
• The maximum allowable processor utilization for n task using RMS is (proof neglected):
– For m = 2,– For m approaches infinity, – RMS will not reach 100% of the available CPU cycles even
with zero context switch overhead• Due to the fact of statically assigned priority
– However, RMS guarantees all processes will always meet their deadlines
83.0)12(2 2/1 ≅−=U
45
RMS/ RMA Example
• P1 period = 4; execution time = 2 (U = 0.5)– P1 has higher priority than P2 since it has shorter period
• P2 period = 12; execution time = 1 (U=0.08333)• P2 is preempted by P1 during its first period since P1 has higher-priority than P2• CPU utilization = [(2x3) + 1]/12 = 0.58333
– Within the feasible utilization of RMS (schedulable since 0.58333 < 0.83)
time0 5 10
P2 Period = 12
P1 Period = 4
P1
P2
P1 P1
46
EDF Scheduling
• Earliest Deadline First (EDF) Scheduling– Dynamic priority scheduling scheme
• Changes process priorities during execution based on initiation times
– Policy: Process closest to its deadline has the highest priority• Priorities are recalculated at every completion of a process
– Achieve higher CPU utilization than RMS• Can achieve 100% CPU utilization
– May miss a deadline– POSIX currently does not support EDF– Generally considered too expensive to be used in practice
• The major problem is keeping the processes sorted by time to deadline– Since the times to deadlines for the processes change during
execution• Dynamic sorting adds complexity to the scheduling process
47
RMS and EDF
• RMS versus EDF:
– EDF extracts higher CPU utilization but may miss deadlines• Overhead on scheduling decision
– EDF is significant more complex than the RMS in the algorithm
• Non safe-critical applications (e.g. set-top box)
– RMS ensures all deadlines will be satisfied but has lower CPU utilization
• What if the set of process is unschedulable but still need to guarantee that they complete their deadlines?
– Get a faster CPU
– Reduce the execution time of processes
– Change deadlines in requirements
48
Issues with Modeling Assumptions
Assumption 1:• In both RMS and EDF, it is assumed that each process is
self-contained– A process does not need a system resource (e.g. I/O
device, bus)– The assumption can cause priority inversion
• Priority inversion: – scheduling the process without considering the
resources that are required– Low-priority process blocks execution of a higher-
priority process by keeping hold of its resource
49
Priority inversion (Cont.)
– Can cause deadlock• Example: P1 (higher priority), P2 (lower priority)
– Low-priority process (P2) grabs resource (e.g. I/O device) from the OS
– High-priority process (P1) is ready, the OS preempt P2 for P1• P2 is still using the resource
– When P1 request the resource, it will be denied since P2 alreadyowns it
• Cause deadlock• Unless P1 has a way to take the resource from P2
• Solution to priority inversion:– Promote the priority of the process when it requests a resource from the
OS– Its priority is demoted to the normal value once the process is finished
with the resource
50
Issues with Modeling Assumptions (Cont.)
Assumption 2
• RMS assumes no data dependencies
– Knowledge of data dependencies can help to improve the CPU utilization
– Some combinations of processes can not be ready at the same time
• P1 and P2 can not run simultaneously
– P2 can not be started until P1 is finished
P1
P2
51
Issues with Modeling Assumptions (Cont.)
Assumption 3• Zero context switching time
– Can cause a system to miss a deadline– Hard to calculate the effect
• Depends on the order to context switching time– In most RTOS, context switch requires only a few
hundreds instruction, plus only slight overhead for a simple real-time scheduler (RMS)
– Problem may occur in high-rate processes
– A better estimation is to assuming average number of context switches per process
52
Issues with Modeling Assumptions (Cont.)
• Assumption 4: Process execution time is constant– The following factors can cause variations of execution time
in run-time• Data-dependent behavior• Caching effects
– Multiple programs executing in the same cache makes the state space exponentially larger than that for a single program
• Assumption 5: Many read-time systems are designed without the consideration of caches– Results in more computational power than necessary– Solutions:
• Develop a model to estimate the performance of multiple processes sharing a cache
• Cache management can improve CPU utilization
53
POSIX Scheduling Policies
• SCHED_FIFO: Rate-monotonic scheduling• SCHED_RR: Round-robin scheduling
– A combination of real-time and interactive scheduling techniques• Within a priority level, the processes are time-sliced in
round-robin fashion• Interactive ensures all processes get a chance to run, time is
divided into quanta– Processes get the CPU in multiples of quanta– The length of quantum can vary with priority level
• SCHED_OTHER: undefined scheduling– Allows non-real-time processes to intermix with real-time
processes• Different processes in a system can run with different polices
– Some processes use SCHED_FIFO while others run SCHED_RR
54
Guidelines for Scheduling Policies
• Cooperative multitasking is typically used by desktop OSes or non-mission critical systems
– E.g. in Windows systems
• Round-robin scheduling is often used when there is no desired priority between competing tasks or the processor is expected to be shard equally
– Multi-channel communication systems
• Rate monotonic is typically used as an analysis tool
– Can be used to predict the impact of new tasks in mission critical systems (e.g. avionics)
• Most real-time systems use a preemptive, priority-based policy
– Requires more thought in the system design
55
Inter-process Communication Mechanisms
56
Inter-process Communication (IPC)
• IPC: OS provides mechanisms so that processes can pass data
• A process can send a communication in one of two ways:
– Blocking: the process goes into waiting state until it receives a response
– Non-blocking: allows the process to continue execution after sending the communication
• Two major IPC styles: (logically equivalent)
– Shared memory:• Processes use same memory location
• Processes must cooperate to avoid destroying/ missing messages
– Message passing:• No common address space
• Processes send messages along a communication channel
57
Shared Memory
• Shared memory communication implemented on a bus– Components can be processes or CPUs– The write and read operations are standard and can be
encapsulated in a procedural interface
Process 1 Process 2memory
Write Read
58
Race Condition in Shared Memory
• Problem when two Processes try to write the same location:– Process 1 reads the flag location and sees it is 0– Process 2 reads the flag location and sees it is 0– Process 1 sets the flag to 1 and writes data to the shared location– Process 2 erroneously set the flag to 1 and overwrites the data left by
Process 10: Data not ready1: Data ready
Process 1 Process 2memory
Write Write
59
Atomic Test-and-Set
• Problem can be solved with an atomic test-and-set:
– Single bus atomic operation: can not be interrupted
– Test: reads a location first, set it to a specific value• It returns the results of the test
– If the location was already set, the additional set has no effect
• The test-and-set instruction returns a false result
– If the location was not set, the location will be set by the instruction
• The test-and-set instruction returns a true result
– Available on most of the microprocessors
• Critical region: section of code that can not be interrupted by another process
– E. g. write shared memory, access I/O device
60
Semaphores
• Semaphores– Language-level synchronization construct– A semaphore is used to guard access to a block of protected memory
(critical region of memory)– Implemented by test-and-set– Any process wants to access the protected memory must use the semaphore
to ensure that no other process is actively using it– Semaphore Protocol:
1.P(): gain access to the protected memory (wait for semaphore)– Use a test-and-set to repeatedly test a location that holds a lock on the
memory block• P() operation exits when the lock is available• Once available, the test-and-set automatically sets the lock
2. Perform critical region operations (process works on the protected memory block)
3.V(): release the lock (release semaphore)• Other processes can have access to the region using the P() function
61
Semaphores (Cont.)
ListPointer
Semaphore
HighestPriority
Processes waiting for the semaphore
LowestPriority
Value=0
• A semaphore contains:– A value (0..65535): as the ID of the semaphore– A list of processes waiting for the semaphore
Process Activation Record
62
Message Passing
• Message passing– Each communicating entity has its own message send/ receive unit at the
endpoint, not at the communication link• In contrast, shared memory uses memory block as a communication
device– Data are stored in the communication link/ memory
– Passing communication packets among the devices– Message is a pointer
• Pointer can point to a variable or a data structure– Processes or ISR can send messages– Only processes can receive a message
• Highest-priority process waiting on queue will get the message• Receiving process can timeout if no message is received within a certain
amount of time• Two types of queue:
– FIFO: first in first out– LIFO: Last in first out