costly mistakes of real-time software development dave stewart, phd director of research and systems...
TRANSCRIPT
Costly Mistakes ofReal-Time Software Development
Dave Stewart, PhDDirector of Research and Systems Integration
InHand Electronics [email protected]
www.inhand.com
ESC 223
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
Why this presentation?
2
Rookies and Expertsmake the same mistakes over and over again.
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
What Constitutes “Costly” Mistakes?
3
Initial software effort is over budget Delivery is late Mistake requires an expensive solution One or more requirements not met Quality issues that lead to poor reputation High maintenance costs to fix software
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
The Order is Subjective
4
is costliest mistake
… but just in my opinion. Every person may have their own personal order.
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
The Order is Not Really Important
5
What is important is that the mistake is on the list!
Correcting just ONE mistakecan save thousands of dollars
or significantly improve quality and robustness of software.
Correcting SEVERAL mistakescan lead to savings and improvements
that are incalculable!
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
No Emulators of Target Applications
6
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
Use an Emulator to Reduce Costs
7
Prototyping before hardware is available Faster development with better tools Enable more developers Obtain early customer feedback Deeper understanding Mistakes that do happen
are safer and cheaper!
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
“It’s just a glitch”
8
Never assume that a problem has been fixed magically
Note problem inyour log bookimmediately!
Spend some time to try and fix the problem
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
“it’s just a glitch”
9
Recognize most likely causes timing error (i.e. real-time scheduling issue) race condition memory corruption hardware problem
Almost any other type of error would be easy to replicate, and would not be a random or rare glitch.
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
During Design, Take Precautions
10
Due to high cost of trying to debug glitches, take precautions that will minimize potential issues Formal code review Minimize shared resources and memory Limit use of interrupts Select appropriate real-time scheduling algorithm Analyze expected real-time performance Use deadlock-free IPC solutions
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
Actively Troubleshoot
11
Knowing that glitches are usually one of a few specific types of errors, actively try to replicate. E.g. Force context switches inside race conditions (e.g. by
adding a sleep command) to validate mutexes Intentionally cause an overload of the system to see how
it fails if CPU is overloaded Check for stack or heap corruption Monitor progress of software and hardware using a logic
analyzer (see ESC Class “Troubleshooting Real-Time Software Issues Using a Logic Analyzer”)
Incrementally add debug information so that next time glitch happens, more information is available
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
Poor or No Software Diagrams
12
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
Poor or No Software Diagrams
13
typedef struct _def_t { struct _def_t *next; struct _def_t *prev; char name[8]; short loval; short hival;} def_t;
typedef struct _xyz_t { int i; float f; short s[2]; unsigned char b[8];} xyz_t;
typedef struct _abc_t { def_t *def; xyz_t *xyz; short ndef;} abc_t;
Without a corresponding diagram, there is no easy way to know what this code is supposed to do.
By reverse engineering it, at best you can determine what the code does, but is that what it is supposed to do?
Diagrams represent the design, no diagrams means code was implemented directly from requirements, and will likely not be very robust nor efficient.
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
Poor or No Software Diagrams
14
typedef struct _def_t { struct _def_t *next; struct _def_t *prev; char name[8]; short loval; short hival;} def_t;
typedef struct _xyz_t { int i; float f; short s[2]; unsigned char b[8];} xyz_t;
typedef struct _abc_t { def_t *def; xyz_t *xyz; short ndef;} abc_t;
*def
*xyz
ndef
abc_t
nex t
prev
name
loval
hival
def_t
structure abc_tf ield within struc ture
zoomed-in view
abc_tfield
head name1 namendef
xyz[0]
xyz[1]
xyz[2]
xyz[nxyz–1]
i
f
s [1]
b[0] b[1] b[2] b[3]
b[4] b[5] b[6] b[7]
of a s tructure
pointer
Legend
s[0]
Diagrams make it clear what code is supposed to do. Errors are more easily identified, as code does not reflect the diagram.
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
Creating Good Diagrams
15
Create a legend for every diagram Every block, symbol, line, shading, color, and
font type should be specified in legend. Any deviation from legend shows an error in
the design
Land
Water
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
Creating Good Diagrams
16
Use Drawing or CASE Tools E.g. UML (Unified Modeling Language) UML is nothing more than a standard set of legends for
different types of diagrams Use different diagrams to represent multiple “views”
Data Flow Process Flow Sequence Diagrams Timing Diagrams
State Machines Use Cases Data Structures Class Diagrams
Each layer should have Architectural Design (how components are integrated) Detailed Design (contents of each component)
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
Inadequate Design for Test
17
Most common reason for not testing code is because it is difficult to test Yet that is precisely when testing is most needed
When a system is being designed, every requirement needs to have the following question answered, “how will this requirement be tested?” If there is no answer, then guaranteed this is the
requirement that will cause problems Update the design to include methods to test!
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
Example: Too Many Branches
18
It is extremely difficult to test many branches. Such code is usually result of implementing directly from
requirements, that tend to have a lot of “if” conditionals. Solution: change the design. Define the system as
a finite state machine that is easier to test.
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
Improve Status Output to Console
19
Bad: Reading 100 bytes
Good: SPI R 100 Cmd=0x12 Data=45 12 20 10 …
Bad: State is 4
Good: [pwr_ctrl.c Line 112 pwr_VoltageSet()] state=4
Automated test programs can more easily monitor detailed output, even if the detail is a bit cryptic These programs cannot easily handle lack of information or
ambiguous information.
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
Avoid Interactive Testing
20
Interactive tests will rarely be re-done when small changes occurs Spend time designing the tests Create scripts to automatically run tests Emulate input devices using known patterns
Ideal Rule: Always test the entire system all the time!
Practical Rule: Automate testing as much as possible to get close to the
ideal rule. Regularly test non-automated parts of the system following
documented procedures.
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
Improper Separation Between ISR and IST
21
ISR: Interrupt Service Routine High-priority interrupt, unscheduled General rule is to process it and return as
soon as possible IST: Interrupt Service Thread
Scheduled actions for most interrupts Priorities can be managed
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
Bad Implementation of ISR vs IST
22
Following is a very common flaw in many RTOS Driver Implementations This flaw will usually break the ability to achieve
good real-time performance on a system
ISR: disable interrupts decide which IST to run signal IST
IST: process the interrupt re-enable interrupts
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
Good Implementation of ISR vs IST
23
ISR disable interrupts if “quick” operation (e.g. < 100 usec)
perform the full operation
else Mask out the individual interrupt only signal IST
enable interrupts IST
perform longer operations unmask individual interrupt
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
Example: Serial Driver
24
ISR Handles individual bytes arriving over serial link It only takes a few microseconds per byte;
No need to signal IST for that IST
When a complete packet arrives, or buffer reaches a critical threshold, then IST is signaled to provide more extensive processing
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
Why is the “good” implementation better?
25
Reduces overhead of interrupts that are happening rapidly Interrupts may be arriving much faster than the data
actually needs to be processed; by processing data less frequently, priority of IST can be lowered, making it easier to schedule
ISR remains at high priority, thus does not miss any critical events
Interrupts are never disabled for extended periods of time, thus preventing priority version Allows other real-time tasks to execute when needed
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
Circular Dependencies
26
This is the root cause of spaghetti—aka “legacy”—code!
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
Circular Dependencies Break Modularity
27
abc
def ghi
jkl stu
mno uvwpqr
xyz
Example of Dependency Graph
jklSuppose we want to test or reuse module jkl, or find an error that is observed in jkl.
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
Circular Dependencies Break Modularity
28
abc
def ghi
jkl stu
mno uvwpqr
xyz
Example of Dependency Graph
jkljkl
uvwpqr
xyz
Unit-testing confined to only the four modules shown.
If an error is observed is in jkl, then quite likely the issue is in one of the four modules dependent modules.
If module jkl needs to be reused in a different system, only the four modules shown are needed.
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
Circular Dependencies Break Modularity
29
abc
def ghi
jkl stu
mno uvwpqr
xyz
Example of Dependency Graph
abc
def ghi
jkl stu
mno uvwpqr
xyz
jkljkl
uvwpqr
xyz
abc
def ghi
jkl stu
mno uvwpqr
xyz
With just one circular dependency (between jkl and ghi) the number of modules needed to confine errors, unit test, or reuse, is increased.
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
Circular Dependencies Break Modularity
30
abc
def ghi
jkl stu
mno uvwpqr
xyz
abc
def ghi
jkl stu
mno uvwpqr
xyz
Example of Dependency Graph
abc
def ghi
jkl stu
mno uvwpqr
xyz
jkljkl
uvwpqr
xyz
With just one major circle, jkl cannot be reused or unit tested!
abc
def ghi
jkl stu
mno uvwpqr
xyz
abc
def ghi
jkl stu
mno uvwpqr
xyz
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
Circular Dependencies Break Modularity
31
abc
def ghi
jkl stu
mno uvwpqr
xyz
abc
def ghi
jkl stu
mno uvwpqr
xyz
Such major cycles are quite common:• Using a single “globals.h” type of
header that all modules include• Low-level device driver directly uses
a constant or variable defined by application
• Software architecture is not layered
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
Avoiding Circular Dependencies
32
Follow fundamental Software Engineering concepts: Data encapsulation and modularity Use abstract data types or objects
Avoid including everything E.g. #include “globals.h”. This effectively makes everything
dependent on everything else. Instead, what you need is what you include Anything more causes undesired dependencies
Review component relationships If adding an include file causes a circular dependency, review
the design immediately. This is much easier to fix at the root of the problem, rather than
later on down the road.
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
No Code Reviews
33
Textbook Reasons Code reviews are a proven way to improve quality and
robustness Studies have shown that more problems can get fixed in
one day of code review than in a week of debugging
These are applicable, but not the main reason this mistake becomes so costly.
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
Code Reviews
34
Greatest cost for software is in maintenance phase It is likely that original developers of software are no longer on
project, thus not the ones troubleshooting and revising software Code reviews are necessary:
Reviews help eliminate messy code by forcing programmers to show their code to others.
This results in cleaner code from the start, before the programmer is moved off the project.
Reviews double as training sessions to increase number of employees who understand and can maintain the code
Reduces company risk that only one employee knows the code, and they decide to leave for any reason.
Increased consistency for software from one project to the next, and more reuse of code and software architectures
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
Wrong Priority Assignments
35
Arbitrarily bumping up priorities of “important” threads is a primary cause of breaking real-time performance and creating priority inversions Doing so will help that one thread perform as needed,
but at the cost of causing other real-time threads to fail to meet their timing requirements
Trying to then modify the priorities of the other threads that stopped working simply counteracts the initial change.
This can lead to a never-ending debug cycle and a system that never meets requirements.
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
Assigning Priorities
36
Priorities are a system configuration issue Drivers and threads should never
define or hard-code their own priorities
Define all priorities in a tabular configuration file so that priorities of every thread can be seen relative to every other thread.
Absolute priorities are meaningless! Only relative priorities have meaning
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
Assigning Priorities
37
Use Rate Monotonic Algorithm as a Guideline An algorithm for assigning fixed priorities
The higher the frequency (or shorter the period)of a task, the higher its priority.
Semantic importance of a task is irrelevantin real-time scheduling. If a task needs tobe hard real-time, it needs to execute fasterthan other tasks.
While an accurate analysis would be nice, it is not necessary unless on the threshold of performance
Consider the general rates and approximate utilization of each driver or thread, and assign priorities accordingly
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
Prioritizing Random Events – Right Way
38
The Right Way is to define Aperiodic Servers Each server has reserved execution time at high priority,
and has the ability to use leftover execution time at lower priorities
Requires appropriate RTOS mechanisms to detect timing errors E.g. missed deadline or using more CPU time than
reserved. Unfortunately, many RTOS don’t support the mechanisms See “Mechanisms for Detecting and Handling Timing Errors”
by D. Stewart and P. Khosla Available on the Web; search by title
Use the Right Way for Hard Real-Time Systems
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
Prioritizing Random Events – Quick Way
39
Estimate the fastest the random event arrives, consider that the frequency of the thread
Estimate how long each event takes, and use that as utilization of the thread
Assign priority using Rate Monotonic Algorithm as though this was periodic
This method is generally good enough for soft-real-time threads
It can be used if there are hard-real-time threads as long as the highest priority assigned to a random
event thread is less than any hard real-time thread
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
Too Much Locking and Blocking
40
Locking and Blocking comes in many forms Semaphores Message Passing Mutex Critical Sections
Blocking is an enemy to real-time predictability! Reduces effectiveness of real-time scheduling Introduces priority inversion Potential exists for deadlock if multiple locks
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
Eliminate Locking when Possible
41
Minimize inter-module communication Too much inter-module communication is a sign of a
poorly-decomposed software architecture Limit real-time scheduling preemption
Systems don’t have to be fully preemptive If module A reads data that module B runs, no reason one
should have to interrupt the other; they can be set to same priority, and always execute in sequence
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
Eliminate Locking when Possible
42
If using message passing: Use non-blocking mechanisms only, such as
asynchronous messages If no message available, then continue with alternate
execution Use large enough queues
In soft-real-time systems, if they fill up, throw away data In hard real-time systems, CPU utilization must be low
enough on average so that queues never fill up
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
Eliminate Locking when Possible
43
Critical Sections Sometimes it is ok to disable specific interrupts for a few
microseconds, to eliminate a lock and possible blocking Introduce a buffer for data shared between interrupts and
threads interrupt only accesses buffer thread manages refreshing real data from the buffer.
Access buffers and FIFOs using algorithms that guarantee integrity of data without the need for locks
Easiest to achieve if only single producer and single consumer
Guard hardware registers by using servers for individual resources
Can then use asynchronous message passing between applications (clients) and the server
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
Block threads ONCE and ONLY ONCE
44
Ideal model of a real-time thread is to block once per cycle, waiting for start of a cycle Once cycle begins, never block within the cycle
Read Inputs/Events
Do Processing
Write Outputs
Wait for Event (BLOCK)For periodic tasks, event is time-based
For other tasks, event could be an interrupt, message arrival, semaphore wakeup, or any other signal
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
No measurements ofexecution time!
45
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
For Real-Time Systems: Execution Time = $$$
46
Money ($$$) Have a limited amount Bills have deadlines Can’t spend more than you have* Count it to know how much you are spending Don’t buy without knowing the price Good budgeting leads to good cash flow
CPU Execution Time Have a limited amount Real-time deadlines Can’t use more than you have* Measure it to know how much you are using Don’t execute code without knowing its CPU usage Good real-time design leads to good CPU usage
* unless you use credit
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
Measuring Execution Time
47
Design your system so that the code is measurable!
Measure execution time as part of your standard testing.
Do not only test the functionality of the code!
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
Measuring Execution Time
48
Learn both coarse-grain and fine-grain techniques to measure execution time Use coarse-grain measurements for analyzing
real-time properties Use fine-grain measurements for optimizing and
fine-tuning See “Measuring Execution Time and
Real-Time Performance” by D. Stewart Available on web by searching by title
DesignWest 2012 – San JoseCostly Mistakes with Real-Time Software Development
Dave Stewart, PhD© 2012 InHand Electronics, Inc. – www.inhand.com
Costly Mistakes ofReal-Time Software Development
49
#1 No Measurements of Execution Time!#2 Too Much Locking and Blocking#3 Wrong Priority Assignments#4 No Code Reviews#5 Circular Dependencies#6 Improper Separation between ISR and IST#7 Inadequate Design for Test#8 Poor or No Software Design Diagrams#9 It’s Just a Glitch#10 No Emulators of Target Application