critical systems validation cis 376 bruce r. maxim um-dearborn

Critical Systems Validation

CIS 376

Bruce R. Maxim

UM-Dearborn

Validation Perspectives

• Reliability validation– Does measured system reliability meet its

specification?

– Is system reliability good enough to satisfy users?

• Safety validation– Does system operate so that accidents do not occur?

– Are accident consequences minimized?

• Security validation– Is system secure against external attack?

Validation Techniques

• Static techniques– design reviews and program inspections

– mathematical arguments and proof

• Dynamic techniques– statistical testing

– scenario-based testing

– run-time checking

• Process validation– SE processes should minimize the chances of

introducing system defects

Static Validation Techniques

• Concerned with analysis of documentation• Focus is on finding system errors and identifying

potential problems that may arise during system operation

• Documents may be prepared to support static validation– structured arguments

– mathematical proofs

Static Safety Validation Techniques

• Demonstrating safety by testing is difficult

• Testing all possible operational situations is impossible

• Normal reviews for correctness may be supplemented by specific techniques intended to make sure unsafe situations never arise

Safety Reviews

• Intended system functions correct?• Is structure maintainable and understandable?• Verify algorithm and data structure design against

specification• Check code consistency with algorithm and data

structure design• Review adequacy of system testing

Review Tips

• Keep software as simple as possible

• Avoid error prone software constructs during implementation

• Use information hiding to localize effects of data corruption

• Make appropriate use of fault tolerant techniques

Hazard-Driven Analysis

• Effective safety assurance relies on hazard identification

• Safety can be assured by– hazard avoidance

– accident avoidance

– protection systems

• Safety reviews should demonstrate that one or more of these techniques have been applied to all identified hazards

System Safety Case

• The normal practice for a formal safety case to be required for all safety-critical computer-based systems

• A safety case presents a list of arguments, based on identified hazards, as to why there is an acceptably low probability that these hazards will not result in an accident

• Arguments can be based on formal proof, design rationale, safety proofs, and process factors

Formal Methods and Validation

• Specification validation– developing a formal model of a system specification

often reveals errors and omissions

– mathematical analysis of a formal specification is another way to discover specification problems

• Formal verification– mathematical arguments are used to demonstrate that a

program or design is consistent with its formal specification

Formal Validation Problems

• The formal model of the specification is not likely to be understood by the domain expert– this makes it hard to check that the formal model is an

accurate representation of the system specification

– a consistently wrong specification is useless

• Verification does not scale-up– verification is a complex, error-prone process

– the cost of verification increases exponentially with system size

Formal Methods in Practice

• Use of formal methods in specification writing and verification may not guarantee correctness

• Use of formal methods helps to increase confidence in a system by demonstrating that some classes of errors are not present

• Formal verification is only likely to the used in small critical system components

• 5 or 6 KLOC seems to be the component size limit for current formal verification techniques

Safety Proofs

• Safety proofs are used to show that a system cannot reach an unsafe state

• Correctness proofs are used to show that system code conforms to its specification

• Safety proofs are based on proof by contradiction– Assume that an unsafe state can be reached

– Show this assumption is contradicted by program code

• Safety proofs may be presented graphically

Safety Proof Construction

• Establish safe exit conditions for a component• Starting with the end of the code, work backwards

until all paths leading to the exit are identified• Assume the exit condition is false• Show that for each path leading to the exit, the

assignments made during path execution contradict the assumption of a false exit condition

Safety Validation• Design validation

– design is checked to ensure that hazards do not arise that cannot be handled without causing an accident

• Code validation– code is checked for conformance to specification and to

ensure that the code is a true implementation of the design

• Run-time validation– using run-time checks to monitor to make sure system

does not enter unsafe state during operation

Example from Sommerville:Gas Warning System

• System to warn of poisonous gas. Consists of a sensor, a controller and an alarm

• Two levels of gas are hazardous– Warning level - no immediate danger but take action to

reduce level

– Evacuate level - immediate danger. Evacuate the area

• The controller takes air samples, computes the gas level and then decides whether or not the alarm should be activated

Is the gas sensor control code safe?

Gas_level: GL_TYPE ; loop

-- Take 100 samples of airGas_level := 0.000 ;for i in 1..100 loop

Gas_level := Gas_level + Gas_sensor.Read ;end loop ;Gas_level := Gas_level / 100 ;if Gas_level > Warning and Gas_level < Danger then

Alarm := Warning ; Wait_for_reset ;elsif Gas_level > Danger then

Alarm := Evacuate ; Wait_for_reset ;else

Alarm := off ; end if ;

end loop ;

Graphical Safety Argument

Gas_level > Warning and Alarm = off Unsafe state

Gas_level > Warning and Gas_level < Danger

Gas_level > Danger

Alarm = WarningAlarm = Evacuate Alarm = off

or or or

contradiction contradiction

Path 1 Path 2 Path 3

Condition CheckingGas_level < Warning Path 3 Alarm = off

contradictionGas_level = Warning Path 3 Alarm = off

contradictionGas_level > Warning &Gas_level < Danger

Path 1 Alarm = Warningcontradiction

Gas_level = Danger Path 3 Alarm = off*

Gas_level > Danger Path 2 Alarm = Evacuatecontradiction

* This indicates a code problem, since danger exists and no alarm sounds.

Safety Assertions

• Assertions should be included in the program indicating conditions that should hold at crucial lines of code

• Assertions may be based on pre-computed limits for critical variables

• Assertions may be used during formal program inspections or may be converted to run-time safety checks

Dynamic Validation

• Concerned with validating system during its execution.

• Testing techniques– analyzing the system outside of its operational

environment

• Run-time checking– checking during normal execution that a system

is operating within its dependability envelop

Reliability Validation

• Involves exercising the program to assess whether it has reached the required level of reliability or not

• Can’t be done during normal defect testing process, because defect test data is not always typical of normal usage data

• Statistical testing must be used where a statistically significant data sample based on simulated usage is used to assess reliability

Statistical Testing

• Used to test for reliability, not fault detection• Measuring the number of errors allows the

reliability of the software to be predicted• Error seeding is one approach to measuring

reliability• An acceptable level of reliability should be

specified before testing begins and the software should be modified until that level is attained

Estimating Number of Program Errors by Error Seeding

• One member of the test team places a known number of errors in the program while other members try to find them.

• Assumption: (s/S) = (n/N)– s = # seeded errors found during testing

– S = # of seeded errors placed in program

– n = # non-seeded (actual) errors found during testing

– N = total # of non-seeded (actual) errors in program

• This can be written as N = (S*n)/s

Error Seeding Example

• Using the error seeding assumptions– if 75 of 100 seeded errors are found– we believe the we have found 75% of the actual

errors

• If we found 25 non-seeded errors that means actual # errors in the program is

N = (100 * 25)/75 = 33 1/3

Confidence in Software

• C = 1 – if n >N – meaning the actual # errors found during testing

exceeds the # actual errors in the program

• C = S / (S - N -1)– if n <= N– meaning the actual # errors fond during testing

is less than the actual # program errors

Confidence Example• To achieve a 98% confidence level that our program is bug

free (N = 0) how many seeded errors would need to be introduced and found by our test team?

S/(S - N + 1) = 98/100

S/(S - 0 + 1) = 98/100

S/(S + 1) = 98/100

100S = (98S + 98)

2S = 98

S = 49

Better Approach

• A more realistic approach to estimating confidence would be based on the number of seeded errors found, regardless of whether they have all been found or not.

• If n <= N then

C = C(S, s -1)/C(S + N + 1, N+s)• where

C(S, s - 1) = S!/(s - 1)!*(S - s + 1)!

C(S+N+1, N+s) = (S+N+1)!/(N+s)!*(S+1-s)!

Reliability Validation Process

• Establish an operational profile for the system• Construct test data reflecting this operational

profile• Test the system and observe both the number of

failure and the times of the failures• Compute the reliability after a statistically

significant number of failures have been observed

Operational Profiles• Set of test data whose frequency distribution

matches frequency distribution of these inputs in normal system usage

• Can be generated from real data collected from and existing system or from assumptions made about system usage patterns

• Should be generated automatically if possible (this can be difficult for interactive systems)

• Hard to predict pattern of unlikely inputs without some type of probability model

Reliability Growth Models

• Mathematical models of system reliability change over time as system is tested and defects are removed

• Can be used to predict system reliability by using current process data and extrapolating the reliability using the model equations

• Depends on the use of statistical testing to measure reliability for each system version

Reliability Model Selection • There is no universally applicable growth model• Many reliability growth models have been

proposed• Reliability does not always increase over time,

some system changes may introduce new errors• Models predicting equal steps between releases

are not likely to be correct• Reliability growth rates tend to slow down with

times as frequently occurring faults are removed from software (90/10 rule)

Exponential Growth Models• Software reliability growth models fall into two

major categories– time between failure models (MTBF)

– fault count models (faults or time normalized rates)

• Reliability growth models are usually based on formal testing data and several curves may need to be checked against the actual test results.

• The exponential distribution is the simplest and most important distribution in reliability and survival studies.

Model Evaluation Criteria

• Predictive validity– external means of verifying model correctness

• Capability– model does what is needs to do

• Quality of assumptions• Applicability

– appropriateness to software design

• Simplicity– easy to collect data and easy to use

Modeling Process

• Examine data

• Select a model to fit the data

• Estimate model parameters

• Perform goodness of fit test

• Make reliability predictions based on fitted model

Reliability Model Fitting

Reliability

Requiredreliability

Fitted reliabilitymodel curve

Estimatedtime of reliability

achievement

Time

= Measured reliability

Exponential Model

• CDF = cumulative probability density function

F(t) = 1 - exp(-t/c) = 1 - exp(-lambda*t)• PDF = probability density function

f(t) = 1/c * exp (-t/c) = 1/lambda * exp(-lambda*t)lambda = 1/c = error detection or hazard rate

t = time

• In real applications we need K = total number of defects in additions to lambda.

Exponential Distribution ReliabilityModel Considerations

• The more precise the input data the better the outcome.

• The more data points available the better the model will perform.

• When using calendar time for large projects, you need to verify homogeneity of testing effort.– person hours per time unit

– test cases run

– variations executed

To Normalize Testing Data

• Calculate average # person hours of testing per week

• Compute defect rates for each n person-hours of testing

• Use allocated defect data as weekly defect as weekly input data for model

Reliability Validation Problems

• Operational profile uncertainty– does the operational profile reflect actual usage

• High cost of test data generation– statistical programming modeling is labor intensive

work

• Statistical uncertainty for high-reliability systems– it may be impossible to generate enough failures so

draw statistically valid conclusions (e.g. missile defense or nuclear control systems)

Security Validation

• Similar to safety validation in that the goal is to demonstrate that system cannot enter an insecure (or unsafe) state

• The key differences between security and safety are– safety problems are accidental

– security problems are deliberate

– security problems tend to be generic

– safety problems tend to be application domain specific

Security Validation Techniques

• Experience-based validation– system is reviewed and analyzed in terms of the types

of attack known to the validation team

• Tool-based validation– security tools (e.g. password checkers) are used to

analyze system in operation

• Tiger teams– teams try to breach security by simulating attacks on

the system

critical systems validation cis 376 bruce r. maxim um-dearborn

Documents

system specification

formal specification

system safety case

formal validation problems

static validation techniques

formal safety case

safety proofs

security validation