secure systems research group - fau 1 a survey of dependability patterns ingrid buckley and eduardo...

18
Secure Systems Research Group - FAU 1 A survey of dependability patterns Ingrid Buckley and Eduardo B. Fernandez Dept. of Computer Science and Engineering Florida Atlantic University Boca Raton, FL, USA January 18, 2007

Upload: andra-white

Post on 12-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Secure Systems Research Group - FAU 1 A survey of dependability patterns Ingrid Buckley and Eduardo B. Fernandez Dept. of Computer Science and Engineering

Secure Systems Research Group - FAU1

A survey of dependability patterns

Ingrid Buckley and Eduardo B. FernandezDept. of Computer Science and Engineering

Florida Atlantic UniversityBoca Raton, FL, USA

January 18, 2007

Page 2: Secure Systems Research Group - FAU 1 A survey of dependability patterns Ingrid Buckley and Eduardo B. Fernandez Dept. of Computer Science and Engineering

Secure Systems Research Group - FAU2

IntroductionDependability is that property of a system that allows one to relyon its service

Dependability for critical systems is of utter importance in business and critical infrastructures such as hospitals, airport

andthe electricity grid of a country.

Dependability is comprised of several pertinent aspects:

• Fault Tolerance• Safety• Availability• Reliability

Page 3: Secure Systems Research Group - FAU 1 A survey of dependability patterns Ingrid Buckley and Eduardo B. Fernandez Dept. of Computer Science and Engineering

Secure Systems Research Group - FAU3

Introduction cont’d• Fault Tolerance as it relates to systems, software

and hardware is the ability to remain operable in the presence of faults.

• Safety is the prevention of catastrophic effects on the environment or the users of the system

• Availability is the ability of a system to perform its functions when needed.

• Reliability measures the success with which the system conforms to its specification.

• We use the Unified Modeling Language (UML), to represent fault tolerance patterns.

Page 4: Secure Systems Research Group - FAU 1 A survey of dependability patterns Ingrid Buckley and Eduardo B. Fernandez Dept. of Computer Science and Engineering

Secure Systems Research Group - FAU4

Objectives• Classify software and hardware fault tolerance patterns according to their objectives• Analyze and evaluate the classified fault

tolerance patterns • Determine how to improve upon existing

patterns.• Design new fault tolerance patterns for

unsupported areas within critical systems.

Page 5: Secure Systems Research Group - FAU 1 A survey of dependability patterns Ingrid Buckley and Eduardo B. Fernandez Dept. of Computer Science and Engineering

Secure Systems Research Group - FAU5

Background

• A pattern is an encapsulated solution to a recurrent problem that solves a specific problem in a given context and can be tailored to fit different situations.

• A fault is a defective value in the state of a component or in the design of a system; a fault is the manifestation of an error. An error is a defective value in an erroneous state of a system

• A system failure occurs when there is a deviation from the system’s specification. A failure is the manifestation of an error.

• The System Development Life Cycle (SDLC) is the entire process of formal, logical steps taken to develop software.

Page 6: Secure Systems Research Group - FAU 1 A survey of dependability patterns Ingrid Buckley and Eduardo B. Fernandez Dept. of Computer Science and Engineering

Secure Systems Research Group - FAU6

Fault Tolerance

• A system that can mask the effects of a fault and continue operating correctly is said to be fault tolerant.

• Fault tolerance requires redundancy and diversity which are directly linked to reliability and support availability of a system.

• Diversity in this sense speaks of having different versions of a function or system where all have the same functionality.

• The integration of hardware and software fault tolerance to cope with the various kinds of faults that can appear in a software system is a good foundation towards achieving a fault tolerant system.

• There are several fault tolerance patterns that have already been written and support different levels of the system architecture. Our aim is to focus on hardware and software fault tolerant patterns.

Page 7: Secure Systems Research Group - FAU 1 A survey of dependability patterns Ingrid Buckley and Eduardo B. Fernandez Dept. of Computer Science and Engineering

Secure Systems Research Group - FAU7

Fault Tolerance Cont’d• Fault Tolerance patterns are a fairly new area in association with

critical systems , the need for them has increased with the need to secure systems against failure caused accidentally or intentionally by attackers.

• Due to the diversity of attacks on different types of systems, it is highly important to have effective fault tolerance techniques to mitigate faults that may lead to a failure in a critical system.

• To prevent failures the following is required: – Detection - Detecting the occurrence of errors– Locating the unit or component where the error has occurred

(diagnosis).– Masking- masking errors so as to prevent malfunctioning of

the system if a fault occurs.– Containment of faults -Confine or delimit the effects of the

error.– Recovery- Reconfigure the system to remove the faulty unit

and erase the effects of the error.

Page 8: Secure Systems Research Group - FAU 1 A survey of dependability patterns Ingrid Buckley and Eduardo B. Fernandez Dept. of Computer Science and Engineering

Secure Systems Research Group - FAU8

Hardware Fault Tolerant Patterns

Hardware fault tolerance applies hardware replication to enhance the system availability/reliability in the presence of hardware faults.• Hardware Fault Tolerance patterns: -The Watch Dog pattern primarily provides protection against time-based faults by creating an alarm whenever liveness messages are not received in a given time frame.

Page 9: Secure Systems Research Group - FAU 1 A survey of dependability patterns Ingrid Buckley and Eduardo B. Fernandez Dept. of Computer Science and Engineering

Secure Systems Research Group - FAU9

Hardware Fault Tolerant Patterns Cont’d

– Fail Stop Processor : The Fail-Stop Processor pattern mainly aims at transforming errors that lead to Byzantine/complex failures, and is based on redundancy and comparing output from all replicas to reach an agreement.

– Acknowledgement : The Acknowledgement pattern detects crash failures and is based on acknowledging the reception of input within a given time interval.

Page 10: Secure Systems Research Group - FAU 1 A survey of dependability patterns Ingrid Buckley and Eduardo B. Fernandez Dept. of Computer Science and Engineering

Secure Systems Research Group - FAU10

Software Fault Tolerant Patterns

• Software fault tolerance applies software redundancy by means of diversity of design to tolerate software faults that can occur at the design, programming or maintaining phases of the software development cycle.

Software Fault Tolerance patterns:– Roll forward : The Roll Forward pattern

is a failure recovery pattern which detects and recovers from a fault by monitoring two replicas for errors.

Page 11: Secure Systems Research Group - FAU 1 A survey of dependability patterns Ingrid Buckley and Eduardo B. Fernandez Dept. of Computer Science and Engineering

Secure Systems Research Group - FAU11

Software Fault Tolerant Patterns Con’t

– Input Guard : Input Guard pattern stops erroneous input from propagating the error inside a component. A guard is placed at every access point of the component to check the validity of the input.

– Fault Container : The Fault Container pattern provides the same benefits as the combination of the Input Guard and the Output Guard patterns, because it prevents an error from being propagated inside and outside a given component .

Page 12: Secure Systems Research Group - FAU 1 A survey of dependability patterns Ingrid Buckley and Eduardo B. Fernandez Dept. of Computer Science and Engineering

Secure Systems Research Group - FAU12

Hardware/Software Fault Tolerance Pattern

• The Software Redundancy Pattern deals with hardware, software and environmental faults at the same time.

Page 13: Secure Systems Research Group - FAU 1 A survey of dependability patterns Ingrid Buckley and Eduardo B. Fernandez Dept. of Computer Science and Engineering

Secure Systems Research Group - FAU13

Patterns diagram for the fault tolerance domain

Page 14: Secure Systems Research Group - FAU 1 A survey of dependability patterns Ingrid Buckley and Eduardo B. Fernandez Dept. of Computer Science and Engineering

Secure Systems Research Group - FAU14

Analysis of PatternsPattern Advantage Disadvantage

Watchdog •Can be used improve deadlock detection, where strokes can be keyed or contains data to identify strokes from different computational steps.

• Does not actually checks that the internal

computation processing is correct

Acknowledgement •The design complexity introduced by the is very low .

•Does not introduce any space overhead

•Does not provide means to tolerate faults in a system. Rather, it provides means detect errors.•It introduces relatively elevated space overhead that is proportional to the number of simultaneous errors it can deal with

Fail Stop Processor •Introduces low time overhead since the

processors function in parallel •The processors are replicas of the original system on which the Fail-Stop Processor pattern is applied, without any additional functionality. meaning that in practice the processors can be replicas of a legacy system, which cannot be subject to any internal

changes such as those that are needed if

additional functionality would be required by the processors.

•The error on the monitored system is detected only after some input has been issued to it. •The timeout must be set based on the time it takes for the input to reach the monitored system plus the time it takes for the acknowledge to reach monitoring system.

Page 15: Secure Systems Research Group - FAU 1 A survey of dependability patterns Ingrid Buckley and Eduardo B. Fernandez Dept. of Computer Science and Engineering

Secure Systems Research Group - FAU15

Analysis of Patterns Cont’dPattern Advantage Disadvantage

Roll Forward •The time overhead imposed by this pattern is low when errors occur: the failed replica is discarded, and the unaffected replica

processes the subsequent inputs .

• The time overhead imposed by this pattern in

the absence of errors is high; before the replica

Is able to receive and process new input, it must

copy its new state to the other replica.

Input Guard •It stops the contamination of the guarded component from erroneous input that does not conform to the specification of the guarded component.•There are various ways that the Input Guard pattern can be implemented, each providing different benefits with respect to the time or space overhead introduced by the guard.

•Cannot prevent the propagation of errors that do conform with the specification of the guarded component.•Has significant time and space over head

Fault Container •It stops of errors expressed as input and output content or timing that does not conform to a component specification from entering or exiting that component.•The undefined behavior of the container in the presence of errors allows its combination with error detection and error masking patterns

•The Fault Container pattern cannot prevent the propagation of errors that do not conform with the specification of the contained component.•Unless combined with some error detection and system recovery mechanisms, this pattern will result in send- or receive-omission failures (i.e. failure to send output or receive input of the contained component).

Page 16: Secure Systems Research Group - FAU 1 A survey of dependability patterns Ingrid Buckley and Eduardo B. Fernandez Dept. of Computer Science and Engineering

Secure Systems Research Group - FAU16

Conclusion

• There is a need to improve upon current Fault Tolerant Patterns based on our analysis.

• New Fault Tolerance Patterns are necessary to provide dependability in distributed systems because many of the fault Tolerance patterns are very similar and do not provide a comprehensive support for errors that can lead to failure.

Page 17: Secure Systems Research Group - FAU 1 A survey of dependability patterns Ingrid Buckley and Eduardo B. Fernandez Dept. of Computer Science and Engineering

Secure Systems Research Group - FAU17

Future Work

• Safety, Availability and Reliability Patterns being researched.

• Defining areas of need where current Fault Tolerance Patterns are lacking or require improvement.

• Designing new Fault Tolerance Patterns.

Page 18: Secure Systems Research Group - FAU 1 A survey of dependability patterns Ingrid Buckley and Eduardo B. Fernandez Dept. of Computer Science and Engineering

Secure Systems Research Group - FAU18

Recommendations and Questions

Feed back: