secure systems research group - fau 1 a survey of dependability patterns ingrid buckley and eduardo...
TRANSCRIPT
![Page 1: Secure Systems Research Group - FAU 1 A survey of dependability patterns Ingrid Buckley and Eduardo B. Fernandez Dept. of Computer Science and Engineering](https://reader036.vdocuments.mx/reader036/viewer/2022082518/56649ea25503460f94ba56a7/html5/thumbnails/1.jpg)
Secure Systems Research Group - FAU1
A survey of dependability patterns
Ingrid Buckley and Eduardo B. FernandezDept. of Computer Science and Engineering
Florida Atlantic UniversityBoca Raton, FL, USA
January 18, 2007
![Page 2: Secure Systems Research Group - FAU 1 A survey of dependability patterns Ingrid Buckley and Eduardo B. Fernandez Dept. of Computer Science and Engineering](https://reader036.vdocuments.mx/reader036/viewer/2022082518/56649ea25503460f94ba56a7/html5/thumbnails/2.jpg)
Secure Systems Research Group - FAU2
IntroductionDependability is that property of a system that allows one to relyon its service
Dependability for critical systems is of utter importance in business and critical infrastructures such as hospitals, airport
andthe electricity grid of a country.
Dependability is comprised of several pertinent aspects:
• Fault Tolerance• Safety• Availability• Reliability
![Page 3: Secure Systems Research Group - FAU 1 A survey of dependability patterns Ingrid Buckley and Eduardo B. Fernandez Dept. of Computer Science and Engineering](https://reader036.vdocuments.mx/reader036/viewer/2022082518/56649ea25503460f94ba56a7/html5/thumbnails/3.jpg)
Secure Systems Research Group - FAU3
Introduction cont’d• Fault Tolerance as it relates to systems, software
and hardware is the ability to remain operable in the presence of faults.
• Safety is the prevention of catastrophic effects on the environment or the users of the system
• Availability is the ability of a system to perform its functions when needed.
• Reliability measures the success with which the system conforms to its specification.
• We use the Unified Modeling Language (UML), to represent fault tolerance patterns.
![Page 4: Secure Systems Research Group - FAU 1 A survey of dependability patterns Ingrid Buckley and Eduardo B. Fernandez Dept. of Computer Science and Engineering](https://reader036.vdocuments.mx/reader036/viewer/2022082518/56649ea25503460f94ba56a7/html5/thumbnails/4.jpg)
Secure Systems Research Group - FAU4
Objectives• Classify software and hardware fault tolerance patterns according to their objectives• Analyze and evaluate the classified fault
tolerance patterns • Determine how to improve upon existing
patterns.• Design new fault tolerance patterns for
unsupported areas within critical systems.
![Page 5: Secure Systems Research Group - FAU 1 A survey of dependability patterns Ingrid Buckley and Eduardo B. Fernandez Dept. of Computer Science and Engineering](https://reader036.vdocuments.mx/reader036/viewer/2022082518/56649ea25503460f94ba56a7/html5/thumbnails/5.jpg)
Secure Systems Research Group - FAU5
Background
• A pattern is an encapsulated solution to a recurrent problem that solves a specific problem in a given context and can be tailored to fit different situations.
• A fault is a defective value in the state of a component or in the design of a system; a fault is the manifestation of an error. An error is a defective value in an erroneous state of a system
• A system failure occurs when there is a deviation from the system’s specification. A failure is the manifestation of an error.
• The System Development Life Cycle (SDLC) is the entire process of formal, logical steps taken to develop software.
![Page 6: Secure Systems Research Group - FAU 1 A survey of dependability patterns Ingrid Buckley and Eduardo B. Fernandez Dept. of Computer Science and Engineering](https://reader036.vdocuments.mx/reader036/viewer/2022082518/56649ea25503460f94ba56a7/html5/thumbnails/6.jpg)
Secure Systems Research Group - FAU6
Fault Tolerance
• A system that can mask the effects of a fault and continue operating correctly is said to be fault tolerant.
• Fault tolerance requires redundancy and diversity which are directly linked to reliability and support availability of a system.
• Diversity in this sense speaks of having different versions of a function or system where all have the same functionality.
• The integration of hardware and software fault tolerance to cope with the various kinds of faults that can appear in a software system is a good foundation towards achieving a fault tolerant system.
• There are several fault tolerance patterns that have already been written and support different levels of the system architecture. Our aim is to focus on hardware and software fault tolerant patterns.
![Page 7: Secure Systems Research Group - FAU 1 A survey of dependability patterns Ingrid Buckley and Eduardo B. Fernandez Dept. of Computer Science and Engineering](https://reader036.vdocuments.mx/reader036/viewer/2022082518/56649ea25503460f94ba56a7/html5/thumbnails/7.jpg)
Secure Systems Research Group - FAU7
Fault Tolerance Cont’d• Fault Tolerance patterns are a fairly new area in association with
critical systems , the need for them has increased with the need to secure systems against failure caused accidentally or intentionally by attackers.
• Due to the diversity of attacks on different types of systems, it is highly important to have effective fault tolerance techniques to mitigate faults that may lead to a failure in a critical system.
• To prevent failures the following is required: – Detection - Detecting the occurrence of errors– Locating the unit or component where the error has occurred
(diagnosis).– Masking- masking errors so as to prevent malfunctioning of
the system if a fault occurs.– Containment of faults -Confine or delimit the effects of the
error.– Recovery- Reconfigure the system to remove the faulty unit
and erase the effects of the error.
![Page 8: Secure Systems Research Group - FAU 1 A survey of dependability patterns Ingrid Buckley and Eduardo B. Fernandez Dept. of Computer Science and Engineering](https://reader036.vdocuments.mx/reader036/viewer/2022082518/56649ea25503460f94ba56a7/html5/thumbnails/8.jpg)
Secure Systems Research Group - FAU8
Hardware Fault Tolerant Patterns
Hardware fault tolerance applies hardware replication to enhance the system availability/reliability in the presence of hardware faults.• Hardware Fault Tolerance patterns: -The Watch Dog pattern primarily provides protection against time-based faults by creating an alarm whenever liveness messages are not received in a given time frame.
![Page 9: Secure Systems Research Group - FAU 1 A survey of dependability patterns Ingrid Buckley and Eduardo B. Fernandez Dept. of Computer Science and Engineering](https://reader036.vdocuments.mx/reader036/viewer/2022082518/56649ea25503460f94ba56a7/html5/thumbnails/9.jpg)
Secure Systems Research Group - FAU9
Hardware Fault Tolerant Patterns Cont’d
– Fail Stop Processor : The Fail-Stop Processor pattern mainly aims at transforming errors that lead to Byzantine/complex failures, and is based on redundancy and comparing output from all replicas to reach an agreement.
– Acknowledgement : The Acknowledgement pattern detects crash failures and is based on acknowledging the reception of input within a given time interval.
![Page 10: Secure Systems Research Group - FAU 1 A survey of dependability patterns Ingrid Buckley and Eduardo B. Fernandez Dept. of Computer Science and Engineering](https://reader036.vdocuments.mx/reader036/viewer/2022082518/56649ea25503460f94ba56a7/html5/thumbnails/10.jpg)
Secure Systems Research Group - FAU10
Software Fault Tolerant Patterns
• Software fault tolerance applies software redundancy by means of diversity of design to tolerate software faults that can occur at the design, programming or maintaining phases of the software development cycle.
Software Fault Tolerance patterns:– Roll forward : The Roll Forward pattern
is a failure recovery pattern which detects and recovers from a fault by monitoring two replicas for errors.
![Page 11: Secure Systems Research Group - FAU 1 A survey of dependability patterns Ingrid Buckley and Eduardo B. Fernandez Dept. of Computer Science and Engineering](https://reader036.vdocuments.mx/reader036/viewer/2022082518/56649ea25503460f94ba56a7/html5/thumbnails/11.jpg)
Secure Systems Research Group - FAU11
Software Fault Tolerant Patterns Con’t
– Input Guard : Input Guard pattern stops erroneous input from propagating the error inside a component. A guard is placed at every access point of the component to check the validity of the input.
– Fault Container : The Fault Container pattern provides the same benefits as the combination of the Input Guard and the Output Guard patterns, because it prevents an error from being propagated inside and outside a given component .
![Page 12: Secure Systems Research Group - FAU 1 A survey of dependability patterns Ingrid Buckley and Eduardo B. Fernandez Dept. of Computer Science and Engineering](https://reader036.vdocuments.mx/reader036/viewer/2022082518/56649ea25503460f94ba56a7/html5/thumbnails/12.jpg)
Secure Systems Research Group - FAU12
Hardware/Software Fault Tolerance Pattern
• The Software Redundancy Pattern deals with hardware, software and environmental faults at the same time.
![Page 13: Secure Systems Research Group - FAU 1 A survey of dependability patterns Ingrid Buckley and Eduardo B. Fernandez Dept. of Computer Science and Engineering](https://reader036.vdocuments.mx/reader036/viewer/2022082518/56649ea25503460f94ba56a7/html5/thumbnails/13.jpg)
Secure Systems Research Group - FAU13
Patterns diagram for the fault tolerance domain
![Page 14: Secure Systems Research Group - FAU 1 A survey of dependability patterns Ingrid Buckley and Eduardo B. Fernandez Dept. of Computer Science and Engineering](https://reader036.vdocuments.mx/reader036/viewer/2022082518/56649ea25503460f94ba56a7/html5/thumbnails/14.jpg)
Secure Systems Research Group - FAU14
Analysis of PatternsPattern Advantage Disadvantage
Watchdog •Can be used improve deadlock detection, where strokes can be keyed or contains data to identify strokes from different computational steps.
• Does not actually checks that the internal
computation processing is correct
Acknowledgement •The design complexity introduced by the is very low .
•Does not introduce any space overhead
•Does not provide means to tolerate faults in a system. Rather, it provides means detect errors.•It introduces relatively elevated space overhead that is proportional to the number of simultaneous errors it can deal with
Fail Stop Processor •Introduces low time overhead since the
processors function in parallel •The processors are replicas of the original system on which the Fail-Stop Processor pattern is applied, without any additional functionality. meaning that in practice the processors can be replicas of a legacy system, which cannot be subject to any internal
changes such as those that are needed if
additional functionality would be required by the processors.
•The error on the monitored system is detected only after some input has been issued to it. •The timeout must be set based on the time it takes for the input to reach the monitored system plus the time it takes for the acknowledge to reach monitoring system.
![Page 15: Secure Systems Research Group - FAU 1 A survey of dependability patterns Ingrid Buckley and Eduardo B. Fernandez Dept. of Computer Science and Engineering](https://reader036.vdocuments.mx/reader036/viewer/2022082518/56649ea25503460f94ba56a7/html5/thumbnails/15.jpg)
Secure Systems Research Group - FAU15
Analysis of Patterns Cont’dPattern Advantage Disadvantage
Roll Forward •The time overhead imposed by this pattern is low when errors occur: the failed replica is discarded, and the unaffected replica
processes the subsequent inputs .
• The time overhead imposed by this pattern in
the absence of errors is high; before the replica
Is able to receive and process new input, it must
copy its new state to the other replica.
Input Guard •It stops the contamination of the guarded component from erroneous input that does not conform to the specification of the guarded component.•There are various ways that the Input Guard pattern can be implemented, each providing different benefits with respect to the time or space overhead introduced by the guard.
•Cannot prevent the propagation of errors that do conform with the specification of the guarded component.•Has significant time and space over head
Fault Container •It stops of errors expressed as input and output content or timing that does not conform to a component specification from entering or exiting that component.•The undefined behavior of the container in the presence of errors allows its combination with error detection and error masking patterns
•The Fault Container pattern cannot prevent the propagation of errors that do not conform with the specification of the contained component.•Unless combined with some error detection and system recovery mechanisms, this pattern will result in send- or receive-omission failures (i.e. failure to send output or receive input of the contained component).
![Page 16: Secure Systems Research Group - FAU 1 A survey of dependability patterns Ingrid Buckley and Eduardo B. Fernandez Dept. of Computer Science and Engineering](https://reader036.vdocuments.mx/reader036/viewer/2022082518/56649ea25503460f94ba56a7/html5/thumbnails/16.jpg)
Secure Systems Research Group - FAU16
Conclusion
• There is a need to improve upon current Fault Tolerant Patterns based on our analysis.
• New Fault Tolerance Patterns are necessary to provide dependability in distributed systems because many of the fault Tolerance patterns are very similar and do not provide a comprehensive support for errors that can lead to failure.
![Page 17: Secure Systems Research Group - FAU 1 A survey of dependability patterns Ingrid Buckley and Eduardo B. Fernandez Dept. of Computer Science and Engineering](https://reader036.vdocuments.mx/reader036/viewer/2022082518/56649ea25503460f94ba56a7/html5/thumbnails/17.jpg)
Secure Systems Research Group - FAU17
Future Work
• Safety, Availability and Reliability Patterns being researched.
• Defining areas of need where current Fault Tolerance Patterns are lacking or require improvement.
• Designing new Fault Tolerance Patterns.
![Page 18: Secure Systems Research Group - FAU 1 A survey of dependability patterns Ingrid Buckley and Eduardo B. Fernandez Dept. of Computer Science and Engineering](https://reader036.vdocuments.mx/reader036/viewer/2022082518/56649ea25503460f94ba56a7/html5/thumbnails/18.jpg)
Secure Systems Research Group - FAU18
Recommendations and Questions
Feed back: