fault tolerance in corba and wireless corba chen xinyu 18/9/2002

22
Fault Tolerance in CORBA and Wireless CORBA Chen Xinyu 18/9/2002

Upload: job-warner

Post on 05-Jan-2016

221 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Fault Tolerance in CORBA and Wireless CORBA Chen Xinyu 18/9/2002

Fault Tolerance in CORBA and

Wireless CORBA

Chen Xinyu

18/9/2002

Page 2: Fault Tolerance in CORBA and Wireless CORBA Chen Xinyu 18/9/2002

Outline

Introduction to CORBA and Wireless CORBA

What is Fault Tolerance

Fault Tolerant CORBA

Fault Tolerance in Wireless CORBA

Conclusion

Future Work

Page 3: Fault Tolerance in CORBA and Wireless CORBA Chen Xinyu 18/9/2002

What is CORBA

Common Object Request Broker Architecture• A Distributed Object Computing (DOC) open standard

– Compare to platform/language specific alternatives

– e.g., Java RMI, Microsoft’s DCOM

• A language-neutral environment

• A middleware infrastructure specification

Administered by the Object Management Group • a.k.a., the OMG

Page 4: Fault Tolerance in CORBA and Wireless CORBA Chen Xinyu 18/9/2002

Wireless CORBA Architecture

Encapsulates, forwards or ignores incoming GIOP messages Decapsulates and forwards messages from the GIOP tunnel Generates mobility events Lists available services

Similar to the Access Bridge Does not provide forwarding Generates mobility events Does not list services

Abstract transport-independent tunnel for GIOP messages Concrete tunnels for TCP/IP, UDP/IP and WAP. Only one GIOP tunnel

Keeps track of the associated access bridges Redirects requests for services on the terminal

Source: Telecom Wireless CORBA, OMG Doucment dtc/01-06-02

Page 5: Fault Tolerance in CORBA and Wireless CORBA Chen Xinyu 18/9/2002

Wireless CORBA

CORBA

Key:TCP/IP Network CORBA objects may be

invoked anywhere along the “end to end” path

IIOP

GTP Tunnel

GIOP

GIOPGTP

Access Point

Page 6: Fault Tolerance in CORBA and Wireless CORBA Chen Xinyu 18/9/2002

Fault, Error and Failure

Fault Error Failure

Fault tolerant mechanisms

an anomalous condition occurring in the system hardware or software

the part of the system state that is liable to lead to a failure

occurs when the delivered service of a system or a component deviates from its specification

Fault tolerance is the ability of a system to continue providing its specified service despite

component failure

Fault tolerance is the ability of a system to continue providing its specified service despite

component failure

Page 7: Fault Tolerance in CORBA and Wireless CORBA Chen Xinyu 18/9/2002

Fault Tolerant CORBA Architecture

Source: Bell Labs Research

Page 8: Fault Tolerance in CORBA and Wireless CORBA Chen Xinyu 18/9/2002

Object Replication Styles

Passive Replication• Only one replica processes each request,

other replicas are available as backups

• Lower memory and processing costs

• Slower recovery from faults

• Duplicate message detection during recovery from faults

Active Replication• Several replicas process each request

• Faster recovery from faults

• State transfer to initialize new replicas

Page 9: Fault Tolerance in CORBA and Wireless CORBA Chen Xinyu 18/9/2002

ORB ORB ORB ORB ORB

ORB ORB ORB

Passive Replication

Clientinvokes a method of

Server AServer A

Server B

Reliable totally ordered multicast

Primaryreplica

Primaryreplica

Only primary replica of Server A executes the method

Reply returnedfrom primary replica of Server Bto primary replica of Server A

Only primary replicaof Server Bexecutes the method

Reliabletotally orderedmulticastfor state transfer

ObjectObject Object Object Object

Object Object Object

Source: Eternal Systems, Inc

Page 10: Fault Tolerance in CORBA and Wireless CORBA Chen Xinyu 18/9/2002

Active Replication

Object

ORB ORB ORB ORB ORB

ORB ORB ORB

Clientinvokes a method of

Server AServer A

Server B

Reliable totally ordered multicast

STOP STOP

Duplicate invocationssuppressed

Reliabletotally orderedmulticasts forrequests and replies

Object Object Object Object

Object Object Object

Duplicate repliessuppressed

STOPSTOP

Source: Eternal Systems, Inc

Page 11: Fault Tolerance in CORBA and Wireless CORBA Chen Xinyu 18/9/2002

Device, Wireless & Mobile Issues

Device Issues• Slow processor

• Small memory

• Small disk space

• Low power supply

• Physical damage

Applying Mobile Host as Stable Storage

a large number of system messages or a large size of information carried in a message

Applying Access Bridge as Stable Storage

Uncoordinated checkpointing Pessimistic message logging

Checkpoints and Logs collection

Mobile Issue• Handoff

Wireless Issues

• High bit error rate

• Little bandwidth

• Long transfer delay

Page 12: Fault Tolerance in CORBA and Wireless CORBA Chen Xinyu 18/9/2002

Recovery Scheme

Uncoordinated checkpointing• time

• predefined number of messages

Pessimistic message logging• no extra communication overhead

Independent rollback recovery• only failed objects rollback

Page 13: Fault Tolerance in CORBA and Wireless CORBA Chen Xinyu 18/9/2002

Fault Tolerance Architecture

Client Object

Terminal Bridge

Recovery Mechanism

ORB

Platform

Mobile Host

Recovery Mechanism

Logging Mechanism

Platform

Access Bridge

Mobile Side

Fixed Side

Access Bridge

ORB

Recovery Mechanism

Logging Mechanism

ORB

Platform

Remote Server

GIOP Tunnel

Multicast Messages

Server Replica

Page 14: Fault Tolerance in CORBA and Wireless CORBA Chen Xinyu 18/9/2002

Checkpoint and Logs Collection Strategies

Pessimistic• checkpoint and logs are transferred during handoff

• generates heavy volume of data transfer

Lazy• creates a linked list of Access Bridges

• complicated recovery

Frequency-based• the number of handoffs

Distance-based • the distance between mobile host and the Access

Bridge carrying its latest checkpoint

Page 15: Fault Tolerance in CORBA and Wireless CORBA Chen Xinyu 18/9/2002

Mobile Host Crash

Access Bridge 1

Access Bridge 2

Access Bridge 3

Home Location

Agent

HandoffLocation Update

Page 16: Fault Tolerance in CORBA and Wireless CORBA Chen Xinyu 18/9/2002

Mobile Host Crash

Access Bridge 1

Access Bridge 2

Access Bridge 3

Home Location

Agent

HandoffLocation Update

Page 17: Fault Tolerance in CORBA and Wireless CORBA Chen Xinyu 18/9/2002

Mobile Host Crash

Access Bridge 1

Access Bridge 2

Access Bridge 3

Home Location

Agent

Page 18: Fault Tolerance in CORBA and Wireless CORBA Chen Xinyu 18/9/2002

Mobile Host Crash

Access Bridge 1

Access Bridge 2

Access Bridge 3

Home Location

Agent

Collect last checkpoint and succeeded message logs

Sorted by Ack. SN

Reconnect

Messages Replay

Page 19: Fault Tolerance in CORBA and Wireless CORBA Chen Xinyu 18/9/2002

Conclusion

Fault Tolerant CORBA is based on Object Replication

Fault tolerance in Wireless CORBA is based on Rollback-Recovery Protocol

Checkpoint and message logs collection is important in Wireless CORBA

Page 20: Fault Tolerance in CORBA and Wireless CORBA Chen Xinyu 18/9/2002

Future Work

Low-cost Checkpointing Algorithm• forces a minimum number of objects to take

checkpoints

• minimizes the number of synchronization messages

• makes checkpointing nonblocking

Failure Detection in Wireless Environment

Page 21: Fault Tolerance in CORBA and Wireless CORBA Chen Xinyu 18/9/2002

Question and Answer

Page 22: Fault Tolerance in CORBA and Wireless CORBA Chen Xinyu 18/9/2002

Thank You