fine-grained failover using connection migration
DESCRIPTION
Fine-Grained Failover Using Connection Migration. Alex C. Snoeren, David G. Andersen, Hari Balakrishnan MIT Laboratory for Computer Science. Servers Fail. The Problem. Client. Content server. More often than users want to know…. Solution: Server Redundancy. Use a healthy one at all - PowerPoint PPT PresentationTRANSCRIPT
Fine-Grained FailoverUsing Connection Migration
Alex C. Snoeren,
David G. Andersen, Hari Balakrishnan
MIT Laboratory for Computer Science
The Problem
Servers Fail.More often than users want to know…
Client Content server
Solution: Server Redundancy
Use a healthyone at alltimes.
1. Health Monitoring
2. Server Selection
3. Connection Resumption
Failover Components
Today’s Replication Technology
• DNS/Content RoutingWide-area replication Need client awareness
• Layer 4/Web SwitchesTransparent, possibly
mid-stream failover Requires co-location
DNS
We
b S
witc
h
• Wide area replication Yet somehow synchronize
replica servers
• Transparent failover Enable other servers to
continue connections
Ideal Technology
• Stream Mapping Infer application state from
transport layer information
• Connection Migration Transparently hand off
sessions between servers
Migrate Architecture
Str
eam
Map
per
Str
eam
Map
per
Str
eam
Map
per
Stream Mapping
HTTP 1.1 200 OK Content-Length: 328987 ...Content-Type: video/mpeg
GET /StreamingContent.mpg HTTP/1.1Client:
Server Response:
Stream Map: TCP SeqNo 083346
TCP ISS 083521
Client Object (URL) Offset (TCP SeqNo)
128.89.3.24:4234 /StreamingContent.mpg 083346
Anatomy of Failover
Client
Support Group
Initial Connection
Migrated Connection
Support Groups
• Set of partially mirrored servers All servers able to provide same content Can be topologically diverse
• Synchronize on per-connection basis Servers need not be complete mirrors Connections from a failed server can be
handled by a different support server Connections may have distinct support
groups
Soft State Synchronization
• Synchronize within support groups Periodic advertisements Advertise client application object requests Communicate initial transport layer state
• Only initial state need be communicated Current info inferred from transport layer Clients will reject redundant migrates from
stale support servers
TCP ConnectionMigration
1. Initial SYN
2. SYN/ACK
3. ACK (with data)
4. Normal data transfer
5. Migrate SYN
6. Migrate SYN/ACK
7. ACK (with data)
client server
TCP ConnectionMigration
1. Initial SYN
2. SYN/ACK
3. ACK (with data)
4. Normal data transfer
5. Migrate SYN
6. Migrate SYN/ACK
7. ACK (with data)
client server
TCP ConnectionMigration
1. Initial SYN
2. SYN/ACK
3. ACK (with data)
4. Normal data transfer
5. Migrate SYN
6. Migrate SYN/ACK
7. ACK (with data)
client server
failover server
545968:546414(536)
ack 533526
SYN 533525:533525(0)ack 545968
current
SYN 083521:083521(0)
(migrate T, R)
stale
Implementation
Server App
Server AppClient
Stream Mapping Wedges
• Software “Wedge” Stream Mapping Synchronization
We
dg
eW
ed
ge
Wedge Overhead
1000
10000
100000
1e+06
1e+07
1 10 100 1000 10000
Mic
ros
ec
on
ds
pe
r re
qu
es
t
Request size (Kbytes)
Wedge
Direct
Experimental Topology
Client initiates a transfer to A…
Linux/Apache 1.3
Linux/Apache 1.3
then migrates to B…
and back to A…
128Kbs links
Varying Oscillation Rates
0
100000
200000
300000
400000
500000
600000
700000
800000
900000
1e+06
0 10 20 30 40 50 60
Go
od
pu
t (b
yte
s)
Time (secs)
No Oscillations10 sec12 sec
2 sec5 sec
Benefits & Limitations
• Enable wide area server replication Low server synchronization overhead Infer current state from transport layer
• Robust even under adverse loads Health monitors can be overly reactive Gracefully handle cascaded failures
• Leverages connection migration Requires modern transport stack
Software available on the web:
http://nms.lcs.mit.edu/software/migrate
Networks and Mobile Systems