15-446 networked systems practicum

73
15-446 Networked Systems Practicum Lecture 20 – Mobility 1

Upload: emelda

Post on 15-Jan-2016

32 views

Category:

Documents


0 download

DESCRIPTION

15-446 Networked Systems Practicum. Lecture 20 – Mobility. Overview. Internet mobility TCP over noisy links Mosh CODA LBFS. Routing to Mobile Nodes. Obvious solution: have mobile nodes advertise route to mobile address/32 Should work!!! Why is this bad? - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: 15-446 Networked Systems Practicum

15-446 Networked Systems Practicum

Lecture 20 – Mobility

1

Page 2: 15-446 Networked Systems Practicum

2

Overview

• Internet mobility

• TCP over noisy links

• Mosh

• CODA

• LBFS

Page 3: 15-446 Networked Systems Practicum

3

Routing to Mobile Nodes

• Obvious solution: have mobile nodes advertise route to mobile address/32• Should work!!!

• Why is this bad?• Consider forwarding tables on backbone

routers• Would have an entry for each mobile host• Not very scalable

• What are some possible solutions?

Page 4: 15-446 Networked Systems Practicum

4

How to Handle Mobile Nodes?(Addressing)

• Dynamic Host Configuration (DHCP)• Host gets new IP address in new locations• Problems

• Host does not have constant name/address how do others contact host

• What happens to active transport connections?

Page 5: 15-446 Networked Systems Practicum

5

How to Handle Mobile Nodes?(Naming)

• Naming• Use DHCP and update name-address mapping

whenever host changes address• Fixes contact problem but not broken transport

connections

Page 6: 15-446 Networked Systems Practicum

6

How to Handle Mobile Nodes? (Transport)

• TCP currently uses 4 tuple to describe connection• <Src Addr, Src port, Dst addr, Dst port>

• Modify TCP to allow peer’s address to be changed during connection

• Security issues• Can someone easily hijack connection?

• Difficult deployment both ends must support mobility

Page 7: 15-446 Networked Systems Practicum

7

How to Handle Mobile Nodes?(Link Layer)

• Link layer mobility• Learning bridges can handle mobility this is

how it is handled at CMU• Encapsulated PPP (PPTP) Have mobile host

act like he is connected to original LAN• Works for IP AND other network protocols

Page 8: 15-446 Networked Systems Practicum

8

How to Handle Mobile Nodes?(Routing)

• Allow mobile node to keep same address and name

• How do we deliver IP packets when the endpoint moves?• Can’t just have nodes advertise route to their address

• What about packets from the mobile host?• Routing not a problem• What source address on packet? this can cause

problems• Key design considerations

• Scale• Incremental deployment

Page 9: 15-446 Networked Systems Practicum

9

Basic Solution to Mobile Routing

• Same as other problems in computer science• Add a level of indirection

• Keep some part of the network informed about current location• Need technique to route packets through this

location (interception)

• Need to forward packets from this location to mobile host (delivery)

Page 10: 15-446 Networked Systems Practicum

10

Interception

• Somewhere along normal forwarding path• At source• Any router along path• Router to home network• Machine on home network (masquerading as mobile

host)

• Clever tricks to force packet to particular destination• “Mobile subnet” – assign mobiles a special address

range and have special node advertise route

Page 11: 15-446 Networked Systems Practicum

11

Delivery

• Need to get packet to mobile’s current location

• Tunnels• Tunnel endpoint = current location• Tunnel contents = original packets

• Source routing• Loose source route through mobile current

location

Page 12: 15-446 Networked Systems Practicum

12

Mobile IP (RFC 2290)

• Interception• Typically home agent – a host on home network

• Delivery• Typically IP-in-IP tunneling• Endpoint – either temporary mobile address or foreign

agent

• Terminology• Mobile host (MH), correspondent host (CH), home

agent (HA), foreign agent (FA)• Care-of-address, home address

Page 13: 15-446 Networked Systems Practicum

13

Mobile IP (MH at Home)

Mobile Host (MH)

Visiting Location

Home

Internet

Correspondent Host (CH)

Packet

Page 14: 15-446 Networked Systems Practicum

14

Mobile IP (MH Moving)

Visiting Location

Home

Internet

Correspondent Host (CH)Packet

Home Agent (HA) Mobile Host (MH)I am here

Page 15: 15-446 Networked Systems Practicum

15

Mobile IP (MH Away – FA)

Visiting Location

Home

Internet

Correspondent Host (CH)

Packet

Home Agent (HA) Foreign Agent (FA)Encapsulated

Mobile Host (MH)

Page 16: 15-446 Networked Systems Practicum

16

Mobile IP (MH Away - Collocated)

Visiting Location

Home

Internet

Correspondent Host (CH)Packet

Home Agent (HA) Mobile Host (MH)Encapsulated

Page 17: 15-446 Networked Systems Practicum

17

Other Mobile IP Issues

• Route optimality• Resulting paths can be sub-optimal• Can be improved with route optimization

• Unsolicited binding cache update to sender

• Authentication• Registration messages• Binding cache updates

• Must send updates across network• Handoffs can be slow

• Problems with basic solution• Triangle routing• Reverse path check for security

Page 18: 15-446 Networked Systems Practicum

18

Overview

• Internet mobility

• TCP over noisy links

• Mosh

• CODA

• LBFS

Page 19: 15-446 Networked Systems Practicum

Performance Degradation

0.0E+00

5.0E+05

1.0E+06

1.5E+06

2.0E+06

0 10 20 30 40 50 60

Time (s)

Se

que

nce

nu

mb

er

(byt

es)

TCP Reno(280 Kbps)

Expected TCP performance(1.30 Mbps)

2 MB wide-area TCP transfer over 2 Mbps Lucent WaveLAN

19

Page 20: 15-446 Networked Systems Practicum

20

Wireless Bit-Errors

Router

Computer 2Computer 1

2322

Loss Congestion

21 0

Burst losses lead to coarse-grained timeoutsResult: Low throughput

Loss Congestion

Wireless

Page 21: 15-446 Networked Systems Practicum

21

TCP Problems Over Noisy Links

• Wireless links are inherently error-prone• Fades, interference, attenuation• Errors often happen in bursts

• TCP cannot distinguish between corruption and congestion• TCP unnecessarily reduces window, resulting

in low throughput and high latency

• Burst losses often result in timeouts• Sender retransmission is the only option

• Inefficient use of bandwidth

Page 22: 15-446 Networked Systems Practicum

22

Proposed Solutions

• Incremental deployment• Solution should not require modifications to fixed hosts• If possible, avoid modifying mobile hosts

• End-to-end protocols• Selective ACKs, Explicit loss notification

• Split-connection protocols• Separate connections for wired path and wireless hop

• Reliable link-layer protocols• Error-correcting codes• Local retransmission

Page 23: 15-446 Networked Systems Practicum

23

Approach Styles (End-to-End)

• Improve TCP implementations• Not incrementally deployable• Improve loss recovery (SACK, NewReno)• Help it identify congestion (ELN, ECN)

• ACKs include flag indicating wireless loss• Trick TCP into doing right thing E.g. send extra

dupacks

Wired link Wireless link

Page 24: 15-446 Networked Systems Practicum

24

Approach Styles (Link Layer)

• More aggressive local rexmit than TCP• Bandwidth not wasted on wired links

• Possible adverse interactions with transport layer• Interactions with TCP retransmission• Large end-to-end round-trip time variation

• FEC does not work well with burst losses

Wired link Wireless link

ARQ/FEC

Page 25: 15-446 Networked Systems Practicum

25

Overview

• Internet mobility

• TCP over noisy links

• Mosh

• CODA

• LBFS

Page 26: 15-446 Networked Systems Practicum

26

Page 27: 15-446 Networked Systems Practicum
Page 28: 15-446 Networked Systems Practicum
Page 29: 15-446 Networked Systems Practicum
Page 30: 15-446 Networked Systems Practicum
Page 31: 15-446 Networked Systems Practicum
Page 32: 15-446 Networked Systems Practicum
Page 33: 15-446 Networked Systems Practicum
Page 34: 15-446 Networked Systems Practicum
Page 35: 15-446 Networked Systems Practicum
Page 36: 15-446 Networked Systems Practicum
Page 37: 15-446 Networked Systems Practicum
Page 38: 15-446 Networked Systems Practicum
Page 39: 15-446 Networked Systems Practicum

39

Overview

• Internet mobility

• TCP over noisy links

• Mosh

• CODA

• LBFS

Page 40: 15-446 Networked Systems Practicum

Background

• We are back to 1990s.

• Network is slow and not stable

• Terminal “powerful” client• 33MHz CPU, 16MB RAM, 100MB hard drive

• Mobile Users appeared• 1st IBM Thinkpad in 1992

• We can do work at client without network

40

Page 41: 15-446 Networked Systems Practicum

CODA

• Successor of the very successful Andrew File System (AFS)

• AFS• First DFS aimed at a campus-sized user

community• Key ideas include

• open-to-close consistency• callbacks

41

Page 42: 15-446 Networked Systems Practicum

Hardware Model

• CODA and AFS assume that client workstations are personal computers controlled by their user/owner• Fully autonomous• Cannot be trusted

• CODA allows owners of laptops to operate them in disconnected mode• Opposite of ubiquitous connectivity

42

Page 43: 15-446 Networked Systems Practicum

Accessibility

• Must handle two types of failures• Server failures:

• Data servers are replicated

• Communication failures and voluntary disconnections

• Coda uses optimistic replication and file hoarding

43

Page 44: 15-446 Networked Systems Practicum

Design Rationale

• Scalability• Callback cache coherence (inherit from AFS)• Whole file caching• Fat clients. (security, integrity)• Avoid system-wide rapid change

• Portable workstations• User’s assistance in cache management

44

Page 45: 15-446 Networked Systems Practicum

Client Caching in AFS v2

• Client caches both clean and dirty file data and attributes• The client machine uses local disks to cache data• When a file is opened for read, the whole file is fetched

and cached on disk• Why? What’s the disadvantage of doing so?

• However, when a client caches file data, it obtains a “callback” on the file

• In case another client writes to the file, the server “breaks” the callback• Similar to invalidations in distributed shared memory

implementations• Implication: file server must keep state!

45

Page 46: 15-446 Networked Systems Practicum

Design Rationale –Replica Control

• Pessimistic• Disable all partitioned writes - Require a client to acquire control of a cached

object prior to disconnection

• Optimistic• Assuming no others touching the file- sophisticated: conflict detection + fact: low write-sharing in Unix+ high availability: access anything in range

46

Page 47: 15-446 Networked Systems Practicum

What about Consistency?

• Pessimistic replication control protocols guarantee the consistency of replicated in the presence of any non-Byzantine failures• Typically require a quorum of replicas to allow

access to the replicated data• Would not support disconnected mode

47

Page 48: 15-446 Networked Systems Practicum

Pessimistic Replica Control

• Would require client to acquire exclusive (RW) or shared (R) control of cached objects before accessing them in disconnected mode:• Acceptable solution for voluntary

disconnections• Does not work for involuntary disconnections

• What if the laptop remains disconnected for a long time?

48

Page 49: 15-446 Networked Systems Practicum

Leases

• We could grant exclusive/shared control of the cached objects for a limited amount of time

• Works very well in connected mode • Reduces server workload• Server can keep leases in volatile storage as

long as their duration is shorter than boot time

• Would only work for very short disconnection periods

49

Page 50: 15-446 Networked Systems Practicum

Optimistic Replica Control (I)

• Optimistic replica control allows access in every disconnected mode• Tolerates temporary inconsistencies• Promises to detect them later• Provides much higher data availability

50

Page 51: 15-446 Networked Systems Practicum

Optimistic Replica Control (II)

• Defines an accessible universe: set of replicas that the user can access• Accessible universe varies over time

• At any time, user• Will read from the latest replica(s) in his

accessible universe• Will update all replicas in his accessible

universe

51

Page 52: 15-446 Networked Systems Practicum

Coda (Venus) States

1. Hoarding:Normal operation mode

2. Emulating:Disconnected operation mode

3. Reintegrating:Propagates changes and detects inconsistencies

Hoarding

Emulating Recovering

52

Page 53: 15-446 Networked Systems Practicum

Hoarding

• Hoard useful data for disconnection

• Balance the needs of connected and disconnected operation.• Cache size is restricted• Unpredictable disconnections

• Prioritized algorithm – cache manage

• hoard walking – reevaluate objects

53

Page 54: 15-446 Networked Systems Practicum

Prioritized algorithm

• User defined hoard priority p: how interest it is?• Recent Usage q • Object priority = f(p,q)• Kick out the one with lowest priority

+ Fully tunableEverything can be customized

- Not tunable (?)- No idea how to customize

54

Page 55: 15-446 Networked Systems Practicum

Hoard Walking

• Equilibrium – uncached obj < cached obj• Why it may be broken? Cache size is limited.

• Walking: restore equilibrium• Reloading HDB (changed by others)• Reevaluate priorities in HDB and cache• Enhanced callback

• Increase scalability, and availability

• Decrease consistency

55

Page 56: 15-446 Networked Systems Practicum

Emulation

• In emulation mode:• Attempts to access files that are not in the client

caches appear as failures to application• All changes are written in a persistent log,

the client modification log (CML)• Venus removes from log all obsolete entries like

those pertaining to files that have been deleted

56

Page 57: 15-446 Networked Systems Practicum

Persistence

• Venus keeps its cache and related data structures in non-volatile storage

• All Venus metadata are updated throughatomic transactions• Using a lightweight recoverable virtual

memory (RVM) developed for Coda• Simplifies Venus design

57

Page 58: 15-446 Networked Systems Practicum

Reintegration

• When workstation gets reconnected, Coda initiates a reintegration process• Performed one volume at a time• Venus ships replay log to all volumes• Each volume performs a log replay algorithm

• Only care write/write confliction• Succeed?

• Yes. Free logs, reset priority• No. Save logs to a tar. Ask for help

58

Page 59: 15-446 Networked Systems Practicum

Performance

• Duration of Reintegration• A few hours disconnection 1 min

• But sometimes much longer

• Cache size• 100MB at client is enough for a “typical” workday

• Conflicts• No Conflict at all! Why?

• Over 99% modification by the same person

• Two users modify the same obj within a day: <0.75%

59

Page 60: 15-446 Networked Systems Practicum

Coda Summary

• Puts scalability and availability beforedata consistency• Unlike NFS

• Assumes that inconsistent updates are very infrequent

• Introduced disconnected operation mode and file hoarding

60

Page 61: 15-446 Networked Systems Practicum

Remember this slide?

• We are back to 1990s.

• Network is slow and not stable

• Terminal “powerful” client• 33MHz CPU, 16MB RAM, 100MB hard drive

• Mobile Users appear• 1st IBM Thinkpad in 1992

61

Page 62: 15-446 Networked Systems Practicum

What’s now?

• We are in 2010s now.• Network is fast and reliable in LAN• “powerful” client very powerful client

• Multi-core/multi-GHz CPU, multi GB RAM, TB+ hard drive

• Mobile users everywhere• Do we still need disconnection?

• How many people are using coda?

62

Page 63: 15-446 Networked Systems Practicum

Do we still need disconnection?

• WAN and wireless is not very reliable, and is slow

• Smartphone is not very powerful• Electric power constrained

• LBFS (MIT) on WAN, Coda and Odyssey (CMU) for mobile users• Adaptation is also important

63

Page 64: 15-446 Networked Systems Practicum

64

Overview

• Internet mobility

• TCP over noisy links

• Mosh

• CODA

• LBFS

Page 65: 15-446 Networked Systems Practicum

Low Bandwidth File SystemKey Ideas

• A network file systems for slow or wide-area networks

• Exploits similarities between files or versions of the same file• Avoids sending data that can be found in the

server’s file system or the client’s cache

• Also uses conventional compression and caching

• Requires 90% less bandwidth than traditional network file systems

65

Page 66: 15-446 Networked Systems Practicum

Working on slow networks

• Make local copies• Must worry about update conflicts

• Use remote login• Only for text-based applications

• Use instead a LBFS• Better than remote login• Must deal with issues like auto-saves blocking

the editor for the duration of transfer

66

Page 67: 15-446 Networked Systems Practicum

LBFS design

• Provides close-to-open consistency• Uses a large, persistent file cache at client

• Stores clients working set of files

• LBFS server divides file it stores into chunks and indexes the chunks by hash value

• Client similarly indexes its file cache• Exploits similarities between files

• LBFS never transfers chunks that the recipient already has

67

Page 68: 15-446 Networked Systems Practicum

Indexing

• Uses the SHA-1 algorithm for hashing• It is collision resistant

• Central challenge in indexing file chunks is keeping the index at a reasonable size while dealing with shifting offsets• Indexing the hashes of fixed size data blocks• Indexing the hashes of all overlapping blocks at

all offsets

68

Page 69: 15-446 Networked Systems Practicum

LBFS indexing solution

• Considers only non-overlapping chunks

• Sets chunk boundaries based on file contents rather than on position within a file

• Examines every overlapping 48-byte region of file to select the boundary regions called breakpoints using Rabin fingerprints• When low-order 13 bits of region’s fingerprint

equals a chosen value, the region constitutes a breakpoint

69

Page 70: 15-446 Networked Systems Practicum

Effects of edits on file chunks

• Chunks of file before/after edits• Grey shading show edits

• Stripes show 48byte regions with magic hash values creating chunk boundaries

70

Page 71: 15-446 Networked Systems Practicum

More Indexing Issues

• Pathological cases• Very small chunks

• Sending hashes of chunks would consume as much bandwidth as just sending the file

• Very large chunks• Cannot be sent in a single RPC

• LBFS imposes minimum and maximum chuck sizes

71

Page 72: 15-446 Networked Systems Practicum

The Chunk Database

• Indexes each chunk by the first 64 bits of its SHA-1 hash

• To avoid synchronization problems, LBFS always recomputes the SHA-1 hash of any data chunk before using it• Simplifies crash recovery

• Recomputed SHA-1 values are also used to detect hash collisions in the database

72

Page 73: 15-446 Networked Systems Practicum

Conclusion

• Under normal circumstances, LBFS consumes 90% less bandwidth than traditional file systems.

• Makes transparent remote file access a viable and less frustrating alternative to running interactive programs on remote machines.

73