distributed systems and security: an introduction brad karp ucl computer science cs gz03 / 4030 1 st...
TRANSCRIPT
![Page 1: Distributed Systems and Security: An Introduction Brad Karp UCL Computer Science CS GZ03 / 4030 1 st October, 2007](https://reader035.vdocuments.mx/reader035/viewer/2022072012/56649e445503460f94b37ab4/html5/thumbnails/1.jpg)
Distributed Systems and Security:
An Introduction
Brad KarpUCL Computer Science
CS GZ03 / 40301st October, 2007
![Page 2: Distributed Systems and Security: An Introduction Brad Karp UCL Computer Science CS GZ03 / 4030 1 st October, 2007](https://reader035.vdocuments.mx/reader035/viewer/2022072012/56649e445503460f94b37ab4/html5/thumbnails/2.jpg)
2
Today’s Lecture
• Logistics• Course Communication• Overview of Distributed Systems
– What are they?– Why build them?– Why are they hard to build well?
• Detailed Syllabus: Distributed Systems• Assessment regime• Questionnaire
![Page 3: Distributed Systems and Security: An Introduction Brad Karp UCL Computer Science CS GZ03 / 4030 1 st October, 2007](https://reader035.vdocuments.mx/reader035/viewer/2022072012/56649e445503460f94b37ab4/html5/thumbnails/3.jpg)
3
Logistics
• Meeting Times/Locations– Monday: 9 AM – 10 AM , MPEB 1.02– Monday: 1 PM – 2 PM, Roberts 309– Tuesday: 1 PM – 2 PM, Anatomy Gavin de Beer
LT
• Schedule– 1st October – 30th October: Distributed Systems– 5th November – 9th November: Reading Week
(no lecture)– 12th November – 11th December: Security
![Page 4: Distributed Systems and Security: An Introduction Brad Karp UCL Computer Science CS GZ03 / 4030 1 st October, 2007](https://reader035.vdocuments.mx/reader035/viewer/2022072012/56649e445503460f94b37ab4/html5/thumbnails/4.jpg)
4
Course Communication
• Course web page– http://www.cs.ucl.ac.uk/staff/B.Karp/gz03/f2007
/– Detailed calendar: readings, lecture topics,
coursework, announcements/corrections– Your responsibility: check page daily!
• Course mailing lists– {gz03,4030}@<department’s domain>– Subscribe by sending mail to {gz03-request,4030-request}@<department’s domain> with one-word subject join
– Must subscribe from UCL CS email address– Used for course announcements– Your responsibility: check email daily!
![Page 5: Distributed Systems and Security: An Introduction Brad Karp UCL Computer Science CS GZ03 / 4030 1 st October, 2007](https://reader035.vdocuments.mx/reader035/viewer/2022072012/56649e445503460f94b37ab4/html5/thumbnails/5.jpg)
5
What Is a Distributed System?
• Multiple computers (“machines,” “hosts,” “boxes,” &c.)– Each with CPU, memory, disk, network
interface– Interconnected by LAN or WAN (e.g.,
Internet)
• Application runs across this dispersed collection of networked hardware
• But user sees single, unified system
![Page 6: Distributed Systems and Security: An Introduction Brad Karp UCL Computer Science CS GZ03 / 4030 1 st October, 2007](https://reader035.vdocuments.mx/reader035/viewer/2022072012/56649e445503460f94b37ab4/html5/thumbnails/6.jpg)
6
What Is a Distributed System?(Alternate Take)
“A distributed system is a system in which I can’t do my work because some computer that I’ve never even heard of has failed.”
– Leslie Lamport, Microsoft Research (ex DEC)
![Page 7: Distributed Systems and Security: An Introduction Brad Karp UCL Computer Science CS GZ03 / 4030 1 st October, 2007](https://reader035.vdocuments.mx/reader035/viewer/2022072012/56649e445503460f94b37ab4/html5/thumbnails/7.jpg)
7
Start Simple: Centralized System
• Suppose you run Gmail• Workload:
– Inbound email arrives; store on disk– Users retrieve, delete their email
• You run Gmail on one server with disk
GmailServer (PC)
EmailSender
EmailSender
EmailSender
EmailReader
EmailReader
EmailReader
What are shortcomings of this design?
![Page 8: Distributed Systems and Security: An Introduction Brad Karp UCL Computer Science CS GZ03 / 4030 1 st October, 2007](https://reader035.vdocuments.mx/reader035/viewer/2022072012/56649e445503460f94b37ab4/html5/thumbnails/8.jpg)
8
Why Distribute? For Availability
• Suppose Gmail server goes down, or network between client and it goes down
• No incoming mail delivered, no users can read their inboxes
• Fix: replicate the data on several servers– Increased chance some server will be reachable– Consistency? One server down when delete
message, then comes back up; message returns in inbox
– Latency? Replicas should be far apart, so they fail independently
– Partition resilience? e.g., airline seat database splits, one seat remains, bought twice, once in each half!
![Page 9: Distributed Systems and Security: An Introduction Brad Karp UCL Computer Science CS GZ03 / 4030 1 st October, 2007](https://reader035.vdocuments.mx/reader035/viewer/2022072012/56649e445503460f94b37ab4/html5/thumbnails/9.jpg)
9
Why Distribute?For Scalable Capacity
• What if Gmail a huge success?• Workload exceeds capacity of one
server• Fix: spread users across several servers
– Best case: linear scaling—if U users per box, N boxes support NU users
– Bottlenecks? If each user’s inbox on one server, how to route inbound mail to right server?
– Scaling? How close to linear?– Load balance? Some users get more mail
than others!
![Page 10: Distributed Systems and Security: An Introduction Brad Karp UCL Computer Science CS GZ03 / 4030 1 st October, 2007](https://reader035.vdocuments.mx/reader035/viewer/2022072012/56649e445503460f94b37ab4/html5/thumbnails/10.jpg)
10
Performance Can Be Subtle
• Goal: predictable performance under high load
• 2 employees run a Starbucks– Employee 1: takes orders from customers,
calls them out to Employee 2– Employee 2:
• writes down drink orders (5 seconds per order)• makes drinks (10 seconds per order)
• What is throughput under increasing load?
![Page 11: Distributed Systems and Security: An Introduction Brad Karp UCL Computer Science CS GZ03 / 4030 1 st October, 2007](https://reader035.vdocuments.mx/reader035/viewer/2022072012/56649e445503460f94b37ab4/html5/thumbnails/11.jpg)
11
Starbucks Throughput
• Peak system performance: 4 drinks / min• What happens when load > 4 orders / min?• What happens to efficiency as load increases?
What would preferable curve be?What design achieves that goal?
![Page 12: Distributed Systems and Security: An Introduction Brad Karp UCL Computer Science CS GZ03 / 4030 1 st October, 2007](https://reader035.vdocuments.mx/reader035/viewer/2022072012/56649e445503460f94b37ab4/html5/thumbnails/12.jpg)
12
Why Are Distributed SystemsHard to Design?
• Failure: of hosts, of network– Remember Lamport’s lament
• Heterogeneity– Hosts may have different data representations
• Need consistency (many specific definitions)– Users expect familiar “centralized” behavior
• Need concurrency for performance– Avoid waiting synchronously, leaving resources
idle– Overlap requests concurrently whenever possible
![Page 13: Distributed Systems and Security: An Introduction Brad Karp UCL Computer Science CS GZ03 / 4030 1 st October, 2007](https://reader035.vdocuments.mx/reader035/viewer/2022072012/56649e445503460f94b37ab4/html5/thumbnails/13.jpg)
13
Security
• Before Internet:– Encryption and authentication using cryptography– Between parties known to each other (e.g.,
diplomatic wire)• Today:
– Entire Internet of potential attackers– Legitimate correspondents often have no prior
relationship– Online shopping: how do you know you gave credit
card number to amazon.com? How does amazon.com know you are authorized credit card user?
– Software download: backdoor in your new browser?– Software vulnerabilities: remote infection by
worms!– Crypto not enough alone to solve these problems!
![Page 14: Distributed Systems and Security: An Introduction Brad Karp UCL Computer Science CS GZ03 / 4030 1 st October, 2007](https://reader035.vdocuments.mx/reader035/viewer/2022072012/56649e445503460f94b37ab4/html5/thumbnails/14.jpg)
14
Detailed Syllabus
• No textbook• Readings: research papers on real, built
distributed systems that illustrate concepts– You must read papers by day assigned!– Lectures will assume you have.
• Lectures: (some) background for papers; review system from paper; discuss system
• See schedule on course web page
![Page 15: Distributed Systems and Security: An Introduction Brad Karp UCL Computer Science CS GZ03 / 4030 1 st October, 2007](https://reader035.vdocuments.mx/reader035/viewer/2022072012/56649e445503460f94b37ab4/html5/thumbnails/15.jpg)
15
How Will You Be Evaluated?
• 2 courseworks, 15% total– One will involve significant programming: 10%
• Out: 9th Oct, Due: 29th Oct
– Other will be written problem set: 5%• Out: 20th Nov, Due: 13th Dec
• 85% final exam– 2.5 hours; rubric: 3 of 5 questions– 4030 (4th-years): must get >= 40% to pass
• Overall:– 4030 (4th-years): must get >= 40% mean to pass– GZ03 (DCNDS, SSE): must get >= 50% mean to
pass