distributed systems cs 15-440

25
Distributed Systems CS 15-440 Overview and Introduction Lecture 1, Aug 25, 2014 Mohammad Hammoud

Upload: rafael-curry

Post on 30-Dec-2015

30 views

Category:

Documents


0 download

DESCRIPTION

Distributed Systems CS 15-440. Overview and Introduction Lecture 1, Aug 25, 2014 Mohammad Hammoud. Why Should Y ou Study Distributed Systems?. Definition of a Distributed System. A distributed system is:. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Distributed Systems CS 15-440

Distributed SystemsCS 15-440

Overview and Introduction

Lecture 1, Aug 25, 2014

Mohammad Hammoud

Page 2: Distributed Systems CS 15-440

Why Should You Study Distributed Systems?

Application Domain Associated Networked Application

Finance and commerce eCommerce e.g. Amazon and eBay, PayPal, online banking and trading

The information society Web information and search engines, ebooks, Wikipedia; social networking: Facebook and MySpace.

Creative industries and entertainment online gaming, music and film in the home, user-generated content, e.g. YouTube, Flickr

Healthcare health informatics, on online patient records, monitoring patients

Education e-learning, virtual learning environments; distance learning

Transport and logistics GPS in route finding systems, map services: Google Maps, Google Earth

Science The Grid as an enabling technology for collaboration between scientists

Environmental management sensor technology to monitor earthquakes, floods or tsunamis

Application Domain Associated Networked Application

Finance and commerce eCommerce e.g. Amazon and eBay, PayPal, online banking and trading

The information society Web information and search engines, ebooks, Wikipedia; social networking: Facebook and MySpace.

Creative industries and entertainment online gaming, music and film in the home, user-generated content, e.g. YouTube, Flickr

Healthcare health informatics, on online patient records, monitoring patients

Education e-learning, virtual learning environments; distance learning

Transport and logistics GPS in route finding systems, map services: Google Maps, Google Earth

Science The Grid as an enabling technology for collaboration between scientists

Environmental management sensor technology to monitor earthquakes, floods or tsunamis

Application Domain Associated Networked Application

Finance and commerce eCommerce e.g. Amazon and eBay, PayPal, online banking and trading

The information society Web information and search engines, ebooks, Wikipedia; social networking: Facebook and MySpace.

Creative industries and entertainment online gaming, music and film in the home, user-generated content, e.g. YouTube, Flickr

Healthcare health informatics, on online patient records, monitoring patients

Education e-learning, virtual learning environments; distance learning

Transport and logistics GPS in route finding systems, map services: Google Maps, Google Earth

Science The Grid as an enabling technology for collaboration between scientists

Environmental management sensor technology to monitor earthquakes, floods or tsunamis

Application Domain Associated Networked Application

Finance and commerce eCommerce e.g. Amazon and eBay, PayPal, online banking and trading

The information society Web information and search engines, ebooks, Wikipedia; social networking: Facebook and MySpace.

Creative industries and entertainment online gaming, music and film in the home, user-generated content, e.g. YouTube, Flickr

Healthcare health informatics, on online patient records, monitoring patients

Education e-learning, virtual learning environments; distance learning

Transport and logistics GPS in route finding systems, map services: Google Maps, Google Earth

Science The Grid as an enabling technology for collaboration between scientists

Environmental management sensor technology to monitor earthquakes, floods or tsunamis

Application Domain Associated Networked Application

Finance and commerce eCommerce e.g. Amazon and eBay, PayPal, online banking and trading

The information society Web information and search engines, ebooks, Wikipedia; social networking: Facebook and MySpace.

Creative industries and entertainment online gaming, music and film in the home, user-generated content, e.g. YouTube, Flickr

Healthcare health informatics, on online patient records, monitoring patients

Education e-learning, virtual learning environments; distance learning

Transport and logistics GPS in route finding systems, map services: Google Maps, Google Earth

Science The Grid as an enabling technology for collaboration between scientists

Environmental management sensor technology to monitor earthquakes, floods or tsunamis

Application Domain Associated Networked Application

Finance and commerce eCommerce e.g. Amazon and eBay, PayPal, online banking and trading

The information society Web information and search engines, ebooks, Wikipedia; social networking: Facebook and MySpace.

Creative industries and entertainment online gaming, music and film in the home, user-generated content, e.g. YouTube, Flickr

Healthcare health informatics, on online patient records, monitoring patients

Education e-learning, virtual learning environments; distance learning

Transport and logistics GPS in route finding systems, map services: Google Maps, Google Earth

Science The Grid as an enabling technology for collaboration between scientists

Environmental management sensor technology to monitor earthquakes, floods or tsunamis

Application Domain Associated Networked Application

Finance and commerce eCommerce e.g. Amazon and eBay, PayPal, online banking and trading

The information society Web information and search engines, ebooks, Wikipedia; social networking: Facebook and MySpace.

Creative industries and entertainment online gaming, music and film in the home, user-generated content, e.g. YouTube, Flickr

Healthcare health informatics, on online patient records, monitoring patients

Education e-learning, virtual learning environments; distance learning

Transport and logistics GPS in route finding systems, map services: Google Maps, Google Earth

Science The Grid as an enabling technology for collaboration between scientists

Environmental management sensor technology to monitor earthquakes, floods or tsunamis

Application Domain Associated Networked Application

Finance and commerce eCommerce e.g. Amazon and eBay, PayPal, online banking and trading

The information society Web information and search engines, ebooks, Wikipedia; social networking: Facebook and MySpace.

Creative industries and entertainment online gaming, music and film in the home, user-generated content, e.g. YouTube, Flickr

Healthcare health informatics, on online patient records, monitoring patients

Education e-learning, virtual learning environments; distance learning

Transport and logistics GPS in route finding systems, map services: Google Maps, Google Earth

Science The Grid as an enabling technology for collaboration between scientists

Environmental management sensor technology to monitor earthquakes, floods or tsunamis

Page 3: Distributed Systems CS 15-440

Definition of a Distributed System

A distributed system is:

A collection of independent computers

that appear to its users as a single coherent system

(Tanenbaum book)

One in which components located at networked

computers communicate and coordinate their

actions only by passing messages (Coulouris

book)

Page 4: Distributed Systems CS 15-440

Why Distributed Systems?

• Big data continues to grow:

In mid-2010, the information universe carried 1.2 zettabytes 2020 predictions expect nearly 44 times more

• Applications are becoming data-intensive and compute-intensive

Page 5: Distributed Systems CS 15-440

Why Distributed Systems?

• Individual computers have limited resources compared to the scale of current day problems & application domains:

1. Caches and Memory:

L1

Cache

L2 Cache

L3 Cache

Main Memory

16KB- 64KB, 2-4 cycles

512KB- 8MB, 6-15 cycles

4MB- 32MB, 30-50 cycles

1GB- 4GB, 300+ cycles

Page 6: Distributed Systems CS 15-440

Why Distributed Systems?

• Individual computers have limited resources compared to the scale of current day problems & application domains:

2. Hard Disk Drive:

Limited capacity

Limited number of channels

Limited bandwidth

Page 7: Distributed Systems CS 15-440

Why Distributed Systems?

P

L1

L2

P

L1

L2 Cache

P

L1

P

L1

P

L1

Interconnect

• Individual computers have limited resources compared to the scale of current day problems & application domains:

3. Processor:

The number of transistors that can be integrated on a single die has continued to grow at Moore’s pace

Chip Multiprocessors (CMPs) are now available

A single Processor Chip

A CMP

Page 8: Distributed Systems CS 15-440

Why Distributed Systems?3. Processor (cont’d):

Up until a few years ago, CPU speed grew at the rate of 55% annually, while the memory speed grew at the rate of only 7% [H & P]

Memory

Memory

P

M

P

L1

L2

P

L1

L2 Cache

P

L1

P

L1

P

L1

Interconnect

Processor-Memory speed gap

Page 9: Distributed Systems CS 15-440

Why Distributed Systems?3. Processor (cont’d):

Even if 100s or 1000s of cores are placed on a CMP, it is a challenge to deliver input data to these cores fast enough for processing

A Data Setof 4 TBs

4 100MB/S IO Channels

10000 seconds

(or 3 hours) to load data

Memory

P

L1

L2 Cache

P

L1

P

L1

P

L1

Interconnect

Page 10: Distributed Systems CS 15-440

Why Distributed Systems?

Only 3 minutes to load data

A Data Set (data) of 4 TBs

Splits

Memory

P

L1

L2

Memory

P

L1

L2

100 Machines

Page 11: Distributed Systems CS 15-440

Requirements

But this requires:

A way to express the problem as parallel processes and execute them on different machines (Programming Models and Concurrency)

A way for processes on different machines to exchange information (Communication)

A way for processes to cooperate, synchronize with one another and agree on shared values (Synchronization)

A way to enhance reliability and improve performance (Consistency and Replication)

Page 12: Distributed Systems CS 15-440

Requirements

But this requires (cont’d):

A way to recover from partial failures (Fault Tolerance)

A way to secure communication and ensure that a process gets only those access rights it is entitled to (Security)

A way to extend interfaces so as to mimic the behavior of another system, reduce diversity of platforms, and provide a high degree of portability and flexibility (Virtualization)

Page 13: Distributed Systems CS 15-440

An Introductory Course on Distributed Systems

.0.

Introduction.1.

Processes and Communications.2.

Naming.3.

Synchronization.4.

Consistency and Replication.5.

Fault Tolerance.6.

Programming Models.7.

Distributed File Systems.8.

Security.9.

Virtualization

Considered: a reasonably critical and comprehensive perspective.

Thoughtful: Fluent, flexible and efficient perspective.

Masterful: a powerful and illuminating perspective.

Page 14: Distributed Systems CS 15-440

Intended Learning Outcomes

Introduction

• ILO0: Outlining the characteristics of distributed systems and the challenges that must be addressed in their design

Processes and Communication

• ILO1: Explain and contrast the communication mechanisms between processes and systems

Naming

• ILO2: Identify why entities and resources in distributed systems should be named, and examine the naming conventions and the naming-resolution mechanisms

Synchronization

• ILO3: Describe and analyze how multiple machines and services should cooperate and synchronize to correctly solve a problem

Page 15: Distributed Systems CS 15-440

Intended Learning Outcomes

Consistency and Replication

• ILO4: Identify how replication of resources improve performance in distributed systems, and explain algorithms to maintain consistent copies of replicas

Fault Tolerance

• ILO5: Explain how a distributed system can be made fault tolerant

Programming Models

• ILO6: Explain and apply the shared memory, message passing, MapReduce, Pregel and GraphLab programming models and describe the important differences between them

Distributed File Systems

• ILO7: Explain distributed file systems as a paradigm for general-purpose distributed systems, analyze its various aspects and architectures, and contrast against parallel file systems

Page 16: Distributed Systems CS 15-440

Intended Learning Outcomes

Security

• ILO8: Explain the various concepts and mechanisms that are generally incorporated in distributed systems to support security

Virtualization

• ILO9:Explain resource virtualization, how it applies to distributed systems, and how it allows distributed resource management and scheduling

Page 17: Distributed Systems CS 15-440

Course Objectives

The course aims at providing an in-depth and hands-on understanding on

Principles on which distributed systems are based

Principles on which distributed systems are optimized

Distributed system programming models and analytics engines

How modern distributed systems meet the demands of contemporary distributed applications

Page 18: Distributed Systems CS 15-440

Teaching Team

Instructor: Mohammad Hammoud

(MHH)

Teaching Assistant:

Dania Abed Rabbou

(DA)

15-440 Teaching

Team

Office HoursMHH

• Wednesday, 4:30- 5:30PM• Welcome when my office door is open• By appointment

Office HoursDA

• Tuesday, 9:30AM- 12:00PM and Thursday, 10:30AM- 12:00PM• Welcome when her office door is open• By appointment

Page 19: Distributed Systems CS 15-440

Teaching Methods

Lectures (28)

• Motivate learning• Provide a framework or roadmap to organize the information of the

course• Explain subjects and reinforce the critical big ideas

Recitations (14)

•Get students to reveal what they don’t understand, so we can help them•Allow students to practice skills they will need to become competent/expert

Page 20: Distributed Systems CS 15-440

Assignments & Projects

Assignments

• 5 required problem solving and reading assignments

Projects

• 4 large programming projects

Page 21: Distributed Systems CS 15-440

Projects

For all the projects except the final one, the following rules apply:

If you submit one day late, 25% will be deducted from your project score as a penalty

If you are two days late, 50% will be deducted

The project will not be graded (and you will receive a zero score) if you are more than two days late

There will be a 3-grace-day quota

Page 22: Distributed Systems CS 15-440

Assessment Methods

How do we measure learning?

Type # Weight

Project 4 45%

Exam 2 25% (10% Midterm & 15% Final)

Problem Solving and Reading Assignment

5 15%

Quiz 2 10%

Class and Recitation Participation and Attendance

42 5%

Page 23: Distributed Systems CS 15-440

Target Audience and Prerequisites

Target Audience:

Seniors

Prerequisites:

15-213 Students should have a basic knowledge of computer

systems and object-oriented programming

Page 24: Distributed Systems CS 15-440

Text Books

The primary textbooks for this course are:

1. Andrew S. Tannenbaum and Maarten Van Steen, Distributed Systems: Principles and Paradigms, 2nd E, Pearson, 2007.

2. George Coulouris, Jean Dollimore, Tim Kindberg, and Gordon Blair, Distributed Systems: Concepts and Design, 5th E, Addison Wesley, 2011

Reference Book:

4. Tom white, Hadoop: The Definitive Guide, 2nd E, O’Reilly Media, 2011

Page 25: Distributed Systems CS 15-440

Next Lecture

• We will discuss the trends in distributed systems and the challenges encountered when designing such systems

• To Do:• Read C1 & T1• Attend Next Lecture• Attend Recitation on Thursday

• OOP Programming in Java

Questions?