slide 1 comp 308 parallel efficient algorithms lecturer: dr. igor potapov ashton building, room 3.15...

35
Slide 1 COMP 308 Parallel Efficient Algorithms Lecturer: Dr. Igor Potapov Ashton Building, room 3.15 E-mail: [email protected] COMP 308 web-page: http://www.csc.liv.ac.uk/~igor/COMP308 Introduction to Parallel Computation

Post on 19-Dec-2015

221 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Slide 1 COMP 308 Parallel Efficient Algorithms Lecturer: Dr. Igor Potapov Ashton Building, room 3.15 E-mail: igor@csc.liv.ac.uk COMP 308 web-page: igor/COMP308

Slide 1

COMP 308Parallel Efficient Algorithms

Lecturer: Dr. Igor PotapovAshton Building, room 3.15E-mail: [email protected]

COMP 308 web-page:http://www.csc.liv.ac.uk/~igor/COMP308

Introduction to Parallel Computation

Page 2: Slide 1 COMP 308 Parallel Efficient Algorithms Lecturer: Dr. Igor Potapov Ashton Building, room 3.15 E-mail: igor@csc.liv.ac.uk COMP 308 web-page: igor/COMP308

Slide 2

Course Description and Objectives:

• The aim of the module is

– to introduce techniques for the design of efficient parallel algorithms and

– their implementation.

Page 3: Slide 1 COMP 308 Parallel Efficient Algorithms Lecturer: Dr. Igor Potapov Ashton Building, room 3.15 E-mail: igor@csc.liv.ac.uk COMP 308 web-page: igor/COMP308

Slide 3

At the end of the course you will be: familiar with the wide applicability of graph

theory and tree algorithms as an abstraction for the analysis of many practical problems,

familiar with the efficient parallel algorithms related to many areas of computer science: expression computation, sorting, graph-theoretic problems, computational geometry, algorithmics of texts etc.

familiar with the basic issues of implementing parallel algorithms.

Also a knowledge will be acquired of those problems which have been perceived as intractable for parallelization.

Learning Outcomes:

Page 4: Slide 1 COMP 308 Parallel Efficient Algorithms Lecturer: Dr. Igor Potapov Ashton Building, room 3.15 E-mail: igor@csc.liv.ac.uk COMP 308 web-page: igor/COMP308

Slide 4

Teaching method

• Series of 30 lectures ( 3hrs per week )Lecture Monday 10.00Lecture Tuesday 10.00Lecture Friday 10.00

-------------- Course Assessment ----------------------• A two-hour examination 80%• Continues assignment (Written class test + Home assignment) 20%-----------------------------------------------------------------

------

Page 5: Slide 1 COMP 308 Parallel Efficient Algorithms Lecturer: Dr. Igor Potapov Ashton Building, room 3.15 E-mail: igor@csc.liv.ac.uk COMP 308 web-page: igor/COMP308

Slide 5

Recommended Course Textbooks

• Introduction to AlgorithmsCormen et al.

• Introduction to Parallel Computing: Design and Analysis of AlgorithmsVipin Kumar, Ananth Grama, Anshul Gupta, and George Karypis, Benjamin Cummings 2nd ed. - 2003

• Efficient Parallel Algorithms A.Gibbons, W.Rytter, Cambridge University Press 1988.

+Research papers (will be announced later)

Page 6: Slide 1 COMP 308 Parallel Efficient Algorithms Lecturer: Dr. Igor Potapov Ashton Building, room 3.15 E-mail: igor@csc.liv.ac.uk COMP 308 web-page: igor/COMP308

Slide 6

What is Parallel Computing?(basic idea)

• Consider the problem of stacking (reshelving) a set of library books.– A single worker trying to stack all the books in

their proper places cannot accomplish the task faster than a certain rate.

– We can speed up this process, however, by employing more than one worker.

Page 7: Slide 1 COMP 308 Parallel Efficient Algorithms Lecturer: Dr. Igor Potapov Ashton Building, room 3.15 E-mail: igor@csc.liv.ac.uk COMP 308 web-page: igor/COMP308

Slide 7

Solution 1

• Assume that books are organized into shelves and that the shelves are grouped into bays

• One simple way to assign the task to the workers is:– To divide the books equally among them.– Each worker stacks the books one a time

• This division of work may not be most efficient way to accomplish the task since– The workers must walk all over the library to stack books.

Page 8: Slide 1 COMP 308 Parallel Efficient Algorithms Lecturer: Dr. Igor Potapov Ashton Building, room 3.15 E-mail: igor@csc.liv.ac.uk COMP 308 web-page: igor/COMP308

Slide 8

Solution 2

• An alternative way to divide the work is to assign a fixed and disjoint set of bays to each worker.

• As before, each worker is assigned an equal number of books arbitrarily.– If the worker finds a book that belongs to a bay assigned to

him or her, • he or she places that book in its assignment spot

– Otherwise,• He or she passes it on to the worker responsible for the bay it

belongs to.

• The second approach requires less effort from individual workers

Instance of task

partitioning

Instance of Communication

task

Page 9: Slide 1 COMP 308 Parallel Efficient Algorithms Lecturer: Dr. Igor Potapov Ashton Building, room 3.15 E-mail: igor@csc.liv.ac.uk COMP 308 web-page: igor/COMP308

Slide 9

Problems are parallelizable to different degrees

• For some problems, assigning partitions to other processors might be more time-consuming than performing the processing locally.

• Other problems may be completely serial.– For example, consider the task of digging a post hole.

• Although one person can dig a hole in a certain amount of time,

• Employing more people does not reduce this time

Page 10: Slide 1 COMP 308 Parallel Efficient Algorithms Lecturer: Dr. Igor Potapov Ashton Building, room 3.15 E-mail: igor@csc.liv.ac.uk COMP 308 web-page: igor/COMP308

Slide 10

Power of parallel solutions• Pile collection

– Ants/robots with very limited abilities (see its neighbourhood )

– Grid environment (sticks and robots)

Move()

Move randomly ( )

Until robot sees a stick in its nighbouhood

Collect()

Move(); Pick up a sick; Move();

Put it down; Collect();

Page 11: Slide 1 COMP 308 Parallel Efficient Algorithms Lecturer: Dr. Igor Potapov Ashton Building, room 3.15 E-mail: igor@csc.liv.ac.uk COMP 308 web-page: igor/COMP308

Slide 11

Sorting in nature

6 2 1 3 5 7 4

Page 12: Slide 1 COMP 308 Parallel Efficient Algorithms Lecturer: Dr. Igor Potapov Ashton Building, room 3.15 E-mail: igor@csc.liv.ac.uk COMP 308 web-page: igor/COMP308

Slide 12

Parallel Processing(Several processing elements working

to solve a single problem)

Primary consideration: elapsed time– NOT: throughput, sharing resources, etc.

• Downside: complexity– system, algorithm design

• Elapsed Time = computation time + communication time +

synchronization time

Page 13: Slide 1 COMP 308 Parallel Efficient Algorithms Lecturer: Dr. Igor Potapov Ashton Building, room 3.15 E-mail: igor@csc.liv.ac.uk COMP 308 web-page: igor/COMP308

Slide 13

Design of efficient algorithms

A parallel computer is of little use unless efficient parallel algorithms are available.

– The issue in designing parallel algorithms are very different from those in designing their sequential counterparts.

– A significant amount of work is being done to develop efficient parallel algorithms for a variety of parallel architectures.

Page 14: Slide 1 COMP 308 Parallel Efficient Algorithms Lecturer: Dr. Igor Potapov Ashton Building, room 3.15 E-mail: igor@csc.liv.ac.uk COMP 308 web-page: igor/COMP308

Slide 14

Processor Trends

• Moore’s Law– performance doubles every 18 months

• Parallelization within processors– pipelining– multiple pipelines

Page 15: Slide 1 COMP 308 Parallel Efficient Algorithms Lecturer: Dr. Igor Potapov Ashton Building, room 3.15 E-mail: igor@csc.liv.ac.uk COMP 308 web-page: igor/COMP308

Slide 15

Why Parallel Computing

• Practical:– Moore’s Law cannot hold forever– Problems must be solved immediately– Cost-effectiveness– Scalability

• Theoretical:– challenging problems

Page 16: Slide 1 COMP 308 Parallel Efficient Algorithms Lecturer: Dr. Igor Potapov Ashton Building, room 3.15 E-mail: igor@csc.liv.ac.uk COMP 308 web-page: igor/COMP308

Slide 16

Some Complex Problems

• N-body simulation• Atmospheric simulation• Image generation• Oil exploration• Financial processing• Computational biology

Page 17: Slide 1 COMP 308 Parallel Efficient Algorithms Lecturer: Dr. Igor Potapov Ashton Building, room 3.15 E-mail: igor@csc.liv.ac.uk COMP 308 web-page: igor/COMP308

Slide 17

Some Complex Problems

• N-body simulation– O(n log n) time– galaxy 1011 stars approx. one year /

iteration

• Atmospheric simulation– 3D grid, each element interacts with neighbors– 1x1x1 mile element 5 108 elements– 10 day simulation requires approx. 100 days

Page 18: Slide 1 COMP 308 Parallel Efficient Algorithms Lecturer: Dr. Igor Potapov Ashton Building, room 3.15 E-mail: igor@csc.liv.ac.uk COMP 308 web-page: igor/COMP308

Slide 18

Some Complex Problems

• Image generation– animation, special effects– several minutes of video 50 days of rendering

• Oil exploration– large amounts of seismic data to be processed– months of sequential exploration

Page 19: Slide 1 COMP 308 Parallel Efficient Algorithms Lecturer: Dr. Igor Potapov Ashton Building, room 3.15 E-mail: igor@csc.liv.ac.uk COMP 308 web-page: igor/COMP308

Slide 19

Some Complex Problems

• Financial processing– market prediction, investing– Cornell Theory Center, Renaissance Tech.

• Computational biology– drug design– gene sequencing (Celera)– structure prediction (Proteomics)

Page 20: Slide 1 COMP 308 Parallel Efficient Algorithms Lecturer: Dr. Igor Potapov Ashton Building, room 3.15 E-mail: igor@csc.liv.ac.uk COMP 308 web-page: igor/COMP308

Slide 20

Fundamental Issues

• Is the problem amenable to parallelization?• How to decompose the problem to exploit

parallelism?• What machine architecture should be used?• What parallel resources are available? • What kind of speedup is desired?

Page 21: Slide 1 COMP 308 Parallel Efficient Algorithms Lecturer: Dr. Igor Potapov Ashton Building, room 3.15 E-mail: igor@csc.liv.ac.uk COMP 308 web-page: igor/COMP308

Slide 21

Two Kinds of Parallelism

• Pragmatic– goal is to speed up a given computation as much

as possible– problem-specific– techniques include:

• overlapping instructions (multiple pipelines)• overlapping I/O operations (RAID systems)• “traditional” (asymptotic) parallelism techniques

Page 22: Slide 1 COMP 308 Parallel Efficient Algorithms Lecturer: Dr. Igor Potapov Ashton Building, room 3.15 E-mail: igor@csc.liv.ac.uk COMP 308 web-page: igor/COMP308

Slide 22

Two Kinds of Parallelism

• Asymptotic – studies:

• architectures for general parallel computation• parallel algorithms for fundamental problems• limits of parallelization

– can be subdivided into three main areas

Page 23: Slide 1 COMP 308 Parallel Efficient Algorithms Lecturer: Dr. Igor Potapov Ashton Building, room 3.15 E-mail: igor@csc.liv.ac.uk COMP 308 web-page: igor/COMP308

Slide 23

Asymptotic Parallelism

• Models– comparing/evaluating different architectures

• Algorithm Design– utilizing a given architecture to solve a given

problem

• Computational Complexity– classifying problems according to their

difficulty

Page 24: Slide 1 COMP 308 Parallel Efficient Algorithms Lecturer: Dr. Igor Potapov Ashton Building, room 3.15 E-mail: igor@csc.liv.ac.uk COMP 308 web-page: igor/COMP308

Slide 24

Architecture

• Single processor:– single instruction stream– single data stream– von Neumann model

• Multiple processors:– Flynn’s taxonomy

Page 25: Slide 1 COMP 308 Parallel Efficient Algorithms Lecturer: Dr. Igor Potapov Ashton Building, room 3.15 E-mail: igor@csc.liv.ac.uk COMP 308 web-page: igor/COMP308

Slide 25

MISD

SISD

MIMD

SIMD

1 Many

1M

any

Data Streams

Inst

ruct

ion

Str

eam

sFlynn’s Taxonomy

Page 26: Slide 1 COMP 308 Parallel Efficient Algorithms Lecturer: Dr. Igor Potapov Ashton Building, room 3.15 E-mail: igor@csc.liv.ac.uk COMP 308 web-page: igor/COMP308

Slide 26

Page 27: Slide 1 COMP 308 Parallel Efficient Algorithms Lecturer: Dr. Igor Potapov Ashton Building, room 3.15 E-mail: igor@csc.liv.ac.uk COMP 308 web-page: igor/COMP308

Slide 27

Parallel Architectures

• Multiple processing elements• Memory:

– shared– distributed– hybrid

• Control:– centralized– distributed

Page 28: Slide 1 COMP 308 Parallel Efficient Algorithms Lecturer: Dr. Igor Potapov Ashton Building, room 3.15 E-mail: igor@csc.liv.ac.uk COMP 308 web-page: igor/COMP308

Slide 28

Parallel vs Distributed Computing

• Parallel: – several processing elements concurrently

solving a single same problem

• Distributed: – processing elements do not share memory or

system clock

• Which is the subset of which?– distributed is a subset of parallel

Page 29: Slide 1 COMP 308 Parallel Efficient Algorithms Lecturer: Dr. Igor Potapov Ashton Building, room 3.15 E-mail: igor@csc.liv.ac.uk COMP 308 web-page: igor/COMP308

Slide 29

Efficient and optimal parallel algorithms

• A parallel algorithm is efficient iff – it is fast (e.g. polynomial time) and – the product of the parallel time and number of processors is

close to the time of at the best know sequential algorithm

T sequential T parallel N processors

• A parallel algorithms is optimal iff this product is of the same order as the best known sequential time

Page 30: Slide 1 COMP 308 Parallel Efficient Algorithms Lecturer: Dr. Igor Potapov Ashton Building, room 3.15 E-mail: igor@csc.liv.ac.uk COMP 308 web-page: igor/COMP308

Slide 30

A measure of relative performance between a multiprocessor system and a single processor system is the speed-up S( p), defined as follows:

S( p) = Execution time using a single processor systemExecution time using a multiprocessor with p processors

S( p) =T1

TpEfficiency =

Spp

Cost = p Tp

Metrics

Page 31: Slide 1 COMP 308 Parallel Efficient Algorithms Lecturer: Dr. Igor Potapov Ashton Building, room 3.15 E-mail: igor@csc.liv.ac.uk COMP 308 web-page: igor/COMP308

Slide 31

Metrics• Parallel algorithm is cost-optimal:

parallel cost = sequential time

Cp = T1

Ep = 100%

• Critical when down-scaling:parallel implementation may become slower than sequential

T1 = n3

Tp = n2.5 when p = n2

Cp = n4.5

Page 32: Slide 1 COMP 308 Parallel Efficient Algorithms Lecturer: Dr. Igor Potapov Ashton Building, room 3.15 E-mail: igor@csc.liv.ac.uk COMP 308 web-page: igor/COMP308

Slide 32

Amdahl’s Law

• f = fraction of the problem that’s inherently sequential(1 – f) = fraction that’s parallel

• Parallel time Tp:

• Speedup with p processors:

pffTp

)1(

pf

fSp

11

Page 33: Slide 1 COMP 308 Parallel Efficient Algorithms Lecturer: Dr. Igor Potapov Ashton Building, room 3.15 E-mail: igor@csc.liv.ac.uk COMP 308 web-page: igor/COMP308

Slide 33

What kind of speed-up may be achieved?• Part f is computed by a single processor• Part (1-f) is computed by p processors, p>1

Basic observation: Increasing p we cannot speed-up part f.

f

Page 34: Slide 1 COMP 308 Parallel Efficient Algorithms Lecturer: Dr. Igor Potapov Ashton Building, room 3.15 E-mail: igor@csc.liv.ac.uk COMP 308 web-page: igor/COMP308

Slide 34

Amdahl’s Law

• Upper bound on speedup (p = )

• Example:f = 2%S = 1 / 0.02 = 50

fS

1

pf

fSp

11

Converges to 0

Page 35: Slide 1 COMP 308 Parallel Efficient Algorithms Lecturer: Dr. Igor Potapov Ashton Building, room 3.15 E-mail: igor@csc.liv.ac.uk COMP 308 web-page: igor/COMP308

Slide 35

• The basic parallel complexity class is NC.• NC is a class of problems computable in poly-logarithmic

time (log c n, for a constant c) using a polynomial number of processors.

• P is a class of problems computable sequentially in a polynomial time

The main open question in parallel computations is

NC = P ?

The main open question