chapter9 sorting(1)

56
CHAPTER9 SORTING(1)

Upload: kasia

Post on 22-Feb-2016

74 views

Category:

Documents


0 download

DESCRIPTION

Chapter9 Sorting(1). Outline. introduction Sorting Networks Bubble Sort and its Variants. Introduction. Sorting is the most common operations performed by a computer Internal or external Comparison-based Θ( nlog n ) and non comparison-based Θ(n). background. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Chapter9 Sorting(1)

CHAPTER9 SORTING(1)

Page 2: Chapter9 Sorting(1)

2

Outline

introduction Sorting Networks Bubble Sort and its Variants

Page 3: Chapter9 Sorting(1)

3

Introduction Sorting is the most common operations

performed by a computer

Internal or external

Comparison-based Θ(nlogn) and non comparison-based Θ(n)

Page 4: Chapter9 Sorting(1)

4

background Where the input and output sequence are

stored?stored on one processdistributed among the process

○ Useful as an intermediate step What’s the order of output sequence among

the processes?Global enumeration

Page 5: Chapter9 Sorting(1)

5

How comparisons are performed Compare-exchange is not easy in parallel

sorting algorithms One element per process

Ts+Tw, Ts>>Tw => poor performance

Page 6: Chapter9 Sorting(1)

6

How comparisons are performed (contd’)

More than one element per processn/p elements, Ai <= AjCompare-split, (ts+tw*n/p)=> Ɵ(n/p)

Page 7: Chapter9 Sorting(1)

7

Outline

introduction Sorting Networks

Bitonic sortMapping bitonic sort to hypercube and mesh

Bubble Sort and its Variants

Page 8: Chapter9 Sorting(1)

8

Sorting Networks Ɵ(log2n) Key component: Comparator

Increasing comparatorDecreasing comparator

Page 9: Chapter9 Sorting(1)

9

A typical sorting network Depth: the number of columns it contains

Network speed is proportional to it

Page 10: Chapter9 Sorting(1)

10

Bitonic sort: Ɵ(log2n) Bitonic sequence <a0,a1,…,an>

Monotonically increasing then decreasing There exists a cyclic shift of indices so that the above satisfied EG: 8 9 2 1 0 4 5 7

How to rearrange a bitonic sequence to obtain a monotonic sequence? Let s= <a0,a1,…,an> is a bitonic sequence

s1 ,s2 are bitonic every element of s1 are smaller than every element of s2

Bitonic-split; bitonic-merge=>bitonic-merging network or

Page 11: Chapter9 Sorting(1)

11

Example of bitonic merging

Page 12: Chapter9 Sorting(1)

12

Bitonic merging network Logn column

Page 13: Chapter9 Sorting(1)

13

Sorting n unordered elements Bitonic sort, bitonic-sorting network d(n)=d(n/2)+logn => d(n)=Θ(log2n)

Page 14: Chapter9 Sorting(1)

14

The first three stage

Page 15: Chapter9 Sorting(1)

15

How to map Bitonic sort to a hypercube ?

One element per process How to map the bitonic sort algorithm on general

purpose parallel computer? Process <=> a wire Compare-exchange function is performed by a pair of

processes Bitonic is communication intensive=> considering the

topology of the interconnection network○ Poor mapping => long distance before compare, degrading

performance Observation:

Communication happens between pairs of wire which have 1 bit different

Page 16: Chapter9 Sorting(1)

16

The last stage of bitonic sort

Page 17: Chapter9 Sorting(1)

17

Communication characteristics

Page 18: Chapter9 Sorting(1)

18

Bitonic sort algorithm on 2d processors Tp=Θ(log2n), cost optimal to bitonic sort

Page 19: Chapter9 Sorting(1)

19

Mapping Bitonic sort to a mesh

Page 20: Chapter9 Sorting(1)

20

The last stage of the bitonic sort

Page 21: Chapter9 Sorting(1)

21

A block of elements per process case Each processor has n/p elements

S1: Think of each process as consisting of n/p smaller processes○ Poor parallel implementation

S2: Compare-exchange=> compare-split:Θ(n/p)+Θ(n/p)The different: S2 initially sorted locallyHypercube

mesh

Page 22: Chapter9 Sorting(1)

22

Performance on different Architecture

Either very efficient nor very scalable, since the sequential algorithm is sub optimal

Page 23: Chapter9 Sorting(1)

23

Outline

introduction Sorting Networks Bubble Sort and its Variants

Page 24: Chapter9 Sorting(1)

24

Bubble sort O(n2) Inherently sequential

Page 25: Chapter9 Sorting(1)

25

Odd-even transposition N phases, each Θ(n) comparisons

Page 26: Chapter9 Sorting(1)

26

Odd-even transposition

Page 27: Chapter9 Sorting(1)

27

Parallel formulation O(n)

Page 28: Chapter9 Sorting(1)

28

Shellsort Drawback of odd-even sort

A sequence which has a few elements out of order, still need Θ(n2) to sort.

ideaAdd a preprocessing phase, moving

elements across long distanceThus reduce the odd and even phase

Page 29: Chapter9 Sorting(1)

29

Shellsort

Page 30: Chapter9 Sorting(1)

30

Conclusion Sorting Networks

Bitonic networkMapping to hypercube and mesh

Bubble Sort and its VariantsOdd-even sortShell sort

Page 31: Chapter9 Sorting(1)

CHAPTER9 SORTING(2)

Page 32: Chapter9 Sorting(1)

32

Outline Issues in Sorting Sorting Networks Bubble Sort and its Variants

Quick sort Bucket and Sample sort Other sorting algorithms

Page 33: Chapter9 Sorting(1)

33

Quick Sort Feature

Simple, low overheadΘ(nlogn) ~ Θ(n2),

IdeaChoosing a pivot, how? Partitioning into two parts, Θ(n)Recursively solving two sub-problems

complexityT(n)=T(n-1)+ Θ(n)=> Θ(n2)T(n)=T(n/2)+ Θ(n)=> Θ(nlogn)

Page 34: Chapter9 Sorting(1)

34

The sequential algorithm

Page 35: Chapter9 Sorting(1)

35

Parallelizing quicksort Solution 1

Recursive decompositionDrawback: partition handled by single process,

Ω(n). Ω(n2) Solution 2

Idea: performing partition parallelly we could partition an array of size n into two

smaller arrays in time Θ(1) by using Θ(n) processes○ how?○ CRCW PRAM, Shard-address, message-passing

model

Page 36: Chapter9 Sorting(1)

36

Parallel Formulation for CRCW PRAM –cost optimal assumption

n elements, n process write conflicts are resolved arbitrarily Executing quicksort can be visualized as constructing a

binary tree

Page 37: Chapter9 Sorting(1)

37

Example

Page 38: Chapter9 Sorting(1)

38

algorithm1. procedure BUILD TREE (A[1...n]) 2. begin 3. for each process i do 4. begin 5. root := i; 6. parenti := root; 7. leftchild[i] := rightchild[i] := n + 1; 8. end for 9. repeat for each process i ≠ root do 10. begin 11. if (A[i] < A[parenti]) or (A[i]= A[parenti] and i <parenti) then 12. begin 13. leftchild[parenti] :=i ; 14. if i = leftchild[parenti] then exit 15. else

parenti := leftchild[parenti]; 16. end for 17. else 18. begin 19. rightchild[parenti] :=i; 20. If i = rightchild[parenti] then exit 21. else

parenti := rightchild[parenti]; 22. end else 23. end repeat 24. end BUILD_TREE

Assuming balanced tree:•Partition distributeTo all process O(1)•Θ(logn) * Θ(1)

Page 39: Chapter9 Sorting(1)

39

Parallel Formulation for Shared-Address-Space Architecture assumption

N element, p processes Shared memory

How to parallelize? Idea of the algorithm

Each process is assigned a block Selecting a pivot element, broadcast Local rearrangement Global rearrangement=> smaller block S, larger block L redistributing blocks to processes

○ How many?    Until breaking the array into p parts

Page 40: Chapter9 Sorting(1)

40

Example

How to compute the location?

Page 41: Chapter9 Sorting(1)

41

Example(contd’)

Page 42: Chapter9 Sorting(1)

42

How to do global rearrangement?

Page 43: Chapter9 Sorting(1)

43

Analysis Assumption

Pivot selection results in balanced partitions Logp steps

Broadcasting Pivot Θ(logp)Locally rearrangement Θ(n/p) Prefix sum Θ(log p)Global rearrangement Θ(n/p)

Page 44: Chapter9 Sorting(1)

44

Parallel Formulation for Message Passing Architecture Similar to shared-address architecture Different

Array distributed to p processes

Page 45: Chapter9 Sorting(1)

45

Pivot selection Random selection

Drawback: bad pivot lead to significant performance degradation

Median selectionAssumption: the initial distribution of

elements in each process is uniform

Page 46: Chapter9 Sorting(1)

46

Outline Issues in Sorting Sorting Networks Bubble Sort and its Variants

Quick sort Bucket and Sample sort Other sorting algorithms

Page 47: Chapter9 Sorting(1)

47

Bucket Sort Assumption

n elements distributed uniformly over [a, b] Idea

Divided into m equal sized subintervalElement replacementSorted each one

Θ(nlog(n/m)) => Θ(n) Compare with QuickSort

Page 48: Chapter9 Sorting(1)

48

Parallelization on message passing architecture N elements, p processes=> p buckets Preliminary idea

Distributing elements n/pSubinterval, elements redistributionLocally sortingDrawback: the assumption is not realistic =>

performance degradation Solution:

Sample sorting => splittersGuarantee elements < 2n/m

Page 49: Chapter9 Sorting(1)

49

Example

Page 50: Chapter9 Sorting(1)

50

analysis Distributing elements n/p Local sort & sample selection Θ(p) Sample combining Θ(P2),sortingΘ(p2logp),

global splitter Θ(p) elements partitioning Θ(plog(n/p)),

redistribution O(n)+O(plogp) Locally sorting

Page 51: Chapter9 Sorting(1)

51

Outline Issues in Sorting Sorting Networks Bubble Sort and its Variants

Quick sort Bucket and Sample sort Other sorting algorithms

Page 52: Chapter9 Sorting(1)

52

Enumeration Sort Assumption

O(n2) process, n elements, CRCW PRAM Feature

Based the rank of each element Θ(1)

Page 53: Chapter9 Sorting(1)

53

Algorithm

1. procedure ENUM SORT (n) 2. begin 3. for each process P1,j do 4. C[j] :=0; 5. for each process Pi,j do 6. if (A[i] < A[j]) or ( A[i]= A[j] and i < j) then 7. C[j] := 1; 8. else 9. C[j] := 0; 10. for each process P1,j do 11. A[C[j]] := A[j]; 12. end ENUM_SORT

Common structure: A[n], C[n]

Page 54: Chapter9 Sorting(1)

54

Radix Sort Assumption

n elements, n process Feature

Based on binary presentation of the elements

Leveraging the enumeration sorting

Page 55: Chapter9 Sorting(1)

55

Algorithm1. procedure RADIX SORT(A, r) 2. begin 3. for i := 0 to b/r - 1 do 4. begin 5. offset := 0; 6. for j := 0 to 2^r -1 do 7. begin 8. flag := 0; 9. if the ith least significant r-bit block of A[Pk] = j then 10. flag := 1; 11. index := prefix_sum(flag) // Θ(log n) 12. if flag = 1 then 13. rank := offset + index; 14. offset := parallel_sum(flag); // Θ(log n)15. endfor 16. each process Pk send its element A[Pk] to process Prank;//Θ(n) 17. endfor 18. end RADIX_SORT

Page 56: Chapter9 Sorting(1)

56

Conclusion Sorting Networks

Bitonic network, mapping to hypercube and mesh Bubble Sort and its Variants

Odd-even sorting, shell sorting Quick sort

Parallel formation on CRCW PRAM, shared address/MP architecutre

Bucket and Sample sort Enumeration and radix sorting