sorting (chapter 9) - people.cs.aau.dkadavid/teaching/mvp-08/12-mvp08.pdf · 5 07-04-2008 alexandre...

41
1 Sorting (Chapter 9) Alexandre David 1.2.05

Upload: others

Post on 22-May-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Sorting (Chapter 9) - people.cs.aau.dkadavid/teaching/MVP-08/12-MVP08.pdf · 5 07-04-2008 Alexandre David, MVP'08 5 Fundamental Distinction Comparison based sorting: Compare-exchangeof

1

Sorting (Chapter 9)

Alexandre David1.2.05

Page 2: Sorting (Chapter 9) - people.cs.aau.dkadavid/teaching/MVP-08/12-MVP08.pdf · 5 07-04-2008 Alexandre David, MVP'08 5 Fundamental Distinction Comparison based sorting: Compare-exchangeof

2

07-04-2008 Alexandre David, MVP'08 2

Sorting

Arrange an unordered collection ofelements into monotonically increasing(or decreasing) order.Let S = <a1,a2,…,an>.Sort S into S’ = <a1’,a2’,…,an’> such that

ai’ ≤ aj’ for 1 ≤ i ≤ j ≤ nand S’ is a permutation of S.

Problem

The elements to sort (actually used for comparisons) are also called the keys.

Page 3: Sorting (Chapter 9) - people.cs.aau.dkadavid/teaching/MVP-08/12-MVP08.pdf · 5 07-04-2008 Alexandre David, MVP'08 5 Fundamental Distinction Comparison based sorting: Compare-exchangeof

3

07-04-2008 Alexandre David, MVP'08 3

Recall on Comparison Based Sorting Algorithms

Bubble sortSelection sortInsertion sort

Quick sortMerge sortHeap sort

O(n2 )

Θ(n2 )

Ω(n)

Θ(n logn)Ω(n logn)

You should know these complexities from a previous course on algorithms.

Page 4: Sorting (Chapter 9) - people.cs.aau.dkadavid/teaching/MVP-08/12-MVP08.pdf · 5 07-04-2008 Alexandre David, MVP'08 5 Fundamental Distinction Comparison based sorting: Compare-exchangeof

4

07-04-2008 Alexandre David, MVP'08 4

Characteristics of Sorting AlgorithmsIn-place sorting: No need for additional memory (or only constant size).Stable sorting: Ordered elements keep their original relative position.Internal sorting: Elements fit in process memory.External sorting: Elements are on auxiliary storage.

We assume internal sorting is possible.

Page 5: Sorting (Chapter 9) - people.cs.aau.dkadavid/teaching/MVP-08/12-MVP08.pdf · 5 07-04-2008 Alexandre David, MVP'08 5 Fundamental Distinction Comparison based sorting: Compare-exchangeof

5

07-04-2008 Alexandre David, MVP'08 5

Fundamental DistinctionComparison based sorting:

Compare-exchange of pairs of elements.Lower bound is Ω(n logn) (proof based on decision trees).Merge & heap-sort are optimal.

Non-comparison based sorting:Use information on the element to sort.Lower bound is Ω(n).Counting & radix-sort are optimal.

We assume comparison based sorting is used.

Page 6: Sorting (Chapter 9) - people.cs.aau.dkadavid/teaching/MVP-08/12-MVP08.pdf · 5 07-04-2008 Alexandre David, MVP'08 5 Fundamental Distinction Comparison based sorting: Compare-exchangeof

6

07-04-2008 Alexandre David, MVP'08 6

Issues in Parallel SortingWhere to store input & output?

One process or distributed?Enumeration of processes used to distribute output.

How to compare?How many elements per process?As many processes as element ⇒ poor performance because of inter-process communication.

Page 7: Sorting (Chapter 9) - people.cs.aau.dkadavid/teaching/MVP-08/12-MVP08.pdf · 5 07-04-2008 Alexandre David, MVP'08 5 Fundamental Distinction Comparison based sorting: Compare-exchangeof

7

07-04-2008 Alexandre David, MVP'08 7

Parallel Compare-Exchange

Communication cost: ts+tw.Comparison cost much cheaper ⇒ communicationtime dominates.

Page 8: Sorting (Chapter 9) - people.cs.aau.dkadavid/teaching/MVP-08/12-MVP08.pdf · 5 07-04-2008 Alexandre David, MVP'08 5 Fundamental Distinction Comparison based sorting: Compare-exchangeof

8

07-04-2008 Alexandre David, MVP'08 8

Blocks of Elements Per Process

P0 P1 Pp-1…

n elements

n/p elements per process

Blocks: A0 ≤ A1 ≤ … ≤ Ap-1

Page 9: Sorting (Chapter 9) - people.cs.aau.dkadavid/teaching/MVP-08/12-MVP08.pdf · 5 07-04-2008 Alexandre David, MVP'08 5 Fundamental Distinction Comparison based sorting: Compare-exchangeof

9

07-04-2008 Alexandre David, MVP'08 9

Compare-Split

Exchange: Θ(ts+twn/p)

Merge: Θ(n/p) Split: O(n/p)

For large blocks: Θ(n/p)

Page 10: Sorting (Chapter 9) - people.cs.aau.dkadavid/teaching/MVP-08/12-MVP08.pdf · 5 07-04-2008 Alexandre David, MVP'08 5 Fundamental Distinction Comparison based sorting: Compare-exchangeof

10

07-04-2008 Alexandre David, MVP'08 10

Sorting NetworksMostly of theoretical interest.Key idea: Perform many comparisons in parallel.Key elements:

Comparators: 2 inputs, 2 outputs.Network architecture: Comparators arranged in columns, each performing a permutation.Speed proportional to the depth.

Page 11: Sorting (Chapter 9) - people.cs.aau.dkadavid/teaching/MVP-08/12-MVP08.pdf · 5 07-04-2008 Alexandre David, MVP'08 5 Fundamental Distinction Comparison based sorting: Compare-exchangeof

11

07-04-2008 Alexandre David, MVP'08 11

Comparators

Page 12: Sorting (Chapter 9) - people.cs.aau.dkadavid/teaching/MVP-08/12-MVP08.pdf · 5 07-04-2008 Alexandre David, MVP'08 5 Fundamental Distinction Comparison based sorting: Compare-exchangeof

12

07-04-2008 Alexandre David, MVP'08 12

Sorting Networks

Page 13: Sorting (Chapter 9) - people.cs.aau.dkadavid/teaching/MVP-08/12-MVP08.pdf · 5 07-04-2008 Alexandre David, MVP'08 5 Fundamental Distinction Comparison based sorting: Compare-exchangeof

13

07-04-2008 Alexandre David, MVP'08 13

Bitonic Sequence

A bitonic sequence is a sequence ofelements <a0,a1,…,an> s.t.

1. ∃i, 0 ≤ i ≤ n-1 s.t. <a0,…,ai> ismonotonically increasing and<ai+1,…,an-1> is monotonicallydecreasing,

2.or there is a cyclic shift of indicesso that 1) is satisfied.

Definition

Example: <1,2,4,7,6,0> & <8,9,2,1,0,4> are bitonic sequences.

Page 14: Sorting (Chapter 9) - people.cs.aau.dkadavid/teaching/MVP-08/12-MVP08.pdf · 5 07-04-2008 Alexandre David, MVP'08 5 Fundamental Distinction Comparison based sorting: Compare-exchangeof

14

07-04-2008 Alexandre David, MVP'08 14

Bitonic SortRearrange a bitonic sequence to be sorted.Divide & conquer type of algorithm (similar to quicksort) using bitonic splits.

Sorting a bitonic sequence using bitonic splits = bitonic merge.But we need a bitonic sequence…

Page 15: Sorting (Chapter 9) - people.cs.aau.dkadavid/teaching/MVP-08/12-MVP08.pdf · 5 07-04-2008 Alexandre David, MVP'08 5 Fundamental Distinction Comparison based sorting: Compare-exchangeof

15

07-04-2008 Alexandre David, MVP'08 15

Bitonic Split

<a0,a1,…,an/2-1,an/2,an/2+1,…,an-1>

s1 = <mina0,an/2,mina1,an/2+1,…,minan/2-1,an-1>bi

s2 = <maxa0,an/2,maxa1,an/2+1,…,maxan/2-1,an-1>bi’

s2s1

s1 ≤ s2s1 & s2 bitonic!

And in fact the procedure works even if the original sequence needs a cyclic shift to look like this particular case.

Page 16: Sorting (Chapter 9) - people.cs.aau.dkadavid/teaching/MVP-08/12-MVP08.pdf · 5 07-04-2008 Alexandre David, MVP'08 5 Fundamental Distinction Comparison based sorting: Compare-exchangeof

16

07-04-2008 Alexandre David, MVP'08 16

Bitonic Merging Network logn stagesn/2 com

parators

⊕BM[n]

Cost: Θ(logn) obviously.

Page 17: Sorting (Chapter 9) - people.cs.aau.dkadavid/teaching/MVP-08/12-MVP08.pdf · 5 07-04-2008 Alexandre David, MVP'08 5 Fundamental Distinction Comparison based sorting: Compare-exchangeof

17

07-04-2008 Alexandre David, MVP'08 17

Bitonic SortUse the bitonic network to merge bitonic sequences of increasing length… starting from 2, etc.Bitonic network is a component.

Page 18: Sorting (Chapter 9) - people.cs.aau.dkadavid/teaching/MVP-08/12-MVP08.pdf · 5 07-04-2008 Alexandre David, MVP'08 5 Fundamental Distinction Comparison based sorting: Compare-exchangeof

18

07-04-2008 Alexandre David, MVP'08 18

Bitonic Sortlogn stages

Cost: O(log2n).Simulated on a serial computer: O(n log2n).

Not cost optimal compared to the optimal serial algorithm.

Page 19: Sorting (Chapter 9) - people.cs.aau.dkadavid/teaching/MVP-08/12-MVP08.pdf · 5 07-04-2008 Alexandre David, MVP'08 5 Fundamental Distinction Comparison based sorting: Compare-exchangeof

19

07-04-2008 Alexandre David, MVP'08 19

Mapping to Hypercubes & Mesh – IdeaCommunication intensive, so special care for the mapping.How are the input wires paired?

Pairs have their labels differing by only one bit ⇒ mapping to hypercube straightforward.For a mesh lower connectivity, several solutions but worse than the hypercubeTP=Θ(log2n)+Θ(√n) for 1 element/process.Block of elements: sort locally (n/p logn/p) & use bitonic merge ⇒ cost optimal.

But not efficient & not scalablebecause the sequential algorithmis suboptimal.

Hypercube: Neighbors differ with each other by one bit.

Page 20: Sorting (Chapter 9) - people.cs.aau.dkadavid/teaching/MVP-08/12-MVP08.pdf · 5 07-04-2008 Alexandre David, MVP'08 5 Fundamental Distinction Comparison based sorting: Compare-exchangeof

20

07-04-2008 Alexandre David, MVP'08 20

Bubble Sort

Difficult to parallelize as it is because it is inherently sequential.

procedure BUBBLE_SORT(n)begin

for i := n-1 downto 1 dofor j := 1 to i do

compare_exchange(aj,aj+1);end

Θ(n2 )

It is difficult to sort n elements in time logn using n processes (cost optimal w.r.t. the best serial algorithm in n logn) but it is easy to parallelize other (less efficient) algorithms.

Page 21: Sorting (Chapter 9) - people.cs.aau.dkadavid/teaching/MVP-08/12-MVP08.pdf · 5 07-04-2008 Alexandre David, MVP'08 5 Fundamental Distinction Comparison based sorting: Compare-exchangeof

21

07-04-2008 Alexandre David, MVP'08 21

Odd-Even Transposition Sort

(a1,a2),(a3,a4)…

(a2,a3),(a4,a5)…

Θ(n2 )

Page 22: Sorting (Chapter 9) - people.cs.aau.dkadavid/teaching/MVP-08/12-MVP08.pdf · 5 07-04-2008 Alexandre David, MVP'08 5 Fundamental Distinction Comparison based sorting: Compare-exchangeof

22

07-04-2008 Alexandre David, MVP'08 22

Page 23: Sorting (Chapter 9) - people.cs.aau.dkadavid/teaching/MVP-08/12-MVP08.pdf · 5 07-04-2008 Alexandre David, MVP'08 5 Fundamental Distinction Comparison based sorting: Compare-exchangeof

23

07-04-2008 Alexandre David, MVP'08 23

Odd-Even Transposition SortEasy to parallelize!Θ(n) if 1 process/element.Not cost optimal but use fewer processes, an optimal local sort, and compare-splits:

( ) ( )nnpn

pnTP Θ+Θ+⎟⎟

⎞⎜⎜⎝

⎛Θ= log

local sort (optimal) + comparisons + communicationCost optimal for p = O(logn)but not scalable (few processes).

Write speedup & efficiency to find the bound on p but you can also see it with TP.

Page 24: Sorting (Chapter 9) - people.cs.aau.dkadavid/teaching/MVP-08/12-MVP08.pdf · 5 07-04-2008 Alexandre David, MVP'08 5 Fundamental Distinction Comparison based sorting: Compare-exchangeof

24

07-04-2008 Alexandre David, MVP'08 24

Odd-Even Transposition SortParallel formulation cost-optimal forp=O( log n).Isoefficiency function: W=Θ(p2p). Exponential(p) ⇒ poorly scalable.

Page 25: Sorting (Chapter 9) - people.cs.aau.dkadavid/teaching/MVP-08/12-MVP08.pdf · 5 07-04-2008 Alexandre David, MVP'08 5 Fundamental Distinction Comparison based sorting: Compare-exchangeof

25

07-04-2008 Alexandre David, MVP'08 25

Improvement: Shellsort2 phases:

Move elements on longer distances.Odd-even transposition but stop when no change.

Idea: Put quickly elements near their final position to reduce the number of iterations of odd-even transposition.

Page 26: Sorting (Chapter 9) - people.cs.aau.dkadavid/teaching/MVP-08/12-MVP08.pdf · 5 07-04-2008 Alexandre David, MVP'08 5 Fundamental Distinction Comparison based sorting: Compare-exchangeof

26

07-04-2008 Alexandre David, MVP'08 26

2 3

1 3 7 62

Page 27: Sorting (Chapter 9) - people.cs.aau.dkadavid/teaching/MVP-08/12-MVP08.pdf · 5 07-04-2008 Alexandre David, MVP'08 5 Fundamental Distinction Comparison based sorting: Compare-exchangeof

27

07-04-2008 Alexandre David, MVP'08 27

QuicksortAverage complexity: O(n logn).

But very efficient in practice.Average “robust”.Low overhead and very simple.

Divide & conquer algorithm:Partition A[q..r] into A[q..s] ≤ A[s+1..r].Recursively sort sub-arrays.Subtlety: How to partition?

Page 28: Sorting (Chapter 9) - people.cs.aau.dkadavid/teaching/MVP-08/12-MVP08.pdf · 5 07-04-2008 Alexandre David, MVP'08 5 Fundamental Distinction Comparison based sorting: Compare-exchangeof

28

07-04-2008 Alexandre David, MVP'08 28

2 1 5 8 4 3 73

q r

3 5

4 752 13 3 87 83

4 52 1 3 7 83 51 73

42 3 851 73 4 5

Hoare partitioning is better. Check in your algorithm course.

Page 29: Sorting (Chapter 9) - people.cs.aau.dkadavid/teaching/MVP-08/12-MVP08.pdf · 5 07-04-2008 Alexandre David, MVP'08 5 Fundamental Distinction Comparison based sorting: Compare-exchangeof

29

07-04-2008 Alexandre David, MVP'08 29

BUG

Page 30: Sorting (Chapter 9) - people.cs.aau.dkadavid/teaching/MVP-08/12-MVP08.pdf · 5 07-04-2008 Alexandre David, MVP'08 5 Fundamental Distinction Comparison based sorting: Compare-exchangeof

30

07-04-2008 Alexandre David, MVP'08 30

Parallel QuicksortSimple version:

Recursive decomposition with one process per recursive call.Not cost optimal: Lower bound = n (initial partitioning).Best we can do: Use O(logn) processes.Need to parallelize the partitioning step.

Page 31: Sorting (Chapter 9) - people.cs.aau.dkadavid/teaching/MVP-08/12-MVP08.pdf · 5 07-04-2008 Alexandre David, MVP'08 5 Fundamental Distinction Comparison based sorting: Compare-exchangeof

31

07-04-2008 Alexandre David, MVP'08 31

Parallel Quicksort for CRCW PRAMSee execution of quicksort as constructing a binary tree.

33,2,1 7,4,5,8

3 71,2 5,4 8

1

2

5

4

8

Page 32: Sorting (Chapter 9) - people.cs.aau.dkadavid/teaching/MVP-08/12-MVP08.pdf · 5 07-04-2008 Alexandre David, MVP'08 5 Fundamental Distinction Comparison based sorting: Compare-exchangeof

32

07-04-2008 Alexandre David, MVP'08 32

BUG

Text & algorithm 9.5:A[p..s] ≤ x < A[s+1..q].Figures & algorithm 9.6:A[p..s] < x ≤ A[s+1..q].

Page 33: Sorting (Chapter 9) - people.cs.aau.dkadavid/teaching/MVP-08/12-MVP08.pdf · 5 07-04-2008 Alexandre David, MVP'08 5 Fundamental Distinction Comparison based sorting: Compare-exchangeof

33

07-04-2008 Alexandre David, MVP'08 33

only one succeeds

A[i]≤A[parenti]

This algorithm does not correspond exactly to the serial version. Time for partitioning: O(1).

Page 34: Sorting (Chapter 9) - people.cs.aau.dkadavid/teaching/MVP-08/12-MVP08.pdf · 5 07-04-2008 Alexandre David, MVP'08 5 Fundamental Distinction Comparison based sorting: Compare-exchangeof

34

07-04-2008 Alexandre David, MVP'08 34

13 2 5 8 4 3 71 2 3 4 5 6 7 8

root=1

1 1 1 1 1 1 1 1

31

2 6

22 2 666

2 6

Page 35: Sorting (Chapter 9) - people.cs.aau.dkadavid/teaching/MVP-08/12-MVP08.pdf · 5 07-04-2008 Alexandre David, MVP'08 5 Fundamental Distinction Comparison based sorting: Compare-exchangeof

35

07-04-2008 Alexandre David, MVP'08 35

13 2 5 8 4 3 71 2 3 4 5 6 7 8

1 1 1 1 1 1 1 1

31

2 6

2 2 666

2 6

2 4

3 7 5

3 7 5

35 5

1 3 8

4

8

5

7

Each step: Θ(1). Average height: Θ(logn).This is cost-optimal – but it is only a model.

Page 36: Sorting (Chapter 9) - people.cs.aau.dkadavid/teaching/MVP-08/12-MVP08.pdf · 5 07-04-2008 Alexandre David, MVP'08 5 Fundamental Distinction Comparison based sorting: Compare-exchangeof

36

07-04-2008 Alexandre David, MVP'08 36

Parallel Quicksort – Shared Address (Realistic)Same idea but remove contention:

Choose the pivot & broadcast it.Each process rearranges its block of elements locally.Global rearrangement of the blocks.When the blocks reach a certain size, local sort is used.

Page 37: Sorting (Chapter 9) - people.cs.aau.dkadavid/teaching/MVP-08/12-MVP08.pdf · 5 07-04-2008 Alexandre David, MVP'08 5 Fundamental Distinction Comparison based sorting: Compare-exchangeof

37

07-04-2008 Alexandre David, MVP'08 37

Page 38: Sorting (Chapter 9) - people.cs.aau.dkadavid/teaching/MVP-08/12-MVP08.pdf · 5 07-04-2008 Alexandre David, MVP'08 5 Fundamental Distinction Comparison based sorting: Compare-exchangeof

38

07-04-2008 Alexandre David, MVP'08 38

Bug: Figdoesn’tmatch thetext.

Page 39: Sorting (Chapter 9) - people.cs.aau.dkadavid/teaching/MVP-08/12-MVP08.pdf · 5 07-04-2008 Alexandre David, MVP'08 5 Fundamental Distinction Comparison based sorting: Compare-exchangeof

39

07-04-2008 Alexandre David, MVP'08 39

CostScalability determined by time to broadcast the pivot & compute the prefix-sums.Cost optimal.

Page 40: Sorting (Chapter 9) - people.cs.aau.dkadavid/teaching/MVP-08/12-MVP08.pdf · 5 07-04-2008 Alexandre David, MVP'08 5 Fundamental Distinction Comparison based sorting: Compare-exchangeof

40

07-04-2008 Alexandre David, MVP'08 40

MPI Formulation of QuicksortArrays must be explicitly distributed.Two phases:

Local partition smaller/larger than pivot.Determine who will sort the sub-arrays.

And send the sub-arrays to the right process.

Page 41: Sorting (Chapter 9) - people.cs.aau.dkadavid/teaching/MVP-08/12-MVP08.pdf · 5 07-04-2008 Alexandre David, MVP'08 5 Fundamental Distinction Comparison based sorting: Compare-exchangeof

41

07-04-2008 Alexandre David, MVP'08 41

Final WordPivot selection is very important.Affects performance.Bad pivot means idle processes.