stupid columnsort tricks

16
Stupid Columnsort Stupid Columnsort Tricks Tricks Geeta Chaudhry Geeta Chaudhry Tom Cormen Tom Cormen Dartmouth College Dartmouth College Department of Computer Department of Computer Science Science

Upload: laith-bates

Post on 31-Dec-2015

29 views

Category:

Documents


2 download

DESCRIPTION

Stupid Columnsort Tricks. Geeta Chaudhry Tom Cormen Dartmouth College Department of Computer Science. Columnsort. Sorts N numbers Organized as r ´ s mesh Divisibility restriction : s must divide r Height restriction : r ≥ 2 s 2 8 steps Sort each column Transpose - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Stupid Columnsort Tricks

Stupid Columnsort Stupid Columnsort TricksTricks

Geeta ChaudhryGeeta Chaudhry

Tom CormenTom Cormen

Dartmouth CollegeDartmouth College

Department of Computer ScienceDepartment of Computer Science

Page 2: Stupid Columnsort Tricks

ColumnsortColumnsort Sorts Sorts NN numbers numbers

Organized as Organized as rr ss mesh mesh Divisibility restrictionDivisibility restriction: : ss must divide must divide rr Height restrictionHeight restriction: : rr ≥ 2 ≥ 2ss22

8 steps8 steps1.1. Sort each columnSort each column2.2. TransposeTranspose3.3. Sort each columnSort each column4.4. UntransposeUntranspose5.5. Sort each columnSort each column6.6. Shift down 1/2 columnShift down 1/2 column7.7. Sort each columnSort each column8.8. Shift up 1/2 columnShift up 1/2 column

Page 3: Stupid Columnsort Tricks

Proof of CorrectnessProof of Correctness

Columnsort is Columnsort is obliviousoblivious Use 0-1 Principle:Use 0-1 Principle:

If an oblivious algorithm sorts all input sets If an oblivious algorithm sorts all input sets consisting solely of 0s and 1s, then it sorts all consisting solely of 0s and 1s, then it sorts all input sets with arbitrary values.input sets with arbitrary values.

After step 3, the mesh consists ofAfter step 3, the mesh consists of Clean rows of 0s at the topClean rows of 0s at the top Clean rows of 1s at the bottomClean rows of 1s at the bottom ≤ ≤ ss dirty rows between the clean rows dirty rows between the clean rows

Page 4: Stupid Columnsort Tricks

Proof of Correctness (continued)Proof of Correctness (continued)

After step 4, the mesh consists ofAfter step 4, the mesh consists of Clean columns of 0s on the leftClean columns of 0s on the left Clean columns of 1s on the rightClean columns of 1s on the right A dirty area of size ≤ A dirty area of size ≤ ss22 between the clean between the clean

columnscolumns

rr ≥ 2 ≥ 2ss22 ==> ==> ss22 ≤ ≤ rr/2/2==> the dirty area is at most 1/2 ==> the dirty area is at most 1/2

a a column largecolumn large

Page 5: Stupid Columnsort Tricks

Proof of Correctness (continued)Proof of Correctness (continued) If, entering step 5, the dirty area is at most If, entering step 5, the dirty area is at most

1/2 a column large, then steps 5–8 1/2 a column large, then steps 5–8 complete the sortingcomplete the sorting If the dirty area fits in a single column, step 5 If the dirty area fits in a single column, step 5

cleans it, and steps 6–8 leave the mesh cleancleans it, and steps 6–8 leave the mesh clean If the dirty area spans two columns, then it’s If the dirty area spans two columns, then it’s

in the bottom half of one column and the top in the bottom half of one column and the top half of the next column.half of the next column. Step 5 does not change thisStep 5 does not change this Step 6 gets the dirty area into one columnStep 6 gets the dirty area into one column Step 7 cleans itStep 7 cleans it Step 8 moves all values back to where they belongStep 8 moves all values back to where they belong

Page 6: Stupid Columnsort Tricks

Removing the Divisibility RestrictionRemoving the Divisibility Restriction

Step 1: Sort each columnStep 1: Sort each column Each column has ≤ 1 0Each column has ≤ 1 01 transition1 transition

≤ ≤ ss 0 01 transitions1 transitions

There may be a 1There may be a 10 transition going from 0 transition going from one column to the nextone column to the next ≤ ≤ ss–1 1–1 10 transitions0 transitions

Step 2: TransposeStep 2: Transpose Within rows, ≤ Within rows, ≤ ss 0 01 transitions, ≤ 1 transitions, ≤ ss–1 1–1 10 0

transitionstransitions

Page 7: Stupid Columnsort Tricks

Divisibility Restriction (continued)Divisibility Restriction (continued)

After step 2, letAfter step 2, let XX = dirty rows with one 0 = dirty rows with one 01 transition, no 11 transition, no 100 YY = dirty rows with one 1 = dirty rows with one 10 transition, no 00 transition, no 011 ZZ = all other dirty rows (≥ 1 0 = all other dirty rows (≥ 1 01 and ≥ 1 11 and ≥ 1 10)0) Number of dirty rows = |Number of dirty rows = |XX| + || + |YY| + || + |ZZ|| All other rows are cleanAll other rows are clean

Claim: max(|Claim: max(|XX|, ||, |YY|) + ||) + |ZZ| ≤ | ≤ ss Every row of Every row of XX, , ZZ contains ≥ 1 0 contains ≥ 1 01 => |1 => |XX| + || + |ZZ| ≤ | ≤ ss Every row of Every row of YY, , ZZ contains ≥ 1 1 contains ≥ 1 10 => |0 => |YY| + || + |ZZ| ≤ | ≤ ss–1–1 max(|max(|XX|, ||, |YY|) = ||) = |XX| => max(|| => max(|XX|, ||, |YY|) + ||) + |ZZ| ≤ | ≤ ss max(|max(|XX|, ||, |YY|) = ||) = |YY| => max(|| => max(|XX|, ||, |YY|) + ||) + |ZZ| ≤ | ≤ ss–1–1 In either case, max(|In either case, max(|XX|, ||, |YY|) + ||) + |ZZ| ≤ | ≤ ss

Page 8: Stupid Columnsort Tricks

Divisibility Restriction (continued)Divisibility Restriction (continued)

After step 3After step 3 Clean rows of 0s move to the topClean rows of 0s move to the top Clean rows of 1s move to the bottomClean rows of 1s move to the bottom Pair up the min(|Pair up the min(|XX|, ||, |YY|) pairs of rows with one |) pairs of rows with one

row in row in XX and other row in and other row in YY

more 0s than 1s000000001111111110000000

000000000000111110001111

more 1s than 0s000111111111111111110000

000111110000111111111111

equal 0s and 1s000011111111111100000000

000000000000111111111111

from X:from Y:

Page 9: Stupid Columnsort Tricks

Divisibility Restriction (continued)Divisibility Restriction (continued)

In all cases, ≥ 1 clean row is formedIn all cases, ≥ 1 clean row is formed ≥ ≥ min(|min(|XX|, ||, |YY|) new clean rows are created|) new clean rows are created Dirty rows remainingDirty rows remaining

≤ |≤ |XX| + || + |YY| + || + |ZZ| – min(|| – min(|XX|, ||, |YY|)|) ||XX| + || + |YY| – min(|| – min(|XX|, ||, |YY|) = max(||) = max(|XX|, ||, |YY|)|)

==> Dirty rows remaining==> Dirty rows remaining ≤ max(| ≤ max(|XX|, ||, |YY|) + ||) + |ZZ|| ≤ ≤ ss

From here, it’s the same as the original proofFrom here, it’s the same as the original proof Dirty area size ≤ Dirty area size ≤ ss22 ≤ ≤ rr/2 (half a column) after step 4/2 (half a column) after step 4 Steps 5–8 clean up the dirty areaSteps 5–8 clean up the dirty area

Page 10: Stupid Columnsort Tricks

Subblock DistributionSubblock Distribution

Divide up the mesh into Divide up the mesh into ss1/21/2 ss1/21/2 subblocks subblocks Each subblock contains Each subblock contains ss values values

Add two steps between steps 3 and 4Add two steps between steps 3 and 4 Step 3.1: Perform any fixed permutation that moves Step 3.1: Perform any fixed permutation that moves

all values in each subblock into all all values in each subblock into all ss columns columns Step 3.2: Sort each columnStep 3.2: Sort each column

The resulting algorithm is The resulting algorithm is subblock columnsortsubblock columnsort Works with relaxed height restriction of Works with relaxed height restriction of rr ≥ 4 ≥ 4ss3/23/2

(assuming the divisibility restriction)(assuming the divisibility restriction)

Page 11: Stupid Columnsort Tricks

Subblock Columnsort CorrectnessSubblock Columnsort Correctness

After step 3, the line dividing 0s and 1s goes left After step 3, the line dividing 0s and 1s goes left to right and bottom to top (southwest to to right and bottom to top (southwest to northeast)northeast) Never turns back to the leftNever turns back to the left Never turns back toward the bottomNever turns back toward the bottom To show, suffices to show that after step 2,To show, suffices to show that after step 2,

# of 0s in each column ≥ # of 0s in column to its right# of 0s in each column ≥ # of 0s in column to its right How could a column haveHow could a column have

# of 0s < # of 0s in column to its right?# of 0s < # of 0s in column to its right? There would have to be a 1There would have to be a 10 transition in a row0 transition in a row But divisibility restriction => there are no 1But divisibility restriction => there are no 10 0

transitions in rows after step 2transitions in rows after step 2

Page 12: Stupid Columnsort Tricks

Subblock Columnsort Correctness Subblock Columnsort Correctness (continued)(continued)

After step 3.1, the number of 0s in any two After step 3.1, the number of 0s in any two columns differs by ≤ 2columns differs by ≤ 2ss1/21/2

The dirty area is confined to an area The dirty area is confined to an area ss rows high and rows high and ss columns wide columns wide

In subblocks, In subblocks, ss1/21/2 ss1/21/2

Dividing line passes through ≤ Dividing line passes through ≤ ss1/21/2 + 1 subblocks + 1 subblocks vertically and ≤ vertically and ≤ ss1/21/2 subblocks horizontally ==> ≤ 2 subblocks horizontally ==> ≤ 2ss1/21/2 subblocks totalsubblocks total Don’t double-count the 1 extra subblockDon’t double-count the 1 extra subblock

Step 3.1 distributes each subblock to all Step 3.1 distributes each subblock to all ss columns columns All clean subblocks distribute their 0s and 1s All clean subblocks distribute their 0s and 1s

uniformly to each columnuniformly to each column Number of 0s between any two columns differs byNumber of 0s between any two columns differs by

≤ number of dirty subblocks = 2≤ number of dirty subblocks = 2ss1/21/2

Page 13: Stupid Columnsort Tricks

Subblock Columnsort Correctness Subblock Columnsort Correctness (continued)(continued)

After step 4, the mesh consists ofAfter step 4, the mesh consists of Clean columns of 0s on the leftClean columns of 0s on the left Clean columns of 1s on the rightClean columns of 1s on the right A dirty area of size ≤ 2A dirty area of size ≤ 2ss3/23/2 between the clean between the clean

columnscolumns

After sorting in step 3.2, the dirty area isAfter sorting in step 3.2, the dirty area is≤ 2≤ 2ss1/21/2 rows high and ≤ rows high and ≤ ss columns wide columns wide

Size of dirty area is ≤ 2Size of dirty area is ≤ 2ss3/23/2

Page 14: Stupid Columnsort Tricks

Subblock Columnsort Correctness Subblock Columnsort Correctness (continued)(continued)

Finish up by observing that Finish up by observing that rr ≥ 4 ≥ 4ss3/23/2 is is equivalent to 2equivalent to 2ss3/23/2 ≤ ≤ rr/2/2

Now the dirty area is at most 1/2 a columnNow the dirty area is at most 1/2 a column Steps 5–8 clean up the dirty areaSteps 5–8 clean up the dirty area Can remove the divisibility restriction at Can remove the divisibility restriction at

the cost of tightening the height restriction the cost of tightening the height restriction to to rr ≥ 6 ≥ 6ss3/23/2

Page 15: Stupid Columnsort Tricks

Slabpose ColumnsortSlabpose Columnsort

Another variation on columnsortAnother variation on columnsort Loosens height restriction to Loosens height restriction to rr ≥ 4 ≥ 4ss3/23/2

Has 10 stepsHas 10 steps Partitions the mesh into vertical slabsPartitions the mesh into vertical slabs

Page 16: Stupid Columnsort Tricks

ConclusionConclusion

Removed and relaxed restrictions on Removed and relaxed restrictions on columnsortcolumnsort Divisibility restriction: Divisibility restriction: ss divides divides rr Height restriction: Height restriction: rr ≥ 2 ≥ 2ss22

Divisibility restriction is not necessaryDivisibility restriction is not necessary Subblock columnsortSubblock columnsort

With divisibility restriction: With divisibility restriction: rr ≥ 4 ≥ 4ss3/23/2

Without divisibility restriction: Without divisibility restriction: rr ≥ 6 ≥ 6ss3/23/2

Slabpose columnsort: Slabpose columnsort: rr ≥ 4 ≥ 4ss3/23/2