o'connor -- rpglab: a matlab package for random permutation generation

36
NOTES ON R ANDOM P ERMUTATION G ENERATION AND THE M ATLAB package RPGLab Derek O’Connor January 31, 2011 * 1 Introduction Permutations and combinations have been studied seriously for at least 500 years. Some of the best mathematicians have contributed to this research: Bernoulli, Newton, Stir- ling, Euler, . . ., etc. This area of study has expanded greatly in the last 100 years and is now called Combinatorics, with permutations and combinations forming the foundation. That is, you can’t do combinatorics unless you know (and mind) your P s and C s. Permutations and combinations are considered to be so important that even school- children are required to study the subject. Figure 1 shows the opening page of the P&C chapter in Hall’s Algebra, which I bought as a secondary school pupil in 1960, price 10s/6d. Note that they use the good-old-fashioned notation n C r rather than the ambigu- ous ( n r ). They also use n P r , but I don’t know the modern equivalent. 1 This is an outline of these notes: 1. Introduction 2. Definitions 3. Random Permutations 4. Algorithms for Random Permutations 5. MATLAB Implementations and Testing 6. Generating Special Permutations 7. Testing Permutation Generators 8. RPGLab * Started: 3rd Jan 2010. Web: email : 1 There is no need for all this notational fuss: C(n, k) for combinations and P(n, k) for permutations will do nicely. 1

Upload: derek-oconnor

Post on 03-Apr-2015

1.054 views

Category:

Documents


1 download

DESCRIPTION

These are notes on the generation of large random permutations. Included is a description of the RPGLab package written in Matlab that can be used to generate and test random permutations.

TRANSCRIPT

Page 1: O'Connor -- RPGLab: A Matlab package for Random Permutation Generation

NOTES ON

RANDOM PERMUTATION GENERATION

AND THE

MATLAB package RPGLab

Derek O’Connor

January 31, 2011*

1 Introduction

Permutations and combinations have been studied seriously for at least 500 years. Someof the best mathematicians have contributed to this research: Bernoulli, Newton, Stir-ling, Euler, . . ., etc. This area of study has expanded greatly in the last 100 years and isnow called Combinatorics, with permutations and combinations forming the foundation.That is, you can’t do combinatorics unless you know (and mind) your P s and C s.

Permutations and combinations are considered to be so important that even school-children are required to study the subject. Figure 1 shows the opening page of theP&C chapter in Hall’s Algebra, which I bought as a secondary school pupil in 1960, price10s/6d. Note that they use the good-old-fashioned notation nCr rather than the ambigu-ous (n

r). They also use nPr , but I don’t know the modern equivalent.1

This is an outline of these notes:

1. Introduction

2. Definitions

3. Random Permutations

4. Algorithms for Random Permutations

5. MATLAB Implementations and Testing

6. Generating Special Permutations

7. Testing Permutation Generators

8. RPGLab

*Started: 3rd Jan 2010. Web: http://www.derekro onnor.net email : derekro onnor�eir om.net1There is no need for all this notational fuss: C(n, k) for combinations and P(n, k) for permutations will

do nicely.

1

Page 2: O'Connor -- RPGLab: A Matlab package for Random Permutation Generation

Derek O’Connor Random Permutation Generation

Figure 1. H. S. HALL, An Algebra for Schools, 1ST EDITION 1912, REPRINTED 1956, MACMILLAN, LONDON

© DEREK O’CONNOR, JANUARY 31, 2011 2

Page 3: O'Connor -- RPGLab: A Matlab package for Random Permutation Generation

Derek O’Connor Random Permutation Generation

2 Definitions

Definition 1. (Permutation.) A permutation is a rearrangement of the elements of anordered list A = {a1, a2, . . . , an} into a one-to-one correspondence with A itself.

In what follows we assume that A is the set {1, 2, . . . , n}. We will represent any per-mutation as a MATLAB vector p(1 : n), where p(i) is the position of element i in thepermutation. Thus we have

{a1, a2, . . . , an}p−→ {ap(1), ap(2), . . . , ap(n)} (2.1)

Table 1. A PERMUTATION VECTOR

i 1 2 3 4 5 6 7 8 9 10

p(i) 3 4 9 2 10 8 6 1 5 7

The number of permutations on a set of n elements is n! = n · (n − 1) · · · 2 · 1. There is noexact closed form for n!, but Stirling’s Approximation is excellent for even modest n

n! ≈√

2πn(n

e

)n(

1 +1

12n+

1

288n2+

139

5140n3+ O

(

1

n4

))

(2.2)

Table 2. APPROXIMATE NUMBER OF PERMUTATIONS

n N = n!

101 107

102 10158

103 102,568

104 1035,659

105 10456,573

106 105,565,709

107 1065,657,059

108 10756,570,556

109 108,565,705,523

1010 1095,657,055,186

A random permutation of length n = 109 can be generated by a 2.3GHz, 16GB, machinein a few minutes. Table 2 is there to remind us of the truly gigantic size of the permutationspace from which these are generated.

Definition 2. (Identity.) The identity permutation is p = [1, 2, . . . , n], that is, p(i) = i, i =1, 2, . . . , n. In MATLAB the statement p = 1:n; generates the identity permutation.

Definition 3. (Transposition.) A transposition of a permutation is an exchange of twoof its elements i, j with all others staying the same. Any transposition of p gives a newpermutation q, if i 6= j.

© DEREK O’CONNOR, JANUARY 31, 2011 3

Page 4: O'Connor -- RPGLab: A Matlab package for Random Permutation Generation

Derek O’Connor Random Permutation Generation

Definition 4. (Fixed Point.) A fixed point of a permutation p is any i such that p(i) = i. Itis called a fixed point because the element ai does not move under the permutation. TheIdentity Permutation p = (1, 2, . . . , n) has n fixed points.

Definition 5. (Derangement.) A derangement is a permutation with no fixed points, i.e.,p(i) 6= i, i = 1, 2, . . . , n.

Definition 6. (Cycle.) A cycle is a sequence i → p(i) → · · · → j → p(j) → i. The lengthof a cycle is the number of elements in the cycle. A fixed point is a cycle of length 1. Theidentity permutation has n cycles of length 1.

Definition 7. (Cyclic Permutation.) A cyclic permutation has 1 cycle. That is, starting atany point i we have a sequence i → p(i) → · · · → j → p(j) → i, which includes everyelement of {1, 2, . . . , n}. This cycle has length n.

It should be obvious that a cyclic permutation has no fixed points and is, therefore, aderangement. Equally obvious is that not all derangements are cyclic permutations.

3 Random Permutations

A random permutation of the numbers 1, 2, . . . , n is a permutation drawn uniformly fromthe set of all n ! permutations of n numbers. That is, a permutation p is drawn from theset of all permutations in such a way that the probability of drawing a given permutation,p, is 1/n ! .

The reason for our interest in random permutations is simple: we are interested in permu-tation vectors where n ≥ 106. For example, if we wish to test a new super-duper sortingalgorithm on inputs of size n ≥ 106, then we must resort to testing with a random samplefrom the vast (N ≥ 106 !) space of permutations.

3.1 PROPERTIES OF RANDOM PERMUTATIONS

It is useful to think of a permutation as a directed graph with n nodes labelled 1, 2, · · · , n.A directed arc i → j exists if p[i] = j. Of necessity there can be only one arc from anynode i. Also, of necessity, each node has one incoming arc. [WHY ?] Hence a permutationis a directed graph with n nodes and n directed arcs. An example of a permutation graphof 10 nodes is shown in Figure 2. The the permutation shown in Figure 2 has 10 nodes,

5 3 6 8 1 4 7 2 10 9

1 2 3 4 5 6 7 8 9 10

p

5 3 6 8 1 4 7 2 10 9

1 2 3 4 5 6 7 8 9 10

Figure 2. A PERMUTATION AND ITS GRAPH

© DEREK O’CONNOR, JANUARY 31, 2011 4

Page 5: O'Connor -- RPGLab: A Matlab package for Random Permutation Generation

Derek O’Connor Random Permutation Generation

10 arcs, 4 cycles starting at 1,2,7,9, with lengths 2,5,1,and 2, respectively.2

The Number of Fixed Points. The probability that a permutation of length n has k fixedpoints is

Pr(p has k fixed points ) ∼ 1

ek!, (3.1)

and the average number of cycles of length k in a permutation of length n is 1/k withvariance 1/k. Hence, the average number of fixed points in a permutation of length n is 1, withvariance 1. This is a surprising result because the average is independent of the length ofthe permutation.

The Number of Derangements. Let D(n) be the number of derangements in the set ofn! permutations of length n. Then

D(n) = n!n

∑k=0

(−1)k

k !=

n! + 1

e

→ n!

e, as n → ∞. (3.2)

Hence

Pr(p is a derangement) =D(n)

n!→ 1

e≈ 0.36788. (3.3)

This means that in a large sample of random permutations about 37% of them will bederangements, independent of n, the length of the permutation.

The Number of Cycles. The number of cycles in a random permutation ranges from 1,a cyclic permutation, to n, the identity permutation. Let Ck(n) be the number of cycles oflength k in a permutation p(1 : n).

The expected number of cycles of length less than or equal to m is Hm.

The expected number of cycles of any length is Hn, or about log n. The average length of

a cycle is thusn

log n.

The Number of Cyclic Permutations There are n! permutations of n distinct elementsand (n − 1) ! of these will be cyclic.

Cn(n) = (n − 1)! and Pr(p(n) is cyclic) =Cn(n)

n!=

1

n. (3.4)

This means that in a large sample of random permutations a decreasing fraction of themwill be cyclic (e.g., 1% for n = 100, 0.1% for n = 1000), while the fraction of derangementsremains constant at 37%.

2Most of what follows is from Sedgewick & Flajolet, Analysis of Algorithms, Addison-Wesley, 1996, Chap-ter 6.

© DEREK O’CONNOR, JANUARY 31, 2011 5

Page 6: O'Connor -- RPGLab: A Matlab package for Random Permutation Generation

Derek O’Connor Random Permutation Generation

Table 3. CYCLE DISTRIBUTION IN RANDOM PERMUTATIONS

C(i) n = 106, 11 cycles n = 107, 9 cycles n = 108, 11 cycles n = 109, 14 cycles

i Start Length Start Length Start Length Start Length

1 1 419486 3 765,295 3 7,946,786 1 715510740

2 2 563485 1 183,715 2 1,805,558 4 143677250

3 44 15656 38 41,873 74 168,213 20 135801148

4 509 432 97 8,007 586 53,765 148 3267451

5 1010 119 1,114 775 1 13,346 1303 1160044

6 5793 542 4,374 216 1,409 11,991 2774 461529

7 7003 255 3,307 100 14,927 329 9415 119643

8 16043 27 465 15 934,615 7 459730 1876

9 220629 2 253,498 4 958,846 3 7116764 135

10 578575 5 1,805,235 1 4362180 76

11 815788 2 3,807,650 1 9884753 37

12 30479687 29

13 112643112 23

14 111244636 19

In the next section we will discuss the two main algorithms for generating random per-mutations. However, before we do this we should have methods for checking the prop-erties of random permutations. This will allow us to check the output of any generatorso that we do not waste time developing bad algorithms and programs that implementthem. These methods have been collected in a package called RPGLAB which is dis-cussed in Section 8.

© DEREK O’CONNOR, JANUARY 31, 2011 6

Page 7: O'Connor -- RPGLab: A Matlab package for Random Permutation Generation

Derek O’Connor Random Permutation Generation

Example 1. (Random Permutation of length 99.)3

p =

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

34 60 68 78 50 4 7 97 16 74 98 76 8 65 99 75 10 30 91 95

21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

42 15 80 6 90 70 28 47 32 1 63 51 49 57 59 72 66 85 53 94

41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

9 83 87 81 43 84 12 23 54 22 79 92 45 62 64 31 46 82 52 93

61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80

5 58 35 55 20 37 18 19 67 44 24 26 11 41 27 88 73 89 13 36

81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99

56 40 38 3 29 21 69 48 71 17 39 96 25 33 2 14 61 86 77

1

3457

46

84

3

68

19

91

39

5345

43

87

69

67

18

30

2

60

93

25

90

17

10

74

41

9

16

75

27

28

47

12

7688 48

23

80

36

72

26

70

44

81

56

31

63

35

59

52

92

96

14

652095

4

78 89

71

246

5

50

22

15

99

77

73

11

9886

21

42

83

38

85

29

32

51

79

13

8 97

61

7

33

49

5462

58

82

4094

37

66

55

64

Figure 3. RANDOM PERMUTATION OF LENGTH 99 WITH 8 CYCLES AND 1 FIXED POINT

3Figure 3 produced by the MATLAB function GrVizp.m and GraphViz with ‘neato’ layout.

© DEREK O’CONNOR, JANUARY 31, 2011 7

Page 8: O'Connor -- RPGLab: A Matlab package for Random Permutation Generation

Derek O’Connor Random Permutation Generation

Example 2. (Random Permutation of length 150.)

1

122

10

49

24

86

60

97

107

40

30

89

37

98

9

115

42

10976

108 7116

18

88

87

99

36

114

105

66

73

47

101

11

117

138

5

13

135

140

17

126

93

103

86814725

15050

4

123

69

141

2

127

119

78

144

121

45

104

22

116

82

96

70

3

39

95

148

81

136

110120 139

726

29

15

85

34

137

149

12

80

90

14

59

38

92

94

128

55

118

77

41

74

235684

102

130

20

7

134

13331

83

26

124

35

62

11319

43

91

21

2751

146

48

63

65

67

33

143

7544

61

112

46

129

53

28

58

132

131

32

142

52

54

57

125

64

106 111

145

79100

Figure 4. RANDOM PERMUTATION (AND DERANGEMENT) OF LENGTH 150

© DEREK O’CONNOR, JANUARY 31, 2011 8

Page 9: O'Connor -- RPGLab: A Matlab package for Random Permutation Generation

Derek O’Connor Random Permutation Generation

Example 3. (Random Permutation of length 200.)

1

15

73

161

158

179

110

174

542

18

21

19

80

132

70

38

50

44

23

74 86

52

2

6

146

63

170

183

60

25

36

57

81

116

136

15583 142

59

111

43

69

151

150

13

184

79

30

112

22

4

115

176105133

135

3

180

172

9

156

190

126

200

29

168

124

153

159

66

145

106

194

119

186

100131 35 157

127

160

87

144

173

143

58

130

65

138

64

26

76

147

8

24

40

120

99

28

121

11

189

681037137

195

118

34

166

48

10

163

188

196

37

128

169

75

49

16

134

192

56

88

197

89

140

45

39123

97 27 4755

162

171

85

95

175

91

31

82

125

109

113

84

165

152

61

148

117

107

94

72

167

54

181

20

96

519053

6792

101

46

139

93

32

71

12

199 104

108

185154

14

122

187164

17

198

178149

33

182

141 62

114

129

177

7741

193

78

98

102

191

Figure 5. RANDOM PERMUTATION OF LENGTH 200

© DEREK O’CONNOR, JANUARY 31, 2011 9

Page 10: O'Connor -- RPGLab: A Matlab package for Random Permutation Generation

Derek O’Connor Random Permutation Generation

4 Algorithms for Random Permutations

Algorithms and programs for generating random permutations have been around sincethe start of the computer age. In fact Knuth has tracked down the first known algorithmto R. A. Fisher and F. Yates, Statistical Tables, (London, 1938), Example 12.4 There appearto be just two types of algorithm which are based on: (1) Random transpositions, and (2)Sorting a random vector.

The production of random permutations may be viewed in different ways:

1. Building a permutation randomly step-by-step.

2. Sampling with replacement from the set of all permutations of length n.

3. Sampling without replacement from the set of integers {1, 2, . . . , n}.

All three views are useful and should be kept in mind.

Conceptually, building a random permutation step-by-step is simple: take a random sam-ple of size n without replacement from a set of n elements S = {s1, s2, . . . , sn}.

Algorithm RandPerm(S) → p

Generates a random permutation of the

elements of the set S = {s1, s2, . . . , sn}for k := 1 to n do

Choose an element skr at random from S

S := S − {skr}

p[k] := skr

endfor kreturn p

endalg RandPerm

Notice that all the elements of S will be chosen and put into the array p. The arrayis needed to preserve the random order in which the elements were drawn. Thus p isreturned containing a random permutation of S.

Analysis of Algorithm RandPerm. By construction, RandPerm returns a permutation ofthe set S. We need to show that each permutation is equally likely with probability 1/n!of occurring. We prove this by induction on k.

The size of the set |S| decreases by 1 at each stage and so Pr(skr ) = 1/(n − k + 1) at stage

k, where skr is the element chosen at random at stage k. This is obviously true for k = 1,

with Pr(s1r ) = 1/n. Hence, because sk

r is chosen uniformly and independently, we have

Pr(s1r , s2

r , . . . , snr ) = Pr(s1

r )Pr(s2r ) · · · Pr(sn

r ) =1

n· 1

n − 1· · · 1

1=

1

n!. (4.1)

Thus we have proved that Algorithm RandPerm generates all possible permutations ofthe set S with equal probability.

4Available here: http://digital.library.adelaide.edu.au/ oll/spe ial/�sher/© DEREK O’CONNOR, JANUARY 31, 2011 10

Page 11: O'Connor -- RPGLab: A Matlab package for Random Permutation Generation

Derek O’Connor Random Permutation Generation

If we assume that each set operation can be performed in O(1) time, then this algorithmruns in O(n) time, which is optimal up to a multiplicative constant. There are many waysto implement the set operations, either explicitly or implicitly.

The Fisher-Yates Shuffle Algorithm

This is a random transposition algorithm. The first computer implementation of it was byDurstenfeld.5 The most succinct statement of the Fisher-Yates Shuffle Algorithm is givenby Reingold, et al.6:

for i := n downto 2 do πi ↔ πrand(1,i), (4.2)

where π is a permutation of length n, and rand(i, j) returns a random integer from theset {i, i + 1, . . . , j − 1, j}. This is a truly elegant algorithm: it is tiny, uses no extra space,and is optimal. Also, it can shuffle any array in place, not just permutations.7

The Sort Permutation Algorithm

Sorting a set of distinct elements (x1, x2, . . . , xn), may be viewed as

Find a permutation p = (p1, p2, . . . , pn), such that xp1< xp2 · · · < xpn . (4.3)

Hence any sorting algorithm will generate, implicitly or explicitly, a permutation and itsassociated sorted vector:

[s, p] := Sort(x), where s[i] = x[p[i]]. (4.4)

The following line of MATLAB code demonstrates this:

x = rand(1,8); [s,p] = sort(x); table = [x;s;p;x(p)]

Table 4. SORTING WITH A PERMUTATION VECTOR

i 1 2 3 4 5 6 7 8

x 0.146529 0.118308 0.315581 0.683211 0.914784 0.912839 0.753141 0.437558

s 0.118308 0.146529 0.315581 0.437558 0.683211 0.753141 0.912839 0.914784

p 2 1 3 5 8 7 6 4

x[p] 0.118308 0.146529 0.315581 0.437558 0.683211 0.753141 0.912839 0.914784

We can see that the permutation p has one fixed point p[3] = 3, and three cycles (1, 2, 1),(4, 5, 8, 4), and (6, 7, 6).

A random permutation is obtained from the sorting algorithm if it is given a randomly-ordered vector to sort. This is done by filling a vector r[1 : n] with random numbers

5Richard Durstenfeld, Algorithm 235, Random Permutation, Communications of the ACM, Vol. 7, July1964, page 420.

6Edward M. Reingold, Jurg Nievergelt, and Narsingh Deo, Combinatorial Algorithms: Theory and Practice,Prentice-Hall, 1977, page 177.

7 Here is an interesting page on card shuffling by Richard J. Wagner: http://www-personal.umi h.edu/~wagnerr/shu�e/. Elsewhere on his site is his C++ implementation of the Mersenne Twister random numbergenerator.

© DEREK O’CONNOR, JANUARY 31, 2011 11

Page 12: O'Connor -- RPGLab: A Matlab package for Random Permutation Generation

Derek O’Connor Random Permutation Generation

distributed uniformly on (0, 1) : r[1 : n] := RandReal(0, 1), and the sorting this randomvector : [s, p] := Sort(r[1 : n]), where s = r[p]. Thus a random permutation operating ona random vector gives an ordered vector.

Here are the two algorithms, side-by-side for easy comparison:

function GRPfys(n) → pGenerates a random permuation

of the integers 1,2,...,n

p := [1, 2, . . . , n] Identity perm.

for k := 2 to n dor := RandInt(1, k)p[k] :=: p[r] Swap

endfor kreturn p

endfunc GRPfys

function GRPsort(n) → pGenerates a random permuation

of the integers 1,2,...,n

for k := 1 to n dor[k] := RandReal(0, 1)

endfor k[s, p] := Sort(r)return p

endfunc GRPsort

Analysis of the Shuffle and Sort Algorithms

GRPfys has one for loop (2 : n), and so it requires Tgr = (n − 1)(Tr + Ts) time, where Tr

and Ts are the times to perform the random number generation and swap, respectively.We say the time (or step) complexity of this algorithm is O(n). This is asymptoticallyoptimal – just to read a permutation requires O(n) time.

GRPsort has one for loop (1 : n), which requires nTr time, followed by a sort whichrequires O(n log n) time. The total time is Tgs(n) = nTr + O(n log n). The time (or step)complexity of this algorithm is O(n log n).

An important difference between the algorithms is that GRPsort requires twice as muchstorage as the Shuffle algorithm, assuming that the vector r[1 : n] is over-written bys[1 : n].

© DEREK O’CONNOR, JANUARY 31, 2011 12

Page 13: O'Connor -- RPGLab: A Matlab package for Random Permutation Generation

Derek O’Connor Random Permutation Generation

5 MATLAB Implementations of the Shuffle Algorithm

The MATLAB functions here are all minor variations of the Fisher-Yates Shuffle algorithm,except MATLAB’s own randperm which uses sorting.

In the MATLAB functions that follow, all loops are incremented rather than decremented.In the original Knuth algorithm P, he decrements the loop because in those days this gavefaster loops. This is no longer true. 8

Fisher-Yates Implementations

function p = GRPfys(n)

p = 1:n; Identity permutation

for k = 2:n

r = ceil(k*rand); Random integer between 1 and k

t = p(r);

p(r) = p(k); Swap(p(r),p(k))

p(k) = t;

end;

return; GRPfys

Knuth, in the 3rd edition of Volume 2, TAOCP, points out in the second-last paragraphon page 145 that there is no need for the swap if we don’t want to shuffle a given vector,but just want a random permutation. This is implemented below as GRPNS

function p = GRPns(n)

p = 1:n; Identity permutation

for k = 2:n

r = ceil(k*rand); Random integer between 1 and k

p(k) = p(r); No Swap

p(r) = k;

end;

return; GRPns

The final variation of GRPfys is what all right-thinking MATLAB-ers consider ‘good prac-tice’, viz., vectorization. In this variation the generation of the random integer r is moved

8The decremented (or backward) loop, and its close cousin, the zero-based array, have been constantsources of error in the student programs that I have seen over the 25 years of teaching algorithms and datastructures. People from childhood have been taught to count from 1 upwards and mistakes are inevitablewhen this is reversed. The rot set in, I believe, with Sputnik and the American Space Program – all thoseHouston count-downs. And remember, Knuth started programming at that time.

Likewise with zero-based arrays: people do not naturally start counting from 0. Except of course inIreland and the UK. Here, in these sceptre’d isles, we count the floors of our buildings from 0 upwards. Butwe don’t call the first floor Floor Zero. Oh no! We have a special name for it: the Ground Floor. I havelost count (pardon the pun) of the number of Americans wandering lost around our university buildingssaying, “But Prof Jones states in his letter that his office is on the second floor of the Gerry Adams MemorialBuilding”. MATLAB, very sensibly, does not allow zero-based arrays, but I’m sure they get many requestsfor them.

© DEREK O’CONNOR, JANUARY 31, 2011 13

Page 14: O'Connor -- RPGLab: A Matlab package for Random Permutation Generation

Derek O’Connor Random Permutation Generation

outside the loop and a random integer vector r is generated before entering the loop. Im-mediately we see that this has doubled the memory required: two vectors p and r insteadof just p. But if it speeds up the function, it may be worth it.

function p = GRPvec(n);

p = I(n);

r = ceil(rand(n,1).*(1:n)’); Vector of random integers

for k = 2:n

t = p(r(k));

p(r(k)) = p(k); Swap(p(r),p(k))

p(k) = t;

end;

return; GRPvec

We can see in Table 5 that GRPVec takes 20% longer for n = 107, and 6% longer for n = 108,than the loop version. In this simple function vectorization has given us the worst of bothworlds: increased time and increased space. Vectorization, like the optimizing compiler,seems to be a mixed blessing.

I decided to replace the 3-statement swap with a slick one-statement in the original Shuf-fle method.

function p = GRPvswap(n)

p = 1:n;

for k = n:­1:2

r = ceil(k*rand);

1 t = p(r);

2 p([k r]) = p([r k]); p(r) = p(k);

3 p(k) = t;

end;

return; GRPvswap

This 1-statement swap had a disastrous effect on the execution time of this simple func-tion. This function took 14 to 40 times longer to execute than the original. The slow-downis caused by the need to construct two vectors [k r], [r k] for each swap. This is ex-pensive, according to Jan Simon.9

9http://www.mathworks. om/matlab entral/newsreader/view_thread/295414.

© DEREK O’CONNOR, JANUARY 31, 2011 14

Page 15: O'Connor -- RPGLab: A Matlab package for Random Permutation Generation

Derek O’Connor Random Permutation Generation

Jan Simon

function X = Shuffle(N,’index’)

% <­­­ snip ­­­>

% INPUT:

% X: Array with any size. Types: DOUBLE, SINGLE, CHAR, LOGICAL,

% (U)INT64/32/16/8. This works for CELL and STRUCT arrays also, but

% shuffling their indices is faster.

% Author: Jan Simon, Heidelberg, (C) 2010

% Simple Knuth shuffle in forward direction:

for i = 1:length(X)

w = ceil(rand * i);

t = X(w);

X(w) = X(i);

X(i) = t;

end;

return; % Shuffle

This is Knuth’s Algorithm P with the loop direction reversed. Note that this function canshuffle any vector, not just permutations. Also, it is very general and well-written, beingable to handle various integer arrays as well as the usual double precision array.

This MATLAB function is not actually used. Instead, Simon has written this in C and put aMEX wrapper around it so that when it is compiled it can be called from MATLAB. This isbecause the C compiler can generate much faster code than MATLAB’s interpreter. Thusthis is a code generation improvement and not an algorithmic improvement.

MATLAB

function p = randperm(n)

%RANDPERM Random permutation.

% RANDPERM(n) is a random permutation of the integers from 1 to n.

% For example, RANDPERM(6) might be [2 4 5 6 1 3].

%

% Note that RANDPERM calls RAND and therefore changes RAND’s state.

%

% See also PERMUTE.

% Copyright 1984­2004 The MathWorks, Inc.

% Revision : 5.10.4.1 Date : 2004/03/0221 : 48 : 27[ignore,p] = sort(rand(1,n));

If you have a good sorting method then this is a quick ’n slick method for generatingrandom permutations. Others might call it quick ’n dirty. Yet others would call it justdirty. But let’s not quibble: it works fine and it is fast because MATLAB has carefullyimplemented (in C or assembly?) a good sorting algorithm, Hoare’s Quicksort, I believe.

Although all the MATLAB functions listed above use rand, which returns a standard IEEE8-byte double precision floating point number, only MATLAB uses a whole n−vector ofthem. But the methods above generate n rands, don’t they? Yes, but one-at-a-time, so the

© DEREK O’CONNOR, JANUARY 31, 2011 15

Page 16: O'Connor -- RPGLab: A Matlab package for Random Permutation Generation

Derek O’Connor Random Permutation Generation

total memory requirement is 1 integer n−vector, whereas MATLAB’s method requires 1integer n−vector and 1 double precision n−vector. This can make a big difference if youwork with truly large permutations.

5.1 Timing Tests

The results here show that all variants of GRPfys are about the same and so the choiceof function is reduced to three: GRPfys, GRPmex, or randperm. Jan Simon’s GRPmex is the

Table 5. PERMUTATION GENERATION USING MATLAB R2008A (SECS.)

Function Coder Date Mem n = 107 n = 108 n = 109

GRPfys Derek O’Connor Dec 2010 1 2.0 24.0 315

GRPns ” Dec 2010 1 2.0 24.0 310

GRPvec ” Dec 2010 1 2.2 25.5 350

GRPmex Jan Simon 2010 1 0.6 7.6 156

randperm Matlab 2005 2 2.3 26.0

Dell Precision 690, Intel 2xQuad-Core E5345 @ 2.33GHz, 16GB RAM

Windows7 64-bit Prof., MATLAB R2008a, 7.6.0.324

’mexed’ version of GRPfys, and runs 2–3 times faster. Both require 1 array of size n. If youdo not have the appropriate C compiler then you are forced to use GRPfys or somethingsimilar.

MATLAB’s randperm is the very slick one-liner [s,p] = sort(rand(n,1)). This method’scomplexity is O(n log n) which means that an O(n) method such as Shuffle will even-tually beat it when n is large enough. If Tshuf = cshufn and Tsort = csortn log n, then theaverage times per array element are cshuf and csort log n. Thus the average time of MAT-LAB’s sort method grows with n, while the Shuffle remains constant.

Table 5 shows that there is very little difference between GRPfys and randperm for n = 107

and n = 108. But the crucial difference is that MATLAB’s method uses twice as muchworking memory as GRPfys. This is why there is a blank in the table for randperm at n =109: 16 GB memory was not enough to allow randperm to generate a random permutationof size 8 bytes× 109 = 8 GB.

The general conclusion from these tests is that the simplest implementations of the Fisher-Yates algorithm are the best. Fancy vectorizations and array index manipulations notonly obscure the algorithm, but are error-prone, and are slower than the simpler methods.

5.2 Profiling GRPfys(n)

The profile for p = GDPfys(10ˆ7) is shown below. It is important on multi-core systemsto switch to single processor mode before running the profiler, otherwise the results willbe erratic. See the MATLAB profiler help for details.

© DEREK O’CONNOR, JANUARY 31, 2011 16

Page 17: O'Connor -- RPGLab: A Matlab package for Random Permutation Generation

Derek O’Connor Random Permutation Generation

time Percent calls

1 function p = GRPfys(n);

0.08 1.0 1 p = 1:n;

1 for k = 2:n

1.87 22.6 9999999 r = ceil(rand*k);

1.29 15.7 9999999 t = p(r);

1.89 22.9 9999999 p(r) = p(k);

1.80 21.8 9999999 p(k) = t;

1.32 16.0 9999999 end;

1 return;

We can see that there are no ‘hot spots’ in this code and that time is fairly evenly spreadacross each statement. This suggests that there is little we can do in MATLAB to speed itup. But what about vectorization?, I hear you cry. Let us try profiling GRPvec above. Hereare the results for n = 108:

time calls | time calls

function p = GRPfys(n); | function p = GRPvec(n);

0.66 1 p = 1:n; | 0.74 1 p = 1:n;

| 6.22 1 r = ceil(rand(n,1).*(1:n)’);

1 for k = 2:n | 1 for k = 2:n

18.05 99999999 r = ceil(rand*k); |

12.69 99999999 t = p(r); | 21.55 99999999 t = p(r(k));

20.50 99999999 p(r) = p(k); | 22.97 99999999 p(r(k)) = p(k);

21.70 99999999 p(k) = t; | 12.93 99999999 p(k) = t;

12.57 99999999 end; | 12.54 99999999 end;

­­­­­ | ­­­­­

86.16 total | 77.10 total

We can see that the unvectorized code GRPfys on the left requires 86.16/77.1− 1 = 0.1175= 12% more time that GRPvec. Random number generation takes 18.05 secs in GRPfysbut only 6.22 secs in GRPvec. The time taken by the other statements in the loop are aboutthe same for both versions. Hence the 12% difference is accounted for by the vectorizationof the random number generator. Or so it seems.

Because profiling adds a lot of overhead to the run times, it is always a good idea to getthe actual times without the profiler on. Still with the computer in the single cpu modeGRPfys and GRPvec were timed, again with n = 108. Here are the command-line results:

» tic;p=GRPvec(1e8);toc

Elapsed time is 27.283832 seconds.

» tic;p=GRPvec(1e8);toc

Elapsed time is 27.352696 seconds.

» tic;p=GRPfys(1e8);toc

Elapsed time is 23.726308 seconds.

» tic;p=GRPfys(1e8);toc

Elapsed time is 23.714992 seconds.

This shows that the GRPvec takes 27.3/23.7 − 1 = 15% more time than the unvectorizedGRPfys, a complete reversal of the profiler results. In this case the MATLAB profiler hasbeen worse than useless — it has been grossly misleading. We can see that the profileradded about (86.16 − 23.7)/23.7 = 263% overhead time to the actual running time of

© DEREK O’CONNOR, JANUARY 31, 2011 17

Page 18: O'Connor -- RPGLab: A Matlab package for Random Permutation Generation

Derek O’Connor Random Permutation Generation

GRPfys. This overhead time is noise (we don’t want it) which is swamping out the actualtime signal. The only reliable information given out by the profiler is the statement count.But apart from the erroneous profiler results, this example shows that vectorization isnot (always) a good thing: not only has it added 15% to the run time, it has doubled theworking memory required.

“It was once true that vectorization would improve the speed of your MATLAB code.However, that is largely no longer true with the JIT-accelerator” – Matlab Doug, Mar.2010.

6 Generating Special Permutations.

The Fisher-Yates Shuffle Algorithm is provably correct and, when implemented with agood random number generator, produces all possible permutations with equal proba-bility. A special permutation is one which has some special property or restriction: it iscyclic, it is a derangement, it must have 2 fixed points, etc.

The simplest way to generate special permutations is by the Rejection Method, as shownin the function GRPSpec: repeatedly generate a permutation until it has the special prop-erty or restriction. The main problem with this general method is that many permutationsmay need to be generated before a desirable one is found.

function GRPSpec(n) → pGenerates a special random permutation

of the integers 1,2,...,n

p := [1, 2, . . . , n] Identity perm.

while p is not special do

p := GRPfys(n)endwhile preturn p

endfunc GRPSpec

This type of algorithm is called a Las Vegas Algorithm:10 an algorithm that is guaranteed togive the correct output, but may take a very long (random) time to do so. It is importantto understand that GRPfys(p) is sampling uniformly with replacement from the set of allpermutations of length n, and knows nothing about the special property. We have seenthat the time complexity of GRPfys(n) is Ts(n) = csn ∈ O(n). This is not a random time,but constant for a given n.

The random uniform output of GRPfys(p) means that we do no know how many timesthe while loop is performed: it is a random number nw. Hence the time complexity ofGRPSpec(n) is a random function

Tspec(n) =nw

∑k=1

(Ttest(n) + Ts(n)). (6.1)

Using Wald’s Equation, E[∑Ni=1 Xi] = E[N]E[X], equation (6.1) becomes

E[Tspec(n)] = E[nw]E[Ttest(n) + Ts(n)] = E[nw](E[Ttest(n)] + E[Ts(n)]). (6.2)

10See Brassard & Bratley, Fundamentals of Algorithmics, Prentice-Hall, 1996, Chapter 10.

© DEREK O’CONNOR, JANUARY 31, 2011 18

Page 19: O'Connor -- RPGLab: A Matlab package for Random Permutation Generation

Derek O’Connor Random Permutation Generation

We have E[Ts(n)] = Ts(n) is a constant for a given n, but we do not know much aboutTtest(n). Is it a random number or a constant? Usually these tests can be performed inO(n) time, e.g., IsCyc(p) and IsDer(p) are O(n). Assuming that the test is , at worst, aconstant ctn, then we have E[Tspec(n)] = E[nw](ctn + csn) = cnE[nw].

What can be said about nw and its expected value? Obviously nw > 0 and nw < ∞,assuming that the desired permutation type is not an impossibility, e.g., p(1 : 10) has atleast one p(i) = 59, say. Consider asking for p = (n, n − 1, . . . , 2, 1), the reverse identity.The probability of this permutation occurring is 1/n!, and so GRPfys may spend a verylong time until it ’hits’ it. An upper bound on nw would seem to be O(n!). This is, ofcourse, the essence of a Las Vegas algorithm: it will find the correct answer, but it maytake forever.

6.1 Generating Random Derangements.

A derangement is a permutation with no fixed points, i.e., pi 6= i, i = 1, 2 . . . , n. Obvi-ously, a permutation can be checked in O(n) time to determine if it is a derangement.In this case the rejection method works in O(n) time because Pr(p is a derangement) =D(n)/n! → 1/e ≈ 0.36788. Hence the expected time complexity is E(nw)n = en ≈2.718n, i.e., on average, the while loop of GRDrej is performed about 2.718 times perderangement generated.

The Rejection Derangement Generator

function p = GRDrej(n);

Generate a random permutation p(1:n) using GRPfys

and reject if this is not a derangement.

Requires expected e = 2.7183... passes through the while-loop.

NotDer = true;

while NotDer

p = GRPfys(n);

NotDer = HasFP(p); Derangement check

end;

return GRDrej

The Martinez-Panholzer-Prodinger Derangement Generator.

This is a new derangement algorithm by Martinez, et al.11 that is not easy to understandbut seems to work well. Shown below are the original algorithm and it MATLAB imple-mentation.

11http://www.siam.org/pro eedings/anal o/2008/anl08_022martinez .pdf© DEREK O’CONNOR, JANUARY 31, 2011 19

Page 20: O'Connor -- RPGLab: A Matlab package for Random Permutation Generation

Derek O’Connor Random Permutation Generation

function GRPmpp(n) → pMartinez et al’s original Algorithm which generates a random

derangement of the integers 1,2,...,n

p := [1, 2, . . . , n] Identity permutation

Mark[1 : n] :=false;

i := n; u := nwhile u ≥ 2 do

if ¬Mark[i] thenrepeat

j :=Random(1, i − 1)until ¬Mark[j]p[i] :=: p[j] Swap

u :=Uniform(0, 1)if u < (u − 1)Du−2/Du then

Mark[j] :=true

u := u − 1endif uu := u − 1

endif ¬Mark[i]i := i − 1

endwhile jreturn p

endfunc GRPmpp

function p = GRDmpp(n);

p = I(n); identity permutation

mark = p < 0; mark(1:n) = false

i = n; u = n;

while u > 1

if ∼mark(i)

j = ceil(rand*(i­1)); random j in [1,i-1]

while mark(j)

j = ceil(rand*(i­1)); random j in [1,i-1]

end;

t = p(i);

p(i) = p(j); Swap p(i) and p(j)

p(j) = t;

r = rand;

if r < 1/u Prob. if test for n large

mark(j) = true;

u = u­1;

end;

u = u­1;

end; if ∼mark

i = i­1;

end; while u > 1

return GRDmpp

Tests on GRDmpp show that on average the outer while loop is performed 2 times, aspredicted by the analysis. Hence this is faster than the Rejection algorithm which requirese = 2.718 loops, on average.

© DEREK O’CONNOR, JANUARY 31, 2011 20

Page 21: O'Connor -- RPGLab: A Matlab package for Random Permutation Generation

Derek O’Connor Random Permutation Generation

Derangement Generator Timing Tests

Timing tests were run on the two derangement functions along with GRDmex which isGRDrej using Jan Simon’s fast GRPmex function.

Table 6. RANDOM DERANGEMENT GENERATOR TIMES (SECS)

n GRDrej GRDmex GRDmpp

105 0.04 8.91 0.01 1 0.02 4.13

106 0.26 3.48 0.08 1 0.24 3.20

107 18.06 5.70 4.83 1.52 3.17 1

108 144.11 14.38 10.02 1 38.03 3.80Times and normalized times, averaged over a

sample of 10 runs for each n.

Expected Running Times of the Derangement Generators

The Rejection algorithm (GRDrej) and the Martinez algorithm (GRDmpp) are examples ofLas Vegas Algorithms, i.e., they are guaranteed to give the correct result but they mayrun for a long (random) time. The Fisher-Yates Shuffle algorithm has a constant runningtime for a given n, which I will call Ts(n). The running time of the Rejection algorithmis Tr(n) = nw × (Tt(n) + Ts(n)), where nw is a random variable that counts the numberof times the while loop is executed. The expected value of nw is e = 2.7183 . . . becausethe probability of the Shuffle algorithm generating a derangement is 1/e. Hence Tr(n) =e(Tt(n) + Ts(n)) = 2.7183Ts(n), if we assume that Tt(n) is negligible compared to Ts(n).

Martinez, et al., prove that the expected running time of their algorithm is Tm(n) = 2Ts(n)and so it is faster, on average, than the Rejection algorithm, but not by much. My advicewould be: if you have a good fast Fisher-Yates Shuffle program, use it in the Rejectionalgorithm, especially if you are risk-averse, because the Martinez algorithm is harder toimplement correctly. We can see in the timing table above that the Rejection algorithmusing Jan Simon’s GRPmex was fastest for n = 105, n = 106 and n = 108. GRDmpp wasfastest for n = 107. An important added advantage of the Rejection algorithm is that ituses just 1 array of length n, whereas the Martinez algorithm uses 2 arrays of length n.

The interesting question remains: is random derangement generation inherently moredifficult than random permutation generation, or is a there a derangement algorithmwith Td(n) = Ts(n) ?

6.2 Generating Random Cyclic Permutations.

A cyclic permutation has a single cycle of length n. A permutation can be tested in O(n)time to see if it is cyclic. Hence GRPSpec will perform an O(n) cyclic test followed by anO(n) permutation generation for each iteration of the while loop. If nw is the numberof times the while loop is performed then the complexity of this method is O(nwn).From (3.4) we have Pr(p is cyclic) = 1/n. Hence the expected value of nw is n and so theexpected complexity of GRPSpec is O(n2), for cyclic permutations.

© DEREK O’CONNOR, JANUARY 31, 2011 21

Page 22: O'Connor -- RPGLab: A Matlab package for Random Permutation Generation

Derek O’Connor Random Permutation Generation

Sattola’s Cyclic Permutation Generator.

S. Sattola, in her Master’s thesis, gave a very simple modification of the Fisher-Yatesalgorithm that generates random cyclic permutations.12

function GRCsat(n) → pGenerates a random cyclic permutation

of the integers 1,2,...,n

p := [1, 2, . . . , n] Identity perm.

for k := 2 to n do

r := RandInt(1, k − 1) Sattola’s modification

p[k] :=: p[r] Swap

endfor kreturn p

endfunc GRCsat

We can see that the only change in the Fisher-Yates algorithm is this: r := RandInt(1, k)becomes r := RandInt(1, k − 1). This change ensures that it swaps different elements ofp, i.e, p[k] :=: p[r], k 6= r. It is remarkable that such a small change in the Fisher-Yatesalgorithm can make such a big difference in its output.

The correctness of this algorithm has been proved and it has been thoroughly analysedby Prodinger13 It obviously has the same O(n) complexity as the Fisher-Yates algorithm,and so it is an order of magnitude faster than the rejection method.

12S. Sattola, “An algorithm to generate a random cyclic permutation”. Information Processing Letters, Vol.22, pages 315–317, 1986.

13Prodinger, Helmut, “On the analysis of an algorithm to generate a random cyclic permutation”, ArsCombinatorica, Vol. 65, 2002.

© DEREK O’CONNOR, JANUARY 31, 2011 22

Page 23: O'Connor -- RPGLab: A Matlab package for Random Permutation Generation

Derek O’Connor Random Permutation Generation

7 Testing Combinatorial Generators

Combinatorial generation algorithms and programs are often small, deceptively simple,subtle, and, above all, error-prone. These small programs are often buried deep as sub-programs (functions) in large simulations whose operation depends crucially on theircorrect and efficient working. It should go without saying that these generators must berigourously tested.

7.1 Random Generator Tests

These tests are for any n, but usually with n in the range [106, 1010], where exhaustivetesting is out of the question.

1. Existential. Does the generator output the correct form of combinatorial object. Forexample, (1) does a permutation generator produce (all possible) permutations?(2) does a random derangement generator produce (all possible) derangements?,and (3) does a random cyclic permutation generator produce (all possible) cyclicpermutations.

2. Uniformity. Are the random permutations distributed uniformly over the popula-tion of n! permutations. Uniformity implies that each number in {1, 2, . . . , n} occurswith relative frequency 1/n.

3. Frequency. Counting Classes. Derangements, Cycle Structure, etc.

4. Others

— TO BE COMPLETED —

© DEREK O’CONNOR, JANUARY 31, 2011 23

Page 24: O'Connor -- RPGLab: A Matlab package for Random Permutation Generation

Derek O’Connor Random Permutation Generation

Here is an example of an existential test of a derangement generator.

function [p,k,t] = FindPerm(target);

% Check if RPG can hit (find) the permutation ’target’

% USE: [p,k,t] = FindPerm([5 4 6 7 1 8 10 2 3 9]);

% The example target is a derangement but not cyclic

% Warning: use small permutations. O(n!) time.

% Derek O’Connor 28 Dec 2010.

n = length(target)

p = randpermfull2(n);

limit = 5*factorial(n);

k = 0;

tic;

while any(p­target) && k < limit

p = randpermfull2(n); % Ver 2, Jos van der Geest’s ’derangement’ gen.

k = k+1;

end;

t = toc;

Frac = k/factorial(n);

dispa(’­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­’);

if k == limit

dispa(’Target NOT FOUND after k =’, k,’iterations. Time =’,t,’secs.’);

else

dispa(’Target FOUND after k =’, k,’iterations. Time =’,t,’secs.’);

end;

dispa(’Fraction of Space searched (k/n!) =’, Frac ,’Rate =’, ceil(k/t), ’per second’);

[target;p]

return

The target is [5 4 6 7 1 8 10 2 3 9]. HasFP(target) returns 0 or false so we know thattarget is a derangement. Let’s see if it is a cyclic permutations: IsCyc(target) returns 0or false, so now we know that target is a derangement that is not cyclic. If randpermfull is atrue derangement generator then it should generate this derangement after a sufficientlylarge number of iterations. Here is what happens:

target = [5 4 6 7 1 8 10 2 3 9];

[p,k,t] = FindPerm(target);

­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­

Target NOT FOUND after k = 18144000 iterations. Time = 202.6715 secs.

Fraction of Space searched (k/n!) = 5 Rate = 89525 per second

ans =

5 4 6 7 1 8 10 2 3 9

6 5 4 1 10 8 3 2 7 9

© DEREK O’CONNOR, JANUARY 31, 2011 24

Page 25: O'Connor -- RPGLab: A Matlab package for Random Permutation Generation

Derek O’Connor Random Permutation Generation

Here is an example of an important frequency test of a permutation generator.

function nfp = FreqFixedPts(n,nsamp);

% Frequency Count of fixed points in a

% a sample of random permutations p(1:n).

% USE: [nfp,avfp,sdevfp] = FreqFixedPts(20,10ˆ4);

% Derek O’Connor 8th Jan 2011. [email protected]

nfp = zeros(nsamp,1);

for s = 1:nsamp

p = GRPfys(n);

nfp(s) = CountFixedPts(p);

end;

OutFreqFixedPts(nfp);

return; % FreqFixedPts

Figure 6 confirms that the expected value and variance of the number of fixed points is 1,for any n, and that the relative frequency of 0 fixed points is about 1/e ≈ 0.3679, whichis the relative frequency of derangements. A bad RPG would be unlikely to give theseresults. Indeed, it was this test that showed that there was something wrong with GRDrej.The fault was identified in the ‘simple’ IsDer interrogation function: a compound while

test was wrong. The function was re-written as IsDer(p) = ∼HasFP(p).

Figure 6. FIXED POINT FREQUENCIES

— TO BE COMPLETED —

© DEREK O’CONNOR, JANUARY 31, 2011 25

Page 26: O'Connor -- RPGLab: A Matlab package for Random Permutation Generation

Derek O’Connor Random Permutation Generation

8 RPGLab

In this section14 we describe a set of simple tools, MATLAB functions, that help us ma-nipulate and test permutations. This is done in the spirit of Kernighan and Plauger’sSoftware Tools in Pascal, Addison-Wesley, 1981, one of the great books in computer sci-ence. Sadly, the lessons and insights of this book are either unknown to or ignored bymany of today’s MATLAB programmers. In the same spirit as Kernighan & Plauger, thebooks by Jon Bentley are a devoted to small, simple, but powerful programs.

I hope that these simple functions will prove useful to those who wish to experimentwith random permutations. Hence the suffix LAB.

Simplicity and Correctness. The functions in RPGLAB have been designed to be assimple as possible and as simple to use as possible. The inputs are: an integer n, or a rowvector p(1 : n), or an integer and a vector. Outputs are: a logical ans, or a vector p.

Naming Conventions. Mathematics derives its power from the judicious choice of sym-bols it uses to name objects and processes. For example, ∑

ni=1 xi, conveys an immense

amount of information in a very compact form. Expressing this in a programming lan-guage (except APL) would take many more symbols, while expressing it in ordinaryEnglish would take a paragraph or more. With this in mind we have used the short-est possible names that are compatible with conveying information that is essential tounderstanding a piece of code. The names we use are not explanations but symbolicreminders of what the name stands for. Thus p = GRPfys(n) reminds us that it standsfor the process of Generating a Random Permutation of length n using the f isher-yatesshuffle algorithm, and stores it in the row vector p. This name and others have to bestudied, understood, and remembered, before they can be used with facility – just as inMathematics.

Error Checking. What may shock many MATLABers is the lack of input error checking.This is in the spirit of Kernighan & Plauger who assume that the users are reasonablycompetent and not lazy. After all, the makers of 36" pipe wrenches don’t check to see ifyou are using one to fix your bicycle.

If you generate and manipulate permutations with these functions only, then only validpermutations will be generated. Otherwise you must do your own error-checking.

Comments. Another shock is that very few comments are used, except for a succinctstatement of purpose at the head of each function. It has become fashionable in somequarters to write mini-theses at the start of each function, along with enough biographicaldata to satisfy the Library of Congress. I do include my name in the header of eachfunction.15 The header comments in the functions that follow have been stripped out tosave space, but they are in the m-files.

14I have written this section to be independent of the others. As a result, some code and discussions arerepeated.

15This reminds me of a comment by a famous art historian: “No matter how abstract the painting, thesignature is always very clear”.

© DEREK O’CONNOR, JANUARY 31, 2011 26

Page 27: O'Connor -- RPGLab: A Matlab package for Random Permutation Generation

Derek O’Connor Random Permutation Generation

’MATLAB-isms’. Perhaps the biggest shock to MATLABers will be that there is virtuallyno use of MATLAB’s vectorizations and fancy array manipulation functions. Most code iswritten in loop or component form, as the Numerical Linear Algebra people call it. I be-lieve that MATLAB’s matrix-vector notation and manipulation functions are very useful,when used in the proper place, but many MATLABers have made a fetish of vectorizationand index manipulation, to the detriment of code clarity, and, quite often, to speed.

A benefit of not using what I call ‘MATLAB-isms’, is that these functions are easily trans-lated into other languages, such as Fortran and C, making them good Mex candidates.

MATLAB is a very powerful and useful system for numerical computing. It allows theknowledgeable user to do in a few lines what would take hundreds of lines of tediousFortran or C code. A superb example of the power of good MATLAB programming isTrefethen’s Chebfun system.16. Trefethen, by the way, claims to be the first license holderof MATLAB.

8.1 The Primitives

These are the low-level functions that are used everywhere and are so simple that theyare, we hope, obviously correct. Writing such simple functions is not a trivial task.

These primitives fall into two classes: (1) permutation constructors, and (2) permutationinterrogators.

Table 7. PERMUTATION PRIMITIVES.

Class Name Description Use

I Generate the identity permutation p = I(n)

Trans Transpose elements i and j of p p = Trans(p,i,j)

Constrs. Rev Reverse the elements p(i), . . . , p(j) p = Rev(p,i,j)

Rot Left circular shift p by k positions p = Rot(p,k)

GRPfys Generate a random permutation p(1 : n) p = GRPfys(n)

GRPmex Generate a random permutation p(1 : n) p = GRPmex(n)

IsPer Is p a permutation? ans = IsPer(p)

Interrs. HasFP Does p have a fixed point? ans = HasFP(p)

IsDer Is p a derangement? ans = IsDer(p)

IsCyc Is p cyclic? ans = IsCyc(p)

Inputs and Outputs: i, j, k, n are integers, p is a permutation, ans is logical.

16http://www2.maths.ox.a .uk/ hebfun/© DEREK O’CONNOR, JANUARY 31, 2011 27

Page 28: O'Connor -- RPGLab: A Matlab package for Random Permutation Generation

Derek O’Connor Random Permutation Generation

8.2 Constructors

Identity. p = I(n)

function p = I(n);

p = 1:n; a row vector

return; I(n)

Transpose. p = Trans(p,i,j)

function p = Trans(p,i,j);

t = p(i);

p(i) = p(j);

p(j) = t;

return; Trans

Reverse. p = Rev(p,i,j)

function p = Rev(p,i,j);

while i < j

p = Trans(p,i,j);

i = i+1;

j = j­1;

end;

return; Rev

This code reverses elements p(i) . . . p(j) of p(1, 2, . . . , n). 17 Although there is no error

5 3 6 8 1 4 7 2 10 9

1 2 3 4 5 6 7 8 9 10

p

5 3 7 8 1 4 6 2 10 9

1 2 3 4 5 6 7 8 9 10

p

5 3 7 4 1 8 6 2 10 9

1 2 3 4 5 6 7 8 9 10

p

i j

i j

i = j

Figure 7. REVERSE OPERATION

checking, this code is quite robust: if i ≥ j then nothing happens. In practice, the trans-position function is replaced by the 3-statement swap operation. This operation is nowused in the rotation operation in a very clever way.

17See Kernighan & Plauger, Software Tools in Pascal, Addison-Wesley, 1981, pages 194,195.

© DEREK O’CONNOR, JANUARY 31, 2011 28

Page 29: O'Connor -- RPGLab: A Matlab package for Random Permutation Generation

Derek O’Connor Random Permutation Generation

Rotate. p = Rot(p,k)

function p = Rot(p,k);

n = length(p);

p = Rev(p,1,k);

p = Rev(p,k+1,n);

p = Rev(p,1,n);

return; Rot

This does a left circular shift of all elements of p by k positions. That is,

[p(1), . . . p(k), p(k + 1), . . . , p(n)]Rot(p,k)−−−−→ [p(k + 1), . . . , p(n), p(1), . . . , p(k)]

Here is how the three reversals work:

[p(1), . . . p(k), p(k + 1), . . . , p(n)]Rev(p,1,k)−−−−−→ [p(k), . . . , p(1), p(k + 1), . . . , p(n)]

[p(k), . . . p(1), p(k + 1), . . . , p(n)]Rev(p,k+1,n)−−−−−−−→ [p(k), . . . , p(1), p(n), . . . , p(k + 1)]

[p(k), . . . p(1), p(n), . . . , p(k + 1)]Rev(p,1,n)−−−−−→ [p(k + 1), . . . , p(n), p(1), . . . , p(k)]

At first glance, this code may seem inefficient because it is doing three reversals. Firstly,it should be obvious that the Reverse function is efficient: it performs (j − i)/2 transpo-sitions, or a total of 3(j − i)/2 element moves. Hence, the Rotate operation performs

3(k − 1)

2+

3(n − k − 1)

2+

3(n − 1)

2=

6n − 9

2∼ 3n element moves.

Although this is not optimal (each element is moved twice) it is very efficient nonetheless.There is an extensive body of research literature on this and related topics. The rotationoperation is an important low-level operation in text editors.

Generate a Random Permutation. p = GRPfys(n)

function p = GRPfys(n);

p = I(n);

for k = 2:n

r = RandInt(1,k);

p = Trans(p,k,r);

end;

return; GRPfys

This is the increasing-loop version of the Durstenfeld version of the Fisher-Yates Shufflealgorithm, shown below, along with Pike’s modification for a partial shuffle.

For those who have never seen, let alone written an Algol program, the function entier(x)is the largest integer not greater than the value of x.18

18Looking at Durstenfeld’s nicely-typeset Shuffle procedure, it is a shock to realize that it is a valid Algolprocedure, comments and all. Now, fifty years later, it is a poor reflection on today’s language designers,compiler-interpreter writers, and program-editor makers, that they cannot handle or present us with nicelytypeset programs, despite the huge strides made in mathematical typesetting, a much more difficult taskthan program typesetting. But we do have 19 different types of assignment statements in Java. Programmersof the World, Protest!

© DEREK O’CONNOR, JANUARY 31, 2011 29

Page 30: O'Connor -- RPGLab: A Matlab package for Random Permutation Generation

Derek O’Connor Random Permutation Generation

Generate a Random Permutation. p = GRPmex(n)

function p = GRPmex(n);

p = ShuffleMex(n,’index’); Jan Simon’s mexed version of F-Y Shuffle

return; GRPmex

This is 2 to 3 times faster than the pure MATLAB version GRPfys(n).

© DEREK O’CONNOR, JANUARY 31, 2011 30

Page 31: O'Connor -- RPGLab: A Matlab package for Random Permutation Generation

Derek O’Connor Random Permutation Generation

8.3 Interrogators

Valid Permutation Check. ans = IsPer(p)

function ans = IsPer(p)

n = length(p);

count = zeros(n,1);

ans = true;

for k = 1:n

if count(p(k)) == 0

count(p(k)) = count(p(k))+1;

else

ans = false; Stop after first bad p(k)

return

end;

end; for k

return; IsPer

© DEREK O’CONNOR, JANUARY 31, 2011 31

Page 32: O'Connor -- RPGLab: A Matlab package for Random Permutation Generation

Derek O’Connor Random Permutation Generation

A permutation p is valid if and only if each k ∈ {1, 2, . . . , n} appears once and only once.This is an expensive check because it uses an extra array.19 Notice that the function re-turns as soon as a bad value is found. No further time is wasted checking for other badpoints. We will use this stop-as-soon-as-possible principle throughout the other interroga-tion functions.

Fixed Point Check. ans = HasFP(p)

function ans = HasFP(p)

n = length(p);

ans = false;

for k = 1:n

if p(k) == k

ans = true;

return; stops after first fixed point.

end;

end; for k

return HasFP

A permutation p has a fixed point if p(k) = k, for some k = 1, 2, . . . , n. Note thatHasFP(p) returns false if p is a derangement. This function could be written more suc-cinctly with a compound while statement, but such statements are (for me at least) errorprone. It is well to remember that succinctness and simplicity are often at odds.

Derangement Check. ans = IsDer(p)

function ans = IsDer(p);

ans = ∼HasFP(p);

return IsDer

If you are sure that HasFP(p) is correct, then IsDer(p) is obviously correct. Replace thefunction call with inline code if necessary, but check it carefully.20

19I’m sure there are better ways of doing this check. Later, maybe.20My first attempt at IsDer(p) had an error in a compound while check. This error did not show up until

much later, when I did frequency tests on the permutation generators.

© DEREK O’CONNOR, JANUARY 31, 2011 32

Page 33: O'Connor -- RPGLab: A Matlab package for Random Permutation Generation

Derek O’Connor Random Permutation Generation

Cyclic Permutation Check. ans = IsCyc(p)

function ans = IsCyc(p);

n = length(p);

start = 1;

next = p(start);

L = 1;

while next ∼= start stops at the end of first cycle.

L = L+1;

next = p(next);

end;

ans = L == n;

return; IsCyc

A permutation p is cyclic if it has a cycle of length n. This function starts arbitrarily atelement 1. If element 1 is part of a cycle whose length is less than n, then p cannot becyclic, and false is returned.

8.4 Special Permutation Generators

There are many special permutations. Here we give two which seem to be the mostuseful: Derangement and Cyclic.

Table 8. SPECIAL GENERATORS.

Name Description Use

GRDrej Generate a random derangement p(1 : n) p = GRDrej(n)

GRDmex Generate a random derangement p(1 : n) p = GRDmex(n)

GRDmpp Generate a random derangement p(1 : n) p = GRDmpp(n)

GRCsat Generate a random cyclic permutation p(1 : n) p = GRCsat(n)

Generate a Derangement. p = GRDrej(n)

function p = GRDrej(n);

NotDer = true;

while NotDer

p = GRPfys(n);

NotDer = HasFP(p); Derangement check

end;

return GRDrej

© DEREK O’CONNOR, JANUARY 31, 2011 33

Page 34: O'Connor -- RPGLab: A Matlab package for Random Permutation Generation

Derek O’Connor Random Permutation Generation

Generate a Derangement. p = GRDmex(n)

function p = GRDmex(n);

NotDer = true;

while NotDer

p = GRPmex(n);

NotDer = HasFP(p); Derangement check

end;

return GRDmex

This is just GRDrej using GRPmex which uses Jan Simon’s ShuffleMex. This gives a 2 to 3speedup.

Generate a Cyclic Permutation. p = GRCsat(n)

function p = GRCsat(n)

p = I(n);

for k = 2:n

r = RandInt(1,k­1); Changing k to k-1 in GRPfys

p = Trans(p,k,r);

end;

return; GRCsat

This is Sattola’s modification of GRPfys. We can see that the only change in GRCsat isto use of k − 1 instead of k. This causes a dramatic change in the output of this func-tion: it generates cyclic permutations only. The correctness of this algorithm has beenproved and it has been thoroughly analysed by Prodinger21 It obviously has the sameO(n) complexity as the Fisher-Yates algorithm.

21Prodinger, Helmut, “On the analysis of an algorithm to generate a random cyclic permutation”, ArsCombinatorica, Vol. 65, 2002.

© DEREK O’CONNOR, JANUARY 31, 2011 34

Page 35: O'Connor -- RPGLab: A Matlab package for Random Permutation Generation

Derek O’Connor Random Permutation Generation

Generate a Random Integer. r = RandInt(L,U)

This function is included because it is easy to get wrong. It can be replaced by the singleline of code below. The purpose of this function is to pick a random integer from the set{L, L + 1, . . . , U − 1, U}. This is done with the single statement

r = L + floor(rand ∗ (U − L + 1)) (8.1)

We wish to prove that this statement works correctly, given that rand is MATLAB’s imple-mentation of the Mersenne Twister generator. MATLAB’s documentation on rand statesthat

The rand function now supports a method of random number generation called theMersenne Twister. The algorithm used by this method, developed by Nishimura andMatsumoto, generates double precision values in the closed interval [2−53, 1− 2−53],with a period of (219937 − 1)/2

From this information we have from (8.1)

r ∈ L + floor([2−53, 1 − 2−53]× (U − L + 1))

= L + floor[2−53(U − L + 1), (U − L + 1)− 2−53(U − L + 1)],

= L + floor[ǫ, U − L + 1 − ǫ], where ǫ = 2−53(U − L + 1),

= L + [⌊ǫ⌋, ⌊(U − L + 1 − ǫ)⌋],= L + [0, U − L], if ǫ < 1,

= [L, U].

Now ǫ < 1 ⇒ 2−53(U − L + 1) < 1 ⇒ (U − L + 1) < 253. If L and U are 32-bitsigned integers then (U − L + 1) is also a 32-bit signed integer in [−231, 231 − 1]. Hence−231 ≤ (U − L + 1) ≤ 231 − 1, and the condition ǫ < 1, or (U − L + 1) < 253, obviouslyholds.

If L and U are 64-bit signed integers then (U − L + 1) is also a 64-bit signed integerin [−263, 263 − 1]. Hence −263 ≤ (U − L + 1) ≤ 263 − 1, and the condition ǫ < 1, or(U − L + 1) < 253, may not hold.

Warning: Do not use 64-bit integers with r = L + floor(rand*(U­L+1)).

The period of rand is (219937 − 1)/2 ≈ 20 × 106000, which is a gigantic number, at leastto ordinary mortals. However, the sets of permutations that GRPfys(n) samples from aregigantically larger that rand’s gigantic period (See Table 2 above). On a 2.3 GHz, 16 GBmachine, random permutations of length n = 109 can be generated in a few minutes.Thus we are sampling from a space of N = n! = (109)! objects. Using Stirling’s approx-imation we find loge(109)! ≈ 2 × 1010 which means that N = n! is an integer with about1010 digits. This means that despite rand’s gigantic period – an integer with a mere 6000digits – GRPfys(n) will never visit more than an infinitesimally small fraction of the per-mutation space. Yet we can say, with a certain amount of confidence, that GRPfys willproduce a permutation p(1 : 109) with probability 1/(1010- digit number).

© DEREK O’CONNOR, JANUARY 31, 2011 35

Page 36: O'Connor -- RPGLab: A Matlab package for Random Permutation Generation

Derek O’Connor Random Permutation Generation

— TO BE COMPLETED —

Table 9. OTHER FUNCTIONS.

Name Description Use

InvP The inverse of p. p(q) = q(p) = I q = InvP(p)

CanForm Arrange p in canonical form q = CanForm(p)

CycStruct Determine cycle structure of p [ncyc,S,L] = CycStruct(p)

FindPerm Can an RPG ‘hit’ the target pt [p,k,time] = FindPerm(pt)

© DEREK O’CONNOR, JANUARY 31, 2011 36