rhodes universitycontents i introduction 1 1 what is \advanced programming"? 3 1.1 introduction

250
January 2011 Advanced Programming Course Notes George C. Wells Department of Computer Science Rhodes University Grahamstown 6140 South Africa EMail: [email protected]

Upload: others

Post on 14-Oct-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

'

&

$

%

January 2011

Advanced Programming

Course Notes

George C. Wells

Department of Computer ScienceRhodes UniversityGrahamstown 6140

South AfricaEMail: [email protected]

Page 2: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Copyright c© 2011. G.C. Wells, All Rights Reserved.

Permission is granted to make and distribute verbatim copies of this manual provided the copyright noticeand this permission notice are preserved on all copies, and provided that the recipient is not asked towaive or limit his right to redistribute copies as allowed by this permission notice.

Permission is granted to copy and distribute modified versions of all or part of this manual or transla-tions into another language, under the conditions above, with the additional requirement that the entiremodified work must be covered by a permission notice identical to this permission notice.

Acknowledgements

Parts of these notes are adapted from previous course notes written by Pat Terry and David Sewry. Inaddition, Peter Wentworth has been a valuable source of good ideas. Any errors or lack of clarity are, ofcourse, a result of my failure to distil their collective wisdom.

Page 3: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Contents

I Introduction 1

1 What is “Advanced Programming”? 3

1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.1.1 About the Course . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2 Writing Good Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2.1 Correctness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.2.2 Ease of Use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.2.3 Generality and Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.2.4 Portability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.2.5 Clarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.2.6 Ease of Coding and Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.2.7 Ease of Modification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.3 Techniques for Improving Program Quality . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.3.1 Preconditions and Postconditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.3.2 Assertions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.3.3 Automated Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.4 Concluding Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2 Old Friends Revisited 14

2.1 Recursion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.1.2 Calculating Factorials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.1.3 Writing a String Backwards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.2 Interfaces in Java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.2.1 Interfaces and Polymorphism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3 Driven to Abstraction! 25

3.1 The Need for Abstraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3.1.1 Some Examples of Abstraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3.2 Abstraction in Modern Programming Languages . . . . . . . . . . . . . . . . . . . . . . . 28

3.2.1 Procedural Abstractions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

i

Page 4: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

3.2.2 Abstract Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.2.3 Information Hiding: Client and Implementor Views . . . . . . . . . . . . . . . . . . 30

3.3 Closing Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

II Advanced Data Structures 32

4 Vectors and Lists 34

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

4.2 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

4.3 Linked Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

4.4 Generic Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

4.4.1 A Generic List Class Using Polymorphism . . . . . . . . . . . . . . . . . . . . . . . 51

4.4.2 A Generic List Class Using Java’s Generic Features . . . . . . . . . . . . . . . . . . 55

4.5 Closing Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

5 Stacks and Queues 60

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

5.2 Stacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

5.2.1 An Array-based Stack Implementation . . . . . . . . . . . . . . . . . . . . . . . . . 61

5.2.2 A Linked List Stack Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . 64

5.2.3 A Simple Example of the Use of a Stack . . . . . . . . . . . . . . . . . . . . . . . . 68

5.3 Queues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

5.3.1 An Array-based Queue Implementation . . . . . . . . . . . . . . . . . . . . . . . . 69

5.3.2 A Linked List Queue Implementation . . . . . . . . . . . . . . . . . . . . . . . . . 73

5.3.3 An Example of the Use of Queues . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

5.4 Deques, Circular Lists and Header Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

5.4.1 Implementation Techniques for Deques . . . . . . . . . . . . . . . . . . . . . . . . . 84

5.4.2 A Java Class for Deques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

5.4.3 The Use of Deques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

5.5 Summary of the ADTs in Chapters Four and Five . . . . . . . . . . . . . . . . . . . . . . 92

6 Trees and Graphs 94

6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

6.2 Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

6.2.1 Definitions and Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

6.2.2 Implementing Trees as Dynamic Data Structures . . . . . . . . . . . . . . . . . . . 96

6.2.3 Converting a General Tree to a Binary Tree . . . . . . . . . . . . . . . . . . . . . . 97

6.2.4 A General Binary Tree Class in Java . . . . . . . . . . . . . . . . . . . . . . . . . . 98

6.2.5 Using the Tree Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

6.2.6 Traversing Trees and the Use of Iterators . . . . . . . . . . . . . . . . . . . . . . . 103

6.2.7 Ordered Binary Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

ii

Page 5: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

6.3 Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

6.3.1 Definitions and Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

6.3.2 Representation of Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

7 Making a Hash of It! 119

7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

7.2 Dictionaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

7.2.1 A Java Interface for Dictionary Data Structures . . . . . . . . . . . . . . . . . . . 120

7.2.2 A Simple Java Class for Dictionaries . . . . . . . . . . . . . . . . . . . . . . . . . . 120

7.2.3 An Example of the Use of a Dictionary . . . . . . . . . . . . . . . . . . . . . . . . 126

7.3 Hash Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

7.3.1 Hashing Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

7.3.2 Internal Hashing with Open Addressing . . . . . . . . . . . . . . . . . . . . . . . . 130

7.3.3 External Hashing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136

7.4 Comparison of Dictionary Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

III The Analysis of Algorithms 143

8 Big-O 145

8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

8.2 Algorithmic Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146

8.2.1 The Impact of the Input Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146

8.2.2 Big-O Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

8.3 Some Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149

8.3.1 Very Simple Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149

8.3.2 A More Realistic Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

IV Some Common Algorithms 152

9 Searching 154

9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

9.1.1 Implementation Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

9.2 Searching Techniques Revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156

9.2.1 Simple Sequential Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156

9.2.2 Searching a Hash Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156

9.3 Binary Search Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

9.3.1 Binary Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

9.3.2 Interpolated Binary Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

9.3.3 The Relationship Between the Binary Search and Binary Search Trees . . . . . . . 161

10 Sorting 163

iii

Page 6: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

10.2 Simple Sorting Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

10.2.1 The Bubble Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164

10.2.2 The Insertion Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166

10.2.3 The Selection Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

10.2.4 Summary of Simple Sorting Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 170

10.3 Indirect Sorting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170

10.4 More Efficient Sorting Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171

10.4.1 The Quick Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172

10.4.2 The Merge Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176

10.5 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179

Index of Data Structures and Algorithms 181

Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181

Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182

Index 183

Bibliography 185

A File Listings 187

A.1 Lists and Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187

A.1.1 IntegerVector.java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187

A.1.2 IntegerList.java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190

A.1.3 ObjectList.java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193

A.1.4 GenericList.java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195

A.2 Stacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198

A.2.1 Stack.java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198

A.2.2 ArrayStack.java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199

A.2.3 ListStack.java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200

A.3 Queues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202

A.3.1 Queue.java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202

A.3.2 ArrayQueue.java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202

A.3.3 ListQueue.java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204

A.3.4 QSearch.java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206

A.4 The Iterator Interface: Iterator.java . . . . . . . . . . . . . . . . . . . . . . . . . . . 208

A.5 Deques: Deque.java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208

A.6 Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210

A.6.1 Tree.java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210

A.6.2 Animal.java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212

A.6.3 BinarySearchTree.java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213

A.7 Dictionaries and Hash Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218

iv

Page 7: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

A.7.1 Dictionary.java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218

A.7.2 Pair.java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219

A.7.3 DictionaryPair.java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220

A.7.4 ListDictionary.java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222

A.7.5 Concordance.java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225

A.7.6 InternalHashTable.java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226

A.7.7 ExternalHashTable.java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230

A.8 Binary Searches: BinarySearch.java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235

A.9 Sorting: Sort.java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236

v

Page 8: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

List of Figures

2.1 A Class Hierarchy for Employee Classes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.2 Fitting in PAs (won’t work in Java). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.3 Fitting in PAs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.4 Using an Interface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3.1 Abstract Views of a Computer System. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.2 An Example Class Diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

5.1 A Tree View of the Bread-first Search Strategy. . . . . . . . . . . . . . . . . . . . . . . . . 82

6.1 A Family Tree. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

6.2 A Binary Tree (an Ancestor Tree). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

6.3 Binary Tree Converted From a General Tree. . . . . . . . . . . . . . . . . . . . . . . . . . 99

6.4 Knowledge-Base Tree. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

6.5 Example Binary Tree. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

6.6 A Binary Search Tree. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

6.7 An Example of a Graph. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

7.1 Full Class Diagram for the ListDictionary Class. . . . . . . . . . . . . . . . . . . . . . . 121

7.2 An External Hashing Table. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

8.1 Illustration of Some Common Order Functions. . . . . . . . . . . . . . . . . . . . . . . . . 149

9.1 A Binary Search Tree Equivalent to Binary Searching an Array. . . . . . . . . . . . . . . . 162

vi

Page 9: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

List of Tables

4.1 Operations on Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

5.1 Summary of ADTs in Chapters Four and Five . . . . . . . . . . . . . . . . . . . . . . . . . 92

6.1 Operations on Binary Search Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

7.1 Sample Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

7.2 Examples of Hash Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

7.3 Summary of Dictionary ADTs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

8.1 Results of Different Order Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

9.1 Comparison of Sequential and Binary Searches . . . . . . . . . . . . . . . . . . . . . . . . 160

10.1 Comparison of Simple Sorting Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . 170

vii

Page 10: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Part I

Introduction

1

Page 11: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

This section lays the foundation for the rest of the course. It starts by covering the motivation for studyingthe topics included under the banner of “Advanced Programming”. It also introduces some importantnew concepts, and revises some programming language features with which you may be familiar already.

2

Page 12: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Chapter 1

What is “Advanced Programming”?

Objectives

• To introduce the underlying ideas of advanced programming

• To discuss good programming practice

• To introduce preconditions and postconditions, and assertions

• To introduce automatic program documentation mechanisms, specifically Javadoc

1.1 Introduction

Information and Communication Technology (ICT) has been responsible for some of the most amazingand complex innovations that mankind has ever developed, and our society is now almost completelydependent on ICT. One of the key enablers of this process of innovation is the ability to construct newsystems — the process of programming.

Most of the programs that you have written and seen up until now have been very small in comparisonwith “real life” computer programs. In realistic applications it is not unusual for programs to be tensof thousands of lines long. For example, there is a open-source GIS (Geographic Information System)package that has over 900 000 lines of code. The techniques which work well for writing small programsto solve small problems start to fall apart very rapidly when applied to such life-size problems.

When dealing with introductory programming both student and teacher tend to focus on the languagedetails, and too little emphasis is given to the algorithms and data structures that are being used. In thisway the forest tends to get lost in the trees — the problems of dealing with syntax, compilers, etc. obscurethe bigger pictures of problem solving and program design. This course attempts to redress the balancepartially, by considering a number of common data structures and algorithms. The trade-offs that applyto selecting a particular data structure or algorithm are considered, as are methods for analysing therelative merits of differing approaches.

To be more specific in attempting to answer the question posed by the title of this chapter, this course:

3

Page 13: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

• introduces some criteria for judging program quality

• illustrates a number of important data structures, and some of their areas of application

• discusses program efficiency, and how this can be measured

• presents a number of sorting and searching techniques together with their associated measures ofefficiency

These aspects might be loosely categorised into two main areas: data structures and algorithms.

A further, extremely important aspect of dealing with large problems is that of design. This is an aspectthat we will not be considering explicitly in this course, but will leave to later courses. Tackling a largeprogramming project without a proper design would be like trying to construct a building without anarchitectural plan. Such an approach might work for a garden shed, but is likely to result in majorproblems if applied to a three-bedroomed house, and total disaster if used for a multistorey office block!Similarly, writing small programs can be tackled with little formal planning, but writing a program ofany realistic size requires careful design and planning if it is to succeed.

1.1.1 About the Course

When teaching a course like this, particularly the data structures section, there are a number of differentapproaches that might be taken, based on different philosophies about teaching and learning styles. Atop-down style would begin by discussing the applications of a data structure or algorithm, focusing onhow it is used in typical situations, and treating it in a fairly high-level, abstract manner. With somebackground in how a data structure or algorithm is used, one would then move on to consider how itmight be implemented, and study the “inner workings”.

An alternative philosophy, which is followed in these course notes, is the bottom-up approach. In thiscase, one begins with a study of the implementation of the data structure or algorithm and then moveson to study its use and example applications. In this way, the higher-level application is grounded on afoundation of understanding the implementation details.

Which of these two philosophies is better is very much a case of personal learning style and preference.The bottom-up approach has been consciously adopted for this course, because it provides a firm un-derstanding of the details before attempting to consider the more abstract use of a data structure oralgorithm. However, if your own learning style prefers a top-down approach, you might find the bottom-up style more difficult to follow. If this is the case, you may benefit from reading ahead in each chapter,skimming over the implementation details in each case, and focusing on the applications, before returningto reread the implementation sections.

1.2 Writing Good Programs

Not all programs written to solve the same problem are equally “good”. Certain criteria are required bywhich to judge the quality of programs. They are:

Correctness The program should solve the problem that it aims to solve, correctly and completely.

Ease of use The program should be as easy to use as possible.

Generality and efficiency Rather than solve a single, specific problem, the program should solve asbroad a range of related problems as possible. In addition, the resources (e.g. computer time andmemory space) required by the program should be minimised.

4

Page 14: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Portability The program should be able to run on a variety of computers and/or operating systemswith minimum modification for each installation.

Clarity The program should be as easy to read and understand as possible.

Ease of coding and testing The program should be written in such a way that it can be completedwith the minimum of effort.

Amenability to modification The program should be so constructed as to be easy to modify withoutcourting disaster.

Before discussing each of these criteria in more detail, a distinction must be made between three typesof programs, each of which place different priorities on the above criteria:

One-shot Programs These programs are used only once, and are then discarded. Since the timespent writing such a program far exceeds the time spent using it, ease of coding has top priority.Generality, efficiency and amenability to modification are of no importance.

Production Programs These programs are used frequently and most programs fall into this category.They will most likely be used for a long time and by many people. A possible priority listing mightbe:

• correctness

• ease of use

• clarity

• portability

• generality

• ease of modification

• ease of coding

Efficiency would probably be on a par with clarity for programs like editors and compilers, andbelow ease of modification for less frequently used programs.

Service Routines and Program Components These are methods and components that perform ba-sic operations (for example, sorting a list of data items, calculating a square root, a GUI buttoncomponent, or a report-generator). The program code for these library routines and components isnormally fairly short and is used many times. Efficiency is critically important.

Since production programs make up the majority of programs written, they will be the main focus ofattention when discussing the individual criteria.

1.2.1 Correctness

Programs that do not work correctly are of no use whatsoever. Unfortunately, and despite the best ofintentions, the first version of any program will contain errors.

Although programming techniques have improved immensely since the early days, the processof finding and correcting errors in programming — known graphically if inelegantly as “de-bugging” — still remains a most difficult, confused and unsatisfactory operation. The chiefimpact of this state of affairs is psychological. Although we are happy to pay lip-service tothe adage that to err is human, most of us like to make a small private reservation aboutour own performance on special occasions when we really try. It is somewhat deflating to be

5

Page 15: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

shown publicly and incontrovertibly by a machine that even when we do try, we in fact makeas many mistakes as other people. If your pride cannot recover from this blow, you will nevermake a programmer. (Christopher Strachey)

The majority of programs work most of the time, but for the remainder, they fail or give incorrect results.The people who use a particular program frequently get to know its habits and learn to live with them,but to the people who rarely use the program it becomes highly annoying.

Some common errors are:

1. Incomplete problem solution Although the general case is solved, the special case is left unattended.A careful study of the scope of the problem before attempting any solution will avoid many errors.But bear the program’s objectives in mind all the time.

2. Initialisation In some systems, such as Visual BASIC, all variables are initialised but this shouldnot be assumed, especially if the program is to be portable. In the case of Java, the system eitherinitialises all variables (to a zero, or null, value) or else enforces initialisation by the programmer.Of course, care must still be taken that the default initialisation is appropriate.

3. Off-by-one errors This occurs when a loop is executed one too many or one too few times.

4. Real numbers Real (floating point) numbers (i.e. float and double values in Java) are not storedexactly on a computer. Frequent addition or multiplication can compound the error.

5. Typographical errors The difference between 1 (the digit one) and I (the upper case letter “i”) andl (the lower case letter “L”), and 0 (the digit zero) and O (the upper case letter “o”) is sometimesvery marginal. The fact that i is a favourite loop control variable (although it should not be) and1 a common starting point in a loop, makes matters worse.

6. Precedence errors The arithmetic expression:

x =a× b

c× d

is coded correctly in Java as:

x = (a * b) / (c * d);

and not as:

x = a * b / c * d;

(What does this mean?). Brackets are free — use them!

1.2.2 Ease of Use

As far as the user is concerned, he or she is affected by two aspects:

1. Input : the program should be able to accept any input and not “crash”. In addition, the usershould not have to do any counting, for example, type in the number of integers to be submittedbefore actually typing in the integers themselves — let the machine do all the counting. The usershould not be expected to employ obscure codes in place of an English word. For example, if dealingwith the input of colours the system should accept strings like “red” or “blue”, rather than numericcodes like 16 or 3. In modern systems the use of menus of options and graphical user interfaces hasgreatly improved ease of use, but care must still be taken to design user interfaces carefully andconsistently.

6

Page 16: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

2. Output : all output should be sufficient, clear and self-explanatory. Do not display or print outunnecessary data.

1.2.3 Generality and Efficiency

Unfortunately these two criteria tend to be mutually exclusive. Normally a compromise must be reached.In general, rather than writing a routine to solve a highly specific problem, one should write a routinethat will solve a class of related problems. However, be careful not to write a routine that will spendages trying to decide exactly what the problem is!

1.2.4 Portability

A portable program is a program that will execute on a variety of machines or on different operatingsystems with no, or more realistically, very few modifications. This can be achieved by sticking as closeto the language standard as possible. Of course, this kind of portability was one of the important designconsiderations for the Java language, and as a result Java programs are far more portable than mostothers.

1.2.5 Clarity

Clarity is a measure of how easy the program is to read and understand. Factors that affect clarity are:

• simplicity of the algorithm

• program structure

• choice of symbolic names for classes, variables, methods, etc. (use meaningful names, and nameunfamiliar constants)

• comments

• physical layout

Good supporting documentation will also contribute to program clarity.

1.2.6 Ease of Coding and Testing

Modular programs are both easy to code and easy to test. Parts of the program (i.e. individual classes inobject-oriented languages like Java) can be written and tested independently of others. Test data shouldbe simple but still representative of all possible input data. Test code should be left in programs (butcommented out) so that it can be easily reused in future.

1.2.7 Ease of Modification

A clear and well-documented program makes the process of maintenance and modification easier. Themajor problem with modifying a program is not being able to foresee the exact consequences of thealteration. Again the more modular the program, the smaller the chance that a change in one section ofthe program will affect another section.

7

Page 17: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

1.3 Techniques for Improving Program Quality

There are some simple techniques and habits that can make a great difference in constructing good qualitysoftware. Three techniques that we will consider here are the use of preconditions and postconditions, theuse of program assertions, and automated mechanisms for producing documentation.

1.3.1 Preconditions and Postconditions

When writing a method (or function, or procedure1) there are generally some facts that it will rely onwhen it is called. These are called the preconditions for the method: the things that should hold trueat the start of the method’s execution. In much the same way, the method should leave the programin some well-defined state at the end of the execution of the method. This state is described by thepostconditions: the things that should hold true at the end of the method’s execution. These conditionscan be expressed as comments given at the point of declaration of the method. For example, a squareroot method might have the following pre- and postconditions:

public double squareRoot (double x)

// PRE: x >= 0

// POST: squareRoot returns an approximation to the square root of x

// The approximate square root is found using the Newton-Raphson method

For many methods there will be no preconditions. For example:

public double random ()

// PRE: None

// POST: random returns a pseudo-random number r in the range 0 <= r < 1

Another common notation is <entry>, used to indicate the value of a parameter on entry. For example:

public int inc (int n)

// PRE: n has a value

// POST: if n < Integer.MAX_VALUE then inc returns n<entry> + 1

// else inc returns Integer.MIN_VALUE

Preconditions and postconditions are only comments, and so the programmer must write them initially,and then keep them up to date as changes are made to the method. When writing a method carefulthought should be given to what assumptions are being made at the start of the method and what it willguarantee to do by the end of the method. Of course, if the calling code violates the preconditions thenthere is no guarantee that the postconditions will be upheld. For example, if the squareRoot methodabove is called with a negative value for the parameter x, it may be justified in returning garbage.

1.3.2 Assertions

Assertions are a mechanism that allow us to introduce the idea of “checkpoints” in our programs. Anassertion states a fact about the state of the program at the point at which the assertion is made. Manyprogramming languages do not support assertions and so they can only be included as comments, aswe did for preconditions and postconditions above. Originally Java did not support assertion checking.However, this was rectified with the release of Java version 1.4.

1In the rest of the discussion we will refer only to methods in the usual Java manner, but be aware that the use ofpreconditions and postconditions is a generally useful technique, and can be applied in any programming language tofunctions and to procedures.

8

Page 18: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

The Java assertion mechanism allows us to embed assertions in our programs in such a way that theywill be checked as the program runs. An assert statement will throw an exception2 if the assertion doesnot hold true. The Java assertion mechanism has two alternative forms:

assert boolean expression;assert boolean expression : expression;

Consider the following simple example, using the first form above:

. . .

skipSpaces();

ch = (char)System.in.read();

assert ch != ’ ’;

If we had made an error in writing the skipSpaces method then the error would be detected and theprogram would terminate with a message like:

java.lang.AssertionError

at MyApp.main(MyApp.java:12)

Exception in thread "main"

The second form of the assert statement has an expression of any type as a second field. This isconverted to a string and used to form the error message given by the AssertionError class. In practice,the second expression is usually a simple string giving some form of helpful error message, as in thefollowing example.

. . .

skipSpaces();

ch = (char)System.in.read();

assert ch != ’ ’ : "Character is not a space";

In some cases it may be difficult, or even impossible, to express an assertion using a simple conditionalexpression. In these situations one needs to revert to using a simple comment to state the assertion. Forexample:

class PlayingCard

{ ... }

PlayingCard deck[] = new PlayingCard[52];

. . .

shuffleCards(deck);

// ASSERT: deck contains 52 cards in random order

Just as with preconditions and postconditions, the responsibility for the use of assertions rests with theprogrammer. The discipline of writing assertions into your programs has the two-fold benefit of forcingyou to evaluate exactly what your program is meant to be doing at strategic points in its execution, andproviding for run-time checking (in situations where this is possible).

Note that the assertion checking in Java is disabled by default. If it is required that assertions are checked,then it is possible to enable assertion checking in Java programs at runtime. This allows the checking

2Actually an error is thrown: AssertionError.

9

Page 19: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

to be performed during program development and testing, but then be disabled for efficiency when theprogram is shipped. Further details are available in the Java documentation.

Alert readers may have realised that there is a close link between assertions and pre- and postconditions.Both of these techniques make statements about the expected state of our programs at well-defined pointsin their execution. For example, it is very likely that the postcondition of the shuffleCards methodreferred to above will state something like “The deck of n cards is in random order”. For this reason onesometimes sees methods starting with an assertion, or assertions, checking that the preconditions hold.For example, returning to our squareRoot method we might have something like the following:

public double squareRoot (double x)

// PRE: x >= 0

// POST: squareRoot returns an approximation to the square root of x

// The approximate square root is found using the Newton-Raphson method

{ assert x >= 0;

. . .

} // squareRoot

In this way the possibility that the precondition may not be met by the calling code is taken into account,and an error message will be generated if the assertion fails.

However, the Java programming guidelines produced by Sun state that using exceptions is a better wayof handling preconditions for public methods. This ensures that the exception conditions are properlydocumented and handled. So a better solution to the problem described above would be:

public double squareRoot (double x)

// PRE: x >= 0

// POST: squareRoot returns an approximation to the square root of x

// The approximate square root is found using the Newton-Raphson method

{ if (x < 0)

throw new IllegalArgumentException("x < 0");

. . .

} // squareRoot

1.3.3 Automated Documentation

Many current programming languages provide mechanisms that allow programmers to easily generatedocumentation for the systems that they develop. In particular, Java provides Javadoc. This is a toolthat takes a Java file and automatically provides HTML documentation for the classes, methods, datafields, etc. that it contains. While this may be useful in itself, the real power of Javadoc is that itallows programmers to write special documentation comments, which are then used to create explanatorydocumentation.

Documentation comments are distinguished by beginning with /** (they end with the normal */), andshould appear immediately before the program feature that they describe. Depending on the context(i.e. whether a documentation comment is being used with a class, method, etc.) there are a number oftags that can also be used to add specific information to the documentation.

As a simple example, the code below illustrates the use of a number of Javadoc features and some commondocumentation tags.

/** This class contains a square root algorithm.

10

Page 20: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

* @author George Wells

* @version 1.1 (7 November 2000)

*/

public class Roots

{

/** The approximate square root is found using the Newton-Raphson method.

* @param x The value whose square root is to be calculated.

* @return The approximate square root of x.

* @throws IllegalArgumentException if x < 0.

*/

public static double squareRoot (double x)

{ . . .

} // squareRoot

} // class Roots

When run through the Javadoc processor this file produces HTML documentation that looks somethinglike the following. Note how the documentation comments have been reproduced here, and how thedocumentation tags have affected the presentation of the information.

Class Rootsjava.lang.Object

b Roots

public class Roots

extends java.lang.Object

This class contains a square root algorithm.

Version:1.1 (7 November 2000)

Author:George Wells

Constructor Summary

Roots()

Method Summary

static double squareRoot(double x)

The approximate square root is found using the Newton-Raphson method.

Methods inherited from class java.lang.Objectclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait,

wait

Constructor Detail

Roots

public Roots()

11

Page 21: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Method Detail

squareRoot

public static double squareRoot(double x)

The approximate square root is found using the Newton-Raphson method.

Parameters:x - The value whose square root is to be calculated.

Returns:The approximate square root of x.

Throws:java.lang.IllegalArgumentException - if x < 0.

All the Java API documentation is produced using Javadoc, so the style of this documentation shouldlook very familiar. Because the documentation is created using HTML, the usual HTML tags may also beused to assist in the formatting of the documentation. Further information about the use of the Javadoctool, the available tags, etc. can be found by referring to the Java documentation.

Automatically producing documentation from program code helps improve program quality by allowingprogrammers to write the code and the documentation together, with only one set of files to maintain.This aids greatly in ensuring that documentation and code remain in agreement. Simplifying the doc-umentation process also increases the likelihood of documentation being produced at all (programmersare notorious for disregarding the production of documentation!).

The code for the data structures and algorithms that we will be considering during this course is docu-mented using Javadoc (see Appendix A, p. 187), so you will see many examples of the use of documentationcomments. However, to save space, many of the code segments discussed in the main body of the notesdo not have the full documentation comments included. Of course, for more information on the classesdiscussed in these notes, you may also refer to the online documentation produced by the Javadoc tool.

1.4 Concluding Comments

This chapter has laid a foundation for the rest of our study of Advanced Programming. As we go onto explore this subject in more detail, readers are also referred to the many references listed in theBibliography (see p. 185). These books and articles will provide valuable further insights into the subjectmatter of this course. In particular, Object-Oriented Data Structures Using Java, by Dale, Joyce andWeems[7] is highly recommended as a supporting text, as it covers many of the same topics as these notesat about the same level of detail, but with fresh insights and different examples.

12

Page 22: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Skills

• You should understand what is meant by “advanced programming”

• You should understand the need for advanced programming techniques and proper programdesign

• You should know the criteria used to judge the quality of a computer program

• You should be able to write preconditions and postconditions for any methods you develop

• You should be able to write assertions for any code you develop, and to use the assertionmechanisms in Java

• You should be able to use Javadoc to produce system documentation

13

Page 23: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Chapter 2

Old Friends Revisited

Objectives

• To consider/revise the concept of recursion

• To consider/revise the use of interfaces in Java

Many of the ADTs we will be considering are implemented using interfaces, and many of the algorithmsthat we will be looking at make use of recursion. This chapter gives a brief introduction (possibly revisionfor some readers) to the use of these techniques in Java.

2.1 Recursion

Recursion is a method for repeating activity in a program. The use of recursion is perhaps best ap-preciated for problems where iterative algorithms (i.e. using loops for repetition) start to become veryawkward. Some of the best examples of this arise when we come to consider tree ADTs and algorithmsfor manipulating and traversing such trees. For the moment we will have to be satisfied with some simpler(and rather overdone!) examples.

2.1.1 Introduction

The word recursion is defined in the dictionary as “the act or process of returning or running back”[6].In the context of Computer Science, this term is used when an algorithm (or a method) refers back toitself, or calls itself.

Recursion can be thought of quite simply as solving a problem in terms of a simpler form of the sameproblem. For example, consider a simple robot (R) that is trying to catch a target (T) in a room. Therobot is incapable of detecting the speed and direction of the target — all it can detect is the currentposition of the target.

14

Page 24: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

The robot can follow a very simple algorithm of taking one step towards the target, and then followingthe same process again and again, until it reaches the target. This might result in the following sequenceof steps:

15

Page 25: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Expressed in terms of a pseudo-code algorithm, we might have something like the following:

ALGORITHM: CatchTarget

IF atTarget THEN

STOP

Take one step towards target

CatchTarget // Recursive use of the algorithm

There are a number of points to notice about this approach. Firstly, we have expressed the problem(CatchTarget) in terms of itself (the last line of the algorithm refers to the entire algorithm again) —this is the essence of recursion. However, each time we “reuse” the algorithm, it is in a slightly “simpler”case (i.e. we are one step closer to catching the target). Lastly, and very importantly, we have a stop case— when we reach the target we are finished. Without the stop case the algorithm would never terminate,as each and every attempt to catch the target would result in the algorithm being invoked again.

2.1.2 Calculating Factorials

The first concrete example that we will look at is the traditional factorial problem. The factorial of anumber, n, is written n! and is defined to be n multiplied by every number smaller than it, down to 1.So, we can write:

n! = n× (n− 1)× (n− 2)× . . .× 3× 2× 1

The key to using recursion here is to see that this can be expressed as:

n! = n× (n− 1)!

since

(n− 1)! = (n− 1)× (n− 2)× . . .× 3× 2× 1

In other words, the factorial of a positive integer number can be defined to be that number multiplied bythe factorial of the preceding number (n! = n×(n−1)!). As always, the recursion has to stop somewhere,and to do this we introduce the stop case that 0! = 1.

With this understanding of the recursive approach to solving this problem, we can express the recursivefactorial method in Java as follows:

16

Page 26: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

public int factorial (int n)

// Recursive method to calculate the factorial of n

// PRE: n >= 0

// POST: if n! < Integer.MAX_VALUE then factorial returns n!

// else a nonsense value is returned

{ if (n == 0)

return 1; // The stop case

else

return n * factorial(n-1); // The recursive call

} // factorial

As is often the case, it must be noted that the iterative solution to this problem is far more efficient. Therecursive solution is elegant, and serves to illustrate a simple recursive method, but should never be usedin practice.

Exercise 2.1 Write an iterative, rather than recursive, method to calculate factorials. Whichis easier to understand?Write a program that calls both methods repeatedly, measuring the time taken by them both.Which is more efficient?

2.1.3 Writing a String Backwards

As a second example, consider the problem of reading in a string and then writing it out backwards. Forexample, given the input George we want a method that will write out egroeG. How can we go aboutthis? We can read a single character and store it in a local variable while we recursively call the methodagain to deal with the rest of the string. When it returns we can then write out the character we firsthad. This gives us the recursive algorithm that we need:

ALGORITHM: ReadAndWriteBackwards

READ ch

IF ch != ’\n’ THEN

ReadAndWriteBackwards // Write out the rest of the string backwards

WRITE ch

In Java this becomes:

public void readAndWriteBackwards ()

// Read in a character string terminated by ’\n’ and write

// it out backwards

// PRE: None

// POST: A string of characters has been read and written

// out in reverse order

{ char ch;

ch = (char)System.in.read();

if (ch != ’\n’)

readAndWriteBackwards();

System.out.print(ch);

} // readAndWriteBackwards

17

Page 27: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Note that this works because each recursive call to the readAndWriteBackwards method has its owncopy of the local variable ch, which effectively stores a character while the recursion proceeds. If we useda class or instance variable for this purpose the program would no longer work correctly.

Make sure that you understand how these recursive methods work.

Exercise 2.2 The Fibonacci series is another widely overused example of recursion. Thenth Fibonacci number (fn) is defined to be the sum of the previous two Fibonacci numbers(fn−1 + fn−2). The first two Fibonacci numbers (f1 and f2) are defined to be 1. The first fewvalues in the series are: 1, 1, 2, 3, 5, 8, 13, etc. More formally, we can define the Fibonacciseries:

fn =

{1 if n = 1 or n = 2fn−1 + fn−2 if n > 2

Write a recursive Java method to calculate the nth number in the Fibonacci series.

Exercise 2.3 An important class of graphics images are what are known as fractals. Fractalimages have the interesting property that they are similar when examined at any level ofmagnification. Here we consider a simple case of recursive fractals, which produce images thatlook like quite realistic mountain ranges.Such an image can easily be produced recursively as follows: we start off with the problem ofdrawing a horizontal line across the centre of the drawing area; we then break this into twosubproblems of drawing half a line, but with the centre point of the line moved slightly upor down in a vertical direction by a random factor proportional to the sub-line length. Toillustrate this, the first two steps might be as follows:

We then apply the same process to drawing the two half-lines:

The same process is then applied to the problem of drawing the four quarter-length lines, andso on. The stop case in this situation is when the length of a line drops below some suitablethreshold (say, five pixels in length).Write a recursive Java applet that will display a random fractal, as illustrated above.

18

Page 28: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Exercise 2.4 The Towers of Hanoi is another common (and much better) example of recur-sion. An ancient legend in the Far East says that there is a monastery that contains threetowers. The monks in the monastery are moving a set of 64 disks from one tower to another,one disk at a time. When the job is completed the world will supposedly come to an end! Theprocess of moving the disks is a little complicated by the fact that the disks are all of differentsizes, and the rules state that a disk may not be placed on top of a smaller disk. The initialconfiguration, assuming that there are only four disks, would look like the following:

The recursive solution to this problem can be described as follows:

To move n disks from tower A to tower B:

If n = 1 the disk can be moved directlyOtherwise:1. Move n− 1 disks from A to C, using B as a temporary holding place2. Move the nth disk from A to B3. Move the n − 1 disks from C to B, using A as a temporary holding

place

Note that both steps 1 and 3 are recursive in this case.Develop a Java class that can be used to represent one tower with its disks, and a program tosimulate the solution to the Towers of Hanoi problem using three objects of this class.

Exercise 2.5 In bioinformatics, gene sequences are expressed as long strings of informationcoded using the letters A, C, T and G (these stand for the four distinct bases that comprisegenes). When analysing DNA, the data that is obtained is sometimes incomplete. In thesecases an X is introduced into the string to show that there was an undecipherable base present.So, we might get a short sequence like “ACCCGTAAXGGCTAGXXGGCT”. In order to matchthis with other, known sequences we need to be able to generate all possible strings from thisform of data (i.e. replace all the X’s with all possible combinations of A, C, T and G). Theremay be any number of X’s in any arbitrary position(s) in the given sequence.Recursion allows us to generate these sequences quite easily. We can formulate the problem ofgenerating all possible sequences as follows:

If there are no X’s Then

Print the sequence

Else

Replace the first X with A and generate all possible sequences

Replace the first X with C and generate all possible sequences

Replace the first X with T and generate all possible sequences

Replace the first X with G and generate all possible sequences

In this way we can methodically generate all the possible gene sequences.Write a Java program that uses this recursive approach to solving the problem.

19

Page 29: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Figure 2.1: A Class Hierarchy for Employee Classes.

2.2 Interfaces in Java

You should already be familiar with the concepts of inheritance and polymorphism in Java. In particular,you should be aware that inheritance in Java is restricted to single inheritance: a class can have onlyone superclass (or parent class). In some situations we require objects to have more than just one setof “inherited” characteristics. This is the problem that interfaces help us to address. Let’s consider anexample.

In a company’s human resources (personnel) department we might develop a computer system to workwith employee data. We might consider using inheritance to do this, and come up with an inheritancehierarchy something like that shown in Figure 2.1.

There is no problem with any of this, but let’s consider what would happen if we wanted to introducea class for Personal Assistants (PAs), i.e. people who are something like high-powered secretaries, withsome managerial responsibilities. We might want to introduce them in the way suggested by the hierarchyin Figure 2.2, but this is not possible in Java, as we can only have one superclass for any class.

So, how can we specify that personal assistants should be something like a cross between a managerand a secretary? In terms of inheritance we would need to decide which one of the two they were morelike and build the inheritance hierarchy like that. For example, we might decide that they were mainlysecretaries, and build the hierarchy of classes shown in Figure 2.3.

But we still need to express the fact that personal assistants have managerial responsibilities. We canuse an interface for this purpose. An interface in Java is similar to a class, but it has no data membersor methods. It simply specifies that classes that implement it must provide the methods that it defines.In our employee situation we might introduce an interface called ManagerialResponsibilities. Dia-grammatically we can then express what we need as shown in Figure 2.4. Note how we used dashed linesto emphasise that we are not using classes and inheritance with the interfaces.

20

Page 30: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Figure 2.2: Fitting in PAs (won’t work in Java).

Figure 2.3: Fitting in PAs.

21

Page 31: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Figure 2.4: Using an Interface.

It is important to note that the PersonalAssistant and Manager classes do not inherit any propertiesor methods from the ManagerialResponsibilities interface.

How do we express this in Java? It is in fact very simple. The outlines of the classes and the interface thatare required are as follows. Study this closely and compare it with the diagrammatic view in Figure 2.4.

public class Employee

{ ... }

public class Secretary extends Employee

{ ... }

public interface ManagerialResponsibilities

{ public void manage ();

} // interface ManagerialResponsibilities

public class Manager extends Employee

implements ManagerialResponsibilities

{ ...

public void manage ()

{ ...

} // manage

...

} // class Manager

22

Page 32: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

public class PersonalAssistant extends Secretary

implements ManagerialResponsibilities

{ ...

public void manage ()

{ ...

} // manage

...

} // class PersonalAssistant

public class Director extends Manager

{ ... }

Note how the interface, which is shown completely above, does not contain a body for the manage method,simply the “heading” followed immediately by a semicolon. What this means in practice is that the classeswhich implement this interface (i.e. Manager and PersonalAssistant above) must provide this method,as indicated above. In this way, classes that cannot inherit some behaviour (such as the ability to“manage” here) can be forced to provide the required method(s) by requiring that they implement aninterface.

As we can also see from this example, it is possible for a class to both extend another class (usinginheritance) and also to implement an interface. In fact, a class can implement many interfaces ifnecessary, and so it is not unusual to see classes like the following:

public class MyClass extends SomeOtherClass

implements IntA, IntB, IntC

{ ... } // class MyClass

where IntA, IntB and IntC are all interfaces. This class now has four distinct types of behaviour: thatwhich it has inherited from SomeOtherClass, and that which it has been forced to provide throughimplementing the three interfaces.

2.2.1 Interfaces and Polymorphism

Interface names can be used for variables, just like class names, so we could write the following code forour employee example:

MangerialResponsibilities mgr;

This is a reference variable that can be used to refer to an object belonging to any class that implementsthe MangerialResponsibilities interface. So, we could do any of the following:

mgr = new Manager();

mgr = new PersonalAssistant();

or even:

mgr = new Director();

since the Director class has inherited the necessary behaviour (i.e. the managemethod) from the Managerclass which implemented the interface.

23

Page 33: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

As is the case with polymorphism through inheritance, we lose some capabilities in this way. All wecan use the mgr variable for is the behaviour defined in the MangerialResponsibilities interface.We have effectively lost the access to the specific features of the actual class being used (Manager,PersonalAssistant or Director above). In this case, all we can do with the mgr reference is call themanage method:

mgr.manage();

However, as usual, we can use the instanceof operator to find out exactly what type of object we aredealing with, and type-casts to get a reference of the type required. For example:

public void runOffice (MangerialResponsibilities mgr)

{ mgr.manage();

if (mgr instanceof PersonalAssistant)

{ PersonalAssistant pa = (PersonalAssistant)mgr;

// Use pa to work with PersonalAssistant object

. . .

} // if

} // runOffice

As well as allowing us to work around the limitations of single inheritance, interfaces provide a convenientsolution in any situation where multiple classes that are not related by inheritance must still share somecommon behaviour.

Note: The class diagram format used in this chapter and throughout these notes is a highly simplifiedform of the class diagrams used in the Unified Modeling Language (UML). In general, the use of “proper”UML class diagrams is to be preferred.

We will return to the subject of interfaces and see further examples of their use in the later chapters ofthis course.

Skills

• You should be able to write and understand recursive methods in Java

• You should be able to use interfaces in Java, together with polymorphism

24

Page 34: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Chapter 3

Driven to Abstraction!

Objectives

• To discuss general aspects of abstraction

• To introduce the concept of abstract data types

3.1 The Need for Abstraction

Many years ago, Mary Shaw of Carnegie-Mellon University expressed the need for abstraction in Com-puter Science very well:

We believe that the bulk of problems with contemporary software follow from the fact that it istoo complex. Complexity per se is not the culprit, of course; rather it is our human limitationsin dealing with it. Dijkstra1 said it extremely well when he wrote of “our human inability to domuch”. It is our human limitations, our inability to deal with all relations and ramificationsof a complex situation simultaneously, which lies at the root of our software problems. Inother contexts we (humans) have developed an exceptionally powerful technique for dealingwith complexity. We abstract from it. Unable to master the entirety of a complex object, wechoose to ignore its inessential details, dealing instead with a generalized model of the object— we consider its “essence”.[17]

3.1.1 Some Examples of Abstraction

As mentioned in the quote above, in many areas of our everyday lives we practise abstraction withouteven thinking about it. For example, very few (if any!) people when driving a car pause to think aboutthe chemical reaction between complex hydrocarbon molecules and oxygen that causes an explosion inthe cylinder of the engine, or the laws of physics which determine how the force of that explosion isturned into forward motion, or the computer technology that controls many modern engines. The reasonfor this is that we have a very good abstract view of a car as a four-wheeled device with a few, relatively

1Edsger Dijkstra, a very famous computer scientist.

25

Page 35: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

simple, controls for getting us from one place to another (in some cases with the side effect of attractingmembers of the opposite sex!). The controls of a motor car serve as the interface between the driver andthe hidden maths, physics and chemistry that actually get us to our destination. Any excessive concernfor the physical functioning of the vehicle is likely to overwhelm the driver with far too much tediousdetail.

A Computer as an Abstract Device

In much the same way that we treat everyday objects as abstractions, most readers will have an abstractview of a computer as a programmable device capable of a certain level of calculation and interaction.Of course, there are many levels of detail below this which are conveniently ignored most of the time.These levels can be thought of as forming a hierarchy of abstract views of a computer system where, ateach succeeding level, the conceptual “machine” with which one is dealing is further and further removedfrom the actual details of the circuits and electronic signals. This hierarchy is depicted in Figure 3.1.

Fifth Generation LanguageFourth Generation Language

High level LanguageAssembly LanguageOperating SystemMachine CodeMicrocode

Electronic CircuitsQuantum Physics Effects

Figure 3.1: Abstract Views of a Computer System.

At the lowest level we have laws of quantum physics determining the energy levels of electrons in semi-conductor materials. This is a level of detail that only really concerns researchers in the field of computerchip manufacturing.

Just above the quantum physics level we find the electronic circuits (the hardware) that actually performthe computations in a computer system. Typically, these circuits can do simple arithmetic and logicaloperations and store numeric values. The “programming” (if we can call it that) of this level is done byelectronic engineers, who design the circuits.

In the levels above the hardware we have different software elements that control the action of thehardware. The microcode level (or microprogramming level) is the very lowest level of software andconsists of instructions that control the switches in the hardware circuits directly. The microprogrammingof a computer is usually done by the manufacturer of the computer, and so very few programmers everwork at this level.

One step up from the microcode level is machine code. Since we seldom work at the microcode level,this level defines what we refer to as the conventional machine level. It is the lowest level of abstractionof which most programmers are aware. The machine code for a computer consists of binary informationthat controls the circuitry of the computer. Typically, a single machine code instruction will be made upof several microcode instructions. At this level there are usually instructions to specify arithmetic andlogical operations, program control (such as iteration, selection and branching), access to data stored inthe memory of the computer, etc. Due to the difficulty of working with a purely numeric representationfor the instructions we seldom deal with this level directly. The instructions to add two numbers togetherat this level might look as follows2:

2This and the following examples of machine level and assembly language instructions are for the Intel 80x86/Pentiumfamily of processors.

26

Page 36: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

1010 0001 0000 0000 0000 0000

0000 0011 0000 0110 0000 0000 0000 0010

Since this is fairly meaningless and hard to comprehend, we usually use hexadecimal (base 16) represen-tation for such values (which isn’t a lot easier to understand, but much easier to write!). In hexadecimalnotation the above sequence of machine code instructions is as follows:

A1 0000

03 06 0002

Programming at this level is sometimes referred to as working in a first generation language. When thefirst generation of computers was developed in the late 1940’s this was the only way of programmingthem.

Above the machine code level we have the operating system. This consists of routines (usually suppliedby the computer manufacturer) to simplify access to the resources of the computer, such as memory,disks, terminals, printers, etc. For example, the operating system may have routines that will allocate ablock of memory for use by a program, to print a character on a printer, or to read a line of input froma keyboard. The provision of these facilities in the operating system simplifies the task of programmersworking at higher levels since they do not need to concern themselves with the complex operations neededto access the peripheral devices connected to a computer system. The other advantage of an operatingsystem is that it makes the programs more independent of the hardware on which they execute, since thedetails of the operation of the hardware are kept from the program by the operating system.

Above the operating system level we have the assembly language level. At this level we can control theexecution of the circuits by writing instructions in a form of English-like notation. This notation usuallyhas a direct, one-to-one correspondence to the machine code instructions, but is easier for humans tocomprehend than binary or hexadecimal. At this level we might represent the sequence of instructionsneeded to add two numbers as follows:

MOV AX, NUM1

ADD AX, NUM2

Of course the hardware of the computer cannot execute this form of instruction and so it must betranslated into machine code by a program called an assembler. At this level we can also make use of thefacilities offered by the operating system in order to access peripheral devices. Assembly languages aresometimes referred to as second generation languages since they emerged soon after the early, primitivefirst generation languages.

Above the assembly language level we find the high level languages. This is the level at which mostprogrammers work, and with which they are most familiar. It is at this level that we find Java, Python,C++, C], Delphi, and other common computer languages. The languages found at this level are oftenreferred to as third generation languages, as they were developed after assembly languages. The firstthird generation language was FORTRAN, which was developed by IBM in the 1950’s. A feature thatdistinguishes high level languages from the preceding generations is that they are far more machine inde-pendent. An assembly language instruction or machine code is usually specific to a particular computer.On the other hand, a program written in a high level language can be executed on any of a wide range ofcomputers, without too much trouble. Of course, since the hardware circuits only “understand” binarydata a high level language has to be translated into machine code. This is the task performed by acompiler or an interpreter.

The notation used by high level languages is usually fairly formal and precise, but far more readable byhumans. For example, to add two numbers in Java we could write:

num1 + num2

27

Page 37: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

which is easier to understand than the assembly language version above. In a language like COBOL it iseven clearer still!

ADD NUM1 TO NUM2.

Within the range of high level languages we find that there is something of a subhierarchy. Some languages(such as C) are actually fairly close to assembly language, and provide some access to the underlying levelsof abstraction. Other high level languages (such as Java) are much further removed from the underlyinglevels.

The fourth generation languages (often abbreviated to 4GLs) came along after the high level languages.They are typically used for business applications and database access. The aim of fourth generationlanguages is to make it easier for business people with no knowledge of programming to access informationquickly and easily.

The fifth generation languages are not very well defined. They are distinguished by their use of artificialintelligence techniques to make them extremely easy to use.

When working at any of these levels of abstraction we can usually safely ignore the details and complexityof the underlying levels. Just as the car driver would be swamped trying to think about the chemistryand physics at work while driving a car, so too would a programmer be overwhelmed if he or she hadto consider the binary translation and electrical signals corresponding to every statement in a high-levelprogramming language.

These examples of abstraction in everyday life and in computer systems illustrate the power of abstractionin allowing us to ignore the details of complex systems. While we may ignore these details, we can stilluse these complex systems very easily, thanks to the high-level, abstract view that we have of them. Ofcourse, this kind of abstraction is not the main focus of this course — we wish to find out how to buildcomplex software systems that can make use of abstraction in order to simplify the software developmentprocess. In order to be able to do this, computer languages provide mechanisms that allow us to writeprograms using different types of abstraction.

3.2 Abstraction in Modern Programming Languages

Modern programming languages allow us to use built-in abstractions, and also to build our own newabstractions. This applies in two ways: procedural (or process) abstractions and abstract data types (ordata abstractions). We will consider each of these in turn.

3.2.1 Procedural Abstractions

A procedural abstraction is a way of viewing a set sequence of steps that have to be carried out. Forexample, in Java we use a method called writeObject to write an object to an output stream. This is aprocedural abstraction that allows us to ignore the detail of how an object is converted into a form thatcan be written to a stream, and exactly how that writing is done.

In the same way we can develop our own procedural abstractions using the method (or function, orprocedure) mechanisms present in most computer languages. For example, in developing a program tocalculate salaries we might write a method to calculate income tax. Once this is done we can use themethod without any concern for the details of the actual process of tax calculation.

28

Page 38: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

3.2.2 Abstract Data Types

Of more interest to us in this course is the concept of an abstract data type (ADT). An ADT can beformally defined in the following way.

An abstract data type is a pair < V,O >, where:

V is a set of valuesO is a set of operations defined on those values

Most computer languages have built-in data types that provide the programmer with abstract views ofsimple numeric and character values. These are abstractions in the sense that we can use them withouttoo much concern for the binary and electronic representations of the values.

Built-In Abstract Data Types

Many programmers are happy that, for the most part, they can manipulate numbers (e.g. double float-ing point values) without concern for the internal representation (implementation) of these values, andwithout knowledge of the hardware operations used to support the data manipulation. The programmeris, quite rightfully, content with the concept of a floating point number and totally oblivious of the un-derlying implementation. The notion of concept and implementation being separated is the foundationof abstract data types, or data abstraction.

The basic Java data types double, int, char, etc., each supported by a set of operations, are abstractdata types. A declaration of the form:

double number;

is an abstraction of the underlying memory circuitry used to house the floating point value number, andthe instruction:

number = number + 10.8;

is an abstraction of the electronic circuitry used to implement such an arithmetic operation. The sameargument applies to the other built-in Java types. The reader probably has not thought of doubles insuch a way because these primitive types are predefined. And therein lies the power of abstraction —there is no need to know or to comprehend the details of implementation, but merely to understand theconcept.

In terms of the formal definition given above, the int type in Java could be characterised as follows (youmay not have seen all of the operations specified here, but don’t worry about that):

V = { -2 147 483 648 ... 2 147 483 647 }O = { +, -, *, /, %, ~, |, &, ^, <<, >>, >>>, <, <=, ==, >=, >, !=, = }

Creating New Abstract Data Types

As well as providing us with built-in abstractions, modern computer languages allow us to create ourown new ADTs. In Java it is the class mechanism which allows us to do this. At one level, classes inJava are very like records in other programming languages, and of limited use in constructing ADTs. Thefeatures in Java that extend the simple idea of a record and make the class a truly powerful mechanismfor constructing ADTs are the ability to specify the visibility of fields within the class (e.g. using public

and private access control), and the ability to specify not only the data values for the class but also the

29

Page 39: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

operations (using methods). These mechanisms give us the ability to develop our own new data typesand integrate them with the existing abstractions built into the Java language.

An example of such an ADT is a frequency table. This has a set of possible values (the frequency countsfor several different ranges or “bins”) and a set of possible operations (e.g. add a new entry, return thefrequency for some range of values, print the table, print a frequency graph).

As another example, consider a data structure that could be used for a dictionary, such as might be usedby the Rhodes Dictionary Unit. This has a set of values (words and their associated meanings) and a setof operations (e.g. open, close, add a new entry, delete an entry, look up the meaning of a word, etc.).

In the rest of this course we will study many more examples of abstract data types.

Interfaces and Abstraction

Interfaces, with polymorphism, are also a powerful abstraction mechanism in Java. When we declarea parameter (or variable) to be of an interface type, we are effectively taking an abstract view of theactual object that is being referred to. For example, using the interface example of the previous chapter,the following method does not need to know whether it is actually dealing with a Manager object, aPersonalAssistant object or a Director object. All that is relevant is that the object referred to bythe parameter m has a manage method — the delegate method does not need to know anything elseabout the actual parameter.

void delegate (ManagerialResponsibilities m)

{ ...

m.manage();

...

} // delegate

3.2.3 Information Hiding: Client and Implementor Views

One of the key aspects of data abstraction is hiding away unnecessary details of the implementation. Aswe have already noted, Java provides the necessary mechanisms for this with the access control availablefor user-defined classes. Of course, this has two aspects. The first is that of the user of the class (theclient view), and the second that of the implementor of the class. It is in some ways unfortunate that ina course like this we study both the implementation and the use of various ADTs, and so are forced toconsider both of these views. This tends to obscure the distinction between the two views. As we lookat various examples of ADTs try to bear this in mind, and consider separately how you would tackle aproject (a) as a user of a particular ADT and (b) as the implementor of the same ADT.

To try to emphasise these two different points of view we will mark sections of the notes that dealwith the client view by using highlighting as you see here.

/* The same notation will also be used for client program

segments included in the notes. */

Class Diagrams

To help the reader grasp the fundamental details of the various data structures, we will use a simplifiedform of UML class diagram, as shown in Figure 3.2. The top part of the diagram simply names the class.The middle section shows the details of the data members, and the bottom section the details of thefunction members (or methods). Private members are shown with a grey background (as for the datamembers here) and public members are shown normally. This helps to emphasise the difference between

30

Page 40: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

ClassName

data members

methods

Figure 3.2: An Example Class Diagram.

the client and implementor views since the public members are the part of the class that are visible tothe client programs, while the private members are those parts that are only visible and of interest tothe implementor.

3.3 Closing Comments

Hopefully this chapter has helped to explain the importance of abstraction, as well as its incrediblepower for helping us to deal with complex systems. As a final thought on this subject for now, here is anunsolicited quote from a Rhodes Computer Science graduate (who was working as a technical businessmanager at Internet Solutions, one of South Africa’s largest Internet service providers, when he wrotethis):

Essentially, my work involves dealing with problems, and solving them — at the moment,that involves developing new products for Internet Solutions. If I think about all I learned atuniversity, probably, the single most powerful construct I learned (actually it appears in bothMathematics and Computer Science) is the concept of abstraction. Abstraction can be appliedto almost any complex problem to break it down into manageable chunks. That becomes a hugehelp when dealing with complex projects, when I am trying to get my mind around how to dealwith something. I might not be writing programs, or solving theorems, but the concepts thatapply there apply all over the business world. Having that formalism as a backing gives mean infinite lead over my peers who do not have tools like that at their disposal.

(Geoff Rehmet, 2002)

Skills

• You should understand how we use abstraction in everyday life, and in the context of completecomputer systems

• You should understand the abstractions which modern computer languages provide

• You should know that modern computer languages allow us to construct our own abstractions

• You should know what is meant by the term abstract data type

• You should understand the difference between the client and implementor views of an ADT

31

Page 41: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Part II

Advanced Data Structures

32

Page 42: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

This section of the course considers a number of important data structures. Chapter 4 takes a look atarrays and linked lists which form the building blocks for many of the more complex structures whichwe then consider in the following chapters. These range from simple stacks and queues through to morecomplex structures such as trees and hash tables. Common techniques for the implementation of thesedata structures as abstract data types will be considered, together with some examples of their use.

33

Page 43: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Chapter 4

Vectors and Lists

Objectives

• To consider simple lists of information

• To study the implementation of lists using arrays and using dynamic data structures (linkedlists)

• To see how generic ADTs are handled in Java

4.1 Introduction

Many problems involve the use of lists. In real life we have lists of courses which students can take,shopping lists, lists of New Year’s resolutions, lists of cabinet ministers’ names, etc. And, in a ratherrecursive way, the previous sentence was a list of lists! You are already familiar with the way in whichwe can handle lists of data in programming languages by using arrays. This chapter will build on thisbasic concept in two ways. The first section takes the simple concept of an array in Java and extendsit using some of the more advanced features of Java. The second section shows how we can get aroundsome of the problems associated with using arrays by using dynamic data structures. In both cases wewill be dealing with lists of integers, but obviously the principles apply to lists of any other objects. Thisis reinforced in the last section of this chapter when we consider examples of generic data structures:lists of any type of object. In doing this, we take advantage of some of the language features introducedin Java 5.0.

4.2 Vectors

Arrays in Java have a few problems associated with them. One of the most obvious is that the size ofthe array is fixed when the array is created and cannot be changed. Fortunately, Java provides us withthe necessary mechanisms to develop our own alternative structures.

In this section we are going to develop a data structure to hold lists of integer values and manipulatethese in ways very similar to arrays. In fact we will be using an array as the basic building block of our

34

Page 44: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

• Constructing an empty list (initialisation)

• Adding an element to a list

• Accessing an element of a list

• Removing an element from a list

• Finding an element in a list

• Finding the length of the list

• Displaying the contents of a list

Table 4.1: Operations on Lists

new data structure. Let us start by thinking about the operations that we might want to perform on alist. We obviously want to add new elements to the list and be able to access them, and will probablywant to remove elements. We will also want to display the contents of the list. We might need to knowthe length of the list, and it might be useful to search the list for the position of a given element. Wewill also need to provide a constructor, as usual. These requirements are summarised in Table 4.1 andthe class diagram below. They are requirements that apply generally to all lists and not just to the listsof integers that we will be dealing with in this section.�

IntegerVector

data, numElements

add, get, set, position,remove, length

Data Members

Let us see how we can build such a data structure in Java. The starting point is, of course, to create anew class. We will call it IntegerVector. The data members of this class will need to hold the contentsof the list together with information about the list such as the number of elements it actually containsat any given moment. This gives us the following starting point:

public class IntegerVector

{ private int data[]; // The array of data.

private int numElements; // Number of elements in vector.

. . .

} // class IntegerVector

Note that we will call our new data structure a vector1 to differentiate it from a normal array in Java,but it is really much the same sort of thing, with a few extra operations. We have arranged to have thedata, together with the other bits of housekeeping information, stored as private fields in the class. Thismeans that we will only be able to access the data through the public methods that we write. This is a

1Vector is the technical term used by mathematicians to refer to an indexed list of values.

35

Page 45: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

very common pattern for developing ADTs, as it provides complete control over the clients’ access to thecontents of the ADT objects. The full listing of this file (IntegerVector.java) is given in Appendix Aof the notes.

Constructors

The first operation that we need to consider is the creation of a new vector, i.e. the constructor orconstructors that we need. The main constructor will simply create the vector with a given size.

public IntegerVector (int initSize)

{ if (initSize <= 0)

throw new IllegalArgumentException("initSize <= 0");

numElements = 0;

data = new int[initSize];

} // constructor

Note the check on the precondition that the initial size specified is positive. Another useful constructorwould be one that allows us to make use of a default initial size. This can simply make use of the firstconstructor above, by using the keyword this, as we see in the following code.

public IntegerVector ()

{ this(100);

} // constructor

These constructors allow us (thinking of our client role now) to declare vectors in either of the twoways illustrated by the following examples:

IntegerVector v1 = new IntegerVector(); // Create a vector with the default size

IntegerVector v2 = new IntegerVector(20); // Create a vector of 20 ints

Adding New Elements

That completes the constructors required for the implementation of the new data structure. What aboutaccessing the data structure? The vectors described by what we have so far start off “empty” — we cansee above that the data member numElements is set to zero as part of the initialisation of the class. So,how can we add a new element to the vector? A very simple way is to add the new element to the endof the list (assuming that there is space for it):

public void add (int item)

// Place the new item at the end of an IntegerVector

{ if (numElements + 1 > data.length)

throw new NoSpaceAvailableException("no space available");

data[numElements++] = item;

} // add

Note how the business of adding the new item and updating the number of elements is very neatly carriedout by the single Java statement data[numElements++] = item. If you are not sure how this works referback to your introductory Java notes on the ++ operator, and then try a few examples to see what ishappening.

This add operator will do very well for a simple list, but what if we wanted to add an item to the middleof our list (or at the beginning, for that matter)? In this case we need to specify the position in thevector where the new item is to be added.

36

Page 46: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

public void add (int item, int position)

{ if (numElements + 1 > data.length)

throw new NoSpaceAvailableException("no space available");

if (position < 0)

throw new IllegalArgumentException("position is negative");

if (position >= numElements) // Add at end

data[numElements++] = item;

else

{ int k;

for (k = numElements-1; k >= position; k--)

data[k+1] = data[k]; // Move elements up

data[k+1] = item; // Put item in place

numElements++;

}

} // add

Notice how we have a further precondition check now to make sure that the position parameter has apositive value. We also have two cases that arise. The first is that the new item is to be added to theend of the vector (this is essentially the same as we had before). The second, more difficult, case is thatthe item is to be added into the middle of the list. In this case we have to make space for it by movingexisting items in the list up. This is the purpose of the for loop above. Try out a few examples andsatisfy yourself as to how this works.

From the client perspective, we can now add items to our vectors as follows:

v1.add(3);

v1.add(-8);

v2.add(39);

v1.add(21, 1); // Put item in position 1 of v1

What will these two vectors (v1 and v2) contain after this series of operations?

Displaying the Contents of a Vector

Having provided a mechanism that allows us to put items in vectors, it would be rather a nice idea if wecould display the contents of a vector. A simple way of displaying the contents of an IntegerVector isto provide a method that outputs the contents of the vector. For example, if we assumed that we wouldalways want to display the vector using System.out, then we could write such a method as follows:

public void output ()

// Display contents of vector using System.out

{ for (int k = 0; k < numElements; k++)

System.out.print(data[k] + " ");

System.out.println();

} // output

37

Page 47: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

This could be used in the following way:

v1.output();

However, it would be far more useful if we provided a toString() method for the IntegerVector

class. This will be used automatically by the compiler if we use System.out.println() to displayan IntegerVector:

System.out.println(v1);

In particular, this would not force us always to use System.out to display the vector, but wouldallow us to use any output stream. Furthermore, it can also be used in any other situation wherewe need to display a vector (for example, in a textbox in a GUI display).

The toString() method for our IntegerVector class would be as follows:

public String toString ()

{ StringBuffer s = new StringBuffer("[");

int k;

for (k = 0; k < numElements; k++)

{ s.append(data[k]);

if (k < numElements-1)

s.append(", ");

}

s.append("]");

return s.toString();

} // toString

Note how we use a StringBuffer to build up the string representation of the vector and then convert itto a String once the job is completed. The StringBuffer class is a standard Java class that allows us tomanipulate strings in ways that are a lot more efficient than manipulating String objects directly. Thisis because String objects in Java cannot be changed once they are created. As a result, manipulatingstrings requires that the methods of the String class create new String objects for each result (a time-consuming operation). The contents of a StringBuffer, on the other hand, can be freely changed withoutrequiring the creation of new objects.

With this method in place we can display vectors very simply. For example:

System.out.println("Vector 1: " + v1 + " Vector 2: " + v2);

Accessor Methods

Moving on to the subject of accessing entries in an IntegerVector, we need to provide two methodshere: one to retrieve a data value (get) and one to alter a data value (set). These sorts of methods areusually referred to as accessor methods or just accessors.

public int get (int index)

{ if (index < 0 || index >= numElements)

throw new IndexOutOfBoundsException("index is out of range");

return data[index];

} // get

38

Page 48: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

public void set (int index, int item)

{ if (index < 0 || index >= numElements)

throw new IndexOutOfBoundsException("index is out of range");

data[index] = item;

} // set

The if statements simply check that the index passed as a parameter is in the correct range of values(i.e. that it corresponds to an existing element of our vector). The return statement in the get methodthen returns the correct element of the vector, while the set method updates the value of the specifiedelement. As we commented earlier, this is a common pattern for ADTs like this, and the use of “get”and “set” (or some variation on these names) is a common convention.

These methods allow us to access our vectors as shown in the following example:

v1.set(2, v2.get(0)+3); // Like: v1[2] = v2[0] + 3

Other Methods

Now we can create vectors, we can add elements to them and we can access and display the contents ofour vectors. This comprises a minimal set of operations. To make our vectors more useful we can addsome of the further operations we mentioned in the introduction to this section. The first of these allowsus to find an item in a list:

public int position (int item)

{ int k;

for (k = 0; k < numElements; k++)

if (data[k] == item)

break; // Leave for loop

if (k >= numElements) // item was not found

return -1;

else

return k;

} // position

This is quite straightforward. The only real “catch” is the way we return −1 to indicate that the itemcould not be found in the vector.

The next method allows us to remove an item from a vector:

public void remove (int position)

{ if (position < 0 || position >= numElements)

throw new IndexOutOfBoundsException("position is out of range");

for (int k = position+1; k < numElements; k++)

data[k-1] = data[k];

numElements--;

} // remove

Again this is fairly simple, we just need to move the other items down in the list and decrease the numberof elements. Notice again the use of an assertion to check the preconditions for the method.

The last method that we need to consider is one to tell us the number of items contained in the vector.As we already have this information stored in a private data field, this is a trivial operation.

39

Page 49: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

public int length ()

// Return number of elements in an IntegerVector

{ return numElements; }

Some Comments

So now we have a class which allows us to construct and use lists of integer values. In itself this is probablynot very interesting or useful, but the principles that we have in place here are just as applicable to listsof ingredients and instructions for recipes, lists of parts for a space shuttle, lists of student records, andmany other applications.

One problem with this class, as we have developed it here, is that the maximum number of elements thatcan be contained in the vector is fixed when we construct a new vector. If we make the size too smallwe will be in trouble because the add operation will eventually fail. On the other hand, if we make thesize too large we will be wasting memory. Exercise 4.9 below discusses one way around the first of theseproblems. The next section introduces a very useful technique that will allow us to solve both of thesekinds of space problem at once.

Exercise 4.1 Rather than using the toString() method, we can access the contents of anIntegerVector directly when we want to display them. In this case we can find out the numberof elements (using the length method) and then access the elements (using the get method).Write a program which displays a vector in this way.

Exercise 4.2 Write a method deleteAll which will delete all the elements in a vector (thisis very easy when you see how to do it — think about how we know how many elements thereare in a vector).

Exercise 4.3 Write a method to allow the input of vectors. This should first read an integergiving the length of the vector followed by that number of elements. For example: 3 12 74 89

should be read as a vector of three values (12, 74 and 89).

Exercise 4.4 Write a method for the IntegerVector class that will reverse the contents ofthe list.

Exercise 4.5 Write a second, overloaded position method for the IntegerVector class thattakes a “starting point” (the position in the list from which it should start searching for thegiven item). This can be used to find the position of duplicate values in the list.

40

Page 50: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Exercise 4.6 Another useful operation for many ADTs is the ability to assign one to another.For example:

v1.assign(v2); // Assign the contents of v2 to v1

Write this method, using the following outline:

public void assign (IntegerVector v)

// Replace the contents of this IntegerVector with the

// contents of v

What precondition do you need to check?

Exercise 4.7 Wemight want to provide another method called add for our vector class to allowtwo IntegerVectors to be added together. There are (at least) two different interpretations forwhat this might mean for vectors of integers. What are they? Write the method to implementone of the two possible meanings.

Exercise 4.8 The set method has an assertion which fails if an attempt is made to accessan element which has not been added to the vector. Change this to allow such accesses to bemade, as long as the access is still within the bounds of the size of the vector (this will allowus to use our vectors more like normal arrays in Java). What implications does this have?

Exercise 4.9 Following on from the previous exercise, we can get around the size constraintsfor our vectors if we are prepared to increase the size when accesses are made out of range(particularly when adding new items to the vector). Change the add and set operators so thatsuch accesses cause the size of the vector to be increased. To do this you will need to allocatea new data array of the required size, and then copy across all the elements from the existingdata array.

Exercise 4.10 Develop a student record class and then change the IntegerVector class tohandle lists of student records rather than integers. Is this very difficult?

Exercise 4.11 Develop a class for lists with specified upper and lower bounds, and rangechecking, as in Visual BASIC. Your class should be able to be used as follows:

IntegerArray a = new IntegerArray(1, 10);

// Visual BASIC: Dim a(1 To 10) As Integer

...

a.set(1, 2); // Assignment to the first array element

a.set(10, -5); // Assignment to the last array element

a.set(11, 2); // Should cause a run-time error

for (int k = a.lowerBound(); k <= a.upperBound(); k++)

System.out.println(a.get(k)); // Write out all elements

41

Page 51: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

4.3 Linked Lists

In Exercise 4.9 above we explored one way of getting around the size constraints that apply to the vectorsthat we developed in the previous section. This approach will work but involves the rather cumbersomecopying of the existing data every time we need to increase the size of the vector. Dynamic data structuresprovide us with a much better solution to this problem. Instead of setting aside a fixed amount of spacefor the items in a list, we can initially set aside no space at all and then allocate just enough space foreach new element as it is added to the list. We can get around the copying requirement by keeping eachdata item in its own area of memory and linking these all together using references (often called pointersin this context). This is referred to as a dynamic data structure (because it is so easily changed). Morespecifically, we refer to this as a linked list, due to the way in which the items are kept together usingreferences (or pointers, or links, as they are sometimes called in this context).

Let us work through the development of a class that will allow us to store lists of integers using thistechnique. As we do so, compare what we are doing here with the vector class of the previous section,and think about the differences between the two approaches. The class diagram for this new class isshown below. Notice the similarities between it and the class diagram for the IntegerVector class.�

IntegerList

first, numElements

add, get, set, position,remove, length

Data Members

The full listing of this “linked list of integers” class (IntegerList.java) can be found in Appendix A ofthe notes. The private section of the class is as follows:

public class IntegerList

{ private class ListNode

{ public int data;

public ListNode next;

} // class ListNode

private ListNode first; // Pointer to the first ListNode in an IntegerList

private int numElements; // Number of elements in an IntegerList

. . .

} // class IntegerList

This gets very interesting as we have one class (ListNode) declared inside another class (IntegerList).ListNode is what is known in Java as an inner class (for obvious reasons!). The class diagram for theListNode class is shown below. �

ListNode

data, next

42

Page 52: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

We need to think very carefully about the visibility of the members of these classes. If we examine thedeclaration of ListNode we see that its members (it only has two) are both public. But, because thedeclaration of ListNode itself is as private class ListNode we can only access these fields and usethe ListNode class from within the IntegerList class. In other words, the inner class ListNode and itsmembers data and next are effectively only visible to the methods of IntegerList. This is one of therare occasions where it is permissible to use public data members in a class.

Linking the Nodes

The ListNode class is the key element in building the linked lists which we are using for our lists ofintegers. A single instance of this class can be pictured in the following way:

In itself that is not particularly useful, but we can create a list made up of a number of these individualnodes, using the next references to link them together:

The only remaining problem is to locate the start of this list of nodes. This is the purpose of the firstreference variable in the IntegerList class. If we add this into the picture we get:

Of course, this shows a list to which several elements have been added. We need to consider how anempty list is first created, and how we can add elements to such a list.

Constructor

The creation of an empty list is handled by the constructor for the IntegerList class:

public IntegerList () // Constructor

43

Page 53: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

{ first = null;

numElements = 0;

} // IntegerList constructor

Thus, when we create a list (e.g. IntegerList list1 = new IntegerList();) we have the followingsituation:

With the constructor behind us, we can start to think of the operations that we might want to performon our lists. These will be the same operations as we identified at the start of this chapter. The first isadding a new element to a list.

Adding New Elements

The simplest possible case is where we always add new elements to the beginning of a list (why is thisthe simplest case?). If we choose this approach our add method will look like the following:

1 public void add (int item)

2 // Place the new item in an IntegerList

3 { ListNode node = new ListNode();

4 node.data = item;

5 node.next = first;

6 first = node;

7 numElements++;

8 } // add

Let’s assume that we are adding the value 17 into a list which already contains the values 3 and –8. Theexisting list will look like this:

After creating the new node (line 3) and setting its data field (line 4) we will have the following picture:

44

Page 54: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

The next step (line 5) is to set the next field of the new node. This is assigned the value of the first

reference, giving:

The final linking step (line 6) sets the first pointer to point to the new node, giving:

45

Page 55: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

The last thing we do is increment the number of elements (line 7). After leaving the add method thelocal variable node is destroyed, and we are left with the following picture:

This is exactly what we want, with the new node first in the list of three nodes.

Exercise 4.12 Work through the add method when adding an element to a list that waspreviously empty, and satisfy yourself that it is correct in this case too. Draw diagrams of thechanging situation like those above.

This is all very well, but it is rather limiting if we can only add new items to the front of the list. Whatif we want to add new nodes to the end of the list, or in the middle of the list? The following version ofthe add method allows us to add a new item anywhere in the list, with the default being at the end ofthe list. From the client’s point of view this is identical to the add method for the IntegerVector class,and numbers the positions in the list from zero in the same way.

1 public void add (int item, int position)

2 { if (position < 0)

3 throw new IllegalArgumentException("position is negative");

46

Page 56: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

4 ListNode node = new ListNode();

5 node.data = item;

6 ListNode curr = first,

7 prev = null;

8 for (int k = 0; k < position && curr != null; k++)

9 // Find position

10 { prev = curr;

11 curr = curr.next;

12 }

13 node.next = curr;

14 if (prev != null)

15 prev.next = node;

16 else

17 first = node;

18 numElements++;

19 } // add

This is complicated by the need to find the correct position in the list. This is done by the for loop atlines 8–12. Notice how we need to check both that k < position and that curr != null in this forloop. This is to make sure that we don’t “run off the end” of the list if position has been given a valuegreater than the length of the list. In this case curr will become null at the end of the list and the newnode will be added at the end.

Exercise 4.13 Trace through the execution of this method when adding a new element to thebeginning, middle and end of a list, and satisfy yourself that it works correctly in all three ofthese cases. Does it work correctly when adding an item to an empty list?

Accessor Methods

The next methods that we need to consider are the get and set operations which will allow us to accessthe contents of the linked list a little like an array. The definition of the get and set methods to do thisis:

public int get (int index)

{ if (index < 0 || index >= numElements)

throw new IndexOutOfBoundsException("index is out of range");

ListNode curr = first;

for (int k = 0; curr != null && k < index; k++)

curr = curr.next;

assert curr != null;

return curr.data;

} // get

public void set (int index, int item)

{ if (index < 0 || index >= numElements)

throw new IndexOutOfBoundsException("index is out of range");

ListNode curr = first;

for (int k = 0; curr != null && k < index; k++)

curr = curr.next;

assert curr != null;

curr.data = item;

47

Page 57: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

} // set

This allows us, as clients of the IntegerList class, to use lists in the following way:

IntegerList myList = new IntegerList();

. . .

myList.add(1);

myList.add(37);

myList.add(15, 0); // Add 15 to the beginning of myList

myList.set(2, myList.get(0) + 10);

After this myList will contain: 15 1 25.

Efficiency Now, while the methods that we have developed above appear to allow us to do exactlythe same things with an IntegerList as we can do with the IntegerVector class, there is a very bigdifference in the way in which they are implemented, and this has major implications for the efficiencyof the accessing operations. For an array, as used in the IntegerVector class, the items are stored inconsecutive memory locations and so accessing any element of the array is a simple matter of addingan offset (the index) to the starting address of the array to find the location of the item which is beingreferred to. This is a quick and efficient operation — a simple addition. For our IntegerList class, onthe other hand, the get and set operations involve a for loop working its way through each item in thelist, counting as it goes along until the right element is found. This is potentially a very slow, inefficientoperation, particularly if the list is very long and the items being referred to are located near the end ofthe list. As a result, one should use these methods very carefully with the linked list implementation.

Displaying the Contents of a List

The next method we will consider is the toString method to allow us to display the contents of anIntegerList conveniently:

public String toString ()

{ StringBuffer s = new StringBuffer("[");

for (ListNode curr = first; curr != null; curr = curr.next)

{ s.append("" + curr.data);

if (curr.next != null)

s.append(", ");

}

s.append("]");

return s.toString();

} // toString

Notice again the use of a StringBuffer for efficiency. Note too the use of the for loop in this method.This is a common idiom for working through linked lists in Java. This makes use of a pointer (currin the example above), which starts off at the node pointed to by the first pointer of the list we areworking through. While the curr pointer is not null (i.e. while we have not reached the end of the list)we work through from one element to the next by setting curr to the next field of the current node(curr = curr.next).

48

Page 58: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Other Methods

The idiom used above in the toString method is used again, in a particularly interesting way, in thenext method we consider: the position method. This is used to find a particular element in a list.

public int position (int item) // Find item in a IntegerList

{ ListNode curr = first;

int k;

for (k = 0; curr != null && curr.data != item;

k++, curr = curr.next)

; // Search for item in IntegerList

if (curr == null) // item was not found

return -1;

else

return k;

} // position

Here the body of the for loop is completely empty! All the “work” of the loop is done in the controlsection. While this is a common idiom, it is necessary to identify it clearly as an empty loop, hence thevery clearly indented and commented ; terminating the for loop. Note that the && operator in Java usesshort-circuit evaluation of conditional expressions, and so the test curr != null && curr.data != item

is quite safe.

The remove method is as follows:

1 public void remove (int position)

2 { if (position < 0 || position >= numElements)

3 throw new IndexOutOfBoundsException("position is out of range");

4 ListNode curr = first,

5 prev = null;

6 for (int k = 0; curr != null && k < position; k++)

7 { prev = curr;

8 curr = curr.next;

9 }

10 assert curr != null;

11 if (prev != null)

12 prev.next = curr.next;

13 else

14 first = curr.next;

15 numElements--;

16 } // remove

There are two points to note about this method. Firstly, the checks in lines 2 and 10 are testing thesame thing in two different ways, namely that the value of the parameter position is within the rangeof items in the list. The assertion will only fail if the list is “broken” in some way (i.e. if there is some“internal” problem with our implementation of the linked list). Secondly, we again need a special casefor the deletion of the first node in a list, just as we did for adding the first node in a list. This is doneby the if statement in lines 11–14.

The last method that we need to consider is length, which reports on the number of elements in the list.This is identical to the equivalent method in the IntegerVector class. Why?

49

Page 59: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Some Comments

That completes our basic set of operations for lists. One of the features of this class is that the clientinterface is almost identical to that of the IntegerVector class. The major implication of this is thatclient programs using lists of integers do not have to be changed2 to use one or the other of the twodifferent classes. What we have successfully done is to hide all the details of the implementation from theclient. We can choose the implementation (array or linked list) that we want in different circumstanceswith very little impact on the client programs. This highlights the power of abstraction.

Exercise 4.14 Write a method deleteAll which will delete all the elements in a list. Howdoes this differ from the same method for the IntegerVector class (Exercise 4.2)?

Exercise 4.15 Write a method to allow the input of lists of integers (see Exercise 4.3).

Exercise 4.16 Write a method for the IntegerList class that will reverse the contents of thelist (see Exercise 4.4).

Exercise 4.17 Write an overloaded position method for the IntegerList class that willallow duplicate values to be located (see Exercise 4.5).

Exercise 4.18 Write an assignment method, assign for the IntegerList class (see Exercise4.6).

Exercise 4.19 The common special case of adding a new item to the end of the list canbe simplified by keeping a reference to the last item in the list (say, last). Modify theIntegerList class to use this approach.

Exercise 4.20 Quite often programs follow a common pattern of getting an element of alinked-list ADT immediately followed by setting the value. For example:

if (lst.get(k) == 3)

lst.set(k, 9);

If the get method “remembers” the value of the index and keeps a pointer to this element ofthe list, then we can jump straight there for the subsequent set operation. Furthermore, wecan also make use of this mechanism to jump into the middle of the list for subsequent accessesat or beyond this point (e.g. a get operation at position k+1 does not need to start from thebeginning of the list again).The “remembered” (or “cached”) values of the index and reference into the middle of the listcan be set by any of the methods that find things in the middle of the list (i.e. get, set andposition). Similarly, the cached values can be used by both get and set.We need to be very careful when we implement this: any intervening changes to the list (usingthe add or remove methods) will potentially invalidate the optimisation.Modify the IntegerList class to use these optimisations.

2The only change necessary is to declare objects as IntegerList rather than IntegerVector. This is a trivial namingissue.

50

Page 60: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

4.4 Generic Lists

In the last section we commented on the powerful data hiding aspects of Java classes, and on how we canimplement lists of integers in different ways without affecting the clients of the classes. This is very useful,but of course the restriction that we are dealing only with lists of integers remains. Java provides us withmechanisms that allow us to develop lists of any class type. Most simply, we can use polymorphism forthis purpose. In Java 5.0, some new features were added to the Java programming language, which allowus to develop generic classes with greater safety and ease-of-use.

4.4.1 A Generic List Class Using Polymorphism

Using polymorphism, we can develop generic data structures that can hold any kind of object. We willnow consider a generic list data structure, using this approach. We will call our new, generic list classObjectList.

If you look back at the IntegerList class of the previous section you will see that there are very few placeswhere the fact that it is an integer list is important. The ListNode structure contains an integer value(in the data field). The add, position and set methods take an integer as a parameter specifying thevalue which is to be added or searched for or set. Lastly, the get method returns an integer value. If werewrite the class and replace these few references to int with Object then we will have a general-purposelist type that can be used for lists of any objects. Let’s see how we can do this.

Data Members

The class itself is very similar to the previous examples, but we now use Object in the place of int inthe definition of our ObjectList class. The first place where this is apparent is in the declaration of theinner ListNode class. Here are all the data members of the ObjectList class:

public class ObjectList

{ private class ListNode

{ public Object data;

public ListNode next;

} // class ListNode

private ListNode first; // Pointer to the first ListNode

private int numElements; // Number of elements

. . .

} // class ObjectList

Notice how the data field is now defined to be of type Object. In the same way, in the methods of ournew class we can use Object rather than int, but the algorithms stay exactly as they were before.

The Methods

Here is the listing of the rest of the ObjectList class:

. . .

public ObjectList () // Constructor

{ first = null;

numElements = 0;

} // ObjectList constructor

51

Page 61: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

public void add (Object item, int position)

// Place the new item in a ObjectList

{ if (position < 0)

throw new IllegalArgumentException("position is negative");

ListNode node = new ListNode();

node.data = item;

ListNode curr = first,

prev = null;

for (int k = 0; k < position && curr != null; k++)

// Find position

{ prev = curr;

curr = curr.next;

}

node.next = curr;

if (prev != null)

prev.next = node;

else

first = node;

numElements++;

} // add

public void add (Object item)

// Place the new item at end of a ObjectList

{ add(item, numElements);

} // add

public void remove (int position)

// Remove item at position in an ObjectList

{ if (position < 0 || position >= numElements)

throw new IndexOutOfBoundsException("position is out of range");

ListNode curr = first,

prev = null;

for (int k = 0; curr != null && k < position; k++)

{ prev = curr;

curr = curr.next;

}

assert curr != null;

if (prev != null)

prev.next = curr.next;

else

first = curr.next;

numElements--;

} // remove

public int length ()

// Return number of elements in an ObjectList

{ return numElements;

} // length

public Object get (int index)

// Retrieve an element from an ObjectList

52

Page 62: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

{ if (index < 0 || index >= numElements)

throw new IndexOutOfBoundsException("index is out of range");

ListNode curr = first;

for (int k = 0; curr != null && k < index; k++)

curr = curr.next;

assert curr != null;

return curr.data;

} // get

public void set (int index, Object item)

// Change the value of an element in an ObjectList

{ if (index < 0 || index >= numElements)

throw new IndexOutOfBoundsException("index is out of range");

ListNode curr = first;

for (int k = 0; curr != null && k < index; k++)

curr = curr.next;

assert curr != null;

curr.data = item;

} // get

public int position (Object item)

// Find item in a ObjectList

{ ListNode curr = first;

int k;

for (k = 0; curr != null && !curr.data.equals(item); k++, curr = curr.next)

; // Search for item in ObjectList

if (curr == null) // item was not found

return -1;

else

return k;

} // position

public String toString ()

{ StringBuffer s = new StringBuffer("[");

for (ListNode curr = first;

curr != null;

curr = curr.next)

{ s.append(curr.data.toString());

if (curr.next != null)

s.append(", ");

}

s.append("]");

return s.toString();

} // toString

} // class ObjectList

If you compare these methods with the equivalent ones from the IntegerList class you will see that thealgorithms are almost identical. All that has changed is that a few parameters and variables are nowdeclared as type Object rather than int. Also, in the position method, we have had to use .equals

for the comparison rather than ==.

53

Page 63: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Using the ObjectList ADT

Note: This entire section deals with the client view, and will not be highlighted.

The previous section covered the implementor’s view of a generic class using polymorphism. How do weuse this from the client perspective? For our generic list class we can create a new list in the followingway:

ObjectList iList = new ObjectList();

We can then use iList in much the same way as objects of our earlier IntegerList class. The onlyrestriction is that, since the ObjectList class works with objects, we need to make use of the Integer

wrapper class:

iList.add(new Integer(3));

iList.add(new Integer(-7));

iList.add(new Integer(56));

In fact, thanks to a further new feature of Java 5.0 (autoboxing, which automatically converts primitivetypes to an object of the equivalent wrapper class), we can dispense with the explicit creation of newInteger objects:

iList.add(3);

iList.add(-7);

iList.add(56);

It is important to note that the Integer objects are still created in this case (the two code segmentsabove have exactly the same effect). The only difference is that the compiler manages the creation of thewrapper objects automatically in the second example.

Returning to the use of our generic class, if we wrote a student record class, we could create a list ofstudent records:

public class Student

{ . . .

} // class Student

ObjectList classList = new ObjectList();

classList.add(new Student());

And so on, and so on! Our ObjectList class is completely generic, and can be used to work with lists ofany type of object we require, even mixed lists containing different types of object.

One point that we need to bear in mind is that the get method returns a reference of type Object. Thismeans that when we retrieve an object from one of these lists we need to take some care. In particular,the following code will not work, as we will get a compiler error:

Student st;

st = classList.get(5); // Retrieve sixth student

How can we retrieve the objects then? The way to do this is to use a type cast to convert the Object

reference returned by get to the correct type. If we are completely sure that the object we are retrievingis of a certain type (Student in our example above) we can simply write the following:

54

Page 64: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Student st;

st = (Student)classList.get(5); // Retrieve sixth student

If we are less certain of ourselves, it is possible to be really careful and check first before doing the typecast (which would result in a run-time error if we are wrong). This makes use of the Java instanceof

operator. This is a boolean operator that takes a reference variable and a class name, and gives a resultof true if the object is of the correct type. It can be used as in the following example:

Student st;

Object obj = classList.get(5); // Retrieve sixth object

if (obj instanceof Student) // Check type

st = (Student)obj; // It’s a Student so the type cast will work

else

System.err.println("Class list contains non-student object!");

Exercise 4.21 Develop a generic version of the array-based IntegerVector class using poly-morphism. Call your new class ObjectVector. How different is it to the IntegerVector

version?

4.4.2 A Generic List Class Using Java’s Generic Features

The last two examples above highlighted a drawback of using polymorphism: since the ObjectList classis written using Object as the type of the data being stored in the list, various casts and conversions arerequired. In addition, there is no type-checking by the compiler. For example, the following is perfectlylegal as far as the compiler is concerned:

ObjectList classList = new ObjectList();

classList.add(new Student());

...

classList.add("Hello World!"); // A String in a class list?!

classList.add(3); // An Integer in a class list?!

The problem is that the ObjectList class is able to hold any and all kinds of objects, and the compilercannot detect whether the kind of usage that we see above is sensible in the context of the program thatis using an ObjectList.

In order to solve these kinds of problems, Java 5.0 introduced generics. This feature allows us to writea class to handle data of some unspecified type, but with strong type-checking by the compiler andautomatic type conversions. We will now see how this can be used to develop a truly generic list class.

Writing a Generic Class

The key to creating a generic list class is to parameterise the type of the data that is to be stored in thelist. In other words, client programs need to be able to say: “we want a list of objects of type X”. Thecompiler then needs to be able to check that the list is being used correctly (e.g. that we don’t try tostore integers in a class list of students).

The syntax that is used for this in Java is quite simple:

public class GenericList<T>

{ . . .

} // class GenericList

55

Page 65: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

This states that the GenericList class will work with objects of some unspecified type, called T here.Within the class we can use the type parameter T as if it is a class type. So, for example, the ListNodeinner class now becomes:

public class GenericList<T>

{ private class ListNode

{ public T data;

public ListNode next;

} // class ListNode

. . .

} // class GenericList

The type parameter T is then specified by the client program, when the class is used:

GenericList<Student> classList = new GenericList<Student>();

The compiler can, and will, now check to ensure that only Student objects are stored in the classlist. Any attempt to store the wrong kind of data in the list will result in a compile-time error.

Returning to the rest of the class, writing the methods, etc. is quite easy, as all we need to do is changeeach mention of Object in the ObjectList class to the type parameter T. The class is shown below (withcomments removed to save some space).

public class GenericList<T>

{ private class ListNode

{ public T data;

public ListNode next;

} // inner class ListNode

private ListNode first;

private int numElements;

public GenericList ()

{ first = null;

numElements = 0;

} // GenericList constructor

public void add (T item, int position)

{ if (position < 0)

throw new IllegalArgumentException("position is negative");

ListNode node = new ListNode();

node.data = item;

ListNode curr = first,

prev = null;

for (int k = 0; k < position && curr != null; k++)

// Find position

{ prev = curr;

curr = curr.next;

}

node.next = curr;

if (prev != null)

prev.next = node;

56

Page 66: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

else

first = node;

numElements++;

} // add

public void add (T item)

{ add(item, numElements);

} // add

public void remove (int position)

{ if (position < 0 || position >= numElements)

throw new IndexOutOfBoundsException("position is out of range");

ListNode curr = first,

prev = null;

for (int k = 0; curr != null && k < position; k++)

{ prev = curr;

curr = curr.next;

}

assert curr != null;

if (prev != null)

prev.next = curr.next;

else

first = curr.next;

numElements--;

} // remove

public int length ()

{ return numElements;

} // length

public T get (int index)

{ if (index < 0 || index >= numElements)

throw new IndexOutOfBoundsException("index is out of range");

ListNode curr = first;

for (int k = 0; curr != null && k < index; k++)

curr = curr.next;

assert curr != null;

return curr.data;

} // get

public void set (int index, T item)

{ if (index < 0 || index >= numElements)

throw new IndexOutOfBoundsException("index is out of range");

ListNode curr = first;

for (int k = 0; curr != null && k < index; k++)

curr = curr.next;

assert curr != null;

curr.data = item;

} // get

public int position (T item)

{ ListNode curr = first;

57

Page 67: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

int k;

for (k = 0;

curr != null && !curr.data.equals(item);

k++, curr = curr.next)

; // Search for item in GenericList

if (curr == null) // item was not found

return -1;

else

return k;

} // position

public String toString ()

{ StringBuffer s = new StringBuffer("[");

for (ListNode curr = first; curr != null; curr = curr.next)

{ s.append(curr.data.toString());

if (curr.next != null)

s.append(", ");

}

s.append("]");

return s.toString();

} // toString

} // class GenericList

Using a Generic Class

Note: This entire section deals with the client view, and will not be highlighted.

As already noted, using the class is a simple matter of specifying the type of the data to be stored. Thisallows us to do things like:

GenericList<Integer> iList = new GenericList<Integer>();

GenericList<Student> classList = new GenericList<Student>();

...

iList.add(3);

iList.add(-7);

iList.add(56);

...

int x = iList.get(0);

...

classList.add(new Student());

...

Student st = classList.get(5); // Retrieve sixth student

...

There are a few things to note that are illustrated by these examples. Firstly, because of the typeinformation that the compiler now has, it can ensure that only integers are stored in iList, and onlyStudent objects in classList. This provides far more safety from accidental programming errors thanpreviously.

Secondly, as is clear from the example above, because the compiler knows the return type of the get

58

Page 68: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

method, no type-cast is required as was the case previously3.

Exercise 4.22 Develop a generic version of the IntegerVector class using the new generic fea-tures in Java 5.0. Call your new class GenericVector. How different is it to the IntegerVectorversion (and ObjectVector, if you have done Exercise 4.21)? (Note that the java.util pack-age already contains a class called Vector that provides very similar functionality).

4.5 Closing Remarks

The generic mechanisms in Java 5.0 are quite complex, and the examples above have only scratched thesurface of this feature. We will see more of their use as we continue through the course. Also, moredetails are available in [2].

The classes that we have developed in this chapter have very usefully illustrated a number of importantpoints about abstract data types and their implementation. However, you should note that the Java classlibraries already contain classes like these, and many of the ADTs which we will be studying in subsequentchapters. In particular, the java.util package contains a class called LinkedList, which provides thesame functionality as our GenericList class (and more). The same package also contains classes calledVector and ArrayList which work very much like an array-based version of the GenericList class, andare very widely used. The classes in the java.util package all make use of the new generic facilities.

Skills

• You should know how to use the following Java features: class constructors and polymorphism

• You should be familiar with the use of linked lists, and with the Java syntax and idioms usedfor linked lists

• You should be aware of the advantages of data-hiding, from the client and implementor view-points

• You should know how polymorphism in Java can be used to create generic data structures

• You should know how the generic facilities in Java 5.0 can be used to create generic datastructures, and how these differ from the use of polymorphism

• You should be aware that the Java class libraries already contain classes for many commonADTs

3In fact, a type-cast is still required, but it is inserted automatically by the compiler, using the type information, whichit has available.

59

Page 69: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Chapter 5

Stacks and Queues

Objectives

• To consider three abstract data types: stack, queue and deque

• To study the implementation of these abstract data types using arrays and using dynamicdata structures

• To introduce some of the uses of these abstract data types

• To consider problem solving algorithms as a particular application of these abstract data types

• To study the following implementation techniques for linked lists: doubly-linked lists, circ-ularly-linked lists and list header nodes

5.1 Introduction

In chapter four we developed classes for simple lists of items. Stacks and queues are just specialised lists.They are defined by the way in which we add items to and remove items from the lists. For a stack, itemsare always added and removed at one end (more usually called the top). This is analogous to the way wedeal with a stack of plates in a kitchen, or a stack of tins or boxes in a supermarket: a new item is puton the top of the stack, and when an item is removed it is the top one again. On the other hand, itemsare added to a queue at one end (usually called the tail), and are removed from the other end (usuallycalled the head). Again this is analogous to the kinds of queues that we are used to seeing in places likepost offices and banks: people join the end of the queue and leave it when they have reached the head ofthe queue.

Both of these behaviours arise very frequently in programming computers, and so these are probably twoof the most common data structures. In this chapter we will study the implementation of generic classesfor stack and queue data structures, and will consider some of the possible uses of these structures. Inthe last section of this chapter we will consider a more general data structure: the double ended queue,or deque1 as it is usually called. The deque will also be used to introduce some further techniques for

1Pronounced “deck”. Some authors spell it dequeue.

60

Page 70: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

implementing linked lists.

5.2 Stacks

As already mentioned, stacks are characterised by the fact that items are added to and removed from asingle end. One effect of this is that when we remove an item, it is the item that was most recently addedto the stack. For this reason a stack is also commonly referred to as a “last in, first out” (or LIFO) list.We call the operation which adds an item to a stack push, and the operation which removes an item froma stack pop. We thus speak of “pushing” and “popping” values.

Since a stack is just a special case of a list we can choose to implement it in either of the two ways thatwe used for general lists in the last chapter (i.e. using an array, or using a linked list). We will considerboth of these approaches in turn.

5.2.1 An Array-based Stack Implementation

An array can be used to implement a stack very easily. All that is required is to keep the index of theitem which is currently on the top of the stack. If the stack is empty we can set the “top” to a valuelike −1. Diagrammatically this would look the following (we will consider a stack of characters for themoment, but the principles are the same for a stack of anything):

top -1

0 1 2 3 ... max

stack ? ? ? ? ... ?

If we push the character ’G’ onto this empty stack the picture becomes:

top 0

0 1 2 3 ... max

stack G ? ? ? ... ?

If we now push the character ’e’ onto this stack the picture becomes:

top 1

0 1 2 3 ... max

stack G e ? ? ... ?

And so on. By the time we have pushed on all the letters of the name “George” we would have thefollowing picture (assuming max > 5):

top 5

0 1 2 3 4 5 ... max

stack G e o r g e ... ?

If we now pop an item from the stack we will get the character ’e’. And the picture becomes:

top 4

0 1 2 3 4 5 ... max

stack G e o r g ? ... ?

61

Page 71: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Another pop operation would give the character ’g’ and the value of top would become 3, and so on.Of course, the usual disadvantage of an array arises with this implementation, namely that the size ofthe stack is fixed (at max+1 characters in the example above). If this limit is made too large we waste alot of space in memory. If it is too small we may run out of space for pushing items during the executionof a program.

How can we develop an array-based stack as a Java class? As far as private data members are concernedwe will need to have both the top of stack index and an array of data items. This leads us to the outlineof the class shown in the following class diagram (we will call our class ArrayStack to differentiate itfrom the one we will develop later using a linked list):�

ArrayStack

data, topIndex

push, pop, top,isEmpty

The Stack Interface

We will develop this class so that it conforms to the following generic interface describing the functionalityrequired for all stack implementations.

public interface Stack<T>

{ public void push (T item); // Push the new item onto a stack

public T pop (); // Pop item off top of stack

public T top (); // Return a copy of top item

public boolean isEmpty (); // Return TRUE if no items on stack

} // interface Stack

Note the use of generics in Java to specify that a stack may work with any type of object.

Data Members and Constructors

The ArrayStack class itself is then as follows:

public class ArrayStack<T> implements Stack<T>

{ private T[] data; // Array of data

private int topIndex; // Position of top element

public ArrayStack (int initSize) // Constructor

{ data = (T[])new Object[initSize];

topIndex = -1;

} // Constructor

public ArrayStack () // Constructor

62

Page 72: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

{ this(100);

} // Constructor

...

} // class ArrayStack

There is one small issue here to do with the use of the generic features in Java 5.0. For various reasons,which we won’t go into here (see [2] for details), we cannot create an array using a generic type parameter,such as T in this case. This is relatively easy to work around, as we are allowed to create an array ofObject and type-cast this, as we see in the first constructor in the code above. Otherwise, note howsimilar this is to the equivalent parts of the IntegerVector class.

Operations

Moving on to the more interesting parts of this class, how do we implement the push and pop operations?This is quite straightforward in fact, and only takes a few lines of Java:

public void push (T item)

// Push the new item onto an ArrayStack

{ if (topIndex >= data.length-1)

throw new NoSpaceAvailableException();

data[++topIndex] = item;

} // push

public T pop () // Pop item off top of stack

{ if (topIndex < 0)

throw new EmptyException("stack is empty");

return data[topIndex--];

} // pop

Note that the use of the prefix ++ operator in the push method and the postfix -- operator in the pop

method is very important. Why? Notice the checks on the preconditions here.

While these operations are all we need in order to use stacks, they are very minimal. Two more simpleoperations provide us with greatly improved functionality. These methods allow us to examine the dataitem on the top of the stack without removing it from the stack, and to tell whether or not the stack isempty.

public T top () // Return a copy of top item

{ if (topIndex < 0)

throw new EmptyException("stack is empty");

return data[topIndex];

} // top

public boolean isEmpty ()

// Return TRUE if no items on stack

{ return topIndex < 0;

} // isEmpty

Note just how little the top method differs from the pop method. This completes our array-based stackimplementation. The complete listing of this class (ArrayStack.java) can be found in Appendix A. Wewill consider the use of this data structure in client programs a little later.

63

Page 73: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Exercise 5.1 Write a method to delete all the items on the stack (this is very easy to do).

Exercise 5.2 Write a toString() method that can be used to print or display the contentsof a stack. In what order should the contents be displayed?

Exercise 5.3 Exercise 4.9 (see p. 41) gives a suggestion for dealing with the space constraintsdiscussed above. Implement these ideas for the ArrayStack class.

5.2.2 A Linked List Stack Implementation

To implement a stack as a linked list we will need to be able to add and remove items from one end ofthe list. This turns out to be very easy to do (a stack is probably the simplest form of linked list toimplement, in fact). We will need a pointer to the top item on the stack. When the stack is empty thiswill be null.

To add an element to the stack we need to perform the following sequence of steps:

• Set the “next” pointer of the new node to point to the current top of stack

• Set the top of stack to point to the new node

That’s all there is to it. Let’s add the value ’G’ to our stack. After the first of these two steps we get:

After the second step we get:

We can redraw this as follows:

64

Page 74: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

If we now add the character ’e’ to our stack we will get the following picture after the first step:

After the second step we get:

which we can redraw as:

By the time we have pushed all of the letters in the name “George” onto our stack we will have thefollowing picture:

So, pushing items onto the linked list stack is very simple, what about popping items off the stack? Well,this turns out to be just as easy. As far as the pointers are concerned all that needs to be done is to setthe “top” pointer to the “next” field of the current top item. This would give:

As long as we still have a reference to the old first element we can retrieve its data (the letter ’e’ in thiscase) and then return it, giving:

Let’s have a look at the Java class we will need to implement this form of the stack data structure.

65

Page 75: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

The class diagram is shown below. We will call the class ListStack to distinguish it from the arrayimplementation of the previous section, but will implement exactly the same interface.�

ListStack

topNode

push, pop, top,isEmpty

Data Members

As far as the data members are concerned, we need the following:

public class ListStack<T> implements Stack<T>

{ private class StackNode

{ public T data;

public StackNode next;

} // class StackNode

private StackNode topNode; // Top StackNode in the stack

. . .

} // class ListStack

Just as we did for the general lists of the last chapter we have again used an inner class to describe thenodes of the linked list with data and next fields (see the class diagram below). The only other privateinformation is the top of stack pointer, topNode.�

StackNode

data, next

Constructor

The constructor for this class is very simple. All we need to do is ensure that the top of stack pointer iscorrectly initialised (although this is not strictly necessary in Java):

public ListStack () // Constructor

{ topNode = null; }

Operations

Considering the push operation first, we need to create a new node and link it into the list in the waydescribed previously. This gives the following code:

66

Page 76: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

1 public void push (T item)

2 { StackNode node = new StackNode();

3 node.data = item;

4 node.next = topNode;

5 topNode = node;

6 } // push

We start by creating a new node (line 2), and then setting its data field to the value of the item we arepushing onto the stack (line 3). Lines 4 and 5 then do the necessary juggling of the pointers: settingthe next field of the new node to point to the top of the stack, and then resetting the top of the stackto point to the new node. Note that the order of these two operations is very important. What wouldhappen if we reversed them?

The pop operation is just as simple:

1 public T pop ()

2 { if (topNode == null)

3 throw new EmptyException("stack is empty");

4 T tmpData = topNode.data;

5 topNode = topNode.next;

6 return tmpData;

7 } // pop

Note the check in line 2 to ensure that the necessary precondition (that the stack is not empty) is met.We need to use a temporary variable here. In line 4 we use tmpData to store the data field of the node weare about to remove from the list. This is done so that we can return this value at the end of the methodafter we have reset the topNode pointer. Line 5 resets the top of stack pointer, effectively removing thenode from the top of the stack (and making it eligible for garbage collection). We can still access thedata that it contained using the tmpData variable, of course, and this is used in line 6 to return the datavalue.

The top method is a lot simpler. Since we do not need to delete the node in this case, we are spared thenecessity of using a temporary reference to the data field.

public T top () // Return copy of top item

{ if (topNode == null)

throw new EmptyException("stack is empty");

return topNode.data;

} // top

The last method to consider is the isEmpty method, which allows us to check the state of the stack. Thisis very simple (indeed we have had if statements checking this precondition in the pop and top methodsalready).

public boolean isEmpty ()

// Return TRUE if no items on stack

{ return topNode == null;

} // isEmpty

And that is essentially all there is to the linked list implementation of a stack. As usual this implemen-tation has the advantage that space is allocated as it is needed. This has two implications. Firstly, nospace is wasted — nodes are allocated only for items that are currently on the stack. Secondly, the onlylimit on the size of the stack is the total amount of memory available. The complete listing of this class(ListStack.java) can be found in Appendix A of the notes.

67

Page 77: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Exercise 5.4 Write a toString method for this class.

5.2.3 A Simple Example of the Use of a Stack

Note: This entire section deals with the client view, and will not be highlighted.

As was mentioned in the introduction to this chapter, stacks have many uses in computer programs. Inthis section we will have a look at a very simple application which uses a stack. This is the problem ofreversing a string, which we solved recursively in chapter three (see p. 17). The “last in, first out” natureof a stack is exactly what we need in order to reverse a string. We work through the string character bycharacter, pushing each one onto a stack. When we reach the end we can pop each one off the stack andwrite it out. The code for this is very easy:

import java.io.*;

import cs2.ListStack; // Or ArrayStack

public class TestStack

{

public static void main (String args[]) throws IOException

{ ListStack<Character> st = new ListStack<Character>(); // Or ArrayStack

char ch = (char)System.in.read();

while (ch != ’\n’)

{ st.push(ch);

ch = (char)System.in.read();

}

System.out.print("Backwards: ");

while (! st.isEmpty())

System.out.print(st.pop());

System.out.println();

} // main

} // class TestStack

Notice how we can use either of our stack implementations interchangeably, simply by importing theappropriate class and using the corresponding class name (as shown in the comments in the programabove). This is due to the fact that we kept to the same interface for both classes. Indeed, if we hadchosen to do so, we could have initially developed the array implementation and then replaced it (usingthe same file and class names) with the linked list implementation. In this case the program above wouldrequire no changes at all. Furthermore we could have used the interface name (i.e. Stack) as the type forthe variable st, giving us even more flexibility. This is one of the major advantages of the informationhiding provided by Java classes (and similar features in other languages, such as MODULEs in Modula-2). Client programs, such as this example, can be highly immune to changes in the implementation offacilities which they use.

68

Page 78: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Exercise 5.5 A stack is a useful data structure for checking for the correct use of parentheses(brackets) in programming languages. The way this works is that when an opening (left)bracket is found, it is pushed onto a stack. When a right bracket is found, the top elementof the stack is popped and checked to make sure that it matches the right bracket. If a rightbracket is found, but the stack is empty then the brackets are mismatched. At the end of thechecking process, if there are still elements on the stack, then the brackets are mismatched.Use one of the two stack implementations from this chapter to write a program to check thebrackets in a piece of text (e.g. a Java program) in this way.

Exercise 5.6 A useful way of handling arithmetic expressions is to use what is known asReverse Polish Notation (usually abbreviated to RPN, and also called postfix notation). Inthis method of calculation the operands are entered first and then the operation is specified.For example, to add two and five you would enter: 2 5 +. It becomes very easy, if a lit-tle unnatural, to express complex expressions in this way. For example, (2 + 5) * 10 - 3

becomes 2 5 + 10 * 3 -, and (2 + 5) * (10 - 3) becomes 2 5 + 10 3 - *. Note fromthese examples how there is no need for parentheses in the RPN form.A stack can be used to easily evaluate RPN expressions, using the following approach:

read a word

if the word is a number then

push it onto the stack

else if the word is an operator then

pop the operands off the stack

perform the operation

push the result onto the stack

Write a Java program to do this using either of the two stack classes developed in this chapter.

5.3 Queues

As discussed in the introduction, queues are characterised by the fact that items are added at one endand removed from the other. This means that the item which is removed is the one that has been inthe queue for the longest time. For this reason, a queue is also commonly referred to as a “first in, firstout” (or FIFO) list. This behaviour also leads to a slightly more complicated implementation than wasthe case for a stack. In this section we will again consider both array and linked list implementations ofqueues.

5.3.1 An Array-based Queue Implementation

A simple implementation of a queue using an array makes use of head and tail indices to keep trackof which elements are at the head and the tail of the list. Initially, to represent an empty queue, bothindices are set to a value like −1. The diagram below shows an empty queue with space for five elements.

head -1 tail -1

0 1 2 3 4

queue

As items are added to the queue the tail index is incremented, and as they are removed the head indexis incremented. The main problem that arises with simple array-based queues is the way in which such a

69

Page 79: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

queue tends to “move” through the array. For example, if we add the letters “abc” to the queue we willget:

head 0 tail 2

0 1 2 3 4

queue a b c

If we remove two items (i.e. ‘a’ and ‘b’) from this queue we get:

head 2 tail 2

0 1 2 3 4

queue c

If we attempt to add three more letters (“pqr”) we will get overflow. There is no more space at the tailof the queue, although there are still unused elements in the array.

head 2 tail 5

0 1 2 3 4

queue c p q r

This is known as the travelling queue problem. There are a number of solutions that can be developedto get around this. A simple one is to move up the data in the array when overflow becomes a problem.This approach would leave us with the following situation:

head 0 tail 3

0 1 2 3 4

queue c p q r

While this solves the travelling queue problem it is potentially very inefficient, particularly if the queueis long and always close to full. A better solution is to make use of a circular queue. In this situation thearray conceptually loops back on itself, so that the result of the operation above would be the following:

head 2 tail 0

0 1 2 3 4

queue r c p q

Removing an item from the queue would give us ’c’, and the following situation:

head 3 tail 0

0 1 2 3 4

queue r p q

Adding another item (say ’z’) would give:

head 3 tail 1

0 1 2 3 4

queue r z p q

And so we could continue. The maximum size of the queue is still fixed (at five in this example) but theproblem of overflowing the queue when there are still vacant slots is prevented. Let’s look at how wecould implement this as a Java class.

70

Page 80: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

The Queue Interface

Again, we will use a common interface for both of the queue implementations that we will develop, andwill make use of the generic features in Java 5.0.

public interface Queue<T>

{

public void add (T item); // Add item to the end of queue

public T remove (); // Remove item from head of queue

public T head (); // Return a copy of item at head of queue

public boolean isEmpty (); // Return TRUE if no items in queue

} // interface Queue

Data Members

The class diagram for our queue class is as follows:�

ArrayQueue

data, hd, tl

add, remove, head,isEmpty

The data members are very similar to those that we had for the stack we considered previously. The onlydifference is that, instead of keeping track of just the top of the stack, we now need to keep track of boththe head and the tail:

public class ArrayQueue<T> implements Queue<T>

{ private T[] data; // Pointer to array of data

private int hd, tl; // Position of head and tail elements

. . .

} // class ArrayQueue

Constructor

The constructor is also very similar to that for the ArrayStack class:

public ArrayQueue (int initSize) // Constructor

{ data = (T[])new Object[initSize];

hd = tl = -1;

} // Constructor

Operations

Adding an item to the queue involves incrementing the tail index and inserting the new element. However,there are several points which need to be handled carefully. Firstly, we need to ensure the “wrap around”

71

Page 81: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

behaviour required for the circular queue. This can be done as follows:

tl = tl + 1;

if (tl >= data.length)

tl = 0;

An alternative way that is often seen is to use the “mod” operator (% in Java) as follows:

tl = (tl + 1) % data.length;

This has exactly the same effect, but is a little shorter.

The second point we need to watch out for is running out of space in the array. This can be detectedif the tail index “catches up with” the head index (consider what would happen in the simple exampleabove if we kept adding characters to the queue). The last point of caution is to ensure that the headindex is also updated when we add the first element to the queue. With these facts in mind, let’s have alook at the implementation of the add method:

1 public void add (T item) // Add item to end of queue

2 { tl = (tl + 1);

3 if (tl >= data.length)

4 tl = 0; // wraparound

5 if (tl == hd) // Out of space

6 throw new NoSpaceAvailableException("no space available");

7 data[tl] = item;

8 if (hd == -1) // First item in queue

9 hd = tl;

10 } // add

The if statement in line 5 is used to catch the case where the queue becomes full (the tail index catchesup with the head index). While this is a precondition, it is difficult to check at the beginning of themethod, so we check it only after updating the tl subscript. The if statement in line 8 deals with thenecessity of updating the head index when the first element is added to an empty queue.

The remove method is also fairly simple. The important points here are that we cannot remove an itemfrom an empty queue, and also the way we handle the removal of the last item from a queue.

1 public T remove () // Remove item from head of queue

2 { if (hd == -1)

3 throw new EmptyException("queue is empty");

4 T tmpData = data[hd];

5 if (hd == tl) // Was last element

6 hd = tl = -1;

7 else

8 { hd = (hd + 1);

9 if (hd >= data.length)

10 hd = 0; // wraparound

11 }

12 return tmpData;

13 } // remove

The if statement in line 2 checks that we are not attempting to remove an item from an empty queue.The first part of the next if statement (line 5) handles the case where the last item is removed, setting

72

Page 82: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

both the head and tail indices to −1 to signify that the queue is empty again. Note again the wrap-aroundbehaviour in the else clause (lines 8–10) required for the circular queue to work correctly.

The last two methods allow us to examine the item at the head of the queue and to tell whether thequeue is empty. They are both quite simple and similar to their counterparts in the stack implementation.The full implementation of the ArrayQueue class can be found in Appendix A of the notes (the file isArrayQueue.java).

public T head () // Return item at head of queue

{ if (hd == -1)

throw new EmptyException("queue is empty");

return data[hd];

} // head

public boolean isEmpty () // TRUE if no items in queue

{ return hd == -1;

} // isEmpty

Exercise 5.7 Write a method to remove all the elements of a queue (again, this is very easyto do).

Exercise 5.8 Write a toString method to allow the display of a queue.

Exercise 5.9 Rather than using the circular queue implementation, modify the class to usethe alternative approach discussed at the beginning of this section (i.e. moving the data valuesup when the tail of the queue reaches the end of the array). Is this any easier to implement?Design a program to test the efficiency of the two implementations and measure which is themost efficient for short queues and long queues.

5.3.2 A Linked List Queue Implementation

Just as was the case for the stack data structure, the use of a linked list to implement a queue gets aroundseveral of the problems we have encountered with the array implementation. In this case we will needtwo pointers for the head and tail of the queue, corresponding to the two indices we had in the arrayimplementation. An empty queue would then be represented by:

If we add three items to this queue (the letters “abc”) we will get:

73

Page 83: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Removing a letter (i.e. ’a’) from this queue will give:

One point to note is that we could do without the tail pointer if we were prepared to follow the linkedlist from the head every time we wanted to add an item to the tail of the queue. This would obviouslybe a lot less efficient than the approach illustrated above.�

ListQueue

hd, tl

add, remove, head,isEmpty

Data Members

Turning to the implementation of a queue using a linked list, the data members of the new class are asfollows:

public class ListQueue<T> implements Queue<T>

{ private class QueueNode

{ public T data;

public QueueNode next;

} // inner class QueueNode

private QueueNode hd, tl; // Pointers to head and tail

// elements

. . .

} // class ListQueue �

QueueNode

data, next

Constructor

As is often the case for linked list data structures, the constructor we require is very simple:

public ListQueue () // Constructor

{ hd = tl = null; }

74

Page 84: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Operations

Turning our attention first to the add method:

1 public void add (T item) // Add item to end of queue

2 { QueueNode newNode = new QueueNode();

3 newNode.data = item;

4 newNode.next = null;

5 if (tl != null)

6 tl.next = newNode;

7 tl = newNode;

8 if (hd == null) // First item in queue

9 hd = tl;

10 } // add

Most of this should be reasonably familiar, or at least not too difficult by now. The main subtleties hereare the different cases which arise when updating the head and tail pointers. The first part of this process(in lines 5 and 6) is used to link the new node into the list of elements, if there are any existing elements.The tail pointer is then updated to point to the new element (in line 7). The last step (lines 8 and 9) isa check to make sure that the head pointer is updated if we have just placed the first item in a queue.Trace through this sequence of steps, say for adding the letter ’d’ to the last example queue above, andfor adding a new element to an empty queue.

The remove method is as follows:

1 public T remove () // Remove item from head of queue

2 { if (hd == null)

3 throw new EmptyException("queue is empty");

4 T tmpData = hd.data;

5 hd = hd.next;

6 if (hd == null) // Was last element

7 tl = null;

8 return tmpData;

9 } // remove

As for our stack class, note the need for the temporary data variable (declared and initialised in line 4).The heart of this method is in the central part: on line 5 the head pointer is updated to remove the firstelement from the list (there is still a reference to this element’s data in the tmpData variable, of course).Lines 6 and 7 then deal with the case where the last element has just been removed from the queue andthe tail pointer needs to be set to null to reflect this.

Lastly, the head and isEmpty methods are again very simple:

public T head () // Return item at head of queue

{ if (hd == null)

throw new EmptyException("queue is empty");

return hd.data;

} // head

public boolean isEmpty () // TRUE if no items in queue

{ return hd == null;

} // isEmpty

The full listing of this class can be found in the file ListQueue.java in Appendix A of the notes.

75

Page 85: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Exercise 5.10 Write a method to delete all the items in a queue.

Exercise 5.11 As suggested in the discussion above (p. 74), we do not need the tail pointerif we are prepared to follow the list from the head each time we add an element to the queue.Modify the class to use this approach, and then write a test program to measure the differencein efficiency (repeatedly add and remove elements using both queue implementations, andmeasure the time taken in each case).

Exercise 5.12 Modify the ListQueue class to create a priority queue class. A priority queuediffers from a normal queue in that each item on the queue has an associated integer priorityvalue. When an item is added to the priority queue, it is not necessarily added at the end ofthe queue, but is placed in its correct position with regard to the priority of the other itemsin the queue (usually in ascending order, from head to tail). When an item is removed from apriority queue, it is always the item at the head of the queue — i.e. the one with the highestpriority.You will need to change the add method to take a priority value, along with the item beingadded. You will also need to provide a new method to retrieve the priority value of the itemat the head of the queue. Note that, as a result of these changes, the class you develop will nolonger conform to the Queue interface.

5.3.3 An Example of the Use of Queues

Note: This entire section deals with the client view, and will not be highlighted.

A number of applications in Computer Science, particularly in the field of artificial intelligence, requiresearching through all, or many, of the possible solutions to a problem. Queues can be very useful forkeeping track of partial solutions that still need to be explored further. In this section we will develop asimple problem-solving program which makes use of a queue in this way. The problem we will tackle isthat of finding a path through a maze.

We can represent a maze as a matrix (i.e. a two-dimensional array) of boolean values indicating whethera block is open or closed. An example of such a maze is shown below (for simplicity, the true values areshown as 1’s and the false values as 0’s).

0 0 0 0 0 0 0 0 0 00 1 1 1 1 1 1 1 1 00 1 0 1 0 0 0 1 0 00 1 0 1 1 1 0 1 0 00 0 1 1 0 1 0 1 0 00 0 1 0 0 1 0 1 0 0

Entrance→ 1 1 1 0 1 1 0 0 1 1 →Exit0 0 0 0 1 0 1 0 1 00 1 1 1 1 1 1 1 1 00 0 0 0 0 0 0 0 0 0

The search strategy that we will use makes use of a queue to keep track of the positions that we stillhave to use as starting points for further unexplored paths through the maze. In addition, we need tokeep track of which positions in the maze we have previously explored so that we do not repeat any

76

Page 86: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

paths through the maze. This can be handled easily with a second boolean matrix, beenThere. Initiallyall positions in beenThere will be false. When we visit a position we can set the corresponding elementof beenThere to true, indicating that that position has now been explored and can be ignored if wecome back to it in the course of traversing the maze. With this in mind we can develop the first part ofour program. We will use the linked list implementation of a queue since we cannot be sure how manypotential positions will need to be queued at any time. We will store the positions as records with fieldsfor the row and column numbers of a point in the maze. The full listing of the program can be found inAppendix A of the notes (QSearch.java).

import cs2.*;

import java.io.*;

public class QSearch

{ private static final int MAX_COORD = 10; // Size of maze

private class Position // Coordinates of location in maze

{ public int r, // Row coordinate

c; // Column coordinate

} // inner class Position

private Queue<Position> posQueue = new ListQueue<Position>();

// Queue of positions still to be checked

private boolean[][] maze = new boolean[MAX_COORD][MAX_COORD];

// Description of maze

private boolean[][] beenThere = new boolean[MAX_COORD][MAX_COORD];

// Keep track of previous positions

...

The approach we take to tackling the maze is first to find the entrance. To keep the problem simple wewill require that there is only one gap in the first column (the entrance) and only one gap in the lastcolumn (the exit), and no other gaps in the “outer walls” at all. So finding the entrance is simply a caseof going down the first column to find a gap (a location where the maze matrix has a true value). Oncethis is found we can put it onto the queue as the first position which needs to be explored as part ofa potential solution to the problem. At this stage our program looks like this (do not worry about thereadMaze method for now — it simply reads in the matrix of values describing the maze):

public void solveMaze ()

{ int r, c = 0; // Row and column coordinates;

readMaze("MAZE");

// Initialise beenThere to all FALSE

for (r = 0; r < MAX_COORD; r++)

for (c = 0; c < MAX_COORD; c++)

beenThere[r][c] = false;

// Find starting position

for (r = 0; r < MAX_COORD; r++)

if (maze[r][0]) // r is starting row

break;

// Put starting position on queue

77

Page 87: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

addPosition(r, 0);

...

} // solveMaze

public static void main (String[] args)

{ QSearch qs = new QSearch();

qs.solveMaze();

} // main

The addPosition method simply creates a new Position object and places it onto the queue:

public void addPosition (int row, int col)

// Put a new position on the queue of positions

{ Position p = new Position();

p.r = row;

p.c = col;

posQueue.add(p);

} // addPosition

We can now picture our main data structures as follows. Note that the subscripts for the maze andbeenThere matrices have been shown to clarify the discussion, and that the representation of the queuehas been simplified. Note too that this maze is slightly different to the one shown above.

maze

0 1 2 3 4 5 6 7 8 90 0 0 0 0 0 0 0 0 0 01 0 1 1 1 1 1 1 1 1 02 0 1 0 1 0 0 0 1 0 03 0 1 0 1 1 1 0 1 0 04 0 1 1 1 0 1 0 1 0 05 0 1 0 0 0 1 0 1 0 06 1 1 1 1 1 1 0 0 1 17 0 1 0 0 1 0 1 0 1 08 0 1 1 1 1 1 1 1 1 09 0 0 0 0 0 0 0 0 0 0

beenThere

0 1 2 3 4 5 6 7 8 90 0 0 0 0 0 0 0 0 0 01 0 0 0 0 0 0 0 0 0 02 0 0 0 0 0 0 0 0 0 03 0 0 0 0 0 0 0 0 0 04 0 0 0 0 0 0 0 0 0 05 0 0 0 0 0 0 0 0 0 06 0 0 0 0 0 0 0 0 0 07 0 0 0 0 0 0 0 0 0 08 0 0 0 0 0 0 0 0 0 09 0 0 0 0 0 0 0 0 0 0

78

Page 88: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

With the initial state set up in this way we can enter the main part of the algorithm. This is a loop thatcontinues until either the exit of the maze is reached or else there are no more entries on the queue stillto be examined (in which case we can conclude that there is no path through the maze). Inside the loopwe remove the next entry from the queue of unexplored positions. We mark the beenThere matrix torecord the fact that we have now explored this position and then try all possible moves from that point.In outline we have the following:

while (! posQueue.isEmpty())

{ Position nextPos;

// Remove next position from queue and try all

// possible moves

nextPos = posQueue.remove();

c = nextPos.c;

r = nextPos.r;

beenThere[r][c] = true; // Note that we have visited this spot

System.out.println("Visiting position: " + r + ", " + c);

if (c == MAX_COORD-1) // Found exit, so leave search loop

break;

// Try all possible moves from this position

. . .

} // while

The only changes to our data structures at this point are that the queue is empty and position 6,0 in thebeenThere matrix is now set to true (1 in the diagram). How do we handle trying all possible movesfrom this position? We need to try the four possible directions which we can move in (up, down, left andright). For each of these we need to check: (1) that there is no wall in the maze at that position, and(2) that we have not yet visited that position in the maze. If both of these conditions are met then wecan add the new position to the queue as a position that we still need to explore. This comes out as thefollowing section of code (replacing the comment // Try all possible moves from this position inthe code shown above).

// Try to move up

if (maze[r-1][c] && ! beenThere[r-1][c])

addPosition(r-1, c);

// Try to move right

if (maze[r][c+1] && ! beenThere[r][c+1])

addPosition(r, c+1);

// Try to move down

if (maze[r+1][c] && ! beenThere[r+1][c])

addPosition(r+1, c);

79

Page 89: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

// Try to move left

if (c > 0 && maze[r][c-1] && ! beenThere[r][c-1])

addPosition(r, c-1);

For our current position (6,0) we cannot move up, down or left (note the need to check that c > 0 whentrying to go left, so that we do not attempt to walk back out the entrance!), only right. After this ourdata structures will look as follows:

maze

0 1 2 3 4 5 6 7 8 90 0 0 0 0 0 0 0 0 0 01 0 1 1 1 1 1 1 1 1 02 0 1 0 1 0 0 0 1 0 03 0 1 0 1 1 1 0 1 0 04 0 1 1 1 0 1 0 1 0 05 0 1 0 0 0 1 0 1 0 06 1 1 1 1 1 1 0 0 1 17 0 1 0 0 1 0 1 0 1 08 0 1 1 1 1 1 1 1 1 09 0 0 0 0 0 0 0 0 0 0

beenThere

0 1 2 3 4 5 6 7 8 90 0 0 0 0 0 0 0 0 0 01 0 0 0 0 0 0 0 0 0 02 0 0 0 0 0 0 0 0 0 03 0 0 0 0 0 0 0 0 0 04 0 0 0 0 0 0 0 0 0 05 0 0 0 0 0 0 0 0 0 06 1 0 0 0 0 0 0 0 0 07 0 0 0 0 0 0 0 0 0 08 0 0 0 0 0 0 0 0 0 09 0 0 0 0 0 0 0 0 0 0

At this point we go around the while loop again. This time we remove the position 6,1 from the queueand mark it as visited in the beenThere matrix. Now when we try all possible moves we find that wecan move up (to 5,1), right (to 6,2) and down (to 7,1). We are prevented from moving left (to 6,0)although this position is open in the maze because it has been marked in the beenThere matrix. Ourdata structures now look as follows (some of the rows have been edited out to save space):

80

Page 90: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

maze

0 1 2 3 4 5 6 7 8 90 0 0 0 0 0 0 0 0 0 0

· · ·5 0 1 0 0 0 1 0 1 0 06 1 1 1 1 1 1 0 0 1 17 0 1 0 0 1 0 1 0 1 08 0 1 1 1 1 1 1 1 1 09 0 0 0 0 0 0 0 0 0 0

beenThere

0 1 2 3 4 5 6 7 8 90 0 0 0 0 0 0 0 0 0 0

· · ·5 0 0 0 0 0 0 0 0 0 06 1 1 0 0 0 0 0 0 0 07 0 0 0 0 0 0 0 0 0 08 0 0 0 0 0 0 0 0 0 09 0 0 0 0 0 0 0 0 0 0

On the next iteration of the while loop we remove 5,1 from the queue and will explore all possible movesfrom the position (only to 4,1 in this case) adding them to the end of the queue. On the next iterationwe remove position 6,2 and explore the moves from there (only to 6,3) adding them to the queue. Thenext iteration gives position 7,1 and we add 8,1 to the queue as the only possible move from there. Inthis way the algorithm continues until such time as we run out of positions to explore or we reach theexit.

Let’s consider how this algorithm works. At the stage shown diagrammatically above, the queue holds allthe positions that can be reached in two steps from the entrance. As we explore these positions we add tothe end of the queue all the positions which can be reached in three steps from the entrance. Exploringthese will lead to all positions four steps from the entrance being added to the end of the queue. In thisway the search continues dealing with all paths through the maze of a particular length before movingonto the next length. This means that we are guaranteed to find the shortest possible path through themaze.

The technique that we have used here is referred to as a breadth-first search. This comes from viewingthe possible solutions to the problem as a tree. For example, we can picture the tree for the point up towhich we traced the execution of the program above as shown in Figure 5.1. At each stage of the searchwe try all the possibilities at one level of the tree before moving onto the next level.

Of course, this is not the only strategy we could use for the search. The queue in this problem is beingused to remember positions that need to be explored further. Any data structure can be used to holdthese positions. If we chose a stack rather than a queue what would be the effect? In fact, this leads us

81

Page 91: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Figure 5.1: A Tree View of the Bread-first Search Strategy.

to another important class of problem solving algorithms: the depth-first search.

If we push each new position onto a stack then the next position we retrieve will be the last one weput on the stack. For our maze-solving algorithm, after the stage of pushing (5,1), (6,2) and (7,1) wewould start the next stage of exploration at (7,1). This would lead to (8,1) being pushed onto the stackand this would then become the next position to be searched. In fact we will explore the entire subtreeleading from position (7,1) before we come to back to looking at (6,2) or (5,1). This is what is meantby “depth-first”: we explore the full depth of one branch of the tree before we consider any others. Thismay provide a solution quicker than a bread-first search. However, in general, the solution found maynot be the optimal one.

Exercise 5.13 Change the maze-solving program to use a stack rather than a queue, and seewhat effect this has.

Exercise 5.14 Change the maze-solving program to use a priority queue (see Exercise 5.12)rather than a normal queue to provide a prioritised search. If the horizontal distance to theright-hand outer wall is used as the priority value this will prefer routes that are makingprogress towards the right. Make the necessary changes to the maze-searching program andsee what effect this has.

82

Page 92: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Exercise 5.15 It would be useful if our program reported on the route found to reach theexit of the maze. One way of doing this is to change the beenThere matrix. Instead of simplystoring a true/false value, we can use it to hold the coordinates of the position that led usto each point in the path. We can then follow the path back from the exit to work out howwe got there. Make this change to the program and get it to print out the route at the end.Note that the beenThere matrix will still need to be initialised to some default value to showthat no positions have been visited as the program starts. For the first, simple maze the datastructures would look like this at the end:

maze

0 1 2 3 4 5 6 7 8 90 0 0 0 0 0 0 0 0 0 01 0 1 1 1 1 1 1 1 1 02 0 1 0 1 0 0 0 1 0 03 0 1 0 1 1 1 0 1 0 04 0 0 1 1 0 1 0 1 0 05 0 0 1 0 0 1 0 1 0 06 1 1 1 1 1 1 0 0 1 17 0 0 0 0 1 0 1 0 1 08 0 1 1 1 1 1 1 1 1 09 0 0 0 0 0 0 0 0 0 0

beenThere

0 1 2 3 4 5 6 7 8 90 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0,01 0,0 1,2 1,3 2,3 1,3 1,4 1,5 1,6 1,7 0,02 0,0 1,1 0,0 3,3 0,0 0,0 0,0 1,7 0,0 0,03 0,0 2,1 0,0 4,3 3,3 3,4 0,0 2,7 0,0 0,04 0,0 0,0 5,2 4,2 0,0 3,5 0,0 3,7 0,0 0,05 0,0 0,0 6,2 0,0 0,0 4,5 0,0 4,7 0,0 0,06 -,- 6,0 6,1 0,0 6,5 5,5 0,0 0,0 7,8 6,87 0,0 0,0 0,0 0,0 6,4 0,0 8,6 0,0 8,8 0,08 0,0 8,2 8,3 8,4 7,4 8,4 8,5 8,6 8,7 0,09 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0

83

Page 93: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Exercise 5.16 Another classic problem solving situation involves a farmer, a goat, a lion anda cabbage! The farmer finds himself on one bank of a river with a goat, a (semi-domesticated!)lion and a cabbage that he needs to get across the river. Unfortunately his only means oftransport is a small canoe which can only hold the farmer and one other item at any timea.The problem is that the goat will eat the cabbage if the farmer is not on the same bank, andlikewise the lion will eat the goat if the farmer is not there. How can he get his cargo safely tothe far bank? Obviously there are a number of different states (combinations of the four itemson each bank) and some of them are “safe” and some not. We can try to solve the farmer’sproblem by starting with the state where all the items are on the first bank. We can then tryall possible moves from this position and queue the “safe” ones for further exploration. Thisprocess continues until all the items are on the second bank or no further possibilities exist.Devise a notation (a data structure) for representing the states in this problem, and then usethe general outline of our maze-solving algorithm to find a solution to the farmer’s problem.Alter your program to use a depth-first search (i.e. based on a stack rather than a queue) andsee what effect this has on the solution found.

aIt is a very large cabbage!

5.4 Deques, Circular Lists and Header Nodes

We have now seen stacks in which items are added and removed from only one end, and queues to whichitems are added at one end and removed from the other. A more general data structure is the doubleended queue, or deque. A deque permits items to be added to, and removed from either end. For thisreason we cannot distinguish a head or a tail, and so we simply refer to the left and right ends of thedeque. Furthermore, rather than having just one add operation and one removal operation, we now needpairs of these operations: addLeft and addRight, and removeLeft and removeRight. Because of theextra complexity of a deque we need to think carefully about its implementation.

5.4.1 Implementation Techniques for Deques

As with any of the abstract data types we have looked at so far, we can implement deques using eitherarrays or linked lists. An array implementation would need to use a form of circular array, as theArrayQueue ADT did in order to deal with the travelling queue problem (see p. 70). However, we will notconsider an array implementation at this time, but will turn immediately to a linked list implementation,which illustrates a number of new techniques for linked lists.

All of the linked lists we have considered up until now have been simple, singly-linked lists. By this wemean that each node has had only one pointer field (usually called next) pointing to the next node inthe list. However, in the case of a deque we may be working through the list from either end. In thiscase a single pointer is not enough. We need to use a doubly-linked list.

Doubly-Linked Lists

A doubly-linked list is made up of nodes with a data field as usual, but with two pointer fields: onepointing to the left neighbour and one to the right neighbour. In this case, a list containing the letters“abc” would look like the following:

84

Page 94: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

This form of list allows us to add or remove nodes from either end relatively easily.

Exercise 5.17 Develop algorithms to add and remove elements at either end of a deque im-plemented as a doubly-linked list.

Circularly-Linked Lists

With this kind of structure we can go one step further and turn it into a circularly-linked list. Here wetake the null pointers at either end of the list and use them to join up with the other end of the list. Ifwe choose this approach the picture above changes to this:

Note that with this style of implementation one has to be very careful not to end up in infinite loopswhen traversing a list. As there are no longer any null pointers to check for, we need to construct thetermination conditions of such loops along the following lines:

if (left != null)

{ curr = left;

do

{ // Handle the current node

. . .

curr = curr.rt;

} while (curr != left);

}

There is one other optimisation we can make to the deque data structure we have here. Note that wecan now find the right end of the deque by following just one pointer from the left end (the left fieldof the left-most node points to the right-most node). This means that we can remove the right pointercompletely and represent our deque as shown here:

85

Page 95: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Exercise 5.18 Modify your algorithms from Exercise 5.17 to take this new structure intoaccount.

Lists with Header Nodes

The last new idea we need to introduce in this section is that of a list-head node, or just a header nodeas it is sometimes called. In many of the algorithms we have studied so far there have been special casesto deal with adding the first element to a list, or removing the last element from a list. These need notarise, if the list is never empty ! We can ensure this by artificially inserting an unused node into the list.This is what we call the list-head node (or header node). Of course, our algorithms that manipulate thedata in such lists must be extremely careful not to access the header node as if it contained valid data.The doubly- and circularly-linked deque shown above would look like this with a header node:

An empty deque would simply have the header node linked back to itself. The situation would look likethis:

86

Page 96: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

5.4.2 A Java Class for Deques

So then, how can we implement a deque ADT using these kinds of advanced linked-list techniques? Let’slook at the data structures and algorithms that we will need. The class diagram for the main class isshown, but we have not shown the (very simple) class diagram for the inner DequeNode class.�

Deque

header

addLeft, addRight,removeLeft, removeRight,rightHead, leftHead,isEmpty

Data Members

The data members are quite simple, and would be the same for any doubly-linked data structure whetherit was circularly-linked or not, and whether it had a header node or not:

public class Deque<T>

{ private class DequeNode

{ public T data;

public DequeNode lt, // Pointer to left neighbour

rt; // Pointer to right neighbour

} // class DequeNode

private DequeNode header; // Pointer to header node

. . .

} // class Deque

Constructor

Turning to the algorithms, the constructor for the class will need to set up the header node. Note howthis is linked back to itself in the way illustrated in the previous diagram.

public Deque () // Constructor

{ header = new DequeNode(); // Create header node

header.lt = header;

header.rt = header;

} // Constructor

Operations

Considering first the operation of adding an item to the deque, we will need two versions of this: one toadd items to the left end of the deque and one to add items to the right end. We will just consider theaddLeft method here; the other is very similar:

1 public void addLeft (T item) // Add item to left end

2 { DequeNode newNode = new DequeNode();

3 newNode.data = item;

87

Page 97: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

4 newNode.rt = header.rt;

5 newNode.lt = header;

6 header.rt.lt = newNode;

7 header.rt = newNode;

8 } // addLeft

The main point to notice here is that there are no special cases for adding an item to an empty list, sincethis situation no longer arises. However, linking the new node into the list is a little more complicated,since four pointers need to be updated. Let’s consider adding the first node into a previously emptydeque. We will assume that it is a character deque, and that we are adding the character ’a’. Aftercreating the new node (line 2) and initialising its data field (line 3) we will have the following situation:

Line 4 (newNode.rt = header.rt;) then changes the picture to the following:

After line 5 (newNode.lt = header;) we get:

88

Page 98: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Having initialised the two pointer fields in the new node, the next steps are to change the links in theremainder of the list. The first of these steps (see line 6: header.rt.lt = newNode;) would change theleft pointer of any previous node in the list. Since we are adding the first node this has the effect ofchanging the left pointer of the header node itself to point to the new node:

The last step (line 7: header.rt = newNode;) sets the right link of the header node to point to the newnode:

89

Page 99: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

This is the end result. We can drop the temporary newNode pointer from the diagram, and redraw itmore clearly as follows:

Note how no special case was needed for the fact that this was the first node in the list. Trace throughthe addition of a second node in the same way and satisfy yourself as to how it works in this case.

Turning to the next of the methods, we have removeLeft to take an element from a deque (there is alsoremoveRight which is, again, very similar):

public T removeLeft () // Remove item from left end

{ if (header.rt == header)

throw new EmptyException("deque is empty");

DequeNode tmpPtr = header.rt;

T tmpData = tmpPtr.data;

header.rt = tmpPtr.rt;

tmpPtr.rt.lt = header;

return tmpData;

} // removeLeft

The main thing to notice in this method is the if statement to check that there is a node that can beremoved from the deque. We can no longer just check for a null pointer as there is now always somethingto point at (even if it is just the header node). We can tell if the deque is empty in this case by checkingto see if the header node points back at itself. We have chosen to do this in the method above by checkingthe right pointer (header.rt == header), but could just have easily used the left pointer in the sameway. Trace through the way this method removes a node from a deque.

The last few methods are again very simple. There are two methods for examining the two end nodeswithout removing them from the deque (only rightHead is shown here) and one to check if the deque isempty (again we could have used either the left or right pointer of the header node for this):

90

Page 100: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

public T rightHead () // Return item at right end

{ if (header.lt == header)

throw new EmptyException("deque is empty");

return header.lt.data;

} // rightHead

public boolean isEmpty () // TRUE if no items in deque

{ return header.lt == header;

} // isEmpty

That ends our look at deques and various alternative implementation techniques. It is important to notethat doubly-linked lists, circularly-linked lists and header nodes are not restricted to the implementationof deques but can be applied to many data structures. Indeed, doubly-linked lists simplify many of thealgorithms we have looked at already as the extra pointer generally removes the need for a “previous”pointer when inserting and deleting nodes from a list. In the same way, the use of a header node simplifiesmany of the linked-list algorithms we have looked at for inserting and removing nodes, by removing thespecial cases needed for handling empty lists.

Exercise 5.19 Change the implementation of the GenericList class from the previous chap-ter to use a doubly-linked list. Once you have done this, change it to make use of a headernode as well.

Exercise 5.20 Add a method to the Deque class to delete all the elements of the list (excludingthe header node, obviously!).

Exercise 5.21 Write a toString method for the Deque class that can be used to display thenodes in a deque from left to right.

Exercise 5.22 Rewrite the Deque class to use a circular array rather than a linked list.

5.4.3 The Use of Deques

Note: This entire section deals with the client view, and will not be highlighted.

The deque is a very general data structure. In fact, we can define the previous data structures that welooked at in this chapter in terms of a deque. A stack is simply a deque where we restrict addition andremoval of nodes to one end of the deque. Similarly, a queue can be implemented as a deque whereaddition occurs at only one end, and removal at the other.

One possible use of a general deque would be in the kind of problem-solving algorithm we looked at withqueues. If we can judge that some potential solutions to a problem might be more likely to give a resultthan others we could add them to the head of the queue rather than to the end as in the breadth-firsttechnique. This would make use of a deque in which we added and removed items from one end and justadded items at the other end (sometimes called an output-restricted deque). The effect of this on oursearching strategy would be to search in a way that was neither breadth-first nor depth-first but partiallyprioritised. The success of this would depend on how well we could assess which states might lead to aresult. For example, for the maze problem we might give priority to states where the column position

91

Page 101: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

ADT Array Implementation Linked List ImplementationList of int IntegerVector IntegerList

Generic Lists ObjectList and GenericList

Stack ArrayStack ListStack

Queue ArrayQueue ListQueue

Deque Deque

Table 5.1: Summary of ADTs in Chapters Four and Five

was further to the right (i.e. we appeared to be moving towards the exit). Of course, it is very easy toconstruct mazes for which this strategy is poor. For other problems it may be easier to assess the meritsof different partial solutions and give higher priority to those that are likely to lead to a solution faster.

Further applications of deques would be in programs where one needed to model situations where itemscould be handled in the more general way embodied in a deque. For example, in playing some card gamesit might be possible to add and remove cards from either the top or the bottom of the pack of cards.When writing a program to play such a game, a deque would be the natural data structure to representthe pack of cards.

Exercise 5.23 Rewrite the maze-solving program, using a deque to implement the prioritisedsearch strategy outlined above.

Using Deques: Another View

The following quote is taken from the Sun Developer Network’s Core Java Technologies Tech Tips newslet-ter of 14 December 2005:

Why use a deque? Deques are useful data structures for recursive problems, such as searchingthrough a maze or parsing source [code]. As you move along a path, you save “good” spots,adding more data along the way (that is, while you think the path is good). If the path turnsbad, you pop off the bad bits, returning to the last good spot. Here, you would add andremove from the same end, like a stack. Once you find your way through, you start back atthe beginning to reveal the solution, which is the other end. Other typical examples includeoperating system schedulers and bad card dealers who like to deal from the bottom of thedeck — to themselves at least.

5.5 Summary of the ADTs in Chapters Four and Five

Table 5.1 summarises the abstract data types that have been covered in this chapter and the previouschapter. It shows the names of the ADTs developed using arrays and also linked lists. In some caseswe have only studied one type of implementation. The full listings of all these ADTs can be found inAppendix A.

92

Page 102: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Skills

• You should be familiar with the following abstract data types: stack, queue and deque

• You should be able to implement these abstract data types using arrays and linked lists

• You should be able to use singly-linked lists, doubly-linked lists, circularly-linked lists andheader nodes

• You should be familiar with some of the uses and characteristics of stacks, queues and deques,particularly in the context of problem solving algorithms

93

Page 103: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Chapter 6

Trees and Graphs

Objectives

• To consider nonlinear abstract data types, specifically general trees, binary trees, binary searchtrees and graphs

• To study the implementation of these abstract data types using dynamic data structures

• To consider some applications of these abstract data types

6.1 Introduction

All the data structures we have considered up until now (lists, stacks, queues and deques) have beenlinear. That is, there has been a very simple relationship between one item in the structure and the next.Each item has had at most two neighbouring items and they have been clearly “on either side” of it. Inthis chapter we will be considering data structures that do not have a simple linear relationship betweenthe items in the data structure. These are trees and graphs. In some ways, as we will see, a tree is just arestricted form of the more general graph data structure (similar to the way in which queues and stackscan be viewed as restricted forms of deques). We will begin by studying trees.

6.2 Trees

In fact, we have already met a tree structure in this course, when we considered search strategies forproblem solving algorithms in the last chapter. There we illustrated the positions that could be reachedin a maze as a tree of coordinates (see Figure 5.1, p. 82). In everyday life many situations involve tree-likestructures. A common example is a person’s family tree, as in Figure 6.1. Other good examples are thehierarchical naming structures used by botanists and zoologists for classifying plants and animals. Wehave also seen many examples of inheritance hierarchies, which, in Java, are trees.

94

Page 104: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Figure 6.1: A Family Tree.

6.2.1 Definitions and Terminology

Let’s consider a family tree as an example. The one shown in Figure 6.1 happens to be mine, startingfrom my grandfather (who was also named George Wells). It serves to illustrate several points abouttrees, particularly concerning their terminology. In fact, much of the terminology used for trees comesfrom thinking of them as “family trees”.

Formally, we can define a tree as follows:

A tree is a finite set S of one or more nodes of basetype T such that:

a) there is one specially designated node known as the rootb) the remaining nodes (excluding the root) are all partitioned into

m ≥ 0 disjoint sets S1 to Sm, where each Si is in turn a tree.

The trees S1 to Sm are referred to as subtrees. According to the above definition, a subtree may be empty.In our example (Figure 6.1) the first node labelled “George” is the root of the entire tree. Below it thereare three subtrees, each of which has two subtrees. The nodes are connected by branches (sometimescalled edges). A node which has no descendants is called a leaf (the nodes labelled “Gary”, “Gayle”,“George”, “Elizabeth”, “Robert” and “Lorna” are examples of leaves in the tree above). Two nodesthat share the same parent are called siblings (for example, “Eric” and “Anne”, or “Gary” and “Gayle”above). A node above another node with a branch connecting them is called a parent. In the exampleabove we can say that the node labelled “Anne” is the parent of the node labelled “Lorna”. Similarlywe identify nodes as children. The node labelled “Elizabeth” is a child of the node labelled “Eric”. Thedepth of a node is the length of the path from that node to the root (the node labelled “Robert” has adepth of 2 above). The outdegree of a node is the number of branches leaving the node (this is 3 for thefirst node labelled “George”, 2 for the node labelled “Colin”, and 0 for the node labelled “Robert”). Ingeneral, any leaf node will have an outdegree of zero. A related concept is the indegree of a node (i.e. thenumber of branches entering a node). For a tree this can only be zero (for the root) or one (for all othernodes).

Notice how we usually draw trees “upside down”, that is with the root at the top. Another importantthing to notice about our formal definition of a tree is that it is recursive. Expressed a little less formally,we have defined a tree as a root node connected to several trees (its subtrees). This naturally recursivedefinition leads us into a situation where many of the algorithms that deal with trees are best and mosteasily expressed using recursion, as we will see.

A very important class of trees are binary trees1. These are trees where the maximum outdegree of any

1In fact, any tree, no matter what its outdegree, can be converted to a binary tree, as we will see shortly.

95

Page 105: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Figure 6.2: A Binary Tree (an Ancestor Tree).

node is two — trees with only zero, one or two children beneath each node. In such a case, we talk of theleft subtree and the right subtree of a node. If the outdegree is either zero or two (i.e. all non-leaf nodeshave exactly two children), we say that the tree is a proper binary tree.

Much of the rest of this chapter is concerned with binary trees. An example of a binary tree would bean ancestor tree (the inverse of a family tree). For my immediate ancestors we get the tree shown inFigure 6.2. Notice how every node here has an outdegree of two or zero. Since no one has more (or less!)than two biological parents, any ancestor tree will be a proper binary tree.

6.2.2 Implementing Trees as Dynamic Data Structures

The first thing to consider about a tree is the structure of a node. There are at least two aspects thatmust be dealt with: the data stored in the node (what we have called a “label” in the general discussionabove) and pointers to related nodes. We can safely ignore the data field for the moment and rely on thegeneric mechanisms in Java to provide us with a generic data field that we can use for any data type weneed. What about the pointer fields? The number and structure of these depends on exactly what wewish to do with the trees. The simplest possible case is a binary tree where we keep only two pointersin each node: one to each of the two subtrees. An example of such a tree (with a character data field)would be as follows:

Note that the links between the nodes here allow us to move down the tree easily but there is no wayof moving back up the tree. In fact, this is not particularly serious when we use recursive methodsto traverse the trees. If we are not going to use recursive methods it is still possible to use the abovestructure, but it may make life easier if we add a third link to each node, pointing back to the parentnode. In this case our tree would look like this (we have reshaped the boxes representing nodes to savesome space):

96

Page 106: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

This structure allows us to move freely up and down through the tree. Algorithms using this structureneed to be extremely careful about the order in which they traverse the tree.

That deals with binary trees, but how might we represent more general trees? If we know the maximumoutdegree of any node we can set aside that number of pointers in each node to point to the child nodes.This technique might result in a considerable number of unused pointer fields, but it is simple and quiteefficient. If we are faced with the situation where the outdegree of the nodes cannot be predicted, andmay be arbitrarily large, then we will be forced to use a more general list of pointers to keep track ofthe child nodes. One way of doing this would be to use a list ADT (such as the GenericList class fromchapter four, or the Java Vector class). This would allow us to add as many child nodes as were neededfor each node in the tree. Fortunately such cases arise very rarely, and we can generally restrict ourselvesto dealing with only binary trees.

6.2.3 Converting a General Tree to a Binary Tree

As we have already mentioned, any general tree can be converted to a binary tree. At first this mightseem impossible as a general tree might have any number of child nodes beneath any node. How can wemodel this when we only have two children in a binary tree? If we think of the structure of a generaltree we can describe it by two relations: each node has a left-most child node and a right sibling. Now anode in a binary tree has two places in which we can store this information: the left and right subtrees.

Consider the general tree shown below.

How would we convert this to a binary tree? Considering the root node (a) first it has only a left-mostchild (b) and no sibling. It will form the root of the new binary tree, with only one child:

97

Page 107: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Considering the node labelled b next, it has a left-most child (e) and a right sibling (c). This gives:

Considering e, it has neither a left-most child nor a sibling, so it will have no children in the new tree.In the case of c, we have f as the left-most child and d as the right sibling. This leads to:

And so on. Eventually we get to the binary tree shown in Figure 6.3, which is equivalent to the generaltree with which we began.

In most cases, we can traverse this binary tree as easily (and visiting the same nodes in the same order)as we would have done with the original, general tree. The binary trees created from general trees in thisway tend to be rather “long and thin”, and are not proper binary trees. However there are algorithmsthat allow us to reorganise these trees to give us a “flatter” structure if we so desire.

To recap on this section and the previous section: we have now seen how any tree can be represented bya binary tree. We have also seen how a binary tree can be represented by nodes with three fields: onefor the data, one for the left subtree pointer and one for the right subtree pointer. And, in some cases,we might need to add a pointer to the parent node as well. We will now turn to the construction of ageneric Java class for binary trees.

6.2.4 A General Binary Tree Class in Java

Section 6.2.2 dealt with the data representation for a tree. This, of course, corresponds to the set ofvalues for a tree ADT. The other aspect of an ADT that we need to define is the set of operations thatcan be applied to the ADT. In the case of trees the operations we might need to perform are, perhaps, notas obvious as those we would require for linear data structures like queues and stacks. Different authorsof textbooks propose different sets of operations, and each has very good reasons for doing so. For ourpurposes we will implement the following operations for our tree class:

• left: return the left subtree of a tree

• right: return the right subtree of a tree

• addLeft: add a left subtree to a node

98

Page 108: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Figure 6.3: Binary Tree Converted From a General Tree.

• addRight: add a right subtree to a node

• getData: access the data value in the root of a tree

The class diagram for this class is as shown here.�

Tree

data, lt, rt

getData, left, right,addLeft, addRight

Data Members

With this structure in mind, let’s turn to the implementation of the Tree class. The first section is againquite simple, with just a data field and two pointers to the subtrees:

public class Tree<T>

{ private T data;

private Tree<T> lt, // Pointer to left subtree

rt; // Pointer to right subtree

. . .

} // class Tree

99

Page 109: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

The interesting thing to note here is how we represent a tree by just a single node (indeed, our class mightbe better named TreeNode). This stems from the recursive nature of the tree: any node is effectively theroot of some subtree within the overall structure.

Constructors

When creating a new node we will need to specify the value stored in the node. Furthermore we mightwant to specify existing trees as the left and right subtrees. We can use overloaded constructors to achievethis:

public Tree (T value, Tree<T> left, Tree<T> right)

// Constructor: creates new node with left and right subtrees

{ lt = left;

rt = right;

data = value;

} // Constructor

public Tree (T value)

// Constructor: creates new node

{ this(value, null, null);

} // Constructor

Operations

The rest of the methods listed above are very simple. Here are a few of them (right and addRight arevery similar to the equivalent left subtree operations here):

public Tree<T> left () // Return the left subtree of a tree

{ return lt; }

public void addLeft (Tree<T> left)

// Add a left subtree to a node

{ if (lt != null)

throw new UnsupportedOperationException("subtree already present");

lt = left;

} // addLeft

public T getData ()

// Access the data value in the root node of a tree

{ return data; }

Notice how we have checked the precondition to make sure that we can safely add a subtree to a nodewithout losing a previously existing one.

Exercise 6.1 Write a pair of replace methods for the Tree class that will work like the addmethods, but replacing the existing subtree.

Exercise 6.2 Write a method for the Tree class that will allow client programs to modify thedata stored in a node in the tree. Call your method setData.

100

Page 110: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Figure 6.4: Knowledge-Base Tree.

Exercise 6.3 Working with the trees we have developed in this section is a little difficultbecause we have no way of removing subtrees. Write a remove method to do this.

Turning away from the implementation view for a moment, how would we use such trees?

6.2.5 Using the Tree Class

Note: This entire section deals with the client view, and will not be highlighted.

As an example, we will consider the simple child’s game of “guess the animal”. The user thinks of ananimal, and the computer has to try to guess what it is by asking questions that require a “yes/no”answer. How does this involve trees? Well, the computer has a knowledge base of questions and animals,which is arranged into a tree structure. The internal nodes are questions, where the left subtree isconsidered when the answer is “yes” and the right subtree when the answer is “no”. A simple exampleof such a knowledge base is shown in Figure 6.4.

The computer starts at the root of the tree, and then works left (if the answer is “yes”) or right (“no”),until a leaf node is reached, at which point it has identified the animal (as best as it can). The followingprogram trace shows an example of a user’s interaction with this program (the user’s inputs are shownin italics to distinguish them from the program’s output):

Does it live in water?

no

Does it fly?

no

Does it bark?

yes

It’s a dog.

Turning to the implementation, let’s look first at the main part of the program, which uses this tree datastructure:

1 public void play ()

2 { Tree<String> pos;

101

Page 111: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

3 initTree();

4 System.out.println("Let’s play guess the animal.");

5 pos = root;

6 while (pos != null)

7 { if (pos.left() != null) // Must be a question

8 { System.out.println(pos.getData());

9 if (answer())

10 pos = pos.left();

11 else

12 pos = pos.right();

13 }

14 else // Must be an answer

15 { System.out.println("It’s a " + pos.getData() + ".");

16 break;

17 }

18 }

19 if (pos == null)

20 System.out.println("Sorry, I don’t know the animal.");

21 } // play

The initTree method sets up the tree containing the knowledge base — we will come back to it shortly.The variable pos is used to move through the tree. We start off setting it to point to the root node. Notethat the while loop is usually terminated by the break in line 16, and not by the condition on line 6. Inline 7 we check to see whether the current node has a left subtree. If it has a subtree then it must be aquestion node, and the program asks the question in line 8. The answer method is part of this program.It simply waits for the user to enter a string. If the first letter is a “y” (upper- or lowercase) it returnstrue, if the first letter is “n” it returns false, otherwise it prints out a message asking the user to enter“yes” or “no” and repeats the process until a valid reply is given. The answer given by the user is usedto make the decision to follow either the left subtree (if the answer was “yes”) or the right subtree. Ifthe node was not a question, but an animal name, the program prints out the name (line 15) and thenexits the while loop. Note how we use the left and right methods to work through the tree and thegetData method to retrieve the string from a node for output.

The initTree method is responsible for setting up the tree. Ideally, this should be done by reading theknowledge base in from a file. For this simple example it has been done “by hand” as part of the program.This illustrates how the addLeft and addRight methods can be used.

private void initTree ()

{ Tree<String> p;

root = new Tree<String>("Does it live in water?");

root.addLeft(new Tree<String>("Does it have webbed feet?"));

root.addRight(new Tree<String>("Does it fly?"));

p = root.right();

p.addLeft(new Tree<String>("bird"));

p.addRight(new Tree<String>("Does it bark?"));

p = p.right();

p.addLeft(new Tree<String>("dog"));

p.addRight(new Tree<String>("cat"));

// Return to left subtree of root

p = root.left();

102

Page 112: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

p.addLeft(new Tree<String>("duck"));

p.addRight(new Tree<String>("fish"));

} // initTree

The root of the tree (referenced by the variable root) is a class variable. The local variable p is usedto work through the tree creating new nodes and building up the structure. This method works in a“top-down” way starting at the root and working down to the leaves. We could also have done this ina bottom-up fashion by using the other constructor (the one with subtrees as extra parameters). In thiscase we would have created two leaf nodes (say the ones labelled “dog” and “cat”) and then created theparent question node in this way:

Tree<String> left = new Tree<String>("dog");

Tree<String> right = new Tree<String>("cat");

Tree<String> p = new Tree<String>("Does it bark?", left, right);

and so on up the tree.

While this children’s game may seem like a trivial application, the kind of techniques that are beingused here are also applicable to expert systems, a very important area of research in the field of artificialintelligence.

Exercise 6.4 Rewrite the initTree method using the bottom-up approach discussed above.Be careful to ensure that the root is correctly initialised. Which version of initTree is easierto understand?

Exercise 6.5 Develop a file layout that will allow you to specify the knowledge base as a textfile, and rewrite initTree to take a file name and initialise the tree from this.

Exercise 6.6 Another way to construct the knowledge base is to get the user to help. In thiscase the program starts off with a tree containing just one node. This contains an answer (say“cat”). The program immediately guesses that the user is thinking of a cat. The user then hasto respond “yes” or “no”. If the response is “no”, the program asks the user to enter both thecorrect answer and a question that can be used to differentiate between the wrong answer andthe correct animal. This is used to restructure the tree incorporating the new animal and thenew question. In this way the program can “learn” about new animals. Rewrite the programto use this approach (you will need to do Exercise 6.1, Exercise 6.2 or Exercise 6.3 first).

6.2.6 Traversing Trees and the Use of Iterators

It is often the case that a program has to work through all the nodes of a tree. For binary trees there area number of standard ways of working through a tree. These are:

In-order: first the left subtree is traversed, then the root node is visited and then the right subtree istraversed (LNR).

Pre-order: first the root node is visited, then the left subtree is traversed, followed by the right subtree(NLR).

Post-order: first the left subtree is traversed, then the right subtree is traversed and finally the rootnode is visited (LRN).

103

Page 113: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Figure 6.5: Example Binary Tree.

Breadth-order: The tree is traversed level by level.

The names LNR, etc. come from abbreviating the phrases “Left, Node, Right”, etc., which describe theorder of traversing the subtrees of a node. This might be made more clear by considering an example.

Consider the tree shown in Figure 6.5. If we print out the nodes as we visit them we will get the followingoutput from the four different traversal methods:

Traversal OutputIn-order (LNR) d b e a f c g

Pre-order (NLR) a b d e c f g

Post-order (LRN) d e b f g c aBreadth-order(top-down)

a b c d e f g

Note that in addition to the first three traversals we could also do NRL, RNL and RLN traversals, butthese are uncommon as we more usually work through a tree from left to right. We could also do thebreadth-order traversal from the bottom-up, but again this is rather unusual.

How could we do such traversals with the Tree class we have developed? The existing methods thatwe have defined for the class are sufficient to write methods to do such traversals. The simplest way ofwriting traversal methods like these is using recursion. Here is one that would print out the nodes in atree of characters as used in the illustration above. This method does an in-order traversal.

public void LNRPrint (Tree<Character> root)

// In-order traversal of tree printing out the nodes’ data

{ if (root != null)

{ LNRPrint(root.left());

System.out.println(root.getData());

LNRPrint(root.right());

}

} // LNRPrint

Exercise 6.7 Write similar methods to do pre-order and post-order traversals of trees.

Exercise 6.8 One way of implementing a breadth-order traversal is to use a queue. To dothis you start at the root of the tree and add its child nodes to a queue. We can then workthrough the queue visiting the node at the head of the queue, and adding any child nodes ithas to the end of the queue. Using this approach, write a method to perform a breadth-firsttree traversal.

104

Page 114: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Figure 6.6: A Binary Search Tree.

We could write methods like this every time we needed to do a traversal of a tree. However these arevery common operations and so the traversals are good candidates to be written as methods of the class.The problem with that is that the task to be performed each time may be different. Here we neededto print out the contents of each node as we visited it. In another application we might need to addthe (numeric) contents of the node to a running total. We can overcome this problem by providing amethod that returns an iterator that can be used to work (or iterate) through the contents of the tree.An iterator is just an object with methods that allow us to access the contents of some data structure.We will return to this topic in the next section.

6.2.7 Ordered Binary Trees

Up until now we have not had much to say about the order of the nodes in the trees we have beenconsidering (except for the “guessing game” application). An important group of binary trees are thosewhere the data is stored in some definite order. These are referred to as binary search trees. This is dueto the fact that the ordering makes for very efficient searching for a node with a particular value. Thekey fact about a binary search tree is that for any node the left subtree contains only values less thanthat contained in the node, and the right subtree contains values greater than or equal to that containedin the node. An example of a binary search tree containing some randomly chosen letters of the alphabetis shown in Figure 6.6. One aspect of a binary search tree that is worth noting is that an LNR (in-order)traversal will visit the nodes in ascending order.

Because of the additional structure in a binary search tree it is worth developing a new class with aslightly different set of operations. These operations are summarised in Table 6.1. Since we will needto ensure that the ordering is retained we need to remove the addLeft and addRight methods. Thesecan be replaced with an insert method that creates a new node and inserts it into the correct positiondepending on the data value which it contains. We can also provide a remove method to remove a nodefrom the tree, and a contains method to return a true/false indication of whether a node is in the tree.The class diagram is given below, omitting a number of private methods for simplicity.

105

Page 115: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

• Constructor

• Adding an element to a tree

• Deleting an element from a tree

• Finding an element in a tree

• Traversing the tree

Table 6.1: Operations on Binary Search Trees

BinarySearchTree

root

insert, remove, contains,getLNRIterator,getNLRIterator,getLRNIterator

Implementing the BinarySearchTree Class

A further change to the structure can be made in terms of how we “enclose” the nodes. The original treeclass we developed gave the user of the class control over following the links, etc. For a binary searchtree, because the structure is better defined, we can hide the entire tree structure from the user. This ismore like the approach that we have used for all our previous linear abstract data types. One impact ofthis decision is that the (public) interface methods to which the user has access need to be duplicatedwith internal (private) methods. With all of this in mind, the skeleton of our new class is shown below.

You will note that the generic type used for the class is specified differently here (as class BinarySearch-

Tree<T extends Comparable>). This introduces a new feature of the generic mechanisms in Java,namely that generic type parameters can be bounded . In this case, what we are saying is that theBinarySearchTree class can hold any type of object, as long as the object extends the Comparable inter-face2. In other words, we are restricting the types of classes that can be used with the BinarySearchTreeclass. This is needed in this case, because we need to be able to compare the data items in the binarysearch tree in order to maintain the ordering of the nodes.

Reminder Comparable is an interface that requires a method called compareTo. This method is usedto allow general comparisons between objects. It returns zero if the two objects are equal, a negativevalue if the object is less than the one that it is being compared with and a positive integer (greater thanzero) otherwise. Many of the standard library classes implement this interface (e.g. the Integer wrapperclass). The following example shows how the compareTo method can be used. The program prints truein all three cases.

Integer five = new Integer(5);

Integer nine = new Integer(9);

Integer x = new Integer(5);

2Actually, it must implement the interface, not extend it, but the generic type parameter specification uses the extends

keyword for both meanings.

106

Page 116: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

System.out.println(nine.compareTo(five) > 0);

System.out.println(five.compareTo(nine) < 0);

System.out.println(x.compareTo(five) == 0);

public class BinarySearchTree<T extends Comparable>

{ private class BSTreeNode

{ public T data;

public BSTreeNode lt, // Pointer to left subtree

rt, // Pointer to right subtree

parent; // Pointer to parent node

public BSTreeNode (T value) // Constructor

{ data = value;

lt = rt = parent = null;

} // Constructor

public BSTreeNode (T value, BSTreeNode parent)

// Constructor

{ data = value;

this.parent = parent;

lt = rt = null;

} // Constructor

} // inner class BSTreeNode

private BSTreeNode root;

private void insert (T value, BSTreeNode root)

{ . . . } // insert

private T deleteMin (BSTreeNode root)

// Delete and return the smallest value in the tree under

// root. This is only used by remove.

{ ... } // deleteMin

private void remove (T value, BSTreeNode root)

{ ... } // remove

private boolean contains (T value, BSTreeNode root)

{ ... } // contains

private void buildLNRIterator (BSTreeNode root,

TreeIterator t)

{ ... } // buildLNRIterator

private void buildNLRIterator (BSTreeNode root,

TreeIterator t)

{ ... } // buildNLRIterator

107

Page 117: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

private void buildLRNIterator (BSTreeNode root,

TreeIterator t)

{ ... } // buildNLRIterator

// ------------------------------------------------------------

public BinarySearchTree () // Constructor

{ root = null; }

public void insert (T newValue)

// Add a new node to the tree

{ if (root == null)

root = new BSTreeNode(newValue);

else

insert(newValue, root);

} // insert

public void remove (T value)

// Delete a node from the tree

{ remove(value, root); }

public boolean contains (T value)

// Tell whether value is in tree

{ return contains(value, root); }

public Iterator<T> getLNRIterator ()

{ TreeIterator t = new TreeIterator();

buildLNRIterator(root, t);

return t;

} // LNRTraversal

public Iterator<T> getNLRIterator ()

{ TreeIterator t = new TreeIterator();

buildNLRIterator(root, t);

return t;

} // NLRTraversal

public Iterator<T> getLRNIterator ()

{ TreeIterator t = new TreeIterator();

buildLRNIterator(root, t);

return t;

} // LRNTraversal

} // class BinarySearchTree

We will come back to the details in a moment. For now, notice how almost all the public interfacemethods simply call on overloaded private methods to accomplish the necessary tasks.

Data Members

Turning to the private data members in this class, we have a nested class definition for the nodes (just asfor the stacks, lists, etc. of the previous chapters) and a reference (called root) to the start of the data

108

Page 118: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

structure. Note that we have chosen to implement a constructor for the BSTreeNode inner class. Thissimplifies the addition of new nodes slightly. Note too how we have included a link to the parent node ofeach node in the tree. This is needed for some of the algorithms, as we will see shortly.�

BSTreeNode

data, lt, rt, parent

Inserting New Nodes

Let’s now have a look at the implementation of the private, internal methods that actually manipulatethe trees. These are, again, highly recursive. The first is the insert method, used to insert a new objectinto the tree. It is given the object and the root of an existing tree as parameters.

private void insert (T value, BSTreeNode root)

{ assert root != null;

if (root.data.compareTo(value) > 0)

// Add to left subtree

if (root.lt != null)

insert(value, root.lt);

else

root.lt = new BSTreeNode(value, root);

else // Add to right subtree

if (root.rt != null)

insert(value, root.rt);

else

root.rt = new BSTreeNode(value, root);

} // insert

This is fairly straightforward. It needs to find the correct place in the tree to add in the new node. Oneof the features of binary search trees is that we never have to do such an insertion at an interior point inthe tree: we always add new nodes as leaves in the tree. As an example, consider adding the letter ’b’to the example tree below (we do permit duplicates in the tree).

Our definition of the binary search tree says that values less than that stored in a node will be found inthe left subtree. When we start looking for the insertion point in the tree above this means we must takethe right leg (to the node labelled ’d’). At this point we have to go left, as ’b’ is less than ’d’. Thisbrings us to the node labelled ’c’ and we have to go left again, as ’b’ is less than ’c’. At this point wereach a dead end. The process of searching for the insertion point can be illustrated as follows:

109

Page 119: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

The final step is to add a new node containing the second ’b’ at this point, giving:

This tree still has all the properties we require of a binary search tree. In particular, an in-order (LNR)traversal will visit the nodes in ascending order: abbcd. Trace through the action of this method for thenode addition described above. Satisfy yourself that it works in all possible cases (such as adding to anempty tree, adding a value smaller/greater than any already in the tree, etc.).

Deleting Nodes

While adding a new node to a binary search tree and maintaining its essential properties is reasonablystraightforward, the same is not true of deleting a node. When we delete an internal node we need toreplace it with something so that the subtrees are not just left dangling. This gives rise to four individualcases where the node to be deleted has: (1) no subtrees, (2) only a left subtree, (3) only a right subtree,or (4) both left and right subtrees.

The most difficult of these cases is (4), where the node has two subtrees. For example, what are we todo when deleting the node labelled ’b’ below?

110

Page 120: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

One way of solving this problem is to delete the smallest node in the right subtree (i.e. the node labelled’c’ above) and replace the node that we are deleting with this one. Let’s see how this works. Whenwe come to delete the node labelled ’c’ we still have the problem of dealing with the right subtree (thenode chosen for deletion in this way will never have a left subtree, of course — why?). This can simplybe moved up a level to replace the deleted node. This gives:

How does this get us closer to deleting the node we really need to delete? Well, we can very easily replacethe contents of the node labelled ’b’ with the contents of the newly deleted node (i.e. ’c’). This gives:

This tree is exactly what we require: a binary search tree with the ’b’ node deleted.

Now that we know how to go about it, we can write the code. It turns out to be easiest to write a separatemethod to delete the smallest element from a given subtree (the first step in the process described above).Since this is a rather unusual method and is only of interest to the implementation of this ADT, we willnot make it public, but rather private. We will call it deleteMin:

private T deleteMin (BSTreeNode root)

// Delete and return the smallest value in the tree under root

// This method is only used by remove

{ assert root != null;

if (root.lt != null)

111

Page 121: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

return deleteMin(root.lt);

else // Delete this node

{ T tmpData = root.data;

replaceInParent(root.parent, root, root.rt);

return tmpData;

}

} // deleteMin

The replaceInParent method that is used here is another private utility method that simplifies thealtering of the links in the parent of a deleted node:

private void replaceInParent (BSTreeNode parent, BSTreeNode child,

BSTreeNode newChild)

// Replace child link in parent node with newChild

{ if (parent == null) // at root

root = newChild;

else

if (parent.lt == child)

parent.lt = newChild;

else

parent.rt = newChild;

if (newChild != null)

newChild.parent = parent; // Reset parent

} // replaceInParent

The deleteMin method can then be used by the private remove method to help deal with the difficultcase of deleting a node with two subtrees. What about the other three cases mentioned above? Obviously,if a node has no subtrees then deleting it is trivial: we simply delete the node and set the parent’s pointerto null (consider deleting the node labelled ’f’ above). If the node to be deleted has only a left or aright subtree, it is almost as simple: we can simply replace the node by the child node. Consider a fewexamples and satisfy yourself that this strategy will work.

We can now write the remove method itself. It consists primarily of a series of nested if statements todecide which of the four possible cases holds. This listing has been commented with the number of thecase as in the discussion above.

private void remove (T value, BSTreeNode root)

{ if (root != null)

{ if (root.data.compareTo(value) > 0)

// Delete from left subtree

remove(value, root.lt);

else

if (root.data.compareTo(value) < 0)

// Delete from right subtree

remove(value, root.rt);

else // Must be this node to be deleted

{ if (root.lt != null && root.rt != null)

// Has both left and right subtrees: CASE 4

{ T min = deleteMin(root.rt);

root.data = min;

}

else

{ if (root.lt == null && root.rt == null)

112

Page 122: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

// Has no subtrees: CASE 1

replaceInParent(root.parent, root, null);

else

if (root.lt == null)

// Has only right subtree: CASE 3

replaceInParent(root.parent, root, root.rt);

else // Has only left subtree: CASE 2

replaceInParent(root.parent, root, root.lt);

} // else

} // else

} // if

} // remove

This is probably the most complex algorithm that we have seen so far. Study it carefully and make surethat you know exactly how it works.

Searching for a Node

The next method to consider is contains which returns an indication of whether or not a particularvalue is found in the tree or not. This involves a fairly straightforward recursive search through the tree.

private boolean contains (T value, BSTreeNode root)

{ if (root == null)

return false;

if (root.data.equals(value))

return true;

else

if (root.data.compareTo(value) > 0) // Look in left subtree

return contains(value, root.lt);

else // Look in right subtree

return contains(value, root.rt);

} // contains

Note that this method uses both the equals and the compareTo methods.

Iterators for Traversing the Tree

The final point to consider is the way that the iterators work. You will recall that the public methodsall have the following form (this method returns an in-order iterator):

public Iterator<T> getLNRIterator ()

{ TreeIterator t = new TreeIterator();

buildLNRIterator(root, t);

return t;

} // LNRTraversal

This essentially creates a specific Iterator object and returns it. The client can then use the Iteratorobject to work through the nodes in the tree. We need to consider three things now: (1) what methodsdoes the Iterator interface require, (2) what does the TreeIterator class do, and (3) how do thebuildXXXIterator methods work?

113

Page 123: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

The Iterator Interface The Iterator interface is a very simple Java interface that specifies threemethods. Note that it is a generic interface and so can iterate through collections of any specified datatype.

public interface Iterator<T>

{ public T get (); // Get the current item

public void next (); // Move to the next item

public boolean atEnd (); // Tell whether there are any more items

} // interface Iterator

These methods allow a client program using an iterator to access the current data item in the ADTthrough which the client is working, to move on to the next data item, and to tell if the iteration iscompleted (i.e. that there are no more data items to examine). The use of an iterator in the context ofa tree is illustrated below (see p. 115).

The TreeIterator Class The TreeIterator class is an inner class that simply builds up a list ofnodes in a vector (we use the Vector class from the java.util package, but we could have used ourown GenericList class or any similar ADT). It then provides the methods required by the Iterator

interface to allow clients to work through the list of nodes that it contains.

private class TreeIterator implements Iterator<T>

{ private Vector<T> v;

private int index = 0;

public TreeIterator ()

{ v = new Vector<T>();

} // constructor

public T get ()

{ return v.get(index);

} // get

public void next ()

{ index++;

} // next

public boolean atEnd ()

{ return index >= v.size();

} // atEnd

void add (T value)

{ v.add(value);

} // add

} // class TreeIterator

Note how the Vector class is itself generic and so needs to be created here to work with a list of whatevertype T the BinarySearchTree is working with.

Creating the Iterators The TreeIterator objects are initialised by the buildXXXIterator methods.The buildLNRIterator method is as follows (again, the others are very similar):

114

Page 124: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

private void buildLNRIterator (BSTreeNode root,

TreeIterator t)

{ if (root != null)

{ buildLNRIterator(root.lt, t);

t.add(root.data);

buildLNRIterator(root.rt, t);

}

} // buildLNRIterator

Essentially, all this does is traverse the tree in the correct order (i.e. in-order here), adding the data inthe nodes to the given TreeIterator. This is then the iterator that is returned to the client, allowing itto work through the nodes in the same order.

How can we use these iterators? It is actually very simple. The following example shows how anin-order traversal of a tree of characters can be performed in order to print out the contents of thetree. The steps are to first obtain an iterator, using one of the getXXXIterator methods, then toloop using the iterator’s atEnd method to detect the end of the process, and the next method toobtain the next object from the tree.

BinarySearchTree<Character> tree = new BinarySearchTree<Character>();

...

Iterator<Character> it = tree.getLNRIterator();

while (! it.atEnd())

{ System.out.print(it.get());

it.next();

}System.out.println();

Exercise 6.9 Provide iterators for the vector and list ADTs in Chapter 4.

Exercise 6.10 Many of the private methods in the BinarySearchTree class (contains,insert, deleteMin and remove) can easily be written in a non-recursive way. Rewrite themwithout using recursion.

Exercise 6.11 Write a method for this class that provides a breadth-order iterator (see Ex-ercise 6.8 for a way of doing this).

The main uses of binary search trees are in sorting and searching. A file of data can be read in andthe data records placed into a binary search tree. At the end of this process, an in-order traversalof the tree will produce the data in ascending order. If the data is originally in a random orderthis method is a very efficient way of sorting. Additionally, once data has been placed in the binarysearch tree it can be searched and accessed very efficiently. We will return to this point in Chapter 9.

6.3 Graphs

Graphs are even less structured than trees in that any two nodes in a graph can be connected. There is noconcept of a “parent/child” relationship: two nodes are either connected or they are not. The traditional

115

Page 125: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Figure 6.7: An Example of a Graph.

example of this is a route map showing the main roads between cities. For example, the diagram inFigure 6.7 shows a graph representing the roads joining the major cities in South Africa.

6.3.1 Definitions and Terminology

A formal definition of a graph is as follows:

A graph G = (V,E) consists of a set of vertices V , and a set of edges E.Each edge is a pair (v, w), where v, w ∈ V .

The nodes in a graph are usually called vertices (the singular word is vertex ), and the connecting linesare called edges or arcs. The arcs may be directed (meaning that they have a definite start and end —this is usually shown by placing an arrowhead on the end of the arc or along its length), or undirected,as in the example in Figure 6.7. Furthermore, the arcs may be weighted, meaning that they have somevalue associated with them. For an example such as the major road graph that we have here, the weightsusually correspond to some measurement like the distance between the two vertices connected by the arc.

6.3.2 Representation of Graphs

Common questions when dealing with graphs are problems like: what is the shortest route (i.e. along thefewest arcs) from vertex A to vertex B? what is the lowest weighted route from vertex A to vertex B?what is the shortest round trip route which visits all vertices? can a route be found that visits each vertexonly once? and so on. Efficient algorithms to solve problems like these rely on efficient representationsof the graph. We will consider two possible representations here: adjacency matrices and edge lists.

Graph Representation using Adjacency Matrices

This method represents a graph using a square matrix, indexed by the vertex labels, showing whichvertices are connected. For the example graph in Figure 6.7 we would have the following matrix:

116

Page 126: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Pretoria Jo’burg Durban E.L. P.E. C.T. BloemPretoria 1 1 0 0 0 0 0Jo’burg 1 1 1 0 0 0 1Durban 0 1 1 1 0 0 0E.L. 0 0 1 1 1 0 1P.E. 0 0 0 1 1 1 1C.T. 0 0 0 0 1 1 1Bloem 0 1 0 1 1 1 1

Note that, by convention, a vertex is considered to be joined to itself and so the diagonal of this matrixis filled with ones. The rest of the matrix is filled with ones where there exists a link between the twovertices represented by the row and column index. This matrix is for an undirected graph and so issymmetrical about the diagonal. A directed graph on the other hand would not be symmetrical and aconvention would need to be established to specify the direction of a link. For example, we might specifythat a one in a position meant that there was an arc from the vertex labelling the row to the vertexlabelling the column.

This representation is simple, and is often used for that reason. However, it can be wasteful of space,especially in cases where there are many vertices and few arcs (most of the entries will be zeroes). Analternative representation, which prevents this, is the edge list representation.

Graph Representation using Edge Lists

This method associates with each vertex the list of vertices to which it is connected. For the SouthAfrican roads example we would have the following representation:

Vertex Edge ListPretoria Jo’burgJo’burg Pretoria, Durban, BloemDurban Jo’burg, E.L.E.L. Durban, Bloem, P.E.P.E. E.L., Bloem, C.T.C.T. Bloem, P.E.Bloem C.T., Jo’burg, E.L., P.E.

In an actual computer implementation, this might be done using a list of references or pointers (forexample, we could use our own GenericList class from chapter four). This might look something likethe following:

public class CityNode

{ private String name; // City name

private GenericList<CityNode> roads; // Edge list

. . .

} // class CityNode

This can be extended to include the weighting information for the arcs if this is a requirement of theproblem.

One of the vital points to bear in mind when dealing with graph algorithms is that it is easy to get intoan infinite loop, visiting the same nodes over and over. Great care must be taken to prevent this.

117

Page 127: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Graphs occur in many areas of Computer Science. Their uses in describing networks, such asnetworks of roads, communication networks (e.g. telephone lines and exchanges) and computernetworks, should be obvious. In addition they can be used for describing some state-space problems(similar to the solution trees discussed in the maze-solving algorithm of the last chapter).

You will be studying graph theory further in a later course, so we will leave it at this for now.

Skills

• You should be familiar with the following abstract data types: tree, binary tree and binarysearch tree

• You should be acquainted with the general concepts of graphs, and the terminology used todescribe them

• You should be familiar with implementation techniques for these abstract data types

• You should be familiar with some of the uses and characteristics of these abstract data types

• You should be familiar with the use of iterators

118

Page 128: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Chapter 7

Making a Hash of It!

Objectives

• To consider the following abstract data types: dictionaries and hash tables

• To study the implementation of these abstract data types

• To introduce some of the important issues for the efficient implementation of hash tables

• To consider the application of these abstract data types

7.1 Introduction

In this chapter we will be looking at some abstract data types for storing information in ways similar tothose used in databases. This introduces the concept of data items composed of a key and an associatedvalue. We insist that the keys are unique and can be used to identify specific data items. The simplestdata structure like this is a dictionary, where a key is used to access some related data. This is obviouslyanalogous to a normal dictionary in which the key corresponds to the word, and the value to its definition.The initial dictionary implementation that we will consider makes use of a sorted list of data items. Thisis rather inefficient, and leads us onto the topics of hashing and hash tables, which provide more efficientmeans of implementing dictionaries and similar structures. Hashing techniques also support very efficientsearching, a topic that is vital for many applications in the ICT field, not least search engines such asGoogle.

7.2 Dictionaries

As mentioned above, we now think of our data as having two parts: a key and a value. In fact, the valuemay be a complete record with several fields. The key is generally somewhat simpler, but this is not astrict requirement. Typical key values would be student or employee numbers, or, in the context of anEnglish dictionary, the keyword. The important thing about keys is that they uniquely identify a specificitem of data. All access to the data is done using the key values: client programs have no concept of the

119

Page 129: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

position of a data item in the data structure.

7.2.1 A Java Interface for Dictionary Data Structures

The interface that we will use for the ADTs in this chapter is shown below. This shows the use of aninterface with two generic type parameters (called K and V here, for the type of the keys and the typeof the associated values, respectively). It essentially describes classes that allow us to insert and retrievedata items based on a key, as described above. You will notice that there is provision for inserting just akey without an associated value, but that there are no facilities for accessing or working with the valuesindependently of their keys. You will also notice that we provide a method to get an iterator that canbe used to work through all the entries in these types of ADT — we will examine the definition of thisand the Pair interface, which it uses, shortly.

public interface Dictionary<K, V>

{

public void makeEmpty ();

// Delete all entries in dictionary

public void insert (K aKey, V aValue);

// Insert new element or update existing one

public void insert (K aKey);

// Overloaded version of insert above

public void remove (K aKey);

// Remove an entry from a dictionary

public V get (K aKey);

// Access an entry in a dictionary, creating it if necessary

public boolean contains (K aKey);

// Tell whether dictionary contains aKey

public boolean isEmpty ();

// Tell whether dictionary is empty

public Iterator<Pair<K, V>> getIterator ();

// Get an Iterator for this Dictionary

} // interface Dictionary

7.2.2 A Simple Java Class for Dictionaries

The first class we will look at that implements the Dictionary interface is very similar to many of theprevious classes we have written using linked lists, with a private inner class and a pointer to the startof a dynamically allocated data structure. The inner list node class also makes use of a class calledDictionaryPair, which provides much of the functionality needed for all our dictionary ADTs in thischapter.

120

Page 130: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Dictionary

insert, remove,

get, contains,

isEmpty,

makeEmpty,

getIterator

Pair

getKey, getValue,

setValue,

hashCode, equals

ListDictionary

insert, remove,

get, contains,

isEmpty, makeEmpty,

getIterator

Figure 7.1: Full Class Diagram for the ListDictionary Class.

Data Members

The full class diagram for the ListDictionary, ListNode and DictionaryPair classes is shown inFigure 7.1, showing the “uses” and inheritance relationships between these three classes.

public class ListDictionary<K extends Comparable, V> implements Dictionary<K, V>

{ private class ListNode extends DictionaryPair<K, V>

{ public ListNode next;

public ListNode (K aKey, V aValue)

{ super(aKey, aValue);

}

public ListNode (K aKey)

{ super(aKey);

}

} // inner class ListNode

private ListNode dict;

. . .

} // class ListDictionary

There are a few points to note here.

• We will be keeping the list of data for this class in ascending order of the keys. For this reason,we have bounded the generic type for the keys (i.e. K), just as we did for the BinarySearchTree

class in the previous chapter. In essence, while the Dictionary interface states that we can have adictionary of any type K for the keys and any type V for the values, we are restricting that here to

121

Page 131: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

say that the keys must implement the Comparable interface for the purposes of the ListDictionaryclass.

• The inner class (ListNode) used to build the linked list of data in this case extends the Dictionary-Pair class. The DictionaryPair class is a simple class that allows us to represent data that iscomposed of a pair of values: a key and a data value. There are also a number of supportingmethods required, which we will discuss as we come across the need for them. The important factto note is that, once a pair object has been constructed, the key value cannot be changed, althoughthe associated data value may be. The entire DictionaryPair class is shown below.

class DictionaryPair<K, V> implements Pair<K, V>

{ private K key;

private V value;

public DictionaryPair (K aKey, V aValue)

{ key = aKey;

value = aValue;

} // constructor

public DictionaryPair (K aKey)

{ key = aKey;

} // constructor

public K getKey ()

{ return key;

} // getKey

public V getValue ()

{ return value;

} // getValue

public void setValue (V value)

{ this.value = value;

} // setValue

public int hashCode ()

{ return key.hashCode();

} // hashcode

public boolean equals (Object o)

{ if (o instanceof Pair)

return key.equals(((Pair)o).getKey());

else

return key.equals(o);

} // equals

} // class DictionaryPair

Implementation of the Dictionary Data Structure

What will the lists that we construct using the ListDictionary class look like? We stated above that wewould keep the lists sorted on the key values. For a trivial example where the key is an integer, and theassociated value is a string with the equivalent number in Roman numerals, we might have something

122

Page 132: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

like the following (note that the diagram has been simplified slightly):

If we needed to add a node with the key value 7, we would get:

This example does not show another important point about our dictionary class, namely that the keysmust be unique. If, for some odd reason, we came to add another node with a key of 7, we would replacethe existing node rather than have two nodes with the same key. With this in mind, we can considerthe next part of the ListDictionary class. This is a private method that is used by several of the othermethods. It either inserts a new node into the list with a specified key and returns a pointer to it, orreturns a pointer to the existing node with that key.

private ListNode findNode (K aKey)

{ ListNode prev = null;

ListNode curr = dict;

while (curr != null && aKey.compareTo(curr.getKey()) > 0)

{ prev = curr;

curr = curr.next;

}

if (curr == null || aKey.compareTo(curr.getKey()) != 0)

// Insert new entry

{ ListNode n = new ListNode(aKey);

n.next = curr;

if (prev == null)

dict = n;

else

prev.next = n;

curr = n;

}

return curr;

} // findNode

Notice how we only insert the key field here, and then only when necessary — the value is left undefined.This is dealt with by the public methods that make use of this private, “internal” method.

123

Page 133: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Inserting a Key

Moving on to the public methods of the class, we have the usual constructor, etc. The first methodof more interest is the insert method. We will develop three overloaded versions of this, all of whichmake use of the private findNode method we have already seen. These variations allow one to optionallyspecify a value (the key, of course, must be specified whenever we insert a new entry) or to provide thedata as an existing Pair object.

public void insert (K aKey, V aValue)

// Insert new element or update existing one

{ ListNode curr;

curr = findNode(aKey);

assert (curr != null && aKey.equals(curr.getKey()));

curr.setValue(aValue);

} // insert

public void insert (K aKey)

// Overloaded version of insert above

{ ListNode curr;

curr = findNode(aKey);

assert (curr != null && curr.getKey().equals(aKey));

} // insert

public void insert (Pair<K, V> p)

// Overloaded version of insert above

{ ListNode curr;

curr = findNode(p.getKey());

assert (curr != null && curr.getKey().equals(p.getKey()));

curr.setValue(p.getValue());

} // insert

Note the assertions in these methods. They are there to check that the private findNode method hasdone its job correctly (i.e. they are checking the postconditions of the findNode method).

Removing a Key

The next public method is the counterpart to insert, remove, and again is quite straightforward:

public void remove (K aKey)

// Remove an entry from a dictionary

{ ListNode curr = dict, prev = null;

while (curr != null && aKey.compareTo(curr.getKey()) > 0)

{ prev = curr;

curr = curr.next;

}

if (curr != null && aKey.compareTo(curr.getKey()) == 0)

// Remove this dictionary entry

{ if (prev == null)

dict = curr.next;

else

prev.next = curr.next;

}

124

Page 134: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

// else entry not found - ignore

} // remove

Accessing an Entry

The next method we define is to access an entry in a dictionary list, by specifying the key. The interestingfeature of this method is that it creates a new entry (using the private findNode method) if there is noexisting one. So whatever entry a client program accesses is guaranteed to exist. Due to the way we havewritten the findNode method this becomes very simple. The definition of the get method is as follows:

public V get (K aKey)

// Access an entry in a dictionary, creating it if necessary

{ ListNode curr;

curr = findNode(aKey);

assert curr != null && aKey.equals(curr.getKey());

return curr.getValue();

} // get

Iterating Over the Entries

The next method overcomes a potential problem with the kind of access that we have chosen to use. If,for example, the key type being used was a string type then there is no simple way to run a for loop overall the key values. In fact, even if the key type is something like int, we still have a problem becausethe values of the keys may not be contiguous (as was the case with the Roman numbers example welooked at earlier). In order to get around this we make use of an iterator again. In this case, the iteratorwill allow us to work through the dictionary in the order specified by the keys. The inner class and themethod that provides the client with an iterator object are shown below:

private class ListDictionaryIterator implements Iterator<Pair<K, V>>

{ private ListNode nextEntry;

public ListDictionaryIterator (ListNode first)

{ nextEntry = first;

} // constructor

public Pair<K, V> get ()

{ return nextEntry;

} // get

public void next ()

{ nextEntry = nextEntry.next;

} // next

public boolean atEnd ()

{ return nextEntry == null;

} // atEnd

} // class ListDictionaryIterator

public Iterator<Pair<K, V>> getIterator ()

// Get an Iterator for this ListDictionary

125

Page 135: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

{ return new ListDictionaryIterator(dict);

} // getIterator

This is a little simpler than the iterators for trees, as the data is already in a list and the iterator objectsimply has to keep track of where in the list it is currently positioned (using the nextEntry variable).

The use of the generic mechanisms for this class is a little more complex and quite interesting. Thegeneric specification Iterator<Pair<K, V>> effectively states that we are dealing with an Iterator ofPairs. The Pair objects in turn are composed of two generic types, K and V. We will see how this is usedin practice shortly.

Other Operations

The last three methods are a simple check on whether the dictionary contains a given entry, a check onwhether the dictionary is empty, and a mechanism to delete all the entries in a dictionary:

public boolean contains (K aKey)

// Tell whether dictionary contains aKey

{ ListNode curr = dict;

while (curr != null &&

aKey.compareTo(curr.getKey()) > 0)

curr = curr.next;

return (curr != null &&

aKey.compareTo(curr.getKey()) == 0);

} // contains

public boolean isEmpty ()

// Tell whether dictionary is empty

{ return dict == null;

} // isEmpty

public void makeEmpty ()

// Delete all entries in dictionary

{ dict = null;

} // makeEmpty

7.2.3 An Example of the Use of a Dictionary

Note: This entire section deals with the client view, and will not be highlighted.

A common text-processing problem is that of producing a concordance (or index) for a document. Wewill develop a simple concordance program that records all the words that appear in a file and keepstrack of the line numbers of all occurrences of each word. In order to do this we will need to use strings(the words) as the key type for the dictionary. What can we use for the value type? Well, the values weneed to store are the line numbers of the appearances of a word. To do this we need some sort of listof integers. Fortunately, we have several possible ways of keeping lists of integers, so can use any one ofthem. In fact, we will use the IntegerList class from chapter four. Writing the concordance program isthen very easy:

public static void main (String args[]) throws IOException

{ Dictionary<String, IntegerList> dict =

new ListDictionary<String, IntegerList>();

126

Page 136: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

In this way we proceed through the text building up lists of line numbersfor all the words in the text. The last part of the program then makes useof the getIterator method for the dictionary to work through all theentries printing them out.

Table 7.1: Sample Text

String line = null;

int lineNo = 1;

BufferedReader in = new BufferedReader(new FileReader("sample.txt"));

while ((line = in.readLine()) != null)

{ StringTokenizer st = new StringTokenizer(line);

while (st.hasMoreTokens())

{ String word = st.nextToken();

IntegerList lst = dict.get(word);

if (lst == null) // First entry for this word

{ lst = new IntegerList();

lst.add(lineNo);

dict.insert(word, lst);

}

else // Simply add new line number

lst.add(lineNo);

}

lineNo++;

}

// Now print out the index

Iterator<Pair<String, IntegerList>> it = dict.getIterator();

while (! it.atEnd())

{ Pair<String, IntegerList> p = it.get();

System.out.println(p.getKey() + ": " + p.getValue());

it.next();

}

} // main

This program makes use of the StringTokenizer class. This is a standard Java class that allows us tobreak a string up into words (it’s a little like an iterator for the words in a string). We take each wordfrom the string and use it to access the dictionary ADT (using the get method). This will automaticallycreate an entry for the word if it was not in the dictionary already. In any case, it returns the associatedvalue, which in this case is an IntegerList containing the list of line numbers. If this is null then weknow that this is the first time we have seen the word, so we create a new IntegerList, add the currentline number and then insert an entry for the current word with the new list. If the list is not null thenwe have seen the word before and all that is required is to add the current line number to the list.

In this way, the program proceeds through the text, building up lists of line numbers for all the words inthe text. The last part of the program then makes use of an iterator for the dictionary to work throughall the entries, printing them out.

As an example, the output of this program when given the text shown in Table 7.1 (i.e. the text of anearlier version of the previous paragraph) as input was:

127

Page 137: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

In: [1]

The: [2]

all: [2, 3]

building: [1]

dictionary: [3]

entries: [4]

for: [2, 3]

getIterator: [3]

in: [2]

last: [2]

line: [1]

lists: [1]

makes: [2]

method: [3]

numbers: [1]

of: [1, 2, 3]

out.: [4]

part: [2]

printing: [4]

proceed: [1]

program: [2]

text: [1]

text.: [2]

the: [1, 2, 2, 2, 3, 3, 3]

them: [4]

then: [2]

this: [1]

through: [1, 3]

to: [3]

up: [1]

use: [2]

way: [1]

we: [1]

words: [2]

work: [3]

This program is a little simplistic in the way that it handles punctuation symbols (see the second entry for“text” above) and mixed-case words (like “The”), but it serves to illustrate the use of the dictionary andthe process of producing a concordance. We should probably also deal with the multiple line entries thatshow up for common words (like the word “the” above). These problems are addressed by the followingexercises.

Exercise 7.1 Develop your own class for words. This should remove any letters other thanalphanumeric characters, hyphens and underscores and ignore case differences. You will needto implement the Comparable interface for your class (i.e. provide a compareTo method for it).Use this class as the key type for the concordance program.

Exercise 7.2 Develop an integer set class. This will be similar to the integer list class(IntegerList) used above, but should ignore multiple entries of values. Use this class forthe value type in the concordance program.

128

Page 138: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Key Hash Valueadvanced 26computer 76elephant 60function 16george 83hash 70programming 59science 39second 36words 59year 8

Table 7.2: Examples of Hash Values

7.3 Hash Tables

Our dictionary implementation is very useful, but may be inefficient when dealing with many entries.This arises from the fact that we need to search through the internal list of nodes in order to find anyentry. Since we keep the list in ascending order of keys we don’t necessarily have to search through theentire list, but, on average, we are going to have to search through half the list every time. If only therewas some way in which we could immediately access the node we required! It is this problem that isaddressed by a hash table. The central data structure in a hash table is an array. When we access thetable we take the key and “hash” it: turn it into some numeric value in the range of the subscripts of thearray. This hashed key value is then used to access the table entry we require. Thus the major factor insuccessfully implementing a hash table is the hashing function which turns the key into a number thatcan be used as an array subscript.

7.3.1 Hashing Functions

The exact nature of a hashing function depends on the type of the key, of course. If the keys are numeric,then we can often perform some simple arithmetic operation on them. More often, however, we aredealing with strings as keys (as in the concordance example in the previous section). Other exampleswould be the form of student number used at Rhodes (609A1234), or people’s surnames. In this case weneed to do something to convert the string into a numeric value. A simple approach is to add togetherall the Unicode (or ASCII) values of the characters, with or without some sort of scaling. A Java methodfor doing this might be as follows:

public int hash (String key)

{ int h = 0;

for (int k = 0; k < key.length(); k++)

h = h * 26 + (int)key.charAt(k);

return (h & 0x7FFFFFFF) % SIZE;

} // hash

Here h is the hashed value, and SIZE is the number of entries in the array being used. The code in thelast line that reads (h & 0x7FFFFFFF) performs a bit-wise AND operation. What will the effect of thisbe?

As examples, some random words (the keys) and their corresponding hash values when using this hashfunction (SIZE has been taken as 100 for this example) are shown in Table 7.2. There are several pointsto notice about this. Firstly, the process of hashing removes any ordering that was originally present

129

Page 139: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

in the data. These words, in alphabetic order originally, would be assigned to entries in the hash tablethat bear no resemblance to that order. Secondly, the keys “programming” and “words” have hashed toexactly the same value (59). (This is bound to happen if we think about, since we are taking a potentiallyinfinite set of possible words, and mapping them to a set of only 100 integer values). This is known as acollision. Dealing with collisions is one of the most important aspects of hash tables.

Collision Handling

One way of preventing collisions is to find a so-called perfect hashing function. A perfect hashing functionis guaranteed to map a number of distinct values to separate integer values. Such hash functions canbe constructed for very limited domains. For example, there are perfect hash functions for the sets ofreserved words for various programming languages. For the reserved words in Modula-2, a perfect hashingfunction is:

// Perfect hashing function for Modula-2 reserved words

len = key.length();

h = (256 * (int)key.charAt(0) + (int)key.charAt(len-1) + len) % 139;

where key is the reserved word being hashed. This gives forty unique values in the range 0–138 for theforty reserved words in Modula-2. While this is a perfect hashing function it is less than optimal as theother 99 entries in a table using this approach would be unused. If we wish to make use of every entryin the table (looked at from another perspective, if we wanted to use a table with only forty entries forthe Modula-2 reserved words) we need a perfect minimal hashing function. This must not only map thestrings of interest to a set of unique values, but these values must be continuous. Such a function can beconstructed for the set of Modula-2 reserved words, but will not be presented here.

So, perfect hashing functions are a solution to the problem of collisions. The only problem is that, asstated above, they are only available for very limited domains (such as small sets of reserved words inprogramming languages). In general, it is extremely unlikely that we are going to be able to constructa perfect hashing function for the keys we are using. Indeed, in many applications (for example, theconcordance program) we may not even know what the keys are until the program is run. In these caseswe need to choose the best hashing function we can find, and then make other arrangements to deal withthe collisions. These give rise to two types of hash tables: those with internal hashing and those withexternal hashing.

7.3.2 Internal Hashing with Open Addressing

Consider the problem of inserting a new item into a hash table. When using internal hashing we applyour hash function to the key to find the entry in the hash table where we expect to be able to put thegiven key. If a collision has occurred we will find that that position in the table is already occupied.In this case we choose another location for the item. There are a number of ways of doing this. Thesimplest is to look sequentially from the entry given by the hash function until we find an unoccupiedentry, and place the new entry there. For example, given the set of words above, if we came upon theword “programming” first, it would be placed in position 59. When we later came to add the word“words” we would find position 59 in use and would place “words” in position 60. If position 60 hadalready been occupied by some other key, we would put it in position 61, etc., etc. This means that, whenwe try to locate an item in the table, we must be prepared to search through part of the table in casethe item that we are looking for was affected by a collision and stored in one of the following positions.This search need only continue until either the desired key value is located or the first unoccupied entryis found (in which case the desired key value is obviously not present). The process of searching throughthe successive positions in the hash table is known as probing .

130

Page 140: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

We will restrict our discussion to linear probing where we examine successive positions as described above.However, you should be aware that other probing techniques exist where the entries to be examined arenot just the immediately following ones. In some techniques a second hash function is applied; in othersa constant, or variable, amount is added to the given position to generate the probe sequence; and so on.

There is one important consequence of this, which arises when we remove items from the hash table. Ifwe remove the word “programming” from the hash table, we cannot simply mark the entry as unoccupiedagain. If we did so then a search to locate the word “words” would fail even though it was still in thehash table. A solution to this problem is to store the state of the entry in a field. This state can takeon one of three values: empty, occupied or deleted. When searching for an entry we can only halt whenwe reach an empty entry. A deleted entry may have been there when we originally inserted the item forwhich we are looking and so we must keep on looking.

Another consequence of this approach is that collisions will become more frequent as the table fills up.When this happens we get a phenomenon known as clustering where groups of keys occupy a number ofneighbouring entries. In this regard, we distinguish between primary clustering and secondary clustering.Primary clustering is caused by several keys mapping to the same location and so occupying successiveentries. Secondary clustering occurs when a key maps to an entry which is already occupied by a displacedkey. In other words a previous collision has caused a collision between two keys with different hash values.As the table fills up both types of clusters grow, collisions increase in frequency, and the efficiency ofaccessing the hash table decreases. The only real solution to this is to make the table somewhat largerthan it is anticipated will actually be needed. (Perhaps a better solution is to rather use external hashing,but we will return to that topic in the next section). Alternative probing techniques may partially decreasethe problem of secondary clustering.

A further performance penalty can also arise when we have frequent deletions from the table. In this casesearches for particular keys may have to consider a large number of deleted entries. If deletions do occurfrequently then it is a good idea to reconstruct the table from time to time, compacting the clusters andremoving the unused entries.

With all of this in mind we can turn to the implementation of a Java class for a hash table.�

InternalHashTable

table, numEntries

hash,

insert, remove, contains,get, getIterator,isEmpty, makeEmpty

TableEntry

key, value,

occupied

131

Page 141: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Data Members and the Hash Function

The first part of the file (InternalHashTable.java) is as shown below. The three possible “statusvalues” of the entries are handled as follows:

empty The entry in the array is set to null

occupied The entry in the array is not null, and the occupied flag is true

deleted The entry in the array is not null, and the occupied flag is false

We will also use the same DictionaryPair class to hold the data (i.e. key/value pairs) as we used forthe ListDictionary class, and extend this to hold the “occupied” status of each entry.

Another important consideration is how we handle the hash function. This class has no idea of whatdata type might be used for the keys that it stores, so how is it to work out a hash value? FortunatelyJava comes to our rescue here: all Java objects have a built-in hash function method as standard (it isdefined in the Object class and so it is inherited by all Java classes). This method is called hashCode

and returns an int value.

public class InternalHashTable<K, V> implements Dictionary<K, V>

{ private static final int DEF_SIZE = 101; // Default table size

private class TableEntry<K, V> extends DictionaryPair<K, V>

{ boolean occupied = false; // Used to mark deleted entries

public TableEntry (K aKey, V aValue)

{ super(aKey, aValue);

}

public TableEntry (K aKey)

{ super(aKey);

}

} // inner class TableEntry

private TableEntry<K, V>[] table;

private int numEntries; // Number of occupied and deleted slots

private int hash (K aKey)

// Scale hash value for table size

{ return ((aKey.hashCode() & 0x7FFFFFFF) % table.length);

} // hash

...

} // class InternalHashTable

The nested TableEntry class allows us to record the key and the value associated with an entry (inheritedfrom the DictionaryPair class), as well as its status (using the occupied field, as discussed above). Thevariable table is a reference to the array used to hold the entries, and numEntries keeps track of thenumber of occupied and deleted slots (i.e. the ones that are not genuinely empty). Since the JavahashCode method returns any integer value, we need the hash method which ensures the hashcode ispositive, and scales the hashcode value by taking the remainder after dividing by the size of the arraybeing used.

132

Page 142: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Constructors

Turning to the public part of the class, we have the usual constructors for this type of class, allowing thesize of the table to be specified by the client if necessary.

public InternalHashTable (int initSize)

// Constructor

{ table = new TableEntry[initSize];

numEntries = 0;

} // Constructor

public InternalHashTable ()

// Constructor

{ this(DEF_SIZE);

} // Constructor

This simply has to initialise the private data members of the class. The second constructor simply callson the first with the size set to the defined default value (101).

Inserting a Key

Moving on to the more interesting methods, insert is responsible for adding a new entry to the hashtable. This involves hashing the key to find the position in the hash table and then resolving a collisionif it occurs. This is further complicated by the need to keep unique keys in the hash table. If we find anexisting or deleted entry with the same key, we simply update the value. Notice how we “wrap around”the end of the array as we increment index. This is very similar to the technique we used for a circularqueue previously (see p. 70). We also make sure that there is always at least one empty slot in the hashtable. This simplifies the probing loops as they are assured of stopping when an empty slot is found.

public void insert (K aKey, V aValue)

{ int index = hash(aKey);

while (table[index] != null && !table[index].getKey().equals(aKey))

{ index = (index + 1);

if (index >= table.length) // wraparound

index = 0;

}

if (table[index] == null) // Insert new entry

{ if (numEntries + 1 >= table.length) // Out of space?

throw new NoSpaceAvailableException("no space available in hash table");

table[index] = new TableEntry<K, V>(aKey, aValue);

table[index].occupied = true;

numEntries++;

}

else // Update existing or deleted entry

{ table[index].setValue(aValue);

if (! table[index].occupied) // Undelete it

{ table[index].occupied = true;

}

}

} // insert

There is also a very simple variation on the insert method that takes only a key value.

133

Page 143: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Removing a Key

The converse of inserting a key is performed by the remove method. This is quite straightforward again.It involves searching for the item (hashing the key and resolving any collisions). If the item is found thenthe occupied field is simply changed to false.

public void remove (K aKey)

// Remove an entry from a hash table

{ int index = hash(aKey);

while (table[index] != null &&

!table[index].getKey().equals(aKey))

{ index = (index + 1);

if (index >= table.length) // wraparound

index = 0;

}

if (table[index] != null)

{ table[index].occupied = false;

}

} // remove

Accessing an Entry

The next method, get, is similar in some ways to the insert method as it must insert a new entry ifthe specified key value is not yet present in the hash table. Notice here again how we ensure that thereis always at least one empty slot to ensure that the probing process works correctly.

public V get (K aKey)

// Access an entry in a hash table, creating it

// if necessary

{ int index = hash(aKey);

while (table[index] != null && !table[index].getKey().equals(aKey))

{ index = (index + 1);

if (index >= table.length) // wraparound

index = 0;

}

if (table[index] == null || !table[index].occupied) // Insert new entry

{ if (numEntries + 1 >= table.length) // Out of space?

throw new NoSpaceAvailableException("no space available in hash table");

table[index] = new TableEntry<K, V>(aKey);

table[index].occupied = true;

numEntries++;

}

assert aKey.equals(table[index].getKey());

return table[index].getValue();

} // get

Other Operations

The contains method is similar to get as it also has to search for a given key and be careful of the tablebeing full. The isEmpty and makeEmpty methods are quite straightforward.

public boolean contains (K aKey)

134

Page 144: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

// Tell whether the hash table contains aKey

{ int index = hash(aKey);

while (table[index] != null && !table[index].getKey().equals(aKey))

{ index = (index + 1);

if (index >= table.length) // wraparound

index = 0;

}

return (table[index] != null &&

table[index].occupied &&

table[index].getKey().equals(aKey));

} // contains

public boolean isEmpty ()

// Tell whether hash table is empty

{ for (int k = 0; k < table.length; k++)

if (table[k] != null && table[k].occupied) // Found an occupied slot

return false;

return true;

} // isEmpty

public void makeEmpty ()

// Delete all entries in the hash table

{ for (int k = 0; k < table.length; k++)

{ table[k] = null;

}

numEntries = 0;

} // makeEmpty

Iterating Over the Entries

The last thing to consider is how to handle iterators for this class. One important point to note aboutthis is that it is not possible to iterate over the entries in the table in any kind of order. This is dueto the fact that the hashing function will have randomised the positions at which the keys are placed.There is not much we can do about this except to work through the table from the beginning to the end.Unfortunately, this has a very large impact on the kinds of applications for which we might want to usea hash table. If the order of the information is of any importance then a hash table will not be a suitableimplementation. Anyway, here is the inner class and the getIterator method:

private class HashTableIterator implements Iterator<Pair<K, V>>

{ private int index;

public HashTableIterator ()

{ for (index = 0; index < table.length; index++)

if (table[index] != null && table[index].occupied)

// First non-empty slot

break;

} // constructor

public Pair<K, V> get ()

{ return table[index];

} // get

135

Page 145: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

public void next ()

{ while (++index < table.length)

{ if (table[index] != null && table[index].occupied) // Found more data

{ break;

}

}

} // next

public boolean atEnd ()

{ return index >= table.length;

} // atEnd

} // inner class HashTableIterator

public Iterator<Pair<K, V>> getIterator ()

// Get an Iterator for this hash table

{ return new HashTableIterator();

} // getIterator

Note again how we use an iterator of Pair<K, V> objects. Notice too how the constructor of the innerclass sets up the iterator by searching for the first occupied entry.

Some Comments

That ends our hash table implementation using internal hashing. We have used the same interface(Dictionary) that we had for the ListDictionary class, and so it can be used as a more efficientmechanism for applications such as the concordance problem we looked at previously. The only differencein the use of this class (and an aesthetically unpleasing one, unfortunately) is that the results will not bein alphabetic order when we print out the concordance when using a hash table.

Exercise 7.3 Modify the implementation of the InternalHashTable class to report on colli-sions. See what effect choosing different table sizes has on the number of collisions. Experimentwith other probing techniques and see what effect they have on the number of collisions.

Exercise 7.4 Write a rebuild method for this class. This should create a new array andthen work through the existing array copying only the currently occupied entries across (if thenew array is in place as the array pointed to by table then the existing insert method canbe used to help with this). Your rebuild method should optionally allow the user to changethe size of the array. Here is a suggested outline:

public void rebuild (int newSize)

// Reconstruct the hash table.

// If newSize > 0 then the new table can contain newSize elements,

// otherwise the size is unchanged

{ ... } // rebuild

7.3.3 External Hashing

We have seen quite a lot of the problems caused by collisions. The solution used in internal hashingis adequate, but starts to grow increasingly inefficient as the hash table fills up. As has already been

136

Page 146: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Figure 7.2: An External Hashing Table.

commented, external hashing provides a different solution to this problem. The approach used in externalhashing is to think of the hash table as providing a number of “buckets”. The hash function is appliedto the key value in order to find the bucket into which the entry must be placed. This means that thebuckets may hold several entries (all those whose keys hash to the same value). In this way collisions areavoided by allowing several keys to map to the same value. The disadvantage is that, as the number ofentries per bucket rises, we are forced to search through the entries in the bucket. Of course, this is noworse than the probing we have to do with internal hashing.

Let’s first consider a very simple (and rather ineffective) approach to external hashing. If we are buildingup a table of data about people (perhaps employees, or students) we might settle for the very simplehashing function of taking the first letter of a person’s surname and using this (as a value between 0and 25) as the hash value. This works well in theory. For example, we could take the name “Wells”.This hashes to the value 22, and so we would add this entry to bucket number 22. The problem herearises from the fact that the distribution of first letters of surnames is not very even. A simple analysisof some names shows that about half of the entries will end up in only five of the twenty-six buckets.So, some buckets will be very full and others almost empty. All that this really means, of course, is thatour hashing function is too simple. Any function that hashed surnames more effectively would give us amore even distribution of entries over the buckets.

When implementing external hashing, we have to consider how we will implement the buckets. A simpleapproach is to build linked lists of entries. More efficient methods might be to use binary search trees, oreven further hash tables for the buckets. We will restrict our discussion to considering simple, unsortedlinked lists for the buckets. For the set of words we used to introduce the concepts of hashing we wouldhave the data structure shown in Figure 7.2 (only the keys are shown for simplicity).

137

Page 147: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

HashTable

table

hash,

insert, remove, contains,get, getIterator, isEmpty,makeEmpty

EntryNode

key, value,

next

Data Members and the Hash Function

Let’s implement a Java class to do external hashing. As before, we will use the standard Java hashCode

method. We also set a default number of buckets.

public class ExternalHashTable<K, V> implements Dictionary<K, V>

{ private static final int DEF_SIZE = 101; // Default table size

private class EntryNode<K, V> extends DictionaryPair<K, V>

{ EntryNode<K, V> next;

public EntryNode (K aKey, V aValue)

{ super(aKey, aValue);

}

public EntryNode (K aKey)

{ super(aKey);

}

} // inner class EntryNode

private EntryNode<K, V>[] table;

private int hash (K aKey)

// Scale hash value for table size

{ return ((aKey.hashCode() & 0x7FFFFFFF) % table.length);

} // hash

...

} // class ExternalHashTable

Notice the declaration of the data member called table. This is declared to be an array of EntryNodes.This allows us to create an array of lists, since each EntryNode is part of a linked list of other nodes. Eachelement of the array table represents one bucket. Note that we have dispensed with the numEntries

field here, and have no direct way of telling how many items are stored in the hash table.

138

Page 148: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

The constructors are very simple in this case, and will not be shown here.

Inserting a Key

What do we need to do when we insert a new entry? Well, the hashing function can be used to identifythe correct bucket. We then need to work through the linked list to ensure that we do not end up withduplicate entries. If this is a duplicate then we update the existing entry, otherwise we insert a new entryinto the list. Notice how the work of the for loop is all taken care of in the control section so the bodyis empty.

public void insert (K aKey, V aValue)

// Insert new element or update existing one

{ int index = hash(aKey);

// Look for aKey in linked list

EntryNode<K, V> c;

for (c = table[index];

c != null && !c.getKey().equals(aKey);

c = c.next)

;

if (c == null) // Insert new node

{ EntryNode<K, V> n = new EntryNode<K, V>(aKey, aValue);

n.next = table[index];

table[index] = n;

}

else // Update existing entry

c.setValue(aValue);

} // insert

Other Operations

The rest of the methods are quite straightforward. We just have to keep in mind that we are dealingwith an array of pointers to linked lists.

public void remove (K aKey)

// Remove an entry from a hash table

{ int index = hash(aKey);

if (table[index] != null) // Look for node

{ EntryNode<K, V> c = table[index], p = null;

while (c != null)

{ if (c.getKey().equals(aKey))

break;

p = c;

c = c.next;

}

if (c != null) // Unlink node

{ if (p == null)

table[index] = c.next;

else

p.next = c.next;

}

}

139

Page 149: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

} // remove

public V get (K aKey)

// Access an entry in a hash table, creating it if necessary

{ int index = hash(aKey);

EntryNode<K, V> c = table[index];

while (c != null && !c.getKey().equals(aKey))

c = c.next;

if (c == null) // Insert new entry

{ EntryNode<K, V> n = new EntryNode<K, V>(aKey);

n.next = table[index];

table[index] = n;

c = n;

}

assert aKey.equals(c.getKey());

return c.getValue();

} // get

public boolean contains (K aKey)

// Tell whether the hash table contains aKey

{ EntryNode<K, V> c = table[hash(aKey)];

while (c != null && !c.getKey().equals(aKey))

c = c.next;

return (c != null);

} // contains

public boolean isEmpty ()

// Tell whether hash table is empty

{ for (int k = 0; k < table.length; k++)

if (table[k] != null)

return false; // Found at least one entry

return true; // Found no entries

} // isEmpty

public void makeEmpty ()

// Delete all entries in the hash table

{ for (int k = 0; k < table.length; k++)

table[k] = null; // Delete linked list

} // makeEmpty

Notice how we have had to implement the isEmpty method. It searches through the table and returnsfalse as soon as it finds the first entry. If it reaches the end of the table without finding any entriesthen it returns true. The makeEmpty method is used to clear all the linked lists (giving the Java garbagecollector quite a bit of work to do as a result!).

Iterating Over the Entries

The iterator class for this ADT is a little more tricky than the previous ones as it needs to work throughthe array, and also through the linked lists for the buckets. The inner class that takes care of this is asfollows:

private class HashTableIterator implements Iterator<Pair<K, V>>

140

Page 150: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

{ private int index;

private EntryNode<K, V> nextEntry;

public HashTableIterator ()

{ for (index = 0; index < table.length; index++)

if (table[index] != null) // First non-empty bucket

break;

if (index < table.length) // We have some data

nextEntry = table[index];

} // constructor

public Pair<K, V> get ()

{ return nextEntry;

} // get

public void next ()

{ nextEntry = nextEntry.next;

if (nextEntry == null) // Look for next non-empty bucket

while (++index < table.length)

{ if (table[index] != null) // Found more data

{ nextEntry = table[index];

break;

}

}

} // next

public boolean atEnd ()

{ return index >= table.length;

} // atEnd

} // class HashTableIterator

Again, the ExternalHashTable class uses the same interface as the first dictionary class and the hashtable class using internal hashing. While it suffers from the same aesthetic problems as the previous hashtable implementation, it has one major advantage. That is that its size is not limited by the size of thearray used — it can contain any number of entries (at least until we run out of memory for dynamicallyallocated objects). Of course, as the number of entries grows very much larger than the number ofbuckets, the linked lists start to grow very long and so we start to lose some of the performance benefitsof using a hash table.

Exercise 7.5 Modify the ExternalHashTable class to include a count of the number of en-tries. Be careful as there are a number of places where we can add new entries. This simplifiesthe isEmpty method considerably.

Exercise 7.6 Implement an external hash table class that makes use of binary search treesfor the buckets. What difference would you expect this make to the performance of the hashtable, when the hash table has very few entries and when it has many entries?

141

Page 151: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

ADT Advantages DisadvantagesListDictionary Ordered Slow access

Flexible sizeInternalHashTable Fast access Unordered

Fixed sizeExternalHashTable Fast access Unordered

Flexible size

Table 7.3: Summary of Dictionary ADTs

7.4 Comparison of Dictionary Data Structures

The advantages and disadvantages of the three data structures studied in this chapter are summarisedin Table 7.3. Which of them should be used in a particular situation depends on the trade-off be-tween the factors mentioned in the table. For example, if ordered access is a requirement then only theListDictionary will be suitable; if fast access is required and the amount of data is not known then anExternalHashTable should be used; and so on.

Exercise 7.7 Write a program that inserts 5000 random integers in the range 0. . . 19 999into a dictionary (i.e. any class that implements the Dictionary interface). The programshould time how long it takes to search the dictionary for a further 5000 random integers inthe same range, using the contains method. Do this for all three of the ADTs considered inthis chapter. For the two hash tables the experiment should be repeated, setting the size ofthe table to 5001 (i.e. just big enough for the data) and then to 10 000. Draw up a table ofyour results. Does this agree with what you would expect?

Skills

• You should be familiar with the following abstract data types: dictionaries, hash tables usinginternal hashing with open addressing, and hash tables using external hashing

• You should be familiar with the central idea of a dictionary or hash table data structure: alist of values indexed by some key value

• You should be acquainted with the important concepts of hash tables and the terminologyused to describe them

• You should know about some of the issues that affect the performance of hash tables

• You should be familiar with implementation techniques for these abstract data types

142

Page 152: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Part III

The Analysis of Algorithms

143

Page 153: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

As we have been through the course so far, we have often commented about the efficiency of variousalgorithms. In this section we want to lay the foundation for a more thorough analysis of the performanceof algorithms. In particular, we need to explore methods that will allow us to compare algorithms withoutany concern for issues such as the speed of the computer being used, or the quality of the compilers, etc.

You may wonder why we are so concerned with the efficiency and performance of algorithms. After all,the speed and power of the computers that we use is increasing rapidly. We will also deal with thesekinds of questions in this section.

144

Page 154: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Chapter 8

Big-O

Objectives

• To introduce the idea of analysing a algorithm

• To study methods for measuring the efficiency of algorithms and thus comparing differentalgorithms

• To compare some of the common measurements of efficiency that we find in many algorithms

• To show some simple applications of the methods we use

8.1 Introduction

Many algorithms appear to be simple and reasonably efficient. However, careful analysis may revealthat they are completely unsuitable except for trivial sizes of problem. One startling illustration of thiscomes from considering the problem of solving sets of simultaneous equations. This is a vitally importantaspect of many scientific and engineering applications (e.g. aircraft design and weather prediction). Inthese cases there may be thousands, or even millions, of equations to be solved. A simple method forsolving simultaneous equations is Cramer’s Rule, which you may have encountered in a maths course. Ifyou have not, do not panic! The following discussion is very general in nature.

The key step in Cramer’s method is the calculation of a value known as the determinant. It can easily beshown that the number of operations required to calculate the determinant is proportional to n!, wheren is the number of simultaneous equations in the set.

Now, never mind engineering applications with thousands of simultaneous equations, let us consider aset of just thirty equations. You will agree that this is a small, if not tiny, set. From the precedingdiscussion we can see that the number of operations required to calculate the determinant in this caseis 30! ≈ 3 × 1032 operations, a rather large number. But computers are very fast at calculations, surelythis poses no real problem? Well, some of the fastest computers in the world today can perform about109 arithmetic operations per second. How long will it take to calculate the determinant for Cramer’sRule? Only 1016 years, which is more than the estimated age of the universe!! Even if the performance of

145

Page 155: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

computers over the next few years increases at a mind-boggling rate we still are faced with an intractableproblem.

Giving up on Cramer’s Rule, we might try another approach. Using the same kind of technique that weusually use when solving simultaneous equations by hand leads to an algorithm known as the tri-diagonalmethod. This requires about 10n operations, and our set of thirty simultaneous equations can then besolved in 10−7 seconds. This is doing rather better than Cramer’s rule, and suggests that the tri-diagonalmethod might be a better approach to solving scientific and engineering problems!

Hopefully this example has brought home the need for the careful analysis of the efficiency and perfor-mance of algorithms. Merely seizing upon the first algorithm that springs to mind may lead one to anextremely inefficient solution. This example has also focused on the time taken by an algorithm. Anothervery important aspect of the efficiency of algorithms is the amount of resources (memory, disk space,etc.) that they use. We will mainly be concentrating on time issues in this course, but will have occasionto mention other resource issues from time to time.

8.2 Algorithmic Complexity

When we speak of algorithmic complexity, or the complexity of algorithms, we are not talking about howdifficult they are to understand. Rather, we are referring to their behaviour with respect to time, or otherresources that they use. In doing this we need to find a measure of their complexity that is independentof factors such as the compiler used, or the speed of the computer on which the program is run. Wewant some way of measuring the efficiency of a program that is universally applicable. We can do thisby focusing on the main operations carried out by the algorithms, and working out how many times theyare done.

For the kinds of algorithms that we are going to be looking at, the important operations are oftencomparisons and exchanges of data items. If we can count these, they give us a good measure of howcomplex an algorithm is. Looked at crudely, if one algorithm takes 20 comparisons and 30 exchanges,and another 100 comparisons and 150 exchanges when dealing with the same set of data, then we canclearly state the first is more efficient. Of course, a program based on the first algorithm might run moreslowly if it were running on an old Apple II microcomputer at 1MHz, and a program based on the secondalgorithm was running on a modern supercomputer. However, the fundamental point remains: the firstalgorithm is more efficient.

8.2.1 The Impact of the Input Data

An important point that arises from this discussion is the question of the data that is used. We statedabove that we were “dealing with the same set of data” when we measured the efficiency of the twoalgorithms. In general, the input data is going to have a vital impact on the performance of an algorithm.If we are going to compare the complexity of algorithms we need some mechanism that allows us toabstract away from the actual data that is being used. This issue has two aspects: (1) how we characterise,or “measure”, the data, and (2) what specific cases may arise.

Characterising the Input Data

We can usually characterise the data by some abstract quantity. For example, when discussing Cramer’sRule in the introduction to this chapter, we talked about a set of n simultaneous equations. The actualdata (the variables and constants making up the equations) were of no real interest to the analysis weperformed — all we needed to know about was how many equations there were.

For many algorithms it is a quantity such as the number of data items that is exactly what is required fora general analysis of the algorithmic complexity. However, you should not fall into the trap of thinking

146

Page 156: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

this is always the case. For example, some tree algorithms depend only on the depth of the tree, and noton the number of data items stored in the tree1. Some algorithms also depend on more than a singlevalue. For example, there are string searching algorithms that depend on both the length of the searchstring and the length of the text being searched.

Considering Different Cases

The second aspect when considering the input data is that we may need to consider three differentcharacteristics of the algorithm. These are the (1) best case performance, (2) worst case performance,and (3) average case performance. These are obviously determined by the input data. If the input data is“good” in some sense then we might expect the best case performance. If there is some “problem” withthe input data then we might expect the worst case performance. In general, we are likely to encounterthe average case performance. To fully characterise the complexity of an algorithm we need to considerall of these situations.

Some algorithms are very efficient at the best and average cases, but break down badly when faced withthe worst case. In such a situation, if the worst case was likely to arise, we might settle for a compromiseand select an algorithm with slightly worse performance in the best and average cases but with betterworst case performance. Indeed, the input data that might produce the worst case for one algorithm mayproduce the best case for another, as we will see.

8.2.2 Big-O Notation

Given that we can find a quantity, such as the number of items n, for the analysis of an algorithm, we canexpress the complexity using Big-O notation, or order notation. This makes use of several simplifyingassumptions, once we have determined the relationship between n and the number of essential operationsused by the algorithm. Let’s assume that we have discovered that we have an algorithm such that thenumber of operations carried out by it is given by the function f(n). For example, we might have:

f(n) = 3.5n2 + 120n+ 45

If we examine the behaviour of this function we find that the dominant term is 3.5n2: as n grows larger,the contribution of the other terms becomes negligible. Similar reasoning allows us to dispense with theconstant 3.5, and we are left with the situation where we can characterise the efficiency of the algorithmby the single quantity n2. We then say the algorithm has a complexity of the order of n2. More succinctly,we say that the algorithmic complexity is O(n2). In this way we have characterised the essential behaviourof the algorithm: its performance is proportional to the square of the number of data items. If we findanother algorithm to solve the same problem whose efficiency is directly proportional to the number ofdata items, we say it is O(n), and can state that it is more efficient than the first.

We formally define the concept of order notation as follows.

We say that an algorithm is O(f(n)), or that the algorithm is of the order of f(n), if thereexist positive constants c and n0 such that the time t required to execute the algorithm isdetermined by:

t ≤ cf(n) for all n > n0

That is, the time (or, in general, other resource) requirements of the algorithm grow no faster than aconstant (c) times f(n), as long as n is greater than some cut-off value n0.

1This is a slightly artificial example, as the depth of the tree is often proportional to the number of nodes in the tree.

147

Page 157: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

n = 3 6 9 12 50 100 1000

1 10−6s 10−6s 10−6s 10−6s 10−6s 10−6s 10−6slogn 2× 10−6s 3× 10−6s 3× 10−6s 4× 10−6s 6× 10−6s 7× 10−6s 10−5sn 3× 10−6s 6× 10−6s 9× 10−6s 10−5s 5× 10−5s 10−4s 10−3sn logn 5× 10−6s 2× 10−5s 3× 10−5s 4× 10−5s 3× 10−4s 7× 10−4s 10−2sn2 9× 10−6s 4× 10−5s 8× 10−5s 10−4s 3× 10−3s 0.01s 1sn3 3× 10−5s 2× 10−4s 7× 10−4s 2× 10−3s 0.13s 1s 16.7m2n 8× 10−6s 6× 10−5s 5× 10−4s 4× 10−3s 36y 4× 1016y 3× 10287y

Times are expressed in seconds (s), minutes (m), or years (y).

Table 8.1: Results of Different Order Algorithms

There are a number of general issues that we can note about the use of order notation. Firstly, as describedabove, we can ignore less significant terms of the function describing the behaviour of an algorithm. Ifwe are to do this we need to understand the ranking of the complexities which commonly occur whenconsidering algorithms. The following list summarises the most common orders found, and they areranked from the least significant to the most:

1 Constant complexity The algorithm is independent of any external factor. Few useful algorithmsare this favourable.

log n Logarithmic complexity This is very good. The time (or other resource) required by the algo-rithm grows very slowly as n increases.

n Linear complexity The algorithm’s performance is directly proportional to the factor n.

n log n This arises quite commonly in practice. It is slightly worse than linear complexity, but stillincreases at a fairly slow rate.

n2 Quadratic complexity These algorithms start to grow at a rate that makes them impractical forlarge datasets.

nm where m ≥ 2 Polynomial complexity (note that quadratic complexity is a special case of polyno-mial complexity). As the value of m grows larger so the usefulness of these algorithms decreases forlarge datasets. Polynomial algorithms form a hierarchy of their own within this overall complexityranking.

2n Exponential complexity These algorithms are useless except for very small datasets.

n! Factorial complexity These are even worse! We saw an example of this in the introduction whereeven n = 30 was totally impractical.

Occasionally one comes across an algorithm with some other complexity such as log log n, or nn, butthese are relatively rare. The relative rate of increase of the more common functions is depicted in thegraph in Figure 8.1 (note that the y-axis is logarithmic).

Looking at this subject from another perspective, we can ask the question: if one operation takes a certaintime to execute, how long will it take to complete an algorithm with one of the above complexities? Apartial table of results is given in Table 8.1, where the time taken to execute one operation has beentaken as one microsecond (10−6s). Bearing in mind that for many applications even 1000 data items isnot a lot, this gives a good indication of just how intractable some algorithms are.

Yet another way of looking at this is to consider how much bigger a problem we can solve if computerpower increases dramatically. For example, if we had access to computers 1 000 000 time more powerfulthan those available today how much bigger can n grow? Well, if we have an O(n) algorithm we will beable to solve a problem 1 000 000 times as big (i.e. for 1 000 000n). If we have an O(n2) algorithm we

148

Page 158: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Figure 8.1: Illustration of Some Common Order Functions.(Taken from [20])

would only be able to solve a problem of size 1000n. In other words although the computer is 1 000 000times more powerful, we cannot increase our problem size by anything like the same ratio. If the algorithmis O(2n) then we can solve a problem of size n + 20. In other words, we can only add 20 items to ourdataset even though the computer has grown 1 000 000 times more powerful!

One last point worth making about order notation before we look at how it is applied is that we have lostsome information about the exact nature of the algorithm when we state just the order. For example,from our definition, we can see that when we say that an algorithm is O(n), we have discarded anyconstant factor and any lesser terms (i.e. the real function governing the performance might be 23n+15).This may affect our choice of algorithm, if (1) these constants have extreme values, and (2) n is small.A common example is in sorting. The best sorting algorithms we have are O(n log n). Simple sortingalgorithms are generally O(n2). However the constants of proportionality are often rather larger for themore complex sorting algorithms. So, if we are only sorting small lists of data we might be better offchoosing a simple, O(n2) sorting algorithm.

8.3 Some Examples

We have discussed the analysis of algorithms in some depth now, but how is all this applied in practice?We will be considering a number of searching and sorting algorithms in the next two chapters, and willtake the opportunity to discuss their complexities. In this section we will content ourselves with a fewsimple examples.

8.3.1 Very Simple Examples

Few algorithms of any practical interest have constant complexity (i.e. are O(1)) — almost invariablythere is some dependence on the input data. However, an algorithm that had only a single sequence ofstatements, with no repetition of any kind (i.e. loops or recursion) would yield a constant complexityresult.

149

Page 159: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

What about the following algorithm?

for (int k = 0; k < 10; k++)

// do something

This has a loop and so looks more interesting. In fact it is also O(1). The reason is that there is nodependence on the input data (assuming that the commented do something is just a simple statement,or sequence of statements). Whether this algorithm is given one item of data or 1 000 000, it is still goingto loop only ten times, giving a complexity of O(1).

Of more interest are algorithms that loop, processing data repeatedly. The simplest of these might justwork through the input data performing some operation on each item.

for (int k = 0; k < DATA_LENGTH; k++)

// do something to the k’th data element

What is the order of this algorithm? The number of operations is directly dependent on the numberof data items, and so this is an O(n) algorithm. Again, this is assuming that the process of “doingsomething” does not involve further loops or recursion dependent on the input data.

The following algorithm shows two loops processing some data:

for (int k = 0; k < DATA_LENGTH; k++)

for (int j = 0; j < DATA_LENGTH; j++)

// do something to the k’th element and the j’th element

This gives rise to an algorithmic complexity of O(n2). Be careful not to confuse this with the situationthat we might have in dealing with a two dimensional matrix. We might have two loops in such a case:

for (int k = 0; k < NUMBER_OF_ROWS; k++)

for (int j = 0; j < NUMBER_OF_COLS; j++)

// do something to the (k,j)’th element

In this case we still have an O(n) algorithm, since the n data items have simply been arranged into amatrix of NUMBER OF ROWS by NUMBER OF COLS.

Here is another algorithm with two loops:

for (int k = 0; k < DATA_LENGTH; k++)

for (int j = 0; j < (DATA_LENGTH-k); j++)

// do something to the k’th element and the j’th element

What do we have here? Well, the first time around the inner loop we will execute it n times. The nexttime will give n− 1 iterations, and the next n− 2, and so on. By the time we are finished we will havethe following total number of iterations of the “do something” part of the algorithm:

n+ (n− 1) + (n− 2) + · · ·+ 2 + 1

At first this might not seem very helpful, but in fact this is just an arithmetic progression and can besimplified to:

n

2(n+ 1) =

n2

2+

n

2

which is O(n2). This is a very commonly occurring result as we will see.

150

Page 160: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

8.3.2 A More Realistic Example

Let’s consider the problem of searching through an unsorted list in order to find whether or not it containsan item. This is something that we have done several times now. The quantity n, which will determinethe complexity of this search, is the number of items in the list. The operations that characterise thisprocess are comparisons between the value that we are searching for and the items in the list.

What is the worst case that could arise? It is obviously when the item that we are searching for is notto be found in the list at all. The only way that we can tell this is to look through the whole list. In thiscase the number of comparisons is n, and so the algorithm is exhibiting O(n) behaviour.

The best case will be when the item we are looking for is the very first item in the list. In this case weperform only one comparison and so have constant complexity. However, this case is extremely unlikelyto arise in practice!

What about the average case? To analyse this we need to make some assumptions. If we assume thatthe item we are looking for will always be found, then, on average, we will need to search through halfthe list before we find the item we are looking for. This gives n/2 comparisons, which is still O(n), ofcourse. If that assumption seems unreasonable, what if we rather assumed that on average the item weare searching for is not to be found in the list half the time? In this case half the searches will requiren comparisons to ascertain that the item is not present, and the other half will require, on average, n/2comparisons as before. This gives an overall result of 3n/4 comparisons as the average case based onthese assumptions. This is still O(n).

So, a simple sequential searching algorithm is essentially O(n), no matter how we look at it. In the nextsection of the course we will continue considering these kinds of efficiency issues as we look at searchingand sorting algorithms.

Skills

• You should understand the need for measuring the efficiency of algorithms

• You should be familiar with Big-O notation and how it is used

• You should be familiar with the ranking of common order functions

• You should be able to apply the methods you have learnt to the analysis of simple algorithms

151

Page 161: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Part IV

Some Common Algorithms

152

Page 162: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Searching and sorting are very common operations that are needed in many programs. Consequently, it isworth making a detailed study of these important classes of algorithms. They also make a useful vehiclefor the further exploration of the concepts of algorithmic complexity introduced in the last chapter. Thisis due the fact that the analysis of these algorithms is (1) very well understood, and (2) generally quitestraightforward. Lastly, this section introduces some balance into our study of advanced programming —up until now we have focused mainly on data structures, and the subject of algorithms has been ratherneglected.

153

Page 163: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Chapter 9

Searching

Objectives

• To introduce efficient searching techniques

• To study the algorithmic complexity of these, together with the simpler searching techniquesmet already

• To introduce some of implementation issues pertinent to searching and also to sorting

9.1 Introduction

In the last chapter we analysed the complexity of a simple sequential search algorithm for an unsortedlist of elements. In chapter seven we implemented a dictionary data structure using a sorted list of items.This had a search method defined (called contains — see p. 126). We also implemented two hash tabledata structures that had similar methods. Lastly, the binary tree class from chapter six also had such amethod to tell whether or not an item was present. So searching is not something new. What can weadd? In this chapter we will revisit some of these ideas, and introduce two new techniques: the binarysearch algorithm and the interpolated binary search algorithm.

9.1.1 Implementation Issues

Before we look at the advanced searching techniques of this chapter, there are a number of issues we needto consider regarding the implementation of these, and the sorting algorithms of the next chapter. Theseare (1) the nature of the data being searched/sorted, (2) the underlying data structure being used, and(3) the options available in Java for writing generic search/sort methods.

The Nature of the Data

In general, the items being searched for or sorted will be records of some kind, such as student records,or account details. In any such case we can usually think of the data as having two parts (a key and an

154

Page 164: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

associated value), just as we did in chapter seven. To simplify our discussion, we will ignore the associatedvalues, and focus only on searching and sorting keys. To further simplify things, we will consider onlynumeric keys in our examples. While this may seem rather unrealistic, the principles that we will beconsidering are just as valid as if we were using more realistic data (such as student records with studentnumbers as keys).

The Impact of the Data Structure Used

The second general consideration is that of the underlying data structure. Many of the data structures wehave seen in this course have been based on linked lists of nodes containing the data. However, for manysorting and searching algorithms these are not the best data structures to use. In these cases arrays areoften more efficient, and sometimes are the only data structures that can be used. Why is this? Whenwe consider how we access the data in a linked list, we usually have only one, or maybe two, entry pointsinto the list and must work our way through the list to find the particular entry of interest. In fact, wecan state that accessing an entry in a linked list is an O(n) operation. On the other hand, accessing anentry in array can be done directly: it is an O(1) operation. When searching and sorting, we often needto access any item of the dataset. If this involves an expensive O(n) search for the given node, the overallcomplexity of the algorithm may become excessively high.

In general then we will be concentrating on using arrays, rather than linked lists, as we look at searchingand sorting algorithms. Is this a great restriction? Not really, as we can quite easily set up an array tohelp us access elements of a linked list directly. For example, if we have a linked list (of integers) like thefollowing:

we can count the number of elements and then create an array of references, or pointers, (called index

below) that will allow us to access the linked list directly:

We can then use index to “leap into the middle” of the linked list as we search or sort it.

Generic Comparisons in Java

When we come to implement general searching and sorting algorithms in Java we need some way tocompare the values of objects, while using as wide a range of objects as possible. We have already seen,and used, the mechanism that allows us to do this, in the form of the Comparable interface. You willrecall that this specifies a method called compareTo that allows us to compare any two objects. For ourpurposes in this section of the course we will usually be working with arrays of data, declared as follows:

Comparable[] list;

155

Page 165: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

9.2 Searching Techniques Revisited

As mentioned in the introduction, we have already seen a number of search algorithms. In this sectionwe will briefly consider their complexity.

9.2.1 Simple Sequential Search

In chapter four we developed several classes for various types of lists (the IntegerVector, IntegerList,ObjectList and GenericList classes). Each of these had a position method defined which was essen-tially a search algorithm. If you look back at these methods you will see that they were simple sequentialsearches. The data in the lists that we were dealing with was unordered, and so we had to search throughthe entire list in order to tell whether or not an item was present. This is exactly the case which weanalysed in the previous chapter and found that, if the item was present, the average complexity wasn/2. Furthermore, if we made some (probably reasonable) assumptions, we showed that the average casecomplexity was 3n/4 if the item was not present half of the time. Either analysis gives an O(n) result,as the cases differ only in the constants.

In chapter seven we developed the ListDictionary class. This too made use of a linked list of nodes.What it introduced was the concept of order — the list was sorted in ascending order of keys. What effectdoes this have on the complexity of searching? The search method we defined for our dictionary classwas called contains. Assuming that the item can always be found in the list we will still, on average,have to make n/2 comparisons to find the item. What if the item is not found in the list? In this case,because the list is ordered, we still need only make n/2 comparisons, on average. As soon as we find avalue greater than that of the item we are searching for we can be sure that it is not present in the list.So, for a sorted list a simple sequential search gives the slightly better result that it always requires n/2comparisons on average. Of course, this is still O(n), so the complexity has not improved as a result ofkeeping the list sorted — all we have achieved is a slight decrease in the constant factor from 0.75 to 0.5.

9.2.2 Searching a Hash Table

The exact analysis of searching a hash table for a key can be quite difficult. However, we can draw somesimple conclusions. We will consider only internal hashing, but the analysis of external hashing is notvery different. If we assume that no collisions occur (a rather unrealistic assumption, but we will pursuethis train of thought for a moment) then we can tell whether or not an item is present simply by hashingthe key and checking the given element of the hash table. This is an O(1) operation, and so is extremelyefficient compared with the sequential searching techniques of the last section.

If we allow for collisions, what effect will this have on our analysis of the complexity? The moment wehave collisions we will get clusters of items which we need to search through sequentially in order to findthe item we are looking for, or else ascertain that it is not present. The number of comparisons requiredthus depends on the length of the clusters. Let’s call this c. In general, if the item is found in the tablewe will require c/2 comparisons, and if the item is not present c comparisons. If the item is found half ofthe time then we get 3c/4, just as we had before for sequential searches of unordered lists. The questionthat we really need to answer here is how is c related to n. This depends on the hashing function andthe consequent clustering of keys.

If we assume that the hashing function provides a perfectly even distribution of hash values for the keys,and that the distribution of keys inserted into the table is perfectly even then the clusters will all be ofthe same length. This length will be the number of keys n divided by the number of clusters. In otherwords c is proportional to n, and our algorithm is still O(n). The main advantage is that the multiplyingconstant will generally be very small. Take as an example a hash table that has had 100 keys insertedinto it. Using our assumptions above, if there are twenty clusters they will each have five entries. So, forthis example, n = 100 and c = 5. We stated that the number of comparisons that is required on average

156

Page 166: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

would be 3c/4 = 3.75. So, in this specific case, the number of comparisons is only 0.0375n. This is farless than 3n/4 or n/2. So, although we are still dealing with an O(n) algorithm, it is likely to be moreefficient than the simple sequential searches.

To prevent the reader from believing that analysing the performance of hash tables is a very simpleprocess, we will present some actual results here, without showing their derivation. For internal hashing,if the item is found in the table, the number of comparisons is given by the following formula:

1 + (1− nM )−1

2

where M is the size of the table.

For external hashing, if the item is found in the table, the number of comparisons is:

1 +n

2M

and if the item is not found in the table:

e−nM +

n

M

Deriving these functions is not particularly easy!

You will note that the expression n/M is an important factor in all of these equations. This is the ratioof the number of items to the size of the table (i.e. how full the table is), which we should expect to bea dominant factor.

9.3 Binary Search Techniques

We have already seen that sorting the data can help a little with the efficiency of searching. Whenwe looked back at the dictionary data structure we were able to perform sequential searches using n/2comparisons whether the item was present or not. This contrasted with the figure of 3n/4 comparisonswhen searching an unsorted list. We also made the point that this was still O(n). In fact, if the data issorted, we can find lower order algorithms. They are the topic of this section.

9.3.1 Binary Search

The binary search is a very powerful technique — as we will see, we can use it to search through a list ofone million items using at most twenty comparisons.

The nature of the binary search arises from the fact that, if we have a sorted list, we can start our searchin the middle of the list rather than at the beginning of the list. How does this help us? Well, we canstraight away ignore one half of the list. Restricting our attention to the half of the list in which weexpect to find the value we are looking for, we can repeat the process, now looking halfway along thissublist.

For example, consider looking for the value 29 in the following list of ten items.

0 1 2 3 4 5 6 7 8 9

3 8 12 15 20 21 29 37 39 42

157

Page 167: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

We will start our search at position 41. The value at this position is 20 and we know that the valuewe are looking for, if it is present, must lie to the right of this position. We now restrict our search topositions 5 through 9. With only one comparison we have managed to eliminate half of the entries inthe list (those in positions 0 through 4). Looking halfway along the list between positions 5 and 9 weconsider position 7. The value we find here is 37 and we know that the value we are looking for must lieto the left of this position. We now restrict our search to the sublist between positions 5 and 6. Halfwaybetween these values will take us to position 5. The value we find here is 21 and we know that the itemwe are looking for must be to the right of this position. That restricts our search to the sublist betweenpositions 6 and 6. Halfway between these positions is still position 6, and here we find the item we werelooking for.

This example illustrates the worst case, as we needed the maximum number of comparisons to find thisitem. If we had been searching for the value 20 we would have found it after only one comparison. Inany event, we have found the item we were looking for using only four comparisons to search our list often items. This is the worst case that can possibly arise. Using a binary search on a list of ten items, themost comparisons we will ever have to do is four.

Implementation

Let’s look at the algorithm, and then we will return to the analysis of this algorithm’s complexity. Thebinary search is very easy to code in Java. We make use of three indices: left and right to keep trackof the sublist that we are currently considering, and look to keep track of the midpoint of the sublist.The method returns the index of the item if it is found, or a special value of −1 if it is not.

public static int binarySearch (Comparable list[],

Comparable item)

// Search through list of entries for the item.

// PRE: list must be sorted into ascending order

// POST: returns -1 if item is not found,

// otherwise returns the index of item

{ int left = 0,

right = list.length-1,

look;

do

{ look = (left + right) / 2;

if (list[look].compareTo(item) > 0)

right = look - 1;

else

left = look + 1;

} while (list[look].compareTo(item) != 0 && left <= right);

if (list[look].compareTo(item) == 0)

return look;

else

return -1;

} // binarySearch

Notice how we form the condition controlling the do loop. We need to stop, either when we find the itemwe are looking for, or when we have ascertained that the item is not present. This can be detected bythe left and right indices “crossing over” so that the left index has a value greater than that of the rightindex.

1To find the halfway mark we average the two positions we are considering, using integer division, which truncates theresult downwards. Here, (0 + 9)/2 gives 4.

158

Page 168: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

In passing, note here how the Comparable interface provides us with a very useful level of abstraction:we do not need to know anything about the objects that we are searching through, except that they havea compareTo method, and we don’t even need to know how that does the actual comparison.

Exercise 9.1 The calculation used to find the midpoint of the array is not always safe, asreported by Joshua Bloch (one of the leading Java developers at Sun) in the following ar-ticle: http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-

nearly.html. Test this for a suitably large array (which can simply contain sequential integervalues) and demonstrate the bug (hint: search for large values). Then implement any of thesolutions that Bloch suggests and establish that the modified search is safe for the cases thatcaused the problem to arise initially.

Complexity Analysis

Turning to the efficiency of this search algorithm, we stated above that the number of comparisons for alist of ten items would never be greater than four. What is the general relationship between the length ofthe list and the number of comparisons? Since we halve the list at each stage, the number of comparisonsrequired will also halve. Put slightly differently, the number of comparisons needed for a list of length n(i.e. Cn) is one plus the number of comparisons needed for a list of length n/2, or:

Cn = Cn/2 + 1

Furthermore, the number of comparisons needed for a list of length one is just one:

C1 = 1

How does this help us work out the complexity? We start by assuming that n is a power of 2, let’s sayn = 2k. Then we can rewrite our first equation as follows:

C2k = C2k−1 + 1

This is what is known as a recurrence relation. We can use the formula itself to substitute for the firstterm on the right hand side. This gives:

C2k = C2k−2 + 1 + 1. . .C2k = C20 + kC2k = k + 1

Now, since we originally said we were assuming that n = 2k, we have log2 n = k. Rewriting this lastequation above then gives:

Cn = log2 n+ 1

This is the answer we wanted: the number of comparisons required for a list of length n is of the orderlog2 n, if n is an exact power of two. In fact, it can be shown that even if n is not an exact power oftwo the number of comparisons is still proportional to log2 n. We usually drop the base when takinglogarithms2, and so this gives us all we need to be able to state that the binary search algorithm is

2The actual base used has a small effect on the constant of proportionality, which we disregard anyway.

159

Page 169: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Number of Comparisons Required

nSequential Search(Average Case)

Binary Search(Worst Case)

10 5 432 16 564 32 6128 64 7

1 000 500 101 000 000 500 000 20

Table 9.1: Comparison of Sequential and Binary Searches

O(logn). From the ranking of order functions in the previous chapter we can tell that this is a bettersearch algorithm than the sequential searches, which are all O(n).

Turning to some actual examples, we saw above that we could search a list of ten items using fourcomparisons. This may not seem like a great saving over the five comparisons required on average for asequential search. Table 9.1 illustrates some other examples showing the differences for various values ofn. Note that this is not really a fair comparison as it gives the worst case figures for the binary searchand the average case figures for the sequential search. As we can see from the table, the efficiency of thebinary search becomes very much more apparent as the size of the dataset being searched increases.

9.3.2 Interpolated Binary Search

The binary search might seem so efficient that we can hardly expect to do any better. In fact there isa modification we can make to it that improves it further. In the original binary search we stated thatwe would look in the array at the midpoint between the ends of the sublist we are currently considering.If we think about the way in which we as humans search through long lists, we tend to do things alittle differently. For example, if we are searching through a telephone book for the name “Wells” we areunlikely to turn to the middle of the book and then decide whether we need to turn to the left or theright. We are more likely to turn to a position somewhere towards the end of the ’phone book. This isbecause we have a good understanding of the fact that the letter “W” comes near the end of the alphabet,and so we can start our search nearer the point where we expect that we might find the entry which weare looking for. We can adapt this idea for the binary search to improve its performance further.

The modification that is required to the standard binary search algorithm given above is quite simplein fact. Instead of simply setting the look index to the midpoint of the left and right indices, weexamine the data values at the left and right positions and use them to interpolate the look position.For example, if we were looking for the value 765 in a list of numbers and the value at the left positionwas 13 and the value at the right position was 987, we would set look to a position (765−13)/(987−13)of the distance from left to right.

Implementation

In essence, the modified method is as shown below. In practice, unfortunately, it is complicated by theneed to prevent division by zero and to ensure that the interpolating calculations are done using floatingpoint arithmetic. The steps required to deal with these problems have been left out below to simplifythe code (the full, correct version of this method is shown in Appendix A).

public static int intBinarySearch (double list[], double item)

// Search through list of doubles for the item,

// using interpolated binary search.

// PRE: list must be sorted into ascending order

160

Page 170: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

// POST: returns -1 if item is not found,

// otherwise returns the index of item

{ int left = 0,

right = list.length-1,

look;

do

{ look = left + (right-left) * ((item-list[left]) / (list[right]-list[left]));

if (look < left)

look = left;

if (look > right)

look = right;

if (list[look] > item)

right = look - 1;

else

left = look + 1;

} while (list[look] != item && left <= right);

if (list[look] == item)

return look;

else

return -1;

} // intBinarySearch

Note that this relies on the values in the list having some numeric value. If they do not, then the methodof interpolation will have to be adjusted accordingly.

Algorithmic Complexity

So, how efficient is the interpolated binary search? The full analysis is not very simple, but it can beshown that, if the keys are evenly distributed, the method is O(log log n). This is quite remarkable.

An experiment gave the following results (average number of comparisons) when searching through a listof 1000 random integers for 10 000 random values:

Binary search: 9.8 comparisons per searchInterpolated binary search: 3.1 comparisons per search

So, on average, the interpolated binary search can find an item in a list of 1000 values using about threecomparisons! This agrees well with the theory since log2 1000 ≈ 10 and log2 log2 1000 ≈ 3. For a listof one million items the interpolated binary search is capable of finding an item using only four or fivecomparisons.

Exercise 9.2 Plot a graph of n, log n and log logn for n = 8, 16, 64, 128, 512 and 1024.

Exercise 9.3 Write a program that fills an array with 1000 random integer values and thenperforms 10 000 searches, counting the number of comparisons needed. Do your results agreewith those given above?

9.3.3 The Relationship Between the Binary Search and Binary Search Trees

Perhaps unsurprisingly, the binary search algorithm we have seen in this section is closely related tothe binary search tree ADT that we saw in chapter six. How can this be? Here we are dealing with a

161

Page 171: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

0 1 2 3 4 5 6 7 8 9

3 8 12 15 20 21 29 37 39 42

Figure 9.1: A Binary Search Tree Equivalent to Binary Searching an Array.

search technique for arrays of data — a sequential data structure. In chapter six we were dealing witha nonlinear, tree structure of dynamically allocated nodes. In fact, the binary search algorithm worksthrough the data in exactly the same way that a search through an equivalent binary search tree would.The key point here is the binary search tree must be equivalent. This means that the element foundat the midpoint of the array must be the one at the root of the binary tree, and the subtrees below itshould display the same properties. That is, the element found at the midpoint of one of the sublists inthe binary search should be the root of an equivalent subtree in the binary search tree. For the examplearray we had at the beginning of this section the equivalent binary search tree is as shown in Figure 9.1.

Work through the search patterns for these two data structures for a few examples and see how theyeffectively work in exactly the same way. In a sense, the binary search algorithm works through the arrayin a tree-like way, subdividing it just as the subtrees do in a binary search tree. Notice how the maximumdepth in our binary search tree is four. This corresponds directly to the fact we had that the maximumnumber of comparisons we would ever need to use in a binary search on an array of ten items is four.

The last matter arising from this is that we can now discuss the efficiency of searching a binary searchtree. As long as the tree is balanced (i.e. the depths of all the leaf nodes differ by no more than one,just as we have in Figure 9.1) then the algorithmic complexity of searching through a binary search treeis O(logn). The analysis of this is very similar to that which led us to the result for the binary searchalgorithm.

Skills

• You should be familiar with the binary search algorithm and the interpolated binary searchalgorithm

• You should be able to discuss the algorithmic complexities of simple sequential searches andthe two binary search methods

• You should be aware of the relationship between the binary search algorithm and a binarysearch tree

162

Page 172: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Chapter 10

Sorting

Objectives

• To study the following sorting algorithms: Bubble Sort, Insertion Sort, Selection Sort, QuickSort and Merge Sort

• To implement these algorithms in Java

• To analyse the algorithmic complexity of these algorithms

10.1 Introduction

You should already be familiar with some sorting techniques from your first year of Computer Science.In addition, we slipped one sorting method in anonymously in chapter seven when we looked at thedictionary data structure. Sorting is important in many applications. For one, if we wish to use efficientsearching techniques like the binary search then we need to have a sorted list of data. Furthermore, evenwhen data is already sorted we might need it in another order. A common application that lecturers haveis sorting class records (in alphabetic order) into descending mark order so that they can get an overviewof the overall performance of their students in a test, etc.

In this chapter we will look at several simple sorting techniques. Some of these should already be knownto you. We will then look at some more advanced (and efficient) sorting methods. It is important tobear in mind that the field of sorting algorithms is a vast one, and in a course like this we will only bescratching the surface.

As we discussed at the beginning of the previous chapter, we will make use of the Comparable interfaceto write generic sorting routines.

10.2 Simple Sorting Methods

The methods that we study in this section all have complexities of O(n2). As we commented in chaptereight, such simple methods, while less efficient, may still be very useful for small datasets, and are worth

163

Page 173: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

studying for that reason. We will be considering the Bubble Sort, the Selection Sort and the InsertionSort.

10.2.1 The Bubble Sort

This is probably one of the simplest and least efficient sorting methods known! It works by repeatedlycomparing pairs of items and swapping them if they are out of order. This is repeated until eventuallythe whole list is in order. We will use the following list as an example.

3 10 7 2 5 11 4 6

If we consider this list, we start by comparing 3 and 10. They are in order so we change nothing. Wenow compare 10 and 7. They are out of order so we swap them, giving:

3 7 10 2 5 11 4 6

Comparing 10 and 2 we swap them:

3 7 2 10 5 11 4 6

And so on:

3 7 2 5 10 11 4 6 Compare 10 and 5. Swap.

3 7 2 5 10 11 4 6 Compare 10 and 11.

3 7 2 5 10 4 11 6 Compare 11 and 4. Swap.

3 7 2 5 10 4 6 11 Compare 11 and 6. Swap.

3 7 2 5 10 4 6 11 Finished first pass.

At this point we can see that the largest element in the original list (i.e. 11) has found its way to theend of the list. It is as if it has “bubbled up” through the list, and this is where the Bubble Sort gets itsname. We will redraw the list as follows to indicate that the last element is now in the correct position:

3 7 2 5 10 4 6 11

We now start all over again at the start of the list. We do not have to work all the way through to theend of the list this time, as we know that the last element is in the correct place.

3 7 2 5 10 4 6 11 Compare 3 and 7.

3 2 7 5 10 4 6 11 Compare 7 and 2. Swap.

3 2 5 7 10 4 6 11 Compare 7 and 5. Swap.

3 2 5 7 10 4 6 11 Compare 7 and 10.

3 2 5 7 4 10 6 11 Compare 10 and 4. Swap.

3 2 5 7 4 6 10 11Compare 10 and 6. Swap.Finished second pass.

164

Page 174: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

The second biggest value (10) has now “bubbled up” into the correct position.

We now start the third pass. In this pass the number 7 will bubble up to the top of the list. We will notillustrate this pass step by step, but the final picture is:

2 3 5 4 6 7 10 11

The next pass gives:

2 3 4 5 6 7 10 11

When we do the next pass, we notice something interesting: we do not swap any pairs of numbersanywhere in the list. The algorithm can use this fact to tell (as we can see clearly from the diagramabove) that the list is now in order, and it need not waste any time on doing further passes over the list.

Implementation

That is how the algorithm works, what does it look like in Java? The code is as follows:

public static void bubbleSort (Comparable list[])

// Bubble Sort list

{ boolean madeSwap = true; // Flag to tell if we can stop

for (int pass = 0; pass < list.length && madeSwap; pass++)

{ madeSwap = false;

for (int k = 0; k < list.length-pass-1; k++)

if (list[k].compareTo(list[k+1]) > 0)

{ Comparable tmp = list[k];

list[k] = list[k+1];

list[k+1] = tmp;

madeSwap = true;

}

}

} // bubbleSort

There are some points to note about this method. Firstly, the variable madeSwap is used for the purposewe mentioned above: as soon as we do a pass and make no swaps, we know we can stop sorting as thelist is now in order. The second point to note is how we control the inner loop (the one using k). Thisruns from 0 to list.length-pass-1, where pass gives the current pass number. This means that, aseach number bubbles its way to the top of the list, we don’t consider it again. In essence, we are ignoringthe part of the list that was shaded in grey above.

Complexity Analysis

Considering the best case first, if the list is already sorted before we start what will happen? Thealgorithm will only work through the list once, making no swaps, and then terminate. At this stage wewill have done n− 1 comparisons and no swaps. We cannot hope to do any better than this O(n) result.

The worst case would be if the list is in reverse order when we start. In this case the inner for loop isexecuted the following number of times:

(n− 1) + (n− 2) + (n− 3) + · · ·+ 2 + 1

165

Page 175: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

This is just an arithmetic progression similar to the one we saw in chapter eight, and the result is:

n

2(n− 1)

This is O(n2). In this worst case, this gives both the number of comparisons, and the number of swapsmade (since every comparison results in a swap).

The average case analysis is more difficult, and we will not go into the details here. The end result is asfollows:

Number of comparisons: 14 (n− 1)(n+ 2)

Number of swaps: n4 (n− 1)

Both of these are O(n2), of course. If you think about the way in which we move the data in this sort,it is not too surprising that it is very inefficient — each item of the data moves only one position at atime. The Bubble Sort cannot be very successful, unless the list is already sorted, or almost sorted.

Exercise 10.1 A further improvement to the performance of the Bubble Sort can be madeby keeping track on each pass of the position at which the last swap was made. The next passneed only run as far as this position, not right up to list.length-pass-1. Why is this so?Implement this improvement to the above algorithm and measure the performance benefit fora list of random numbers.

10.2.2 The Insertion Sort

The key to the Insertion Sort is that we consider one item at a time and insert it into the correct positionin a sorted sublist. Let’s consider the same example list again.

3 10 7 2 5 11 4 6

We will work from the right end of the list back to the left. Initially we have the last item, which we canthink of as a sorted sublist of just one item1. We’ll shade the sorted sublist as before to emphasise thispoint.

3 10 7 2 5 11 4 6

We now start the actual sorting process. We take the next number from the unsorted part of the list (4in the example above) and insert it into the correct place in the sorted part of the list. In this instancethe number 4 doesn’t actually move at all.

3 10 7 2 5 11 4 6

We now consider the next number in the unsorted part of the list (11) and insert it into the sorted part ofthe list. To do this we need to put the value 11 somewhere safe temporarily, and shift the other numbersdown:

3 10 7 2 5 4 6 6 11

1Any list with just a single item is “sorted”.

166

Page 176: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

We can then put 11 in its correct position:

3 10 7 2 5 4 6 11

We now consider the next number 5, and insert it into the correct position (between 4 and 6) in the sameway. This gives:

3 10 7 2 4 5 6 11

And so we continue. The next steps are as shown here:

3 10 7 2 4 5 6 11 Insert 2.

3 10 2 4 5 6 7 11 Insert 7.

3 2 4 5 6 7 10 11 Insert 10.

2 3 4 5 6 7 10 11 Insert 3.

Implementation

A generic Java method to perform the Insertion Sort is as follows:

public static void insertionSort (Comparable list[])

// Insertion Sort list of length entries in place

{ for (int k = list.length-2; k >= 0; k--)

{ Comparable tmp = list[k];

int j = k+1;

while (j < list.length && tmp.compareTo(list[j]) > 0) // Move data down

{ list[j-1] = list[j];

j++;

}

list[j-1] = tmp;

}

} // insertionSort

The first loop here (i.e. the for loop) considers each element of the unsorted part of the list in turn. Theinner loop (i.e. the while loop) is the one which moves the elements of the sorted sublist down until thecorrect position for inserting the tmp value is found.

Complexity Analysis

Turning to the efficiency of the Insertion Sort, the worst case arises when the original list is in reverseorder. In this situation the inner (while) loop always has to run to the end of the list. This gives thefollowing number of comparisons and data movements:

1 + 2 + ...+ (n− 2) + (n− 1) =n

2(n− 1)

167

Page 177: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

This is again O(n2).

The best case is when the list is already sorted. In this case the outer (for) loop still runs n− 1 times,but the inner loop is never executed. We make one trivial assignment when we replace the tmp value backwhere it came from. This does neither any good nor any harm. Checking to prevent it would probablybe less efficient than just doing it. So, the best case complexity is O(n).

The average case complexity for the Insertion Sort is not too difficult to work out. The outer loop alwaysexecutes n − 1 times as we have just seen. On average the item to be inserted is going to be halfwayalong the sorted sublist, giving half the number of comparisons and movements that we had in the worstcase. So, the complexity is:

n

4(n− 1)

which is still O(n2).

Comments

Before we leave the Insertion Sort there are a few other points we should note about it. Firstly, it is anideal sorting method to use while the data is being read into a program. We simply insert each new iteminto the correct place in the list as it is read. Secondly, unlike the other sorting methods that we areconsidering, it is well suited to use with linked lists of data. We can begin by removing any item fromthe original list and using it as the start of a new linked list. We can then work through the old linkedlist item by item, inserting each one into the correct position in the new list. This works well, and avoidsthe data movement present in the array version: we are simply updating pointers in the two lists. If youlook back to the findNode method in the ListDictionary class in chapter seven you will see that thismade use of a form of Insertion Sort, placing each new item into the correct position in a linked list asit was entered.

Exercise 10.2 Implement the Insertion Sort for linked lists of data items, as discussed above.

10.2.3 The Selection Sort

In the Insertion Sort we worked through an unsorted sublist item by item, and placed each item into thecorrect position in a sorted sublist. The Selection Sort is similar, but works by finding the smallest valueremaining in the unsorted sublist and placing this at the end of the sorted sublist. We select the smallestremaining value and put it in place. We will consider the same example again:

3 10 7 2 5 11 4 6

We now look through the whole list for the smallest value. This is the number 2. We swap this item withthe one at the start of the list, giving:

2 10 7 3 5 11 4 6

We now repeat the process, looking through the unsorted sublist for the smallest remaining value. Thisis 3, and we swap this with the value at the start of this sublist (i.e. 10):

2 3 7 10 5 11 4 6

168

Page 178: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

And so we continue. The next smallest value is 4. This is swapped with 7, and so on:

2 3 4 10 5 11 7 6

2 3 4 5 10 11 7 6

2 3 4 5 6 11 7 10

2 3 4 5 6 7 11 10

2 3 4 5 6 7 10 11

Implementation

This algorithm is again quite easily coded in Java:

public static void selectionSort (Comparable list[])

// Selection Sort list of length entries

{ for (int k = 0; k < list.length-1; k++)

{ int minPos = k;

for (int j = k+1; j < list.length; j++)

if (list[j].compareTo(list[minPos]) < 0)

minPos = j;

// Now swap the k’th and smallest items

Comparable tmp = list[k];

list[k] = list[minPos];

list[minPos] = tmp;

}

} // selectionSort

Here the variable minPos is used to keep track of the position in the unsorted sublist where the smallestitem is to be found. The outer loop works through the list, and gives the position of the start of theunsorted sublist at any time. The inner loop is used to find the smallest element remaining in the unsortedsublist.

Complexity Analysis

This algorithm is unusual in that it has no particularly good or bad cases: the amount of work done isexactly the same for any ordering of the original input list. The outer loop always executes n− 1 times,and the inner loop varies from n − 1 to 1. This gives a total of n

2 (n − 1) comparisons again. So thisalgorithm is always O(n2) with regard to the number of comparisons required. When we consider theamount of data movement, however, the picture is a lot better. We only swap two items once we knowwhich is the smallest remaining value. In this we way we only make n− 1 swaps in total. This is muchbetter, in general, than either of the other simple sorting techniques we have studied. This is especiallytrue if the data items are large records. Of course, it doesn’t have the good O(n) best case performanceof the other two simple sorts.

169

Page 179: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Sort Operation Worst Case Average Case Best CaseBubble Comparisons n

2 (n− 1) 14 (n− 1)(n+ 2) n− 1

Swaps n2 (n− 1) n

4 (n− 1) 0Insertion Comparisons n

2 (n− 1) n4 (n− 1) n− 1

Moves n2 (n− 1) n

4 (n− 1) n− 1Selection Comparisons n

2 (n− 1) n2 (n− 1) n

2 (n− 1)Swaps n− 1 n− 1 n− 1

Table 10.1: Comparison of Simple Sorting Algorithms

10.2.4 Summary of Simple Sorting Methods

We can summarise the behaviour of these three simple sorting algorithms as shown in Table 10.1. As wecan see from this table, the Selection Sort has the most consistent performance, and generally the leastamount of data manipulation. The other two sorting methods do have better best case complexities andthis may weigh in their favour if sorted, or almost sorted, lists are common inputs.

Exercise 10.3 Write a program that will allow you to test these algorithms with sorted,reverse ordered and random lists of data. Time the execution of all three algorithms, andverify the theoretical results shown in Table 10.1.

Exercise 10.4 Modify the sorting algorithms so that they count the number of comparisonsand data exchanges/moves that they perform. Write a program that will allow you to test thesealgorithms with sorted, reverse ordered and random lists of data (or modify your program fromExercise 10.3). Use this to verify the theoretical results shown in Table 10.1.

10.3 Indirect Sorting

Before taking a look at some more powerful sorting techniques we will make a small digression to considerthe problems associated with sorting large records of data. We have studied the number of data movesor exchanges made by the simple algorithms of the previous section. We also commented on the factthat the Selection Sort does markedly less swapping of data items and so is preferable when dealingwith large data structures. Of course, moving data around is not a time consuming task when the dataitems are simple data types like integers. In Java an integer is stored in only four bytes. Even a double

or long value takes only eight bytes. Other data structures may be very much larger. For example, astudent record containing names and addresses, matric marks, courses taken, marks obtained, fee data,disciplinary records, etc. might be several kilobytes in size. Shuffling these large records around as wesort a list of records becomes very time consuming in itself (this is generally not an issue in Java, ofcourse — why?).

A solution to this problem is to sort the data indirectly. To do this we set up a secondary list containingonly the subscripts of the actual records in the list of data which we want sorted. We then sort thesubscripts in this second list into order. Consider the following list (presumably of very large studentrecords, but only the keys are shown here):

0 1 2 3 4604G1234 602W4567 603C9823 605B3465 603A9182

. . . . . . . . . . . . . . .

The secondary array that we require, with the subscripts in order, looks like this:

170

Page 180: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

0 1 2 3 41 4 2 0 3

What this tells us is that the element of the student list which should be in the first position is theone with subscript 1 (i.e. 602W4567). The last item in the list should be the one with subscript 3 (i.e.605B3465). In other words, we can use this secondary list of subscripts to access the original list inascending order, even though the original list has not been physically sorted!

Implementation

How can we implement this idea? Well, we start off with a secondary list which is in simple subscriptorder. It will look like the following:

0 1 2 3 40 1 2 3 4

We then sort this list, but using the values of the student numbers when we do the comparisons. Wecan use any sorting algorithm we like. The only thing that we need to remember is that the list we aresorting contains subscripts and not the actual data. Let’s consider the Insertion Sort for example. Noticehow this now makes indirect accesses to the list array.

public void insertionSort (Comparable list[],

int index[])

// Insertion Sort index to access list in ascending order.

{ for (int k = 0; k < list.length; k++)

index[k] = k; // Set up index list

for (k = length-2; k >= 0; k--)

{ int tmp = index[k];

int j = k+1;

while (j < list.length && list[tmp].compareTo(list[index[j]] > 0)

// Move data down

{ index[j-1] = index[j];

j++;

}

index[j-1] = tmp;

}

} // insertionSort

Compare this with the original Insertion Sort, and trace through how it sorts the example list of studentrecords shown above.

Exercise 10.5 Rewrite the Bubble Sort and Selection Sort algorithms to make use of thistechnique.

10.4 More Efficient Sorting Methods

All the sorting methods we studied in Section 10.2 were essentially O(n2). In this section we want to lookat two sorting methods that are generally more efficient. These are the Quick Sort and the Merge Sort.Both of these algorithms work by subdividing the list being sorted into successively smaller sublists, andconsidering these independently.

171

Page 181: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

10.4.1 The Quick Sort

The Quick Sort was invented by C.A.R. Hoare, a very famous computer scientist, in 1960. It has becomeone of the most popular sorting methods. The central idea in the Quick Sort is to choose one of the itemsin the list (it doesn’t really matter which one). This is called the partition element. We then divide thelist into two sublists: one consisting of those items smaller than the partition element, and one consistingof those items greater than the partition element. We then repeat the process for the two sublists. Let’ssee how it works for our example:

3 10 7 2 5 11 4 6

First, we need to decide how to pick the partition element. One of the simplest (and usually quiteeffective) methods is just to pick the first number in the list. In our example that gives us the value 3and we need to partition the list into two sublists: those less than 3 and those greater than 3. The resultof this can be imagined as follows:

2 3 7 10 5 11 4 6

While we have drawn this as three separate lists, the data is, of course, still stored in one array. We arejust going to consider parts of it as separate sublists from now on. Another point which this raises is thatthe choice of the partition element was not particularly good here. Ideally, we would like the partitionelement to fall exactly in the middle of the list after we have partitioned the lists. The last point to noticeabout this is that the partition element is now in its correct place: all the values to its left are less thanit and all the values to its right are greater. We need not consider the partition element again.

Returning to the example, the left sublist only contains a single number and so needs no further sorting.We can turn our attention to the right sublist and apply the Quick Sort method to this. We choose thevalue 7 as our partition element, and get the following picture:

2 3 4 6 5 7 11 10

We now have two further sublists to sort, and so we reapply the Quick Sort algorithm to each of these inturn. Considering the left sublist first, we get 4 as the partition element. This gives an empty left sublistand 6 and 5 in the right sublist:

2 3 4 6 5 7 11 10

Quick Sorting the list with 6 and 5 in it gives just 5 in the left sublist and an empty right sublist. Weare now finished with everything to the left of 7 and can return to consider its right sublist:

2 3 4 5 6 7 11 10

Taking 11 as the partition element we get a left sublist with just 10 in it and no right sublist. This givesthe final sorted list:

2 3 4 5 6 7 10 11

There are two things to note about the Quick Sort. Firstly, the method is inherently recursive. Wepartition the list into two sublists, and then apply the Quick Sort to them independently. Secondly,the key step is the partitioning of the list into sublists about a partition element. We will consider theimplementation of the partitioning algorithm first.

172

Page 182: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Implementation

The partitioning method needs to be able to work with any sublist, so we will pass it the whole list,together with the indices of the left and right ends of the sublist with which it is to work. We also needto return the index of the position where the partition element ends up so that the Quick Sort algorithmcan work with the left and right sublists. In Java this is as follows:

private static int partition (Comparable list[], int start, int end)

// Partition list between start and end, returning the partition point.

// The value at list[start] is used as the partition element.

// This method is used by the Quick Sort.

{ int left = start,

right = end;

Comparable tmp;

while (left < right)

{ // Work from right end first

while (list[right].compareTo(list[start]) > 0)

right--;

// Now work up from start

while (left < right && list[left].compareTo(list[start]) <= 0)

left++;

if (left < right)

{ tmp = list[left];

list[left] = list[right];

list[right] = tmp;

}

}

// Exchange the partition element with list[right]

tmp = list[start];

list[start] = list[right];

list[right] = tmp;

return right;

} // partition

This method makes use of the two subscripts left and right to do the partitioning. The right subscriptfirst works down from the end of the list until a value less than or equal to the partition element isencountered. This value should be in the left sublist. We then start the second nested while loop whichworks the left subscript up from the beginning of the list until a value greater than the partition elementis found, or else we reach the right subscript. If we have not reached the right subscript, we swap thetwo values indicated by left and right as they are in the wrong sublists. When left reaches right

we simply exchange the partition element (list[start]) with the item at position right (we could alsouse left, since they are equal). We then return right as the position where the partition element cannow be found.

The partition method does almost all the work of the Quick Sort algorithm and so the actual QuickSort method itself is very simple. This version is recursive:

private static void recursiveQS (Comparable list[], int start, int end)

// Recursive Quick Sort list between start and end

{ if (start < end)

{ int partitionPoint = partition(list, start, end);

recursiveQS(list, start, partitionPoint-1);

173

Page 183: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

recursiveQS(list, partitionPoint+1, end);

}

} // recursiveQS

public static void quickSort (Comparable list[])

// Quick Sort the list - actually just calls recursiveQS

{ recursiveQS(list, 0, list.length-1);

} // quickSort

We have chosen to define a trivial quickSort method which has exactly the same parameter list as thesimple sorts we looked at earlier. This method simply sets up the parameters which the actual QuickSort method (recursiveQS) uses and then calls it. Notice just how simple the recursiveQS method is.It calls the partition method, and then recursively Quick Sorts the two sublists.

A Non-Recursive Implementation

The algorithm above serves to illustrate the way in which the Quick Sort works, but we have commentedbefore on the fact that recursion is often an inefficient way to handle repetitive tasks. We can write anon-recursive Quick Sort algorithm, but we need some way of remembering which sublists we need tocome back to and sort. We can do this using a stack or a queue. We will do it here using the Stack classfrom the java.util package, but we could just as easily have used one of our own from chapter five.This version of the Quick Sort algorithm uses exactly the same partition method we saw previously.

public static void iterativeQuickSort (Comparable list[])

// Quick Sort list of length elements using a stack

{ class Pair // Store (start,end) pairs

{ public int start, end;

} // class Pair

Stack<Pair> s = new Stack<Pair>();

Pair p = new Pair();

p.start = 0;

p.end = list.length-1;

s.push(p); // Starting pair on the stack - the whole list

while (! s.empty())

{ p = s.pop(); // Get next sublist to sort

while (p.start < p.end)

{ int partitionPos = partition(list, p.start, p.end);

// Now push left sublist

Pair tmp = new Pair();

tmp.start = p.start;

tmp.end = partitionPos-1;

s.push(tmp);

// Start work on right sublist

p.start = partitionPos+1;

}

}

} // iterativeQuickSort

Sublists are represented by pairs of subscripts: the start and end of the sublist. We have declared asmall, inner class (Pair) to hold these pairs of values, and have created a stack to hold the pairs. The

174

Page 184: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

method works by pushing the left sublist onto the stack each time we partition part of the list, and thencontinuing straight away to sort the right sublist.

Exercise 10.6 The iterative version of the Quick Sort above removes the recursion success-fully. However it may run out of space for the stack when working with very large lists. Oneway to help prevent this is to push the smaller of the left and right sublists and work on thelonger of the two. Modify the algorithm above to do this.

Exercise 10.7 We saw above how the choice of the first item as the partition element couldcause one of the sublists to be much smaller than the other. A method for preventing this isto choose the median of the first, middle and last values in the list as the partition element.Implement this improvement in either of the Quick Sort versions above and measure what, ifany, improvement it makes.

Exercise 10.8 Another method for preventing the poor choice of the partition element issimply to choose a random element of the list as the partition element. Implement this im-provement in either of the Quick Sort versions above and measure what, if any, improvementit makes.

Exercise 10.9 Another modification which is often made to the Quick Sort is to use a simplersorting technique once the length of the sublists drops below a certain threshold. Make thischange to the either of the Quick Sort versions. Use the Selection Sort and set the thresholdlength to ten. Measure the impact that this has on the performance of the Quick Sort algorithm.

Complexity Analysis

Now, let’s turn to the analysis of the Quick Sort. This is not affected by the implementation as a recursiveor iterative method. All that this decision affects is the magnitude of the constants, which we drop fromour order notation anyway.

Let’s consider the best case performance first. We stated above that the best case was when the partitionelement divided the list into two sublists of equal length. If we let the number of comparisons performedfor a list of n items be Cn then we get the following recurrence relation:

Cn = 2Cn/2 + nC1 = 0

In other words, the number of comparisons needed to sort a list of n items is the n comparisons done bythe partitioning algorithm, plus the two sets of comparisons done for the two sublists (i.e. 2Cn/2)

This recurrence relation can be solved in manner similar to that we used when analysing the binarysearch in the last chapter. We start by assuming that n is a power of 2: n = 2k. This gives:

C2k = 2C2k−1 + 2k

If we divide both sides of this equation by 2k we get:

C2k

2k=

C2k−1

2k−1+ 1

175

Page 185: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Using this equation to substitute for the first term on the right hand side we get:

C2k

2k=

C2k−2

2k−2+ 1 + 1

Continuing substituting like this k times we get to:

C2k

2k= 1 + 1 + · · ·+ 1 + 1 = k

In other words:

C2k = 2kk

We started by assuming that n = 2k, so we have k = log2 n. This all means that we can rewrite theabove equation as:

Cn = n log2 n

This is the result that we wanted: the number of comparisons done by the Quick Sort, in the best case,is n log2 n. This means the Quick Sort algorithm is O(n log n) in the best case.

The average case is rather harder to analyse. We expect the partition point, on average, to fall in themiddle of the list, and so it is not unreasonable to expect the average case performance to be similar tothe best case performance. In fact, it can be shown that the number of comparisons in the average caseis approximately 1.38n log2 n, so the algorithm is still O(n log n).

What about the worst case performance? This turns out to be the downfall of the Quick Sort. Forexample, if the list is already sorted then the choice of the first item as the partition element will meanthat we always have an empty left sublist and a right sublist of n− 1 elements. This means that we needthe following number of comparisons:

(n− 1) + (n− 2) + ...+ 2 + 1 =n

2(n− 1)

and the algorithm degenerates to O(n2). If the list was originally in reverse order, we would get the samepoor performance. Of course, Exercises 10.7 and 10.8 above suggested modifications to the algorithm,which can help prevent the worst case from arising.

Exercise 10.10 Write a program that will allow you to test the Quick Sort with sorted, reverseordered and random lists of data. Time its execution for all three input lists and verify theresults shown above.

Exercise 10.11 Implement the modifications suggested in Exercises 10.7 and 10.8, and seewhat effect they have on the results of Exercise 10.10.

10.4.2 The Merge Sort

The Quick Sort worked on the basis of partitioning the list into two sublists, and then sorting the twosublists independently. The Merge Sort is rather different. It simply splits the list into two halves, sortseach of them independently, and then merges the two sorted sublists together. This means that the mergealgorithm is the key to the Merge Sort. Let’s consider what we need to do in order to merge two sortedlists together.

Consider the following two sorted lists of information :

176

Page 186: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

2 3 7 10

4 5 6 11

We can merge these by starting at the beginning of both lists and working through them, copying thesmaller of the two current items to a new array. For example, given the two lists above we would startat the beginning of the first list and copy the value 2 to the new list:

62 3 7 10

4 5 6 11

2

We will cross out each number once we have dealt with it. Considering the two numbers now at thebeginning of the two lists we have 3 and 4, so we copy 3 to the new list:

62 63 7 10

4 5 6 11

2 3

We now have 7 and 4 to consider, and so copy 4 to the new list:

62 63 7 10

64 5 6 11

2 3 4

And so we continue until one of the lists is exhausted:

62 63 7 1064 65 6 11 Choose 5.

2 3 4 5

62 63 7 1064 65 66 11 Choose 6.

2 3 4 5 6

62 63 67 10 Choose 7.64 65 66 11

2 3 4 5 6 7

62 63 67 610 Choose 10.64 65 66 11

2 3 4 5 6 7 10

At this stage we have finished with the first list. All that remains to be done is to copy whatever is leftin the second list into the new array. Of course, if the original lists had been different this might havebeen the other way around. Also, in general, we will have more than just one element left over to copywhen one of the lists is exhausted. Copying the remnant of the second list over (in this case just thesingle value, 11) gives:

62 63 67 61064 65 66 611

2 3 4 5 6 7 10 11

177

Page 187: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

So, from two already sorted lists we have managed to create a new sorted list. How does this help uswith sorting? As already stated, the Merge Sort works by splitting an unsorted list into two separatehalves, sorting them, then merging the resulting sorted lists. How does it sort the two halves? Well, itsplits them each in two, sorts them and then merges the resulting sublists. In this way the Merge Sort isrecursive. The stop case for the recursion arises when the length of the sublist to be sorted is one. Thisis obviously sorted, as a trivial case of a list. With this in mind we can look at the Merge Sort algorithmitself.

Implementation

Written in Java, the essence of the Merge Sort is as follows:

private static void recursiveMS (Comparable list[], int start, int end)

// Recursive function to perform Merge Sort

{ if (start < end)

{ int midPoint = (start + end) / 2;

recursiveMS(list, start, midPoint);

recursiveMS(list, midPoint+1, end);

merge(list, start, midPoint, end);

}

} // recursiveMS

public static void mergeSort (Comparable list[])

// Merge Sort list of length entries

{ recursiveMS(list, 0, list.length-1);

} // mergeSort

As we had for the Quick Sort, the Merge Sort is written as a number of methods. The mergeSort methodsimply calls the recursive method (recursiveMS) to do the actual sorting. This takes the start and endof the sublist which it is to sort as parameters. It divides the list into two, and then calls itself recursively.Once the two sublists are sorted they are merged using the following merge method. This differs slightlyfrom our previous discussion in that it has to merge two sublists contained in one array.

public static void merge (Comparable list[], int first, int mid, int last)

// Merge list from first to mid with list from mid+1 to last

{ Comparable[] tmp = new Comparable[last-first+1]; // Temporary array for merging

int i = first, // Subscript for first sublist

j = mid+1, // Subscript for second sublist

k = 0; // Subscript for merged list

// Merge sublists together

while (i <= mid && j <=last)

if (list[i].compareTo(list[j]) < 0)

tmp[k++] = list[i++];

else

tmp[k++] = list[j++];

// Copy remaining tail of one sublist

while (i <= mid)

tmp[k++] = list[i++];

while (j <= last)

tmp[k++] = list[j++];

// Now copy tmp back into list

for (k = first; k <= last; k++)

178

Page 188: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

list[k] = tmp[k-first];

} // merge

This method starts by creating a new temporary array tmp into which it can merge the two sublists. Itthen performs the merge operation, as we described it above. Notice the two while loops which copy theremnant of one of the original lists across. If you look carefully at the conditions under which the firstwhile loop terminates you will see that only one of the other two while loops is ever executed. Oncethe two sublists are merged into tmp, the contents of tmp are copied back into the original array.

Exercise 10.12 Just as we had for the Quick Sort (see Exercise 10.9), the Merge Sort canbe modified to use one of the simple sorting techniques of Section 10.2 when the size of thesublists drops below some threshold. This reduces the amount of recursion and also the amountof copying data to and from the temporary arrays used in merging. Implement this modificationand measure the impact it has on the efficiency of the Merge Sort algorithm. Try differentthreshold values and see which gives the best results.

Exercise 10.13 Develop an iterative version of the Merge Sort, using a stack in the same wayas we did with the iterative version of the Quick Sort.

Complexity Analysis

Let’s consider the algorithmic complexity of the Merge Sort. In this method we always divide the listinto two equal sublists (assuming the length of the list is a power of two). An analysis similar to that ofthe best case of the Quick Sort reveals that the Merge Sort is also O(n log n). If the length of the listis not an exact power of two the actual complexity is very slightly worse, but is still O(n log n). This isindependent of whether the list is sorted or randomly ordered at the start. So, we do not have the samepoor worst case behaviour that we saw with the Quick Sort. Why then do we not use the Merge Sortmore frequently? The problem with the Merge Sort arises not from its time complexity, which, as wehave just seen, is always very good, but with its space complexity. Because we need the temporary arraysfor the merging process this algorithm requires twice as much memory as the Quick Sort. When the liststhat we are sorting are very large this is too high a price to pay and the Quick Sort is to be preferred. Ingeneral, the special case which produces the O(n2) behaviour of the Quick Sort is unlikely to arise. Evenif it does arise, a better choice of the partition element can help overcome the deficiencies of the QuickSort. For these reasons it, rather than the Merge Sort, is one of the most widely used sorting techniquesin practice.

10.5 Concluding Remarks

We have now seen several sorting methods, some of them O(n2) and some O(n log n). There are manyother ways of sorting lists. Some names you may have heard of are Shell’s Diminishing Increment Sort(often called simply the Shell Sort) and the Heap Sort. The Tree Sort works by building a binary searchtree from the data and then performing an in-order traversal of the tree. This is a useful techniquewhen dealing with dynamically allocated lists of data rather than arrays. These sorting algorithms areessentially all O(n log n) or close to it.

Exercise 10.14 Plot a graph of n2 and n log n for n = 8, 16, 64, 128, 512 and 1024.

179

Page 189: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Exercise 10.15 Modify the Quick Sort and Merge Sort algorithms so that they count thenumber of comparisons that they perform. Write a program that will allow you to test thesealgorithms with sorted, reverse ordered and random lists of data, and verify the theoreticalresults found for these sorts.

Exercise 10.16 Shell’s Diminishing Increment Sort (usually just called the Shell Sort) is arelatively simple, but quite efficient sorting technique. It is similar in some respects to theBubble Sort (but far more efficient than the Bubble Sort). Like the Bubble Sort, the idea is tocompare and swap pairs of values in a list, but using a “gap” between the pairs (i.e. comparingthe kth element with the one at position k + gap). The initial gap is set to half the length ofthe list. When a pass is made with no items being swapped, then the gap is halved, and theprocess repeated. This continues until the gap is zero.Implement this sorting algorithm, and test its efficiency in comparison to the other sorts inthis chapter.

Skills

• You should be familiar with the sorting algorithms of this chapter: Bubble Sort, InsertionSort, Selection Sort, Quick Sort and Merge Sort

• You should be able to discuss the algorithmic complexities of these sorting algorithms

• You should be aware that there are other sorting algorithms available

180

Page 190: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Index of Data Structures andAlgorithms

This section gives an index of the major data structures and algorithms that have been discussed in thesenotes.

Data Structures

1 Lists of Integers Page

1.1 Array implementation: IntegerVector . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

1.2 Linked list implementation: IntegerList . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

2 Generic Lists: Linked list implementations

2.1 Using polymorphism: ObjectList . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

2.2 Using generics: GenericList . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

3 Stacks

3.1 Array implementation: ArrayStack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

3.2 Linked list implementation: ListStack . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

4 Queues

4.1 Array implementation: ArrayQueue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

4.2 Linked list implementation: ListQueue . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

5 Deques

5.1 Linked list implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

5.1.1 Implementation using doubly-linked lists . . . . . . . . . . . . . . . . . . . . . . . . 84

5.1.2 Implementation using doubly-linked lists with header nodes . . . . . . . . . . . . . 87

6 Trees

6.1 General Binary Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

6.2 Binary Search Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

7 Graphs

181

Page 191: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

7.1 Adjacency matrix representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

7.2 Edge list representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

8 Dictionary 120

9 Hash Tables

9.1 Internal hashing with open addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

9.2 External hashing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

Algorithms

1 Maze searching (using a queue) 76

2 Converting a general tree to a binary tree 97

3 Guessing Game (using a general binary tree) 101

4 Indexing text (using a dictionary) 126

5 Searching Algorithms

5.1 Sequential Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156

5.2 Binary Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

5.3 Interpolated Binary Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

6 Sorting Algorithms

6.1 Bubble Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164

6.2 Insertion Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166

6.3 Selection Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

6.4 Quick Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172

6.4.1 Recursive Quick Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172

6.4.2 Iterative Quick Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

6.5 Merge Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176

182

Page 192: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Index

abstract data type, 28abstract data types

nonlinear, 93accessor methods, 37accessors, 37ADT, see abstract data typealgorithmic complexity, 145assertions, 8

Big-O notation, 146binary search, 156binary search trees, 104binary trees, 94

proper, 95breadth-first search, 80Bubble Sort, 164

circular queue, 69circularly-linked list, 84Comparable interface, 105, 154Cramer’s Rule, 144

depth-first search, 81deques, 83dictionaries, 118documentation, 10documentation comments, 10documentation tags, 10doubly-linked list, 83dynamic data structure, 40

Fibonacci series, 17

generic data structures, 49generics, 54graphs, 114

adjacency matrix representation, 115edge list representation, 116

hash tables, 128clustering, 130collisions, 129external hashing, 135internal hashing with open addressing, 129

probing, 129hashing function, 128header node, 85

indirect sorting, 170Insertion Sort, 166instanceof, 53interfaces, 19

abstraction, 29, 157polymorphism, 22

interpolated binary search, 159iterator, 104iterators

dictionary, 124external hash table, 139internal hash table, 134tree, 112

Javadoc, 10

linked lists, 40list

circularly-linked, 84doubly-linked, 83

list-head node, see header node

Merge Sort, 177

order notation, 146

perfect hashing function, 129polymorphism, 49

with interfaces, 22postconditions, 8preconditions, 8priority queue, 75problem-solving algorithms, 75

breadth-first, 80depth-first, 81

procedural abstraction, 27program quality, 4

improving, 7

queues, 68

183

Page 193: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

array implementation, 68circular queue, 69linked list implementation, 72travelling queue problem, 69

Quick Sort, 172

recurrence relation, 158recursion, 13

factorial function, 15reversing a string, 16tree definition, 94tree methods, 108tree traversal, 103

searchingbinary search, 156interpolated binary search, 159

Selection Sort, 168single inheritance, 19sorting

Bubble Sort, 164indirect, 170Insertion Sort, 166Merge Sort, 177Quick Sort, 172Selection Sort, 168

stacks, 60array implementation, 60linked list implementation, 63

StringTokenizer, 126

Towers of Hanoi, 18travelling queue problem, 69trees, 93

traversals, 102tri-diagonal method, 145

vectors, 33

184

Page 194: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Bibliography

[1] D.A. Bailey. Java Structures: Data Structures in Java for the Principled Programmer. McGraw-Hill,1999.

[2] G. Bracha. Generics in the Java programming language. URL: http://java.sun.com/j2se/1.5/-pdf/generics-tutorial.pdf, 2004.

[3] T.A. Budd. Classic Data Structures in Java. Addison Wesley Longman, 2001.

[4] F.M Carrano and J.J. Prichard. Data Abstraction and Problem Solving with Java: Walls and Mirrors.Addison Wesley Longman, 2001.

[5] W.J. Collins. Data Structures and the Java Collections Framework. McGraw-Hill, 2002.

[6] Collins Concise Dictionary. HarperCollins, 5th edition, 2001.

[7] N. Dale, D.T. Joyce, and C. Weems. Object-Oriented Data Structures Using Java. Jones and Bartlett,2nd edition, 2006. Superb coverage of advanced programming in Java — highly recommended.

[8] H.M. Deitel and P.J. Deitel. Java: How to Program. Prentice Hall, 3rd edition, 1999.

[9] A. Drozdek. Data Structures and Algorithms in Java. Brooks/Cole, 2001.

[10] M.T. Goodrich and R. Tamassia. Data Structures and Algorithms in Java. John Wiley and Sons,4th edition, 2006.

[11] D. Ince. From Data Structures to Patterns. Macmillan Press, 2000.

[12] J. Lewis and W. Loftus. Java Software Solutions: Foundations of Program Design. Addison Wesley,4th edition, 2005.

[13] M. Main. Data Structures and Other Objects Using Java. Addison-Wesley, 2nd edition, 2003.

[14] B.R. Preiss. Data Structures and Algorithms with Object-Oriented Design Patterns in Java. JohnWiley and Sons, 2000.

[15] G.W. Rowe. An Introduction to Data Structures and Algorithms with Java. Prentice Hall, 1998.

[16] S. Sahni. Data Structures, Algorithms, and Applications in Java. McGraw-Hill, 2000.

[17] M. Shaw. The impact of abstraction concerns on modern programming languages. Technical ReportComputer Science Technical Report CMU-CS-80-116, Carnegie-Mellon University, 1980.

[18] D.A. Watt and D.F. Brown. Java Collections: An Introduction to Abstract Data Types, Data Struc-tures and Algorithms. John Wiley and Sons, 2001.

[19] M.A. Weiss. Data Structures and Algorithm Analysis in Java. Addison-Wesley, 1999.

185

Page 195: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

[20] C. Willis and D. Paddon. Abstraction and Specification with Modula-2. Pitman, 1992.

[21] R. Winder and G. Roberts. Developing Java Software. John Wiley and Sons, 2nd edition, 2000.A good general introduction to Java, with very good coverage of advanced data structures andalgorithms.

186

Page 196: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

Appendix A

File Listings

A.1 Lists and Vectors

A.1.1 IntegerVector.java

package cs2;

/** Simple class to handle vectors of integers.

* These are simple lists of integers, based on a fixed-size array.

* @author George Wells

* @version 3.0 (25 February 2010)

*/

public class IntegerVector

{

/** The array of data.

*/

private int data[];

/** Number of elements stored in the vector.

*/

private int numElements;

/** Create a new vector.

* <BR><I>Precondition:</I> <CODE>initSize > 0</CODE>.

* @param initSize The maximum capacity of the vector.

* @throws IllegalArgumentException if <CODE>initSize</CODE> is negative or zero.

*/

public IntegerVector (int initSize)

{ if (initSize <= 0)

throw new IllegalArgumentException("initSize <= 0");

numElements = 0;

data = new int[initSize];

} // constructor

/** Create a new vector with a default maximum capacity of 100.

*/

public IntegerVector ()

{ this(100); // Default to 100 elements

} // constructor

/** Place a new item at a specified position in an IntegerVector.

187

Page 197: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

* <BR><I>Precondition:</I> There is space available for another value.

* <BR><I>Precondition:</I> The position is in range.

* <BR><I>Postcondition:</I> The value <CODE>item</CODE> appears at

* <CODE>position</CODE> in the vector, or at the end of the list if

* <CODE>position</CODE> is greater than the original length of the list.

* @param item The integer value to be added to the vector.

* @param position The position in the vector where the item should

* be added.

* @throws IllegalArgumentException if <CODE>position</CODE> is negative.

* @throws NoSpaceAvailableException if no space is available.

*/

public void add (int item, int position)

{ if (numElements + 1 > data.length)

throw new NoSpaceAvailableException("no space available");

if (position < 0)

throw new IllegalArgumentException("position is negative");

if (position >= numElements) // Add at end

data[numElements++] = item;

else

{ int k;

for (k = numElements-1; k >= position; k--)

data[k+1] = data[k]; // Move elements up

data[k+1] = item; // Put item in place

numElements++;

}

} // add

/** Place a new item at the end of an IntegerVector.

* <BR><I>Precondition:</I> There is space available for another value.

* <BR><I>Postcondition:</I> The value <CODE>item</CODE> appears at

* the end of the vector.

* @param item The integer value to be added to the vector.

*/

public void add (int item)

{ add(item, numElements);

} // add

/** Remove the item at a given position in an IntegerVector.

* <BR><I>Precondition:</I> The position is that of a valid item.

* <BR><I>Postcondition:</I> The item at <CODE>position</CODE> has been

* removed from the vector.

* @param position The position of the item to be removed from the

* vector.

* @throws IndexOutOfBoundsException if <CODE>position</CODE> is invalid.

*/

public void remove (int position)

{ if (position < 0 || position >= numElements)

throw new IndexOutOfBoundsException("position is out of range");

for (int k = position+1; k < numElements; k++)

data[k-1] = data[k];

numElements--;

} // remove

/** Return the current number of elements in an IntegerVector.

* @return The number of elements in the vector.

*/

188

Page 198: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

public int length ()

{ return numElements; }

/** Retrieve an element from an IntegerVector.

* <BR><I>Precondition:</I> <CODE>index</CODE> is in range.

* @return The element at position <CODE>index</CODE> in the vector.

* @throws IndexOutOfBoundsException If <CODE>index</CODE> is invalid.

*/

public int get (int index)

{ if (index < 0 || index >= numElements)

throw new IndexOutOfBoundsException("index is out of range");

return data[index];

} // get

/** Change the value of an element in an IntegerVector.

* <BR><I>Precondition:</I> <CODE>index</CODE> is in range.

* <BR><I>Postcondition:</I> The value of the element at position <CODE>index</CODE>

* in the vector has been changed.

* @param index The position of the item to be changed in the vector.

* @param item The new value for the item.

* @throws IndexOutOfBoundsException if <CODE>index</CODE> is invalid.

*/

public void set (int index, int item)

{ if (index < 0 || index >= numElements)

throw new IndexOutOfBoundsException("index is out of range");

data[index] = item;

} // set

/** Search for a specified item in an IntegerVector.

* @param item The item to be searched for.

* @return The position of this item if it is found, otherwise -1

*/

public int position (int item)

{ int k;

for (k = 0; k < numElements; k++)

if (data[k] == item)

break; // Leave for loop

if (k >= numElements) // item was not found

return -1;

else

return k;

} // position

/** Return a string representation of the IntegerVector.

* The format is: <CODE>[ <I>item</I>, <I>item</I>, ... ]</CODE>

* @return A string representing the contents of this vector

*/

public String toString ()

{ StringBuffer s = new StringBuffer("[");

int k;

for (k = 0; k < numElements; k++)

{ s.append("" + data[k]);

if (k < numElements-1)

s.append(", ");

}

s.append("]");

189

Page 199: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

return s.toString();

} // toString

} // class IntegerVector

A.1.2 IntegerList.java

package cs2;

/** Simple class to handle lists of integers, using linked lists.

* @author George Wells

* @version 3.0 (25 February 2010)

*/

public class IntegerList

{ private class ListNode

{ public int data;

public ListNode next;

} // inner class ListNode

/** Reference to the first ListNode in a IntegerList.

*/

private ListNode first;

/** Number of elements in a IntegerList.

*/

private int numElements;

/** Create an empty IntegerList.

* <BR><I>Postcondition:</I> The list is empty.

*/

public IntegerList ()

{ first = null;

numElements = 0;

} // IntegerList constructor

/** Place a new item at a specified position in an IntegerList.

* <BR><I>Precondition:</I> The specified position is positive or zero.

* <BR><I>Postcondition:</I> The value <CODE>item</CODE> appears at

* <CODE>position</CODE> in the list, or at the end of the list if

* <CODE>position</CODE> is greater than the original length of the list.

* @param item The integer value to be added to the list.

* @param position The position in the list where the item should

* be added.

* @throws IllegalArgumentException if <CODE>position</CODE> is negative.

*/

public void add (int item, int position)

{ if (position < 0)

throw new IllegalArgumentException("position is negative");

ListNode node = new ListNode();

node.data = item;

ListNode curr = first,

prev = null;

for (int k = 0; k < position && curr != null; k++) // Find position

{ prev = curr;

curr = curr.next;

}

node.next = curr;

190

Page 200: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

if (prev != null)

prev.next = node;

else

first = node;

numElements++;

} // add

/** Place a new item at the end of an IntegerList.

* <BR><I>Postcondition:</I> The value <CODE>item</CODE> appears at

* the end of the list.

* @param item The integer value to be added to the list.

*/

public void add (int item)

{ add(item, numElements);

} // add

/** Remove the item at a given position in an IntegerList.

* <BR><I>Precondition:</I> The position is that of a valid item.

* <BR><I>Postcondition:</I> The item at <CODE>position</CODE> has been

* removed from the list.

* @param position The position of the item to be removed from the

* list.

* @throws IndexOutOfBoundsException if <CODE>position</CODE> is invalid.

*/

public void remove (int position)

{ if (position < 0 || position >= numElements)

throw new IndexOutOfBoundsException("position is out of range");

ListNode curr = first,

prev = null;

for (int k = 0; curr != null && k < position; k++)

{ prev = curr;

curr = curr.next;

}

assert curr != null;

if (prev != null)

prev.next = curr.next;

else

first = curr.next;

numElements--;

} // remove

/** Return the current number of elements in an IntegerList.

* @return The number of elements in the list.

*/

public int length ()

{ return numElements;

} // length

/** Retrieve an element from an IntegerList.

* <BR><I>Precondition:</I> <CODE>index</CODE> is in range.

* @return The element at position <CODE>index</CODE> in the list.

* @throws IndexOutOfBoundsException if <CODE>index</CODE> is invalid.

*/

public int get (int index)

{ if (index < 0 || index >= numElements)

throw new IndexOutOfBoundsException("index is out of range");

191

Page 201: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

ListNode curr = first;

for (int k = 0; curr != null && k < index; k++)

curr = curr.next;

assert curr != null;

return curr.data;

} // get

/** Change the value of an element in an IntegerList.

* <BR><I>Precondition:</I> <CODE>index</CODE> is in range.

* @param index The position of the item to be changed in the list.

* @param item The new value for the item.

* @throws IndexOutOfBoundsException if <CODE>index</CODE> is invalid.

*/

public void set (int index, int item)

{ if (index < 0 || index >= numElements)

throw new IndexOutOfBoundsException("index is out of range");

ListNode curr = first;

for (int k = 0; curr != null && k < index; k++)

curr = curr.next;

assert curr != null;

curr.data = item;

} // get

/** Find item in an IntegerList.

* @param item The item to be searched for.

* @return The position of this item if it is found, otherwise -1

*/

public int position (int item)

{ ListNode curr = first;

int k;

for (k = 0; curr != null && curr.data != item; k++, curr = curr.next)

; // Search for item in IntegerList

if (curr == null) // item was not found

return -1;

else

return k;

} // position

/** Return string representation of the IntegerList.

* The format is: <CODE>[ <I>item</I>, <I>item</I>, ... ]</CODE>

* @return A string representing the contents of this list

*/

public String toString ()

{ StringBuffer s = new StringBuffer("[");

for (ListNode curr = first; curr != null; curr = curr.next)

{ s.append("" + curr.data);

if (curr.next != null)

s.append(", ");

}

s.append("]");

return s.toString();

} // toString

} // class IntegerList

192

Page 202: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

A.1.3 ObjectList.java

package cs2;

/** Simple class to handle generic lists of objects, using linked lists.

* @author George Wells

* @version 3.0 (25 February 2010)

*/

public class ObjectList

{ private class ListNode

{ public Object data;

public ListNode next;

} // inner class ListNode

/** Reference to the first ListNode in a ObjectList.

*/

private ListNode first;

/** Number of elements in a ObjectList.

*/

private int numElements;

/** Create an empty ObjectList.

* <BR><I>Postcondition:</I> The list is empty.

*/

public ObjectList ()

{ first = null;

numElements = 0;

} // ObjectList constructor

/** Place a new item at a specified position in a ObjectList.

* <BR><I>Precondition:</I> The position is positive or zero.

* <BR><I>Postcondition:</I> The object <CODE>item</CODE> appears at

* <CODE>position</CODE> in the list or at the end of the list if

* <CODE>position</CODE> is greater than the original length of the list.

* @param item The object to be added to the list.

* @param position The position in the list where the item should

* be added.

* @throws IllegalArgumentException if <CODE>position</CODE> is negative.

*/

public void add (Object item, int position)

{ if (position < 0)

throw new IllegalArgumentException("position is negative");

ListNode node = new ListNode();

node.data = item;

ListNode curr = first,

prev = null;

for (int k = 0; k < position && curr != null; k++) // Find position

{ prev = curr;

curr = curr.next;

}

node.next = curr;

if (prev != null)

prev.next = node;

else

first = node;

numElements++;

} // add

193

Page 203: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

/** Place a new item at the end of a ObjectList.

* <BR><I>Postcondition:</I> The object <CODE>item</CODE> appears at

* the end of the list.

* @param item The object to be added to the list.

*/

public void add (Object item)

{ add(item, numElements);

} // add

/** Remove the item at a given position in a ObjectList.

* <BR><I>Precondition:</I> The position is that of a valid item.

* <BR><I>Postcondition:</I> The item at <CODE>position</CODE> has been

* removed from the list.

* @param position The position of the item to be removed from the

* list.

* @throws IndexOutOfBoundsException if <CODE>position</CODE> is invalid.

*/

public void remove (int position)

{ if (position < 0 || position >= numElements)

throw new IndexOutOfBoundsException("position is out of range");

ListNode curr = first,

prev = null;

for (int k = 0; curr != null && k < position; k++)

{ prev = curr;

curr = curr.next;

}

assert curr != null;

if (prev != null)

prev.next = curr.next;

else

first = curr.next;

numElements--;

} // remove

/** Return the current number of elements in a ObjectList.

* @return The number of elements in the list.

*/

public int length ()

{ return numElements;

} // length

/** Retrieve an element from an ObjectList.

* <BR><I>Precondition:</I> <CODE>index</CODE> is in range.

* @return The element at position <CODE>index</CODE> in the list.

* @throws IndexOutOfBoundsException if <CODE>index</CODE> is invalid.

*/

public Object get (int index)

{ if (index < 0 || index >= numElements)

throw new IndexOutOfBoundsException("index is out of range");

ListNode curr = first;

for (int k = 0; curr != null && k < index; k++)

curr = curr.next;

assert curr != null;

return curr.data;

} // get

194

Page 204: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

/** Change the value of an element in an ObjectList.

* <BR><I>Precondition:</I> <CODE>index</CODE> is in range.

* @param index The position of the item to be changed in the list.

* @param item The new value for the item.

* @throws IndexOutOfBoundsException if <CODE>index</CODE> is invalid.

*/

public void set (int index, Object item)

{ if (index < 0 || index >= numElements)

throw new IndexOutOfBoundsException("index is out of range");

ListNode curr = first;

for (int k = 0; curr != null && k < index; k++)

curr = curr.next;

assert curr != null;

curr.data = item;

} // get

/** Find item in a ObjectList (uses the <CODE>.equals()</CODE> method for

* comparisons).

* @param item The item to be searched for.

* @return The position of this item if it is found, otherwise -1.

*/

public int position (Object item)

{ ListNode curr = first;

int k;

for (k = 0; curr != null && !curr.data.equals(item); k++, curr = curr.next)

; // Search for item in ObjectList

if (curr == null) // item was not found

return -1;

else

return k;

} // position

/** Return string representation of the ObjectList.

* The format is: <CODE>[ <I>item</I>, <I>item</I>, ... ]</CODE>

* @return A string representing the contents of this list.

*/

public String toString ()

{ StringBuffer s = new StringBuffer("[");

for (ListNode curr = first; curr != null; curr = curr.next)

{ s.append(curr.data.toString());

if (curr.next != null)

s.append(", ");

}

s.append("]");

return s.toString();

} // toString

} // class ObjectList

A.1.4 GenericList.java

package cs2;

/** Simple class to handle generic lists, using linked lists.

* @author George Wells

195

Page 205: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

* @version 3.0 (25 February 2010)

*/

public class GenericList<T>

{ private class ListNode

{ public T data;

public ListNode next;

} // inner class ListNode

/** Reference to the first ListNode in a GenericList.

*/

private ListNode first;

/** Number of elements in a GenericList.

*/

private int numElements;

/** Create an empty GenericList.

* <BR><I>Postcondition:</I> The list is empty.

*/

public GenericList ()

{ first = null;

numElements = 0;

} // GenericList constructor

/** Place a new item at a specified position in a GenericList.

* <BR><I>Precondition:</I> The position is positive or zero.

* <BR><I>Postcondition:</I> The object <CODE>item</CODE> appears at

* <CODE>position</CODE> in the list or at the end of the list if

* <CODE>position</CODE> is greater than the original length of the list.

* @param item The object to be added to the list.

* @param position The position in the list where the item should

* be added.

* @throws IllegalArgumentException if <CODE>position</CODE> is negative.

*/

public void add (T item, int position)

{ if (position < 0)

throw new IllegalArgumentException("position is negative");

ListNode node = new ListNode();

node.data = item;

ListNode curr = first,

prev = null;

for (int k = 0; k < position && curr != null; k++) // Find position

{ prev = curr;

curr = curr.next;

}

node.next = curr;

if (prev != null)

prev.next = node;

else

first = node;

numElements++;

} // add

/** Place a new item at the end of a GenericList.

* <BR><I>Postcondition:</I> The object <CODE>item</CODE> appears at

* the end of the list.

* @param item The object to be added to the list.

196

Page 206: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

*/

public void add (T item)

{ add(item, numElements);

} // add

/** Remove the item at a given position in a GenericList.

* <BR><I>Precondition:</I> The position is that of a valid item.

* <BR><I>Postcondition:</I> The item at <CODE>position</CODE> has been

* removed from the list.

* @param position The position of the item to be removed from the

* list.

* @throws IndexOutOfBoundsException if <CODE>position</CODE> is invalid.

*/

public void remove (int position)

{ if (position < 0 || position >= numElements)

throw new IndexOutOfBoundsException("position is out of range");

ListNode curr = first,

prev = null;

for (int k = 0; curr != null && k < position; k++)

{ prev = curr;

curr = curr.next;

}

assert curr != null;

if (prev != null)

prev.next = curr.next;

else

first = curr.next;

numElements--;

} // remove

/** Return the current number of elements in a GenericList.

* @return The number of elements in the list.

*/

public int length ()

{ return numElements;

} // length

/** Retrieve an element from an GenericList.

* <BR><I>Precondition:</I> <CODE>index</CODE> is in range.

* @return The element at position <CODE>index</CODE> in the list.

* @throws IndexOutOfBoundsException if <CODE>index</CODE> is invalid.

*/

public T get (int index)

{ if (index < 0 || index >= numElements)

throw new IndexOutOfBoundsException("index is out of range");

ListNode curr = first;

for (int k = 0; curr != null && k < index; k++)

curr = curr.next;

assert curr != null;

return curr.data;

} // get

/** Change the value of an element in an GenericList.

* <BR><I>Precondition:</I> <CODE>index</CODE> is in range.

* @param index The position of the item to be changed in the list.

* @param item The new value for the item.

197

Page 207: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

* @throws IndexOutOfBoundsException if <CODE>index</CODE> is invalid.

*/

public void set (int index, T item)

{ if (index < 0 || index >= numElements)

throw new IndexOutOfBoundsException("index is out of range");

ListNode curr = first;

for (int k = 0; curr != null && k < index; k++)

curr = curr.next;

assert curr != null;

curr.data = item;

} // get

/** Find item in a GenericList (uses the <CODE>.equals()</CODE> method for

* comparisons).

* @param item The item to be searched for.

* @return The position of this item if it is found, otherwise -1.

*/

public int position (T item)

{ ListNode curr = first;

int k;

for (k = 0; curr != null && !curr.data.equals(item); k++, curr = curr.next)

; // Search for item in GenericList

if (curr == null) // item was not found

return -1;

else

return k;

} // position

/** Return string representation of the GenericList.

* The format is: <CODE>[ <I>item</I>, <I>item</I>, ... ]</CODE>

* @return A string representing the contents of this list.

*/

public String toString ()

{ StringBuffer s = new StringBuffer("[");

for (ListNode curr = first; curr != null; curr = curr.next)

{ s.append(curr.data.toString());

if (curr.next != null)

s.append(", ");

}

s.append("]");

return s.toString();

} // toString

} // class GenericList

A.2 Stacks

A.2.1 Stack.java

package cs2;

/** Interface describing features common to all stacks.

* @author George Wells

* @version 2.0 (3 January 2005)

*/

198

Page 208: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

public interface Stack<T>

{

/** Push a new item onto a stack.

* <BR><I>Postcondition:</I> The stack is not empty.

* @param item The item to be pushed onto the stack.

*/

public void push (T item);

/** Pop an item off the top of a stack.

* <BR><I>Precondition:</I> The stack is not empty.

* <BR><I>Postcondition:</I> The item on the top of the stack is removed

* and returned.

* @return The item from the top of the stack.

*/

public T pop ();

/** Return a copy of the item on the top of a stack.

* <BR><I>Precondition:</I> The stack is not empty.

* <BR><I>Postcondition:</I> The item on the top of the stack is returned.

* @return The value of the item on the top of the stack.

*/

public T top ();

/** Tell if a stack contains any items.

* @return <CODE>true</CODE> if there are no items on the stack, otherwise

* <CODE>false</CODE>.

*/

public boolean isEmpty ();

} // interface Stack

A.2.2 ArrayStack.java

package cs2;

/** Simple generic stack class using an array.

* @author George Wells

* @version 2.0 (3 January 2005)

*/

public class ArrayStack<T> implements Stack<T>

{ /** The array of data.

*/

private T[] data;

/** The position of the top element.

*/

private int topIndex;

/** Create a new stack, with a given capacity.

* <BR><I>Precondition:</I> <CODE>initSize > 0</CODE>.

* <BR><I>Postcondition:</I> The stack is initialised and is empty.

* @param initSize The maximum capacity of the stack.

*/

public ArrayStack (int initSize)

{ data = (T[])new Object[initSize];

topIndex = -1;

} // Constructor

199

Page 209: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

/** Create a new stack with a default capacity of 100 items.

*/

public ArrayStack ()

{ this(100);

} // Constructor

/** Push a new item onto a stack.

* <BR><I>Precondition:</I> There is space available in the stack.

* <BR><I>Postcondition:</I> The stack is not empty.

* @param item The item to be pushed onto the stack.

* @throws NoSpaceAvailableException if the stack capacity is exceeded.

*/

public void push (T item)

{ if (topIndex >= data.length-1)

throw new NoSpaceAvailableException();

data[++topIndex] = item;

} // push

/** Pop an item off the top of the stack.

* <BR><I>Precondition:</I> The stack is not empty.

* @return The item that was popped off the stack.

* @throws EmptyException if the stack is empty.

*/

public T pop ()

{ if (topIndex < 0)

throw new EmptyException("stack is empty");

return data[topIndex--];

} // pop

/** Return a copy of the item on the top of the stack, without removing

* it.

* <BR><I>Precondition:</I> The stack is not empty.

* @return The value of the item on the top of the stack.

* @throws EmptyException if the stack is empty.

*/

public T top ()

{ if (topIndex < 0)

throw new EmptyException("stack is empty");

return data[topIndex];

} // top

/** Tell whether the stack is empty.

* @return <CODE>true</CODE> if there are no items on the stack, otherwise

* <CODE>false</CODE>.

*/

public boolean isEmpty ()

{ return topIndex < 0;

} // isEmpty

} // class ArrayStack

A.2.3 ListStack.java

package cs2;

200

Page 210: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

/** Simple generic stack class using a linked list.

* @author George Wells

* @version 3.0 (25 February 2010)

*/

public class ListStack<T> implements Stack<T>

{ private class StackNode

{ public T data;

public StackNode next;

} // inner class StackNode

/** Reference to the first StackNode in a ListStack.

*/

private StackNode topNode;

/** Create an empty ListStack.

* <BR><I>Postcondition:</I> The stack is initialised and is empty.

*/

public ListStack ()

{ topNode = null; }

/** Push a new item onto a ListStack.

* <BR><I>Postcondition:</I> The stack is not empty.

* @param item The item to be pushed onto the stack.

*/

public void push (T item)

{ StackNode node = new StackNode();

node.data = item;

node.next = topNode;

topNode = node;

} // push

/** Pop an item off the top of the stack.

* <BR><I>Precondition:</I> The stack is not empty.

* <BR><I>Postcondition:</I> The item on the top of the stack is removed

* and returned.

* @return The item from the top of the stack.

* @throws EmptyException if the stack is empty.

*/

public T pop ()

{ if (topNode == null)

throw new EmptyException("stack is empty");

T tmpData = topNode.data;

topNode = topNode.next;

return tmpData;

} // pop

/** Return a copy of the item on the top of the stack.

* <BR><I>Precondition:</I> The stack is not empty.

* <BR><I>Postcondition:</I> The item on the top of the stack is returned.

* @return The value of the item on the top of the stack.

* @throws EmptyException if the stack is empty.

*/

public T top ()

{ if (topNode == null)

throw new EmptyException("stack is empty");

return topNode.data;

201

Page 211: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

} // top

/** Tell if the stack contains any items.

* @return <CODE>true</CODE> if there are no items on the stack, otherwise

* <CODE>false</CODE>.

*/

public boolean isEmpty ()

{ return topNode == null;

} // isEmpty

} // class ListStack

A.3 Queues

A.3.1 Queue.java

package cs2;

/** Interface describing features common to all queues.

* @author George Wells

* @version 2.0 (3 January 2005)

*/

public interface Queue<T>

{

/** Add an item to the tail of a queue.

* <BR><I>Postcondition:</I> The queue is not empty.

* @param item The item to be added to the queue.

*/

public void add (T item);

/** Remove an item from the head of a queue.

* <BR><I>Precondition:</I> The queue is not empty.

* @return The item from the front of the queue.

*/

public T remove ();

/** Return a copy of the item at the head of a queue.

* <BR><I>Precondition:</I> The queue is not empty.

* @return The value of the item at the front of the queue.

*/

public T head ();

/** Tell if a queue contains any items.

* @return <CODE>true</CODE> if there are no items in the queue, otherwise

* <CODE>false</CODE>.

*/

public boolean isEmpty ();

} // interface Queue

A.3.2 ArrayQueue.java

package cs2;

/** Simple generic queue class, using an array.

202

Page 212: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

* @author George Wells

* @version 3.0 (25 February 2010)

*/

public class ArrayQueue<T> implements Queue<T>

{ /** The array of data.

*/

private T[] data;

/** Index of the item at the head (front) of the queue.

*/

private int hd;

/** Index of the item at the tail (back) of the queue.

*/

private int tl;

/** Create an empty queue, with a given capacity.

* <BR><I>Precondition:</I> <CODE>initSize > 0</CODE>.

* <BR><I>Postcondition:</I> The queue is initialised and is empty.

* @param initSize The maximum size of the queue.

*/

public ArrayQueue (int initSize)

{ data = (T[])new Object[initSize];

hd = tl = -1;

} // Constructor

/** Create an empty queue, with a default capacity of 100 elements.

*/

public ArrayQueue ()

{ this(100);

} // Constructor

/** Add an item to the back of a queue.

* <BR><I>Precondition:</I> There is space available in the queue.

* <BR><I>Postcondition:</I> The queue is not empty.

* @param item The item to be added to the queue.

* @throws NoSpaceAvailableException if the queue’s capacity is exceeded.

*/

public void add (T item)

{ tl = (tl + 1);

if (tl >= data.length)

tl = 0; // wraparound

if (tl == hd) // Out of space

throw new NoSpaceAvailableException("no space available");

data[tl] = item;

if (hd == -1) // First item in queue

hd = tl;

} // add

/** Remove an item from the front of a queue.

* <BR><I>Precondition:</I> The queue is not empty.

* @return The item removed from the front of the queue.

* @throws EmptyException if the queue is empty.

*/

public T remove ()

{ if (hd == -1)

throw new EmptyException("queue is empty");

T tmpData = data[hd];

203

Page 213: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

if (hd == tl) // Was last element

hd = tl = -1;

else

{ hd = (hd + 1);

if (hd >= data.length)

hd = 0; // wraparound

}

return tmpData;

} // remove

/** Return a copy of the item at the front of a queue, without removing it.

* <BR><I>Precondition:</I> The queue is not empty.

* @return The value of the item at the front of the queue.

* @throws EmptyException if the queue is empty.

*/

public T head ()

{ if (hd == -1)

throw new EmptyException("queue is empty");

return data[hd];

} // head

/** Tell whether the queue is empty.

* @return <CODE>true</CODE> if there are no items in the queue, otherwise

* <CODE>false</CODE>.

*/

public boolean isEmpty ()

{ return hd == -1;

} // isEmpty

} // class ArrayQueue

A.3.3 ListQueue.java

package cs2;

/** Simple generic queue class, using linked lists.

* @author George Wells

* @version 3.0 (25 February 2010)

*/

public class ListQueue<T> implements Queue<T>

{ private class QueueNode

{ public T data;

public QueueNode next;

} // inner class QueueNode

/** Reference to the head (front) of the queue.

*/

private QueueNode hd;

/** Reference to the tail (back) of the queue.

*/

private QueueNode tl;

/** Create an empty queue.

* <BR><I>Postcondition:</I> The queue is empty.

*/

public ListQueue ()

204

Page 214: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

{ hd = tl = null; }

/** Add an item to the tail of a queue.

* <BR><I>Postcondition:</I> The queue is not empty.

* @param item The item to be added to the queue.

*/

public void add (T item)

{ QueueNode newNode = new QueueNode();

newNode.data = item;

newNode.next = null;

if (tl != null)

tl.next = newNode;

tl = newNode;

if (hd == null) // First item in queue

hd = tl;

} // add

/** Remove an item from the head of a queue.

* <BR><I>Precondition:</I> The queue is not empty.

* <BR><I>Postcondition:</I> The item at the head of

* the queue is removed and returned.

* @return The item from the head of the queue.

* @throws EmptyException if the queue is empty.

*/

public T remove ()

{ if (hd == null)

throw new EmptyException("queue is empty");

T tmpData = hd.data;

hd = hd.next;

if (hd == null) // Was last element

tl = null;

return tmpData;

} // remove

/** Return a copy of the item at the head of a queue.

* <BR><I>Precondition:</I> The queue is not empty.

* @return The value of the item at the head of the queue.

* @throws EmptyException if the queue is empty.

*/

public T head ()

{ if (hd == null)

throw new EmptyException("queue is empty");

return hd.data;

} // head

/** Tell if a queue contains any items.

* @return <CODE>true</CODE> if there are no items in

* the queue, otherwise <CODE>false</CODE>.

*/

public boolean isEmpty ()

{ return hd == null;

} // isEmpty

} // class ListQueue

205

Page 215: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

A.3.4 QSearch.java

// Program using a queue to perform a breadth-first maze search.

// George Wells -- 7 November 2000

import cs2.*;

import java.io.*;

public class QSearch

{ private static final int MAX_COORD = 10; // Size of maze

private class Position // Coordinates of location in maze

{ public int r, // Row coordinate

c; // Column coordinate

} // inner class Position

private Queue<Position> posQueue = new ListQueue<Position>();

// Queue of positions still to be checked

private boolean[][] maze = new boolean[MAX_COORD][MAX_COORD];

// Description of maze

private boolean[][] beenThere = new boolean[MAX_COORD][MAX_COORD];

// Keep track of previous positions

public void readMaze (String mazeFile) throws IOException

// Read in the maze

{ int r, c; // Loop counters

BufferedReader f = new BufferedReader(new FileReader(mazeFile));

for (r = 0; r < MAX_COORD; r++)

{ String line = f.readLine();

for (c = 0; c < MAX_COORD; c++)

{ char pos = line.charAt(c);

maze[r][c] = (pos == ’ ’);

}

}

f.close();

} // readMaze

public void addPosition (int row, int col)

// Put a new position on the queue of positions

{ Position p = new Position();

p.r = row;

p.c = col;

posQueue.add(p);

} // addPosition

public void solveMaze ()

{ int r, c = 0; // Row and column coordinates;

try

{ readMaze("MAZE");

}

catch (IOException e)

{ System.err.println("Error reading data file");

e.printStackTrace();

206

Page 216: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

System.exit(1);

}

// Initialise beenThere to all FALSE

for (r = 0; r < MAX_COORD; r++)

for (c = 0; c < MAX_COORD; c++)

beenThere[r][c] = false;

// Find starting position

for (r = 0; r < MAX_COORD; r++)

if (maze[r][0]) // r is starting row

break;

// Put starting position on queue

addPosition(r, 0);

// Now do search

while (! posQueue.isEmpty())

{ Position nextPos;

// Remove next position from queue and try all possible moves

nextPos = posQueue.remove();

c = nextPos.c;

r = nextPos.r;

beenThere[r][c] = true; // Note that we have visited this spot

System.out.println("Visiting position: " + r + ", " + c);

if (c == MAX_COORD-1) // Found exit, so leave search loop

break;

// Try to move up

if (maze[r-1][c] && ! beenThere[r-1][c])

addPosition(r-1, c);

// Try to move right

if (maze[r][c+1] && ! beenThere[r][c+1])

addPosition(r, c+1);

// Try to move down

if (maze[r+1][c] && ! beenThere[r+1][c])

addPosition(r+1, c);

// Try to move left

if (c > 0 && maze[r][c-1] && ! beenThere[r][c-1])

addPosition(r, c-1);

}// while

if (c == MAX_COORD-1) // Found exit

System.out.println("Success! Found exit at position: " +

r + ", " + c);

else

System.out.println("Failed to find exit from maze.");

} // solveMaze

public static void main (String[] args)

{ QSearch qs = new QSearch();

qs.solveMaze();

} // main

} // class QSearch

207

Page 217: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

A.4 The Iterator Interface: Iterator.java

package cs2;

/** Interface describing iterators for the CS2 data structures.

* These are only simple one-directional, read-only iterators at this stage.

* @author George Wells

* @version 2.0 (3 January 2005)

*/

public interface Iterator<T>

{ /** Get the current item.

* @return A reference to the current data item.

* @throws RunTimeException May throw run-time exceptions if the iterator

* is in an invalid state.

*/

public T get ();

/** Move to the next item.

* @throws RunTimeException May throw run-time exceptions if the iterator

* is in an invalid state.

*/

public void next ();

/** Tell whether there are any more items.

* @return <code>true</code> when the iterator is finished.

*/

public boolean atEnd ();

} // interface Iterator

A.5 Deques: Deque.java

package cs2;

/** Simple generic deque class using a circular, doubly linked

* list with a header node.

* @author George Wells

* @version 3.0 (25 February 2010)

*/

public class Deque<T>

{ private class DequeNode

{ public T data;

public DequeNode lt; // Pointer to left neighbour

public DequeNode rt; // Pointer to right neighbour

} // inner class DequeNode

/** Reference to the header node of the deque.

*/

private DequeNode header;

/** Create an empty deque.

* <BR><I>Postcondition:</I> The deque is empty.

*/

public Deque ()

{ header = new DequeNode(); // Create header node

header.lt = header;

208

Page 218: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

header.rt = header;

} // Constructor

/** Add an item to the left end of the deque.

* <BR><I>Postcondition:</I> The deque is not empty.

* @param item The item to be added to the deque.

*/

public void addLeft (T item)

{ DequeNode newNode = new DequeNode();

newNode.data = item;

newNode.rt = header.rt;

newNode.lt = header;

header.rt.lt = newNode;

header.rt = newNode;

} // addLeft

/** Add an item to the right end of the deque.

* <BR><I>Postcondition:</I> The deque is not empty.

* @param item The item to be added to the deque.

*/

public void addRight (T item)

{ DequeNode newNode = new DequeNode();

newNode.data = item;

newNode.lt = header.lt;

newNode.rt = header;

header.lt.rt = newNode;

header.lt = newNode;

} // addRight

/** Remove an item from the left end of the deque.

* <BR><I>Precondition:</I> The deque is not empty.

* <BR><I>Postcondition:</I> The item at the left end

* of the deque is removed and returned.

* @return The item from the left end of the deque.

* @throws EmptyException if the deque is empty.

*/

public T removeLeft ()

{ if (header.rt == header)

throw new EmptyException("deque is empty");

DequeNode tmpPtr = header.rt;

T tmpData = tmpPtr.data;

header.rt = tmpPtr.rt;

tmpPtr.rt.lt = header;

return tmpData;

} // removeLeft

/** Remove an item from the right end of the deque.

* <BR><I>Precondition:</I> The deque is not empty.

* <BR><I>Postcondition:</I> The item at the right end

* of the deque is removed and returned.

* @return The item from the right end of the deque.

* @throws EmptyException if the deque is empty.

*/

public T removeRight ()

{ if (header.lt == header)

throw new EmptyException("deque is empty");

209

Page 219: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

DequeNode tmpPtr = header.lt;

T tmpData = tmpPtr.data;

header.lt = tmpPtr.lt;

tmpPtr.lt.rt = header;

return tmpData;

} // removeRight

/** Return a copy of the item at the left end of the deque.

* <BR><I>Precondition:</I> The deque is not empty.

* <BR><I>Postcondition:</I> The item at the left end

* of the deque is returned.

* @return The value of the item at the left end of the deque.

* @throws EmptyException if the deque is empty.

*/

public T leftHead ()

{ if (header.rt == header)

throw new EmptyException("deque is empty");

return header.rt.data;

} // leftHead

/** Return a copy of the item at the right end of the deque.

* <BR><I>Precondition:</I> The deque is not empty.

* <BR><I>Postcondition:</I> The item at the right end

* of the deque is returned.

* @return The value of the item at the right end of the deque.

* @throws EmptyException if the deque is empty.

*/

public T rightHead ()

{ if (header.lt == header)

throw new EmptyException("deque is empty");

return header.lt.data;

} // rightHead

/** Tell whether the deque is empty.

* @return <CODE>true</CODE> if there are no items in the

* deque, otherwise <CODE>false</CODE>.

*/

public boolean isEmpty ()

{ return header.lt == header;

} // isEmpty

} // class Deque

A.6 Trees

A.6.1 Tree.java

package cs2;

/** Simple generic tree class.

* This is an immutable, "open box" data structure,

* with in-order, pre-order and post-order iterators.

* @author George Wells

* @version 3.0 (25 February 2010)

*/

210

Page 220: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

public class Tree<T>

{ /** The data stored in this node.

*/

private T data;

/** Pointer to left subtree.

*/

private Tree<T> lt;

/** Pointer to right subtree.

*/

private Tree<T> rt;

/** Creates new node with left and right subtrees.

* @param value The value to be stored in this node.

* @param left The left subtree to be added to this node.

* @param right The right subtree to be added to this node.

*/

public Tree (T value, Tree<T> left, Tree<T> right)

{ lt = left;

rt = right;

data = value;

} // Constructor

/** Creates new node without subtrees.

* @param value The value to be stored in this node.

*/

public Tree (T value)

{ this(value, null, null);

} // Constructor

/** Return the left subtree of a tree.

* @return Left subtree of this node.

*/

public Tree<T> left ()

{ return lt; }

/** Return the right subtree of a tree.

* @return Right subtree of this node.

*/

public Tree<T> right ()

{ return rt; }

/** Add a left subtree to a node.

* <I>Precondition:</I> There is no left subtree.

* @param left The new left tree to be added.

* @throws UnsupportedOperationException if there is a pre-existing left subtree.

*/

public void addLeft (Tree<T> left)

{ if (lt != null)

throw new UnsupportedOperationException("subtree already present");

lt = left;

} // addLeft

/** Add a right subtree to a node.

* <I>Precondition:</I> There is no right subtree.

* @param right The new right tree to be added.

* @throws UnsupportedOperationException if there is a pre-existing right subtree.

211

Page 221: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

*/

public void addRight (Tree<T> right) // Add a right subtree to a node

{ if (rt != null)

throw new UnsupportedOperationException("subtree already present");

rt = right;

} // addRight

/** Access the data value in a node of a tree.

* @return The data contained in this tree node.

*/

public T getData ()

{ return data; }

} // class Tree

A.6.2 Animal.java

/* Simple guessing game in which the computer tries to guess the

name of an animal the user has in mind by asking simple questions

with yes/no answers.

George Wells -- 7 November 2000 */

import cs2.Tree;

import java.io.*;

public class Animal

{ private Tree<String> root;

private BufferedReader in = new BufferedReader(new InputStreamReader(System.in));

private void initTree ()

{ Tree<String> p;

root = new Tree<String>("Does it live in water?");

root.addLeft(new Tree<String>("Does it have webbed feet?"));

root.addRight(new Tree<String>("Does it fly?"));

p = root.right();

p.addLeft(new Tree<String>("bird"));

p.addRight(new Tree<String>("Does it bark?"));

p = p.right();

p.addLeft(new Tree<String>("dog"));

p.addRight(new Tree<String>("cat"));

// Return to left subtree of root

p = root.left();

p.addLeft(new Tree<String>("duck"));

p.addRight(new Tree<String>("fish"));

} // initTree

private boolean answer ()

// Get yes/no answer from user and return true for yes, false for no

{ String ans = " ";

while (true)

{ try

{ ans = in.readLine();

212

Page 222: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

}

catch (IOException e)

{ e.printStackTrace();

}

char ch = ans.charAt(0);

if (ch == ’y’ || ch == ’Y’)

return true;

else

if (ch == ’n’ || ch == ’N’)

return false;

else

System.out.println("Please answer yes or no.");

}

} // answer

public void play ()

{ Tree<String> pos;

initTree();

System.out.println("Let’s play guess the animal.");

pos = root;

while (pos != null)

{ if (pos.left() != null) // Must be a question

{ System.out.println(pos.getData());

if (answer())

pos = pos.left();

else

pos = pos.right();

}

else // Must be an answer

{ System.out.println("It’s a " + pos.getData() + ".");

break;

}

}

if (pos == null)

System.out.println("Sorry, I don’t know the animal you had in mind.");

} // play

public static void main (String[] args)

{ Animal a = new Animal();

a.play();

} // main

} // class Animal

A.6.3 BinarySearchTree.java

package cs2;

import java.util.Vector;

/** Simple generic binary search tree class.

* @author George Wells

* @version 3.0 (25 February 2010)

*/

public class BinarySearchTree<T extends Comparable<? super T>>

{ private class BSTreeNode

213

Page 223: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

{ public T data;

public BSTreeNode lt, // Pointer to left subtree

rt, // Pointer to right subtree

parent; // Pointer to parent node

public BSTreeNode (T value) // Constructor

{ data = value;

lt = rt = parent = null;

} // Constructor

public BSTreeNode (T value, BSTreeNode parent) // Constructor

{ data = value;

this.parent = parent;

lt = rt = null;

} // Constructor

} // inner class BSTreeNode

private class TreeIterator implements Iterator<T>

{ private Vector<T> v;

private int index = 0;

public TreeIterator ()

{ v = new Vector<T>();

} // constructor

public T get ()

{ return v.get(index);

} // get

public void next ()

{ index++;

} // next

public boolean atEnd ()

{ return index >= v.size();

} // atEnd

void add (T value)

{ v.add(value);

} // add

} // inner class TreeIterator

/** The root of the entire tree.

*/

private BSTreeNode root;

private void insert (T value, BSTreeNode root)

// Insert value in to the tree as a leaf node

{ assert root != null;

if (root.data.compareTo(value) > 0) // Add to left subtree

if (root.lt != null)

insert(value, root.lt);

else

root.lt = new BSTreeNode(value, root);

214

Page 224: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

else // Add to right subtree

if (root.rt != null)

insert(value, root.rt);

else

root.rt = new BSTreeNode(value, root);

} // insert

private T deleteMin (BSTreeNode root)

// Delete and return the smallest value in the tree under root

// This method is only used by remove

{ assert root != null;

if (root.lt != null)

return deleteMin(root.lt);

else // Delete this node

{ T tmpData = root.data;

replaceInParent(root.parent, root, root.rt);

return tmpData;

}

} // deleteMin

private void replaceInParent (BSTreeNode parent, BSTreeNode child, BSTreeNode newChild)

// Replace child link in parent node with newChild

{ if (parent == null) // at root

root = newChild;

else

if (parent.lt == child)

parent.lt = newChild;

else

parent.rt = newChild;

if (newChild != null)

newChild.parent = parent; // Reset parent

} // replaceInParent

private void remove (T value, BSTreeNode root)

{ if (root != null)

{ if (root.data.compareTo(value) > 0) // Delete from left subtree

remove(value, root.lt);

else

if (root.data.compareTo(value) < 0) // Delete from right subtree

remove(value, root.rt);

else // Must be this node to be deleted

{ if (root.lt != null && root.rt != null)

// Has both left and right subtrees

{ T min = deleteMin(root.rt);

root.data = min;

}

else

{ if (root.lt == null && root.rt == null)

// Has no subtrees

replaceInParent(root.parent, root, null);

else

if (root.lt == null)

// Has only right subtree

replaceInParent(root.parent, root, root.rt);

else // Has only left subtree

replaceInParent(root.parent, root, root.lt);

215

Page 225: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

} // else

} // else

} // if

} // remove

private boolean contains (T value, BSTreeNode root)

{ if (root == null)

return false;

if (root.data.equals(value))

return true;

else

if (root.data.compareTo(value) > 0) // Look in left subtree

return contains(value, root.lt);

else // Look in right subtree

return contains(value, root.rt);

} // contains

private void buildLNRIterator (BSTreeNode root, TreeIterator t)

{ if (root != null)

{ buildLNRIterator(root.lt, t);

t.add(root.data);

buildLNRIterator(root.rt, t);

}

} // buildLNRIterator

private void buildNLRIterator (BSTreeNode root, TreeIterator t)

{ if (root != null)

{ t.add(root.data);

buildNLRIterator(root.lt, t);

buildNLRIterator(root.rt, t);

}

} // buildNLRIterator

private void buildLRNIterator (BSTreeNode root, TreeIterator t)

{ if (root != null)

{ buildLRNIterator(root.lt, t);

buildLRNIterator(root.rt, t);

t.add(root.data);

}

} // buildLRNIterator

// ------------------------------------------------------------------

/** Create an empty binary search tree.

* <BR><I>Postcondition:</I> The tree is empty.

*/

public BinarySearchTree ()

{ root = null; }

/** Insert an item into the binary search tree.

* <BR><I>Postcondition:</I> The tree is not empty.

* @param newValue The item to be inserted into the tree.

*/

public void insert (T newValue)

{ if (root == null)

root = new BSTreeNode(newValue);

216

Page 226: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

else

insert(newValue, root);

} // insert

/** Remove an item from the tree. If the item is duplicated in the tree, only the

* first instance found is removed.

* <BR><I>Precondition:</I> None.

* <BR><I>Postcondition:</I> If the item was found in the tree, it has been removed.

* @param value The item to be removed from the tree.

*/

public void remove (T value)

{ remove(value, root); }

/** Tell whether an item appears in the tree.

* @return <CODE>true</CODE> if the item is found, otherwise <CODE>false</CODE>.

* @param value The item to be found in the tree.

*/

public boolean contains (T value) // Tell whether value is in tree

{ return contains(value, root); }

/** Obtain an interator that can be used to work through all of the data contained

* in the tree using an in-order traversal.

* <P><B>Note:</B> This iterator makes a copy of the contents of the tree.

* Any subsequent changes to the structure of the tree will <EM>not</EM> be

* reflected by this iterator.

* @return An <CODE>Iterator</CODE> allowing the contents of the tree to be

* accessed in order.

*/

public Iterator<T> getLNRIterator ()

{ TreeIterator t = new TreeIterator();

buildLNRIterator(root, t);

return t;

} // LNRTraversal

/** Obtain an interator that can be used to work through all of the data contained

* in the tree using a pre-order traversal.

* <P><B>Note:</B> This iterator makes a copy of the contents of the tree.

* Any subsequent changes to the structure of the tree will <EM>not</EM> be

* reflected by this iterator.

* @return An <CODE>Iterator</CODE> allowing the contents of the tree to be

* accessed in pre-order.

*/

public Iterator<T> getNLRIterator ()

{ TreeIterator t = new TreeIterator();

buildNLRIterator(root, t);

return t;

} // NLRTraversal

/** Obtain an interator that can be used to work through all of the data contained

* in the tree using a post-order traversal.

* <P><B>Note:</B> This iterator makes a copy of the contents of the tree.

* Any subsequent changes to the structure of the tree will <EM>not</EM> be

* reflected by this iterator.

* @return An <CODE>Iterator</CODE> allowing the contents of the tree to be

* accessed in post-order.

*/

217

Page 227: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

public Iterator<T> getLRNIterator ()

{ TreeIterator t = new TreeIterator();

buildLRNIterator(root, t);

return t;

} // LRNTraversal

} // class BinarySearchTree

A.7 Dictionaries and Hash Tables

A.7.1 Dictionary.java

package cs2;

/** Interface describing the implementation of simple dictionary ADT’s. Used

* by dictionary and hash table ADT’s. The specification provides for the

* storage of data accessed by specifying a <EM>key</EM> value.

* <P>For all ADT’s implementing this interface there is an assumption that

* the keys are unique.

* @author George Wells

* @version 2.0 (3 January 2005)

*/

public interface Dictionary<K, V>

{

/** Delete all the entries in a dictionary ADT.

* <BR><I>Postcondition:</I> The dictionary is empty.

*/

public void makeEmpty ();

/** Add an item to a dictionary ADT. If the specified key is already

* present then the existing value is replaced by that specified here.

* <BR><I>Postcondition:</I> The dictionary is not empty.

* @param aKey The key to be added.

* @param aValue The associated value to be stored with the key.

*/

public void insert (K aKey, V aValue);

/** Add a key to a dictionary ADT without an associated value.

* <BR><I>Postcondition:</I> The dictionary is not empty.

* @param aKey The key to be added.

*/

public void insert (K aKey);

/** Remove an item from a dictionary ADT.

* <BR><I>Postcondition:</I> The item specified by the given key is no

* longer present.

* @param aKey The key of the entry to be removed.

*/

public void remove (K aKey);

/** Access an entry in a dictionary ADT. Note that this method

* should add the specified key if it is not already present.

* <BR><I>Precondition:</I> None.

* <BR><I>Postcondition:</I> The value associated with the specified key

* is returned.

218

Page 228: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

* @param aKey The key of the item to be accessed.

* @return The value associated with the specified key is returned

* (<CODE>null</CODE> if the key was not previously present).

*/

public V get (K aKey);

/** Tell whether a dictionary contains a specified key.

* <BR><I>Precondition:</I> None.

* <BR><I>Postcondition:</I> An indication of whether the key is present

* in the dictionary is returned.

* @param aKey The key of the item to be accessed.

* @return <CODE>true</CODE> if the specified key is found in the

* dictionary, otherwise <CODE>false</CODE>.

*/

public boolean contains (K aKey);

/** Tell whether a dictionary is empty.

* <BR><I>Precondition:</I> None.

* <BR><I>Postcondition:</I> An indication of whether the dictionary is

* empty is returned.

* @return <CODE>true</CODE> if the dictionary is empty,

* otherwise <CODE>false</CODE>.

*/

public boolean isEmpty ();

/** Obtain an interator for a dictionary. The iterator should allow all

* data items stored in a dictionary to be accessed. Implementations may,

* or may not, provide a specific ordering by key values. The iterator’s

* <CODE>get</CODE> method should return a <CODE>Pair</CODE> object

* allowing access to both the key and the associated value.

* <BR><I>Precondition:</I> None.

* <BR><I>Postcondition:</I> An iterator for a dictionary is returned.

* @return An iterator for a dictionary.

*/

public Iterator<Pair<K, V>> getIterator ();

} // interface Dictionary

A.7.2 Pair.java

package cs2;

/** Interface describing a (key, value) pair (as used in dictionary and hash

* table data structures).

* @author George Wells

* @version 2.0 (3 January 2005)

*/

public interface Pair<K, V>

{ /** Access the key.

* @return The key value contained in a pair.

*/

public K getKey ();

/** Access the value.

* @return The associated value contained in a pair.

*/

219

Page 229: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

public V getValue ();

/** Replace the value associated with a key.

* <BR><I>Precondition:</I> None.

* <BR><I>Postcondition:</I> The value associated with a key is changed to

* that specified.

* @param value The new value to be associated with a key.

*/

public void setValue (V value);

/** Return a hash code for the key in a pair. This may utilise the

* standard Java <CODE>hashCode</CODE> method.

* @return The hash code of the key.

*/

public int hashCode ();

/** Test keys for equality. If the parameter is an object implementing the

* <CODE>Pair</CODE> interface then the keys should be compared for

* equality. Otherwise the key contained in this pair should be compared

* directly with the parameter.

* @return <CODE>true</CODE> if the keys are equal, <CODE>false</CODE>

* otherwise.

*/

public boolean equals (Object o);

} // interface Pair

A.7.3 DictionaryPair.java

package cs2;

/** Simple class implementing the <CODE>Pair</CODE> interface for use in

* dictionary and hash table data structures. This class allows access to a

* key and an associated value, but only the associated value may be modified.

* @author George Wells

* @version 2.0 (3 January 2005)

*/

public class DictionaryPair<K, V> implements Pair<K, V>

{ /** The key value.

*/

private K key;

/** The data value associated with the key.

*/

private V value;

/** Create a pair, initialising both the key and associated value.

* <BR><I>Precondition:</I> None.

* <BR><I>Postcondition:</I> The key and value are initialised.

* @param aKey The key.

* @param aValue The value initially associated with this key.

*/

public DictionaryPair (K aKey, V aValue)

{ key = aKey;

value = aValue;

} // constructor

220

Page 230: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

/** Create a pair, initialising only the key.

* <BR><I>Precondition:</I> None.

* <BR><I>Postcondition:</I> The key is initialised (the associated value

* is <CODE>null</CODE>).

* @param aKey The key.

*/

public DictionaryPair (K aKey)

{ key = aKey;

} // constructor

/** Return the key.

* @return The key value contained in this pair.

*/

public K getKey ()

{ return key;

} // getKey

/** Return the value.

* @return The associated value contained in this pair.

*/

public V getValue ()

{ return value;

} // getValue

/** Replace the value associated with this key.

* <BR><I>Precondition:</I> None.

* <BR><I>Postcondition:</I> The value associated with this key is changed

* to that specified.

* @param value The new value to be associated with this key.

*/

public void setValue (V value)

{ this.value = value;

} // setValue

/** Return a hash code for this pair. This simply utilises the standard

* Java <CODE>hashCode</CODE> method for the <EM>key</EM>.

* @return The hash code of the key.

*/

public int hashCode ()

{ return key.hashCode();

} // hashcode

/** Test keys for equality. If the parameter is an object implementing the

* <CODE>Pair</CODE> interface then the keys are compared for equality.

* Otherwise the key contained in this pair is compared directly with the

* parameter.

* @return <CODE>true</CODE> if the keys are equal, <CODE>false</CODE>

* otherwise.

*/

public boolean equals (Object o)

{ if (o instanceof Pair)

return key.equals(((Pair)o).getKey());

else

return key.equals(o);

} // equals

221

Page 231: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

} // class DictionaryPair

A.7.4 ListDictionary.java

package cs2;

/** Implementation of a simple dictionary ADT. Requires that the keys implement the

* <CODE>Comparable</CODE> interface.

* The keys must be unique.

* @author George Wells

* @version 3.0 (25 February 2010)

*/

public class ListDictionary<K extends Comparable<? super K>, V> implements Dictionary<K, V>

{ private class ListNode extends DictionaryPair<K, V>

{ public ListNode next;

public ListNode (K aKey, V aValue)

{ super(aKey, aValue);

}

public ListNode (K aKey)

{ super(aKey);

}

} // inner class ListNode

private class ListDictionaryIterator implements Iterator<Pair<K, V>>

{ private ListNode nextEntry;

public ListDictionaryIterator (ListNode first)

{ nextEntry = first;

} // constructor

public Pair<K, V> get ()

{ return nextEntry;

} // get

public void next ()

{ nextEntry = nextEntry.next;

} // next

public boolean atEnd ()

{ return nextEntry == null;

} // atEnd

} // inner class ListDictionaryIterator

/** Reference to the linked list containing the data.

*/

private ListNode dict;

/** Find the node in the linked list containing the specified key. If the key

* is not found then it is inserted.

* <BR><I>Postcondition:</I> The dictionary is not empty.

* @param aKey The key to be located or inserted.

* @return A reference to the node containing the specified key.

*/

private ListNode findNode (K aKey)

222

Page 232: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

{ ListNode prev = null;

ListNode curr = dict;

while (curr != null && aKey.compareTo(curr.getKey()) > 0)

{ prev = curr;

curr = curr.next;

}

if (curr == null || aKey.compareTo(curr.getKey()) != 0) // Insert new entry

{ ListNode n = new ListNode(aKey);

n.next = curr;

if (prev == null)

dict = n;

else

prev.next = n;

curr = n;

}

return curr;

} // findNode

/** Create an empty ListDictionary.

* <BR><I>Postcondition:</I> The dictionary is empty.

*/

public ListDictionary ()

{ dict = null; }

/** Delete all the entries in the dictionary.

* <BR><I>Postcondition:</I> The dictionary is empty.

*/

public void makeEmpty ()

{ dict = null;

} // makeEmpty

/** Insert a new item into the dictionary or update an existing one.

* If the specified key is already present then the existing value is

* replaced by that specified here.

* <BR><I>Postcondition:</I> The dictionary is not empty.

* @param aKey The key to be added.

* @param aValue The associated value to be stored with the key.

*/

public void insert (K aKey, V aValue)

{ ListNode curr;

curr = findNode(aKey);

assert (curr != null && aKey.equals(curr.getKey()));

curr.setValue(aValue);

} // insert

/** Insert a new item into the dictionary or update an existing one.

* If the specified key is already present then the existing value

* is replaced by that specified here.

* <BR><I>Postcondition:</I> The dictionary is not empty.

* @param p The key/value pair to be added/updated.

*/

public void insert (Pair<K, V> p)

{ ListNode curr;

curr = findNode(p.getKey());

assert (curr != null && curr.getKey().equals(p.getKey()));

curr.setValue(p.getValue());

223

Page 233: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

} // insert

/** Add a key to the dictionary without an associated value.

* If the specified key is already present then nothing is changed.

* <BR><I>Postcondition:</I> The dictionary is not empty.

* @param aKey The key to be added.

*/

public void insert (K aKey)

{ ListNode curr;

curr = findNode(aKey);

assert (curr != null && curr.getKey().equals(aKey));

} // insert

/** Remove an item from the dictionary. If the specified key is not found, no action

* is taken.

* <BR><I>Postcondition:</I> The item specified by the given key is not present.

* @param aKey The key of the entry to be removed.

*/

public void remove (K aKey)

{ ListNode curr = dict, prev = null;

while (curr != null && aKey.compareTo(curr.getKey()) > 0)

{ prev = curr;

curr = curr.next;

}

if (curr != null && aKey.compareTo(curr.getKey()) == 0) // Remove this dictionary entry

{ if (prev == null)

dict = curr.next;

else

prev.next = curr.next;

}

// else entry not found - ignore

} // remove

/** Access an entry in the dictionary. Note that this method

* adds the specified key if it is not already present.

* <BR><I>Precondition:</I> None.

* <BR><I>Postcondition:</I> The value associated with the specified key is returned.

* @param aKey The key of the item to be accessed.

* @return The value associated with the specified key is returned (<CODE>null</CODE>

* if the key was not previously present).

*/

public V get (K aKey)

{ ListNode curr;

curr = findNode(aKey);

assert curr != null && aKey.equals(curr.getKey());

return curr.getValue();

} // get

/** Tell whether the dictionary contains a specified key.

* <BR><I>Precondition:</I> None.

* <BR><I>Postcondition:</I> An indication of whether the key is present in the

* dictionary is returned.

* @param aKey The key of the item to be accessed.

* @return <CODE>true</CODE> if the specified key is found in the dictionary,

* otherwise <CODE>false</CODE>.

*/

224

Page 234: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

public boolean contains (K aKey)

{ ListNode curr = dict;

while (curr != null && aKey.compareTo(curr.getKey()) > 0)

curr = curr.next;

return (curr != null && aKey.compareTo(curr.getKey()) == 0);

} // contains

/** Tell whether the dictionary is empty.

* <BR><I>Precondition:</I> None.

* <BR><I>Postcondition:</I> An indication of whether the dictionary is empty

* is returned.

* @return <CODE>true</CODE> if the dictionary is empty,

* otherwise <CODE>false</CODE>.

*/

public boolean isEmpty ()

{ return dict == null;

} // isEmpty

/** Obtain an interator for the dictionary. The iterator allows all

* data items stored in the dictionary to be accessed. The iterator provided by

* this method orders the items into ascending order of the keys.

* The iterator’s <CODE>get</CODE>

* method returns a <CODE>Pair</CODE> object allowing access to both the

* key and the associated value.

* <BR><I>Precondition:</I> None.

* <BR><I>Postcondition:</I> An iterator for the dictionary is returned.

* @return An iterator for the dictionary.

*/

public Iterator<Pair<K, V>> getIterator ()

{ return new ListDictionaryIterator(dict);

} // getIterator

} // class ListDictionary

A.7.5 Concordance.java

// Test dictionaries and hash tables

import cs2.*;

import java.io.*;

import java.util.StringTokenizer;

public class Concordance

{

public static void main (String args[]) throws IOException

{ Dictionary<String, IntegerList> dict =

new ListDictionary<String, IntegerList>();

String line = null;

int lineNo = 1;

BufferedReader in = new BufferedReader(new FileReader("sample.txt"));

while ((line = in.readLine()) != null)

{ StringTokenizer st = new StringTokenizer(line);

while (st.hasMoreTokens())

{ String word = st.nextToken();

IntegerList lst = dict.get(word);

225

Page 235: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

if (lst == null) // First entry for this word

{ lst = new IntegerList();

lst.add(lineNo);

dict.insert(word, lst);

}

else // Simply add new line number

lst.add(lineNo);

}

lineNo++;

}

// Now print out the index

Iterator<Pair<String, IntegerList>> it = dict.getIterator();

while (! it.atEnd())

{ Pair<String, IntegerList> p = it.get();

System.out.println(p.getKey() + ": " + p.getValue());

it.next();

}

} // main

} // class Concordance

A.7.6 InternalHashTable.java

package cs2;

/** This class implements a simple, internal hashtable dictionary.

* Overflow is handled by always keeping one empty slot in the table.

* @author George Wells

* @version 3.0 (25 February 2010)

*/

public class InternalHashTable<K, V> implements Dictionary<K, V>

{ /** Default table size.

*/

private static final int DEF_SIZE = 101; // Default table size

private class TableEntry<K, V> extends DictionaryPair<K, V>

{ boolean occupied = false; // Used to mark deleted entries

public TableEntry (K aKey, V aValue)

{ super(aKey, aValue);

}

public TableEntry (K aKey)

{ super(aKey);

}

} // inner class TableEntry

/** The array used for the hash table.

*/

private TableEntry<K, V>[] table;

/** The number of occupied and deleted entries in the table.

Restricted to <CODE>table.length-1</CODE>.

*/

private int numEntries;

226

Page 236: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

private class HashTableIterator implements Iterator<Pair<K, V>>

{ private int index;

public HashTableIterator ()

{ for (index = 0; index < table.length; index++)

if (table[index] != null && table[index].occupied) // First non-empty slot

break;

} // constructor

public Pair<K, V> get ()

{ return table[index];

} // get

public void next ()

{ while (++index < table.length)

{ if (table[index] != null && table[index].occupied) // Found more data

{ break;

}

}

} // next

public boolean atEnd ()

{ return index >= table.length;

} // atEnd

} // inner class HashTableIterator

/** Function used to generate a suitable hash value for a specified key. This method

* uses the standard Java <CODE>hashCode</CODE> method, and ensures that the result

* is positive and the correct range of values to be used as a subscript for

* <CODE>table</CODE>.

* @param akey The key to be hashed.

* @return The hash value, <I>h</I> (0 <= <I>h</I> < <CODE>table.length</CODE>).

*/

private int hash (K aKey)

{ return ((aKey.hashCode() & 0x7FFFFFFF) % table.length); // Allow for negative hashcodes

} // hash

/** Create a new hash table, with a given capacity.

* <BR><I>Precondition:</I> <CODE>initSize > 0</CODE>.

* <BR><I>Postcondition:</I> The array used by the hash table is initialised and is empty.

* @param initSize The maximum capacity of the hash table.

*/

public InternalHashTable (int initSize)

{ table = new TableEntry[initSize];

numEntries = 0;

} // Constructor

/** Create a new hash table, with a default capacity (currently 101 items).

* <BR><I>Postcondition:</I> The array used by the hash table is initialised and is empty.

*/

public InternalHashTable ()

{ this(DEF_SIZE);

} // Constructor

/** Delete all entries in the hash table.

227

Page 237: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

* <BR><I>Postcondition:</I> The array used by the hash table is empty.

*/

public void makeEmpty ()

{ for (int k = 0; k < table.length; k++)

{ table[k] = null;

}

numEntries = 0;

} // makeEmpty

/** Insert a new item or update an existing one in the hash table.

* There is a requirement that the keys are unique. If

* the key is found in the table, then the associated value is

* replaced with that specified here.

* <BR><I>Precondition:</I> None.

* <BR><I>Postcondition:</I> The item is added or updated.

* @param aKey The key for the item to be added to the hash table.

* @param aValue The value associated with the key.

* @throws NoSpaceAvailableException if the hash table’s capacity is exceeded.

*/

public void insert (K aKey, V aValue)

{ int index = hash(aKey);

while (table[index] != null && !table[index].getKey().equals(aKey))

{ index = (index + 1);

if (index >= table.length) // wraparound

index = 0;

}

if (table[index] == null) // Insert new entry

{ if (numEntries + 1 >= table.length) // Out of space?

throw new NoSpaceAvailableException("no space available in hash table");

table[index] = new TableEntry<K, V>(aKey, aValue);

table[index].occupied = true;

numEntries++;

}

else // Update existing or deleted entry

{ table[index].setValue(aValue);

if (! table[index].occupied) // Undelete it

{ table[index].occupied = true;

}

}

} // insert

/** Insert a new key into the hash table. There is a requirement that the keys are unique.

* If the key is found in the table, then nothing is changed.

* <BR><I>Precondition:</I> None.

* <BR><I>Postcondition:</I> The key is added if not present previously.

* @param aKey The key to be added to the hash table.

* @throws NoSpaceAvailableException if the hash table’s capacity is exceeded.

*/

public void insert (K aKey)

{ int index = hash(aKey);

while (table[index] != null && !table[index].getKey().equals(aKey))

{ index = (index + 1);

if (index >= table.length) // wraparound

index = 0;

}

if (table[index] == null) // Insert new entry

228

Page 238: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

{ if (numEntries + 1 >= table.length) // Out of space?

throw new NoSpaceAvailableException("no space available in hash table");

table[index] = new TableEntry<K, V>(aKey);

table[index].occupied = true;

numEntries++;

}

else // Update deleted entry if necessary

{ if (! table[index].occupied) // Undelete it

{ table[index].occupied = true;

}

}

} // insert

/** Remove an entry from the hash table. If

* the key is not found in the table, then nothing is done.

* <BR><I>Precondition:</I> None.

* <BR><I>Postcondition:</I> The key and associated value have been removed

* from the table, if present previously.

* @param aKey The key to be removed from the hash table.

*/

public void remove (K aKey)

{ int index = hash(aKey);

while (table[index] != null && !table[index].getKey().equals(aKey))

{ index = (index + 1);

if (index >= table.length) // wraparound

index = 0;

}

if (table[index] != null)

{ table[index].occupied = false;

}

} // remove

/** Access an entry in the hash table, creating it if necessary. Note that this method

* adds the specified key if it is not already present.

* <BR><I>Precondition:</I> None.

* <BR><I>Postcondition:</I> The value associated with the specified key is returned.

* @param aKey The key of the item to be accessed.

* @return The value associated with the specified key is returned (<CODE>null</CODE> if

* the key was not previously present).

* @throws NoSpaceAvailableException if the hash table’s capacity is exceeded.

*/

public V get (K aKey)

{ int index = hash(aKey);

while (table[index] != null && !table[index].getKey().equals(aKey))

{ index = (index + 1);

if (index >= table.length) // wraparound

index = 0;

}

if (table[index] == null || !table[index].occupied) // Insert new entry

{ if (numEntries + 1 >= table.length) // Out of space?

throw new NoSpaceAvailableException("no space available in hash table");

table[index] = new TableEntry<K, V>(aKey);

table[index].occupied = true;

numEntries++;

}

assert aKey.equals(table[index].getKey());

229

Page 239: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

return table[index].getValue();

} // get

/** Tell whether the hash table contains a specified key.

* <BR><I>Precondition:</I> None.

* <BR><I>Postcondition:</I> An indication of whether the key is present in the hash

* table is returned.

* @param aKey The key of the item to be accessed.

* @return <CODE>true</CODE> if the specified key is found in the hash table,

* otherwise <CODE>false</CODE>.

*/

public boolean contains (K aKey)

{ int index = hash(aKey);

while (table[index] != null && !table[index].getKey().equals(aKey))

{ index = (index + 1);

if (index >= table.length) // wraparound

index = 0;

}

return (table[index] != null && table[index].occupied &&

table[index].getKey().equals(aKey));

} // contains

/** Tell whether the hash table is empty.

* <BR><I>Precondition:</I> None.

* <BR><I>Postcondition:</I> An indication of whether the hash table is empty

* is returned.

* @return <CODE>true</CODE> if the hash table is empty,

* otherwise <CODE>false</CODE>.

*/

public boolean isEmpty ()

{ for (int k = 0; k < table.length; k++)

if (table[k] != null && table[k].occupied) // Found an occupied slot

return false;

return true;

} // isEmpty

/** Obtain an interator for this hash table. The iterator allows all

* data stored in the hash table to be accessed. The order of the items is

* determined by the hash codes of the key values and is unlikely to make any sense

* to a human.<BR><B>Note:</B>The iterator’s <CODE>get</CODE>

* method returns a <CODE>Pair</CODE> object allowing access to both the

* key and the associated value.

* <BR><I>Precondition:</I> None.

* <BR><I>Postcondition:</I> An iterator for the hash table is returned.

* @return An iterator for the hash table.

*/

public Iterator<Pair<K, V>> getIterator ()

{ return new HashTableIterator();

} // getIterator

} // class InternalHashTable

A.7.7 ExternalHashTable.java

package cs2;

230

Page 240: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

/** This class implements a simple, external hashtable dictionary using a linked list

* for the buckets.

* @author George Wells

* @version 3.0 (25 February 2010)

*/

public class ExternalHashTable<K, V> implements Dictionary<K, V>

{ /** Default table size.

*/

private static final int DEF_SIZE = 101;

private class EntryNode<K, V> extends DictionaryPair<K, V>

{ EntryNode<K, V> next;

public EntryNode (K aKey, V aValue)

{ super(aKey, aValue);

}

public EntryNode (K aKey)

{ super(aKey);

}

} // inner class EntryNode

/** The array used for the hash table.

*/

private EntryNode<K, V>[] table;

private class HashTableIterator implements Iterator<Pair<K, V>>

{ private int index;

private EntryNode<K, V> nextEntry;

public HashTableIterator ()

{ for (index = 0; index < table.length; index++)

if (table[index] != null) // First non-empty bucket

break;

if (index < table.length) // We have some data

nextEntry = table[index];

} // constructor

public Pair<K, V> get ()

{ return nextEntry;

} // get

public void next ()

{ nextEntry = nextEntry.next;

if (nextEntry == null) // Look for next non-empty bucket

while (++index < table.length)

{ if (table[index] != null) // Found more data

{ nextEntry = table[index];

break;

}

}

} // next

public boolean atEnd ()

{ return index >= table.length;

} // atEnd

231

Page 241: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

} // inner class HashTableIterator

/** Function used to generate a suitable hash value for a specified key. This method

* uses the standard Java <CODE>hashCode</CODE> method, and ensures that the result

* is positive and the correct range of values to be used as a subscript for

* <CODE>table</CODE>.

* @param akey The key to be hashed.

* @return The hash value, <I>h</I> (0 <= <I>h</I> < <CODE>table.length</CODE>).

*/

private int hash (K aKey)

{ return ((aKey.hashCode() & 0x7FFFFFFF) % table.length); // Allow for negative hashcodes

} // hash

/** Create a new hash table, with a given number of "buckets".

* <BR><I>Precondition:</I> <CODE>initSize > 0</CODE>.

* <BR><I>Postcondition:</I> The array used by the hash table is initialised and is empty.

* @param initSize The number of "buckets" to be used by the hash table.

*/

public ExternalHashTable (int initSize)

{ table = new EntryNode[initSize];

} // Constructor

/** Create a new hash table, with a default number of "buckets" (currently 101).

* <BR><I>Postcondition:</I> The array used by the hash table is initialised and is empty.

*/

public ExternalHashTable ()

{ this(DEF_SIZE);

} // Constructor

/** Delete all entries in the hash table.

* <BR><I>Postcondition:</I> The array used by the hash table is empty.

*/

public void makeEmpty ()

{ for (int k = 0; k < table.length; k++)

table[k] = null; // Delete linked list

} // makeEmpty

/** Insert a new item or update an existing one in the hash table. There is a

* requirement that the keys are unique. If the key is found in the table, then

* the associated value is replaced with that specified here.

* <BR><I>Precondition:</I> None.

* <BR><I>Postcondition:</I> The item is added or updated.

* @param aKey The key for the item to be added to the hash table.

* @param aValue The value associated with the key.

*/

public void insert (K aKey, V aValue)

{ int index = hash(aKey);

// Look for aKey in linked list

EntryNode<K, V> c;

for (c = table[index]; c != null && !c.getKey().equals(aKey); c = c.next)

;

if (c == null) // Insert new node

{ EntryNode<K, V> n = new EntryNode<K, V>(aKey, aValue);

n.next = table[index];

table[index] = n;

}

232

Page 242: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

else // Update existing entry

c.setValue(aValue);

} // insert

/** Insert a new key into the hash table. There is a requirement that the keys

* are unique. If the key is found in the table, then nothing is changed.

* <BR><I>Precondition:</I> None.

* <BR><I>Postcondition:</I> The key is added if not present previously.

* @param aKey The key to be added to the hash table.

*/

public void insert (K aKey)

// Insert new element or update existing one

{ int index = hash(aKey);

// Look for aKey in linked list

EntryNode<K, V> c;

for (c = table[index]; c != null && !c.getKey().equals(aKey); c = c.next)

;

if (c == null) // Insert new node

{ EntryNode<K, V> n = new EntryNode<K, V>(aKey);

n.next = table[index];

table[index] = n;

}

} // insert

/** Remove an entry from the hash table. If

* the key is not found in the table, then nothing is done.

* <BR><I>Precondition:</I> None.

* <BR><I>Postcondition:</I> The key and associated value have been removed

* from the table, if present previously.

* @param aKey The key to be removed from the hash table.

*/

public void remove (K aKey)

{ int index = hash(aKey);

if (table[index] != null) // Look for node

{ EntryNode<K, V> c = table[index], p = null;

while (c != null)

{ if (c.getKey().equals(aKey))

break;

p = c;

c = c.next;

}

if (c != null) // Unlink node

{ if (p == null)

table[index] = c.next;

else

p.next = c.next;

}

}

} // remove

/** Access an entry in the hash table, creating it if necessary. Note that this method

* adds the specified key if it is not already present.

* <BR><I>Precondition:</I> None.

* <BR><I>Postcondition:</I> The value associated with the specified key is returned.

* @param aKey The key of the item to be accessed.

* @return The value associated with the specified key is returned (<CODE>null</CODE>

233

Page 243: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

* if the key was not previously present).

*/

public V get (K aKey)

{ int index = hash(aKey);

EntryNode<K, V> c = table[index];

while (c != null && !c.getKey().equals(aKey))

c = c.next;

if (c == null) // Insert new entry

{ EntryNode<K, V> n = new EntryNode<K, V>(aKey);

n.next = table[index];

table[index] = n;

c = n;

}

assert aKey.equals(c.getKey());

return c.getValue();

} // get

/** Tell whether the hash table contains a specified key.

* <BR><I>Precondition:</I> None.

* <BR><I>Postcondition:</I> An indication of whether the key is present in the hash

* table is returned.

* @param aKey The key of the item to be accessed.

* @return <CODE>true</CODE> if the specified key is found in the hash table,

* otherwise <CODE>false</CODE>.

*/

public boolean contains (K aKey)

{ EntryNode<K, V> c = table[hash(aKey)];

while (c != null && !c.getKey().equals(aKey))

c = c.next;

return (c != null);

} // contains

/** Tell whether the hash table is empty.

* <BR><I>Precondition:</I> None.

* <BR><I>Postcondition:</I> An indication of whether the hash table is empty

* is returned.

* @return <CODE>true</CODE> if the hash table is empty,

* otherwise <CODE>false</CODE>.

*/

public boolean isEmpty ()

{ for (int k = 0; k < table.length; k++)

if (table[k] != null)

return false; // Found at least one entry

return true; // Found no entries

} // isEmpty

/** Obtain an interator for this hash table. The iterator allows all

* data stored in the hash table to be accessed. The order of the items is

* determined by the hash codes of the key values and is unlikely to make any sense

* to a human.<BR>

* <B>Note:</B>The iterator’s <CODE>get</CODE> method returns a <CODE>Pair</CODE>

* object allowing access to both the key and the associated value.

* <BR><I>Precondition:</I> None.

* <BR><I>Postcondition:</I> An iterator for the hash table is returned.

* @return An iterator for the hash table.

*/

234

Page 244: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

public Iterator<Pair<K, V>> getIterator ()

{ return new HashTableIterator();

} // getIterator

} // class ExternalHashTable

A.8 Binary Searches: BinarySearch.java

package cs2;

/** This class contains two searching algorithms:

* a generic binary search and an interpolated binary search for

* lists of double values.

* @author George Wells

* @version 1.1 (7 November 2000)

*/

public class BinarySearch

{

/** Search through list of entries for an item using a binary search.

* <BR><I>Precondition:</I> The list must be sorted into ascending order.

* <BR><I>Postcondition:</I> Returns -1 if item is not found, otherwise

* returns the index of item.

* <P>This method requires that the data implements

* the <CODE>Comparable</CODE> interface.

* <P>This search is of order log <I>n</I>.

* @param list The list of items to be searched.

* @param item The item being searched for.

* @return -1 if <CODE>item</CODE> is not found, otherwise returns the

* index of item.

*/

public static int binarySearch (Comparable list[], Comparable item)

{ int left = 0,

right = list.length-1,

look;

do

{ look = (left + right) / 2;

if (list[look].compareTo(item) > 0)

right = look - 1;

else

left = look + 1;

} while (list[look].compareTo(item) != 0 && left <= right);

if (list[look].compareTo(item) == 0)

return look;

else

return -1;

} // binarySearch

/** Search through list of entries for an item using an interpolated binary

* search.

* <BR><I>Precondition:</I> The list must be sorted into ascending order.

* <BR><I>Postcondition:</I> Returns -1 if item is not found, otherwise

* returns the index of item.

* <P>This method requires that the data consists of <CODE>double</CODE>

* values for the interpolation.

* <P>This search is of order log log <I>n</I>.

235

Page 245: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

* @param list The list of items (<CODE>double</CODE> values) to be

* searched.

* @param item The item (a <CODE>double</CODE> value) being searched for.

* @return -1 if <CODE>item</CODE> is not found, otherwise returns the

* index of item.

*/

public static int intBinarySearch (double list[], double item)

{ int left = 0,

right = list.length-1,

look;

do

{ if (left != right)

look = (int)(left + ((item-list[left]) /

(list[right]-list[left])) *

(double)(right-left));

else

look = left;

if (look < left)

look = left;

if (look > right)

look = right;

if (list[look] > item)

right = look - 1;

else

left = look + 1;

} while (list[look] != item && left <= right);

if (list[look] == item)

return look;

else

return -1;

} // intBinarySearch

} // class BinarySearch

A.9 Sorting: Sort.java

package cs2;

import java.util.Stack;

/** This class contains a number of generic sorting algorithms:

* BubbleSort, InsertionSort, SelectionSort,

* QuickSort and MergeSort.

* @author George Wells

* @version 1.1 (7 November 2000)

*/

public class Sort

{

// --- Simple sorting algorithms of O(n^2) ---

/** Sort a list of items into ascending order using the Bubble Sort.

* <BR><I>Precondition:</I> The list contains data that implements

* the <CODE>Comparable</CODE> interface.

* <BR><I>Postcondition:</I> The list is in ascending order.

* <P>This sort is of order <I>n</I><SUP>2</SUP>.

236

Page 246: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

* @param list The list of items to be sorted.

*/

public static void bubbleSort (Comparable list[])

{ boolean madeSwap = true; // Flag to tell if we can stop early

for (int pass = 0; pass < list.length && madeSwap; pass++)

{ madeSwap = false;

for (int k = 0; k < list.length-pass-1; k++)

if (list[k].compareTo(list[k+1]) > 0)

{ Comparable tmp = list[k];

list[k] = list[k+1];

list[k+1] = tmp;

madeSwap = true;

}

}

} // bubbleSort

/** Sort a list of items into ascending order using the Insertion Sort.

* <BR><I>Precondition:</I> The list contains data that implements

* the <CODE>Comparable</CODE> interface.

* <BR><I>Postcondition:</I> The list is in ascending order.

* <P>This sort is of order <I>n</I><SUP>2</SUP>.

* @param list The list of items to be sorted.

*/

public static void insertionSort (Comparable list[])

{ for (int k = list.length-2; k >= 0; k--)

{ Comparable tmp = list[k];

int j = k+1;

while (j < list.length && tmp.compareTo(list[j]) > 0)

// Move data down

{ list[j-1] = list[j];

j++;

}

list[j-1] = tmp;

}

} // insertionSort

/** Sort a list of items into ascending order using the Selection Sort.

* <BR><I>Precondition:</I> The list contains data that implements

* the <CODE>Comparable</CODE> interface.

* <BR><I>Postcondition:</I> The list is in ascending order.

* <P>This sort is of order <I>n</I><SUP>2</SUP>, but does only <I>n</I>

* movements of the data.

* @param list The list of items to be sorted.

*/

public static void selectionSort (Comparable list[])

{ for (int k = 0; k < list.length-1; k++)

{ int minPos = k;

for (int j = k+1; j < list.length; j++)

if (list[j].compareTo(list[minPos]) < 0)

minPos = j;

// Now swap the k’th and smallest items

Comparable tmp = list[k];

list[k] = list[minPos];

list[minPos] = tmp;

}

237

Page 247: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

} // selectionSort

// --- More powerful sorting methods of O(nlog n) ---

/** Partition a list between given start and end points, returning the

* partition point. The value at <CODE>list[start]</CODE> is used as the

* partition element. This method is used by the Quick Sort.

* <BR><I>Precondition:</I> The list contains data that implements

* the <CODE>Comparable</CODE> interface.

* <BR><I>Postcondition:</I> The items in the list before the partition

* point are less than or equal to the partition element, and the items

* in the list after the partition point are greater than the partition

* element.

* @param list The list of items to be partitioned.

* @param start The index of the first item in the list to be considered.

* @param end The index of the last item in the list to be considered.

* @return The index of the partition point.

*/

private static int partition (Comparable list[], int start, int end)

{ int left = start,

right = end;

Comparable tmp;

while (left < right)

{ // Work from right end first

while (list[right].compareTo(list[start]) > 0)

right--;

// Now work up from start

while (left < right && list[left].compareTo(list[start]) <= 0)

left++;

if (left < right)

{ tmp = list[left];

list[left] = list[right];

list[right] = tmp;

}

}

// Exchange the partition element with list[right]

tmp = list[start];

list[start] = list[right];

list[right] = tmp;

return right;

} // partition

/** Sort a list of items into ascending order using a recursive form of the

* Quick Sort.

* <BR><I>Precondition:</I> The list contains data that implements

* the <CODE>Comparable</CODE> interface.

* <BR><I>Postcondition:</I> The list between <CODE>start</CODE> and

* <CODE>end</CODE> is in ascending order.

* @param list The list of items to be sorted.

* @param start The index of the first item in the list to be considered.

* @param end The index of the last item in the list to be considered.

*/

private static void recursiveQS (Comparable list[], int start, int end)

{ if (start < end)

{ int partitionPoint = partition(list, start, end);

recursiveQS(list, start, partitionPoint-1);

238

Page 248: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

recursiveQS(list, partitionPoint+1, end);

}

} // recursiveQS

/** Sort a list of items into ascending order using a recursive form of the

* Quick Sort.

* <BR><I>Precondition:</I> The list contains data that implements

* the <CODE>Comparable</CODE> interface.

* <BR><I>Postcondition:</I> The list is in ascending order.

* <P>This sort is of order <I>n</I>log <I>n</I>.

* @param list The list of items to be sorted.

*/

public static void quickSort (Comparable list[])

// Quick Sort the list - actually just calls recursiveQS

{ recursiveQS(list, 0, list.length-1);

} // quickSort

/** Sort a list of items into ascending order using an iterative form of

* the Quick Sort.

* This makes use of the <CODE>java.util.Stack</CODE> class.

* <BR><I>Precondition:</I> The list contains data that implements

* the <CODE>Comparable</CODE> interface.

* <BR><I>Postcondition:</I> The list is in ascending order.

* <P>This sort is of order <I>n</I>log <I>n</I>.

* @param list The list of items to be sorted.

*/

public static void iterativeQuickSort (Comparable list[])

// Quick Sort list of length elements using a stack

{ class Pair // Store (start,end) pairs

{ public int start, end;

} // class Pair

Stack<Pair> s = new Stack<Pair>();

Pair p = new Pair();

p.start = 0;

p.end = list.length-1;

s.push(p); // Starting pair on the stack - the whole list to sort

while (! s.empty())

{ p = s.pop(); // Get next sublist to sort

while (p.start < p.end)

{ int partitionPos = partition(list, p.start, p.end);

// Now push left sublist

Pair tmp = new Pair();

tmp.start = p.start;

tmp.end = partitionPos-1;

s.push(tmp);

// Start work on right sublist

p.start = partitionPos+1;

}

}

} // iterativeQuickSort

/** Merge two sorted sublists of items into a single sorted list.

* <BR><I>Precondition:</I> The sublist between <CODE>first</CODE> and

* <CODE>mid</CODE> is in ascending order, and the sublist between

* <CODE>mid+1</CODE> and <CODE>last</CODE> is in ascending order.

239

Page 249: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

* <BR><I>Precondition:</I> Space is available to create an array of

* <CODE>last-first+1</CODE> elements.

* <BR><I>Postcondition:</I> The list between <CODE>first</CODE> and

* <CODE>last</CODE> is in ascending order.

* @param list The list of items to be merged.

* @param first The index of the first element of the first sublist to be

* considered.

* @param mid The index of the midpoint between the two sublists to be

* considered.

* @param last The index of the last element of the second sublist to be

* considered.

*/

public static void merge (Comparable list[], int first, int mid, int last)

{ Comparable[] tmp = new Comparable[last-first+1];

// Temporary array for merging

int i = first, // Subscript for first sublist

j = mid+1, // Subscript for second sublist

k = 0; // Subscript for merged list

// Merge sublists together

while (i <= mid && j <=last)

if (list[i].compareTo(list[j]) < 0)

tmp[k++] = list[i++];

else

tmp[k++] = list[j++];

// Copy remaining tail of one sublist

while (i <= mid)

tmp[k++] = list[i++];

while (j <= last)

tmp[k++] = list[j++];

// Now copy tmp back into list

for (k = first; k <= last; k++)

list[k] = tmp[k-first];

} // merge

/** Sort a list of items into ascending order using a recursive form of the

* Merge Sort.

* <BR><I>Precondition:</I> The list contains data that implements

* the <CODE>Comparable</CODE> interface.

* <BR><I>Postcondition:</I> The list between <CODE>start</CODE> and

* <CODE>end</CODE> is in ascending order.

* @param list The list of items to be sorted.

* @param start The index of the first item in the list to be considered.

* @param end The index of the last item in the list to be considered.

*/

private static void recursiveMS (Comparable list[], int start, int end)

{ if (start < end)

{ int midPoint = (start + end) / 2;

recursiveMS(list, start, midPoint);

recursiveMS(list, midPoint+1, end);

merge(list, start, midPoint, end);

}

} // recursiveMS

/** Sort a list of items into ascending order using a recursive form of the

* Merge Sort.

* <BR><I>Precondition:</I> The list contains data that implements

240

Page 250: Rhodes UniversityContents I Introduction 1 1 What is \Advanced Programming"? 3 1.1 Introduction

* the <CODE>Comparable</CODE> interface.

* <BR><I>Postcondition:</I> The list is in ascending order.

* <P>This sort is of order <I>n</I>log <I>n</I>, but has 2<I>n</I> space

* requirements.

* @param list The list of items to be sorted.

*/

public static void mergeSort (Comparable list[])

{ recursiveMS(list, 0, list.length-1);

} // mergeSort

} // class Sort

241