cpsc 211 data structures & implementations (c) texas a&m...

CPSC 211 Data Structures & Implementations (c) Texas A&M University [ 0 ]

About These Slides

These slides were developed by

Prof. Jennifer WelchDepartment of Computer ScienceTexas A&M UniversityCollege Station, TX [email protected]

during Spring 1999. Comments and suggestions forimprovements are welcome.


What are Data Structures?

Data structures are ways to organize data (informa-tion). Examples:

� simple variables —

� objects —

� arrays —

� linked lists —

Typically, algorithms go with the data structures tomanipulate the data (e.g., the methods of a class).

This course will cover some more complicated datastructures:

� how

� what


Abstract Data Types

An abstract data type(ADT) defines

�

�

Similar to a

This course will cover

� specifications of

� pros and cons of

� how the


Specific ADTs

The ADTs to be studied (and some sample applica-tions) are:

�

�

�

�

�


How Does C Fit In?

Although data structures are universal (can be imple-mented in any programming language), this course willuse Java and C:

�

�

We will learn how to gain the advantages of

Reasons to learn C:

� learn

� useful

� ubiquitous and

� Unix

� C code can be very

� very efficient


Other Topics

Course will emphasizegood software developmentpractice:

�

�

�

�

Course will touch on several moreadvanced computerscience topicsthat appear later in the curriculum, andfit in with our topics this semester:

�

�

�


Principles of Computer Science

Computer Science is like:

� engineering:

� science:

� math:

However, CS studies

Recurring concepts in computer science are:

� layers, hierarchies, information-hiding, abstraction,interfaces

� efficiency, tradeoffs, resource usage

� reliability, affordability, correctness


Introduction to Data Structures

Data structures are one of the enduring principlesin computer science.Why?

1. Data structures are based on the notion of informa-tion hiding:

2. A number of data structures are useful in a widerange of applications.


Efficiency Considerations

Since these data structures are so widespread, it’s im-portant to implement them efficiently. Measures ofefficiency:

�

�

in

�

�

We will study tradeoffs, such as

�

�

Efficiency will be measured using

�

�


Asymptotic Analysis

Actual (wall-clock) time of a program is affected by:

�

�

�

�

�

�

Instead of wall-clock time, look at thepatternof theprogram’s behavioras the problem size increases. Thisis calledasymptotic analysis.


Big-Oh Notation

Big-oh notation is used to capture the generic

From a practical point of view, you can get the big-ohnotation for a function by

1.

2.

Which terms are lower order than others?In increas-ing order:

Examples:

� 4302 =

� n3 + n log n + n5 + n =

� 34n3 � 2n log n + :0004n5 + 5:2n =

See Appendix B, Section 4 of Standish, or CPSC 311,for mathematical definitions and justifications.


Why Multiplicative Constants are Unimportant

An example showing how multiplicative constants be-come unimportant asn gets very large:

n 1000 log n :0001 � n2

2

256

4096

8192

16,384

32,768

1,048,576

Big-oh notation is not always appropriate! If yourprogram is working on small input sizes,


Generic Steps

How can you figure out the running time of an algo-rithm without implementing it, running it on variousinputs, plotting the results, and fitting a curve to thedata? And even if you did that, how would you knowyou fit the right curve?

We countgeneric stepsof the algorithm. Each genericstep that we count should be

Classifying an assignment statement as a generic stepis

Classifying a statement “sort the entire array” as a genericstep is


Stack vs. Heap

Memory used by an executing program is partitioned:

� the stack:

– When a method begins executing, a piece of thestack (stack frame) is devoted to it.

– There is an entry in the stack frame for�

�

�

– For variables of primitive type, the data itself isstored

For variables of object type,

– When the method finishes, the method’s stack frameis

� the heap:Dynamically allocated memory goes here,including the actual data for objects. Lifetime is


Stack Frames Example

main calls p p calls q

q returns p calls r r calls s

s returns r returns p returns

main main main

main main main

main main main

p p

p p p

p p

r r

r

q

s


Objects

An object is an entity (e.g., a ball) that has

� state—

� behavior —

A classis the

Analogy: a class is like an

an object is like an

� class defines important

� construction is required to

� many objects/houses can be created


Data Abstraction

The class concept supports

Similar principles apply as for procedural abstraction:

� group

� group

� separate the issue of

� separate the issue of


References

The class of an object is its

Objects are declared differently than are variables ofprimitive types.

Suppose there is a class calledPerson .

int total;Person neighbor;

� Declaration oftotal allocates storage on the

� Declaration ofneighbor allocates storage on the


Creating Objects

A constructor is a special method of the class that

When a constructor is called,

� storage space is allocated

� each object gets

� the object’s state is

The name of the constructor for classX is X() . Ex:

neighbor = new Person();

The operatornew must be put in front of the call to theconstructor.

Summary: Declaring a variable of an object type pro-duces


Creating Objects (cont’d)

You can combine the declaration and initialization:Person neighbor = new Person();just as you can for primitive types:int total = 25;


Object Assignment & Aliases

The meaning of assignment isdifferentfor objects thanit is for primitive types.int num1 = 5;int num2 = 12;num2 = num1;

At the end,num2 holds 5.Person neighbor = new Person(); // creates object 1Person friend = new Person(); // creates object 2friend = neighbor;

At the end,friend andneighbor both refer to ob-ject 1 (they arealiasesof each other) and nothing refersto object 2 (it isinaccessible).


Data Abstraction Revisited

As a rule of thumb, referring to instance variables out-side the class is

For instance, the implementor of thePerson classmight decide to store the age

In this case,getAgeInYears must change:

Code that got the age using this method need not change,but code that got the age using.age directly

Moral:


Public vs. Private

You can tailor the ability to access methods and vari-ables from outside the class, usingvisibility modifiers .

� public: the variable or method can

� private: the variable or method can

Visibility modifiers go at the beginning of the line thatdeclares the variable or method. Ex:public static void main(...private int age;Rules of thumb:

� make instance variables

� make instance methods that are part of the publicinterface of the class

� make instance methods that help with internal workof a class


Public vs. Private (cont’d)

Instance variables should be accessible onlyindirectlyvia public ”get” and ”set” methods. Ex:

getAgeInYears()

Group together all the private variables/methods, andall the public ones when you format your program.


Specification vs. Implementation

Users of a class should rely only on the specification ofthe class. They are allowed to

� declare

� create

� invoke

Implementors of a class should

� define

� hide

� protect

� feel free to


Inheritance

Inheritance lets a programmer derive a new class froman existing class. New class can

� use

� modify

� have

Thus inheritance promotessoftware reuse. It is a defin-ing characteristic of

Terminology:

� Class A isderived from (or, inherits from) anotherclass B

� A is calledsubclassor child class.

� B is calledsuperclassor parent class.


Benefits of Inheritance

Inheritance is particularly useful inlargesoftware projects:

�

– saves

– provides

– supports

�

�

�


Costs of Inheritance

�

– Usually this disadvantage is outweighed by

– Once system is working,

�

�


Inheritance in Java

To declare that a class is a subclass of another class:

class <child-class> extends <parent-class> {... // define the child-class

}

� child class inherits

� child class inherits

� child class does NOT inherit

� child class does NOT inherit

Inherited variables and methods can be used in thechild class

Inheritance is one-way street!!


Protected Visibility

� private :

� public :

This makes it dangerous to inherit variables, since nor-mally instance variables should not be made accessibleoutside the class.

The solution is

� protected:


Overriding Methods

When a child class defines a method with the samenameand signature(sequence of parameters) as theparent, the child’s versionoverrides the parent’s ver-sion. Useful when

Polymorphism means that

These are not necessarily the same, since a variable canrefer to any object whose class is a descendant of thevariable’s class.

When in doubt, draw a memory diagram!


Abstract Classes — Motivation

Consider a database for a veterinarian to keep track ofmedical and billing information for each patient.

� Each patient is someone’s pet (e.g., dog, bird).

� Some aspects of the vet’s business are independentof the particular species (e.g., billing, owner info).

� Some aspects depend critically on the species (e.g.,the vaccination schedule, diet recommendations).

An obvious organization is to have a

Note that it does not make sense to create aPet object—

ThePet class is used to


Rules for Abstract Classes and Methods

� Only instance methods can be declared

� Any class with an abstract method must be declared

� A class may be declared abstract

� An abstract class cannot

� A non-abstract subclass of an abstract class must

� If a subclass of an abstract class does not implementall of the abstract methods that it inherits, then

Since an abstract class cannot be instantiated, its vari-ables and methods are notdirectly used. But they canbe


Declaring an Interface

An interface is an abstract class taken to the extreme.It is like an abstract class in which

interface <interface name> {<constant declarations> // public final<abstract method declarations> // public abstract

}

An interface provides

� a collection of

� a collection of

For example:


Implementing an Interface

The syntax for “inheriting from” (calledimplement-ing) an interface I is:

class B implements I { ... }

For example:

The classAccount

� can access the

� must provide an implementation of


Abstract Classes vs. Interfaces

� An abstract class can be used as a repository of

� A class can implement

� Both abstract classes and interfaces can be used to


Object-Oriented Design

The design of a software system is an iterative process.

� choose

� develop

� previous step may indicate that

� develop

� etc.

As the design matures, objects are abstracted into classes:

� group

� put

� determine

Initial design effort focuses on the overall structure ofthe program. The algorithms for the methods are spec-ified using pseudocode. Actual coding begins


Deciding on Objects and Classes

Make some guesses about what the objects in the sys-tem are and try to arrange them into groups (whichwill be the classes). Although you should put seriousthought into this,don’t try to do this perfectly on thefirst pass.

Rule of Thumb:

Later you may need

As you come up with the objects, some details (vari-ables and methods) will be obvious. Document theseand test them out with scenarios —

A scenariois a


Linked List

Linked lists are useful when

Linked lists are an example of

Separate blocks of storage are

Linked representations are an important alternative to

Many key abstract data types (lists, stacks, queues, sets,trees, tables) can be represented with either

Important to understand the


Pointers

Pointers in Java are called

However, you cannot


Linear Linked Lists

The list consists of a series of

Each node contains

�

�

To realize this idea in Java:

� each

� class

–

–

� another class


Linear Linked Lists (cont’d)

Here is a diagram of the heap:

Space complexity:


Linked List Example — Node Class

For a linked list of books, first define a class that rep-resents individual list elements (nodes).

The type of the link variable is thesameas the classbeing defined —


Linked List Example — List Class

Then define a class that represents

�

�

�


Linked List Operations

What should be the operations on a linked list?

� –

–

–

� –

–

–

�

Add some instance methods to theBookList class:


Using a Linked List

Example:


Inserting at the Front of a Linked List

Pseudocode:

1.

2.

In Java (assuming the parameter is not null):


Inserting at the Front of a Linked List (cont’d)

What happens if we do step 1 and step 2 in the oppositeorder?

Time Complexity:


Inserting at the End of a Linked List

First, assume the list is empty (i.e.,first equalsnull ).

1.

2.

Now, assume the list is not empty (i.e.,first doesnot equalnull ).

1.

2.

How do we do step 1?


Inserting at the End of a Linked List (cont’d)

Time Complexity:


Using a Last Pointer

To improve running time, keep a pointer to the lastnode in the list class, as well as the first node.

Time Complexity:


Using a Last Pointer (cont’d)


Deleting Last Node from Linked List

Suppose we want to delete the node at the end of thelist and return the deleted node.

First, let’s handle theboundary conditions:

� If the list is empty,

� If the list has only one element


Deleting Last Node from Linked List (cont’d)

Suppose the list has at least two elements.First attempt:

1.

2.

3.

...

return thisStep 1 can be done as before.

What about step 2?


Deleting Last Node from Linked List (cont’d)

Time Complexity:

Would it help to keep a last pointer?


Linked Lists Pitfalls

� Check that a link is not null before following it!Example:

� Mark end of list

� Be careful with boundary cases!

� Draw memory diagrams!

� Don’t lose access to needed objects!


Linked Lists vs. Arrays

Space complexity:

Time Complexity (n data items):

singly singly doubly doubly arraylinked linked, linked linked,

last ptr last ptrinsert front

insert end

delete first

delete last

search


Linked Lists vs. Arrays (cont’d)

Suppose the items in the sequence are in sorted order.Then data items must be inserted in the correct place.But perhaps this will make searching for an item easier.Break the insertion process into two parts:

1. search

2. insert

singly singly doubly doubly arraylinked linked, linked linked,

last ptr last ptrsearch

insert


Linked Lists vs. Arrays (cont’d)

Tradeoff:

� linked list:

– insert is

– search is

because nodes

� arrays:

– insert is

– search is

because nodes

Binary search cannot be used on

Later we will see some other data structures that try to


Other Linked Structures

We don’t have to restrict ourselves to just having onelink instance variable per node. We can get arbitrarilycomplicated linked structures.

Some of the more common and useful ones are:

� doubly linked list —

� rings —

� trees —

� general graphs —


Recursion

Idea ofrecursion is closely related to the principle of

� Figure out how to

� Assume you have a

� Figure out how to

This is also an application of

Rules for recursive programs:

� There must be

� Recursive call(s) must


Stack Frames for Recursive Methods

When a recursive method is executed,

Example:The factorial ofn, representedn!, is calculated asn �(n� 1) � (n� 2) � � � 2 � 1.

To computen!:


Stack Frames for Factorial Example

Stack frames when callingfact(4) :


Reversing a Linked List Recursively

To find a recursive solution, break the problem downinto a smaller problem. Let the list consist of nodesx1; x2; : : : ; xn.

One idea:

1. Reverse

2. Put

Step 1 solves a smaller problem; step 2 does a littlemore work to solve the larger problem.

(A similar idea:

1. Reverse

2. Put

Stopping case?


Reversing a Linked List Recursively (cont’d)

abstract class Node {Node link;

}class LinkedList {

Node first;...void reverseList() {

first = reverse(first);}

}

reverseList is an instance method that

Note a common occurrence:


Reversing a Linked List Recursively (cont’d)

� reverse takes as a parameter

� reverse returns


Concatenating Two Lists

Methodconcat appends the list starting with node bto the end of the list starting with node a. It returns areference to the first node in the resulting list.

Time Complexity:To reverse a list ofn nodes takes


Figure for Reversing a Linked List Recursively


Reversing an Array Recursively

Let A be an array of sizen. To reverseA, we mustchange which indexes are occupied by which data, sothat at the end:

� A[0] contains

� A[1] contains

� etc.

We can follow the ideas from the linked list:

1. save

2. recursively cause

3. store

This breaks the problem of sizen down into a subprob-lem of sizen� 1.Stopping case:


Reversing an Array Recursively (cont’d)

The following reverses the elements ofA starting atindexstart :

The top level call is:


Figure for Reversing an Array Recursively


Towers of Hanoi

Towers of Hanoi is is an example of a problem thatis mucheasier to solve using recursion than not usingrecursion.

� There are 3 pegs andn disks, all of different sizes

� Initially all disks are on the start peg, stacked indecreasing size, with largest on bottom and smalleston top.

� We must move all the disks to the end peg

� The third peg

Example:n = 2. Solution is:

1. Move

2. Move

3. Move

For largern, it becomes difficult to figure out.


Recursive Solution to Towers of Hanoi

Using recursion can help. Suppose someone gives us amethodM to moven� 1 pegs. We can use it to solvethe problem forn pegs as follows:

1. Move

2. Move

3. Move

Steps 1 and 3 will be done

Stopping case?


Figure for Towers of Hanoi


Recursive Solution to Towers of Hanoi (cont’d)

The output of the program will be a list of instructions.

To call this method, suppose you have 4 pegs and youwant to use peg 1 as the start peg, peg 3 as the finishpeg, and peg 2 as the spare peg:


Time Complexity of Towers of Hanoi Solution

Time Complexity:Asymptotically proportional to thenumber of

Each instantiation of the method

To count the number of instantiations, draw a

Number of vertices in the tree is

Therefore time complexity is


Parsing Arithmetic Expressions

An important part of a compiler is theparser, whichchecks whether

An important part of this problem is to check whether

� a + (b� (x=y))

� a + +b=z

� (a)) � c

To simplify the problem:

� Assume that the operands are

� Only consider operators

The correct syntax for arithmetic expressions can bedescribed using


A Grammar for Arithmetic Expressions

Sample Rules:(j means “or”)

1.

2.

3.

Here are some derivations:


Recursive Parsing Algorithm

Idea is to try to obtain an expression from the input. Todo this, try to obtain from the input

�

�

�

To obtain a term from the input (starting at the currentposition), try to obtain

�

�

�

To obtain a factor from the input (starting at the currentposition), try to obtain

�

�


Recursive Parsing Algorithm (cont’d)

At the top level:

boolean valid(String input) {String remainder = getExpr(input);return ((remainder != null) &&

(remainder.length() == 0));}

getExpr recognizes an expression at the beginningof input and returns the rest of the string, which willbe the empty string if nothing is left over. If a syntaxerror is encountered, it returnsnull . (Does not handlewhite space in the input.)


Recursive Parsing Algorithm (cont’d)


Abstract Data Types

An abstract data type(ADT) defines entities that have

�

�

ADTs provide the benefits of

There is astrict separationbetween

This separation facilitates

ADTs are easily achieved in


ADT Example: Priority Queue Specification

Thepriority queue ADT is useful in many situations.Here is its specification:� The state is

� The operations on a priority queue are:

–

–

–

Note thatthere is no operationto

Example applications:� Pay

� Provide


Using a Priority Queue to Sort a List of Integers

Even without knowing anything abouthow a priorityqueue might be implemented, we can take advantageof its operations to solve other problems.

For example, to sort a list of numbers:

� Insert

� Successively

� Store


Implementing a Priority Queue with an Array


Implementing a Priority Queue with a Linked List

Pseudocode:

� To insert an element:

� To remove the highest priority element:

– Scan

– When

Time is

Asymptotic running times are

Time to sort is

Can we do things faster by keeping the array, or linkedlist, elements in sorted order?Warning:


Implementing a PQ with a Sorted Array

Keep the array elements in increasing order of priority.(If highest priority is smallest element, then elementswill be in decreasingorder).Pseudocode:




Implementing a PQ with a Sorted Linked List

Pseudocode:



Asymptotic times are


Generic PQ Implementation Using Java

To avoid rewriting the priority queue implementationfor every different kind of element (integer, double,String, user-defined classes, etc.), we can use Java’sinterface feature.

All that is required is


Using theComparisonKey Interface

� Change the specification of thePriorityQueueclass to consist of a collection of

� Any class that

� Define a class calledPQItem that

� sortPQ , the sorting algorithm that uses a priorityqueue, can


Generic Implementation of PQ with Array

class PriorityQueue {private ComparisonKey[] A =

new ComparisonKey[100]; // int -> CKprivate int next;PriorityQueue() {

next = 0;}public void insert(ComparisonKey x) { // int -> CK

A[next] = x;next++;

}public ComparisonKey remove() { // int -> CK

ComparisonKey high = A[0]; // int -> CKint highLoc = 0;for (int cur = 1; cur < next; cur++) {

if (high.compareTo(A[cur]) ==ComparisonKey.LOWER) { // use compareTo metho d

high = A[cur];highLoc = cur;

}}A[highLoc] = A[next-1];next--;return high;

}}


Implementing the GenericPQItem

Here is a possiblePQItem class for integers. Note

For aPQItem class for strings:

� make

� make

� the method


GenericPQItem ’s (cont’d)

This approach is particularly powerful since we can

Suppose the items are

One form of priority might be

Another form might be

All those decisions will be encapsulated inside the


Sorting with Generic PQ

Finally, here is the sorting algorithm:

void sortPQ (ComparisonKey[] A) {int n = A.length;PriorityQueue pq =

new PriorityQueue();for (int i = 0; i < n; i++)

pq.insert(A[i]);for (int i = 0; i < n; i++)

A[i] = pq.remove();}

The only difference from before is

IMPORTANT TO NOTICE:

� ThePriorityQueue class

� ThesortPQ method


Importance of Modularity and Information Hiding

Why is it valuable to be able to do these kinds of things?

The public/private visibility modifiers of Java, and thediscipline of not making the internal details be avail-able outside are forms of

Information hiding promotesmodular programming— you can

The key to abstraction is


Compiling and Running a C Program in Unix

Simple scenario in which your program is in a singlefile: Suppose you want to name your programtest .

1. edit

2. compile

3. if

4. run

5. if


Structure of a C Program

A C program is a list of

Every C program must contain

Functions are

� The

� For

�

� The\n is

� Comments


A Useful Library

See the Reek book (especially Chapter 16) for a de-scription of what you can do with built-in libraries. Inaddition tostdio.h ,

� stdlib.h lets you use functions for, e.g.,

–

–

–

–

� math.h provides

� string.h has


Printf

The functionprintf is used to print the standard out-put (screen):

� It can take a

� The first argument must

� The first argument might

� A

� Following the first argument is a

Example:

Output is:


Variables and Arithmetic Expressions

The main numeric data types that we will use are:

�

�

�

Variables are declared and manipulated in arithmeticexpressions pretty much as in Java. For instance,

However, in C,

CPSC 211 Data Structures & Implementations (c) Texas A&M University [ 100]

Reading from the Keyboard

The functionscanf reads in data from the keyboard.

� scanf takes a

� The first argument is

� Each

� After the first argument is a

� The subsequent arguments must each be

� The code for an

When you run this program, it will wait for you to entertwo integers, and then continue. The integers can be onthe same line separated by a space, or on two lines.


Functions

Functions in C are pretty much like methods in Java(dealing only with primitive types). Example:#include < stdio.h >double times2 (double x) {

x = 2*x;return x;

}main () {

double y = 301.4;printf("Original value is %f; final value is %f.\n",

y, times2(y));}

� Functions must be

� As in Java, parameters are

� As in Java, if the function does not return any value,

� Parameters and local variables of functions


Recursive Functions

Recursion is essentially the same as in Java.

The only difference is if you have mutually recursivefunctions, also calledindirect recursion: for instance,if function A calls function B, while B calls A.

Then you have a problem with the requirement thatfunctions be defined before they are used.

You can get around this problem with


Global Variables and Constants

C also providesglobal variables.

� A global variable is defined

� A global variable can be used

Generally, global variables that can be changed are frownedupon, as contributing to errors. However, global vari-ables are very appropriate forconstants. Constants aredefined usingmacros:


Boolean Expressions

� The operators to compare two values are the sameas in Java:

� However, instead of returning a boolean value, theyreturn

� Actually, C interprets

Thus the analog in C of aboolean expressionin Javais any expression that produces

As in Java, boolean expressions can be operated onwith Some examples:

� (10 == 3) evaluates to

� !(10 == 3) evaluates to

� !( (x < 4) || (y == 5) ) : if x is 10 andy is 5, then this evaluates to


If Statements and Loops

Given the preceding interpretation of “boolean expres-sion”, the following statements are the same in C as inJava:

�

�

�

�

Since Boolean expressions are essentially integers, youcan have afor statement like this in C:for (int count = 99; count; count--) {

...}

� count is initialized to

� the loop is executed

� count is

� This loop is executed


Switch

C has a switch statement that is like that in Java:

switch ( <integer-expression> ) {case <integer-constant-1> :

<statements-for-case-1>break;

case <integer-constant-2> :<statements-for-case-2>break;

...default : <default-statements>

}

Don’t forget the break statements!

The integer expression must produce a value belongingto any of the integral data types (various size integersand characters).


Enumerations

This is something neat that Java does not have.

An enumeration is a way to give

For instance, suppose you need to have some codesin your program to indicate whether a library book ischecked in, checked out, or lost. Intead of

#define CHECKED_IN 0#define CHECKED_OUT 1#define LOST 2

you can use anenumeration declaration:


Using an Enumeration in a Switch Statement

int status;/* some code to give status a value */switch (status) {

case CHECKED_IN :/* handle a checked in book */break;

case CHECKED_OUT :/* handle a checked out book */break;

case LOST :/* handle a lost book */break;

}


Enumeration Data Type

You can give a name to an enumeration and thus createanenumeration data type. The syntax is:

enum <name-of-enum-type> <actual enumeration>

For example:

enum book_status { CHECKED_IN, CHECKED_OUT, LOST };

Why bother to do this?


Type Synonyms

The enumeration type is our first example of auserdefined typein C.

It’s rather unpleasant to have to carry around the wordenum all the time for this type.

Instead, you can give a name to this type you havecreated, and subsequently just use that type – withouthaving to keep repeatingenum. For example:


Structures

C also gives you a way to create more general types ofyour own, asstructures These are essentially like ob-jects in Java, if you just consider the instance variables.A structure groups together related data items that canbe of different types.

The syntax to define a structure is:


Storage on the Stack

The statement

struct student stu;

causes the entirestu structure to be stored


Using typedef with Structures

When using the structure type, you have to carry alongthe wordstruct .

To avoid this, you can use a

A more concise way to do this is:

Now you can create aStudent variable:


Using a Structure

You can access the pieces of a structure using dot no-tation (analogous to accessing instance variables of anobject in Java) :

You can also have theentirestruct on either the left orthe right side of the assignment operator:


Figure for Copying a Structure


Passing a Structure to a Function

Structures can be passed as parameters to functions:

Then you can call the function:

But if you put the following line of code after theprintfin print info :


Returning a Structure From a Function

You can return a structure from a function also. Sup-pose you have the following function:

Now you can call the function:


Figure for Returning a Structure from a Function

The copying of formal parameters and return valuescan be avoided by


Arrays

To define an array:

For example:

� Unlike Java,

� Unlike Java,

� Unlike Java,

� As in Java,

� As in Java,


Arrays (cont’d)

Two things you CAN do:

� If you have an array of structures,

� You can declare a two-dimensional array (and higher):e.g.,

Two things you CANNOT do:

�

�

We’ll see how to accomplish these tasks


Pointers in C

Pointers are used in C to

� circumvent

– copying of parameters and return values

– lasting changes

� access

� allow

For each data type T,

For instance,

declaresiptr to be of type “pointer toint ”. iptrrefers to a

Actually, most C programmers write it as:


Addresses and Indirection

Computer memory is

Each variable is

Theaddressof the variable is

� iptr refers to

� *iptr refers to

Applying the* operator is called


The Address-Of Operator

We saw the& operator inscanf . It

int i;int* iptr;i = 55;iptr = &i;*iptr = *iptr + 1;

Last line gets data out of location whose address is iniptr , adds 1 to that data, and stores result back inlocation whose address is iniptr .


Comparing Indirection and Address-Of Operators

As a rule of thumb:

� Indirection:

– It CANNOT

– It CAN

� Address-Of:

– It CAN

– It CANNOT


Pointers and Structures

Remember the struct typeStudent , which has anint age and adouble grade point :

Student stu;Student* sptr;sptr = &stu;

To access variables of the structure:

There is a “shorthand” for this notation:


Passing Pointer Variables as Parameters

You can pass pointer variables as parameters.

void printAge(Student* sp) {printf("Age is %i",sp->age);

}

When this function is called,

1. aStudent* variable:

or

2. apply the& operator to aStudent variable:

C still uses call by value to pass pointer parameters, butbecause they are pointers, what gets copied are

Data comingin to the function is not copied.


Passing Pointer Variables as Parameters (cont’d)

Now we can

void changeAge(Student* sp, int newAge) {sp->age = newAge;

}

You can also

Old initialize with copying:

Student initialize(int old, double gpa) {Student st;st.age = old;st.grade_point = gpa;return st;

}

More efficientinitialize using pointers:


Passing Pointer Variables as Parameters (cont’d)

Using pointers is anoptimizationin previous case. Butit is

void swapAges (Student* sp1, Student* sp2) {int temp;temp = sp1->age;sp1->age = sp2->age;sp2->age = temp;

}

To call this function:


Pointers and Arrays

The name of an array is

It is a

To reference array elements, you can use

�or

�

What is going on with the pointer notation?

� a refers to

� *a refers to

� a+1 refers to

� *(a+1) refers to


Pointers and Arrays (cont’d)

You can also refer to array elements with

For example,

int a[5];int* p;p = a; /* p = &a[0]; is same */

� p refers to

� *p refers to

� p+1 refers to

� *(p+1) refers to

Sincep is a non-constant pointer, you can also

Warning: NO BOUNDS CHECKING IS DONE INC!


Passing an Array as a Parameter

To pass an array to a function:

void printAllAges(int a[], int n) {int i;for (i = 0; i < n; i++) {

printf("%i \n", a[i]);}

}

The “array” parameter indicates

Alternative definition:

void printAllAges(int* p, int n) {int i;for (i = 0; i < n; i++) {

printf("%i \n", *p);p++;

}}

Theformalarray parameter is a

You can call the function like this:


Dynamic Memory Allocation in Java

JavaThat means that

This happens whenever

In Java there is strict distinction between

Every variable is either

� memory for variables is

This memory

� memory for variables of primitive type

� memory that holds the actual contents of an objectis

This memory goes away


Dynamic Memory Allocation in C

In C,

Every type has the possibility of being allocated stati-cally (on the stack) or dynamically (on the heap).

To allocate space statically, you

Space is allocated

To allocate space dynamically, use

� It takes one integer parameter indicating the

Usesizeof operator to get the length;

� It returns a

The pointer has typevoid* . You MUST cast it tothe appropriate type. Ifmalloc fails to allocate thespace,


malloc Example

To dynamically allocate space for anint :

int* p;p = (int*) malloc(sizeof(int)); /* cast result

to int* */if (p == NULL) { /* to be on the safe side */

printf("malloc failed!");} else {

*p = 33;printf("%i", *p);

}

Normally, you don’t need to allocate a single integer ata time. Typically, you would usemalloc to:

� allocate

� allocate


Another malloc Example

To dynamically allocate space for a structure:

Student* sptr;sptr = (Student*) malloc(sizeof(Student));sptr->age = 20;sptr->grade_point = 3.4;


Allocating a Linked List Node Dynamically

For a singly linked list of students, use this type:

typedef struct Stu_Node{int age;double grade_point;struct Stu_Node* link;

} StuNode;

To allocate a node for the list:

To insert the node pointed to bysptr after the nodepointed to by some other node, saycur :


Allocating an Array Dynamically

To allocate an array dynamically,

int i;int* p;p = (int*) malloc(100*sizeof(int)); /* 100 elt array *//* now p points to the beginning of the array */for (i = 0; i < 100; i++) /* initialize the array */

p[i] = 0; /* access the elements */

Similarly, you can allocate an array of structures:


Deallocating Memory Dynamically

When memory is allocated using malloc,

You can get

void sub() {int *p;p = (int*) malloc(100*sizeof(int));return;

}

Although the space for the pointer variablep goes awaywhensub finishes executing,

But they are completely useless aftersub is done,

If you had wanted them to be accessible outside ofsub ,


Using free

To deallocate memory when you are through with it,

It takes as an argument a

and returns nothing. The result offree is that all thespace starting at the designated location will be

In the functionvoid sub above, just before the re-turn, you should say:

DO NOT DO THE FOLLOWING:


Saving Space with Arrays of Pointers

Suppose you need an array of structures, where eachstructure is fairly large. But you are not sure at compiletime how big the array needs to be.

1. Allocate

2. Find out

3. Allocate


Array of Pointers Example

To implement with the usualStudent struct :


Information Hiding in C

Java provides support for information hiding by

�

�

Advantages of data abstraction, including the use ofconstructor and accessor (set and get) functions:

� push

� easier

� easy

� easy

C does not provide the same level of compiler supportas Java, but you can achieve the same effect with some


Information Hiding in C (cont’d)

A “constructor” in C would be a function that

� calls

� initializes

� returns

For example:



The analog of a Java instance method in C would bea function whose first parameter is the “object” to beoperated on.

You can writeset andget functions in C:



You can use theset andget functions to swap theages for two student objects:

When should you provide set and get functions andwhen should you not? They obviously impose someoverhead in terms of additional function calls.


Strings in C

� There is no explicit string type in C.

� A string in C is an array of characters that isterminated with the null character.

� The length

� The null character

� A sequence of characters enclosed in double quotes


Strings in C (cont’d)

� You can also declare a

To initializename, do not assign to a string literal!Instead, either

� Access elements using the brackets notation:

char firstLetter;name[3] = ’a’;firstLetter = name[0];namePtr[3] = ’b’;firstLetter = namePtr[0];


Passing Strings to and from Funtions

To pass a string into a function or return one from afunction, you mustPassing in a string:

Returning a string:

You can call these functions like this:


Reading in a String from the User

To read in a string from the user, call:

scanf("%s", name);

� Notice the use of%sin scanf . The correspondingdata must be a

� scanf reads a string from the input stream up to

� The letters are read into

� You must make sure that you have a large enougharray to hold the string.How much space is needed?

� If you don’t have enough space, whatever followsthe array will be


String Manipulation Functions

There are some useful string manipulation functionsprovided for you in C. These include:

� strlen , which takes a string as an argument andreturns the length of the string,not counting thenull character at the end. I.e., it counts how manycharacters it encounters before reaching’\0’ .

� strcpy , which takes two strings as arguments andcopies itssecondargument to itsfirst argument.

First, to use them, you need to include headers for thestring handling library:

#include <string.h>

To demonstrate the use ofstrlen andstrcpy , sup-pose you want to add anamecomponent to theStudentstructure and change the constructor so that it asks theuser interactively for the name:


String Manipulation Functions Example

typedef struct {char* name;int age;double grade_point;

} Student;

Student* constructStudent(int age, double gpa) {char inputBuffer[100]; /* read name into this */Student* sptr;sptr = (Student*) malloc(sizeof(Student));sptr->age = age;sptr->grade_point = gpa;

/* here’s the new part: */printf("Enter student’s name: ");scanf("%s", inputBuffer);

/* allocate just enough space for the name */sptr->name = (char*) malloc (

(strlen (inputBuffer) + 1)*sizeof(char) );/* copy name into new space */

strcpy (sptr->name, inputBuffer);return sptr;

}

When constructor returns,inputBuffer goes away.Space allocated forStudent object is anint , adoubleand just enough space for the actualname.


Other Kinds of Character Arrays

Not every character array has to be used to represent astring. You may want a character array that holds allpossible letter grades, for instance:

char grades[5];grades[0] = ’A’;grades[1] = ’B’;grades[2] = ’C’;grades[3] = ’D’;grades[4] = ’F’;

In this case, there is no reason for the last array entryto be the null character, and in fact, it is not.


File Input and Output

File I/O is much simpler than in Java.

� Include

� Declare

� Call

� Writing to a file is done with

� Reading from a file is done with

� Call


File I/O Example

/* to use the built in file functions */#include <stdio.h>main () {/* create a pointer to a struct called FILE; *//* it is system dependent */

FILE* fp;char line[80];int i;

/* open the file for writing */fp = fopen("testfile", "w");

/* write into the file */fprintf(fp,"Line %i ends \n", 1);fprintf(fp,"Line %i ends \n", 2);

/* close the file */fclose(fp);

/* open the file for reading */fp = fopen("testfile", "r");

/* read six strings from the file */for (i = 1; i < 7; i++) {

fscanf(fp,"%s", line);printf("got from the file: %s \n", line);

}/* close the file

fclose(fp);}


Motivation for Stacks

Some examples oflast-in, first-out(LIFO) behavior:

� Web browser’s

� Text editors

� The most recent pending method/function call

� To evaluate an arithmetic expression,

A stack is a sequence of elements, to which elementscan be added (push) and removed (pop):


Specifying an ADT with an Abstract State

We would like a specification to be as independent ofany particular implementation as possible.

But since people naturally think in terms of state, apopular way to specify an ADT is


Specifying the Stack ADT with an Abstract State

1. A stack’s state is modeled as

2. Initially the state of the stack is

3. The effect of a push(x) operation is to

4. The effect of a pop operation is to


Specifying an ADT with Operation Sequences

But a purist might complain that a state-based specifi-cation is, implicitly, suggesting a particular implemen-tation. To be even more abstract, one can specify anADT

For instance:

� push(a) pop(a):

� pop(a):

� push(a) push(b) push(c) pop(c) pop(b) push(d) pop(d):

� push(a) push(b) pop(a):


Additional Stack Operations

Other operations that you sometimes want to provide:

� peek:

� size:

� empty:


Balanced Parentheses

Recursive definition of a sequence of parentheses thatis balanced:

� the sequence

� if the sequence

According to this definition:

� ( ) :

� ( ( ) ( ( ) ) ) :

� ( ( ) ) ) ( ) :

� ( ) ) ( :


Algorithm to Check for Balanced Parentheses

Key observations:

1. There must be

2. In any prefix, the number of

Pseudocode:


Java Method to Check for Balanced Parentheses

Usingjava.util.Stack class (which manipulatesobjects):

import java.util.*;

boolean isBalanced(char[] parens) {Stac k S = new Stack();try { // pop might throw an exception

for (int i = 0 ; i < parens.length; i++) {if ( parens[i] == ’(’ )

S.push(new Character(’(’));else

S.pop(); // discard popped object}return S.empty();

}catch (EmptyStackException e) {

return false;}

}


Checking for Multiple Kinds of Balanced Parens

Suppose there are 3 different kinds of parentheses:( and ), [ and ],f andg.

Modify the program:

boolean isBalanced3(char[] parens) {Stac k S = new Stack();try {

for (int i = 0 ; i < parens.length; i++) {if (leftParen(parens[i]) // ( or [ or {

S.push(new Character(parens[i]));else {

char leftp = ((Character)S.pop()).charValue();if (!match(leftp,parens[i])) return false;

}}return S.empty();

} // end trycatch (EmptyStackException e) {

return false;}

}


Multiple Kinds of Parentheses (cont’d)

boolean leftParen(char c) {return ((c == ’(’) || (c == ’[’) || c == ’{’));

}

boolean match(char lp, char rp) {if ((lp == ’(’) && (rp == ’)’) return true;if ((lp == ’[’) && (rp == ’]’) return true;if ((lp == ’{’) && (rp == ’}’) return true;return false;

}


Postfix Expressions

We normally write arithmetic expressions usinginfixnotation:

Another way to write arithmetic expressions is to usepostfix notation:

For example,

� 3 4 + is same as

� 1 2 - 5 - 6 5 / + is same as

One advantage of postfix is that

For instance,

� (1 + 2) * 3 becomes

� 1 + (2 * 3) becomes


Using a Stack to Evaluate Postfix Expressions

Pseudocode:


StringTokenizer Class

Java’sStringTokenizer class is very helpful tobreak up the input string into operators and operands— called

� Create aStringTokenizer object out of the in-put string. It

� Use instance methodhasMoreTokens to test

� Use instance methodnextToken to

� Second argument to constructor indicates that,

� Third argument to constructor indicates that


Java Method to Evaluate Postfix Expressions

public static double evalPostFix(String postfix)throws EmptyStackException {

Stac k S = new Stack();StringTokenizer parser = new StringTokenizer

(postfix, " \n\t\r+-*/", true);while (parser.hasMoreTokens()) {

String token = parser.nextToken();char c = token.charAt(0);if (isOperator(c)) {

double y = ((Double)S.pop()).doubleValue();double x = ((Double)S.pop()).doubleValue();switch (c) {

case ’+’:S.push(new Double(x+y)); break;

case ’-’:S.push(new Double(x-y)); break;

case ’*’:S.push(new Double(x*y)); break;

case ’/’:S.push(new Double(x/y)); break;

} // end switch} // end ifelse if (!isWhiteSpace(c)) // token is operand

S.push(Double.valueOf(token));} // end whilereturn ((Double)S.pop()).doubleValue();

}


Evaluating Postfix (cont’d)

public static boolean isOperator(char c) {return ( (c == ’+’) || (c == ’-’) ||

(c == ’*’) || (c == ’/’) );}

public static boolean isWhiteSpace(char c) {return ( (c == ’ ’) || (c == ’\n’) ||

(c == ’\t’) || (c == ’\r’) );}

Does not

Does no


Implementing a Stack with an Array

Since Java supplies aStack class, why bother?

Idea:

Issues for Java implementation:

� elements in the array are to be of type

� throw exception if

� dynamically increase the size of the array to avoid

To handle the last point, we’ll do the following:

� initially,

� if array is full and a push occurs,


Implementing a Stack with an Array in Java

class Stack {private Object[] A;private int next;

public Stack () {A = new Object[16];next = 0;

}public void push(Object obj) {

if (next == A.length) {// array is full, double its size

Object[] newA = new Object[2*A.length];for (int i = 0 ; i < next; i++) // copy

newA[i] = A[i];A = newA; // old A can now be garbage collected

}A[next] = obj;next++;

}public Object pop() throws EmptyStackException {

if (next == 0)throw new EmptyStackException();

else {next--;return A[next];

}}


Implementing a Stack with an Array in Java (cont’d)

public boolean empty() {return (next == 0);

}

public Object peek() throws EmptyStackException {if (next == 0)

throw new EmptyStackException();else

return A[next-1];}

} // end Stack class

class EmptyStackException extends Exception {

public EmptyStackException() {super();

}}


Time Performance of Array Implementation

� push:

� pop:

� empty:

� peek:


Impementing a Stack with a Linked List in Java

Idea:

class StackNode {Object item;StackNode link;

}

class Stack {

private StackNode top; // first node in list, the top

public Stack () {top = null;

}

public void push(Object obj) {StackNode node = new StackNode();node.item = obj;node.link = top;top = node;

}


Implementing a Stack with a Linked List in Java(cont’d)

public Object pop() throws EmptyStackException {

if (top == null)throw new EmptyStackException();

else {StackNode temp = top;top = top.link;return temp.item;

}}

public boolean empty() {return (top == null);

}

public Object peek() throws EmptyStackException {if (top == null)

throw new EmptyStackException();else

return top.item;}

}


Time Performance of Linked List Implementation

� push:

� pop:

� empty:

� peek:


Interchangeability of Implementations

If you have done things right, you can:

� write a program using the built-inStack class

� compile and run that program

� then make available your ownStack class, usingthe array implementation (e.g., putStack.classin the same directory

� WITHOUT CHANGING OR RECOMPILING YOURPROGRAM, run your program — it will use the lo-calStack implementation and will still be correct!

� then replace the array-basedStack.class file withyour own linked-list-basedStack.class file

� again, WITHOUT CHANGING OR RECOMPIL-ING YOUR PROGRAM, run your program — itwill use the localStack implementation and willstill be correct!


Motivation for Queues

Some examples offirst-in, first-out(FIFO) behavior:

�

�

�

A queueis a


Specifying the Queue ADT

Using the abstract state style of specification:

� The state of a queue is modeled as a

� Initially the state of the queue is the

� The effect of an enqueue(x) operation is to

� The effect of a dequeue operation is to


Specifying the Queue ADT (cont’d)

Alternative specification using allowable sequences wouldgive some rules (an “algebra”). Some specific exam-ples:

� enqueue(a) dequeue(a):

� dequeue(a):

� enqueue(a) enqueue(b) enqueue(c) dequeue(a) en-queue(d) dequeue(b):

� enqueue(a) enqueue(b) dequeue(b):

Other popular queue operations:

�

�

�


Applications of Queues in Operating Systems

The text discusses some applications of queues in op-erating systems:

� to buffer data coming from a running process goingto a printer:

� a printer may be shared between several computersthat are networked together.


Application of Queues in Discrete Event Simulators

A simulation program is a program that mimics, or“simulates”, the behavior of some complicated real-world situation, such as

�

�

�

These systems are typically too complicated to be mod-eled exactly mathematically, so instead, they are sim-ulated: events take place in them according to somerandom number generator. For instance,

� at random times,




Using a Queue to Convert Infix to Postfix

First attempt: Assume infix expression is

For example:

� (((22=7) + 4) � (6� 2))

� (7� (((2 � 3) + 5) � (8� (4=2))))

Pseudocode:


Converting Infix to Postfix (cont’d)

Examples:

� (((22=7) + 4) � (6� 2))

Q:

S:

� (7� (((2 � 3) + 5) � (8� (4=2))))

Q:

S:


Converting Infix to Postfix with Precedence

It is too restrictive to require parentheses around every-thing.

Instead,precedence conventionstell

For instance,4 � 3 + 2 equals

We need to modify the above algorithm to handle op-erator precedence.

�

�

�


Converting Infix to Postfix with Precedence (cont’d)

create queue Q to hold postfix expressioncreate stack S to hold operators not yet

added to the postfix expressionwhile there are more tokens do

get next token tif t is a number then enqueue t on Qelse if S is empty then push t on Selse if t is ( then push t on Selse if t is ) then

while top of S is not ( dopop S and enqueue result on Q

endwhilepop S // get rid of ( that ended while

else // t is real operator and S not empty)while prec(t) <= prec(top of S) do

pop S and enqueue result on Qendwhilepush t on S

endifendwhilewhile S is not empty do

pop S and enqueue result on Qendwhilereturn Q


Converting Infix to Postfix with Precedence (cont’d)

For example:

� (22=7 + 4) � (6� 2)

Q:

S:

� 7� (2 � 3 + 5) � (8� 4=2)

Q:

S:


Implementing a Queue with an Array

State is represented with:

� arrayA

� integerhead that holds

� integertail that holds

Operation implementations:

� enqueue(x):

� dequeue(x):

� empty:

� peek:

� size:

Problem:


Implementing a Queue with a Circular Array

Wrap around to reuse the vacated space at the begin-ning of the array in a circular fashion, using mod oper-ator%.

� enqueue(x):

� dequeue(x):

� empty:

The problem is that


Expanding Size of Queue Dynamically

To avoid overflow problem in circular array implemen-tation of a queue, use same idea as for array implemen-tation of stack:If array is discovered to be full during an enqueue,

� allocate

� copy

� enqueue

� free

One complication with the queue, though, is that thecontents of the queue might be in two sections:

1. from

2. then from

Copying the new array must take this into account.


Performance of Circular Array

Performance of the circular array implementation of aqueue:

� Time:

� space:


Implementing a Queue with a Linked List

State representation:

� Data items are kept in

� Pointerhead points to

� Pointertail points to

Operation implementations:

� To enqueue an item,

� To dequeue an item,


Implementing a Queue with a Linked List (cont’d)

class Queue {

private QueueNode head;private QueueNode tail;

public Queue() {head = null;tail = null;

}

public boolean empty() {return (head == null);

}

public void enqueue(Object obj) {QueueNode node = new QueueNode(obj);if empty() {

head = node;tail = node;

} else {tail.link = node;tail = node;

}}

// continued on next slide


Implementing a Queue with a Linked List (cont’d)

// continued from previous slide

public Object dequeue() {if ( empty() )

return null; // or throw an EmptyQueueExceptionelse {

Object returnItem = head.item;head = head.link; // remove first node from listif (head == null) // fix tail pointer if needed

tail = null;return returnItem;

}}

}

Every operation always takes


Motivation for the List ADT

This ADT is good for modeling

Some sample applications:

�

�

�


Specifying the List ADT

Thestateof a list object is

Typical operations on a list are:

� create:

� empty:

� length:

� select(i):

� replace(i,x):

� delete(x):

� insert(x):


Implementing the List ADT

Array implementation:

� Keep a counter

� To select or replace at some location,

� To insert at some location, items down.

� To delete at some location,

Linked list implementation:

� Keep a count of

� To select, replace, delete or insert an item,


Comparing the Times of List Implementations

Timefor various operations, on a list ofn data items:

list singlyoperation linked list array

empty

length

select(i)

replace(i)

delete(i)

insert(i)

The time for insert in an array assumes no overflowoccurs. If overflow occurs,


Comparing the Space of List Implementations

Spacerequirements:

� If the array holdspointersto the items, then there isthe space overhead of

� If the array holds the items themselves, then there isthe space overhead of

� In both kinds of arrays, there is also the overhead of

� If you use a linked list, then the space overhead isfor

To quantify the space tradeoffs between the array ofitems and linked list representations:

� Let p be the number of

� Let q be the number of

� Letm be the number of


Comparing the Space (cont’d)

To holdn items,

� the array representation uses

� the linked list representation uses

The tradeoff point is when

� Whenn < q �m=(p+ q),

� Whenn > q �m=(p+ q),

� When the item size,q, is much larger than the pointersize,p,

� When the item size,q, is closer to the pointer size,p,


Generalized Lists

A generalized listis

Example:(a; b; (c; (d; e); f); g; (h; i)).

There are five elements in the (top level) list:

1.

2.

3.

4.

5.

Items which are not lists are calledatoms(they cannotbe further subdivided).


Sample Java Code for Generalized List

class Node {Object item;Node link;Node (Object obj) { item = obj; }

}class GenList {

private Node first;GenList() { first = null; }void insert(Object newItem) {

Node node = new Node(newItem);node.link = first;first = node;

}void print() {

System.out.print("( ");Node node = first;while (node != null) {

if (node.item instanceof GenList)((GenList)node.item).print();

else S.o.p(node.item);node = node.link;if (node != null) S.o.p(", ");

}S.o.p(" )");

}}


Sample Java Code (cont’d)

Notice:

� o instanceof C returns true if

– objecto

– objecto

– objecto

– objecto

� castsnode.item to typeGenList , if appropri-ate

� recursive call of theGenList methodprint

� implicit use of thetoString method of every class,in the call toSystem.out.print

Don’t confuse theprint method ofSystem.outwith theprint method we are defining for classGenList .)


Sample Java Code (cont’d)

How do we know thatprint is well-defined and won’tget into an infinite loop?

Theprint method is recursiveanduses a while loop.� The while loop

� If an item is not a generalized list, then it

� If an item is itself a generalized list, then

� The while loop stops whenEach recursive call takes you deeper into the nesting ofthe generalized list.� Assume

� The stopping case for the recursion is

� Each recursive call takes you closer to a stoppingcase.


Generalized List Pitfalls

Warning! If there is acycle in the generalized list,print will go into an infinite loop. For instance:

Be careful aboutshared sublists. For instance,


Application of Generalized Lists: LISP

Generalized lists are

� highly

� good for applications where

� the key structuring paradigm in

LISP is afunctional language:

Each function call is represented as a list, with thename of the function coming first, and the argumentscoming after it:


LISP-like Approach to Arithmetic Expressions

Apply this approach to evaluating arithmetic expres-sions:

Useprefix notation (as opposed to postfix), with paren-theses to delimit the sublists:


Strings and StringBuffers

Java differentiates between

There areno methods that changean existingString .

If you want to change the characters in a string, use aStringBuffer . Some key features are:

� change

� append

� insert

TheStringBuffer class can be implemented usingan array of characters. The ideas are not complicated.


The Heap

When you usenew or malloc to dynamically allo-cate some space, the run-time system handles the me-chanics of actually finding the required free space ofthe necessary size.

When you make an object inaccessible (in Java) or usefree (in C), again the run-time system handles themechanics of reclaiming the space.

We are now going to look at HOW one could imple-ment dynamic allocation of objects from the heap. Thereasons are:

�

�

�


What is the Heap?

The heap is an area of memory used to store objectsthat will by dynamically allocated and deallocated.

Memory can be viewed as one long array of memorylocations, where the address of a memory location isthe index of the location in the array.

Thus we can view the heap as

Contiguous locations in the heap (array) are groupedtogether into

When a request arrives to allocaten bytes, the system

� finds

� allocates

� returns

Blocks are classified as either

Initially,


Heap Data Structures

Once blocks are allocated, the heap might get choppedup into alternating allocated and free blocks of varyingsizes.

We need a way to locate all the free blocks.

This will be done by keeping the free blocks in a

The linked list is implemented using

Each block has some


Allocation

When a request arrives to allocaten bytes,

There are two strategies for choosing the block to use:

�

�

If the block found is bigger thann, then

If the block found is exactly of sizen, then

If no block large enough is found, then


Deallocation

When a block is deallocated, as a first cut, simply insertthe block at the front of the free list.

��

��

��

��

��

��

��

��

��

��

��

��

��

��

p := alloc(10)

q := alloc(20)

free(p)

r := alloc(40)

free(q)

10

100

10 70

10

70

50

10

10

10

20

20

20

50

10

10

40

40

0

free

0

freep

79

79

79300 10

q free

20

p

0

q

30 7910

free

79703010

q

0

rfree

0 10 30 79

rfree


Fragmentation

��

��

free(q)

70

10 20 1040

0 10 30 79

rfree

Problem with previous example: If a request comes infor 30 bytes, the system will check the free list, andfind


Coalescing

A solution to fragmentation is to

� physical neighbor:

� virtual neighbor:

To facilitate this operation, we will need additional spaceoverhead in the header, and it will also help to keep“footer” information at the end of each block to:

� make

� indicate

� replicate


More Insidious Fragmentation

��

��

free(q)

70

10 20 1040

0 10 30 79

rfree

However, coalescing will not accommodate a requestfor


Compaction

The solution to this problem is called

The difficulty though is that


Master Pointers

A solution is to use

� A special area of the heap contains

� The addresses

� The address returned by the allocate procedure is

� Thecontentsof a master pointer

� But the user,


Master Pointers (cont’d)

��

��

��

��

��

��......

q rp

master pointers

rest of heap

Costs:

� Additional

� Additional

� Unpredictable


Garbage Collection

The above discussion of deallocation assumes the mem-ory allocation algorithm is somehow informed aboutwhich blocks are no longer in use:

� In C, this is done

� In Java,

This process is part ofgarbage collection:

�

�

One of the challenging aspects of garbage collection ishow to


Trees

Important terminology:

Some uses of trees:

� model

� model

� a clever implementation of

�


Trees (cont’d)

Some more terms:

� path:

� length of path:

� height of a node:

� height of tree:

� depth (or level) of a node:

� depth of tree:

Fact: The depth of a tree equals the height of the tree.


Binary Trees

Binary tree: a tree in which

Complete binary tree: tree in which

Important Facts:

� A complete binary tree withL levels contains

� A complete binary tree withn nodes has


Binary Trees (cont’d)

Leftmost binary tree: like a complete binary tree,except that

however, all leaves at bottom level are

Important Facts:

� A leftmost binary tree withL levels contains

� A leftmost binary tree withn nodes has


Binary Heap

Now suppose that there is a data item, calledinside each node of a tree.

A binary heap (or min-heap) is a

� leftmost binary tree

� satisfies the

Do not confuse this use of “heap” with its usage inmemory management!

Important Fact: The same set of keys

There is no


Using a Heap to Implement a Priority Queue

To implement the priority queue operationinsert(x):

1.

2.

3.

Time:

To implement the priority queue operationremove():Tricky part is how to remove the root without messingup the tree structure.

1.

2.

3.

Time:


Using a Heap to Implement a PQ (cont’d)

PQ operation sorted arrayunsorted arrayheapor linked list or linked list

insertremove (min)

No longer have the severe tradeoffs of the array andlinked list representations of priority queue.


Heap Sort

Recall the sorting algorithm that used a priority queue:

1. insert the elements to be sorted, one by one, into apriority queue.

2. remove the elements, one by one, from the priorityqueue; they will come out in sorted order.

If the priority queue is implemented with a heap, therunning time is


Linked Structure Implementation of Heap

To implement a heap with a linked structure, each nodeof the tree will be represented with an object containing

�

�

�

�

To find the next available location for insert, or therightmost node on the bottom level for remove, in con-stant time,

�

�

Then keep a


Array Implementation of Heap

Fortunately, there’s a nifty way to implement a heapusing an array, based on an interesting observation: Ifyou number the nodes in a leftmost binary tree, startingat the root and going across levels and down levels, yousee a pattern:

1

2 3

4 5

8 9

6 7

� Node numberi has left child

� Node numberi has right child

� If 2 � i > n, theni has no

� If 2 � i + 1 > n, theni has no

� Therefore, node numberi is a leaf if

� The parent of nodei is

� Next available location for insert is index

� Rightmost node on the bottom level is index


Array Implementation of Heap (cont’d)

Representation consists of

� arrayA[1..max] (ignore location 0)

� integern, which is initially 0, holding number ofelements in heap

To implementinsert(x) (ignoring overflow):n := n+1 // make a new leaf nodeA[n] := x // new node’s key is initially xcur := n // start bubbling x upparent := cur/2while (parent != 0) && A[parent] > A[cur] do

// current node is not the root and its key// has not found final resting placeswap A[cur] and A[parent]cur := parent // move up a level in the treeparent := cur/2

endwhile


Array Implementation of Heap (cont’d)

To implementremove(ignoring underflow):minKey := A[1] // smallest key, to be returnedA[1] := A[n] // replace root’s key with key in

// rightmost leaf on bottom leveln := n-1 // delete rightmost leaf on bottom levelcur := 1 // start bubbling down key in rootLchild := 2*curRchild := 2*cur + 1while (Lchild <= n) && (A[minChild()] < A[cur]) do

// current node is not a leaf and its key has// not found final resting place

swap A[cur] and A[minChild()]cur := minChild() // move down a level in the treeLchild := 2*curRchild := 2*cur + 1

endwhilereturn minKey

minChild(): // returns index of child w/ smaller keymin := Lchildif (Rchild <= n) && (A[Rchild] < A[Lchild]) then

// node has a right child and it is smallermin := RChild

endifreturn min


Binary Tree Traversals

Now consideranykind of binary tree with data in thenodes, not just leftmost binary trees.

In many applications, we need totraversea tree: “visit”each node exactly once. When the node is visited,some computation can take place, such as printing thekey.

There are three popular kinds of traversals, differing inthe order in which each node is visited in relation to theorder in which its left and right subtrees are visited:

� inorder traversal:

� preorder traversal:

� postorder traversal:


Binary Tree Traversals (cont’d)

preorder(x):if x is not empty then

visit xpreorder(leftchild(x))preorder(rightchild(x))

inorder(x):if x is not empty then

inorder(leftchild(x))visit xinorder(rightchild(x))

postorder(x):if x is not empty then

postorder(leftchild(x))postorder(rightchild(x))visit x

a

b c

d

e

f g

h i

� preorder:

� inorder:

� postorder:


Binary Tree Traversals (cont’d)

These traversals are particularly interesting when thebinary tree is a parse tree for an arithmetic expression:

� Postorder traversal results in the

� Preorder gives

� Does inorder give

*

+ -

135 2� preorder:

� inorder:

� postorder:


Representation of a Binary Tree

The most straightforward representation for an (arbi-trary) binary tree is a linked structure, where each nodehas

�

�

�

Notice that the array representation used for a heapwill not work, because the structure of the tree is notnecessarily very regular.

class TreeNode {Object data; // data in the nodeTreeNode left; // left childTreeNode right; // right child

// constructor goes here...

void visit() {// what to do when node is visited}

}


Representation of a Binary Tree (cont’d)

class Tree {TreeNode root;// other information...

void preorderTraversal() {preorder(root);

}

preorder(TreeNode t) {if (t != null) { // stopping case for recursion

t.visit(); // user-defined visit methodpreorder(t.left);preorder(t.right);

}}

}

But we haven’t yet talked about how you actually MAKEa binary tree. We’ll do that next, when we talk about


Dictionary ADT Specification

So far, we’ve seen the abstract data types

�

�

�

�

Another useful ADT is adictionary (or table). Theabstract state of a dictionary is a

The main operations are:

�

�

�

Some additional operations are:

� find the

� find the

�


Dictionary ADT Applications

Thedictionary (or table) ADT is

For instance, student records at a university can be keptin a dictionary data structure:

� When a new student enrolls,

� When a student graduates,

� When information about a student needs to be up-dated,

� Once the search has located the record for that stu-dent,

� When information about student needs to be retrieved,

The world is full of information databases, many ofthem extremely large (imagine what the IRS has).

When the number of elements gets very large,


Dictionary Implementations

We will study a number of implementations:

Search Trees

�

� :

–––

�

Hash Tables

�

�


Binary Search Tree

Recall theheap ordering propertyfor binary heaps:

Anotherordering property is thebinary search treeproperty: for each nodex,

� all keys in the left subtree ofx

� all keys in the right subtree ofx

A binary search tree (BST)is


Searching in a BST

To search for a particular key in a binary search tree,we take advantage of the binary search tree property:

search(x,k): // x is node where search starts----------- // k is key searched forif x is null then // stopping case for recursion

return "not found"else i f k = the key of x then

return xelse i f k < the key of x then

search(leftchild(x),k) // recursive callelse / / k > the key of x

search(rightchild(x),k) // recursive callendif

The top level call hasx equal to

In the previous tree, the search path for 17 isand the search path for 21 is

Running Time:If BST is a chain, then


Searching in a BST (cont’d)

Iterative version of search:

search(x,k):------------while x != null do

if k = the key of x thenreturn x

else if k < the key of x thenx := leftchild(x)

else // k > the key of xx := rightchild(x)

endifendwhilereturn "not found"

As in the recursive version,

The comparison of the search key with the node keytells you at each level

Running Time:


Searching in a Balanced BST

If the tree is a complete binary tree, then the depth is

and thus the search time is

Binary trees withO(log n) depth are consideredbal-anced: there is balance between

You can have binary trees that areso that the depth isbut might have a larger constant hidden in the big-oh.

As an aside, a binary heap does not have

Since nodes at the same level of the heap have no par-ticular ordering relationship to each other, you will needto


Inserting into a BST

To insert a keyk into a binary search tree,

Then

insert(x,k):-----------if x = null then

make a new node containing kreturn new node

else i f k = the key of x thenreturn null // key already exists

else i f k < the key of x thenleftchild(x) := insert(leftchild(x),k)return x

else / / k > the key of xrightchild(x) := insert(rightchild(x),k)return x

endif

Insert called on nodexunlessx is null, in which case

As a result, a child of a node

Running Time:


Inserting into a BST (cont’d)


Finding Min and Max in Binary Search Tree

Fact: The smallest key in a binary tree is found by

Running Time:

Guess how to find the largest key and how long it takes.

Min isand max is


Printing a BST in Sorted Order

Cute tie-in between tree traversals and BST’s.

Theorem: Inorder traversal of a binary search tree vis-its the nodes

Inorder traversal on previous tree gives:

Proof: Let’s look at some small cases and then useinduction for the general case.

Case 1:

Case 2:

Casen: Suppose true for trees of size

Consider a tree of size


Printing a BST in Sorted Order (cont’d)

L contains at mostandR contains at most

Inorder traversal:

� prints out

� then prints out

� then prints out

2

Running Time:


Tree Sort

Does previous theorem suggest yet another sorting al-gorithm to you?

Tree Sort: Insert all the keysthen do an

Running Time:since each of then inserts takes


Finding Successor in a BST

Thesuccessorof a nodex in a BST is

Case 1:If x has a right child, then the successor ofx

is the

follow x’s right pointer, then follow left pointers untilthere are no more.

Path to find successor of 19 is


Finding Successor in a BST (cont’d)

19

10 22

16

17 27

20 26

13

4

Case 2:If x does not have a right child, then find the

Path to find successor of 17 is

If you never find an ancestor that is larger thanx’s key,then

Path to try to find successor of 27 is

Running Time:


Finding Predecessor in a BST

The predecessorof a nodex in a BST is the nodewhose

To find it,

Case 1:If x has a left child, then the predecessor ofx

follow x’s left pointer, then follow right pointers untilthere are no more.

Case 2: If x does not have a left child, then find thelowest ancestor ofx

(I.e., follow parent pointers fromx until reaching a keysmaller thanx’s.)

If you never find an ancestor that is smaller thanx’skey, then

Running Time:


Deleting a Node from a BST

Case 1:x is a leaf. Then

Case 2:x has only one child. Then

Case 3:x has two children. Use the same strategy asbinary heap: Instead of removing the root node,

1. Find

2. Delete

3. Replace

Running Time:


Deleting a Node from a BST (cont’d)


Balanced Search Trees

We would like to come up with a way to keep a binarysearch tree “balanced”, so that the depth isand thus the running time for the BST operations willbe

There are a number of schemes that have been devised.We will briefly look at a few of them.

They all require much more complicated algorithmsfor insertion and deletion, in order to

The algorithms for searching, finding min, max, pre-decessor or successor, are essentially the same as for

Next few slides give the main idea for the definitionsof the trees, but not why the definitions giveO(log n)

depth, and not how the algorithms for insertion anddeletion work.


AVL Trees

An AVL tree is a binary search tree such that for eachnode, the heights of the left and right subtrees of thenode

Theorem: The depth of an AVL tree is

When inserting or deleting a node in an AVL tree, ifyou detect that the AVL tree property has been vio-lated, then you


Red-Black Trees

A red-black tree is a binary search tree in which

� every “real” node is given

� every node is colored

– every leaf node is

– if a node is red, then both its children are

– every path from a node to a leaf contains

From a fixed node, all paths from that node to a leafdiffer in length by

Theorem: The depth of an AVL tree isInsert and delete algorithms are quite involved.


B-Trees

The AVL tree and red-black tree allowed some varia-tion in

An alternative idea is to make sure that all root-to-leafpaths have

and allow

The definition of a B-tree uses a parameterm:

� every leaf

� the root has

� every non-root node has

Keys are placed into nodes like this:

� Each non-leaf node has

� Each leaf node has

� The keys within a node are


B-Trees (cont’d)

And we require theextended search tree property:

� For each nodex, thei-th key inx is

and is

B-trees are extensively used in the real world, for in-stance, database applications. In practice,

Theorem: The depth of a B-tree tree is

Insert and delete algorithms are quite involved.


Tries

In the previous search trees, each key is

except for their

For some kinds of keys, one key might be a

For example, if the keys are strings, then the key “at”is a prefix of the key “atlas”.

The next kind of tree takes advantage of

to store them more efficiently.

A trie is a (not necessarily binary) tree in which

� each node corresponds to

� prefix for each node

The trie storing “a”, “ ale”, “ant”, “bed”, “bee”, “bet”:


Inserting into a Trie

To insert into a trie:

insert(x,s): // x is node, s is string to insert------------if length(s) = 0 then

mark x as holding a complete keyelse

c := first character in sif no outgoing edge from x is labeled with c then

create a new child node of xlabel the edge to the new child node with cput the edge in the correct sorted order

among all of x’s outgoing edgesendifx := child of x reached by edge labeled cs := result of removing first character from sinsert(x,s)

endif

Start the recursionTo insert “an” and “beep”:

a b

e

d e t

l n

te


Searching in a Trie

To search in a trie:

search(x,s): // x is node, s is string to search for------------if length(s) = 0 then

if x holds a complete key then return xelse return null // s is not in the trie

elsec := first character in sif no outgoing edge from x is labeled with c then

return null // s is not in the trieelse

x := child of x reached by edge labeled cs := result of removing first character from ssearch(x,s)

endifendif

Start the recursion

To search for “art” and “bee”:a b

e

d e t

l n

te


Hash Table Implementation of Dictionary ADT

Another implementation of the Dictionary ADT is a

Hash tables support the operations

�

�

�

with

This is a significant advantage over even balanced searchtrees, which have average times of

Thedisadvantageof hash tables is that

and printing all elements in sorted order takes


Main Idea of Hash Table

Main idea: exploitrandom accessfeature of arrays:the i-th entry of array A can be accessed

Simple example:Suppose all keys are in the range

Then store elements in an array A withInitialize all entries to some empty indicator.

� To insert x with key k:

� To search for key k:

� To delete element with key k:

All times are

But this idea does not scale well.


Hash Functions

Suppose

� elements are

� school has

� keys are

Since there are 1 billion possible SSN’s, we need anarray of length 1 billion. And most of it will be wasted,since only 40,000/1,000,000,000 = 1/25,000 fraction isnonempty.

Instead, we need a way to

LetM be the size of the array we are willing to provide.

Use ahash function, h, to

Thenh maps key values to integers in the range


Simple Hash Function Example

Suppose keys are integers. Let the hash function beh(k) = k mod M . Notice that this always gives yousomething in the range

� To insertx with keyk:

� To search for element with keyk:

� To delete element with keyk:

All times areassuming the hash function can be computed in con-stant time.

The key to making this work is to


Collisions

In reality, any hash function will havecollisions: whentwo different keys

This is inevitable, since the hash function is squashingdown a large domain into a small range.

For example, ifh(k) = k mod M , then

since they both hash to

What should you do when you have a collision? Twocommon solutions are

1.

2.


Chaining

Keep all data items that hash to the same array locationin a

� to insert elementx with keyk:

� to search for element with keyk:

� to delete element with keyk:

Worst case times, assuming computingh is constant:

� insert:

� search and delete:Worst case is if alln elements


Good Hash Functions for Chaining

Intuition: Hash function should

More formally:

Impractical to check in practice since

For example: Suppose the symbol table in a compileris implemented with a hash table. The compiler writercannot know in advance which variable names will ap-pear in each program to be compiled.

Heuristics are used to approximate this condition:


Good Hash Functions for Chaining (cont’d)

Some issues to consider in choosing a hash function:

� Exploit

For symbol table example, take into account the kindsof variables names that people often choose (e.g.,x1).

� Hash function should depend on

For example: if the keys are English words, it is nota good idea to hash on the first letter, since manywords begin with S and few with X.


Average Case Analysis of Chaining

Defineload factor of hash table withM entries andnkeys to be

Assume a hash function that is ideal for chaining

Fact: Average length of each linked list is

Theaveragerunning time for chaining:

� Insert:

� Unsuccessful Search:O(1) time to computeh(k); � items, on average, inthe linked list are checked until discovering thatk isnot present.

� Successful Search:O(1) time to computeh(k); on average, key beingsought is in middle of linked list, so�=2 compar-isons needed to findk.

� Delete:

For these times to beO(1), � must beO(1), son cannotbe too much larger than


Open Addressing

With this scheme, there areInstead,

If there is a collision, you have toprobe the table –

You must pick a pattern that you will use to probe thetable.

The simplest pattern is toand then check

This is called

If h(k) = 7, the probe sequence will be


Clustering

A problem with linear probing:

If an insert probe sequence begins in a cluster,

�

�

To reduce clustering,to skip over some locations, so locations are not checked

There are various schemes for how to choose the incre-ments; in fact, the increment to use can be


Clustering (cont’d)

If the probe sequence starts at 7 and the probe incre-ment is 4, then the probe sequence will be

Warning!The probe increment must be

otherwise you will not search all locations.

For example, suppose you have table size 9 and incre-ment 3. You will only search


Double Hashing

Even when “non-linear” probing is used, it is still truethat

To get around this problem, use

1. One hash function,h1, is used to determine

2. A second hash function,h2, is used to determine

If the hash functions are chosen properly,


Double Hashing Example

Let h1(k) = k mod 13 andh2(k) = 1 + (k mod 11).

� To insert 14: start probing atProbe increment isProbe sequence is

� To insert 27: start probing atProbe increment isProbe sequence is

� To search for 18: start probing atProbe increment isProbe sequence is


Deleting with Open Addressing

Open addressing has another complication:

� to insert:

� to search:

Suppose we use linear probing. Consider this sequence:

� Insertk1, whereh(k1) = 3, at location 3.



� Deletek2 from location 4 by setting location 4 toempty.

� Search fork3.

Solution:when an element is deleted, instead of mark-ing the slot as empty,

Then the search algorithm needs to continue searchingif it finds one of those slots.


Good Hash Functions for Open Addressing

An ideal hash function for open addressing would sat-isfy an even stronger property than that for chaining,namely:

This is even harder to achieve in practice than the idealproperty for chaining.

A good approximation is double hashing with this scheme:

�

Generalizes the earlier example.


Average Case Analysis of Open Addressing

In this situation, the load factor� = n=M is alwaysless than 1:

Assume that there is always at least one empty slot.

Assume that the hash function ensures that each key isequally likely to have each permutation off0; 1; : : : ;M � 1g as its probe sequence.

Average case running times:

� Unsuccessful Search:

� Insert:

� Successful Search:

� Delete:

The reasoning behind these formulas requires more so-phisticated probability than for chaining.


Sanity Check for Open Addressing Analysis

The time for searches should

The formula for unsuccessful search is

� As n gets closer toM ,

� so

� so

At the extreme, whenn = M � 1, the formula 1

1��=

M , meaning that


Sorting

� Insertion Sort:

– Consider

– Shift

– Insert

– Worst-case time is

� Treesort:

– Insert

– Then do

– For a basic BST, worst-case time isbut average time is

– For a balanced BST, worst-cast time isalthough code is more complicated.


Sorting (cont’d)

� Heapsort:

– Insert

– Then


� Mergesort: Apply the idea of

– Split

– Recursively

– Recursively

– Then


however, it requires more space.


Object-Oriented Software Engineering

References:

� Standish textbook, Appendix C

� Developing Java Software, by Russel Winder andGraham Roberts, John Wiley & Sons, 1998 (ch 8-9).

Outline of material:

�

�

�

�

�

�

�

�

�


Small Scale vs. Large Scale Programming

Programming in the small: programs done by

whose length is

Programming in the large: projects consisting of

and producing

Obviously the complications are much greater here.

The field of software engineering is mostly orientedtoward

However, the principles still hold (although simplified)for programming in the small. It’s worth understandingthese principles so that

�

�


Object-Oriented Software Engineering

Software engineeringstudies

Object-oriented software engineeringuses

Why object-oriented?

� use of abstractions to

� benefits of encapsulation to

� power of inheritance to

Experience has shown that object-oriented software en-gineering

� helps create robust reliable programs with

� promotes the development of programs by


Object-Oriented Software Engineering (cont’d)

Solutions to specific problems tend to be fragile andshort-lived:

To minimize effects of requirement changes

instead of just focusing on

Usually the problem domain is fairly stable, whereas a

If you capture the problem domain as the core ofyour design, then the code is likely to be

More traditionalstructured programming tends to leadto a


Object-Oriented Software Engineering (cont’d)

In OO analysis and design,identify

and model them asLeads to

� go downwards to

� go upwards to

This approach tends to lead toandFor instance, when the requirements change, you mayhave all the basic abstractions right but you

Aim for

which are specialized by inheritance to provide


Software Life Cycle

� inception:

– requirements:

� elaboration:

– analysis:

– design:

– identify reuse:

� implementation

––

–

� testing

� delivery and maintenance


Software Life Cycle (cont’d)

Lifecycle is not followed linearly;

An ideal way to proceed is by

� implement

� review

� decide

� proceed

� continue

This supports

letting you try alternatives and


Requirements

Decidewhatthe program is supposed to do

Harder than it sounds.

Ask the user

�

�

Involve the user in reviewing the requirements whenthey are produced and the prototypes developed.

Typically, requirements are organized

Helpful to constructscenarios, which describe


Requirements (cont’d)

An example scenario to look up a phone number:

1. select

2. enter

3.

4. program computes, to(do NOT specify data structure to be used at thislevel)

5.

Construct as many scenarios as needed until you feelcomfortable, and have gotten feedback from the user,that

This part of the software life cycle is no different forobject-oriented software engineering than for non-object-oriented.


Object-Oriented Analysis and Design

Main objective:

Analysis and design are two ends of a spectrum: Anal-ysis focuses more on the

while design focuses more on the

For large scale projects, there might be a real distinc-tion: for example,

might be required to implement

For small scale projects, there is typically no distinc-tion between analysis and design:


Object-Oriented Analysis and Design (cont’d)

To decide on the classes:

� Study

Look for nounsin the requirements:

These will probably turn into

and/or

See how the requirements specify interactions be-tween things (e.g., each student has a GPA, eachcourse has a set of enrolled students).

� Use ananalysis method:

(Particularly aimed at large scale projects.)


An Example OO Analysis Method

CRC (Class, Responsibility, Collaboration): It clearlyidentifies the Classes, what the Responsibilities are ofeach class, and how the classes Collaborate (interact).

In the CRC method, you drawclass diagrams:

� each class is

–––

� if class 1 is a subclass of class 2, then

� if an object of class 1 is part of (an instance variableof) class 2, then

� if objects of class 1 need to communicate with ob-jects of class 2, then

The arrows and lines can be annotated to indicate thenumber of objects involved, the role they play, etc.


CRC Example

To model a game with several players who take turnsthrowing a cup containing dice, in which some scoringsystem is used to determine the best score:

This is a diagram of thenot theObject diagrams are trickier since

Double-check that the class diagram is consistent withrequirements scenarios.


Object-Oriented Analysis and Design (cont’d)

While fleshing out the design, after identifyingwhatthe different methods of the classes should be, figureout

This means deciding what

Do not fall in love with one particular solution (such asthe first one that occurs to you). Generate

and then try to

Do not commit to a particular solution too early in theprocess. Concentrate on

The use of ADTs assists in this aspect.


Verification and Correctness Proofs

Part of the design includes

You should have some convincing argument as to whythese algorithms arecorrect.

In many cases, it will be obvious:

�

�

But sometimes you might be coming up with your ownalgorithm, or

In these cases, it’s important to check what you aredoing!


Verification and Correctness Proofs (cont’d)

The Standish book describes one particular way to provecorrectness of small programs, or program fragments.The important lessons are:

� It is possible to

� Formalisms can help you to

� Spending a lot of time thinking about your program,no matter what formalism, will

� These approaches are impossible to do

For large programs, there are research efforts aimed at

i.e., programs that

Generally automatic verification is slow and cumber-some, and requires some specialized skills.


Verification and Correctness Proofs (cont’d)

An alternative approach to program verification is

Instead of trying to verify actual code,

� Represent the algorithm in

� then

Of course, you might make a mistake when translat-ing your pseudocode into Java, but the proving will bemuch more manageable than the verification.


Implementation

The design is now fleshed out to the level of code:

�

�

�

�

As the code is written, document the key design de-cisions, implementation choices, and any unobviousaspects of the code.

Software reuse:Use library classes as appropriate (e.g.,Stack, Vector, Date, HashTable). Kinds of reuse:

�

�

�

But sometimes modifications can be more time con-suming than starting from scratch.


Testing and Debugging: The Limitations

Testing cannot prove that your program is correct.

It is impossible to test a program on every single input,so

Even if you could apply some kind of program verifi-cation to your program,

And in fact, how do you know that your requirements

However, testing still serves a worthwhile, pragmatic,purpose.


Test Cases, Plans and Logs

Run the program on varioustest cases. Test casesshould

More specifically,

� test on

� test on

� test on

Organize your test cases according to a

Purposes:

� make it clear

� ensure that

Results of running a set of tests is a

After fixing a bug, you must

(Winder and Roberts calls this the Principle of Maxi-mum Paranoia.)


Kinds of Testing

Unit testing:

�

�

Integration testing:

Two approaches to integration testing:

Bottom-up testing

Then progress to the next level up: those methods andclasses that only use the bottom level ones already tested.Use a driver to test combinations of the bottom twolayers.

Proceed until


Kinds of Testing (cont’d)

Top down testing proceeds in the opposite direction,making

Reasons to do top down testing:

� to allow software development to

� if you have modules that are mutually dependent,e.g., X uses Y, Y uses Z, and Z uses X. You can


Other Approaches to Debugging

In addition to testing, another approach to debugging aprogram is to

A third approach is called a

Some companies give your (group’s) code to anothergroup, whose job is to try to make your code break!


Maintenance and Documentation

Maintenance includes:

�

�

�

�

Most often, the person (or people) doing the mainte-nance are NOT the one(s) who originally wrote theprogram.There are (at least) two kinds of documentation, bothof which need to be updated during maintenance:

� internal documentation,

� external documentation,


Maintenance and Documentation (cont’d)

In addition to good documentation, a clean and eas-ily modifiable structure is needed for effective mainte-nance,

If changes are made in ad hoc, kludgey way, (either be-cause the maintainer does not understand the underly-ing design or because the design is poor), the programwill

Trying to fix one problem causes something else tobreak, so in desperation you put in some jumps (spaghetticode) to try to avoid this, etc.

Eventually it may be better to replace the program with


Measurement and Tuning

Experience has shown:

�

�

These observations suggest that optimizing your pro-gram can pay big benefits, but that it is smarter to

How can you figure out where your program is spend-ing its time?

� use a tool called an

�


Measurement and Tuning (cont’d)

Things you can do to speed up a program:

� find

� replace

� replace

� take advantage of

Don’t do things that are stupidly slow in your programfrom the beginning.

On the other hand, don’t go overboard in supposedoptimizations (that might hurt readability) unless you


Software Reuse and Bottom-up Programming

The bottom line from section C.7 in Standish is:

� the effort required to build software is

� making use of reusable components can

So it makes lots of sense to try to reuse software. Ofcourse, there are costs associated with reuse:

�

�

Using lots of reusable components leads to more bottom-up, rather than top down, programming. Or perhaps,more appropriately,


Design Patterns

As you gain experience, you will learn to recognizegood and bad design and build up

Why not try to exploit other people’s experience in thisarea as well?

A design patterncaptures a component of a completedesign that has been observed to

It provides both a solution to a problem and informa-tion about them.

There is a growing literature on design patterns, espe-cially for object oriented programming. It is worth-while to become familiar with it. For instance, searchthe WWW for “design pattern” and see what you get.


File Structures

A file is

Why on mass storage?

�

�

�

The data is subdivided into

Each record contains a number of

One (or more) field is the

Issue:

We will discuss sequential files, indexed files, and hashedfiles.


Sequential Files

Records areconceptuallyorganized in

The actual storage might or might not be sequential:

� On a tape,

� On a disk,

Convenient way to batch (group together) a number ofupdates:

� Store the

� Sort the

� Scan through

Not a convenient organization for accessing a particu-lar record quickly.


Indexed Files

Sequential search is even slower on disk/tape than inmain memory. Try to improve performance using

An index for a file is a

Typically the key field is

The index can be organized as a list, a search tree, ahash table, etc.To find a particular record:

�

�

�

Multiple indexes, one per key field, allow


Hashed Files

An alternative to storing the index as a hash table is to

Instead, hash on the key to find the address of the de-sired record and

The usual hashing considerations arise.


Databases

A databaseis

�

�

Example: Collection of student records can be viewedas a database to be used by:

�

�

�

�

The advantages of consolidating the data:

�

�

�


Database System Organization

The “software architecture” of a database system is

� End user calls application software to access thedata. End user thinks of data

� Application software calls database management sys-tem (DBMS) software. The applications softwarehas a

� DBMS deals with the

As usual, the advantages of layering are that


Communication with a Database

Databases usually provide a useful and powerful in-terface for obtaining information from them. So far,we’ve just seen requests of the form:

�

�

�

But suppose you’d like to print out the names of allstudents that are freshman and either have a 4.0 GPAor whose names start with X.

There are ways to conceptually organize the data toallow suchqueries to be answered efficiently, usingwhat are called

� The application software communicates with

� The DBMS must


Database Integrity

Data in a database is typically

�

�

Thus it must

Data can be corrupted if

Example of corrupted data:

� T1 transfers

� T2 inventories

Suppose this sequence of events occurs:

� T1 subtracts

� T2 gets the

� T2 gets the

� T1 adds

T2’s total balance is


DB Serializability

To prevent transactions from interfering with each other,the DBMS should

This property is called

The DMBS does not have to (and should not) actuallymake the transactions run serially, but if there is a po-tential conflict,

One solution is

� Before accessing any data item, the transaction must

� Only one transaction at a time can

� If another transaction already has the lock, then

� After accessing all the data items,


Committing and Aborting a Transaction

Two-phase locking can lead todeadlock, e.g.:

�

�

�

�

The DBMS must periodically check for deadlock, andif one is discovered, it must

If the aborted transaction has already made changes tothe database, the DBMS must

� either

� don’t actually

Once the transaction has successfully completed, thenit is


Artificial Intelligence

Goal: Develop machines that

�

�

and proceed ”intelligently”

�

�

�

Distinct but related goals:

1.

2.

3.


8-Puzzle Example

Given a 3-by-3 box that holds 8 tiles, numbered 1 through8. One tile is missing. The goal is to start with the tilesscrambled and

We will try to solve this problem by a machine that has

� a gripper,

� a video camera,

� a computer,

� a “finger”,

Ideas from mechanical engineering can be used to im-plement the gripper and the finger. We will talk abouthow to “see” where the tiles are, and how to decidehow to move the tiles.


Computer Vision

It is not enough to simply store the image obtainedfrom the camera. The program must be

� figure out which parts of the image are the salientobjects, called

� and then recognize the objects by comparing themto known symbols, called

For the 8-puzzle, this problem can be highly simplified:

� always expect the digits to

�

�

�

But in general this is a very difficult problem and onewhere there has been extensive research.


Reasoning

How can the program solve the puzzle?

One solution is to

For example, if the input is

then the solution is to

But in this case there are approximately 9! = 362,880different inputs, some of which require a long sequenceof moves to solve, and it would require a lot of space.

Plus, someone would have to figure out all the answersin advance.


Production Systems

Instead, have the program figure out the solution. Oneapproach is the

First, consider thestate graphof the problem:

�

�

Here is a tiny piece of the state graph for the 8-puzzle:

Identify the

Thecontrol systemfigures out how to


Solving a Production System

We must find a path through the state graph from

Luckily, finding paths in graphs is

One way is to build asearch tree(not to be confusedwith a binary search tree), which

Two solutions are


Breadth-First Search

Build the search tree in abreadth-first manner:

� The root

� The next level

� The next level

For example:1 2 3

6

7

2 3

6

7

1 2 3

4 6

7

1 2 3

6

3

6

7

1 3

4 6

7

1 2 3

4

7

1 2 3

4 5 6

7

1 2 3

6

85

4

85

41

85

85

41

2

85

2

85

6

8

85

47

8

47

5

But the search tree grows exponentially.


Depth-First Search

Another approach is

Pursue more promising paths to greater depths andconsider other options only if

To implement this idea, we need some criterion to de-cide which paths are promising, orappearto be promis-ing.

Such criteria are calledheuristics. A heuristic is

We need something quantitative so we can


Heuristic for 8-Puzzle

For the 8-puzzle example, our intuitive rule of thumbis to

A quantitative heuristic measure is:

For instance, if the input is

then the heuristic measure is

This heuristic has two desirable properties:

1. it is a

2. it is


Using a Heuristic in Depth-First Search

� Repeatedly

� Choose the

� Generate

� Continue

In the 8-puzzle example above:

� Generate the root. Its heuristic measure is

� Generate all children of the root. They have mea-sures

� Choose the leaf with measure 2 and generate all itschildren. They have measures

� Choose the leaf with measure 1 and generate all itschildren. They have measures

In this depth-first search, we only had to generate 9states, instead of


Other Applications of Production Systems

Many problems can be formulated as production sys-tems. In addition to the 8-puzzle,

You can even model the process of drawing logicalconclusions from a set of given facts as a productionsystem. In this case,

� each state is

� a production/rule/move corresponds to

For instance, part of the state graph might be:

since there is a rule of logic that says: Given the facts

1.

2.

then you can deduce that


Some Other Areas of AI

Neural Networks: Try to take advantage of the powerof parallelism (multiprocessor computer architectures)using a paradigm that (roughly) follows the model of

Robotics: Hardware and software working together,e.g., automated manufacturing. Great interest in hav-ing machines explore and function in uncontrolled andunpredictable environments, such as

�

�

�

Expert Systems:Combine domain specific knowledgefrom human experts with For example:

�

�


Time Complexity of an Algorithm

Time complexity of an algorithm: the functionT (n)that describes the

Given a particular algorithm, discover this function byattacking the problem from two directions:

� find anupper boundU(n) on the functionT (n), i.e.,convince ourselves that the algorithm will

� find a lower boundL(n) on the functionT (n), i.e.,convince ourselves that, for eachn, there is

Try to find smallestU and largestL, so thatT is squeezedin between and has no room to hide.


Time Complexity of an Algorithm (cont’d)

(a) No execution on an input of sizen0 takes

(b) The slowest execution on all inputs of sizen0 takes

(c) At least one execution on an input of sizen0 takes


Time Complexity of Heapsort

Let T (n) be the time complexity of heapsort.

First cut at upper bound:

First cut at lower bound:

Refined argument for upper bound: each heap opera-tion never

Refined argument for lower bound: Describe a partic-ular input that

On inputn; n � 1; n� 2; : : : ; 3; 2; 1, running time is atleast

ThusT (n) now precisely identified as


Time Complexity of a Problem

Time complexity of a problem: the time complexityfor

To show that a problem has time complexityT (n):

� Identify a

� Then prove

Example:Sorting problem has time complexityO(n log n).

�

� It can be proved that

Problems can be classified by their time complexity.Harderproblems are considered to be those


The Class P

All problems(not algorithms) whose time complexityis at most some polynomial are said to be

Example:

Not all problems are in P.

Example:Consider the problem of listing all permuta-tions of the integers 1 throughn.

� Output size is

� Thus running time is

� n! is larger than2n, thus


NP-Complete Problems

There is an important class of problems that

These problems are called

These problems have the following characteristic:

�

�

Many real-world problems in science, math, engineer-ing, operations research, etc. are NP-complete.


Traveling Salesman Problem

An example NP-complete problem is the

Given a set of cities and the distances between them,determine an order in which to

A candidate solution for TSP is

To check whether the allowed mileage is exceeded, addup the distances between adjacent cities in the listing,which will take

But the total number of different candidate solutions is


P vs. NP

Imagine an (unrealistically) powerful model of compu-tation in which the computer first makes a luckyguess(a nondeterministic choice) as to a candidate solutionin constant time, and then behaves as an ordinary com-puter and verifies the solution.

Problems solvable on this computer in polynomial timeare

NP includes

Having polynomial running time on this funny com-puter would not seem to ensure polynomial runningtime on a real computer.

That is, it seems likely that

But no one has yet been able to proveP 6= NP . Out-standing open question in CS since the 1970’s.


Computability Theory

Complexity theory focuses on

Computability theory focuses on

We will focus on computing (mathematical)functions,with inputs and outputs.

We would like to know if there exist functions that


Church-Turing Thesis

First, we have to decide what constitutes an algorithm.

� Assembly languages have

� High-level languages have

�

Church-Turing thesis: (“thesis” means “conjecture”)Anything that can reasonably be considered an algo-rithm can be

A Turing machine is a

Thus, for theoretical purposes,


Computing Functions

Some sample functions:

� f(n) = 3:

� f(n) = 2n:

� f(n) = sinn:

There existnon-computablefunctions, functions whoseinput/output relationships are so complicated that thereis no

We will assume

� your

� with a

� only consider


Goedel Number of a Program

Here is a way to convert a program into an integer.

�

�

Conversely, any integer can be converted

� Most of the time,

� Sometimes it

� Rarely,

� More rarely,

Use this numbering scheme to


An Uncomputable Function

Define a functionh called the

� If the program with Goedel numbern halts when itsinput isn, then

� If the program with Goedel numbern does not haltwhen its input isn, then

Theorem: h is uncomputable

Proof: Assume in contradiction thath is computable.Then

Define another programI (which will be in the listing):

1.n

2. run programH

3. letx be

4. if x = 0 then

5. else


An Uncomputable Function (cont’d)

Let nI be the Goedel number ofI.

Case 1:

Case 2:

Thus the hypothetical programH

2

Another way to view this result is that

cpsc 211 data structures & implementations (c) texas a&m...

Documents