lecture 41: semester review
DESCRIPTION
CSC 213 – Large Scale Programming. Lecture 41: Semester Review. Final Exam. Tues., May 10 from 10:15 – 12:15 in OM 200 Plan on exam taking full 2 hours If major problem , come talk to me ASAP Exam covers material from entire semester Open-book & open-note so bring what you’ve got - PowerPoint PPT PresentationTRANSCRIPT
LECTURE 41:SEMESTER REVIEW
CSC 213 – Large Scale Programming
Final Exam
Tues., May 10 from 10:15 – 12:15 in OM 200 Plan on exam taking full 2 hours
If major problem, come talk to me ASAP Exam covers material from entire
semester Open-book & open-note so bring what
you’ve got My handouts, solutions, & computers are not
allowed Cannot collaborate with a neighbor on the
exam Problems will be in a similar style to 2
midterms Lab mastery: 2:45 – 3:45 on Thurs., May 12 in
OM119
Lazy
Contemplative
Always Using Imagination
Most Important Trait
Critical Property of Test
All good tests FAIL
Loop Testing: Simple Loops
Loop executed at most n times, try inputs that: Skip loop entirely Make 1 pass through the loop Make 2 passes through the loop Make m passes through the loop, where (m
< n) If possible, n-1, n, & (n+1) passes through
the loop
Indexed File Format
Split information into two (or more) files Data file uses fixed-size records to store data Index files contain search terms & location
record starts Fixed-size records usually used in data
file Each record will use exactly that much space Extra space wasted if the value is smaller But limits data size, cannot get more space Makes it far easier to reuse space &
rebuild index
Entry ADT
Needs 2 pieces: what we have & what we want First part is the key: data used in search Item we want is value; the second part of
an Entry Implementations must define 2
methods key() & value() return appropriate item Usually includes setValue() but NOT setKey()
What is a MAP?
At simplest level, Map is collection of Entrys key-value pairs serve as the basic data in
a Map size() & isEmpty() work at level of Entry
Searchable data stored using Maps put() adds an Entry so key is mapped to
the value get() retrieves value associated with key
from Map remove() deletes entire Entry
At most one value per key using a Map
Dictionary ADT
DICTIONARY ADT very similar to MAP Hold searchable data in each of these ADTs Both data structures are collections of Entrys
Convert key to value using either concept DICTIONARY can have multiple values
to one key 1 value for key is still legal option
Dictionary ADT
DICTIONARY ADT very similar to MAP Hold searchable data in each of these ADTs Both data structures are collections of Entrys
Convert key to value using either concept DICTIONARY can have multiple values
to one key 1 value for key is still legal option
“pantsless”
Dictionary ADT
DICTIONARY ADT very similar to MAP Hold searchable data in each of these ADTs Both data structures are collections of Entrys
Convert key to value using either concept DICTIONARY can have multiple values
to one key 1 value for key is still legal option
“pantsless” Also many Entrys with same key but
different value “cool” “cool”
Using Hash Properly
Normally, table holds one Entry per index Need to be smarter when keys collide
Efficiency matters important critical If we do not care, use List-based approach
Several common schemes used to provide speed Each form of probing has strengths &
weaknesses Must consider bad hash effects before
using If this O(n) time unacceptable, use other
leafy plant
Using Hash Properly
Normally, table holds one Entry per index Need to be smarter when keys collide
Efficiency matters important critical If we do not care, use List-based approach
Several common schemes used to provide speed Each form of probing has strengths &
weaknesses Must consider bad hash effects before
using If this O(n) time unacceptable, use other
leafy plant
Binary Search Trees
Implements a BinaryTree for searching Map or Dictionary will be ADT exposed
Data organized to make usage efficient (maybe)
Strict ordering maintained in tree Nodes to the left are smaller Larger keys in right child of node Equal values not specified
No problem, just be consistent 6
6
92
41 10
BST Performance
Search, insert, & remove take O(h) time h is height of tree
Height’s best case is complete tree at O(log n)
O(n) height for linked list is BST’s worst case
AVL Tree Definition
Fancy type of BST O(log n) time
provided For this, needs more
info
6
92
41 8
5
AVL Tree Definition
Fancy type of BST O(log n) time
provided For this, needs more
info
6
92
41 8
5
Node heights are shown in blue
1
21 1
23
4
AVL Tree Definition
Fancy type of BST O(log n) time
provided For this, needs more
info Keep tree balanced
by… Checking heights of
kids Only let differ by 0 or
1
6
92
41 8
5
Node heights are shown in blue
1
1 1
3
4
2
2
AVL Tree Definition
Fancy type of BST O(log n) time
provided For this, needs more
info Keep tree balanced
by… Checking heights of
kids Only let differ by 0 or
1 Fix larger
differences by Shifts nodes in the BST
For balance maintainenceTrinode Restructuring
6
92
41 8
5
Node heights are shown in blue
1
1 1
3
4
2
2
Building a SplayTree
Another approach which builds upon BST Not an AVLTree, however, but a new BST
subclass
Concept Behind SplayTree
Splay trees do NOT maintain balance Recently used nodes clustered near top of BST
Most recently accessed nodes take O(1) time
Other nodes may need O(n) time to access Usually very efficient, but provides no
guarantees
Red-Black Tree
Root Property: Root node painted black External Property: Leaves are painted
black Internal Property: Red nodes’ children are
black Depth Property: Leaves have identical
black depth Number of black ancestors for the node
9
154
62 12
7
21
Map & Dictionary ADT
Implementation Searching
Adding Removing
Ordered List O(log n) O(n) O(n)
Unordered List
O(n) O(n)/O(1) O(n)
Hash O(n) O(n) O(n)
if lucky/good
O(1) O(1) O(1)
BST O(n) O(n) O(n)
AVL / balanced
O(log n) O(log n) O(log n)
Splay (expected)
O(log n) O(log n) O(log n)
Splay (worst-case)
O(n) O(n) O(n)
Sorting is a Dance
Sorting is a Dance
Sorting is a Dance
Merge Sort Execution Tree
Show steps used to sort all of the data
7 2 9 4 2 4 7 9
7 2 9 4 3 8 6 1 1 2 3 4 6 7 8 9
7 2 2 7
7 7 2 2
9 4 4 9
9 9
4 4
3 8 6 1 1 3 6 8
3 8 3 8
8 8
3 3
6 1 1 6
6 6
1 1
Quick Sort
Divide: Partition by pivot L has values <= p G uses values >= p
Recur: Sort L and G Conquer: Merge L, p, G
p
p
L G
p
Quick Sort v. Merge Sort
Quick Sort Merge Sort
Divide data around pivot Want pivot to be near
middle All comparisons occur
here
Conquer with recursion Does not need extra
space
Merge usually done already Data already sorted!
Divide data in blindly half Always gets even split No comparisons
performed!
Conquer with recursion Needs* to use other
arrays
Merge combines solutions Compares from (sorted)
halves
Bucket & Radix Sort
Sort data written as tuple of enumerable data Consumption of wine overall, in liters Annual per capita consumption of liters
Sort one place in tuple using bucket sort Uses 1 bucket per value that could be
enumerated When there are ties, preserve relative
ordering Repeat stable sorts to perform radix
sort Must preserve relative ordering, like bucket
sort From least to most important sort each
tuple place
Radix-Sort In Action
List of 4-bit integers sorted using RADIX-SORT 0001
0010
1001
1101
1110
1001
0010
1101
0001
1110
1001
0001
0010
1101
1110
1001
1101
0001
0010
1110
0010
1110
1001
1101
0001
Lower Bound on Sorting
Smallest number of comparisons is tree’s height Decision tree sorting n elements has n!
leaves At least log(n!) height needed for this many
leaves As we saw, this simplifies to at most O(n log
n) height O(n log n) time needed to compare data!
Practical lower bound, but cheating can do better
Need enumerable tuples - cannot always cheat
“If you believe radix hypothesis” it takes O(n) time
John
DavidPaul
brown.edu
cox.net
cs.brown.edu
att.netqwest.net
math.brown.edu
cslab1bcslab1a
Graph Applications
Electronic circuits Transportation networks Databases Packing suitcases Finding terrorists Scheduling college’s exams Assigning classes to rooms Garbage collection Coloring countries on a map Playing minesweeper
Edge List Structure
Simplest Graph Space efficient No change to use with
directed or undirected
v
u
w
a c
b zd
vertices
Edge List Structure
Simplest Graph Space efficient No change to use with
directed or undirected
Fields Sequence of vertices
v
u
w
a c
b zd
u v w z
edges
Edge List Structure
Simplest Graph Space efficient No change to use with
directed or undirected
Fields Sequence of vertices Sequence of edges
v w
a c
b
a
zd
b c d
vertices
v w z
u
u
Adjacency-List Implementation Vertex has Sequence of Edges Edges still refer
to Vertex
u wu
v
wa b
edges
vertices
Adjacency-List Implementation Vertex has Sequence of Edges Edges still refer
to Vertex Ideas in Edge-List
serve as base
u w
u v w
a b
u
v
wa b
edges
vertices
Adjacency-List Implementation Vertex has Sequence of Edges Edges still refer
to Vertex Ideas in Edge-List
serve as base Extends Vertex
u w
u v w
a b
u
v
wa b
edges
vertices
Adjacency-List Implementation Vertex has Sequence of Edges Edges still refer
to Vertex Ideas in Edge-List
serve as base Extends Vertex
Add Position reference to speed removal
u w
u v w
a b
u
v
wa b
edges
vertices
0 1 2
0
1
2
Adjacency Matrix Structure
Edge-List structurestill used as base
u v w
0 1 2
u
v
wa b
ba
edges
vertices
0 1 2
0
1
2
Adjacency Matrix Structure
Edge-List structurestill used as base
Vertex stores int Index found in
matrix u v w
0 1 2
u
v
wa b
ba
edges
vertices
0 1 2
0
1
2
Adjacency Matrix Structure
Edge-List structurestill used as base
Vertex stores int Index found in
matrix
Adjacency matrix in Graph class
u v w
0 1 2
u
v
wa b
ba
edges
vertices
0 1 2
0
1
2
Adjacency Matrix Structure
Edge-List structurestill used as base
Vertex stores int Index found in
matrix
Adjacency matrix in Graph class null if
not adjacent
u v w
0 1 2
u
v
wa b
ba
edges
vertices
0 1 2
0
1
2
Adjacency Matrix Structure
Edge-List structurestill used as base
Vertex stores int Index found in
matrix Adjacency matrix
in Graph class null if
not adjacent -or-
Edge incidentto both vertices
u v w
0 1 2
u
v
wa b
ba
edges
vertices
0 1 2
0
1
2
Adjacency Matrix Structure
Undirected edgesstored in both array locations
u v w
0 1 2
u
v
wa b
ba
edges
vertices
0 1 2
0
1
2
Adjacency Matrix Structure
Undirected edgesstored in both array locations
Directed edgesonly in array from source to target
u v w
0 1 2
u
v
wa b
ba
0 1 2
0
1
2
edges
vertices
Adjacency Matrix Structure
Undirected edgesstored in both array locations
Directed edgesonly in array from source to target
u v w
0 1 2
u
v
wa b
ba
n vertices & m edges no self-loops
Edge-List
Adjacency-List
Adjacency-Matrix
Space n + m n + m n2
incidentEdges(v) m deg(v) n + deg(v)
areAdjacent(v,w) m min(deg(v), deg(w)) 1
insertVertex(o) 1 1 n2
insertEdge(v,w,o) 1 1 1
removeVertex(v) m deg(v) n2
removeEdge(e) 1 1 1
Asymptotic Performance
Just Messing With You
Taking up time just to keep you from:
Graphs Solve Many Problems…
Understand how it works & what it does: DFS finds connected components in tree
form Connected vertices using minimal hops
using BFS Dijsktra’s minimizes weight to each vertex Weight of edge total minimized with Prim-
Jarnik Topological sort schedules vertices (when
possible) Can compute reachablility with Floyd-
Warshall
Given problem, which algorithm would solve it?
Graphs Solve Many Problems…
Graphs Solve Many Problems…
But Not All Problems…
Cost of Accessing Memory
How long memory access takes is also important Will make a major difference in time
program takes Easy memory aid to remember how
this works:
Cost of Accessing Memory
How long memory access takes is also important Will make a major difference in time
program takes Easy memory aid to remember how
this works:
Beer
Cost of Accessing Memory
How long memory access takes is also important Will make a major difference in time
program takes Easy memory aid to remember how
this works:
Beer
Cost of Accessing Memory
How long memory access takes is also important Will make a major difference in time
program takes Easy memory aid to remember how
this works:
Beer
Cost of Accessing Memory
How long memory access takes is also important Will make a major difference in time
program takes Easy memory aid to remember how
this works:
Beer
Multi-Way Search Tree
Nodes contain multiple elements Tree grows up with leaves always at same
level Each internal node:
At least 2 children Has 1 fewer Entrys than children Entrys sorted from smallest to largest
11 24
2 6 8 15 27 30
Hints for Studying
Will NOT require memorizing: ADT’s methods Node implementations Big-Oh time proofs (Anything else you think of)
Hints for Studying
You should know (& be ready to look up): How ADTs & algorithms work (trace & big
ideas) For each ADT implementations, its pros &
cons Where & why each ADT would be used For each method what it does & what it
returns Big-Oh complexity & impact for important
methods
Studying For the Exam
1. What does the ADT/implementation do?
Where in the real-world is this found?
2. How is the ADT, search tree, or sort used?
What would we apply it to solve a problem?
How is it used and why?
3. What is necessary for implementation?
Given implementation, why do we do it like that?
What tradeoffs does this implementation make?
“Subtle” Hint
Do NOT bother with
memorizationBe able to access &use information quickly