queues and stacks. can receive multiple requests from multiple sources ◦ how do we services these...
TRANSCRIPT
Can receive multiple requests from multiple sources◦ How do we services these requests?
First come, first serve processing Priority based processing
◦ Buffering of requests, as they might arrive faster than they can be processed
You could always use a List structure, with an integer value associated with the item, and then append it to the List using the Add() method◦ Inefficient
Queue
Two jobs added to List
Why List is problematic
Job 1 is processed and slot becomes available
Job 3 grabs first available slot, and Job 4 gets the next available slot
nextJobPost keeps track of the “next” job to be processed in the List
Why List is problematic List will continue to grow, even if jobs are
processed right away◦ The default is to double the size, when the list
requires additional “slots” No reclaiming of the already used slots is done with
Lists If you do reclaim the “used” slots in the List,
then your first-come, first-serve processing scheme will not work
A List represents a linear order
When adding an item, once the last item is used, the “next” will “wrap around” to the 0th item in the array/list◦ A “modulus” function is used to “wrap around”
What happens if all items are filled, and you still need another item?◦ Resize the circular array…!
This is done in the Queue class
A “circular” List
Add / remove buffer items◦ First-come, first-serve (FIFO)◦ Manage space utilization◦ Uses Generics
Type-safe Methods
◦ Enqueue() Adds elements at the “tail” index If not enough space, default growth factor of 2.0 is used to resize
Class constructor can specify other growth factor
◦ Dequeue() Returns the current element from the “head” index Sets the “head” element to null and increments the “head” index
◦ Peek() Allows you to see the head element, without a dequeue, or increasing the head index
counter◦ Contains()
Determine if a specific item exists in the Queue◦ ToArray()
Returns an array containing the Queue’s elements
Queue
LIFO structure Uses a circular array, as does the Queue Methods
◦ Push() Adds an item to the stack
◦ Pop() Removes and returns the item on the “top” of the stack
Size is increased, as required (same as the Queue’s growth factor)
Call Stack as used by the CLR is an example of this structure◦ When calling a function, Push its information onto the stack◦ When returning from that routine, Pop it from the stack and
expose the routine to which it returns control
Stacks
Problem: We often don’t know the “position” of an element within an array◦ Potentially we process all elements before finding the
one we need Reduce the O-time to O(1)
◦ Build an array capable of holding all SS#’s◦ Each element would hold a record based on the SS# as a
“key”◦ Waste
109 possible values, but you only have 1,000 employees Utilization would be 0.0001% of the array
Hashing allows us to “compress” this ordinal indexing
Hashtables
Use the last 4 digits (or 3, or 5) of the SS#◦ Mathematical transformation (mapping) of a nine-
digit value to a four-digit value◦ Array ranges from 0000 to 9999
Constant lookup time (O-time) Better utilization of space Hash table
◦ Array which uses hashing to compress the indexers
Hash function◦ Function which performs the hashing
Hashtables
H(x) = last four digits of x Collisions
◦ When multiple inputs to a hash function result in identical outputs 105 collisions for SS#’s ending in “0000”
◦ Collision of hash value results in attempting to store into a “slot” already occupied by a prior hash result
Hashing
Collision frequency is directly correlated to the hash function◦ SS# assumes that the last four digits are
uniformly distributed If year of birth, or geographical location alters the
distribution Increases collisions
◦ Collision avoidance is the selection of an appropriate hashing algorithm
◦ Collision resolution is locating another slot in the hashtable for entry placement
Collision avoidance / resolution
Linear probing◦ If collision in slot i occurs, proceed to the next
available slot (i+1), theni+2 and so on, if required Alice = 1234, Bob=1234, Cal=1237, Danny=1235,
Edward=1235 Insert Alice Insert Bob Insert Cal Insert Danny Insert Edward
Collision resolution
Searching◦ Start at the hash location, and then perform a linear
search from there until the value is located When/if you reach an empty slot your search value is
NOT in that hashtable Linear probing not very good resolution
◦ Leads to clustering of values Ideally you’d like a uniform distribution of values
Quadratic probing◦ Slot s is taken
Probe s+12, then s-12, then s+22, then s-22, and so on… Can still lead to clustering
Collision resolution
Rehashing◦ Used by the .NET Framework Hashtable class◦ Adding an item to the table
Provide item and unique key to access the item Item and key can be of any type
◦ Retrieving item Index the Hashtable by key
Collision resolution
//Note the use of the ContainsKey() Method, which returns a Booleanusing System;using System.Collections;public class HashtableDemo{ private static Hashtable employees = new Hashtable(); public static void Main() { // Add some values to the Hashtable, indexed by a string key employees.Add("111-22-3333", "Scott"); employees.Add("222-33-4444", "Sam"); employees.Add("333-44-5555", "Jisun"); // Access a particular key if (employees.ContainsKey("111-22-3333")) { string empName = (string) employees["111-22-3333"]; Console.WriteLine("Employee 111-22-3333's name is: " + empName); } else Console.WriteLine("Employee 111-22-3333 is not in the hash table..."); }}
Hashtable Code Example
// Step through all items in the Hashtable
foreach(string key in employees.Keys)Console.WriteLine("Value at employees[\"" + key + "\"] = " + employees[key].ToString());
The order of insertion and order of keys are not necessarily the same◦ Depends on the slot the key was stored in
depends on the hash value of the key Depends on the collision resolution used
◦ The output from the above code results in:
Value at employees["333-44-5555"] = JisunValue at employees["111-22-3333"] = ScottValue at employees["222-33-4444"] = Sam
Hashtable Code Example
Function returns an ordinal value◦ Slot # for the key◦ Function can accept a key of any type◦ GetHashCode()
Any object can be represented as a unique number
Hashtable Class: Hash Function
Rehashing (double hashing)
◦ Set of hash functions H1… Hn
◦ H1 is initially used If collision, then H2 is used, and so on
They differ by multiplicative factors
◦ Each slot in the hash table is visited exactly once when hashsize number of probes are made For a given key, Hi and Hj cannot hash to the same slot in the
table This can work if the results of (1 + (((GetHash(key) >> 5) + 1) %
(hashsize – 1)) and hashsize are “relatively prime” They share no common factors
Guaranteed to be prime if hashsize is a prime number
◦ Better collision avoidance than linear or quadratic probing
Hashtable Class: Collision Resolution
Hashtable class◦ Property: loadFactor
Max ratio of items in the Hash to the total slots in the table 0.5 at most, half the slots can be used, and the other
half must remain empty Values range from 0.1 to 1.0
Microsoft has a default “scaling factor” of 72% If you pass 1.0 to the loadFactor property, it’s still only
0.72 behind the scenes Performance issue
Hashing: Load Factors
Hashtable class◦ Add() method
Performs a check against the loadFactor If exceeded, the Hashtable is expanded
◦ Expansion Slot count is approximately doubled
From the current prime number to the next largest prime number value
Hash value depends on the number of total slots All values in the table need to be rehashed when the
table expands Occurs behind the scenes during Add() method
Hashing
loadFactor ◦ Affects the size of the hash table and number of
probes required on a collision High load factor
Denser hash table, but more collisions Expected number of probes needed when a collision
happens 1/(1-loadFactor)
Default 0.72 loadFactor results in 3.5 probes per collision on average Does not vary based on number of items in the table Asymptotic access time is O(1)
Much more desirable that the O(n) search time for an array
Hashtable
Hashtable is “loosely-typed” structure◦ Developer can add keys and values of any type to
the table Generics allow us to have type-safe implementations
of a class Dictionary class is a “type-safe” class
◦ Types the keys and the values◦ You must specify the types for keys/values when
creating the Dictionary instance◦ Once created, you can add and remove items,
just like the Hashtable
Dictionary Class
Collision resolution◦ Different from the Hashtable◦ Chaining is used
Secondary data structure is used for the collisions◦ Each slot in the Dictionary contain an array of
elements A collision prepends the element to the bucket’s list
Dictionary class
8 buckets (example)◦ Employee object is added to the bucket that its key
hashes to If already occupied, item is prepended
Searching and removing items from a chained hashtable◦ Time proportional to total items and number of
buckets O(n/m)
n=total elements m= total buckets
◦ Dictionary class implemented n=m at all times
Dictionary class