scientific programming [24pt] data structuresdisi.unitn.it/~montreso/sp/slides/b02-strutture.pdf ·...
Post on 18-Mar-2019
216 Views
Preview:
TRANSCRIPT
Scientific Programming
Data structures
Alberto Montresor
Università di Trento
2018/11/21
This work is licensed under a Creative CommonsAttribution-ShareAlike 4.0 International License.
references
Table of contents1 Abstract data types (PSADS §1.8, §1.13)
SequencesSetsDictionaries
2 Elementary data structures implementationsLinked List (PSADS §3.19–§3.21)Dynamic vectors
3 Sets, Dictionary implementations in PythonDirect access tablesHash functionsHashtable implementation
4 Stack and queuesStack – Last-in, First-out (LIFO)Queue (PSADS §3.10–§3.12, §3.15–§3.16)
5 Some code examples
Abstract data types (PSADS §1.8, §1.13) Definitions
Introduction
Data
In a programming lan-guage, a piece of datais a value that can beassigned to a variable.
Abstract data type (ADT)
A mathematical model, given by acollection of values and a set of ope-rators that can be performed onthem.
Primitive abstract data types
Primitive abstract data types are provided directly by the language
Type Operators Notesint +,-,*,/, //, % 32-64 bits (264 � 1)long +,-,*,/, //, % Unlimited integer precisionfloat +,-,*,/, //, % 32-64 bitsboolean and, or, not True, False
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 1 / 69
Abstract data types (PSADS §1.8, §1.13) Definitions
Data types
"Specification" and "implementation" of an abstract data type
Specification: its “manual”, hides the implementation details fromthe userImplementation: the actual code that realizes the abstract datatype
Example
Real numbers vs IEEE-754
Don’t try this at home (or try it?):
>>> 0.1 + 0.2
0.30000000000000004
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 2 / 69
Abstract data types (PSADS §1.8, §1.13) Definitions
Data structure
Data structures
Data structures are collections of data, characterized more by theorganization of the data rather than the type of contained data.
How to describe data structures
a systematic approach to organize the collection of dataa set of operators that enable the manipulation of the structure
Characteristics of the data structures
Linear / Non linear (sequence)Static / Dynamic (content variation, size variation)
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 3 / 69
Abstract data types (PSADS §1.8, §1.13) Definitions
Data structures
Type Java C++ Python
Sequences
List, Queue, DequeLinkedList,ArrayList, Stack,ArrayDeque
list, forward_listvectorstackqueue, deque
listtupledeque
Sets
SetTreeSet, HashSet,LinkedHashSet
setunordered_set
set, frozenset
Dictionaries
MapHashTree, HashMap,LinkedHashMap
mapunordered_map
dict
Trees
- - -
Graphs
- - -
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 4 / 69
Abstract data types (PSADS §1.8, §1.13) Sequences
Sequence
Sequence
A dynamic data structure representing an "ordered" group of ele-ments
The ordering is not defined by the content, but by the relativeposition inside the sequence (first element, second element,etc.)Values could appear more than onceExample: [0.1, "alberto", 0.05, 0.1] is a sequence
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 5 / 69
Abstract data types (PSADS §1.8, §1.13) Sequences
Sequence
Operators
It is possible to add / remove elements, by specifying their positions = s1, s2, . . . , snthe element si is in position posi
It is possible to access directly some of the elements of the sequencethe beginning and/or the end of the list
having a reference to the position
It is possible to sequentially access all the other elements
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 6 / 69
Abstract data types (PSADS §1.8, §1.13) Sequences
Sequence – Specification
Sequence
% Return True if the sequence is emptyboolean isEmpty()
% Returns the position of the first elementPos head()
% Returns the position of the last elementPos tail()
% Returns the position of the successor of pPos next(Pos p)
% Returns the position of the predecessor of pPos prev(Pos p)
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 7 / 69
Abstract data types (PSADS §1.8, §1.13) Sequences
Sequence – Specification
Sequence (continue)% Inserts element v of type object in position p.% Returns the position of the new elementPos insert(Pos p, object v)
% Removes the element contained in position p.% Returns the position of the successor of p, which % becomes successor of thepredecessor of p
Pos remove(Pos p)
% Reads the element contained in position pobject read(Pos p)
% Writes the element v of type object in position pwrite(Pos p, object v)
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 8 / 69
Abstract data types (PSADS §1.8, §1.13) Sets
Sets
Set
A dynamic, non-linear data structure that stores an unordered collectionof values without repetitions.
We can consider a total order between elements as the orderdefined over their abstract data type, if present.
Operators
Basic operators:insert
delete
contains
Sorting operatorsMaximum
Minimum
Set operatorsunion
intersection
difference
Iterators:for x in S:
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 9 / 69
Abstract data types (PSADS §1.8, §1.13) Sets
Sets – Specifications
Set
% Returns the size of the setint len()
% Returns True if x belongs to the set; Python: x in S
boolean contains(object x)
% Inserts x in the set, if not already presentadd(object x)
% Removes x from the set, if presentdiscard(object x)
% Returns a new set which is the union of A and BSet union(Set A, Set B)
% Returns a new set which is the intersection of A and BSet intersection(Set A, Set B)
% Returns a new set which is the difference of A and BSet difference(Set A, Set B)
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 10 / 69
Abstract data types (PSADS §1.8, §1.13) Dictionaries
Dictionaries
Dictionary
Abstract data structure that represents the mathematical conceptof partial function R : D ! C, or key-value association
Set D is the domain (elements called keys)Set C is the codomain (elements called values)
Operators
Lookup the value associated to a particular key, if present, NoneotherwiseInsert a new key-value association, deleting potential associationthat are already present for the same keyRemove an existing key-value association
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 11 / 69
Abstract data types (PSADS §1.8, §1.13) Dictionaries
Dictionaries – Specification
Dictionary
% Returns the value associated to key k, if present; returns noneotherwise
object lookup(object k)
% Associates value v to key kinsert(object k, object v)
% Removes the association of key kremove(object k)
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 12 / 69
Abstract data types (PSADS §1.8, §1.13) Dictionaries
Comments
Concepts of sequences, sets, dictionaries are linkedKey sets/ value sets
Iterate over the set of keys
Some realizations are "natural"Sequence $ list
Queue $ queue as a list
Alternative implementations existSet as boolean vector
Queue as circular vector
The choice of the data structure has implications on theperformances
Dictionary as a hash table: lookup O(1), minimum search O(n)Dictionary as a tree: lookup O(log n), minimum search O(1)
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 13 / 69
Table of contents
1 Abstract data types (PSADS §1.8, §1.13)SequencesSetsDictionaries
2 Elementary data structures implementationsLinked List (PSADS §3.19–§3.21)Dynamic vectors
3 Sets, Dictionary implementations in PythonDirect access tablesHash functionsHashtable implementation
4 Stack and queuesStack – Last-in, First-out (LIFO)Queue (PSADS §3.10–§3.12, §3.15–§3.16)
5 Some code examples
Elementary data structures implementations Linked List (PSADS §3.19–§3.21)
List
List (Linked List)
A sequence of memory objects, containing arbitrary data and 1-2pointers to the next element and/or the previous one
Note
Contiguity in the list 6) contiguity in memoryAll the operations require O(1), but in some cases you need a lot ofsingle operations to complete an action
Possible implementations
Bidirectional / MonodirectionalWith sentinel / Without sentinelCircular / Non-circular
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 14 / 69
Elementary data structures implementations Linked List (PSADS §3.19–§3.21)
List
v1L v2 v3 vn nil…
nil v1L v2 vn nilv3 …
v1L v2 vnv3 …
L v1 v2 vn nil…
Monodirectional
Bidirectional
Bidirectional, circular
Monodirectional, with sentinel
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 15 / 69
Elementary data structures implementations Linked List (PSADS §3.19–§3.21)
Monodirectional list, Pythonclass Node:
def __init__(self,initdata):
self.data = initdata
self.next = None
def getData(self):
return self.data
def getNext(self):
return self.next
def setData(self,newdata):
self.data = newdata
def setNext(self,newnext):
self.next = newnext
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 16 / 69
Elementary data structures implementations Linked List (PSADS §3.19–§3.21)
Monodirectional list, Pythonclass UnorderedList:
def __init__(self):
self.head = None
def add(self,item):
temp = Node(item)
temp.setNext(self.head)
self.head = temp
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 17 / 69
Elementary data structures implementations Linked List (PSADS §3.19–§3.21)
Monodirectional list, Pythonclass UnorderedList:
def __init__(self):
self.head = None
def add(self,item):
temp = Node(item)
temp.setNext(self.head)
self.head = temp
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 18 / 69
Elementary data structures implementations Linked List (PSADS §3.19–§3.21)
Monodirectional list, Pythonclass UnorderedList: # Continued
def search(self,item):
current = self.head
found = False
while current != None and not found:
if current.getData() == item:
found = True
else:
current = current.getNext()
return found
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 19 / 69
Elementary data structures implementations Linked List (PSADS §3.19–§3.21)
Monodirectional list, Pythondef remove(self,item):
current = self.head
previous = None
found = False
while not found:
if current.getData() == item:
found = True
else:
previous = current
current = current.getNext()
if previous == None:
self.head = current.getNext()
else:
previous.setNext(current.getNext())
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 20 / 69
Elementary data structures implementations Dynamic vectors
Dynamic vectors
Lists in Python implemented through dynamic vectors
A vector of a given size (initial capacity) is allocatedWhen inserting an element before the end, all elements haveto be moved - cost O(n)
When inserting an element at the end (append), the cost isO(1) (just writing the element at first available slot)
Problem:It is not known how many elements have to be storedThe initial capacity could be insufficient
Solution:A new (larger) vector is allocated, the content is copied in the newvector, the old vector is released
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 21 / 69
Elementary data structures implementations Dynamic vectors
Dynamic vectors
Question
Which is the best approach?
Approach 1
If the old vector has size n, allocate a new vector of size dn. Forexample, d = 2
Approach 2
If the old vector has size n, allocate a new vector of size n+d, whered is a constant. For example, d = 16
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 22 / 69
Elementary data structures implementations Dynamic vectors
Amortized analysis - doubling
Actual cost of an append() operation:
ci =
(i 9k 2 Z+
0
: i = 2
k+ 1
1 otherwise
Assumptions:Initial capacity: 1Writing cost: ⇥(1)
n cost
1 1
2 1 + 2
0= 2
3 1 + 2
1= 3
4 1
5 1 + 2
2= 5
6 1
7 1
8 1
9 1 + 2
3= 9
10 1
11 1
12 1
13 1
14 1
15 1
16 1
17 1 + 2
4= 17
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 23 / 69
Elementary data structures implementations Dynamic vectors
Amortized analysis - doubling
Actual cost of n operations append():
T (n) =nX
i=1
ci
= n+
blogncX
j=0
2
j
= n+ 2
blognc+1 � 1
n+ 2
logn+1 � 1
= n+ 2n� 1 = O(n)
Amortized cost of a singleappend():
T (n)/n =
O(n)
n= O(1)
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 24 / 69
Elementary data structures implementations Dynamic vectors
Amortized analysis - increment
Actual cost of an append() operation:
ci =
(i (i mod d) = 1
1 altrimenti
Assumptions
Increment: d
Initial size: d
Writing cost: ⇥(1)
Example
d = 4
n cost
1 1
2 1
3 1
4 1
5 1 + d = 5
6 1
7 1
8 1
9 1 + 2d = 9
10 1
11 1
12 1
13 1 + 3d = 13
14 1
15 1
16 1
17 1 + 4d = 17
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 25 / 69
Elementary data structures implementations Dynamic vectors
Amortized analysis - increment
Actual cost of n operations append():
T (n) =nX
i=1
ci
= n+
bn/dcX
j=1
d · j
= n+ d
bn/dcX
j=1
j
= n+ d(bn/dc+ 1)bn/dc
2
n+
(n/d+ 1)n
2
= O(n2
)
Amortized cost of a singleappend():
T (n)/n =
O(n2
)
n= O(n)
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 26 / 69
Elementary data structures implementations Dynamic vectors
Reality check
Language Data structure Expansion factorGNU C++ std::vector 2.0
Microsoft VC++ 2003 vector 1.5Python list 1.125
Oracle Java ArrayList 2.0OpenSDK Java ArrayList 1.5
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 27 / 69
Elementary data structures implementations Dynamic vectors
Python – List
Operator Worst case Worst caseamortized
L.copy() Copy O(n) O(n)L.append(x) Append O(n) O(1)
L.insert(i,x) Insert O(n) O(n)L.remove(x) Remove O(n) O(n)L[i] Index O(1) O(1)
for x in L Iterator O(n) O(n)L[i:i+k] Slicing O(k) O(k)L.extend(s) Extend O(k) O(n+ k)x in L Contains O(n) O(n)min(L), max(L) Min, Max O(n) O(n)len(L) Get length O(1) O(1)
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 28 / 69
Table of contents
1 Abstract data types (PSADS §1.8, §1.13)SequencesSetsDictionaries
2 Elementary data structures implementationsLinked List (PSADS §3.19–§3.21)Dynamic vectors
3 Sets, Dictionary implementations in PythonDirect access tablesHash functionsHashtable implementation
4 Stack and queuesStack – Last-in, First-out (LIFO)Queue (PSADS §3.10–§3.12, §3.15–§3.16)
5 Some code examples
Sets, Dictionary implementations in Python
Dictionary: possible implementations
Unorderedarray
Orderedarray
LinkedList
RB Tree Idealimpl.
insert() O(1), O(n) O(n) O(1), O(n) O(log n) O(1)
lookup() O(n) O(log n) O(n) O(log n) O(1)
remove() O(n) O(n) O(n) O(log n) O(1)
Ideal implementation: hash tables
Choose a hash function h that maps each key k 2 U to an integerh(k)
The key-value hk, vi is stored in a list at position h(k)
This vector is called hash table
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 29 / 69
Sets, Dictionary implementations in Python
Hash table – Definitions
All the possible keys are contained in the universe set U of size uThe table is stored in list T [0 . . .m� 1] with size mAn hash function is defined as: h : U ! {0, 1, . . . ,m� 1}
Alberto Montresor
Cristian Consonni
Alessio Guerrieri
Edsger Dijkstra
Keys Hash function
…
Hash table
00
01
02
03
04
05
06
m-3
m-2
m-1
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 30 / 69
Sets, Dictionary implementations in Python
Collisions
When two or more keys in the dictionary have the same hashvalues, we say that a collision has happenedIdeally, we want to have hash functions with no collisions
Alberto Montresor
Cristian Consonni
Alessio Guerrieri
Edsger Dijkstra
Keys Hash functions
…
Hash table
00
01
02
03
04
05
06
m-3
m-2
m-1
Collision
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 31 / 69
Sets, Dictionary implementations in Python Direct access tables
Direct access tables
In some cases: the set U is already a (small) subset of Z+
The set of year days, numbered from 1 to 366The set of Kanto’s Pokemons, numbered from 1 to 151
Direct access tables
We use the identity function h(k) = k as hash functionWe select m = u
Problems
If u is very large, this approach may be infeasibleIf u is not large but the number of keys that are actually recordedis much smaller than u = m, memory is wasted
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 32 / 69
Sets, Dictionary implementations in Python Hash functions
Perfect hash functions
Definition
A hash function h is called perfect if h is injective, i.e.8k
1
, k2
2 U : k1
6= k2
) h(k1
) 6= h(k2
)
Examples
Students ASD 2005-2016N. matricola in [100.090, 183.864]
h(k) = k � 100.090,m = 83.774
Studentes enrolled in 2014N. matricola in [173.185, 183.864]
h(k) = k � 173.185,m = 10.679
Problems
Universe space is oftenlarge, sparse, unknown
To obtain a perfecthash function is difficult
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 33 / 69
Sets, Dictionary implementations in Python Hash functions
Hash functions
If collisions cannot be avoided
Let’s try to minimize their numberWe want hash functions that uniform distribute the keys intohash indexes [0 . . .m� 1]
Simple uniformity
Let P (k) be the probability that key k is inserted in the tableLet Q(i) be the probability that a key ends up in the i-thentry of the table
Q(i) =X
k2U :h(k)=i
P (k)
An hash function h has simple uniformity if:8i 2 [0, . . . ,m� 1] : Q(i) = 1/m
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 34 / 69
Sets, Dictionary implementations in Python Hash functions
Hash functions
To obtain a hash function with simple uniformity, the probability
distribution P should be known
Example
if U is given by real number in [0, 1[ and each key has the sa-me probability of being selected, then H(k) = bkmc has simpleuniformity
In the real world
The key distribution may unknown or partially known
Heuristic techniques are used to obtain an approximation of simpleuniformity
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 35 / 69
Sets, Dictionary implementations in Python Hash functions
How to realize a hash function
Assumption
Each key can be translated in a numerical, non-negative values, byreading their internal representation as a number.
Example: string transformation
ord(c): ordinal binary value of character c in ASCIIbin(k): binary representation of key k, by concatenating thebinary values of its charactersint(b): numerical value associated to the binary number b
int(k) = int(bin(k))
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 36 / 69
Sets, Dictionary implementations in Python Hash functions
Division method
Division method
Let m be a prime number
H(k) = int(k) mod m
Example
m = 383
H(”Alberto”) =18.415.043.350.787.183 mod 383 = 221
H(”Alessio”) =18.415.056.470.632.815 mod 383 = 77
H(”Cristian”) = 4.860.062.892.481.405.294 mod 383 = 130
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 37 / 69
Sets, Dictionary implementations in Python Hash functions
Multiplication method
Multiplication method
Le m be any size, for example 2
k
Let C be a real constant, 0 < C < 1
Let i = int(k)
H(k) = bm(C · i� bC · ic)c
Example
m = 2
16
C =
p5�1
2
H(”Alberto”) = 65.536 · 0.78732161432 = 51.598
H(”Alessio”) = 65.536 · 0.51516739168 = 33.762
H(”Cristian”) = 65.536 · 0.72143641000 = 47.280
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 38 / 69
Sets, Dictionary implementations in Python Hashtable implementation
Separate chaining
Idea
The keys with the same valueh are stored in amonodirectional list /dynamic vector
The H(k)-th slot in the hashtable contains the list/vectorassociated to k
© Alberto Montresor 20
Tecniche di risoluzione delle collisioni
✦ Liste di trabocco (chaining) ✦ Gli elementi con lo stesso
valore hash h vengono memorizzati in una lista
✦ Si memorizza un puntatore alla testa della lista nello slot A[h] della tabella hash
✦ Operazioni: ✦ Insert:
inserimento in testa
✦ Lookup, Delete: richiedono di scandire la lista alla ricerca della chiave
k2
0
m–1
k1 k4
k5 k6
k7 k3
k8
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 39 / 69
Sets, Dictionary implementations in Python Hashtable implementation
Separate chaining: complexity analysis
n Number of keys stored in the hash tablem Size of the hash table↵ = n/m Load factorI(↵) Average number of memory accesses to search a key that
is not in the table (insuccess)S(↵) Average number of memory accesses to search a key that
is not in the table (success)
Worst case analysis
All the keys are inserted in a unique listinsert(): ⇥(1)
lookup(), remove(): ⇥(n)
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 40 / 69
Sets, Dictionary implementations in Python Hashtable implementation
Separate chaining: complexity analysis
Average case analysis
Let’s assume the hash function has simple uniformity
Hash function computation: ⇥(1), to be added to all searches
How long the lists are?
The expected length of a listis equal to ↵ = n/m
© Alberto Montresor 22
Liste di trabocco: complessità
✦ Teorema: ✦ In tavola hash con concatenamento, una ricerca senza successo richiede un tempo
atteso Ө(1 + α)
✦ Dimostrazione: ✦ Una chiave non presente nella tabella può essere collocata in uno qualsiasi degli m
slot
✦ Una ricerca senza successo tocca tutte le chiavi nella lista corrispondente
✦ Tempo di hashing: 1 + lunghezza attesa lista: α → Θ(1+α)
k1 k41
α
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 41 / 69
Sets, Dictionary implementations in Python Hashtable implementation
Separate chaining: complexity analysis
Insuccess
When searching for a missingkey, all the keys in the listmust be read
Success
When searching for a keyincluded in the table, onaverage half of the keys in thelist must be read.
Expected cost: ⇥(1) + ↵ Expected cost: ⇥(1) + ↵/2
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 42 / 69
Sets, Dictionary implementations in Python Hashtable implementation
Separate chaining: complexity analysis
What is the meaning of the load factor?
The cost factor of every operation is influenced by the cost factor
If m = O(n), ↵ = O(1)
In such case, all operations are O(1) in expectation
If ↵ becomes too large, the size of the hash table can be doubledthrough dynamic vectors
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 43 / 69
Sets, Dictionary implementations in Python Hashtable implementation
Python and hash function/tables
Python sets and dict
Are implemented through hash tables
Sets are degenerate forms of dictionaries, where there are novalues, only keys
Unordered data structures
Order between keys is not preserved by the hash function; this iswhy you get unordered results when you print them
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 44 / 69
Sets, Dictionary implementations in Python Hashtable implementation
Python – set
Operation Average case Worst casex in S Contains O(1) O(n)S.add(x) Insert O(1) O(n)S.remove(x) Remove O(1) O(n)S|T Union O(n+m) O(n ·m)
S&T Intersection O(min(n,m)) O(n ·m)
S-T Difference O(n) O(n ·m)
for x in S Iterator O(n) O(n)len(S) Get length O(1) O(1)
min(S), max(S) Min, Max O(n) O(n)
n = len(S),m = len(T)
https://docs.python.org/2/library/stdtypes.html#set
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 45 / 69
Sets, Dictionary implementations in Python Hashtable implementation
Python – dict
Operation Average case Worst casex in D Contains O(1) O(n)D[] = Insert O(1) O(n)= D[] Lookup O(1) O(n)del D[] Remove O(1) O(n)for x in S Iterator O(n) O(n)len(S) Get length O(1) O(1)
n = len(S),m = len(T)
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 46 / 69
Sets, Dictionary implementations in Python Hashtable implementation
Implementing hash functions 1
Rule: If two objects are equal, then their hashes should be equal
If you implement __eq__(), then you should implement function
__hash__() as well
Rule: If two objects have the same hash, then they are likely to be equal
You should avoid to return values that generate collisions in your hash
function.
Rule: In order for an object to be hashable, it must be immutable
The hash value of an object should not change over time
1http://www.asmeurer.com/blog/posts/what-happens-when-you-mess-with-hashing-in-python/
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 47 / 69
Sets, Dictionary implementations in Python Hashtable implementation
Example
class Point:
def __init__(self, x, y):
self.x = x
self.y = y
def __eq__(self, other):
return self.x == other.x and self.y == other.y
def __hash__(self):
return hash( (self.x,self.y) )
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 48 / 69
Table of contents
1 Abstract data types (PSADS §1.8, §1.13)SequencesSetsDictionaries
2 Elementary data structures implementationsLinked List (PSADS §3.19–§3.21)Dynamic vectors
3 Sets, Dictionary implementations in PythonDirect access tablesHash functionsHashtable implementation
4 Stack and queuesStack – Last-in, First-out (LIFO)Queue (PSADS §3.10–§3.12, §3.15–§3.16)
5 Some code examples
Stack and queues Stack
Stack
Stack
A linear, dynamic data structure, in which the operation "remove" returns(and removes) a predefined element: the one that has remained in the datastructure for the least time
Stack
% Returns True if the stack is emptyboolean isEmpty()
% Returns the size of the stackboolean size()
% Inserts v on top of the stackpush(object v)
% Removes the top element of thestack and returns it to the caller
object pop()
% Read the top element of the stack,without modifying it
object peek()
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 49 / 69
Stack and queues Stack
Stack example
Stack Operation Stack Contents Return Values.isEmpty() [] True
s.push(4) [4]
s.push(’dog’) [4,’dog’]
s.peek() [4,’dog’] ’dog’
s.push(True) [4,’dog’,True]
s.size() [4,’dog’,True] 3
s.isEmpty() [4,’dog’,True] False
s.push(8.4) [4,’dog’,True,8.4]
s.pop() [4,’dog’,True] 8.4
s.pop() [4,’dog’] True
s.size() [4,’dog’] 2
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 50 / 69
Stack and queues Stack
Stack
Possible uses
In languages like Python:Compiler: To balance parentheses
In the the interpreter: A new activation
record is created for each function call
In graph analysis:To perform visits of the entire graph
Possible implementations
Through bidirectional listsreference to the top element
Through vectorslimited size, small overhead
top
top
A
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 51 / 69
Stack and queues Stack
Stack in procedural languagesdef func3(y):
z = y
y = y*y
return z+y
def func2(x):
y = x
x = x*3
y = x * func3(y)
return y+x
def func1(w):
x = w
w = w*2
x = w + func2(x)
return x+w
print(func1(5))
See it in code lens!
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 52 / 69
Stack and queues Stack
Stack in Pythonclass Stack:
def __init__(self):
self.items = []
def size(self):
return len(self.items)
def isEmpty(self):
return self.items == []
def pop(self):
return self.items.pop()
def peek(self):
return self.items[len(self.items)-1]
def push(self, item):
self.items.append(item)
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 53 / 69
Stack and queues Stack
Example of application: balanced parenthesis
Check whether the following sets of parentheses are balanced{ { ( [ ] [ ] ) } ( ) }
[ [ { { ( ( ) ) } } ] ]
[ ] [ ] [ ] ( ) { }
( [ ) ]
( ( ( ) ] ) )
[ { ( ) ]
These parentheses could be associated to sets, lists, tuples and/orarithmetic operations
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 54 / 69
Stack and queues Stack
Example of application: balanced parenthesisdef parChecker(parString):
s = Stack()
index = 0
for symbol in parString:
if symbol in "([{":
s.push(symbol)
else:
if s.isEmpty():
return False
else:
top = s.pop()
if not matches(top,symbol):
return False
return s.isEmpty()
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 55 / 69
Stack and queues Stack
Example of application: balanced parenthesisdef matches(openpar,closepar):
opens = "([{"
closers = ")]}"
return opens.index(openpar) == closers.index(closepar)
print(parChecker(’{{([][])}()}’))
print(parChecker(’[{()]’))
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 56 / 69
Stack and queues Queue
Queue – First-In, First-Out (FIFO)
Queue
A linear, dynamic data structure, in which the operation "remo-ve" returns (and removes) a predefined element: the one that hasremained in the data structure for the longest time)
Queue
% Returns True if queue is emptyboolean isEmpty()
% Returns the size of the queueint size()
% Inserts v at the end of thequeue
enqueue(object v)
% Extracts q from the beginningof the queue
object dequeue()
% Reads the element at the top ofthe queue
object top()
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 57 / 69
Stack and queues Queue
Queue example
Queue Operation Queue Contents Return Valueq.isEmpty() [] True
q.enqueue(4) [4]
q.enqueue(’dog’) [’dog’,4]
q.enqueue(True) [True,’dog’,4]
q.size() [True,’dog’,4] 3
q.isEmpty() [True,’dog’,4] False
q.enqueue(8.4) [8.4,True,’dog’,4]
q.dequeue() [8.4,True,’dog’] 4
q.dequeue() [8.4,True] ’dog’
q.size() [8.4,True] 2
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 58 / 69
Stack and queues Queue
Queue
Possible uses
To queue requests performed on alimited resource (e.g., printer)To visit graphs
Possible implementations
Through listsadd to the tail
remove from the head
Through circular arraylimited size, small overhead
head
tail
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 59 / 69
Stack and queues Queue
Queue in Python (wrong)
class Queue:
def __init__(self):
self.items = []
def isEmpty(self):
return self.items == []
def enqueue(self, item):
self.items.insert(0,item)
def dequeue(self):
return self.items.pop()
def size(self):
return len(self.items)
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 60 / 69
Stack and queues Queue
Queue based on circular array
Implementation based on the modulus operationPay attention to overflow problems (full queue)
head
head+n
head+n
enqueue(12)
dequeue() → 3
enqueue(15,17,33)
3 6 54 … … 43
3 6 54 … … 43 12
head head+n
6 54 … … 43 12
headhead+n
33 6 54 … … 43 12 15 17
head
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 61 / 69
Stack and queues Queue
Queue based on circular array
By MuhannadAjjan [CC BY-SA 4.0] (http://creativecommons.org/licenses/by-sa/4.0) via Wikimedia
CommonsAlberto Montresor (UniTN) SP - Data structure 2018/11/21 61 / 69
Stack and queues Queue
Queue based on circular array
By MuhannadAjjan [CC BY-SA 4.0] (http://creativecommons.org/licenses/by-sa/4.0) via Wikimedia
CommonsAlberto Montresor (UniTN) SP - Data structure 2018/11/21 61 / 69
Stack and queues Queue
Queue based on circular array
By MuhannadAjjan [CC BY-SA 4.0] (http://creativecommons.org/licenses/by-sa/4.0) via Wikimedia
CommonsAlberto Montresor (UniTN) SP - Data structure 2018/11/21 61 / 69
Stack and queues Queue
Queue based on circular array
By MuhannadAjjan [CC BY-SA 4.0] (http://creativecommons.org/licenses/by-sa/4.0) via Wikimedia
CommonsAlberto Montresor (UniTN) SP - Data structure 2018/11/21 61 / 69
Stack and queues Queue
Queue based on circular array
By MuhannadAjjan [CC BY-SA 4.0] (http://creativecommons.org/licenses/by-sa/4.0) via Wikimedia
CommonsAlberto Montresor (UniTN) SP - Data structure 2018/11/21 61 / 69
Stack and queues Queue
Queue based on circular array
By MuhannadAjjan [CC BY-SA 4.0] (http://creativecommons.org/licenses/by-sa/4.0) via Wikimedia
CommonsAlberto Montresor (UniTN) SP - Data structure 2018/11/21 61 / 69
Stack and queues Queue
Queue based on circular array
By MuhannadAjjan [CC BY-SA 4.0] (http://creativecommons.org/licenses/by-sa/4.0) via Wikimedia
CommonsAlberto Montresor (UniTN) SP - Data structure 2018/11/21 61 / 69
Stack and queues Queue
Queue based on circular array
By MuhannadAjjan [CC BY-SA 4.0] (http://creativecommons.org/licenses/by-sa/4.0) via Wikimedia
CommonsAlberto Montresor (UniTN) SP - Data structure 2018/11/21 61 / 69
Stack and queues Queue
Queue based on circular array
By MuhannadAjjan [CC BY-SA 4.0] (http://creativecommons.org/licenses/by-sa/4.0) via Wikimedia
CommonsAlberto Montresor (UniTN) SP - Data structure 2018/11/21 61 / 69
Stack and queues Queue
Queue based on circular array
By MuhannadAjjan [CC BY-SA 4.0] (http://creativecommons.org/licenses/by-sa/4.0) via Wikimedia
CommonsAlberto Montresor (UniTN) SP - Data structure 2018/11/21 61 / 69
Stack and queues Queue
Queue based on circular array
By MuhannadAjjan [CC BY-SA 4.0] (http://creativecommons.org/licenses/by-sa/4.0) via Wikimedia
CommonsAlberto Montresor (UniTN) SP - Data structure 2018/11/21 61 / 69
Stack and queues Queue
Queue based on circular array
By MuhannadAjjan [CC BY-SA 4.0] (http://creativecommons.org/licenses/by-sa/4.0) via Wikimedia
CommonsAlberto Montresor (UniTN) SP - Data structure 2018/11/21 61 / 69
Stack and queues Queue
Queue based on circular array
By MuhannadAjjan [CC BY-SA 4.0] (http://creativecommons.org/licenses/by-sa/4.0) via Wikimedia
CommonsAlberto Montresor (UniTN) SP - Data structure 2018/11/21 61 / 69
Stack and queues Queue
Queue based on circular array
By MuhannadAjjan [CC BY-SA 4.0] (http://creativecommons.org/licenses/by-sa/4.0) via Wikimedia
CommonsAlberto Montresor (UniTN) SP - Data structure 2018/11/21 61 / 69
Stack and queues Queue
Queue based on circular array
By MuhannadAjjan [CC BY-SA 4.0] (http://creativecommons.org/licenses/by-sa/4.0) via Wikimedia
CommonsAlberto Montresor (UniTN) SP - Data structure 2018/11/21 61 / 69
Stack and queues Queue
Queue based on circular array
By MuhannadAjjan [CC BY-SA 4.0] (http://creativecommons.org/licenses/by-sa/4.0) via Wikimedia
CommonsAlberto Montresor (UniTN) SP - Data structure 2018/11/21 61 / 69
Stack and queues Queue
Queue based on circular array
By MuhannadAjjan [CC BY-SA 4.0] (http://creativecommons.org/licenses/by-sa/4.0) via Wikimedia
CommonsAlberto Montresor (UniTN) SP - Data structure 2018/11/21 61 / 69
Stack and queues Queue
Queue based on circular array
By MuhannadAjjan [CC BY-SA 4.0] (http://creativecommons.org/licenses/by-sa/4.0) via Wikimedia
CommonsAlberto Montresor (UniTN) SP - Data structure 2018/11/21 61 / 69
Stack and queues Queue
Queue based on circular array
By MuhannadAjjan [CC BY-SA 4.0] (http://creativecommons.org/licenses/by-sa/4.0) via Wikimedia
CommonsAlberto Montresor (UniTN) SP - Data structure 2018/11/21 61 / 69
Stack and queues Queue
Queue based on circular array
By MuhannadAjjan [CC BY-SA 4.0] (http://creativecommons.org/licenses/by-sa/4.0) via Wikimedia
CommonsAlberto Montresor (UniTN) SP - Data structure 2018/11/21 61 / 69
Stack and queues Queue
Queue based on circular array
By MuhannadAjjan [CC BY-SA 4.0] (http://creativecommons.org/licenses/by-sa/4.0) via Wikimedia
CommonsAlberto Montresor (UniTN) SP - Data structure 2018/11/21 61 / 69
Stack and queues Queue
Queue based on circular array
By MuhannadAjjan [CC BY-SA 4.0] (http://creativecommons.org/licenses/by-sa/4.0) via Wikimedia
CommonsAlberto Montresor (UniTN) SP - Data structure 2018/11/21 61 / 69
Stack and queues Queue
Queue based on circular array
By MuhannadAjjan [CC BY-SA 4.0] (http://creativecommons.org/licenses/by-sa/4.0) via Wikimedia
CommonsAlberto Montresor (UniTN) SP - Data structure 2018/11/21 61 / 69
Stack and queues Queue
Queue based on circular array
By MuhannadAjjan [CC BY-SA 4.0] (http://creativecommons.org/licenses/by-sa/4.0) via Wikimedia
CommonsAlberto Montresor (UniTN) SP - Data structure 2018/11/21 61 / 69
Stack and queues Queue
Queue based on circular array
By MuhannadAjjan [CC BY-SA 4.0] (http://creativecommons.org/licenses/by-sa/4.0) via Wikimedia
CommonsAlberto Montresor (UniTN) SP - Data structure 2018/11/21 61 / 69
Stack and queues Queue
Queue based on circular array
By MuhannadAjjan [CC BY-SA 4.0] (http://creativecommons.org/licenses/by-sa/4.0) via Wikimedia
CommonsAlberto Montresor (UniTN) SP - Data structure 2018/11/21 61 / 69
Stack and queues Queue
Queue based on circular array
By MuhannadAjjan [CC BY-SA 4.0] (http://creativecommons.org/licenses/by-sa/4.0) via Wikimedia
CommonsAlberto Montresor (UniTN) SP - Data structure 2018/11/21 61 / 69
Stack and queues Queue
Queue based on circular array
By MuhannadAjjan [CC BY-SA 4.0] (http://creativecommons.org/licenses/by-sa/4.0) via Wikimedia
CommonsAlberto Montresor (UniTN) SP - Data structure 2018/11/21 61 / 69
Stack and queues Queue
Queue based on circular array
By MuhannadAjjan [CC BY-SA 4.0] (http://creativecommons.org/licenses/by-sa/4.0) via Wikimedia
CommonsAlberto Montresor (UniTN) SP - Data structure 2018/11/21 61 / 69
Stack and queues Queue
Queue based on circular array
By MuhannadAjjan [CC BY-SA 4.0] (http://creativecommons.org/licenses/by-sa/4.0) via Wikimedia
CommonsAlberto Montresor (UniTN) SP - Data structure 2018/11/21 61 / 69
Stack and queues Queue
Queue based on circular array
By MuhannadAjjan [CC BY-SA 4.0] (http://creativecommons.org/licenses/by-sa/4.0) via Wikimedia
CommonsAlberto Montresor (UniTN) SP - Data structure 2018/11/21 61 / 69
Stack and queues Queue
Queue based on circular array
By MuhannadAjjan [CC BY-SA 4.0] (http://creativecommons.org/licenses/by-sa/4.0) via Wikimedia
CommonsAlberto Montresor (UniTN) SP - Data structure 2018/11/21 61 / 69
Stack and queues Queue
Queue based on circular array – Pseudocode
Queue
Queue(self , dim)self .A new int[0 . . . dim� 1]
self .cap dim % Maximum sizeself .head 0 % Head of the queueself .size 0 % Current size
top()if size > 0 then
return A[head]
dequeue()if size > 0 then
temp A[head]
head (head + 1)%cap
size size� 1
return temp
enqueue(v)if size < cap then
A[(head + size)%cap] vsize size + 1
size()return size
isEmpty()return size = 0
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 62 / 69
Stack and queues Queue
Python data structure - Dequeue
>>> from collections import deque
>>> Q = deque(["Eric", "John", "Michael"])
>>> Q.append("Terry") # Terry arrives
>>> Q.append("Graham") # Graham arrives
>>> Q.popleft() # The first to arrive now leaves
’Eric’
>>> Q.popleft() # The second to arrive now leaves
’John’
>>> Q # Remaining queue in order of arrival
deque([’Michael’, ’Terry’, ’Graham’])
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 63 / 69
Table of contents
1 Abstract data types (PSADS §1.8, §1.13)SequencesSetsDictionaries
2 Elementary data structures implementationsLinked List (PSADS §3.19–§3.21)Dynamic vectors
3 Sets, Dictionary implementations in PythonDirect access tablesHash functionsHashtable implementation
4 Stack and queuesStack – Last-in, First-out (LIFO)Queue (PSADS §3.10–§3.12, §3.15–§3.16)
5 Some code examples
Some code examples
Reverse string
def reverse(s):
n = len(s)-1
res = ""
while n >= 0:
res = res + s[n]
n -= 1
return res
Complexity: ⇥(n2
)
n string sums
Each sum copies all thecharacters in a new string
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 64 / 69
Some code examples
Reverse string
def reverse(s):
n = len(s)-1
res = ""
while n >= 0:
res = res + s[n]
n -= 1
return res
Complexity: ⇥(n2
)
n string sums
Each sum copies all thecharacters in a new string
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 64 / 69
Some code examples
Reverse string
def reverse(s):
res = []
for c in s:
res.insert(0, c)
return "".join(res)
Complexity: ⇥(n2
)
n list inserts
Each insert moves allcharacters one position up inthe list
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 65 / 69
Some code examples
Reverse string
def reverse(s):
res = []
for c in s:
res.insert(0, c)
return "".join(res)
Complexity: ⇥(n2
)
n list inserts
Each insert moves allcharacters one position up inthe list
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 65 / 69
Some code examples
Reverse string
def reverse(s):
n = len(s)-1
res = []
while n >= 0:
res.append(s[n])
n -= 1
return "".join(res)
Complexity: ⇥(n)
n list append
Each append has anamortized cost of O(1)
Better solution
def reverse(s):
return s[::-1]
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 66 / 69
Some code examples
Reverse string
def reverse(s):
n = len(s)-1
res = []
while n >= 0:
res.append(s[n])
n -= 1
return "".join(res)
Complexity: ⇥(n)
n list append
Each append has anamortized cost of O(1)
Better solution
def reverse(s):
return s[::-1]
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 66 / 69
Some code examples
Reverse string
def reverse(s):
n = len(s)-1
res = []
while n >= 0:
res.append(s[n])
n -= 1
return "".join(res)
Complexity: ⇥(n)
n list append
Each append has anamortized cost of O(1)
Better solution
def reverse(s):
return s[::-1]
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 66 / 69
Some code examples
Remove duplicates
def deduplicate(L):
res=[]
for item in L:
if item not in res:
res.append(item)
return res
Complexity: ⇥(n2
)
n list append
n checks whether an elementis already present
Each check costs O(n)
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 67 / 69
Some code examples
Remove duplicates
def deduplicate(L):
res=[]
for item in L:
if item not in res:
res.append(item)
return res
Complexity: ⇥(n2
)
n list append
n checks whether an elementis already present
Each check costs O(n)
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 67 / 69
Some code examples
Remove duplicates
def deduplicate(L):
res=[]
present=set()
for item in L:
if item not in present:
res.append(item)
present.add(item)
return res
Complexity: ⇥(n)
n list append
n checks whether an elementis already present
Each check costs O(1)
Other possibility – destroy original order
def deduplicate(L):
return list(set(L))
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 68 / 69
Some code examples
Remove duplicates
def deduplicate(L):
res=[]
present=set()
for item in L:
if item not in present:
res.append(item)
present.add(item)
return res
Complexity: ⇥(n)
n list append
n checks whether an elementis already present
Each check costs O(1)
Other possibility – destroy original order
def deduplicate(L):
return list(set(L))
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 68 / 69
Some code examples
Remove duplicates
def deduplicate(L):
res=[]
present=set()
for item in L:
if item not in present:
res.append(item)
present.add(item)
return res
Complexity: ⇥(n)
n list append
n checks whether an elementis already present
Each check costs O(1)
Other possibility – destroy original order
def deduplicate(L):
return list(set(L))
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 68 / 69
Some code examples
Exercise
Queues and Priority Queues are data structures which are known tomost computer scientists. The Italian Queue, however, is not so wellknown, though it occurs often in everyday life. At lunch time the queuein front of the cafeteria is an italian queue, for example.
In an italian queue each element belongs to a group. If an elemententers the queue, it first searches the queue from head to tail to check ifsome of its group members (elements of the same group) are already inthe queue.
If yes, it enters the queue right behind them.
If not, it enters the queue at the tail and becomes the new lastelement (bad luck).
Dequeuing is done like in normal queues: elements are processed fromhead to tail in the order they appear in the italian queue.
Alberto Montresor (UniTN) SP - Data structure 2018/11/21 69 / 69
top related