java programming the collections framework. java collections framework the java collections...
TRANSCRIPT
Java ProgrammingThe Collections Framework
Java Collections Framework
• The Java Collections Framework defines classes and interfaces. The base class of this framework is Collection.
Java Collections Framework
• Collection: a group of elements. Supported operations are the most general kind that all sets and lists support.
• Set: a collection that – cannot contain duplicate elements– elements are not ordered– elements do not have a position.
• List: a collection that – can contain duplicate elements. – elements are ordered – elements have a position.
Java Collection
public interface Collection<E> extends Iterable<e> {int size(); boolean isEmpty(); boolean contains(Object element); boolean add(E element); //optional boolean remove(Object element); //optional Iterator<E> iterator();
boolean containsAll(Collection<?> c); boolean addAll(Collection<? extends E> c); //optional boolean removeAll(Collection<?> c); //optional boolean retainAll(Collection<?> c); //optional void clear(); //optional
// others omitted}
Collection Methods
• int size(): – returns the number of elements in the collection
• boolean isEmpty(): – returns true if the collection is empty and false otherwise
• boolean contains(Object x): – returns true if the collection contains the specified element and
false otherwise
• boolean add(E e):– Ensures that the collection contains the specified element. Returns
true if the collection changed as a result and false otherwise.
Collection Methods
• boolean remove(Object e): – removes a single instance of the specified element if it is contained. Returns
true if the collection was changed as a result and false otherwise.
• boolean containsAll(Collection<?> c):– returns true if the collection is empty and false otherwise
• boolean contains(Object x): – returns true if the collection contains the specified element and false otherwise
• boolean add(E e):– Ensures that the collection contains the specified element. Returns true if the
collection changed as a result and false otherwise.
• void clear():– Removes all of the elements from this collection
Collection Methods
• boolean addAll(Collection<? extends E> c)– Adds all of the elements in the specified collection to this collection.
The behavior of this operation is undefined if the specified collection is modified while the operation is in progress.
• boolean removeAll(Collection<?> c)– Removes all of this collection's elements that are also contained in
the specified collection. After this call returns, this collection will contain no elements in common with the specified collection.
• boolean retainAll(Collection<?> c)– Retains only the elements in this collection that are contained in the
specified collection. Removes from this collection all of its elements that are not contained in the specified collection
Iterator
• Iterator<E> iterator()– returns an iterator over the elements in this collection.
• The Iterator interface defines 3 methods– boolean hasNext()
• returns true if the iteration has more elements– E next()
• returns the next element in the iteration– void remove()
• removes from the underlying collection the last element returned by the iterator
public interface Iterator<E> {public boolean hasNext();public E next();public void remove();
}
Using the Collection interface
public void example(Collection<String> names) {names.clear();names.add(“China”);names.add(“America”);names.add(“Canada”);names.add(“France”);
boolean a = names.contains(“Israel”);boolean b = names.remove(“France”);boolean c = names.isEmpty();int s = names.size();
Iterator<String> it = names.iterator();while(it.hasNext()) {
System.out.println(it.next());}
for(String name : names) { System.out.println(name);}
}
Implementing Collection Methods
• Consider the two methods contains and containsAll.– containsAll can be implemented in terms of contains.
public boolean containsAll(Collection<?> other) { for(Object element : other) { if(!contains(element)) return false; } return true;}
Collection implementation
• Some Collection methods can be written in terms of other Collection methods.– boolean contains(Object c);– boolean containsAll(Collection<?> c); – boolean addAll(Collection<? extends E> c);
//optional – boolean removeAll(Collection<?> c); //optional – boolean retainAll(Collection<?> c); //optional – void clear(); //optional
• We should write one class that implements these methods. – This class would be abstract.– No subclass would have to provide these methods.– Each subclass must provide all other Collection methods.
AbstractCollection
public abstract class AbstractCollection<E> implements Collection<E> {
public boolean contains(Object e) { // This code should check for null-valued ‘e’ but doesn’t. Just add an extra case Iterator<E> items = iterator(); while(items.hasNext()) { if(e.equals(items.next()) return true; } return false; }
public boolean containsAll(Collection<?> c) { for(Object item : c) { if(!contains(item)) return false; } return true; }
public boolean addAll(Collection<? extends E> c) { boolean changed = false; for(E item : c) { changed = changed || add(item); } return changed; }
// other methods not listed.}
List ADT
• A list is a sequence of elements– A0, A1, A2, …, AN-1
• Properties– The subscript indicates the position of an element– The size of a list is the number of elements in the list– Each element in a list is of the same type.
• List is-a Collection that includes support for– Positional access: obtains an element from position– Searching: returns the position of an element– Ranged viewing: returns sub-lists based on positions of
elements
Java List
public interface List<E> extends Collection<E> { // Positional access E get(int index); E set(int index, E element); //optional void add(int index, E element); //optional E remove(int index); //optional boolean addAll(int index, Collection<? extends E> c); //optional
// Search int indexOf(Object o); int lastIndexOf(Object o); Iteration ListIterator<E> listIterator(); ListIterator<E> listIterator(int index);
// Range-view List<E> subList(int from, int to);
}
Using the List interface
public void example(List<String> names) {names.add(“Rams”);names.add(“Vikings”);names.add(“Packers”);names.add(“Bears”);
boolean a = names.contains(“Rams”);boolean b = names.remove(“Vikings”);boolean c = names.isEmpty();int s = names.size();
Iterator<String> it = names.iterator();while(it.hasNext()) {
System.out.println(it.next());}
for(String name : names) {System.out.println(name);
}
String n1 = names.get(0);String n2 = names.set(0, “Cardinals”);names.add(0, “Lions”);int i1 = names.indexOf(“Bears”);List<String> nfc = names.sublist(0, 3);
}
Implementing Collection Methods
• Consider the two methods add(int,E) and addAll(int, Collection)– addAll can be implemented in terms of add
• Since some List methods can be implemented in terms of other List methods, write an abstract list class.
public boolean addAll(int index, Collection<? extends E> other) { boolean modified = false; for(E e : c) { add(index++, e); modified = true; } return modified;}
Concrete Implementation
• There must be a concrete class that implements the List interface.– There may be more than one implementation. – There are two basic ways to implement:• ArrayList: Store the elements in an array.• LinkedList: Store the elements in linked nodes
Class Diagram for Lists
ArrayList
• ArrayList is in the java.util package. An ArrayList implements List with a backing array.– An ArrayList is not an array. – An ArrayList contains an array.– The ArrayList has a capacity (the array size)– The ArrayList has a size (the number of elements in the list)
Using the List interface : part 2
public void example() {List<String> names = new ArrayList();
names.add(“Rams”);names.add(“Vikings”);names.add(“Packers”);names.add(“Bears”);
boolean a = names.contains(“Rams”);boolean b = names.remove(“Vikings”);boolean c = names.isEmpty();int s = names.size();
String n1 = names.get(0);String n2 = names.set(0, “Cardinals”);names.add(0, “Lions”);int i1 = names.indexOf(“Bears”);
}
ArrayList implementation
ArrayList implementation
ArrayList implementation
ArrayList Time Complexity
Method Expected Time Complexity
boolean add(E element) O(1)
E get(int i) O(1)
int size() O(1)
E set(int i, E element) O(1)
void add(int i, E element) O(n-i)
int indexOf(E element) O(n)
E remove(int i) O(n-i)
void clear() O(n)
LinkedList
• A LinkedList is a concrete implementation of List– Uses nodes to hold elements and to impose a linear structure on the
elements. A node knows the node that succeeds it in the list (the ‘next’ node).
– Variations of the linked list can be useful. • The LinkedList class in Java is a doubly-linked list. • We will start with a singly-linked list.
Using the List interface : part 3
public void example() {List<String> names = new SinglyLinkedList();
names.add(“Rams”);names.add(“Vikings”);names.add(“Packers”);names.add(“Bears”);
boolean a = names.contains(“Rams”);boolean b = names.remove(“Vikings”);boolean c = names.isEmpty();int s = names.size();
String n1 = names.get(0);String n2 = names.set(0, “Cardinals”);names.add(0, “Lions”);int i1 = names.indexOf(“Bears”);
}
SinglyLinkedList
SinglyLinkedList
SinglyLinkedList
SinglyLinkedList Iterator Code
SinglyLinkedList Time Complexity
Method Time Complexity
boolean add(E element) O(n)
E get(int i) O(i)
int size() O(1)
E set(int i, E element) O(i)
void add(int i, E element) O(i)
int indexOf(E element) O(n)
E remove(int i) O(i)
void clear() O(1)
DoublyLinkedList
• A DoublyLinkedList is a concrete implementation of List– Uses nodes to hold elements and to impose a linear structure on the
elements. – A node knows the preceding and succeeding nodes.
DoublyLinkedList Time Complexity
Method Time Complexity
boolean add(E element) O(1)
E get(int i) O(n)
int size() O(1)
E set(int i, E element) O(n)
void add(int i, E element) O(n)
int indexOf(E element) O(n)
E remove(int i) O(n)
void clear() O(1)
Stack<E>
• A stack is a very simple linear data type that follows the last-in-first-out (LIFO) principle.– E push(E element):
• means add the element to the stack. Also returns the added element.
– E pop():• means remove the last pushed element
– E top():• return the last pushed element
Stack<E>
• The Stack class in Java is odd. It is a subclass of List that simply adds methods named push, pop, top and a few others.– Since it is a subclass of List, a Stack can be used as a list
and violate the LIFO principle.– This is not too good of a design, but it is convenient for
programmers writing the Stack class.– The Stack uses a sequential implementation– We will write our own implementation
Stack<E>public interface Stack<E> extends Collection<E> { public E top(); public E pop(); public E push(E element);}
public class ArrayStack<E> extends AbstractCollection<E> implements Stack<E> { private ArrayList<E> stack; public int size() { return stack.size(); }
public Iterator<E> iterator() { return stack.iterator(); }
public E top() { if(isEmpty()) throw new EmptyStackException(); return stack.get(stack.size()-1); }
public E push(E e) { stack.add(e); return e; }
public E pop() { if(isEmpty()) throw new EmptyStackException(); return stack.remove(stack.size()-1); } }
This is an example of the adapter pattern. An adapter, or wrapper, makes one object behave like another.
Queue Interface
• A queue is a very simple linear data type that follows the first-in-first-out (FIFO) principle.– E enqueue(E element):
• means add the element to the queue. Also returns the added element.
– E dequeue()• removes and returns the element that was added before all others
– E front(): • returns the element that was added before all others
Queue<E>public interface Queue<E> extends Collection<E> { public E font(); public E dequeue(); public E enqueue(E element);}
public class ArrayQueue<E> extends AbstractCollection<E> implements Stack<E> { private ArrayList<E> queue; public int size() { return queue.size(); }
public Iterator<E> iterator() { return queue.iterator(); }
public E front () { if(isEmpty()) throw new EmptyStackException(); return queue.get(0); }
public E enqueue(E e) { queue.add(size(), e); return e; }
public E dequeue() { if(isEmpty()) throw new EmptyStackException(); return queue.remove(0); } }
Map<K,V>
• A map associates keys with values. – A map cannot contain duplicate keys– Each key can map to at most one value.
• Supports the following basic methods– V put(K key, V value)
• associates the key with the value. Returns the object previously associated with the key (null if none).
– V get(K key)• returns the value associated with the key. Returns null if the key is not
in the map
– V remove(K key)• removes any association between the key and value. Returns the value
that was associated with the key.
Map<K,V>public interface Map<K,V> { int size(); boolean isEmpty(); boolean containsKey(Object key); boolean containsValue(Object value); V get(Object key);
V put(K key, V value); V remove(Object key); void putAll(Map<? extends K, ? extends V> m); void clear();
Set<K> keySet(); Collection<V> values(); Set<Map.Entry<K, V>> entrySet();
interface Entry<K,V> { K getKey(); V getValue(); V setValue(V value); boolean equals(Object o); int hashCode(); }
boolean equals(Object o); int hashCode();}
Selected Map<K,V> methods• boolean containsKey(Object key)
– Returns true if the map contains the key and false otherwise
• boolean containsValue(Object value)– Returns true if the map contains one or more keys to the specified value.
• V get(Object key)– Returns the value associated with the specified key
• V put(K key, V value)– Associates the key with the value. Returns the value previously associated with key or null of none.
• V remove(Object key)– Removes the key from the map
• void putAll(Map<? extends K, ? extends V> otherMap)– performs a put on every element of otherMap
• void clear()– removes all key-value mappings
Selected Map<K,V> methods
• Set<K> keySet()– returns a set of the keys
• Collection<V> values()– returns a collection of the values
• Set<Map.Entry<K, V>> entrySet()– returns a set of entries
• Entry<K,V> is a nested inner interface. Represents one key-value association.– K getKey()
• returns the key of the Entry
– V getValue()• returns the value of the Entry
– V setValue(V value)• Modifies the value mapped to by the key. Returns the previous mapping
Using a Map
public void example(Map<String, Integer> map) { map.put(“A”, 1); map.put(“B”, 2); map.put(“C”, 3);
Integer x = map.get(“C”); Integer y = map.get(“A”); Integer z = map.get(new Integer(3)); int size = map.size();
// How to print the values in the map? // How to print the keys in the map?}
public abstract class AbstractMap<K,V> implements Map<K,V> { protected AbstractMap() { }
public boolean isEmpty() { return size() == 0; }
public void putAll(Map<? extends K, ? extends V> m) { for (Map.Entry<? extends K, ? extends V> e : m.entrySet()) put(e.getKey(), e.getValue()); }
public static class SimpleEntry<K,V> implements Entry<K,V> { private final K key; private V value;
public SimpleEntry(K key, V value) { this.key = key; this.value = value; }
public SimpleEntry(Entry<? extends K, ? extends V> entry) { this.key = entry.getKey(); this.value = entry.getValue(); }
public K getKey() { return key; }
public V getValue() { return value; }
public V setValue(V value) { V oldValue = this.value; this.value = value; return oldValue; } }}
public class ListMap<K,V> extends AbstractMap<K,V> { private List<Entry<K,V>> entries;
public ListMap() { entries = new LinkedList<>(); } public ListMap(Map<K,V> source){ entries = new LinkedList<>(); for(K key : source.keySet()) { put(key, source.get(key)); } } // this is linear search...O(n) private Entry<K,V> getEntry(Object key) { for(Entry<K,V> e : entries) { if(e.getKey().equals(key)) return e; } return null; } @Override public boolean containsKey(Object key) { return getEntry(key) != null; }
@Override public V get(Object key) { Entry<K,V> entry = getEntry(key); if(entry != null) { return entry.getValue(); } return null; }
public V put(K key, V value) { Entry<K,V> entry = getEntry(key); if(entry != null) { return entry.setValue(value); } else { entries.add(0, new AbstractMap.SimpleEntry(key,value)); return value; }}
public V remove(Object key) { Entry<K,V> entry = getEntry(key); if(entry != null) { entries.remove(entry); return entry.getValue(); } else { return null; }}
public boolean containsValue(Object value) { for(Entry e : entries) { if(e.getValue().equals(value)) return true; } return false;} public void clear() { entries.clear(); } public int size() { return entries.size(); }
// several other methods are not shown here}
47
What is a Hashtable?
• The previous implementation adapts a LinkedList to perform as a Map.– The LinkedList adapter is easy to write.– The LinkedList adapter is inefficient. Finding a key in the list is
linear: O(n)
• A Hashtable is a map that stores key-value pairs in an array using a hashing strategy. – A key is converted to an array index using a hash function. The hash
function is fast therefore finding an entry is fast.– In other words: given a key K we know what the index of K will be and
therefore don’t need to search (although some small amount of searching may be necessary).
Hashtable Example
• Hashtable of character/integers– The hashtable is an array of 7– The hashtable capacity is 7– Initially, the hashtable size is 0
• Insert characters into the table. – A single character is a key.– The hash function must take a
character as input and convert it into a number between 0 and 6.
• Hash function– Let P be the position of the
character in the English alphabet (starting with 1).
– Use h(K) = (P % 7) as the hash function
Hashtable
0 null
1 null
2 null
3 null
4 null
5 null
6 null
7 null
Hashtable Example
• Consider inserting the following data into the table. Subscripts are used to denote the position of the character in the alphabet.– put(B2, 102)
– put(S19, 3)
– put(J10, -12)
– put(N14, 44)
– put(X24, 85) // interesting
– put(W23, -33)
– put(B2, -101)
– get(X24)
– get(W23)
Hashtable
0 null
1 null
2 null
3 null
4 null
5 null
6 null
7 null
B:102
S:3
J:-12
N:44
X:85
W:-33
B:-101
• A collision occurs when two-or-more different keys map to the same array index.• When a collision occurs, a collision resolution policy will
determine how to resolve the collision• The load factor is defined as the size/capacity. The odds
of a collision increase with increasing load factor.
• Two collision resolution policies:– Open Addressing: look for an open slot using a pre-
determined sequence of probes. – Separate Chaining: keep a list of key/value pairs in a slot
such that one slot can contain multiple key values.
Collisions
51
Open Addressing Details
• The sequence of locations examined when locating data is called the probe sequence.
• The probe sequence {s(0), s(1), s(2), … } can be described as follows:
s(i) = norm(h(K) + p(i))
– h(K) : the hash function mapping K to an integer– p(i) : a probing function returning an offset from h(K) for the ith probe– norm is the normalizing function (usually division modulo capacity)
52
• Linear probing– p(i) = i – The probe sequence becomes
{norm(h(k)), norm(h(k)+1), norm(h(k)+2), …}
• Quadratic probing– p(i) = i2
– The probe sequence becomes {norm(h(k)), norm(h(k)+1), norm(h(k)+4),…}
– Must ensure that the sequence will probe an empty slot if the array has an empty slot.
– A theorem states that this method will find an empty slot if the table is not more that ½ full.
Open Addressing
53
Double Hashing
• Double hashing– p(i,k) = i*h2(k)– The probe sequence becomes
{norm(h(k)), norm(h(k)+h2(k)), norm(h(k)+2h2(k)), …}
• The hash value is determined by two hash functions and is typically better than linear or quadratic probing.
54
• Can any object be used as a key?– What are the essential properties of a key object?
• Consider using a Person as a key.– The hashing function is given by hashCode().– Equality testing is given by equals(…).
What make a good key?
class Person {private String name;private String phone;
Person(String n, String p) {name = n;phone = p;
}
public int hashCode() { … }public boolean equals(Object other) {…}
}
1. Whenever hashCode is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer.
2. If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
3. As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects. This is usually done by returning the memory address of the object.
Java hashCode contract
56
• The API for Java’s String hashcode method states:– Returns a hash code for this string. The hash code
for a String object is computed as s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1] using int arithmetic, where s[i] is the ith character of the string, n is the length of the string, and ^ indicates exponentiation. (The hash value of the empty string is zero.)
Java Hashcode Method
57
• What makes a good hashcode?– Should be fast to compute– Should evenly distribute keys across a large space. – Should not generate clumps of hashes for similar keys.
• Hashtable capacities are usually kept at prime-values to avoid problems with probe sequences.
• Example: Insert into a table using quadratic probing and a key that hashes to 2
{ norm(h(k)), norm(h(k)+12), norm(h(k)+22), norm(h(k)+32 +… }{2, 3, 6, 11, 2, 11, 6, 3, 2, 3, 6, 11, 6, 3, 2, 3, 6, …}
Hashcodes and capacity
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
58
Removing with open addressing
• Consider a table of capacity 11 with the following sequence of operations using h(k) = K%11 and linear probing.– put(36, “Fu”)– put(23, “Zhi”)– put(4, “Guo”)– put(46, “Li”)– put(1, “Chen”)– get(1)– remove(23)– remove(36)– get(1)
Hashtable01 23:”Zhi”2 46:”Li”3 36:”Fu”4 4:”Guo”5 1:”Chen”6789
10
59
• Removing an item may cause errors when finding other items at a later time.
• Deletion should be lazy. Each entry has a boolean flag named “hasBeenDeleted”. This flag is set to true when deleting a key. The entry is still in the array.
• The ‘hasBeenDeleted’ flag means– When getting/removing: The cell is occupied –
keep probing.– When putting: The cell is empty – put here.
Removal
60
• Another collision resolution strategy is separate chaining.– Each array slot contains a list of entries– The fundamental methods then become:
• PUT: hash into array and add to list• GET: hash into array and search the list• REMOVE: hash into array and remove from list
• Must ensure that– the load factor remains controlled– There is no probe sequence but all other
considerations related to open addressing apply
Separate Chaining
Separate Chaining
• Consider inserting the following data into the table. Subscripts are used to denote the position of the character in the alphabet.– put(B2, 102)
– put(S19, 3)
– put(J10, -12)
– put(N14, 44)
– put(X24, 85)
– put(W23, -33)
– put(B2, -101)
– get(X24)
– get(W23)
Hashtable
0 1 2 3 4 5 6 7
B:102
S:3
J:-12
N:44
X:85
W:-33B:-101
Set<E>
• A collection that contains no duplicate elements. – sets contain no pair of elements e1 and e2 such that e1.equals(e2)– sets contain at most one null element.
• The Set interface places additional stipulations, beyond those inherited from the Collection interface, on the contracts of all constructors and on the contracts of the add, equals and hashCode methods.– All constructors must create a set that contains no duplicate elements.– An element of the set cannot be changed in a way that affects equality testing. – A set must not contain itself as an element.
• Some sets have restrictions on the elements that they may contain. – For example, some implementations prohibit null elements– some have restrictions on the types of their elements. – Attempting to add an ineligible element throws an unchecked exception,
typically NullPointerException or ClassCastException.
Set Hierarchypublic class AbstractSet<E> extends AbstractCollection<E> { // the abstractset class has just a few methods // these methods are not relevant to our discussion.
}
public class HashSet<E> extends AbstractSet<E> implements Set<E> {private static final Object PRESENT = new Object(); private HashMap<E,Object> map;
public HashSet() { map = new HashMap<>(); }
public HashSet(Collection<? extends E> c) {map = new HashMap<>(Math.max((int) (c.size()/.75f) + 1, 16));addAll(c);
}
public int size() { return map.size(); }
public boolean isEmpty() { return map.isEmpty(); }
public boolean contains(Object o) { return map.containsKey(o); }
public boolean add(E e) { return map.put(e, PRESENT)==null; }
public boolean remove(Object o) { return map.remove(o)==PRESENT; }
public void clear() { map.clear(); }
public Iterator<E> iterator() { return map.keySet().iterator(); } }