net collection classes deep dive - rocksolid tour 2013

Post on 19-May-2015

591 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

Collection Classes Deep Dive

By Gary ShortHead of Gibraltar Labs

Gibraltar Software

2

Introduction

• Gary Short• Head of Gibraltar Labs• C# MVP• @garyshort• gary.short@gibraltarsoftare.com• facebook.com/TheOtherGaryShort

3

Why do we Care About This Stuff?

4

5

Let’s Start With Something We Know

6

List<T> Demo

7

What we Learned

• Don’t add elements in a loop• Add causes capacity growths• Capacity growths uses Array.Copy()• Array.Copy() is a O(n) operation• O(n) is sloooooowwwwwww. • Use AddRange() instead• Or set “large enough” initial capacity.

8

How Slow is Slow?

10 100 1000 10000 100000 10000000

5000

10000

15000

20000

25000

30000

Performance: Add Versus AddRange

AddAddRange

Number of Elements Added

Num

ber o

f Tic

ks

10

What About Removing Stuff?

11

Demo

12

What we Learned

Prefer RemoveAt() as there’s no IndexOf() step

13

List<T> - Sorting

• Uses QuickSort under the hood• Fastest general purpose sort algorithm• O(n log n) in best case• O(n log n) in average case• Though worst case is O(n^2)

1 2 3 4 5 6 7 8 9 100

20

40

60

80

100

120

Performance: O(n log n) Vs O(n^2)

O(n log n)O(n^2)

Elements to be Sorted

Effor

t

15

QuickSort Demo

16

So What is the Worst Case?

• If the list is already sorted– First partition has lower = 0, upper = n– Then calls Partition(n-1);– This happens a further n-2 times

18

Can we Mitigate the Worst Case?

• Median of Three– Take an element from the “top” of the array– Take an element from the “middle” of the array– Take an element from the “bottom” of the array– Find the median value of the three– Pivot on the median

• Let’s see if Microsoft uses this algorithm.

19

Disadvantage: O(n) Add, Insert, Remove

20

What if we Need Fast Add, Insert & Remove?

21

LinkedList<T>

• Double linked– Each item points to the previous and next items– This means it’s super fast• Add, insert and remove are all O(1) operations

22

Demo

23

Disadvantage: O(n) lookups

24

What if we Need Fast Lookups?

25

Dictionary<TKey, TValue>

• Performance depends on key.GetHashCode() – Hash codes must be evenly distributed across int• If two keys return hashes that give the same index

– Dictionary must look for nearest free location to store item– Must search later to return the item– This hurts performance

– Use your own type, then this is on you.

26

Dictionary<TKey, TValue>

• Objects used as keys must also implement IEquatable.Equals()

• Or override Equals()• Why?– Different keys may return the same hashcode– Equals() is used by the dictionary comparing keys– So you must ensure the following

• If A.Equals(B) then A.HashCode() and B.HashCode() return the same HashCode()

• Override Equals() but not GetHashCode() == compile error.

27

Disadvantage: one value per key

28

What if I Need Multiple Values per Key?

29

Lookup<TKey, TElement> Demo

30

Concurrent Collections

31

Types of Concurrent Collections

• ConcurrentBag<T>• ConcurrentDictionary<T>• ConcurrentQueue<T>• ConcurrentStack<T>• OrderablePartitioner<T>• BlockingCollection<T>.

32

Key Characteristics

• New .Net 4.0• Guards against multi-thread collection conflicts• Implements IProducerConsumerCollections<T>– TryAdd()

• Tries to add item to collection returns success bool

– TryTake()• Tries to remove and return item returns success bool

– Returns the item in an out param.

• Always check the return value before moving on.

33

Do I Have To Check Every Time?!

• BlockingCollection<T>– Blocks and waits until task completes– Uses Add() and Take() methods• Block the thread and wait until task completes• Add() has an overload to pass a CancellationToken• Add() may also block if bounding capacity was used.

34

But I Don’t Want it to Wait For Ever!

• So we don’t want to wait forever• Nor do we want to cancel the Add() from

outside• TryAdd() and TryTake() are offered too• Where you can specify a timeout.

35

Summary

• List is a good general purpose collection– Construct to size if possible– Construct to upper threshold then trim– Prefer AddRange() over Add()– Be aware of “Quicksort Killers”

• Use LinkedList if you need fast insert/remove• Use Dictionary if you need fast lookup• Use Lookup if you need multi values• Use concurrent collections for thread safety.

36

Questions

• gary.short@gibraltarsoftware.com• @garyshort• Facebook.com/TheOtherGaryShort

top related