spring 2006cs 685 network algorithmics1 longest prefix matching trie-based techniques cs 685 network...

49
Spring 2006 CS 685 Network Algorithmics 1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Upload: ruben-shatswell

Post on 14-Dec-2015

237 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 1

Longest Prefix MatchingTrie-based Techniques

CS 685 Network AlgorithmicsSpring 2006

Page 2: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 2

The Problem

• Given:– Database of prefixes with associated next hops, say:

1000101* 128.44.2.301101100* 4.33.2.110001* 124.33.55.1210* 151.63.10.11101* 4.33.2.1

1000100101* 128.44.2.3– Destination IP address, e.g. 120.16.8.211

• Find: the longest matching prefix and its next hop

Page 3: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 3

Constraints

• Handle 150,000 prefixes in database• Complete lookup in minimum-sized (40-byte)

packet transmission time– OC-768 (40 Gbps): 8 nsec

• High degree of multiplexing—packets from 250,000 flows interleaved

• Database updated every few milliseconds

performance number of memory accesses

Page 4: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 4

Basic ("Unibit") Trie Approach

• Recursive data structure (a tree)• Nodes represent prefixes in the

database– Root corresponds to prefix of length

zero

• Node for prefix x has three fields:– 0 branch: pointer to node for prefix

x0 (if present)– 1 branch: pointer to node for prefix

x1 (if present)– Next hop info for x (if present)

Example Database:

a: 0* xb: 01000* yc: 011* zd: 1* we: 100* uf: 1100* zg: 1101* uh: 1110* zi: 1111* x

Page 5: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 5

0 1

0 1

ax

0 1

dw

0 1

0 1

cz

0 1

0 1

ew

a: 0* xb: 01000* yc: 011* zd: 1* we: 100* uf: 1100* zg: 1101* uh: 1110* zi: 1111* x

0 1

0 1

0 1

by

0 1

0 1

0 1

fz

0 1

0 1

gu

0 1

hz

0 1

ix

Page 6: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 6

Trie Search Algorithm

typedef struct foo {

struct foo *trie_0, *trie_1;

NEXTHOPINFO trie_info;

} *TRIENODE;

NEXTHOPINFO best = NULL;

TRIENODE np = root;

unsigned int bit = 0x80000000;

while (np != NULL) {

if (np->trie_info)

best = np->trie_info;

// check next bit

if (addr&bit)

np = np->trie_1;

else

np = np->trie_0;

bit >>= 1;

}

return best;

Page 7: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 7

Conserving Space

Sparse database wasted space– Long chains of trie nodes with only one non-NULL

pointer– Solution: handle "one-way" branches with special

nodes• encode the bits corresponding to the missing nodes

using text strings

Page 8: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 8

0 1

0 1

ax

0 1

dw

0 1

0 1

cz

0 1

0 1

eu

a: 0* xb: 01000* yc: 011* zd: 1* we: 100* uf: 1100* zg: 1101* uh: 1110* zi: 1111* x

0 1

0 1

0 1

by

0 1

0 1

0 1

fz

0 1

0 1

gu

0 1

hz

0 1

ix

Page 9: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 9

0 1

0 1

ax

0 1

dw

0 1

0 1

cz

0 1

0 1

eu

a: 0* xb: 01000* yc: 011* zd: 1* we: 100* uf: 1100* zg: 1101* uh: 1110* zi: 1111* x

0 1

by

0 1

0 1

0 1

fz

0 1

0 1

gu

0 1

hz

0 1

ix

00

Page 10: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 10

Bigger Issue: Slow!

• Computing one bit at a time is too slow– Worst-case: one memory access per bit (32

accesses!)

• Solution: compute n bits at a time– n = stride length– Use n-bit chunks of addresses as index into array in

each trie node

• How to handle prefixes which are not a multiple of n in length?– Extend them, replicate entries as needed– E.g. n=3, 1* becomes 100, 101, 110, 111

Page 11: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 11

Extending Prefixes

Original Databasea: 0* x

b: 01000* y

c: 011* z

d: 1* w

e: 100* w

f: 1100* zg: 1101* uh: 1110* zi: 1111* x

Example:stride length=2

Expanded Databasea0: 00* xa1: 01* xb0: 010000* yb1: 010001* yc0: 0110* zc1: 0111* zd0: 10* wd1: 11* we0: 1000* ue1: 1001* uf: 1100* zg: 1101* uh: 1110* zi: 1111* x

Page 12: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 12

Expanded Databasea0: 00* xa1: 01* xb0: 010000* yb1: 010001* yc0: 0110* zc1: 0111* zd0: 10* wd1: 11* we0: 1000* we1: 1001* wf: 1100* zg: 1101* uh: 1110* zi: 1111* x

x

x

w

w

0001

1011

z

z

0001

1011

y

y0001

1011

u

u0001

1011

z

u

z

x

0001

1011

Total cost: 40 pointers (22 null)Max #memory accesses: 3

Page 13: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 13

0 1

0 1

x

0 1

w

0 1

0 1

z

0 1

0 1

u

a: 0* xb: 01000* yc: 011* zd: 1* we: 100* uf: 1100* zg: 1101* uh: 1110* zi: 1111* x

0 1

by

0 1

0 1

0 1

z

0 1

0 1

u

0 1

z

0 1

x

00

Total cost: 46 pointers (21 null)Max #memory accesses: 5

Page 14: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 14

Choosing Fixed Stride Lengths

• We are trading space for time:– Larger stride length fewer memory accesses– Larger stride length more wasted space

• Use the largest stride length that will fit in memory and complete required accesses within the time budget

Page 15: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 15

Updating

Insertion1. Keep a unibit version of the trie, with each node labeled

with longest matching prefix and its length2. To insert P, search for P, remembering last node, until

1. Null pointer (not present), or2. Reach the last stride in P

3. Expand P as needed to match stride length4. Overwrite any existing entries with length less than P's

Deletion is similar1. Find entry for prefix to be deleted2. Remove its entry (from unibit copy also!)3. Expand any entries that were "covered" by the deleted

prefix

Page 16: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 16

Variable Stride Lengths

• It is not necessary that every node have the same stride length

• Reduce waste by allowing stride length to vary per node– Actual stride length encoded in pointer to the trie

node– Nodes with fewer used pointers can have smaller

stride lengths

Page 17: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 17

Expanded Databasea0: 00* xa1: 01* xb: 01000* yc0: 0110* zc1: 0111* zd0: 10* wd1: 11* we: 100* wf: 1100* zg: 1101* uh: 1110* zi: 1111* x

x

x

w

w

0001

1011

z

z

0001

1011

y01

u01

z

u

z

x

0001

1011

Total waste: 16 pointersMax #memory accesses: 3Note: encoding stride length costs 2 bits/pointer

2 bits

1 bit2 bits

1 bit

Page 18: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 18

Calculating Stride Lengths

• How to pick stride lengths?– We have two variables to play with: height and stride length– Trie height determines lookup speed set max height first

• Call it h– Then choose strides to minimize storage

• Define cost of trie T, C(T):– If T is a single node, then number of array locations in the

node– Else number of array locations in root + i C(Ti), where Ti's

are children of T• Straightforward recursive solution:

– Root stride s results in y=2s subtries T1, ... Ty

– For each possible s, recursively compute optimal strides for C(Ti)'s using height limit h-1

– Choose root stride s to minimize total cost = (2s + i C(Ti))

Page 19: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 19

Calculating Stride Lengths

• Problem: Expensive, repeated subproblems• Solution (Srinivasan & Varghese):

Dynamic programming• Observe that each subtree of a variable-stride trie

contains the set of prefixes as some subtree of the original unibit trie

• For each node of the unibit trie, compute optimal stride and cost for that stride

• Start at bottom (height = 1), work up• Determine optimal grouping of leaves in subtree• Given subtree optimal costs, compute parent optimal cost

• This results in optimal stride length selections for the given set of prefixes

Page 20: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 20

0 1

0 1

x

0 1

w

0 1

0 1

z

0 1

0 1

u

0 1

by

0 1

0 1

0 1

z

0 1

0 1

u

0 1

z

0 1

x

00

Stride = 2Cost = 4

Stride = 1Cost = 7

Stride = 0Cost = 1

Stride = 0Cost = 1

Stride = 0Cost = 1

Stride = 1Cost = 2

Page 21: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 21

Alternative Method: Level Compression

• LC-trie (Nilsson & Karlsson '98) is a variable-stride trie with no empty entries in trie nodes

• Procedure:– Select largest root stride that allows no empty

entries– Do this recursively down through the tree

• Disadvantage: cannot control height precisely

Page 22: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 22

0 1

0 1

x

0 1

w

0 1

0 1

z

0 1

0 1

u

0 1

by

0 1

0 1

0 1

z

0 1

0 1

u

0 1

z

0 1

x

00

Stride = 1

Stride = 1

Stride = 1

Stride = 2

Page 23: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 23

Performance Comparisons

• MAE-East database (1997 snapshot)– ~ 40K prefixes

• "Unoptimized" multibit trie: 2003 KB• Optimal fixed-stride: 737 KB, computed in 1 msec

– Height limit = 4 ( 1 Gbps wire speed @ 80 nsec/access)

• Optimized (S&V) variable-stride: 423 KB, computed in 1.6 sec, Height limit = 4

• LC-compressed– Height = 7– 700 KB

Page 24: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 24

Lulea Compressed Tries

• Goals:– Minimize number of memory accesses– Aggressively compress trie

• Goal: so it can fit in SRAM (or even cache)

• Three-level trie with strides of 16, 8, 8– 8 mem accesses typical

• Main Techniques1. Leaf-pushing2. Eliminate duplicate pointers from trie node arrays3. Efficient bit-counting using precomputation for large

bitmaps4. Use of indices instead of full pointers for next-hop info

Page 25: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 25

1. Leaf-Pushing

• In general, a trie node entry has associated– A pointer to a next trie node– A prefix (i.e. pointer to next-hop info)– Or both, or neither

• Observation: we don't need to know about a prefix pointer along the way until we reach a leaf

• So: "push" prefix pointers down to leaves– Keep only one set of pointers per node

Page 26: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 26

Leaf-Pushing: the Concept

Prefixes

Page 27: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 27

Expanded Databasea0: 00* xa1: 01* xb0: 010000* yb1: 010001* yc0: 0110* zc1: 0111* zd0: 10* wd1: 11* we0: 1000* ue1: 1001* uf: 1100* zg: 1101* uh: 1110* zi: 1111* x

x

x

w

w

0001

1011

z

z

0001

1011

y

y0001

1011

u

u0001

1011

z

u

z

x

0001

1011

Cost: 40 pointers(22 wasted)

Before

x0001

1011

x

z

z

0001

1011

y

y

x

x

0001

1011

u

u

w

w

0001

1011

z

u

z

x

0001

1011

Cost: 20 pointers

After Leaf-Pushing

Page 28: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 28

2. Removing Duplicate Pointers

• Leaf-pushing results in many consecutive duplicate pointers

• Would like to remove redundancy and store only one copy in each node

• Problem: now we can't directly index into array using address bits– Example: k=2, bits 01 = index 1

needs to be converted to index 0 somehow

u

u

w

w

0001

1011

u

w

Page 29: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 29

2. Removing Duplicate Pointers

Solution: Add a bitmap: one bit per original entry– 1 indicates new value– 0 indicates duplicate of previous

value

• To convert index i, count 1's up to position i in the bitmap, and subtract 1Example: old index 1 new index 0 old index 2 new index 1

u

u

w

w

0001

1011

u

w

1

0

1

0

0001

1011

Page 30: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 30

Bitmap for Duplicate Elimination

Prefixes

100000000000100010000100000000000000000010000000000100000000000010001000000100000011000000000000000000

Page 31: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 31

3. Efficient Bit-Counting

• Lulea first-level 16-bit stride 64K entries• Impractical to count bits up to, say, entry

34578 on the fly!• Solution: Precompute (P2a)

– Divide bitmap into chunks (say, 64 bits each)– Store the number of 1 bits in each chunk in an array B– Compute # 1 bits up to bit k by:

chunkNum = k >> 6;

posInChunk = k & 0x3f; // k mod 64

numOnes = B[chunkNum] + count1sInChunk(chunkNum,posInChunk) – 1;

Page 32: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 32

Bit-Counting Precomputation Example

1 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1

0 3 3 6 7

0 1 0 1

9

Chunk Size = 8 bits

Converted index = 7 + 2 – 1 = 8

index = 35

Cost: 2 memory accesses (maybe less)

Page 33: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 33

4. Efficient Pointer Representation

• Observation: the number of different next-hop pointers is limited– Each corresponds to an immediate neighbor of the

router– Most routers have at most a few dozen neighbors– In some cases a router might have a few hundred

distinct next hops, even a thousand

• Apply P7: avoid unnecessary generality– Only a few bits (say 8-12) needed to distinguish

between actual next-hop possibilities

• Store indices into table of next-hops info– E.g., to support up to 1024 next hops: 10 bits– 40K prefixes 40K pointers 160KB @ 32 bits,

vs 50KB @ 10 bits

Page 34: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 34

Other Lulea Tricks

• First level of trie uses two levels of bit-counting array– First counts bits before the 64-bit chunk– Second counts bits in the 16-bit word within chunk

• Second- and third-level trie nodes are laid out differently depending on number of pointers in them– Each node has 256 entries– Categorized by number of pointers

• 1-8: "sparse" — store 8-bit indices + 8 16-bit pointers (24B)• 9-64: "dense" — like first level, but only one bit-counting array

(only six bits of count needed)• 65-256: "very dense" — like first level, with two bit-counting

arrays: 4 64-bit chunks, 16 16-bit words

Page 35: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 35

Lulea Performance Results

1997 MAE-East database– 32K entries, 58K leaves, 56 different next hops– Resulting Trie size: 160KB– Build time: 99 msec– Almost all lookups took < 100 clock cycles

(333MHz Pentium)

Page 36: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 36

Trie Bitmap(Eatherton, Dittia & Varghese)

• Goal: storage, speed comparable to Lulea plus fast insertion

• Main culprit in slow insertion is leaf-pushing• So get rid of leaf-pushing

– Go back to storing node and prefix pointers explicitly– Use the same compression bitmap trick on both lists

• Store next-hop information separately, only retrieve at the end– Like leaf-pushing, only in the control plane!

• Use smaller strides to limit memory accesses to one per trie node (Lulea requires at least two)

Page 37: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 37

Storing Prefixes Explicitly

• To avoid expansion/leaf pushing, we have to store prefixes in the node explicitly

• There are 2k+1 – 1 possible prefixes of length k– Store list of (unique) next hop pointers for each

prefix covered by this node– Use same bitmap/bit counting technique as Lulea to

find pointer index– Keep trie nodes small (stride 4 or less), exploit

hardware (P5) to do prefix matching, bit counting

Page 38: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 38

a: 0* xb: 01000* yc: 011* zd: 1* we: 100* uf: 1100* zg: 1101* uh: 1110* zi: 1111* x

Example: Root node, stride = 3

000

001

010

011

100

101

110

111

0

0

1

0

0

0

1

0

0

1

*

0*

1

0

0

0

0

0

0

0

1

1

0

0

1*

00*

01*

10*

11*

000*

001*

010*

011*

100*

101*

110*

0111*

to child nodes

x

w

z

u

Page 39: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 39

Tree Bitmap Results

• Insertions are as in simple multibit tries• May cause complete revamp of trie node, but

that requires only one memory allocation• Performance comparable to Lulea, but

insertion much faster

Page 40: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 40

A Different Lookup Paradigm

• Can we use binary search to do longest-prefix lookups?

• Observe that each prefix corresponds to a range of addresses– E.g. 204.198.76.0/24 covers the range

204.198.76.0 – 204.198.76.255– Each prefix has two range endpoints– N disjoint prefixes divide the entire space into 2N+1

disjoint segments– By sorting range endpoints, and comparing to

address, can determine exact prefix match

Page 41: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 41

Prefixes as Ranges

Page 42: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 42

Binary Search on Ranges

• Store 2N endpoints in sorted order– Including the full address range for *

• Store two pointers for each entry– ">" entry: next-hop info for addresses strictly greater

than that value– "=" entry: next-hop info for addresses equal to that

value

Page 43: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 43

Example: 6-bit addresses

Example Database:a: 0* xb: 01000* yc: 011* zd: 1* we: 100* uf: 1100* zg: 1101* uh: 1110* zi: 1111* x

a: 000000-011111 xb: 010000-010001 yc: 011000-011111 zd: 100000-111111 we: 100000-100111 u f: 110000-110011 zg: 110100-110111 uh: 111000-111011 zi: 111100-111111 x

000000010000010001011000011111100000100111110000110011110100110111111000111011111100111111

xyxxxuwzxuxzxx-

xyyzxuuzzuuzzxx

> =

Page 44: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 44

Range Binary Search Results

• N prefixes can be searched in log2 N + 1 steps– Slow compared to multibit tries– Insertion can also be expensive

• Memory-expensive– Requires 2 full-size entries per prefix– 40K prefixes, 32-bit addresses: 320KB, not counting

next-hop info

• Advantage: no patent restrictions!

Page 45: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 45

Binary Search on Prefix LengthsWaldvogel, et al

• For same-length prefixes, a hash table gives fast comparisons

• But linear search on prefix lengths is too expensive• Can we do a faster (binary) search on prefix

lengths?– Challenge: how do we know whether to move "up" or

"down" in length on failure?– Solution: include extra information to indicate presence of

a longer prefix that might match– These are called marker entries– Each marker entry also contains best-matching prefix for

that node– When searching, remember best-matching prefix when

going "up" because of a marker, in case of later failure

Page 46: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 46

Example: Binary Search on Prefix Length

Example Database:a: 0* xb: 01000* yc: 011* zd: 1* we: 100* uf: 1100* zg: 1101* uh: 1110* zi: 1111* x

Prefix Lengths: 1, 3, 4, 7

0* 1*length 1BMP a,xd,w

011* 100* 110M 111M 010Mlength 3

BMP c,z e,u d,w d,w a,x

length 4

BMP1100* 1101* 1110* 1111* 0100M

f,z g,u h,z i,x a,x

Example: Search for address 011000 and 101000

length 5

BMP01000*

b,y

Page 47: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 47

Binary Search on Prefix Length Performance

• Worst-case number of hash-table accesses: 5• However, most prefixes are 16 or 24 bits

– Arrange hash tables so these are handled in one or two accesses

• This technique is very scalable for larger address lengths (e.g. 128 bits for IPv6)– Unibit Trie for IPv6: 128 accesses!

Page 48: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 48

Memory Allocation for Compressed Schemes

• Problem: when using a compressed scheme (like Lulea), trie nodes are kept at minimal size

• If a node grows (changes size), it must be reallocated and copied over

• As we have discussed, memory allocators can perform very badly– Assume M is the size of the largest possible request– Cannot guarantee more than 1/log2 M of memory will

be used!• E.g. if M=32, 20% is max guaranteed utilization• Router vendors cannot claim to support large

databases

Page 49: Spring 2006CS 685 Network Algorithmics1 Longest Prefix Matching Trie-based Techniques CS 685 Network Algorithmics Spring 2006

Spring 2006 CS 685 Network Algorithmics 49

Memory Allocation for Compressed Schemes

• Solution: Compaction– Copy memory from one location to another

• General-purpose OS's avoid compaction!– Reason: very hard to find and update all pointers to

objects in the moved region

• The good news:– Pointer usage is very constrained in IP lookup

algorithms– Most lookup structures are trees at most one

pointer to any node• By storing a "parent" pointer, can easily update pointers

as needed