12_

11
For Review Only Bloom-BIRD: A Scalable Open Source Router Based on Bloom Filter Journal: IEEE Network Magazine Manuscript ID: Draft Topic or Series: Special Issue-Open Source for Networking: Development and Experimentation Date Submitted by the Author: n/a Complete List of Authors: Bahrambeigy, Bahram; Islamic Azad University, Science and Research branch, Information Technology Department Ahmadi, Mahmood; Razi University, Computer Engineering Department Fazlali, Mahmood; Shahid Beheshti University (SBU), Computer Science Department Key Words: BIRD, Bloom Filter, Open Source Routers, Quagga, XORP

Upload: pupuol

Post on 02-Apr-2016

213 views

Category:

Documents


1 download

DESCRIPTION

http://www.pupuol.com/download/12.pdf

TRANSCRIPT

Page 1: 12_

For Review O

nly

Bloom-BIRD: A Scalable Open Source Router Based on

Bloom Filter

Journal: IEEE Network Magazine

Manuscript ID: Draft

Topic or Series: Special Issue-Open Source for Networking: Development and Experimentation

Date Submitted by the Author: n/a

Complete List of Authors: Bahrambeigy, Bahram; Islamic Azad University, Science and Research branch, Information Technology Department Ahmadi, Mahmood; Razi University, Computer Engineering Department Fazlali, Mahmood; Shahid Beheshti University (SBU), Computer Science Department

Key Words: BIRD, Bloom Filter, Open Source Routers, Quagga, XORP

Page 2: 12_

For Review O

nly

1

Abstract— Flexibility and configurability behind the open-

source routers has extended their usage via the networks. On the

other hand, the need for high-performance and high-speed

routers has become a fundamental issue due to incremental

growth of information exchange through Internet and intranets.

Therefore, in this paper we employ Bloom filter to accelerate the

BIRD routing daemon which is the best based on resource usage

comparison (i.e. CPU and memory usage) made between three

open-source routers in the paper. Based on the best of our

knowledge this is the first application of Bloom filter on BIRD

software router. Bloom-BIRD (our changed version of BIRD) can

scale its Bloom filter capacity therefore false positive errors are

handled in an acceptable rate. It shows up to 93% speedup for IP

lookups over standard BIRD when number of inserted nodes into

its internal FIB (Forwarding Information Base) becomes huge.

Index Terms— BIRD, Bloom Filter, Open Source Routers,

Quagga, XORP.

I. INTRODUCTION

he need for high-performance and high-speed routers has

become a fundamental issue due to incremental growth of

information exchange through Internet and intranets. Due to

adoption of CIDR (Class-less Inter-domain Routing), routers

need to find best match between different prefix lengths that

may differ from lengths 1 to 128 based on what version of IP

and what prefix is used. This process of finding matching IPs

is time consuming and a lot of hardware (e.g. TCAM and

SRAM) and algorithmic approaches (e.g. binary searches) are

proposed in the literature as will be discussed further in the

related works.

Modern IP router solutions can be classified into three main

categories, hardware routers, software routers, and

programmable network processors (NPs) [1]. PC-based

software routers have made the reasonable computational

platform available with easy development and

programmability features. These features are the most

important in comparison with hardware routers. Current

software routers can forward up to 40 Gbits/sec traffic on a

Bahram Bahrambeigy is with Information Technology Department,

Science and Research branch, Islamic Azad University, Kermanshah, Iran (e-

mail: [email protected]). Mahmood Ahmadi is with Computer Engineering Department, Razi

University, Kermanshah, Iran (e-mail: [email protected]).

Mahmood Fazlali is with Computer Science Department, Shahid Beheshti University (SBU), Tehran, Iran (e-mail: [email protected]).

single commodity personal computer [1]. On the other hand,

existence of open-source software routers has created an

opportunity to study and change their code to make the better

routers based on what researchers need. BIRD [2], Quagga

[3], and XORP [4] are examples of such open-source routers

and are selected in the paper. These three routers are compared

with each other in terms of CPU and memory usage by a

simple scenario which all of them implement OSPF protocol.

Experimental results show that the BIRD obtains the best

performance among them. BIRD obtained less CPU usage

than Quagga more than twice and more than 49 times than

XORP router. For memory usage, BIRD obtained less

memory usage than Quagga about twice and than XORP about

44 times. Therefore, BIRD is selected to implement a Bloom

filter (BF) [5] on its internal FIB (Forwarding Information

Base) which all routing tables are based on this data-structure.

Results show that BF on BIRD makes it up to 93% faster than

its internal hashing mechanism for searching big FIBs when

the result of search is negative. Also results indicate that there

is even speedup when result of searching a particular prefix in

the FIB is positive.

The main concern in this paper is to show how a Bloom

filter can help an open-source router to speedup searches when

number of inserted nodes becomes huge. In order to have a

fair comparison between two versions of BIRD (i.e. standard

BIRD and Bloom-BIRD), Basic rules of standard BIRD have

not changed. For example maximum length of BIRD’s main

hash is 16-bit, so it is the same for Bloom-BIRD too. The

Bloom-BIRD has a Bloom filter array (so a space overhead) to

speedup simple searches for a given IP and length and also

Longest Prefix Matching (LPM) searches. The array can scale

its capacity therefore false positive errors are handled in an

acceptable rate.

The main contribution of the paper is as follows:

- Comparison of the memory and CPU usage of three

open-source routers, BIRD, Quagga and XORP.

- Proposal of Bloom-BIRD router to enhance the

performance of BIRD open-source router that utilizes

a Bloom filter in its architecture.

The rest of the paper is organized as follows. In section II, a

related work of Bloom filter and its applications in network

processing is presented. In section III, in the first part, a

scenario in order to compare three open-source routers is

presented and in the latter part results of comparison is

presented. Section IV, contains three parts, which first, is a

brief introduction to BIRD’s FIB data structure, in the second

Bloom-BIRD: A Scalable Open Source Router

Based on Bloom Filter

Bahram Bahrambeigy, Mahmood Ahmadi and Mahmood Fazlali

T

Page 1 of 10

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Page 3: 12_

For Review O

nly

2

part introduction of scenario in order to evaluate Bloom-BIRD

and standard BIRD is presented and the last part is results of

Bloom-BIRD evaluation. Section V presents the conclusion of

the paper.

II. RELATED WORKS

Bloom filter (BF) is a randomized and probabilistic data-

structure proposed by Burton Bloom in the 1970s [5]. BF

normally consists of a bit-array representing existence of

inserted elements. By checking k hash functions and getting

negative answer, it can be determined that the element is not

inserted certainly. However some False Positive (FP) may

occur. Which means BF may say some elements exist by

mistake so it needs to check actual hash table to make sure

about positive answers. BF’s bit-array can reside in an on-chip

memory by a hardware implementation to do k hash functions

checks in parallel. Main advantage of BF structure is Space

and Time efficiency in which consumes much less space than

ordinary data structures because of its potential collisions and

requiring much less and more predictable time to query a

member.

More than 20 variants of BF are presented in the literature

[6]. Each and every one of them is used for a special manner.

For example, standard Bloom filter (SBF) is used in order to

check if a specific element is present or not. An important

draw-back of SBF is that insertions can not be undone.

Counting Bloom filter (CBF) and later, Deletable BF (DlBF)

proposed in order to gain the removability in BF [6]. In CBF

each bit in the Bloom array will be replaced by a counter.

Each insertion, increments counters related to k hash

functions. Obviously, each deletion decrements related

counters. Another variant of BF which supports deletions as

mentioned is DlBF. It splits BF array into multiple regions and

tracks regions of BF array in which collisions occur. A small

fraction of bit-array will be used in order to determine related

area is collision-free or not. If bits are located in a collision-

free region then the bit can be reset safely, otherwise it will

not be safe to delete. Therefore, some bits may not reset if

they are located in a collisionary region. CBF is selected in the

paper because of its simplicity and consistency over deletions

instead of DlBF. DlBF would be a good option if BF is going

to be implemented in an on-chip memory.

Also other useful variants of BF are Scalable Bloom filter

(SlBF) and Dynamic Bloom filter (DBF) which adapt their

performance and capacity when number of inputs increases

[6]. SlBF does not support counting and deletion while DBF

can do it. They are based on this draw-back on SBF that there

may be non-predictable input entries into SBF, so the error

rate becomes to enlarge if BF array does not scale with

number of inputs.

BFs are used in various applications including network

processing as discussed in [7] can be classified into four

categories: Resource routing, Packet routing, Measurement,

and Collaborating in overlay and peer-to-peer networks. Also

IP Route lookup and Packet Classification are important

applications of BF in the network processing (e.g. [8]).

Longest Prefix matching (LPM) or Best Matching Prefix

(BMP) which can be classified into IP route lookup is also an

interesting area of BF application. There are a lot of proposed

algorithms in order to speedup BMP in the literature. In [9]

authors have classified BMP algorithms into Trie-based

algorithms, Binary search on prefix values, and Binary search

on prefix lengths. A Trie is a tree-based data-structure

allowing organization of prefixes on a digital basis using the

bits of prefixes to direct the branching [9]. Trie-based schemes

do a linear search on prefix length since compare one bit at a

time. The worst case of memory accesses is W when prefix

length is W. But binary search algorithms on prefix values are

proportional to log2N which N is number of prefixes. Binary

search on prefix lengths are proportional to log2W [9]. There

has been a BF application for LPM [6] which uses parallel

check on on-chip memory to accelerate lookups before

checking slower off-chip memory.

Previous work which implemented a BF on a router (for the

acceleration) is [10] that authors have implemented a complete

BF even on hardware to show how BF can help forwarding

and shortest path routing that is implemented in Click router.

Their implementation adjusts BF size and false positives

dynamically without impacting forwarding rate. Although

BUFFALO is a complete BF implementation on a real router

but in this paper the main concern is to show BF application in

BIRD’s routing daemon (open-source router) that based on the

best of our knowledge is the first time. Bloom-BIRD can scale

its Bloom filter capacity therefore false positive errors are

handled in an acceptable rate. Basic rules and chain orders of

BIRD are not changed and changes are the modest in order to

have a fair comparison.

III. OPEN SOURCE ROUTERS PERFORMANCE

In this section, three open-source routers, i.e. BIRD,

Quagga and XORP are compared in terms of CPU and

memory usage while implementing OSPF protocol in a simple

scenario.

In the first subsection, the scenario is presented and in the

latter part, results of comparison are discussed.

A. Scenario and Test-bed

The scenario to compare three routers is depicted in Fig 1.

In the scenario, there are two routers (router1, router2) and

two clients (ubuntu1, ubuntu2) in which injection of packets

starts from ubuntu1 to ubuntu2 destination. This scenario is

implemented by Oracle VirtualBox v4.1 which all virtual

Fig. 1. Simple scenario to test routers; packet injection from ubuntu1 to

ubuntu2

Page 2 of 10

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Page 4: 12_

For Review O

nly

3

machines run Ubuntu 10.04 with unmodified Linux kernel

3.2.0. Hardware configuration of the two routers are same,

they have 403 MB RAM and two network interfaces. And also

the two clients have same hardware configuration in which

they have one network interface and 256 MB RAM. Host of

the virtual machine manager is a home PC with 2.88 MHz

dual core CPU.

Packet injection is made from ubuntu1 client to ubuntu2

destination by Ostinato version 0.5 [11]. For evaluation, a

burst stream including 30 TCP, 30 UDP and 30 ICMP packets

with 64 Bytes length which took 1:30 minutes, is used. Speed

of injection in this software is set to 2 bursts/sec.

To gather amount of CPU and memory usage of the routers,

Valgrind profiler [12] is used in the two routers

simultaneously. This software is a DBI (Dynamic Binary

Instrumentation) framework designed for building

heavyweight DBA (Dynamic Binary Analysis) tools. The

program consists of different tools but in this comparison just

two of them i.e. Callgrind and Massif are used. Callgrind tool

uses number of functions calls as metric for CPU usage.

Massif tool which is used for memory usage is a heap profiler

that for each memory allocation/deallocation makes a snapshot

from memory to measure amount of memory allocated or

freed.

BIRD (1.3.8), Quagga (0.99.22) and XORP (1.8.5) are

executed separately in the scenario three times. Two routers

(router1, router2) run same software router and resource usage

is measured by Valgrind. Results are average performance of

these two routers.

B. Comparison results

Results of three open-source routers comparison is

presented in Figures 2 and 3. All routers run OSPF protocol in

a simple scenario as depicted in Figure 1. In Figure 2 the y

axis shows the total number of function calls of related open-

source router (i.e. CPU usage) in millions (106) base which is

gathered by Callgrind tool from Valgrind profiler, and the x

axis shows open-source routers name. In Figure 3 the y axis

shows total memory usage by routers in MB gathered by

Massif tool from Valgrind profiler. Similarly the x axis shows

open-source routers name.

As it is obvious from these two figures, for OSPF protocol

implementation, BIRD achieved the best performance in terms

of CPU and memory usage. BIRD obtained less CPU usage

than Quagga more than twice and more than 49 times than

XORP router. For memory usage, BIRD needs less memory

usage than Quagga about twice (1.80 times) and than XORP

about 44 times.

Since BIRD is the best among the routers in terms of CPU

and memory usage based on the comparison, it is selected to

implement BF on it and make it even faster when the network

is huge.

IV. BLOOM-BIRD: A BETTER BIRD

The BIRD project [2] aims to develop a fully functional

dynamic IP routing daemon primarily targeted on (but not

limited to) Linux, FreeBSD and other UNIX-like systems. It

was developed at Faculty of Math and Physics, Charles

University Prague. Currently it is developed and supported by

CZ.NIC1 Labs. It supports latest versions of routing protocols

such as BGP, RIP and OSPF. It also supports both IPv4 and

IPv6 and a simple command line interface to configure the

router.

There is a fundamental data-structure named FIB

(Forwarding Information Base) in the BIRD which routing

tables are based on. This data-structure stores IP prefixes and

length of them. Searching in a FIB, where huge number of

prefixes stored can be a speed bottleneck, which can be faster

using a CBF (Counting Bloom Filter) as will be presented.

Storing in BIRD’s FIB is a two stage mechanism. In the first

stage, an order-bit hash is calculated based on prefix value to

find bucket index of main hash table (order can be varied from

10 to 16). In the next stage, there is a chain of nodes in linked

list structure which may become long due to huge number of

nodes. Therefore, a BF can help to reduce of traversing these

long chains for missing nodes which result in accelerating the

whole router. Nodes are allocated in each chain by BIRD’s

Slab Allocator. Their implementation is based on what

Bonwick proposed [13] that makes linked list traversing and

nodes allocation/deallocation fast.

There is three important functions related to FIBs in the

BIRD named fib_get(), fib_find(), fib_route() which are

responsible for adding, searching and longest prefix matching

respectively. Each FIB in BIRD starts with a default 10-bit

1 http://www.nic.cz/

Fig. 2. Three open-source routing daemons CPU usage. XORP is worst and BIRD is best.

Fig. 3. Three open-source routing daemons memory usage (MB). XORP is worst and BIRD is best.

Page 3 of 10

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Page 5: 12_

For Review O

nly

4

hash table and increases its hash table size when number of

prefixes increases. This expansion will stop at 16-bit,

therefore, chain lengths starts to become larger. Implemented

BF helps the main hash table when this situation happens to

prevent searching in this large FIB chains when an IP prefix

cannot be found.

In the following subsection, the way Bloom-BIRD is

implemented and testing scenario is presented. In the next

subsection, the scenario to test Bloom-BIRD is discussed.

There is also third subsection to present results of comparison

between standard BIRD and Bloom-BIRD.

A. Implementation

BIRD uses dynamic hashing size to store prefixes which

increases when number of inserted prefixes becomes huge. It

starts from 10-bit to 16-bit and never grows afterwards. In

order to have a fair comparison between standard BIRD and

Bloom-BIRD, these rules have not changed. Therefore,

Bloom-BIRD has a BF array in each FIB to help it responding

faster when it is possible.

Hashing mechanism in the BF of Bloom-BIRD is inspired

by the main hash table of standard BIRD. If number of

enteries increases, Bloom-BIRD changes size and order of BF

array like the way main hash table does. Bloom-BIRD starts

with 18-bit BF and it increases to 20-bit if the capacity limit is

reached. This expansion continues until 32-bit and it never

grows afterwards. Therefore, capacity of BF array is limited to

32-bit when an acceptable FP error rate is expected. For

simplicity, this expansion of BF array is not included in the

psuedo-codes in this subsection.

In Figure 4 a simple BIRD’s FIB hashing table is presented.

In the Figure, the order can be varied between 10 to 16 as

mentioned before. Basic fib_find() function that searches for a

given prefix and length in a FIB is shown which uses

ipa_hash() function to determine which bucket in main hash

table should be used. Main hash array is an order-bit array of

fib_node type. Afterwards the node will be inserted into a new

free location in the linked lists chain.

In Figure 5 the way BF is implemented in FIB is depicted.

The fib_find() function checks BF for given prefix firstly. If

BF confirms existing of the prefix, then the main hash table

will be checked in order to determine pointer address of

founded node or a FP error may occur. On the other hand,

(and more importantly) if BF returns negative, checking main

hash table will be ignored. Therefore, the main advantage of

BF is the latter part in which checking main hash table and

maybe long chain lists traverse can be avoided.

The pseudo-code of fib_find() in the standard BIRD (as

discussed), is shown as SB_fib_find() function (in order to

distinguish between standard BIRD functions a SB is

prepended and for Bloom-BIRD functions there is BB prefix).

In the first line, e variable points to selected bucket which is

traversed in order to find given prefix. In the second and third

lines, the bucket chain is traversed to find requested node.

Two situations may happen after this loop. The loop may find

the node, then last line returns the node. Otherwise, traversing

linked list may end with a null pointer therefore fourth line

returns a null pointer indicating that the node cannot be found.

SB_fib_find(fib, prefix, length)

1. e = fib_table[ipa_hash(prefix)] 2. while((not empty e) AND (not found e)) 3. e = e->next 4. return e

In the Bloom-BIRD version of this function, in the first line,

BF array and its hashing mechanism is used in order to ignore

prefixes that does not exist as discussed earlier. The function

is changed as BB_fib_find() shows. Three first lines are

dedicated for BF search for a given prefix. There is k hash

functions in order to search BF. In each iteration if a location

of BF array represents an empty location then the search

returns false answer (i.e. NULL pointer). As it is shown in

second line, k independent hash functions are used for BF to

check array locations. These hash functions are also decpited

in Figure 5 as bloom_hashi() functions.

Fig. 4. FIB hashing architecture of standard BIRD. fib_find() uses ipa_hash()

function to determine which hash bucket must be selected. Afterwards nodes are chained in each bucket.

Fig. 5. FIB hashing architecture of Bloom-BIRD. Before checking main hash

table, k hash is performed on IP prefix to check the Bloom filter array. If Bloom

confirms existing of the prefix then it will go through main hash table. Otherwise it immediately returns null from fib_find().

Page 4 of 10

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Page 6: 12_

For Review O

nly

5

BB_fib_find(fib, prefix, length) 1. for(i=1 to k) 2. if(filter[bloom_hashi(prefix)] is empty) 3. return NULL

4. e = fib_table[hash(prefix)] 5. while((not empty e) AND (not found e))

6. e = e->next 7. return e

BIRD uses very simple longest prefix matching (LPM)

mechanism that starts from a given length and decrement it

until longest prefix match is found or returns a NULL pointer.

The pseudo-code is shown as SB_fib_route() function.

Since fib_route() uses fib_find() as its main function to

determine existence of the prefix, BF can help fib_route() very

effectively since BF has no false negative and it does not need

to go through the main hash chains for lengths that cannot be

found. Therefore, there is no need to change anything in the

fib_route() function. Although there is better solutions like

binary searches on prefix values and lengths as mentioned in

previous works, BIRD’s LPM algorithm is not changed in

order to show BF performance over standard BIRD.

SB_fib_route(fib, prefix, length)

1. while (length ≥ 0) 2. if(fib_find(fib, prefix, length)) 3. return founded node 4. else

5. length = length – 1 6. return NULL

The last important function is fib_get() which searches for

given prefix and length and if does not exist, it adds the prefix

into the FIB. The pseudo-code of this function is presented as

SB_fib_get() function. It is shown in the first line that the

function uses fib_find() mechanism as its searching function.

If it finds the node then its pointer will be returned. Otherwise

the node will be inserted into selected bucket of the hash table.

SB_fib_get(fib, prefix, length)

1. if(fib_find(fib, prefix, length)) 2. return founded node 3. else

4. Go down through hash chains 5. And add new node

To check existence of nodes in this function also BF can

help through fib_find() when a node does not exist. Therefore,

in the first line, BF is checked before its main hash table and

when a node is not inserted before, BF counters should be

incremented. Therefore, only one change is needed in the

function. This function is been shown as BB_fib_get()

function.

BB_fib_get(fib, prefix, length)

1. if(fib_find(fib, prefix, length)) 2. return founded node 3. else

4. Go down through hash chains 5. And add new node 6. for(i=1 to k)

7. filter[bloom_hashi(prefix)] += 1

Extra lines 6 and 7 of BB_fib_get() are responsible for

updating BF array due to newly added node. This does not

count as a overhead since hash functions are optimized using

bit-wise shifts.

There are two different types of hash functions used in the

Bloom-BIRD. First type is much like BIRD’s original hash

function which returns a 16-bit hash based on prefix value but

optimized using bit-wise shifts. Second type hash functions

are used for BF which have much less collisions than BIRD’s

original hash function. These second type hash functions

return varied bit size based on BF array size. Number of BF

hash functions may vary based on k parameter of Bloom filter

which determines number of hash functions. Optimal value of

the k parameter can be calculated using the following

equation:

� � ��/�� ∗ ln�2� (1)

In which m represents number of bits in the BF array and n

is number of elements that can be inserted. In the Bloom-

BIRD, k is constant and is set to 3 because the loop of

checking k hash functions becomes bottleneck for bigger k.

Therefore, the equation is used in order to calculate size of BF

array in the Bloom-BIRD.

B. Scenario

In order to evaluate standard BIRD and Bloom-BIRD three

real IPv4 prefix sets from [14] are gathered from years 2008,

2010 and 2013 sorted by date which latest and more updated

one contains more than 482 thousands unique IPv4 prefixes as

Table 1 shows. The two versions of BIRD i.e. standard BIRD

and Bloom-BIRD are evaluated by inserting these real prefix

sets and quering them. Prefix sets 4 and 5 are manually

generated which contains all possible 24 length prefixes

starting with 1-19 and 20-39 octets respectively. Prefix set 6 is

concatenation of two prefix sets 4 and 5 in order to test

searching FIBs with even bigger prefix sets and make sure

about results. These last three prefixes contain 99% missing

(not existing) prefixes compared to the other three real

prefixes (i.e. prefix sets 1-3) in order to show performance of

BF when most queries return negative. These last three prefix

Table. 1. Prefix sets to test the two versions of BIRD Prefix sets 1-3 are real dumps from routers and Prefix sets 4-6 are manually

generated

Prefix set alias # of nodes

Prefix1 262,039

Prefix2 351,645

Prefix3 482,500

Prefix4 1,179,648

Prefix5 1,310,720

Prefix6 2,490,368

Table. 2. Percentage of missing nodes when searching for prefix sets

There is zero (0) values since the prefix set is inserted itself (no missing nodes)

Inserted prefix set / Searched prefix set

Prefix1 Prefix2 Prefix3

Prefix1 0 24.53% 36.27%

Prefix2 43.65% 0 22.92%

Prefix3 65.34% 43.82% 0

Prefix4 99.8% 99.77% 99.56%

Prefix5 99.86% 99.77% 99.39%

Prefix6 99.83% 99.77% 99.47%

Page 5 of 10

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Page 7: 12_

For Review O

nly

6

sets are not real prefixes therefore they are only used for

searching and not for inserting into FIBs.

Percentage of number of missing nodes when each prefix

sets are searched is presented in Table 2. For example when

all prefixes in prefix set 2 are inserted into a FIB and all

prefixes in the prefix set 1 are queried afterwards, 24.53% of

searches return negative answer.

Results of evaluation is included and discussed in the

following subsection.

C. Results of Bloom-BIRD and Discussion

As discussed in the previous subsection, three prefix sets 1-

3 are inserted into a FIB in three different times in the two

versions of standard BIRD and Bloom-BIRD and all prefix

sets 1-6 are queried afterwards. Percetage of speedups of

Bloom-BIRD over standard BIRD and FP error rate are

presented in Table 3 in three main columns. In the first main

column, speedups of fib_fib() function which is responsible

for simple searching for a given prefix and length is presented.

In the second main column, speedups of fib_route() function

which is responsible for Longest Prefix Matching (LPM) is

presented (starting length for LPM is set to 32 for all

searches). In the last column, percentage of FP error rate is

presented. Also there are 6 rows representing what prefix set is

searched. The smallest speedup is 4% and biggest speedup is

93%. Smaller speedups are gained when most prefixes are

found after search (i.e. number of existing nodes are bigger

than missing nodes). On the other hand, bigger speedups are

gained when most prefixes are not found after search. Also FP

error rate is at most 20% which is an acceptable rate.

Based on the experiments, for small number of prefixes, BF

counts only as a memory overhead for BIRD i.e. no valueable

speedup will be gained. Therefore, BF feature of Bloom-BIRD

will remain deactivated until its main hash table reaches into

16-bit order. Afterwards, BF array will be allocated and

initialized to zero. Hashing mechanism in the BF of Bloom-

BIRD is inspired by the main hash table of BIRD. If number

of enteries increases, Bloom-BIRD changes size and order of

BF feature like the way main hash table does. Bloom-BIRD

starts with 18-bit BF and it increases to 20-bit if the capacity

limit is reached. This expansion continues until 32-bit.

In the Table 3 results show how scaling feature helps

accelerating the Bloom-BIRD when prefix set 2 is inserted in

comparison with when prefix set 1 is inserted. Since number

of prefixes in the prefix set 2 is bigger than Bloom-BIRD

default hash order (i.e. 18-bit), therefore the order of BF array

is scaled up to 20-bit and consequently FP is decreased in

comparison with when prefix set 1 is inserted. This situation

also shows how FP error rate is important and can make the

searches faster.

Table 3 shows that speedups of fib_route() function is lower

than its similar situation in the fib_find(). That’s because of

fib_find() tires just once for given prefix and length but

fib_route() tries W(n) times in worst case which n is 32 for

IPv4 and 128 for IPv6. Therefore, fib_route() in most cases

finds the best match.

For memory usage, number of bits in the BF array can be

calculated using Equation (1). Although 4-bit counters are

generally used for counters in CBF, but in the Bloom-BIRD

FIBs, 8-bit counters are used in the CBF because of simplicity

and lower overhead of increment operations in the PC for

Byte data type. Since k is constant and is set to 3 and number

of inputs by default is 18-bit, therefore the memory

requirement for Bloom-BIRD in the 18-bit capacity is 1.08

MB. When the capacity limit is reached, it will be incremented

by 2 therefore it will be 20-bit capacity. This order requires

4.33 MB of memory. This expansion continues until 32-bit

and the memory requirement can be calculated using Equation

(1).

V. CONCLUSION

The paper showed and presented another application of BF on

a practical open-source router. The BF implementation on

BIRD’s FIB data structure showed that it can help BIRD to

search and route faster when inserted prefixes into a FIB

become huge. Bloom-BIRD version which has Bloom Filter

implementation, evaluated using different prefix sets gathered

from real routers dumps and also manually generated prefix

sets to make the tests more accurate. Bloom-BIRD employs a

BF in BIRD’s FIB data structure in order to accelerate the

router when FIB’s linked list chains become long. Comparison

using different prefix sets showed that up to 93% speedup is

gained when a search returns negative answer. Also it is

showed how Bloom-BIRD can handle its FP error rate when

number of inserted prefixes increases. Because of its

Table. 3. Results of Bloom-BIRD (1.3.10) and standard BIRD (1.3.10) comparison

(*) means the prefix set is inserted itself – these are also worst cases since all searched prefixes are existing

fib_find() – simple search Speedup of Bloom-BIRD over BIRD

Using below prefix sets inserted

fib_route() – LPM search Speedup of Bloom-BIRD over BIRD

Using below prefix sets inserted

False Positive Percentage of Bloom-BIRD

Using below prefix sets inserted

Inserted prefix set /

Searched prefix set

Prefix1 Prefix2 Prefix3 Prefix1 Prefix2 Prefix3 Prefix1 Prefix2 Prefix3

Prefix1 (*) 11% 37% 50% (*) 4% 20% 32% (*) 0 17.1% 18.54%

Prefix2 54% (*) 19% 42% 28% (*) 7% 29% 16.44% (*) 0 20.83%

Prefix3 56% 59% (*) 22% 38% 36% (*) 14% 14.4% 9.88% (*) 0

Prefix4 77% 93% 91% 55% 74% 76% 8.86% 0.46% 1.11%

Prefix5 77% 93% 90% 52% 71% 74% 8.63% 0.6% 1.67%

Prefix6 77% 92% 91% 54% 72% 75% 8.74% 0.54% 1.4%

Page 6 of 10

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Page 8: 12_

For Review O

nly

7

capability to adapt its scale, it is possible to employ this

solution for IPv6 addresses too.

REFERENCES

[1] S. Han, K. Jang, K. Park, and S. Moon, "PacketShader: a GPU-

accelerated software router," ACM SIGCOMM Computer

Communication Review, vol. 41, pp. 195-206, 2010. [2] BIRD Internet Routing Daemon. Available: http://bird.network.cz

[3] Quagga Routing Suite. Available: http://www.nongnu.org/quagga/

[4] M. Handley, O. Hodson, and E. Kohler, "XORP: an open platform for network research," ACM SIGCOMM Computer

Communication Review, vol. 33, pp. 53-57, 2003.

[5] B. H. Bloom, "Space/time trade-offs in hash coding with allowable errors," Communications of the ACM, vol. 13, pp. 422-426, 1970.

[6] S. Tarkoma, C. E. Rothenberg, and E. Lagerspetz, "Theory and

practice of bloom filters for distributed systems," Communications Surveys & Tutorials, IEEE, vol. 14, pp. 131-155, 2012.

[7] A. Broder and M. Mitzenmacher, "Network Applications of Bloom Filters: A Survey," Internet Mathematics, vol. 1, pp. 636-646,

2002.

[8] M. Ahmadi and S. Wong, "Modified collision packet classification using counting Bloom filter in tuple space," presented at the

Proceedings of the 25th conference on Proceedings of the 25th

IASTED International Multi-Conference: parallel and distributed computing and networks, Innsbruck, Austria, 2007.

[9] L. Hyesook and L. Nara, "Survey and proposal on binary search

algorithms for longest prefix match," Communications Surveys & Tutorials, IEEE, vol. 14, pp. 681-697, 2012.

[10] M. Yu, A. Fabrikant, and J. Rexford, "BUFFALO: bloom filter

forwarding architecture for large organizations," presented at the Proceedings of the 5th international conference on Emerging

networking experiments and technologies, Rome, Italy, 2009.

[11] Ostinato, Packet/Traffic Generator and Analyzer. Available: http://code.google.com/p/ostinato

[12] N. Nethercote and J. Seward, "Valgrind: a framework for

heavyweight dynamic binary instrumentation," ACM SIGPLAN Notices, vol. 42, pp. 89-100, 2007.

[13] J. Bonwick, "The slab allocator: an object-caching kernel memory

allocator," presented at the Proceedings of the USENIX Summer 1994 Technical Conference on USENIX Summer 1994 Technical

Conference - Volume 1, Boston, Massachusetts, 1994.

[14] RouteViews BGP RIBs. December 2008, December 2010, July 2013. Available: http://routeviews.org/

Page 7 of 10

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Page 9: 12_

For Review Only

router1 router2

ubuntu1 ubuntu2

192.168.1.2/24

192.168.1.1/24

20.20.20.1/30

20.20.20.2/3010.10.10.1/8

10.10.10.2/8

Page 8 of 10

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Page 10: 12_

For Review Only

1

2

3

4

.

.

.

2order

fib_find()

ipa_hash()

node1 node2

node3

node4

node5 node6

node nFIB

Hash table

fib_node array

Page 9 of 10

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Page 11: 12_

For Review Only

1

2

3

4

.

.

.

2order

fib_find()

ipa_hash()

node1 node2

node3

node4

node5 node6

node n

1 2 ... m

bloom_hashi()

...1 2 k

True

FIBHash table

False

fib_node array

Page 10 of 10

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960