cse 326 multi-dimensional search trees david kaplan dept of computer science & engineering...

36
CSE 326 Multi-Dimensional Search Trees David Kaplan Dept of Computer Science & Engineering Autumn 2001

Post on 19-Dec-2015

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: CSE 326 Multi-Dimensional Search Trees David Kaplan Dept of Computer Science & Engineering Autumn 2001

CSE 326Multi-Dimensional Search Trees

David Kaplan

Dept of Computer Science & EngineeringAutumn 2001

Page 2: CSE 326 Multi-Dimensional Search Trees David Kaplan Dept of Computer Science & Engineering Autumn 2001

Multi-Dimensional SearchTrees

CSE 326 Autumn 20012

Outline Multidimensional search ADT Range Queries k-D Trees Quad Trees

Page 3: CSE 326 Multi-Dimensional Search Trees David Kaplan Dept of Computer Science & Engineering Autumn 2001

Multi-Dimensional SearchTrees

CSE 326 Autumn 20013

Multi-D Search ADT Dictionary operations

create destroy find insert delete range queries

Each item has k keys for a k-dimensional search tree

Searches can be performed on one, some, or all of the keys or on ranges of the keys

9,13,64,2

5,78,21,94,4

8,42,5

5,2

Page 4: CSE 326 Multi-Dimensional Search Trees David Kaplan Dept of Computer Science & Engineering Autumn 2001

Multi-Dimensional SearchTrees

CSE 326 Autumn 20014

Applications of Multi-D Search Astronomy (simulation of galaxies) - 3 dimensions Protein folding in molecular biology - 3

dimensions Lossy data compression - 4 to 64 dimensions Image processing - 2 dimensions Graphics - 2 or 3 dimensions Animation - 3 to 4 dimensions Geographical databases - 2 or 3 dimensions Web searching - 200 or more dimensions

Page 5: CSE 326 Multi-Dimensional Search Trees David Kaplan Dept of Computer Science & Engineering Autumn 2001

Multi-Dimensional SearchTrees

CSE 326 Autumn 20015

Range QueryRange query

search in which exact key may not be entirely specified

primary interface to multi-D data structures

Examples: “All stars in the Milky Way within 1000 light

years of the Sun.” “All customers named ‘Smith’ who spent

from $100-500 with us last year.”

Page 6: CSE 326 Multi-Dimensional Search Trees David Kaplan Dept of Computer Science & Engineering Autumn 2001

Multi-Dimensional SearchTrees

CSE 326 Autumn 20016

Two-Dimensional

Range Query ExamplesSearch for items based on …

single key range

multiple key ranges

function of multiple keys

Page 7: CSE 326 Multi-Dimensional Search Trees David Kaplan Dept of Computer Science & Engineering Autumn 2001

Multi-Dimensional SearchTrees

CSE 326 Autumn 20017

x

Range Querying in 1-D

Find everything in the rectangle…

Page 8: CSE 326 Multi-Dimensional Search Trees David Kaplan Dept of Computer Science & Engineering Autumn 2001

x

Range Querying in 1-D with a BST

Find everything in the rectangle…

Page 9: CSE 326 Multi-Dimensional Search Trees David Kaplan Dept of Computer Science & Engineering Autumn 2001

x

y

1-D Range Querying in 2-D

Page 10: CSE 326 Multi-Dimensional Search Trees David Kaplan Dept of Computer Science & Engineering Autumn 2001

x

y

2-D Range Querying in 2-D

Page 11: CSE 326 Multi-Dimensional Search Trees David Kaplan Dept of Computer Science & Engineering Autumn 2001

Multi-Dimensional SearchTrees

CSE 326 Autumn 200111

k-D Trees Split on the next dimension at each succeeding level If building in batch choose the median along the

current dimension at each level guarantees logarithmic height and balanced tree common scenario: large, (relatively) static data sets

In general, add as in a BST

k-D tree node

dimension

left right

keys value The dimension thatthis node splits on

Page 12: CSE 326 Multi-Dimensional Search Trees David Kaplan Dept of Computer Science & Engineering Autumn 2001

x

Building a 2-D Tree (1/4)y

Page 13: CSE 326 Multi-Dimensional Search Trees David Kaplan Dept of Computer Science & Engineering Autumn 2001

x

y

Building a 2-D Tree (2/4)

Page 14: CSE 326 Multi-Dimensional Search Trees David Kaplan Dept of Computer Science & Engineering Autumn 2001

x

y

Building a 2-D Tree (3/4)

Page 15: CSE 326 Multi-Dimensional Search Trees David Kaplan Dept of Computer Science & Engineering Autumn 2001

x

y

Building a 2-D Tree (4/4)

Page 16: CSE 326 Multi-Dimensional Search Trees David Kaplan Dept of Computer Science & Engineering Autumn 2001

k-D Tree Data Structure

ac

ih

m

d

e

f

b

jk

g

l

ldkf

hg

e

cj i mb a

Page 17: CSE 326 Multi-Dimensional Search Trees David Kaplan Dept of Computer Science & Engineering Autumn 2001

x

y

2-D Range Querying in 2-D Trees

Search every partition that intersects the rectangle. Check whether each node (including leaves) falls into the range.

Page 18: CSE 326 Multi-Dimensional Search Trees David Kaplan Dept of Computer Science & Engineering Autumn 2001

x

y

Other Shapes for Range Querying

Search every partition that intersects the shape (circle). Check whether each node (including leaves) falls into the shape.

Page 19: CSE 326 Multi-Dimensional Search Trees David Kaplan Dept of Computer Science & Engineering Autumn 2001

Multi-Dimensional SearchTrees

CSE 326 Autumn 200119

Find in a k-D TreeFinds the node which has the given set of keys in it or returns null if there is no such node

Node *& find(const keyVector & keys, Node *& root) {

int dim = root->dimension;

if (root == NULL)

return root;

else if (root->keys == keys)

return root;

else if (keys[dim] < root->keys[dim])

return find(keys, root->left);

else

return find(keys, root->right);

}

Runtime:

Page 20: CSE 326 Multi-Dimensional Search Trees David Kaplan Dept of Computer Science & Engineering Autumn 2001

Multi-Dimensional SearchTrees

CSE 326 Autumn 200120

Breathalyzer Test:k-D Trees Can Lose Their Balance1

insert(<5,0>)insert(<6,9>)insert(<9,3>)insert(<6,5>)insert(<7,7>)insert(<8,6>)

6,9

5,0

6,5

9,3

8,6

7,7

Worst-case imbalance?

1but not when built in batch!

Page 21: CSE 326 Multi-Dimensional Search Trees David Kaplan Dept of Computer Science & Engineering Autumn 2001

Multi-Dimensional SearchTrees

CSE 326 Autumn 200121

k-D Find ExampleActual Find

find(<3,6>)Insert (first, do a Find)

find(<0,10>)

5,78,21,94,4

8,42,5

5,2

9,13,64,2

Page 22: CSE 326 Multi-Dimensional Search Trees David Kaplan Dept of Computer Science & Engineering Autumn 2001

Multi-Dimensional SearchTrees

CSE 326 Autumn 200122

Quad Trees Split on all (two) dimensions at each level Split key space into equal size partitions (quadrants) Add a new node by adding to a leaf, and, if the leaf is

already occupied, split until only one node per leaf

quad tree node

Quadrants:

0,1 1,1

0,0 1,0

quadrants

0,0 1,0 0,11,1

keys value

Centerx yCenter:

Page 23: CSE 326 Multi-Dimensional Search Trees David Kaplan Dept of Computer Science & Engineering Autumn 2001

x

Building a Quad Tree (1/5)

y

Page 24: CSE 326 Multi-Dimensional Search Trees David Kaplan Dept of Computer Science & Engineering Autumn 2001

x

Building a Quad Tree (2/5)

y

Page 25: CSE 326 Multi-Dimensional Search Trees David Kaplan Dept of Computer Science & Engineering Autumn 2001

x

Building a Quad Tree (3/5)

y

Page 26: CSE 326 Multi-Dimensional Search Trees David Kaplan Dept of Computer Science & Engineering Autumn 2001

x

Building a Quad Tree (4/5)

y

Page 27: CSE 326 Multi-Dimensional Search Trees David Kaplan Dept of Computer Science & Engineering Autumn 2001

x

Building a Quad Tree (5/5)

y

Page 28: CSE 326 Multi-Dimensional Search Trees David Kaplan Dept of Computer Science & Engineering Autumn 2001

Multi-Dimensional SearchTrees

CSE 326 Autumn 200128

Quad Tree Data Structure

a

g

b

ef

d

cga

fed

cb

Page 29: CSE 326 Multi-Dimensional Search Trees David Kaplan Dept of Computer Science & Engineering Autumn 2001

x

2-D Range Querying in Quad Trees

y

Page 30: CSE 326 Multi-Dimensional Search Trees David Kaplan Dept of Computer Science & Engineering Autumn 2001

Multi-Dimensional SearchTrees

CSE 326 Autumn 200130

Find in a Quad Tree Finds the node which has the given pair of

keys in it or returns quadrant where the point should be if there is no such node

Node *& find(Key x, Key y, Node *& root) {

if (root == NULL)

return root; // Empty tree

if (root->isLeaf)

return root; // Key may not actually be here

// Compare vs. center; always make same choice on ties

int quad = getQuadrant(x, y, root);

return find(x, y, root->quadrants[quad]);

}

Runtime?

Page 31: CSE 326 Multi-Dimensional Search Trees David Kaplan Dept of Computer Science & Engineering Autumn 2001

Multi-Dimensional SearchTrees

CSE 326 Autumn 200131

Walk this straight line, please …Quad Trees Can Lose Their Balance1

b

a

Worst-case imbalance?• O(log(1/dmin))

1but no batch bail-out – why not?

Page 32: CSE 326 Multi-Dimensional Search Trees David Kaplan Dept of Computer Science & Engineering Autumn 2001

Multi-Dimensional SearchTrees

CSE 326 Autumn 200132

Find Example

a

g

b

ef

d

cga

fed

cb

find(<10,2>) (i.e., c)find(<5,6>) (i.e., d)

Page 33: CSE 326 Multi-Dimensional Search Trees David Kaplan Dept of Computer Science & Engineering Autumn 2001

Insert Example

gga

insert(<10,7>,x)

… …a

g

b

ef

d

c

x

gx• Find the spot where the node should go.• If the space is unoccupied, insert the node.• If it is occupied, split until the existing node separates from the new one.

Page 34: CSE 326 Multi-Dimensional Search Trees David Kaplan Dept of Computer Science & Engineering Autumn 2001

Delete Example

a

g

b

ef

d

c

ga

fed

cb

delete(<10,2>)(i.e., c)

• Find and delete the node.• If its parent has just one child, delete it.• Propagate!

Page 35: CSE 326 Multi-Dimensional Search Trees David Kaplan Dept of Computer Science & Engineering Autumn 2001

Nearest Neighbor Search

ga

fed

cb

getNearestNeighbor(<1,4>)

g

b

f

d

c

• Find a nearby node (do a find).• Do a circular range query.• As you get results, tighten the circle.• Continue until no closer node in query.

a

e

Works on k-D Trees, too!

Page 36: CSE 326 Multi-Dimensional Search Trees David Kaplan Dept of Computer Science & Engineering Autumn 2001

Multi-Dimensional SearchTrees

CSE 326 Autumn 200136

Quad Trees vs. k-D Treesk-D Trees

Density balanced trees Number of nodes is O(n) where n is the number of points Height of the tree is O(log n) with batch insertion Supports insert, find, nearest neighbor, range queries

Quad Trees Number of nodes is O(n(1+ log(/n))) where n is the number

of points and is the ratio of the width (or height) of the key space and the smallest distance between two points

Height of the tree is O(log n + log ) Supports insert, delete, find, nearest neighbor, range queries

Also: many hybrids and variants