p-tree implementation anne denton. so far: logical definition c.f. dr. perrizo’s slides logical...

16
P-Tree Implementation Anne Denton

Upload: lily-francis

Post on 13-Dec-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: P-Tree Implementation Anne Denton. So far: Logical Definition C.f. Dr. Perrizo’s slides Logical definition Defines node information Representation of

P-Tree Implementation

Anne Denton

Page 2: P-Tree Implementation Anne Denton. So far: Logical Definition C.f. Dr. Perrizo’s slides Logical definition Defines node information Representation of

So far: Logical Definition C.f. Dr. Perrizo’s slides Logical definition

Defines node information Representation of structure open

Wide variety of implementations has been tried

Page 3: P-Tree Implementation Anne Denton. So far: Logical Definition C.f. Dr. Perrizo’s slides Logical definition Defines node information Representation of

Tree Representation Options Pointers Tree-walks

Depth-first Breadth-first

Node addresses (P-trees: qids) Note: Any one tree representation

will make the tree loss-less!

Page 4: P-Tree Implementation Anne Denton. So far: Logical Definition C.f. Dr. Perrizo’s slides Logical definition Defines node information Representation of

Issues Storage requirements Suitability to distributed processing

(e.g., avoiding pointer swizzling) Ease of access to particular nodes

Main issue Data structure must optimize

anding speed at each node

Page 5: P-Tree Implementation Anne Denton. So far: Logical Definition C.f. Dr. Perrizo’s slides Logical definition Defines node information Representation of

Main Desired Property

Anding through Bit-vector operations New node information New structural information

Why? Parallelism: up-to 32 or 64 bits

processed in parallel for single processor CPU

Page 6: P-Tree Implementation Anne Denton. So far: Logical Definition C.f. Dr. Perrizo’s slides Logical definition Defines node information Representation of

QID-based P-Vector representation (Example: P1V)[ ] 1001[01] 0010[10] 1101[01.00] 1110[01.11] 0010[10.10] 1101• Node information stored as bit-vector• Structural information:

• Traditional relation of degree 2• Address is key

Page 7: P-Tree Implementation Anne Denton. So far: Logical Definition C.f. Dr. Perrizo’s slides Logical definition Defines node information Representation of

Can We Convert Address to Bit-Vectors?[ ] 1001 [ ] 0110[01] 0010 [01] 1001[10] 1101 <=> [10] 0010[01.00] 1110[01.11] 0010[10.10] 1101

We know this: PMV! Claim: qid is now redundant Standard conversion to bit-vectors

Page 8: P-Tree Implementation Anne Denton. So far: Logical Definition C.f. Dr. Perrizo’s slides Logical definition Defines node information Representation of

Does this Define Structure? Yes! Concept:

Similar to Depth-First Search Mixed vector specifies existing children

Slight modification: Store all children to one node

sequentially Reason: address can be computed

through counts on mixed

Page 9: P-Tree Implementation Anne Denton. So far: Logical Definition C.f. Dr. Perrizo’s slides Logical definition Defines node information Representation of

Representation of Standard Example

Page 10: P-Tree Implementation Anne Denton. So far: Logical Definition C.f. Dr. Perrizo’s slides Logical definition Defines node information Representation of

P-Tree Anding Start at root Pursue new (potentially) mixed

children Deriving new mixed (m) and pure1 (u):

u is AND of all ui m is AND of all (mi OR ui) AND NOT u

Cannot be done with either u or m alone

Page 11: P-Tree Implementation Anne Denton. So far: Logical Definition C.f. Dr. Perrizo’s slides Logical definition Defines node information Representation of

Fast Counting using Table Look-up How many bits are set in 01100110? Look-up table stores “4” for index 102 Works up-to sequences of 8 bit

00000000 000000001 100000010 100000011 200000100 100000101 2

Page 12: P-Tree Implementation Anne Denton. So far: Logical Definition C.f. Dr. Perrizo’s slides Logical definition Defines node information Representation of

Finding the next 1 Which is the first bits set in 01100110? Look-up table stores “1” for index 102 Works up-to sequences of 8 bit

(00000000 8)00000001 700000010 600000011 600000100 500000101 5

Page 13: P-Tree Implementation Anne Denton. So far: Logical Definition C.f. Dr. Perrizo’s slides Logical definition Defines node information Representation of

Finding a child Assume children are stored in

sequence For mixed vector 01100110 where

is the child with index 5 (part of qid)?

Count the children in 01100 Storage location calculated with

one table look-up

Page 14: P-Tree Implementation Anne Denton. So far: Logical Definition C.f. Dr. Perrizo’s slides Logical definition Defines node information Representation of

Potential problems Eliminating large sub-trees slow Speeding up “and”:

Introduce additional access structure Array indices as pointers

Note: No lowest level due to adjacent storage of

children Reduces storage by about 1/fanout (e.g., 1/16)

Access structure does not need to be stored (P-tree loss-less without it)

Page 15: P-Tree Implementation Anne Denton. So far: Logical Definition C.f. Dr. Perrizo’s slides Logical definition Defines node information Representation of

Full Example

Page 16: P-Tree Implementation Anne Denton. So far: Logical Definition C.f. Dr. Perrizo’s slides Logical definition Defines node information Representation of

Summary PV1: node values stored as bit-

vectors Now: tree structure stored as bit-

vectors as well Benefits: Several fast bit-vector

algorithms can be used Description of structure:

Modified depth-first tree-walk Additional access structure efficient