algorithm engineering „parallele suche“

35
Algorithm Engineering „Parallele Suche“ Stefan Edelkamp

Upload: shaun

Post on 23-Feb-2016

35 views

Category:

Documents


0 download

DESCRIPTION

Algorithm Engineering „Parallele Suche“. Stefan Edelkamp. Übersicht. Motivation PRAM Terminierung Depth-Slicing Hash- based Partitioning & Transposition Table Scheduling Stack Splitting & Parallel Window Search Parallele Suche mit Treaps. Parallel Shared Memory Graph Search. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Algorithm   Engineering „Parallele Suche“

Algorithm Engineering „Parallele Suche“

Stefan Edelkamp

Page 2: Algorithm   Engineering „Parallele Suche“

Übersicht

Motivation PRAM Terminierung Depth-Slicing Hash-based Partitioning & Transposition Table Scheduling Stack Splitting & Parallel Window Search Parallele Suche mit Treaps

Page 3: Algorithm   Engineering „Parallele Suche“

Parallel Shared Memory Graph Search

Single-core CPU Multi-core CPU

• Parallelization is important for multi-core CPUs

• But parallelizing graph-search algorithms such as breadth-first search, Dijkstra’s algorithm, and A* is challenging…

• Issues: Load balancing, Locking, …

Page 4: Algorithm   Engineering „Parallele Suche“

Parallel Shared Memory Graph Search

Single-core CPU Multi-core GPU

• Parallelization is even more important for GPUs

• But parallelizing graph-search algorithms such as breadth-first search, Dijkstra’s algorithm, and A* is challenging…

• Issues: Kernel Function Design, Load balancing, Locking, …

Page 5: Algorithm   Engineering „Parallele Suche“

Parallel External Memory Graph Search

Single-core CPU+HDD Multi-core C/GPU+HDD• …

Page 6: Algorithm   Engineering „Parallele Suche“

MotivationParallel and External Memory Graph Search Synergies: They need partitioned access to large sets of data This data needs to be processed individually. Limited information transfer between two partitions Streaming in external memory programs relates to

Communication Queues in distributed programs (as communication often realized on files) Good external implementations often lead to good parallel

implementations

Page 7: Algorithm   Engineering „Parallele Suche“

Experimente

Page 8: Algorithm   Engineering „Parallele Suche“

WeitereExperimente

Page 9: Algorithm   Engineering „Parallele Suche“

Parallel Random Access MachineCommon Read/Exclusive Write (CREW PRAM)

Page 10: Algorithm   Engineering „Parallele Suche“

Parallele Addition

Page 11: Algorithm   Engineering „Parallele Suche“

In Pseudo-Code

Page 12: Algorithm   Engineering „Parallele Suche“

Definitionen

Problemgröße Parallele Rechenzeit Arbeit Sequentielle Zeit: Effizienz: Speedup: Im Beispiel Linear Speedup Effiziente Parallelisierung: Im Beispiel

Page 13: Algorithm   Engineering „Parallele Suche“

Präfixsumme

Page 14: Algorithm   Engineering „Parallele Suche“

Terminierung

Page 15: Algorithm   Engineering „Parallele Suche“

Depth-Slicing

Page 16: Algorithm   Engineering „Parallele Suche“

Im Quelltext

Page 17: Algorithm   Engineering „Parallele Suche“

Hash-based Partitioning

Page 18: Algorithm   Engineering „Parallele Suche“

Transposition Driven Scheduling

Page 19: Algorithm   Engineering „Parallele Suche“

Im Quelltext

Page 20: Algorithm   Engineering „Parallele Suche“

Parallele Tiefensuche (Parallel Branch-And Bound)

Page 21: Algorithm   Engineering „Parallele Suche“

Im Quelltext

Page 22: Algorithm   Engineering „Parallele Suche“

Load-Balancing via Stack Splitting

Page 23: Algorithm   Engineering „Parallele Suche“

Parallel Window Search(Iterative-Deepening Search)

Page 24: Algorithm   Engineering „Parallele Suche“

Treaps: Mischung aus Heaps und Suchbäumen

Page 25: Algorithm   Engineering „Parallele Suche“

Einsatz Using a treap the need for exclusive locks can be alleviated to

some extend. Each operation on the treap manipulates the data structure in

the same top-down direction. Moreover, it can be decomposed into successive elementary

operations. Tree partial locking protocol:Every process holds exclusive access to a sliding window of nodes

in the tree. It can move this window down a path in the tree, which allows other processes to access different, non-overlapping windows at the same time.

Parallel search using a treap with partial locking has been tested for the FIFTEENPUZZLE on different architectures, with a speedup for 8 processors in between 2 and 5.

Page 26: Algorithm   Engineering „Parallele Suche“

Selbstanordnende Bäume mittelsSplay-Operation Siehe Extra-Folien

Page 27: Algorithm   Engineering „Parallele Suche“

Parallel External-Memory Graph Search Motivation Shared and Distributed Environments Parallel Delayed Duplicate Detection

Parallel ExpansionDistributed Sorting

Parallel Structured Duplicate DetectionFinding Disjoint Duplicate Detection ScopesLocking

Page 28: Algorithm   Engineering „Parallele Suche“

Distributed Search over the Network

Distributed setting provides more space. Experiments show that internal time dominates

I/O.

Page 29: Algorithm   Engineering „Parallele Suche“

Exploiting Independence

Since each state in a Bucket is independent of the other –

they can be expanded in parallel.

Duplicates removal can be distributed on different processors.

Bulk (Streamed) transfers much better than single ones.

Page 30: Algorithm   Engineering „Parallele Suche“

Parallel Breadth-First Frontier Search Enumerating 15-Puzzle

Hash function partitions both layers into files. If a layer is done, children files are renamed into parent files. For parallel processing a work queue contains parent files

waiting to be expanded, and child files waiting to be merged

Page 31: Algorithm   Engineering „Parallele Suche“
Page 32: Algorithm   Engineering „Parallele Suche“
Page 33: Algorithm   Engineering „Parallele Suche“

Distributed Queue for Parallel Best-First Search

P0

P1

P2

<15,34, 0, 100>

<g, h, start byte, size><15,34, 20, 100>TOP

<15,34, 40, 100>

<15,34, 60, 100>

Beware of the

Mutual Exclusio

n

Problem!!!

Page 34: Algorithm   Engineering „Parallele Suche“

Distributed Delayed Duplicate Detection Each state can appear several times in a

bucket. A bucket has to be searched completely for

the duplicates.

P0 P1 P2 P3GOAL

Problem: Concurrent Writes !!!!

Sorted buffers

Single Files

Page 35: Algorithm   Engineering „Parallele Suche“

Multiple Processors - Multiple Disks Variant

Sorted buffers w.r.t the hash val

Sorted Files

P1 P2 P3 P4

Divide w.r.t the hash rangesSorted buffers from every processor

Sorted File

h0 ….. hk-1 hk ….. hl-1