parallel olap andrew rau-chaplin faculty of computer science dalhousie university joint work with f....
TRANSCRIPT
![Page 1: Parallel OLAP Andrew Rau-Chaplin Faculty of Computer Science Dalhousie University Joint Work with F. Dehne T. Eavis S. Hambrusch](https://reader036.vdocuments.mx/reader036/viewer/2022071806/56649dbc5503460f94aaed31/html5/thumbnails/1.jpg)
Parallel OLAP
Andrew Rau-ChaplinFaculty of Computer ScienceDalhousie University
Joint Work withF. DehneT. EavisS. Hambrusch
![Page 2: Parallel OLAP Andrew Rau-Chaplin Faculty of Computer Science Dalhousie University Joint Work with F. Dehne T. Eavis S. Hambrusch](https://reader036.vdocuments.mx/reader036/viewer/2022071806/56649dbc5503460f94aaed31/html5/thumbnails/2.jpg)
Decision Support Systems A time-oriented analysis of
scientific or organizational data
Information Processing
Online Analytical Processing (OLAP)
Data Minning
![Page 3: Parallel OLAP Andrew Rau-Chaplin Faculty of Computer Science Dalhousie University Joint Work with F. Dehne T. Eavis S. Hambrusch](https://reader036.vdocuments.mx/reader036/viewer/2022071806/56649dbc5503460f94aaed31/html5/thumbnails/3.jpg)
Data Warehousing for Decision Support
Operational data collected into DW
DW used to support multi-dimensional views
Views form the basis of OLAP processing
Our focus: the OLAP server
Data MiningAnalysisQuery Reports
Olap ServerOlap Server
Meta Data Repository
MonitoringAdministration
Operational Databases
Data Warehouse
Data Marts
External Sources
ExtractClean
TransformLoad
Refresh
Output
Front-End Tools
Olap Engines
Data Storage
Data Cleaningand
Integration
![Page 4: Parallel OLAP Andrew Rau-Chaplin Faculty of Computer Science Dalhousie University Joint Work with F. Dehne T. Eavis S. Hambrusch](https://reader036.vdocuments.mx/reader036/viewer/2022071806/56649dbc5503460f94aaed31/html5/thumbnails/4.jpg)
Data Cube Generation
Proposed by Gray et al in 1995 Can be generated from a
relational DB but…
A
B
C The cuboid ABC (or CAB)
ABC
AB AC BC
A C B
ALL
12
18
83
21
34
3850
21
![Page 5: Parallel OLAP Andrew Rau-Chaplin Faculty of Computer Science Dalhousie University Joint Work with F. Dehne T. Eavis S. Hambrusch](https://reader036.vdocuments.mx/reader036/viewer/2022071806/56649dbc5503460f94aaed31/html5/thumbnails/5.jpg)
Core OLAP Operations Five fundamental OLAP operations:
roll-up, drill-down, slice, dice, and pivot
Range Queries
![Page 6: Parallel OLAP Andrew Rau-Chaplin Faculty of Computer Science Dalhousie University Joint Work with F. Dehne T. Eavis S. Hambrusch](https://reader036.vdocuments.mx/reader036/viewer/2022071806/56649dbc5503460f94aaed31/html5/thumbnails/6.jpg)
The Challenge Design and build a parallel ROLAP system
Full cube generation Partial cube generation Indexing and query resolution
For High dimensionality: 10 – 30 D Large input data sizes: Gigabytes Large output data sizes: Terabytes
Implications Parallel + external memory Shared disk + Shared nothing
![Page 7: Parallel OLAP Andrew Rau-Chaplin Faculty of Computer Science Dalhousie University Joint Work with F. Dehne T. Eavis S. Hambrusch](https://reader036.vdocuments.mx/reader036/viewer/2022071806/56649dbc5503460f94aaed31/html5/thumbnails/7.jpg)
The Architectural Model Shared Disk
A set of P processors connected via an interconnection fabric
standard-sized local memory concurrent access to a shared
disk array Shared Nothing
A set of p processors connected via and interconnection fabric
Standard size local memory Independent local disk(s)
Algorithm Design CGM (Coarse Grained
Multicomputer)
Communication Fabric
p1 p2
p3
p4
pn
Communication Fabric
p1 p2
p3
p4
pn
![Page 8: Parallel OLAP Andrew Rau-Chaplin Faculty of Computer Science Dalhousie University Joint Work with F. Dehne T. Eavis S. Hambrusch](https://reader036.vdocuments.mx/reader036/viewer/2022071806/56649dbc5503460f94aaed31/html5/thumbnails/8.jpg)
Coarse Grained Multicomputer
A set of P processors Arbitrary
communication topology or shared memory
m memory per processor, m >>p
Communication round consists of an h-relation in which all proc. send and receive O(m) data
Communication Fabric
![Page 9: Parallel OLAP Andrew Rau-Chaplin Faculty of Computer Science Dalhousie University Joint Work with F. Dehne T. Eavis S. Hambrusch](https://reader036.vdocuments.mx/reader036/viewer/2022071806/56649dbc5503460f94aaed31/html5/thumbnails/9.jpg)
MOLAP vs. ROLAP
Model Year Colour Sales
Chevy 1990 Blue 87
Chevy 1990 Red 5
Chevy 1990 ALL 92
Chevy ALL Blue 87
Chevy ALL Red 5
Chevy ALL ALL 92
Ford 1990 Blue 99
Ford 1990 Green 64
Ford 1990 ALL 163
Ford 1991 Blue 7
Ford 1991 Red 8
Ford 1991 ALL 15
Ford ALL Blue 106
Ford ALL Green 64
Ford ALL Red 8
ALL 1990 Blue 186
ALL 1990 Green 64
ALL 1991 Blue 7
ALL 1991 Red 8
Ford ALL ALL 178
ALL 1990 ALL 255
ALL 1991 ALL 15
ALL ALL Blue 193
ALL ALL Green 64
ALL ALL Red 13
ALL ALL ALL 270
Model Year Colour Sales
Chevy 1990 Red 5
Chevy 1990 Blue 87
Ford 1990 Green 64
Ford 1990 Blue 99
Ford 1991 Red 8
Ford 1991 Blue 7
![Page 10: Parallel OLAP Andrew Rau-Chaplin Faculty of Computer Science Dalhousie University Joint Work with F. Dehne T. Eavis S. Hambrusch](https://reader036.vdocuments.mx/reader036/viewer/2022071806/56649dbc5503460f94aaed31/html5/thumbnails/10.jpg)
Existing Parallel Results Goil &
Choudhary MOLAP Approach
Parallelize the generation of each cuboid
Challenge > 2d comm.
rounds
![Page 11: Parallel OLAP Andrew Rau-Chaplin Faculty of Computer Science Dalhousie University Joint Work with F. Dehne T. Eavis S. Hambrusch](https://reader036.vdocuments.mx/reader036/viewer/2022071806/56649dbc5503460f94aaed31/html5/thumbnails/11.jpg)
Parallelizing the Data Cube
Generating Data Cubes (Shared Disk) Generating Data Cubes (Shared
Nothing) Generating Partial Data Cubes Parallel Multi-dimensional Indexing Conclusions and Future Work