efficient top-down planning in business...
TRANSCRIPT
Efficient Top-Down Planning in Business Intelligence
Tobias Lauer • Alexander Haberstroh
Jedox AG
Collaborators: Christoffer Anselm • Zurab Khadikov • Steffen Wittmer
Business Intelligence and Corporate Planning
Jedox Suite
Jedox Suite
Jedox for Excel
Jedox OLAP
Jedox Excel AddIn
Microsoft Excel
SAP Connectivity
Multiprocessor Scalability
Jedox OO-Addin
Open Office Calc
GPU Acceleration
Jedox ETL Data Integration
Data Analysis
Spreadsheet Front-End
Supervision (Events, LDAP)
3rd Party Access (ODBO)
Jedox Web
Jedox Spreadsheet
Jedox Analyzer
Excel2Web
Jedox OLAP Manager
Jedox User Manager
Jedox Report Manager
Jedox ETL Manager
Jedox Mobile
Web Front-End
Mobile Front-End (requires Mobile Server)
iOS App (iPhone, iPad)
Android App
Android widgets
Mobile Server
Online Analytical Processing (OLAP)
• Data modeled as multidimensional “cube”
– Dimensions are structured hierarchically:
• Base elements
• Consolidated elements
– Operations:
• Analysis:
– Multidimensional aggregation (bottom-up)
• Planning:
– Data distribution (top-down)
Jan
Feb
Mar
Q1
Apr
May
Jun
Jul
Q2
Aug
Sep
Q3
Oct
Nov
Dec
Q4
Year
All regions
Europe
France
Italy
UK
North America
USA
Canada
Mexico Deviation
Actual
Budget
Jan Feb Mar Apr May Jun Jul Aug Sep Dec Nov Oct
Q1 Q2 Q3 Q4
Year
Storage model
• Only store base cells with value ≠ 0
• All higher-level (consolidated) cell
values are calculated on demand
when needed
Memory saving, data consistency
Paths and values
Zero and consolidated
values are not stored!
Path compression
Writeback in Top-Down Planning
• Writeback: ”opposite direction”
of aggregation
• Value inserted at high level of
aggregation is broken down to
lower levels until the base
level
• All underlying base cells are
modified, depending on the
type of writeback
Ranges and areas
• Base elements in each dimension are
collected in ranges
D0: { [0,0] , [2,2] } |D0| = 2
D1: { [0,0] , [2,3] } |D1| = 3
D2: { [0,2] } |D2| = 3
• The Cartesian product of ranges
across all dimensions forms an area
D0× D1× D2
Multiply-base distribution
Multiply-base distribution
Set-base distribution
• Every relevant base cell in the area
is set to the same given value
• Naïve approach:
search for all relevant paths and
replace cell values
– Problem: what about zero-value cells,
which are not represented?
• Better approach:
(1) Delete all existing cells in area
(2) Create all cells in area with new
value
Parallel creation of all cell paths in an area
• “Parallel enumeration” of the
area:
– Each thread computes the path of
”its” cell from the thread ID
– Problem:
• Gaps between ranges of a
dimension prevent simple iterations
• Iterating over all ranges and counting
all visited elements is inefficient
– Solution:
• Represent ranges by pre-calculated
prefix sums (rather than start and
end points)
Prefix sum representation of ranges
(1) Find smallest m such that r[m] ≥ k
(2) i = g[m] + k
Prefix sums of gap lengths: g =
Prefix sums of range lengths: r =
Index i of kth relevant element in D:
Add-base distribution
• The same given value v is added
to the value of each relevant base
cell
• Approach:
– Create all cells of the area and set
value to v (as before) and store them
temporarily
– Find all previously existing relevant
cells and add their (old) value to the
one in the new temporary area
– Delete old relevant cells and persist
temporary storage
Performance tests Timings
(in ms)
CPU
Intel DualCore
2x GPU
GeForce 260
3x GPU
Tesla C1060
4x GPU
Tesla C2050
Multiply-base 1 3,548 466 (7.6 x) 558 (6.4 x) 435 (8.2 x)
Refresh 1 1,131 200 (5.7 x) 127 (8.9 x) 74 (15.3 x)
Sum 4,679 666 (7.0 x) 685 (6.8 x) 509 (9.2 x)
Multiply-base 2 21,542 513 (42 x) 580 (37 x) 448 (48 x)
Refresh 2 5,508 961 (5.7 x) 617 (8.9 x) 347 (16 x)
Sum 27,050 1,474 (18 x) 1,197 (23 x) 795 (34 x)
Timings (in ms)
CPU
Intel DualCore
2x GPU
GeForce 260
3x GPU
Tesla C1060
4x GPU
Tesla C2050
Set-base 14,979 900 (17 x) 715 (21 x) 572 (26 x)
Refresh 5,598 962 (5.8 x) 610 (9.2 x) 347 (16 x)
Sum 20,577 1,862 (11 x) 1,325 (16 x) 919 (22 x)
Timings (in ms)
CPU
Intel DualCore
2x GPU
GeForce 260
3x GPU
Tesla C1060
4x GPU
Tesla C2050
Add-base 110,387 1,465 (75 x) 899 (123 x) 872 (127 x)
Refresh 5,621 953 (5.9 x) 608 (9 x) 346 (16 x)
Sum 116,008 2,418 (48 x) 1,507 (77 x) 1,218 (95 x)
Speed-up factors (compared to CPU)
2x GeForce 260
3x Tesla C1060
4x Tesla C2050
0
20
40
60
80
100
7x
11x
18x
48x
7x
16x
23x
77x
9x
22x
34x
95x
Add-base Multiply-base 2
Multiply-base 1 Set-base
Concluding remarks
• Top-down planning creates and/or manipulates large numbers of data
records
– These updates are systematic and structured
– GPUs are well-suited for parallel execution
– CUDA implementation shows nice speedup compared to sequential CPU algorithm
– Can benefit from multiple GPUs
• Approach commercially implemented in Jedox Suite
16 3 24 10 21 35
4 13 29 31 17 16
67 24 8 13 8 13
51 5 27 39 73 44 12 46
19 3 86 54 6
49 13 90 28
42 11
2
81 54
33 1
92
60
16 0 3 24 10 0 21 35
4 13 29 0 31 17 16 0
67 0 24 8 0 13 8 13
0 0 0 0 0 0 0 0
51 5 27 39 73 44 12 46
19 0 3 86 54 0 6 0
0 49 0 0 13 90 0 28
0 42 11
0 2 0
0 81 54
0 0 0
0 33 1
0 92 0
0 0 60
16 3 24 10 21 35
4 13 29 31 17 16
67 24 8 13 8 13
51 5 27 39 73 44 12 46
19 3 86 54 6
49 13 90 28
42 11
2
81 54
33 1
92
60
16 3 24 10 21 35
4 13 29 31 17 16
67 24 8 13 8 13
51 5 27 39 73 44 12 46
19 3 86 54 6
49 13 90 28
42 11
2
81 54
33 1
92
60
Multiply-base distribution
16 3 24 10 21 35
4 13 29 31 17 16
67 24 8 13 8 13
51 5 27 39 73 44 12 46
19 3 86 54 6
49 13 90 28
42 11
2
81 54
33 1
92
60
240
Multiply-base distribution
16 3 24 10 21 35
4 13 29 31 17 16
67 24 8 13 8 13
51 5 27 39 73 44 12 46
19 3 86 54 6
49 13 90 28
42 11
2
81 54
33 1
92
60
240
Multiply-base distribution
16 3 24 10 21 35
4 13 29 31 17 16
67 24 8 13 8 13
51 5 27 39 73 44 12 46
19 3 86 54 6
49 13 90 28
42 11
2
81 54
33 1
92
60
240
x2
Multiply-base distribution
16 3 24 10 21 35
4 26 58 31 17 32
67 48 16 13 16 26
51 10 54 78 73 44 24 92
19 3 86 54 6
49 13 90 28
42 11
2
81 54
33 1
92
60
480
Multiply-base distribution
16 3 24 10 21 35
4 26 58 31 17 32
67 48 16 13 16 26
51 10 54 78 73 44 24 92
19 3 86 54 6
49 13 90 28
42 11
2
81 54
33 1
92
60
480
Set-base distribution
16 3 24 10 21 35
4 13 29 31 17 16
67 24 8 13 8 13
51 5 27 39 73 44 12 46
19 3 86 54 6
49 13 90 28
42 11
2
81 54
33 1
92
60
B 10
Set-base distribution
16 3 24 10 21 35
4 10 10 10 31 17 10 10
67 10 10 10 13 10 10
10 10 10 10 10
51 10 10 10 73 44 10 10
19 3 86 54 6
49 13 90 28
42 11
2
81 54
33 1
92
60
B 10
Set-base distribution
16 3 24 10 21 35
4 13 29 31 17 16
67 24 8 13 8 13
51 5 27 39 73 44 12 46
19 3 86 54 6
49 13 90 28
42 11
2
81 54
33 1
92
60
10 10 10 10 10
10 10 10 10 10
10 10 10 10 10
10 10 10 10 10
Set-base distribution
16 3 24 10 21 35
4 13 29 31 17 16
67 24 8 13 8 13
51 5 27 39 73 44 12 46
19 3 86 54 6
49 13 90 28
42 11
2
81 54
33 1
92
60
10 10 10 10 10
10 10 10 10 10
10 10 10 10 10
10 10 10 10 10
Set-base distribution
16 3 24 10 21 35
4 31 17
67 13
51 73 44
19 3 86 54 6
49 13 90 28
42 11
2
81 54
33 1
92
60
10 10 10 10 10
10 10 10 10 10
10 10 10 10 10
10 10 10 10 10
Set-base distribution
16 3 24 10 21 35
4 10 10 10 31 17 10 10
67 10 10 10 13 10 10
10 10 10 10 10
51 10 10 10 73 44 10 10
19 3 86 54 6
49 13 90 28
42 11
2
81 54
33 1
92
60
Add-base distribution
16 3 24 10 21 35
4 13 29 31 17 16
67 24 8 13 8 13
51 5 27 39 73 44 12 46
19 3 86 54 6
49 13 90 28
42 11
2
81 54
33 1
92
60
B+5
Add-base distribution
16 3 24 10 21 35
4 13
+5
29
+5 +5 31 17
16
+5 +5
67 +5 24
+5
8
+5 13
8
+5
13
+5
+5 +5 +5 +5 +5
51 5
+5
27
+5
39
+5 73 44
12
+5
46
+5
19 3 86 54 6
49 13 90 28
42 11
2
81 54
33 1
92
60
B+5
Add-base distribution
16 3 24 10 21 35
4 13 29 31 17 16
67 24 8 13 8 13
51 5 27 39 73 44 12 46
19 3 86 54 6
49 13 90 28
42 11
2
81 54
33 1
92
60
5 5 5 5 5
5 5 5 5 5
5 5 5 5 5
5 5 5 5 5
Add-base distribution
16 3 24 10 21 35
4 13 29 31 17 16
67 24 8 13 8 13
51 5 27 39 73 44 12 46
19 3 86 54 6
49 13 90 28
42 11
2
81 54
33 1
92
60
18 34 5 21 5
5 29 13 13 18
5 5 5 5 5
10 32 44 17 51
Add-base distribution
16 3 24 10 21 35
4 31 17
67 13
51 73 44
19 3 86 54 6
49 13 90 28
42 11
2
81 54
33 1
92
60
18 34 5 21 5
5 29 13 13 18
5 5 5 5 5
10 32 44 17 51
Add-base distribution
16 3 24 10 21 35
4 18 34 5 31 17 21 5
67 5 29 13 13 13 18
5 5 5 5 5
51 10 32 44 73 44 17 51
19 3 86 54 6
49 13 90 28
42 11
2
81 54
33 1
92
60
340