data vault partitioningstrategies - wordpress.com...2 23.11.2017 data vault partitioning strategies...

32
Data Vault Partitioning Strategies Dani Schnider, Trivadis AG DOAG Conference, 23 November 2017 @ dani_schnider DOAG2017

Upload: others

Post on 13-Aug-2020

16 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Vault PartitioningStrategies - WordPress.com...2 23.11.2017 Data Vault Partitioning Strategies Trivadis is a market leader in IT consulting, system integration, solution engineering

Data VaultPartitioning StrategiesDani Schnider, Trivadis AG

DOAG Conference, 23 November 2017

@dani_schnider DOAG2017

Page 2: Data Vault PartitioningStrategies - WordPress.com...2 23.11.2017 Data Vault Partitioning Strategies Trivadis is a market leader in IT consulting, system integration, solution engineering

Our company.

Data Vault Partitioning Strategies2 23.11.2017

Trivadis is a market leader in IT consulting, system integration, solution engineeringand the provision of IT services focusing on and technologiesin Switzerland, Germany, Austria and Denmark. We offer our services in the following strategic business fields:

Trivadis Services takes over the interacting operation of your IT systems.

O P E R A T I O N

Page 3: Data Vault PartitioningStrategies - WordPress.com...2 23.11.2017 Data Vault Partitioning Strategies Trivadis is a market leader in IT consulting, system integration, solution engineering

COPENHAGEN

MUNICH

LAUSANNEBERN

ZURICHBRUGG

GENEVA

HAMBURG

DÜSSELDORF

FRANKFURT

STUTTGART

FREIBURG

BASEL

VIENNA

With over 600 specialists and IT experts in your region.

Data Vault Partitioning Strategies3 23.11.2017

14 Trivadis branches and more than600 employees

200 Service Level Agreements

Over 4,000 training participants

Research and development budget:CHF 5.0 million

Financially self-supporting and sustainably profitable

Experience from more than 1,900 projects per year at over 800customers

Page 4: Data Vault PartitioningStrategies - WordPress.com...2 23.11.2017 Data Vault Partitioning Strategies Trivadis is a market leader in IT consulting, system integration, solution engineering

Dani Schnider

4

Working for Trivadis in Glattbrugg/Zurich

– Senior Principal Consultant

– Data Warehouse Lead Architect

– Trainer of several Courses

Co-Author of the books

– Data Warehousing mit Oracle

– Data Warehouse Blueprints

Certified Data Vault Data Modeler

23.11.2017 Data Vault Partitioning Strategies

@dani_schnider danischnider.wordpress.com

Page 5: Data Vault PartitioningStrategies - WordPress.com...2 23.11.2017 Data Vault Partitioning Strategies Trivadis is a market leader in IT consulting, system integration, solution engineering

Data Vault Tables

Data Vault Partitioning Strategies5 23.11.2017

Surrogate Key (PK)Business Key(s) (UK)Load DateRecord Source

HUBSurrogate Key (PK)Foreign Key Hub 1Foreign Key Hub 2...Load DateRecord Source

LINKForeign Key to Hub (PK)Load Date (PK)Load End Date (optional)Context Attribute 1Context Attribute 2...Context Attribute nRecord Source

SATELLITE

Page 6: Data Vault PartitioningStrategies - WordPress.com...2 23.11.2017 Data Vault Partitioning Strategies Trivadis is a market leader in IT consulting, system integration, solution engineering

Data Vault Partitioning Strategies6 23.11.2017

Source: How to Create a Data Vault Model, https://youtu.be/Q1qj_LjEawc

Page 7: Data Vault PartitioningStrategies - WordPress.com...2 23.11.2017 Data Vault Partitioning Strategies Trivadis is a market leader in IT consulting, system integration, solution engineering

Example Data Vault Model (Subset)

Data Vault Partitioning Strategies7 23.11.2017

Page 8: Data Vault PartitioningStrategies - WordPress.com...2 23.11.2017 Data Vault Partitioning Strategies Trivadis is a market leader in IT consulting, system integration, solution engineering

Data Vault Partitioning Strategies8 23.11.2017

Partitioning by Load Date

Page 9: Data Vault PartitioningStrategies - WordPress.com...2 23.11.2017 Data Vault Partitioning Strategies Trivadis is a market leader in IT consulting, system integration, solution engineering

Partitioning by Load Date: A Good Strategy?

Data Vault Partitioning Strategies9 23.11.2017

SID LOAD_DATE LOAD_END_DATE

1 01.01.2014 13.04.2014

1 13.04.2014 28.07.2015

1 28.07.2015 09.02.2017

1 09.02.2017 31.12.9999

2 15.03.2014 31.12.9999

3 26.06.2016 10.03.2017

3 10.03.2017 13.03.2017

3 13.03.2017 14.03.2017

3 14.03.2017 31.12.9999

Page 10: Data Vault PartitioningStrategies - WordPress.com...2 23.11.2017 Data Vault Partitioning Strategies Trivadis is a market leader in IT consulting, system integration, solution engineering

Partitioning by Load Date: Use Cases

Data Vault Partitioning Strategies10 23.11.2017

Master Data (Changing Data) Transactional Data (Events)Product Sales TransactionCustomer OrderEmployee Web Tracking… …Beer (general product description) Brew (particular brew batch)

Page 11: Data Vault PartitioningStrategies - WordPress.com...2 23.11.2017 Data Vault Partitioning Strategies Trivadis is a market leader in IT consulting, system integration, solution engineering

Partitioning by Load Date: Overview

11

S_Brew_Journal

H_BrewH_Beer

S_Recipe

L_Beer_BrewS_Beer

23.11.2017 Data Vault Partitioning Strategies

Page 12: Data Vault PartitioningStrategies - WordPress.com...2 23.11.2017 Data Vault Partitioning Strategies Trivadis is a market leader in IT consulting, system integration, solution engineering

Partitioning by Load Date: Hub Example

Data Vault Partitioning Strategies12 23.11.2017

CREATE TABLE H_BREW( H_Brew_Key RAW (16) NOT NULL,Brew_No NUMBER( 4) NOT NULL,Load_Date DATE NOT NULL,Record_Source VARCHAR2 (4 CHAR) NOT NULL

)PARTITION BY RANGE (Load_Date) INTERVAL(numtoyminterval(1,'MONTH'))(PARTITION p_old_data

VALUES LESS THAN (TO_DATE('01-01-2015','dd-mm-yyyy')));

Page 13: Data Vault PartitioningStrategies - WordPress.com...2 23.11.2017 Data Vault Partitioning Strategies Trivadis is a market leader in IT consulting, system integration, solution engineering

Partitioning by Load Date: Satellite Example

Data Vault Partitioning Strategies13 23.11.2017

CREATE TABLE S_BREW_JOURNAL( H_Brew_Key RAW (16) NOT NULL,Load_Date DATE NOT NULL,Brew_Date DATE NOT,Brewer VARCHAR2 (40),...Record_Source VARCHAR2 (4 CHAR) NOT NULL

)PARTITION BY RANGE (Load_Date) INTERVAL(numtoyminterval(1,'MONTH'))(PARTITION p_old_data

VALUES LESS THAN (TO_DATE('01-01-2015','dd-mm-yyyy')));

Page 14: Data Vault PartitioningStrategies - WordPress.com...2 23.11.2017 Data Vault Partitioning Strategies Trivadis is a market leader in IT consulting, system integration, solution engineering

Partitioning by Load Date

Data Vault Partitioning Strategies14 23.11.2017

ü Partition Pruning

û Partition-wise Join

ü Rolling History

ü Data Distribution

ü Partition Exchange

Restrictions:

Only for transactional data

Global indexes should be avoided

Page 15: Data Vault PartitioningStrategies - WordPress.com...2 23.11.2017 Data Vault Partitioning Strategies Trivadis is a market leader in IT consulting, system integration, solution engineering

Partitioning by Load Date: Global Index Issue

Data Vault Partitioning Strategies15 23.11.2017

ALTER TABLE S_BREW_JOURNAL ADD CONSTRAINT S_BREW_JOURNAL_PKPRIMARY KEY (H_Brew_Key, Load_Date)RELY DISABLE NOVALIDATE;

ALTER TABLE S_BREW_JOURNAL ADD CONSTRAINT H_BREW_S_BREW_JOURNAL_FKFOREIGN KEY (H_Brew_Key) REFERENCES H_BREW (H_Brew_Key)RELY DISABLE NOVALIDATE;

ALTER TABLE H_BREW ADD CONSTRAINT H_BREW_PKPRIMARY KEY (H_Brew_Key)RELY DISABLE NOVALIDATE;

ALTER TABLE H_BREW ADD CONSTRAINT H_BREW_UN UNIQUE (Brew_No)RELY DISABLE NOVALIDATE;

How to Avoid Global Indexes (PK/UK on Hubs)?

Page 16: Data Vault PartitioningStrategies - WordPress.com...2 23.11.2017 Data Vault Partitioning Strategies Trivadis is a market leader in IT consulting, system integration, solution engineering

Data Vault Partitioning Strategies16 23.11.2017

Partitioning by Load End Date

Page 17: Data Vault PartitioningStrategies - WordPress.com...2 23.11.2017 Data Vault Partitioning Strategies Trivadis is a market leader in IT consulting, system integration, solution engineering

Partitioning by Load End Date: Find Current Versions

Data Vault Partitioning Strategies17 23.11.2017

SID LOAD_DATE LOAD_END_DATE

1 01.01.2014 13.04.2014

1 13.04.2014 28.07.2015

1 28.07.2015 09.02.2017

1 09.02.2017 31.12.9999

2 15.03.2014 31.12.9999

3 26.06.2016 10.03.2017

3 10.03.2017 13.03.2017

3 13.03.2017 14.03.2017

3 14.03.2017 31.12.9999

Page 18: Data Vault PartitioningStrategies - WordPress.com...2 23.11.2017 Data Vault Partitioning Strategies Trivadis is a market leader in IT consulting, system integration, solution engineering

Partitioning by Load End Date: Overview

18

S_Brew_Journal

H_BrewH_Beer

S_Beer

L_Beer_Brew

His

tory

Parti

tion

Cur

rent

Parti

tion

His

tory

Parti

tion

Cur

rent

Parti

tion

S_Recipe

His

tory

Parti

tion

Cur

rent

Parti

tion

23.11.2017 Data Vault Partitioning Strategies

Page 19: Data Vault PartitioningStrategies - WordPress.com...2 23.11.2017 Data Vault Partitioning Strategies Trivadis is a market leader in IT consulting, system integration, solution engineering

Partitioning by Load End Date: Satellite Example

Data Vault Partitioning Strategies19 23.11.2017

CREATE TABLE S_RECIPE(

H_Beer_Key RAW (16) NOT NULL,Load_Date DATE NOT NULL,Load_End_Date DATE DEFAULT ON NULL TO_DATE('31-12-9999', 'dd-mm-yyyy'),Start_Temp NUMBER (3),Mashing_Time_1 NUMBER (3),Mashing_Temp_1 NUMBER (3),…Record_Source VARCHAR2 (4 CHAR) NOT NULL

)ENABLE ROW MOVEMENTPARTITION BY LIST (Load_End_Date)(PARTITION p_current VALUES (TO_DATE('31-12-9999', 'dd-mm-yyyy')),PARTITION p_history VALUES (DEFAULT));

Page 20: Data Vault PartitioningStrategies - WordPress.com...2 23.11.2017 Data Vault Partitioning Strategies Trivadis is a market leader in IT consulting, system integration, solution engineering

Partitioning by Load End Date

Data Vault Partitioning Strategies20 23.11.2017

ü Partition Pruning

û Partition-wise Join

û Rolling History

ü Data Distribution

ü Partition Exchange

Restrictions:

Only for Satellites, requires LOAD_END_DATE

ENABLE ROW MOVEMENT required

Page 21: Data Vault PartitioningStrategies - WordPress.com...2 23.11.2017 Data Vault Partitioning Strategies Trivadis is a market leader in IT consulting, system integration, solution engineering

Partitioning by Load End Date: Partition Exchange

Data Vault Partitioning Strategies21 23.11.2017

Special Implementation of Satellite Load Jobs

Only useful if most versions are replaced

S_Recipe

His

tory

Parti

tion

Cur

rent

Parti

tion

Load

Tab

le

PartitionExchange

Moveold versions

Insertnew versions

Insertunchanged

versions

1. Load Table contains

• All new versions

• All unchanged versions

2. Move old versions to history partition

• Insert rows with load end date

3. Exchange current partition

Page 22: Data Vault PartitioningStrategies - WordPress.com...2 23.11.2017 Data Vault Partitioning Strategies Trivadis is a market leader in IT consulting, system integration, solution engineering

Data Vault Partitioning Strategies22 23.11.2017

Partitioning by Hub Key

Page 23: Data Vault PartitioningStrategies - WordPress.com...2 23.11.2017 Data Vault Partitioning Strategies Trivadis is a market leader in IT consulting, system integration, solution engineering

Partitioning by Hub Key

Data Vault Partitioning Strategies23 23.11.2017

Improve Join Performance:Full Partition-wise Joins

– Between Hubs and Satellites

– Between Links and Hubs

Equal Distribution with HASH Partitioning

Run Extraction Queries in Parallel

Partition Key:Primary Key of Hub

Foreign Key of Satellite / Link

Link: Composite HASH-HASH Partitioning

Satellite

part 1

part 2

part 3

part 4

Hub

part 1

part 2

part 3

part 4

slave1

slave2

slave3

slave4

proc(QC)

Page 24: Data Vault PartitioningStrategies - WordPress.com...2 23.11.2017 Data Vault Partitioning Strategies Trivadis is a market leader in IT consulting, system integration, solution engineering

Partitioning by Hub Key: Overview

S_Brew_Journal

H_BrewH_Beer

S_Recipe

L_Beer_BrewS_Beer

23.11.2017 Data Vault Partitioning Strategies24

Page 25: Data Vault PartitioningStrategies - WordPress.com...2 23.11.2017 Data Vault Partitioning Strategies Trivadis is a market leader in IT consulting, system integration, solution engineering

Partitioning by Hub Key: Hub Example

Data Vault Partitioning Strategies25 23.11.2017

CREATE TABLE H_BEER(H_Beer_Key RAW (16) NOT NULL,Beer_Name VARCHAR2 (40) NOT NULL,Load_Date DATE NOT NULL,Record_Source VARCHAR2 (4 CHAR) NOT NULL

)PARTITION BY HASH (H_Beer_Key) PARTITIONS 8;

Page 26: Data Vault PartitioningStrategies - WordPress.com...2 23.11.2017 Data Vault Partitioning Strategies Trivadis is a market leader in IT consulting, system integration, solution engineering

Partitioning by Hub Key: Satellite Example

Data Vault Partitioning Strategies26 23.11.2017

CREATE TABLE S_BEER_DESCRIPTION(H_Beer_Key RAW (16) NOT NULL,Load_Date DATE NOT NULL,Style VARCHAR2 (40),ABV NUMBER (3,1),IBU NUMBER (3),Seasonal VARCHAR2 (10),Label_Color VARCHAR2 (10),Record_Source VARCHAR2 (4 CHAR) NOT NULL

)PARTITION BY HASH (H_Beer_Key) PARTITIONS 8;

Page 27: Data Vault PartitioningStrategies - WordPress.com...2 23.11.2017 Data Vault Partitioning Strategies Trivadis is a market leader in IT consulting, system integration, solution engineering

Partitioning by Hub Key: Link Example

Data Vault Partitioning Strategies27 23.11.2017

CREATE TABLE L_BEER_BREW(L_Beer_Brew_Key RAW (16) NOT NULL,H_Beer_Key RAW (16) NOT NULL,H_Brew_Key RAW (16) NOT NULL,Load_Date DATE NOT NULL,Record_Source VARCHAR2 (4 CHAR) NOT NULL

)PARTITION BY HASH (H_Beer_Key)

SUBPARTITION BY HASH (H_Brew_Key)SUBPARTITIONS 8

PARTITIONS 8;

Page 28: Data Vault PartitioningStrategies - WordPress.com...2 23.11.2017 Data Vault Partitioning Strategies Trivadis is a market leader in IT consulting, system integration, solution engineering

Partitioning by Hub Key

Data Vault Partitioning Strategies28 23.11.2017

û Partition Pruning

ü Partition-wise Join

û Rolling History

ü Data Distribution

û Partition Exchange

Restrictions:

Maximal two partition keys per Link

Page 29: Data Vault PartitioningStrategies - WordPress.com...2 23.11.2017 Data Vault Partitioning Strategies Trivadis is a market leader in IT consulting, system integration, solution engineering

Data Vault Partitioning Strategies29 23.11.2017

Conclusion

Page 30: Data Vault PartitioningStrategies - WordPress.com...2 23.11.2017 Data Vault Partitioning Strategies Trivadis is a market leader in IT consulting, system integration, solution engineering

Partitioning by Hub Key: Benefits

Data Vault Partitioning Strategies30 23.11.2017

Load Date 1) Load End Date 2) Hub Key

Partition Pruning ü ü û

Partition-wise Join û û ü

Rolling History ü û û

Data Distribution ü ü ü

Partition Exchange ü ü û

1) Only for transactional data2) Only for Satellites

Page 31: Data Vault PartitioningStrategies - WordPress.com...2 23.11.2017 Data Vault Partitioning Strategies Trivadis is a market leader in IT consulting, system integration, solution engineering

White Paper: Data Vault Partitioning Strategies

Data Vault Partitioning Strategies31 23.11.2017

Page 1 of 18 | www.trivadis.com | Date 30.10.2017

White Paper Data Vault Partitioning Strategies

Dani Schnider

Data Vault Partitioning Strategies WHITE PAPER

Download: https://danischnider.wordpress.com/publications/

Page 32: Data Vault PartitioningStrategies - WordPress.com...2 23.11.2017 Data Vault Partitioning Strategies Trivadis is a market leader in IT consulting, system integration, solution engineering

Trivadis @ DOAG 2017#opencompany

Stand: 3. Stock, direkt an der Rolltreppe

Wir teilen unser Knowhow!Einfach vorbei kommen, Live-Präsentationenund DokumentenarchivT-Shirts, Gewinnspiel und mehrWir freuen uns wenn Sie vorbei schauen

23.11.2017 Data Vault Partitioning Strategies32