data aging strategies in sap bw 7.3
DESCRIPTION
Data Aging Strategies in SAP BW 7.3TRANSCRIPT
Rainer Uhle, SAP Product Manager
Dr. Peter Zimmerer, SAP Development Architect Mannheim, Rosengarten - June 22, 2011
Data Aging Strategies
in
SAP Business Warehouse BW 7.3
© 2011 SAP AG. All rights reserved. 2
Disclaimer
This presentation outlines our general product direction and should not be relied on in
making a purchase decision. This presentation is not subject to your license agreement
or any other agreement with SAP. SAP has no obligation to pursue any course of
business outlined in this presentation or to develop or release any functionality
mentioned in this presentation. This presentation and SAP's strategy and possible future
developments are subject to change and may be changed by SAP at any time for any
reason without notice. This document is provided without a warranty of any kind, either
express or implied, including but not limited to, the implied warranties of
merchantability, fitness for a particular purpose, or non-infringement. SAP assumes no
responsibility for errors or omissions in this document, except if such damages were
caused by SAP intentionally or grossly negligent.
© 2011 SAP AG. All rights reserved. 3
You Need Complete and Trusted Information
to Make Good Business Decisions
90% of upper level management feel they don’t
have the necessary information for critical
business decisions; 50% of them are afraid they
are making poor decisions because of it.”
“
BI strategies are deemed to fail without a trusted
data foundation“
The #1 risk for building a data mart or data
warehouse is data quality “
© 2011 SAP AG. All rights reserved. 4
Are these terms consistent
with our business
definitions?
Can I trust this data enough
to make my critical
decisions? Has the data
passed all our business rule
checks?
How current is this data?
When was it last updated?
Where did these numbers
come from? Are we
considering all our relevant
sources?
How Good is the Data Behind My Dashboard?
© 2011 SAP AG. All rights reserved. 5
Enterprise Data Warehouse (EDW)
Characteristics and Requirements
© 2011 SAP AG. All rights reserved. 6
SAP NetWeaver Business Warehouse
Strong EDW capabilities
Reliable
Data Acquisition
Business
Content
Streamlined
Operations
Lifecycle
Management
Fast, sustainable implementation through
Modeling Patterns
Business Content
Openness and data quality through
Out-of-the box integration for data originating in SAP systems
Integrated with SAP BusinessObjects Data Services (Data Integrator and Data Quality Management)
Efficient data management through:
Management of data consistency, data base abstraction, data base neutral
Sophisticated Security, Authorization and Identity Handling
High availability
Enable sophisticated lifecycle management at different levels:
System
Meta Data
Data (Nearline storage, archiving)
Integrated, scalable Enterprise Data Warehouse (EDW) platform
EDW = DBMS + X
© 2011 SAP AG. All rights reserved. 7
What does BW know about my Business?
© 2011 SAP AG. All rights reserved. 8
Introduction into the term "Layered, Scalable Architecture
(LSA)"
The Layered, Scalable Architecture (LSA) is a standard term for SAP for
common, unified understanding.
The LSA is a Reference Architecture and not only a data model.
At the center is the service idea of the reference architecture: Each layer
provides a service that can be used.
Layer-based data model in which each layer performs a
specific task.Layered
The data model is scalable and can be enhanced for
example by other source systems, regions and scenarios.Scalable
The LSA is an architecture that is applied in the entire BW
system.Architecture
© 2011 SAP AG. All rights reserved. 9
The LSA Reference Architecture layers
LS
A
Reporting Layer
Business Transformation Layer Opera
tional
Data
Sto
re
Data Propagation Layer
Harmonisation Layer
Corporate
Memory
Data Acquisition Layer
Reporting
Data Sources
Layer optimized for reporting(consists of InfoCubes and
MultiProviders)
Near real-time reporting, close to operational reporting
BI Applications(Architected Data Mart Layer)
EDW Layer(Single Point of truth, reusable, granular, complete history)
Source system close structure, complete storage of history as granular as possible, “Master
the Unknown”
Application of Business Logic for the applicationsEasily digestible,
consumable ,
integrated and
independent
data
Harmonization, securing
data quality, plausibility
Extractor inbox, 1:1
mapping, temporary
storage
© 2011 SAP AG. All rights reserved. 10
LSA Data Flow Templates as Content
© 2011 SAP AG. All rights reserved. 11
SAP NetWeaver BW adoption
Adoption of SAP NetWeaver BW constantly growing
Unaffected by economic down-turn in 2009
More than 12000 customers referring to more than 15000 productive systems
13.359
13.728
13.910
14.214
14.446
14.687
14.948
15.238
12.000
12.500
13.000
13.500
14.000
14.500
15.000
15.500
16.000
Q1 0
9
Q2 0
9
Q3 0
9
Q4 0
9
Q1 1
0
Q2 1
0
Q3 1
0
Q4 1
0
Stable Product, Large installed Base, Constant Growth
Productive SAP NetWeaver BW systems – constant growth
© 2011 SAP AG. All rights reserved. 12
Analyst Opinions
Forrester 2011
© 2011 SAP AG. All rights reserved. 13
SAP BW EDW and Reality -
„60 TB Proof of Concept‟ on RDBMS (IBM/ DB2)
Discussions about corporate DWH architectures (EDW) are frequently driven by fears and prejudices. This results in vague questions like:
Can BW handle 30, 40,..., 100 Terabyte ?
The answer:
SAP BW - 60TB Proof of Concept
© 2011 SAP AG. All rights reserved. 14
BW Accelerator Query Run Time
BW
Analytical
Engine
Indexing
Query &
Response
Information
SA
P N
etW
eave
r 7
.0
B
usin
ess In
te
llig
en
ce
SA
P N
etW
eave
r B
W A
cc
ele
rato
r
Aggregation
“on the fly”
Merging and results
preparation for BI
queries
InfoCube
(*) property setting („load index into main
memory‟) or schedule program
RSDDTREX_INDEX_LOAD_UNLOAD
© 2011 SAP AG. All rights reserved. 15
BWA Resources
27 blades 81 blades 135 blades
5 TB
15 TB
25 TB
To
tal
DB
Siz
e
Index creation throughput
Multiuser reporting throughput
avg. report response time
avg. # records touched per report
Legend:
BWA Linear Scalability - Data Volume vs. Resources
(25 TB Showcase 2009)
0.6 TB / h
100,000 reports / h
4.5 sec
6 M records
1.1 TB / h
101,000 reports / h
4.2 sec
22 M records
1.2 TB / h
101,000 reports / h
4.2 sec
37 M records
© 2011 SAP AG. All rights reserved. 16
DSS Applications Departmental Data Marts
EDW
Marketing
Acctg Finance
SalesERP
ERP
ERP
CRM
eComm.
Bus. Int.
ETL
Global
ODS
Oper.
Mart
Exploration
warehouse/
data mining
Source:Bill Inmon
Sta
gin
g A
rea
local
ODS
Dialogue
Manager
Cookie
Cognition
Preformatted
dialogues
Cross media
Storage
ManagementNear line
Storage
Web Logs
Session
Analysis
Internet
ERP
Corporate
Applications
Changed
Data
Granularity
Manager
Archives
Bill Inmon‟s Corporate Information Factory & Nearline
Storage
© 2011 SAP AG. All rights reserved. 17
Data-Aging Strategies for Volume Performance
Information Lifecycle according to Importance/Age:Storage Type /
Data CategoryOnline Database
Nearline Storage
(read only)
Classic Archive
(read only)
Frequently read /
changed data
(actual)
Infrequently read
data (mature)
Very rarely read data
(aged)
© 2011 SAP AG. All rights reserved. 18
Key facts about SAP NLS
NLS should
be a part of an
Information
Lifecycle
Management
(ILM) strategy
Data archived in
NLS can be
incorporated
into reporting
Process Chain
support
Copes with
changes in the
meta data to the
BW objects of
the archived
data
Mainly time-
based archiving,
yet can also be
based on other
characteristics
Increases
retention period
for analysis data
Supports
archiving of
InfoCubes and
DataStore
Objects
Based on well-
established SAP
/ SAP BW
archiving
concepts
Data
consistency
guaranteed
before
deleting the
data from
source
Included in
the query
statistic data
collection
(RSRT)
Saves storage
costs and
other system
resources
Lock of the
archived data
slice in the
original
InfoProviders
NLS is an
application
from a third
party vendor,
running on a
separate
systemHigh
compression
rate (up to
95%)
Scheduling
and
Monitoring of
archiving
sessions from
SAP BW
system
© 2011 SAP AG. All rights reserved. 19
Evolution by SAP NetWeaver BW Releases
Enhanced Look-Up API
Suspension and selective
continuation of archiving
processes within Process
Chains
Restore of an archiving
request with all successors
Smaller Data Object size for
ADK-based Nearline
Solution without semantic
grouping
SAP NetWeaver BW 7.00
Support of write-optimized
DataStore Objects for ADK
archiving and the Nearline-
Storage interface
Request based Archiving
Enhanced status and job
monitoring within
InfoProvider management
view
SAP NetWeaver BW 7.01
(EhP1)SAP NetWeaver BW 7.30
Support for accessing
Nearline-Storage data for
MultiProviders
Feature to allow archiving
from uncompressed
InfoCubes
Archiving of Semantic
Partioned Objects (SPO)
with SP1
Automatic rebuild of BW
Accelerator index possible
© 2011 SAP AG. All rights reserved. 20
The Nearline Storage Solution for SAP NetWeaver
BW
Based on the Nearline Storage Interface Development Partners can implement their Solutions for Archiving and NLS into the SAP BW
3rd Party NLS Solutions
are implemented within the SAP BW ABAP Stack in partner specific namespaces
have to pass a certification process
can offer specific Application Area in the SAP Support Portal
have to be licensed in addition to SAP licenses
can have a different release cycle compared to SAP NetWeaver BW
Present development partners Certified since SAP BW 7.0(in alphabetical order of their products)
CBW® – PBS Software yes Dynamic NearLine Access® - SAND Technology yes DB2 Viper 9.5® - IBM 7.01 SP6 DataVard OutBoard 1.0 yes
(see also http://www.sap.com/ecosystem/customers/directories/SearchSolution.epx )
NLS
Partner
Solution
© 2011 SAP AG. All rights reserved. 21
Customer Adoption - BW Archiving and Nearline Storage (based on 895 customer messages)
© 2011 SAP AG. All rights reserved. 22
Data analysis and assistance for ROI analysis
Sizing of Nearline Storage solutions:
Hardware sizing of the NearLine-Storage solution has to be done by the
vendor Different Nearline Storage technologies on the market
From database solutions, to file-based solutions, to column-based storage solutions
Data volume services by SAP Active Global Support (AGS)
http://service.sap.com/dvm
Deliver a thorough analysis of BW objects distribution
Can help on estimating the data volume that may be archived /
transferred to NLS for the largest InfoProviders within the system
Considers only “technical facts” (and not the customer’s “business
requirements”)
© 2011 SAP AG. All rights reserved. 23
Data Management with Nearline Storage
Implementation Aspects
Create a Data Archiving Process
Create and schedule archiving requests
Restore archiving requests
Load data to subsequent Data Targets
LS
A
Data Propagation Layer Corporate
Memory
Data
Acquisition
Layer
DataSource
InfoSource
InfoPackage
DTP
Nearline Storage
Reporting Layer
(Architected Data Marts)
MultiProviderSAP Sales InfoCube
DAP
DTP DTP
DTP
PSA
DTP
1
2 3
4
Nearline Storage
Look-up during Transformation
Query Settings
MultiProvider Settings
6
5
7
Nearline Storage
1
2
3
4
5
6
7
© 2011 SAP AG. All rights reserved. 24
ADK Archive
RDBMS
Design Aspects –
Nearline Storage (NLS) vs. BW Accelerator (BWA)
InfoMarts (InfoCube)
Nearline StorageBWA
Acquisition
Acceleration Archiving
BI
Access - very frequently frequently not frequently rarely
© 2011 SAP AG. All rights reserved. 25
Data Management at Query Runtime
The Data Manager identifies the availability of alternative data storage of any kind, such as
1. Data resides in the InfoProvider in the database
2. Data resides in a classical Aggregate
3. Data resides in the BW Accelerator Index
4. Data resides in an NLS Partition
Aggregate Types
• BW Accelerator Index
• NLS Partition
© 2011 SAP AG. All rights reserved. 26
NLS Related MultiProvider Settings
Nearline read mode
• disabled at all
• enabled at all
• InfoProvider settings
© 2011 SAP AG. All rights reserved. 27
MultiProvider: Query Runtime Statistics
Listing of Basis Providers and NLS
partitions used during Query execution
© 2011 SAP AG. All rights reserved. 28
NLS Related Query Designer Settings
Reporting
Fixed NLS Settings
• read NLS
• do not read NLS
• see InfoProvider settings
© 2011 SAP AG. All rights reserved. 29
NLS Related Query Designer Settings: Variable
Variable NLS Settings
(Dialog)
• read NLS
• do not read NLS
• see InfoProvider settings
© 2011 SAP AG. All rights reserved. 30
InfoCube: Archiving of Uncompressed Data
Central setting in Data Archiving Process (DAP)
Valid for all archiving requests und DAP-Variants
Can be changed during operation
Prerequisite: only already processed requests (aggregates, Delta DTP)
Allow Archiving for non-
compressed data
© 2011 SAP AG. All rights reserved. 31
Data Management at Archiving Runtime
During the delete phase of the archiving request
the new setup of the BWA index is offered in the dialog.
BWA consistence
reflected during
DAP processing
© 2011 SAP AG. All rights reserved. 32
Optimized Support for Navigational Attributes
Optimized Support for navigational attributes during Query processing on NLS
Navigational attributes are master data attributes that can be used to navigate/filter in
queries. Master data attributes are located outside the InfoCube persistence in the
extended star schema and thus are not a component of the NLS data stock.
Previous solution:
– Selections for navigational attributes were not transferred to NLS as selections …
– The attribute values were assigned subsequently and filtered in the result set
– Performance problems for highly selective attribute values
Improvement:
– Selections for navigational attributes are converted first to a selection for the
characteristic bearing attributes (max. 100 characteristic values)
– The attribute selection is replaced by this characteristic selection in the query selection.
© 2011 SAP AG. All rights reserved. 33
DSO Lookup for „nearlined‟ Partitions
SAP NetWeaver BW 7.30 will come
up with a separate transformation rule
type, a DSO lookup
In case a NLS solution is attached to
the BW system, the lookup will
automatically read from both the
“online” and “near lined” data
partitions.
© 2011 SAP AG. All rights reserved. 34
With SAP NetWeaver BW 7.30, the Analysis Process Designer will be enabled to read
from Nearline-Storage also for the source type “Read data from InfoProvider”
Data Access within the APD
Option to allow
reading from NLS for
InfoProvider sources
© 2011 SAP AG. All rights reserved. 35
Reload data from both Online and
Nearline partitions for InfoCubes
Option to extract data
from both the Online
and Nearline Partition
in a single DTP
© 2011 SAP AG. All rights reserved. 36
Transaction LISTCUBE
Read data from NLS combined
© 2011 SAP AG. All rights reserved. 37
Archiving of Semantic Partitioned Objects
Facts:
Semantic Partitioning possible for InfoCubes (only standard InfoCubes) and DSOs (standard
and write-optimized)
There is not a DAP per PartProvider but only one DAP for the entire SPO. As a consequence,
there is not a set of tables / files created in the NLS system per PartProvider but only a set of
tables / files per SPO.
The DAP itself has the same options / settings as a regular InfoProvider. However, the DAP
must contain the logical partitioning criterion as additional archiving criterion so that data can
be archived, reloaded, or restore for a dedicated Semantic Partition.
Semantic
Partitioning criterion
© 2011 SAP AG. All rights reserved. 38
Archiving of Semantic Partitioned Objects
Since archiving is not carried out per PartProvider, there is not “Archive” tab within
the administration user interface. Instead, an archiving request can be scheduled by
means of a dedicated / global button.
Maintain Archiving
© 2011 SAP AG. All rights reserved. 39
Archiving of Semantic Partitioned Objects
Since archiving is not carried out per PartProvider, there is not “Archive” tab within the
administration user interface. Instead, an archiving request can be scheduled by means of a
dedicated / global button.
An archiving request can be schedule to archive data from all available partitions or only from
a dedicated partitions (which is equal to an archiving run being restricted to the semantic
partition)
Cross-partition archiving or
only for a specific partition
© 2011 SAP AG. All rights reserved. 40
Reading data from SPOs
Query
In SAP NetWeaver BW 7.30 data contained within a Nearline-Storage system can be read with a query
being directly flagged to read data from NLS (query properties to read NLS data do no longer have to be
maintained via transaction RSRT)
Query can be set to read or to not read data from a NLS. Furthermore, it is possible to specify the same on
InfoProvider level, which can also be taken into consideration.
© 2011 SAP AG. All rights reserved. 41
Summary and Outlook
Latest Enhancements
Enhanced lookup support especially for temporal lookups (non-equal lookup conditions)
Request-based archiving for InfoCubes (avoid compression before archiving) (BW 7.30)
Combined DTP extraction from online and archive partition of an InfoCube (BW 7.30)
Enhanced NLS support for Semantically Partitioned Objects (SPO) based on standard InfoCubes and
standard DSOs (BW 7.30 SP 1). NLS support for SPOs based on write-optimized DSOs is available with
SP3.
NLS support for DSO lookup within transformations (DSO lookup feature to be released with SAP
NetWeaver BW 7.30 with lookup for online data only)
Master Data deletion to consider data within NLS
Medium term
NLS support for BW 7.3 running on HANA In-Memory
Physical deletion of NLS requests from the nearline Storage (BW 7.30 SP5)
Long term
Archiving of InfoCubes with non-cumulative key figures, as well as InfoSets and HybridProviders
Archiving of master data and hierarchies
Archiving with free selection criteria (not only time slice archiving)
© 2011 SAP AG. All rights reserved. 42
Planned Roadmap HANA & SAP NetWeaver BW
Major release
BW Accelerator
New features and
improvements across all
components
BW 7.0 / BWA 7.0
2009
20112010
Real-time operational analytics on
mass data
Rapid creation of agile data marts
Non disruptive deployments of
HANA side by side ERP and/or
BW
HANA V1.0 Additional calculation
capabilities
Primary persistence layer
under BW; eliminates need
for separate database
Models for SAP business
content enabling new
applications
HANA V1.0 SPSnn
Go-to release for
integration with SAP
Business Objects BI
BW 7.0 EhP1 (7.01)
Major step on Enterprise
Data Warehousing
scalability and flexibility
BW Accelerator: additional
performance
Integration Improvements
with SAP BusinessObjects
Data Services
BW 7.3 / BWA 7.2
BW running on HANA as
the underlying In-Memory
DB Platform
In-Memory for Enterprise
Data Warehousing
Integrated Planning In-
Memory enabled
BW 7.3 SPnn
2006
SAP NetWeaver BW evolving to a
fully In-Memory enabled EDW
solution on top of HANA
Future
direction
© 2011 SAP AG. All rights reserved. 43
Data-Aging Strategies: Nearline Storage Only
Information Lifecycle according to Importance/Age:
Storage Type /
Data CategoryOnline Database
Nearline Storage
(read only)
Classic Archive
(read only)
Frequently read /
changed data
(actual)
Infrequently read
data (mature)
Very rarely read data
(aged)
Current Situation
Nearline Storage is the leading and only persistency
No isolated Delete from Nearline Storage possible
Workaround: Restore to Online Database and delete from there
Archive
© 2011 SAP AG. All rights reserved. 44
Data-Aging Strategies: Classic Archive + Nearline Storage
Information Lifecycle according to Importance/Age:
Storage Type /
Data CategoryOnline Database
Nearline Storage
(read only)
Classic Archive
(read only)
Frequently read /
changed data
(actual)
Infrequently read
data (mature)
Very rarely read data
(aged)
Archive (ADK …
… + NLS)
Current Situation
ADK (Classic) Archive is the leading persistency
Nearline Storage is filled from ADK Archive during Verification Phase
Nearline Storage is strictly coupled to ADK Archive (no independent Delete)
© 2011 SAP AG. All rights reserved. 45
Details for the planned NLS Deletion Features (for SAP BW 7.3, SP05)
1) Data resides in NLS only (without ADK)
First step "logical" Deletion of NLS Data (set NLS Request to "Invalid" )
NLS Status in NLS Archiving-Request-List will be set to „Marked for Deletion“/
"Deleted"
NLS Data will be deleted asynchronously using a Clean-Up Job or (later) a
Process Chain
Time slices will remain locked
2) Data resides in NLS and ADK
Request can only be deleted from NLS, Data in ADK stays untouched
ADK delete is not supported from NLS Dialog (see SAP Data Life Cycle/
Retention concepts in ERP)
Later Restore from ADK to NLS supported
© 2011 SAP AG. All rights reserved. 46
Data resides in NLS (only)
(Final) Deletion of Nearline Request
© 2011 SAP AG. All rights reserved. 47
Data resides in NLS onlyThree Alternatives lead to Nearline Request Status "Deleted"
Finally Deleted from NLS
(after successful
archiving)
Restored
(Deleted from NLS but
stored in Online-DB again)
Invalidated
(never deleted from
Online-DB)
© 2011 SAP AG. All rights reserved. 48
Data resides in ADK and NLSRestore deleted Nearline Request from ADK
© 2011 SAP AG. All rights reserved. 49
Data resides in ADK and NLSNew Nearline Request after Restore from ADK