information builders may 11, 2012 information builders (canada) inc. webfocus hyperstage...
TRANSCRIPT
Information BuildersMay 11, 2012
Information Builders (Canada) Inc.
WebFOCUS HyperstageAnalyze/Report from large Volumes of Data
Reporting
Query & Analysis
Dashboards
Information Delivery
PerformanceManagement
EnterpriseSearch
Visualization& Mapping
Data UpdatingPredictive Analytics
MS Office &e-Publishing
Extended BI
Core BI
Extensions to the WebFOCUS platform
allow you to build more application
types at a lower cost
Business toBusiness
Data Warehouse& ETL
Master DataManagement
Data Profiling & Data Quality
Business ActivityMonitoring
High PerformanceData Store
Mobile Applications
WebFOCUSHigher Adoption & Reuse with Lower TCO
High PerformanceData Store
Reporting
Query & Analysis
Dashboards
Information Delivery
PerformanceManagement
EnterpriseSearch
Visualization& Mapping
Data UpdatingPredictive Analytics
MS Office &e-Publishing
Extended BI
Core BI
Extensions to the WebFOCUS platform
allow you to build more application
types at a lower cost
Business toBusiness
Data Warehouse& ETL
Master DataManagement
Data Profiling & Data Quality
Business ActivityMonitoring
Mobile Applications
WebFOCUS High Performance Data Store
The Business Challenge
Big Data
Copyright 2007, Information Builders. Slide 4
Data Storage
Time
Machine- GeneratedData
Human-GeneratedData
Today’s Top Data-Management ChallengeBig Data and Machine Generated Data
Source: KEEPING UP WITH EVER-EXPANDING ENTERPRISE DATA ( Joseph McKendrick Unisphere Research October 2010)
How Performance Issues are Typically Addressed – by Pace of Data Growth
IT Managers try to mitigate these response times …..
Don't Know / Unsure
Upgrade networking infrastructure
Archive older data on other systems
Upgrade/expand storage systems
Upgrade server hardware/processors
Tune or upgrade existing databases
0% 20% 40% 60% 80% 100%
7%
21%
30%
33%
54%
66%
4%
32%
44%
60%
70%
75%
High Growth
Low Growth
When organizations have long running queries that limit the business, the response is often to spend much more time and money to resolve the problem
Classic Approaches and ChallengesData Warehousing
Traditional Data Warehousing
Labour intensive, heavy indexing, aggregations and partitioning
Hardware intensive: massive storage; big servers
Expensive and complex
More Data, More Data Sources
More Kinds of Output Needed by More Users,
More Quickly
Limited Resources and Budget
0101010101010101010101010101
0101010101010101010101010
0101010101010101010101
1
0101010101010101010101
10
1010 1011001
0 110
01
1
0
01
101
010101
1
1
0101
0
1010101
10 0101
10
01
10
0110
1
0
10101
01 010 01 0101
011
10100101
1
01
0
10
1010 101 10010 1
10
01
1
0
01
101
0
10101
10
0101010101010101010101010
0101010101010101010101010101
1
10110
0
101
1010 10 1101
010
0
0 101 0010
0
Real time data
Multiple databases
External Sources
New Demands: Larger transaction volumes driven by the internetImpact of Cloud ComputingMore -> Faster -> Cheaper
Data Warehousing Matures: Near real time updatesIntegration with master data managementData mining using discrete business transactionsProvision of data for business critical applications
Early Data Warehouse Characteristics:Integration of internal systemsMonthly and weekly loadsHeavy use of aggregates
Classic Approaches and ChallengesData Warehousing – Growing Demands
Classic Approaches and ChallengesDealing with Large Data
INDEXES
CUBES/OLAP
Classic Approaches and Challenges Limitations of Indexes Increased Space requirements
Sum of Index Space requirements can exceed the source DB Index Management
Increases Load times Building the index
Predefines a fixed access path
Classic Approaches and Challenges Limitations of OLAP Cube technology has limited scalability
Number of dimensions is limited Amount of data is limited
Cube technology is difficult to update (add Dimension) Usually requires a complete rebuild Cube builds are typically slow New design results in a new cube
Limitations of RowsThese Solutions Contribute to Operational Limitations1. Impediments to business agility
wait for DBAs to create indexes or other tuning structures, thereby delaying access to data.
Indexes significantly slow data-loading operations and increase the size of the database, sometimes by a factor of 2x.
2. Loss of data and time fidelity: ETL operations typically performed in batch during non-business hours. Delay access to data, often result in mismatches between operational and
analytic databases.3. Limited ad hoc capability:
Response times for ad hoc queries increase as the volume of data grows. Unanticipated queries (where DBAs have not tuned the database in
advance) can result in unacceptable response times.4. Unnecessary expenditures:
Attempts to improve performance using hardware acceleration and database tuning schemes raise the capital costs of equipment and the operational costs of database administration.
Added complexity of managing a large database diverts operational budgets away from more urgent IT projects.
Pivoting Your Perspective:Columnar Technology ….
Copyright 2007, Information Builders. Slide 13
Row-based databases are ubiquitous because so many of our most important business systems are transactional.
Row-oriented databasesare well suited for transactional environments, such as a call center where a customer’s entire record is required when their profileis retrieved and/or when fields are frequently updated.
The Ubiquity of Rows
But - Disk I/O becomes a substantial limiting factor since a row-oriented design forces the database to retrieve all column data for any query.
30 columns
50 millionsRows
The Limitation of Rows
Row Oriented (1, Smith, New York, 50000; 2, Jones, New York, 65000; 3, Fraser, Boston, 40000; 4, Fraser, Boston, 70000)
Works well if all the columns are needed for every query. Efficient for transactional processing if all the data for the row is available
Works well with aggregate results (sum, count, avg. ) Only columns that are relevant need to be touched Consistent performance with any database design Allows for very efficient compression
Column Oriented (1, 2, 3, 4; Smith, Jones, Fraser, Fraser; New York, New York, Boston, Boston, 50000, 65000, 40000, 70000)
Employee Id
1
2
3
Name
Smith
Jones
Fraser
Location
New York
New York
Boston
Sales
50,000
65,000
40,000
4 Fraser Boston 70,000
Pivoting Your PerspectiveColumnar Technology
WebFOCUS Hyperstage
Copyright 2007, Information Builders. Slide 16
IntroducingWebFOCUS Hyperstage Mission
Improve database performance for WebFOCUS applications with less hardware, no database tuning, and easy migration
What is WebFOCUS Hyperstage High performance analytic data store Designed to handle business-driven queries on large volumes of data
without IT intervention. Easy to implement and manage, Hyperstage provides answers to your
business users need at a price you can afford Advantages
Dramatically increase performance of WebFOCUS applications Disk footprint reduced with powerful compression algorithm = faster
response time Embedded ETL for seamless migration of existing analytical databases
No change in query or application required Includes optimized Hyperstage Adapter WebFOCUS metadata can be used to define hierarchies and drill
paths to navigate the star schema17
Hyperstage Engine
Knowledge Grid
Compressor
BulkLoader
• Unmatched Administrative Simplicity • No Indexes• No data partitioning• No Manual tuning
Introducing WebFOCUS HyperstageHow it is architected
Combines a columnar database with intelligence we call the Knowledge Grid
to deliver fast query responses.
Improve database performance for WebFOCUS applications with less
hardware, no database tuning, and easy migration
Introducing WebFOCUS HyperstageWhat it means for Customers
Self-managing: 90% less administrative effort
Low-cost: More than 50% less than alternative solutions
Scalable, high-performance: Up to 50 TB using a single industry standard server
Fast queries: Ad hoc queries are as fast as anticipated queries, so users have total flexibility
Compression: Data compression of 10:1 to 40:1 means a lot less storage is needed, it might mean you can get the entire database in memory!
Create Information(Metadata) about the data,
and, upon Load, automatically …
Uses the metadata whenProcessing a query to
Eliminate / reduce need to access data
Architecture Benefits
o Stores it in the Knowledge Grid (KG)o KG Is loaded into Memoryo Less than 1% of compressed data Size
o The less data that needs to be accessed, the faster the response o Sub-second responses when answered by KG
o No Need to partition data, create/maintain indexes projections, or tune for performanceo Ad hoc queries are as fast as static queries, so users have total flexibility
Introducing WebFOCUS HyperstageHow it works
Smarter Architecture
No maintenance No query planning No partition schemes No DBA
Data Packs – data stored in manageably sized, highly compressed data packs
Knowledge Grid – statistics and metadata “describing” the super-compressed data
Column Orientation
Data compressed using algorithms tailored to data type
WebFOCUS Hyperstage EngineHow it works
Summary
Copyright 2007, Information Builders. Slide 22
Business Intelligence – Meeting Requirements
No indexes No partitions No views No materialized aggregates
Value propositionLow IT overheadAllows for autonomy from ITEase of implementationFast time to marketLess Hardware Lower TCO
No DBA Required!
WebFOCUS HyperstageThe Big Deal
WebFOCUS Hyperstage AdapterWhat it looks like
WebFOCUS Hyperstage AdapterWhat it looks like
Example – Focus to Hyperstage Compression 243639 Rows
Q&A
Copyrigh
t 200
7, Infor
matio
n Builders.
Slide
28