the reader · ffi rs.indd 12/01/15 page v about the authors ralph kimball founded the kimball...
TRANSCRIPT
ffi rs.indd 12/01/15 Page i
The Kimball Group Reader
ffi rs.indd 12/01/15 Page iii
Remastered Collection
Ralph Kimball and Margy Ross with Bob Becker, Joy Mundy, and Warren Thornthwaite
Relentlessly Practical Tools for Data Warehousing and Business Intelligence
The Kimball Group Reader
ffi rs.indd 12/01/15 Page iv
The Kimball Group Reader: Relentlessly Practical Tools for Data Warehousing and Business Intelligence, Second Edition
Published byJohn Wiley & Sons, Inc.10475 Crosspoint BoulevardIndianapolis, IN 46256www.wiley.com
Copyright © 2016 by Ralph Kimball and Margy Ross
Published simultaneously in Canada
ISBN: 978-1-119-21631-5ISBN: 978-1-119-23879-9 (ebk)ISBN: 978-1-119-21659-9 (ebk)
Manufactured in the United States of America
10 9 8 7 6 5 4 3 2 1
No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, elec-tronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 646-8600. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permissions.
Limit of Liability/Disclaimer of Warranty: The publisher and the author make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifi cally disclaim all warranties, including without limita-tion warranties of fi tness for a particular purpose. No warranty may be created or extended by sales or promotional materials. The advice and strategies contained herein may not be suitable for every situation. This work is sold with the understanding that the publisher is not engaged in rendering legal, accounting, or other professional services. If professional assistance is required, the services of a competent professional person should be sought. Neither the publisher nor the author shall be liable for damages arising herefrom. The fact that an organization or Web site is referred to in this work as a citation and/or a poten-tial source of further information does not mean that the author or the publisher endorses the information the organization or Web site may provide or recommendations it may make. Further, readers should be aware that Internet Web sites listed in this work may have changed or disappeared between when this work was written and when it is read.
For general information on our other products and services please contact our Customer Care Department within the United States at (877) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.
Wiley publishes in a variety of print and electronic formats and by print-on-demand. Some material included with standard print versions of this book may not be included in e-books or in print-on-demand. If this book refers to media such as a CD or DVD that is not included in the version you purchased, you may download this material at http://booksupport.wiley.com. For more information about Wiley products, visit www.wiley.com.
Library of Congress Control Number: 2015956232
Trademarks: Wiley, and the Wiley logo are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affi liates, in the United States and other countries, and may not be used without written permission. All other trademarks are the prop-erty of their respective owners. John Wiley & Sons, Inc., is not associated with any product or vendor mentioned in this book.
ffi rs.indd 12/01/15 Page v
About the Authors
Ralph Kimball founded the Kimball Group. Since the mid-1980s, he has been the DW/BI industry’s
thought leader on the dimensional approach and has trained more than 20,000 IT professionals.
Prior to working at Metaphor and founding Red Brick Systems, Ralph co-invented the Star worksta-
tion at Xerox’s Palo Alto Research Center (PARC). Ralph has a Ph.D. in Electrical Engineering from
Stanford University.
Margy Ross is President of the Kimball Group and Decision Works Consulting. She has focused
exclusively on data warehousing and business intelligence since 1982. Margy has consulted with
hundreds of clients and taught DW/BI best practices to tens of thousands. Prior to working at
Metaphor and co-founding DecisionWorks Consulting, she graduated with a B.S. in Industrial
Engineering from Northwestern University.
ffi rs.indd 12/01/15 Page vii
Credits
Project Editor
Tom Dinse
Production Editor
Christine O’Connor
Copy Editor
Kim Cofer
Manager of Content Development & Assembly
Mary Beth Wakefi eld
Marketing Director
David Mayhew
Marketing Manager
Carrie Sherrill
Professional Technology & Strategy Director
Barry Pruett
Business Manager
Amy Knies
Associate Publisher
Jim Minatel
Project Coordinator, Cover
Brent Savage
Proofreader
Nancy Carrasco
Indexer
Johnna VanHoose Dinse
Cover Designer
Wiley
Cover Image
The Kimball Group
ffi rs.indd 12/01/15 Page ix
Warren Thornthwaite (1957–2014)
The Kimball Group lost Warren to a brain tumor in 2014. He wrote many insightful articles that
appear in this Reader. All of us at the Kimball Group miss Warren dearly—his intellect, curiosity,
creativity, and most especially, his friendship and sense of humor. As any of you who met Warren will
attest, he was truly one of a kind!
ffi rs.indd 12/01/15 Page xi
First, we want to thank the 33,000 subscribers to the Kimball Design Tips, and the uncounted
numbers who have visited the Kimball Group website to peruse our archive. This book brings the
remastered Design Tips and articles together in what we hope is a very usable form.
The Kimball Group Reader would not exist without the assistance of our business partners. Kimball
Group members Bob Becker, Joy Mundy, and Warren Thornthwaite wrote many of the valuable articles
and Design Tips included in the book. Thanks to Julie Kimball for her insightful comments. Thanks
also to former Kimball Group member Bill Schmarzo for his contributions on analytic applications.
Thanks to our clients and students who have embraced, practiced, and validated the Kimball
methods with us. We have learned as much from you as you have from us!
Jim Minatel, our executive editor at Wiley Publishing, project editor Tom Dinse, and the rest of
the Wiley team have supported this project with skill, encouragement, and enthusiasm. It has been
a pleasure to work with them.
To our families, thank you for your support over the twenty year span during which we wrote
these Design Tips and articles. Julie Kimball and Scott Ross: We couldn’t have done it without you!
And, of course, thanks to our children, Sara Kimball Smith, Brian Kimball, and Katie Ross, who have
grown into adults over the same time!
Acknowledgments
ftoc.indd 12/01/15 Page xiii
Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxv
1 The Reader at a Glance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1Setting Up for Success . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Resist the Urge to Start Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Set Your Boundaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Tackling DW/BI Design and Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Data Wrangling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Myth Busters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.5 Dividing the World . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.6 Essential Steps for the Integrated Enterprise Data Warehouse . . . . . . . . . . . . . . . . . . . 13
1.7 Drill Down to Ask Why . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.8 Slowly Changing Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.9 Judge Your BI Tool through Your Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.10 Fact Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
1.11 Exploit Your Fact Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2 Before You Dive In . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35Before Data Warehousing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.1 History Lesson on Ralph Kimball and Xerox PARC. . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Historical Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.2 The Database Market Splits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.3 Bringing Up Supermarts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Dealing with Demanding Realities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.4 Brave New Requirements for Data Warehousing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.5 Coping with the Brave New Requirements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
2.6 Stirring Things Up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
2.7 Design Constraints and Unavoidable Realities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
ftoc.indd 12/01/15 Page xiv
Contentsxiv
2.8 Two Powerful Ideas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
2.9 Data Warehouse Dining Experience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
2.10 Easier Approaches for Harder Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
2.11 Expanding Boundaries of the Data Warehouse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
3 Project/Program Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75Professional Responsibilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
3.1 Professional Boundaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
3.2 An Engineer’s View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
3.3 Beware the Objection Removers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
3.4 What Does the Central Team Do? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
3.5 Avoid Isolating DW and BI Teams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
3.6 Better Business Skills for BI and Data Warehouse Professionals . . . . . . . . . . . . . . . . . . 91
3.7 Risky Project Resources Are Risky Business . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
3.8 Implementation Analysis Paralysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
3.9 Contain DW/BI Scope Creep and Avoid Scope Theft . . . . . . . . . . . . . . . . . . . . . . . . . 96
3.10 Are IT Procedures Benefi cial to DW/BI Projects? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
Justifi cation and Sponsorship . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
3.11 Habits of Effective Sponsors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
3.12 TCO Starts with the End User . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
Kimball Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
3.13 Kimball Lifecycle in a Nutshell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
3.14 Off the Bench . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .111
3.15 The Anti-Architect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .112
3.16 Think Critically When Applying Best Practices . . . . . . . . . . . . . . . . . . . . . . . . . . . . .115
3.17 Eight Guidelines for Low Risk Enterprise Data Warehousing . . . . . . . . . . . . . . . . . . .118
4 Requirements Defi nition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123Gathering Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
4.1 Alan Alda’s Interviewing Tips for Uncovering Business Requirements . . . . . . . . . . . . 123
4.2 More Business Requirements Gathering Dos and Don’ts . . . . . . . . . . . . . . . . . . . . . 127
4.3 Balancing Requirements and Realities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
4.4 Overcoming Obstacles When Gathering Business Requirements . . . . . . . . . . . . . . . 130
4.5 Surprising Value of Data Profi ling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
ftoc.indd 12/01/15 Page xv
Contents xv
Organizing around Business Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
4.6 Focus on Business Processes, Not Business Departments! . . . . . . . . . . . . . . . . . . . . . 134
4.7 Identifying Business Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
4.8 Business Process Decoder Ring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
4.9 Relationship between Strategic Business Initiatives and Business Processes . . . . . . . . 138
Wrapping Up the Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
4.10 The Bottom-Up Misnomer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
4.11 Think Dimensionally (Beyond Data Modeling) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
4.12 Using the Dimensional Model to Validate Business Requirements . . . . . . . . . . . . . . 145
5 Data Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147Making the Case for Dimensional Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
5.1 Is ER Modeling Hazardous to DSS? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
5.2 A Dimensional Modeling Manifesto . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
5.3 There Are No Guarantees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
Enterprise Data Warehouse Bus Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
5.4 Divide and Conquer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
5.5 The Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
5.6 The Matrix: Revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
5.7 Drill Down into a Detailed Bus Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .174
Agile Project Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .176
5.8 Relating to Agile Methodologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .176
5.9 Is Agile Enterprise Data Warehousing an Oxymoron? . . . . . . . . . . . . . . . . . . . . . . . . 177
5.10 Going Agile? Start with the Bus Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
5.11 Conformed Dimensions as the Foundation for Agile Data Warehousing . . . . . . . . . 180
Integration Instead of Centralization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
5.12 Integration for Real People . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
5.13 Build a Ready-to-Go Resource for Enterprise Dimensions . . . . . . . . . . . . . . . . . . . . 185
5.14 Data Stewardship 101: The First Step to Quality and Consistency . . . . . . . . . . . . . . 186
5.15 To Be or Not To Be Centralized . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
Contrast with the Corporate Information Factory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
5.16 Differences of Opinion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
5.17 Much Ado about Nothing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
ftoc.indd 12/01/15 Page xvi
Contentsxvi
5.18 Don’t Support Business Intelligence with a Normalized EDW . . . . . . . . . . . . . . . . . 199
5.19 Complementing 3NF EDWs with Dimensional Presentation Areas . . . . . . . . . . . . . 201
6 Dimensional Modeling Fundamentals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203Basics of Dimensional Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
6.1 Fact Tables and Dimension Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
6.2 Drilling Down, Up, and Across . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
6.3 The Soul of the Data Warehouse, Part One: Drilling Down . . . . . . . . . . . . . . . . . . . . 210
6.4 The Soul of the Data Warehouse, Part Two: Drilling Across . . . . . . . . . . . . . . . . . . . 213
6.5 The Soul of the Data Warehouse, Part Three: Handling Time . . . . . . . . . . . . . . . . . . 216
6.6 Graceful Modifi cations to Existing Fact and Dimension Tables . . . . . . . . . . . . . . . . . 219
Dos and Don’ts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
6.7 Kimball’s Ten Essential Rules of Dimensional Modeling . . . . . . . . . . . . . . . . . . . . . . 221
6.8 What Not to Do . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
Myths about Dimensional Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
6.9 Dangerous Preconceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
6.10 Fables and Facts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
7 Dimensional Modeling Tasks and Responsibilities . . . . . . . . . . . . . . . . . . . . . . . . 233Design Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
7.1 Letting the Users Sleep . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
7.2 Practical Steps for Designing a Dimensional Model . . . . . . . . . . . . . . . . . . . . . . . . . 240
7.3 Staffi ng the Dimensional Modeling Team . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
7.4 Involve Business Representatives in Dimensional Modeling . . . . . . . . . . . . . . . . . . . . 244
7.5 Managing Large Dimensional Design Teams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
7.6 Use a Design Charter to Keep Dimensional Modeling Activities on Track . . . . . . . . . 248
7.7 The Naming Game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
7.8 What’s in a Name? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
7.9 When Is the Dimensional Design Done? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
Design Review Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
7.10 Design Review Dos and Don’ts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
7.11 Fistful of Flaws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
7.12 Rating Your Dimensional Data Warehouse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
ftoc.indd 12/01/15 Page xvii
Contents xvii
8 Fact Table Core Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267Granularity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
8.1 Declaring the Grain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
8.2 Keep to the Grain in Dimensional Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
8.3 Warning: Summary Data May Be Hazardous to Your Health . . . . . . . . . . . . . . . . . . . 272
8.4 No Detail Too Small . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
Types of Fact Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
8.5 Fundamental Grains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
8.6 Modeling a Pipeline with an Accumulating Snapshot . . . . . . . . . . . . . . . . . . . . . . . . 280
8.7 Combining Periodic and Accumulating Snapshots . . . . . . . . . . . . . . . . . . . . . . . . . . 282
8.8 Complementary Fact Table Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
8.9 Modeling Time Spans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
8.10 A Rolling Prediction of the Future, Now and in the Past . . . . . . . . . . . . . . . . . . . . . 289
8.11 Timespan Accumulating Snapshot Fact Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
8.12 Is it a Dimension, a Fact, or Both? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294
8.13 Factless Fact Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
8.14 Factless Fact Tables? Sounds Like Jumbo Shrimp? . . . . . . . . . . . . . . . . . . . . . . . . . . 298
8.15 What Didn’t Happen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
8.16 Factless Fact Tables for Simplifi cation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302
Parent-Child Fact Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304
8.17 Managing Your Parents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304
8.18 Patterns to Avoid When Modeling Header/Line Item Transactions . . . . . . . . . . . . . 307
Fact Table Keys and Degenerate Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
8.19 Fact Table Surrogate Keys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
8.20 Reader Suggestions on Fact Table Surrogate Keys . . . . . . . . . . . . . . . . . . . . . . . . . 310
8.21 Another Look at Degenerate Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312
8.22 Creating a Reference Dimension for Infrequently Accessed Degenerates . . . . . . . . 313
Miscellaneous Fact Table Design Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
8.23 Put Your Fact Tables on a Diet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
8.24 Keeping Text Out of the Fact Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316
8.25 Dealing with Nulls in a Dimensional Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317
8.26 Modeling Data as Both a Fact and Dimension Attribute . . . . . . . . . . . . . . . . . . . . . 318
ftoc.indd 12/01/15 Page xviii
Contentsxviii
8.27 When a Fact Table Can Be Used as a Dimension Table . . . . . . . . . . . . . . . . . . . . . . 319
8.28 Sparse Facts and Facts with Short Lifetimes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
8.29 Pivoting the Fact Table with a Fact Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
8.30 Accumulating Snapshots for Complex Workfl ows . . . . . . . . . . . . . . . . . . . . . . . . . 324
9 Dimension Table Core Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327Dimension Table Keys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
9.1 Surrogate Keys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
9.2 Keep Your Keys Simple . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
9.3 Durable “Super-Natural” Keys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333
Date and Time Dimension Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334
9.4 It’s Time for Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335
9.5 Surrogate Keys for the Time Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
9.6 Latest Thinking on Time Dimension Tables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
9.7 Smart Date Keys to Partition Fact Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
9.8 Updating the Date Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
9.9 Handling All the Dates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343
Miscellaneous Dimension Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
9.10 Selecting Default Values for Nulls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
9.11 Data Warehouse Role Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
9.12 Mystery Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350
9.13 De-Clutter with Junk Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353
9.14 Showing the Correlation between Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354
9.15 Causal (Not Casual) Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356
9.16 Resist Abstract Generic Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359
9.17 Hot-Swappable Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360
9.18 Accurate Counting with a Dimensional Supplement . . . . . . . . . . . . . . . . . . . . . . . . 361
Slowly Changing Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363
9.19 Perfectly Partitioning History with Type 2 SCD . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363
9.20 Many Alternate Realities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364
9.21 Monster Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367
9.22 When a Slowly Changing Dimension Speeds Up . . . . . . . . . . . . . . . . . . . . . . . . . . 370
9.23 When Do Dimensions Become Dangerous? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372
9.24 Slowly Changing Dimensions Are Not Always as Easy as 1, 2, and 3 . . . . . . . . . . . . 373
ftoc.indd 12/01/15 Page xix
Contents xix
9.25 Slowly Changing Dimension Types 0, 4, 5, 6 and 7 . . . . . . . . . . . . . . . . . . . . . . . . 378
9.26 Dimension Row Change Reason Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382
10 More Dimension Patterns and Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . 385Snowfl akes, Outriggers, and Bridges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385
10.1 Snowfl akes, Outriggers, and Bridges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385
10.2 A Trio of Interesting Snowfl akes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 388
10.3 Help for Dimensional Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392
10.4 Managing Bridge Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395
10.5 The Keyword Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399
10.6 Potential Bridge (Table) Detours . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403
10.7 Alternatives for Multi-Valued Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405
10.8 Adding a Mini-Dimension to a Bridge Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 407
Dealing with Hierarchies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409
10.9 Maintaining Dimension Hierarchies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409
10.10 Help for Hierarchies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414
10.11 Five Alternatives for Better Employee Dimensional Modeling . . . . . . . . . . . . . . . . 417
10.12 Avoiding Alternate Organization Hierarchies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425
10.13 Alternate Hierarchies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 426
Customer Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427
10.14 Dimension Embellishments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427
10.15 Wrangling Behavior Tags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429
10.16 Three Ways to Capture Customer Satisfaction . . . . . . . . . . . . . . . . . . . . . . . . . . . 431
10.17 Extreme Status Tracking for Real-Time Customer Analysis . . . . . . . . . . . . . . . . . . . 435
Addresses and International Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439
10.18 Think Globally, Act Locally . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439
10.19 Warehousing without Borders. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443
10.20 Spatially Enabling Your Data Warehouse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 448
10.21 Multinational Dimensional Data Warehouse Considerations . . . . . . . . . . . . . . . . . 452
Industry Scenarios and Idiosyncrasies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453
10.22 Industry Standard Data Models Fall Short . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453
10.23 An Insurance Data Warehouse Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455
10.24 Traveling through Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 460
10.25 Human Resources Dimensional Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463
ftoc.indd 12/01/15 Page xx
Contentsxx
10.26 Managing Backlogs Dimensionally . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467
10.27 Not So Fast . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 468
10.28 The Budgeting Chain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 471
10.29 Compliance-Enabled Data Warehouses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475
10.30 Clicking with Your Customer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 477
10.31 The Special Dimensions of the Clickstream . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 482
10.32 Fact Tables for Text Document Searching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485
10.33 Enabling Market Basket Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 489
11 Back Room ETL and Data Quality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495Planning the ETL System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495
11.1 Surrounding the ETL Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495
11.2 The 34 Subsystems of ETL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 500
11.3 Six Key Decisions for ETL Architectures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504
11.4 Three ETL Compromises to Avoid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 508
11.5 Doing the Work at Extract Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 510
11.6 Is Data Staging Relational? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 513
11.7 Staging Areas and ETL Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 517
11.8 Should You Use an ETL Tool? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 518
11.9 Call to Action for ETL Tool Providers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 521
11.10 Document the ETL System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 522
11.11 Measure Twice, Cut Once . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 523
11.12 Brace for Incoming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527
11.13 Building a Change Data Capture System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 530
11.14 Disruptive ETL Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 531
11.15 New Directions for ETL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533
Data Quality Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535
11.16 Dealing With Data Quality: Don’t Just Sit There, Do Something! . . . . . . . . . . . . . . 535
11.17 Data Warehouse Testing Recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 537
11.18 Dealing with Dirty Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 539
11.19 An Architecture for Data Quality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 545
11.20 Indicators of Quality: The Audit Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 553
11.21 Adding an Audit Dimension to Track Lineage and Confi dence . . . . . . . . . . . . . . . 556
11.22 Add Uncertainty to Your Fact Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 559
ftoc.indd 12/01/15 Page xxi
Contents xxi
11.23 Have You Built Your Audit Dimension Yet? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 560
11.24 Is Your Data Correct? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 562
11.25 Eight Recommendations for International Data Quality . . . . . . . . . . . . . . . . . . . . 565
11.26 Using Regular Expressions for Data Cleaning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 568
Populating Fact and Dimension Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 572
11.27 Pipelining Your Surrogates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 572
11.28 Unclogging the Fact Table Surrogate Key Pipeline . . . . . . . . . . . . . . . . . . . . . . . . 576
11.29 Replicating Dimensions Correctly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 579
11.30 Identify Dimension Changes Using Cyclic Redundancy Checksums . . . . . . . . . . . 580
11.31 Maintaining Back Pointers to Operational Sources . . . . . . . . . . . . . . . . . . . . . . . . 581
11.32 Creating Historical Dimension Rows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 582
11.33 Facing the Re-Keying Crisis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 585
11.34 Backward in Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 587
11.35 Early-Arriving Facts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 590
11.36 Slowly Changing Entities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 591
11.37 Using the SQL MERGE Statement for Slowly Changing Dimensions . . . . . . . . . . . 593
11.38 Creating and Managing Shrunken Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . 595
11.39 Creating and Managing Mini-Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 597
11.40 Creating, Using, and Maintaining Junk Dimensions . . . . . . . . . . . . . . . . . . . . . . . 599
11.41 Building Bridges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 601
11.42 Being Offl ine as Little as Possible . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 605
Supporting Real Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 606
11.43 Working in Web Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 606
11.44 Real-Time Partitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 610
11.45 The Real-Time Triage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 613
12 Technical Architecture Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 617Overall Technical/System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 617
12.1 Can the Data Warehouse Benefi t from SOA? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 617
12.2 Picking the Right Approach to MDM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 619
12.3 Building Custom Tools for the DW/BI System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625
12.4 Welcoming the Packaged App . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 626
12.5 ERP Vendors: Bring Down Those Walls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 629
12.6 Building a Foundation for Smart Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 632
ftoc.indd 12/01/15 Page xxii
Contentsxxii
12.7 RFID Tags and Smart Dust . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 637
12.8 Is Big Data Compatible with the Data Warehouse? . . . . . . . . . . . . . . . . . . . . . . . . . 640
12.9 The Evolving Role of the Enterprise Data Warehouse in the Era of Big Data Analytics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 641
12.10 Newly Emerging Best Practices for Big Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 659
12.11 The Hyper-Granular Active Archive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 670
Presentation Server Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 672
12.12 Columnar Databases: Game Changers for DW/BI Deployment . . . . . . . . . . . . . . . 672
12.13 There Is no Database Magic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 673
12.14 Relating to OLAP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 676
12.15 Dimensional Relational versus OLAP: The Final Deployment Conundrum . . . . . . . 679
12.16 Microsoft SQL Server Comes of Age for Data Warehousing . . . . . . . . . . . . . . . . . . 682
12.17 The Aggregate Navigator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 686
12.18 Aggregate Navigation with (Almost) No Metadata . . . . . . . . . . . . . . . . . . . . . . . . 690
Front Room Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 697
12.19 The Second Revolution of User Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 697
12.20 Designing the User Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 700
Metadata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 704
12.21 Meta Meta Data Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 704
12.22 Creating the Metadata Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 708
12.23 Leverage Process Metadata for Self-Monitoring DW Operations . . . . . . . . . . . . . . 709
Infrastructure and Security Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 712
12.24 Watching the Watchers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 712
12.25 Catastrophic Failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 716
12.26 Digital Preservation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 719
12.27 Creating the Advantages of a 64-Bit Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 722
12.28 Server Confi guration Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 723
12.29 Adjust Your Thinking for SANs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 726
13 Front Room Business Intelligence Applications . . . . . . . . . . . . . . . . . . . . . . . . . . 729Delivering Value with Business Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 729
13.1 The Promise of Decision Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 730
13.2 Beyond Paving the Cow Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 733
13.3 BI Components for Business Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 736
ftoc.indd 12/01/15 Page xxiii
Contents xxiii
13.4 Big Shifts Happening in BI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 738
13.5 Behavior: The Next Marquee Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 740
Implementing the Business Intelligence Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 743
13.6 Three Critical Components for Successful Self-Service BI . . . . . . . . . . . . . . . . . . . . 743
13.7 Leverage Data Visualization Tools, But Avoid Anarchy . . . . . . . . . . . . . . . . . . . . . . . 745
13.8 Think Like a Software Development Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . 747
13.9 Standard Reports: Basics for Business Users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 748
13.10 Building and Delivering BI Reports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 753
13.11 The BI Portal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 757
13.12 Dashboards Done Right . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 759
13.13 Don’t Be Overly Reliant on Your Data Access Tool’s Metadata . . . . . . . . . . . . . . . . 760
13.14 Making Sense of the Semantic Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 762
Mining Data to Uncover Relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 764
13.15 Digging into Data Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 764
13.16 Preparing for Data Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 766
13.17 The Perfect Handoff . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 770
13.18 Get Started with Data Mining Now . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 774
13.19 Leverage Your Dimensional Model for Predictive Analytics . . . . . . . . . . . . . . . . . . 778
13.20 Does Your Organization Need an Analytic Sandbox? . . . . . . . . . . . . . . . . . . . . . . 779
Dealing with SQL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 781
13.21 Simple Drill Across in SQL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 781
13.22 An Excel Macro for Drilling Across . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 783
13.23 The Problem with Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 785
13.24 SQL Roadblocks and Pitfalls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 789
13.25 Features for Query Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 792
13.26 Turbocharge Your Query Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 794
13.27 Smarter Data Warehouses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 798
14 Maintenance and Growth Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 805Deploying Successfully . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 805
14.1 Don’t Forget the Owner’s Manual . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 805
14.2 Let’s Improve Our Operating Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 809
14.3 Marketing the DW/BI System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 811
14.4 Coping with Growing Pains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 812
ftoc.indd 12/01/15 Page xxiv
Contentsxxiv
Sustaining for Ongoing Impact . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 816
14.5 Data Warehouse Checkups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 816
14.6 Boosting Business Acceptance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 822
14.7 Educate Management to Sustain DW/BI Success . . . . . . . . . . . . . . . . . . . . . . . . . . 825
14.8 Getting Your Data Warehouse Back on Track . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 828
14.9 Upgrading Your BI Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 829
14.10 Four Fixes for Legacy Data Warehouses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 831
14.11 A Data Warehousing Fitness Program for Lean Times . . . . . . . . . . . . . . . . . . . . . . 835
14.12 Enjoy the Sunset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 839
15 Final Thoughts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 841Key Insights and Reminders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 841
15.1 Final Word of the Day: Collaboration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 841
15.2 Tried and True Concepts for DW/BI Success . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 843
15.3 Key Tenets of the Kimball Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 845
A Look to the Future . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 847
15.4 The Future Is Bright . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 847
Article Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 853
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 861
fl ast.indd 11/25/15 Page xxv
Introduction
The Kimball Group’s article and Design Tip archive has been the most popular destination on our
website (www.kimballgroup.com). Stretching back twenty years to Ralph’s original 1995 DBMS
magazine articles, the archive explores more than 250 topics, sometimes in more depth than provided
by our books or courses.
With The Kimball Group Reader, Second Edition, we have organized all of the articles in a coherent
way. But The Reader is more than merely a collection of our past magazine articles and Design Tips
verbatim. We have trimmed the redundancy, made sure all the articles are written with the same
consistent vocabulary, and updated many of the fi gures. This is a new and improved remastered
compilation of our writings.
After considerable discussion, we decided to update many time references and edit content through-
out the book to provide the perspective of 2015 rather than leaving old dates or outdated concepts in
the articles. Thus an article written in 2007 may use 2015 in an example! When articles refer to the
number of years that have passed, we have updated these references relative to 2015. For example,
if a 2005 article originally said “during the past fi ve years,” the article now reads “during the past
fi fteen years.” Mentions regarding our years of experience, number of books sold, articles written, or
students taught have also been updated to 2015 fi gures. Finally, we occasionally changed references
from outmoded technologies such as “modems” to more modern technologies, especially “internet.”
We trust these changes will not mislead or cause confusion, but rather make your reading experi-
ence more natural.
Intended Audience and GoalsThe primary reader of this book should be the analyst, designer, modeler, or manager who is deliv-
ering a data warehouse in support of business intelligence. The articles in this book trace the entire
lifecycle of DW/BI system development, from original business requirements gathering all the way
to fi nal deployment. We believe that this collection of articles serves as a superb reference-in-depth
for literally hundreds of issues and situations that arise in the development of a DW/BI system.
The articles range from a managerial focus to a highly technical focus, although in all cases, the
tone of the articles strives to be educational. These articles have been accessed thousands of times
per day on the Kimball Group website over a span of 20 years, so we’re confi dent they’re useful.
This book adds signifi cant value by organizing the archive, and systematically editing the articles to
ensure their consistency and relevance.
fl ast.indd 11/25/15 Page xxvi
The Kimball Group Readerxxvi
Preview of ContentsFollowing two introductory chapters, the book’s organization will look somewhat familiar to readers
of The Data Warehouse Lifecycle Toolkit, Second Edition (Wiley, 2008) because we’ve organized the
articles topically to correspond with the major milestones of a data warehouse/business intelligence
(DW/BI) implementation. Not surprisingly given the word “Kimball” is practically synonymous with
dimensional modeling, much of The Reader focuses on that topic in particular.
■ Chapter 1: The Reader at a Glance. We begin the book with a series of articles written by Ralph
several years ago for DM Review magazine. This series succinctly encapsulates the Kimball
approach in a cohesive manner, so it serves as a perfect overview, akin to Cliff sNotes, for the
book.
■ Chapter 2: Before You Dive In. Long-time readers of Ralph’s articles will fi nd that this chapter
is a walk down memory lane, as many of the articles are historically signifi cant. Somewhat
amazingly, the content is still very relevant even though most of these articles were written
in the 1990s.
■ Chapter 3: Project/Program Planning. With an overview and history lesson under your belt,
Chapter 3 moves on to getting the DW/BI program and project launched. We consider both the
project team’s and sponsoring stakeholders’ responsibilities, and then delve into the Kimball
Lifecycle approach.
■ Chapter 4: Requirements Defi nition. It is diffi cult to achieve DW/BI success in the absence of
business requirements. This chapter delivers specifi c recommendations for eff ectively eliciting
the business’s needs. It stresses the importance of organizing the requirements fi ndings around
business processes, and suggests tactics for reaching organizational consensus on appropriate
next steps.
■ Chapter 5: Data Architecture. With a solid understanding of the business requirements, we turn
our attention to the data (where we will remain through Chapter 11). This chapter begins with
the justifi cation for dimensional modeling. It then describes the enterprise data warehouse bus
architecture, discusses the agile development approach to support data warehousing, provides
rationalization for the requisite integration and stewardship, and then contrasts the Kimball
architecture with the Corporate Information Factory’s hub-and-spoke.
■ Chapter 6: Dimensional Modeling Fundamentals. This chapter introduces the basics of dimen-
sional modeling, starting with distinguishing a fact from a dimension, and the core activities of
drilling down, drilling across, and handling time in a data warehouse. We also explore familiar
fables about dimensional models.
■ Chapter 7: Dimensional Modeling Tasks and Responsibilities. While Chapter 6 covers the
fundamental “what and why” surrounding dimensional modeling, this chapter focuses on
the “how, who, and when.” Chapter 7 describes the dimensional modeling process and tasks,
fl ast.indd 11/25/15 Page xxvii
Introduction xxvii
with the aim of organizing an eff ective team, whether starting with a blank slate or revisiting
an existing model.
■ Chapter 8: Fact Table Core Concepts. The theme for Chapter 8 could be stated as “just the
facts, and nothing but the facts.” We begin by discussing granularity and the three fundamental
types of fact tables, and then turn our attention to fact table keys and degenerate dimensions.
The chapter closes with a potpourri of common fact table patterns, including null, textual, and
sparsely populated metrics, as well as facts that closely resemble dimension attributes.
■ Chapter 9: Dimension Table Core Concepts. We shift our focus to dimension tables in Chapter
9, starting with a discussion of surrogate keys and the ever-present time (or date) dimensions.
We then explore role playing, junk, and causal dimension patterns, before launching into a
thorough handling of slowly changing dimensions, including four new advanced dimension
types. Hang onto your hats.
■ Chapter 10: More Dimension Patterns and Considerations. Chapter 10 complements the
previous chapter with more meaty coverage of dimension tables. We describe snowfl akes and
outriggers, as well as a signifi cantly updated section on bridges for handling both multi-valued
dimension attributes and ragged variable hierarchies. We discuss nuances often encountered in
customer dimensions, along with internationalization issues. The chapter closes with a series
of case studies covering insurance, voyages and networks, human resources, fi nance, electronic
commerce, text searching, and retail; we encourage everyone to peruse these vignettes as the
patterns and recommendations transcend industry or application boundaries.
■ Chapter 11: Back Room ETL and Data Quality. We switch gears from designing the target
dimensional model to populating it in Chapter 11. Be forewarned: This is a hefty chapter, as
you’d expect given the subject matter. This updated edition of the Reader has a wealth of new
material in this chapter. We start by describing the 34 subsystems required to extract, trans-
form, and load (ETL) the data, along with the pros and cons of using a commercial ETL tool.
From there, we delve into data quality considerations, provide specifi c guidance for building
fact and dimension tables, and discuss the implications of real-time ETL.
■ Chapter 12: Technical Architecture Considerations. It’s taken us until Chapter 12, but we’re
fi nally discussing issues surrounding the technical architecture, starting with server oriented
architecture (SOA), master data management (MDM), and packaged analytics. A new section
on big data features two in-depth Kimball Group white papers written by Ralph. Final sections
in this chapter focus on the presentation server, including the role of aggregate navigation
and online analytical processing (OLAP), user interface design, metadata, infrastructure, and
security.
■ Chapter 13: Front Room Business Intelligence Applications. In Chapter 13, we step into the
front room of the DW/BI system where business users are interacting with the data. We describe
the lifecycle of a typical business analysis, starting with a review of historical performance but
not stopping there. We then turn our attention to standardized BI reports before digging into
fl ast.indd 11/25/15 Page xxviii
The Kimball Group Readerxxviii
data mining and predictive analytics. The chapter closes by exploring the limitations of SQL
for business analysis.
■ Chapter 14: Maintenance and Growth Considerations. In this penultimate chapter, we provide
recommendations for successfully deploying the DW/BI system, as well as keeping it healthy
for sustained success.
■ Chapter 15: Final Thoughts. The Reader concludes with fi nal perspectives on data warehous-
ing and business intelligence from each Kimball Group principal. The insights range from the
most important hard won lessons we have learned to some glimpses of what the future of data
warehousing may hold.
Navigation AidsGiven the breadth and depth of the articles in The Kimball Group Reader, we have very deliberately
identifi ed over two dozen articles as “Kimball Classics” because they captured a concept so eff ectively
that we, and many others in the industry, have referred to these articles repeatedly over the past
twenty years. The classic articles are designated with a special icon that looks like this:
We expect most people will read the articles in somewhat random order, rather than digesting
the book from front to back. Therefore, we have put special emphasis on The Reader’s index as
we anticipate many of you will delve in by searching the index for a particular technique or
modeling situation.
Terminology Notes We are very proud that the vocabulary established by Ralph has been so durable and broadly adopted.
Kimball “marker words” including dimensions, facts, slowly changing dimensions, surrogate keys,
fact table grains, factless fact tables, and degenerate dimensions, have been used consistently across
the industry for more than twenty years. But in spite of our best intentions, a few terms have morphed
since their introduction; we have retroactively replaced the old terms with the accepted current ones.
■ Artifi cial keys are now called surrogate keys.
■ Data mart has been replaced with business process dimensional model, business process subject
area, or just subject area, depending on the context.