polaris query, analysis, and visualization of large hierarchical relational databases pat hanrahan...
Post on 20-Dec-2015
223 views
TRANSCRIPT
Polaris
Query, Analysis, and Visualization of
Large Hierarchical Relational Databases
Pat Hanrahan
With Chris Stolte and Diane Tang
Computer Science Department
Stanford University
Motivation
Large databases have become very common
Corporate data warehouses Amazon, Walmart,…
Scientific projects: Human Genome Project
Sloan Digital Sky Survey
Need tools to extract meaning from these databases
Related Work
Formalisms for graphics Bertin’s “Semiology of Graphics” Mackinlay’s APT Roth et al.’s Sage and SageBrush Wilkinson’s “Grammar of Graphics”
Visual exploration of databases DeVise DataSplash/Tioga-2
Visualization and data mining SGI’s MineSet IBM’s Diamond
Polaris Formalism
UI interpreted as visual specification that defines:
Table configuration
Type of graphic in each pane
Encoding of data as visual properties of marks
Data transformations and queries
Schema
MarketStateYearQuarterMonthProduct TypeProduct
ProfitSalesPayrollMarketingInventoryMarginCOGS...
Ordinal fields(categorical)
Quantitative fields(measures)
Coffee chain data[Visual Insights]
Polaris Visual Encodings
Principle of Importance Ordering: Encode the most important
information in the most effective way [Cleveland & McGill]
The Pivot Table Interface
Common interface to statistical packages/Excel
Cross-tabulations
Simple interface based on drag-and-drop
Data Cubes
Structure relation as n-dimensional cube
Each cell aggregatesall measures for those dimensions
Each cube axiscorresponds to a dimension in the relation
Table Algebra: Operands
Ordinal fields: interpret domain as a set that partitions table into rows and columns:
Quarter = {(Qtr1),(Qtr2),(Qtr3),(Qtr4)}
Quantitative fields: treat domain as single element set and encode spatially as axes:
Profit = {(Profit)}
Concatenation (+) Operator
Ordered union of two sets
Quarter + ProductType
= {(Qtr1),(Qtr2),(Qtr3),(Qtr4)}+{(Coffee),(Espresso)}
= {(Qtr1),(Qtr2),(Qtr3),(Qtr4),(Coffee),(Espresso)}
Profit + Sales
= {(Profit),(Sales)}
Cross () Operator
Direct-product of two sets
Quarter ProductType =
{(Qtr1,Coffee), (Qtr1, Tea), (Qtr2, Coffee), (Qtr2, Tea),
(Qtr3, Coffee), (Qtr3, Tea), (Qtr4, Coffee), (Qtr4,Tea)}
ProductType Profit =
SQL Dataflow
Notes Aggregation operators applied after sort Only one layer is shown; additional z-sort
Relational Table Tuples in Panes Marks in Panes
Sort
Hierarchical Structure
Challenge: these databases are very large
Queries/Vis should not require all the records
Augment database with hierarchical structure
Provide meaningful levels of abstraction
Derived from domain or clustering
Provides metadata (missing data for context)
Hierarchies and Data Cubes
Each dimension in the cube is structured as a tree
Each level in tree corresponds to level of detail
Schema: Star Schema
StateMonthProductProfitSalesPayrollMarketingInventoryMargin...
Measures
LocationMarketState
TimeYearQuarterMonthProducts
Product TypeProduct Name
Fact tableExistence Table
Generalizations
• Snowflake schemas
• Lattices (DAGs)
Categorical Hierarchies
Quarter Month
Direct product of two sets
Would create twelve entries for each quarter, i.e. (Qtr1, December)
Quarter / Month
Based on tuples in database not semantics
Would only create three entries per quarter
Can be expensive to compute
Quarter . Month
Based on tuples in existence tables (not db)
Generalization: Techniques
Selection
Simplification
Exaggeration
Regularization
Displacement
Aggregation
Summary
Polaris
Spreadsheet or table-based displays
Simple drag-and-drop interface
Built on a formalism that allows algebraic manipulation of visual mapping of tuples to marks
Multiscale visualizations using data and visual abstraction
Connects to SQL/MDX servers
See http://www.graphics.stanford.edu/projects/polaris
Future Work
Articulate full-set of multiscale design patterns
Transition between levels of detail
Develop system infrastructure for browsing VLDB
Support layers/lenses/linking with tuple flow
Device independence through graphical encodings
Extend formalism to 3D
Couple scientific and information visualization
…