cse 544 data models and views
DESCRIPTION
CSE 544 Data Models and Views. Lecture #4 Wednesday, January 19, 2011. Announcements. Projects : please sign up to meet with me on Friday, between 11-1pm (need about 15’). Before that do this: Form team Choose project Think, so we can have a meaningful discussion - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/1.jpg)
Dan Suciu -- 544, Winter 2011
CSE 544Data Models and Views
Lecture #4Wednesday, January 19, 2011
1
![Page 2: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/2.jpg)
Dan Suciu: Queries on Probabilistic Data 2
Announcements
• Projects: please sign up to meet with me on Friday, between 11-1pm (need about 15’). Before that do this:– Form team– Choose project– Think, so we can have a meaningful
discussion• Homework 1: due on Monday, 12pm
(before the lecture)Wayne State, 1/18/2011
![Page 3: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/3.jpg)
CSE 544 - Fall 2009 3
References
• M. Stonebraker and J. Hellerstein. What Goes Around Comes Around. In "Readings in Database Systems" (aka the Red Book). 4th ed.
![Page 4: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/4.jpg)
CSE 544 - Fall 2009 4
Data Model Motivation
• User is concerned with real-world data– Data represents different aspects of user’s business– Data typically includes entities and relationships between them– Example entities are students, courses, products, clients– Example relationships are course registrations, product purchases
• User somehow needs to define data to be stored in DBMS
• Data model enables a user to define the data using high-level constructs without worrying about many low-level details of how data will be stored on disk
![Page 5: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/5.jpg)
CSE 544 - Fall 2009 5
Levels of Abstraction
Disk
Physical Schema
Conceptual Schema
External Schema External Schema External Schema
a.k.a logical schemadescribes stored datain terms of data model
includes storage detailsfile organizationindexes
schema seen by apps
Classical picture.Remember it !
![Page 6: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/6.jpg)
CSE 544 - Fall 2009 6
Outline• Different types of data
• Early data models– IMS– CODASYL
• Physical and logical independence in the relational model
• Other data models
![Page 7: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/7.jpg)
CSE 544 - Fall 2009 7
Different Types of Data
• Structured data– What is this ? Examples ?
• Semistructured data– What is this ? – Examples ?
• Unstructured data– What is this ? Examples ?
![Page 8: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/8.jpg)
CSE 544 - Fall 2009 8
Different Types of Data
• Structured data– All data conforms to a schema. Ex: business data
• Semistructured data– Some structure in the data but implicit and irregular– Ex: resume, ads
• Unstructured data– No structure in data. Ex: text, sound, video, images
• Our focus: structured data & relational DBMSs
![Page 9: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/9.jpg)
CSE 544 - Fall 2009 9
Outline
• Different types of data
• Early data models– IMS – late 1960’s and 1970’s– CODASYL – 1970’s
• Physical and logical independence in the relational model
• Other data models
![Page 10: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/10.jpg)
CSE 544 - Fall 2009 10
Early Proposal 1: IMS
• What is it ?
![Page 11: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/11.jpg)
CSE 544 - Fall 2009 11
Early Proposal 1: IMS
• Hierarchical data model
• Record– Type: collection of named fields with data types (+)– Instance: must match type definition (+)– Each instance must have a key (+)– Record types must be arranged in a tree (-)
• IMS database is collection of instances of record types organized in a tree
![Page 12: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/12.jpg)
CSE 544 - Fall 2009 12
IMS Example
• See Figure 2 in paper “What goes around comes around”
![Page 13: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/13.jpg)
CSE 544 - Fall 2009 13
Data Manipulation Language: DL/1
• How does a programmer retrieve data in IMS ?
![Page 14: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/14.jpg)
CSE 544 - Fall 2009 14
Data Manipulation Language: DL/1
• Each record has a hierarchical sequence key (HSK)– Records are totally ordered: depth-first and left-to-right
• HSK defines semantics of commands:– get_next– get_next_within_parent
• DL/1 is a record-at-a-time language– Programmer constructs an algorithm for solving the
query– Programmer must worry about query optimization
![Page 15: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/15.jpg)
CSE 544 - Fall 2009 15
Data storage
• How is the data physically stored in IMS ?
![Page 16: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/16.jpg)
CSE 544 - Fall 2009 16
Data storage• Root records
– Stored sequentially (sorted on key)– Indexed in a B-tree using the key of the record– Hashed using the key of the record
• Dependent records– Physically sequential – Various forms of pointers
• Selected organizations restrict DL/1 commands– No updates allowed with sequential organization– No “get-next” for hashed organization
![Page 17: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/17.jpg)
CSE 544 - Fall 2009 17
Data Independence
• What is it ?
![Page 18: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/18.jpg)
CSE 544 - Fall 2009 18
Data Independence• Physical data independence: Applications are insulated
from changes in physical storage details
• Logical data independence: Applications are insulated from changes to logical structure of the data
• Why are these properties important?– Reduce program maintenance as– Logical database design changes over time– Physical database design tuned for performance
![Page 19: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/19.jpg)
CSE 544 - Fall 2009 19
IMS Limitations• Tree-structured data model
– Redundant data, existence depends on parent, artificial structure
• Record-at-a-time user interface– User must specify algorithm to access data
• Very limited physical independence– Phys. organization limits possible operations– Application programs break if organization changes
• Provides some logical independence– DL/1 program runs on logical database – Difficult to achieve good logical data independence with a tree model
![Page 20: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/20.jpg)
CSE 544 - Fall 2009 20
Early Proposal 2: CODASYL
• What is it ?
![Page 21: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/21.jpg)
CSE 544 - Fall 2009 21
Early Proposal 2: CODASYL• Networked data model
• Primitives are also record types with keys (+)• Network model is more flexible than hierarchy(+)
– Ex: no existence dependence• Record types are organized into network (-)
– A record can have multiple parents– Arcs between records are named– At least one entry point to the network
• Record-at-a-time data manipulation language (-)
![Page 22: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/22.jpg)
CSE 544 - Fall 2009 22
CODASYL Example• See Figure 5 in paper “What goes around comes around”
![Page 23: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/23.jpg)
CSE 544 - Fall 2009 23
CODASYL Limitations• No physical data independence
– Application programs break if organization changes
• No logical data independence– Application programs break if organization changes
• Very complex• Programs must “navigate the hyperspace”• Load and recover as one gigantic object
![Page 24: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/24.jpg)
CSE 544 - Fall 2009 24
Outline• Different types of data
• Early data models– IMS– CODASYL
• Physical and logical independence in the relational model
• Other data models
![Page 25: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/25.jpg)
CSE 544 - Fall 2009 25
Relational Model Overview
• Proposed by Ted Codd in 1970
• Motivation: better logical and physical data independence
![Page 26: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/26.jpg)
Dan Suciu: Queries on Probabilistic Data 26
Relational Model Overview
• Defines logical schema only– No physical schema
• Set-at-a-time query language
Wayne State, 1/18/2011
![Page 27: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/27.jpg)
CSE 544 - Fall 2009 27
Physical Independence
• Definition: Applications are insulated from changes in physical storage details
• Early models (IMS and CODASYL): No
• Relational model: Yes– Yes through set-at-a-time language: algebra or
calculus– No specification of what storage looks like– Administrator can optimize physical layout
![Page 28: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/28.jpg)
CSE 544 - Fall 2009 28
Physical Independence
• Definition: Applications are insulated from changes in physical storage details
• Early models (IMS and CODASYL): No
• Relational model: Yes– Yes through set-at-a-time language: algebra or
calculus– No specification of what storage looks like– Administrator can optimize physical layout
![Page 29: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/29.jpg)
CSE 544 - Fall 2009 29
Logical Independence
• Definition: Applications are insulated from changes to logical structure of the data
• Early models– IMS: some logical independence– CODASYL: no logical independence
• Relational model– Yes through views
![Page 30: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/30.jpg)
CSE 544 - Fall 2009 30
Great Debate• Pro relational
– What where the arguments ?
• Against relational– What where the arguments ?
• How was it settled ?
![Page 31: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/31.jpg)
CSE 544 - Fall 2009 31
Great Debate• Pro relational
– CODASYL is too complex– CODASYL does not provide sufficient data independence– Record-at-a-time languages are too hard to optimize– Trees/networks not flexible enough to represent common cases
• Against relational– COBOL programmers cannot understand relational languages– Impossible to represent the relational model efficiently– CODASYL can represent tables
• Ultimately settled by the market place
![Page 32: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/32.jpg)
CSE 544 - Fall 2009 32
Outline
• Different types of data
• Early data models– IMS– CODASYL
• Physical and logical independence in the relational model
• Other data models
![Page 33: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/33.jpg)
CSE 544 - Fall 2009 33
Other Data Models• Entity-Relationship: 1970’s
– Successful in logical database design (next lecture)• Extended Relational: 1980’s • Semantic: late 1970’s and 1980’s• Object-oriented: late 1980’s and early 1990’s
– Address impedance mismatch: relational dbs OO languages
– Interesting but ultimately failed (several reasons, see paper)• Object-relational: late 1980’s and early 1990’s
– User-defined types, ops, functions, and access methods• Semi-structured: late 1990’s to the present
![Page 34: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/34.jpg)
CSE 544 - Fall 2009 34
Summary• Data independence is desirable
– Both physical and logical– Early data models provided very limited data independence– Relational model facilitates data independence
• Set-at-a-time languages facilitate phys. indep. [more next lecture]• Simple data models facilitate logical indep. [more next lecture]
• Flat models are also simpler, more flexible• User should specify what they want not how to get it
– Query optimizer does better job than human
• New data model proposals must– Solve a “major pain” or provide significant performance gains
![Page 35: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/35.jpg)
Views
Dan Suciu -- 544, Winter 2011 35
Views are relations, but may not be physically stored.
For presenting different information to different users
Employee(ssn, name, department, project, salary)
Payroll has access to Employee, others only to Developers
CREATE VIEW Developers AS SELECT name, project FROM Employee WHERE department = ‘Development’
![Page 36: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/36.jpg)
Example
Dan Suciu -- 544, Winter 2011 36
CREATE VIEW CustomerPrice AS SELECT x.customer, y.price FROM Purchase x, Product y WHERE x.product = y.pname
Purchase(customer, product, store)Product(pname, price)
CustomerPrice(customer, price) “virtual table”
![Page 37: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/37.jpg)
Dan Suciu -- 544, Winter 2011 37
SELECT u.customer, v.storeFROM CustomerPrice u, Purchase vWHERE u.customer = v.customer AND u.price > 100
We can later use the view:
Purchase(customer, product, store)Product(pname, price)CustomerPrice(customer, price)
![Page 38: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/38.jpg)
Types of Views
• Virtual views:– Pros/cons ???
• Materialized views– Pros/cons ??
Dan Suciu -- 544, Winter 2011 38
![Page 39: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/39.jpg)
Types of Views
• Virtual views:– Used in databases– Computed only on-demand – slow at runtime– Always up to date
• Materialized views– Used in data warehouses– Pre-computed offline – fast at runtime– May have stale data or expensive synchronization
Dan Suciu -- 544, Winter 2011 39
![Page 40: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/40.jpg)
Query Modification
Dan Suciu -- 544, Winter 2011 40
SELECT u.customer, v.storeFROM CustomerPrice u, Purchase vWHERE u.customer = v.customer AND u.price > 100
CREATE VIEW CustomerPrice AS SELECT x.customer, y.price FROM Purchase x, Product y WHERE x.product = y.pname
View:
Query:
Purchase(customer, product, store)Product(pname, price)
CustomerPrice(customer, price)
![Page 41: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/41.jpg)
Query Modification
Dan Suciu -- 544, Winter 2011 41
SELECT u.customer, v.storeFROM (SELECT x.customer, y.price FROM Purchase x, Product y WHERE x.product = y.pname) u, Purchase vWHERE u.customer = v.customer AND u.price > 100
Modified query:
Purchase(customer, product, store)Product(pname, price)
CustomerPrice(customer, price)
![Page 42: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/42.jpg)
Query Modification
Dan Suciu -- 544, Winter 2011 42
SELECT x.customer, v.storeFROM Purchase x, Product y, Purchase v, WHERE x.customer = v.customer AND y.price > 100 AND x.product = y.pname
Modified and unnested query:
Purchase(customer, product, store)Product(pname, price)
CustomerPrice(customer, price)
![Page 43: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/43.jpg)
Another Example
Dan Suciu -- 544, Winter 2011 43
SELECT DISTINCT u.customer, v.storeFROM CustomerPrice u, Purchase vWHERE u.customer = v.customer AND u.price > 100
??
Purchase(customer, product, store)Product(pname, price)
CustomerPrice(customer, price)
![Page 44: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/44.jpg)
Answer
Dan Suciu -- 544, Winter 2011 44
SELECT DISTINCT u.customer, v.storeFROM CustomerPrice u, Purchase vWHERE u.customer = v.customer AND u.price > 100
Purchase(customer, product, store)Product(pname, price)
CustomerPrice(customer, price)
SELECT DISTINCT x.customer, v.storeFROM Purchase x, Product y, Purchase v, WHERE x.customer = v.customer AND y.price > 100 AND x.product = y.pname
![Page 45: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/45.jpg)
Applications of Virtual Views
• Physical data independence. E.g.– Vertical data partitioning– Horizontal data partitioning
• Security– The view reveals only what the users are
allowed to know
Dan Suciu -- 544, Winter 2011 45
![Page 46: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/46.jpg)
Vertical PartitioningSSN Name Address Resume Picture234234 Mary Huston Clob1… Blob1…345345 Sue Seattle Clob2… Blob2…345343 Joan Seattle Clob3… Blob3…234234 Ann Portland Clob4… Blob4…
Resumes
SSN Name Address234234 Mary Huston
345345 Sue Seattle . . .
SSN Resume234234 Clob1…345345 Clob2…
SSN Picture234234 Blob1…
345345 Blob2…
T1 T2 T3
![Page 47: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/47.jpg)
Vertical Partitioning
Dan Suciu -- 544, Winter 2011 47
CREATE VIEW Resumes AS SELECT T1.ssn, T1.name, T1.address, T2.resume, T3.picture FROM T1,T2,T3 WHERE T1.ssn=T2.ssn and T2.ssn=T3.ssn
![Page 48: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/48.jpg)
Vertical Partitioning
Dan Suciu -- 544, Winter 2011 48
SELECT addressFROM ResumesWHERE name = ‘Sue’
Which of the tables T1, T2, T3 willbe queried by the system ?
![Page 49: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/49.jpg)
Vertical PartitioningWhen to do this:• When some fields are large, rarely accessed
– E.g. Picture• In distributed databases
– Customer info site 1, customer orders at site 2• In data integration
– T1 comes from one source– T2 comes from a different source
Dan Suciu -- 544, Winter 2011 49
![Page 50: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/50.jpg)
Horizontal Partitioning
50
SSN Name City234234 Mary Huston345345 Sue Seattle345343 Joan Seattle234234 Ann Portland-- Frank Spokane
-- Jean Spokane
CustomersSSN Name City234234 Mary Huston
CustomersInHuston
SSN Name City345345 Sue Seattle
345343 Joan Seattle
CustomersInSeattle
SSN Name City-- Frank Spokane
-- Jean Spokane
CustomersInCanada
![Page 51: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/51.jpg)
Horizontal Partitioning
Dan Suciu -- 544, Winter 2011 51
CREATE VIEW Customers AS CustomersInHuston UNION ALL CustomersInSeattle UNION ALL . . .
SSN Name City234234 Mary Huston
CustomersInHuston
SSN Name City345345 Sue Seattle
345343 Joan Seattle
CustomersInSeattle
SSN Name City-- Frank Spokane
-- Jean Spokane
CustomersInCanada
![Page 52: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/52.jpg)
Horizontal Partitioning
Dan Suciu -- 544, Winter 2011 52
SELECT nameFROM CusotmersWHERE city = ‘Seattle’
SSN Name City234234 Mary Huston
CustomersInHuston
SSN Name City345345 Sue Seattle
345343 Joan Seattle
CustomersInSeattle
SSN Name City-- Frank Spokane
-- Jean Spokane
CustomersInCanadaWhich tables are inspected by the system ?
![Page 53: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/53.jpg)
Horizontal Partitioning
Dan Suciu -- 544, Winter 2011 53
SELECT nameFROM CusotmersWHERE city = ‘Seattle’
Which tables are inspected by the system ?
SSN Name City234234 Mary Huston
CustomersInHuston
SSN Name City345345 Sue Seattle
345343 Joan Seattle
CustomersInSeattle
SSN Name City-- Frank Spokane
-- Jean Spokane
CustomersInCanada
All ! The system doesn’t know where ‘Seattle’ is
![Page 54: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/54.jpg)
Better
Dan Suciu -- 544, Winter 2011 54
CREATE VIEW Customers AS SELECT *, ‘Huston’ AS City FROM CustomersInHuston UNION ALL SELECT *, ‘Seattle’ AS City FROM CustomersInSeattle UNION ALL . . .
SSN Name City234234 Mary Huston
CustomersInHuston
SSN Name City345345 Sue Seattle
345343 Joan Seattle
CustomersInSeattle
SSN Name City-- Frank Spokane
-- Jean Spokane
CustomersInCanada
![Page 55: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/55.jpg)
Better
Dan Suciu -- 544, Winter 2011 55
SELECT nameFROM CusotmersWHERE city = ‘Seattle’
SELECT nameFROM CusotmersInSeattle
![Page 56: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/56.jpg)
Horizontal Partitioning
Applications:• Optimizations:
– E.g. archived applications and active applications
• Distributed databases• Data integration
Dan Suciu -- 544, Winter 2011 56
![Page 57: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/57.jpg)
SQL Security Model
• Discretionary access control:– Users × Tables × {SELECT, INSERT,
UPDATE, …}– GRANT and REVOKE commands
• Coarse grained ! Now row-level access control:– Each customer is allowed to see his/her own
records• Views are quick fix to that
Dan Suciu -- 544, Winter 2011 57
![Page 58: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/58.jpg)
Views and Security
Dan Suciu -- 544, Winter 2011 58
Name Address BalanceMary Huston 450.99Sue Seattle -240Joan Seattle 333.25Ann Portland -520
Customers:Fred is notallowed tosee this
How do we grant Fred access only to Name/Address ?
![Page 59: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/59.jpg)
Views and Security
Dan Suciu -- 544, Winter 2011 59
Name Address BalanceMary Huston 450.99Sue Seattle -240Joan Seattle 333.25Ann Portland -520
Grant Fred
access tothis
Customers:Fred is notallowed tosee this
CREATE VIEW PublicCustomers SELECT Name, Address FROM Customers
![Page 60: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/60.jpg)
Views and Security
Dan Suciu -- 544, Winter 2011 60
Name Address BalanceMary Huston 450.99Sue Seattle -240Joan Seattle 333.25Ann Portland -520
Customers: John isnot allowedto see >0balances
How do we grant John access only to delinquent accounts ?
![Page 61: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/61.jpg)
Views and Security
Dan Suciu -- 544, Winter 2011 61
Name Address BalanceMary Huston 450.99Sue Seattle -240Joan Seattle 333.25Ann Portland -520
Customers: John isnot allowedto see >0balances
CREATE VIEW BadCreditCustomers SELECT * FROM Customers WHERE Balance < 0
![Page 62: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/62.jpg)
Technical Problems in Virtual Views
• Simplifying queries over virtual views
• Updating virtual views
Dan Suciu -- 544, Winter 2011 62
![Page 63: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/63.jpg)
Set v.s. Bag Semantics
Dan Suciu -- 544, Winter 2011 63
SELECT DISTINCT a,b,cFROM R, S, TWHERE . . .
SELECT a,b,cFROM R, S, TWHERE . . .
Set semantics
Bag semantics
![Page 64: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/64.jpg)
Unnesting Queries
• Inner query: set/bag semantics
• Outer query: set/bag semantics
• When can we unnest ?
Dan Suciu -- 544, Winter 2011 64
![Page 65: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/65.jpg)
SELECT DISTINCT a,b,cFROM (SELECT DISTINCT u,v FROM R,S WHERE …), TWHERE . . .
SELECT DISTINCT a,b,cFROM (SELECT u,v FROM R,S WHERE …), TWHERE . . .
SELECT a,b,cFROM (SELECT u,v FROM R,S WHERE …), TWHERE . . .
SELECT a,b,cFROM (SELECT DISTINCT u,v FROM R,S WHERE …), TWHERE . . .
![Page 66: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/66.jpg)
SELECT DISTINCT a,b,cFROM (SELECT DISTINCT u,v FROM R,S WHERE …), TWHERE . . .
SELECT DISTINCT a,b,cFROM R, S, TWHERE . . .
SELECT DISTINCT a,b,cFROM (SELECT u,v FROM R,S WHERE …), TWHERE . . .
SELECT a,b,cFROM (SELECT u,v FROM R,S WHERE …), TWHERE . . .
SELECT a,b,cFROM (SELECT DISTINCT u,v FROM R,S WHERE …), TWHERE . . .
![Page 67: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/67.jpg)
SELECT DISTINCT a,b,cFROM (SELECT DISTINCT u,v FROM R,S WHERE …), TWHERE . . .
SELECT DISTINCT a,b,cFROM R, S, TWHERE . . .
SELECT DISTINCT a,b,cFROM (SELECT u,v FROM R,S WHERE …), TWHERE . . .
SELECT DISTINCT a,b,cFROM R, S, TWHERE . . .
SELECT a,b,cFROM (SELECT u,v FROM R,S WHERE …), TWHERE . . .
SELECT a,b,cFROM (SELECT DISTINCT u,v FROM R,S WHERE …), TWHERE . . .
![Page 68: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/68.jpg)
SELECT DISTINCT a,b,cFROM (SELECT DISTINCT u,v FROM R,S WHERE …), TWHERE . . .
SELECT DISTINCT a,b,cFROM R, S, TWHERE . . .
SELECT DISTINCT a,b,cFROM (SELECT u,v FROM R,S WHERE …), TWHERE . . .
SELECT DISTINCT a,b,cFROM R, S, TWHERE . . .
SELECT a,b,cFROM (SELECT u,v FROM R,S WHERE …), TWHERE . . .
SELECT a,b,cFROM (SELECT DISTINCT u,v FROM R,S WHERE …), TWHERE . . .
NO
![Page 69: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/69.jpg)
SELECT DISTINCT a,b,cFROM (SELECT DISTINCT u,v FROM R,S WHERE …), TWHERE . . .
SELECT DISTINCT a,b,cFROM R, S, TWHERE . . .
SELECT DISTINCT a,b,cFROM (SELECT u,v FROM R,S WHERE …), TWHERE . . .
SELECT DISTINCT a,b,cFROM R, S, TWHERE . . .
SELECT a,b,cFROM (SELECT u,v FROM R,S WHERE …), TWHERE . . .
SELECT a,b,cFROM R, S, TWHERE . . .
SELECT a,b,cFROM (SELECT DISTINCT u,v FROM R,S WHERE …), TWHERE . . .
NO
![Page 70: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/70.jpg)
Updating Virtual Views
• V(A1, A2, ..) = view over R1, R2, …
• Insert/modify/delete in/from V
• Can we push this to R1, R2, … ?– Updatable view = yes.– Non-updatable view = no.
Dan Suciu -- 544, Winter 2011 70
![Page 71: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/71.jpg)
Updatable View
Dan Suciu -- 544, Winter 2011 71
CREATE VIEW Expensive-Product AS SELECT pname FROM Product WHERE price > 100
Purchase(customer, product, store)Product(pname, price) INSERT
INTO Expensive-Product VALUES(‘Gizmo’)
![Page 72: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/72.jpg)
Updatable View
Dan Suciu -- 544, Winter 2011 72
CREATE VIEW Expensive-Product AS SELECT pname FROM Product WHERE price > 100
INSERT INTO Product VALUES(‘Gizmo’, NULL)
Purchase(customer, product, store)Product(pname, price) INSERT
INTO Expensive-Product VALUES(‘Gizmo’)
![Page 73: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/73.jpg)
Updatable View
Dan Suciu -- 544, Winter 2011 73
CREATE VIEW AcmePurchase AS SELECT customer, product FROM Purchase WHERE store = ‘AcmeStore’
INSERT INTO AcmePurchase VALUES(‘Joe’, ‘Gizmo’)
Purchase(customer, product, store)Product(pname, price)
![Page 74: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/74.jpg)
Updatable View
Dan Suciu -- 544, Winter 2011 74
CREATE VIEW AcmePurchase AS SELECT customer, product FROM Purchase WHERE store = ‘AcmeStore’
INSERT INTO AcmePurchase VALUES(‘Joe’, ‘Gizmo’)
INSERT INTO PurchaseVALUES(‘Joe’,’Gizmo’,NULL)
Notethis
Purchase(customer, product, store)Product(pname, price)
![Page 75: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/75.jpg)
Nonupdatable Views
Dan Suciu -- 544, Winter 2011 75
INSERT INTO CustomerPrice VALUES(‘Joe’, 200)
? ? ? ? ?
Most views arenon-updateable
CREATE VIEW CustomerPrice AS SELECT x.customer, y.price FROM Purchase x, Product y WHERE x.product = y.pname
Purchase(customer, product, store)Product(pname, price)
![Page 76: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/76.jpg)
Query Minimization under Bag Semantics
Rule 1: If:• x, y are tuple variables over the same
table and:• The condition x.key = y.key is in the
WHERE clauseThen combine x, y into a single variablequery
Dan Suciu -- 544, Winter 2011 76
![Page 77: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/77.jpg)
Query Minimization under Bag Semantics
SELECT y.name, x.dateFROM Order x, Product y, Order zWHERE x.pid = y.pid and y.price < 99 and y.pid = z.pid and x.cid = z.cid and z.weight > 150
Order(cid, pid, weight, date)Product(pid, name, price)
SELECT y.name, x.dateFROM Order x, Product yWHERE x.pid = y.pid and y.price < 99 and x.weight > 150
![Page 78: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/78.jpg)
Query Minimization under Bag Semantics
Rule 2: If • x ranges over S, y ranges over T, and• The condition x.fk = y.key is in the
WHERE clause, and• there is a not null constraint on x.fk• y is not used anywhere else, andThen remove T (and y) from the query
Dan Suciu -- 544, Winter 2011 78
![Page 79: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/79.jpg)
Query Minimization under Bag Semantics
Order(cid, pid, weight, date)Product(pid, name, price)
SELECT x.cid, x.dateFROM Order x WHERE x.weight > 20
SELECT x.cid, x.dateFROM Order x, Product yWHERE x.pid = y.pid and x.weight > 20
What constraintsdo we need to havefor this optimization ?
![Page 80: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/80.jpg)
Materialized Views
• The result of the view is materialized
• May speed up query answering significantly
• But the materialized view needs to be synchronized with the base data
Dan Suciu -- 544, Winter 2011 80
![Page 81: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/81.jpg)
Applications of Materialized Views
• Indexes
• Denormalization
• Semantic caching
Dan Suciu -- 544, Winter 2011 81
![Page 82: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/82.jpg)
Indexes
82
REALLY important to speed up query processing time.
SELECT *FROM PersonWHERE name = 'Smith'
CREATE INDEX myindex05 ON Person(name)
Person (name, age, city)
May take too long to scan the entire Person table
Now, when we rerun the query it will be much faster
![Page 83: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/83.jpg)
B+ Tree Index
Dan Suciu -- 544, Winter 2011 83
Adam Betty Charles …. Smith ….
We will discuss them in detail in a later lecture.
![Page 84: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/84.jpg)
Creating Indexes
84
Indexes can be created on more than one attribute:
SELECT * FROM Person WHERE age = 55 AND city = 'Seattle'
Helps in:
SELECT * FROM Person WHERE city = 'Seattle'
But not in:
CREATE INDEX doubleindex ON Person (age, city)
Example:
SELECT * FROM Person WHERE age = 55
and even in:
![Page 85: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/85.jpg)
CREATE INDEX W ON Product(weight)CREATE INDEX P ON Product(price)
Indexes are Materialized Views
SELECT weight, priceFROM ProductWHERE weight > 10 and price < 100
Product(pid, name, weight, price, …)
W(pid, weight)P(pid, price)
SELECT x.weight, y.priceFROM W x, P yWHERE x.weight > 10 and y.price < 100 and x.pid = y.pid
![Page 86: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/86.jpg)
Denormalization
• Compute a view that is the join of several tables
• The view is now a relation that is not in normal form WHY ?
Dan Suciu -- 544, Winter 2011 86
CREATE VIEW CustomerPrice AS SELECT * FROM Purchase x, Product y WHERE x.product = y.pname
Purchase(customer, product, store)Product(pname, price)
![Page 87: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/87.jpg)
Semantic Caching
• Queries Q1, Q2, … have been executed, and their results are stored in main memory
• Now we need to compute a new query Q• Sometimes we can use the prior results in
answering Q• These queries can be seen as materialized
views
87
![Page 88: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/88.jpg)
Technical Challenges in Managing Views
• Synchronizing materialized views– A.k.a. incremental view maintenance,
or incremental view update
• Answering queries using views
Dan Suciu -- 544, Winter 2011 88
![Page 89: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/89.jpg)
Synchronizing Materialized Views• Immediate synchronization = after each
update• Deferred synchronization
– Lazy = at query time– Periodic– Forced = manual
89
Which one is best for: indexes, data warehouses, replication ?
![Page 90: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/90.jpg)
Incremental View Update
Dan Suciu -- 544, Winter 2011 90
CREATE VIEW FullOrder AS SELECT x.cid,x.pid,x.date,y.name,y.price FROM Order x, Product y WHERE x.pid = y.pid
UPDATE ProductSET price = price / 2WHERE pid = ‘12345’
Order(cid, pid, date)Product(pid, name, price)
UPDATE FullOrderSET price = price / 2WHERE pid = ‘12345’
No need to recompute the entire view !
![Page 91: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/91.jpg)
Incremental View Update
91
CREATE VIEW Categories AS SELECT DISTINCT category FROM Product
DELETE ProductWHERE pid = ‘12345’
Product(pid, name, category, price)
DELETE CategoriesWHERE category in (SELECT category FROM Product WHERE pid = ‘12345’)
It doesn’t work ! Why ? How can we fix it ?
![Page 92: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/92.jpg)
Incremental View Update
92
CREATE VIEW Categories AS SELECT category, count(*) as c FROM Product GROUP BY category
DELETE ProductWHERE pid = ‘12345’
Product(pid, name, category, price)
UPDATE CategoriesSET c = c-1 WHERE category in (SELECT category FROM Product WHERE pid = ‘12345’);DELETE CategoriesWHERE c = 0
![Page 93: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/93.jpg)
Answering Queries Using Views• We have several materialized views:
– V1, V2, …, Vn• Given a query Q
– Answer it by using views instead of base tables• Variation: Query rewriting using views
– Answer it by rewriting it to another query first• Example: if the views are indexes, then we
rewrite the query to use indexes
Dan Suciu -- 544, Winter 2011 93
![Page 94: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/94.jpg)
Rewriting Queries Using Views
94
Purchase(buyer, seller, product, store)Person(pname, city)
CREATE VIEW SeattleView AS SELECT y.buyer, y.seller, y.product, y.store FROM Person x, Purchase y WHERE x.city = ‘Seattle’ AND x.pname = y.buyer
SELECT y.buyer, y.sellerFROM Person x, Purchase yWHERE x.city = ‘Seattle’ AND x..pname = y.buyer AND y.product=‘gizmo’
Goal: rewrite this queryin terms of the view
Have thismaterializedview:
![Page 95: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/95.jpg)
Rewriting Queries Using Views
Dan Suciu -- 544, Winter 2011 95
SELECT y.buyer, y.sellerFROM Person x, Purchase yWHERE x.city = ‘Seattle’ AND x..pname = y.buyer AND y.product=‘gizmo’
SELECT buyer, sellerFROM SeattleViewWHERE product= ‘gizmo’
![Page 96: CSE 544 Data Models and Views](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816970550346895de14380/html5/thumbnails/96.jpg)
Rewriting is not always possible
96
CREATE VIEW DifferentView AS SELECT y.buyer, y.seller, y.product, y.store FROM Person x, Purchase y, Product z WHERE x.city = ‘Seattle’ AND x.pname = y.buyer AND y.product = z.name AND z.price < 100
SELECT y.buyer, y.sellerFROM Person x, Purchase yWHERE x.city = ‘Seattle’ AND x..pname = y.buyer AND y.product=‘gizmo’ SELECT buyer, seller
FROM DifferentViewWHERE product= ‘gizmo’
“Maximallycontainedrewriting”