live a lineage-supported, versioned dbms anish das sarma martin theobald jennifer widom
TRANSCRIPT
![Page 1: LIVE A lineage-supported, versioned DBMS Anish Das Sarma Martin Theobald Jennifer Widom](https://reader036.vdocuments.mx/reader036/viewer/2022062803/56649c6d5503460f9491ed75/html5/thumbnails/1.jpg)
LIVE
A lineage-supported, versioned DBMS
Anish Das Sarma Martin Theobald Jennifer Widom
![Page 2: LIVE A lineage-supported, versioned DBMS Anish Das Sarma Martin Theobald Jennifer Widom](https://reader036.vdocuments.mx/reader036/viewer/2022062803/56649c6d5503460f9491ed75/html5/thumbnails/2.jpg)
ULDB Data Model and the Trio System Uncertainty & Lineage
LIVE Data Model (LDM) Uncertainty, Lineage & Versioning
Data Modifications Insert/Delete Tuples, Update Values, Update
Confidences Query Evaluation
Valid-At vs. Snapshot Queries, Interval Computations, Confidence Computations, Complexity
Experiments/Conclusions
Agenda
21.04.232 LIVE - A lineage-supported, versioned DBMS
![Page 3: LIVE A lineage-supported, versioned DBMS Anish Das Sarma Martin Theobald Jennifer Widom](https://reader036.vdocuments.mx/reader036/viewer/2022062803/56649c6d5503460f9491ed75/html5/thumbnails/3.jpg)
ULDB Data Model
21.04.233 LIVE - A lineage-supported, versioned DBMS
Different types of uncertainty: 1. Tuple Alternatives 2. ‘?’ (Maybe) Annotations 3. Confidences
Implementation of the ULDB data model: Trio System
TriQL query language TrioExplorer browser frontend, trioplus client,
API Enhanced PostgreSQL backend (SPI) Search for “Stanford Trio”
![Page 4: LIVE A lineage-supported, versioned DBMS Anish Das Sarma Martin Theobald Jennifer Widom](https://reader036.vdocuments.mx/reader036/viewer/2022062803/56649c6d5503460f9491ed75/html5/thumbnails/4.jpg)
ULDBs – Alternatives
21.04.234 LIVE - A lineage-supported, versioned DBMS
1. Alternatives: uncertainty about attribute values
2. ‘?’ (Maybe) Annotations 3. Confidences
Saw (witness, color, car)
Amy red, Honda ∥ red, Toyota ∥ orange, Mazda
Three possibleworlds
![Page 5: LIVE A lineage-supported, versioned DBMS Anish Das Sarma Martin Theobald Jennifer Widom](https://reader036.vdocuments.mx/reader036/viewer/2022062803/56649c6d5503460f9491ed75/html5/thumbnails/5.jpg)
ULDBs – Maybe Annotations
21.04.235 LIVE - A lineage-supported, versioned DBMS
Six possibleworlds
1. Alternatives 2. ‘?’ (Maybe): uncertainty about tuple
presence 3. Confidences
?
Saw (witness, color, car)
Amy red, Honda ∥ red, Toyota ∥ orange, Mazda
Betty blue, Acura
![Page 6: LIVE A lineage-supported, versioned DBMS Anish Das Sarma Martin Theobald Jennifer Widom](https://reader036.vdocuments.mx/reader036/viewer/2022062803/56649c6d5503460f9491ed75/html5/thumbnails/6.jpg)
ULDBs – Confidences
21.04.236 LIVE - A lineage-supported, versioned DBMS
1. Alternatives 2. ‘?’ (Maybe) Annotations 3. Confidences: weighted uncertainty
Six possible worlds,each with a probability
?
Saw (witness, color, car)
Amy red, Honda 0.5 ∥ red, Toyota 0.3 ∥ orange, Mazda 0.2
Betty blue, Acura 0.6
![Page 7: LIVE A lineage-supported, versioned DBMS Anish Das Sarma Martin Theobald Jennifer Widom](https://reader036.vdocuments.mx/reader036/viewer/2022062803/56649c6d5503460f9491ed75/html5/thumbnails/7.jpg)
ULDBs – Closure
21.04.237 LIVE - A lineage-supported, versioned DBMS
Saw (witness, car)
Cathy
Mazda ∥ Honda
Drives (person, car)
Jimmy, Toyota ∥ Jimmy, Mazda
Billy, Honda ∥ Frank, Honda
Hank, Honda
Suspects
Jimmy
Billy ∥ Frank
Hank
Suspects = πperson(Saw ⋈ Drives)
???
Does not correctlycapture possibleworlds in theresult!
CANNOT
![Page 8: LIVE A lineage-supported, versioned DBMS Anish Das Sarma Martin Theobald Jennifer Widom](https://reader036.vdocuments.mx/reader036/viewer/2022062803/56649c6d5503460f9491ed75/html5/thumbnails/8.jpg)
ULDBs – Lineage
21.04.238 LIVE - A lineage-supported, versioned DBMS
ID Saw (witness, car)
11
Cathy
Honda ∥ Mazda
ID Drives (person, car)
21
Jimmy, Toyota ∥ Jimmy, Mazda
22
Billy, Honda ∥ Frank, Honda
23
Hank, Honda
ID Suspects
31
Jimmy
32
Billy ∥ Frank
33
Hank
Suspects = πperson(Saw ⋈ Drives)
???
λ(31) = (11,2)(21,2)
λ(32,1) = (11,1)(22,1)
λ(33) = (11,1)23
; λ(32,2) = (11,1)(22,2)
![Page 9: LIVE A lineage-supported, versioned DBMS Anish Das Sarma Martin Theobald Jennifer Widom](https://reader036.vdocuments.mx/reader036/viewer/2022062803/56649c6d5503460f9491ed75/html5/thumbnails/9.jpg)
ULDBs – Summary
21.04.239 LIVE - A lineage-supported, versioned DBMS
1. Alternatives2. ‘?’ (Maybe) Annotations3. Confidences4. Lineage
ULDBs are closed and complete
Uncertainty-Lineage Databases (ULDBs)Uncertainty-Lineage Databases (ULDBs)
![Page 10: LIVE A lineage-supported, versioned DBMS Anish Das Sarma Martin Theobald Jennifer Widom](https://reader036.vdocuments.mx/reader036/viewer/2022062803/56649c6d5503460f9491ed75/html5/thumbnails/10.jpg)
Can exclusively utilize lineage in order to compute the confidence of a result tuple.
#P-complete for general Boolean formulas Approximation algorithms: Luby-Karp, etc.
Lineage & Confidences
21.04.2310 LIVE - A lineage-supported, versioned DBMS
λ(21) = (11 12 13)
ID Saw(witness, car)
11 (Mary, Honda) : 0.8
12 (Susan, Honda) : 0.9
13 (Betty, Honda) : 0.5
ID SuspectCars(car)
21 Honda : ?
Select distinct car from Saw;
P(21) = 1 – (1-0.8) X (1-0.9) X (1-0.5)
0.99
![Page 11: LIVE A lineage-supported, versioned DBMS Anish Das Sarma Martin Theobald Jennifer Widom](https://reader036.vdocuments.mx/reader036/viewer/2022062803/56649c6d5503460f9491ed75/html5/thumbnails/11.jpg)
ID Photo(Number,Name)2
11 (1, Amy) [0,1] : 1.0
12 (1, Bob) [0,] : 0.6
13 (2, Carl) [0,1] : 0.314 (3, Dale) [1,1] : 0.1
Versioning (LDM Data Model)
21.04.2311 LIVE - A lineage-supported, versioned DBMS
Version intervals for tuples Contiguous version numbers 0,…, Database has current version vD
Tuples have a validity intervals [s, e]
Valid-At Queries: Select * from Photo valid-at 2;
Snapshot Queries: View Photo at 2;
Possible Worlds: LDM databases encode lists of sets of
possible worlds.
ID Photo(Number,Name)2
12 (1, Bob) [0,] : 0.6
ID Photo@2(Number,Name)
12 (1, Bob) : 0.6
![Page 12: LIVE A lineage-supported, versioned DBMS Anish Das Sarma Martin Theobald Jennifer Widom](https://reader036.vdocuments.mx/reader036/viewer/2022062803/56649c6d5503460f9491ed75/html5/thumbnails/12.jpg)
Insert Tuple: Insert t with version [vD+1,]
commit; Increase vD
Data Modifications – Insert
21.04.2312 LIVE - A lineage-supported, versioned DBMS
ID People(Name, State, Job)0
21 (Bob, NY, Analyst) [0,] : 1.0
22 (Carl, IL, Teacher) [0,] : 1.0
23 (David, PA, Manager)
[0,] : 0.6
24 (Frank, CA, Eng.) [1,] : 0.3
ID People(Name, State, Job)1
ID People(Name, State, Job)2
25 (David, PA, CEO) [2,] : 0.3
(1)
(2)
(2)
![Page 13: LIVE A lineage-supported, versioned DBMS Anish Das Sarma Martin Theobald Jennifer Widom](https://reader036.vdocuments.mx/reader036/viewer/2022062803/56649c6d5503460f9491ed75/html5/thumbnails/13.jpg)
Insert Tuple: Insert t with version [vD+1,]
Delete Tuple: Set end(t) to vD
commit; Increase vD
Data Modifications – Delete
21.04.2313 LIVE - A lineage-supported, versioned DBMS
ID People(Name, State, Job)2
21 (Bob, NY, Analyst) [0,] : 1.0
22 (Carl, IL, Teacher) [0,] : 1.0
23 (David, PA, Manager)
[0,] : 0.6
24 (Frank, CA, Eng.) [1,] : 0.325 (David, PA, CEO) [2,] : 0.3
22 (Carl, IL, Teacher) [0,2] : 1.0
ID People(Name, State, Job)3
(1)
(2)
(3)
(2)
![Page 14: LIVE A lineage-supported, versioned DBMS Anish Das Sarma Martin Theobald Jennifer Widom](https://reader036.vdocuments.mx/reader036/viewer/2022062803/56649c6d5503460f9491ed75/html5/thumbnails/14.jpg)
Insert Tuple: Insert t with version [vD+1,]
Delete Tuple: Set end(t) to vD
Update Value: Set end(t) to vD
Insert t’ with version [vD+1,]
commit; Increase vD
Data Modifications – Update
21.04.2314 LIVE - A lineage-supported, versioned DBMS
ID People(Name, State, Job)3
21 (Bob, NY, Analyst) [0,] : 1.0
22 (Carl, IL, Teacher) [0,2] : 1.0
23 (David, PA, Manager)
[0,] : 0.6
24 (Frank, CA, Eng.) [1,] : 0.325 (David, PA, CEO) [2,] : 0.321 (Bob, CA, Student) [4,] : 0.3
21 (Bob, NY, Analyst) [0,3] : 1.0
(1)
(2)
(3)
(2)
(4)
(4)
ID People(Name, State, Job)4
![Page 15: LIVE A lineage-supported, versioned DBMS Anish Das Sarma Martin Theobald Jennifer Widom](https://reader036.vdocuments.mx/reader036/viewer/2022062803/56649c6d5503460f9491ed75/html5/thumbnails/15.jpg)
Insert Tuple: Insert t with version [vD+1,]
Delete Tuple: Set end(t) to vD
Update Value: Set end(t) to vD
Insert t’ with version [vD+1,]
Update Probability: Set end(t) to vD
Insert t’=t with probability p’ and version [vD+1,]
commit; Increase vD
Data Modifications – Update
21.04.2315 LIVE - A lineage-supported, versioned DBMS
ID People(Name, State, Job)4
21 (Bob, NY, Analyst) [0,3] : 1.0
22 (Carl, IL, Teacher) [0,2] : 1.0
23 (David, PA, Manager)
[0,] : 0.6
24 (Frank, CA, Eng.) [1,] : 0.325 (David, PA, CEO) [2,] : 0.321 (Bob, CA, Student) [4,] : 0.3
(1)
(2)
(3)
(2)
(4)
(4)21 (Bob, CA, Student) [5,] :
0.7
21 (Bob, CA, Student) [4,4] : 0.3 (5)
ID People(Name, State, Job)5
![Page 16: LIVE A lineage-supported, versioned DBMS Anish Das Sarma Martin Theobald Jennifer Widom](https://reader036.vdocuments.mx/reader036/viewer/2022062803/56649c6d5503460f9491ed75/html5/thumbnails/16.jpg)
Insert Tuple: Insert t with version [vD+1,]
Delete Tuple: Set end(t) to vD
Update Value: Set end(t) to vD Insert t’ with version [vD+1,]
Update Probability: Set end(t) to vD Insert t’=t with probability p’ and version
[vD+1,]
Possible worlds: Updates may create duplicate
worlds, which are merged (at any version v).
Data Modifications – Summary
21.04.2316 LIVE - A lineage-supported, versioned DBMS
ID People(Name, State, Job)4
21 (Bob, NY, Analyst) [0,3] : 1.0
22 (Carl, IL, Teacher) [0,2] : 1.0
23 (David, PA, Manager)
[0,] : 0.6
24 (Frank, CA, Eng.) [1,] : 0.325 (David, PA, CEO) [2,] : 0.326 (Bob, CA, Student) [4,] : 0.3
(1)
(2)
(3)
(2)
(4)
(4)21 (Bob, CA, Student) [5,] :
0.7
21 (Bob, CA, Student) [4,4] : 0.3 (5)
ID People(Name, State, Job)5
![Page 17: LIVE A lineage-supported, versioned DBMS Anish Das Sarma Martin Theobald Jennifer Widom](https://reader036.vdocuments.mx/reader036/viewer/2022062803/56649c6d5503460f9491ed75/html5/thumbnails/17.jpg)
1) Data Computation (regular SQL, including lineage) 2) Interval Computation (stored procedure)
Query Evaluation
21.04.2317 LIVE - A lineage-supported, versioned DBMS
DD
D1, D2, …, Dn1D1, D2, …, Dn1
possibleworlds
at versionsQ on each
world
encodingof possible worlds
Q(D1), Q(D2), …, Q(Dn)Q(D1), Q(D2), …, Q(Dn)
implementation of Q
operational semantics
D + ResultD + Result
D1, D2, …, Dn2D1, D2, …, Dn2
@ (0)
@ (1)
D1, D2, …, DnvD1, D2, …, Dnv @ (vD)
…
…
@ (0)
![Page 18: LIVE A lineage-supported, versioned DBMS Anish Das Sarma Martin Theobald Jennifer Widom](https://reader036.vdocuments.mx/reader036/viewer/2022062803/56649c6d5503460f9491ed75/html5/thumbnails/18.jpg)
Can exclusively utilize lineage in order to compute the confidence of any result tuple.
Can exclusively utilize lineage in order to compute the version interval of any result tuple.
Lineage, Confidences & Versions
21.04.2318 LIVE - A lineage-supported, versioned DBMS
![Page 19: LIVE A lineage-supported, versioned DBMS Anish Das Sarma Martin Theobald Jennifer Widom](https://reader036.vdocuments.mx/reader036/viewer/2022062803/56649c6d5503460f9491ed75/html5/thumbnails/19.jpg)
Positive Lineage (disjunctions & conjunctions) In the lineage formula λ(t)
Replace every tuple t’ by its version interval Replace every with and every with
Version Interval Computation
21.04.2319 LIVE - A lineage-supported, versioned DBMS
λ(21) = (11 12 13)
ID Saw(witness, car)3
11 (Mary, Honda) [1,] : 0.8
12 (Susan, Honda) [2,] : 0.9
13 (Betty, Honda) [3,] : 0.5
ID SuspectCars(car)3
21 (Honda) ? : ?
Select distinct car from Saw;
P(21) = 1 – (1-0.8) X (1-0.9) X (1-0.5)
[1,] :
0.99
![Page 20: LIVE A lineage-supported, versioned DBMS Anish Das Sarma Martin Theobald Jennifer Widom](https://reader036.vdocuments.mx/reader036/viewer/2022062803/56649c6d5503460f9491ed75/html5/thumbnails/20.jpg)
Positive Lineage (disjunctions & conjunctions) In the lineage formula λ(t)
Replace every tuple t’ by its version interval Replace every with and every with
Version & Confidence Computation
21.04.2320 LIVE - A lineage-supported, versioned DBMS
λ(21) = (11 12)
ID Saw(witness, car)3
11 (Mary, Honda) [1,] : 0.8
12 (Susan, Honda) [2,] : 0.9
13 (Betty, Honda) [3,] : 0.5
ID SuspectCars(car)3
21 (Honda) [1,] : 0.99
Select distinct car from Saw;
P(21) = 1 – (1-0.8) X (1-0.9)
ID SuspectCars(car)2
21 (Honda) ? : ?
Select distinct car from Saw valid-at 2;
[1,] : 0.98
![Page 21: LIVE A lineage-supported, versioned DBMS Anish Das Sarma Martin Theobald Jennifer Widom](https://reader036.vdocuments.mx/reader036/viewer/2022062803/56649c6d5503460f9491ed75/html5/thumbnails/21.jpg)
21.04.2321 LIVE - A lineage-supported,
versioned DBMS
Can decouple interval computation from data computation
Or: push interval computation into query plans only when there is no negation.
Interval Computations & Query Plans
Select R.A from R EXCEPT ( Select R.A from R EXCEPT Select S.A from S ); r=(a)[0,10] u=(a)[0,10]
t=(a)[0,10]
r=(a)[0,10] s=(a)[5,15]
–
–
Select R.A from R,SWhere R.A=S.A;
r=(a)[0,10] s=(a)[5,15]
t=(a)[5,10]
![Page 22: LIVE A lineage-supported, versioned DBMS Anish Das Sarma Martin Theobald Jennifer Widom](https://reader036.vdocuments.mx/reader036/viewer/2022062803/56649c6d5503460f9491ed75/html5/thumbnails/22.jpg)
Positive Lineage (disjunctions & conjunctions) Version interval computation
PTIME (linear) Confidence computation
#P-complete
Arbitrary Lineage (including negation) Version interval computation
PTIME (linear) if all confidences are known NP-hard if confidences are not known
(need to check for idempotence of negated tuples) Confidence computation
#P-complete
Complexity Results
21.04.2322 LIVE - A lineage-supported, versioned DBMS
![Page 23: LIVE A lineage-supported, versioned DBMS Anish Das Sarma Martin Theobald Jennifer Widom](https://reader036.vdocuments.mx/reader036/viewer/2022062803/56649c6d5503460f9491ed75/html5/thumbnails/23.jpg)
Probabilistic & versioned TPC-H setting Queries over Lineitem, Orders tables
with varying join selectivity from 0.1% to 1% (6,000-60,000 and1,500-15,000 tuples for Lineitem & Orders)
Update 0.1% to 1% of the input data Assign probabilities within [0,1] uniform-randomly to
tuples
Additional indexes for versioning Two B+-trees on (start, end) and end points of intervals Rewrite valid-at & snapshot queries using
WHERE (start ≤ v ≤ end) predicates
Experiments – Setup
21.04.2323 LIVE - A lineage-supported, versioned DBMS
![Page 24: LIVE A lineage-supported, versioned DBMS Anish Das Sarma Martin Theobald Jennifer Widom](https://reader036.vdocuments.mx/reader036/viewer/2022062803/56649c6d5503460f9491ed75/html5/thumbnails/24.jpg)
Experiments – Results (I)
21.04.2324 LIVE - A lineage-supported, versioned DBMS
Join query Overhead of versioned
system vs. non-versioned system (versions not computed)
Join query Overhead of
computing versions (versioned system)
(%)
![Page 25: LIVE A lineage-supported, versioned DBMS Anish Das Sarma Martin Theobald Jennifer Widom](https://reader036.vdocuments.mx/reader036/viewer/2022062803/56649c6d5503460f9491ed75/html5/thumbnails/25.jpg)
Experiments – Results (II)
21.04.2325 LIVE - A lineage-supported, versioned DBMS
Join query Progressive data
updates (overwrite multiple times)
Join query Valid-at queries vs. full version
computation
![Page 26: LIVE A lineage-supported, versioned DBMS Anish Das Sarma Martin Theobald Jennifer Widom](https://reader036.vdocuments.mx/reader036/viewer/2022062803/56649c6d5503460f9491ed75/html5/thumbnails/26.jpg)
Experiments – Results (III)
21.04.2326 LIVE - A lineage-supported, versioned DBMS
Overhead of version computation, different query types (1% data modified)
![Page 27: LIVE A lineage-supported, versioned DBMS Anish Das Sarma Martin Theobald Jennifer Widom](https://reader036.vdocuments.mx/reader036/viewer/2022062803/56649c6d5503460f9491ed75/html5/thumbnails/27.jpg)
LDMs are closed and complete Generalizes to full ULDB data
model (including value alternatives & maybe (?) annotations)
Can employ lineage also for update propagations Supports all of
INSERT/DELETE/UPDATE with INTERSECT/UNION/EXCEPT set operations
Conclusions
21.04.2327 LIVE - A lineage-supported, versioned DBMS
Lineage
Uncertainty Versioning
DBMS