bigdawg overview

13
e BigDAWG Polystore Syst

Upload: albertrcarter

Post on 11-Apr-2017

727 views

Category:

Science


0 download

TRANSCRIPT

Page 1: bigdawg overview

The BigDAWG Polystore System

Page 2: bigdawg overview

Database Challenges

• Enterprises encounter many databases and data models.• Specialized systems provide performance, but add complexity.

Page 3: bigdawg overview

Database Challenges

• Enterprises encounter many databases and data models.• Specialized systems provide performance, but add complexity.• BigDAWG goals:

– Provide as much location (database) transparency as possible

– Support a single query notation and interface with limited extensions BigDAWG

Page 4: bigdawg overview

BigDAWG Design

Support for heterogeneous storage and database engines

Many “Sizes”

Support for real time streaming databases for Internet of things

Low Latency

Allow users to operate on data without explicit knowledge of location

Location Transparency

Support the widest number of database operations with efficient connectors

Semantic completeness

Page 5: bigdawg overview

BigDAWG Design

Support for heterogeneous storage and database engines

Many “Sizes”

Support for real time streaming databases for Internet of things

Low Latency

Allow users to operate on data without explicit knowledge of location

Location Transparency

Support the widest number of database operations with efficient connectors

Semantic completeness

Page 6: bigdawg overview

BigDAWG Design

Support for heterogeneous storage and database engines

Many “Sizes”

Support for real time streaming databases for Internet of things

Low Latency

Allow users to operate on data without explicit knowledge of location

Location Transparency

Support the widest number of database operations with efficient connectors

Semantic completeness

Page 7: bigdawg overview

BigDAWG Design

Support for heterogeneous storage and database engines

Many “Sizes”

Support for real time streaming databases for Internet of things

Low Latency

Allow users to operate on data without explicit knowledge of location

Location Transparency

Support the widest number of database operations with efficient connectors

Semantic completeness

Page 8: bigdawg overview

Semantic Islands as the Tradeoff

• Islands are the trade-off between functionality and location transparency.

• Islands have:- A Data Model- A Language or Set of Operators- A Set of Candidate Database Engines

Page 9: bigdawg overview

Semantic Islands as the Tradeoff

• Islands are the trade-off between functionality and location transparency.

• Islands have:- A Data Model- A Language or Set of Operators- A Set of Candidate Database Engines

User specifies the Island:RELATIONAL(select avg(temp) from device)

ARRAY(multiply(A,B))

Page 10: bigdawg overview

Semantic Islands as the Tradeoff

• Islands are the trade-off between functionality and location transparency.

• Islands have:- A Data Model- A Language or Set of Operators- A Set of Candidate Database Engines

User specifies the Island:RELATIONAL(select avg(temp) from device)

ARRAY(multiply(A,B))

* Islands do Intersection of engines

* BigDAWG does Union of Islands

* Islands are logical

Page 11: bigdawg overview

Hackathon to Prototype BigDAWG

• BigDAWG Goal: Harness the power of advanced database engines through a unified interface

• BigDAWG is the vision of the ISTC Big Data to develop future technologies and interfaces that support knowledge extraction big data

• Recent Hackathon at MIT BeaverWorks produced a BigDAWG prototype

Page 12: bigdawg overview

Using BigDAWG Polystore for Medical Big Data

• Data Explorer

• Tell Me Something Interesting

• Text Analytics

• Heavy Analytics

• Streaming Analytics

Page 13: bigdawg overview

-Explorer-ScalaR

-Tell Something-SeeDB

Searchlight

-Text Analytics-D4M

-Heavy Analytic-Myria

-Streaming-S-Store

S-PI-Watch-

WearablesS-PI

Big DAWG Prototype - Island Types

Client

Server

Big DAWG API

Islands

EnginesTabular Clinical

DataHistorical Waveform

DataText

Clinical Data (i.e. chart notes)

Streaming Waveform DataIntermediate

results

D4MAssociative Arrays

Myria(Iterative)

PostgreSQL SciDB MyriaX S-Store

Streams

Accumulo

Data ModelIsland

(i.e. ARRAY, TEX)

Data ModelIsland

(i.e. ARRAY, TEX)

Data ModelIsland

(i.e. ARRAY, TEXT)