astronomy, petabytes, and mysql mysql conference santa clara, ca april 16, 2008 kian-tat lim...

Post on 05-Jan-2016

213 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Astronomy, Petabytes, and MySQL

MySQL ConferenceSanta Clara, CAApril 16, 2008

Kian-Tat LimStanford Linear Accelerator Center

MySQL ConferenceApril 16, 2008 Santa Clara, CA

2 / 47

Outline

LSSTLSST Database

LSST Database + MySQL

MySQL ConferenceApril 16, 2008 Santa Clara, CA

3 / 47

LSST

What Is It?Why Build It?

MySQL ConferenceApril 16, 2008 Santa Clara, CA

4 / 47

LSST

What Is It?Why Build It?

MySQL ConferenceApril 16, 2008 Santa Clara, CA

5 / 47

Telescope

Proposed telescope to be

built in Chile

MySQL ConferenceApril 16, 2008 Santa Clara, CA

6 / 47

Large

3.2 gigapixel camera

8.4 meter diameter mirror

MySQL ConferenceApril 16, 2008 Santa Clara, CA

7 / 47

Synoptic Survey

Wide

Deep

Fast

MySQL ConferenceApril 16, 2008 Santa Clara, CA

8 / 47

LSST

What Is It?Why Build It?

MySQL ConferenceApril 16, 2008 Santa Clara, CA

9 / 47

Dark Matter and Energy

Photo: J. A. Tyson, W. Colley, E. L. Turner, and NASA

MySQL ConferenceApril 16, 2008 Santa Clara, CA

10

/ 47

Variable Objects

MySQL ConferenceApril 16, 2008 Santa Clara, CA

11

/ 47

Transient Objects

MySQL ConferenceApril 16, 2008 Santa Clara, CA

12

/ 47

Moving Objects

Photo: D. Roddy, Lunar and Planetary Institute

MySQL ConferenceApril 16, 2008 Santa Clara, CA

13

/ 47

LSST Database

What’s In It?How Big?

How Often?What Queries?Unusual Needs

MySQL ConferenceApril 16, 2008 Santa Clara, CA

14

/ 47

LSST Database

What’s In It?How Big?

How Often?What Queries?Unusual Needs

MySQL ConferenceApril 16, 2008 Santa Clara, CA

15

/ 47

Database: Components

Image Metadata

Moving

Objects

CatalogObject Catalog

Source Catalog

Difference Image Source Catalog

Provenance

Statistics

Summaries

Calibration Engineering and Facility Database

MySQL ConferenceApril 16, 2008 Santa Clara, CA

16

/ 47

Astronomical Objects

Image Metadata

Moving

Objects

CatalogObject Catalog

Source Catalog

Difference Image Source Catalog

Provenance

Statistics

Summaries

Calibration Engineering and Facility Database

MySQL ConferenceApril 16, 2008 Santa Clara, CA

17

/ 47

Sources

Image Metadata

Moving

Objects

CatalogObject Catalog

Source Catalog

Difference Image Source Catalog

Provenance

Statistics

Summaries

Calibration Engineering and Facility Database

MySQL ConferenceApril 16, 2008 Santa Clara, CA

18

/ 47

Changes

Image Metadata

Moving

Objects

CatalogObject Catalog

Source Catalog

Difference Image Source Catalog

Provenance

Statistics

Summaries

Calibration Engineering and Facility Database

MySQL ConferenceApril 16, 2008 Santa Clara, CA

19

/ 47

Image Metadata

Image Metadata

Moving

Objects

CatalogObject Catalog

Source Catalog

Difference Image Source Catalog

Provenance

Statistics

Summaries

Calibration Engineering and Facility Database

MySQL ConferenceApril 16, 2008 Santa Clara, CA

20

/ 47

Calibration and Facility

Image Metadata

Moving

Objects

CatalogObject Catalog

Source Catalog

Difference Image Source Catalog

Provenance

Statistics

Summaries

Calibration Engineering and Facility Database

MySQL ConferenceApril 16, 2008 Santa Clara, CA

21

/ 47

LSST Database

What’s In It?How Big?

How Often?What Queries?Unusual Needs

MySQL ConferenceApril 16, 2008 Santa Clara, CA

22

/ 47

Sagans of Rows

49 billion objects

2.8 trillion sources

MySQL ConferenceApril 16, 2008 Santa Clara, CA

23

/ 47

Lots of Columns

308 columns for objects

56 columns for sources

(for now)

MySQL ConferenceApril 16, 2008 Santa Clara, CA

24

/ 47

Database Size

Grows to >14 PB

MySQL ConferenceApril 16, 2008 Santa Clara, CA

25

/ 47

LSST Database

What’s In It?How Big?

How Often?What Queries?Unusual Needs

MySQL ConferenceApril 16, 2008 Santa Clara, CA

26

/ 47

Frequency

Nightly updates

Semi-annual data releases

MySQL ConferenceApril 16, 2008 Santa Clara, CA

27

/ 47

LSST Database

What’s In It?How Big?

How Often?What Queries?Unusual Needs

MySQL ConferenceApril 16, 2008 Santa Clara, CA

28

/ 47

Queries

•All about an object•All objects meeting criteria•All objects near objects meeting

criteria•All objects with interesting time

series•All pairs of objects with similar time

series

MySQL ConferenceApril 16, 2008 Santa Clara, CA

29

/ 47

LSST Database

What’s In It?How Big?

How Often?What Queries?Unusual Needs

MySQL ConferenceApril 16, 2008 Santa Clara, CA

30

/ 47

Unusual Needs

Flexibility

Provenance

MySQL ConferenceApril 16, 2008 Santa Clara, CA

31

/ 47

LSST Database + MySQL

Why MySQL?Scalability?

Performance?

MySQL ConferenceApril 16, 2008 Santa Clara, CA

32

/ 47

LSST Database + MySQL

Why MySQL?Scalability?

Performance?

MySQL ConferenceApril 16, 2008 Santa Clara, CA

33

/ 47

MySQL

Relational database management system

MySQL ConferenceApril 16, 2008 Santa Clara, CA

34

/ 47

Open Source

Vibrant community

Strong company support

MySQL ConferenceApril 16, 2008 Santa Clara, CA

35

/ 47

Hardware

Runs on commodity hardware

MySQL ConferenceApril 16, 2008 Santa Clara, CA

36

/ 47

In-Memory Tables

Needed for near-real-time processing

MySQL ConferenceApril 16, 2008 Santa Clara, CA

37

/ 47

LSST Database + MySQL

Why MySQL?Scalability?

Performance?

MySQL ConferenceApril 16, 2008 Santa Clara, CA

38

/ 47

“MySQL Grid”

MySQL ConferenceApril 16, 2008 Santa Clara, CA

39

/ 47

Partitioning

Large tables partitioned spatially

MySQL ConferenceApril 16, 2008 Santa Clara, CA

40

/ 47

Replication

Dimension tables likely replicated

MySQL ConferenceApril 16, 2008 Santa Clara, CA

41

/ 47

Needs: Distributor/Combiner

LSST will build prototypeNeed long-term support

MySQL ConferenceApril 16, 2008 Santa Clara, CA

42

/ 47

LSST Database + MySQL

Why MySQL?Scalability?

Performance?

MySQL ConferenceApril 16, 2008 Santa Clara, CA

43

/ 47

Per-Column Indexing

2X data size

MySQL ConferenceApril 16, 2008 Santa Clara, CA

44

/ 47

Needs: Optimizer

Efficient use of multiple (20-30) indexes

MySQL ConferenceApril 16, 2008 Santa Clara, CA

45

/ 47

Needs: Indexes

Bitmap/compressed indexes

MySQL ConferenceApril 16, 2008 Santa Clara, CA

46

/ 47

Needs: Storage Engine

“Shared scan” for long-running full-table queries

MySQL ConferenceApril 16, 2008 Santa Clara, CA

47

/ 47

Summary

Building a petabyte DB

MySQL can be a core component

top related