astrophysical surveys: visualization/data managements

35
Astrophysical Surveys: Astrophysical Surveys: Visualization/Data Visualization/Data Managements Managements Peter Nugent (LBNL/UCB) Peter Nugent (LBNL/UCB)

Upload: justis

Post on 02-Feb-2016

38 views

Category:

Documents


0 download

DESCRIPTION

Astrophysical Surveys: Visualization/Data Managements. Peter Nugent (LBNL/UCB). “Current” Optical Surveys. Photometric: Palomar Transient Factory La Silla Supernova Search SkyMapper PanSTARRS Spectroscopic: SDSS III (BOSS, SEGUE-II, APOGEE, MARVELS) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Astrophysical Surveys: Visualization/Data Managements

Astrophysical Surveys: Astrophysical Surveys: Visualization/Data Visualization/Data

ManagementsManagements

Peter Nugent (LBNL/UCB)Peter Nugent (LBNL/UCB)

Page 2: Astrophysical Surveys: Visualization/Data Managements

INT Exascale Workshop

““Current” Optical Current” Optical SurveysSurveys

Photometric:

Palomar Transient FactoryLa Silla Supernova SearchSkyMapperPanSTARRS

Spectroscopic:

SDSS III (BOSS, SEGUE-II, APOGEE, MARVELS)

All of these surveys span astrophysics from planets to cosmology, from the static to the transient universe.

Page 3: Astrophysical Surveys: Visualization/Data Managements

INT Exascale Workshop

PTF: CompetitionPTF: CompetitionThe competition were two wide-field multi-color surveys with cadences that were either unpredictable (SkyMapper) or from days to weeks (PanSTARRS) in a given filter.

How could we do something better/different?

- Start quickly - P48” coupled with the CFHT12k camera- Don’t do multiple colors- Explore the temporal domains in unique ways- Take full advantage of the big-iron at Super-Computing

Centers- Get all the science we possibly can out of this program

Thus we need the capability of providing immediate follow-up of unique transients, using 4 to 10-m class telescopes.

Page 4: Astrophysical Surveys: Visualization/Data Managements

INT Exascale Workshop

Transient Phase-SpaceTransient Phase-Space

Page 5: Astrophysical Surveys: Visualization/Data Managements

INT Exascale Workshop

PTF SciencePTF SciencePTF Key ProjectsPTF Key Projects

Various SNeVarious SNe Dwarf novaeDwarf novae

Transients in nearby Transients in nearby galaxiesgalaxies

Core collapse SNeCore collapse SNe

RR LyraeRR Lyrae Solar system objectsSolar system objects

CVsCVs AGNAGN

AM CVnAM CVn BlazarsBlazars

Galactic dynamicsGalactic dynamics LIGO & Neutrino LIGO & Neutrino transientstransients

Flare starsFlare stars Hostless transientsHostless transients

Nearby star kinematicsNearby star kinematics Orphan GRB afterglowsOrphan GRB afterglows

Rotation in clustersRotation in clusters Eclipsing stars and Eclipsing stars and planetsplanets

Tidal eventsTidal events H-alpha sky-surveyH-alpha sky-surveyThe power of PTF resides in its diverse science goals and follow-up.

Page 6: Astrophysical Surveys: Visualization/Data Managements

INT Exascale Workshop

PTF (2009-2013)PTF (2009-2013)

92 Mpixels, 1” resolution, 60s exposures - 128MB data per minute2 cadences SN & Dynamic with g in dark time & R in bright time

Page 7: Astrophysical Surveys: Visualization/Data Managements

INT Exascale Workshop

PTF SciencePTF ScienceThe power of PTF resides in its diverse science goals and follow-up.

And so does the cost…these resources are an order of magnitude more expensive than the survey itself.

Page 8: Astrophysical Surveys: Visualization/Data Managements

INT Exascale Workshop

PTF PipelinePTF Pipeline

Page 9: Astrophysical Surveys: Visualization/Data Managements

INT Exascale Workshop

PTF SubtractionsPTF Subtractionsm

oon

4096 X 2048 CCD images - over 3000 per night

NewRef Sub

Page 10: Astrophysical Surveys: Visualization/Data Managements

INT Exascale Workshop

QuickTime™ and a decompressor

are needed to see this picture.

PipelinePipeline

NERSC GLOBAL FILESYSTEM170TB

DataTransferNodes

ScienceGatewayNode 2

ScienceGatewayNode 1

Observatory PTF Collaboration

via Web

Processing/db Subtractions

Carver3200

core IBM

128 MB/90s50 GB/night

512 MB/90s200 GB/night

Page 11: Astrophysical Surveys: Visualization/Data Managements

INT Exascale Workshop

PTF DatabasePTF DatabaseR-bandR-band g-bandg-band

imagesimages 1.07M1.07M 166k166k

subtractiosubtractionsns

810k810k 50k50k

referencesreferences 18.6k18.6k 3.0k3.0k

CandidateCandidatess

487M487M 22.2M22.2M

TransientsTransients 2911729117 1670*1670*

The db selection (psql) and schema was based on getting the fastest and cheapest solution to last the length of the survey and serve the science needs.

Page 12: Astrophysical Surveys: Visualization/Data Managements

INT Exascale Workshop

PTF Sky CoveragePTF Sky Coverage

1000

100

10

0

To date:• 1200 Spectroscopically typed supernovae• 105 Galactic Transients• 104 Transients in M31

Page 13: Astrophysical Surveys: Visualization/Data Managements

INT Exascale Workshop

PTF: Real or BogusPTF: Real or BogusPTF produces 1 million candidates during a typical night:

• Most of these are not real Image Artifacts Misalignment of images due to poor sky conditions Image saturation from bright stars

• 50k are asteroids• 1-2k are variable stars• 100 supernovae• 3-4 new, young supernovae or other explosions

Page 14: Astrophysical Surveys: Visualization/Data Managements

INT Exascale Workshop

Real or BogusReal or Bogusm

oon

4096 X 2048 CCD images - over 3000 per night

Page 15: Astrophysical Surveys: Visualization/Data Managements

INT Exascale Workshop

Real or BogusReal or Bogus

230 bogus candidates, 2 variable stars, 4 asteroids and the youngest Type Ia supernovae observed to date.

PTF10ygu: Caught 2 days after explosion

Page 16: Astrophysical Surveys: Visualization/Data Managements

INT Exascale Workshop

Real or BogusReal or Bogus

Scanners helped vet over 1000 candidates and weights/biases were determined for each

scanner.

QuickTime™ and a decompressor

are needed to see this picture.

QuickTime™ and a decompressor

are needed to see this picture.

Page 17: Astrophysical Surveys: Visualization/Data Managements

INT Exascale Workshop

Real or BogusReal or Bogus

There is a natural cut at a value of 0.1, but we go to 2 w/ 0.07 to be safe.

Page 18: Astrophysical Surveys: Visualization/Data Managements

INT Exascale Workshop

Users…Users…

QuickTime™ and a decompressor

are needed to see this picture.

Page 19: Astrophysical Surveys: Visualization/Data Managements

INT Exascale Workshop

Citizen Scientists…Citizen Scientists…

http:// supernova.galaxyzoo.org is now up and running!A beta version appeared last year to support the SN Ia program in PTF and a WHT spectroscopy run. I spent a week with the folks at Oxford setting up the db and giving them training sets of good and bad candidates. They did the rest… 1200 members of galaxy zoo screened all the candidates between Aug 1 and Aug 12 in 3 hrs. The top 50 hits were all SNe/variable stars and they found 3 before we did. They scanned ~25,000 objects - 3 objects/min. They now do ~200 nightly and we have 15,000 users.

QuickTime™ and a decompressor

are needed to see this picture.

QuickTime™ and a decompressor

are needed to see this picture.

Page 20: Astrophysical Surveys: Visualization/Data Managements

INT Exascale Workshop

RobotRobot

QuickTime™ and a decompressor

are needed to see this picture.

A robot (built by Josh Bloom at UCB) queries the db every 20 min and compares new transients with archival information to ascertain its likely nature and publishes them to the collaboration - classification.

Page 21: Astrophysical Surveys: Visualization/Data Managements

INT Exascale Workshop

RobotRobot

QuickTime™ and a decompressor

are needed to see this picture.

Complications to traditional methods include varying uncertainties in data, non-structured temporal sequence (bad weather, etc.), differing levels of historical information (in SDSS or not, known host in NED, etc.)

And this is just for stars…we also have ones for SNe, AGN…

Page 22: Astrophysical Surveys: Visualization/Data Managements

INT Exascale Workshop

Turn-aroundTurn-aroundThe scanning is handled in three ways:

(1)Individuals can look through anything they want and save things to the PTF database(2)SN Zoo(3)UCB machine learning algorithm is applied to all candidates and reports are generated on the best targets and what they are likely to be (SN, AGN, varstar) by comparison to extant catalogs as well as the PTF reference catalog. These come out ~15 min after a group of subtractions are loaded into the database.

On June 3, 2010 we were able to photometrically screen 4 SN candidates with the Palomar 60” telescope in g, r and i-band (50% of the time on P60 is devoted to this) within 2.5 hrs of discovery on the Palomar Schmidt and take spectra of them at Keck the same night. Now a nightly occurrence.

Page 23: Astrophysical Surveys: Visualization/Data Managements

INT Exascale Workshop

Robot -10vdlRobot -10vdl

Discovery and follow-up of PTF 10vdl a SN II.

QuickTime™ and a decompressor

are needed to see this picture.

QuickTime™ and a decompressor

are needed to see this picture.

QuickTime™ and a decompressor

are needed to see this picture.

Page 24: Astrophysical Surveys: Visualization/Data Managements

INT Exascale Workshop

PTF TotalsPTF Totals

Transients = 1220Papers = 20

In addition to these we have followed 2 triggers from IceCube and one from LIGO.

We estimate that at the end of the survey we will have 40B detections in the individual images and 40B detections in the deep co-additions.

Page 25: Astrophysical Surveys: Visualization/Data Managements

INT Exascale Workshop

Historical DataHistorical Data

The DeepSky program was started in response to the needs of several astrophysics projects hosted at NERSC. The result of this project is an all-sky digital image based upon the point-and-stare observations taken via the Palomar-QUEST Consortium and the SN Factory & Near Earth Asteroid Team. This data (over 9 million ccd images) spans 7 years and almost 20,000 square degrees, with typically 10-100 pointing on a particular part of the sky.

Page 26: Astrophysical Surveys: Visualization/Data Managements

INT Exascale Workshop

Historical Data (Own)Historical Data (Own)

Deep co-additions of many images go into making a reference image. Requires Deep co-additions of many images go into making a reference image. Requires HPC to handle the sheer volume of data and the solution to large linear systems HPC to handle the sheer volume of data and the solution to large linear systems to normalize the images to each other before the co-additions. Science can be had to normalize the images to each other before the co-additions. Science can be had from either the reference or the individual images that went into it.from either the reference or the individual images that went into it.

QuickTime™ and aYUV420 codec decompressor

are needed to see this picture.

Page 27: Astrophysical Surveys: Visualization/Data Managements

INT Exascale Workshop

SNe IaSNe Ia810 SNe Ia discovered to date.

The cosmology dataset (total of 153) has been followed with a ~2-3 day cadence in gri with LT and FTN/FTSAnd PTF in R/g-band

Page 28: Astrophysical Surveys: Visualization/Data Managements

INT Exascale Workshop

PTF TotalsPTF Totals

Page 29: Astrophysical Surveys: Visualization/Data Managements

INT Exascale Workshop

Near FutureNear FutureNext Generation Transient Survey (aka PTF-II)

- Upgrade to 5X PTF: 36 sq. deg. (~ 1 billion pixels)- Would like to explore the sky on 100s timescales- Turnaround in 10-20 minutes with list of new

candidates- Ingest SDSS, BOSS, NED, etc. catalogs to refine

our understanding of these candidates in real-time

- Able to handle Advanced LIGO, neutrino detectors, etc.

Page 30: Astrophysical Surveys: Visualization/Data Managements

INT Exascale Workshop

QuickTime™ and a decompressor

are needed to see this picture.

Bottlenecks…Bottlenecks…

NERSC GLOBAL FILESYSTEM240TB

DataTransferNodes

ScienceGatewayNode 2

ScienceGatewayNode 1

Observatory PTF Classification

Processing/db

Carver

Subtractions2.5 MB/s

1 GB/s

12 MB/s

4 MB/s(crude)

0.5 MB/s(full)

Page 31: Astrophysical Surveys: Visualization/Data Managements

INT Exascale Workshop

Bottlenecks…crude Bottlenecks…crude vsvs. . realreal

time

bri

gh

tness

5- data in db

Page 32: Astrophysical Surveys: Visualization/Data Managements

INT Exascale Workshop

Heavy Heavy RandomRandom I/O I/O

SC09 Storage challenge allowed us to couple both the SDSS db and the PTF candidate db to ask the question, which objects that we think are quasars in the static SDSS data vary like one in the PTF data. PTF db is now 250GB and growing nightly…

Page 33: Astrophysical Surveys: Visualization/Data Managements

INT Exascale Workshop

Heavy Heavy RandomRandom I/O + I/O + analyticsanalytics Aster Data

provides a parallel db solution that also allows us to embed many of our machine learning algorithms. Already handle PB datasets.

Likely will couple both solutions (Aster + SSD).

QuickTime™ and a decompressor

are needed to see this picture.

Page 34: Astrophysical Surveys: Visualization/Data Managements

INT Exascale Workshop

Conclusions - FutureConclusions - Future

LSST - 15TB data/nightOnly one 30-m telescope

Page 35: Astrophysical Surveys: Visualization/Data Managements

INT Exascale Workshop

Conclusions - FutureConclusions - FutureScaling issues:

• Much larger volume of data (100X)• Much larger community (100X)• Greater science scope, not just transients

• Much larger set of historical resources one would wish to query against

• Follow-up resources (All the 8-10-meter & the 30-meter telescopes) much more expensive