hmi jsoc science data processing art amezcua
DESCRIPTION
HMI JSOC Science Data Processing Art Amezcua. AGENDA JSOC-SDP Overview JSOC-SDP Status (H/W, S/W) Pipeline processing Database Level-0, Level-1 and higher levels JSOC-SDP Maintenance and CM Documentation Staffing Summary. JSOC Science Data Processing (SDP) Status. - PowerPoint PPT PresentationTRANSCRIPT
Heliophysics Projects Division
HMI Team Meeting September 8-11, 2009 1
HMI JSOC Science Data Processing
Art Amezcua
AGENDA JSOC-SDP Overview JSOC-SDP Status (H/W, S/W)
Pipeline processingDatabaseLevel-0, Level-1 and higher levels
JSOC-SDP Maintenance and CMDocumentationStaffingSummary
Heliophysics Projects Division
HMI Team Meeting September 8-11, 2009 2
JSOC Science Data Processing (SDP) Status
• JSOC-SDP supports both HMI and AIA through Level-1 and HMI through Level-2 science data products
• JSO-SDP Infrastructure is complete– JSOC-SDP Hardware is complete, upgrades in process
• Database systems – Warm standby system online in September 2009
• Web Server – Upgrade online in September 2009
– JSOC-SDP Software• Data Record Management System and Storage Unit Management System (DRMS/
SUMS) complete as of March 2009
– JSOC-SDP Archive System is fully operational
• Software Components needed to support commissioning– Level-0 Image processing for both AIA and HMI is ready and was used to
support Observatory I&T– Level-0 HK, FDS and other metadata merge – complete as of May 2009– Level-1 (science observables) – will be completed during commissioning
• HMI Doppler and LOS Magnetic – 95% complete
• HMI Vector Field Observables – 90% complete
• AIA Level-1.5 Images – 50% complete
Heliophysics Projects Division
HMI Team Meeting September 8-11, 2009 3
JSOC-SDP Status (continued)
• Software components needed to support science mission: – Production Pipeline Manager – In development, expected during commissioning– HMI Level-2 (Version 1 of science data products)
• Local Helioseismology – Work in parallel on “rings”, “time-distance”, and “holography” proceeding with basic capability, expected to be ready during commissioning
• Global Helioseismology – Ready for testing during commissioning
• Magnetic Field standard products – Ready for testing during commissioning
• Vector Field disambiguation – 80% complete with preliminary product ready by end of commissioning (requires real data to proceed)
– Export and Catalog Browse Tools • Functional but needs work (http://jsoc.stanford.edu/ajax/lookdata.html)
• Refinements will continue
– All science products need flight data during commissioning to complete development
• AIA Visualization Center (AVC) at Lockheed Martin– Higher-level AIA processing and science product generation– Heliophysics Event Knowledgebase (HEK)
• Stanford Summary: On schedule for L – 4 and Phase E – 6 months
Heliophysics Projects Division
HMI Team Meeting September 8-11, 2009 4
Science TeamForecast Centers
EPOPublic
Catalog
Primary Archive
MOCDDS
Redundant Data
Capture System
19-DayArchive
OffsiteArchiveLMSAL
OfflineArchive
HMI JSOC Pipeline Processing System
DataExport& WebService
JSOC-SDP
LMSAL
High-LevelData Import
AIA AnalysisSystem
Local Archive
HMI & AIAOperation
s
House-keeping
Database
QuicklookViewing
housekeepi
ng GSFCWhite Sands
World
JSOC-IOC
Stanford
JSOC-AVC
HMI and AIA JSOC Overview
Heliophysics Projects Division
HMI Team Meeting September 8-11, 2009 5
JSOC-SDP Data Center
• Facility– Located in the climate-controlled basement of Physics and Astrophysics
building at Stanford– Important components on UPS and building back-up power; databases auto-
shutdown on power outage – Critical components monitored for failure (telephone, email, webpage
notification of issues)
• Components– 3 data-capture machines (plus one at MOC)– 1 data-processing cluster (512 CPU cores, 64 nodes, queuing system)– 1 file- and tape-server machine– 3 database machines– 2 gateway machines (to MOC, to SDP)– 1 web-server machine– 2 LMSAL real-time (housekeeping) machines– OC3 lines from DDS, ethernet (1 Gbps) connects all components, high-speed
(20 Gbps) interconnect between data-processing cluster and file- and tape-server, 10 Gbps link between file- and tape-server and LMSAL.
Heliophysics Projects Division
HMI Team Meeting September 8-11, 2009 6
JSOC-SDP Major Components
16
10 TB8
GSFCAccess(FDS,
L0HK)
Web Serve
r
4
13 TB50
SlotsHMI
Data Capture System
Pipeline Cluster 512
cores2 Quad Core X86-64
Processors2 Quad Core X86-64
Processors2 Quad Core X86-64
Processors2 Quad Core X86-64
Processors2 Quad Core X86-64
Processors2 Quad Core X86-64
Processors2 Quad Core X86-64
Processors2 Quad Core X86-64
Processors2 Quad Core X86-64
Processors2 Quad Core X86-64
Processors2 Quad Core X86-64
Processors2 Quad Core X86-64
Processors2 Quad Core X86-64
Processors2 Quad Core X86-64
Processors2 Quad Core X86-64
Processors8
2 Quad Core X86-64
Processors2 Quad Core X86-64
Processors2 Quad Core X86-64
Processors2 Quad Core X86-64
Processors2 Quad Core X86-64
Processors2 Quad Core X86-64
Processors2 Quad Core X86-64
Processors2 Quad Core X86-64
Processors2 Quad Core X86-64
Processors2 Quad Core X86-64
Processors2 Quad Core X86-64
Processors2 Quad Core X86-64
Processors2 Quad Core X86-64
Processors2 Quad Core X86-64
Processors2 Quad Core X86-64
Processors8
400 TB
Plus
150 TB py
10G
Link
1G
Link
ethernet (1 G)
20-G interconnect
Local Science
Workstations
Fire
wall
X86-64 Processor Cores
Disk LTO-4 Tape Library
4
13 TB50
Slots
4
13 TB50
SlotsSpare AIA
OC3 lines
DRMS & SUMS Database Hosts
16
10 TB
16
10 TB
HMIDB
HMIDB2
Web/Export
DDS
4
10 TB
File/Tape Server
2200-Slot
Tape Library
Plus
12 LTO-4
Drives
SPARC
0.5 TB
HMISDP-mon
AIASDP-mon
SPARC
0.5 TB
RT Mon
LMSAL
OutsideWorl
d
MOC link for real-time housekeeping
10-G ethernet
Heliophysics Projects Division
HMI Team Meeting September 8-11, 2009 7
Data Capture
• Telemetry files transmitted to data-capture (DCS) machines via two OC3 lines
– One line for AIA (data from four cameras over two virtual channels)– One line for HMI (data from two cameras over two virtual channels)
• Three sets of telemetry:– DCS machines archive telemetry files to tape, driven to LMSAL twice a week
and stored in cabinet– Production processes on dedicated cluster node ingest raw telemetry from
DCS disk into hmi.tlm and aia.tlm– Dedicated cluster node creates Level 0 data from telemetry and stores in
DRMS/SUMS as hmi.lev0 and aia.lev0
• DCS acks (once per day) DDS when offsite tape in place and verified and records in hmi.tlm created
Heliophysics Projects Division
HMI Team Meeting September 8-11, 2009 8
Data Capture Details
DDS
OC3
Switches
HMI DCS
IS SY T S
PipelineProcessingBack Endethernet
Tape Robot
Disk array13 TB (19
days)
CISCOSYSTEMS
HMI AIA
UPS
Building Power w/ Generator Backup
AIA DCS
UPSUPS
MOC link
Tape Archive
CISCOSYSTEMS
Spare DCSCISCOSYSTEMS
CISCOSYSTEMS
10gig private link
LMSAL
Operator hand net
MOC
Spare DCS
HMI Monitorin
g
AIAMonitorin
g
Heliophysics Projects Division
HMI Team Meeting September 8-11, 2009 9
Data Capture Status
• Data capture machines
– Online as of January 2008– Each is capable of caching 19
days telemetry– Tape drive is used to generate
offsite copy stored at LMSAL– Pipeline system is used to
generate tlm copy and Level-0 data series
– All tapes are LTO-4 (800 GB)
Heliophysics Projects Division
HMI Team Meeting September 8-11, 2009 10
JSOC-SDP Pipeline Hardware
Data-processing cluster, file- and tape-server machine, and T950 tape library [COMPLETE as of July 2008]
Heliophysics Projects Division
HMI Team Meeting September 8-11, 2009 11
Data-Processing Pipeline Status
• All machines fully tested for integrity with simulated data at realistic data rate and volumes
• All machines/components under service warranties with vendors
• Database machines have been online for four years (during DRMS development)
• Data-processing cluster, file- and tape-server machine, T950 tape robot, and tape systems went online in July 2008
• Upgrades (new machines onsite)– MOC web-access machine– solarport (gateway to SDP)– ftp server– web server– two database machines– in service in September 2009
Heliophysics Projects Division
HMI Team Meeting September 8-11, 2009 12
Pipeline Software – DRMS/SUMS
• Data Series– Related images and metadata stored in “data series”
– Rows are data records (e.g., one record per time step)
– Columns are keywords, pointers to data files (in SUMS), pointers to other data series
• Storage Unit Management System (SUMS)– Image files (e.g., FITS files) stored in SUMS
– Uses PostgreSQL database
– Sole client is DRMS
• Data Record Management System (DRMS)– Data series minus image files
– Implemented as C library which wraps PostgreSQL database
– Has a FORTRAN interface
– Scientist interact directly with DRMS
• NetDRMS– Network of DRMS/SUMS sites that share DRMS/SUMS data
– DRMS data shared via RemoteDRMS, which uses Slony-1 to make data logs that are ingested at remote site
– Data files residing in SUMS shared via RemoteSUMS, which uses scp; integrates with VSO so that data are obtained from least-congested NetDRMS sites
Heliophysics Projects Division
HMI Team Meeting September 8-11, 2009 13
DRMS SUMS
Server
FullTape Backups
Server
DRMS
Server
Slony-1 “Master” node
WALs
HMIDB
HMIDB2 - WARM STANDBYWEBDB
Cron'dbackup
info
Backupspermissionsuser IDsSLONY configslayout info
Solar Weather
Science Users
VSO
IDL solar soft
ServerFirewall
ClientsCluster
Slony-1 LOG GENERATE
Tape drive - maintenance
Snapshotvolume
Slony-1 LOG SHIPREMOTE DRMS’s
DRMS
DRMS SUMS
FIR
EW
AL
L - ST
AN
FO
RD
NE
T
SUMS
LMSALDirect
Access
DEDICATED LINE TO LMSALTRUSTED USER Solar Weather, if necessary (contingency)
Slonyreplication
SUMs queriesfrom WebDB to
primaryare read-only
Http://lookdata
Http://lookdata
Database Configuration
Heliophysics Projects Division
HMI Team Meeting September 8-11, 2009 14
Level-0 Processing
VC02*.tlmVC02*.qacVC05*.tlmVC05*.qac
NFS from dcs1
VC01*.tlmVC01*.qacVC04*.tlmVC04*.qac
NFS from dcs0
ingest_lev0
VC02
cl1n001 HMI
ingest_lev0
VC05
hmi.tlmfilename | SUDIR
*.tlm *.qac
SUMSfilename | SUDIR
hmi.lev0fsn | lev0 keys | SUDIRfsn | lev0 keys | SUDIR
image.fitsimage_sm.f
itsimage.png
SUMS
ingest_lev0
VC01
cl1n001 AIA
ingest_lev0
VC04
aia.tlmfilename | SUDIR
*.tlm*.qac
SUMSfilename | SUDIR
aia.lev0fsn | lev0 keys | SUDIRfsn | lev0 keys | SUDIR
image.fitsimage_sm.
fits image.png
SUMS
AIA ~24 images per VC01 or VC04 .tlm file.
HMI ~16 images per VC02 or VC05 .tlm file.
Reconstructs images from tlm; no modification of CCD pixels [COMPLETE as of August 2008]
Heliophysics Projects Division
HMI Team Meeting September 8-11, 2009 15
Level-1 Processing
fetch fetch level-0 level-0
keywordkeywords & s &
segmentsegmentss
fetch fetch level-0 level-0 imageimage
get get readout readout mode mode
correcticorrectionsons
read read flatfield flatfield
argumenargumentsts
hmi.lev0hmi.lev0
sdo.fds_orbit_vecsdo.fds_orbit_vectorstors
sdo.lev0_asd_0sdo.lev0_asd_00303
hmi.flatfieldhmi.flatfield
remove remove overscan overscan rows & rows &
colscols
correct for correct for gain & gain & offsetoffset
ID bad ID bad pixelspixels
calculate calculate image image centercenter
set qualityset quality
hmi.lev1hmi.lev1
interpolinterpolate ate
predictepredicted orbit d orbit vectorsvectors
interpolinterpolate ate
spacecraspacecraft ft
pointing pointing vectorsvectors
ancillary-data ancillary-data inputinput
ancillary-data processingancillary-data processing
image-data image-data processingprocessing
image-data image-data inputinput
COMPLETE as of September 2009
Heliophysics Projects Division
HMI Team Meeting September 8-11, 2009 16
HMI Observables – “Level-1” Products
Heliophysics Projects Division
HMI Team Meeting September 8-11, 2009 17
HMI Higher-Level Processing Status
• Higher-Level Science Products– Internal Rotation Ω(r,Θ) [estimated complete as of August 2009]– Internal Sound Speed cs(r,Θ) [estimated complete as of August 2009]– Full-Disk Velocity v(r,Θ,Φ) [estimated complete as of December 2009]– Sound Speed cs(r,Θ,Φ) [estimated complete as of December 2009]– Carrington Synoptic Velocity Maps [estimated complete as of December 2009]– Carrington Synoptic Speed Maps [estimated complete as of December 2009]– High-Resolution Velocity Maps [estimated complete as of December 2009]– High-Resolution Speed Maps [estimated complete as of December 2009]– Deep Focus Maps [estimated complete as of July 2010]– Far-Side Activity Maps [estimated complete as of December 2009]– Line-of-Sight Magnetic Field Maps [COMPLETE as of July 2009]– Vector field inversion and direction disambiguation [estimated complete as of
March 2010]– Vector Magnetic Field Maps [estimated complete as of April 2010]– Coronal Magnetic Field Extrapolations [COMPLETE as of July 2009]– Coronal and Solar Wind Models [estimated complete as of April 2010]– Brightness Images [estimated complete as of August 2009]
Heliophysics Projects Division
HMI Team Meeting September 8-11, 2009 18
Data Distribution and Export
• Scope– AIA – Level-0 and Level-1 data– HMI – Level-0 data through Level-2 data
• Web Export– http://jsoc.stanford.edu/ajax/lookdata.html– Query for desired data, then download via web– Supports several data formats (internal files, FITS files, tar files, compressed files)– Provides support for special processing (such as extracting regions)– Other developers can expand on this export method by writing javascript that is allowed
to access our web cgi programs– Functional now; enhancements estimated complete as of August 2009
• NetDRMS– Network of DRMS sites– Can share DRMS data (not just data files) among sites using RemoteDRMS and
RemoteSUMS– Scientists can request the same data from one of many sites– Functional now; enhancements estimated complete as of August 2009
• Virtual Solar Observatory (VSO) Integration– Provides UI that allows uniform search of disparate types of data– Obtains metadata and data files from NetDRMS sites experiencing the least congestion– Estimated complete as of December 2009
Heliophysics Projects Division
HMI Team Meeting September 8-11, 2009 19
Maintenance and Expansion During Mission
• Hardware– Each hardware component is covered under a vendor service plan; as plans
expire, they are renewed– Planned phased replacement/upgrades throughout Phase-E
• Software– Lead software developers are part of continuing team for Phase-E
• Storage– File Server – 150 TB per year disk– Tape Library – filled tapes stored in Data Center, replaced with new tapes as
needed; library expansion entails new 1300-slot cabinet when needed
• Functionality– Anticipate continued development of science processing and distribution tools
during Phase-E
Heliophysics Projects Division
HMI Team Meeting September 8-11, 2009 20
Documentation
• Wiki - http://jsoc.stanford.edu/jsocwiki– Overview of JSOC Data Series, DRMS, and SUMS– JSOC Series Definition Files– DRMS Names– JSOC User’s Guide– JSOC Developer’s Guide
• Doxygen - http://jsoc.stanford.edu/doxygen_html– Manual describing DRMS API functions and modules– Provides synopsis and describes input parameters, output, return values– To date, ~ 1/2 of functions/modules have documentation
• Flow Diagrams– Tree diagrams illustrating connections between various programs, data series, data tables, etc. – Diagrammatic view of pipeline processes links to documentation– Note stages of development (A – E) and estimated completion date
• Procedures documented– Database maintenance– DCS operations– Level-0 processing management– RemoteDRMS/SUMS installation and maintenance
• Procedure documentation in progress– Calibration Processing (filters, flat fields, etc.)– Pipeline Dataflow Management System– Export management– Weekly data product report generation
Heliophysics Projects Division
HMI Team Meeting September 8-11, 2009 21
Summary
• The JSOC-SDP can support:– Archive and distribution functions now– Analysis for instrument commissioning now– Initial science data processing by launch
Heliophysics Projects Division
HMI Team Meeting September 8-11, 2009 22
Backup Slides
Heliophysics Projects Division
HMI Team Meeting September 8-11, 2009 23
JSOC-SDP Stages of Software Development
• Stage A – Code specification exists, but working code does not exist• Stage B – Prototype code exists, but not necessarily on HMI data and not
necessarily in the correct language• Stage C – Working code exists but cannot run inside JSOC pipeline• Stage D – Working code capable of running in JSOC pipeline, but undergoing
final testing and not released for general use• Stage E – Working code complete and integrated into JSOC pipeline
Following dataflow charts show status as of FORR with estimated months to complete to stage E shown
Heliophysics Projects Division
HMI Team Meeting September 8-11, 2009 24
JSOC-SDP Dataflow – Data Capture
Heliophysics Projects Division
HMI Team Meeting September 8-11, 2009 25
JSOC-SDP Dataflow HMI Level-0 and Level-1
Heliophysics Projects Division
HMI Team Meeting September 8-11, 2009 26
JSOC-SDP Dataflow HMI Level-1 – Detail
Heliophysics Projects Division
HMI Team Meeting September 8-11, 2009 27
JSOC-SDP Dataflow HMI Level-2 Helioseismology
Heliophysics Projects Division
HMI Team Meeting September 8-11, 2009 28
JSOC-SDP Dataflow HMI Level-2 – LOS Mag
Heliophysics Projects Division
HMI Team Meeting September 8-11, 2009 29
JSOC-SDP Dataflow HMI Level-2 – Vector Mag
Heliophysics Projects Division
HMI Team Meeting September 8-11, 2009 30
AVC Dataflow – Data Distribution