grid-brick event processing framework in geps chep 03 – la jolla, california a. amorim, p....

23
Grid-Brick Event Processing Framework in GEPS CHEP 03 – La Jolla, California A. Amorim, P. Trezentos, N. Almeida, H. Fei, L.Pedro, J.Villate, H.Wolters

Post on 21-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Grid-Brick Event Processing Framework in GEPS

CHEP 03 – La Jolla, California

A. Amorim, P. Trezentos, N. Almeida, H. Fei, L.Pedro, J.Villate, H.Wolters

2

Outline Introduction Architecture Overview Data Flow How do we do it Action! Vantages and disadvantages On-going and future work Summary

[email protected] FCUL - Lisbon

3

Introduction

What is GEPS Grid-based Event Processing System Developed on top of Globus Provides web-based access to grid

computing environment for event processing Developed by GridPT working group

[email protected] FCUL - Lisbon

4

Introduction

Main Idea

Do NOT move data!

Each node stores and is

reponsible by a subset of the whole data…

[email protected] FCUL - Lisbon

5

Introduction

The usual way

High speed Disks

Data Center

[email protected] FCUL - Lisbon

6

Introduction

Our way

User terminal User terminal User terminal………..

Job submit server

Meta-data catalogue

[email protected] FCUL - Lisbon

7

Architecture Overview

User submits a query through a interface to the Job Submit Server (JSS).

Job submitted information will be stored in the Metadata Catalog.

[email protected] FCUL - Lisbon

8

Architecture Overview

[email protected] FCUL - Lisbon

The job is submitted to the grid nodes using Globus API functions.

All the nodes query their own information and retrieve a result.

9

Architecture Overview

JSS receives result from the Grid nodes and produces a final one

User can download or consult the final result

User can also visualize the state of the job in each Grid node

[email protected] FCUL - Lisbon

10

Architecture Overview

[email protected] FCUL - Lisbon

11

Data-flow

User terminal(PHP interface)

BROKER

Meta-data catalogue

[email protected] FCUL - Lisbon

JSS

Grid node

Brick

12

How do we do it:

Technologies used Globus PgSQL LDAP PHP ROOT

[email protected] FCUL - Lisbon

13

How do we do it:

Relevant Features Globus

Toolkit that provide GRID API functions

PgSQL Meta-data catalogue implementation

LDAP Query Grid node information

PHP Web interface

[email protected] FCUL - Lisbon

14

Action!

The human interface

Main Page

Submit a Job

GREED info

Job status

[email protected] FCUL - Lisbon

15

Action!

Enabling ROOT Queries The job is submitted to the grid nodes All the nodes query their own information with

ROOT and retrieve a ROOT file with a TTree JSS receives the ROOT files and produces a

final ROOT file with the result of the query User can download or consult the final file

because it is a TTree.

[email protected] FCUL - Lisbon

16

Action!

Enabling ROOT Queries (cont) Stores the information in each node using TTree’s Filter the information in each node and retrieve a

result file that include a TTree Join all the result files in the Job Submit Server using

a TChain and produces a final TTree that is the query result

View the final result file with a TBrowser or with Carrot

[email protected] FCUL - Lisbon

17

Action!

Enabling ROOT Queries (reading ROOT files)

Analysis Performance Reading From a TFile

050

100150200250300350400

100 500 1000 2000 4000

Events

Tim

e S

pen

t (s

)

30% Events

60% Events

90% Events

[email protected] FCUL - Lisbon

18

Action!

Enabling ROOT Queries (reading ROOT files)

Analysis Performance Reading From a TTree

0

20

40

60

80

100

120

140

100 500 1000 2000 4000 6000

Events

Tim

e S

pen

t (s

)

30% Events

60% Events

90% Events

[email protected] FCUL - Lisbon

19

Action!

Special Features used ROOT

TObject TTree CINT Filtering data from TTree ROOT I/O TChain

Carrot Browsing ROOT files Histograming variables

[email protected] FCUL - Lisbon

20

Vantages and disavantages

Vantages Commodity Data Storage Huge Scalability (400 GB/node) Granularity

Disadvantages Load balancing

Suitable storage policy Fault tolerance

Data replication or Backup

[email protected] FCUL - Lisbon

21

On-going and future work

Error handling and fault-tolerance Recover mechanisms for each node Create a redundancy mechanism to recover

from a malfunction in the nodes Develop a storage mechanism to submit more

work to the best nodes Load balancing

Provide to user several interfaces to submit work

[email protected] FCUL - Lisbon

22

Summary

A different approach is being developed There is already a real prototype working Some (good) results have been achieved A lot of work already done…..

But …. Still a lot of work to do!

[email protected] FCUL - Lisbon

23

Acknowledgments

Thank you to those who are developing and participating in this project A. Amorim ([email protected]) P. Trezentos ([email protected]) N. Almeida ([email protected]) H. Fei ([email protected]) L.Pedro ([email protected]) J.Villate ([email protected]) H.Wolters ([email protected])

Keep the good workThanks for hearing me!

[email protected] FCUL - Lisbon