the atlas eventindex: an event catalogue for experiments ... filethe atlas eventindex: an event...

19
The ATLAS EventIndex: An event catalogue for experiments collecting large amount of data Andrea Favareto Università degli Studi di Genova & INFN Genova 18 th Jun 2014 - Riunione PRIN, Bologna

Upload: others

Post on 11-Oct-2019

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The ATLAS EventIndex: An event catalogue for experiments ... fileThe ATLAS EventIndex: An event catalogue for experiments collecting large amount of data Andrea Favareto Università

The ATLAS EventIndex: An event catalogue for

experiments collecting large amount of data

Andrea Favareto Università degli Studi di Genova & INFN Genova

18th Jun 2014 - Riunione PRIN, Bologna

Page 2: The ATLAS EventIndex: An event catalogue for experiments ... fileThe ATLAS EventIndex: An event catalogue for experiments collecting large amount of data Andrea Favareto Università

Introduction• Modern  scien,fic  experiments  collect  very  large  amount  of  data  

‣ ATLAS:  e.g.  2  billion  real  and  4  billion  simulated  events  in  2011  and  2012  

• A  catalog  of  data  is  needed  according  to  different  point  of  view  

‣ mul,ple  use  cases  and  search  criteria  

• A  database  that  contains  the  reference  to  the  file  that  includes  every  event  at  every  stage  of  processing  is  necessary  to  recall  selected  events  from  data  storage  systems  

‣ In  ATLAS  an  EventTag  database  already  exists  

-­‐ designed  in  the  late  1990’s  

-­‐ Oracle  databases  with  separate  tables  for  each  reprocessing  cycle.  Implementa,on  in  Oracle  par,cularly  labor-­‐intensive  

-­‐ each  event  is  recorded  several  ,mes,  once  for  each  cycle  of  reconstruc,on  

‣ EventTag  is  poten,ally  very  useful  but  very  liLle  used,  at  least  in  its  DB  format  

-­‐ main  use:  events  skimming

2

Page 3: The ATLAS EventIndex: An event catalogue for experiments ... fileThe ATLAS EventIndex: An event catalogue for experiments collecting large amount of data Andrea Favareto Università

EventIndex• GOAL:  design  a  more  agile  system  (“NoSQL”  databases),  compared  to  TagDB,  that  keeps  

the  func,onality  of  skimming  by  keeping  only  the  pointers  (currently  GUIDs)  to  the  files  where  the  event  in  ques,on  can  be  found  at  every  stage  of  processing  and  elimina,ng  other  variables  of  lower  importance  

• EventIndex  

‣ a  complete  catalog  of  ATLAS  events  

-­‐ all  events,  real  and  simulated  data  

-­‐ all  processing  stage  

‣ contents  

-­‐ event  iden,fiers  (RunNumber,  EventNumber,  LumiBlockN,  …)  

-­‐ online  trigger  paLern  and  hit  counts  

-­‐ references  (GUIDs)  to  the  events  at  each  processing  stage  (RAW,  ESD,  AOD,  NTUP)  in  all  permanent  files  on  storage

3

Page 4: The ATLAS EventIndex: An event catalogue for experiments ... fileThe ATLAS EventIndex: An event catalogue for experiments collecting large amount of data Andrea Favareto Università

Use Cases• old  use  cases  (from  TagDB)  

‣ event  picking  

-­‐ give  me  the  reference  (pointer)  to  “this”  event  in  “that”  format  for  a  given  processing  cycle  

‣ event  skimming  

-­‐ give  me  the  list  of  events  passing  “this”  selec,on  and  their  references  

‣ produc,on  consistency  checks  

-­‐ technical  checks  that  processing  cycles  are  complete  

• new  use  cases  

‣ event  service  

-­‐ give  me  the  references  for  “this”  range  of  events

4

Page 5: The ATLAS EventIndex: An event catalogue for experiments ... fileThe ATLAS EventIndex: An event catalogue for experiments collecting large amount of data Andrea Favareto Università

Data Flow• Informa,on  for  the  EventIndex  is  collected  from  all  produc,on  jobs  running  

at  CERN  and  on  the  Grid  and  transferred  to  CERN  using  a  messaging  system  

• This  info  is  reformaLed,  inserted  into  the  Hadoop  storage  system  and  internally  catalogued  and  indexed  

• Query  services  (CLI  and  GUI)  implemented  also  as  web  services  query  the  EventIndex  and  retrieve  the  informa,on

5

Page 6: The ATLAS EventIndex: An event catalogue for experiments ... fileThe ATLAS EventIndex: An event catalogue for experiments collecting large amount of data Andrea Favareto Università

EventIndex project breakdown

• Defini,on  of  4  major  work  areas  (or  tasks)  

‣ core  architecture  

-­‐ design  and  prototyping,  then  development  and  deployment  of  the  core  infrastructure  to  implement  the  EventIndex  using  Hadoop  technology;  ,mescale:  summer  2014  

‣ data  collec,on  and  storage  

-­‐ design  and  prototyping,  then  development,  deployment  and  opera,on  of  the  infrastructure  to  collect  from  Grid  and  non-­‐Grid  jobs  the  informa,on  needed  by  the  EventIndex,  and  transfer  and  upload  it  to  the  EventIndex  DB;  ,mescale:  autumn  2014  

‣ query  services  

-­‐ adapta,on  of  the  web  services  and  APIs  now  used  to  query  the  Tag  DB  to  the  EventIndex  using  Hadoop  technology;  ,mescale:  summer  2014  

‣ func,onal  tes,ng  and  opera,on  

-­‐ end-­‐to-­‐end  tests  of  the  completeness,  func,onality  and  performance  of  the  EventIndex  and  related  services.  Checks  of  the  opera,on  procedures  and  sebng  up  monitoring  tools;  ,mescale:  end  2014

6

Page 7: The ATLAS EventIndex: An event catalogue for experiments ... fileThe ATLAS EventIndex: An event catalogue for experiments collecting large amount of data Andrea Favareto Università

Data collection

7

• Data  to  be  stored  in  the  EventIndex  are  produced  by  all  produc,on  jobs  that  run  at  CERN  or  on  the  Grid  

‣ snippet  of  informa,on  of  permanent  output  files  (file  unique  iden,fier,  event  relevant  aLributes)  has  to  be  sent  to  central  server  @CERN

• producers  run  within  the  “pilot”  processon  each  worker  node  and  transmit  the  datausing  a  messaging  system  to  a  queue  in  oneof  the  endpoints

• asynchronous  consumers  con,nuously get  new  data,  check  their  validity  with  theproduc,on  system  and  pass  them  into  Hadoop

Page 8: The ATLAS EventIndex: An event catalogue for experiments ... fileThe ATLAS EventIndex: An event catalogue for experiments collecting large amount of data Andrea Favareto Università

Core architecture• data  are  stored  in  Hadoop  map  files  

‣ event  aLributes  stored  in  several  groups  (constants,  variables,  references,…)  

‣ data  can  be  indexed  (e.g.  using  inverted  indices  for  trigger  info).  Index  files  just  have  key  +  references  to  data  files.  Some  index  files  created  at  upload,  others  added  later.  Results  of  queries  can  be  cached  to  be  used  

• searches  can  be  performed  in  two  steps:  

‣ get  reference  from  index  file  

‣ using  that  ref.,  get  data  

• searches  can  be  performed  using  keys  (immediate  results),  or  on  data  files  

• full  searches  can  be  done  with  full  scans  or  using  MapReduce  jobs  

• access  to  the  data  is  achieved  by  a  single  and  simple  interface  for:  

‣ upload:  copy  file  into  HDFS  

‣ read:  get  &  search  

‣ update:  add  new  (ver,cal)  data  

‣ index:  create  new  indices

8

Page 9: The ATLAS EventIndex: An event catalogue for experiments ... fileThe ATLAS EventIndex: An event catalogue for experiments collecting large amount of data Andrea Favareto Università

Core system - Performance tests• tests  done  with  EI  CLI  and  web  service  

• tests  on  data11_7TeV,physics_Muons,f415_m1025_m1024  (t0  processing,  16  runs,  Sep-­‐Nov  2011)  

• some  use  cases  I’m  considering  are:  

1) COUNT/RETRIEVE  events  W/O  FILTER  (which  simply  means  to  count  all  the  events  in  the  dataset  without  any  par,cular  request)  

2) COUNT/RETRIEVE  events  W/  trigger  AND/OR  W/runnumber  

3) COUNT/RETRIEVE  events  (e.g.  token)  W/  GUID  

• Queries  that  give  the  run  number  and  event  number  (event  picking  use  case)  return  their  result  in  ~5  seconds  

‣ data  in  Hadoop  catalogue  are  indexed  by  RunNumber-­‐EventNumber  key  

‣ currently  we  are  thinking  about  to  generate  many  indices  for  each  event  variable  and  to  automa,se  the  search  in  order  to  use  the  proper  index  file  depending  on  the  variable  on  which  the  query  is  performed

9

Page 10: The ATLAS EventIndex: An event catalogue for experiments ... fileThe ATLAS EventIndex: An event catalogue for experiments collecting large amount of data Andrea Favareto Università

Core system - Performance tests

10

#test selec,on #events frac,on  (%) total  ,me

RETRIEVE  events  W/O  FILTER -­‐ 20371303 100 08:50,4

RETRIEVE  events  W/  trigger EF_mu18 3435250 16,86 03:03,2

RETRIEVE  events  W/  runnumber 189822 6448550 31,66 03:05,9

RETRIEVE  events  W/  runnumber  AND  W/  trigger EF_mu18  AND  18922 1353032 6,64 02:29,1

RETRIEVE  W/  GUID BE456704-­‐BE08-­‐E111-­‐BC39-­‐0030489470C6 8410 0,04 01:27,2

• #events  cross-­‐checked  with  previous  tests  and  with  iELLSI  (EventTag  web  service)  

• total  ,me  can  fluctuate  a  bit  (~  ±30  sec)  

• output  txt  file  created  with  all  events  informa,ons

Page 11: The ATLAS EventIndex: An event catalogue for experiments ... fileThe ATLAS EventIndex: An event catalogue for experiments collecting large amount of data Andrea Favareto Università

Core system - Performance tests

11

#test selec,on #events frac,on  (%) total  ,me

COUNT  events  W/O  FILTER -­‐ 20371303 100 02:04,5

COUNT  events  W/  trigger EF_mu18 3435250 16,86 01:57,5

COUNT  events  W/  runnumber 189822 6448550 31,66 02:02,8

COUNT  events  W/  runnumber  AND  W/  trigger EF_mu18  AND  18922 1353032 6,64 02:03,7

COUNT  W/  GUID BE456704-­‐BE08-­‐E111-­‐BC39-­‐0030489470C6 8410 0,04 01:23,7

• now  simply  count  the  events  and  output  txt  file  NOT  created

Page 12: The ATLAS EventIndex: An event catalogue for experiments ... fileThe ATLAS EventIndex: An event catalogue for experiments collecting large amount of data Andrea Favareto Università

Hadoop cluster load tests

• The  idea  is  to  run  some  queries  simultaneously  on  the  hadoop  cluster  and  measure  it’s  performances  in  this  condi,ons  

• in  order  to  compare  directly  the  results,  the  same  query  has  been  considered  

• mul,ple  queries  on  

‣ same  dataset  

‣ different  datasets  

• mul,ple  queries  from  

‣ single  user  -­‐  a  single  user  can  run  up  to  25  jobs  simultaneously  from  the  same  machine  

‣ two  users

12

Page 13: The ATLAS EventIndex: An event catalogue for experiments ... fileThe ATLAS EventIndex: An event catalogue for experiments collecting large amount of data Andrea Favareto Università

Hadoop cluster load tests: single user

13

#jobs SAME  DATASET:  total  ,me

DIFFERENT  DATASETs:  total  

1 01:32,3

2 01:43,5

5 02:42,8 02:24,0

10 03:19,7 02:31,5

15 04:31,1

20 05:35,2 02:55,7

26  (13x2  different  machines) 06:32,5

30  (15x2  different  machines) 07:49,7 03:49,2

36  (18x2  different  machines) 08:49,4

40  (20x2  different  machines) 09:28,3 04:07,2

00:00.0$

01:26.4$

02:52.8$

04:19.2$

05:45.6$

07:12.0$

08:38.4$

10:04.8$

0$ 10$ 20$ 30$ 40$ 50$

same$query$

different$queries$

same dataset different dataset

Page 14: The ATLAS EventIndex: An event catalogue for experiments ... fileThe ATLAS EventIndex: An event catalogue for experiments collecting large amount of data Andrea Favareto Università

Hadoop cluster load tests: two users

14

SAME  DATASET:  total  ,me

DIFFERENT  DATASETs:  total  ,me

#  total  jobs USER  1 USER  2 USER  1 USER  2

6  (3+3) 02:35,5 02:36,2 01:39,7 01:32,9

10  (5+5) 02:58,2 03:11,3 02:32,7 02:00,1

16  (8+8) 04:32,3 04:34,2 02:27,7 01:59,7

20  (10+10) 05:11,5 05:05,6 02:32,6 02:03,4

26  (13+13) 06:07,3 05:47,6 02:46,2 02:05,2

30  (15+15) 07:44,6 07:42,1

00:00.0$

01:26.4$

02:52.8$

04:19.2$

05:45.6$

07:12.0$

08:38.4$

0$ 5$ 10$ 15$ 20$ 25$ 30$ 35$

Andrea$

Claudia$

dif_queries_Andrea$

dif_queries_Claudia$

sd: USER 1 sd: USER 2 dd: USER 1 dd: USER 2

Page 15: The ATLAS EventIndex: An event catalogue for experiments ... fileThe ATLAS EventIndex: An event catalogue for experiments collecting large amount of data Andrea Favareto Università

Hadoop cluster load tests: comments

• tests  show  that  

‣ the  max  number  of  jobs/user  should  be  set  to  a  large  number  to  allow  many  jobs  to  run  concurrently  

‣ the  response  ,me  depends  linearly  on  the  number  of  concurrent  jobs.  It  doesn’t  maLer  who  the  concurrent  jobs  belong  to  

‣ the  response  ,me  increase  much  slower  for  queries  on  different  datasets  

• not  surprising,  but  good!

15

Page 16: The ATLAS EventIndex: An event catalogue for experiments ... fileThe ATLAS EventIndex: An event catalogue for experiments collecting large amount of data Andrea Favareto Università

Conclusions ad outlook

16

• The  EventIndex  project  is  on  track  to  provide  ATLAS  with  a  global  event  catalogue  in  advance  of  the  resump,on  of  LHC  opera,ons  in  2015.  

• The  global  architecture  is  defined  and  prototyping  work  is  in  progress;  preliminary  results  are  encouraging.  

• Current  work:  

‣ full  implementa,on  of  the  final  prototype,  loaded  with  Run1  data  

‣ extend  the  catalog  also  to  MC  

‣ comple,on  of  deployment  and  opera,on  infrastructure  

‣ sebng  up  of  monitoring  tool  just  started  

‣ Tes,ng  other  ways  to  store  EI  info  into  HBASE  only  (YAEIS  -­‐  Yet  Another  Event  Index  Server)  

Page 17: The ATLAS EventIndex: An event catalogue for experiments ... fileThe ATLAS EventIndex: An event catalogue for experiments collecting large amount of data Andrea Favareto Università

Conference communications, posters and proceedings

17

• Poster  at  CHEP  2013,  Amsterdam,  Oct  2013  

‣ The  ATLAS  EventIndex:  an  event  catalogue  for  experiments  collec>ng  large  amounts  of  data,  D.Barberis,  A.Favareto,  et  al.  hLp://cds.cern.ch/record/1606537;ATL-­‐SOFT-­‐SLIDE-­‐2013-­‐805  

• Talk  at  CHEP  2013,  Amsterdam,  Oct  2013  

‣ The  future  of  event-­‐level  informa>on  repositories,  indexing,  and  selec>on  in  ATLAS,  D.Barberis,  A.Favareto,  et  al.  hLp://cds.cern.ch/record/1623340;  ATL-­‐SOFT-­‐SLIDE-­‐2013-­‐864  

• Proceedings  for  CHEP  2013,  Amsterdam,  Oct  2013  

‣ The  ATLAS  EventIndex:  an  event  catalogue  for  experiments  collec>ng  large  amounts  of  data,  D.Barberis,  A.Favareto,  et  al.  hLp://cds.cern.ch/record/1606543;  ATL-­‐SOFT-­‐PROC-­‐2013-­‐012  –  published  in  J.  Phys.  Conf.  Ser.  513  042002  doi:10.1088/1742-­‐6596/513/4/042002  

• Proceedings  for  CHEP  2013,  Amsterdam,  Oct  2013  

‣ The  future  of  event-­‐level  informa>on  repositories,  indexing,  and  selec>on  in  ATLAS,  D.Barberis,  A.Favareto,  et  al.  hLp://cds.cern.ch/record/1625848;  ATL-­‐SOFT-­‐PROC-­‐2013-­‐046  –  Published  in  J.  Phys.  Conf.  Ser.  513  042009  doi:10.1088/1742-­‐6596/513/4/042009  

• Talk  at  ISGC  2014,  Taipei,  Mar  2014  

‣ The  ATLAS  EventIndex:  an  event  catalogue  for  experiments  collec>ng  large  amounts  of  data,  D.Barberis,  A.Favareto,  et  al.  hLp://cds.cern.ch/record/1670054;  ATL-­‐SOFT-­‐SLIDE-­‐2014-­‐118  

• Proceedings  for  ISGC  2014,  Taipei,  Mar  2014  

‣ The  ATLAS  EventIndex:  an  event  catalogue  for  experiments  collec>ng  large  amounts  of  data,  D.Barberis,  A.Favareto,  et  al.  hLp://cds.cern.ch/record/1690609;  ATL-­‐SOFT-­‐PROC-­‐2014-­‐002  –  To  be  published  in  Proceedings  of  Science.

Page 18: The ATLAS EventIndex: An event catalogue for experiments ... fileThe ATLAS EventIndex: An event catalogue for experiments collecting large amount of data Andrea Favareto Università

Backup

18

Page 19: The ATLAS EventIndex: An event catalogue for experiments ... fileThe ATLAS EventIndex: An event catalogue for experiments collecting large amount of data Andrea Favareto Università

H/W of Hadoop cluster

19