glite – an outsider’s view stephen burke ral. january 31 st 2005glite overview introduction a...
TRANSCRIPT
![Page 1: GLite – An Outsider’s View Stephen Burke RAL. January 31 st 2005gLite overview Introduction A personal view of the current situation –Asked to be provocative!](https://reader034.vdocuments.mx/reader034/viewer/2022042822/56649f275503460f94c3ebfc/html5/thumbnails/1.jpg)
gLite – An Outsider’s View
Stephen BurkeRAL
![Page 2: GLite – An Outsider’s View Stephen Burke RAL. January 31 st 2005gLite overview Introduction A personal view of the current situation –Asked to be provocative!](https://reader034.vdocuments.mx/reader034/viewer/2022042822/56649f275503460f94c3ebfc/html5/thumbnails/2.jpg)
January 31st 2005 gLite overview
Introduction
• A personal view of the current situation– Asked to be provocative!– Some things may be wrong
• Accurate information can be hard to obtain
• History• Current situation• Future
![Page 3: GLite – An Outsider’s View Stephen Burke RAL. January 31 st 2005gLite overview Introduction A personal view of the current situation –Asked to be provocative!](https://reader034.vdocuments.mx/reader034/viewer/2022042822/56649f275503460f94c3ebfc/html5/thumbnails/3.jpg)
January 31st 2005 gLite overview
What was supposed to happen
• The original idea was to harden/re-engineer the deployed LCG middleware– Short development cycles driven by user
feedback– No “big bang” releases– No major new development
• Autumn 2003: the ARDA RTAG recommended a new architecture based on AliEn– Set up a prototype system quickly– Rapid development endorsed
![Page 4: GLite – An Outsider’s View Stephen Burke RAL. January 31 st 2005gLite overview Introduction A personal view of the current situation –Asked to be provocative!](https://reader034.vdocuments.mx/reader034/viewer/2022042822/56649f275503460f94c3ebfc/html5/thumbnails/4.jpg)
January 31st 2005 gLite overview
What actually happened
• EGEE started well after EDG finished– Large gap: December 2003 -> April 2004
• EDG infrastructure (cvs, build system, bug tracking, developer guidelines, testbeds, …) all scrapped– New system may be better (?), but it took ~7 months to
put it in place
• JRA1 “prototype” was essentially AliEn– Only two sites, of which one was hardly supported– ARDA project was set up and started using the prototype
• The members (some with no LCG experience) got used to AliEn
• LCG forged ahead with middleware improvements– Middleware quality/stability much improved– But the experiments are still unhappy
![Page 5: GLite – An Outsider’s View Stephen Burke RAL. January 31 st 2005gLite overview Introduction A personal view of the current situation –Asked to be provocative!](https://reader034.vdocuments.mx/reader034/viewer/2022042822/56649f275503460f94c3ebfc/html5/thumbnails/5.jpg)
January 31st 2005 gLite overview
AliEn -> gLite
• JRA1 wrote architecture and design documents for a major middleware development project– EDG experience suggests it will take years, not months– Not obviously driven by NA4 or SA1 requirements– AliEn pushed aside
• RB will support pull model as well as push– Migration to web services – everyone seems to like this,
but what is the real gain in the short term?• Web service code mostly not available yet anyway
• Big bang release is back!– Hardly any testing so far by SA1/LCG– Or most users– Information is limited
![Page 6: GLite – An Outsider’s View Stephen Burke RAL. January 31 st 2005gLite overview Introduction A personal view of the current situation –Asked to be provocative!](https://reader034.vdocuments.mx/reader034/viewer/2022042822/56649f275503460f94c3ebfc/html5/thumbnails/6.jpg)
January 31st 2005 gLite overview
Testbeds
• EDG testbed(s) and ITeam were successful– Both effectively lost in EGEE
• “Prototype” testbed not very useful– Effectively just one site, few machines– Not really a prototype – misled people about what to
expect• JRA1 testing testbed and test team effectively co-
opted for integration– Reduced resources for testing– Few sites, limited manpower– Already ~600 bugs in savannah, growing rapidly
• 260 closed, another 130 fixed and being tested• EDG had ~2500 bugs by the end
• SA1 PPS just starting at the start of 2005– Role still unclear
![Page 7: GLite – An Outsider’s View Stephen Burke RAL. January 31 st 2005gLite overview Introduction A personal view of the current situation –Asked to be provocative!](https://reader034.vdocuments.mx/reader034/viewer/2022042822/56649f275503460f94c3ebfc/html5/thumbnails/7.jpg)
January 31st 2005 gLite overview
Workload management
• Development of the EDG/LCG RB– Seems to be largely backward-compatible– Only user docs so far are the EDG manuals
• Not clear what new features are available– Or whether LCG mods are included
• Not AliEn!– Support for pull model via new CEMon
component
• Still uses BDII with GLUE schema– Should change to R-GMA (?)
![Page 8: GLite – An Outsider’s View Stephen Burke RAL. January 31 st 2005gLite overview Introduction A personal view of the current situation –Asked to be provocative!](https://reader034.vdocuments.mx/reader034/viewer/2022042822/56649f275503460f94c3ebfc/html5/thumbnails/8.jpg)
January 31st 2005 gLite overview
Data Management
• Very complex design, largely new code– No real user documentation, just javadoc– Mostly not delivered yet– Not clear how much will be in RC1
• Metadata activities also in ARDA and GridPP• Still seem to be developing the architecture
– Particularly the interaction with the WMS– WMS hedging its bets
• Supports both (gLite and LCG) systems
• LCG has also been developing DM tools– New file catalogue on its way– How do they relate to gLite?
![Page 9: GLite – An Outsider’s View Stephen Burke RAL. January 31 st 2005gLite overview Introduction A personal view of the current situation –Asked to be provocative!](https://reader034.vdocuments.mx/reader034/viewer/2022042822/56649f275503460f94c3ebfc/html5/thumbnails/9.jpg)
January 31st 2005 gLite overview
Data Storage
• gLite has no development of its own, relying on SRM projects
• EDG-SE was not stable enough for production– Still in development?
• dCache almost ready, but has taken ~18 months and still has many bugs– Support unclear
• New LCG Disk Pool Manager– Only an alpha version so far
• Is storage management really this hard?– Will the “classic SE” ever die?!
![Page 10: GLite – An Outsider’s View Stephen Burke RAL. January 31 st 2005gLite overview Introduction A personal view of the current situation –Asked to be provocative!](https://reader034.vdocuments.mx/reader034/viewer/2022042822/56649f275503460f94c3ebfc/html5/thumbnails/10.jpg)
January 31st 2005 gLite overview
R-GMA
• Should be an information system– But both LCG and gLite still use BDII
• Some user documentation available• gLite version is fairly backward
compatible with LCG version– No web services yet– gLite version getting “fast track” into LCG
• Still few users– But needed for APEL accounting
![Page 11: GLite – An Outsider’s View Stephen Burke RAL. January 31 st 2005gLite overview Introduction A personal view of the current situation –Asked to be provocative!](https://reader034.vdocuments.mx/reader034/viewer/2022042822/56649f275503460f94c3ebfc/html5/thumbnails/11.jpg)
January 31st 2005 gLite overview
Security
• “Security must be built in from the start”– So it gets a separate activity!
• EGEE security requirements document nearly identical to EDG D7.5 from May 2002– Which was mostly not implemented …
• Both LCG and gLite intend to use VOMS– But still not yet integrated with most middleware– No real strategy for how to use it?– Who “owns” VOMS?
![Page 12: GLite – An Outsider’s View Stephen Burke RAL. January 31 st 2005gLite overview Introduction A personal view of the current situation –Asked to be provocative!](https://reader034.vdocuments.mx/reader034/viewer/2022042822/56649f275503460f94c3ebfc/html5/thumbnails/12.jpg)
January 31st 2005 gLite overview
Others
• Package management– gLite is developing a software package
manager• and so is LCG!
– May be useful, no experience yet
• GAS– Came with AliEn, not clear if anyone
wants it
![Page 13: GLite – An Outsider’s View Stephen Burke RAL. January 31 st 2005gLite overview Introduction A personal view of the current situation –Asked to be provocative!](https://reader034.vdocuments.mx/reader034/viewer/2022042822/56649f275503460f94c3ebfc/html5/thumbnails/13.jpg)
January 31st 2005 gLite overview
Operational issues
• System Design– Neither SA1 nor JRA1 has anyone designing how the
complete system should work
• Configuration– The existing system has a very complex configuration
which is the source of many problems– Being addressed in JRA1, but not clear if it will really make
things better
• Stability and debugging– In a big Grid some things are always broken– Error messages and logging must allow problems to be
traced– Services need to be fault-tolerant– Not clear if JRA1 is addressing this
![Page 14: GLite – An Outsider’s View Stephen Burke RAL. January 31 st 2005gLite overview Introduction A personal view of the current situation –Asked to be provocative!](https://reader034.vdocuments.mx/reader034/viewer/2022042822/56649f275503460f94c3ebfc/html5/thumbnails/14.jpg)
January 31st 2005 gLite overview
What happens next?
• Code being delivered to SA1, will run on PPS• All serious bugs supposed to be fixed by March
– EDG experience is that it took >1 year to go from code delivery to production use – some things never made it!
• Migration strategy?– Hard if you don’t know what will work– LCG has its own developments, especially in data management– New R-GMA is largely backward-compatible
• And not critical yet– New RB seems similar to current version
• At least in push mode• ALICE (and LHCb?) want AliEn
– Data management is completely different• Big bang releases
– Code has now been branched– Will developers be keen to fix bugs in the “old” branch?– “Wait for the next version, it will all be fixed then”!