2-mc clatchey neugrid public event jan2011 · a grid -based e-infrastructure for data...
TRANSCRIPT
A GRIDA GRID--BASED BASED ee--INFRASTRUCTUREINFRASTRUCTURE
FOR DATA ARCHIVING/COMMUNICATIONFOR DATA ARCHIVING/COMMUNICATION
AND COMPUTATIONALLY INTENSIVE APPLICATIONSAND COMPUTATIONALLY INTENSIVE APPLICATIONS
IN THE MEDICAL SCIENCESIN THE MEDICAL SCIENCES
neuGRIDPublic Event
Brussels, January 26 th
2011
Prof Richard H McClatchey
1
The ConsortiumThe Consortium
VrijeVrije UniversiteitUniversiteit Medical Centre, THE NETHERLANDSMedical Centre, THE NETHERLANDS
FrederikFrederik BarkhofBarkhof
CF consulting CF consulting s.r.ls.r.l., ITALY., ITALY
Carla Carla FinocchiaroFinocchiaro
National AlzheimerNational Alzheimer’’s s CentreCentre FatebenefratelliFatebenefratelli, Brescia, ITALY, Brescia, ITALY
GB Frisoni, GB Frisoni, CoordinatorCoordinator
KarolinskaKarolinska institutetinstitutet, SWEDEN, SWEDEN
LarsLars--OlofOlof WahlundWahlund
University of the West of England, Bristol, UKUniversity of the West of England, Bristol, UK
Richard Richard McClatcheyMcClatchey, Technical Supervisor, Technical Supervisor
ProdemaProdema GmbHGmbH, SWITZERLAND, SWITZERLAND
Christian Spenger, Alex Christian Spenger, Alex ZijdenbosZijdenbos
MaatMaat GknowledgeGknowledge SLSL, SPAIN, SPAIN
David David MansetManset
HealthGridHealthGrid, FRANCE, FRANCE
YannickYannick LegrLegréé, Tony , Tony SolomonidesSolomonides
The Project The Project AimsAimsOverall ObjectiveTo deploy existing e-Infrastructures (EC-funded Mam moGrid and AddNeuroMed’s LORIS) into a new user-friendly Grid-based research e-Infrastructure (neuGRID) able to archive/exchange digital biomedical images and perform computational ly intensive data analyses .
3
Further AimsTo deploy a ‘Service Oriented Architecture’ (SOA) to mediate between user applications, the backend system and o ther systems through the Grid. Functionalities will be i solated and their interfaces well defined, producing services s uited to adaptability, re-use & scalability .
To bring concepts of the imaging lab and advanced tools to inexperienced research centres and to provide an environment for the development and validation of new algorithms by experienced users
The ApproachThe Approach
• User-driven requirements analysis.• Rapid prototyping with user verification.• Close collaboration between partners.• Service-oriented architecture.• Adherence to emerging standards.• Regular technical review of prototypes.• Tested with periodic data challenges.• Reported regularly for peer review at
conferences / symposia.
A Use Case ExampleA Use Case Example
5
• Identify an existing pipeline• Plan and run the pipeline• Generate provenance and output data• Browse the collected data.
MammoAnalyses
DistributedMedicalServices
GRID Services
MammoGrid
MammoAnalyses
DistributedMedicalServices
GRID Services
MammoGrid
MammoAnalyses
Distributed Medical Services
GRID Services
MammoGrid
MammoAnalyses
Distributed Medical Services
GRID Services
MammoGrid
MammoAnalyses
BrainAnalyses
Distributed Medical Services
GRID Services(EGEE compliant)
MammoGrid NeuGrid
MammoAnalyses
BrainAnalyses
Distributed Medical Services
GRID Services(EGEE compliant)
MammoGrid NeuGrid
MammoAnalyses
BrainAnalyses
YourFavorite Medical
Analyses Applications
Distributed Medical Services
GRID Services(EGEE compliant)
MammoGrid NeuGrid
MammoAnalyses
BrainAnalyses
YourFavorite Medical
Analyses Applications
Distributed Medical Services
GRID Services(EGEE compliant)
MammoGrid NeuGrid
Service Service ProvisionProvision in in NeuGRIDNeuGRID
Service Design PhilosophyService Design PhilosophyService Oriented Architecture will drive services
• Reusable and scalable– Services can be reused across domains & middleware– Scalable, isolated and self-contained.
• Services need to be open and extensible– Loose coupling between services and their components– Non-proprietary in nature– Contracts and communication based on WSDL
• Services need to follow generally agreed standards– SOAP based services– Implementation is protocol independent – Interoperability then made realisable
• Services need to be middleware agnostic– Services should not be tied to a particular middleware– Services can be deployed on any Grid Middleware including the
gLite Middleware
7Services enable interoperability between infrastruc tures
GridGrid CoordinationCoordination CenterCenter
LORIS
SlaveLORIS
DACS1DACS1 DACS2DACS2 DACS3DACS3
Data Data CoordinationCoordinationCenterCenter
SlaveLORIS
SlaveLORIS
Grid
SOA
Workflow
Provenance
Pipeline
`
neuGRID InfrastructureneuGRID Infrastructure
neuG
RID
DA
CS
are
con
nect
edto
GE
AN
T2
Net
wor
kne
uGR
IDne
uGR
IDD
AC
S
DA
CS
are
ar
e co
nnec
ted
to G
EA
NT
2N
etw
ork
Net
wor
k
100 Mb/s100 Mb/s100 Mb/s 100 Mb/s100 Mb/s100 Mb/s 1 Gb/s1 Gb/s1 Gb/s
20 Mb/s20 Mb/s20 Mb/s
neuGRID Infrastructure connects EGI Grid resourcesneuGRIDneuGRID Infrastructure Infrastructure connects EGI EGI GridGrid resourcesresources
Thousands of CPUsThousands of CPUs
Petabytes of storagePetabytes of storage
Strong liaison with BiomedVO/LSVRCStrong liaison with BiomedVO/LSVRC
AlzheimerAlzheimer’’ss DiseaseDisease NeuroimagingNeuroimaging InitiativeInitiative
- To help researchers and clinicians in developingdeveloping new new treatmentstreatments and and testingtesting theirtheirefficacyefficacy,- The ADNI is a multisite, multiyear program ADNI is a multisite, multiyear program which began in October 2004October 2004,- More than 700 subjects recruited, 200 elderly controls, 400 with mild cogn700 subjects recruited, 200 elderly controls, 400 with mild cognitive itive impairment (MCI) and 200 with Alzheimer's disease (AD)impairment (MCI) and 200 with Alzheimer's disease (AD)
- Subjects have been followed for 2followed for 2--3 years 3 years and have been seen approximately every 6 monthsevery 6 months
neuGRID Data ChallengeneuGRID Data ChallengeAnalyzing the USAnalyzing the US--ADNI DatabaseADNI Database
Exp
ecte
dR
esul
tsE
xpec
ted
Res
ults
Latest Data ChallengeLatest Data ChallengeFacts & FiguresFacts & Figures
Experiment duration on the Grid < 2 < 2 WeeksWeeks
Experiment duration on single computer > 5 > 5 YearsYears
Analyzed data PatientsMR ScansImagesVoxels
71571566’’235235
~1~1’’300300’’000000~9~9’’352352’’500500’’000000
Total mining operations 286286’’810810
Max # of processing cores in parallel 184184
Number of countries involved 44
Volume of output data produced 1 TB1 TB
User Testing and Training, 2010User Testing and Training, 2010• Validation of use-cases + training sessions.• From requirements to delivered functionality, led
by the User Manager.• Face-to-face meetings between software
developers and clinical researchers in: Stockholm (Karolinska), Brescia (IRCCS), at CERN and Brussels.
• Demonstrations of the Grid infrastructure plus services using real, anonymized data.
• Results from user sessions fedback into final system delivery and platform testing and into the business plan.
neuGRIDneuGRID : Significant Results: Significant Results• A stable, tested and deployed Grid-based infrastructure
with strong user engagement.• Service-oriented architecture : Portal, Querying, Workflow,
Provenance, Glueing, Anonymization and LORIS services.• Successful grand data challenges achieved using the
CIVET pipeline and EGEE-gLite.• Numerous peer-reviewed publications and awards/prizes
at infrastructure community events.• Strategic international collaboration initiated between EU,
Canada and the US ->OUTGrid.
Quotes from users. neuGRID....“has the functionality to access a large db of imaging files
combined with the option for powerful computing”
“seems to be one of the few information technology systems in healthcare that actually works”