ibm software group ® the use of ogsa-dai with db2 content manager in the ediamond project m oevers,...
TRANSCRIPT
IBM Software Group
®
The Use of OGSA-DAI with DB2 Content Manager in the eDiaMoND Project
M OeversM Oevers, B Collins, A Knox, J Williams, B Collins, A Knox, J Williams
IBM Software Group
Overview
eDiaMoND the projectStrategies for VirtualisationHow DB2 and CM are usedOGSA-DAI enablement of CMLessons Learnt
IBM Software Group
eDiamond – Project Announcement
“One of the pilot e-science projects is to develop a digital mammography archive, together with an intelligent medical decision support system for breast cancer diagnosis and treatment. An individual hospital will not have supercomputing facilities, but through the grid it could buy the time it needs. So the surgeon in the operating theatre will be able to pull up a high-resolution mammogram to identify exactly where the tumour can be found” – Tony Blair (speech to the royal society – 23 may 2002)
IBM Software Group
eDiaMoND Partners
IBM Software Group
eDiaMoND – Project Deliverables
BreastScreeningProgrammes
eDiaMoNDeDiaMoND
Phase 0Prototype
(end-2003)
Phase 0Prototype
(end-2003)
?(Next Phase)
?(Next Phase)
BluePrintBluePrint
• Grid Infrastructure
• Grid-connected Workstation
• Database for Storage & Retrieval of Images & Metadata
• Computation for CADe, CADi and Statistical Analyses
• Required Hardware, Software & Network for given Service Levels
Phase 1Prototype
(mid-2004)
Phase 1Prototype
(mid-2004)
IBM Software Group
eDiaMoND Functional Model
IBM Software Group
Strategies for Virtualisation
Use II & II4CExpose throughOGSA-DAI
Investigate DQP
IBM Software Group
Virtualisation – things to remember
Each Breast Care Unit (BCU) to operate independently from others Individual organisations coming together to for a Virtual Organisation
Data loaded locally in each BCU Data is “owned” by the BCU
Enable read access across all BCUs seamlessly Replication or Federation
DB2 II & II4C Remember it’s got to be a Grid (eScience project)
OGSA-DAI
Distributed Query Processing (QDP) over OGSA-DAI
IBM Software Group
How OGSA-DAI is used with DB2 and CM
DB2 stores the non-image data in a structured form DICOM describes an ER model Patient – Study – Series – Image
Flexible to allow for multiple modalities
Allow flexibility of data modelling/access control/query rewrite CM is used to store and manage the (large 30MB) DICOM files
Files contain both non-image data and image data
Identified by DICOM SOP Instance UID
Flat CM data model (Customer Requirement)
Both exposed as OGSA-DAI services
DICOM – Digital Imaging and Communications in Medicine
IBM Software Group
Query ServicePersistent
OGSA-DAI ServicePersistent
Data Layer
Grid Layer
Grid Layer
Client Layer
Administration Client Screening
Workflow
Viewer Client
Worklist ServiceTransient
1. Query
2. Worklist Create
3. Worklist Consume
4. Retrieve
1
2 3
DB2 Instance Patient IDDICOM ID
Content Manager Instance DICOM IDURL – DICOM ID
Retrieve ServicePersistent
OGSA-DAI ServicePersistent
4
IBM Software Group
DataLayer
GridLayer
GridLayer
ClientLayer
CHU
Grid Development – Phase 0 to Phase1
CMDB2
QUERY
WORKLIST
Admin
RETRIEVE
DB2OGSA DAI
Viewer
CMOGSA DAI
KCLUCLUED
DB2 Fed.
CM Fed.
CMDB2CMDB2CMDB2
DB2 FEDOGSA DAI
CM FEDOGSA DAI
CMOGSA DAI
DB2OGSA DAI
CMDB2
DataLoader
Deploy
IBM Software Group
CM Grid enablement – What it means
Driver Class, e.g. com.ibm.db2.jcc.DB2Driver
Driver URI, e.g.jdbc:db2://localhost:50000/SAMPLE
ConnectionDriverManager.getConnection()
MetadataTable Schema for SQL
XML schema for XML DB Mapping of Grid Certificates to DB
user and password
Datastore object, e.g com.ibm.mm.sdk.server.DKDatastoreICM
Data store name, e.g.ICMNLSDB
Connected DatastoreDatastore.connect()
MetadataItemTyes and Attributes
Could it be treated as an XML DB? Mapping of Grid Certificate to CM user and
password
It was possible to map CM concepts to corresponding JDBC concepts that are exposed in OGSA-DAI configuration files2 XML files to edit and 2 Java classes to write
OGSA-DAI conf/ext points Mapping to CM
IBM Software Group
The Gory details
SimpleCMDataResourceImplementation
SimpleCMDataResourceImplementation()getDatastore()returnDatastore()getDatabaseMetaData()getDriverClass()
(from dataresource)
DataResourceImplementation(from dataresource)
Activity(from engine) CMDataResource
getDatastore()returnDatastore()getDriverClass()
(from dataresource)
BlockReader(from engine)
RetrieveByUIDActivity
mCredentials : Logical View::java::lang::String
RetrieveByUIDActivity(element : Element)processBlock() : voidsetContext(context : Context) : voidfinalize() : void
(from activi ty)
-mInput
BlockWriter(from engine)
-mOutput
IBM Software Group
Lessons Learnt
OGSA-DAI is a flexible framework into which CM fits reasonably well Chaining of activities
User defined activities Developer focus on writing activities Use of dynamic discovery to configure the system
Useful during development/testing
Register more in the registry Unifies the view of the system as far as data is concerned Experience of grid-enabling an existing product
Have not explored how to expose CM metadata yet
IBM Software Group
Data Load - High Level Design
Grid Boundarry
Load C lient
DICOM P arser
DICOM File(Im age or
S R)XM L F ile
Load A P I
LoadPluginf or Core DB
LoadPluginf or Core Store
Invocation
Invocation
P ull fromReference
OGS A -DA ICM S ervice
OGS A -DA IDB 2 S ervice
Reference
1. DICOM file gets parsed2. XML file created with Reference3. XML file passed to load services4. CM pulls DICOM file in5. As simple as possible
IBM Software Group
Data Load Detailed Design
XMLDocument DicomFile
DataLoader
Plugins[] : LoadServicePluginXMLDoc : XMLDocument
DataLoader()load()parse(DicomFile) : XMLDocumentsetPlugin(LoadServicePlugin)
1
1
1
1
creates
1
1
1
1
parses
LoadServicePlugin
setXMLDocument(XMLDocument)connect()
load()disconnect()
configure(Configuration)
1..*
1
1..*
1
ImageStorePlugin NonImageStorePlugin
• Plugin Architecture • Decoupling• Configuration of Plugin to decide• Parser also pluggable• API as simple as possible
IBM
OUCL
IBM Software Group
eDiaMoND API
IBM Software Group
eDiaMoND - Organisation
Development (Mirada)
Oxford / Churchill
UCL / St Georges
Edinburgh
JANET Network
Development (OUCL) Aberdeen
KCL / GuysDevelopment (IBM)
WorkstationServer T221Grid Boundary
OUCL LAN
eDiaMoND LAN
VPN & FWVPN & FW
Oxford LAN
eDiaMoND LAN
VPN & FWVPN & FW
Edinburgh LAN
eDiaMoND LAN
VPN & FWVPN & FW
Aberdeen LAN
eDiaMoND LAN
VPN & FWVPN & FW
eDiaMoND LAN
Mirada LAN
VPN & FWVPN & FW
eDiaMoND LAN
UCL LAN
VPN & FWVPN & FW
eDiaMoND LAN
KCL LAN
VPN & FWVPN & FW
eDiaMoND LAN
IBM LAN
VPN & FWVPN & FW
IBM Software Group
DB=FEDCORE
Node=edibm
View cis.patient =
edibm.patient
union
edouc.patient
Federation setup DB2
Server = edibm
Nickname=
edibm.patientDB=EDCORE
Node=edibm
Table=cis.patient
DB=EDCORE
Node=edouc
Table=cis.patient
Server = edouc
Nickname=
edouc.patient
Create view over
union of
nicknames of
identical tables
No query rewrite
necessary
IBM Software Group
The M Diagram
IBM Software Group
Non-RepudiationNon-Repudiation
Systems AdministrationSystems Administration
EpidemiologyEpidemiology
TeachingTeaching
DiagnosisDiagnosis
ScreeningScreening
EpidemiologyEpidemiology
TeachingTeaching
DiagnosisDiagnosis
ScreeningScreening
eDiaMoND – Non-Functional
Grid
EthicsEthics
LegalLegal
SecuritySecurity
PerformancePerformance
ManageabilityManageability
…………
ScalabilityScalability
AuditabilityAuditabilityEpidemiologyEpidemiology
TeachingTeaching
DiagnosisDiagnosis
ScreeningScreening
EpidemiologyEpidemiology
TrainingTraining
ScreeningScreening
AnonymisationAnonymisation
256MB & 5 secs response
256MB & 5 secs response
Lossless CompressionLossless Compression
EncryptionEncryption
~100 Centres~100 Centres
IBM Software Group
UCL
KCL
CHU
UED
Phase 1 Deployment
IBM
MIR
OUCL
SCO
Digitiser W/S
T221
Digit.
GEO
eDiaMoND Grid Node
eDiaMoND W/S
T221 UC
L L
AN
eDiaMoND Grid Node
eDiaMoND W/S
T221
Digitiser W/S
T221
Digit.
UE
D L
AN
JAN
ET
/ Intern
et
JAN
ET
/ Intern
et
OUCL LAN
eDiaMoND LAN
eDiaMoND Dev.
Grid Node
eDiaMoND Test
Grid Node
eDiaMoND Demo
Grid Node
CH
U L
AN
GU
Y L
AN
eDiaMoND W/S
T221
eDiaMoNDDemo W/S
T221 T221
eDiaMoND Grid Node
eDiaMoNDDev. W/S
T221 T221 eDiaMoND Dev.
Grid Node
eDiaMoND W/S
T221
eDiaMoND Grid Node
MIR
LA
N
eDiaMoNDDemo. W/S
T221 T221 eDiaMoND Demo.
Grid Node
IBM
LA
N
Digitiser W/S
Digitiser W/S
Digit.
Digit.
eDiaMoND Repository
Server
IBMDev.
Grid Node
eDiaMoND LAN
T221
T221
GUY
IBM Software Group
UK Breast Screening – Challenges
230 - Radiologists (Double Reading)50% - Workload Increase
2,000,000 - Screened every Year120,000 - Recalled for Assessment10,000 - Cancers1,250 - Lives Saved
Began in 1988
Women 50-70ScreenedEvery 3 Years2 Views/Breast+ DemographicIncrease
~100 BreastScreeningProgrammes- Scotland- Wales- Northern Ireland- England
Digital
Digital
IBM Software Group
Breast Cancer Facts
1 in 8 women will develop breast cancer in the course of their lives, 1 in 28 will die of it
In the EC breast cancer accounts for 19% of cancer deaths and 24% of cancer cases
Diagnosed in 348,000 women in EC+USA and kills 115,000 women annually
1,000,000 new cases world-wide in 1997 Rationale for Screening
Early diagnosis = better Prognosis
Detection at 0.5cm has favourable outcome in 99% cases; but at 2cm only 50%
IBM Software Group
UK Breast Screening Programme
Call
1000
Missed1
Interval Cancers
ScreeningScreening AssessmentAssessment
EpidemiologyEpidemiology
TrainingTraining~100 BreastScreeningProgrammes
Recall
40 (86)
All Clear960 (914)
All Clear34 (80)
Cancer
6
Previous
Current
The Recall rate is 86 for First Time Screening as nocomparison is possible with a previous Screening
IBM Software Group
Project Teams
Grid Infrastructure Team IBM
Oxford University Computing Laboratory Image Analysis Technology Team
Dept of Engineering Science
Mirada Solutions Image Collection & Clinical Assessment Team
St Georges Hospital
Guy’s and St Thomas’ Hospitals
Oxford Radcliffe Hospitals
Kings College London
University College London
University of Edinburgh
IBM Software Group
Mammograms have very different appearances, depending on image settings and acquisition systems
The “interesting tissue” representation is a surface independent of scanner
SMF® - Mirada’s Patented Standardisation Process
IBM Software Group
Hint
Compression Plates
1cm
1.0 cm
Fatty Tissue
Glandular TissueTumour
A quantitative representation of breast tissue density
Mirada’s Interesting Tissue Representation