a database for medical image management

15
computer methods and programs in biomedicine 86 ( 2 0 0 7 ) 255–269 journal homepage: www.intl.elsevierhealth.com/journals/cmpb A database for medical image management Esperanza Marcos a , C´ esar J. Acu ˜ na a,, Bel´ en Vela a , Jos´ e M. Cavero a , Juan A. Hern ´ andez b a Kybele Research Group, Rey Juan Carlos University, C/Tulip´ an s/n, 28933 M´ ostoles, Madrid, Spain b GTEBIM Research Group, Rey Juan Carlos University, C/Tulip´ an s/n, 28933 M´ ostoles, Madrid, Spain article info Article history: Received 8 May 2006 Received in revised form 9 March 2007 Accepted 9 March 2007 Keywords: Databases Medical images Model driven development XML DICOM abstract MEDIMAN (Medical Image MANagement) is a web information system (WIS) for medical image management and processing currently used by neuroscientists and clinicians at sev- eral medical and research centres in Spain for research and clinical trials. While developing the MEDIMAN database (DB) we encountered several design challenges unlike those aris- ing in traditional DBs. This paper describes the development of MEDIMAN focusing on the database and the use of the database development process proposed in Midas, a model- driven framework for WIS development. Special attention is given to the design decisions made at each stage to address the challenges encountered. © 2007 Elsevier Ireland Ltd. All rights reserved. 1. Introduction At present, medical image modalities like positron emission tomography (PET) and functional magnetic resonance imaging (fMRI) make it possible to obtain images of quantitative and qualitative cerebral activity, which neuroscientists and clini- cians like neuroradiologists, neurologists, neuropsychiatrists, etc., use in their research. If, for example, they are running clinical trials to evaluate a new drug or pathology, the images acquired must be processed in very different ways for their later correction and statistical analysis. The volume of data and the complexity of its management mean that researchers do not only need a suitable storage service for the huge number of images they use, but they also need to process them. This involves applying standard- ized procedures to different sets of images for such purposes as registering, filtering, statistical analysis, viewing and stor- ing in several formats. Moreover it is very useful to represent Corresponding author. Tel.: +34 91 488 8140; fax: +34 91 488 8530. E-mail addresses: [email protected] (E. Marcos), [email protected] (C.J. Acu ˜ na), [email protected] (B. Vela), [email protected] (J.M. Cavero), [email protected] (J.A. Hern ´ andez). the different formats they use in one single format to enable cross-application information exchange. Most of the images are acquired in DICOM [1] (digital imaging and communica- tion in medicine), the most widely accepted standard for the interchange of medical images in digital format [2]. Not only is DICOM used as a format but ANALYZE [3] is also considered, which, though a proprietary format, is widely used for fMRI and PET. MEDIMAN [4–6] is a web information system (WIS) for medical image management and processing through the Web involving different hospitals and research institutions, mainly, though not only, in the neuroimaging area. Neuroscientists and clinicians are using it successfully for research and clini- cal trials in several medical and research institutions such us Ruber International, Fundaci ´ on Hospital de Alcorc ´ on and the University Hospital of Salamanca, among others. The research currently under way with the images stored in MEDIMAN includes studies of psychic diseases such as schizophrenia and photophobia, and of migraine and other types of pain. 0169-2607/$ – see front matter © 2007 Elsevier Ireland Ltd. All rights reserved. doi:10.1016/j.cmpb.2007.03.006

Upload: esperanza-marcos

Post on 05-Sep-2016

230 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: A database for medical image management

A

EJa

b

a

A

R

R

9

A

K

D

M

M

X

D

1

At(qcecal

msaiai

j0d

c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 8 6 ( 2 0 0 7 ) 255–269

journa l homepage: www. int l .e lsev ierhea l th .com/ journa ls /cmpb

database for medical image management

speranza Marcosa, Cesar J. Acunaa,∗, Belen Velaa,ose M. Caveroa, Juan A. Hernandezb

Kybele Research Group, Rey Juan Carlos University, C/Tulipan s/n, 28933 Mostoles, Madrid, SpainGTEBIM Research Group, Rey Juan Carlos University, C/Tulipan s/n, 28933 Mostoles, Madrid, Spain

r t i c l e i n f o

rticle history:

eceived 8 May 2006

eceived in revised form

March 2007

ccepted 9 March 2007

a b s t r a c t

MEDIMAN (Medical Image MANagement) is a web information system (WIS) for medical

image management and processing currently used by neuroscientists and clinicians at sev-

eral medical and research centres in Spain for research and clinical trials. While developing

the MEDIMAN database (DB) we encountered several design challenges unlike those aris-

ing in traditional DBs. This paper describes the development of MEDIMAN focusing on the

database and the use of the database development process proposed in Midas, a model-

driven framework for WIS development. Special attention is given to the design decisions

eywords:

atabases

edical images

odel driven development

ML

made at each stage to address the challenges encountered.

© 2007 Elsevier Ireland Ltd. All rights reserved.

University Hospital of Salamanca, among others. The research

ICOM

. Introduction

t present, medical image modalities like positron emissionomography (PET) and functional magnetic resonance imagingfMRI) make it possible to obtain images of quantitative andualitative cerebral activity, which neuroscientists and clini-ians like neuroradiologists, neurologists, neuropsychiatrists,tc., use in their research. If, for example, they are runninglinical trials to evaluate a new drug or pathology, the imagescquired must be processed in very different ways for theirater correction and statistical analysis.

The volume of data and the complexity of its managementean that researchers do not only need a suitable storage

ervice for the huge number of images they use, but theylso need to process them. This involves applying standard-

zed procedures to different sets of images for such purposess registering, filtering, statistical analysis, viewing and stor-ng in several formats. Moreover it is very useful to represent

∗ Corresponding author. Tel.: +34 91 488 8140; fax: +34 91 488 8530.E-mail addresses: [email protected] (E. Marcos), cesar.acun

[email protected] (J.M. Cavero), [email protected] (J.A. Her169-2607/$ – see front matter © 2007 Elsevier Ireland Ltd. All rights resoi:10.1016/j.cmpb.2007.03.006

the different formats they use in one single format to enablecross-application information exchange. Most of the imagesare acquired in DICOM [1] (digital imaging and communica-tion in medicine), the most widely accepted standard for theinterchange of medical images in digital format [2]. Not onlyis DICOM used as a format but ANALYZE [3] is also considered,which, though a proprietary format, is widely used for fMRIand PET. MEDIMAN [4–6] is a web information system (WIS) formedical image management and processing through the Webinvolving different hospitals and research institutions, mainly,though not only, in the neuroimaging area. Neuroscientistsand clinicians are using it successfully for research and clini-cal trials in several medical and research institutions such usRuber International, Fundacion Hospital de Alcorcon and the

[email protected] (C.J. Acuna), [email protected] (B. Vela),nandez).

currently under way with the images stored in MEDIMANincludes studies of psychic diseases such as schizophreniaand photophobia, and of migraine and other types of pain.

erved.

Page 2: A database for medical image management

s i n

256 c o m p u t e r m e t h o d s a n d p r o g r a m

The main goals of MEDIMAN are to offer neuroscienceresearchers and clinicians easily accessible case history imagestorage and suitably standardized processing and analysis ofmedical images. It also stores the results of medical imageprocessing jointly with the original images in a database (DB).Another of MEDIMANs major aims is to facilitate medicalinformation exchange by means of a common representationof the different formats for image file representation.

Today, medical imaging units can store and transmitmedical images using picture archiving and communicationsystems (PACS) and give information about them using partof the function provided for radiological information sys-tems (RIS). Such clinical information systems are orientedto individual patient information management, but not togroup management, under different experimental conditions.Management of groups of patients and of the mathematicalprocedures applied to their data is not possible with existingsoftware tools of this type. Similar tools have been designedfor cardiology and cancer multicentre trials [7–10], but with nouse of images. Nor do existing tools permit storage of informa-tion on clinical trials with images, such as the post-processingprocedures applied, algorithm parameters, and in general,information for further qualitative and quantitative analyses.MEDIMAN fills this gap by permitting the storage of the image,post-processing information, and image processing results.

Some existing tools share some of MEDIMANs objectives,e.g. MyPACS [11], Casimage [12], MedPix [13] and MIRC [14], thefirst three of which aim to provide an online medical imagestorage service for teaching and research purposes. Neverthe-less, the MIRC site allows users to access materials publishedon several participating sites, i.e., MIRC allows informationretrieval from several image repositories. These tools onlyoffer storage facilities but no other standardized image pro-cessing service.

Although most of the mentioned proposals use the eXten-sible Markup Language (XML), none of them store the XMLinformation in a XML database as is the case of MEDIMAN.Some of them manage the XML files individually as a file sys-tem which became a problem when the number of image filesgrows. The other approaches use OR databases to store theXML data so, as we discuss in Section 6, part of the semanticof the XML file could be lost. It should be pointed out that theydo not use XML to encode the information of the DICOM andAnalyze files, but they do use it to encode specific teachingfiles, including metadata about the image files.

During the MEDIMAN DB development process we encoun-tered several design challenges differing from those found intraditional DBs, which we attribute to two causes: the kind ofinformation that MEDIMAN has to handle – structured data,semi-structured information, images, repetitive groups, etc. –and the exchange facilities required for image file information.

The work presented here focuses on the storage service ofMEDIMAN, and on its database design and development, withspecial emphasis on the ways the different challenges wereovercome.

Two categories of data are managed by MEDIMAN: firstly,

medical image information, i.e. the data of each image fileand the results of the image processing; secondly, informationon clinical trials, etc. Although the former may be consideredas semi-structured data, the information in the latter cate-

b i o m e d i c i n e 8 6 ( 2 0 0 7 ) 255–269

gory is highly structured. The main reasons for differentiatingbetween the two types of data, which constituted one of themain challenges in the design and development of the MEDI-MAN database, are explained in detail in Section 2. This studyaddresses the problem by merging two different databasetechnologies for storing the different types of information:an object-relational (OR) database for the highly structureddata and an XML-DB to manage the semi-structured set. More-over, representing DICOM and Analyze information in XMLimproves the exchange and integration of medical informa-tion in a broader context than the purely medical. There isa range of tools and equipment compliant with the DICOMand Analyze formats which can read and process image datain them, but the exchange capability among them is limited.However, by using XML, which is a standard exchange lan-guage, not only in the medical environment but also in almostany discipline, we ensure the sharing of medical image databy means of any tool or equipment able to read XML data.

Multimedia data usually requires special technology foroptimal storage, access, indexing and retrieval. Relational,object-oriented and OR DBs have been used (sometimesextended with appropriate characteristics such as types,objects, query languages) for storing this kind or data [15,16].Several database management systems (DBMSs) support mul-timedia data types (Informix Dynamic and DB2 UniversalDataBase of IBM, Oracle 10g, CA-JASMINE, Sybase, etc.). Ofthese, we chose Oracle 10g because, in addition to supportingmultimedia data types, it integrates XML and OR technologiesvery well.

In the development of the MEDIMAN database we appliedMidas [17], a model-driven framework for the developmentof WIS with architecture based on the model-driven architec-ture (MDA) proposed by the Object Management Group (OMG)[18,19]. This model-driven approach has several advantages,e.g. maintenance and migration are easier (owing to the devel-opment of different models at different levels of abstraction),as is the semi-automatic transformation between models bymeans of a set of mapping rules, which facilitates the creationof development tools. This study will show the usage of thesemodels and Midas techniques in relation to WIS databasedevelopment.

There have been several attempts to address the databasedevelopment process, such as [20,21] for OR databases, andinitiatives such as [22–25] to transform unified modelinglanguage (UML) conceptual models into XML schemas. Never-theless, we have found no other methodological framework forthe systematic development of DBs which takes into accountboth structured (OR) and semi-structured (XML) data jointly, aswas necessary for the development of the MEDIMAN database.

This study addresses the development of MEDIMAN fromthe point of view of its database, using the development pro-cess proposed in Midas, with special emphasis on the designdecisions made at each stage.

The remainder of the paper is structured as follows. Sec-tion 2 describes the background to the project, with mentionof related work; Section 3 discusses the design considerations

made during the development of MEDIMAN and the methodsused to develop its database; Section 4 describes MEDIMAN,detailing its goals, architecture, functionality and the databasedevelopment process; Section 5 presents a status report of
Page 3: A database for medical image management

i n b

MS

2

Ttbma

watuogda

mwcsmbadgaa

Lcoasadaoip

maccati

sbtwst[

c o m p u t e r m e t h o d s a n d p r o g r a m s

EDIMAN. Finally, Section 6 describes the lessons learned andection 7 presents the conclusions and future work.

. Background

his section deals with other systems providing similar func-ionalities to those offered by MEDIMAN. Related work coulde roughly divided into that oriented to medical image dataanagement and that focused on image processing and data

nalysis.The first group includes SRB (storage resource broker) [26],

hich supports shared collections that can be distributedcross multiple organizations and heterogeneous storage sys-ems. It is one of the most popular examples of middlewaresed to build a Data Grid and to support directly the usef DICOM storage servers as data grid sources. SRB is aeneric type of middleware for the integration of differentata sources, often used to federate different PACS throughspecific driver for DICOM sources, as e.g. Ref. [27].

Specifically, in the area of neuroimaging, we can alsoention BIRN (biomedical informatics research network) [28],hich represents an attempt to develop a “protocol” for

ollaborative research among neuroscientists and medicalcientists. A core goal of BIRN is the development of aulti-institution information management system to support

iomedical research. Each participating institution maintainsdatabase of its experimental or computationally derived

ata, and the data integration system performs semantic inte-ration over the databases to enable researchers to carry outnalyses based on larger and broader datasets than thosevailable in individual institutions.

In the second group of related work, we mention theONI Pipeline (laboratory or neuro imaging), a simple graphi-al environment for constructing complex scientific analysesf data. It provides a visually intuitive interface to datanalysis while also allowing diverse programs to interacteamlessly. The Pipeline allows researchers to share theirnalysis methods easily and provides a simple platform foristributing new programs, as well as program updates, togiven community. The environment also takes advantage

f supercomputing environments by automatically paralleliz-ng data-independent programs in a given analysis wheneverossible.

There are two essential differences between the systemsentioned in this section and MEDIMAN. Firstly, none of them

llows for exchanging medical image information in a broaderontext as proposed by MEDIMAN, and, secondly, MEDIMANonsiders jointly a solution for data management by providingn easily accessible storage of case history images and a solu-ion for the standardized processing and analysis of medicalmages.

Another group of systems that we should mention in thisection is composed by those systems dealing with content-ased image retrieval (CBIR). The CBIR is a vivid field of studyhat refers to the collection of techniques and algorithms

hich enable querying image databases using image content

uch as colour, texture, objects and their geometries ratherhan textual attributes such as image name or other keywords29]. Content-based image retrieval (CBIR) has the potential

i o m e d i c i n e 8 6 ( 2 0 0 7 ) 255–269 257

to provide medical doctors with a powerful resource to helpmake accurate diagnoses. As examples of CBIR systems wecan mention systems such as IBM’s QBIC (query by image con-tent) [30] as the start of content based image retrieval. Anothercommercial system for image and video retrieval is Virage [31].Most of the available systems are, however from academia andwell known examples include Candid [32], Photobook [33] andNetra [34]. To provide CBIR capabilities is not aim of the MEDI-MAN System until now, however the CBIR is an interesting anduseful approach to extend the MEDIMAN search capabilities.

3. Design considerations

This section describes the design decisions of MEDIMANfocusing on its database. The data managed by MEDIMAN areof two categories according to their structuredness and thelevel of interchange needed:

Data related to clinical trial management: this includes infor-mation concerning the management of the clinical trials,individual tests, tasks performed, users, centres, etc. The usersbelong to a research or health care centre, each user createsclinical tests. The clinical tests are performed on a group ofindividuals (e.g. “Caucasian women aged between 18 and 35years”). Several tasks are carried out during a clinical test on aspecific group of individuals. Each task has several image filesassociated to it. These image files are uploaded by the users.

Such data is highly structured and a schema can be agreedon for it. Moreover, in most cases, it does not need to beinterchanged as often as the actual data concerning with themedical images does. This portion contains information thatcan be considered private, e.g., details of centres, users, etc.,which is not usually interchanged with other applications. Itcould therefore be stored straightforwardly using a traditionalDBMS. We chose Oracle 10g because of the good responsetimes in query resolutions and because it integrates object-relational and XML technologies very well. We have madefull use of the object-relational characteristics of Oracle, e.g.using REF types instead of traditional foreign keys. Althoughforeign keys have the advantage of referential integrity, theydo not allow direct navigation, so joins are necessary, makingresponse times worse than when using references. Moreover,response to queries involving bidirectional navigation may beimproved by implementing association through bidirectionalreferences.

Data related to medical image storage: most medical imagesare acquired in DICOM (used as standard) and Analyze (widelyused in fMRI and PET) formats, DICOM being the standard formedical image interchange in digital format. It defines thedata structures for medical images and their related infor-mation, the network-oriented services (image transmission,query of an image archive, etc.), formats for the exchange ofstored images, and requirements for conforming devices andprograms [35]. Internally, a single DICOM file contains both aheader (which stores the patient’s identity, type of scan, imagedimensions, etc.), as well as all the image data (which can

contain information in three dimensions). The DICOM headerconsists of a data set, in turn composed of data elements. Ithas a plain structure, as depicted in Fig. 1. A data element isuniquely identified by a data element tag, which is an ordered
Page 4: A database for medical image management

258 c o m p u t e r m e t h o d s a n d p r o g r a m s i n

Fig. 1 – DICOM header.

pair of integers representing a group number followed by anelement number. VR is the value representation of the valuefield defining its data type, and is an optional field in DICOM.The value length field defines the length of the value field.Finally, the value field contains the value(s) of the data ele-ment. Each data element defines information associated tothe medical image.

An Analyze (7.5) format image consists of two files: animage and a header file. If the image is named, e.g., “brain”,then the files for that image will be called “brain.img” and“brain.hdr”. The .img file contains the numbers that make upthe information in the image. The .hdr file contains informa-tion about the .img file, such as the volume represented byeach number in the image (voxel size) and the number of pix-els in the X, Y and Z directions. This header contains fields oftext, of floating point, integer and other information in pairs(description and value). The .hdr file in Analyze is the coun-terpart of the data set in DICOM.

Although DICOM and Analyze are the most widely acceptedformats for the exchange of medical image information, theyhave some drawbacks. On one hand, files are generated “indi-vidually”, hindering medical information management andquerying.

Much more complex queries can be made using XMLquery languages than directly with DICOM files. Furthermore,DICOM and Analyze encode the information in a specificformat. Although there is a range of tools and equipment com-pliant with the DICOM and Analyze formats which can readand process image data in them, the exchange capability islimited.

We decided to represent both DICOM and Analyze headerinformation using XML and to store their data in an XMLdatabase, for the following reasons:

• Data organization: owing to its organization (in data ele-ments), DICOM and Analyze information is very similar toXML information, so it is possible to process the file contentsfor storage in an equivalent XML structure.

• Data interchanged needs: medical image information oftenneeds to be interchanged with other systems, and XML isthe standard for information and data exchange betweenmultiple systems not only in the medical environment butalso in almost any area. XML and XML schema make it

possible to represent a broad range of information types,while modelling structure and semantic constraints ofthese types, thus facilitating medical information exchangein a broader context. In this way, storing the information

b i o m e d i c i n e 8 6 ( 2 0 0 7 ) 255–269

directly in XML improves the interchange capabilities of themedical image information.

• Data schema adaptability: DICOM and analyze formats canevolve periodically and their structure may change, newdata elements may be added or their data types may change.Although these changes occur, e.g., only when a new equip-ment is developed, this may lead to schema modificationsnot easily managed in traditional database systems buteasily addressed in XML databases. Moreover, as the XMLformat is hierarchical rather than relational, it can be usedin a much more straightforward manner to design datastructures to fit users’ needs. There is no need to use anentity relationship editor or to normalize the schema. If oneelement contains another, it can be represented directly inthe format, with no need for a join table.

• Scalability: using a less structured data schema like XMLmakes it easier to add new image file formats, while withtraditional database systems considerable effort is oftennecessary to ensure a single, uniform schema. Furthereffort is required if one or more of the information sourceschanges or new sources are added.

• Technology availability: at present, robust XML DBMSs allowfor efficient management of XML documents.

• Query capability: standard DICOM query facilities like C-FINDand C-MOVE protocols perform query/retrieve services,allowing a client to make a query to ascertain which imagesare stored in an archive or on another device, such as a work-station. This query mechanism is rather limited, as stated inthe standard itself: “The types of queries which are allowedare not complex” (see Ref. [1], PS3.4, Annex C). However, byusing XML to encode image file information and XPath asthe query language, search power increases significantly, ifwe also bear in mind that such XML files are stored andmanaged by a DBMS.

With these design decisions in mind, the next step is toselect the best techniques for the methodological develop-ment of the MEDIMAN database. For the development of theMEDIMAN database, we chose Midas, which is a methodolog-ical framework for web information systems development.

3.1. The Midas framework

Midas is a model-driven framework for the development ofa WIS. The main characteristics of MDA are the definition ofmodels as first-class elements for the design and implementa-tion of systems, and the definition of mapping rules betweenmodels, to allow transformations to be automated.

Midas proposes modelling the WIS according to two orthog-onal dimensions (see Fig. 4). The first takes into account theplatform dependence degree (based on the MDA approach),by defining computation independent models (CIMs), plat-form independent models (PIMs) and platform specific models(PSMs), and specifying the rules for mapping among thesemodels. In the second case, Midas considers the modelling ofthe WIS according to three basic aspects: hypertext, content

and behaviour. The content aspect, detailed in Fig. 2, corre-sponds to the traditional concept of database and here thedata PIM proposed is the conceptual data model (using UMLclass diagrams). As implementation platforms, Midas consid-
Page 5: A database for medical image management

c o m p u t e r m e t h o d s a n d p r o g r a m s i n b

eOsAi

poaXXwdsdsb

fUs

Fig. 2 – Midas architecture.

rs different database technologies jointly: OR and XML. ForR databases, it defines two kinds of PSM: one for the currenttandard (SQL:2003), and one for the chosen product (DBMS).

PIM may be transformed into the standard PSM, and thennto the product PSM, or directly into the product PSM.

Using the standard PSM may help for documentation pur-oses and facilitates future migration and integration withther systems. For semi-structured data, Midas only proposesn XML schema PSM model because in most of the currentML databases, the database schema is defined through anML schema [36]. This is one of the advantages of Midasith respect to database development, because it permits theevelopment of information systems which combine highlytructured data and semi-structured data, allowing for theesign of database models for several technologies at theame time. For more details on the other aspects of Midas –ehaviour and hypertext – see Refs. [38,39].

The database development process defined in Midas startsrom the conceptual data model, represented by means of aML class diagram at PIM level and then proposes applyinguitable mappings to obtain the different database schemas

Table 1 – Mapping to transform the data PIM into the OR data P

Data PIM Data PSM in SQL:2003

Class Structured type + typed tableClass extension Typed table

Attribute AttributeMultivalued ARRAY/MULTISETComposed ROW/structured type in coluCalculated Trigger/method

AssociationOne-to-one REF/[REF]One-to-many [REF]/[ARRAY]Many-to-many ARRAY/ARRAY

Aggregation ARRAYComposition ARRAYGeneralization Types/typed tables

i o m e d i c i n e 8 6 ( 2 0 0 7 ) 255–269 259

based on OR or XML models. These models will also be repre-sented using UML class diagrams, extended with the appropri-ate UML profiles. The logical model of the database is intendedto be automatically implemented by Midas-CASE, a case tooldeveloped to support the whole Midas methodology. In orderto obtain the database implementation of the MEDIMAN sys-tem using the Midas-CASE tool, the designer should depict theconceptual data model (UML class diagram) using the drag-and-drop user interface, then the conceptual data model issplit in different parts for selection of the portions to be imple-mented in the OR and XML databases. Finally, by selecting thesuitable DBMS (Oracle 10g in the case of MEDIMAN), the PSMmodels (OR and XML) and the appropriate DDL (data definitionlanguages) sentences are created to implement the database.At present, only the XML profile is completely implementedby Midas-CASE, so we implemented the XML part of the classdiagram of MEDIMAN manually. With the PSM models thedesigner can make annotations in order to customize the wayin which Midas-CASE implements the database, and thereforechange the default name of the database objects, the way ofimplementing certain references, etc.

For a better understanding of the database development ofMEDIMAN, we now briefly introduce the different UML profilesdefined in Midas for the content aspect.

Midas OR profile overview: with respect to the Midas OR Pro-file, Table 1 shows the mapping rules for the transformationof a conceptual UML class diagram (PIM level) into a UML classdiagram extended with the appropriate profile to representan OR (standard or product-specific) model (PSM level). For adetailed explanation of the Midas-OR profile, see Ref. [40].

XML schema profile overview: Table 2 shows the mappingrules for the transformation of a conceptual data model (PIMlevel) into an XML schema model (PSM level). An in-depthdiscussion of the XML profile of Midas is available in Ref. [41].

4. Description of MEDIMAN

This section describes MEDIMAN in detail and describes thedesign process of its database, taking into account the designdecision and the methods stated in Section 3.

SMs

Data PSM in Oracle10g

Object type + object tableTable of object type

AttributeVARRAY/nested table

mn Object type in columnTrigger/method

REF/[REF][REF]/[nested table/VARRAY]Nested table/nested tableVARRAY/VARRAY

Nested table/VARRAY of referencesNested table/VARRAY of objectsTypes/typed tables

Page 6: A database for medical image management

260 c o m p u t e r m e t h o d s a n d p r o g r a m s i n

Table 2 – Mapping to transform the data PIM into thedata PSM

Data PIM Data PSM

Data PIM XML schemaClass XML element

Sub class (generalization) XML element of ComplextypePart class (composition) XML element of Complextype

Attribute Sub-elementMandatory minOccurs = 1 (default)Optional minOccurs = 0Multivalued maxOccurs = NComposed (all|sequence) complextype with

subelementsChoice Choice complexType

AssociationOne-to-one Sub-element (of any element) for

association including an REFattribute that references the otherelement implicated

One-to-many Sub-element (at multiplicity Nclass) for association including aREF to element (at multiplicity 1class)

Many-to-many Sub-element (of any element) forassociation including an REF toelements (maxoccurs N)

Aggregation Subelement for aggregationincluding REF to the parts

Maximum multiplicity: 1 Subelement with Complextypewith REF element

Maximum multiplicity: N Subelement with sequenceComplextype of REF elements

Composition Subelement for composition

including a sequence Complextypewith the part elements

4.1. Overall description

MEDIMAN is a WIS for medical image management and pro-cessing now in use by neuroscientists and clinicians at severalmedical and research centres in Madrid and elsewhere inSpain. Its main goal is to offer neuroscience researchers andclinicians an easily accessible historical storage of medicalimages. Many of these specialists carry out fMRI and PET stud-ies when the individual, patient or otherwise, is doing sensoryor motor psychocognitive tasks. For each individual and taskthe scanners generate thousands of images, so a considerableeffort is made in terms of human and financial resources toacquire them, but, along with the information, they are gener-ally lost when the research or clinical studies end. The queriesincluded in MEDIMAN allow for the management of the wholemedical image workflow, including uploading and download-ing image files and result files, viewing image file data (dataon the patient, clinical trials, etc.) and detailed search andaccess control. MEDIMAN also includes capabilities for auditand usage statistics.

It also offers suitable standardized processing and analysisof medical images: in clinical trials and scientific experi-ments, the vast amount of information generated requiresproblem-specific computer processing. Once the images are

b i o m e d i c i n e 8 6 ( 2 0 0 7 ) 255–269

acquired, they have to be processed in different ways inorder to be corrected, and later statistically analyzed. Imageprocessing includes, among others things, the followingprocedures:

• pre-processing: reconstruction and restoration (artifactremoving, filtering and correction),

• post-processing: registration, standardization, etc.• quantification: morphology, volumetry, perfusion, activity

level, etc.• statistical analysis: parametric, non-parametric, Bayesian,

etc.• viewing: 2D multiplanar, 3D, multimodality fusion, etc.

All the results of medical image processing are alsostored, together with the original images, in a database.The images are gathered from the different scanners andstored and processed in one specific centre. This centraliza-tion allows the use of specific procedures for mathematicalanalysis and post-processing in the same manner for allsamples. Analysis standardization is thus guaranteed, regard-less of the acquisition site, providing a centralized procedureadjusted to a specific protocol. Although MEDIMAN oper-ates in a centralized way, it can be used virtually anywhereas it is web-accessible. In addition to the research usagedescribed, MEDIMAN is intended for teaching purposes. Itsimage database can be used by students of neurology, radi-ology, etc. in the research institute and hospitals whereMEDIMAN operates; and by those of other disciplines relatedto medical image processing within the framework of our uni-versity’s PhD programs. Medicine and radiology students canreuse the images acquired for research, to discuss clinicalcases. Moreover, students on the course Foundations of ImageProcessing in Medicine can use MEDIMAN to develop and testnew processing algorithms.

4.2. System architecture

Taking as a reference the architectures for Web applicationdevelopment proposed by .NET and J2EE, this Web applicationwas structured in three layers: presentation, behavioural andpersistence. Fig. 3 shows the MEDIMAN software architecture.

The presentation layer includes the web user interface,developed using ASP.NET. Fig. 4 shows some screenshots ofMEDIMAN. As the application is purely web-based, no clientsoftware needs to be installed other than a standard webbrowser.

The main components of the application are placed in thebehavioural layer, which deals with decoding DICOM and Ana-lyze data files into XML files, and with query processing.

As already said, this paper focuses mainly on the persis-tence layer, i.e., on the database design of MEDIMAN.

4.3. MEDIMAN database development

This section presents the process followed to develop the

MEDIMAN database using the techniques proposed in Midasfor the content aspect. In order to develop the database for theWIS we need: (a) to specify the data PIM, i.e., to build the dataconceptual model using a UML class diagram; (b) to apply the
Page 7: A database for medical image management

c o m p u t e r m e t h o d s a n d p r o g r a m s i n b

rP

FspiItAasuhsi

Fig. 3 – MEDIMAN software architecture.

ight mappings to transform the data PIM into the chosen dataSM.

The proposed data PIM for MEDIMAN is shown in Fig. 5.or reasons of clarity, the figure shows only a reduced ver-ion of the PIM without the data concerning medical imagerocessing and results storage. The data PIM represents the

nformation handled by MEDIMAN as described in Section 2.t is also possible, as described in that section, to distinguishwo kinds of data in the data PIM proposed: data regardingnalyze or DICOM files and data concerning clinical trial man-gement. Information related to the files is observed to beemi-structured (labelled “semi-structured data” in the fig-

re), while the information on clinical trials management isighly structured (“structured data” in the figure). For the rea-ons described in Section 2, we have decided to store themage file information in an XML database and the rest of the

Fig. 4 – MEDIMAN

i o m e d i c i n e 8 6 ( 2 0 0 7 ) 255–269 261

information in an OR database. So in this application it is nec-essary to apply simultaneously the XML profile mappings tothe semi-structured parts of the data PIM to be stored in anXML database, and the OR ones to the structured parts to bestored in an OR database. As a result, it is possible to obtain aunique data PSM, which combines both models.

First, the OR profile mappings are applied to the structuredpart of the data PIM shown in Fig. 5, in order to obtain the partof the data PSM corresponding to the OR database. Note that,in this case, we transform directly from the data PIM to theproduct PSM, i.e., Oracle 10g.

• Classes: to transform UML classes into Oracle 10g classes it isnecessary to define an object type for each class in the dataPIM. The Oracle 10g object type is represented by stereotyp-ing the class with �Object Type�. All the classes in the dataPIM must be transformed into object types.

• Attributes: each attribute of the UML classes of the dataPIM is transformed into an attribute of the object type torepresent the corresponding class of the data PIM. The IDattributes of the data PIM must be stereotyped with �PK�in the data PSM.

• Associations: the UML associations were transformed intounidirectional relationships in this case study. In order totransform the associations, we have to consider maximummultiplicity. There are no one-to-one relationships in thedata PIM. One-to-many associations are transformed byadding an REF type attribute in the object type which repre-sents the class of maximum multiplicity many. For example,the association between centre and user is transformed byadding the REF type centre attribute into the user object. Inthe case of many-to-many associations, we have to includean attribute of a collection type in one of the object typesparticipating in the association. This is because we decidedto transform them into unidirectional relationships. If nav-igability is represented in the UML class diagram, it must be

borne in mind when the association is transformed. If themaximum cardinality is known, it would be advisable to usea VARRAY as collection type; otherwise a nested table maybe used.

screenshots.

Page 8: A database for medical image management

262 c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 8 6 ( 2 0 0 7 ) 255–269

IMA

Fig. 5 – MED

The many-to-many relationship between User and Clin-ical Test classes is transformed by defining a nested tablecollection type of references to the Clinical Test object typeand including an attribute of that type in the object type torepresent the user class.

• Generalization: in the generalization between File, Result fileand Image file classes, the subclasses inherit the attributesand associations of the superclass because File is an abstractclass. So the subclasses are transformed into object typeslike any other class.

• Composition: the composition is transformed by adding anattribute of the collection type into the definition of thewhole type.

The non-shaded area of the Fig. 7 shows the par-tial data PSM of Oracle 10g. To achieve it, the map-ping rules of the Oracle 10g OR Profile were appliedover the structured part of the data PIM depicted inFig. 5.

Now, the XML profile mappings are applied to the semi-structured part of the data PIM as follows. The resultant partialdata PSM is shown in Fig. 6.

• Data PIM: this is represented by a UML package stereotypedwith �SCHEMA� and called ‘DATA PSM’. In the case study,

the UML package will include the element which representsthe classes placed in the semi-structured part of Fig. 5.

• Classes: those classes of the data PIM which are parts of acomposition, such as DICOM element and Analyze element

N data PIM.

classes; or subclasses of a generalization like DICOM Infoand Analyze Info classes are mapped by means of twonamed complexTypes. The rest of the classes (only File Infoclass in this case), are transformed into elements named asthe class name which they come from.

• Attributes: the only classes with attributes in the data PIMare DICOM element and Analyze element. All these attributesare mandatory, therefore they are transformed into classesstereotyped with �ELEMENT� with minimum multiplic-ity 1. These classes are associated to the container class,in the data PSM, with a composition stereotyped with�sequence�.

• Associations: in order to transform the generalizationbetween File Info, Analyze Info and DICOM Info classes,an element for each subclass has to be added into theXML choice complexType, within the element that rep-resents the superclass. The Analyze info and DICOM infoelements have to be included into the File Info ele-ment. The complexTypes of the elements added areAnalyze Info type and DICOM Info type, respectively, cre-ated when the Analyze Info and DICOM Info classes weretransformed.

The composition between the Analyze Info and Ana-lyze element classes is mapped by adding a sub-elementto the element that represents the compositor class Ana-

lyze Info, named as the part class Analyze element. Thetype of the Analyze Element element is the complex typeAnalyze element Type generated when mapping the Ana-lyze Element class.
Page 9: A database for medical image management

c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 8 6 ( 2 0 0 7 ) 255–269 263

ata

pX

woXds1

4

TmmeTuiO

Fig. 7. Although the way the Oracle XML-DB generates the ORelements starting from the XML schema obtained is beyondthe scope of this paper, Table 3 briefly shows the mappings

Table 3 – Mapping to transform the data PIM into theXML data PSM

XML schema types Database types

Simple types SQL typesComplex types Object types

a

Fig. 6 – Partial d

By applying the mapping rules over the semi-structuredart of the Data PIM of Fig. 5, the partial Data PSM for OracleML-DB shown in Fig. 6 is obtained.

At this point we developed the MEDIMAN database schema,hich is composed of two PSMs: Oracle 10g OR schema devel-ped using the Oracle 10g OR profile shown in Fig. 7, and theML schema representing the schema of the XML databaseeveloped with the XML schema profile of Midas. The nexttep is to implement the elements in the DBMS chosen, Oracle0g in this case. The next section shows such implementation.

.4. Database implementation using oracle 10g

he two PSMs obtained in the previous section are imple-ented in Oracle 10g. Fig. 7 shows the complete databaseodel as stored by Oracle 10g, represented using the UML

xtensions for representing OR Databases defined in Ref. [40].

o implement the object types, nested tables, etc. we havesed Oracle SQL*Plus, following the partial data PSM depicted

n the non-shaded area of Fig. 7 and using the OR facilities ofracle 10g.

PSM (XML-DB).

As we chose the structured model based on the XML typestorage model of Oracle XML-DB, after registering the gener-ated XML schema from the partial data PSM depicted in Fig. 6,the Oracle XML-DB generates suitable OR objects to store theimage file XML documents, which are in the shaded area of

Attributes Attributes of the object typeSimple elements Attributes of the object typea

a The attributes will be collection attributes depending on the car-dinality of the simple element or attributes.

Page 10: A database for medical image management

264 c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 8 6 ( 2 0 0 7 ) 255–269

ase i

Fig. 7 – Datab

performed by XML-DB to transform XML schema types intodatabase types. A detailed explanation is available in Ref. [36].

It is important to remember that we do not use twodatabases or two different schemas for storing the structureddata or the XML data. We have defined a single database,where each part is managed efficiently taking into account

the structuredness of the information. In this way, the highlystructured part of the conceptual data model is managedusing object-relational technology, and the less structureddata (image files) is managed using a technology adapted to

n Oracle 10g.

this type of data (i.e., using the XML facilities of Oracle XML-DB).

4.5. Using MEDIMAN

This section describes a real usage scenario of MEDIMAN in

order to provide a clearer picture of the types of research tasksor clinical trials usually managed by the system. It concerns astudy carried out in the Neurology Department of FundacionHospital de Alcorcon (Spain), the aim of which was to study
Page 11: A database for medical image management

i n b

tss(wwtqoi0

o

••

sami

1

2

3

4

ts

wts

sdttbust

c o m p u t e r m e t h o d s a n d p r o g r a m s

he response of the occipital cortex to light stimuli as a mea-urement of photophobia and cortical excitability in migraineufferers. Twenty migraineurs not then undergoing attacks8 with aura and 12 without) and 20 controls were studied

ith fMRI-BOLD for four different light intensities, for each ofhich eight axial image sections of 0.5 cm covering the occipi-

al cortex were acquired. Activation of the occipital cortex wasuantified for each light stimulus by measuring the numberf voxels (area) and percentage of change in baseline signal

ntensity. Light perception was also estimated according to a–3 semi quantitative scale.

Several post-study analyses were carried out on the databtained, including:

fMRI data analysis to determine the differences regardingthe number of activated voxels (extension of activation)in standarized images to MNI brain (Montreal neurologicalinstitute brain) and the intensity of activation for each voxel(BOLD signal wave amplitude).fMRI data series realignment and re-slice to correct formotion artifacts.Statistical analysis.etc.

A more detailed description of this study is beyond thecope of this paper, but is available in Ref. [37], where theuthors have submitted their work in the neurology area. Toanage the study described above with MEDIMAN, the follow-

ng steps were followed:

. Running a new clinical trial for the study. The descriptionof the study includes a brief definition of the objective ofthe research task. In this case we have created a new clin-ical trial entitled “Photo-reactivity of the occipital cortexmeasured by fMRI-BOLD in Migraine”.

. Formation of groups of individuals. Two groups of twentywere set up in the MEDIMAN database: “Migraineurs” and“Control”.

. Establishing tasks to carry out on each group. This involvedinserting four different tasks in the MEDIMAN system foreach group, related to the four different light intensitiesapplied to them.

. Upload of images for each task, obtained in DICOM format.

It should be noted that the study was blinded, no informa-ion about the identity of the patients being introduced in theystem.

Subsequently, the result files of the post-study analysisere also uploaded to the MEDIMAN database together with

he original images. These files included the corrected images,tatistical data series, text documents, etc.

From the case study above, we were able to corroborateome of the hypotheses taken into consideration during theevelopment of the MEDIMAN database. It will be noticed thathe data introduced in the database to manage the researchask is highly structured. It includes several relationships

etween the different entities (clinical trials, group of individ-als, tasks, etc.), justifying the choice of an object-relationaltructure for such information. However, neither the informa-ion in the image files uploaded for each task (acquired mainly

i o m e d i c i n e 8 6 ( 2 0 0 7 ) 255–269 265

in DICOM for this case study) nor the analysis result files(which can include corrected image files, text files, etc.) havea static structure. They exist in several formats (especially theanalysis results files), new ones possible becoming necessaryin the future. With this in mind and taking into account theadvantages and applications of XML discussed previously, anXML structure for this kind or information seems to be thebest choice.

5. Status report

The current version of the MEDIMAN system involves sev-eral medical centres and medical research institutions inSpain, such as Ruber International, Fundacion Hospital deAlcorcon, and the University Hospital of Salamanca, amongothers.

The database is stored on a RAID-5 magnetic disk systemcomprising twelve disks with a full capacity of 1.8 Terabytes(TB). The average volume of work managed monthly by MED-IMAN is about 32,000 images of different modalities, with anestimated average of four clinical trials a week performedon two different patients, each trial generating an averageof 1000 images. This estimate is based on the full amountof images stored in the MEDIMAN database, and the numberof weeks over which the system was operating. It should beborne in mind that the MEDIMAN system was off-line for sev-eral months for upsizing. It also should be pointed out that thedifferent users of MEDIMAN use the system in different ways;the number of patients and the amount of images generatedper each clinical trial vary in a wide range.

In order to evaluate the performance of the MEDIMAN sys-tem we carried out a study of response times, for which weused different types of queries, as shown in Table 4. Notethat the first three queries involve joins between the parts ofthe database implemented using OR and XML technologies,necessitating the use of XPath as part of the SQL sentences.

The total number of images stored in the database was127,100 distributed over 195 clinical trials. Table 5 shows theresults obtained, the columns show a set of values, which areexplained next:

• Rows affected: is the number of images returned by eachquery

• Response time: is the time elapsed between the user mak-ing the query and the instant in which the DBMS finish thequery.

• Access time: is the time elapsed between the user makingthe query and the instant in which the first set of imagesappears in the web browser. The result of each query is dis-played in the web browser as soon as the first set of imagesis returned. It is not necessary to wait until the end of thequery execution to start displaying the results. For this rea-son, the access time is independent from the response timeand the complexity of the query.

From the results shown in the above table, it will be noticedthat the most costly queries are those queries involving XPathsentences. We consider these times are good, more so whenwe remember that the average access time is clinically accept-

Page 12: A database for medical image management

266 c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 8 6 ( 2 0 0 7 ) 255–269

Table 4 – Queries used in the performance evaluation

# Query SQL code Query description

1 SELECTnombrefichero

Finds all the image files for a given clinicaltrial performed to a specific patient

FROMTabla Fichero imagentfiWHEREtfi.informacion asociada.existsNode(‘/Fichero Info/Fichero Info DICOM/Elemento InfoDICOM [Description = “PatientsName”][Value = “ALVAREZ LOPEZ J”]’) = 1 ANDtfi.fkEnsayo.Ensayo id = 222

2 SELECTnombrefichero

Finds all the image files fora given patient.

FROMTabla Fichero imagentfiWHEREtfi.informacion asociada.existsNode(‘/Fichero Info/FicheroInfo DICOM/Elemento Info DICOM [Description = “PatientsName”]

[Value = “SALAS ALONSO JA”]’) = 13 SELECT

tfi.fktarea.descripcionFinds all the task performedfor a given Physician

FROMTabla Fichero imagentfiWHEREtfi.informacion asociada.existsNode(‘/Fichero Info/FicheroInfo DICOM/Elemento Info DICOM

[Description = “ReferringPhysiciansName”][Value = “HUME”]’) = 1

4 SELECTcount(tfi.fichero id)

Counts all the image filesuploaded for a given user

FROMTabla Fichero imagentfiWHEREtfi.fkusuario.usuario id

! = 0GROUP BYtfi.fkusuario.usuario id;

able (less than 5 s) using a broadband Internet connection.Several Web usability studies locate the tolerable access timebetween 2 and 10 s [43,44].

6. Lessons learned

Several lessons were learned from the development of theMEDIMAN database. Those included in this section focus on

the different alternatives for implementing the MEDIMANdatabase.

To choose the platform to implement the MEDIMANDATABASE, we analysed the main characteristics of the data

Table 5 – Response times

Query # Rows affected

1 3,7192 143 4204 15,215

to be managed. As said before, it is of two different types:structured data related to clinical trial management, andsemi-structured data concerning image file information.

The most suitable way to store the structured data is byusing a traditional relational or object-relational model. Thebest way to store the semi-structured data is with XML, fortwo main reasons: to facilitate exchange between other med-ical applications and because a common representation ofAnalyze and DICOM files data is needed.

With a pure object-relational DBMS to implement thewhole MEDIMAN data model, the process of “shredding” thesemi-structured data of the DICOM and Analyze files intoobject-relational tables would require the development of

Response time (min) Access time

00:49.09 Less than 5 s01:45.03 Less than 5 s01:47.08 Less than 5 sLess than 1 s Less than 5 s

Page 13: A database for medical image management

i n b

afimtsatTssmtcm

tgdm

nt

sTnXwtta

snstwmsl

itoebtoboumeioritwa

c o m p u t e r m e t h o d s a n d p r o g r a m s

dditional tools to extract the information from the imageles for its later storage in the object-relational tables. Further-ore, complex queries would be necessary to accommodate

he data extracted in the object-relational tables, while semi-tructured data does not usually conform to a rigid schemand includes many null values, which are hard to manage withraditional DBMS. Nevertheless, with a pure XML DBMS, likeamino [42], structured data may not be suitably managed,o a DBMS that simultaneously manages both types of dataeems to be the best alternative. On the basis of the studyade in Ref. [45] and taking into account our experience with

he Oracle databases, we chose Oracle 10g as our DBMS. Ora-le, since version 9i, has included a set of facilities for XML fileanagement under the name XML-DB.We therefore decided to implement each of the parts of

he data conceptual model using different database technolo-ies: object-relational database technology for the clinical testata and XML database technology for the part concerning theedical image files data.Oracle 10g under XML-DB provides three storage models:

on-structured, non-structured based on XML-type and struc-ured based on XML-type.

In the non-structured model, the XML documents aretored as plain text files using CLOBs (character large objects).he main drawback of this persistence model is that it doesot take into account the nature of XML documents, so theML data are managed as common text files, including theay of querying XML documents, which increases response

ime. In addition, as the XML document is stored in a column,he semantics of the OIM (oracle identity management) modelre lost.

The “non-structured based on XML-Type” storage modeltill stores the XML documents in CLOBs, but the use of theew data type “XML-Type” includes facilities for supportingtandard XPath. As the XML documents are still managed asext files, the querying process demands text search enginesith the consequential overhead. Also with this storageodel, XML files are stored in a column of a table, so the

emantics represented by the conceptual data PIM would beost when transformed into data PSM.

Finally, the “structured based on XML-Type” storage models based on the XML schema standard to break the con-ent of the XML data down into structured data stored inbject-relational tables. This new storage model sets up annvironment to work with XML documents using the capa-ilities of an object-relational database and allows combiningraditional and robust capabilities of databases with the powerf XML standard. Although using the structured storage modelased on XML-Type, the XML data is managed internally usingbject-relational tables, this process being transparent to theser, who works only with an XML schema and XML docu-ents. The user uses XPath to query the XML data. XPath

xpressions sent to Oracle XML-DB functions are translatednto SQL statements that operate directly on the underlyingbjects. This rewriting of XML type operations into object-elational SQL statements results in significant performance

mprovements compared with performing the same opera-ions on XML documents stored using unstructured storage,hile the problems of the null values in semi-structured data

re managed by the DBMS transparently. Although the “struc-

i o m e d i c i n e 8 6 ( 2 0 0 7 ) 255–269 267

tured based on XML-type storage model” breaks XML datadown into relational tables specifically created for the schemawith which the XML data complies, using this storage model,the XML data keeps its semantics because the data is storedand managed according to its XML schema. The existence ofobject-relational tables for storing XML information is trans-parent for the users, who can go on managing the informationas XML data. If we store the image files data directly on anobject-relational database, it will be necessary to transformthe OR data into XML data in order for it to be exchanged. WithOracle 10g XMLDB, this transformation is made automaticallyby the DBMS.

The advantages of the “structure based on the XML-type”storage model led us to choose it for the semi-structured partof the data conceptual model.

Moreover, as part of these lessons learned, some modifi-cations to the Midas methodology were proposed from theexperience acquired while developing MEDIMAN. The mainchanges were proposed in the way the associations of the classdiagram were transformed into XML schema constructions.

With the development and final implementation of theMEDIMAN database, we were able to refine the XML schemaprofile as well as the proposed mappings. Some of the trans-formation rules were changed after their implementation inthe XML DBMS, because of the specific needs of the prod-uct. So, for example, the mapping of the UML generalizationshas changed, because of the limitations with the nesting ofcomplexTypes of the selected product. For this reason, thegeneralizations are now mapped including a subelement ofa complexType that represents the generalization (named“generalization” plus a sequence number) instead of includ-ing directly the choice complexType in the element from thesuperclass of the generalization.

7. Conclusions

This paper has presented the database development processof MEDIMAN, a WIS for medical image management, now inuse by neuroscientists and clinicians at several research andmedical centres in Madrid. Specifically, the MEDIMAN systemis being used for several studies with fMRI for migraine andwith MRI for Alzheimer’s disease. The database of this WIShas some distinguishing characteristics with respect to thetraditional WIS DBs, because it has to manage two differentkinds of data: the information related to the management ofthe clinical trials, which can be considered highly structuredand with a rigid schema; and that concerning the medicalimage files, which is less structured. Along this paper we haveillustrated how the database was developed with the tech-niques proposed in the content aspect of Midas, which is amethodological framework for WIS development. This paperalso mentions the main challenges encountered during MED-IMAN database development, outlining the design decisionsmade in each stage.

Although MEDIMAN operates in a centralized way, we are

working in order to deploy the MEDIMAN system over a GRIDplatform in order to guarantee fault tolerance and to balancethe high processing cost of the images. For this implemen-tation of MEDIMAN we are using the OGSA-DAI (open grid
Page 14: A database for medical image management

s i n

r

268 c o m p u t e r m e t h o d s a n d p r o g r a m

services architecture-data access and integration) [46]. OGSA-DAI is a middleware product which supports the exposure ofdata resources, such as relational or XML databases, on togrids.

Acknowledgements

This research was carried out within the framework of twoprojects financed by the Spanish Ministry of Education andScience: “GOLD: A Model-Driven Framework for Web Informa-tion Systems Development: Application to the development ofa WIS for Medical Image Management” (TIN2005-00010) and“Advanced Method for the Processing of Signals in Neuro-imaging by MR and electroencephalography: Application tothe cerebral cartography and brain-computer interface on dis-abled people” (TEC2005-07801-C03-01/TCM).

The authors would also like to thank Marcos Lopez for hishelp in the implementation task.

Finally, the authors would like to thank the reviewers fortheir comments that helped to improve the quality of thispaper.

e f e r e n c e s

[1] ACR-NEMA, The DICOM Standard, Retrieved fromhttp://medical.nema.org/, 2004.

[2] R. Moreno, S. Furuie, Integration of heterogeneous systemsvia DICOM and the clinical image access service, IEEEComput. Cardiol. (2002) 385–388.

[3] Mayo Clinic, Analyze Software, Retrieved from:http://www.mayo.edu/bir/Software/Analyze/Analyze.html,2004.

[4] C. Acuna, E. Marcos, V. de Castro, J.A. Hernandez, A webinformation system for medical image management,biological and medical data analysis, in: Proceedings of the5th International Symposium, LNCS, vol. 3337, 2004, pp.49–59.

[5] C. Acuna, E. Marcos, V. de Castro, J.A. Hernandez, Managingmedical images, Ercim News 58 (2004) 54–55.

[6] MEDIMAN System, Available athttp://ariadna.escet.urjc.es/gesimed/.

[7] K. Jiang, Integrating clinical trial data for decision makingvia web services, in: Proceedings of the 26th AnnualInternational Conference of the IEEE EMBS, San Francisco,CA, USA, 2004, pp. 3336–3349.

[8] S. Jacob, R. Singh, M. Mathew, B.K. Khandheria, Frameworkfor a virtual center to support multi-center clinical trials incardiology, Comput. Cardiol. 27 (2000) 583–586.

[9] C.H. Elsner, M. Egbring, H. Kottkamp, T. Berger, S. Zoller, G.Hindricks, Open source or commercial products forelectronic data capture in clinical trials a score cardcomparison, Comput. Cardiol. 30 (2003) 371–373.

[10] K.S. Barber, T. Graser, J. Silva, Developing a traceable domainreference architecture to support clinical trials at thenational cancer institute—an experience report, ECBS 2001,in: Proceedings of the Eighth Annual IEEE InternationalConference and Workshop, 2001, pp. 144–151.

[11] E. Weinberger, R. Jakobovits, M. Halsted, MyPACS.net: a

web-based teaching file authoring tool, Am. J. Roentgenol.179 (2002) 579–582.

[12] A. Rosset, O. Ratib, A. Geissbuhler, J.-P. Vallee, Integration ofa Multimedia Teaching and Reference Database in a PACSEnvironment RadioGraphics, vol. 22, 2002, pp. 1567–1577.

b i o m e d i c i n e 8 6 ( 2 0 0 7 ) 255–269

[13] MedPix Medical Image Database Retrieved from:http://rad.usuhs.mil/medpix/medpix home.html#top, 2005.

[14] E. Siegel, D. Channin, J. Perry, C. Carr, B. Reiner, Medicalimage resource center 2002: an update on the RSNAsmedical image resource center, J. Digital Imaging 15 (2002)2–4.

[15] V. Castelli, L.D. Bergaman (Eds.), Image Databases: Searchand Retrieval of Digital Imagery, Wiley, 2002, ISBN0-471-32116-8.

[16] O. Kalipsiz, Multimedia Databases. IV. Fourth InternationalConference on Information Visualisation (IV’00), 2000,p. 111.

[17] P. Caceres, E. Marcos, B. Vela, A MDA-based approach forweb information system development. WISME, in:Workshop in Software Model Engineering in UML’03Conference, San Francisco, USA, 2003.

[18] J. Miller, J. Mujerki (Eds.), MDA Guide Version 1.0. Documentnumber omg/2003-05-01. Retrieved from:http://www.omg.com/mda, 2006.

[19] J. Miller, J. Mujerki (Eds.), Model Driven Architecture.Document number ormsc/2001-07-01. Retrieved from:http://www.omg.com/mda, 2006.

[20] C. Ambler, Agile model driven development is good enough,IEEE Software 20 (5) (2003) 71–73.

[21] R. Muller, Database Design for Smarties, Morgan Kaufmann,1999.

[22] D. Carlson, Modeling XML Applications with UML,Addison-Wesley, 2001.

[23] T. Krumbein, T. Kudrass, in: R. Tolksdorf, R. Eckstein (Eds.),Rule-Based Generation of XML Schemas from UML ClassDiagrams, Berliner XML Tage 2003, Berlin, Germany, BerlinerXML, October 13–15, 2003, pp. 213–227.

[24] W. Provost, UML for W3C XML Schema Design. Retrievedfrom: http://www.xml.com/lpt/a/2002/08//07/wxs uml.html,2005.

[25] N. Routledge, L. Bird, A. Goodchild, UML and XML schema,in: ACM International Conference Proceeding Series.Proceedings of the 13th Australian Conference on DatabaseTechnologies, vol. 5, 2002, pp. 157–166.

[26] Storage Resource Broker,http://www.sdsc.edu/srb/index.php/Main Page, 2006.

[27] L. Leoni S. Manca, A. Giachetti, G. Zanetti. A Virtual DataGrid Architecture for Medical Data using SRB, EuroPACS-MIR2004 in the Enlarged Europe, pp. 475–478.

[28] J. Jovicich, M.F. Beg, S. Pieper, C. Priebe, M.M. Miller, R.Buckner, B. Rosen, Morphometry BIRN (2005) biomedicalinformatics research network: integrating multi-siteneuroimaging data acquisition, data sharing and brainmorphometric processing, in: Proceedings of the 18th IEEEInternational Symposium on Computer-Based MedicalSystems, 2005.

[29] S. Antani, R. Kasturi, R. Jain, A survey on the use of patternrecognition methods for abstraction. Indexing and retrievalof images and video, Pattern Recognit. 35 (2002) 945–965.

[30] M. Flickner, H. Sawhney, W. Niblack, J. Ashley, Q. Huang, B.Dom, M. Gorkani, J. Hafner, D. Lee, D. Petkovic, D. Steele, P.Yanker, Query by image and video content: the QBIC system,IEEE Comput. 28 (9) (1995) 23–32.

[31] A. Hampapur, A. Gupta, B. Horowitz, C.-F. Shu, C. Fuller, J.Bach, M. Gorkani, R. Jain, Virage video engine, in: I.K. Sethi,R.C. Jain (Eds.), Storage and Retrieval for Image and VideoDatabases V, vol. 3022 of SPIE Proceedings, 1997, pp. 352–360.

[32] P.M. Kelly, M. Cannon, D.R. Hush, Query by image example:the CANDID approach, in: W. Niblack, R.C. Jain (Eds.), Storage

and Retrieval for Image and Video Databases III, vol. 2420 ofSPIE Proceedings, 1995, pp. 238–248.

[33] A. Pentland, R.W. Picard, S. Sclaro, Photobook: tools forcontent{based manipulation of image databases, Int. J.Comput. Vision 18 (3) (1996) 233–254.

Page 15: A database for medical image management

i n b

[45] U. Westermann, W. Klas, An analysis of XML database

c o m p u t e r m e t h o d s a n d p r o g r a m s

[34] W.Y. Ma, Y. Deng, B.S. Manjunath, Tools for texture- andcolor-based search of images, in: B.E. Rogowitz, T.N. Pappas(Eds.), Human Vision and Electronic Imaging II, vol. 3016 ofSPIE Proceedings, San Jose, CA, 1997, pp. 496–507.

[35] H. Oosterwijk, Dicom Basics, 2nd ed., O Tech Ed., 2002.[36] R. Murthy, S. Banerjee, XML schemas in oracle XML-DB, in:

Proceeding of the 29th VLDB Conference, Berlin, Germany,2003.

[37] H. Martın, M. Sanchez-del-Rio, J. Alvarez, J. Hernandez, J.Galvez, J.A. Pareja. Photo reactivity of the occipital cortexmeasured by fMRI-BOLD in migraine. Official J. Am. Acad.Neurol. (submitted for publication).

[38] E. Marcos, V. de Castro, B. Vela, Representing web serviceswith UML: a case study, in: M.E. Orlowska, S. Weerawarana,M.P. Papazoglou, J. Yang (Eds.), ICSOC (2003), Springer-Verlag,LNCS-2910, 2003, pp. 15–27.

[39] V. de Castro, P. Caceres, E. Marcos, A user service orientedmethod to model web information systems, WebInformation Systems—WISE (2004) 5th InternationalConference on Web Information Systems Engineering, LNCS3306, Brisbane, Australia, November, 2004, pp. 41–52.

i o m e d i c i n e 8 6 ( 2 0 0 7 ) 255–269 269

[40] E. Marcos, B. Vela, J.M. Cavero, Methodological approach forobject- relational database design using UML, in: R. France,B. Rumpe (Eds.), Journal on Software and Systems Modeling(SoSyM), vol. 2, Springer-Verlag, 2003, pp. 59–72.

[41] B. Vela, C. Acuna, E. Marcos. A model driven approach forXML database development, Proceedings of the 23rdInternational Conference on Conceptual Modelling (ER2004),LNCS 3288, Springer-Verlag, 2004, pp. 780–794.

[42] Tamino X-Query. Software AG - System DocumentationVersion 3.1.1. Software AG, Darmstadt, Germany. Retrievedfrom: http://www.softwareag.com, 2004.

[43] J. Nielsen, Designing Web Usability, New Riders,Indianapolis, 2000.

[44] A. King, Speed Up Your Site: Web Site Optimization, NewRiders Press, 2003.

solutions for the management of MPEG-7 mediadescriptions, ACM Computing Surv. 35 (4) (2003) 331–373.

[46] OGSA-DAI, Retrieved from: http://www.ogsadai.org.uk/,2007.