clef 2007 medical image annotation task budapest, september 19-21 2007

CLEF 2007CLEF 2007

Medical Image Annotation TaskMedical Image Annotation Task

Budapest, September 19-21 2007Budapest, September 19-21 2007

Tatiana Tommasi, Francesco Orabona, Barbara Caputo

IDIAP Research Institute, Centre Du Parc, Av. Des Pres-Beudin 20, martigny, Switzerland

OverviewOverview

• Problem Statement

• Features

• Classifier

• Results

• Conclusions

Problem StatementProblem Statement

Automatic Image Annotation task’s GOAL: classify a test set of 1000 medical images, having a training set of 11000 medical images.

IRMA db: Radiographic images divided into 116 classes according to the IRMA code

IRMA code consists of four independent axes: modality

- body region - body orientation - biological system

Score: errors annotation depends on the level of the hierarchy at which the error is made - a greater penalty is applied for incorrect classification than for a less specific classification in the hierarchy

Local Features – SIFTLocal Features – SIFT

Scale Invariant Feature Transform : local feature descriptor invariant to changes in

- illumination

- image noise

- rotation

- scaling

- minor changes in viewing direction

SIFT points = local maxima of the scale space

Really the most informative for a classification task ??

• Dense random sampling of the SIFT points better than interest point detectors

• Radiographs : low contrast images

No keypoint orientation

SIFT extracted at only one octave

Vocabulary of Visual Words - SIFTVocabulary of Visual Words - SIFT

30 SIFT points extracted from each of the 12000 images

K-means algorithm with K=500

Define a vocabulary of 500 words

1500 points

Feature Vector of 2000 elements

Global Features – Raw PixelsGlobal Features – Raw Pixels

Images resized to 32x32 pixels

gray value of each pixel normalized to have sum 1

Feature Vector of 1024 elements

…..

Support Vector MachineSupport Vector Machine

Training data: (x1,y1) ,…,(xm,ym) xi N ,yi {-1, +1}

Optimal separating hyperplane: that with maximum distance to the closest points in the training set

( · x +b = 0)

f(x) = sign(i=1…m i yi · xi + b)

the xi with non zero i are SUPPORT VECTORS

Non linear SVM: x (x) K(x,y)= (x) · (y)

instead of ( · x)

Chi-square kernel: K(x,y)= exp{- ² (x,y)} ² = i { (||xi-yi||) ² / ||xi+yi|| }

Multi-Class SVMMulti-Class SVM

one-vs-all - for c classes employs c classifiers.

e.g. 3 classes:

margin(x) 1 vs 2,3 margin(x) 2 vs 1,3 margin(x) 3 vs 1,2

x class max(margin)

one-vs-one - for c classes employs c(c-1)/2 classifiers.

e.g. 3 classes:

(x) 1 vs 2 class 2

(x) 1 vs 3 class 3

(x) 2 vs 3 class 3

x class 3

Discriminative Accumulation Scheme - DASDiscriminative Accumulation Scheme - DAS

Main idea: information from different cues can be summed together

M object classes, each with Nj training images {Iij} i=1,…, Nj j=1,…M

For each image we extract a set of P different cue Tp = Tp(Iij), p = 1,…,P

So for an object j we have P new training sets {Tp(Iij)} i=1…Nj

I’ = test image, M 2,

cue the distance from the separating hyperplane is

Dj(p) = i=1…mjp

ijp

yijKp(Tp (Ii j),Tp(I’))+bj

p

Having all the distances for all the j objects and p cues, the image I’ is classified through

j*=argmax j=1…M {p=1…P ap Dj(p) } ap +

Discriminative Accumulation Scheme - DASDiscriminative Accumulation Scheme - DAS

Example with two cues: class1 : 2 images class2 : 3 images class3 : 2 images

Multi Cue Kernel - MCKMulti Cue Kernel - MCK

Main idea: a new kernel which combines different features extracted from images through a positively weighted linear combination of kernels each of them dealing with only one feature

KMC({Tp (Ii)}p,{Tp(I’)}p) = p=1…P ap Kp(Tp(Ii),Tp(I’))

It is possible to

• optimize the weighting factors ap and all the kernel parameters together;

• works both with one-vs-all and one-vs-one SVM extension to the multiclass problem

Experiments Experiments

Single feature Evaluation

- 5 random and disjoint train/test splits of 10000/1000 images are extracted

- best parameters that giving the lowest average score on the 5 splits

- experiments with one-vs-one and one-vs-all SVM multiclass extension

SIFT features outperform the raw pixel ones

Experiments Experiments

Cue Integration

DAS - distances from the separating hyperplanes associated with the best results of the previous step

- cross validation used only to search the best weights for cue integration

MCK - cross validation applied to look for the best kernel parameters and the best feature’s weights at the same time

In both cases weights varied form 0 to 1

Results Results

When the label predicted by the raw pixel is wrong the true label is far from the top of the decision ranking

Results Results

The best feature weight for SIFT results higher than that for raw pixels for all the integration methods

The number of support vectors for the best MCK run is higher than that used by the correspondent single cue SIFT but lower than that used by PIXEL and DAS.

Results Results

First, second and third column contain examples of images misclassified by one of the two cues but correctly classified by DAS and MCK

The fourth column shows an example of an image misclassified by both cues and by DAS but correctly classified by MCK

Conclusions and Future WorkConclusions and Future Work

We would like to …

use various types of local and global descriptors, to select the best features for the task;

add shape descriptors in our fusion schemes, which should result in a better performance;

exploit the natural hierarchical structure of the data.

Cue integration pays off

Cross Validation pays off

clef 2007 medical image annotation task budapest, september 19-21 2007

Documents

x class

x x kx

medical images

p cues

nj training images

images class3

radiographic images

images class2