lec14: evaluation framework for medical image segmentation
TRANSCRIPT
![Page 1: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/1.jpg)
MEDICAL IMAGE COMPUTING (CAP 5937)
LECTURE 14: Evaluation Framework for Medical Image Segmentation
Dr. Ulas BagciHEC 221, Center for Research in Computer Vision (CRCV), University of Central Florida (UCF), Orlando, FL [email protected] or [email protected]
1SPRING 2017
![Page 2: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/2.jpg)
Outline• How to evaluate accuracy of image segmentation?
– Gold standard ~ surrogate of truths– Qualitative
• Visual• Inter- and intra-observer agreement rates
– Quantitative• Volumetric measurements (regression)• Region overlaps• Shape based measurements• Theoretical comparisons• STAPLE, Uncertainty guidance, and evaluation w/o truths
2
![Page 3: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/3.jpg)
Visual Assessment
3
Manual image segmentation from the full spectrum of IDEAL MRI data to delineate red: SAT, green: VAT, blue: liver, yellow: pancreas, purple: kidneys. Left to right: water- only, fat-only, in-phase, out-of-phase, fat fraction, and segmented labels from SliceOmatic.
Reference: Assessment of Abdominal Adiposity and Organ Fat with Magnetic Resonance Imaging (chp11).
![Page 4: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/4.jpg)
Inherent Uncertainty
4
Comparison of glioblastoma multiforme (GBM) segmentation results on an axial slice: semi-automatic segmentation under Slicer (green, left image) and pure manual segmentation (blue, middle image). Egger et al., Nat Sci Rep., 2012.
![Page 5: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/5.jpg)
Inherent Uncertainty 5
red: endocardium; green: epicardium; yellow: ground truthQueiros et al., European Heart Journal, 2016.
![Page 6: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/6.jpg)
Segmentation EvaluationCan be considered to consist of two components:
(1) Theoretical
Study mathematical equivalence among algorithms.
(2) Empirical
Study practical performance of algorithms in specific application domains.
6
![Page 7: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/7.jpg)
Segmentation Evaluation: TheoreticalFundamental challenges in segmentation evaluation:
(Ch1) Are major pI (purely Image based) frameworks such as activecontours, level sets, graph cuts, fuzzy connectedness, watersheds, truly distinct or some level of equivalence exists among them?
7
![Page 8: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/8.jpg)
Segmentation Evaluation: TheoreticalFundamental challenges in segmentation evaluation:
(Ch1) Are major pI (purely Image based) frameworks such as activecontours, level sets, graph cuts, fuzzy connectedness, watersheds, truly distinct or some level of equivalence exists among them?
(Ch2) How to develop truly distinct methods constituting real advance?
8
![Page 9: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/9.jpg)
Segmentation Evaluation: TheoreticalFundamental challenges in segmentation evaluation:
(Ch1) Are major pI (purely Image based) frameworks such as activecontours, level sets, graph cuts, fuzzy connectedness, watersheds, truly distinct or some level of equivalence exists among them?
(Ch2) How to develop truly distinct methods constituting real advance?
(Ch3) How to choose a method for a given application domain?
9
![Page 10: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/10.jpg)
Segmentation Evaluation: TheoreticalFundamental challenges in segmentation evaluation:
(Ch1) Are major pI (purely Image based) frameworks such as activecontours, level sets, graph cuts, fuzzy connectedness, watersheds, truly distinct or some level of equivalence exists among them?
(Ch2) How to develop truly distinct methods constituting real advance?
(Ch3) How to choose a method for a given application domain?
(Ch4) How to set an algorithm optimally for an applicationdomain?
10
![Page 11: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/11.jpg)
Segmentation Evaluation: TheoreticalFundamental challenges in segmentation evaluation:
(Ch1) Are major pI (purely Image based) frameworks such as activecontours, level sets, graph cuts, fuzzy connectedness, watersheds, truly distinct or some level of equivalence exists among them?
(Ch2) How to develop truly distinct methods constituting real advance?
(Ch3) How to choose a method for a given application domain?
(Ch4) How to set an algorithm optimally for an applicationdomain?
Currently any method A can be shown empirically to be better than anymethod B, even when they are equivalent.
11
![Page 12: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/12.jpg)
Segmentation Evaluation: Theoretical
Attributes commonly used by segmentation methods:
(1) Connectedness (2) Texture(3) Smoothness of boundary(4) Gradient / homogeneity(5) Shape information about object(6) Noise handling(7) Optimization employed(8) Orientedness of boundary
![Page 13: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/13.jpg)
Attributes utilized by well-known delineation models
Connected Gradient Texture Smooth Shape Noise Optimize
Fuzzy con Yes Gr = hom affinity
Obj feat affinity
No No Scale FC
In RFC
Chan-Vese No No Yes Yes No No YesMum-Shah No No Yes Yes No Yes Yes
KWT snake Boundary Yes No Yes No No YesMSV LS Fg when
expandngYes No No No No No
Live wire Boundary Yes Yes Yes User No YesAct. shape Yes No No No Yes No YesAct. app Yes No Yes No Yes No Yes
Graph cut Usly not Yes Possible No No No Yes
Clustering No No Yes No No No Yes
SEGMENTATIONEVALUATION:Theoretical
![Page 14: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/14.jpg)
Attributes utilized by well-known delineation models
Connected Gradient Texture Smooth Shape Noise Optimize
Fuzzy con Yes Gr = hom affinity
Obj feat affinity
No No Scale FC
In RFC
Chan-Vese No No Yes Yes No No YesMum-Shah No No Yes Yes No Yes Yes
KWT snake Boundary Yes No Yes No No YesMSV LS Fg when
expandngYes No No No No No
Live wire Boundary Yes Yes Yes User No YesAct. shape Yes No No No Yes No YesAct. app Yes No Yes No Yes No Yes
Graph cut Usly not Yes Possible No No No Yes
Clustering No No Yes No No No Yes
SEGMENTATIONEVALUATION:Theoretical
Deep Learning Yes Yes Yes Yes Yes Yes Yes
![Page 15: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/15.jpg)
Segmentation Evaluation: Empirical
T :
B :
P :
Example: Estimating the volume of brain.
A body region -
Imaging protocol -
Application domain: A particular triple .
A task -
Example: Head.
Example: T2 weighted MRimaging with a particular set of parameters.
Q: A set of scenes acquired for a particular application domain
, ,á ñT B P
, , .T B Pá ñ
![Page 16: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/16.jpg)
Segmentation Evaluation: Empirical
16
The segmentation efficacy of a method M in an applicationdomain may be characterized by three groupsof factors:
Precision :(Reliability)
Repeatability taking into account all subjective actions influencing the result.
Accuracy :(Validity)
Degree to which the result agrees withtruth.
Efficiency : (Viability)
Practical viability of the method.
, ,T B Pá ñ
![Page 17: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/17.jpg)
Validation of Image Segmentation• Spectrum of accuracy versus realism in reference standard.• Digital phantoms.
– Ground truth known accurately.– Not so realistic.
• Acquisitions and careful segmentation.– Some uncertainty in ground truth.– More realistic.
• Autopsy/histopathology.– Addresses pathology directly; resolution.
• Clinical data ?– Hard to know ground truth.– Most realistic model.
Slide Credit: N. Archip
![Page 18: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/18.jpg)
Comparison To Higher Resolution
MRI Photograph MRI
Provided by Peter Ratiu and Florin Talos.Credit: N. Archip
![Page 19: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/19.jpg)
Segmentation Evaluation: Empirical
19
Intra operator variationsInter operator variations
Intra scanner variationsInter scanner variations
Inter scanner variations include variations due to the same brand and different brands.
Repeatability taking into account all subjective actions that influence the segmentation result.
Precision
![Page 20: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/20.jpg)
Segmentation Evaluation: Empirical
20
Precision
( ) -
1 - , = 3, 4. + 2
1 2
i
1 2
O OM MT
M O OM M
PR i=C C
C C
A measure of precision for method M in a trial that producesand for situation Ti is given by
Intra/inter operator
Intra/inter scanner
may be binary or fuzzy segmentations.
1OMC 2O
MC
CMO1,CM
O2
![Page 21: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/21.jpg)
Segmentation Evaluation: Empirical
21
Accuracy
The degree to which segmentations agree with true segmentation.
Surrogates of truth are needed.
For any image C acquired for application domain
CMO - segmentation of O in C by method M ,
Ctd - surrogate of true delineation of O in C.
![Page 22: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/22.jpg)
22
TPFP
TN
FN
True segmentation
OMC
tdC
Segmentation by algorithm M.
FP
FN
Ud
![Page 23: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/23.jpg)
Segmentation Evaluation: Empirical
23
FNVFMd =
Ctd − CMO
Ctd, TPVFM
d = Ctd ∩ CM
O
Ctd
FPVFMd =
CMO − CtdUd -Ctd
, TNVFMd =
Ud − CMO -Ctd
Ud -Ctd,
Ud : A binary scene representing a reference super set(for example, this may be the body region that is imaged).
: Amount of tissue truly in that is missed by .
: Amount of tissue falsely delineated by .
dMdM
FNVF O MFPVF M
![Page 24: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/24.jpg)
Segmentation Evaluation: Empirical
24
Requirements for accuracy metrics:
(1) Capture M’s behavior of trade-off between FP and FN.(2) Satisfy laws of tissue conservation:
(3) Capable of characterizing the range of behavior of M.(4) Any monotonic function g(FNVF, FPVF) is fine as a
metric.(5) Appropriate for
1
1
d dM Md dM M
FNVF TPVFFPVF TNVF
= -
= -
, , .T B Pá ñ
![Page 25: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/25.jpg)
25
Segmentation Evaluation: Empirical
![Page 26: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/26.jpg)
Segmentation Evaluation: Empirical
26
1-FNVF
FPVF
Brain WM segmentation in PD MRimages.
Each value of parameter vector p of M gives a point on the DOC curve.The DOC curve characterizes the behavior of M over a range of parametric values of M.
Delineation Operating Characteristic
:MA Area underthe DOC curve
![Page 27: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/27.jpg)
Segmentation Evaluation: Empirical
27
, ,á ñT B P
.
FPVF
1-FN
VF
0
1p - parameter vector for method M
gp(FPVF, FNVF) - monotonic fn
p* = arg min p [gp(FPVF, FNVF)]
Set M to operate at p*.
Optimally setting an algorithm for
1
![Page 28: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/28.jpg)
Existent Segmentation Data28
Expert 1 Expert 2 Expert 3 Expert 4
Original Image
• Manual segmentation performed by 4 independent experts
• low grade glioma
![Page 29: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/29.jpg)
Expert and Student Segmentations
29
Test image ? ?
? ?
![Page 30: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/30.jpg)
Expert and Student Segmentations
30
Test image Expert consensus Student 1
Student 2 Student 3
![Page 31: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/31.jpg)
Segmentation Evaluation: Empirical
31
Describes practical viability of a method.
Four factors should be considered:
(1) Computational time – for one time training of M
(2) Computational time – for segmenting each scene
(3) Human time – for one-time training of M
(4) Human time – for segmenting each scene
(2) and (4) are crucial. (4) determines the degree of automation of M.
Efficiency
( )1cMt( )2cMt( )1hMt
( )2hMt
![Page 32: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/32.jpg)
Segmentation Evaluation: Empirical
32
Precision : Accuracy :
:::
: Area under the DOC curveintra scannerFN fraction for delineation:inter operatorFP fraction for delineation:intra operator1T
MPR
2TMPR
3TMPR
dMFPVF
MA
dMFNVF
Efficiency :
operator time for scene segmentation.:operator time for algorithm training.:computational time for scene segmentation.:computational time for algorithm training.:1c
Mt2cMt
1hMt
2hMt
4TMPR : inter scanner
![Page 33: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/33.jpg)
Remarks
33
(1) Precision, accuracy, efficiency are interdependent.
accuracy à efficiency.precision and accuracy à difficult.
(2) “Automatic segmentation method” has no meaning unless theresults are proven on a large number of data sets withacceptable precision, accuracy, efficiency, and with .
(3) A descriptive answer to “is method M1 better than M2 under ?” in terms of the 11 parameters is more meaningful
than a “yes” or “no” answer.
(4) DOC is essential to describe the range of behavior of M.
2hMt = 0
, ,T B Pá ñ
![Page 34: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/34.jpg)
Velazquez et al, Scientific Reports 2013.34
![Page 35: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/35.jpg)
Shape Based Metrics for Segmentation Evaluation
35
Sensitivity=94.69%Specificity=94.19%
Sensitivity=72.99%Specificity=78.16%
If you use only DSC (dice similarity, or overlap measure), DSC values are similar to each otherIn both examples (but not sensitivity-specificity values).
Sufficient Enough?
![Page 36: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/36.jpg)
Hausdorff Distance• Can be used for a complementary evaluation metric to the
overlap measure for measuring boundary mismatches!
36
![Page 37: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/37.jpg)
Hausdorff Distance• Can be used for a complementary evaluation metric to the
overlap measure for measuring boundary mismatches!• Lower Haussdorff Distance (HD), Better segmentation
accuracy!
37
( ))(max),(maxmax),( bdadBAHD ABbBAa ÎÎ=
( )),(min)( badadBbB Î
= is a distance of one point a on A from B
![Page 38: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/38.jpg)
Segmentation Evaluation: STAPLE38
• STAPLE (Simultaneous Truth and Performance Level Estimation):– An algorithm for estimating performance and ground truth from a
collection of independent segmentations.– Warfield, Zou, Wells MICCAI 2002.– Warfield, Zou, Wells, IEEE TMI 2004.– Publicly Available
– The STAPLE algorithm ( Warfield et al., 2004) is a region formulation for producing consensus segmentations.
– When foreground is small à weight w is small
![Page 39: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/39.jpg)
Segmentation Evaluation: STAPLE• Segmentations are generated by sampling independently at
each voxel.
• However, the produced segmentations may not be realistic for two reasons. – First, the variability of the segmentation does not account for the
intensity in the image such that borders with strong gradients are equally variable as borders with weak gradient. This is counter intuitive as the basic hypothesis of image segmentation is that changes of intensity are correlated with changes of labels.
– Second, borders of the segmented structures are unrealistic mainly due to their lack of geometric regularity.
39
![Page 40: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/40.jpg)
Regression Analysis in Clinical Problems
• Linear regression between volume(s) – automated segmentation’s volume vs. manual segmentation’s volume– Bland-Altman plot
• Linear regression between visual inspection (raters)– Kappa statistics– t-test / p-value
• Significantly different volumes ? Score ?
40
![Page 41: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/41.jpg)
Regression Analysis in Clinical Problems
41
Manual segmentationVedentham, et al. JCIS, 2014
![Page 42: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/42.jpg)
What is Bland-Altman plot?
42
![Page 43: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/43.jpg)
What is Bland-Altman plot?• is a method of data plotting used in analyzing the agreement
between two different assays.• Claim: any two methods that are designed to measure the
same parameter should have good correlation.– X-axis: mean of the two measurement– Y-axis: difference between the two values
• Good first step analyzing the data!
43
![Page 44: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/44.jpg)
Bland-Altman Plots (e.g., airway segmentation evaluation)
44
Xu, Bagci, et al. MedIA, 2015.
![Page 45: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/45.jpg)
New Directions: Sampling Image Segmentations (Le et al, MedIA, 2016)
• Automatically produce plausible image segmentation samples from a single expert segmentation!
45
![Page 46: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/46.jpg)
New Directions: Sampling Image Segmentations (Le et al, MedIA, 2016)
• Automatically produce plausible image segmentation samples from a single expert segmentation!
• A probability distribution of image segmentation boundaries is defined as Gaussian Process, which leads to segmentations which are spatially coherent and consistent with the presence of salient borders in the image.
46
![Page 47: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/47.jpg)
The Gaussian Density
47
![Page 48: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/48.jpg)
Remark: Gaussian Process (GP) ?
48
Credit: Ghahramani
![Page 49: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/49.jpg)
Remark: Gaussian Process (GP) ?
49
Credit: Ghahramani
![Page 50: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/50.jpg)
Remark: Gaussian Process (GP) ?
50
Credit: Ghahramani
![Page 51: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/51.jpg)
Remark: (GP) ? 51
![Page 52: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/52.jpg)
Remark: (GP) ? 52
![Page 53: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/53.jpg)
New Directions: Sampling Image Segmentations (Le et al, MedIA, 2016)
• Automatically produce plausible image segmentation samples from a single expert segmentation!
• A probability distribution of image segmentation boundaries is defined as Gaussian Process, which leads to segmentations which are spatially coherent and consistent with the presence of salient borders in the image.
53
![Page 54: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/54.jpg)
Sample segmentation contours according to mean inter-sample dice coefficient!
54
(Top Left) Mean of the GP µ; (Top Middle) Sample of the level set function φ(a) drawn from𝒢𝒫(µ,Σ) (Others) GPSSI samples. The ground truth is outlined in red, the GPSSI samples are outlined in orange.
![Page 55: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/55.jpg)
55
New Directions: Sampling Image Segmentations (Le et al, MedIA, 2016)
(Left) Signed geodesic distance µ(a) of the ROI with isocontours –45, 0, 45, 100, 200. (Right) One can check that the samples most probably lie in the region delineated by the isocontours µ(a)=±45. The sampled contours are in orange.
![Page 56: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/56.jpg)
New Directions: Sampling Image Segmentations (Le et al, MedIA, 2016)
56
![Page 57: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/57.jpg)
New Directions: Sampling Image Segmentations (Le et al, MedIA, 2016)
57
![Page 58: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/58.jpg)
Provocative Question?• Can we evaluate segmentation error without the ground
truth?
58
![Page 59: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/59.jpg)
Provocative Question?• Can we evaluate segmentation error without the ground
truth?– With the machine learning support, can we design a classifier which
LEARNS segmentation error and adapt itself for better delineation?
59
![Page 60: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/60.jpg)
Summary• Segmentation Evaluation
– Theoretical vs. Empirical– Visual Assessment– Volumetric Agreement– Efficacy (efficiency, accuracy, …)– STAPLE– New Trends!– Segmentation Challenges (choose your project!)
60
![Page 61: Lec14: Evaluation Framework for Medical Image Segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052514/58ed20e51a28ab43498b467b/html5/thumbnails/61.jpg)
Slide Credits and References• Credits to: Jayaram K. Udupa of Univ. of Penn., MIPG• Bagci’s CV Course 2015 Fall.• K.D. Toennies, Guide to Medical Image Analysis,• Handbook of Medical Imaging, Vol. 2. SPIE Press.• Handbook of Biomedical Imaging, Paragios, Duncan, Ayache.• Seutens,P., Medical Imaging, Cambridge Press.• Neculai Archip, Ph.D• Simon K. Warfield, Ph.D. (See STAPLE Algorithm)
61