powerpoint
TRANSCRIPT
An EM-Algorithm for An EM-Algorithm for Analyzing Multi-Center Analyzing Multi-Center
Repeated fMRI DataRepeated fMRI Data Kelly H. Zou, PhDKelly H. Zou, PhD
Assistant Professor of RadiologyAssistant Professor of Radiology
Department of Radiology, Brigham and Women’s HospitalDepartment of Radiology, Brigham and Women’s HospitalDepartment of Health Care Policy, Harvard Medical SchoolDepartment of Health Care Policy, Harvard Medical School
Joint Statistical MeetingsJoint Statistical MeetingsAugust 8, 2004August 8, 2004
Quality AssessmentQuality Assessment BWH, HMSBWH, HMS
fBIRNfBIRN
Functional Imaging Research Functional Imaging Research for Schizophrenia Testbed,for Schizophrenia Testbed,
Biomedical Informatics Biomedical Informatics Research NetworkResearch Network
Quality AssessmentQuality Assessment BWH, HMSBWH, HMS
Co-AuthorsCo-AuthorsSteven D. Pieper, PhDSteven D. Pieper, PhD
Meng Wang, MSEMeng Wang, MSESimon K. Warfield, PhDSimon K. Warfield, PhDWilliam M. Wells, PhDWilliam M. Wells, PhD
Ron Kikinis, MDRon Kikinis, MDFIRST BIRN FIRST BIRN
Brigham and Women‘s HospitalBrigham and Women‘s Hospital
Harvard Medical SchoolHarvard Medical School
NCRRP41RR13218NCRRP41RR13218
Quality AssessmentQuality Assessment BWH, HMSBWH, HMS
BackgroundBackground
Multi-Site BIRN Study: Multi-Site BIRN Study: 11 Sites 11 Sites (MN, UCI, UNC, UCLA…, BWH, MGH)(MN, UCI, UNC, UCLA…, BWH, MGH)
5 Healthy males as “Human Phantoms”5 Healthy males as “Human Phantoms”
2 Visits on separate days per site per subject2 Visits on separate days per site per subject2 extra visits at one site for 3 of the 5 subjects2 extra visits at one site for 3 of the 5 subjects
4 Sensory motor (SM), 2 cognitive (Cog), 4 Sensory motor (SM), 2 cognitive (Cog), 2 breath-hold (BH) runs per visit2 breath-hold (BH) runs per visit
BWH, HMSBWH, HMSQuality AssessmentQuality Assessment
BWH, HMSBWH, HMSQuality AssessmentQuality Assessment
Clinical ObjectivesClinical Objectives
It is meaningful to pool the data to yield a It is meaningful to pool the data to yield a larger sample size in the next-phase clinical larger sample size in the next-phase clinical study (study (SchizophrenicSchizophrenic vs. normal controls)? vs. normal controls)?
How to assess the effects of various factors?How to assess the effects of various factors?
BWH, HMSBWH, HMSQuality AssessmentQuality Assessment
Statistical ObjectivesStatistical ObjectivesUltimate problem (Pooling): Ultimate problem (Pooling): How to combine multi-site data and to validate How to combine multi-site data and to validate the pooling mechanism?the pooling mechanism?
Current problem (Calibration): Current problem (Calibration): How to compare and combine data in the How to compare and combine data in the calibration step? calibration step?
https://share.spl.harvard.edu/users/zouhttps://share.spl.harvard.edu/users/zou
BWH, HMSBWH, HMSQuality AssessmentQuality Assessment
Materials and MethodsMaterials and MethodsFocus onFocus on Reproducibility of the SMReproducibility of the SM TaskTask
Subjects perform bilateral finger tapping on Subjects perform bilateral finger tapping on button boxes (1 dummy button box and button boxes (1 dummy button box and 1 actual) in time with 3Hz audio cue and 1 actual) in time with 3Hz audio cue and flashing checkerboard squareflashing checkerboard square
They press buttons 1 - 4 in consecutive They press buttons 1 - 4 in consecutive order and then back again using both hands order and then back again using both hands at simultaneously and in syncat simultaneously and in sync
BWH, HMSBWH, HMSQuality AssessmentQuality Assessment
BWH, HMSBWH, HMSQuality AssessmentQuality Assessment
Pulse SequencesPulse Sequences
AA
Spin echo T2-weighted, oblique axialSpin echo T2-weighted, oblique axial256x192 matrix256x192 matrix
35 slices, 4mm inter35 slices, 4mm interScan time=2:24 minScan time=2:24 min
T2SET2SE
3D Spoiled Grass, axial3D Spoiled Grass, axial256x192 matrix256x192 matrix
124-128 slices, 1.2mm124-128 slices, 1.2mmScan time=9:02 minScan time=9:02 min
3D SPGR3D SPGR
FF
Echo-planar imaging or spiral gradient Echo-planar imaging or spiral gradient echo imaging, oblique axial echo imaging, oblique axial
64x64 matrix, 1 shot64x64 matrix, 1 shot35 slices, 4mm 35 slices, 4mm
EPI or EPI or Spiral Spiral GREGRE
Materials and MethodsMaterials and MethodsTaskTask: Sensory Motor: Sensory MotorSiteSite: 5 Sites with 1.5T, 4 with 3T, 1 with 4T: 5 Sites with 1.5T, 4 with 3T, 1 with 4TSubjectSubject: #101; 103; 104; 105; 106: #101; 103; 104; 105; 106RunRun: 4 and registered: 4 and registeredDayDay: : #101; 103; 106 tested on 4 days at #101; 103; 106 tested on 4 days at Stanford and other subjects tested on 2 Stanford and other subjects tested on 2 Days/SiteDays/SiteThresholdThreshold: Activation data: : Activation data:
= – log= – log1010(p-value)sign(F-statistic)=10(p-value)sign(F-statistic)=10-9-9
BWH, HMSBWH, HMSQuality AssessmentQuality Assessment
Materials and MethodsMaterials and MethodsImage registration over the repeated runs Image registration over the repeated runs across sites using FreeSurfer across sites using FreeSurfer
Voxel-to-voxel registration of theVoxel-to-voxel registration of the anatomical anatomical with thewith the functional volumefunctional volume to convert the to convert the subject's anatomical volume to the subject's anatomical volume to the corresponding functional space using a corresponding functional space using a transformation matrixtransformation matrix
BWH, HMSBWH, HMSQuality AssessmentQuality Assessment
Materials and MethodsMaterials and MethodsTkRegister defines the registration matrixTkRegister defines the registration matrix
T=T= -d-dcc 0 0 0 0 (N (Ncc/2)d/2)dcc
0 0 0 0 ddss -(N-(Nss/2)d/2)ds s
0 0 -dr -dr 0 0 (N (Nrr/2)d/2)drr
0 0 0 0 0 0 1 1
ddcc, d, drr, and d, and dss are the resolutions, are the resolutions,
NNcc, N, Nrr, and N, and Nss are the number are the number
of columns,of columns, rows, and slicesrows, and slices
BWH, HMSBWH, HMSQuality AssessmentQuality Assessment
Materials and MethodsMaterials and Methods
BWH, HMSBWH, HMSQuality AssessmentQuality Assessment
Variable Variable
NameName # Values# Values
11 SubjectSubject 1 - 51 - 5
22 Site Site 1 - 101 - 10
33 VisitVisit 1, 2 (all); 1 - 4 (1site 3 subjects)1, 2 (all); 1 - 4 (1site 3 subjects)
44 RunRun 1 - 4/visit1 - 4/visit
55 StrengthStrength 1.5T, 3T, 4T1.5T, 3T, 4T
66 MakerMaker Siemens, GE, PickerSiemens, GE, Picker
77 K-SpaceK-Space Raster, Spiral, Dual-Echo Raster, Spiral, Dual-Echo RasterRaster
Materials and MethodsMaterials and MethodsSelection of Threshold:Selection of Threshold:
The threshold was selected on the scale of The threshold was selected on the scale of the activation data the activation data
The 3D activation map was further The 3D activation map was further standardized using the absolute value for standardized using the absolute value for each voxel prior to statistical inferenceseach voxel prior to statistical inferences
BWH, HMSBWH, HMSQuality AssessmentQuality Assessment
Materials and MethodsMaterials and MethodsSelection of Threshold:Selection of Threshold:
The threshold was selected on the scale of The threshold was selected on the scale of the activation data the activation data
The 3D activation map was further The 3D activation map was further standardized using the absolute value for standardized using the absolute value for each voxel prior to statistical inferenceseach voxel prior to statistical inferences
BWH, HMSBWH, HMSQuality AssessmentQuality Assessment
Complete data density:Complete data density:
Binary ground truth TBinary ground truth Tii for voxel i for voxel i
Expert j segmentation DExpert j segmentation Dijij
Expert performance characterized byExpert performance characterized by
sensitivity p and specificity sensitivity p and specificity
We observe expert decisions DWe observe expert decisions D
To construct maximum likelihood estimatesTo construct maximum likelihood estimates
for each expert’s sensitivity and specificityfor each expert’s sensitivity and specificity)|,(lnmaxargˆ,ˆ qp,TDqp
qp,f
Materials and MethodsMaterials and Methods
Quality AssessmentQuality Assessment BWH, HMSBWH, HMS
Solve the incomplete-data log likelihood
maximization problem by Expectation
Maximization (EM)
ˆ arg max ln ( | )fθ
θ D θ
ˆ ˆ( | ) ln ( | ) |Q E f
θ θ D,T θ D,θ
Quality AssessmentQuality Assessment BWH, HMSBWH, HMS
BWH, HMSBWH, HMSQuality AssessmentQuality Assessment
Visit 1
Visit 2
Level 1A: STAPLE EM Across 4 Runs/Visit
Within Site Within Visit
Level 1B: STAPLE EM Across 4 Runs/Visit
Within Site Within Visit
Level 2A. STAPLE EM Over All Sites Within Visit
Level 2B. STAPLE EM Within Field Strength
Across the Sites/Field Strength Within Visit
Level 2A. STAPLE EM Over All Sites Within Visit
Level 2B. STAPLE EM Within Field Strength
Across the Sites/Field Strength Within Visit
Materials and MethodsMaterials and Methods
Materials and MethodsMaterials and Methods
Statistical methodsStatistical methods
Activation percentageActivation percentageSensitivity and SpecificitySensitivity and SpecificityReceiver operating characteristic curveReceiver operating characteristic curveLinear modelLinear modelAnalysis of varianceAnalysis of variance
BWH, HMSBWH, HMSQuality AssessmentQuality Assessment
Subject 104Subject 104
Visit 1Visit 1
Slice #18Slice #18
ResultsResults
ResultsResultsStatistical significant factors impactingStatistical significant factors impacting
on on sensitivitysensitivity: : subject (p=0.01) subject (p=0.01)
on on specificityspecificity: : subject (p=0.04) subject (p=0.04)
run (p=0.04) run (p=0.04)
BWH, HMSBWH, HMSQuality AssessmentQuality Assessment
BWH, HMSBWH, HMSQuality AssessmentQuality Assessment
ResultsResultsActivation PercentageActivation Percentage
BWH, HMSBWH, HMSQuality AssessmentQuality Assessment
ResultsResultsSensitivitySensitivity
BWH, HMSBWH, HMSQuality AssessmentQuality Assessment
ResultsResultsSpecificitySpecificity
ConclusionConclusion
BWH, HMSBWH, HMSQuality AssessmentQuality Assessment
Site vs. SubjectSite vs. Subject: Variability across subjects : Variability across subjects >variability across sites>variability across sites
Field StrengthField Strength: 3T and 4T were better than 1.5T yielding more : 3T and 4T were better than 1.5T yielding more activation and less variability in sensitivity and specificity activation and less variability in sensitivity and specificity
RunsRuns: There was a non-constant effect after resting and task : There was a non-constant effect after resting and task periods periods
RemarkRemark
BWH, HMSBWH, HMSQuality AssessmentQuality Assessment
Our findings may help develop a calibration Our findings may help develop a calibration plan to minimize the variability introduced by plan to minimize the variability introduced by the sitesthe sites
Enabling us to pool independent functional Enabling us to pool independent functional data of normal and schizophrenic subjects data of normal and schizophrenic subjects across different institutionsacross different institutions
Future Research Future Research
Standardization across subjects Standardization across subjects
Degree of smoothing Degree of smoothing
schizophrenic vs. healthy controlsschizophrenic vs. healthy controls
Longitudinal changes overtimeLongitudinal changes overtime
BWH, HMSBWH, HMSQuality AssessmentQuality Assessment
ReferencesReferences Genovese CR, Noll, DC and Eddy, WF. Genovese CR, Noll, DC and Eddy, WF. Estimating test-retest reliability in fMRI Estimating test-retest reliability in fMRI I. statistical methodology. Magnetic Resonance in Medicine 1997; 38: 497-507.I. statistical methodology. Magnetic Resonance in Medicine 1997; 38: 497-507.Le TH and Hu X.Le TH and Hu X. Methods for assessing accuracy and reliability in functional Methods for assessing accuracy and reliability in functional MRI. NMR in Biomedicine 1997; 10: 160-164.MRI. NMR in Biomedicine 1997; 10: 160-164.Machielsen WCM, Rombouts SARB, Barkhof F, Scheltens P, and Witter MP.Machielsen WCM, Rombouts SARB, Barkhof F, Scheltens P, and Witter MP. fMRI of visual encoding: reproducibility of activation. Human Brain Mapping fMRI of visual encoding: reproducibility of activation. Human Brain Mapping 2000; 9: 156-164.2000; 9: 156-164.Maitra R, Roys SR, and Gullapalli RP.Maitra R, Roys SR, and Gullapalli RP. Test-retest reliability estimation of Test-retest reliability estimation of functional MRI Data. Magnetic Resonance in Medicine 2002; 48: 62-70.functional MRI Data. Magnetic Resonance in Medicine 2002; 48: 62-70.
Quality AssessmentQuality Assessment BWH, HMSBWH, HMS
Warfield SK, Zou KH, Wells WM III.Warfield SK, Zou KH, Wells WM III. Simultaneous Truth and Performance Simultaneous Truth and Performance Level Estimation (STAPLE): An Algorithm for the Validation of Image Level Estimation (STAPLE): An Algorithm for the Validation of Image Segmentation. IEEE Transactions on Medical Imaging 2004; 23: 903-921.Segmentation. IEEE Transactions on Medical Imaging 2004; 23: 903-921.Warfield SK, Zou KH, Wells WM III.Warfield SK, Zou KH, Wells WM III. Validation of image segmentation Validation of image segmentation andand expert quality with an expectation-maximization algorithm.expert quality with an expectation-maximization algorithm. MICCAI MICCAI 2002, LNCS 2002; 2488: 290-297.2002, LNCS 2002; 2488: 290-297.Wei XC, Yoo S-S, Dickey CC, Zou KH, Guttmann CRG,Wei XC, Yoo S-S, Dickey CC, Zou KH, Guttmann CRG,Panych LP.Panych LP. Functional MRI of auditory verbal working Functional MRI of auditory verbal workingmemory: long-term reproducibility analysis. NeuroImage 2004; 21: memory: long-term reproducibility analysis. NeuroImage 2004; 21: 1000-1008. 1000-1008.
Quality AssessmentQuality Assessment
ReferencesReferences on fMRI and EM
BWH, HMSBWH, HMS
Zou KH, Wells MW III, Kikinis R,Zou KH, Wells MW III, Kikinis R, Warfield.Warfield. Three validation metrics for automatedThree validation metrics for automated probabilistic image segmentation of brain tumors. Statistics in Medicine 2004; 23: probabilistic image segmentation of brain tumors. Statistics in Medicine 2004; 23: 1259-1282.1259-1282.Zou KH, Warfield SK, Fielding JR, Tempany CM, Wells MW III, Zou KH, Warfield SK, Fielding JR, Tempany CM, Wells MW III, KKaus MR, Jolesz FA, aus MR, Jolesz FA, Kikinis R.Kikinis R. Statistical validation based on parametric receiver operating characteristic Statistical validation based on parametric receiver operating characteristic analysis of continuous classification data. Academic Radiology 2003; 10: 1359-1368.analysis of continuous classification data. Academic Radiology 2003; 10: 1359-1368.Zou KH, Warfield SK, Bharatha A, Tempany CMC,Zou KH, Warfield SK, Bharatha A, Tempany CMC, Kaus M, Haker S, Wells WM III, Kaus M, Haker S, Wells WM III, Jolesz FA, Kikinis R.Jolesz FA, Kikinis R. Statistical validation of imagage segmentation quality based on Statistical validation of imagage segmentation quality based on a spatial overlap index. a spatial overlap index. Academic Radiology 2004; 11: 178-189.Academic Radiology 2004; 11: 178-189.
Quality AssessmentQuality Assessment
References on Validation MetricsReferences on Validation Metrics
BWH, HMSBWH, HMS