a computer-aided diagnosis system for digital mammograms based on radial basis functions and feature...

28
A Computer-Aided Diagnosis System For Digital Mammograms Based on Radial Basis Functions and Feature Extraction Techniques Dissertation written by Mohammed Jirari Proposal for Ph.D. Candidate Examination July 23 rd , 2003

Post on 20-Dec-2015

224 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Computer-Aided Diagnosis System For Digital Mammograms Based on Radial Basis Functions and Feature Extraction Techniques Dissertation written by Mohammed

A Computer-Aided Diagnosis System For Digital Mammograms Based on Radial Basis Functions

and Feature Extraction Techniques

Dissertation written by

Mohammed Jirari

Proposal for Ph.D. Candidate Examination

July 23rd, 2003

Page 2: A Computer-Aided Diagnosis System For Digital Mammograms Based on Radial Basis Functions and Feature Extraction Techniques Dissertation written by Mohammed

Why This Project?

• Breast Cancer is the most common cancer and is the second leading cause of cancer deaths

• Mammographic screening reduces the mortality of breast cancer

• But, mammography has low positive predictive value PPV (only 35% have malignancies)

• Goal of Computer Aided Diagnosis CAD is to provide a second reading, hence reducing the false positive rate

Page 3: A Computer-Aided Diagnosis System For Digital Mammograms Based on Radial Basis Functions and Feature Extraction Techniques Dissertation written by Mohammed

What is a Mammogram?

• A Mammogram is an x-ray image of the breast. Mammography is the procedure used to generate a mammogram

• The equipment used to obtain a mammogram, however, is very different from that used to perform an x-ray of chest or bones. The breast is composed of tissues that are similar to each other in density. Changes or abnormalities in the breast tissue are often very subtle. Therefore, the mammogram machines, film, and developing process are specially designed to take pictures of these subtle differences.

Page 4: A Computer-Aided Diagnosis System For Digital Mammograms Based on Radial Basis Functions and Feature Extraction Techniques Dissertation written by Mohammed

Mammograms (cont.)

• In order to get a good image, the breast must also be flattened or compressed. This may be uncomfortable, but it will not harm the breast in any way and is extremely important for obtaining a clear image. Compression of the breast is also beneficial because it results in a lower dose of radiation.

• In a standard examination, two images of each breast are taken--one from the top (called a cranio-caudal or CC view) and one from the side (called a medio-lateral oblique or MLO view). This ensures that the images display as much breast tissue as possible.

Page 5: A Computer-Aided Diagnosis System For Digital Mammograms Based on Radial Basis Functions and Feature Extraction Techniques Dissertation written by Mohammed

Mammogram Examples

Mammogram of a left breast, cranio-caudal (from the top) view

Mammogram of a left breast, medio-lateral oblique (from the side) view

Page 6: A Computer-Aided Diagnosis System For Digital Mammograms Based on Radial Basis Functions and Feature Extraction Techniques Dissertation written by Mohammed

Purpose of CAD

• Mammography is the most reliable method in early detection of breast cancer.

• But, due to the high number of mammograms to be read, the accuracy rate tends to decrease.

• Double reading of mammograms has been proven to increase the accuracy, but at high cost.

• CAD can assist the medical staff to achieve high efficiency and effectiveness.

• The physician/radiologist makes the call not CAD

Page 7: A Computer-Aided Diagnosis System For Digital Mammograms Based on Radial Basis Functions and Feature Extraction Techniques Dissertation written by Mohammed

Proposed Method

• The proposed method will assist the physician by providing a second opinion on reading the mammogram, by pointing out an area (if one exists) delimited by its center coordinates and its radius.

• If the two readings are similar, no more work is to be done.

• If they are different, the radiologist will look at it one more time to make the final diagnosis.

Page 8: A Computer-Aided Diagnosis System For Digital Mammograms Based on Radial Basis Functions and Feature Extraction Techniques Dissertation written by Mohammed

Co-occurrence Matrices

• The joint probability of occurrence of gray level a and b for two pixels with a defined spatial relationship in an image.

• The spatial relationship is defined in terms of distance d and angle θ.

• From these matrices, a variety of features may be extracted.

Page 9: A Computer-Aided Diagnosis System For Digital Mammograms Based on Radial Basis Functions and Feature Extraction Techniques Dissertation written by Mohammed

Co-occurrence Matrices (cont.)

• In my project, the matrices are constructed at distance of d=1 and d=3 and for angles θ=0°, 45°, 90°, 135°.

• For each matrix, eight features are extracted.• Can be formally represented as follows:

|}),(,),(),,(),(:)],(),,{[(|),(

|}),(,),(,0,|:|)],(),,{[(|),(

|}),(,),(),,(),(:)],(),,{[(|),(

|}),(,),(,||,0:)],(),,{[(|),(

,135

,90

,45

,0

bnmfalkfdnldmkordnldmkDnmlkbaP

bnmfalkfnldmkDnmlkbaP

bnmfalkfdnldmkordnldmkDnmlkbaP

bnmfalkfdnlmkDnmlkbaP

d

d

d

d

Page 10: A Computer-Aided Diagnosis System For Digital Mammograms Based on Radial Basis Functions and Feature Extraction Techniques Dissertation written by Mohammed

Features Used

• Energy or angular second moment:

• Entropy:

• Maximum Probability:

• Inverse Difference moment:

κ=2, λ=1

ba

d baP,

2, ),(

),(log),( ,,

2, baPbaP dba

d

),(max ,,

baP dba

baba

d

ba

baP

;,

, ),(

Page 11: A Computer-Aided Diagnosis System For Digital Mammograms Based on Radial Basis Functions and Feature Extraction Techniques Dissertation written by Mohammed

Features Used (cont.)

• Contrast:

• Homogeneity:

• Inertia or variance:

),(,,

baPba dba

a bd baP

ba),(

)(1

1,2

a b

d baPba ),()( ,2

Page 12: A Computer-Aided Diagnosis System For Digital Mammograms Based on Radial Basis Functions and Feature Extraction Techniques Dissertation written by Mohammed

Features Used (cont.)

• Correlation:

b adyy

a bdxx

b ady

a bdx

yx

yxba

d

baPb

baPa

baPb

baPa

baPab

),(

),(

),(

),(

,

,2

,2

,

,

,,

Page 13: A Computer-Aided Diagnosis System For Digital Mammograms Based on Radial Basis Functions and Feature Extraction Techniques Dissertation written by Mohammed

Radial Basis Network Used

• Radial basis networks may require more neurons than standard feed-forward backpropagation FFBP networks

• BUT, can be designed in a fraction of the time to train FFBP

• Work best with many training vectors

Page 14: A Computer-Aided Diagnosis System For Digital Mammograms Based on Radial Basis Functions and Feature Extraction Techniques Dissertation written by Mohammed

Radial Basis Network with R Inputs

Page 15: A Computer-Aided Diagnosis System For Digital Mammograms Based on Radial Basis Functions and Feature Extraction Techniques Dissertation written by Mohammed

a=radbas(n)

2

)( nenradbas

Page 16: A Computer-Aided Diagnosis System For Digital Mammograms Based on Radial Basis Functions and Feature Extraction Techniques Dissertation written by Mohammed

Radial basis network consists of 2 layers: a hidden radial basis layer of S1 neurons and an output linear layer of S2

neurons:

Page 17: A Computer-Aided Diagnosis System For Digital Mammograms Based on Radial Basis Functions and Feature Extraction Techniques Dissertation written by Mohammed

Data Used in my Project

• The dataset used is the Mammographic Image Analysis Society (MIAS) MINIMIAS database containing Medio-Lateral Oblique (MLO) views for each breast for 161 patients for a total of 322 images.Every image is: 1024 pixels X 1024 pixels X 256

Page 18: A Computer-Aided Diagnosis System For Digital Mammograms Based on Radial Basis Functions and Feature Extraction Techniques Dissertation written by Mohammed

Preprocessing

• In order to improve the quality of the images and make feature extraction more reliable, 2 techniques were used:* Cropping: cuts the black parts of the image (almost 50%) based on a threshold* Enhancement: Histogram equalization to accentuate the features to be extracted by increasing the dynamic range of gray levels

Page 19: A Computer-Aided Diagnosis System For Digital Mammograms Based on Radial Basis Functions and Feature Extraction Techniques Dissertation written by Mohammed

Preprocessing result

a-Original mammogram b-after cropping c-after cropping and histogram equalization

Page 20: A Computer-Aided Diagnosis System For Digital Mammograms Based on Radial Basis Functions and Feature Extraction Techniques Dissertation written by Mohammed

Feature extraction

The extraction phase is needed in order not to feed the whole image as input to the neural network. The method applied takes the whole cropped image and calculates the co-occurrence matrices at distance d=1 and d=3. The angles used are θ=0°, 45°, 90°, 135° with the fifth matrix being the mean of the 4 directions. The co-occurrence matrices are calculated and the eight statistical features mentioned earlier are computed.

Page 21: A Computer-Aided Diagnosis System For Digital Mammograms Based on Radial Basis Functions and Feature Extraction Techniques Dissertation written by Mohammed

Training

• After normalizing the data between 0 and 1 for the network to have a common range, the training begins.

• The first training set was made up of 212 mammograms with 81 abnormal ones, with features calculated at distances d=1 and d=3.

• The second training set was made up of 163 mammograms with 81 abnormal ones, with features calculated at distances d=1 and d=3.

Page 22: A Computer-Aided Diagnosis System For Digital Mammograms Based on Radial Basis Functions and Feature Extraction Techniques Dissertation written by Mohammed

Example of a network used

Page 23: A Computer-Aided Diagnosis System For Digital Mammograms Based on Radial Basis Functions and Feature Extraction Techniques Dissertation written by Mohammed

Testing

• A mammogram is presented to the trained network and the output is a suspicious area denoted by its center’s x and y coordinates and its radius. If the mammogram is considered to be normal then zeros are returned for the coordinates and radius.

• The radiologist can then review his original assessment of the patient if some areas uncovered by the network were not originally looked at closely.

• The whole database is tested and the accuracy is calculated.

• The smaller dataset performed better than the larger, and when d=3 results were better also compared to d=1.

Page 24: A Computer-Aided Diagnosis System For Digital Mammograms Based on Radial Basis Functions and Feature Extraction Techniques Dissertation written by Mohammed

Results

• There were 2 training datasets: 163 and 212• There were 2 distance measures: 1 and 3• There were 3 spreads: 0.1, 0.25, and 0.05• There were 3 goals: 0.00003, 0.008, 0.00005.• For 12 possible combinations.• The NN was sensitive to the unbalanced data collection

that contained about 70%-30% split in the larger training set. Therefore the smaller dataset was preferred.

• Achieving a high recognition % is not that appealing if the TPF is small

Page 25: A Computer-Aided Diagnosis System For Digital Mammograms Based on Radial Basis Functions and Feature Extraction Techniques Dissertation written by Mohammed

Representative Preliminary Results

Net 1 Net 2 Net 3

TPF 0.01639 0.72973 0.88043

FPF 0.5939 0.0 0.3478

Recognition %

0.3323 0.9068 0.7174

# of Neurons 133 91 151

Page 26: A Computer-Aided Diagnosis System For Digital Mammograms Based on Radial Basis Functions and Feature Extraction Techniques Dissertation written by Mohammed

Future work/plans

• Use more features like standard deviation, skewness, kurtosis, ...

• Which feature(s) have the most impact:* Rank the features from best to worst (single

input to NN)* Select most significant feature(s) by using leave

one out method• Determine whether the area is benign or malignant

by adding the severity of the abnormality to the training.

Page 27: A Computer-Aided Diagnosis System For Digital Mammograms Based on Radial Basis Functions and Feature Extraction Techniques Dissertation written by Mohammed

Future work/plans (cont.)

• Try and reduce False Negatives on the basis of region characteristics size, difference in homogeneity and entropy.

• Use larger database to train/learn, since most commercial CADs use 100,000s mammograms to try and recognize “foreign” samples .

• Increase the recognition rate to diagnose with 100% accuracy since saving human lives is at stake. Reaching 80% rate determines credibility of CAD. May or may not be reached when tested on foreign mammograms, but can gain valuable ideas as to how to improve.

Page 28: A Computer-Aided Diagnosis System For Digital Mammograms Based on Radial Basis Functions and Feature Extraction Techniques Dissertation written by Mohammed

Future work/plans (cont.)

• Use segmentation of breast from its background as it may make the feature extraction more accurate.

• May experiment with multichannel wavelet transform and Kalman-filtering NN, since wavelet transform provides an efficient multiresolution representation. I may also experiment with a fuzzy neural CAD using a fuzzy detection algorithm using a sliding window. Comparing the results will be worth investigating.

• Will use some data mining techniques to unveil new patterns/relationships between the presented patients.