lesion detection and classification of mammogram based on adaptive threshold and discriminant...

71

Click here to load reader

Upload: sujay-pujari

Post on 28-Jul-2015

1.149 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

LESION DETECTION AND CLASSIFICATION OF

MAMMOGRAM BASED ON ADAPTIVE

THRESHOLD AND DISCRIMINANT ANALYSIS

A thesis submitted in partial fulfillment of the requirements for

The award of the degree of

M.Tech.

in

COMMUNICATION SYSTEMS

By

PUJARI SUJAY GIRISH

DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING

NATIONAL INSTITUTE OF TECHNOLOGY

TIRUCHIRAPALLI – 620 015.

DECEMBER 2010

Page 2: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

BONAFIDE CERTIFICATE

This is to certify that the project titled “LESION DETECTION AND

CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE

THRESHOLD AND DISCRIMINANT ANALYSIS” is a bonafide record of the

work done by

PUJARI SUJAY GIRISH (208109013)

in partial fulfillment of the requirements for the award of the degree of Master of

Technology in Communication Systems of the NATIONAL INSTITUTE OF

TECHNOLOGY, TIRUCHIRAPPALLI, during the year 2010-2011.

S. DEIVALAKSHMI

Guide Head of the Department

Project Viva-voce held on _____________________________

Internal Examiner External Examiner

Page 3: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

ABSTRACT

Early detection of breast cancer increases the survival rate and increases the

treatment options. One of the most powerful techniques for early detection of breast

cancer is based on digital mammogram. In order to detect the breast cancer, the

Radiologist usually searches the mammograms visually for specific abnormalities.

However, visual analysis of mammograms is difficult task for radiologists. Computer

Aided Diagnosis (CAD) technology helps in identifying and assists the radiologists to

make final decision.

The proposed CADx system involves three major steps called Lesion Detection,

Feature extraction and Classification.

In Lesion Detection, after pre-processing the digital mammogram using adaptive

threshold based on Multiresolution analysis, the suspicious Region of Interest (ROI) is

detected. Shape based features are considered as defining or measuring using simple

formula. Such Thirteen shape based features are extracted from ROI and also from all

Four channels obtained after wavelet decomposition of ROI. This gives us five feature

sets.

Malignant and benign masses are abnormal/tumour cells present in the breast. While

malignant are treated as cancerous tumours and benign are non-cancerous. Now

Classification to judge whether Benign or Malignant using Canonical Discriminant

analysis for all five feature sets is performed and their classification rates are

compared. In short, five classification schemes are discussed.

The proposed method can allow the radiologist to focus rapidly on the relevant parts

of the mammogram and it can increase the effectiveness and efficiency of radiology

clinics.

Keywords: adaptive threshold based on Multiresolution analysis, CADx, Discriminant

analysis, Shape features,

Page 4: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

ACKNOWLEDGEMENTS

I take this opportunity to express my sincere thanks and deep sense of

gratitude to my project guide Ms S. Deivalakshmi, Assistant Professor, Department

of Electronics and Communication Engineering, National Institute of Technology,

Tiruchirappalli for her guidance, needy suggestions, moral support, constant

encouragement and kind co-operation.

My sincere and heartfelt thanks to Dr. B.Venkataramani, Professor & Head

of the Department, Electronics and Communication Engineering, National Institute

of Technology, Tiruchirappalli, for his indispensable help throughout the course of

this project work.

Finally I would like to thank to all teaching staff and my classmates and

computer support group staff, for their sincere help, without whom I am unable to

complete this project

Page 5: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

TABLE OF CONTENTS

Title Page No

ABSTRACT………………………………………………….... i

ACKNOWLEDGEMENTS…………………………………... ii

TABLE OF CONTENTS……………………………………... iii

LIST OF FIGURES………………………………………….... v

LIST OF TABLES…………………………………………….. ix

ABBREVIATIONS……………………………………………. x

CHAPTER 1 INTRODUCTION

1.1 Motivation…………………………………………………………........... 1

1.2 Objectives and Approach…………………………………………………. 3

1.3 Study Outline …………………………………………………………. 3

CHAPTER 2 LITERATURE REVIEW 4

CHAPTER 3 LESION DETECTION

3.1 Pre-Processing……………………………………………………………. 9

3.2 Lesion Detection………………………………………………………….. 15

3.3 Region selection…………………………………………………………… 19

CHAPTER 4 FEATURE EXTRACTIONS & CLASSIFICATION

4.1 Feature extraction………………………………………………………….. 21

4.2 Feature classification………………………………………………………. 24

CHAPTER 5 RESULTS AND DISCUSSION

5.1 Lesion detection……………………………………………………………. 29

5.2 Feature extraction and classification………………………………………. 37

CHAPTER 6 CADx USER INTERFACE 53

Page 6: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

CHAPTER 7 CONCLUSION & FURTHER WORK 55

REFERENCES 57

Page 7: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

LIST OF FIGURES

Figure No Title Page

No

1.1 Structure of Breast……………………………………………………2

1.2 CC view and MLO view……………………………………………...2

1.3 Two basic views of mammographic image: (a) CC view,

b) MLO view…………………………………………………………2

3.1 Main steps involved in the computer aided detection…………….....7

3.2 The proposed method for lesion detection……………………………8

3.3 steps involved in Pre-processing……………………………………...9

3.4 images with salt and pepper noise and after passing through Median

filter…………………………………………………………………...9

3.5 original image and BW1,BW2 (Binary versions with different

Threshold)…………………………………………………………...10

3.6 Skin line of given mammogram……………………………………..10

3.7 Skin Line and RMLO images ……………………………………….11

3.8 LMLO and skin line images………………………………………...12

3.9 Masking to remove background-Mask1 & after masking…………..13

3.10 Before removing rib & after removing rib ………………………….13

3.11 Rib portion (RIB Part) , after removing Rib (Pre-processed

mammogram) and Given mammogram…………………………….14

3.12 Preprocessed image and Segmented Portion (Lesion)………………15

3.13 Block diagram for adapt. segmentation method adapted by Zhang…16

3.14 Dashed line indicates the PDF pI (x). Two solid lines indicate

P (Cb )pb(x) and P(Ct)pt (x), respectively...………………………….17

3.15 Bayes threshold λ1 and the proposed candidate threshold λ3

are indicated…………………………………………………………18

Page 8: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

3.16 Implemented algorithm for Segmentation Part…………………….19

3.17 Segmented Portion and Lesion Part (Selected Region)…………….20

4.1 Abstract Flow of Proposed method………………………………….21

4.2 One stage, 2d- DWT Decomposition with db4 family……………...22

4.3 Steps to determine CDF for all four feature sets extracted from

respective channels………………………………………………….26

4.4 Steps to determine CDF for feature set 1extracted from Lesion……27

4.5 Steps involved for classification of given Lesion by determining

Discriminant score (N) - Validation process………………………..27

5.1 (MDB063) First stage DWT decomposition of Pre-processed image

with db2 family……………………………………………………...29

5.2 Adaptively chosen Threshold for 27 malignant cases……………...30

5.3 Adaptively chosen Threshold for 38 benign cases…………………30

5.4 Adaptively chosen Threshold for 35 Normal cases………………...30

5.5 (MDB135) original mammogram with its skin line………………..31

5.6 (MDB135) Derivative plot of histogram of LL1 channel of given

mammogram-Threshold 1=206……………………………………..32

5.7 (MDB135) Histogram of given mammogram-Threshold 2=190…...32

5.8 (MDB 135) Region selections………………………………………32

5.9 (MDB226) original mammogram with its skin line………………..33

5.10 (MDB226) Derivative plot of histogram of LL1 channel of given

.mammogram-Threshold 1=224…………………………………….33

5.11 (MDB226) Histogram of given mammogram-Threshold 2= 189…..34

5.12 (MDB 226) Region selections………………………………………34

5.13 (MDB115) original mammogram with its skin line………………...35

Page 9: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

5.14 (MDB115) Derivative plot of histogram of LL1 channel of given

mammogram-Threshold 1=231……………………………………..35

5.15 (MDB115) Histogram of given mammogram-Threshold 2=204……36

5.16 (MDB115) Region selections……………………………………….36

5.17 Lesion part detected from training dataset of 20 benign

Mammograms ……………………………………………………….38

5.18 Lesion part detected from training dataset of 20 malignant

mammograms ……………………………………………………….39

5.19 Lesion and its one stage level-1 DWT decomposition for

mdb002 using db4…………………………………………………...40

5.20 Lesion and its one stage level-1 DWT decomposition for mdb028

using db4…………………………………………………………….41

5.21 CDF histogram plot for a) Benign & b) Malignant groups using

feature set-1………………………………………………………….42

5.22 Classification result for validation dataset (unknown

mammograms) (N=1)……………………………………………….44

5.23 CDF histogram plot for a) Benign & b) Malignant groups using

feature set-2………………………………………………………….46

5.24 Classification result for validation dataset (unknown

mammograms) (N=2)……………………………………………….46

5.25 CDF histogram plot for a) Benign & b) Malignant groups using

feature set-3………………………………………………………….48

5.26 Classification result for validation dataset (unknown

mammograms) (N=3)……………………………………………….48

5.27 CDF histogram plot for a) Benign & b) Malignant groups using

feature set-4………………………………………………………….50

Page 10: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

5.28 Classification result for validation dataset (unknown

mammograms) (N=4)………………………………………………50

5.29 CDF histogram plot for a) Benign & b) Malignant groups using

feature set-5………………………………………………………….52

5.30 Classification result for validation dataset (unknown

mammograms) (N=5)……………………………………………….52

6.1 GUI interface for CADx (Trained)………………………………………………………………54

7.1 Comparison chart for classification of Lesion using DA on Feature

set 1 to 5………………………………………………………………………………………………….55

Page 11: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

LIST OF TABLES

Table no Title Page no

4.1 13 shape based features ……………………………………………23

5.1 mdb002 extracted features (Benign group)…………………………40

5.2 mdb028 extracted features (Malignant group)…………………….41

5.3 Unstandardized coefficients for Lesion……………………………..42

5.4 Classification Results (N=1)………………………………………..43

5.5 Unstandardized coefficients for Lesion_Ca………………………...45

5.6 Classification Results (N=2)………………………………………..45

5.7 Unstandardized coefficients for Lesion_Chd……………………….47

5.8 Classification Results (N=3)………………………………………47

5.9 Unstandardized coefficients for Lesion_Cvd……………………...49

5.10 Classification Results (N=4)………………………………………49

5.11 Unstandardized coefficients for Lesion_Cdd……………………..51

5.12 Classification Results (N=5)………………………………………51

5.13 summaries of all results for classification…………………………52

Page 12: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

ABBREVIATIONS

CAD Computer-Aided Detection

CADx Computer-Aided Diagnosis

CC Craniocaudal

CDF Canonical Discriminant functiom

DA Discriminant Analysis

DS Discriminant score

DWT Discrete Wavelet Transform

FNR False Negative Rate

FPR False Positive Rate

MIAS Mammographic Image Analysis Society

MLO Medio Lateral Oblique

PDF Probability Density Function

ROI Region of Interest

TNR True Negative Rate

TPR True Positive Rate

Page 13: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

CHAPTER-1

INTRODUCTION

Breast cancer is the second leading cause of cancer death in women today (after

lung cancer). An estimated 40,230 breast cancer deaths are expected in 2010. .According to

National Cancer Institute, one out of eight women will develop breast cancer during her

lifetime.

Breast cancer stages range from stage 0 (very early form of cancer) to state IV

(advanced, metastatic breast cancer).Early stage breast cancer are associated with high

survival rates than late stage concerns.

The key to surviving breast cancer is early detection and treatment. According to

ACS, when breast cancer is confined to the breast, the five-year survival rate is almost 100%.

Breast cancer screening has been shown to reduce breast cancer mortality. The high survival

rates of early detection of breast cancer can be attributed to utilization of mammography

screening as well as high level of awareness of the disease symptoms in the population.

Beginning in their early 20s, women should be told about the benefits and

limitations of breast self-examination (BSE).For women in their 20s and 30s, it is

recommended that clinical breast examination (CBE) be part of a periodic health

examination, preferably at least every three years. Asymptomatic women aged 40 and over

should continue to receive a clinical breast examination as part of a periodic health

examination, preferably annually and prior to mammography and to begin annual

mammography at age 40.

1.1 Motivation

Mammography is a uniquely important type of medical imaging used to screen for

breast cancer. All women at risk go through mammography screening procedures for early

detection and diagnosis of tumour. Special x-ray machines developed exclusively for breast

imaging are used to produce mammography films. These machines use very low doses of

radiation and produce high-quality x-rays. A typical mammogram is an intensity x-ray image

with gray levels showing levels of contrast inside the breast which characterize normal tissue

and different calcification and masses.

Page 14: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

Each breast is imaged separately in craniocaudal (CC) view and mediolateral-oblique (MLO)

view shown in Figure 1.1(a) and Figure 1.1(b), respectively.

Fig. 1.1 Structure of Breast Fig 1.2 CC view & MLO view.

Fig. 1.3 Two basic views of mammographic image: (a) CC view, (b )MLO view.

Computer-aided detection (CAD) and computer-aided diagnosis (CADx) systems can

improve the results of mammography screening programs and decrease number of false

positive cases. CAD systems use computerized algorithms for identifying suspicious ROIs. The

motivation behind CAD systems is to reduce both the False Positive Rate (FPR) and False

Negative Rate (FNR). When used as intended, CAD would be expected to increase the

number of mammograms interpreted as positive to the extent that it points out

abnormalities previously overlooked by the radiologist. On the other hand, the cost of

missed or undetected abnormalities (FNs) is very high.

Page 15: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

1.2 Objectives and Approach

The ultimate aim of the CADx system is to help the radiologist in making

recommendations for patient management.

CAD systems consist primarily of the following processing stages such as Pre-

processing, Segmentation, Feature extraction and classification.Pre-processing is performed

to reduce and suppress noise , to enhance mammogram and to remove Background Region

in MLO view of mammogram. Segmentation is nothing but Lesion detection is performed

using adaptive threshold technique . Now from detected ROIs of respective mammograms

shape features are extracted and processed towards classification of abnormality.

The database consists of 40 mammograms with 20 Benign and 20 Malignant cases.

Five feature sets were extracted from ROIs and provided as input to classification stage

using DA. Classification results were compared. After knowing Canonical Discriminant

functions for all five feature sets Analysis is performed on database of 65 unknown

mammograms as validation process and the algorithem for CADx is finalized based on most

significant feature set.

1.3 Study Outline

This project thesis is organized as follows. Chapter 2 reviews the literature and

background of breast cancer and CAD systems in mammography. The Materials and

Methods used in this study are discussed in Chapter 3 and Chapter 4. Chapter 3 deals with

first part of lesion detection and Chapter 4 about Feature extraction and classification

towards classifying given Lesion. Chapter 5provides the results and discussion and the

Chapter 6 concludes thesis with future direction.

CHAPTER-2

LITERATURE REVIEW

Page 16: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

In this chapter, important literatures on the CADx system and their algorithms in

mammography are reviewed. Along with that The Literature required for proposed CADx

system are reviewed.

William R. Klecka (1980), in “Discriminant Analysis” presents a lucid and simple

introduction to several related statistical procedures known as discriminant analysis.

Discriminant Analysis (DA) introduces canonical discriminant function (CDF) of variables in

discriminant analysis. Professor Klecka derives canonical discriminant function coefficients,

provides spatial interpretation of them, and provides a nice discussion of the interpretation

of CDFs. He presents clear discussion of unstandardized and standardized

SPSS ver. 14 manual on algorithms titled “Discriminant” explains all steps involved toward

Classification based on CDF coefficients.

Ingrid Daubechies (1987) invented first smooth orthogonal wavelet with compact support

now known as db-N family. In her text “Ten Lectures on Wavelets” she explains theory

behind wavelets and give nice tour to wavelet era.

Olivier Rioul (1993) described Multiresolution analysis and synthesis for discrete time

signals, in “A Discrete-Time Multiresolution Theory”. Concepts of scale and resolution are

first reviewed in discrete time. The resulting framework allows one to treat the discrete

wavelet transform, octave-band perfect reconstruction filter banks, and pyramid transforms

from a unified standpoint.

Xiao-Ping Zhang and Mita D. Desai (2001) has suggested a general systematic method for

the detection and segmentation of bright targets, in “Segmentation of Bright Targets Using

Wavelets and Adaptive Thresholding”. A method is developed which adaptively chooses

thresholds to segment targets from background, by using a multiscale analysis of the image

probability density function (PDF). A performance analysis based on a Gaussian distribution

model is used to show that the obtained adaptive threshold is often close to the Bayes

threshold. The method has proven robust even when the image distribution is unknown.

Examples are presented to demonstrate the efficiency of the technique on a variety of

targets.

Page 17: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

H.D. Cheng et.al (2003) surveyed most important part of CADx algorithm in “Computer-

aided detection and classification of Micro calcifications in mammograms: a survey”. In

that paper they summarized and compare the methods used in various stages of the

computer-aided detection systems (CAD). In particular, the enhancement and segmentation

algorithms, mammographic features, classifiers and their performances are studied and

compared. Remaining challenges and future research directions are also discussed.

Gonzalez R. et.al (2004) discussed detail discussion regarding shape and margin features in chapter 11 of the text “Digital image processing using MATLAB”. Alfonso Rojas Dominguez & Asoke K. Nandi (2008) presented a method for automatic

detection of mammographic masses, in “Detection of masses in mammograms via

statistically based enhancement, multilevel-thresholding segmentation, and region

selection”. As part of this method, an enhancement algorithm that improves image contrast

based on local statistical measures of the mammograms is proposed. After enhancement,

regions are segmented via thresholding at multiple levels, and a set of features is computed

from each of the segmented regions. For feature extraction he used shape and margin based

properties

Jelena Bozek et.al (2009) surveyed Algorithms, in “A Survey of Image Processing Algorithms

in Digital Mammography”. This chapter gives a survey of image processing algorithms that

have been developed for detection of masses and calcifications. An overview of algorithms

in each step (segmentation step, feature extraction step, feature selection step,

classification step) of the mass detection algorithms is given. Wavelet detection methods

and other recently proposed methods for calcification detection are presented. An overview

of contrast enhancement and noise equalization methods is given as well as an overview of

calcification classification algorithms.

B. Surendiran et.al (2009) performed Discriminant Analysis for classifying the masses

present in mammogram, in “Classifying Digital Mammogram Masses using Univariate

ANOVA Discriminant Analysis”. This approach combines the19 shape properties of the mass

regions and classifies the masses as benign or malignant using Univariate ANOVA. The DDSM

database along with ground truth details are used for experiment. According to which,

Malignant and benign masses are abnormal/tumour cells present in the breast. While

malignant are treated as cancerous tumours and benign are non-cancerous.

Page 18: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

Kai Hu et.al (2010) proposed novel algorithm towards Lesion detection, in that work they

proposed combination of two thresholding segmentations; in “Detection of Suspicious

Lesions by Adaptive Thresholding Based on Multiresolution Analysis in Mammograms”;

i.e., a coarse segmentation and a fine segmentation, to segment suspicious lesions in

multiscale images First use the coarse segmentation to get a rough representation of the

localization of suspicious lesions and then use the fine segmentation to improve the rough

representation to generate more precise segmentation results. This algorithm avoids the

deficiencies of the histogram-based and the window based thresholding algorithms and

improves the segmentation accuracy effectively.

CHAPTER-3

LESION DETECTION

Page 19: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

CAD system consists of a few typical steps depicted in Fig. 3.1. The screen film

mammographic images need to be digitized prior the image processing. This is one of the

advantages of digital mammography where the image can be directly processed

Fig. 3.1 Main steps involved in the computer aided detection.

The first step in image processing is the pre-processing step. It has to be done on

digitized images to reduce the noise and improve the quality of the image. Most digital

mammographic images are high quality images. Another part of the pre-processing step is

removing the background area and removing the pectoral muscle from the breast area if the

image is a MLO view.

The segmentation step aims to find suspicious regions of interest (ROIs) containing

abnormalities. In the feature extraction step the features are calculated from the

characteristics of the region of interest. Critical issue in algorithm design is the feature

selection step where the best set of features are selected for eliminating false positives and

for classifying lesion types. Feature selection is defined as selecting a smaller feature subset

that leads to the largest value of some classifier performance function.

Pre-processing

Segmentation

Feature extraction

Feature selection

Classification

Pre-processing

algorithms

Adaptive Threshold segmentation

algorithm

Page 20: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

Fig. 3.2 The proposed method for Lesion Detection

In this chapter we shall discuss all algorithms undergone towards Lesion Detection.

After getting Region portion user need to decide whether to decide mammogram as normal

or to send it for further classification. Key point is that for normal mammogram after Lesion

detection either only Black image will appear (means all zeros) or may contain noise or

region from background region.

3.1 Pre-Processing

To remove noise

Skin line detection

To remove background

To remove rib portion

Page 21: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

Fig. 3.3 Steps involved in Pre-processing

3.1.1 Median filter

Median filter is non-linear filter and is efficient in removing salt-and pepper noise. Median

tends to preserve the sharpness of image edges while removing noise. It is found that the

noise is removed effectively as the size of the window increases. Also, ability to supress

noise only at the expense of blurring of edges

The Median Filter block replaces the central value of an M-by-N neighbourhood with its

median value.

Fig. 3.4 images with salt and pepper noise and after passing through Median filter

3.1.2 Skin Line detection

Fig. 3.5 original image and BW1,BW2 (Binary versions with different Threshold)

Page 22: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

Algorithm:

If input image is I,

Let, pixel value of I(Xi,Yi)=Pi

BW1(Xi,Yi)=BP1i And BW2(Xi,Yi)=BP2i

Then,

BP1i=1 for Pi >3;

BP1i=0 for Pi <3

BP2i=1 for Pi >12

BP2i=0 for Pi >12

Skin line=BW1-BW2

Fig 3.6 Skin line of given mammogram

3.1.3 MLO Type: Left/Right?

Now after detecting skin line it is necessary to detect type of MLO view; whether its Left

sided(LMO) or Right sided( RMLO).

Step 1:

Input image will undergo through both RMLO and LMLO test and we will get 2 images,

I1=‘RMLO’

I2= ‘LMLO’

RMLO Test: Algorithm:

Page 23: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

Step 1: start with row1.

Step 2: scan from left most column1 towards right side.

Step 3: If pixel is black then replace it with white and move to next pixel and repeat step 3,

Else when pixel is white then go to Step4.

Step 4: Move to next pixel.

If its white Repeat step 4, else Step 5.

Step 5: replace current pixel with black and move to next pixel.

If we exceed last column go to step 6, else repeat step 5

Step 6: Repeat step 2 to 4 for next row unless you exceed all rows

This Image formed is “RMLO”.

Fig.3.7 Skin Line and RMLO images

LMLO Test: Algorithm:

Step 1: start with row1.

Step 2: scan from right most column1 towards left side (here next pixel means Left one).

Step 3: If pixel is black then replace it with white and move to next pixel and repeat step 3,

Else when pixel is white then go to Step4.

Step 4: Move to next pixel.

If its white Repeat step 4, else Step 5.

Step 5: replace current pixel with black and move to next pixel.

If we exceed last column go to step 6, else repeat step 5

Page 24: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

Step 6: Repeat step 2 to 4 for next row unless you exceed all rows

This Image formed is “LMLO”.

Fig.3.8 LMLO and skin line images

Step 2:

Mean_right=mean (RMLO) %right side view

Mean_Left=mean (LMLO) %left side view

If Mean_Left> Mean_right then mammogram is LMLO & (Mask1=LMLO),View=Left

Else given mammogram is RMLO & (Mask1=RMLO),View=Right

3.1.4 To remove background portion

To remove background portion, apply Mask1 obtained after knowing type of MLO.

Do element wise multiplication with original mammogram to obtain background free

mammogram.

Fig 3.9 Masking to remove background-Mask1 & after masking mammogram (Image2)

Page 25: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

3.1.4 To remove rib portion

Fig 3.10: After Local threshold for rib bone removing (Image 3) & after removing rib (Image4)

Algorithm:

Step 1

Image2 apply local threshold i.e. convert it to binary with Threshold =173. (Fig

3.10:Image3)Do not consider first 200 rows, for Label Removing.

Step 2

As we know View is Left or Right, Scan from right or left direction for respective cases.

Travel up to first non-zero pixel; we will call it as POLE.

Step 3

Now travel up to First zero pixel and replace all travelled pixels with zero.

Step 4

Perform Step 1 to 3 for all rows but with following rules:

a) For every POLE ,it should not exceed no. of pixel travelled by previous row

b) Now if at all Rule a is violating for consecutive 5 times then by keeping 45o in

mind decrease pole position for next row by 1 and replace all pixels with zero up

to calculated POLE

After performing these steps you will get image as shown in Fig. 3.8

Page 26: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

Step 5

RIB Part =Image3-Image4;

Step 6

Pre-processed image=Image2-RIB Part

Fig. 3.11 Rib portion (RIB Part) , after removing Rib (Pre-processed mammogram) and Given

mammogram

3.2 Lesion Detection

Page 27: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

Fig. 3.12 Preprocessed image and Segmented Portion (Lesion)

3.2.1 Segmentation

The aim of the segmentation is to extract ROIs containing all masses and locate the

suspicious mass candidates from the ROI. Segmentation of the suspicious regions on a

mammographic image is designed to have a very high sensitivity and a large number of false

positives are acceptable since they are expected to be removed in later stage of the

algorithm Researchers have used several segmentation techniques and their combinations.

1. Thresholding Techniques

2. Region-Based Techniques

3. Edge Detection Techniques

4. Hybrid Techniques

Page 28: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

In Thresholding Techniques, Global Thresholding, Local Thresholding, Local adaptive

techniques and based on Multiresolution analysis adaptive Thresholding Techniques are

there. Local thresholding is slightly better than global thresholding and Adaptive

thresholding are better than other and this way Adaptive threshold based on

Multiresolution analysis is superior to other methods

Page 29: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

According to Zhang and Desai (2001), after the mammograms are wavelet

transformed the gray-level distribution of the target and the background regions of the

images approaches to Gaussian distribution.

Fig. 3.13 Block diagram for adaptive segmentation method adapted by Zhang

The segmentation of possible targets can be modelled by the following classification

problem. For an ideal image I(m,n), there are pixels belonging to two classes: 1) the

background Cb and 2) the target Ct.

Here,

pb(x) PDF of class Cb

P(Cb) a priory probability of class Cb in image I

pt(x) PDF of class Ct

P(Ct) a priory probability of class Ct in image I

Pre-processing

Perform wavelet transform of image

Select proper scaling channel

Select adaptive thresholds by looking for local minima

of the wavelet transformed images at different

channels

Using adaptive threshold get

segmented image

Input the digitized image

Page 30: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

Fig3.14 Dashed line indicates the PDF pI (x). Two solid lines indicate P (Cb )pb(x) and P(Ct)pt

(x), respectively. The Bayes threshold _ and the proposed threshold _ are indicated.

Assuming that fb(x) and ft(x) have one point of intersection, as illustrated in Fig. , the

above classifier is equivalent to the following threshold detection criterion:

I(m, n)<λ :I(m, n)ε (Pixel belong to background class)

I(m, n)>λ :I(m, n)ε (Pixel belong to target class )

Where,

fb(λ) = ft(λ)

And segmented image at scale j using Bayes classifier can be expressed as,

Iseg,j(m,n)= 1, Iseg,j(m,n)> λ

=0, Iseg,j(m,n)< λ

Wavelet transforms are used in the new method and the Bayes classifier is

employed for the segmentation problem. An approach for choosing the threshold adaptively

by looking for the global local minima of the PDFs of wavelet transformed images is

proposed. Based on the assumption of Gaussian distributions, the adaptive threshold by the

new method is compared with the Bayes threshold. It is shown that in general practical

cases, the performance of the proposed threshold is often very close to the Bayes threshold,

which is the optimal threshold from the statistical point of view

Page 31: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

Fig. 3.15 Bayes threshold λ1 and the proposed candidate threshold λ3 are indicated.

According to Kai Hu et.al (2010),

Zhang and Desai had proved that, when the overlap between pb(x) and pt(x) is not

significant, λ2 is often close to λ1. Hence, it is reasonable to carry out segmentation

according to λ2.However, when pb(x) and pt(x) are not ideal and the overlap between them

is large, the algorithm of Zhang and Desai does not work anymore.

For example, in the case shown in the top image of Fig. 2.5, we cannot determine λ2

by selecting the local minima because λ2 is not the minima on the right of μ1, i.e., the global

maximum in pI (x).In this case, we select the threshold from the derivative PDF curve of pI(x).

The top image of Fig. 2.5 shows the PDF curve of pI (x), and the bottom image of Fig. 2.5

shows the absolute value of derivation of pI (x), i.e., |p’I (x)|, and λ3 is the local minima of

|p’I (x)|. Therefore, we can obtain the segmentation result effectively in this case by using

threshold λ3.

Page 32: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

3.3 implemented algorithm

Fig. 3.16 Implemented algorithm for Segmentation Part

3.3 Region selection

From Segmented or detected Portion of mammogram for the perpose of

image extraction,region having more area is selected.For that Binary version of Image is

scanned based on connectivity and number of such connected components are

determined.For all such components LABELS are assigned .

Then Matrix containing Label in place of respective component position is formed.

Now Label having maximum pixels are replaced by ones and other label marked pixels by

zeros ; by using obtained Matrix as mask, we can fixe Lesion part.

Segmented Portion

Pre-processed Mammogram

Level 1, DB2 family LL

channel

Histogram

Smoothing (moving

average)

Second derivative

Threshold=first zero

crossing from right side

Global Thresholding

using given Threshold

Page 33: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

Fig.3.17 Segmented Portion and Lesion Part (Selected Region)

Now, Feature Extraction & classification are done using selected Region or Lesion . Here

Radiologist need to take decision whether to send it for feature extraction & classification to

find out type of abnormality i.e. Benign or Maligant. Next Chapter will deal with Remaining

part of proposed CADx algorirhm.

CHAPTER-4

FEATURE EXTRACTION & CLASSIFICATION

Page 34: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

After selecting Lesion part as explined in last chapter ; in this chapter Feature

extraction & classification algorithms are explained. In this work five different platforms are

given for classification of Lesion and their classification rates are compared both for Known

database (training using 40 sample mammograms) and Unknown database (Validation

using 65 sample mammogram). This chapter will introduce those platforms and in later part

of thesis we will discuss Results and comparison.

Fig 4.1 Abstract Flow of Proposed method

Fig 4.1 shows the algorithm which is followed by all five classification

schemes . Instead of classifying Lesion only based on shape based feature extraction and

classification using DA on Lesion for other 4 platforms shape based feature extraction and

classification using DA are performed on all 4 DWT Level1 channels. And their respected

features are called as feature set (N); where N ranges from 1 to 5.

4.1 Feature Extraction

In this step for given input image containing ROI, following 13 features are extracted.

Set of 13 Feature is nothing but Feature set (N), where N is decided based on input Image

applied,

h

h

g

Lesion_CA

Lesion_CHD

Lesion_CVD

2

2

2

Lesion

h 2

Shape based Feature

extraction algorithm

Classification based on Discriminant

analysis

LESION

Benign or Malignant

Page 35: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

Fig 4.2 one stage, 2d- DWT Decomposition with db4 family

Where,

N=1 if input image =Lesion

N=2 if input image =Lesion_CA

N=3 if input image =Lesion_CHD

N=4 if input image =Lesion_CVD

N=5 if input image =Lesion_CDD

Feature set 1: feature extraction on Lesion

Feature set 2: feature extraction on Lesion_CA

Feature set 3: feature extraction on Lesion_CHD

Feature set 4: feature extraction on Lesion_CVD

Feature set 5: feature extraction on Lesion_CDD

Table 4.1: 13 shape based features

Page 36: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

a. Convex Image — Binary image (logical); the convex hull, with all pixels within the

hull filled in (i.e., set to on).

b. Eccentricity (Ect) = The value is between 0 and 1. (0 and 1 are degenerate cases; an

ellipse whose eccentricity is 0 is actually a circle, while an ellipse whose eccentricity

is 1 is a line segment.)

c. Circularity1 is used to measure shape, reflecting the element’s similarity to circle,

with maximum value approximating 1.0 for circle

4.2 Feature classification

Feature classification is done using Discriminant Analysis (DA). In this method

Canonical discriminant function is determined for extracted features

After extracting features from feature set (N) for all Training cases i.e. mammograms

with known abnormality (we called it as ground truth.) we determined unstandardized

coefficients (N) along with group centroids (N). Now using any platform in other words any

classification scheme (N: 1 to 5); we can determine Discriminant score (N) for unknown

Feature extraction:

1. Area(A)=Total pixels in ROI

2. Perimeter(P)=Total pixel in Border of ROI

3. Max Radius(Rmax)=max(Distance(centroid, Border of ROI)

4. Min Radius(Rmin)=min(Distance(centroid, Border of ROI)

5. Convex Areaa = Total number of pixels in 'Convex Image’

6. Euler Number(Eno)= the number of objects in the region minus the number of

holes in those objects

7. Eccentricity(Ect) = (distance between the foci/Major axis length )of ellipse

8. Elongatedness(En) =(Area/(2*Rmax)2

9. Solidity =Area/Convex Area

10. Circularity1(C1) =(Area/pi*Rmax2 )1/2

11. Dispersion(Dp)= Rmax/Area

12. Standard Deviation Of Edge (Esd) =std. dev. of pixel value on border

13. Shape Index(SI) = Perimeter/(2*Rmax)

Page 37: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

mammogram. And classification is done based on Threshold rule which is obtained using

group centroids (N).

4.2.1 Steps for Canonical Discriminant Analysis:

For n1 Benign & n1 malignant cases which are there in known database or Training cases,

After extracting P features each. We will get 2 feature matrix

Where,

Dimension *A1+ = (n1 x P)……..for n1 Benign cases

Dimension *A2+ = (n1 x P) …...for n1 Malignant cases

Dimension *C+ = (2*n1 x P)……… for Total group

M =Mean of C column wise ( or feature wise)…………….(P x 1)

M1 =Mean of A1 column wise ( or feature wise)…………….(P x 1)

M2 =Mean of A2 column wise ( or feature wise)…………….(P x 1)

W = {Covariance (A1) + Covariance (A2)} x (n1-1) …………………………………….Within groups

sums of squares & cross product matrix

T = Covariance (C) x (2*n1 -1)

………………………………… Total sums of squares & cross product matrix

B = T –W

Let W =LLT………………………….. Cholesky decomposition

Then, for A=L-1 B U -1 ……………………

Now if X is Eigen vector corresponds to prominent Eigen value of A

Then, V = U -1X

Unstandardized coefficients D = V x √(2*n1 -1)…………..( P x 1)

C =A1

A2

= G

Page 38: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

Constant D0 = -1 x ( DTM)

Group 1 centroid (Benign group) CBG= constant + ( DTM1)

Group 2 centroid (Malignant group) CMG= constant + ( DTM2)

Threshold Thd =average of group centroids

Now Canonical Discriminant Function can be determined as,

f=D0+XD

Where X is (1xP) feature vector for given mammogram.

4.2.3 Classification based on CDF

Now after substituting Xinput value in obtained CDF we will get finput; this value is

nothing but Discriminant Score (DS) for given input feature vector .

Now,

If DS > Thd

a) Thd < CBG then its classified in Benign group

b) Thd < CMG then its classified in Malignant group

If DS < Thd

a) Thd < CBG then its classified in Malignant group

b) Thd < CMG then its classified in Benign group

Classification based on CDF

13 Shape based features

Extraction

- Feature set 3

Unstandardized

Lesion_CHD

Classification based on CDF

13 Shape based features

Extraction

- Feature set 2

Unstandardized

Lesion_CA

Known Database –

20 Benign & 20

Malignant cases

Page 39: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

Fig 4.3 Steps to determine CDF for all four feature sets extracted from respective channels

in stage 1 of DWT decomposition of Lesion as shown in Fig. 4.2 ( N is 2:5 )

Classification based on CDF

13 Shape based features

Extraction

- Feature set 1

Unstandardized

coefficients(1)

Lesion

Training Database

– 20 Benign & 20

Malignant cases

Page 40: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

Fig 4.4 Steps to determine CDF for feature set 1 extracted from Lesion (N=1)

Fig 4.5 Steps involved for classification of given Lesion by determining Discriminant

score (N) - Validation process.

Figure 4.3 & Figure 4.4 describes algorithm adapted while training period towards

obtaining CDF & threshold. Figure 4.5 guides the methodology applied to classify

mammogram after training period is over i.e. testing phase. Now hereby by means of

chapter 3 and chapter 4; I proposed CADx system used for Lesion detection and classification

in mammogram based on Adaptive threshold and DA.

In next chapter we shall see Results and Discussion towards implementation part of

proposed CADx system.

Feature set N extracted

Classification based on CDF

Lesion

Unstandardized coefficients

(N)

Discriminant score (N)

N: 1 to 5

Threshold (N)

Benign or Malignant

Validation Database

Page 41: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

CHAPTER-5

RESULTS AND DISCUSSION

As proposed method was explained and implemented in 2 parts ; this section also

we will discuss in 2 parts; First part will deal with Results Lesion detection & in later results

obtained during classification.

The mammogram images used in this experiment were taken from the mini

mammography database of MIAS (http://peipa.essex.ac.uk/ipa/pix/mias/). All images are

held as 8-bit gray level scale images with 256 different gray levels (0-255) and physically in

portable gray map (pgm) format with size 1024 pixels x 1024 pixels.

Page 42: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

5.1 Lesion detection

For conduction of Part I, 100mammograms from database selected. In which 35

Normal, 38 Benign & 27 malignant cases are considered.

All mammograms are of MLO view; so it was necessary to perform Pre-processing to

remove background portion like Label, Pectoral muscle or rib portion.

Fig 5.1 :( MDB063): First stage DWT decomposition of Pre-processed image

with db2 family

LH1 LL1

HH1 HL1

Page 43: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

Global adaptive threshold segmentation described in previous chapter is applied on

given pre-processed mammogram; to obtain Threshold 1.Figure 5.1 shows enhanced version

of first stage DWT Decomposition of given pre-processed mammogram.

Fig 5.2: Adaptively chosen

Threshold for 27 malignant

cases

Fig 5.3: Adaptively chosen

Threshold for 38 benign cases

Fig 5.4: Adaptively chosen

Threshold for 35 Normal cases

Page 44: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

Figure 5.2 to 5.3 shows Threshold obtained in trial mammograms; which also signifies that

variance from average value is increases and smallest threshold also gets decreased as we

move from Malignant to Benign and from Benign to Normal case.

In Next part 3 mammograms of type Normal, Benign and Malignant; namely MDB135,

MDB226 & MDB115 are used as examples for Lesion detection.

Now to calculate Threshold 2, we need to scale result according max pixel value present in

original image. To find out Threshold 1 LL1 image is enhanced such a way that its minimum

pixel value will be zero and maximum is 255 (fig. 5.1);

And Now again to correlate that threshold in special domain of given mammogram.

Threshold 2=Threshold 1*(maximum pixel value in pre-processed mammogram)/255

5.1.1 Normal Case: MDB 135 (LMLO view)

-

Fig 5.5: (MDB135) original mammogram with its skin line

Page 45: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

Fig. 5.6 (MDB135) Derivative plot of histogram of LL1 channel of given mammogram-

Threshold 1=206

Threshold 2= Threshold 1*235/255= 206*235/255 = 190

Fig. 5.7 (MDB135) Histogram of given mammogram-Threshold 2=190

Fig 5.8 (MDB 135) Region selection

5.1.2 Benign case: MDB 226

Page 46: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

Fig 5.9: (MDB226) original mammogram with its skin line

Fig. 5.10 (MDB226) Derivative plot of histogram of LL1 channel of given

mammogram-Threshold 1=224

Threshold 2= Threshold 1*235/255= 224*215/255 = 189

Page 47: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

Fig. 5.11 (MDB226) Histogram of given mammogram-Threshold 2= 189

Fig 5.12 (MDB 226) Region selection

Page 48: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

5.1.3 Malignant case: MDB 115

Fig 5.13: (MDB115) original mammogram with its skin line

Fig. 5.14 (MDB115) Derivative plot of histogram of LL1 channel of given mammogram-

Threshold 1=231

Page 49: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

Threshold 2= Threshold 1*235/255= 231*225/255 = 204

Fig. 5.15 (MDB115) Histogram of given mammogram-Threshold 2=204

Fig. 5.16(MDB115) Region selection

Page 50: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

5.2 Feature extraction and classification

As per notations in previous Chapter, here P=13 features & n1=20 cases. Now for

performing DA or evaluating CDF for all five feature set training dataset of 40

mammograms are used; from which lesion part is detected first, such 20 benign and 20

malignant cases (detected Lesion part) are shown in Figure 5.17 and 5.18 respectively.

For validation part classification is done based on discriminant score obtained and

Threshold corresponding to respected platform (N 1 to 5) which is chosen for feature

extraction and classification. For validation database 38 Benign and 27 malignant cases

from MIAS database are considered ,which we previously used for Lesion detection

algorithm. In training part for one case in each group feature vectors corresponding to

each feature set is extracted. Which are tabled in Table 5.1 and Table 5.2.

Now Secction 5.2.1 to 5.2.5 summarizes the result obtained in Feature classification

part for respective platforms (N=1 to 5). Where to evaluate feature set N, 13 features

are extracted from corrosponding input image.

One can refer figure 5.20 ,MDB028 a mammogram of malignant case .

Lesion is used as input for N=1;

Lesion_ca is used as input for N=2;

Lesion_chd is used as input for N=3;

Lesion_cvd is used as input for N=4;

Lesion_cdd is used as input for N=5;

And you will come up with Feature vectors, column wise in Table 5.2.

Now select your N, i.e. your platform and by adapting procedure explained in previous

Chapter you can form matrix A1 and A2 and by performing steps described earlier

Group centroids & Canonical Discriminant function can be identified, by knowing

unstandardized coefficients and constant.

Page 51: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

20 Benign cases

Fig. 5.17 Lesion part detected from training dataset of 20 Benign mammograms

20 Malignant cases

Page 52: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

Fig 5.18 Lesion part detected from training dataset of 20 malignant mammograms

Page 53: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

Fig 5.19 Lesion and its one stage level-1 DWT decomposition for mdb002 using db4.

Table 5.1: mdb002 extracted features (Benign group)

Feature

Feature set

1

Feature set

2

Feature set

3

Feature set

4

Feature set

5

Area 5599 2098 2098 2098 2098

Perimeter 719.74 279.14 279.4 279.14 279.14

rmin 13.11 6.19 3.81 1.23 15.08

rmax 105.37 51.37 20.22 2.02 50.44

convexarea 9026 2778 2788 2778 2788

eno -10 0 0 0 0

ect 0.97 0.96 0.96 0.96 0.96

en 0.13 0.2 1.28 128.5 0.21

Page 54: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

solidity 0.62 0.76 0.76 0.76 0.76

c1 0.4 0.5 1.28 12.79 0.51

dp 0.02 0.02 0.01 0 0.02

esd 111.96 159.17 35.07 41.41 26.58

si 3.42 2.72 6.9 69.08 2.77

Fig 5.20 Lesion and its one stage level-1 DWT decomposition for mdb028 using db4.

Table 5.2: mdb028 extracted features (Malignant group)

Feature Feature set 1 Feature set 2 Feature set 3 Feature set 4 Feature set 5

Lesion_Ca Lesion_Chd

Lesion_Cvd Lesion_Cdd

Page 55: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

Area 6153 1893 1893 1983 1893

Perimeter 336.78 171.88 171.8 171.88 171.88

Rmin 34.81 10.28 0.93 0.92 18.31

Rmax 53.97 34.3 3.31 12.27 26.75

Convex area 6590 1988 1988 1988 1988

Eno 1 1 1 1 1

Ect 0.54 0.53 0.53 0.53 0.53

En 0.53 0.4 43.22 3.14 0.66

Solidity 0.93 0.95 0.95 0.95 0.95

C1 0.82 0.72 7.42 2 0.92

Dp 0.01 0.02 0 0.01 0.01

Esd 97 21.2 29.88 31.35 135.24

Si 3.12 2.51 29.57 7.01 3.21

5.2.1 Feature Classification –Feature set 1

Unstandardized

coefficients

Area 0.000136053

Perimeter 0.002093029

rmin 0.032719388

rmax -0.006014262

convexarea -0.000136239

eno 0.048984735

ect 0.229593731

en 1.948941623

solidity 6.236294906

c1 -18.67941482

dp 50.23437946

esd -0.074649649

si 0.574652977

constant 10.25357206

Threshold 0

Benign Group Centroid = 1 &

Malignant Group Centroid = - 1

Table 5.3: Unstandardized coefficients for Lesion

Page 56: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

Discriminant score is calculated using unstandardized coefficients and extracted

feature set.For MDB028: Feature set 1(Table 5.1 first column)

DS= {6153*0.0013 + 336.78*0.0021 + 34.81*0.00327 -0.06*53.97 -

0.00013*6590+0.048984*1+0.2295*0.54+1.9489*0.53+6.236*.93-

18.67*0.82+50.2343*0.01-.0746*97+.574*10.2535}

DS=-1.58861251052099

Now, (DS< Threshold) which belongs to Malignant group as its centroid is < Threshold.

Therefore given mammogram is classified in malignant group, which holds true as per

ground truth.

Fig 5.21: CDF histogram plot for a) Benign & b) Malignant groups using feature set-1

TABLE 5.4: Classification Results (b,c) (N=1)

type

Predicted Group

Membership Total

.00 1.00 .00

Original Count .00 16 4 20

1.00 4 16 20

% .00 80.0 20.0 100.0

1.00 20.0 80.0 100.0

Cross- Count .00 14 6 20

Page 57: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

a Cross validation is done only for those cases in the analysis. In cross validation, each case is

classified by the functions derived from all cases other than that case.

validated(a) 1.00 6 14 20

% .00 70.0 30.0 100.0

1.00 30.0 70.0 100.0

80.0% of original

grouped cases

correctly classified.

70.0% of cross-

validated grouped

cases correctly

classified.

Classification rate

(Validation rate)

= 63.079%

Page 58: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

Fig 5.22: Classification result for validation dataset (unknown mammograms) (N=1)

Fig. 6.5 shows classification result in for validation dataset having 65

mammograms with 38 Benign and 27 malignant cases. The plot is function of following

function

F=R-GT

Where R is Result obtained after classification and GT is ground truth of abnormality.

For both R & GT: 0 means Benign & 1 means malignant.

Middle graph indicates (TP+TN) (True results F=0)

Left graph indicates FN (false negative F=-1) i.e. misinterpreted as Benign

Right graph indicates FP (false positive F=1) i.e. misinterpreted as Malignant

And Classification rate = 100*(TP+TN)/ (TP+TN+FP+FN) %

5.2.2 Feature Classification –Feature set 2

Unstandardized coefficients

Area -0.001296512

Perimeter -0.007166919

Rmin 0.05612585

Rmax -0.052294284

convexarea 0.001359857

Eno 0.126160304

Ect -3.019447119

en -2.244389496

solidity 14.81575929

c1 -5.089357262

dp -148.6243504

esd 0.056719493

Table 5.5: Unstandardized coefficients for Lesion_Ca

Page 59: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

type

Predicted Group Membership Total

.00 1.00 .00

Original Count .00 18 2 20

1.00 3 17 20 % .00 90.0 10.0 100.0

1.00 15.0 85.0 100.0 Cross-validated(a)

Count .00 14 6 20

1.00 5 15 20

% .00 70.0 30.0 100.0

1.00 25.0 75.0 100.0

Fig 5.23: CDF histogram plot for a) Benign & b) Malignant groups using feature set-2

si 1.327386923

constant -13.20729376

Threshold 0

Benign Group Centroid = 1.08

Malignant Group Centroid = - 1.08

87.5% of original grouped cases

correctly classified.

72.5% of cross-validated grouped

cases correctly classified.

Table 5.6: Classification Results (N=2)

Page 60: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

Fig 5.24: Classification result for validation dataset (unknown mammograms) (N=2)

5.2.3 Feature Classification –Feature set 3

Unstandarized coefficients

Area 0.000513

Perimeter -0.00761

rmin 0.075119

rmax -0.01532

convexarea -0.00028

eno 0.023116

ect 2.107331

en 0.009295

solidity -12.1234

Benign Group Centroid = 1.8

Malignant Group Centroid = - 1.8

Classification rate

(Validation rate)

= 33.84%

Table 5.7: Unstandardized coefficients for Lesion_Chd

Page 61: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

c1 -0.82893

dp 143.4587

esd 0.029166

si 0.129811

constant 8.274857

Threshold 0

Table 5.8 Classification Results (N=3)

type

Predicted Group Membership Total

.00 1.00 .00

Original Count .00 19 1 20 1.00 0 20 20

% .00 95.0 5.0 100.0 1.00 .0 100.0 100.0

Cross-validated(a)

Count .00 16 4 20 1.00 2 18 20

% .00 80.0 20.0 100.0

1.00 10.0 90.0 100.0

Fig 5.25: CDF histogram plot for a) Benign & b) Malignant groups using feature set-3

97.5% of original

grouped cases

correctly

classified.

85.0% of cross-

validated grouped

cases correctly

classified.

Page 62: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

Fig 5.26: Classification result for validation dataset (unknown mammograms) (N=3)

5.2.4 Feature Classification –Feature set 4

Unstandarized coefficients

Area -0.00019

Perimeter -0.01045

rmin 0.094519

rmax -0.02209

convexarea 0.000349

Benign Group Centroid = 0.96

Malignant Group Centroid = - 0.96

Classification rate

(Validation rate)

= 67.69%

Table 5.9: Unstandardized coefficients for Lesion_Cvd

Page 63: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

eno -0.06987

ect 4.259276

en 0.007625

solidity -3.66475

c1 -0.79561

dp -85.9958

esd -0.02715

si 0.14592

constant 3.963368

Threshold 0

Table 5.10 Classification Results (N=4)

type

Predicted Group Membership Total

.00 1.00 .00

Original Count .00 19 1 20 1.00 6 14 20

% .00 95.0 5.0 100.0 1.00 30.0 70.0 100.0

Cross-validated Count .00 12 8 20 1.00 8 12 20

% .00 60.0 40.0 100.0 1.00 40.0 60.0 100.0

Fig 5.27: CDF histogram plot for a) Benign & b) Malignant groups using feature set-4

82.5% of original

grouped cases

correctly classified.

60.0% of cross-

validated grouped

cases correctly

classified.

Page 64: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

Fig 5.28: Classification result for validation dataset (unknown mammograms) (N=4)

5.2.5 Feature Classification –Feature set 5

Unstandarized coefficients

Area -0.000324559

Perimeter -0.017567012

rmin 0.21035515

rmax -0.098717383

convexarea 0.000721538

eno -0.674295328

ect 4.619567598

en -0.008232511

solidity -18.20872834

Benign Group Centroid = 1.44

Malignant Group Centroid = - 1.44

Classification rate

(Validation rate)

= 69.23%

Table 5.11: Unstandardized coefficients for Lesion_Cdd

Page 65: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

c1 0.649064184

dp 64.58390762

esd -0.065053371

si -0.087046045

constant 16.34716167

Threshold -1.78E-15

Table 5.12 Classification Results (N=5)

type

Predicted Group Membership Total

.00 1.00 .00

Original Count .00 19 1 20 1.00 2 18 20

% .00 95.0 5.0 100.0 1.00 10.0 90.0 100.0

Cross-validated(a)

Count .00 17 3 20 1.00 5 15 20

% .00 85.0 15.0 100.0 1.00 25.0 75.0 100.0

Fig 5.29: CDF histogram plot for a) Benign & b) Malignant groups using feature set-5

92.5% of original

grouped cases

correctly classified.

80.0% of cross-

validated grouped

cases correctly

classified.

Page 66: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

Fig 5.30: Classification result for validation dataset (unknown mammograms) (N=5)

Table 5.13: summaries of all results for classification

original grouped classification rate

Cross-validated grouped case classification rate

Validation classification rate

Feature set 1 80 70 63.08

Feature set 2 87.5 72.5 33.84

Feature set 3 97.5 85 67.69

Feature set 4 82.5 60 69.23

Feature set 5 92.5 80 75.38

CHAPTER 6

CADx USER INTERFACE

After completing training part using CDF functions and respective thresholds one can

classify given mammogram’s Lesion part either Benign or Malignant.

Our CADx will allow radiologist to select any one of the classification scheme.

Where decision can be taken by giving different weightage to different schemes and finally

one can conclude.

All Front end and back end algorithms are implemented in MATLAB environment.

Classification rate

(Validation rate)

= 75.38%

Page 67: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

Figure 6.1 shows GUI of developed CADx system.

Matlab ver.7 is used in which Image processing, wavelet and signal processing toolboxes are

used.

Apart from that for Discriminant analysis SPSS package ver. 14 is used whose results

matches with one with Matlab implementation, but SPSS software found to be more faster

and user friendly.

Page 68: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

Select mammogram

Page 69: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

Fig. 6.1 GUI interface for CADx (Trained).

CHAPTER-7

CONCLUSION AND FUTURE WORK

Proposed CADx system is implemented with the help of Matlab toolbox.Figure 7.1

compares classification rates obtained by using DA on respective feature set. It is clearly

visible that feature set 3 & 5, which are nothing but 13 shape based features extracted

from Horizontal and Diagonal detail channels obtained , after one stage DWT

decomposition using db4 wavelet family of Lesion in given mammogram; gives better

result rather than simple feature extraction using Lesion itself (feature set 1).

Select classification

Scheme

Result:

Type of abnormality

Lesion

Page 70: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

Fig. 7.1 Comparison chart for classification of Lesion using DA on Feature set 1 to 5

As part of further work, one can explore different mother wavelet or wavelet

families so as to improve classification rate. Also, instead of using statistical method of

classification ANN model can be generated.

REFERENCES

1. William R. Klecka (1980), Discriminant Analysis ,sage university paper

2. SPSS ver. 14 manual on algorithms titled “Discriminant”

3. Ingrid Daubechies (1992) “Ten Lectures on Wavelets”, society for industrial and

applied mathematics.

4. Olivier Rioul (1993); A Discrete-Time Multiresolution Theory, IEEE Trans. On signal

processing vol. 41, no. 8, pp. 2591-2605.

0

20

40

60

80

100

120

Feature set1

Feature set2

Feature set3

Feature set4

Feature set5

original grouped classificationrate

Cross-validated grouped caseclassification rate

Validation classification rate

Page 71: LESION DETECTION AND CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE THRESHOLD AND DISCRIMINANT ANALYSIS

5. Xiao-Ping Zhang and Mita D. Desai (2001);Segmentation of Bright Targets Using

Wavelets and Adaptive Thresholding, IEEE Transactions on Image Processing, vol.

10,no. 7, July 2001

6. H.D. Cheng, Xiaopeng Cai, Xiaowei Chen, Liming Hu, Xueling Lou (2003) ;

Computer-aided detection and classification of Micro calcifications in mammograms:

a survey, Pattern Recognition 36 -2967-2991

7. Gonzalez R. C., Woods R. E., Eddins S. L. (2004), Digital image processing using

MATLAB.

8. Alfonso Rojas Dominguez & Asoke K. Nandi (2008); Detection of masses in

mammograms via statistically based enhancement, multilevel-thresholding

segmentation, and region selection, Computerized Medical Imaging and Graphics 32,

pp.304-315

9. Jelena Bozek , Mario Mustra ,Kresimir Delac and Mislav Grgic (2009); A Survey of

Image Processing Algorithms in Digital Mammography, Recent Advan. In Mult. Sig.

Process. And Communication, SCI 231, pp. 631-657.

10. B. Surendiran et.al (2009); Classifying Digital Mammogram Masses using Univariate

ANOVA Discriminant Analysis. Int .Conf. on Advances in Recent Technologies in

Communication and Computing.

11. Kai Hu et.al (2010); Detection of Suspicious Lesions by Adaptive Thresholding Based

on Multiresolution Analysis in Mammograms, IEEE Trans. On Instrument and