main seminar

1

CANCER SUB-CLASSIFICATION

USING ARTIFICIAL NEURAL

NETWORKS

.

ABSTRACT:In this project we have attempted to design a software that is able classify a particular cancer into it’s subtype at a much faster rate and accuracy in comparison to the conventional method and also with minimum human interferance,knowledge or experience thus leading to better prognosis and higher chances of survival for cancer patients.

2

3

WHAT IS CANCER?

CAUSES

DIAGNOSES

PROGNOSIS

4

Oncogenes : Genes which promote cell growth and reproduction.

Tumour Suppressor Genes: Genes which inhibit cell division and survival.

Over-expression or the Under-expression of these genes leads to cancer.

http://en.wikipedia.org/wiki/Oncogene

http://en.wikipedia.org/wiki/Tumor_suppressor_gene



5

DRAWBACKS OF CONVENTIONAL METHODS OF DETECTION AND DIAGNOSIS:

Medical Tests: False Positives and Negatives

subclassifiction accuracy depends largely on Doctors knowledge and experience

Time consuming and delay in subtype identification & treatment

Leads to poor prognosis.

6

NEED FOR THE PROJECTFaster sub-classification of cancer.

Minimum Human Interference required for sub-classification purposes.

Highly Accurate compared to traditional techniques.

High Accuracy in classification leads to better prognosis, and treatment of patients.

Contains Scope in the Future for building techniques for prevention of cancer by studying genetic information

7

PROJECT OVERVIEW:

1.DATA COLLECTIO

N

2. PRE-PROCESSING

3.TRAINING ANN

4.TESTING

5.CANCER SUB-

CLASSIFICATION

6.ACCURACY MET?

7.IF YES:STOP ELSE KEEP TRAINING

8

WHAT IS MGE? DNA Microarray :A collection of

microscopic DNA spots attached to a solid surface.

DNA microarrays used to measure the expression levels of large numbers of genes simultaneously.

Core principle:hybridisation

http://en.wikipedia.org/wiki/File:Complementarity_(DNA).png

9

http://en.wikipedia.org/wiki/File:NA_hybrid.svg

10

The steps required in a microarray experiment:

http://en.wikipedia.org/wiki/File:Microarray_exp_horizontal.svg

11

MGE’S ROLE IN CANCER DIAGNOSIS & SUBCLASSIFICATIONMicroarray is a new technology to automate the diagnostic process and can improve accuracy of traditional diagnostic processes.

Examine expression of Thousands of genes at once=>Test for elevated expression genes=>Predict cancer


12


13


14

WHAT IS MGE?


15

LUNG CANCER

SQUAMOUS CELL

CARCINOMA

ADENOCARCINOMA

LARGE CELL LUNG

CANCER


17

BREAST CANCER

BASAL APOCRINE


18

2. DATA PRE-PROCESSING


19

NEED FOR COMPRESSION

MGE data is huge.

Large size of input data->LARGE ANN->LONG TRAINING TIME->POOR GENERALIZATION.

Require large amount of memory.


20

DATA COMPRESSION

DCT DWT

DATA COMPRESSION

METHODS


21

DISCRETE COSINE TRANSFORM

DCT : Technique for converting a signal into elementary frequency components.

Used to separate the image into parts (or spectral sub-bands) of differing intensity.


22


24

DISCRETE WAVELET ANALYSIS• Wavelet analysis : revealing aspects of data

that other signal analysis techniques miss: aspects like trends, discontinuities and self-similarity.

• Can compress or de-noise a signal without appreciable degradation.

Wavelet transform defined as the sum over all time of the signal multiplied by scaled, shifted versions of wavelet MOTHER function.


25

COMPRESSION LEVELCOMPRESSION TECHNIQUE

LUNG CANCER(22283 inputs)

SKIN CANCER(7480 inputs)

BREAST CANCER(7480 inputs)

DCT 90%(2283 inputs)

89%(778 inputs)

89%(778 inputs)

DWT Dwt-5,96%(707 inputs)

Dwt-4,93%(472 inputs)

Dwt-3, 87%(939 inputs)Dwt-4,93%(472 inputs)

26

3.NEURAL NETWORK:

BUILDING,TRAINING AND TESTING


27

WHAT ARE NEURAL NETWORKS?

The motivation for the development of neural network technology stemmed from the desire to mimic human brain.

A Neural Network is a powerful data-modeling tool that is able to capture and represent complex input/output relationships.


28

PROPERTIES OF NEURAL NETWORKS:

Adaptive learning

Self-learning

Real time operation

Fault tolerant


29

NEURAL NETWORKS VS. CONVENTIONAL COMPUTERS

30

MCCULLOH & PITTS MODEL:

31

TRANSFER FUNCTIONS:

32

LEARNING METHODS:

I. SUPERVISED LEARNING

II. UNSUPERVISED LEARNING

33

BUILDING NEURAL NETWORK

34

NEURAL NETWORK

472 230 2

35

1. ERROR BACKPROPAGATION:•Choose random weights for the network.• Feed in input and obtain the result for each neuron.• Feed forward the output of each layer to the next layer.• Calculate the error for each neuron,and backpropagate it.

• Update the weights.

• Repeat until the network converges on the target output.

36

LOCATION OF LOCAL AND GLOBAL MINIMUM

0 20 40 60 80 100 120 1400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

Mean S

quare

d E

rror

Epoch

LOCAL MINIMUM

GLOBAL MINIMUM

37

Algorithms:

1. Gradient Descent With Momentum Backpropagation

2. Gradient Descent With Adaptive Learning Rate Backpropagation:

38

3. Gradient Descent With Momentum And Adaptive Learning Rate Backpropagation:

4. RESILIENT PROPAGATION:

39

3.K-MEANS CLUSTERING kmeans partitions the observations in

your data into K mutually exclusive clusters.

returns a vector of indices indicating to which of the k clusters it has assigned each observation.

The result is a set of clusters that are as compact and well-separated as possible.

40

CODE:kmeans X=[p1;p2;p3;…p15]; %input data k=3 %number of clusters [cidx, ctrs] = kmeans(X,k); %CIDX represents the cluster indice of each

sample(cluster) and CTRS represents cluster centroids figure for c = 1:3 subplot(3,1,c); plot( X((cidx == c),:)'); end suptitle('K-Means Clustering of Profiles'); figure for c = 1:3 subplot(3,1,c); plot( ctrs(c,:)'); end suptitle('K-Means Clustering of Profiles showing only centroids'); figure [silh3,h] = silhouette(X,cidx,'city'); xlabel('Silhouette Value') ylabel('Cluster')

43

silhouette plot:To get an idea of how well-

separated the resulting clusters are=> silhouette plot.

It is a measure of how close each point in one cluster is to points in the neighboring clusters.

Silhouette plot aids in finding(discovering) new subtypes of cancer

45

4.RESULTS

46

BREAST CANCER:ALGORITHM DCT DWT-4 DWT-3

BACKPROPAGATION

93.61% 97.58% 94.65%

BP WITH MOMENTUM

93.66% 97.76% 97.13%

BP WITH VARIABLE LEARNING RATE

97.26% 97.78% 97.26%

BP WITH VARIABLE LR AND MOMENTUM

98.03% 98.70% 98.61%

RESILIENT PROPAGATION

99.00% 99.75% 99.70%

47

SKIN CANCER:

48

LUNG CANCER:ALGORITHM DCT DWT-5

1.BP 89.66 89.77

2.BP WITH MOMENTUM

86.695 91.04

3.BP WITH ADAPTIVE LEARNING

90.57 91.97

4.BP WITH MOMENTUM AND ADAPTIVE LEARNING

86.705 91.51

5.RESILIENT BP 98.52 99.52

50

LIMITATIONS OF THE SYSTEM

1)The method of backwards-calculating weights not biologically plausible.

2)Each time when the power switched off the network loses it’s training.

3)Each time you run the network, it randomly initialises weights so difference in accuracy.

4)Very difficult and expensive to obtain genetic information of every individual.

51

SCOPE FOR FUTURE

To extend the project to do comparative study of various other compression algorithms.

To implement the network on a neural network processor thus saving it’s training.

Extending technique to other genetic

diseases/disorders,with faster neural networks.

Discovery of tumor- specific markers and new subtypes.

Study the effect of drugs on gene expressions.

52

REFERENCES

I. K.V. G. Rao, P. P. Chand, M.V.R. Murthy,"A neural Network Approach in Medical Decision Systems" Journal of Theoretical and Applied Information Technology, vol. 3 No. 4, 2007

II. Lawrence O. Hall, Xiaomei Liu Kevin W. Bowyer2, Robert Banfield,”Why are Neural Networks Sometimes MuchMore Accurate than Decision Trees: An Analysis on a Bio-Informatics Problem”in IEEE International Conference on Systems, Man & Cybernetics,Washington, D.C., pp. 2851-2856, October 5-8, 2003

III. Charu Gupta “IMPLEMENTATION OF BACK PROPAGATION ALGORITHM (of neural networks)IN VHDL, at CHITKARA INSTITUTE OF ENGG &TECH, CHANDIGARH in June 2006.

IV. V.P. Plagianakos, D.G. Sotiropoulos, and M.N. Vrahatis,” An Improved Backpropagation Method with Adaptive Learning Rate”, University of Patras, Department of Mathematics, TECHNICAL REPORT No.TR98-02

V. Gonzales, R. C., Woods, R. E., 1993. "Digital Image Processing". Addison-Wesley, Reading, Massachusetts.

VI. MATLAB,HELP SECTION.

53

VII. Liang-Tsung Huang, “An integrated method for cancer classification and rule extraction from microarray data,” Journal of Biomedical Science 2009.

VIII. Tripti Goel ,GPMCE, Delhi ,Vijay Nehra ,BPSMV, Khanpur,Virendra P.Vishwakarma JIIT, Noida,” Comparative Analysis of various Illumination Normalization Techniques for Face Recognition, International Journal of Computer Applications (0975 – 8887) Volume 28– No.9, August 2011

IX. Fayez W. Zaki‘, Mustafa M. Abd Elnaby’, lbrahim M. Elshafiey and Amira S. Ashour’, “DCT AND DWT FEATURE EXTRACTION AND ANN CLASSIFICATION MATERIALS BASED TECHNIQUE FOR NON-DESTRUCTIVE TESTING OF MATERIALS,”EIGHTEENTH NATIONAL RADLO SCIENCE CONFERENCE March 27-29 2001 Mansoura Univ, Egypt.

X. Ahmad M. Sarhan,” A Novel Gene-Based Cancer Diagnosis with Wavelets and Support Vector Machines”, European Journal of Scientific Research ISSN 1450-216X Vol.46 No.4 (2010), pp.488-502 © EuroJournals Publishing, Inc. 2010

54

THANK YOU.

main seminar

Documents

cancer subclassification

cancer patients

prevention of cancer

particular cancer

cancer sub classification6

elevated expression

aspects of data

compressionmge data