main seminar
TRANSCRIPT
1
CANCER SUB-CLASSIFICATION
USING ARTIFICIAL NEURAL
NETWORKS
.
ABSTRACT:In this project we have attempted to design a software that is able classify a particular cancer into it’s subtype at a much faster rate and accuracy in comparison to the conventional method and also with minimum human interferance,knowledge or experience thus leading to better prognosis and higher chances of survival for cancer patients.
2
3
WHAT IS CANCER?
CAUSES
DIAGNOSES
PROGNOSIS
4
Oncogenes : Genes which promote cell growth and reproduction.
Tumour Suppressor Genes: Genes which inhibit cell division and survival.
Over-expression or the Under-expression of these genes leads to cancer.
5
DRAWBACKS OF CONVENTIONAL METHODS OF DETECTION AND DIAGNOSIS:
Medical Tests: False Positives and Negatives
subclassifiction accuracy depends largely on Doctors knowledge and experience
Time consuming and delay in subtype identification & treatment
Leads to poor prognosis.
6
NEED FOR THE PROJECTFaster sub-classification of cancer.
Minimum Human Interference required for sub-classification purposes.
Highly Accurate compared to traditional techniques.
High Accuracy in classification leads to better prognosis, and treatment of patients.
Contains Scope in the Future for building techniques for prevention of cancer by studying genetic information
7
PROJECT OVERVIEW:
1.DATA COLLECTIO
N
2. PRE-PROCESSING
3.TRAINING ANN
4.TESTING
5.CANCER SUB-
CLASSIFICATION
6.ACCURACY MET?
7.IF YES:STOP ELSE KEEP TRAINING
8
WHAT IS MGE? DNA Microarray :A collection of
microscopic DNA spots attached to a solid surface.
DNA microarrays used to measure the expression levels of large numbers of genes simultaneously.
Core principle:hybridisation
10
The steps required in a microarray experiment:
11
MGE’S ROLE IN CANCER DIAGNOSIS & SUBCLASSIFICATIONMicroarray is a new technology to automate the diagnostic process and can improve accuracy of traditional diagnostic processes.
Examine expression of Thousands of genes at once=>Test for elevated expression genes=>Predict cancer
15
LUNG CANCER
SQUAMOUS CELL
CARCINOMA
ADENOCARCINOMA
LARGE CELL LUNG
CANCER
16
19
NEED FOR COMPRESSION
MGE data is huge.
Large size of input data->LARGE ANN->LONG TRAINING TIME->POOR GENERALIZATION.
Require large amount of memory.
20
DATA COMPRESSION
DCT DWT
DATA COMPRESSION
METHODS
21
DISCRETE COSINE TRANSFORM
DCT : Technique for converting a signal into elementary frequency components.
Used to separate the image into parts (or spectral sub-bands) of differing intensity.
23
24
DISCRETE WAVELET ANALYSIS• Wavelet analysis : revealing aspects of data
that other signal analysis techniques miss: aspects like trends, discontinuities and self-similarity.
• Can compress or de-noise a signal without appreciable degradation.
Wavelet transform defined as the sum over all time of the signal multiplied by scaled, shifted versions of wavelet MOTHER function.
25
COMPRESSION LEVELCOMPRESSION TECHNIQUE
LUNG CANCER(22283 inputs)
SKIN CANCER(7480 inputs)
BREAST CANCER(7480 inputs)
DCT 90%(2283 inputs)
89%(778 inputs)
89%(778 inputs)
DWT Dwt-5,96%(707 inputs)
Dwt-4,93%(472 inputs)
Dwt-3, 87%(939 inputs)Dwt-4,93%(472 inputs)
26
3.NEURAL NETWORK:
BUILDING,TRAINING AND TESTING
27
WHAT ARE NEURAL NETWORKS?
The motivation for the development of neural network technology stemmed from the desire to mimic human brain.
A Neural Network is a powerful data-modeling tool that is able to capture and represent complex input/output relationships.
28
PROPERTIES OF NEURAL NETWORKS:
Adaptive learning
Self-learning
Real time operation
Fault tolerant
29
NEURAL NETWORKS VS. CONVENTIONAL COMPUTERS
30
MCCULLOH & PITTS MODEL:
31
TRANSFER FUNCTIONS:
32
LEARNING METHODS:
I. SUPERVISED LEARNING
II. UNSUPERVISED LEARNING
33
BUILDING NEURAL NETWORK
34
NEURAL NETWORK
472 230 2
35
1. ERROR BACKPROPAGATION:•Choose random weights for the network.• Feed in input and obtain the result for each neuron.• Feed forward the output of each layer to the next layer.• Calculate the error for each neuron,and backpropagate it.
• Update the weights.
• Repeat until the network converges on the target output.
36
LOCATION OF LOCAL AND GLOBAL MINIMUM
0 20 40 60 80 100 120 1400
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
Mean S
quare
d E
rror
Epoch
LOCAL MINIMUM
GLOBAL MINIMUM
37
Algorithms:
1. Gradient Descent With Momentum Backpropagation
2. Gradient Descent With Adaptive Learning Rate Backpropagation:
38
3. Gradient Descent With Momentum And Adaptive Learning Rate Backpropagation:
4. RESILIENT PROPAGATION:
39
3.K-MEANS CLUSTERING kmeans partitions the observations in
your data into K mutually exclusive clusters.
returns a vector of indices indicating to which of the k clusters it has assigned each observation.
The result is a set of clusters that are as compact and well-separated as possible.
40
CODE:kmeans X=[p1;p2;p3;…p15]; %input data k=3 %number of clusters [cidx, ctrs] = kmeans(X,k); %CIDX represents the cluster indice of each
sample(cluster) and CTRS represents cluster centroids figure for c = 1:3 subplot(3,1,c); plot( X((cidx == c),:)'); end suptitle('K-Means Clustering of Profiles'); figure for c = 1:3 subplot(3,1,c); plot( ctrs(c,:)'); end suptitle('K-Means Clustering of Profiles showing only centroids'); figure [silh3,h] = silhouette(X,cidx,'city'); xlabel('Silhouette Value') ylabel('Cluster')
41
42
43
silhouette plot:To get an idea of how well-
separated the resulting clusters are=> silhouette plot.
It is a measure of how close each point in one cluster is to points in the neighboring clusters.
Silhouette plot aids in finding(discovering) new subtypes of cancer
44
45
4.RESULTS
46
BREAST CANCER:ALGORITHM DCT DWT-4 DWT-3
BACKPROPAGATION
93.61% 97.58% 94.65%
BP WITH MOMENTUM
93.66% 97.76% 97.13%
BP WITH VARIABLE LEARNING RATE
97.26% 97.78% 97.26%
BP WITH VARIABLE LR AND MOMENTUM
98.03% 98.70% 98.61%
RESILIENT PROPAGATION
99.00% 99.75% 99.70%
47
SKIN CANCER:
48
LUNG CANCER:ALGORITHM DCT DWT-5
1.BP 89.66 89.77
2.BP WITH MOMENTUM
86.695 91.04
3.BP WITH ADAPTIVE LEARNING
90.57 91.97
4.BP WITH MOMENTUM AND ADAPTIVE LEARNING
86.705 91.51
5.RESILIENT BP 98.52 99.52
49
50
LIMITATIONS OF THE SYSTEM
1)The method of backwards-calculating weights not biologically plausible.
2)Each time when the power switched off the network loses it’s training.
3)Each time you run the network, it randomly initialises weights so difference in accuracy.
4)Very difficult and expensive to obtain genetic information of every individual.
51
SCOPE FOR FUTURE
To extend the project to do comparative study of various other compression algorithms.
To implement the network on a neural network processor thus saving it’s training.
Extending technique to other genetic
diseases/disorders,with faster neural networks.
Discovery of tumor- specific markers and new subtypes.
Study the effect of drugs on gene expressions.
52
REFERENCES
I. K.V. G. Rao, P. P. Chand, M.V.R. Murthy,"A neural Network Approach in Medical Decision Systems" Journal of Theoretical and Applied Information Technology, vol. 3 No. 4, 2007
II. Lawrence O. Hall, Xiaomei Liu Kevin W. Bowyer2, Robert Banfield,”Why are Neural Networks Sometimes MuchMore Accurate than Decision Trees: An Analysis on a Bio-Informatics Problem”in IEEE International Conference on Systems, Man & Cybernetics,Washington, D.C., pp. 2851-2856, October 5-8, 2003
III. Charu Gupta “IMPLEMENTATION OF BACK PROPAGATION ALGORITHM (of neural networks)IN VHDL, at CHITKARA INSTITUTE OF ENGG &TECH, CHANDIGARH in June 2006.
IV. V.P. Plagianakos, D.G. Sotiropoulos, and M.N. Vrahatis,” An Improved Backpropagation Method with Adaptive Learning Rate”, University of Patras, Department of Mathematics, TECHNICAL REPORT No.TR98-02
V. Gonzales, R. C., Woods, R. E., 1993. "Digital Image Processing". Addison-Wesley, Reading, Massachusetts.
VI. MATLAB,HELP SECTION.
53
VII. Liang-Tsung Huang, “An integrated method for cancer classification and rule extraction from microarray data,” Journal of Biomedical Science 2009.
VIII. Tripti Goel ,GPMCE, Delhi ,Vijay Nehra ,BPSMV, Khanpur,Virendra P.Vishwakarma JIIT, Noida,” Comparative Analysis of various Illumination Normalization Techniques for Face Recognition, International Journal of Computer Applications (0975 – 8887) Volume 28– No.9, August 2011
IX. Fayez W. Zaki‘, Mustafa M. Abd Elnaby’, lbrahim M. Elshafiey and Amira S. Ashour’, “DCT AND DWT FEATURE EXTRACTION AND ANN CLASSIFICATION MATERIALS BASED TECHNIQUE FOR NON-DESTRUCTIVE TESTING OF MATERIALS,”EIGHTEENTH NATIONAL RADLO SCIENCE CONFERENCE March 27-29 2001 Mansoura Univ, Egypt.
X. Ahmad M. Sarhan,” A Novel Gene-Based Cancer Diagnosis with Wavelets and Support Vector Machines”, European Journal of Scientific Research ISSN 1450-216X Vol.46 No.4 (2010), pp.488-502 © EuroJournals Publishing, Inc. 2010
54
THANK YOU.