24
CHAPTER 2
LITERATURE REVIEW
2.1 INTRODUCTION
The existing methods for optic disk, exudates and blood vessel
extraction are surveyed and some of the methods are described below. Some
of the points are taken from them.
2.2 DETECTION OF OPTIC DISK
The optic disk (OD) is the exit point of retinal nerve fibers from the
eye, and the entrance and exit point for retinal blood vessels. It is a brighter
region than the rest of the ocular funds and their shape are approximately
round. The location of the OD is crucial in retinal image analysis, for
example, as a reference to measure distances and identifies anatomical parts
in retinal images (e.g. the fovea), for blood vessel tracking, and many others.
Precise localization of optic disk boundary is an important sub problem of
higher level problems in ophthalmic image processing. Specifically, in
proliferate diabetic retinopathy, fragile vessels are developed in the retina,
largely in the optic disk (OD) region, in response to circulation problems
created during earlier stages of the disease.
2.2.1 Clustering and Principal Component Analysis (PCA) method
The intensity of the optic disk is much higher than the surrounding
retinal background. A common method of optic disk localization is to find the
25
largest cluster of pixels with the highest gray level. A simple clustering
method (Huiqui Li and Opas Chutatape, 2001) was applied on the intensity
image to find the candidate regions where the optic disk may appear and
Principal component analysis (PCA) was applied only on these candidate
regions to locate the optic disk. This method was applied with the presence of
large area of light lesions.
2.2.2 Principal component analysis (PCA) and active shape model
Principal component analysis (PCA) was applied to the candidate
regions at various scales to locate the optic disk. The minimum distance
between the original retinal image and its projection on to "disk spaces"
indicates the center of the optic disk. The shape of optic disk was obtained by
an active shape method in which affine transformation (Huiqi Li and Opas
Chutatape, 2003 and Daehee Kim et al 2009) was used to transform the shape
model from shape space to image space. The effects of vessels present inside
and around the optic disk were not eliminated, but incorporated in the
processing. This algorithm took the advantage of the top-down strategy that
could achieve more robust results especially with the presence of large areas of
light lesions and when the edge of the optic disk was partly occluded by vessels.
2.2.3 Contour detection using ellipse fitting and wavelet transform
The contours were usually represented by image edges and image
features consisting of points where the image intensity function varies
sharply. It was crucial for the precise identification of the macula to enable
successful grading of macular pathology such as diabetic maculopathy.
However, the extreme variation of intensity features within the optic disk and
intensity variations close to the optic disk boundary presents a major obstacle
in automated optic disk detection. The presence of blood vessels, crescents
and peripapillary chorioretinal atrophy seen in myopic patients also increase
26
the complexity of detection. A novel algorithm was used to detect the optic
disk based on wavelet processing and ellipse fitting. Daubechies wavelet
transform (Pallawala et al 2004) was used to approximate the optic disk
region. Next, an abstract representation of the optic disc was obtained using
an intensity-based template. This yielded robust results in cases where the
optic disc intensity was highly non-homogenous. Ellipse fitting algorithm was
then utilized to detect the optic disc contour from this abstract representation.
Additional wavelet processing was performed on the more complex cases to
improve the contour detection rate. Giri Babu Kande et al (2008) and Mendels
et al (1999) used active contour for the detection of optic disk. A snake was
used by Kass et al (1988) as an energy minimizing spline guided by external
constraint forces to pull it towards features such as lines and edges.
2.2.4 Sobel edge detection and least square regression
The Optic disk can be seen as a pale, well defined round or slight
vertical oval disk. The detection was performed in the red component in three
steps: candidate area identification, sobel edge detection and estimation step.
The candidate area was identified and a clustering algorithm (Huiqui Li and
Chutatape, 2000) was applied to assemble nearby pixels into clusters. The
gravity center of the largest cluster was defined as the center of the candidate
area. Sobel edge detector was applied to the candidate area to get the contour
of the optic disk. Performance wise it was not satisfied because of noises.
Hence LSR (Least Square Regression) was applied to get the estimated circle,
based on Sobel edge detector. Another method used texture descriptors
(Lupascu et al 2008) and a regression based method in order to determine the
best circle that fits the optic disk. The best circle was chosen from a set of
circles determined by the method.
27
2.2.5 Lab color morphology
The location of the optic disk (OD) was of critical importance in
retinal image analysis by both the automatic initialization of the snake
(Alireza Osareh et al 2002) and the application of morphology in color space.
Previously optic disk (OD) was focused on locating its center (Sinthanayothin
et al 1999). The interference of blood vessels was removed by dilation and
erosion restores the boundaries. Standard gray level morphology was
performed after a transformation of the RGB color image in to the gray level
image. By fitting the snake onto the optic disk, the performance could be
measured for various morphological methodologies. The initial contour for
snake must be close to the disk boundary otherwise it could converge to the
wrong place. The accuracy obtained was 90%. In color morphology
(Hanburry and Serra, 2001), each pixel must be considered as a vector of
color components. Definitions of maximum and minimum operations on
ordered vectors were necessary to perform basic operations.
2.2.6 Simulated annealing optimization
The method was based on the preliminary detection of the main
retinal vessels by means of a vessel tracking procedure. All retinal vessels
originate from the optic disk and then follow a parabolic course towards
retinal edges. A geometrical parametric model (Ruggeri et al 2003) was
proposed to describe the direction of these vessels and two of the model
parameters are just the coordinates of the optic disk center. Using samples of
vessels’ directions (extracted from fundus images by the tracking procedure)
as experimental data, model parameters were identified by means of a
simulated annealing an optimization technique. These estimated values
provided the coordinates of the center of the optic disk.
28
2.2.7 Circular Hough transform and local variance
The green band of the images is processed since that these images
have the greatest contrast between the optic disk and the retinal tissue. Firstly,
the blood vessels in the image were suppressed by morphological methods
(closing). Then 24 radial vectors were defined using the approximate centre
of the optic disk as the origin. The image was resampled along these vectors
to form a representation that was subsequently processed.
First, the image was firstly enhanced using the Sobel operator, and
then threshold was applied using the local mean and variance to compute the
threshold value. The remaining points were inputs to a circular Hough Transform
(Abdel Ghafara et al 2004) and the largest circle was the optic disk. The success
rate was only 65%.This method separates the normal and abnormal images due
to glaucoma. The optic disk appearance was uniform in normal images but
progressively less as the disease progresses. Thitiporn Chanwimaluang and
Guoliang Fan (2003) suggested two steps. The approximate center of the optic
disk was located using the maximum of the local variance and the boundary was
delineated by fitting the active snake. The local variance could indicate the
intensity variation of the optic disk which was related to blood vessel and small
structures inside the optic disk. For the large variance, more snake points should
be used to overcome the disturbing gradient from blood vessel and/or intensity
variations. High anatomical constraints were not involved.
2.2.8 Histogram Equalisation and Dynamic Contour
The retinal image was enhanced using histogram equalization.
Preprocessing was achieved by applying a pyramid edge detector to the
contrast enhanced image. Pixels in a layer of the pyramid data structure were
computed as the average of groups of four pixels and this averaging prevents
future examination of the spurious edge points arising from the noise. Edge
strength was used to fit the snake to the disc boundary. The cholesky
29
algorithm (Morris, 1999) was used. But here the snake had failed to locate the
boundary in the upper right quadrant. The gradient vector flow (GVF) based
algorithms had been used successfully to segment a variety of medical
imagery. However, due to the compromise of internal and external energy
forces within the resulting partial differential equations, these methods could
lead to less accurate segmentation results in certain cases. The use of a new
mean shift-based GVF segmentation algorithm (Huiyu Zhou et al 2010)
derived the internal/external energies towards the correct direction. This
method incorporates a mean shift operation within the standard GVF cost
function to arrive at a more accurate segmentation.
2.2.9 Texture Based Method
At random, 23 images were chosen for training and the other 23 for
testing. In the training step, 3 blocks containing 1% were collected from each
image, which accounted for 69 blocks and used as the training set. The
weighted function was defined by linearly combining three regression lines
obtained by fitting three pairs of statistic parametric (Chu Yang-Mao et al
2005) relations which were used to precisely detect the optic disk regions and
this could effectively eliminate the incorrect optic disk detection caused by
extending bright areas around the optic disk. Optic disk (OD) localization
could also be achieved by iterative threshold method to identify initial set of
candidate regions followed by connected component analysis (Siddalingaswamy
and Prabhu, 2007) to locate the actual optic disk.
2.2.10 Pyramidal Approach
The optic disk tracking was done by pyramidal decomposition
(Gagnon et al 2001), and by Optic Disk contour search technique based on the
Hausdorff distance. The global Optic Disk region was found by means of a
multi-scale analysis (pyramidal approach) using a simple Haar-based wavelet
transforms. The brightest pixel that appears in a coarse resolution image
30
(at an appropriate resolution level depending on the initial image resolution
and the Optic Disk average dimension) was assumed to be part of the optic
disk. This global optic disk (OD) localization serves as the starting point for a
more accurate optic disk (OD) localization obtained from a template-based
matching that uses the Hausdorff distance measure on a binary image of the
most intense Canny edges. An average error of 7% on optic disk (OD) center
positioning was reached with no false detection. In addition, a confidence
level was associated to the final detection that indicates the "level of
difficulty" the detector had to identify the optic disk (OD) position and shape.
Genetic based detection (Abraham Chandy and Vijaya Kumari, 2006) was
involved in the preprocessing stage having pixels with the highest 2% gray
levels. Simple clustering mechanism and optic disk diameter were considered
to determine the candidate region and the optic disk center. Genetic algorithm
explores the combinatory space of possible contours (solutions) by means of
crossover and mutation, followed by the evaluation of fitness and the
selection of a new set of contours. The cumulative local gradient was used as
a fitness function to find the fittest contour.
A method (Foracchia et al 2004) to identify the position of the optic disc
(OD) in retinal fundus images was based on the preliminary detection of the
main retinal vessels. All retinal vessels originate from the OD and their path
follows a similar directional pattern (parabolic course) in all the images. To
describe the general direction of retinal vessels at any given position in the
image, a geometrical parametric model was proposed, where two of the model
parameters were the coordinates of the OD center. Using as experimental data
samples of vessel centerline points and corresponding vessel directions,
provided by any vessel identification procedure, model parameters were
identified by means of a simulated annealing optimization technique. These
estimated values provide the coordinates of the center of optic disk.
31
2.2.11 Topological Active Nets (TAN)
Gagnon et al (2001) extracted the optic disk using Hausdorff based
template matching and pyramidal decomposition. It was neither sufficiently
sensitive nor specific enough for clinical application. An average error of 7%
on Optic Disk center positioning was reached with no false detection.
In addition, a confidence level was associated to the final detection
that indicates the "level of difficulty" and the detector had to identify the
Optic Disk position and shape. Chr´astek et al (2002) used an automated
method for the optic disk segmentation which consists of 4 steps: localization
of the optic disk, nonlinear filtering, canny edge detector and Hough
transform. The nonlinear filtering was used as a method for noise reduction,
which at the same time preserve the edges. Since the optic disk is a circular
structure, the authors used the Hough transform in order to have a method of
circle detection.
The transform gave them the center and radius of a circle
approximating the border of the optic disk. A new approach to the optic disk
segmentation process was proposed by Novo et al (2008) in digital retinal
images by means of Topological Active Nets (TAN). This was a deformable
model used for image segmentation that integrates features of region-based
and edge-based segmentation techniques, being able to fit the edges of the
objects and model their inner topology. The optimization of the Active Nets
was performed by a genetic algorithm, with adapted or new ad hoc genetic
operators to the problem. The active nets incorporated new energy terms for
the optic disk segmentations, without the need of any pre-processing of the
images.
32
2.3 DETECTION OF EXUDATES
2.3.1 Support Vector Machines and Neural Networks
Alireza Osareh et al (2001) used Gaussian–smoothed histogram
analysis and Fuzzy C means clustering to segment candidate exudates region
and Support Vector Machines (SVM) and neural network to classify the
exudates and non exudates. Support vector machine uses Structural Risk
Minimization approach. For separate classification, a kernel function was
used where a separating hyper plane (w, b) with w as the weight and b as the
bias was used which maximizes the margin or distance from the closest
points. The optimum separating hyper plane based on the kernel is
* + * +n
i i i
i 1
F x Sign y k x, x b?
à Ô? c -Ä ÕÅ Ö (2.1)
Where n is the number of training examples, yi the label value of i,
k is the kernel and gi the coefficient to maximize the Lagrangian
representation.
This can achieve a tradeoff between false positives and false
negatives using asymmetric soft margin. They converge to solution regardless
of initial condition and remove the danger of over fitting. A computer-based
approach (Jagadish Nayak et al 2008) for the detection of diabetic retinopathy
stage using fundus images includes morphological processing techniques and
texture analysis methods to detect the features such as area of hard exudates,
area of the blood vessels and the contrast. Maria et al (2009) used neural
network approach for the classification of exudates.
33
2.3.2 Bayesian Statistical Classifier
Wang et al (2000) applied Bayesian statistical classifier based on
color features to differentiate the yellowish lesions. It combines brightness
adjustment procedure with statistical classification method and local window
based verification strategy. Objects in an image could be described in terms of
features such as color, size, shape, texture and other complex characteristics.
Different objects were classified into corresponding clusters by the rules. This
detects the lesions correctly but misclassifies the optic disk due to the
similarity in color. The presence of lesions could be preliminarily detected
using MDD (Maximal Dependence Decomposition) and brightness
adjustment procedure solved the non uniform illumination problem. The
accuracy was 70% in classifying the normal images. They did not measure
accurately to distinguish exudates among other lesions. The exudates
identified as exudates in gray level images based on a recursive region
growing technique. An adaptive threshold algorithm was developed
(Leistritz, 1994) to detect and to measure the exudates in gray value images of
patients with diabetic retinopathy. Ege et al (2000) suggested Mahalanobis
and Kohenen Neural Network approach along with Bayesian classifier for the
exudates detection.
Gardner et al (1996) used a Neural Network to identify the exudate
lesions in gray level images. A neural network was trained to recognize
features of diabetic fundus images and tested for sensitivity and specificity.
The network can be used to identify the presence of vessels, exudates and
hemorrhages with high predictive value. Back propagation neural network
was employed to analyse images using grids. The network was then tested
from the previously seen training data and the unseen test data. Neural
network determination of retinopathy for a whole image was based on the
statistical threshold and some squares in the grid may be missed. By altering
34
the threshold, it is possible to change the networks detection rate to increase
the sensitivity and this leads to a lower specificity. The network training took
2 to 3 weeks, whereas this requires about an hour. We could also increase our
sensitivity and specificity results by changing the threshold on the network
output. This approach was very weak since not all of the exudates region may
appear within a square patch.
2.3.3 Fuzzy C-means clustering and Morphological Methods
Akara Sopharak et al (2007 and 2009) transformed Red, Green and
Blue (RGB) space to Hue, Saturation and Intensity (HSI) space. A median
filtering operation was applied to reduce noise before a Contrast-Limited
Adaptive Histogram Equalization (CLAHE) is applied for contrast
enhancement. CLAHE operates on small regions in the image. The contrast of
each small region was enhanced with histogram equalization. Fuzzy C-means
clustering (FCM) was used for segmentation and the fine segmentation using
morphological reconstruction (Alireza Osareh et al 2003).
The algorithm still had some false detection because some pixels
with similar color to the exudates belong to the optic disk and the edge of
blood vessel. Candidate exudates can be detected using a multi-scale
morphological process (Alan D Fleming et al 2007). Based on local
properties, the likelihood of a candidate being a member of classes exudates,
drusen or the background were determined. This leads to a likelihood of the
image containing exudates which could be thresholded to create a binary
decision.
Ahmed Wasif Reza and C. Eswaran (2008) used green component
of the image and preprocessing steps such as average filtering, contrast
adjustment, and thresholding. The other processing techniques used were
morphological opening, extended maxima operator, minima imposition, and
35
watershed transformation. The algorithm was evaluated using the test images
with fixed and variable thresholds. Sato et al (1998) employed filter for the
segmentation. Russell et al (1993) employed thresholding for the detection of
exudates.
2.3.4 Region growing and Edge detection
Principal component analysis (Huiqi Li and Opas Chutatape, 2003)
was employed to locate the optic disk. A modified active shape model was
proposed in the shape detection of optic disk. A fundus coordinate system was
established to provide a better description of the features in the retinal images.
Exudates were detected by the combined region growing and edge detection
approaches. LUV was selected as the suitable color space for the exudates
detection. The fundus image was divided in to sixty four sub images.
Detection was done in each sub image. The color difference of the object
could be defined as
D(I,j)=Á (L(i,j)-Lr)2 + (U(i,j)-Ur)
2 (2.2)
L (i, j) and U (i,j) are the colors of pixel(i, j) in the component
L and U respectively. Lr and Ur are the reference colors of the object. The
reference color was determined as the gravity centre of the object. Mean
squared Wiener filter removed the noise. A combined method of region
growing and edge detection which includes seed selection, edge detection and
growing criteria was employed to detect the exudates. Edges in the sub
images were detected by the canny edge detector. Specificity obtained was
low.
36
2.3.5 Recursive Region Growing Segmentation
An automated screening system (Sinthanayothin et al 2002) to
analyse the features of non-proliferative diabetic retinopathy (NPDR) includes
recursive region growing segmentation algorithms combined with the use of a
new technique, termed a `Moat Operator’. These were used to automatically
detect features of NPDR. These features included haemorrhages and
microaneurysms (HMA), which were treated as one group, and hard exudates
as another group. This presents encouraging results in automatic identification
of important features of NPDR. Cree et al (2004) employed Morlet wave
transformation to spot diabetic retinopathy.
2.3.6 K nearest Neighbour and Neural Network
Screening to detect retinopathy disease can lead to successful
treatments in preventing visual loss. Intraretinal fatty (hard) exudates are a
visible sign of diabetic retinopathy and also a marker for the presence of
co-existent retinal oedema. The retinal images were automatically analysed in
terms of pixel resolution and image-based diagnostic accuracies and an
assessment of the level of retinopathy was derived. To estimate the exudates
and non-exudates probability density distributions, K nearest neighbour,
Gaussian quadratic and Gaussian mixture model classifiers were utilized. This
includes colour image segmentation and region-level classification based on
neural network and support vector machine classifier models. A sliding
windowing technique (Gerald Schaefer and Edmond Leung, 2007) was also
used to extract parts of the image which were then passed on to the neural
network to classify whether that area was a part of exudates region or not.
Principal component analysis (PCA) and histogram specifications were used
to reduce training time and complexity of the network and to improve the
classification rate.
37
2.3.7 Fuzzy C-means clustering and Neural Network
In this approach, the detection of exudates regions were done using
image color normalisation, enhancing the contrast between the objects and
background, segmenting the color retinal image (Alireza Osareh et al 2001).
The exudates and non exudates patches were determined by a neural network.
The color normalisation was done using histogram specification. Local
contrast enhancement improves both the contrasting attributes of lesions and
the overall color saturation in an image. Along with this Fuzzy C-means
clustering (FCM), the salient regions were highlighted and relevant features
were extracted and finally the exudates were classified using perceptron
neural network. This was performed on the intensity channel of the image
after it was converted to the HIS color space. The optic disk was segmented
as fragmented of exudates regions due to the similarity of its color to
yellowish exudates regions. The false detected optic disk regions were
ignored. The segmented objects were classified as exudates/non exudates
regions by a neural network.
2.3.8 Intelligent System
In the intelligent system, image Contrast was enhanced by means of
a neurofuzzy subsystem where fuzzy rules were implemented using Hopfield
type neural network (Leonarda Carnimeo and Antonio Giaquinto, 2006). This
was developed to behave as a fuzzy system to highlight pale regions in the
fundus images of patients. Multilayer perceptron neural network was trained
for evaluating the optimal global threshold which could minimize pixel
classification errors. The performance was compared with the algorithms like
Gradient descent backpropagation, gradient descent momentum with adaptive
learning rate, conjugate gradient backpropagation with Fletcher Reeves updates
and Levenberg-Marquardt backpropagation. The globally optimal segmentation
generates binary images which contain only significant information about
suspect damaged retinal areas. Kirsch et al (1971) used a number of
classification algorithms for pattern recognition in biomedical images.
38
2.3.9 Adaptive Multiscale Morphological Processing
The diabetic retinopathy related spot lesions of various sizes and
shapes were accomplished by an adaptive multiscale morphological
processing technique (Xin Zhang and Guoliang Fan, 2006). Local minima
were the reference points to determine the proper scale for dark lesions and
two-sided edge model was used to define the lesion area. Segmentation of
dark and bright lesions was obtained by using entropy based thresholding
techniques. In this approach, the limitation was that lesion validation might
remove some dark lesions that were connected with blood vessels.
2.3.10 Machine Learning Based Automated System
A machine learning based automated system (Meindert Niemeijer
et al 2007) was capable of detecting exudates and cottonwool spots and
differentiates them using statistical classifier. Here each pixel was classified
and lesion probability map indicates the probability that a pixel was a part of a
bright lesion. Pixels with high probability were grouped into clusters. The
classifier differentiates the different types of pixels or clusters based on the so
called features or numerical characteristics such as pixel color or cluster area.
This gave outputs whether bright lesions were present or not and the type of
class of each lesion. In this technique, the bright lesions may be missed by
human experts and the automated system because of the limitations of the
digital two field non stereo photography. Secondly there could be a constraint
of the quality of annotations of the training image. A large number of training
pixels may improve the system performance. Retinal thickening, if present,
will be missed by the algorithm. The testing can be done only on a small
number of patients.
39
2.3.11 Contextual clustering and Fuzzy Art Neural Network
There was an extensive dissimilarity in the color from different
patients owing to intrinsic attribute of lesions, decreasing color dispersion at
the lesion periphery and lighting disparity which results in color of lesion of
some images lighter than the background color. Jayakumari and Santhanam
(2007) applied Histogram equalization independently for each RGB channel
followed by enhancement. The contrast enhancement algorithm not only
enhances the brightness of lesions but also augments the brightness of the
surrounding pixels so that these may be recognized as class lesion. The
segmented images were discriminated to locate the hard exudates using
features such as convex area, solidity and orientation. The exudates were
detected with 93.4% sensitivity and 80% specificity. This method cannot
detect soft exudates which is its limitation.
2.4 DETECTION OF BLOOD VESSEL
Pattern recognition techniques deal with the automatic detection or
classification of objects or features. Humans are very well adapted to carry
out pattern recognition tasks. Some of the pattern recognition techniques are
the adaption of humans’ pattern recognition ability to the computer systems.
In the vessel extraction domain, pattern recognition techniques are concerned
with the detection of vessel structures and the vessel features automatically.
Various techniques are as follows:
(1) Multi-scale approach
(2) Skeleton-based (centerline detection) approach
(3) Region growing approach
(4) Ridge-based approach
(5) Differential Geometry-based approach
(6) Matching filters approaches and
(7) Mathematical morphology schemes
40
2.4.1 Multi-scale Approach
Chwialkowski et al (1996) accomplished segmentation of blood
vessels using multi resolution analysis based on wavelet transform. The aim
was at automated qualitative analysis of arterial flow using velocity-sensitive,
phase contrast MR images. Multi-scale approaches perform segmentation task
on different image resolutions. The main advantage of this technique was the
increased processing speed. Major structures which were the large vessels in
our application domain were extracted at low resolution images while fine
structures were extracted at high resolution. Another advantage was the
increased robustness. After segmenting the strong structures at the low
resolution, weak structures such as branches in the neighborhood of already
segmented structures can be segmented at higher resolution.
2.4.2 Skeleton-Based (Centerline Detection) Approach
Skeleton-based methods extract blood vessel centerlines. The
vessel tree was then created by connecting these centerlines. Niki et al (1993)
described their 3D blood vessel reconstruction and analysis method. Vessel
reconstruction was achieved on short scan cone-beam filtered back
propagation reconstruction algorithm based on Gulberg and Zeng’s work.
A 3D thresholding and 3D object connectivity procedure were applied to the
resulting reconstructed images for the visualization and analysis process. A
3D graph description of blood vessels was used to represent the vessel
anatomical structure.
2.4.3 Ridge-Based Approach
Ridge-based methods treat grayscale images as a height map in
which intensity ridges are the approximation to the central skeleton of the
tubular objects. Thus a 2D image can be viewed as a 3D surface, image
41
intensity forming the third dimension. Guo and Richardson (1998) proposed a
ridge extraction method that treats digitized angiograms as height maps and
the centerlines of the vessels as ridges in the map. The image was first
balanced by a median filter and then smoothed by a non-linear diffusion
method, anisotropic smoothing as the preprocessing step. Then a Region of
Interest (ROI) was selected by adaptive thresholding method. This process
cuts the cost of the ridge extraction process as well as reducing the false
ridges introduced by the image noise. Next the ridge detection process was
applied to extract the vessel centerlines. Finally the candidate vessel
centerlines were connected together from the extracted ridges using a curve
relaxation process.
2.4.4 Region Growing Approach
Region growing technique segments image pixels into regions that
are belonging to an object. Segmentation was performed based on some
predefined criteria. The simplest form of the segmentation can be achieved
through thresholding and component labeling. Segmentation process then
uses region boundary information to extract the regions. The main
disadvantage of region growing approach is that it often requires a seed point
as the starting point of the segmentation process. This requires user
interaction. Due to the variations in image intensities and noise, region
growing can result in holes and over segmentation. Thus it sometimes
requires post-processing of the segmentation result.
Schmitt et al (2002) combined thresholding with region growing
technique to segment vessel tree in 3D in their work of determination of the
contrast agent propagation in 3D rotational XRA image volumes.
42
2.4.5 Differential Geometry-Based Approach
Differential geometry-based method treats images as hyper surfaces
and extracts features of the images using the curvature and the crest lines of
the surface. The crest points of the hyper-surface correspond to the center
lines of the vessel structure. The 2D and 3D images were treated in a similar
way. They only differ in their mathematical formulation. The principal
curvatures correspond to the Eigen values of the Weingarten matrix and the
principal directions were the eigenvectors. Crest points, which were the
intrinsic properties of the surfaces, were the local maxima of the maximum
curvature on the hyper surface.
Krissian et al (1998) described their Directional Anisotropic
Diffusion method which was derived from Gaussian convolution to reduce the
noise in the image. The algorithm was applied to a set of phantom images
containing torus with different radii and a set of real images of vessels.
2.4.6 Matching Filters Approach
Matching filters approach convolves the image with multiple
matched filters for the extraction of objects of interest and the vessel contours
in the vessel extraction domain. In extracting vessel contours, designing
different filters to detect the vessels with different orientation and size plays a
crucial role. The size of the convolution kernel affects the computational load
of the filtering process. Matching filters are usually followed with some other
image processing operations like thresholding and then connected component
analysis to get the final vessel contours. In the detection of vessel centerlines,
connected component analysis is preceded by thinning process.
Poli and Valli (1997) developed an algorithm to enhance and detect
vessels in real time. The algorithm was based on a set of multiple oriented
linear filters obtained as linear combination of properly shifted Gaussian
kernels. These filters were sensitive to vessels of different orientation and
thickness.
43
2.4.7 Mathematical Morphology Schemes
Morphology relates to the study of object forms or shapes. It
facilitates the segmentation and search for object of interest by filling holes
and eliminating unwanted segments.
Figueiredo and Leitao (1995) described their non-smoothing
approach in estimating vessel contours in angiograms. All local maxima for
each vessel cross section of the morphological edge detector were considered
as candidates to edge points. The dynamic programming was used to find the
minimum cost path among the candidates by selecting a pair for each cross
section. Donizelli et al (1998) combined mathematical morphology and region
growing algorithms to segment large vessels from digital subtracted
angiography images. The author implemented three other classical and
morphological algorithms, multiphase analysis process, region splitting
approach, morphological-thresholding and compared them with this method.
Due to the binary region growing algorithm employed, this work can also be
classified as a region growing approach. A bilinear top hat transformation and
matched filters were used for the initial segmentation of the images
(Spencer et al 1996).
2.4.8 Model-Based Approach
In model-based approach, explicit vessel models are applied to
extract the vasculature. Model-based approaches are divided into four
categories:
(1) Deformable models
(2) Parametric models
(3) Template matching and
(4) Generalized cylinders.
44
2.4.9 Deformable Models
The deformable models are categorized in to parametric deformable
models and geometric deformable models.
2.4.9.1 Parametric Deformable Models - Active Contours (Snakes)
Deformable models are model-based techniques employed for
finding object contours using parametric curves that deform under the
influence of internal and external forces.
Molina et al (1998) used 3D snakes to reconstruct 3D catheter paths
from biplane angiograms. In the image preprocessing step, geometric
distortions in both images introduced by the X-ray projections of the vessels
were corrected. The 3D snake used in this method was represented by
B-spines and initialized interactively. Using a snake, facilitates the merging
information from both projections simultaneously during the energy
minimization process.
Kozerke et al (1999) used a modified definition of the active
contour models in their technique to automatically segment vessels in phase
contrast flow measurements. The method requires the user to select the vessel
of interest in an arbitrary image frame by a mouse click. First, each phase
frame was convolved with a Gaussian mask to reduce noise. Next, isolated
pixels were removed and the holes were filled using connectivity information.
Finally, the first phase image in time with an area of half of the
maximum found overall was selected. In the sequential processing of the
remaining frames, segmented contours of the previous frame which was
temporarily neighboring the current frame was used as a model for the
approximation of the contour in the current frame in case of missing or
45
distorted edge features. The method uses phase image, in addition to the
magnitude image, to handle image distortions.
Geiger et al (1995) proposed a method for detecting, tracking, and
matching deformable contours. The method was based on the dynamic
programming (DP) but it was non-iterative and guaranteed to find the global
minimum. Detection algorithm creates a list of uncertainty points for each
point selected by the user. Next the dynamic programming algorithm was
applied to find the optimal contour passing through these lists. Deformable
model was obtained, after considering all possible contours and deformations.
Matching was achieved through a strategy developed which uses a cost
function and some constraints and based on dynamic programming method.
Toledo et al (2000) combined a probabilistic principal component
analysis (PPCA) technique with a statistical snake technique for tracking non-
rigid elongated structures. Probabilistic principal component analysis
technique was used to construct statistical image feature descriptions while
snakes were used for global segmentation and tracking the objects. The
statistical snake learns and tracks image features using statistical learning
techniques.
2.4.9.2 Geometric Deformable Models and Front Propagation Methods
Quek and Kirbas (2001) developed an approach for the extraction
of vasculature from angiography images by using a wave propagation and
trace back mechanism in 2D. Using a dual-sigmoid filter, the system labels
each pixel in an angiogram with the likelihood that it is within a vessel.
46
2.4.10 Parametric Model
Parametric model approach defines objects of interest
parametrically. In tubular object segmentation, objects are described as a set
of overlapping ellipsoids. Pellot et al (1994) employed deformable elliptic
model to approximate irregular vessels and bifurcations.
Krissian et al (1998) developed a multi scale model to extract and
reconstruct 3D vessels from medical images. This method used a new
response function which measures the contours of the vessels around the
centerlines.
Bors and Pitas (1998) used a pattern classification-based approach
for 3D object segmentation and modeling in volumetric images. The objects
were considered as a stack of overlapping ellipsoids whose parameters were
found using the normalized first and second order moments. The method
employed a radial Basis Function (RBF) network classifier in modeling the
3D structure and gray level statistics. The learning of the Radial Basis
Function (RBF) network was based on the Trimmed Mean algorithm.
2.4.11 Template Matching
Template matching tries to recognize a structure model (template)
in an image. The method uses the template as a context, which is a prior
model. Thus it is a contextual method and a top-down approach. In arterial
extraction applications, the arterial tree template is usually represented in the
form of a series of nodes connected in segments. This template is then
deformed to fit the structures in the scene optimally. Stochastic deformation
process described by a Hidden Markov Model (HMM) is a method to achieve
template deformation. Dynamic programming is an effective method
employed in recognition process.
47
Petrocelli et al (1993) presented their method, based on a Gauss-
Markov model, to recognize three dimensional vessel structures from bi-plane
angiogram images. A unique feature of the system comes from its ability to
extract structures from unprocessed standard digital angiograms without any
preprocessing. The system, named Deformable Template Matcher (DTM),
utilizes a prior knowledge of the arterial tree encoded as mathematical
templates. An arterial tree template was represented in the form of a series of
nodes connected in segments. This template was created by a cardiologist
using an interactive mouse- driven program. This template was then deformed
with a stochastic deformation process described by a Hidden Markov Model
(HMM) to fit the unknown scene in the image using local image information.
The template was considered to fit the scene when the best of the state
transition was found. Since the system was working in 3D, the deformation
process was performed in space and back projected on to two planes used. It
requires a good model of the global structure and computational complexity
to extract entire vascular structure.
2.4.12 Generalized Cylinders Model
Generalized cylinders (GC) are used to represent cylindrical
objects. Technically, Generalized cylinders are parametric models.
Binford et al (1971) introduced the use of Generalized cylinders
(GC) in vision applications. Generalized cylinders (GC) consist of a space
curve, or axis and a cross-section function defined on that axis according to
the definition of Agin et al (1976). Cross-section function was usually an
ellipse. In tube-like object extraction, tubular objects were defined by a cross
sectional element that was swept along the axis of the tube (spine) using some
sweep rules. The spine was represented by a splice and the cross-section
function was represented as ellipse. Another method to represent cylinders
was to use Frenet-Serret formulation as the basis of GC. However, Frenet-
48
Serret formulation model and tube model described earlier suffer from some
serious drawbacks. One drawback was that the cross-section function was not
kept orthogonal to the spine. Discontinuities and non-intuitive twisting
behavior were the other drawbacks. To alleviate these problems, researchers
developed some other Generalized cylinders (GC) models. One of these
models is the Extruded Generalized Cylinders (EGC) developed by
O’Donnell et al (1997).
Fessler and Macovski (1991) developed an object-based method for
reconstructing arterial trees from a few projection images. The method used
an elliptical model of generalized cylinders to approximate arterial cross-
sections. By incorporating a prior knowledge of the structure of arteries, the
problem of reconstruction turns into an object estimation problem. The
method employed a nonparametric optimality criterion that attempts to
capture arterial smoothness incorporated in the system as a prior knowledge.
2.4.13 Tracking-Based Approach
Tracking-based approaches apply local operators on a focus known
to be a vessel and track it. On the other hand, pattern recognition approaches
apply local operators to the whole image. Vessel tracking approaches, starting
from an initial point, detect vessel centerlines or boundaries by analyzing the
pixels, orthogonal to the tracking direction. Different methods are employed
in determining vessel contours or centerlines. Edge detection operation
followed by sequential tracing by incorporating connectivity information is a
straightforward approach.
Aylward et al (1996) utilized intensity ridges to approximate the
medial axes of tubular objects such as vessels. Some applications achieve
sequential contour tracing by incorporating the features, such as the vessel
central point, the searching direction, and the search range, detected from
49
previous step into the next step. Fuzzy clustering was another approach to
identify vessel segments. It used linguistic descriptions like “vessel” and “non
vessel” to track vessels in retinal angiogram images.
Haris et al (1997) combined a recursive sequential tracking
algorithm and morphological tools of homology modification and watersheds
to automatically extract coronary arteries from angiogram images. The initial
segmentation of skeleton and borders of the coronary artery tree was achieved
through an artery tracking method based on circular template analysis.
Authors of the paper admit that the system had problem in extracting a
complete coronary tree.
2.4.14 Artificial Intelligence-Based Approach
Artificial Intelligence-based approach utilizes the knowledge-based
to guide the segmentation process and to delineate vessel structures. Different
types of knowledge sources are employed in different systems from various
inputs. One type of knowledge source is the properties of the image
acquisition technique, such as cine-angiography, digital subtraction
angiography (DSA), computer tomography (CT), magnetic resonance
imaging (MRI), and magnetic resonance angiography (MRA). Some
applications utilize a general blood vessel model as a knowledge source.
Stansfield (1986) applied a domain-dependent knowledge of
anatomy to interpret cardiac angiograms in the high-level stages. According
to Stansfield, Anatomical knowledge was embodied within the system in the
form of spatial relations between objects and the expected characteristics of
the objects themselves. Knowledge-based systems exploit a prior knowledge
of the anatomical structure. These systems employed some low-level image
processing algorithms, such as thresholding, thinning, and linking, while
guiding the segmentation process using high-level knowledge. Artificial
50
Intelligence-based methods perform well in terms of accuracy, but the
computational complexity was much larger than some other methods.
Rost et al (1998) described their knowledge-base system, called
SOLUTION (Solution for a Learning Configuration System for Image
Processing), and designed to automatically adopt low-level image processing
algorithms to the needs of the application. It aims to overcome the problem of
extensive change requirement in the existing system to perform in a different
environment. The system accepted task descriptions in high-level natural
spoken terms and configures the appropriate sequence of image processing
operators by using expert knowledge formulated explicitly by rules. In this
implementation, extraction process was limited to contours.
Smets et al (1988) and Pallawala et al (2001) had presented a
knowledge-based system for the delineation of blood vessels on subtracted
angiograms. The system encoded general knowledge about appearance of
blood vessels in these images in the form of 11 rules (e.g. that vessel has high
intensity center lines, comprise high intensity regions bordered by parallel
edges etc.). These rules facilitate the formulation of a 4-level hierarchy
(pixels, center lines, bars, segments) each of which was derived from the
preceding level by a subset of the 11 rules.
The main stages in the algorithm were: First, the center lines of the
vessels were obtained by an adaptive maximum intensity detector. Second,
thresholding, thinning, and linking operations were applied to get the final
center line segments. Third, bar-like primitives were constructed using region
growing algorithm on the center lines detected. Finally, bar-like structures
were combined into blood vessel segments using geometrical and topological
knowledge of the blood vessels. They show results of their system that
indicate that the system was successful where the image contains high
contrast between the vessel and the background and that the system had
51
considerable problems at vessel bifurcations and self-occlusions. Walter et al
(2007) used contrast enhancement and shade correction for the extraction of
features like blood vessel and microaneurysms.
2.4.15 Neural Network-Based Approach
Neural networks are used to simulate biological learning and are
widely used in pattern recognition. Neural nets are basically a classification
approach. One of the advantages that make neural networks attractive in
medical image segmentation is their ability to use nonlinear classification
boundaries obtained during the training of the network. Another attractive
feature of the neural nets is the ability to learn. With the selection of a good
training set which includes all possible features or objects, the network can
learn the classification boundaries in its feature space. One of the
disadvantages of neural networks is that they need to train every time a new
feature is introduced in the network. Another limitation is that it is difficult to
debug the performance of the network. The network is a collection of
elementary processors (nodes). Each node takes number of inputs, performs
elementary computations, and generates a single output. Each node is
assigned a weight and the output is a function of weighted sum of the inputs.
These weights are learned through training and then used in the recognition.
Back propagation algorithm is a widely used learning algorithm. One problem
associated with learning is that, learning depends on the training data set. The
size of the training data set affects the learning process. The training
procedure should be rerun each time a new training data is added to the set.
Therefore neural networks require a training data set, the learning process is a
supervised learning. A different class of neural networks is self-teaching and
does not depend on training data set for the learning. The best known of these
classes of neural networks is Kohonen feature maps or Kohonen self-
organizing networks. Neural networks are used in a wide range of
52
applications. In medical imaging, neural networks are mainly used as a
classification method where the system is trained with a set of medical images
and the target image is segmented using the trained system.
Nekovei and Sun (1995) described their back-propagation network
for the detection of blood vessels in X-ray angiography. This system did not
extract the vascular structure. Its purpose was to label the pixels as vessel or
non-vessel. The system applied the neural network directly to the angiogram
pixels without prior feature detection. Since angiograms are typically very
large, the network was applied to a small sub window which slides across the
angiogram. The pixels of this sub window were directly fed as input to the
network. Pre-labeled angiograms were used as the training set to fix the
network’s weights. A modified version of the common delta-rule was used to
obtain these weights. The algorithm was compared with two other algorithms,
Bayesian maximum likelihood algorithm and iterative ternary classification
algorithm and the classification accuracy of the three methods were
compared.
Hunter et al (1995) combined neural network-based approach with
knowledge- guided snakes to extract Left Ventricular (LV) boundaries in
Echo cardio graphic images. Their method comprises three stages. In the first
two stages, the neural network-based radial search algorithm detected the
candidate left ventricular edge points along a set of radial search lines which
defined the polar domain. The final boundary extraction took place in this
polar domain. In the third and final stage, the knowledge guided Snakes
automatically select left ventricular boundary points from the candidate edge
points resulting from second stage and form a closed contour. They developed
a new two stage Dynamic Programming method reported to be faster than the
original method in their snake’s implementation. Due to the knowledge-
53
guided snakes used in the extraction process, this work can also be classified
as a parametric deformable model approach.
2.4.16 Miscellaneous tube-like object detection Approach
Miscellaneous tube –like object detection approach deals with the
extraction of tubular structures from images. Kompatsiaris et al (2000)
developed a method to detect the boundaries of stents in angiographic images.
Their method first constructed a training set using perspective projection of
various deformations of the 3D wire frame model of the stent. The initial
detection of the stent in the image was accomplished by using the training set
for deriving a multivariate Gaussian density estimate based on Eigen space
decomposition using Principal Component Analysis (PCA). Then, a
maximum likelihood estimation framework was formulated to extract the
stent. In the final stage of the algorithm, a 2D active contour model (snake)
was used to refine the detected stent. The initialization of the snake was
accomplished by an iterative technique considering the geometry of the stent.
This work could also be classified as a parametric deformable model due to
the active snakes used. Fuzzy convergence (Adam Hoover et al 2003)
technique was used to determine the origin of the blood vessel network.
Kaupp et al (1994) presented a method to segment a retinal image
into arteries, veins, the optic disk, the macula, and background. The method
was based upon split-and-merge segmentation, followed by feature based
classification. The features used for classification include region intensity and
shape. The primary goal of this work was vessel measurement; the nerve was
identified only to prevent its inclusion in the measurement of vessels.
The problem of optic nerve detection has rarely received unique
attention. It has been investigated as a precursor for other issues, for example
as identifying a starting point for Blood vessel segmentation (Tamura et al
54
1988) and (Tolias, and Panas, 1998). It had also been investigated as a
byproduct to general retinal image segmentation. For instance into separate
identifications of arteries, veins, the nerve, the fovea, and lesions
(Akitaa and Kuga, 1982), (Goldbaum et al 1996) and (Pinz et al 1998).Based
on the above survey, different methods were employed and the performance
was analyzed. The next forthcoming chapters discuss the methods for
exudates detection through optic disk and blood vessel extraction.
2.5 OVERVIEW OF THE WORK
Initially to make the screening process to be automated, the retinal
images were obtained from the fundus camera, keeping this as the input for
the process, the optic disk area was identified. The next step was to trim out
the identified area, defining a circle to completely cover the optic disc. Once
the OD is determined, there was no need for the optic disk to be present in the
image because it may affect the detection of exudates as it had the same
features like intensity, density as that of the exudates. Once the optic disk had
been extracted, abnormalities present in the retina can be easily found. The
final stage was the detection of exudates.
The steps involved in the optic disk include converting the color
image in to gray scale image. The optic disk is the brightest part in the retinal
image and the red component is used for the detection of the optic disk. In the
iteration method, a window of arbitrary size was chosen and it scans the entire
image. When the number of pixels exceeding the threshold value inside the
window becomes greater than a particular value, the entire area under the
window was made white or else it was made as black. The iteration was
repeated by increasing the window size and the output of the first iteration
was fed as input for the next iteration. The final binary image with the white
pixel representing the optic disk can be mapped to the RGB image. In the
clustering, for every pixel exceeding the threshold a box of certain size was
55
constructed and this was the cluster. Many clusters were formed and the
centroid was calculated for each cluster. If the distance between the two
centroids was within the acceptable level, they were joined together to form a
single cluster. The candidate regions were first determined by clustering the
brightest pixels in intensity image. The Principle Component Analysis was
employed in this thesis to the clusters for the detection of the optic disk. The
result of PCA method was kept as input for the proposed method to trace the
circular shape of the optic disk. With the mean centre as the centre and the
mean radius as radius, a circle was drawn and the best fit enclosing the optic
disk was chosen.
The optic disk obtained from the proposed method was made black
and the retinal image with the exudates was left for further detection. By
fixing the threshold, the exudates were extracted. The image to be tested was
kept as the test image. The normal healthy image was kept as the reference
image. The corresponding pixels in both the test and the reference images
were compared. This was done by subtraction and if both pixels have the
same value it gets cancelled. If it differs, the larger value appears in the
output.
The next method used for the extraction was the MDD classifier.
Training images for exudates and background were drawn by drawing small
windows in the exudate patches and background for a number of sample
images. The RGB components were represented in spherical coordinates. The
output of the propagation through radii was converted to spherical
coordinates. The discriminant function for background and exudates were
calculated using their respective means. If the discriminant function of
exudates was lower than discriminant function of background, then the pixel
was classified as exudates or else as background. The procedure was repeated
for all the pixels, and finally on to the color image. The extraction was done
56
using enhanced MDD classifier. For background, a group of pixels that
surround the optic disk were taken. For yellowish region, pixels those
belonging to the optic disk were taken. The discriminant of yellowish region
and background were found. If the discriminant of yellowish region was less
than the background, then the pixel was classified as yellowish region or else
as background. Yellowish region includes optic disk, exudates, drusens,
artifacts etc., To separate exudates, the sharp objects in the image were
identified using sobel operator. As the exudates have desired features of
yellow color and sharp edges, Boolean operation, feature based AND
operations were used. Here the object with sharp objects with sharp edges
selects the objects in the image with yellowish elements. Thus the exudates
were only detected and false detections were removed from the output image.
The presence of exudates can also be done through the extraction of
blood vessels. The blood vessels were extracted by both filter and also by
employing morphological operations. Laplacian and Gaussian operators were
used to detect the blood vessels accurately. The blood vessel edges were
thinned and smoothed for the betterment of the extraction.
The performance measurements for exudate detection using all the
methods were obtained by calculating Peak Signal to Noise Ratio. This was
done by using the formula
PSNR(x, y) = 10*log 10(max (max(x), max(y))) ^2/ |x-y|^2) (2.3)
The values were tabulated and the plot shows that the exudates
detection by using the proposed method gives a higher ratio than the other
methods.
Thus with the automated output, the physician can take decisions
within a few seconds regarding the treatment for the patient. This is very
57
useful in case of medical camps where the eye check up is made for more than
hundreds of people. It is not necessary for the ophthalmologist to visit the
camp, once the fundus camera is fixed at the appropriate angle, the technician
can just take the images of patients and transfer the images to the system, and
it automatically generates the necessary details about the retinal image. If any
abnormalities are present, it quickly detects them and based on the severity,
the patients can be advised to visit the doctor.
Hence, this proves to be more efficient and economical. It is a time
saving process. Also the risk of the patients being identified as normal can be
reduced. Once the retinal image of the patient is fed into the system, our
automated program identifies the Optic Disk, removes it and indicates the
presence of abnormalities. Thus the patients can get a clear idea about the
current status of their eyes.
The presence of exudates can also be found by initially extracting
the blood vessel. The extraction is done by filter method as well as by
morphological method. The green component was separated from the RGB
color retinal image. This is because, the blood vessel appearance will be good
in this component.