a method for quantitative image assessment based on redundant feature measurements and statistical...
Post on 29-Aug-2016
Embed Size (px)
ELSEVIER Computer Methods and Programs in Biomedicine 45 (1994) 291-305
computer methods and programs in biomedicine
A method for quantitative image assessment based on redundant feature measurements and statistical reasoning
David J. Foran*, Richard A. Bergb
Department of Pathology, University of Medicine and Dentistry of New Jersey, 675 Hoes Lane, Piscataway, NJ 08854. USA bCollagen Corporation, 2500 Faber Place, Palo Alto, CA 94303, USA
Received 27 January 1994; revision received 5 July 1994; accepted 28 July 1994
Advances in computer graphics and electronics have contributed significantly to the increased utilization of digital imaging throughout the scientific community. Recently, as the volume of data being gathered for biomedical applica- tions has begun to approach the human capacity for processing, emphasis has been placed on developing an automated approach to assist health scientists in assessing images. Methods that are currently used for analysis often lack suff- cient sensitivity for discriminating among elements that exhibit subtle differences in feature measurements. In addition, most approaches are highly interactive. This paper presents an automated approach to segmentation and object recog- nition in which the spectral and spatial content of images is statistically exploited. Using this approach to assess noisy images resulted in correct classification of more than 97% of the pixels evaluated during segmentation and in recogni- tion of geometric shapes irrespective of variations in size, orientation, and translation. The software was subsequently used to evaluate digitized stained blood smears.
Keywords: Digital images; Shape descriptors; Chromaticity
Digital imaging has become popular because it offers several advantages over conventional means of collecting and processing pictorial data. Once digitized, images can be analysed, compressed, processed, stored, or transmitted using several modes of communication.
Image processing refers to the manipulation of digitally encoded images to aid in visual under- standing by humans or to facilitate subsequent
* Corresponding author.
computer analysis. One class of computer-based algorithms used to assist in the automated inter- pretation of images is referred to as pattern recog- nition. The term pattern recognition is generally defined as the ability to perceive structure in sensor-derived data. It is a fundamental activity that is intimately intertwined with our concept of intelligence [l]. In the context of this paper it refers to the computer procedures that operate on pictorial data to deliver interpretations of the digitized scene which are analogous to what a human might deduce were she to view the image directly.
0169-2607/94/%07.00 0 1994 Elsevier Science Ireland Ltd. All rights reserved SSDI 0169-2607(94)01590-C
292 D.J. Foran, R.A. Berg/Comput. Methods Programs Biomed. 45 11994) 291-30.5
Although image processing and pattern recogni- tion have their own distinct application areas, fun- damental principles from each field have merged giving rise to machine vision. Many of the basic components associated with machine vision sys- tems have evolved from attempts to mimic the human visual system. The early stages of computer image processing are analogous to the process that takes place in the human eye and optic nerve, and the pattern recognition mechanisms represent the activities which take place in the human brain [l-7]. Machine vision has been used successfully in many industrial areas including robotics, in- spection, process control, material handling, navigation, and parts assembly , but advance- ments in the medical environment have proceeded at a slower rate due to the increased level of image complexity and due to the medical impact of an in- correct assessment [ 11.
Segmentation is a crucial step for any machine- driven image assessment system since identitica- tion of any objects, structures, or shadows within a digitized field requires delineation into sub- regions . Algorithms which are currently used to segment images range in complexity from those that utilize monochromatic intensity thresholding to those that rely on color algebra, color cluster- ing, or spatial filtering [4,10-141. Color-based ap- proaches sometimes improve the accuracy of image subdivision when compared with the per- formance of monochromatic schemes, but the al- gorithms often lack sufficient discriminatory power for differentiating among structures ex- hibiting similar spectral characteristics [ 15 191. Schemes that rely on spatial filtering can be com- putationally cumbersome when large numbers of specimens are evaluated and are often difficult to integrate into automated systems [4,20-221.
Shape is a concept that has intuitive appeal but attempts to quantify shape using a computer have had only limited success [23-261. Algorithms used for shape analysis and object recognition typically rely upon the accuracy of segmentation operations and are especially difficult to codify when they are to be used in applications where objects may pre- sent a range of translations, rotations, and scales within the imaged scene .
Our goal was to develop a system which could
reliably and automatically segment chromatic im- ages and recognize delineated objects regardless of variations in spatial parameters. The algorithms have evolved from quantitative methods which we had originally developed to evaluate stained cells cultured on porous microcarriers (PMCs) and stained histological sections excised from guinea pigs [20,28]. The approach integrates principles of color theory, image processing, multivariate dis- criminant analysis, and spatial pattern recogni- tion
2.1. Classl$cation The notion of establishing classification crit-
erion for scientific gain is not a new concept - in fact Aristotle was one of the first to apply the tech- nique to biological taxonomy. Linnaeus later devised a classification scheme by which member- ship to one class rather than another depended on a single distinguishing attribute rather than by any measured value. The main change in classification has been an emphasis on measurement. Post- Darwinians searched for a reliable method of quantifying association in measures of biological activity and it was not until Galtons discovery of the correlation coefficient in 1883 that this quest was brought to fruition. Tables of correlation began to appear in scientific journals and ultimate- ly gave rise to the multivariate statistical analysis which utilizes inter-class and intra-class correla- tions to make evaluations . At that time most calculations were made by hand and it was not until the 1950s that the multivariate analysis was executed by computer [30,3 11.
Today there are many state of the art imaging systems which have been developed for research or industrial applications that are based on first or higher order predicate calculus that have been ex- tended to allow for inductive inferences [32,33]. Popular inductive methods include interference matching, maximal unifying generalizations, con- ceptual clustering, or constructive induction 19,331.
Another general category of induction methods that has been the subject of much research is neu- ral networks. These approaches have their founda-
D.J. Foran, R.A. Berg / Comput. Methods Programs Biomed. 45 (1994) 291-305 293
tions in statistical analysis, through the use of discriminant functions, and are extended to first the perceptron and then on to more complex neural-net concepts. They are all based on the con- cept of finding the best set of coefficients, or weight vectors, that minimize error [9,34]. The main unifying concept for all of these induction methods is that they try to learn to classify input patterns into output patterns .
2.2. Segmentation Segmentation of digital images is a crucial step
for most automated machine vision systems since many of the subsequent interpretation steps de- pend on the reliability of this delineation process.
One of the first thresholding methods used to segment images was the p-tile method [lo]. In this method, an image is assumed to consist of dark objects in a light background. By assuming that the percentage of the object area is known, the threshold is defined as the highest grey level which maps at least (100 - p) percent of the pixels into the objects in the thresholded image. An obvious drawback to this method is that it is not applicable to images in which the object area is not known 1211.
Much of the early work in image analysis focus- ed on determining a reliable means of selecting the threshold value. One way to choose the threshold value is to search the histogram of grey levels, assuming that it is bimodal, and find the valley of the histogram. This technique is called the mode method ; however, this approach is not appro- priate for images with extremely unequal peaks or to those with broad and flat valleys . An ele- gant refinement of this method assumes that the histogram is the sum of two composite normal functions and determines the valley location from the normal parameters [ 111.
Single-threshold methods are useful in simple situations, but may not be dependable in cases in which the region of pixels to be delineated are not connected. Problems often arise when applying these techniques to images which present a back- ground of varying grey-level or regions which vary in grey-level by more than the threshold. Two modifications of the threshold approach which ameliorate these difficulties are: to high-pass filter
the image to de-emphasize the low-frequency background variation and then apply the original technique t