meat quality evaluation by computer vision
TRANSCRIPT
Journal of Food Engineering 61 (2004) 27–35
www.elsevier.com/locate/jfoodeng
Meat quality evaluation by computer vision
Jinglu Tan *
Department of Biological Engineering, University of Missouri, Columbia, MO 65211-0001, USA
Received 18 April 2002; accepted 2 May 2003
Abstract
Applying computer vision in meat quality evaluation has been an active area of research in recent years. Various studies have
addressed issues from basic technique development to applications. This paper summarizes the main results from a number of recent
application studies, which include characterization of quality attributes such as color, marbling, maturity and texture; prediction of
sensory scores and grades; and prediction of cooked-meat tenderness. The promise of computer vision for objective meat quality
evaluation is demonstrated and the remaining challenges are discussed.
� 2003 Elsevier Ltd. All rights reserved.
Keywords: Image processing; Meat grading; Tenderness
1. Introduction
The existing methods for meat quality (palatability)
grading heavily rely on subjective visual appraisal of
certain carcass characteristics. The official USDA (USDepartment of Agriculture) beef quality grading system
consists of three visually assessed attributes: abundance
of marbling (intra-muscular fat), muscle color, and
skeletal maturity. An official pork grading system does
not yet exist, but color is considered an important
quality attribute because it influences consumers in their
product selection.
While visual appraisal has been serving the industryfor decades, it has some major drawbacks. Although the
graders are professionally trained, inconsistencies and
variations are intrinsic of subjective evaluations (Cross,
Gilliland, Durland, & Seideman, 1983). This has seri-
ously limited the ability of the meat industry to provide
consumers with products of consistent quality. Fur-
thermore, the grades have limited predictive power of
the eating quality. Research has shown that marblingand color characteristics, and thereby, the final grade,
only explain a rather low percentage of the variations in
important palatability measures such as tenderness (Li,
Tan, Martz, & Heymann, 1999; Li, Tan, & Shatadal,
2001).
*Corresponding author. Tel.: +1-573-882-7778; fax: +1-573-884-
5650.
E-mail address: [email protected] (J. Tan).
0260-8774/$ - see front matter � 2003 Elsevier Ltd. All rights reserved.
doi:10.1016/S0260-8774(03)00185-7
Objective measures of beef quality have been a long-
time desire of the industry and there have been many
research efforts in developing instruments. One popular,
and obvious, approach has been to measure the me-
chanical properties as indicators of tenderness. A num-ber of devices have appeared. The most well known one
may be the Warner–Bratzler shear force instrument. The
shear strength of cooked meat is correlated with sensory
tenderness scores (Shackelford, Wheeler, & Koohma-
raie, 1995). This method, however, is not practical for
commercial fresh-meat grading.
Computer vision has been recognized as the most
promising approach to objective assessment of meatquality from fresh-meat characteristics. Research in this
area began in the early 80s. Lenhert and Gilliland (1985)
describe the design of a black-and-white (B/W) imaging
system for lean yield estimation. Cross et al. (1983) and
Wassenberg, Allen, and Kemp (1986) report application
results of the system. Beef quality assessment by image
processing started with the work by Chen, McDonald,
and Crouse (1989) to quantify the marbling area per-centage in six standard USDA marbling photographs.
McDonald and Chen (1990a, 1990b) used morphologi-
cal operations to separate connected muscle tissues from
the longissimus dorsi (LD) muscle. A Boolean random
set model was proposed to describe the spatial marbling
distribution (McDonald & Chen, 1992) but it did not
significantly improve marbling score prediction over
using only the marbling area (McDonald & Chen, 1991).To separate lean from fat in B/W steak images, Chen,
28 J. Tan / Journal of Food Engineering 61 (2004) 27–35
Nguyen, and Park (1995) suggested using the sixthmoment as a thresholding criterion.
To develop a computer vision system for objective
meat grading, several steps must be completed. While
the existing human grading system has many weak-
nesses, any new system designed as a replacement must
be compared against the human system before the new
system is accepted. Since the existing system is defined in
terms of qualitative subjective assessments, the quanti-tative characteristics that contribute to the human
grading are not always obvious. It is therefore necessary
to search for image features that are related to human
scores for marbling abundance, muscle color and ma-
turity; and eventually, the USDA grades. Moreover, to
improve the usefulness of the grading system, new in-
strumentally measurable characteristics are needed to
enhance the power of the grades in predicting the eatingquality such as tenderness.
This paper attempts to show where we stand towards
computer vision-based meat grading by summarizing
some recent results from work performed at the Uni-
versity of Missouri and discussing some remaining
challenges. This will shed light on how successful we
have been in characterizing the traditional quality at-
tributes, predicting official grades, and finding comput-able features for tenderness prediction. The difficulties
discussed will point to some future directions of re-
search.
Fig. 1. Example beef image. Upper: original. Lower: segmented LD
muscle. The holes in the LD muscle give the marbling image.
2. Characterizing USDA quality attributes
Meat images were processed to characterize thequality attributes defined in the USDA beef grading
system. Color image features were extracted to predict
human scores of color, marbling and maturity.
2.1. Color and marbling
Sixty wholesale beef ribs of varying color and degree
of marbling were acquired from a local supplier. Five-
cm-thick slices were taken from the ribs. Each slice was
then cut into two 2.5-cm-thick steaks. The two freshly
cut surfaces, mirror images of each other, were used for
analysis, one for image capture and the other for sensoryanalysis.
A 10-member panel was assembled and trained to
evaluate color and marbling of beefsteaks. Muscle color
was evaluated according to a beef color guide in an 8-
point scale: 1––bleached red, 2––very light cherry red,
3––moderately light cherry red, 4––cherry red, 5––
slightly dark red, 6––moderately dark red, 7––dark red,
and 8––very dark red. Marbling was rated according tothe USDA marbling scorecards in a 9-point scale: 1––
devoid, 2––practically devoid, 3––traces, 4––slight, 5––
small, 6––modest, 7––moderate, 8––slightly abundant,
and 9––moderately abundant. The panel averages wereused as the sensory scores.
Images were acquired while the sensory evaluations
were proceeding. A color imaging system was used to
capture sample images. The illumination and camera
settings were selected so that the image resolution was
appropriate for revealing the small marbling flecks.
The steak images were subjected to several steps of
processing: filtering, background removal, segmentationof fat from muscles, isolation of the LD muscle, and
segmentation of marbling from the LD muscle. Details
of the segmentation algorithms developed and used can
be found in Gao, Tan, and Gerrard (1995), Lu and Tan
(1998), and Lu (2002). Fig. 1 shows an original image
and the segmented LD muscle. The holes in the LD
muscle give an image of the marbling flecks (not shown).
From the segmented LD muscle and marbling images,image features relating to muscle color and marbling
abundance were extracted. The LD muscle color was
characterized with the means (lR, lG and lB) and
standard deviations (rR, rG and rB) of the red, green
and blue color functions.
J. Tan / Journal of Food Engineering 61 (2004) 27–35 29
Features representing the amount and the spatialdistribution of marbling were extracted from labeled
marbling images (Gerrard, Gao, & Tan, 1996). The size
of each marbling fleck was calculated. To take into ac-
count the effects of fleck size, the marbling flecks were
classified into the following three size categories:
A1: <2.7 mm2,
A2: 2.7–21.4 mm2, andA3: >21.4 mm2.
The following marbling features were then computed
to measure marbling abundance:
Dci count of marbling flecks in size category Ai per
unit ribeye area,
Dai marbling area in size category Ai per unit ribeyearea,
Dc count of all marbling flecks per unit ribeye area,
and
Da total marbling area per unit ribeye area.
Dc and Da define two marbling densities: Dc is a count
density and Da is an area density. Dci and Dai represent
the densities for each fleck size category. The two den-sities describe marbling abundance from different per-
spectives.
To characterize the spatial distribution of marbling, a
ribeye image was divided into 5.5-cm2 sub-regions. Dc
and Da were computed for each sub-region. The varia-
tions in these localized densities would indicate the
spatial distribution of marbling over the LD. The fol-
lowing statistics were computed from these densities asimage features:
rDc, rDa standard deviations of the count and area
densities, and
MDc, MDa third moments of count and area densities
about mean.
Statistical analysis was conducted. The SAS Back-
ward Elimination procedure (SAS, 1996) was used to
select image features significant for sensory score pre-
diction. For color scores, two image features (lR and
lG) were significant. For marbling scores, five image
features (lR;Dc1;Da1;Da3 and Da) were useful. The R2
values of regression were 0.86 for color and 0.84 formarbling, indicating that the features were useful in
explaining the variations in sensory scores.
The means of red and green (lR and lG) were sig-
nificant for color prediction but lB was not. This indi-
cates that, while all three color components varied, the
green component did not affect the judges� scoring. Thefact that lR was significant in marbling prediction shows
that the judges� opinions were influenced by the leancolor. Both the count and area densities of small mar-
bling flecks (Dc1 and Da1) influenced the sensory scoresas expected. The area density of large flecks ðDa3Þ beingsignificant indicated that the presence of a few large
marbling flecks affected the sensory scoring although the
judges were instructed not to put more weight on larger
flecks. The global marbling area density ðDaÞ was in-
fluential on the scoring.
The image features characterizing the spatial varia-
tion of marbling were not significant in the regression.This agrees with McDonald and Chen (1992) that in-
formation on the spatial distribution of marbling did
not correlate significantly with marbling scores.
Since the sensory scores were imprecise in nature, the
data were also analyzed by using fuzzy set and neural
network techniques (Tan, Gao, & Gerrard, 1998). The
sensory scales were described as fuzzy sets, sensory at-
tributes as fuzzy variables, and sensory responses assample membership grades. Multi-judge responses were
formulated as a fuzzy membership vector or fuzzy his-
togram of response, which gave an overall panel re-
sponse free of unverifiable assumptions implied in
conventional approaches. Neural networks were used to
predict the sensory responses in their naturally fuzzy
and complex form from the image features selected by
backward elimination. A maximum method of defuzz-ification was employed to give a crisp grade of majority
opinion. The fuzzy set and neural network method
classified color and marbling of a set of samples 100%
correctly. This further verified the usefulness of the im-
age features extracted.
In addition to beef color and marbling, fresh pork
color was also evaluated by image processing (Lu, Tan,
& Gerrard, 1997; Lu, Tan, Shatadal, & Gerrard, 2000).Since consumers pay attention to pork color during
product selection, it is of interest to know if human re-
sponses to pork color can be predicted by image pro-
cessing. Forty-four pork loins were randomly picked
and cut at the 10th rib. The muscle color was subjected
to evaluation by a seven-member sensory panel trained
according to procedures described in the American
Meat Science Association guidelines (AMSA, 1991).The color scale used was: 1––pale-purplish gray, 2––
grayish pink, 3––reddish pink, 4––purplish red, and 5––
dark purplish red. Images were captured immediately
after sensory scoring under the same lighting conditions.
Algorithms were developed to segment the loin images
into background, muscle and fat (Lu et al., 2000). Color
image features were computed, which included the
means and standard deviations of red, green, and bluevalues of the segmented loin muscle areas. Both statis-
tical and neural network models were employed to
predict the sensory color scores from the image features.
The partial least squares technique was used to determine
the latent variables, which were subsequently used to
derive a statistical model by multiple linear regression
(MLR) and a neural network model by back-propagation.
Fig. 2. Vertebra image. Upper: original. Lower: segmented bone-
cartilage object.
30 J. Tan / Journal of Food Engineering 61 (2004) 27–35
The correlation coefficients between the predicted andthe sensory color scores were 0.75 for the neural net-
work model and 0.52 for the statistical model. A pre-
diction error of 0.6 or lower was considered negligible
from a practical point of view. For the neural network
model, 93.2% of the samples had a prediction error
lower than 0.6; and for the statistical model, the per-
centage was 84.1%. These results showed that image
processing plus neural network modeling is an effectivetool for predicting consumer responses to fresh pork
color.
2.2. Skeletal maturity
Image processing and neural network techniques
were developed to determine the skeletal maturity of
beef carcasses (Hatem & Tan, 1998). Maturity refers to
the physiological age, which is not synonymous with
chronological age. Maturity is a factor in beef quality
grading. Beef tenderness reduces with increasing matu-
rity. In the United States, beef carcass maturity is de-termined by subjective evaluation of indicators such as
cartilage ossification in the vertebrae. There are five
maturity levels, designated as ‘‘A’’ (young) through ‘‘E’’
(old).
Two sets of carcass samples were used. The first set
included 110 carcasses selected from a commercial plant.
An official USDA grader graded the carcasses and
provided the maturity scores, which ranged from ‘‘A’’ to‘‘E’’. The second set included 28 cattle of known chro-
nological age. Most of them were of ‘‘A’’ maturity and
the rest were of ‘‘B’’ maturity. They were prepared in a
different commercial plant.
Digital color images were taken right after official
grading under the same florescent lighting conditions.
For each set of images, the lighting and camera settings
were kept constant. The images were focused on thethoracic vertebra around 13th to 15th ribs (Fig. 2).
Existing research has shown that the degree of car-
tilage ossification in the vertebrae is the most important
indicator of skeletal maturity. For ‘‘A’’ maturity, the
cartilage in the thoracic vertebrae is free of ossification;
and for ‘‘B’’ maturity, there is some evidence of ossifi-
cation. Then the cartilage becomes progressively ossified
with age until it appears as bone.The degree of cartilage ossification was characterized
by processing the vertebra images. First, image seg-
mentation was performed to isolate the bones with the
cartilage (Hatem & Tan, 2000). Color and spatial fea-
tures were used for segmentation. The hue in the HSI
(hue, saturation, and intensity) color system was found
effective in segmenting the cartilage areas while the a�
value in the CIE-Lab color system gave good results forsegmenting the bones. After a set of morphological
operations to refine the segmented cartilage and bone,
the two were combined into a bone-cartilage object
(Fig. 2), which was then used to characterize cartilage
ossification.
The fact that cartilage has a lighter color than bones
was used to yield image features. The color values would
vary along the bone-cartilage object differently for dif-ferent degrees of ossification or maturity (Hatem & Tan,
1998). Younger animals have more cartilage and thus
give a longer segment of light colors along the length of
the object. The average hue value was computed along
the length of the bone-cartilage object as image features.
After normalization, these hue values were used as input
vectors to a neural network. The output of the neural
network was the maturity score. The network had a two-layer log-sigmoid/log-sigmoid structure. It was trained
by using the back-propagation algorithm in the Matlab
environment (The Mathworks, Natick, MA). The
trained neural network served as both a feature extrac-
tor and maturity score predictor. Every set of samples
was divided into five sub-sets for training and testing in
a rotating fashion. For each rotation, four sub-sets were
used for training and the fifth sub-set for testing. Thepredicted maturity scores were compared with those
J. Tan / Journal of Food Engineering 61 (2004) 27–35 31
given by the professional grader and the percentage ofcorrect classification was calculated. The fuzzy concept
described in Tan et al. (1998) was incorporated in the
use of human scores. The average correct classification
rates for the five rotations varied from 55% to 77%.
To verify the results from the first set of samples, the
same algorithm was applied to the second set of sam-
ples. The correct classification rate ranged between 57%
and 86% with an average of 75%. This shows that theprocedure developed had certain generality and ro-
bustness.
3. Predicting USDA grades
USDA beef quality and yield grades were predicted
from image features (Lu, Tan, Gao, & Gerrard, 1998).
Beef carcasses (247 for quality grading, 241 for yield
grading) of the same maturity were selected in a com-
mercial packing plant during normal production oper-
ations. The carcasses were prepared the way they are
normally processed in the industry. The quality andyield grades were given by an official USDA grader.
Quality was appraised with an 8-point scale (prime,
choice, select, standard, commercial, utility, cutter, and
canner). Yield was scored with a 5-point scale (1–5).
Digital color images of the ribbed surfaces (steaks)
were captured immediately after the official grading
under the same lighting conditions. After various steps
of image segmentation to obtain the image regions ofinterest, image features were computed. The color and
marbling features are described in a previous section.
Fat thickness is an indicator of lean meat yield. Fig. 3
shows the fat area for the sample shown in Fig. 1. The
back-fat area was partitioned into two halves: the dorsal
part (the upper-left half of the fat area in Fig. 3) and the
Fig. 3. Fat area for the sample shown in Fig. 1.
ventral part (the lower-right half of the fat area in Fig.3). The thickness was computed in the direction ap-
proximately perpendicular to the back curvature (lower
boundary of the fat area in Fig. 3). The average thick-
ness of the ventral part and that of the dorsal part were
used as fat thickness measures.
A method of divergence maximization was developed
to maximize the differences among classes by applying
linear and nonlinear transforms (Lu & Tan, 1998). Lin-ear transforms were employed for quality classification.
Linear, quadratic and cubic transforms were applied for
yield classification. The Euclidean distance was used as a
similarity measure. Supervised classifiers were trained
for both quality and yield classification. The data set
was randomly partitioned into 10 sub-sets. Nine of the
10 sub-sets were used for training and the 10th used for
testing in a rotating manner till all the sub-sets were usedfor testing.
The rate of correct quality classification varied with
the rotations of the procedure. For three of the 10 ro-
tations, it was 100%; four of them, 90–99%; and the
remaining three, 60–70%. The overall rate of correct
quality classification was 85.3%. For classification of
biological products, which usually involve significant
variability and inconsistencies, the results were conside-red excellent.
With a linear transform, the rate of correct yield
classification for the 10 rotations of the procedure was
above 50% for all the rotations except two. Applying
quadratic and cubic transforms did not bring about
significant improvements. The linear transform yielded
the best performance. The overall rate of correct clas-
sification was 64.2%. This was considered reasonablygood, given the fact that the grades from a single grader
are usually not very consistent.
4. Predicting tenderness
In commercial production, marbling and color con-
stitute most of the beef quality grade since most animalsare young (‘‘A’’ or ‘‘B’’ maturity). Color and marbling,
and thus the quality grade, however, are only weak
predictors of the eating quality attributes, of which
tenderness is the most important. There is need to find
other quantitative predicators of beef tenderness. Mus-
cle texture was characterized by image processing. Col-
or, marbling and textural features were used to predict
beef tenderness measured with Warner–Bratzler shearforces and sensory evaluations.
4.1. Warner–Bratzler shear force
Two hundred sixty-five carcasses were selected to
differ in USDA quality grades in a commercial packing
plant. The samples were all of ‘‘A’’ maturity. A rib
Fig. 4. The saturation images of two samples of different tenderness
exhibit different image textures. The upper sample is less tender.
32 J. Tan / Journal of Food Engineering 61 (2004) 27–35
section (posterior end) was removed and vacuum-packaged. This rib section was fabricated into 2.54-cm
thick steaks for Warner–Bratzler shear force measure-
ments. The steaks were cooked on a Hobart Model-
CB51 char broiler. During broiling, steaks were turned at
4, 8, 11, and, if necessary, 14 min until they reached a
final internal temperature of 70 �C. Temperature was
monitored with a thermocouple-based thermometer.
After cooking, the steaks were cooled to room temper-ature (approximately 18 �C). Eight cores of 1.27-cm di-
ameter were removed from each steak parallel to the
muscle fibers and sheared with a Warner–Bratzler in-
strument. The average maximum shear force readings
were used in the data analysis and they varied from 1.25
to 5.24 kg (12.25–51.35 N).
Images of the ribbed surfaces were captured in the
plant immediately following quality grading. The sameexposure and focal distance were used for all the images.
The steak images were segmented into muscle, fat and
marbling as described previously. The color and mar-
bling features were those described in Section 2.1.
The image texture of beef muscles was considered
directly or indirectly related to tenderness. Fig. 4 shows
differences in the image textures of beef samples of dif-
ferent tenderness and these differences are measurableby image processing (Li, Tan, & Shatadal, 1999). Image
textural features based on pixel value run length and
spatial dependence were computed as predictors of
tenderness.
A pixel value run is a set of connected pixels having
the same or close pixel values. Pixel value runs can be
characterized by the pixel value, the length, and the di-
rection of the run. Within a run, the pixel values arewithin a relatively narrow band, forming a fairly smooth
area. The histogram of pixel value band run length is
defined as P ðR; h; T Þ, where P stands for probability, R is
run length in number of pixels, h is the run direction on
the image plane, and T is the pixel value band thickness
or range of pixel values included in a run (Gao & Tan,
1996a, 1996b). Features computed from the pixel value
band run length histogram included:
lrT mean, giving the average run length,
rrT standard deviation, showing variation of run
length, and
MrT third moment, indicating the imbalance of run
length.
Subscript T denotes the band thickness and it rangedfrom 2 to 7.
Pixel value spatial dependence can be described by a
matrix P . Entry P ði; j; d; aÞ is the normalized frequency
at which two pixels d-pixels apart in direction a have
pixel values i and j respectively. A general procedure can
be found in Haralick, Shanmugum, and Dinstein (1973)
for extracting textural properties from this matrix and
14 texture features are suggested. The first 13 featureswere used in the research.
Since the images were roughly isotropic, only direc-
tion a ¼ 0 (horizontal) was used. Distance d was chosen
from 2 to 7. A method based on the auto-correlation
function was used to determine the appropriate distance
(Li et al., 1998).
When d ¼ 3, the texture features had the highest
correlation with shear force and they were selected forsubsequent analyses. Principal component regression
(PCR), which is principal component analysis (PCA)
followed by MLR, and partial least squares (PLS) were
performed to test the improvement in shear force pre-
diction after adding the texture features to the color and
marbling features. The SAS Stepwise procedure was
performed to select the significant variables. Discrimi-
nation analysis was also used to classify the beef samples.Predictions of shear force from color and marbling
characteristics turned out to be very poor. When the
official color and marbling scores were used as predictor
variables, the shear force regression R2-value was less
J. Tan / Journal of Food Engineering 61 (2004) 27–35 33
than 0.05. When the color and marbling image featureswere used, R2 ¼ 0:16.
Muscle image textural features were combined with
the color and marbling features to improve the predic-
tion of shear force. Three methods were used to analyze
the usefulness of textural features: PCR, PLS and dis-
criminant analysis.
PCR gave a model with an R2-value of 0.18 while PLS
yielded a model with an R2-value of 0.34. These resultsare still very poor, but the PLS analysis showed that
adding the textural features significantly improved the
shear force prediction over using color and marbling
features only ðR2 ¼ 0:16Þ.The beef samples were segregated into three catego-
ries based on shear force values (61.71 kg, 1.71–3.09 kg,
and P3.09 kg). One hundred ninety-eight samples were
used as calibration data and 45 samples were used as testdata. The SAS Discriminant procedure with a linear
discriminant function was used to classify the samples.
The calibration samples could be classified with 76%
accuracy and the test samples with 77% accuracy.
Although there are possibilities to improve the
models for shear force prediction from color, marbling
and textural features, the results obtained were poor.
This may indicate that color, marbling and muscle im-age texture do not contain sufficient information to
define cooked-meat shear force. Nevertheless, the in-
clusion of textural features brought about significant
improvement. The image texture of muscles is at least as
significant an indicator of the mechanical properties of
beef muscles as color and marbling (Li et al., 1999).
4.2. Sensory tenderness
Beef samples were cut from the short lions of pasture-
finished steers and feedlot-finished steers of the same
maturity grade. Two sets of short strip loins were takenfrom the carcass samples, each having 97 pieces. One set
of the samples was used for sensory tenderness evalua-
tion and the other used for image analysis.
A 10-person panel was trained for sensory evaluation.
The steak samples were cooked individually by broiling
to an internal temperature of 68 �C. The cooked sampleswere then cut into small size and evaluated, while warm,
by the trained panel for tenderness. A line scale of 16.4-cm long was used for tenderness scoring.
Images of the beef samples were acquired individually
in a chamber with a uniform black background and
warm-white-deluxe fluorescent lighting. The same expo-
sure and focal distance were used for all the samples. The
images were processed and the features described in the
last section were computed. The features computed were
first screened for colinearity. By using MLR, the runlength features for band thickness T ¼ 2 and the spatial
dependence features for distance d ¼ 1 were selected.
Finally, 37 features were selected for further analysis.
The beef samples (97) were divided into a trainingsub-set (72) and a test sub-set (25). PCR and PLS were
performed to test the improvement in tenderness pre-
diction resulting from adding texture features. The SAS
Stepwise procedure was performed to select the vari-
ables significant for tenderness prediction. A neural
network model was also developed.
All the samples were used in PCR. The R2-value for
the regression model increased from 0.30 with the colorand marbling features alone to 0.72 after adding the
texture features. This improvement indicated that the
texture features made a significant contribution to beef
tenderness prediction.
In the PLS analysis, the first 14 factors explained
most of the variations and were used for regression. The
R2-values for the training data set and test data set were
0.35 and 0.17 respectively for using only the color andmarbling features. They increased to 0.70 and 0.62 re-
spectively after adding the texture features. The increase
in R2 -value again verified the usefulness of the texture
features as indicators of beef tenderness.
The 14 factors from the PLS analysis were used as
inputs and the tenderness scores were used as the output
for a neural network with one hidden layer. The back-
propagation algorithm was used to train the neuralnetwork. After the network was trained, the test data set
was used to test the model. A prediction R2-value of 0.70
was obtained, a similar result to those from PCR and
PLS (Li et al., 1999).
In a further study (Li et al., 2001), 59 crossbred steers
were used. The sample preparation and sensory evalu-
ation were conducted in the same manner as described
above. The samples were then segregated into ‘‘tough’’(tenderness score <8) and ‘‘tender’’ (otherwise) groups.
An area about 250 mm2 of the LD muscle was imaged
for each sample. Each color image (480 · 512 pixels) wasdivided into several sub-images of 64 · 64 pixels. Forty-five of these smaller images from the tough group and 45
from the tender group were randomly selected for fur-
ther analysis. The tenderness scores for the tough sam-
ples ranged from 4.17 to 6.67 and those for the tendersamples varied from 8.54 to 11.99.
A wavelet-based method was developed to decompose
a texture image into textural primitives of different sizes
(Li et al., 1999). In this research, the textural primitives
were rectangles. Each image (the saturation function
used) was decomposed into 35 different primitives. The
degree of presence of each primitive type was measured
with the percentage of the image area occupied by theprimitive, which was referred to as the primitive fraction.
Analysis of variance and correlation analysis reduced the
35 primitive fractions to 10, which were significantly
different between the tough and tender groups and were
not significantly correlated to one another.
The 10 primitive fractions were used for multivariate
classification based on the Fisher�s liner discriminant
Fig. 5. A near-infrared beefsteak image taken at wavelength ¼ 850 nm.
34 J. Tan / Journal of Food Engineering 61 (2004) 27–35
(SAS, 1996). The leave-one-out scheme was used, in
which 89 of the 90 samples were used to train the Fisher�slinear discriminant and the remaining one used for test-
ing. This scheme was repeated rotationally for all
90 samples. Out of the 45 tough and 45 tender samples,
38 tough and 37 tender samples were correctly classified,
giving an overall correct classification rate of 83.3%.
Instead of the Fisher�s linear discriminant, a neural net-work classifier was trained and used in the same rota-
tional fashion, which yielded an overall correct
classification rate of 82.3%. The nonlinear neural net-
work classifier showed little advantage over the linear
Fisher classifier.
Near-infrared (NIR) spectral features have also been
studied for beef tenderness prediction (Hatem, Tan, &
Shatadal, 1999). Forty steers were slaughtered andgraded. Short loins were removed for sensory tenderness
evaluation by a trained panel as described previously
and for NIR imaging. Steak images were captured with
an NIR camera and a tunable NIR filter for 20 different
wavelengths from 650 to 1030 nm at 20-nm increments.
Fig. 5 shows an image taken at 850-nm wavelength. The
image intensities were adjusted according to the spectral
characteristics of the camera and the filter.Areas of muscle, fat and marbling were segmented
and a 20-point spectrum was computed for muscle, fat
or marbling of every sample. Different combinations of
these spectra were used as inputs to train neural network
models for tenderness prediction. The most useful
combination found was the difference between the
muscle and fat spectra. The sensory tenderness scale was
from 0 and 16.4. If a prediction within unity from theaverage sensory panel score was considered correct, the
correct prediction rate was 67%. Over 90% of the pre-
dictions were within one standard deviation from the
averages of the 10-member panel scores.
5. Remaining challenges and opportunities
Great strides have been made towards computer
vision-based evaluation of meat. To make this techno-
logy practically useful, several challenges still remain. At
the same time, there are many research opportunities of
great potential.
5.1. Image segmentation
System robustness, real-time capability, sample han-
dling, and standardization are among the issues that
remain to be addressed. The last three issues will require
further research and development, but there do not seem
to be insurmountable difficulties. System robustness or
reliability, however, entails further in-depth research. Itis a major challenge to design a system that has sufficient
flexibility and adaptability to handle the biological
variations in meat products. At the core of this issue is
image segmentation. Reliably and consistently seg-
menting a meat image into parts of interest without
human intervention is prerequisite to success of all
subsequent operations and thereby prerequisite to the
eventual success in computer vision-based meat grading.Given the complex nature of meat images, no existing
algorithms are totally effective for meat image segmen-
tation. There have been successes in employing multi-
variate and nonlinear approaches (Gao et al., 1995; Lu
& Tan, 1998). Our recent work has focused on funda-
mental methodology development in multivariate clas-
sification that will lead to unsupervised image
segmentation algorithms effective for meat image pro-cessing (Lu, 2002). Unsupervised techniques and algo-
rithms with self-learning capability appear to be the key
to robust meat image segmentation. Basic methodology
development is needed.
5.2. Power of quality indicators
As discussed earlier, the conventional quality indi-cators of color, marbling and maturity lead to poor
predictions of important quality measures such as ten-
derness. There are both needs and opportunities to
empower the quality grading system by discovering new
measurable fresh-meat characteristics that are predictors
of cooked-meat quality. The ability of image processing
to characterize complex and subtle differences provides a
range of possibilities. Our efforts have focused on imagetexture and NIR characteristics. Many more possibili-
ties remain unexplored. In particular, images captured
with different wavelengths and even different modalities
have the potential to reveal additional quality informa-
tion.
J. Tan / Journal of Food Engineering 61 (2004) 27–35 35
6. Conclusions
Results from several applications show that color
image processing is a useful technique for meat quality
evaluation. Quality attributes such as muscle color,
marbling, maturity and muscle texture can be effectively
quantified and characterized. USDA quality and yield
grades and cooked-beef tenderness can be predicted to a
satisfactory accuracy. Computer vision is a promisingtechnology for objective meat quality grading. The ex-
isting research has formed a foundation so that a com-
puterized grading assistant can be implemented. This
assistant can provide a human grader with quantitative
information unobtainable subjectively. To replace the
human graders, however, much work remains. Among
the issues requiring continued research, effective meth-
odologies for consistent meat image segmentation are onthe top of the list. This will entail basic research on
image segmentation and classification. There are also
needs and opportunities for discovering new quality
indicators. In particular, imaging with different wave-
lengths and modalities hold promises.
References
AMSA (1991). American Meat Science Association Committee on
guidelines for meat color evaluation. Contribution no. 91-545-A,
ASMA, Savoy, IL, USA.
Chen, Y. R., McDonald, T. P., & Crouse, J. D. (1989). Determining
percent intra-muscular fat on ribeye surface by image processing.
1989 ASAE Annual International Meeting, Paper no. 893009,
ASAE, St. Joseph, MI, USA.
Chen, Y. R., Nguyen, M., & Park, B. (1995). An image processing
algorithm for separation of fat and lean tissues on beef cut surface.
1995 ASAE Annual International Meeting, St. Joseph, MI, USA.
Cross, H. R., Gilliland, D. A., Durland, P. R., & Seideman, S. (1983).
Beef carcass evaluation by use of a video image analysis system.
Journal of Animal Science, 57(4), 910–917.
Gao, X., & Tan, J. (1996a). Analysis of expanded-food texture by
image processing, part I: geometric properties. Journal of Food
Process Engineering, 19(4), 425–444.
Gao, X., & Tan, J. (1996b). Analysis of expanded-food texture by
image processing, part II: mechanical properties. Journal of Food
Process Engineering, 19(4), 445–456.
Gao, X., Tan, J., & Gerrard, D. E. (1995). Image segmentation in 3-
dimensional color space. 1995 ASAE Annual International Meeting,
Paper no. 953607, ASAE, St. Joseph, MI, USA.
Gerrard, D. E., Gao, X., & Tan, J. (1996). Determining beef marbling
and color scores by image processing. Journal of Food Science,
61(1), 145–148.
Haralick, R. M., Shanmugum, K., & Dinstein, I. (1973). Textural
features for image classification. IEEE Transactions on System,
Man, and Cybernetics, SMC-3(6), 610–621.
Hatem, I., & Tan, J. (1998). Determination of skeletal maturity by
image processing. 1998 ASAE Annual International Meeting, Paper
no. 983019, ASAE, St. Joseph, MI, USA.
Hatem, I., & Tan, J. (2000). Cartilage segmentation in vertebra images.
2000 ASAE Annual International Meeting, Paper no. 003125,
ASAE, St. Joseph, MI, USA.
Hatem, I., Tan, J., & Shatadal, P. (1999). Beef quality prediction by
using near-infrared image features. 1999 ASAE Annual Interna-
tional Meeting, Paper no. 993159, ASAE, St. Joseph, MI, USA.
Lenhert, D. H., & Gilliland, D. A. (1985). The design and testing of an
automated beef grader. 1985 ASAE International Meeting, Paper
no. 853035, ASAE, St. Joseph, MI, USA.
Li, J., Tan, J., Gao, X., Lu, J., Gerrard, D. E., Smith, G. C., Tatum, J.
D., & George, M. H. (1998). Image texture features as indicators of
beef muscle mechanical properties. 1998 ASAE Mid-Central
Conference, Paper no. MC98130, ASAE, St. Joseph, MI, USA.
Li, J., Tan, J., Martz, F., & Heymann, H. (1999). Image texture
features as indicators of beef tenderness. Journal of Meat Science,
53, 17–22.
Li, J., Tan, J., & Shatadal, P. (1999). Discrimination of beef images by
texture features. 1999 ASAE Annual International Meeting, Paper
no. 993158, ASAE, St. Joseph, MI, USA.
Li, J., Tan, J., & Shatadal, P. (2001). Classification of tough and tender
beef by image texture analysis. Journal of Meat Science, 57, 341–
346.
Lu, J. (2002). Transforms for multivariate classification and application
to tissue image segmentation. PhD Dissertation, University of
Missouri, Columbia, MO, USA.
Lu, J., & Tan, J. (1998). Application of image segmentation to meat
image processing. 1998 ASAE Annual International Meeting, Paper
no. 983016, ASAE, St. Joseph, MI, USA.
Lu, J., Tan, J., Gao, X., & Gerrard, G. E. (1998). USDA beef
classification based on image processing. 1998 ASAE Mid-Central
Conference, Paper no. MC98131, ASAE, St. Joseph, MI, USA.
Lu, J., Tan, J., & Gerrard, D. E. (1997). Pork quality evaluation by
image processing. 1997 ASAE Annual International Meeting,
Paper no. 973125, ASAE, St. Joseph, MI, USA.
Lu, J., Tan, J., Shatadal, P, & Gerrard, D. E. (2000). Evaluation of
pork color by using computer vision. Journal of Meat Science, 56,
563–566.
McDonald, T. P., & Chen, Y. R. (1990a). Application of morpho-
logical image processing in agriculture. Transactions of the ASAE,
33(4), 1345–1352.
McDonald, T. P., & Chen, Y. R. (1990b). Separating connected muscle
tissues in images of beef carcass ribeyes. Transactions of the ASAE,
33(6), 2059–2065.
McDonald, T. P., & Chen, Y. R. (1991). Visual characterization of
marbling in beef ribeyes and its relationship to taste parameters.
Transactions of the ASAE, 34(6), 2054–2499.
McDonald, T. P., & Chen, Y. R. (1992). A geometric model of
marbling in beef longissimus dorsi. Transactions of the ASAE,
35(3), 1057–1062.
SAS. (1996). The SAS System for Windows (Release 6.11). SAS, Cary,
NC, USA.
Shackelford, S. D., Wheeler, T. L., & Koohmaraie, M. (1995).
Relationship between shear force and trained sensory panel
tenderness ratings of 10 major muscles from Bos indicus and Bos
Taurus cattle. Journal of Animal Science, 73, 3333–3340.
Tan, J., Gao, X., & Gerrard, D. E. (1998). Application of fuzzy sets
and neural networks in sensory analysis. Journal of Sensory
Studies, 14, 119–138.
Wassenberg, R. L., Allen, D. M., & Kemp, K. E. (1986). Video image
analysis prediction of total kilograms and percent primal lean and
fat yield of beef carcasses. Journal of Animal Science, 62(6), 1609–
1616.