Transcript
Page 1: [IEEE 2009 IEEE Symposium on Computational Intelligence for Image Processing (CIIP) - Nashville, TN, USA (2009.03.30-2009.04.2)] 2009 IEEE Symposium on Computational Intelligence for

Predicting Quality Measures in Beef Cattle Using UltrasoundImaging

Wilson Harron and Robert Dony

Abstract— A method of determining two quality measuresof beef cattle using different classification networks is pre-sented. The method involves calculating texture features fromultrasound images of the beef cattle and then predicting thefinal percentage intramuscular fat (IMF) and marbling gradesassociated with the beef cattle. This method can be used in thecattle industry to enhance current breeding techniques.

I. INTRODUCTION

There are two quality measures for beef that are employedin the cattle industry. These measures are used to determinethe quality and tenderness of the final meat product fromthe carcass and are therefore important to a producer asthey determine the price that the meat will be sold at. Themarbling grade is a subjective measurement by experts thatvisually inspect the meat and then determine a grade between1 and 6. The second quality measurement is the percentageof intramuscular fat (%IMF), which is determined through achemical extraction process. The %IMF is the ratio of themass of intramuscular fat in the ribeye area and is distributedthroughout the meat. This measurement is different from thefat content of the carcass because it does not include thebackfat (fat underneath the skin).

To determine these quality measures, the cattle is usuallyslaughtered and the carcass is sent to be graded. However,the producer may want to determine the %IMF and marblinggrade while the animal is still alive for the purposes ofselective breeding. A number of different methods havebeen employed to predict this information using ultrasoundimages of the ribeye area [1],[2],[3] with varying results. Thisinvestigation was undertaken to find methods of determiningthe quality measures from ultrasound images using a numberof image texture measurements.

II. DATA COLLECTION

To predict the quality measures of each steer, an ultrasoundimage was captured before slaughter by a technician at theUniversity of Guelph experimental farm. Each of the steerswas led through a chute where the tag was read and thenprepared for imaging by brushing away the excess fur andthen applying a layer of canola oil to the region to bescanned. The steer was then imaged above the 12th and 13thrib using an ultrasound scanner. Figure 1 shows the resultingimage. In the image, the marbling can clearly be seen asspeckling distributed throughout the meat.

The resulting image was stored with the filename corre-sponding to the tag number of the steer and then the steerwas sent to slaughter. After the steer had been slaughteredthe percentage of intramuscular fat and the marbling grade

Fig. 1. Ultrasound image showing the area above the 12th and 13th rib

were measured. The intramuscular fat is determined using achemical extraction process and then the final mass of intra-muscular fat is divided by the total mass of the ribeye area.The marbling grade is determined empirically after slaughterby grading visual components of the meat. Distribution of fatas well as the colour are used to determine the marbling scoreof each sample.

There were a total of 83 steers used to predict the %IMFand 75 steers used to predict the marbling grade. To train theprediction methods, 60 images were used for the %IMF and50 were used for the marbling grade. There were differentnumbers of images for each of the quality measurementsbecause the marbling grade measurements were not takenfor some of the steers, while all of the steers had the %IMFmeasured after slaughter.

III. RELATIONSHIP BETWEEN PERCENTAGE

INTRAMUSCULAR FAT (IMF) AND MARBLING GRADE

The %IMF and the marbling grade are measured using twovery distinct processes. When the two measures of qualityare compared, it was found that there was a moderate degreeof correlation between them (ρ = 0.7023). The correlationis smaller than what was expected and demonstrates thatthere can be large variations between marbling score andthe percentage IMF. In Figure 2 the percentage IMF and themarbling grade were plotted against each other to show thevariations between each of the quality scores.

IV. TEXTURE FEATURE CALCULATION

A number of different texture features were calculated forthe prediction of marbling grades and the final %IMF. To

978-1-4244-2760-4/09/$25.00 ©2009 IEEE

Page 2: [IEEE 2009 IEEE Symposium on Computational Intelligence for Image Processing (CIIP) - Nashville, TN, USA (2009.03.30-2009.04.2)] 2009 IEEE Symposium on Computational Intelligence for

Fig. 2. Marbling Grade vs %IMF

calculate the features a region of interest (ROI) above the12th and 13th rib was selected. The ROI was then split into32 × 32 blocks, which were used to calculate the texturefeature measures.

A. Gray Level Co-Occurence Matrix (GLCM)

The gray level co-occurence matrix was calculated usingthe algorithm presented in [4]. The features calculated werethe angular second moment, contrast, correlation, measuresof entropy and measures of variance.

B. Gray Run Length Matrix (GRLM)

The gray run length matrix was calculated using thealgorithm presented in [5][6] at the angles of 0◦, 45◦, 90◦

and 135◦. The GRLM algorithm is symmetric, that is, it willhave the effect of calculating in both directions at the givenangles. These angles were determined to be sufficient. Usingthe GRLM, the following features were calculated:

• run percentage• gray level non-uniformity• run length non-uniformity• short and long run emphasis• high and low gray level emphasis• high and low gray level short and long run emphasis

C. Histogram

The histogram was calculated using 8 gray levels. Thefeatures calculated were the histogram coefficients, mean,variance, entropy and uniformity. Each of these features werecalculated using the algorithms presented in [7][8].

D. Fast Fourier Transform (FFT)

The Fast Fourier transform was calculated using the radix-2 algorithm [9]. The features calculated were:

• energy and entropy of the regions, rings and wedges[10],[8]

• peak energy• laplacian of the peak• standard deviation• moment of inertia

E. Discrete Wavelet Transform (DWT)

The discrete wavelet transform was calculated using the2D Haar wavelet algorithm [11]. The energies from eachregion of the transform were then calculated.

F. Discrete Cosine Transform (DCT)

The discrete cosine transform was calculated and themean, variance and entropy was extracted for all regions,and the entropy and energy extracted for each of the DCTregions [12].

G. Autoregressive (AR) Model

The autoregressive model was calculated using the leastmean square (LMS) algorithm. Each of the coefficients wereused as features, along with the resulting mean square errorof the predicted image and the variance of the error [8].

H. Aggregating the Feature Set

As mentioned above, each ROI was split into 32x32 blocks(6 in total for each image). To aggregate the results foreach of the features, four different measures of the featureswere selected. The mean, standard deviation, maximum andminimum values for each set were calculated, leading to atotal of 424 features for each image. There are obviouslya large number of related features and therefore a featureselection algorithm was needed to reduce the number offeatures to be used for classification.

Furthermore, some of the features extracted and aggre-gated were not accurate measures for prediction of the qualitymeasures. For instance, the FFT peak angle was nearlyalways the same for all of the images. This is because thepeak angle measures the angle between the center of theFFT to the peak. The images have a large amount of energycontained in the lower frequencies and therefore the peakwas always found in the center of the image.

V. FEATURE SELECTION

Two methods of feature selection were used to capture therelevant features for classification and prediction. The firstmethod selected the features that had the highest correlationsto the predicted value. The maximum correlations and thefeatures that have the maximum correlations are shown forboth the prediction of grades and %IMF in Table I.

Feature Correlation (ρ)IMF mean of histogram coefficient 1 0.6559Marbling Grade mean of histogram coefficient 1 0.5907

TABLE I

MAXIMUM CORRELATIONS FOR MARBLING GRADE AND %IMF

The largest 10 correlations of each set are shown in TableII. The standard deviation, mean, maximum and minimumhad nearly the same degree of correlation to the results foreach image. As a result, the mean of the six blocks in theROI were used as the correlated features.

Page 3: [IEEE 2009 IEEE Symposium on Computational Intelligence for Image Processing (CIIP) - Nashville, TN, USA (2009.03.30-2009.04.2)] 2009 IEEE Symposium on Computational Intelligence for

IMFHistogram coefficient [1]GRLM short run high gray level emphasis[45◦]AR coefficient [3]GRLM short run high gray level emphasis[90◦]GRLM short run high gray level emphasis[135◦]GRLM short run low gray level emphasis[45◦]Histogram meanGRLM low gray level run emphasis[0◦]GRLM long run low gray level emphasis[135◦]GLCM sum average

Marbling GradeHistogram coefficient [1]AR coefficient [3]DCT region entropy[3]DCT entropyGRLM long run low gray level emphasis[135◦]GRLM low gray level run emphasis[0◦]GRLM low gray level run emphasis[135◦]GRLM short run low gray level emphasis[45◦]GRLM low gray level run emphasis[90◦]GRLM low gray level run emphasis[45◦]

TABLE II

TOP 10 CORRELATIONS

Examples of the data with respect to the IMF and marblinggrades are shown in Figures 3 and 4. In the figures it can beseen that the texture features are not separable with respectto the different quality measures. This presents a significantchallenge to prediction and classification as the data must beseparable for the prediction algorithms used.

00.1

0.20.3

0.40.5

0.60.7

00.05

0.1

0.150.2

0.25

0.3

0.350.4

!0.15

!0.1

!0.05

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Histogram coefficient 1

Data for %IMF Classification

grlm short run low gray level emphasis [45]

AR

coe

ffici

ent 3

imf <= 0.02470.0247 < imf <= 0.05160.0516 < imf <= 0.07430.0743 < imf <= 0.09470.0947 < imf <= 0.1297

Fig. 3. Example data set for correlation feature selection (%IMF)

00.1

0.20.3

0.40.5

!0.2

0

0.2

0.4

0.60

0.5

1

1.5

Histogram Coefficient 1

Data set for classification of marbling grade

AR Coefficient 3

DC

T R

egio

n E

ntro

py 3

144.555.5

Fig. 4. Example data set for correlation feature selection (Marbling Grades)

The second method of feature selection used is a Se-quential Forward Selection (SFS) and Sequential BackwardSelection (SBS) based on the F-stat tests [13][14]. At eachiteration of the algorithm the most statistically significantfeature is added, or the least statistically significant featureis removed from the current set. At the final iteration the

algorithm will have determined the features that are mostsignificant with a small amount of overlap, that is, the re-sulting features will be orthogonal to each other. This meansthat the problem will be separable. The number of featuresidentified is a function of the algorithm, as the algorithm willadd and remove statistically significant features until thereare no more features that meet the minimum requirementsfor inclusion. For the %IMF prediction, 13 features wereincluded in the modes and there were 9 features includes inthe marbling grade prediction. The features for the %IMFprediction are shown in Table III and the features includedin the grades prediction are shown in Table IV.

%IMF Prediction featuresGRLM short run high gray level emphasis[135◦] (mean)Histogram coefficient [3] (mean)DWT energy[1] (mean)AR coeffecient[2] (mean)GLCM sum average(std)GRLM run length nonuniformity[45◦ ] (std)GRLM long run emphasis[90◦] (std)GRLM short run high gray level emphasis[90◦] (std)GRLM long run emphasis[135◦] (std)FFT ring entropy [0] (std)Histogram coeff [6] (std)DWT energy [0] (std)AR coefficient [3] (std)

TABLE III

FEATURES INCLUDED IN %IMF PREDICTION

Marbling Grade Prediction featuresAR coeffecient[2] (mean)AR coeffecient[3] (mean)GLCM sum average (std)GRLM low gray level run emphasis[0◦] (std)GRLM low gray level run emphasis[90◦] (std)GRLM long run emphasis[135◦] (std)GRLM high gray level run emphasis[135◦] (std)Histogram uniformity (max)DWT energy[1] (min)

TABLE IV

FEATURES INCLUDED IN MARBLING GRADE PREDICTION

From Tables III and IV it can be seen that the ARcoefficients and the GRLM measurements are significantfor predicting both the %IMF and the marbling grade. Theautoregressive process is significant because it measures thecorrelations between the pixels, extracting the correlated sig-nals from the image while leaving uncorrelated white noise.The correlated signals of the image can be classified as onetexture feature, while the uncorrelated white noise is anothertexture feature. The coefficients of the AR process will beunique for different textures in the image and therefore arestatistically significant to the prediction process.

The Gray run length matrix (GRLM) measures the dis-tribution of pixels that are the same value and adjacent toeach other in an image. The run number is calculated byfinding the number of connected pixels that have the samegray value. From this matrix a number of different features

Page 4: [IEEE 2009 IEEE Symposium on Computational Intelligence for Image Processing (CIIP) - Nashville, TN, USA (2009.03.30-2009.04.2)] 2009 IEEE Symposium on Computational Intelligence for

are calculated that describe the distribution. Marbling gradeis determined from the distribution of fat in the ribeye andthe distribution of fat is shown as speckles on the ultrasoundimage. Therefore the GRLM measures will be significantfeatures in the prediction for both of the quality measures.

Both the FFT and DWT features are significant becausethey measure the frequency components of the image. Thefrequency distributions may measure the speckling in theultrasound image and may be significant in prediction.

The histogram features measure the distribution of pixelvalues in an image and because the fat areas of the meat aredifferent gray levels than the meat in the ROI. The histogramwill indicate the distribution of fat in the image and areincluded in the final feature set.

Example data sets are shown in Figures 5 and 6 for theprediction of %IMF and marbling grades, respectively. Bothfigures show that for the three features selected there is ahigher degree of separability than the features found usingthe correlation method. The second method was thereforechosen to select the features used in the prediction of thequality measures.

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4

!0.10

0.10.2

0.30

0.5

1

1.5

x 10!16

Histogram coefficient [3]

Sample Data for predicting \% IMF

AR coefficient [3]

GR

LM R

un L

engt

h N

onun

iform

ity [4

5° ]

imf <= 0.02470.0247 < imf <= 0.04910.0491 < imf <= 0.07410.0741 < imf <= 0.09470.0947 < imf <= 0.1216

Fig. 5. Sample data for prediction of %IMF

!0.10

0.10.2

0.30.4

00.5

11.5

2

x 10!17

0

0.5

1

1.5

x 10!16

AR Coefficient [3]

Sample data for classification of marbling grade

GRLM Low Gray Level Run Emphasis[0°]

GR

LM H

igh

Gra

y Le

vel R

un E

mph

asis

[135

° ]

344.555.5

Fig. 6. Sample data for prediction of marbling grades

VI. PREDICTION ALGORITHMS

The prediction of the marbling grade was done usingfour different classification and prediction methods. The firstmethod used was the support vector machine (SVM) [?].To train the SVM, a number of vectors (support vectors) arecalculated that separate the classes and these vectors are usedto then define the separation between classes. As a vector islinear, the vector is used as the input to a kernel, which can beany mathematical function. Through experimentation it wasfound that the polynomial kernel was the most successfulat predicting the quality measures, rather than the linear or

radial basis function. Classification and training using theSVM was done using the libsvm package[15].

A linear neural network (LNN) and multilayer perceptronnetwork (MLP) were also used to classify and predict thequality measures[?]. The number of layers and number ofneurons in the network were determined from the numberof data sets with each layer having half of the numberof neurons of the previous layer. Therefore the number oflayers is equivalent to log2(N). a back-propagation methodwas used for training the networks. For the LNN, theoutput activation function was a linear activation functionand the MLP used a sigmoidal activation function. For bothnetworks, the activation function of the hidden layers is asigmoid. The implementation was accomplished using theFast Artificial Neural Network (FANN) library[16].

Finally, a recursive least squares (RLS) filter was used topredict the quality measures. The RLS was calculated usingthe algorithms presented in [17]. The RLS filter uses a leastsquares method to minimize the output error. The learningrate in the RLS is used to ‘forget’ previous data samples sothat the RLS algorithm can model a changing system. Theprediction of the quality measures is assumed to be a timeinvariant system, so the learning rate was set to 1.0, whichis the learning rate so that every input is weighted the same.

The inputs (feature set) and the outputs (quality measures)for all of the prediction methods were scaled to a range wherethe algorithms would be able to converge. The output wasthen rescaled to its original range after prediction.

To determine which of the prediction methods are moreaccurate, three measurements of the error were taken, themaximum, minimum and root mean square error (RMSE).The correlation between the predicted value and the knownvalue was taken to ensure that the results were consistentwith the known values.

VII. PREDICTION OF MARBLING GRADE

The selected features were used to classify the marblinggrades on the four prediction networks. The results for theoptimal parameters of each network is shown in Table V.

Network RMSE MAX(ERR) MIN(ERR) ρresult pttest

SVM 0.58 1.50 0 0.67 0.60LNN 0.43 0.97 0 0.82 0.78MLP 0.44 1.07 0.0054 0.82 0.93RLS 21.90 51.67 1.76 0.27 0.19

TABLE V

RESULTS FOR MARBLING GRADE CLASSIFICATION

The results show that for the LNN, MLP, and SVM themarbling grade is usually predicted to within one half ofa grade. For the LNN and MLP, the marbling grade wasalways predicted within one grade and the SVM predictedthe marbling grade to within 1.5 grades. The RLS filter didnot appear to predict the marbling grade effectively. Thecorrelation coefficient for each of the prediction methodsshow that the results from the SVM, LNN and MLP are

Page 5: [IEEE 2009 IEEE Symposium on Computational Intelligence for Image Processing (CIIP) - Nashville, TN, USA (2009.03.30-2009.04.2)] 2009 IEEE Symposium on Computational Intelligence for

statistically significant, where the correlation coefficient forthe RLS shows that it is not. Similarly, a t-test was performedto test whether the results and the known marbling gradewere from a normal distribution with similar means. The p-values for the t-test are shown in Table V and shows that forthe LNN and MLP, the results are very closely related to theknown marbling grade, where for the SVM, it is moderatelyrelated and for the RLS it is not closely related at all.

A. Support Vector Machine (SVM)

The SVM was tested using one varying parameter. Differ-ent kernel types were tested: the linear kernel, radial basisfunction kernel and the polynomial kernel. The degree ofthe polynomial for the kernel used was varied between 1and 5. However, the polynomial kernel was the only one toconverge for the given training set. Figure 7 shows that themost accurate degree to use is 2. This created a parabolickernel for the support vector. When the other parameters ofthe SVM, such as the cost of constrains violation parameter,or the minimum error (eps) was varied, the prediction resultdid not change.

2 3 4 5 6

0.58

0.6

0.62

0.64

0.66

0.68

0.7RMSE of SVM with the degree of the polynomial kernel being varied

SVM Degree

RM

SE

Fig. 7. SVM RMSE with a varying degree

The SVM was the second least accurate of the predictorsfor the marbling grade. The final RMSE was found to be0.5845, with a maximum error of 1.5 and a minimum errorof 0. The correlation of the output to the desired result wasfound as 0.6707. The correlation of the final output shouldbe much higher even though the correlation of the output tothe marbling grade is larger than the maximum correlationbetween the feature set and the quality measure. However, theoutput is still not linearly related to the known values. Whencompared to the results of the prediction of the %IMF andthe correlation between the %IMF and the marbling grade,this result is consistent.

B. Linear Neural Network (LNN)

The LNN was the most accurate of the prediction methods.There are two parameters that were used to test the LNN:the learning rate and the minimum error to stop at. Theminimum error was varied between 10−1 to 10−3. After theoptimal minimum error value was found, the learning ratewas varied between 10−2 to 10−4. The RMSE of the outputfor the minimum error parameter is shown in Figure 8 andthe correlations between the output and the desired output isshown in Figure 9. During training, the data set was presented

a maximum of 10000 times. It was found that any valuelarger than 10000 did not affect the output significantly.

Fig. 8. LNN RMSE with a varying minimum error

Fig. 9. LNN correlation between output and desired output

From the figures it can be seen that there is a maximumcorrelation and minimum RMSE that was found using theLNN. The optimal minimum error was found to be 0.0437because the RMSE is the smallest and the correlation is thehighest when this value was chosen. When the network wasbeing trained the calculated error reached a minimum, butthat value was larger than the minimum error. This meantthat the network did not converge using a smaller minimumerror. When the learning rate was changed, the final RMSEand correlation did not change because the network still didnot converge to the desired minimum error.

C. Multilayer Perceptron Network (MLP)

The MLP was the second most accurate of the networksand the results were very similar to the results of the LNN.The parameters varied were the same as the parametersvaried when testing the LNN. The RMSE found using thevarying minimum error and the correlation between theoutput and desired output are shown in Figures 10 and 11respectively.

Like the LNN, the MLP converged to a minimum error.The minimum error parameter for the MLP that correspondedto the smaller RMSE was 0.0042. Like the LNN, the learningrate had very little effect on the final result after the minimumerror was found.

D. Recursive Least Squares (RLS) filter

The RLS filter was found to be the least accurate of theresults. The RLS filter uses a least-squares algorithm to findthe filter coefficient that will predict the desired results. Totest the RLS, the parameter δ, which is the parameter used to

Page 6: [IEEE 2009 IEEE Symposium on Computational Intelligence for Image Processing (CIIP) - Nashville, TN, USA (2009.03.30-2009.04.2)] 2009 IEEE Symposium on Computational Intelligence for

Fig. 10. MLP RMSE with a varying minimum error

10!3

10!2

10!1

0.4

0.45

0.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

stop error

corr

(out

,y)

Correlation of MLP output to desired output with a varying stop error

Fig. 11. MLP correlation between output and desired output

initialize the algorithm was varied. However, this parameterdoes not affect the final result. In Figure 12, the results ofthe RLS filter are plotted against the known marbling gradesto show that the results of the RLS filter are neither accuratenor precise. The RLS filter should not be used to predictmarbling grades because of the inaccurate and inconsistentresults obtained.

2 2.5 3 3.5 4 4.5 5 5.5 6!5

0

5

10

15

20

Known grade

RLS

res

ult

RLS Result compared to the known grade

Fig. 12. RLS results compared to grade

VIII. PREDICTION OF %IMF

The same network types were used to predict the %IMFas were used to predict the marbling grade. The results fromthe predictions using the optimal parameters are shown inTable VI.

The results show that for each of the prediction methods,the %IMF can be predicted to within 1.4% of the final %IMFon average. At maximum the %IMF will be predicted towithin 2.9% of the %IMF and the output of the predictors is

Network RMSE MAX(ERR) MIN(ERR) ρresult pttest

SVM 0.0267 0.0560 0.1828e-3 0.6173 0.9530LNN 0.0136 0.0282 0.0027e-3 0.9172 0.9609MLP 0.0136 0.0261 0.0404e-3 0.9149 0.9225RLS 0.0137 0.0288 0.0926e-3 0.9155 0.9537

TABLE VI

RESULTS FOR PREDICTING %IMF

very highly correlated to the known %IMF. The correlationcoefficients show that the results are strongly related to theknown %IMF for the LNN, MLP and RLS filter and theresults from the SVM are weakly related. The t-test wasperformed to find whether the results and the known valueswere from similar normal distributions. From Table VI, it canbe shown that the results for all of the networks are stronglyrelated to the known %IMF even though the results from theSVM have a smaller correlation to the known results.

A. Support Vector Machine (SVM)

The SVM was tested using four different values for thedegree of the kernel polynomial. The degrees ranged from 2to 5. Like the SVM applied to the marbling grade, differentkernels were tested and the algorithm did not converge.Values for the degree larger than 5 caused the algorithm tonot converge as well. Figure 13 shows that the kernel witha degree of 3 had the smallest RMSE and Figure 14 showsthat the kernel with a degree of 3 had the largest correlationbetween the output and the known %IMF.

2 2.5 3 3.5 4 4.5 50.026

0.027

0.028

0.029

0.03

0.031

0.032

0.033

0.034

0.035

degree

RM

SE

RMSE of SVM with varying degree of polynomial

Fig. 13. RMSE of SVM with varying degree

The SVM was the least accurate of the predictors, withthe highest maximum, minimum and RMSE and the lowestcorrelation. Because the SVM predicts the results using dis-crete classes it can suffer from an error similar to quantizationerror where the final predicted value may be classified as themean value of the class, when the input data has it closer tothe extremes of the class set.

B. Linear Neural Network (LNN)

The LNN was found to be the second most accuratenetwork. The maximum error and RMSE was the smallestand the predictor results had the highest correlation to the

Page 7: [IEEE 2009 IEEE Symposium on Computational Intelligence for Image Processing (CIIP) - Nashville, TN, USA (2009.03.30-2009.04.2)] 2009 IEEE Symposium on Computational Intelligence for

2 2.5 3 3.5 4 4.5 5

0.35

0.4

0.45

0.5

0.55

0.6

0.65

0.7

degree

corr

(out

,y)

Correlation between output and known % IMF

Fig. 14. Correlation of SVM output to desired result

output. Again, the minimum error parameter was varied andthe resulting RMSE is shown in Figure 15. The correlationbetween the output and the known %IMF is shown in Figure16.

10!3

10!2

10!1

0.013

0.014

0.015

0.016

0.017

0.018

0.019

0.02

0.021

0.022

stop error rate

RM

SE

RMSE of LNN with varying stop error rate

Fig. 15. RMSE of LNN with varying stop error rate

The optimal parameter for the maximum correlation andminimum RMSE was found to be 0.0219 and the resultsshown in Table VI shows that the LNN is very accurate.Like the LNN used to predict the marbling grades the errorreached a minimum and the LNN did not converge anyfurther.

C. Multilayer Perceptron Network (MLP)

The MLP network was tested by varying the minimumerror parameter. The RMSE of each test is plotted in Figure17 and the correlation between the output and the known%IMF is plotted in Figure 18. The optimal parameter for theminimum error was found to be 0.0126. Any minimum errorparameter value smaller than the optimal parameter had littleeffect as the algorithm could not converge to a smaller errorrate. As with the test for the marbling grade, the learningrate had little affect on the final result.

With the optimal parameter selected, the MLP had thesecond smallest RMSE, the smallest maximum error andthe second smallest minimum error for the results. The

10!3

10!2

10!1

0.76

0.78

0.8

0.82

0.84

0.86

0.88

0.9

0.92

0.94Correlation between output and known % IMF

stop error rate

corr

(out

,y)

Fig. 16. Correlation between output and desired result with varying stoperror rate

10!3

10!2

10!1

0.013

0.014

0.015

0.016

0.017

0.018

0.019

0.02

0.021

0.022RMSE of MLP with varying stop error rate

stop error rate

RM

SE

Fig. 17. RMSE of MLP while varying stop error rate

correlation between the output and the known values wassimilar to the LNN and RLS prediction methods.

D. Recursive Least Squares (RLS) filter

To test the RLS filter, the δ parameter was varied between0.1 and 0.001. The δ parameter is used to initialize thealgorithm. As shown in Figure 19, the parameter does affectthe results although the effect is not significant. The optimalδ was found to be 0.1 and with this parameter the RLSalgorithm had the smallest RMSE and a correlation betweenthe output and the known %IMF that closely matched theLNN and MLP networks.

The output of the RLS filter is plotted against the knownresults in Figure 20 and it shows that the results very closelymatch the known values.

IX. CONCLUSIONS

Each of the four networks have different results for predict-ing the quality measures. The LNN and the MLP consistentlyprovide accurate results and the SVM consistently providedless accurate results. The RLS algorithm however providesaccurate results for the %IMF prediction and very inaccurateresults for the marbling grade prediction. The optimal pa-rameters were found for each of the prediction methods andthrough the application of the same method can be found

Page 8: [IEEE 2009 IEEE Symposium on Computational Intelligence for Image Processing (CIIP) - Nashville, TN, USA (2009.03.30-2009.04.2)] 2009 IEEE Symposium on Computational Intelligence for

10!3

10!2

10!1

0.76

0.78

0.8

0.82

0.84

0.86

0.88

0.9

0.92

0.94Correlation between output and known % IMF

stop error rate

corr

(out

,y)

Fig. 18. Correlation between output and desired result while varying stoperror rate

10!1

0.0136

0.0137

0.0137

0.0137

0.0137

0.0137

0.0137

0.0137

0.0137

0.0137

!

RM

SE

RMSE of RLS with varying !

Fig. 19. RMSE of RLS with varying δ

again if the training set were to change. The features thatwere selected by the SFS and SBS algorithm can be used bythe prediction algorithms to provide excellent results.

From Tables V and VI, it can be seen that the LNN andMLP networks give results that are significantly related to theknown values of marbling grade and %IMF, and the RLSprovides results that are significantly related to the knownvalue of the %IMF. The results for the SVM are less stronglyrelated to the known values of the marbling grade and %IMF.

With the inclusion of more samples which would involvemore steers being scanned and the beef quality measurestaken, the prediction algorithms may become more accurate.Similarly, more samples per steer could be taken, which mayincrease the accuracy of the training algorithm. The accuracyof each network in relation to each other might also changeas the number of samples is increased as each network has adifferent learning algorithm. The SVM in particular is likelyto become more accurate with a larger number of trainingsamples.

REFERENCES

[1] J. R. Brethour, “Estimaing marbling score in live cattle from ultrasoundimages using pattern recognition and neural network procedures,”Journal of Animal Science, no. 72, pp. 1425–1432, 1994.

[2] W. Herring, L. Kriese, J. Bertrand, and J. Crouch, “Comparison offour real-time ultrasound systems that predict intramuscular fat in beefcattle,” Journal of Animal Science, 1998.

0 0.02 0.04 0.06 0.08 0.1 0.12 0.140

0.02

0.04

0.06

0.08

0.1

0.12

0.14Output of RLS filter compared to known % imf

known % imf

pred

icte

d %

imf

Fig. 20. RLS output vs. known %IMF

[3] A. Hassen, D. Wilson, V. Amin, G. Rouse, and C. Hays, “Predictingpercentage of intramuscular fat using two types of real-time ultrasoundequipment,” Journal of Animal Science, no. 79, pp. 11–18, 2001.

[4] R. M. Haralick, K. Shanmugam, and I. Dinstein, “Textural featuresfor image classification,” IEEE Transactions of Systems, Man andCybernetics, vol. 3, no. 6, pp. 610–621, November 1973.

[5] X. Tang, “Texture information in run-length matrices,” IEEE Transac-tions on Image Processing, vol. 7, no. 11, pp. 1602–1609, November1998.

[6] M. M. Galloway, “Texture analysis using gray level run lengths,”Computer Graphics and Image Processing, vol. 4, pp. 172–179, 1975.

[7] M. Stricker and M. Orengo, “Similarity of color images,” Storage andRetrieval for Image and Video Databases III, vol. 2420, no. 1, pp.381–392, 1995.

[8] Q. Huang, “Assisted analysis of ultrasound tendon images based onneural network segmentation,” Master’s thesis, University of Guelph,2004.

[9] E. Chu and A. George, Inside the FFT Black Box. CRC Press, 2000.[10] M. Jernigan and F. D’Astous, “Entropy-based texture analysis in the

spatial frequency domain,” IEEE Transactions on Pattern Analysis andMachine Intelligence, vol. 6, no. 2, pp. 237–243, March 1984.

[11] P. J. V. Fleet, Discrete Wavelet Transformations. Wiley, 2008.[12] Z. Dokur and T. Olmez, “Segmentation of ultrasound images by using

a hybrid neural network,” Pattern Recognition Letters, vol. 23, pp.1825–1836, 2002.

[13] J. Kittler, Handbook of Pattern Recognition and Image Processing.Academic Press, Inc, 1986, no. 3, ch. Feature Selection and Extraction.

[14] K. Fukunaga, Handbook of Pattern Recognition and Image Processing.Academic Press, Inc, 1986, no. 1, ch. Statistical Pattern Classisication.

[15] C.-C. Chang and C.-J. Lin, LIBSVM: a library for support vectormachines, 2001.

[16] S. Nissen, “Implementation of a fast artificial neural network library(fann),” Department of Computer Science University of Copenhagen(DIKU), Tech. Rep., 2003.

[17] S. S. Haykin, Adaptive Filter Theory. Prentice Hall, 1996, vol. 3rd.


Top Related