multivariate image analysis of a set of ftir microspectroscopy images of aged bovine muscle tissue...

11
ORIGINAL PAPER Multivariate image analysis of a set of FTIR microspectroscopy images of aged bovine muscle tissue combining image and design information A. Kohler & D. Bertrand & H. Martens & K. Hannesson & C. Kirschner & R. Ofstad Received: 31 March 2007 / Revised: 27 May 2007 / Accepted: 1 June 2007 / Published online: 17 July 2007 # Springer-Verlag 2007 Abstract In this paper we present an algorithm for ana- lysing sets of FTIR microscopic images of tissue sections. The proposed approach allows one to investigate sets of many FTIR tissue images both with respect to sample information (variation from image to image) and spatial variations of tissues (variation within the image). The algo- rithm is applied to FTIR microscopy images of beef loin muscles containing myofibre and connective tissue regions. The FTIR microscopy images are taken of sub-samples from five different beef loin muscles that were aged for four different lengths of time. The images were investigated regarding variation due to the ageing length and due to the homogeneity of the connective tissue regions. The pre- sented algorithm consists of the following main elements: (1) pre-processing of the spectra to overcome large quality differences in FTIR spectra and differences due to scatter effects, (2) identification of connective tissue regions in every image, (3) labelling of every connective tissue spec- trum with respect to its location in the connective tissue region, and (4) analysis of variations in the FTIR micro- scopic images in regard to ageing time and pixel position of the spectra in the connective tissue region. Important spectral parameters characterising collagen and proteoglycan structure were determined. Keywords FTIR imaging . Pre-processing . Standardisation . Extended multiplicative signal correction . ANOVA PLSR . Hyperspectral imaging . PCA . Connective tissue Introduction FTIR microspectroscopic imaging is a powerful, non- destructive technique extensively used in biological sciences to study the biochemical composition of microscopic tissue sections in a large number of samples, as for example in pharmaceutical science for the investigation of penetration enhancers [1], in medical science for the investigation and identification of pathological tissues [2, 3], for the study of distribution of collagen deposition in cardiac extracellular matrix [4, 5] and collagen cross-links in mineralised tissue [58] or recently also in food science for the investigation of protein secondary structure changes of meat tissue samples [9]. Anal Bioanal Chem (2007) 389:11431153 DOI 10.1007/s00216-007-1414-9 A. Kohler (*) : H. Martens : K. Hannesson : C. Kirschner : R. Ofstad Center for Biospectroscopy and Data Modelling, Norwegian Food Research Institute, Matforsk, Osloveien 1, 1430 Ås, Norway e-mail: [email protected] A. Kohler : H. Martens CIGENE Center for Integrative Genetics, Norwegian University of Life Sciences, 1432 Ås, Norway A. Kohler Department of Mathematical Sciences and Technology (IMT), Norwegian University of Life Sciences, 1432 Ås, Norway D. Bertrand Unité de Sensométrie et de Chimiométrie, ENITIAA, BP 82225, 44322 Nantes Cedex 3, France H. Martens IKBM, Norwegian University of Life Sciences, 1432 Ås, Norway H. Martens University of Copenhagen, Faculty of Life Science, Rolighedsvej 30, 1958 Frederiksberg, Denmark

Upload: independent

Post on 11-Nov-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

ORIGINAL PAPER

Multivariate image analysis of a set of FTIR microspectroscopyimages of aged bovine muscle tissue combining imageand design information

A. Kohler & D. Bertrand & H. Martens & K. Hannesson &

C. Kirschner & R. Ofstad

Received: 31 March 2007 /Revised: 27 May 2007 /Accepted: 1 June 2007 /Published online: 17 July 2007# Springer-Verlag 2007

Abstract In this paper we present an algorithm for ana-lysing sets of FTIR microscopic images of tissue sections.The proposed approach allows one to investigate sets ofmany FTIR tissue images both with respect to sampleinformation (variation from image to image) and spatialvariations of tissues (variation within the image). The algo-rithm is applied to FTIR microscopy images of beef loinmuscles containing myofibre and connective tissue regions.The FTIR microscopy images are taken of sub-samplesfrom five different beef loin muscles that were aged for four

different lengths of time. The images were investigatedregarding variation due to the ageing length and due to thehomogeneity of the connective tissue regions. The pre-sented algorithm consists of the following main elements:(1) pre-processing of the spectra to overcome large qualitydifferences in FTIR spectra and differences due to scattereffects, (2) identification of connective tissue regions inevery image, (3) labelling of every connective tissue spec-trum with respect to its location in the connective tissueregion, and (4) analysis of variations in the FTIR micro-scopic images in regard to ageing time and pixel positionof the spectra in the connective tissue region. Importantspectral parameters characterising collagen and proteoglycanstructure were determined.

Keywords FTIR imaging . Pre-processing .

Standardisation . Extended multiplicative signal correction .

ANOVA PLSR . Hyperspectral imaging . PCA .

Connective tissue

Introduction

FTIR microspectroscopic imaging is a powerful, non-destructive technique extensively used in biological sciencesto study the biochemical composition of microscopic tissuesections in a large number of samples, as for example inpharmaceutical science for the investigation of penetrationenhancers [1], in medical science for the investigation andidentification of pathological tissues [2, 3], for the study ofdistribution of collagen deposition in cardiac extracellularmatrix [4, 5] and collagen cross-links in mineralised tissue[5–8] or recently also in food science for the investigationof protein secondary structure changes of meat tissuesamples [9].

Anal Bioanal Chem (2007) 389:1143–1153DOI 10.1007/s00216-007-1414-9

A. Kohler (*) :H. Martens :K. Hannesson :C. Kirschner :R. OfstadCenter for Biospectroscopy and Data Modelling,Norwegian Food Research Institute,Matforsk, Osloveien 1,1430 Ås, Norwaye-mail: [email protected]

A. Kohler :H. MartensCIGENE Center for Integrative Genetics,Norwegian University of Life Sciences,1432 Ås, Norway

A. KohlerDepartment of Mathematical Sciences and Technology (IMT),Norwegian University of Life Sciences,1432 Ås, Norway

D. BertrandUnité de Sensométrie et de Chimiométrie, ENITIAA,BP 82225, 44322 Nantes Cedex 3, France

H. MartensIKBM, Norwegian University of Life Sciences,1432 Ås, Norway

H. MartensUniversity of Copenhagen, Faculty of Life Science,Rolighedsvej 30, 1958 Frederiksberg, Denmark

The advantage of FTIR microscopy is that the samplepreparation is easy, since no staining of the samples isnecessary and the acquisition of FTIR images is fast (up toa few minutes per image). While staining techniques allowone single or only a few components to be analysed at thesame time, an overall chemical fingerprint is obtained byFTIR microscopy giving more or less detailed informationabout several macromolecules. On the other hand thetreatment of FTIR imaging data is difficult due to thehuge amount of data and information obtained: a singleFTIR image contains several thousand image points (pixels)with a full spectrum with up to several thousand wave-numbers in every pixel point. The analysis of a whole set ofFTIR images faces many important data analytical prob-lems: (1) there are often big quality differences of spectradue to quality differences of detector pixels, due to regionswith low intensities and due to scatter effects like surfaceeffects on tissue sections and differences in the optical pathlength, which make a filtering of spectra including pre-selection and pre-processing very important [10, 11]; (2)FTIR tissue images often contain information from morethan one tissue type and there are local spatial variations inevery tissue type; (3) investigating sets of inhomogeneousimages, where every image corresponds to certain designvariables or to a certain sample information, requires ananalysis that involves both design/sample information,characterising every single image (global image informa-tion), and spatial variation, taking into account inhomoge-neity within every image (local image information). Thisrequires stable models for the standardisation of images inorder to make many images comparable, since severalparameters such as intensity of the light source and samplethickness can vary within and between images.

In this paper an algorithm for the analysis of large sets ofFTIR microscopy images is presented. As an example 113FTIR microscopic images of Longissimus dorsi muscleswere used. The images contained regions with myofibrecells and connective tissue. The algorithm includes a qual-ity test and pre-processing/standardisation of FTIR images,segmentation of connective tissue regions on every imageand calculation of parameters describing the spectrumposition in the connective tissue region (pixel distance tothe edge of the connective tissue region). The obtained setof connective tissue spectra is then analysed by an analysisof variance partial least-squares regression (ANOVA PLSR).An important feature of the approach is that design infor-mation (ageing time and animals) is mixed with spatial in-formation (position of spectrum in the connective tissueregion) in the analysis of variance. Following these linesimportant connective tissue parameters could be investigatedwith respect to the homogeneity of the connective tissueregions and ageing time. The approach can easily be adaptedto the investigation of spatial variations in other tissue types

e.g. gradients in aortic tissues [12]. To our knowledge this isthe first time that a systematic analysis of a large set ofFTIR images has been performed automatically, i.e. manualselection of a few image spectra on every image is not required.

As an example FTIR microscopy images of bovineLongissimus dorsi muscles are studied. Muscle is a com-posite structure comprised of contractile myofibres attachedby connective tissue, and the properties of both the myo-fibres and the connective tissue are important for the textureof meat and fish. The arrangement of the structural com-ponents is similar in all myofibres in both mammalian andfish skeletal muscles. In contrast, connective tissue is a veryheterogenous tissue with respect to both the arrangement andtype of collagen fibres and to the biochemical compositionand structure of the matrix proteoglycans. In order to un-derstand textural variations in different raw materials and theageing mechanisms, better knowledge of the structure andbiochemical composition of the connective tissue has to begained. So far analyses of the connective tissue components,collagens and proteoglycans are mainly based on tediousbiochemical methods using strong denaturing agents orproteolytic enzymes to disrupt the strong interactionsbetween the molecules. We have in previous studies usedthis approach on meat samples and reported degradation ofproteoglycans (aggrecan-like and decorin) during postmortem storage [13, 14]. The present study is a first steptowards more rapid and reliable methods for monitoringconnective tissue heterogeneity in meat and fish muscles.

Materials and methods

Meat samples for microscopy

Longissimus dorsi muscles from five Norwegian Red Cattleobtained from a local slaughterhouse were used. Themuscles were excised from the carcasses 45 min postmortem. Muscle blocks (5 mm×5 mm×2 mm) for FTIRmicroscopy at time 0 were excised immediately from astandardized location in the meat slice. The blocks wereembedded in optimal cutting temperature (OCT) compound(Tissue-Tek, Electron Microscopy Sciences, Hatfiles,USA), frozen in liquid N2 and stored at −80 °C untilsectioning. The rest of the muscle was packed in polyeth-ylene bags under slight vacuum and aged at 4 °C for 2,7 and 21 days according to the described procedure formeat used before physical measurements [15]. Ageingmuscle blocks were then excised as described for the time0 samples. The samples were sectioned at −22 °C trans-versally to the fibre direction. A cryostat (Leica CM 3050 S,Nussloch, Germany) was used, and 8-μm-thick sectionswere prepared and thaw-mounted on infrared-transparent2-mm-thick CaF2 slides for FTIR microscopic measure-

1144 Anal Bioanal Chem (2007) 389:1143–1153

ments. The sections were finally freeze-dried and storedunder dry conditions.

Acquisition of infrared spectra/images

An IR microscope (IRscope II) coupled to an Equinox 55FTIR spectrometer (both from BRUKER OPTICS, Germany)was used to measure the tissue sections. The microscopewas equipped with a computer-controlled x,y stage. TheBruker system was controlled with an IBM-compatible PCrunning OPUS-NT software, version 4.0. An FTIR imagingsystem, employing a focal plane array detector with 64×64pixels, was used to measure sample areas containingconnective tissue and myofibres with a nominal lateralresolution of 4 μm pixel−1. For the FTIR imaging thespectrometer is run in a step-scan mode and 16 counts wereco-added in every mirror position. Absorbance spectra wererecorded in the range from 3,800 to 900 cm−1 with a spec-tral resolution of 8 cm−1. The microscope, which wassealed using a specially designed box, and the spectrometerwas purged with dry air to reduce spectral contributionsfrom water vapour and CO2. A background spectrum/imageof the CaF2 substrate was recorded before each samplemeasurement in order to account for variations in watervapour and CO2 levels. For each of the five animals andfour different ageing times, we analysed two slices resultingin a total of 5×4×2=40 slices. From each slice we obtainedone section and we acquired three infrared images fromevery section, resulting theoretically in 120 infrared images.Of the 120 images, seven had to be removed due to cor-rupt slices or corrupt measurements, resulting in 113images with 4,096 spectra in each image (462,848 spectrain total).

Theory

General strategy

The general strategy for the analysis of the FTIR spectra isgiven below and is illustrated by the flow chart in Fig. 1.The numbering used throughout the paper will refer to theflow chart.

(I) Quality testA quality test was performed on every spectrum of the

image set.(II) Segmentation of connective tissue regions

(a) A representative subset of spectra (Set A, contain-ing 1,130 spectra) was selected using all images. SetA contains myofibre and connective tissue spectrahaving passed the quality test.

(b) Set A was pre-processed by a physical extendedmultiplicative signal correction (EMSC model A)to remove baseline shifts, multiplicative andwavenumber-dependent scatter effects.

(c) A principal component analysis (PCA) was per-formed on set A and a segmentation model based onPCA (PCA model A) was constructed regardingscore images and using a priori knowledge aboutdiscriminative spectral bands for connective tissueand for myofibre cells, respectively.

(d) The EMSC model A and subsequently the PCAmodel A for segmentation were applied to all 113images. Connective tissue spectra and myofibrecell spectra were separated.

(III) Study connective tissue

(a) For the connective tissue spectra distance groupswere calculated, i.e. every connective tissue pixel/spectrum was characterised by its distance to theedge of the connective tissue.

(b) Set B consisting of 2,697 connective tissue spectrawith representative spectra from all images and alldistance groups was chosen.

(c) Set B was pre-processed by a physical EMSC(EMSC model B, model for connective tissuespectra only).

(d) Set B was analysed by ANOVA PLSR by mixingthe spatial variation and design.

(I) Quality test on raw images

(II) Segmentation of connective tissue regions

(a) Random selection of spectra from image set

(b) EMSC model A

(c) PCA model for segmentation

(d) Segmentation of connective tissue regions

(III) Study connective tissue

(a) Position parameters for connective tissue spectra

(b) Random selection of connective tissue spectra

(c) EMSC model B

(d) Analysis of Variance with respect to spatial variation and ageing time.

(e) Chemical imaging of identified parameters

(I) Quality test on raw images

(II) Segmentation of connective tissue regions

(a) Random selection of spectra from image set

(b) EMSC model A

(c) PCA model for segmentation

(d) Segmentation of connective tissue regions

(III) Study connective tissue

(a) Position parameters for connective tissue spectra

(b) Random selection of connective tissue spectra

(c) EMSC model B

(d) Analysis of Variance with respect to spatial variation and ageing time.

(e) Chemical imaging of identified parameters

Fig. 1 Flow chart illustrating the general strategy for the analysis ofthe infrared images

Anal Bioanal Chem (2007) 389:1143–1153 1145

(e) The spatial distribution of important collagen andproteoglycan parameters identified in the previousstep were visualised by imaging chemical proper-ties and scatter effects.

The single steps of this general strategy are explainedbelow.

Test of quality of FTIR image spectra

A quality test was performed on every spectrum of theimage set in order to identify pixels with very low or zerointensity, pixels with high noise or water vapour. The testis similar to the one proposed by Helm et al. [16] exceptthat we used raw spectra for the quality test. The maximumabsorbance differences were calculated in the regions2,100–1,600 cm−1, in order to check for total absorbance(values between 0.2 and 1.2 were accepted); 2,100–2,000 cm−1, in order to check for noise (values below 0.02were accepted); and 1,837–1,847 cm−1, in order to check forwater vapour (values below 0.005 were accepted). Inaddition the maximum difference in the regions 1,800–1,700 (containing C=O stretch in lipids) and in the region1,700–1,600 (containing C=O in proteins) was calculatedand the ratio (lipids to proteins) was used to exclude pixels/spectra referring to fat (values below 0.5 were accepted).

Extended multiplicative signal correction

Infrared spectra of tissue sections often show strong scattereffects due to differences in the effective optical pathlength, due to variations in the light source or due to othereffects, like surface effects of the sample [10]. In order tomake spectra interpretable, it is important to separate thisphysical information from the chemical information origi-nating from absorbances of specific functional groups ofbiomolecules in the samples. Both multiplicative signalcorrection (MSC) [17, 18] and extended multiplicativesignal correction (EMSC) [10, 11, 19] are model-based pre-processing methods that can be used to separate chemicaland physical information in FTIR absorbance spectra. In thebasic form of EMSC every spectrum z enð Þ is a function ofthe wavenumber en, which is defined as the reciprocal of thewavelength λ. The spectrum z enð Þ can be written as

z eνð Þ ¼ aþ b � m eνð Þ þ d eν þ eeν2 þ "r eνð Þ ð1Þa linear combination of a baseline shift a, a multiplicativeeffect b times a reference spectrum m enð Þ, linear andquadratic wavenumber-dependent effects d � en and een2,respectively. The term " enð Þ contains the unmodelledresiduals. The reference spectrum m enð Þ is calculated bytaking the sample mean of the considered set of spectra or

by selecting a spectrum from the sample set as referencespectrum. The EMSC parameters a, b, d and e can beestimated by ordinary least-squares, and finally the spectracan be corrected according to

zcorr eνð Þ ¼ z� a� dev� eeν2� ��

b: ð2ÞWe refer to this as physical EMSC, since the EMSC

model takes into account scatter effects only and does notuse constituent difference spectra [10]. The strength ofEMSC lies in the fact that the model is defined around areference spectrum. This makes EMSC modelling verystable, even in the case where there are very strong spectralchanges, e.g. from image region to image region or fromimage to image due to chemical differences in the tissue orscatter effects arising from differences in sample thicknessor textural changes.

Another advantage of EMSC is that scatter effects areestimated by EMSC. The obtained scatter effects cantherefore be visualised in the same way as chemicalproperties [10].

For the study of connective tissue (III, see Fig. 1) thespectra were derivatised (second derivative) using thealgorithm by Savitzky–Golay (based on nine points) [20]before EMSC. This was done to resolve nearby lyingbands. In this case the parameters for baseline effect a andthe linear effect d in Eq. 1 are zero, since the secondderivative already removes baseline and linear effects. Theorder of both pre-processing steps can also be changedwithout changing the result.

Segmentation of connective tissue and myofibre regions

The FTIR microspectroscopy images used in this studycontain regions with both myofibre and connective tissues.In order to be able to study the spatial variation of con-nective tissue regions, these regions were segmented. Forsegmentation principal component analysis (PCA) wasused. In order to build a PCA model for segmentation tenspectra from every image were selected randomly frompixels/spectra having passed the quality test, resulting in1,130 spectra in total (containing myofibre and connectivetissue spectra). We refer to this as set A (IIa). Set Awas firstpre-processed by a physical EMSC in the region from1,000 to 1,800 cm−1 (IIb). The EMSC model was saved forlater use (EMSC model A). A PCA was then performed onset A in the spectral region from 1,000 to 1,500 cm−1.Based on the visual inspection of score images, loadingsand by visual inspection of the obtained segmentations (bycomparing the segmentations to corresponding light mi-croscopy images) a PCA model was chosen to separateconnective tissue and myofibre spectra (PCA model A)(IIc). The EMSC model A and subsequently the PCA modelA were applied to all images (IId). Through this procedure

1146 Anal Bioanal Chem (2007) 389:1143–1153

connective tissue spectra and myofibre tissue spectra wereseparated by applying the above-identified threshold.

Characterisation of spatial variations of connective tissueregions

In order to quantify spatial variation in connective tissueregions, for every connective tissue pixel/spectrum wecalculated the pixel distance to the nearest non-connectivetissue pixel/spectrum. These distance values were roundedto integer numbers and resulted in numbers beginning withone, since pixel distances and not true microscopicdistances were used. Following this procedure we obtainedan integer distance value for every connective tissuespectrum referring to a distance group (IIIa). Every con-nective tissue spectrum was therefore characterised by thedesign factors of the sample (information corresponding toevery image) and an integer number referring to a distancegroup, namely by the following parameters: the ageing time(0, 2, 7 or 21 days), the animal (1, 2, 3, 4, 5) and thedistance group (1, 2, 3...). In the next step set B wasestablished: from all images connective tissue spectra weresampled. From all connective tissue spectra in each imagewe chose three spectra randomly using the condition that adistance was used only in the case when at least threespectra could be found per distance and image (IIIb). Thisresulted in a set of 2,697 spectra in total (set B). Everyspectrum in set B is characterised by the design factorsageing time, animal and distance group. Set B was pre-processed by an EMSC model (using the mean of set B asreference spectrum (IIIc) in the region 1,800–1,000 cm−1)and was then used to investigate the influence of the designfactors and the distance groups on the spectra by ANOVAPLSR (IIId). Chemical variation present in the first andsecond component was identified and important chemicalbands were used to create chemical maps (IIIe).

The data analysis was performed using Matlab 7.0. (TheMathWorks Inc., Natick, USA).

Analysis of variance by PLSR mixing design and spatialinformation

In order to estimate correlation between design variables(sample information and spectrum position on connectivetissue region) and, FTIR absorbances, we used partial least-squares regression (PLSR). In PLSR the response variablesY (e.g. FTIR spectra of N samples with K2 variables) areexpressed as a linear function of the variables X (e.g. K1

design variables referring to N)

Y ¼ B0 þ XBþ F ð3Þwhere B0 is the N×K1 matrix of offsets with identicalvalues in every column (every variable has the same offset

for all samples N), B is the matrix of regression coefficientswith K1×K2 entries and F is the N×K2 matrix of residuals[21]. X is transferred to a new coordinate space, where thenew “latent X-variables”, the so-called scores, have adiagonal covariance matrix. The directions in this newcoordinate space are given by the loading vectors (as afunction of the old variables). In addition the new variablesare ordered according to the magnitude of their co-varianceto Y, i.e. the first PLS component contains the largest co-variance and so on.

In order to visualize the correlation between design andFTIR variables, so-called correlation loading plots areused. In a correlation loading plot, correlations betweenthe x- and y-variables and the PLS scores are plotted. Thedesign was characterised by the design matrix X contain-ing so-called indicator variables, where each condition isexpressed by a variable with the value 0 and 1, indicating ifthe respective samples are referring to this condition or not[22]. The design variables and FTIR were weightedinversely by their standard deviations prior to PLSR. Thedesign variables contained both distance groups and image(sample) information, i.e. ageing time and animal.

Results

Segmentation of connective tissue regions

A light microscopy image corresponding to a typical FTIRimage region is shown in Fig. 2. The light microscopyimage was taken from meat aged for 2 days. The imageregion covers several myofibres enveloped by thin connec-tive sheets (endomysium) and in the middle of the image a

Fig. 2 Light microscopy image corresponding to bovine meat agedfor 2 days

Anal Bioanal Chem (2007) 389:1143–1153 1147

thicker connective tissue area (perimysium) envelopinggroups of myofibres. All the FTIR images of the imageset used in this study covered both regions of myofibre cellsand regions of connective tissue (perimysium). While themyofibres are rather homogeneous, the connective tissue isa very heterogeneous tissue with networks of collagenousand elastic fibres embedded in and interacting withproteoglycans and glycoproteins.

According to our strategy the whole set of FTIR imageswas subjected to a quality test (I) and subsequently arandom set of spectra from the whole image set wasselected containing both myofibre and connective tissuespectra (IIa). This was followed by a physical EMSC runon set A and a PCA performed in the spectral region 1,500–1,000 cm−1 after having removed several spectral outliers.The explained variances were 46% and 25.2% in the firstand second component, respectively. The region from 1,800to 1,500 cm−1 was left out for the PCA in order to avoidmyofibre protein changes due to ageing time effects. Weobserved the strongest changes in myofibre cell spectra dueto ageing time in the amide I and the amide II region(results not shown). However, since the differences betweenconnective tissue and myofibre cell spectra are largecompared to the changes due to ageing time, the PCAmodel can alternatively also be established using the region1,800 to 1,000 cm−1.

The obtained EMSC and PCA model was applied to allFTIR images (IIb,c). Figure 3 shows the first score of atypical IR image (from meat aged for 2 days). The blackpixels in Fig. 3 refer to pixels that were not accepted by thequality test. In most cases spectra are removed due to verylow or very high absorbances, but in general each of thecriteria used for the quality test can lead to the exclusion of

spectra. The FTIR score image in Fig. 3 in which the lightarea represents the connective tissue, corresponds to thelight microscopy image of Fig. 2. Figure 3 shows that thefirst score is distinguishing well between connective tissueand myofibre regions. This observation was confirmed byinspecting several first-score images.

The score plot of the first and second principal com-ponent of set A is shown in Fig. 4. In the score plot twomain populations are visible: a more compact population tothe left hand side and one that is wider spread to the righthand side. Comparing the values for the first scores in thescore plot in Fig. 3 with the score image in Fig. 4, it isobvious that the more compact population on the left handside in the score plot in Fig. 4 refers to myofibre spectra,while the more spread objects on the right hand side arereferring to connective tissue spectra. This is in accordancewith the fact that myofibre cell regions are known to bemore homogeneous than the connective tissue regions.Inspecting several score images and score plots the thresh-old value for the first score was set to 0.001 in order toseparate connective tissue and myofibre cells. Finally thethreshold of 0.001 for score of PC1 was applied to all 113score images in order to obtain a segmentation of the con-nective tissue on all images (IId). The applied thresholdwas high enough to ensure that the edge regions of thesegmented connective tissue areas were free from pixelsreferring to myofibre spectra. This was confirmed by in-specting segmentation results with corresponding lightmicroscopy images.

Calculation of distance groups

For every connective tissue pixel we calculated the distanceto the edge of the corresponding connective tissue region

Fig. 3 Colour map showing the first score image of a typical FTIRimage (PCA model of set A was used). The corresponding lightmicroscopy image is shown in Fig. 2. The images correspond tobovine meat aged for 2 days

-6 -4 -2 0 2 4 6 8

x 10-3

-4

-2

0

2

4

6

8

10

12

14

16x 10

-3

PC

2

PC1

Fig. 4 Score plot of first and second component separating myofibrespectra (left) and connective tissue spectra (right)

1148 Anal Bioanal Chem (2007) 389:1143–1153

using the Matlab function bwdist (Euclidean distancemetric) (IIIa). This was done by calculating the distanceto the nearest non-connective tissue pixel, i.e. myofibre orvoid regions (defined by a maximum absorbance level).Void spaces occurred often between the connective tissueand myofibre regions or between myofibre cells. Similarvoid regions due to myofibre–myofibre and myofibre–perimysium detachments are also observed in histologicalimages of aged meat [14, 23]. These void regions lead toscatter effects at the edge of the void regions and to spectraalmost identical to zero in the void regions, which requiresa pre-selection of spectra by a quality test. Figure 5 showsthe calculated distances for every connective tissue pixel ofthe FTIR image corresponding to the light microscopyimage of Fig. 2.

Investigation of spatial variations and variationsdue to ageing

In order to investigate if the different factors of animal, timeand distance have influenced the FTIR spectra, set Bconsisting of connective tissue spectra was chosen (IIIb)from the segmented connective tissue regions as describedin the Theory section. Mean second-derivative spectra forageing time 0 (solid line) and ageing time 2 days (dashedline) are shown in Fig. 6. Tentative band assignments madeaccording to the literature are summarised in Table 1.

Set B was analysed by ANOVA PLSR. Prior to ANOVAPLSR an EMSC was performed in the spectral region 1,000to 1,800 cm−1 (EMSC model B), the mean of all spectrareferring to the same design condition (animal, time anddistance group) was calculated and peak absorbances wereselected in the region 1,000–1,700 cm−1 according to theminima in Fig. 6. Minima were selected for presentation

purposes. Equivalently the whole spectral region from 1,000to 1,700 cm−1 can be used for the analysis by ANOVAPLSR [22]. The EMSC model B was stored for later use onFTIR images. The samples referring to distance 1 anddistance 2 were removed from the data set before runningANOVA PLSR. This was done to avoid contributions fromnearby lying muscle fibre in the connective tissue spectranear to the edge. Together with a conservative segmentationmodel this ensured that the selected connective tissuespectra were free from contributions from muscle fibrecells. This was also confirmed by inspecting connectivetissue edge spectra visually.

In the ANOVA PLSR the design matrix includingindicator variables was used as X and the selected FTIRpeak absorbances as Y. Figure 7 shows the correlationloading plot for the first and second PLS component. InFig. 7a the samples corresponding to ageing time 0 and2 days were used; in Fig. 7b samples corresponding toageing time 2 days and 7 days were used. The cross-validated explained variances for X and for Y are shown inthe figure legends. The cross-validation was performedblock-wise between the different animals, i.e. samples withthe same biological origin were taken out at the same timeand predicted from the model of the remaining animals.This is the most conservative form of cross-validation. Theinner and the outer circle in Fig. 7 refer to 50% and 100%explained variance, respectively. The design variables andthe peak absorbance variables are plotted in blue and black,respectively. The correlation loading plot can be interpretedin the following way: nearby lying design variables andFTIR peak absorbances with high explained variance arehighly correlated or highly negatively correlated if they arelocated opposite to each other with respect to the origin ofthe plot. Correlations revealed by the correlation loading

Fig. 6 Mean spectra for ageing time 0 days (solid line) and 2 days(dashed line)

Fig. 5 Distance image illustrating the calculated distances for everyconnective tissue spectrum

Anal Bioanal Chem (2007) 389:1143–1153 1149

plot may then be confirmed by the plot of raw data, e.g. aschemical images of spectral absorbances.

From Fig. 7a the following results are obtained: the designvariables characterising the ageing time span mainly thefirst component; and design variables characterising thedistance of the spectra to the edges of the connective tissueregions span out mainly the second component. The designvariables related to the different animals do not significant-ly contribute to the first and second component (results notshown). Higher PLS components do no show strongcorrelations between spectral bands and design variables(results not shown).

The bands at 1,695 cm−1 and 1,677 cm−1, earlierassigned to reducible collagen cross-links [5–8], are posi-tively correlated to short ageing times. Collagen cross-linksare typical in a collagenous network and supposed to bereduced with ageing. This result may therefore indicate thatreducible cross-links are somehow already reduced after theageing period of 2 days.

The band at 1,340 cm−1 was earlier discussed as a markerfor the integrity of collagen [24–26]. This band is positivelycorrelated to longer distances to the edge of the connectivetissue regions. The same applies for the bands at 1,204 and1,283 cm−1, which have been discussed as referring to the

Fig. 7 Correlation loading plotof ANOVA PLSR. X is definedby the design, mixing spatialand sample information; Y isdefined by selected FTIR bands:a ageing times 0 and 2 days, bageing times 2 and 7 days

Table 1 Assignment of FTIR bands

FTIR band(cm−1)

Assignment

1,695, 1,677 Amide I, possibly reducible collagen cross-links, (C=O stretch) [5–8, 24]1,659 Amide I, possibly non-reducible collagen cross-links, (C=O stretch) [5–8, 24]1,638 Amide I from proteoglycans [6, 24] O–H bending of water [26]1,600–1,500 Amide II C–N stretch, N–H bend combination of collagen and proteoglycans, 1,551 cm−1 possibly referring to the amide II

of collagen [24]1,497 Not assigned1,468 CH2 bending vibration [26]1,453 CH3 asymmetric bending vibration [26]1,396 COO− stretch of ionised fatty acids and amino side chains [26]1,340 CH2 side chain vibrations of collagen [24–26]1,312 Not assigned1,283 Collagen amide III vibration with significant mixing with CH2 wagging vibration from the glycine backbone and proline

sidechain [26]1,237 Sulfate stretch from proteoglycans [24] Collagen amide III vibration with significant mixing with CH2 wagging vibration

from the glycine backbone and proline sidechain [26]1,204 Collagen amide III vibration with significant mixing with CH2 wagging vibration from the glycine backbone and proline

sidechain [26]1,160 Not assigned1,121 Not assigned1,083, 1,062,1,031

C–O stretching vibrations of the carbohydrate residues present in collagen [26] and in proteoglycans

1150 Anal Bioanal Chem (2007) 389:1143–1153

collagen. This indicates that collagen is abundant in the midregions of the connective tissue.

The band at 1,237 cm−1 has been discussed as referringto sulfate stretching vibrations of proteoglycans [24] andhas a positive correlation to ageing time 2 days. Sincecarbohydrate residues are abundant in the network ofproteoglycans one may expect that the C–O stretchingvibrations at 1,083, 1,062 and 1,031 cm−1 are related to theamide I of proteoglycans and to the O–H bending of water(1,638 cm−1) and to ageing time 2 days. Nevertheless allthe carbohydrate bands show rather weak correlations withthe first principal component and only the carbohydratebands at 1,062 and 1,031 cm−1 are (very weakly) related tothe ageing period 2 days. It remains unclear why the bandsreferring to proteoglycans show an increase from ageingtime 0 to ageing time 2 days.

The amide II bands 1,512 cm−1 and 1,528 cm−1 areclosely related to the amide I bands at 1,695 cm−1 and1,677 cm−1, suggesting that these amide II bands possiblyalso refer to reducible collagen cross-links. On the otherhand the amide II band at 1,551 cm−1 is positively relatedto bands referring to proteoglycans and may therefore beassigned to the amide II bands of proteoglycans. This is indisagreement with its ealier assignment to amide II ofcollagen [24].

From Fig. 7 it can also be seen that there is no clearvisible difference between the FTIR bands on day 2 andday 7. A similar result is obtained for the differencesbetween day 7 and day 21 (results not shown).

The random selection of set B was performed severaltimes (applying different start values for the random set)and the subsequent data analysis of the different setsvalidated our results.

Imaging important connective tissue parameters

In order to visualise the results from the previous section,absorbances at wavenumbers that appeared to be importantin the correlation loading plot of Fig. 7 were used togenerate chemical images. Chemical images are shown inFig. 8 (the same FTIR image as in Fig. 3 was used). Theimage spectra were corrected by the second derivative andEMSC, as described above, before generating the chemicalimages. In Fig. 8a the chemical image of the absorbance at1,237 cm−1 (sulfate stretch of proteoglycans) is shown forFTIR images referring to meat that was aged for 2 days.The image shows an increase of small spots with higherabsorbances from the edge to the mid region of theconnective tissue areas, which confirms the weak correla-tion between the band at 1,237 cm−1 and high distances tothe edge. In Fig. 8b a chemical image is shown for theabsorbance at 1,204 cm−1, possibly referring to collagen.This chemical image refers to the same FTIR image as

Fig. 8 Chemical images for the absorbances at 1,237 cm−1 (a) and1,204 cm−1 (b) and the effective optical path length estimated byEMSC (c)

Anal Bioanal Chem (2007) 389:1143–1153 1151

used for creating the chemical image in Fig. 8a. Theimages in a and b appear similar, although differences arevisible.

In Fig. 8c a colour image representing the parameter b ofthe EMSC modelling is shown (see Eq. 2). The parameter bis the effective optical path length that is mainly determinedby the sample thickness. The effective optical path lengthwas estimated and removed from the image spectra by theEMSC pre-processing. Comparing Fig. 8c with Fig. 8a andb shows that the sample thickness is not strongly related tothe absorbances at 1,237 cm−1 and 1,204 cm−1, whichrepresent the sulfate stretch of proteoglycans and collagen,respectively. The images in Fig. 8a and b show a higherdensity of spots with higher aborbances in the connectivetissue mid regions, while the effective optical path lengthimage shows a gradually increasing thickness from the edgeto the centre. It has to be noted that for the imaging inFig. 8 EMSC was performed on the second-derivativespectra. This implies that the parameters a and d of theEMSC estimation are zero, since the baseline effects andlinear effects are removed by the second derivative. Theestimated parameter b on the other hand is identical forEMSC on raw spectra and EMSC on second-derivativespectra. The reason for running the second derivative priorto EMSC estimation was to resolve nearby lying bands.

Discussions

The investigation of spatial variations in FTIR microscopyimages is a very important issue in many disciplines such asmedicine, pharmaceutical science and food science. Ourstudy is, to the best of our knowledge, the first report inwhich spatial variation together with sample/design infor-mation has been investigated for a large set of FTIRmicroscopic images in one data analysis using a semi-automatic approach. The approach includes standardisationof a whole image set with respect to effective optical pathlength (multiplicative effect) and additive effects such asbaseline shifts. These effects are estimated by EMSC andcan be visualised by imaging as shown for the effectiveoptical path length. This allowed an estimation of the sam-ple thickness of the tissue section.

A segmentation model to separate connective tissuespectra from myofibre spectra was established. For the sakeof simplicity a PCA model with a threshold was used assegmentation model, which was sufficient for our purposes.In general any segmentation model can be built into thealgorithm. A segmentation based on one single variable(the absorbance at 1,237 cm−1, referring to the sulfatestretch) may also be sufficient in our case, although it is (atleast intellectually) inferior to the multivariate segmentationmodel based on PCA, since the multivariate model uses

several bands showing different absorbances for connectivetissue and muscle fibre cells.

Our general approach is adapted to designed experimentsin which many images are acquired with different tissuetypes on every image. We have shown how image infor-mation and experimental design information can be used inthe analysis of the spectra. One important aspect of ourapproach is that we create a ‘design matrix’ that mixesexperiment design information (animal and ageing time)and image information (position of pixels/spectra withrespect to tissue type). We have investigated this newdesign matrix with respect to the image spectra usingANOVA PLSR, whereby we have shown how experimentaldesign factors and image information influence differentcomponents. This adds to existing algorithms for FTIRimages described in the literature, e.g. cluster analysis [2]and SIMCA [3], and allows the possibility to regress anunderlying design on FTIR image spectra, whereby thedesign takes into account parameters describing spatialvariation within the images. The use of PLSR allows, aswith other methods that are based on latent variables, theuse of correlation loading plots for the visualisation ofcorrelations between variables/bands and design factors. Inaddition it allows the use of validation possibilities withinthe framework of PLSR.

Since according to the Rayleigh criterion the diffractionlimit at 1,200 cm−1 is 12 μm [27] (the bands discussed withrespect to spatial variation are below 1,200 cm−1 and haveconsequently a lower limit), the spectra referring todistances 1 and 2 were removed from the data set beforethe analysis by ANOVA PLSR. This ensured that onlypixels with a minimum distance of 12 μm to the myofibrepixels (from pixel centre to pixel centre) were included inthe data set. In addition the PCA model for the segmenta-tion was chosen carefully to avoid that pixel spectrareferring to distance 1 are already well placed within theconnective tissue region and not visually overlapping withthe myofibres: the threshold (0.001) chosen for thesegementation using the PCA model is far from thecompact population defining the myofibre tissue spectrain Fig. 4. Although we cannot completely exclude thatmyofibres contribute to the spectra referring to distance 3,we can conclude that the bands at 1,340 cm−1, 1,204 cm−1

and 1,283 cm−1 are somewhat increased in mid connectivetissue regions, since they have a high correlation to thedistance group 5/6 in the correlation loading plot in Fig. 7a.Dispersive line shape artefacts were observed earlier atedges of tissue sections and can appear near holes or cracksin tissue sections, when e.g. the connective tissue region isdetached due to ageing [28]. The spectra of set B used forANOVA PLSR (without distance 1 and distance 2 spectra)did not show any dispersive line shape artifacts, which wasconfirmed by visual inspection.

1152 Anal Bioanal Chem (2007) 389:1143–1153

The relevance of bands that were identified to be impor-tant using ANOVA PLSR was confirmed by visualisingthem as chemical images, where they showed the gradientsthat were predicted by ANOVA PLSR. Among the pa-rameters that turned out to be important for the design,mixing spatial and sample information the sulfate stretch at1,237 cm−1 was shown to increase from the edge to the midarea of the connective tissue regions. The same was true forbands possibly referring to collagen. This shows that thecollagen network has a higher density in mid-connectivetissue regions. This higher density is not connected to theeffective optical path length, since the image spectra arecorrected for this sample thickness using EMSC. We havealso observed that the bands associated with proteoglycansincrease until ageing time day two. We have earlier shownthat proteoglycan breaks down early postmortem [14].Probably a breakdown of the proteoglycan network leadsto a situation where the sulfate stretch in proteoglycans ismore easily accessible for infrared aborptions and with thisto an increase of absorbance after 2 days of ageing.

In general we can state that the most prominent chem-ical changes of connective tissue infrared absorption occurbetween the first 2 days of ageing. This supports ourearlier findings where we have shown that significantchanges in proteoglycans take place at a very early stagepostmortem [14].

The algorithm presented in this paper is available andcan be requested from the corresponding author.

Acknowledgement R. Ofstad and A. Kohler are grateful for financialsupport by the Norwegian Research Council through grant No. 153381/140. The authors thank V. Høst for fine collaboration and Ulrike Böckerand Ganesh Sockalingum for useful discussions.

References

1. Mendelsohn R, Chen HC, Rerek ME, Moore DJ (2003) J BiomedOpt 8:185–190

2. Lasch P, Haensch W, Naumann D, Diem M (2004) BiochimBiophys Acta 1688:176–186

3. Krafft C, Shapoval L, Sobottka SB, Geiger KD, Schackert G,Salzer R (2006) Biochim Biophys Acta 1758:883–891

4. Liu K-Z, Dixon IMC, Mantsch HH (1999) Cardiovasc Pathol8:41–47

5. Miller LM, Novatt JT, Hamerman D, Carlson CS (2004) Bone35:498–506

6. Boskey AL, Mendelsohn R (2005) Vibrational spectroscopy. Acollection of papers presented at the 3rd international conference“shedding light on disease: optical diagnostics for the newmillennium. SPEC 2004, Newark, NJ, USA, 19–23 June 2004,vol 38, pp 107–114

7. Paschalis EP, Verdelis K, Doty SB, Boskey AL, Mendelsohn R,Yamauchi M (2001) J Bone Miner Res 16:1821–1828

8. Paschalis EP, Recker R, DiCarlo E, Doty SB, Atti E, Boskey AL(2003) J Bone Miner Res 18:1942–1946

9. Kirschner C, Ofstad R, Skarpeid HJ, Høst V, Kohler A (2004)J Agric Food Chem 52:3920–3929

10. Kohler A, Kirschner C, Oust A, Martens H (2005) Appl Spectrosc59:707–716

11. Martens H, Nielsen JP, Engelsen SB (2003) Anal Chem 75:394–40412. Bonnier F, Rubin S, Venteo L, Krishna CM, Pluot M, Baehrel B,

Manfait M, Sockalingum GD (2007) Biochim Biophys Acta (inpress)

13. Eggen KH, Ekholdt WE, Host V, Kolset SO (1998) Basic ApplMyol 8:159–168

14. Hannesson, KO, Pedersen ME, Ofstad R, Kolset SO (2003)J Muscle Foods 14:301–318

15. Hildrum KI, Nilsen BN, Mielnik M, Naes T (1994) Meat Sci 38:67–80

16. Helm D, Labischinski H, Naumann D (1991) J Microbiol Methods14:127–142

17. Martens H, Jensen SA, Geladi P (1983) Skagenkaien 12. StokkandForlag, Stavanger, Norway, pp 205–233

18. Geladi P, McDougall DHM (1985) Appl Spectrosc 39:491–50019. Martens H, Stark E (1991) J Pharm Biomed Anal 9:625–63520. Savitzky A, Golay MJE (1964) Anal Chem 36:1627ff21. Martens H, Martens M (2001) Multivariate analysis of quality: an

introduction. Wiley, Chichester, UK22. Bertram HC, Kohler A, Ofstad R, Böcker U, Andersen HJ (2006)

J Agric Food Chem 54:1740–174623. Ofstad R, Hannesson K, Enersen G, Høst V, Hildrum KI (2005)

Proceedings of the 51st international congress of meat science andtechnology, Baltimore, Maryland, USA, 7–12 August 2005

24. Camacho NP, West P, Torzilli PA, Mendelsohn R (2001)Biopolymers 62:1–8

25. Bi X, Yang X, Bostrom MPG, Camacho NP (2007) BiochimBiophys Acta (in press)

26. Jackson M, Choo L-P, Watson PH, Halliday WC, Mantsch HH(1995) Biochim Biophys Acta 1270:1–6

27. Lasch P, Naumann D (2006) Biochim Biophys Acta 1758:814–829

28. Romeo M, Diem M (2005) Vibrational spectroscopy. A collectionof papers presented at the 3rd international conference “sheddinglight on disease: optical diagnostics for the new millennium. SPEC2004, Newark, NJ, USA, 19–23 June 2004, vol 38, pp 129–132

Anal Bioanal Chem (2007) 389:1143–1153 1153