classical least square

39
8/13/2019 Classical Least Square http://slidepdf.com/reader/full/classical-least-square 1/39 Classical Least Square One of our goals in writing this series of columns is to explicate, in words of one syllable or less, the inner workings of the various algorithms that are commonly used for analyzing data, and especially those used for multivariate calibration. In (relatively) recent times, we have presented the math behind multiple linear regression (MLR)  also sometimes known as inverse least squares (ILS), inverse Beer's law, and sometimes as the P-matrix calibration algorithm (1)  and the math behind principal component analysis (2  –8). Our next victim for this type of analysis is the algorithm known as classical least squares (CLS), also known as direct least squares, the Beer's law method, and also as the K-matrix calibration algorithm, although use of this last term (as well as the corresponding "P-matrix") is discouraged, due to confusion caused by the plethora of names for the same methodology. While not as widely used in practice as the other algorithms mentioned (and others yet to be discussed), this method has the advantage of being more closely related to the way chemists and spectroscopists think about spectra, rather than how mathematicians and statisticians think about spectra. It turns out, by the way, that along the way to understanding CLS ourselves, we came across some rather interesting consequences of applying the CLS algorithm to calibration data. We will discuss all this eventually, in due course, as we explain our approach to understanding the algorithm. We'll begin approaching this by presenting some algebraic equations, describing the two approaches. Because we are talking about Beer's law, we begin with the equation for Beer's law: Where  A i is the total absorbance of the ith component in the mixture a i is the absorbance of the pure ith component b is the pathlength through the sample c i is the concentration of the pure ith component Equation 1a applies wavelength-by-wavelength, and tells us that at any given wavelength, the absorbance (  A) is proportional to the absorptivity (a) of a material at the chosen wavelength, the pathlength of the light through the material ( b), and the concentration of the material (c). The absorptivity (a), of course, is the implicit property of a molecule, that varies with wavelength, and constitutes the "spectrum" of that molecule. Because the pathlength is, in practice, often a quantity fixed by the cell that

Upload: iabureid7460

Post on 04-Jun-2018

225 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Classical Least Square

8/13/2019 Classical Least Square

http://slidepdf.com/reader/full/classical-least-square 1/39

Classical Least Square

One of our goals in writing this series of columns is to explicate, in words of one syllable or less,

the inner workings of the various algorithms that are commonly used for analyzing data, and

especially those used for multivariate calibration. In (relatively) recent times, we have presented

the math behind multiple linear regression (MLR)—

 also sometimes known as inverse leastsquares (ILS), inverse Beer's law, and sometimes as the P-matrix calibration algorithm (1)— and

the math behind principal component analysis (2 –8).

Our next victim for this type of analysis is the algorithm known as classical least squares (CLS),

also known as direct least squares, the Beer's law method, and also as the K-matrix calibration

algorithm, although use of this last term (as well as the corresponding "P-matrix") is

discouraged, due to confusion caused by the plethora of names for the same methodology.

While not as widely used in practice as the other algorithms mentioned (and others yet to be

discussed), this method has the advantage of being more closely related to the way chemists

and spectroscopists think about spectra, rather than how mathematicians and statisticians think

about spectra. It turns out, by the way, that along the way to understanding CLS ourselves, we

came across some rather interesting consequences of applying the CLS algorithm to calibration

data. We will discuss all this eventually, in due course, as we explain our approach to

understanding the algorithm.

We'll begin approaching this by presenting some algebraic equations, describing the two

approaches. Because we are talking about Beer's law, we begin with the equation forBeer's law:

Where

 A i is the total absorbance of the ith component in the mixture

a i is the absorbance of the pure ith component

b is the pathlength through the sample

c i is the concentration of the pure ith component

Equation 1a applies wavelength-by-wavelength, and tells us that at any givenwavelength, the absorbance ( A) is proportional to the absorptivity (a) of a material at the

chosen wavelength, the pathlength of the light through the material (b), and the

concentration of the material (c). The absorptivity (a), of course, is the implicit propertyof a molecule, that varies with wavelength, and constitutes the "spectrum" of that

molecule. Because the pathlength is, in practice, often a quantity fixed by the cell that

Page 2: Classical Least Square

8/13/2019 Classical Least Square

http://slidepdf.com/reader/full/classical-least-square 2/39

contains the sample, and is certainly the same for all components of a sample being

measured, it is convenient to combine it with a and consider the product ab as the

quantity we are measuring. This product was represented with the symbol K  in olderliterature, hence, the origin of the "K matrix" nomenclature.

Equation 1a applies to a single component in a sample. When there are multipleabsorbing components, the total absorbance is the sum of the absorbances of all the

absorbing materials at the wavelength of interest. Because this happens at every

wavelength, we also speak of the spectrum of a mixture, that is, the absorbance at everywavelength, as being the sum of the spectra of the components of the mixture, each one

weighted by its concentration. The total absorbance at any wavelength is the sum of the

absorbance, at that wavelength, of all the components in the mixture, for example:

Equation 1 was derived for, and therefore applies to, measurements made using

transmission measurement geometry, when the sample is a clear (that is, nonscattering)

liquid. The presence of optical scattering enormously complicates the situation, to the

 point where it is still considered an unsolved problem despite the extensive efforts ofmany scientists over the years (see references 9 – 14 for typical examples). We do not

currently consider this situation in our column.

In this form, we have previously described how the relationship between the absorbanceand analyte concentration can be found using least squares calculations (1), that is, the

MLR (under any of its names, as described earlier) algorithm.

An example of this form of addition of spectra is illustrated in Figure 1, where we present

the spectra of water, methanol, acetic acid, and a mixture of these three components. The

 particular mixture shown contains (by weight) 25% water, 25% methanol, and 50% aceticacid.

Page 3: Classical Least Square

8/13/2019 Classical Least Square

http://slidepdf.com/reader/full/classical-least-square 3/39

Both the CLS and ILS algorithms apply least squares calculations to the spectral data.

The difference between the CLS and ILS approaches lies in how we treat the spectral

data. Because we have presented the ILS treatment previously, we now present thecorresponding treatment according to the CLS methodology. This methodology tells us

that, as shown in equation 1, that at every wavelength, the total absorbance ( A) equals the

sum of the contributions from each component, weighted by their concentrations.

We begin by expanding the summation in equation 1b. We assume that we are working

with a three-component mixture; this suffices to show how multiple components can behandled, without the equations becoming completely unwieldy, and this is the example

we will use:

To simplify the notation, we note that the pathlength (b), which of course is the same for

all components in a mixture, and we assume that it is also constant for all samples of

interest to us. This permits us to assume a unit of pathlength measurement that allows us

to set the pathlength to 1 (unity), so that equation 1c becomes:

The Mathematics Behind the CLS Method

Equation 1d is our starting point for further discussion. This equation is valid for a single

wavelength of a single sample. The CLS algorithm is based upon applying equation 1d to

all the wavelengths in the spectral range of interest. Equation 1d becomes equation 2:

Where:

 A  j is the absorbance at the jth wavelength

a 1 j is the absorptivity of the pure material 1 at wavelength j 

c 1 is the concentration of the pure material 1 (and c 2 and c 2 similarly)

Page 4: Classical Least Square

8/13/2019 Classical Least Square

http://slidepdf.com/reader/full/classical-least-square 4/39

Knowing a 1 j , a 2 j , a 3 j at all wavelengths j (where j goes from 1 to n, the number of

wavelengths in the spectrum) means knowing the spectra of the pure components making

up the mixture. If we know the spectra of those pure components, then we can set up aleast squares computation this way, similar to other least squares computations we have

derived.

For a given known set of spectra a 1 j, a 2 j, a 3 j, we want to find the concentrations c1, c2,

c3, that best determine the mixture absorbance A. Therefore, we define E   j as the error in

the determination of the value of A  j : for the jth wavelength:

The "least square" principle defines the fact that we want to minimize the sum-squared

error in the reproduction of the values of A  j over all j wavelengths. The next step,

therefore, is to set up and define the sum-squared error:

Equation 4 defines the sum-squared error. Then we minimize the sum squared error bythe usual procedure of taking the derivative and setting it to zero. In this case, because we

want to find the concentrations that give the least square estimation of the absorbances,

the derivatives are taken with respect to the three concentrations, c 1, c 2, and c 3. Thus,starting by taking the derivative with respect to C  1, for example, we get:

Page 5: Classical Least Square

8/13/2019 Classical Least Square

http://slidepdf.com/reader/full/classical-least-square 5/39

Similarly, the derivatives with respect to C  2 and C  2 are Setting all these derivatives tozero and dividing each equation by 2 gives us

Distributing the summations and multiplying through by the aij, we get

Finally, moving the first term in eachequation to the other side:

With known spectra for the three components, a1, a 2, and a3, and a measured spectrumfor A, the unknown variables in equation 10 are the three concentrations, c 1, c 2, and c 3.

Once the data (that is, the spectral values) are plugged into the expressions represented by

equation 10, the equations can be solved (by considering them as simultaneous equations,and solved for the three value of c 

Page 6: Classical Least Square

8/13/2019 Classical Least Square

http://slidepdf.com/reader/full/classical-least-square 6/39

 

Alternatively, the equations can be converted into a matrix expression, in which equation

10 becomes

which, when solved for the concentration

[c], is

Page 7: Classical Least Square

8/13/2019 Classical Least Square

http://slidepdf.com/reader/full/classical-least-square 7/39

Put into this form, equation 12 looks an awful lot like the MLR equation, and even the

whole derivation of it (1). Anyone reading the above explanations will be (or at least

should be) asking themselves, "So what's the difference between the "inverse" leastsquares and this "Classical" (or "direct" least squares) methods? They're both least

squares, aren't they?"

The answer is "Yes, but . . .". The matrix equations are the same, and the computations

are the same. The differences lie in the meanings of the variables that the equations

represent, and therefore, the data that go in the computations specified by thoseequations.

In both formulations, the vector [c] represents the concentrations of the variouscomponents of the samples of interest. In the "least squares" equations for MLR, those

concentrations are considered "known" because they have been measured by some other,

external "reference lab" method. In CLS, the concentrations are unknown, and are

computed as the result of the least squares computations themselves. In fact, this

application of CLS is virtually an "absolute" computation. We compute the concentrationof the components of a mixture based solely upon first principles, that is, the principal of

absorbance being proportional to concentration, in accordance with Beer's law.

In both formulations, the spectra of mixture samples are measured. In MLR, those spectra

are related to the reference laboratory values by the least squares calculations. In CLS,those spectra are related to the spectra of the pure components, no reference laboratory

values are used.

In CLS, spectra of the mixture components in pure form are measured. No corresponding

measurements are made for the MLR algorithm. In a sense, the spectra of the pure

components "replace" the reference laboratory results, and by virtue of their being "pure,"serve as "absolute" references for the computations.

So in summary, concentration information about mixture components, as well as spectraof the mixtures are involved, each in their own way, in both algorithms. But each

algorithm also requires a piece of data that the other does not. MLR requires external

"Reference Laboratory" values for the concentrations, while CLS requires spectra of the pure mixture components. Indeed, by properly organizing the data, the same software that

is used for MLR calibrations can be used for CLS calculations as well. One simply needs

to keep track of what data has been used for the different variables in the software, and

which ones contain the results.

The evaluation of the CLS algorithm historically has been similar to the MLR algorithm:

How well does it predict? The difference lies in what its predicting. In the case of theCLS algorithm, the question is how well does it predict the spectrum of the mixture?

We applied the CLS algorithm to the spectrum of the 25% water, 25% methanol, 50%acetic acid shown in Figure 1, and then calculated the predicted value of absorbance at

each wavelength (the predicted spectrum). Figure 2 presents the plot of the mixture

Page 8: Classical Least Square

8/13/2019 Classical Least Square

http://slidepdf.com/reader/full/classical-least-square 8/39

spectrum reproduced from the calculated CLS concentrations. We observe that, while the

main features of the mixture spectrum are reproduced, overall, the recreation of the

spectrum is indifferently good, at best. We will return to this data and re-examine theunderlying causes of this less-than-stellar performance, after examining some more

aspects of CLS theory in our next column

References 

(1) H. Mark and J. Workman, Spectroscopy 21(5), 34 – 38 (2006).

(2) H. Mark and J. Workman, Spectroscopy 22(9), 20 – 

29 (2007).

(3) H. Mark and J. Workman, Spectroscopy 23(2), 30 – 37 (2008).

(4) H. Mark and J. Workman, Spectroscopy 23(5), 14 – 17 (2008).

(5) H. Mark and J. Workman, Spectroscopy 23(6), 22 – 24 (2008).

Page 9: Classical Least Square

8/13/2019 Classical Least Square

http://slidepdf.com/reader/full/classical-least-square 9/39

(6) H. Mark and J. Workman, Spectroscopy 23(10), 24 – 29 (2008).

(7) H. Mark and J. Workman, Spectroscopy 24(2), 16 – 26 (2008).

(8) H. Mark and J. Workman, Spectroscopy 24(5), 14 – 15 (2009).

(9) A. Schuster, Phil. Mag. 5, 243 (1903).

(10) A. Schuster, Astrophys. J. 21, 1 (1905).

(11) P. Kubelka, F. Munk, Z. Techn. Physik 12, 593 (1931).

(12) G. Kortum, R eflectance Spectroscopy: Principles, Methods, Applications, 1st ed.(Springer-Velag, New York, 1969).

(13) W.W. Wendlandt and H.G. Hecht, Reflectance Spectroscopy (John WIley and Sons,

Hoboken, New Jersey, 1966).

(14) D.J. Dahm and K.D. Dahm, Interpreting Diffuse Reflectance and DiffuseTransmittance: A Theoretical Introduction to Absorptin Spectroscopy of Scattering

 Materials, 1st ed. (IM Publications; West Sussex, UK, 2007).

Page 10: Classical Least Square

8/13/2019 Classical Least Square

http://slidepdf.com/reader/full/classical-least-square 10/39

Page 11: Classical Least Square

8/13/2019 Classical Least Square

http://slidepdf.com/reader/full/classical-least-square 11/39

A key point to note in equation 16 is that by measuring the spectrum of the pure material

in the same cell we measured the spectrum of the mixture in, the pathlength of the cell

dropped out of the equation (even if we hadn't previously set it to unity), and we are left

with the fact that the concentration of the analyte is simply the ratio of the absorbance of

the sample to the absorbance of the pure material.

The complicated-looking matrix equation shown as equation 12 (1) is in fact very similar

to equation 16, it is simply the extension of this derivation to the case of multiple

wavelengths, and the inclusion of the knowledge that the concentration of a given

material is the same regardless of the wavelength we make the measurements at, so that

we can use any or all wavelengths available for the computation.

For our current discussion, however, what's important is that of the three quantities in

equation 16, two are known and one is unknown. Here, the two known quantities are a 

(the spectrum of the pure material) and A (the spectrum of the sample). This allowed us

to solve for the third quantity, the concentration of the analyte.

Taking the same equation:

we now ask the same question as we asked at the beginning of this column: what if we

don't know the spectrum of the pure analyte a?

That's OK. As we learned, again in our famous "high school algebra," as long as weknow the values of all the other variables, we can calculate the one we want to know. In

terms of equation 16 we can answer the question this way: if we solve equation 16 for a,

the absorbance of the pure analyte:

We then note that if we measure the spectrum A of the analyte in the sample, and we alsohave the auxiliary information that specifies the concentration of the components in the

sample, we can calculate a, the spectrum of the pure analyte. We could then use thatspectrum as the spectrum to "plug into" equation 16 to use it to do future analyses in

which we want to know the concentration, just as if we had measured the spectrum of the

 pure material directly.

Of course this is an idealization of the situation. In equations 16 and 17 we have assumed

that the "spectrum" consists of the absorbance at a single wavelength, and we have alsoassumed that the analyte is the only absorbing component in the sample mixture.

Page 12: Classical Least Square

8/13/2019 Classical Least Square

http://slidepdf.com/reader/full/classical-least-square 12/39

To take into account the possibility of multiple absorbing components in a sample, we

take another look at equation 2 (1), which includes the spectral contributions of multiple

components:

Here, we cannot simply solve equation 2 for any of the unknown values of a, the

absorbance of the pure materials. As we also learned in high-school algebra, when there

are multiple unknowns in an equation, their values are undetermined because there aremany (indeed, infinitely many) possible combinations of values that will produce the

values A  j .

What to Do?

Here again we learned in high school algebra what to do. If we have some number ofunknowns in our equation (let's say m unknowns; in equation 2, m = 3), then we need, as

a minimum, that same number of equations, to be able to solve them and obtain a unique

solution. Since, for example, there are three unknown quantities, a 1, a 2, and a 3 in

equation 3, then we need two more equations, to create a set of three simultaneous

equations that we need in order to be able to compute unique values for the absorptivities

of those three pure materials. Therefore we first rewrite equation 2, and include as a

subscript, the fact that this is the first of three equations

We then add two more equations, representing the values corresponding to two moremixtures that might be used:

As general mathematical structures, these equations are satisfactory, but we also need to

know how they relate to the problem we have set out for ourselves. Notice that a 1, a 2,

and a 3 are the same in all three equations, as they must be if they represent the

absorbances of pure materials. As we saw above, these are the unknowns in thesimultaneous algebraic equations represented by equations 18, that are what we want to

solve for.

Page 13: Classical Least Square

8/13/2019 Classical Least Square

http://slidepdf.com/reader/full/classical-least-square 13/39

The quantities a 1 ,j , a 2 ,j , and a 3 ,j are the total absorbances of three mixtures that we need

to use for our data set. In general they will have different values, although under some

special circumstances two, or even all three of them, may be the same.

The quantities c i ,j represent the concentrations of the three materials that comprise each

of the three samples. As we saw in equation 18 and the subsequent discussion, along withthe total absorbances of the mixtures, these concentrations are the algebraic "knowns" in

the equations.

To solve the equations, the concentrations of the various components of the samples must

have certain properties. In general, they must differ between the three samples. And while

the concentrations need not all be orthogonal, it is extremely important they must not belinearly related, between any pair of samples, or between all three samples. For example,

the concentrations of the components in sample 2 must not  be a constant multiple of their

concentrations in sample 1. We make the same requirement for the concentrations of

components in samples 1 and 3, as well as for samples 2 and 3. Also, the sum (or

difference) of the concentrations in one sample should not be the sum of theirconcentrations in the other two.

If all those conditions hold, then it is possible to solve equations 18a – c by ordinary

algebraic means, and explicitly calculate the values of a 1, a 2, and a 3. It is not our

 purpose here, however, to illustrate the algebraic solution to simultaneous equations.

Besides that we have all done that in high school, doing it in a spectroscopic context has

also been illustrated in the context of the development of MLR ( in reference 2). What wewill do here, though, is to quickly run through a simplified version of the mathematical

development by transitioning from algebraic equations to their matrix representation.

Equations 18a – c can be written in matrix form as:

where, as usual, the brackets indicate a matrix, and the symbols A, c, and a have the same

interpretations as in the algebraic equation 3. Because in the matrix equation, as in the

algebraic equation, A and c are the "known" values, we solve equation 10 for the

unknown quantities, the spectra of the pure components, which are represented by [a] by

first multiplying both sides of equation 10 by the inverse of matrix [c], that is, [c]-1

Since any matrix multiplied by its inverse results in the unit matrix, we obtain

Page 14: Classical Least Square

8/13/2019 Classical Least Square

http://slidepdf.com/reader/full/classical-least-square 14/39

 

Equation 21 does not represent a least-squares solution, however. It is merely the result of

solving simultaneous equations, and as is known from our previous work with MLRcalibration, using the solution to simultaneous equations has some limitations. The chief

limitation is the fact that the results are completely dependent upon the values used for

the various individual numbers comprising [ A] and [c]. Small errors in any of the data canhave large effects on the results. The effects can be exacerbated if any of the data values

are colinear, as we discussed earlier.

To reduce the effects of any of these problem issues, we proceed as we have done several

times previously. We use more than the minimum number of required samples, and useleast squares calculations to determine the values of the unknowns that minimize the

(sums of squares of the) errors.

Here, we have known values for the absorbance of the mixtures and for the

concentrations, and wish to solve for the absorbance of the pure components. We presentthese in Table I, for comparison with what we have seen previously.

Page 15: Classical Least Square

8/13/2019 Classical Least Square

http://slidepdf.com/reader/full/classical-least-square 15/39

 

As we see, the layout in Table I is the same as in our previous least square situations and

therefore the least-squares development can be done similarly to the way it was done

 before, too. The only difference is that in Table I, the concentrations are different for each

sample, while the absorbances are the constants of the equation, whose values are to bedetermined using least squares. Without going through all the gory mathematical details,

we can note that we are trying to find values for the a  j , the absorbances of the pure-

component spectra, that will allow us to best compute A  j , the absorbance of the mixture,

for each sample. Becasue there is presumably some error in each measurement, we want

to find the values of the a i that provide the smallest sum of squares of the error in the

computation of the A  j . Thus we compute e  j from the following expression:

We then follow the usual procedure of squaring the errors of each sample, summing the

squared errors over all the samples, then taking the derivative of the sum with respect to

the three (in this example) terms a i , exactly the same way we did in reference 1, except

using the appropriate data values.

When this is done, and converted to matrix expressions, the results are similar to those

we found for the CLS computation when we know the spectra of the pure materials, and

also to the expression for MLR calibration. We illustrate the comparison by rewriting the

matrix equations below in Table II for comparison

Something we sort of swept under the rug, so to speak, is that all the earlier equations,

from equation 17 to equation 22, refer to the values of absorbance (of samples and of

 pure materials) at a single wavelength. This is clear in equation 2, and perhaps even insome of the subsequent discussion. Once we get to equation 18, however, a reader might

think that the multiple values of ai represent absorbances at different wavelengths, as is

often the case in chemometrics applied to spectroscopy. Multiple absorbances in an

equation are usually the absorbances of a substance at different wavelengths, this being

the most common interpretation of "multivariate" in a spectroscopic context.

Page 16: Classical Least Square

8/13/2019 Classical Least Square

http://slidepdf.com/reader/full/classical-least-square 16/39

In this particular case, however, that's not so. The multivariate nature of the data inequations 18 – 22, as in Table II, are that the rows represent different samples, while the

column represent, not different wavelengths, but different materials; the differentmaterials are all measured at the same wavelength, for every sample. The differences in

absorbance arise from the fact that different materials will, in general, have different

absorptivities at any given wavelength.

We demonstrate the application of this methodology in Figure 1, where, starting with a

set of samples made from water, methanol, and acetic acid according to a mixture design(the details of which will be presented in a subsequent column) specifying 12 mixtures

(and the three pure materials). The spectra of the 12 mixtures are used to compute the

spectra of the pure materials, as specified by the equation in Table II; then these can be

compared with the actual spectra of the pure materials.

The computation of the spectrum of the pure materials, therefore, arises out of the fact

that we can apply equation 22 separately to each wavelength in the spectrum of interest.

Computationally this might seems very inefficient, but in the computational form shownin Table II, the quantity [C  T  

][CC  T  

]-1

 need only be computed one time (because the

concentrations don't change when you do the computation for a different wavelength),and so that matrix product need only be computed once and then multiplied by [ A], the

absorbance matrix for each mixture, to compute the values [a] of the various pure-

material spectra at that wavelength.

Two caveats are needed for performing these calculations. The first caveat is how the

"concentrations" should be expressed. While we will eventually have a lot more to say

about the values that concentrations should be expressed in, for now, we will note that theconcentrations should be expressed as fractions (or proportions) of the total, rather than

 percent, so that the sum of the concentrations of all the components in the sample equalunity, rather than 100%.

The second caveat is, in a sense, a result of the first caveat. If we perform a regression

where, for every sample, the sum of the components add to a constant value (as in thiscase, where they all add to unity), we will quickly find that the matrix to be inverted ([CC  T  

]-1

 ) is singular (that is, during the course of the computations a division by zero

situation will be encountered). The way to deal with this situation is to eliminate one ofthe coefficients from the computations. The quick and dirty way (in other words, the

wrong way) would be to eliminate one of the concentrations from the set of equations.

However, that is, as stated, the "wrong" say because it would then ignore the contribution

of that sample component to the calculations. The right way to do the calculation is toeliminate the constant term (a 0) from the computations. Draper and Smith (3)

demonstrate (on p. 412) how the regression equations need to be modified to

accommodate this calculation.

Figure 1 shows that reasonable representations of the pure component spectra are

obtained by this means. A more critical view of the spectra can be obtained by comparingthem to the original, measured, pure component spectra, which we do in Figure 2. The

Page 17: Classical Least Square

8/13/2019 Classical Least Square

http://slidepdf.com/reader/full/classical-least-square 17/39

three parts of Figure 2 show the comparisons of the reproduced spectra with the original

measured pure water, methanol and acetic acid spectra. We see that, as with the

reproduction of mixture spectra in part I of this column, while the reproductions are"about right," a more critical look at the comparison reveals appreciable flaws in the

results. In particular, we see that the calculated methanol spectrum in Figure 2b has a

 peak at roughly 1940 nm, corresponding to the water peak at that wavelength, as does thecalculated acetic acid spectrum, neither of the actual spectra have that peak.

Page 18: Classical Least Square

8/13/2019 Classical Least Square

http://slidepdf.com/reader/full/classical-least-square 18/39

While the mathematics of CLS calibration is not inherently more difficult than the

development of some of the other, more common, calibration methods, it is rarely used in

modern chemometric practice. The reasons for this are not theoretical, but practical.While the equations derived are straightforward enough, they are derived for those

situations to which Beer's law strictly applies —  that is, clear (that is, nonscattering)

liquid solutions.

This is the first stumbling block in the application of the CLS method; most current

applications of chemometric analysis are for powdered solids, or, even if liquids are ofinterest, they are often emulsions, gels, or some other type of scattering sample.

Even in those cases in which clear liquid mixtures might be of interest, there are otherdifficulties. The first difficulty is that it might not be possible to obtain or measure the

spectrum of all the components of the mixture. As in applications in which other

calibration algorithms are commonly used, as for example, with natural products, it might

not be possible to extract every component and measure its spectrum. Indeed, in

complicated samples, not all the components might be known, much less methods toextract them; of course, any extraction method, to be useful, must not degrade the

component or change its spectrum.

In many cases, the spectrum of a pure component, after being extracted from the sample,

is not the same as the spectrum of that component in its natural state in the sample. Thiscaveat is critical even in what might be thought of as a "simple" case, that of water. A

notorious example of this is water in many natural products, where the interactions with

the surrounding materials changes the spectrum of the water compared to its spectrum in

the pure state. Water is by no means unique in this respect.

Earlier in this column, we showed how to deal with those samples where the pure-component spectra are unknown, so a natural question to ask at this point is why notapply that concept, and determine the pure-component spectra from a set of mixtures?

We could indeed do that. The necessary procedure would be to obtain a suitable set of

samples, measure their spectra, and measure, using wet chemistry or another laboratory

(in other words, reference) method, the concentrations of all the components. However,why do this, when the MLR algorithm has the same requirements except that only the

concentration of the analyte needs to be measured using a reference laboratory?

Basically, to use CLS for this purpose would have all the requirements of MLR "onsteroids" so to speak. Again, this is a matter of practicality, rather than any inherent

defect of the CLS method itself.

A more serious reason for not doing that is revealed by the results from this

demonstration of the CLS method: the relatively poor reconstructions of the target

spectra; this constitutes a more significant drawback to the method. Between the twodrawbacks, therefore, the CLS approach is rarely used.

Page 19: Classical Least Square

8/13/2019 Classical Least Square

http://slidepdf.com/reader/full/classical-least-square 19/39

In our next column, we will examine the discrepancies between the measured and

reconstructed spectra that we saw in both columns in more detail, and the causes of those

discrepancies. We also will look at the behavior of the CLS algorithm in more detail, tosee how what we've learned can be applied to obtain useful results.

References 

(1) H. Mark and J. Workman, Spectroscopy 25 (5), 16 – 21 (2010).

(2) H. Mark, Principles and Practice of Spectroscopic Calibration (John Wiley & Sons;

Hoboken, New Jersey, 1991).

(3) N. Drape and H. Smith, Applied Regression Analysis, 3d ed. (John Wiley & Sons;

Hoboken, New Jersey, 1998).

Page 20: Classical Least Square

8/13/2019 Classical Least Square

http://slidepdf.com/reader/full/classical-least-square 20/39

In our two previous columns in this (sub)series (1,2), we examined the mathematics

 behind the classical least squares (CLS) approach to analysis. This approach is based

upon the fact that when Beer's law applies, as in clear liquids, the spectrum of a mixtureis the sum of the spectra of the individual pure components, each weighted by its

concentration.

What we saw in those actual spectra was that while the mathematics describing the

spectra is exact, the synthesis of the mixture spectra, and the recovery of the pure

component spectra from a set of mixtures, was approximately correct, but not exactlycorrect.

There are two reasons for the discrepancies we found. One reason was the nature of theoptical measurement used for obtaining those spectra. While the samples were measured

in a temperature-controlled cell, the optical measurements were made by transflectance.

Thus, while the incident beam from the spectrometer was fully directed and specular, the

 beam was returned to the spectrometer for measurement by diffuse reflectance. A

consequence of this measurement technique is that the returning rays, after diffusereflection, are spread out through a wide range of angles.

We have demonstrated previously an effect on rays passing through a sample at a high

angle to the normal (3) (or see chapter 29 in [4]). We summarize the effect here by noting

that because those rays at high angles have a longer pathlength through the sample thanrays that pass through the sample perpendicular to the faces, they are more strongly

absorbed. The net effect is to introduce a nonlinearity in the spectroscopic response, and

the nonlinearity is greater at high absorbances, an effect we have also previously

demonstrated (see reference 3 or chapter 27 in reference 4).

The second reason for the discrepancies is the interactions between water, methanol, andacetic acid. All three of these materials contain the OH functional group. This functionalgroup can dissociate or form hydrogen bonds. Furthermore, there is a very delicate

equilibrium between undissociated – OH, and varying amounts of hydrogen bonding

 between the three materials. The wavelength at which hydrogen absorbs depends stronglyupon its environment, and the change from undissociated – OH and hydrogen bonded – 

OH causes changes in the absorbance properties of the species, both in strength and

wavelength. Thus, mixing these materials together created conditions whereby theabsorbances change as the proportions of materials in the mixture change.

An example is shown in Figure 1, where spectra of water  – methanol mixtures are shown.In these mixtures, the amount of water varies between 0% and 100% in uniform 25%

steps. It is clear, especially in the spectral region around 2270 nm, that the spectra are

compressed at the higher absorbances

Page 21: Classical Least Square

8/13/2019 Classical Least Square

http://slidepdf.com/reader/full/classical-least-square 21/39

 

The equations for the CLS computations assume that all the spectra respond in a strictly

linear fashion to concentration, and add together in exact proportion to their

concentrations. As we have seen, however, there are at least two physical causes ofnonlinearity in the spectra of the mixtures of these three materials that we were workingwith, an optical effect and a perturbation of the spectra when the environment is changed.

It should come as no surprise, therefore, to find that the equations that depend upon strict

linearity in the behavior of the analytes don't properly describe the system formed, whenapplied to the mixtures where strict linearity is not present. It is the lack of linearity,

resulting from distortions of the spectra of the components (due to interactions with the

other components) that caused the differences between the spectra of mixtures calculatedfrom the pure materials and the actual mixture spectra, as well as the imperfect recreation

of the spectra of the pure materials from the spectra of the mixtures.

To demonstrate that the CLS approach can, in fact, actually work as described, we willneed to start over again, and use materials that will not interact the way water, methanol,

and acetic acid do. Before doing that, however, we also will describe the CLS method in

a way that (we hope) will be more meaningful to a spectroscopist than the descriptionswe used previously (1,2), which were intended for more mathematically and

chemometrically oriented practitioners. Then we'll examine the results of using toluene,

n-heptane, and dichloromethane as the three liquids, these being all hydrocarbons with no – OH or other functional groups. We also will see that the measurements are made in

Page 22: Classical Least Square

8/13/2019 Classical Least Square

http://slidepdf.com/reader/full/classical-least-square 22/39

transmission, so that the nonlinearities attendant upon the use of transflection are not

operative. So now we'll begin approaching this by presenting the same algebraic

equations we saw before, but we will look at them as a spectroscopist would, not as astatistician or chemometricians would. Because we are still talking about Beer's law, we

again begin with the equation for Beer's law:

As we described previously, equation 1 applies wavelength-by-wavelength, and tells usthat at any given wavelength, the absorbance ( A) is proportional to the absorptivity (a) of

a material at the chosen wavelength, the pathlength of the light through the material (b)

and the concentration of the material (c). The absorptivity (a), of course, is the implicit

 property of a molecule that varies with wavelength, and thereby constitutes the"spectrum" of that molecule. Because the pathlength is often, in practice, a quantity fixed

 by the cell that contains the sample, and is certainly the same for all components of a

sample being measured, it is convenient to combine it with a and consider the product ab 

as the quantity we are measuring.

Equation 1 applies to a single component in a sample. When there are multiple absorbingcomponents, the total absorbance is the sum of the absorbances of all the absorbing

materials, at the wavelength of interest. Because this happens at every wavelength, we

also speak of the spectrum of a mixture, that is, the absorbance at every wavelength, as being the sum of the spectra of the components of the mixture, each one weighted by its

concentration.

Equation 1 was derived for, and therefore applies to, measurements made in transmissionwhen the sample is a clear (that is, nonscattering) liquid. The presence of scattering

enormously complicates the situation, to the point where it is still considered an unsolved problem despite the extensive efforts of many scientists over the years (5 – 9) and the morerecent work of Don Dahm (10) for typical examples.

The "inverse" Beer's law case, as it is sometimes called, is derived from equation 1 by the

simple expedient of dividing both sides of equation 1 by the ab product, thereby solving

the equation for c, as we saw in reference 2:

In this form, we have previously described how the relationship between the absorbanceand analyte concentration can be found using least squares calculations (11).

Anyone reading the previous explanations will be (or at least should be) asking

themselves, "So what's the difference between the "inverse" least squares and the

"classical" (or "direct" least squares) methods? They're both least squares, aren't they?

Page 23: Classical Least Square

8/13/2019 Classical Least Square

http://slidepdf.com/reader/full/classical-least-square 23/39

The answer is "Yes, but . . . ." We continue by presenting some matrix equations, without

attempting to understand them, at this point.

In Table I, the symbols have the following meanings:

C  = concentration of analyte(s) *

 A = Absorbance spectrum of sample(s) **

b = coefficients of absorbances at the given wavelengths ***

S  = Absorbance spectra of pure materials ***

*In inverse least squares (ILS), [C ] represents the concentration of the analyte for whicha model is being developed; in CLS, [C ] represents the concentrations of all the pure

materials comprising the sample.

** Similarly, in ILS, [ A] represents the absorbance of the samples comprising the

calibration mixtures, while in CLS, [ A] represents the absorbance spectrum of the sample

 being analyzed.

*** Note that b (the coefficients of selected wavelengths) and S  (spectra of purematerials) do not have exact counterparts in the other calibration algorithm, although in a

loose way, the spectrum of a pure material could be viewed as "coefficients" representing

the absorbance that material at the various wavelengths.

A casual inspection of the two equations in Table I probably would elicit the reaction

"But except for the positions of the labels S  and A, those two equations look practically

the same!"

Yes, indeed they do. There are corresponding parts to the two equations, [SS  T  

]-1

 

corresponds to [ A T   A]

-1, S  

T  corresponds to A 

T  , and [C ] on the CLS side corresponds to

[b] on the ILS side. These correspondences are due to the fact that both equations

represent a "least squares" calculation of the data. So what's the difference? The

equations seem to do the same things. Because they both specify the computation of least

Page 24: Classical Least Square

8/13/2019 Classical Least Square

http://slidepdf.com/reader/full/classical-least-square 24/39

squares, they are doing the same things. In fact, without going into all the gory details,

the two equations are, in reality, the inverse case of each other. But then, don't they say to

do the same things, regardless of the label used to represent the variables?

The answer is "not exactly." The difference between the two equations becomes more

apparent when you realize that, first of all, those equations in Table I are matrix equations(as we stated initially, just before we presented the equations) and that while matrix

operations sometimes appear to reflect algebraic equations, there are differences between

matrix operations and similar-looking algebraic equations.

In the case of the equations in Table I, the pertinent rule for matrix operations is that

matrix multiplication does not commute, that is, given two matrices [ A] and [ B], thematrix product [ A] [ B] does not, except under very special conditions, equal [ B][ A].

Therefore, it's not just a matter of the labeling. Even if A and S  were to represent the same

quantities (which they don't despite the fact that they both represent absorbances), the

 product A T  

[ AA T  

]-1

 would not equal [ AA T  

]-1

  A T  

, for example.

We have previously examined the case of ILS in earlier columns (11,12), so let us now

concentrate on the "classical" least squares methodology, to see how it differs. Again, wecome back to the fact that "classically," chemists learned to do chemical analysis using

spectroscopy by thinking about spectra the way they were taught to do.

So how do chemists think about spectra? To answer this question, let's look at Figures 2a

and 2b. Figure 2a shows the absorbance spectra of some pure liquids: dichloromethane,

toluene, and n-heptane. The reasons for the choice of these materials were mentioned briefly earlier, and also will be discussed further, in due course.

Figure 2: (a) Absorbance spectra of pure dichloromethane, n-heptane, and toluene (x-scale in

wavenumbers) and (b) absorbance spectra of pure dichloromethane, n-heptane, and toluene,

and of a ternary mixture of the three (x-scale in wavenumbers). Note that the baselines of the

spectra are offset for clarity 

Page 25: Classical Least Square

8/13/2019 Classical Least Square

http://slidepdf.com/reader/full/classical-least-square 25/39

When chemists get their first exposure to spectroscopy, it is normally explained in terms

of transmission measurements through clear liquids, and they start by learning a few

 basic "ground rules" about those measurements:

I. The presence of absorbance bands in the material, and their wavelengths, is

characteristic of the nature of the material.

II. The strength of a given absorbance band depends on three factors:

  The absorptivity of the material at that wavelength, which is an inherent property

of the material.

  The pathlength of the light through the sample.

  The concentration of the analyte in the sample.

III. The total absorbance of a sample, at a given wavelength, equals the sum of theabsorbances, at that wavelength, of all the materials in the sample.

Let us make the point now that in the discussion to follow, all references to samples,spectra, and every other aspect of the discussion is based upon the paradigm of

transmission measurements through clear liquid samples.

For clear liquid samples, these properties of spectral measurements are universals, that is,

they are valid for any sample and any wavelength, regardless of the underlying physicscreating the absorbance properties of the material. Thus, for example, the same

considerations apply whether the measurement is made in the UV range, where the

underlying absorptions are due to the interactions of photons with the electronic atomic

and molecular orbitals, or in the infrared range, where the underlying absorptions are due

to interactions of light with the vibrations of the nuclei of the atoms comprising thesample.

Page 26: Classical Least Square

8/13/2019 Classical Least Square

http://slidepdf.com/reader/full/classical-least-square 26/39

 

Figure 3: Spectra of two mixtures. Mixture 1 (blue): toluene = 50%, dichloromethane =

25%, n-heptane = 25%. Mixture 2 (black): toluene = 25%, dichloromethane = 50%, n-heptane = 25%.

Page 27: Classical Least Square

8/13/2019 Classical Least Square

http://slidepdf.com/reader/full/classical-least-square 27/39

 

Figure 4: Absorbance spectra of pure dichloromethane, n-heptane, and toluene, and of a

ternary mixture of the three (x-scale in wavenumbers). Note that this graph is identical to

the graphs in Figure 1b except that it has been turned on its side by rotating it 90°.

Later, when we have become more sophisticated about these matters, we learn some newrules, such as the need to avoid interactions between different species present in thesample, and so forth. But for now, let's keep things simple and basic, and just consider

the basic rules we stated earlier.

From rule 2 earlier, we learn about Beer's law. Without going into detail, we learn that

there is a quantity call the absorbance (abbreviated A) that, at every wavelength, is the

 product of the three quantities listed: the absorptivity, the pathlength, and the

Page 28: Classical Least Square

8/13/2019 Classical Least Square

http://slidepdf.com/reader/full/classical-least-square 28/39

concentration of the analyte. A way to see a convenient mnemonic is again to rewrite

equation 1, as we do here:

 A = abc 

where:

 A is the absorbance of the analyte at a specified wavelength,

a is the absorptivity of the analyte at that wavelength,

b is the pathlength of the light through the sample, and

c is the concentration of the analyte.

There are also some truisms that quickly become apparent. For example, a, the

absorptivity at a given wavelength, is a constant of nature. Therefore, it is not under ourcontrol and for quantitative purposes, we simply have to accept those values that nature

 provides for us. On the other hand, there is a benefit to this. Because the absorptivity at agiven wavelength depends upon the underlying molecular structures, the absorptivity is

characteristic of those structures, and for qualitative purposes, the structure of the

absorptivities, reflects the structure and properties of the underlying molecules, giving us,

the users, the ability to identify those structures. This is an area of spectroscopy that has been highly developed over the years.

Another truism is the simple fact that for a given liquid sample, the pathlength is fixedand is set by the cell that the sample is contained in. Thus, for all the constituents in the

sample, all measurements are made using the same pathlength. Furthermore, theconcentration of all the components in a given sample is constant, regardless which onesare of interest. An extension of this is that in a series of measurements, samples are likely

to all be measured in the same cell, or, if we ignore the quibbling over how well multiple

cells can be matched, at least in cells of the same pathlength. The effect of this is to

remove the pathlength from consideration as a variable because for those measurements,it becomes constant and only the concentrations of the sample components will vary from

sample to sample.

There are several ways to interpret the constancy of the pathlength. First, it could be

folded into other constants; for example we can consider the product a×b, the

absorptivity-pathlength product as the operative variable for any computations performedon the data.

Alternatively, we could consider the pathlength, whatever it is, to be the "unit pathlength"and set its value to 1, so that it doesn't change the numerical values of any other

computations. This will affect only those results that depend upon standardized

measurements performed with specific pathlength cells, and the standardized "absolute"absorbances determined from those measurements. The purpose of this change of

Page 29: Classical Least Square

8/13/2019 Classical Least Square

http://slidepdf.com/reader/full/classical-least-square 29/39

viewpoint is to simplify any equations that are derived from theoretical considerations of

the process of spectroscopic measurements.

The third of the three "ground rules" described previously is the one that is of importance

here. While the description of it is couched in language that describes the effect at a

single wavelength, the point of it is that it is true at every wavelength in the spectrum.Therefore, it is also correct, and more in line with our present interests, to say that not

only is the absorbance of a mixture at a given wavelength equal to the sum of the

absorbances of the individual components, but also, the absorbance spectrum of a mixtureis equal to the sum of the spectra of the individual components in the mixture. An

example of this is shown in Figure 2b, where the spectra of dichloromethane, toluene,

and n-heptane are again shown, along with the spectrum of a mixture of these three

liquids. With a little careful inspection, contributions from each of the three componentscan be seen in the spectrum of the mixture.

The mention of "computations," as we did earlier, brings us to the point at which we want

to translate between the chemist's view of spectra, and the mathematical view of thosespectra (this is, after all, a column about chemometrics!). We begin by noting that a graph

like Figure 2b is not the single, absolute spectrum of a mixture of dichloromethane,toluene, and n-heptane. There are many possible spectra for mixtures of these three

substances, because while a given mixture of them has a unique spectrum, there are many

 possible different mixtures. In fact, the spectrum of the mixture shown in Figure 2b is the

spectrum of a mixture consisting of 25% dichloromethane, 25% toluene, and 50% n-heptane (all percentages are approximate). For other mixtures, the absorbance bands for a

component at higher concentration will become more prominent, and those for a

component at lower concentration will become less prominent and eventually disappear,as the concentration of that component decreases toward 0. An example can be seen in

Figure 3, where we have plotted spectra of two other mixtures of the same three liquids:

Mixture 1: toluene = 50%, dichloromethane = 25%, n-heptane = 25%

Mixture 2: toluene = 25%, dichloromethane = 50%, n-heptane = 25%

 Note how, even though the amount of n-heptane is the same in both mixtures, the spectraare different because of the differences in concentrations of the other two components.

Obviously, for the purpose of demonstration and pedagogical effect, we have exaggerated

the spectra differences by choosing example mixtures in which the concentration

differences are large. In "real" samples, with smaller composition differences betweensamples, the spectral changes are generally smaller and more subtle, nevertheless, the

same effects occur, and indeed, form the basis for all quantitative spectroscopic analysis.

Page 30: Classical Least Square

8/13/2019 Classical Least Square

http://slidepdf.com/reader/full/classical-least-square 30/39

So we see now that the spectrum of a mixture is not merely the sum of the spectra of the

components of the mixture, it is a weighted sum, the weighting factors being the

concentrations of the individual components. What we need to do is to find a way toextract, from the spectrum of a mixture, the weighting factors that represent the

contribution of each component to the final spectrum, because these weighting factors

represent the concentrations of the components. If we call the weighting factors c 1, c 2,and c 3, then the absorbance of the mixture at any wavelength will equal the sum of theweighted absorbances of the components of the mixture:

Extending this to the sums of the spectra, we rewrite equation 23 to reflect the spectra of

the three components

where [ M ], [ D], [T ], and [ H ] represent the spectra (which are now vectors, in this

representation) of the mixture, dichloromethane, toluene, and n-heptane, respectively, and

the corresponding c 1, c 2, and c 3 represent the concentrations of dichloromethane,toluene, and n-heptane, also respectively. We may find it convenient at various points to

change the subscripts on the concentration terms to c D, c T, and c H to help us keep track

of the different terms. We present this future change of terminology now, to help avoidfuture confusion.

Equivalence of Spectra and Numbers 

Along the way, we also need to learn how to modify our point of view, as digitized

spectra that appear be graphical constructs are in fact composed of numbers. Thosenumbers representing the spectral value (whether transmittance, absorbance, or whatever

mode of spectral presentation is used.)

It is convenient to start with Figure 2b, which shows the spectra of the three pure

components, plus the spectrum of the mixture. We will transform this figure, using an

extremely simple transformation. We will transform it by rotating it 90°. Thistransformed graph is shown in Figure 4. Clearly, the rotation of the graph has not

changed or otherwise affected any of the underlying properties of the data that the graph

represents, nor does it change the relationships between any of the spectra.

In Figure 5, we have taken Figure 4 and added some symbols to it. We can now compare

Figure 5 to equation 24. In equation 24, we represented the spectra of each of the

components of the mixture by a symbol ([ D], [T ], [ H ]), each representing thecorresponding spectrum.

In Figure 5 we have effectively rewritten equation 24 by replacing the symbol

representing each spectrum by the actual spectrum.

Figure 5 is where the spectroscopy meets the math.

Page 31: Classical Least Square

8/13/2019 Classical Least Square

http://slidepdf.com/reader/full/classical-least-square 31/39

 

Figure 5: Absorbance spectra of pure dichloromethane, n-heptane, and toluene, and of a

ternary mixture of the three (x-scale in wavenumbers), along with the concentration

information that makes this the graphical representation of equation 3

References 

(1) H. Mark and J. Workman, Spectroscopy 25(5), 16 – 21 (2010).

(2) H. Mark and J. Workman, Spectroscopy 25(6), 20 – 25 (2010).

(3) H. Mark and J. Workman, Spectroscopy 13(11), 18 – 21 (1998).

Page 32: Classical Least Square

8/13/2019 Classical Least Square

http://slidepdf.com/reader/full/classical-least-square 32/39

(4) H. Mark and J. Workman, Chemometrics in Spectroscopy (Elsevier, Amsterdam and

 New York, 2007).

(5) A. Schuster, Phil. Mag. 5, 243 (1903).

(6) A. Schuster, J. Astrophys. 21, 1 (1905).

(7) P. Kubelka and F.Z. Munk, Techn. Phys. 12, 593 (1931).

(8) G. Kortum, Reflectance Spectroscopy: Principles, Methods, Applications, 1st ed.

(Springer-Velag, New York, 1969).

(9) W.W. Wendlandt amd H.G. Hecht, Reflectance Spectroscopy (John WIley & Sons,

 New York, 1966).

(10) D.J. Dahm and K.D. Dahm, Interpreting Diffuse Reflectance and Diffuse

Transmittance: A Theoretical Introduction to Absorption Spectroscopy of ScatteringMaterials, 1st ed. (IM Publications, West Sussex, UK, 2007).

(11) H. Mark and J. Workman, Spectroscopy 21(5), 34 – 38 (2006).

(12) H. Mark and J. Workman, Spectroscopy 21(6), 34 – 36 (2006).

Page 33: Classical Least Square

8/13/2019 Classical Least Square

http://slidepdf.com/reader/full/classical-least-square 33/39

This column is a continuation of our discussion of the classical least squares (CLS)

approach to calibration (1 – 3). As we usually do when we continue the discussion of a

topic through more than one column, we continue the numbering of equations fromwhere we left off in the last installment.

The insight we are trying to develop hinges on Figure 1 (which first appeared as Figure 5in Part III of this series [3]) and the meaning of it in terms of equating the concepts of the

spectroscopic and mathematical views of Beer's law as it applies to spectra measured for

the purpose of calibrations for quantitative analysis. Therefore, we also repeat equation24:

where [ M ], [ D], [T ], and [ H ] represent the spectra (which are now vectors, in thisrepresentation) of the mixture, dichloromethane, toluene, and n-heptane, respectively, and

the corresponding c 1, c 2, and c 3 represent the concentrations of dichloromethane,toluene, and n-heptane, respectively.

The reason for all this is the final sentence in our previous column (3): "Figure 5 is where

the spectroscopy meets the math." We also repeat some of the prior explanation: In thisfigure we have taken Figure 4 from Part III of this column series and added some

symbols to it. We now compare the figure (Figure 1 in this installment) to equation 24. In

equation 24, we represented the spectra of each of the components of the mixture by a

symbol ([ D], [T ], [ H ]), each representing the corresponding spectrum.

Page 34: Classical Least Square

8/13/2019 Classical Least Square

http://slidepdf.com/reader/full/classical-least-square 34/39

 

In Figure 1 we have effectively rewritten equation 24 by replacing the symbol  

representing each spectrum by the actual  spectrum.

That's why Figure 1 is where the spectroscopy meets the math. Figure 1 is the same asequation 24, except that the spectra, which are indicated by the matrix symbols [ D], [T ],

and [ H ] in equation 24, are shown in their conventional graphical form in Figure 1.

This can be even further emphasized by replacing the graphical presentation of the

spectra with the actual numbers they represent, as we just described them. In Table I we

 present the numbers that make up the four spectra that concern us here: the spectra of the

mixture of toluene, dichoromethane, and n-heptane, and the absorbances of the three purematerials, whose spectra are presented in Figure 1.

Page 35: Classical Least Square

8/13/2019 Classical Least Square

http://slidepdf.com/reader/full/classical-least-square 35/39

 

 Now we can go one step further, and add to Table I the expressions needed to convert the

 bare listing of values into a set of equations describing the behavior of the data. We do soin Table II.

 Note that the heading of Table II is essentially a rewriting of equation 23 (3), and each

row of Table II represents the calculations that are to be done for each wavelength. WithTable II, therefore, we have come full circle. Were we to rewrite Table II in matrix

notation, we would arrive back at equation 24, which, as we showed immediately before,

was the matrix equation corresponding to the computations to be done.

An alternate view of the sequence we went through, which we want to emphasize here, is

that we arrived at Table II, and thereby at equation 24, by starting with the graphical

 presentation of the spectra of some pure components and a mixture made up from those pure components (Figure 1 from reference 1), and from that we noted that the spectra are

in fact made up of the numbers representing those spectra at each wavelength.

So we've found that the absorbances of the various materials, when added together

wavelength by wavelength, give the absorbances of the mixture. This ties together (in amathematical sense) the absorbances of the three different materials. Is there anything

else that ties together the absorbances of each material, in a similar sense?

Page 36: Classical Least Square

8/13/2019 Classical Least Square

http://slidepdf.com/reader/full/classical-least-square 36/39

Yes, there is. There is the fact, shown in Figure 1 and in Table II, that all the absorbances

for toluene are multiplied by the same factor (the weighting factor, representing the

concentration of toluene) indicating its contribution to the mixture spectrum. Similarly,all the absorbances of dichloromethane are multiplied by the same factor (representing

the concentration of dichloromethane), as are all the absorbances of n-heptane multiplied

 by the same factor (representing the concentration of n-heptane.

So now we know how to synthesize the spectrum of a mixture of various materials: We

add together the (weighted) spectra of the materials making up the mixture, just asspecified in equation 23.

This is all of some mild interest, intellectually and pedagogically, but it's not what wewant to do. What we really want to do is the inverse operation: to take the spectrum of a

mixture, and from it determine the concentrations of the various components of the

spectrum. How can we do that?

To start getting an appreciation for how we will do that, we again look at equations 23and 24. Equation 23 tells us that the concentrations are the weighting factors, such that

when multiplied by the absorbances, and then added together, the sum is the absorbanceof the mixture. Equation 24 tells us that this is true for the absorbances at all the

wavelengths; furthermore, the absorbances are linked together by two links. The first link

is physical: The absorbances are representative of the same specific compounds that giverise to them and are represented by the same concentration of each of the materials. The

second link is mathematical: All the absorbances are multiplied by the same coefficient

of the equation representing the spectroscopic system.

So to give us a head-start, let's ask the following question: Where else do we have a

situation where a number of variables are each multiplied by a coefficient — 

 and thesame coefficient is used to weight the same variable in each sample  —  to produce anumeric result?

The answer is when we do multiple linear regression (MLR) (or inverse least squares

[ILS]) calibration. There's an exact parallelism between CLS and ILS. This parallelism is

made explicit in Table III.

Page 37: Classical Least Square

8/13/2019 Classical Least Square

http://slidepdf.com/reader/full/classical-least-square 37/39

In Table III, we have taken the same set of numbers and inserted them twice in the table,

once in the context of CLS and once in the context of ILS, changing only the column

headings, which changes the meanings of the numbers. We list the comparisons in TableIV.

Table IV: Meanings of labels in Table III in the contexts of CLS and ILS (MLR)

So we see that, first of all, the same quantities are present in each context: absorbances ofmixtures, concentrations, coefficients. The only quantity missing is that in the ILS

context, there are no absorbances of pure materials explicitly provided.

So what does this all mean in terms of how we want to use this information? First let's

look at the ILS section of Table III. When we do an MLR or ILS calibration, we regress

all the independent variables against the dependent variable; that's naturally what aregression is all about. So what we would want to do is to take the informationrepresenting the independent variables in the CLS case and relate them to the

independent variable just as we did in the ILS case. In the ILS case we used least-squares

mathematics to make the comparison that created the relationships, and we want to do thesame type of least-squares computations here to relate the information in the independent

variables in the CLS case to the dependent variable in that case. Therefore we would do

exactly the same calculations using the independent variables in the CLS case that we didin the ILS case.

The difference is in what the variables in the CLS case represent. In the CLS case, the

dependent variable represents the spectrum of the mixture. The independent variables arethe spectra of the materials that went into the mixture. Therefore what we do is to regress

the spectra of the pure materials against the spectrum of the mixture. This will tell us

what the coefficient is that represents the amount of each material that appears in themixture. This coefficient, then, is the concentration of that material in the mixture.

The concentration of the materials in the mixture is exactly the information that we want

to get out of the analysis. Now, as we have seen, this concentration is exactly the result of

Page 38: Classical Least Square

8/13/2019 Classical Least Square

http://slidepdf.com/reader/full/classical-least-square 38/39

 performing the least-squares calculations using the spectra of the pure materials against

the spectrum of the mixture.

Thus we see that the information we want falls naturally out of the calibration

calculations when we perform them the CLS way. We also note that coefficients are

exactly the same coefficients that were used in equations 21 and 23 to create the spectrumof the mixture out of the spectra of the components to start with. We see, therefore, that

doing the least squares calculations is the way to recover the concentrations of the

materials whose spectra comprise the spectrum of the mixture.

This is the explanation for the similarity of the expressions in the two entries of Table I

from reference 3. They both represent least squares calculations. The differences betweenthe two expressions arise from the difference in the different locations of those different

variables that are used in the calculations in the two cases. These reflect the differences in

the meaning of the different variables that are used in the two calculations.

In the case of ILS, the concentration of the analyte must be known beforehand. Thecoefficients that are calculated represent the contribution of the analyte to the spectrum

 plus corrections for contributions to the spectrum made by the other materials in thesample.

In the case of CLS, no concentration information need be known beforehand. Theconcentration information is calculated ab initio, purely from the spectral information

available. The other side of that coin, however, is that instead of concentration

information from the sample, we need to know the spectra of the pure materials thatcomprise the sample mixture.

The more alert among our readers might ask a question: Why is it that if Table I fromreference 3 specifies the same calculations for both cases, the matrix operations are

written almost in reverse order from one another? The answer, though it might seem a

 priori puzzling, turns out to be actually rather simple and mundane, indeed almost trivial.

Recall that when we rearranged the data to form Figures 4 and 5 in reference 3 (and

Figure 1 in this installment), we rotated the graphical presentation of the spectra by 90°(compare Figure 2b to Figure 5 in reference 3). In terms of the actual numerical data,

 presented as a matrix, this rotation transformed the rows of the matrix into columns and

vice versa (that is, we performed a transpose operation on the matrix). This requires that,

if we wish to perform similar computations on the data before and after the transposition,the order of multiplication must be reversed to compensate. Does this give us the same

result for the new computation? Well, almost. Having both transposed the matrix and

reversed the order of computations, we will wind up with the transpose of the originalresult. This is a theorem of matrix math that we don't usually have much need for, but

 because it has affected our discussion, we felt a need to explain it

Page 39: Classical Least Square

8/13/2019 Classical Least Square

http://slidepdf.com/reader/full/classical-least-square 39/39

References 

(1) H. Mark and J. Workman, Spectroscopy 25(5), 16 – 21 (2010).

(2) H. Mark and J. Workman, Spectroscopy 25(6), 20 – 25 (2010).

(3) H. Mark and J. Workman, Spectroscopy 25(10), 22 – 31 (2010).