measurement uncertainties in science and technology
TRANSCRIPT
Measurement Uncertainties in Science and Technology
Michael Grabe
Measurement Uncertainties in Science and Technology
With 47 Figures
^ Sprin ger
Dr. rer.nat Michael Grabe Am Hasselteich 5, 38104 Braunschweig, Germany E-mail: [email protected]
Library of Congress Control Number: 2005921098
ISBN 3-540-20944-1 Springer Berlin Heidelberg New York
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilm or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law.
Springer is a part of Springer Science-hBusiness Media.
Springeronline, com
© Springer-Verlag Berlin Heidelberg 2005 Printed in The Netherlands
The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.
Typesetting: Data prepared by the author using a Springer TgX macro package Cover design: design & production GmbH, Heidelberg
Printed on acid-free paper SPIN 10977521 57/3141/AD 5 4 3 2 1 0
To the memory of my parents
in love and gratitude
Preface
In his treatises on the Method of Least Squares C.F. Gauß [1] distinguishedirregular or random errors on the one hand and regular or constant errors onthe other. As is well known, Gauß excluded the latter from his considerationson the ground that the observer should either eliminate their ultimate causesor rid every single measured value of their influence.
Today, these regular or constant errors are called unknown systematicerrors or, more simply, systematic errors. Such errors turned out, however, tobe eliminable neither by appropriately adjusting the experimental set-ups norby any other means. Moreover, considering the present state of measurementtechnique, they are of an order of magnitude comparable to that of randomerrors.
The papers by C. Eisenhart [9] and S. Wagner [10] in particular have en-trusted the high-ranking problem of unknown systematic errors to the metro-logical community. But it was not until the late 1970s, that it took centerstage apparently in the wake of a seminar held at the Physikalisch-TechnischeBundesanstalt in Braunschweig [19]. At that time, two ways of formalizingunknown systematic errors were discussed. One of these suggested includingthem smoothly via a probabilistic artifice into the classical Gaussian calculus,and the other, conversely, proposed generalizing that formalism in order tobetter bring their particular features to bear.
As the author prefers to see it, systematic errors introduce biases, andthis situation would compel the experimenter to differentiate between ex-pectations on the one hand and true values on the other – a distinctionthat does not exist in the conventional error calculus. This perspective andanother reason, which will be explained below, have induced the author topropose a generalization of the classical error calculus concepts. Admittedly,his considerations differ substantially from those recommended by the officialmetrology institutions. Today, the latter call for international validity underthe heading Guide to the Expression of Uncertainty in Measurement [34–36].
Meanwhile, both formal and experimental considerations have raised nu-merous questions: The Guide does not make a distinction between true valuesand expectations; in particular, uncertainty intervals are not required to lo-calize the true values of the measurands. Nonetheless, physical laws in generalas well as interrelations between physical constants in particular are to be
VIII Preface
expressed in terms of true values. Therefore, a metrology which is not aimedat accounting for a system of true values is scarcely conceivable.
As the Guide treats random and systematic errors in like manner on aprobabilistic basis, hypothesis testing and analysis of variance should remainvalid; in least squares adjustments, the minimized sum of squared residualsshould approximate the number of the degrees of freedom of the linear system.However, all these assumptions do not withstand a detailed analysis.
In contrast to this, the alternative error model to be discussed here sug-gests healing answers and procedures appearing apt to overcome the saiddifficulties.
In addition to this, the proposal of the author differs from the recommen-dations of the Guide in another respect, inasmuch as it provides the intro-duction of what may be called “well-defined measurement conditions”. Thismeans that each of the measurands, to be linked within a joint error propa-gation, should be subjected to the same number of repeat measurements. Asobvious as this might seem, the author wishes to boldly point out that justthis procedure would return the error calculus back to the womb of statisticswhich it had left upon its way through the course of time. Well-defined mea-surement conditions allow complete empirical variance–covariance matricesto be assigned to the input data and this, in fact, offers the possibility ofexpressing that part of the overall uncertainty which is due to random errorsby means of Student’s statistics.
Though this idea is inconsistent with the traditional notions of the ex-perimenters which have at all times been referred to incomplete sets of data,the attainable advantages when reformulating the tools of data evaluation interms of the classical laws of statistics appear to be convincing.
There is another point worth mentioning: the approach to always assessthe true values of the measurands in general and the physical constants inparticular eliminates, quasi by itself, the allegedly most fundamental termof metrology, namely the so-called “traceability”. Actually, the gist of thisdefinition implies nothing else but just that what has been stated above,namely the localization of true values. While the Guide cannot guarantee“traceability”, this is the basic property of the alternative error model referredto here.
Last but not least, I would like to express my appreciation of my col-leagues’ experience, which I referred to when preparing the manuscript aswell, as their criticism, which inspired me to clarify the text. For technicalsupport I am grateful to Dr. Michael Weyrauch and to Dipl.-Ubers. HanneloreMewes.
Braunschweig, January 2005 Michael Grabe
Contents
Part I Characterization, Combinationand Propagation of Errors
1 Basic Ideas of Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.1 Sources of Measurement Uncertainty . . . . . . . . . . . . . . . . . . . . . . 31.2 Quotation of Numerical Values and Rounding . . . . . . . . . . . . . . 8
1.2.1 Rounding of Estimators . . . . . . . . . . . . . . . . . . . . . . . . . . . 81.2.2 Rounding of Uncertainties . . . . . . . . . . . . . . . . . . . . . . . . . 8
2 Formalization of Measuring Processes . . . . . . . . . . . . . . . . . . . . 112.1 Basic Error Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.2 Random Variables, Probability Densities
and Parent Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182.3 Elements of Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3 Densities of Normal Parent Distributions . . . . . . . . . . . . . . . . . 253.1 One-Dimensional Normal Density . . . . . . . . . . . . . . . . . . . . . . . . . 253.2 Multidimensional Normal Density . . . . . . . . . . . . . . . . . . . . . . . . 293.3 Chi-Square Density and F Density . . . . . . . . . . . . . . . . . . . . . . . . 333.4 Student’s (Gosset’s) Density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363.5 Fisher’s Density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413.6 Hotelling’s Density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4 Estimators and Their Expectations . . . . . . . . . . . . . . . . . . . . . . . 514.1 Statistical Ensembles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 514.2 Variances and Covariances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 544.3 Elementary Model of Analysis of Variance . . . . . . . . . . . . . . . . . 58
5 Combination of Measurement Errors . . . . . . . . . . . . . . . . . . . . . 615.1 Expectation of the Arithmetic Mean . . . . . . . . . . . . . . . . . . . . . . 615.2 Uncertainty of the Arithmetic Mean . . . . . . . . . . . . . . . . . . . . . . 635.3 Uncertainty of a Function of One Variable . . . . . . . . . . . . . . . . . 665.4 Systematic Errors of the Measurands . . . . . . . . . . . . . . . . . . . . . . 70
X Contents
6 Propagation of Measurement Errors . . . . . . . . . . . . . . . . . . . . . . 716.1 Taylor Series Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 716.2 Expectation of the Arithmetic Mean . . . . . . . . . . . . . . . . . . . . . . 746.3 Uncertainty of the Arithmetic Mean . . . . . . . . . . . . . . . . . . . . . . 776.4 Uncertainties of Elementary Functions . . . . . . . . . . . . . . . . . . . . 806.5 Error Propagation over Several Stages . . . . . . . . . . . . . . . . . . . . . 87
Part II Least Squares Adjustment
7 Least Squares Formalism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 937.1 Geometry of Adjustment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 937.2 Unconstrained Adjustment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 977.3 Constrained Adjustment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
8 Consequences of Systematic Errors . . . . . . . . . . . . . . . . . . . . . . . 1018.1 Structure of the Solution Vector . . . . . . . . . . . . . . . . . . . . . . . . . . 1018.2 Minimized Sum of Squared Residuals . . . . . . . . . . . . . . . . . . . . . 1038.3 Gauss–Markoff Theorem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1058.4 Choice of Weights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1078.5 Consistency of the Input Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
9 Uncertainties of Least Squares Estimators . . . . . . . . . . . . . . . . 1139.1 Empirical Variance–Covariance Matrix . . . . . . . . . . . . . . . . . . . . 1139.2 Propagation of Systematic Errors . . . . . . . . . . . . . . . . . . . . . . . . . 1179.3 Overall Uncertainties of the Estimators . . . . . . . . . . . . . . . . . . . . 1199.4 Uncertainty of a Function of Estimators . . . . . . . . . . . . . . . . . . . 1289.5 Uncertainty Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
9.5.1 Confidence Ellipsoids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1319.5.2 The New Solids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1329.5.3 EP Boundaries and EPC Hulls . . . . . . . . . . . . . . . . . . . . . 141
Part III Special Linear and Linearized Systems
10 Systems with Two Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15110.1 Straight Lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
10.1.1 Error-free Abscissas, Erroneous Ordinates . . . . . . . . . . . 15410.1.2 Erroneous Abscissas, Erroneous Ordinates . . . . . . . . . . . 165
10.2 Linearization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17210.3 Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
Contents XI
11 Systems with Three Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . 17511.1 Planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17511.2 Parabolas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
11.2.1 Error-free Abscissas, Erroneous Ordinates . . . . . . . . . . . 18111.2.2 Erroneous Abscissas, Erroneous Ordinates . . . . . . . . . . . 188
11.3 Piecewise Smooth Approximation. . . . . . . . . . . . . . . . . . . . . . . . . 19211.4 Circles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
12 Special Metrology Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20312.1 Dissemination of Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
12.1.1 Realization of Working Standards . . . . . . . . . . . . . . . . . . 20312.1.2 Key Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
12.2 Mass Decades . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21012.3 Pairwise Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21512.4 Fundamental Constants of Physics . . . . . . . . . . . . . . . . . . . . . . . . 21512.5 Trigonometric Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
12.5.1 Unknown Model Function . . . . . . . . . . . . . . . . . . . . . . . . . 22512.5.2 Known Model Function . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
Part IV Appendices
A On the Rank of Some Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
B Variance–Covariance Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
C Linear Functions of Normal Variables . . . . . . . . . . . . . . . . . . . . . 239
D Orthogonal Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
E Least Squares Adjustments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
F Expansion of Solution Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
G Student’s Density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
H Quantiles of Hotelling’s Density . . . . . . . . . . . . . . . . . . . . . . . . . . 261
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
Part I
Characterization, Combination
and Propagation of Errors
1 Basic Ideas of Measurement
1.1 Sources of Measurement Uncertainty
To measure is to compare. To this end, as we know, a well defined systemof physical units is needed. As to the latter, we have to distinguish betweenthe definitions and the realizations of units. By its very nature, a definedunit is a faultless, theoretical quantity. In contrast, a realized unit has beenproduced by a measuring device and is therefore affected by imperfections.Moreover, further errors occur when a realized unit is used in practice, as thecomparison of a physical quantity to be measured with the appropriate unitis a source of measurement errors.
Let us assign a true value to every quantity to be measured or a mea-surand. True values will be indexed by a 0 and measured values by naturalnumbers 1, 2, . . ..1 If x designates the physical quantity considered, then thesymbols x0, x1, x2, . . . denote the true value and the measured values.
To reduce the influence of measurement errors, it suggests to repeat themeasurement, say n times, under the same conditions and to subsequentlycondense the set of repeat measurements thus gained to an estimator for theunknown (and unknowable) true value. Spontaneously one might wish n to beas large as possible hoping to achieve high measurement accuracy. However,there are limits to this wishful thinking.
A large number of repeat measurements would be meaningful only if theresolution of the measuring device is sufficiently high. If the resolution islow, it should hardly make sense to pick up a huge number of repeated mea-surements, as a coarse device will respond to one and the same quantity,repeatedly input, with a rather limited number of different output values.
To be more specific, the accuracy of measuring instruments suffers fromrandom errors, which we are all familiar with, as well as from so-called un-known systematic errors, the knowledge of which appeared to be confined, atleast until the late 1970s, to, say, esoteric circles. Unknown systematic errorsremain constant in time and unknown with respect to magnitude and sign.Traditionally, they are not considered on the part of the classical Gaussianerror calculus. As there is no chance of ignoring or eliminating such per-
1As a zeroth measurement does not exist, the index 0 in x0 clearly distinguishesthe true value from the measured data x1, x2, . . ..
4 1 Basic Ideas of Measurement
turbations, experimenters rather have to lock them within intervals of fixedextensions, the actual lengths of which have to be taken from the knowledgeof the measuring devices and the conditions under which they operate.
Given that the unknown systematic errors as well as the statistical proper-ties of the random process, which produces the random measurement errors,remain constant in time, the experimental set-up corresponds, put in sta-tistical terms, to a stationary process. For the experimenter, this would bethe best he can hope for, i.e. this would be, by definition, a perfect mea-suring device. Over longer periods, however, measuring devices may changetheir properties, in particular, unknown systematic errors may be subject todrifts.
In case unknown systematic errors changed following some trend, themeasurements constitute something of the kind of a time series which wouldhave to be treated and analyzed according to different methods which willbe of no concern here.
As drifts might always occur, the experimenter should try to control themeasurement conditions so that the unknown systematic errors remain con-stant at least during the time it takes to pick up a series of repeat mea-surements, say, x1, x2, . . . , xn, and to specify the extensions of the intervalsconfining the unknown systematic errors such that their actual values arealways embraced.
After all, smaller samples of measured values may appear more meaning-ful than larger ones. In particular, averaging arbitrarily many repeat mea-surements, affected by time-constant unknown systematic errors, would notreduce the latter.
The true value of the physical quantity in question is estimated by meansof a suitably constructed estimator. The experimenter should declare howmuch the two quantities, the true value and the estimator, might deviatefrom each other. The quantity accounting for the difference is called themeasurement uncertainty. This basic quantity grants us access to the possiblelocations of the true value of the measurand relative to its estimator, Fig. 1.1.
As we do not know whether the estimator is larger or smaller than the truevalue – if we knew this, we would have disregarded executable corrections –we shall have to resort to symmetrical uncertainty intervals. Due to formalreasons, the numerical values of uncertainties are always positive quantitieswhich we afterwards have to provide with a ± sign. We state:
The interval
estimator ± measurement uncertainty
defines the result of a measurement. This range is required to localize the truevalue of the measurand (or quantity to be measured).
As a rule, measurement uncertainties will be composed of two parts, oneis due to random errors and the other is due to unknown systematic errors.
In this context, the terms accuracy and reproducibility turn out to becrucial, see also Sect. 4.1. The accuracy refers to the localization of the true
1.1 Sources of Measurement Uncertainty 5
Fig. 1.1. The interval (estimator ± measurement uncertainty) is required to local-ize the true value of the measurand
value of the measurand: The smaller the uncertainty the higher the accuracy.In contrast to this, reproducibility merely aims at the statistical scattering ofthe measured data: The closer the scattering the higher the reproducibility.
Obviously, it is neither necessary nor advisable to always strive for highestattainable accuracies. But, in any case, it is indispensable to know the actualaccuracy of the measurement and, in particular, that the latter reliably local-izes the true value. The national metrology institutes ensure the realizationof units by means of hierarchies of standards. The respective uncertaintiesare clearly and reliably documented so that operators can count on them atany time and at any place.
In the natural sciences, a key role comes to measurement uncertaintiesas principles of nature are verified or rejected by comparing theoretical pre-dictions with experimental results. Here, measurement uncertainties have todecide whether new ideas should be accepted or discarded.
In engineering sciences measurement uncertainties provide an almost un-limited diversity of intertwined dependencies which, in the end, are crucial forthe functioning, quality, service life and competitiveness of industrial prod-ucts.
Trade and industry categorically call for mutual fairness: both, the buyerand the seller expect the measuring processes to so subdivide merchandiseand goods that neither side will be overcharged.2
Figure 1.2 sketches the basics of a measuring process. The constructionof a measuring device will be possible not until the physical model has founda mathematical formulation. After this, procedures for the evaluation of the
2One might wish to multiply the relative uncertainties (defined in (1.2) below)of the counters for gas, oil, petrol and electricity by their annual throughput andthe respective prices.
6 1 Basic Ideas of Measurement
Fig. 1.2. Diagram of a measuring process
measurements have to be devised. As a rule, the latter require additionalassumptions, which are either independent of the measuring process, suchas the method of least squares, or depend on it such as, for example, thestatistical distribution of random errors and the time-constancy of unknownsystematic errors. The monograph at hand confines itself to estimators whosestructures are that of unweighted or weighted arithmetic means. Whetheror not the procedures estimating their measurement uncertainties may betransferred to estimators satisfying other criteria will not be discussed here.
Let any given arithmetic mean x and its uncertainty ux > 0 have, afterall, been estimated. Then, the mathematical notation will be
x ± ux . (1.1)
Example
Let
m ± um = (10.31± 0.05) g
be the result of a weighing. Then the true value m0 of the mass m is requiredto lie somewhere within the interval 10.26 g,. . . ,10.36 g.
The uncertainty ux may be used to derive other, occasionally useful, state-ments. In this sense, the quotient
ux
x(1.2)
1.1 Sources of Measurement Uncertainty 7
denotes the relative uncertainty. Herein, we might wish to substitute x0 forx. This, however, is prohibitive, as the true value is not known. Relativeuncertainties vividly illustrate the accuracy of measurement, independent ofthe order of magnitude of the respective estimators. Let U1 and U2 designateany physical units and let
x1 = 10−6 U1 , x2 = 106 U2
be estimators with the uncertainties
ux1 = 10−8 U1 , ux2 = 104 U2 .
The two estimators have, obviously, been determined with equal accuracy,though their uncertainties differ by not less than 12 decimal powers.
There are uncertainty statements which are still more specific. The fre-quently used ppm uncertainty (ppm means “parts per million”) makes refer-ence to the decimal power of 6. The quotient
ux
x=
uppm(x)106
thus leads touppm(x) =
ux
x106 . (1.3)
The idea is to achieve a more comfortable notation of numbers by getting ridof decimal powers which are known a priori. As an example, let uppm = 1;this implies ux/x = 10−6.
Example
The frequency ν of the inverse ac Josephson effect involves dc voltage stepsU = (h/2e)ν; e designates the elementary charge and h the Planck constant.Let the quotient 2e/h be quantified as
4.8359767(14) 1014 Hz V−1 .
How is this statement to be understood? It is an abbreviation and means
2e/h± u2e/h
= (4.8359767± 0.0000014)1014 Hz V−1 .
Converting to ppm yields
uppm =0.0000014 · 1014 Hz V−1
4.8359767 · 1014 Hz V−1· 106 = 0.289 . . .
or, rounding up,
uppm(2e/h) = 0.30 .
8 1 Basic Ideas of Measurement
1.2 Quotation of Numerical Values and Rounding
Evaluation procedures nearly always endow estimators with dispensable, in-significant decimal digits. To get rid of them, the measurement uncertaintyshould tell us where and how we have to round and, in particular, how toquote the final result [37].
1.2.1 Rounding of Estimators
First of all, we determine the decimal place in which the estimator shouldbe rounded. Given that the first decimal of the measurement uncertaintydiffering from 0 turns out to be one of the digits
1 or 2
3 up to 9
⎫⎪⎬⎪⎭ the estimator should be rounded
⎧⎪⎨⎪⎩
in the place to the rightof this place
in just this place
Having determined the decimal place of the estimator in which it should getrounded, we round it
down,
up,
if the decimal place to the right
of this place is one of the digits
0 , up to 4
5 , up to 9
1.2.2 Rounding of Uncertainties
The uncertainty should be rounded up in the decimal place in which theestimator has been rounded.
Sometimes the numerical value of a physical constant is fixed with regardto some necessity. It goes without saying that fixing a physical constant doesnot produce a more accurate quantity. A possibility to emphasize a fixed digitwould be to print it boldly, e.g. 273.16K.
Table 1.1. Rounding of estimators and uncertainties
raw results rounded results
5.755018 . . . ± 0.000194 . . . 5.75502 ± 0.00020
1.9134 . . . ± 0.0048 . . . 1.913 ± 0.005
119748.8 . . . ± 123.7 . . . 119750 ± 130
81191.21 . . . ± 51.7 . . . 81190 ± 60
1.2 Quotation of Numerical Values and Rounding 9
Examples
Let
(7.238143± 0.000185) g
be the result of a weighing. The notation
7 . 2 3 8 1 4 3↑
0 . 0 0 0 1 8 5
facilitates the rounding procedure. The arrow indicates the decimal place inwhich the estimator should be rounded down. In just this decimal place theuncertainty should be rounded up, i.e.
(7.23814± 0.00019) g
For simplicity, in the examples presented in Table 1.1 the units will be sup-pressed.
2 Formalization of Measuring Processes
2.1 Basic Error Equation
Measuring instruments are known to not operate perfectly. Nevertheless, forthe experimenter it suffices to register a sequence of repeat measurements of,say, n outcomes
x1, x2, . . . , xn , (2.1)
which are free from drift. Obviously, under these circumstances, the datawill scatter around a horizontal line, Fig. 2.1. As Eisenhart [9] has vividlyformulated, the measuring instrument then operates in a state of statisticalcontrol. Then, as we shall assume, the measurement errors may be subdividedin two classes, namely in
– random errors and– unknown systematic errors,
the latter being constant in time, at least during the period it takes to carryout the measurements.
The reservation “during the period of measurement” is essential as itmay well happen that when repeating the measurements at a later time, thesystematic errors may assume other values, giving rise to a different stationarystate. To that effect, the boundaries of the unknown systematic errors shouldbe fixed such that they encompass, once and for all, all potential realizationsof the systematic errors.
Any drift, however, would model the data according to some trend. Themeasuring process would then no longer be stationary and instead of a “hori-zontally” scattering series of repeated measurements, the experimenter wouldhave registered something similar to a time series.
There will certainly be no drifts, if the so-called influence quantities of themeasuring process remain constant in time. Those latter quantities are relatedto environmental and boundary conditions, to the thermal and mechanicalstability of the experimental set-up, to the stability of its detectors, its powersupplies, etc.
Generally speaking, a non-drifting measuring device will not be free fromsystematic errors as the absence of drifts rather merely means that the in-fluences of systematic perturbations are invisible – just on account of theirconstancy.
12 2 Formalization of Measuring Processes
Fig. 2.1. Non-drifting (above) and drifting (below) sequences of repeated measure-ments
While random errors occur instantaneously, in the course of the measure-ments, the actual values of systematic errors are predetermined before themeasurements get started.
In theoretical terms, systematic errors arise when physical models areconceptualized as, in general, this calls for approximations and hypotheses.At an experimental level, systematic errors may be due to inexact adjust-ments and pre-settings, to environmental and boundary conditions, whichare known to be controllable only with finite accuracy, etc. As an example,let us consider the density of air. Though its values appear to be predictablewith remarkable accuracy, there still remains a non-negligible uncertaintycomponent which is due to unknown systematic errors. In weighings of highaccuracy, the greater the differences between the densities of the masses tobe compared, (e.g. platinum-iridium 21 500kg/m3 and steel 8 000 kg/m3), thegreater the uncertainty of the correction due to air buoyancy. As operatorscomplain, this latter uncertainty, unfortunately, makes a significant contri-bution to the overall uncertainty of the weighing [26].
Furthermore, the switching-on of an electrical measuring device will notalways lead to one and the same stationary state, and the same applies to theuse of a mechanical device. Last but not least, inherently different proceduresfor measuring one and the same physical quantity will certainly bring aboutdifferent systematic errors [14].
Random errors have to be attributed to microscopic fluctuations of eventswhich affect the measuring process and prove to be uncontrollable on a macro-scopic scale. According to experience, the random errors of stationary exper-
2.1 Basic Error Equation 13
imental set-ups may be considered, at least in reasonable approximation, tofollow a normal probability distribution. Though we are not in a positionto understand these phenomena in detail, we find guidance in the centrallimit theorem of statistics [43]. According to this, loosely stated, the super-position of sufficiently many similar random perturbations tends to a normaldistribution.
Actually, for practical reasons, the statistical properties of measured dataare rarely scrutinized. Nevertheless, we have to declare: should the assump-tion of normally distributed data fail, the procedures for the estimation ofthe random part of the overall measurement uncertainties, outlined in thefollowing step by step, will not apply.
The status of unknown systematic errors is quite different. First of all, thequestion as to whether or not they should be treated probabilistically is stillcontroversial [10,19]. Though the measuring process as outlined above relatesthem to a probabilistic origin, this is not meant in the sense of repeatableevents but with respect to the observation that the experimenter has nochance of controlling or influencing the chain of events which, in the end,leads to an operative systematic error. Again, while it is true that systematicerrors have their roots in random events, on account of their constancy, andin order to arrive at reliable measurement uncertainties, we prefer to treatthem non-probabilistically and make allowance for them by introducing biasesand worst-case estimations. We might say that the approach of the Guide torandomize unknown systematic errors corresponds to an a priori attitude,whereas the idea to classify them as biases reflects an a posteriori position.While the former roots in theoretical considerations, the latter starts fromexperimental facts.
Should there be just one systematic error, it should certainly not be dif-ficult to draw indisputable conclusions. Yet, it will be more controversial toassess the influence of several or even many unknown systematic errors. Weshall discuss this situation in Sects. 2.3 and 6.3.
As is common practice, we let f denote the unknown systematic error andlet
f1 ≤ f ≤ f2 ; f = const.
specify the confining interval. We expect the boundaries f1 and f2 to besafe, which simply means that f shall not lie outside the limits. As banal asthis might sound, just as little should we forget that a declaration of intentis one matter and its implementation another. To recognize the sources ofsystematic errors, to formalize their potential influence, to experimentallyreduce their magnitude as far as possible, all this requires many years ofexperience which, indeed, constitutes the ultimate art of measurement.
Nevertheless, the mathematical procedures providing the boundaries forthe confining error intervals are not necessarily complicated. All sources whichare to be considered should be listed and investigated with respect to the
14 2 Formalization of Measuring Processes
kind of influence they exert. Some of them will act directly, others withinfunctional relationships. Then, the overall effect to be anticipated should beestimated on the basis of a worst-case reckoning.
In the following we will show that it is always feasible to confine f throughan interval symmetrical to zero: if there are experimental findings definitelyspeaking against such an interval, we should use this information and subtractthe term
(f1 + f2)2
from both the asymmetrical boundaries and the measured data. Thus, wearrive at an interval of the form
−fs ≤ f ≤ fs ; fs =(f2 − f1)
2, (2.2)
and a new set of data x′l = xl − (f1 + f2)/2, l = 1, . . . , n, where, for conve-
nience, we should abstain from explicitly priming the data. Rather, we shouldtacitly assume that the corrections have already been made.
Thus, our error model assumes normally distributed random errors andunknown systematic errors of property (2.2). The latter are understood tobias the measuring process. We observe:
The basic error equation of measurement for non-drifting measuring de-vices (operating in a state of statistical control) is given by
xl = x0 + εl + f ; l = 1, . . . , n − fs ≤ f ≤ fs . (2.3)
While the random errors εl are assumed to be normally distributed, the un-known systematic error f is considered a time-constant perturbation.
There may be situations in which the effects of perturbations are knownto one group of persons but not to another. Then, the first group might wishto speak of systematic deviations while the second group has no other choicethan to continue to refer to systematic errors. Should such a distinction turnout to be relevant, it is proposed agreeing on the definition that
– a systematic deviation denotes a time-constant quantity identified in mag-nitude and sign, while
– an unknown systematic error designates a time-constant quantity uniden-tified in magnitude and sign.
In the following, we shall exclusively address the latter, i.e. unknown system-atic errors, and to be brief, we shall simply call them systematic errors, asby their very nature, errors are always unknown.
Finally, Fig. 2.2 particularizes the measuring process depicted in Sect. 1.1.It may happen that the mathematical representation of the physical modelunder consideration, for numerical or physical reasons, calls for approxima-tions. Even hypotheses may enter, e.g. when a physical model is to be tested
2.1 Basic Error Equation 15
Fig. 2.2. Diagrammed measuring process
by experiment. The design of the measuring device complies with the math-ematical formulation of the physical model.
Let us now discuss the still controversial question of whether or not sys-tematic errors should be randomize by means of postulated probability den-
16 2 Formalization of Measuring Processes
sities. We are sure the experimenter has no chance of influencing the actualvalues of systematic errors entering his measurements. Doubtless, in that heis confronted with fortuitous events. On the other hand, the decisive questionseems to be whether or not he is in a position to alter those systematic errors.In general, the answer will be negative.
Disregarding the efforts to be made, the experimenter could think of dis-assembling and reassembling the measuring device many a time hoping tovary its systematic errors. Yet, he might not be in a position to do that ei-ther for fear of damaging or even destroying some of the components or onaccount of principle reasons.
Furthermore, the experimenter could try to vary the environmental con-ditions. This, however, might be difficult to achieve, and if he could do this,on account of the complexity of the interactions the new state might offer nomore information than the previous one.
After all, even if the experimenter had the intention to vary his systematicerrors, the prospects of success will be limited so that a complete repetitionof the measurements under “new conditions” need not necessarily be helpful.
Instead, we ask what properties a formalism, which abstains from assign-ing probability densities to systematic errors, should have. From the outsetit is clear that such an approach will be incompatible with the essential fea-tures of the classical Gaussian formalism, thus making it necessary to startthe revision from scratch. This might be seen a disadvantage. On the otherhand, the time-constancy of unknown systematic errors doubtlessly mani-fests physical reality, while feeding randomized unknown systematic errorsinto the classical Gaussian formalism produces quite a series of formal aswell as experimental contradictions.
In statistical terms, estimators affected by unknown systematic errors arebiased. At the same time, postulated densities would formally cancel biasesthus contradicting reality.
As has been mentioned the treatment of several random variables asks forso-called covariances. The nature of the latter is either theoretical or empiri-cal, depending on whether the covariances are taken from parent distributionor from samples of finite size. Obviously, randomized unknown systematic er-rors only possess theoretical expectations while physical experiments, on theother hand, will never produce expectations but just estimators. Now, even ifstationary measuring processes will certainly not exhibit couplings betweenrandom and unknown systematic errors, as long as the latter are treatedprobabilistically, on account of purely formal reasons we are entitled to askfor covariances between the randomized and the truly random errors. – Up tonow, a convincing answer to the question of how this might be accomplisheddoes not seem to be at hand.
The analysis of variance attempts to find out whether or not differentsets of empirical data of varying origins relating to one and the same phys-ical quantity might have suffered from systematic effects. However, as we
2.1 Basic Error Equation 17
presently know, empirical data sequences are almost surely charged with sys-tematic errors so that any effort to refer to classical test criteria would besuperfluous from the outset. In particular, there seems to be no chance to tryto generalize the analysis of variance to include unknown systematic errorsby means of postulated densities, Sect. 4.3.
In least squares adjustments, the minimized (and appropriately weighted)sum of squared residuals should approximate the number of degrees of free-dom of the linear system. However, as empirical observations repeatedly show,this relationship scarcely applies. Rather, as a rule, the aforesaid sum exceedsthe number of degrees of freedom appreciably and it suggests attributing thisdiscrepancy to randomized systematic errors, i.e. though the latter have beenrandomized by postulated densities, the experiment invariably keeps themconstant thus revealing the untenability of the randomization concept.
Disregarding all conflicting points, i.e. when systematic errors are ran-domized, one could, nevertheless, set about defining uncertainty intervals.However, the associated formalism necessarily brings about the geometricalcombination of random and systematic errors and thus bans clear and concisedecisions. In particular, the experimenter would stay in doubt whether or nothis uncertainty localized the true value of his measurand. This, unfortunately,implies the failure of traceability within the system of physical quantities andso disentitles metrology of its objectivity.
Ostensibly, postulated densities for systematic errors would save the Gaus-sian error calculus. At the same time, this procedure abolishes the consistencybetween mathematics and physics.
The only way out seems to be to revise the Gaussian formalism fromscratch. That Gauß1 himself strictly refused to include systematic errors inhis considerations should be understood as a clear warning sign – otherwise,as he knew, his formalism would not have been operative.
Indeed, the situation becomes different when systematic errors enter in theform of time-constant perturbations. Then, as explained above, random andsystematic errors remain strictly separated and combine arithmetically. Also,when properly done, this approach allows the influence of random errors to betreated by means of Student’s statistics and the influence of systematic errorsby worst case estimations. The associated formalism is clear and concise. Inparticular, so-defined measurement uncertainties localize the true values ofthe measurands “quasi-safely” thus ensuring traceability from the start.
However, the announced reference to Student’s statistics calls for aparadigm shift inasmuch as we shall have to introduce what might be calledwell-defined measuring conditions. These ask for equal numbers of repeatedmeasurements for each of the variables entering the error propagation. Forapparent reasons, well-defined measuring conditions allow the complete em-
1Though this is the proper spelling, in what follows, we shall defer to the morecommon notation Gauss.
18 2 Formalization of Measuring Processes
pirical variance–covariance matrices of the input data to be established and,as will become visible, just these matrices lead us back to Student’s statistics.
Nonetheless, to require equal numbers of repeat measurements conflictswith common practice. Possibly, unequal numbers of repeat measurementsreflect the historical state of communication which, in the past, experimentersand working groups had to cope with. In other words, they virtually had noother choice. However, in the age of desk computers and the Internet, workinggroups can easily agree on common numbers of repeat measurements, and,by the same token, they can exchange their raw data.
In the end it seems to be more advantageous to forego excessive numbersthan to apply questionable, at least clumsy procedures of data evaluation.Firstly, any physical result may not significantly depend on the numbers ofrepeat measurements and, secondly, questionable uncertainties will tend toblur the primary goal of safely localizing the true values of the measurands.
In view of the high costs Big Physics involves, we should not hesitate tocall for well-defined measuring conditions, all the more the experimental ef-forts remain limited while the attainable benefits promise to be considerable.
2.2 Random Variables, Probability Densitiesand Parent Distributions
It seems appropriate to formalize the functioning of measuring devices bymeans of random variables. From a pragmatic point of view, on the one handwe have
– the set of all measured values which the measuring instruments can pro-duce, and on the other hand
– the mathematically abstract definition of random processes and randomvariables which imply that any number generated by a random processmay be understood as the realization of a random variable.
Matching the two means assuming that any measured value is a realizationof a random variable and any realization of a random variable a measuredvalue.
Let X denote a random variable and let us attempt to identify its realiza-tions x1, x2, . . . , xn with the measured data. Conceptually, we might expectthe experimental set-up to produce an infinite number of different values.However, as the instrument’s resolution is finite, we must be content witha random variable the realizations of which are discrete numbers with rela-tively few decimal places. In particular, we will in no way encounter an infinitenumber of numerically different readings. Nevertheless, for simplicity and ina formal sense, we shall stick to the notion of continuous random variablesassuming that their realizations map the measuring process.
The error model proposed requires that there be but one systematic errorfx superposed on each of the realizations xl; l = 1, . . . , n. As a result, the
2.2 Random Variables, Probability Densities and Parent Distributions 19
Fig. 2.3. One-dimensional empirical distribution density
actual (unknown) value of the systematic error defines the “center of gravity”of the statistical scattering of the random errors.
When formalizing random errors by means of distribution densities, weshall have to distinguish between theoretical and empirical densities. Fromthe features of empirical densities we may infer the structure of the under-lying theoretical densities. To find the empirical distribution density of asequence of measured data, we assign the data to the horizontal axis andthe frequencies of their occurrences to the vertical axis of a rectangular co-ordinate system. Then, we decompose the horizontal axis into an adequatenumber r of intervals or “classes” of width ∆xi; i = 1 . . . , r. For any given i,we count the numbers ∆ni of measured values falling within the respectiveintervals
xi, . . . , xi + ∆xi ; i = 1, . . . , r .
The empirical distribution density is defined by
p(xi) =∆ni
n∆xi; i = 1, . . . , r . (2.4)
Here, n denotes the total number of measured data. Obviously, r should havea value which is neither too large nor too small. In the first case, the numbersp(xi); i = 1, . . . , r would fluctuate unreasonably strongly, and in the second,they would bring about too coarse a resolution. The p(xi); i = 1, . . . , r definea sequence of rectangular columns, Fig. 2.3. The quotient ∆ni/n is calledthe frequency ratio. After all, the shape of an empirical probability densitydepends not only on the properties of the underlying random process butalso on the width of the intervals ∆xi and the number n of data entering thesorting procedure.
From a statistical point of view, the set of measured values xl; l = 1, . . . , ndefines a sample of size n. Increasing n to any size suggests postulating aso-called parent distribution, encompassing an infinite number of data. Cer-tainly, from an experimental point of view, this is a fiction. Nevertheless, a
20 2 Formalization of Measuring Processes
parent distribution proves a useful abstraction. In particular, it allows for aprobability density, called theoretical or exact, the shape of which is unique.
The theoretical density of the random variable X will be denoted as pX(x);the symbols x and X express that x is considered to be any realization of therandom variable X .
In general, metrology addresses several series of measurements, e.g.
x1, x2, . . . , xn ; y1, y2, . . . , yn ; z1, z2, . . . , zn (2.5)
stemming from different measuring devices. To each of the sequences we as-sign a random variable, say X, Y, Z, . . .. Clearly, their joint distribution den-sity is multidimensional. Given, there are two random variables, the impliedtwo-dimensional density is constructed as follows: We cover the x, y-planewith a line screen, exhibiting increments ∆xi and ∆yj . Then, we count thenumbers ∆nij of pairs xi, yj falling within rectangles
xi , . . . , xi + ∆xi ; yj , . . . , yj + ∆yj ; i, j = 1 , . . . , r .
The numbers ∆nij provide the two-dimensional empirical density via
p(xi, yj) =∆nij
n∆xi∆yj; i, j = 1 , . . . , r . (2.6)
Figure 2.4 depicts the basic idea. So-called covariances will assist us to allowfor dependencies between the series of the data (2.5); they will be discussedin a later section.
Let us revert to (2.4). Another term for frequency ratio, ∆ni/n, is empir-ical probability. Obviously
∆Pi =∆ni
n= p(xi)∆xi (2.7)
denotes the empirical probability of finding ∆ni out of n measured values inan interval ranging from xi to xi + ∆xi. Accordingly, statement (2.6) yields
∆Pij =∆nij
n= p(xi, yj)∆xi∆yj , (2.8)
denoting the empirical probability of finding ∆nij measured values out of nin a rectangle xi, . . . , xi + ∆xi; yj , . . . , yj + ∆yj. We confine ourselves to thisrather technical definition of probability, which relies on the repeatability ofphysical experiments.
For larger intervals and rectangles, (2.7) and (2.8) turn into sums and dou-ble sums respectively. If, instead, we refer to theoretical or exact distributiondensities and integral formulations
Pa < X ≤ b =
b∫a
pX(x)dx (2.9)
2.3 Elements of Evaluation 21
Fig. 2.4. Two-dimensional empirical density
denotes the probability of finding any realization of the random variable Xwithin an interval a, . . . , b and
Pa < X ≤ b , c < X ≤ d =
b∫a
d∫c
pXY (x, y)dxdy (2.10)
the probability of finding any realizations of the random variables X, Y withina rectangle a, . . . , b; c, . . . , d. Chapter 3 summarizes the distribution densitieswhich will be of relevance here. Subject to their structure, they refer eitherto one or to more than one normal parent distribution.
As to the interplay between random errors and unknown systematic er-rors, for stationary measuring processes we shall assume throughout thatthere are no dependencies or couplings of any kind.
2.3 Elements of Evaluation
In order to find suitable estimators, measured data xl; l = 1, . . . , n are usu-ally subjected to arithmetic averaging. In view of the influence of unknownsystematic errors, the arithmetic mean
x =1n
n∑l=1
xl (2.11)
can, however, only be a biased estimator of the true value x0 of the mea-surand. As has repeatedly been stated, even if we know that x carries asystematic error, we are not in a position to appropriately correct x. Indeed,all we can do is to design a meaningful measurement uncertainty. Accordingto (2.3), each measured value xl may formally be written as
22 2 Formalization of Measuring Processes
xl = x0 + εl + f ; l = 1, . . . , n . (2.12)
Insertion into (2.11) yields
x = x0 +1n
n∑l=1
εl + f . (2.13)
Even if we assume an arbitrary number of repeat measurements and con-sider the sum over the random errors to become negligibly small, the arith-metic mean still differs by the actual (unknown) value of the systematic errorf from the true value x0 of the measurand. Hence, x is biased with respectto the true value x0.
Averaging notionally over the infinitely many data xl; l = 1, 2, . . . of a nor-mal parent distribution, loosely formulated, the arithmetic mean x changesinto a parameter
µ = x0 + f , (2.14)
where the symmetry of the implied distribution cancels the sum over therandom errors. The parameter µ has the property of what will be calledan expected value. To be realistic, so-defined quantities remain fictitious,experimentally inaccessible parameters. Inserting (2.14) in (2.13) yields
x = µ +1n
n∑l=1
εl , (2.15)
which reveals that though the average x is biased with respect to the truevalue x0 of the measurand, it is unbiased with respect to its expectation µ.
Remarkably enough, C. Eisenhart [9] clarified these ostensibly plain rela-tionships in the early 1960s. To the author’s knowledge, earlier attempts toclear up the foundations of the Gaussian error calculus have not been made.
To better visualize the relationship between the true value x0 of the mea-surand, the expected value µ of the arithmetic mean x and the systematicerror f we introduce the empirical variance
s2 =1
n − 1
n∑l=1
(xl − x)2 , (2.16)
which, obviously, measures the scattering of the random errors, Fig. 2.5.From (2.12) and (2.13) we take that the difference xl− x has cancelled the
systematic error f . In (2.16) we could have inserted a factor 1/n instead of1/(n−1). However, a so-defined empirical variance would be biased, Sect. 4.2.Let σ2 denote the expected value of the empirical variance s2, and, further,s the empirical and σ the theoretical standard deviation. Again, σ2 is anabstract definition remaining experimentally inaccessible. Nevertheless, in aformal sense, the theoretical variance and the theoretical standard deviationwill prove useful many a time.
2.3 Elements of Evaluation 23
Fig. 2.5. True value x0, expected value µ, arithmetic mean x, systematic error fand empirical standard deviation s
The uncertainty of the arithmetic mean should account for the randomfluctuations of the xl as well as for the systematic error f , the latter beingbounded by an interval of the form (2.2). As we ask for safe uncertainties, weshall bring the boundaries ±fs of that interval to bear. This, in fact, meansnothing other than to resort to a worst case estimation.
Equation (2.12) suggests to combine random and systematic errors arith-metically. Hence u = s + fs appears to be a first reasonable assessment ofthe uncertainty of the arithmetic mean. For completeness, let us provide theuncertainty, empirical standard deviation and the worst case estimate of theunknown systematic error with indices, then
ux = sx + fs,x . (2.17)
Later, we shall refine the momentousness of random errors in modifying thefirst term sx by a factor based on sophisticated statistical considerations. Inmost cases there will be several series of measurements, so that the meansx, y, . . . have to be inserted into a given function φ(x, y, . . .) . Thus, we shallhave to assess the uncertainty uφ of a given estimator φ(x, y, . . .) and to quotesomething like
φ(x, y, . . .) ± uφ . (2.18)
Fortunately, the considerations presented hitherto may be generalized so that,in correspondence to (2.17), we shall arrive at a similar result, namely
uφ = sφ + fs,φ . (2.19)
24 2 Formalization of Measuring Processes
Fig. 2.6. Error propagation, symbolic
The quantities sφ and fs,φ will be developed in Chap. 6. Also, the first termon the right-hand side may be modified by means of more sophisticated sta-tistical concepts.
In respect of Fig. 2.6, it seems worth adding a remark about the mathe-matical operations underlying (2.19). The procedures in question will combinen statistically equivalent data sets
x1, y1, . . .x2, y2, . . .. . . , . . . , . . .xn, yn, . . .
where we shall have to brood over the question of how to pair or group thedata.
In contrast to the fluctuations of the random errors, the respective sys-tematic errors of the sequences will, of course, not alter, i.e. new informationcomes in on the part of statistical errors, but not on the part of systematicerrors – all we know is that they have assumed (unknown) fixed values withinintervals of given lengths.
Apparently, the course of time had shrouded Gauss regular or constantsystematic errors with the veil of oblivion. To the author’s knowledge, theywere rediscovered or possibly even newly discovered by Eisenhart [9] andWagner [10] inside the national metrology institutes.
Remarkably enough, in certain cases, legal metrology exclusively confinesitself to systems of inequalities as given in (2.2), treating and propagatingunknown systematic errors in the form of consecutive worst case estimations,thus meeting basic requirements to operate efficiently and safely [26].
3 Densities of Normal Parent Distributions
3.1 One-Dimensional Normal Density
Let the realizations of a random variable X constitute what has been calleda parent distribution. If the probability of a realization of X falling withinan interval x, . . . , x + dx is given by
pX(x)dx =1
σx
√2π
exp(−x − µx)2
2σ2x
)dx , (3.1)
the pool of data is said to be normally distributed and pX(x) itself is calledthe normal distribution density. While the parameter µx marks the center ofthe density, the parameter σx quantifies the broadness of the scattering ofthe data. To abbreviate, we also say X is N(µx, σx)-distributed.
Figure 3.1 depicts the progression of the density. Following the centrallimit theorem of statistics, we shall consider the stochastic perturbationsof measuring processes to be at least approximately normally distributed –though, of course, there will be exceptions. But even if the statistical behaviorof the perturbations is known on that score, metrological applications of (3.1)confront the experimenter with the difficulty, that he is not familiar with theparameters µx and σx. Though this might be a matter of fact, the authorwishes to point to the conflict between theoretically defined parameters ofa distribution density and the experimenter’s chances of finding appropriate
Fig. 3.1. Normal probability density
26 3 Densities of Normal Parent Distributions
approximations for them. Strictly speaking, by their very nature, theoreticallydefined parameters prove to be experimentally inaccessible.
Just to get accustomed to the practical aspect of probability densities,let us try to approximate the numbers ∆nj of the empirical density (2.4) bymeans of the density pX(x) implied in (3.1). Assuming normally distributedmeasured data xl; l = 1, . . . , n, the number of values falling within any of ther intervals xj , . . . , xj + ∆xj ; j = 1, . . . , r would be
∆nj ≈ n
σx
√2π
exp(− (xj − µx)2
2σ2x
)∆xj . (3.2)
Using (3.2), we might wish to design a mean of the data,
x ≈r∑
j=1
xj∆nj
n(3.3)
and also something like a quadratic spread with respect to the so-definedmean,
s2x ≈
r∑j=1
(xj − x)2∆nj
n. (3.4)
However intriguing these attempts may seem, they are meaningless, as theyare based upon the unknown parameters µx and σx.
Let us now refer to the parent distribution and not just to a sample ofsize n. Then, (3.3) changes into the theoretical mean
µx =
∞∫−∞
x pX(x)dx (3.5)
and (3.4) into the theoretical variance
σ2x =
∞∫−∞
(x − µx)2 pX(x)dx . (3.6)
As is seen, (3.5) and (3.6) simply reproduce the parameters of the densitypX(x). Again, there is no benefit. Nevertheless, the parameters µx and σ2
x,though unknown and inaccessible, are of basic importance as they representthe expectations of the empirical estimators x and s2
x.In the following, we shall resort to expectations many a time, so that
it suggests itself to introduce a convenient symbolism. Instead of (3.5) and(3.6), we shall write
µx = EX and σ2x = E(X − µx)2
3.1 One-Dimensional Normal Density 27
respectively. Here, E symbolizes the integration while X stands for theintegration variable. As has already been done, random variables will bedenoted by upper-case letters and their realizations by lower-case letters.1
The lesser the scattering of the measured values, the smaller σx and thenarrower the density pX(x). Conversely, the wider the scattering, the largerσx and the broader the density pX(x). In particular, given µx = 0 and σx = 1,the density (3.1) turns into the so-called standardized normal density N(0, 1).Figure 3.1 underscores the rapid convergence of pX(x), which gets well alongwith the experimental behavior of the data – the latter scatter rather to alesser extent: outside µx ± 2σx, only relatively few values are to be observed.After all, the tails of the normal density outside, say, µx ± 3σx, are scarcelysignificant. The actual situation depends, however, on the intrinsic propertiesof the particular measuring device.
Nevertheless, confidence intervals and testing of hypotheses just rely uponthe tails of distribution densities, which, in pragmatic terms, means that so-defined criteria are based on statistically robust and powerful assertions. Afterall, the tails of the densities rather play a conceptual role and scarcely appearto be verifiable – at least as long as measuring devices operate in statisticallystationary states and do not produce short-time operant systematic errors,so-called outliers, which leave the experimenters in doubt as to whether ornot a particular datum might still pertain to one of the tails of the density.
The result of a measurement may not depend significantly on the actualnumber of repeat measurements, which implies that outliers should not beallowed to play a critical role with regard to the positioning of estimators– though, in individual cases, it might well be recommendable to rely onsophisticated decision criteria. On the other hand, judging from an experi-mental point of view, it could be more advantageous identifying the causesof outliers.
When we assume that the consecutive numbers x1, x2, . . . , xn of a se-quence of repeat measurements are independent, we impute that the mea-suring device relaxes before the next measured value is read. Given the mea-suring device complies with this requirement, we may derive the probabilitydensity of the random variable X corresponding to the arithmetic mean xas follows. To begin with, we generalize our actual notion to consider justone random variable X producing data x1, x2, . . . , xn and conceive, instead,of an ensemble of statistically equivalent series of measurements. To the first“vertical sequence”
x(1)1 , x
(2)1 , . . . , x
(k)1 , . . .
of this ensemble
1Using Greek letters, however, we shall not explicitly distinguish random vari-ables and their realizations.
28 3 Densities of Normal Parent Distributions
x(1)1 , x
(1)2 , x
(1)3 , . . . , x
(1)n first series
x(2)1 , x
(2)2 , x
(2)3 , . . . , x
(2)n second series
. . . . . . . . . . . . . . . . . .
x(k)1 , x
(k)2 , x
(k)3 , . . . , x
(k)n k-th series
. . . . . . . . . . . . . . . . . .
we assign a random variable X1, to the second a random variable X2, etc.and, finally, to the n-th a random variable Xn. Then, considering the so-defined variables to be independent and normally distributed, by means ofany real constants b1, b2, . . . , bn, we may define a sum
Z = b1X1 + b2X2 + · · · + bnXn . (3.7)
As may be shown [40], see Appendix C, the random variable Z is normallydistributed. Putting
µi = EXi and σ2i = E(Xi − µi)2 ; i = 1, ..., n
its expectation and scattering are given by
µz =n∑
l=1
bl µl and σ2z =
n∑l=1
b2l σ2
l (3.8)
respectively. Its distribution density turns out to be
pZ(z) =1
σz
√2π
exp(− (z − µz)2
2σ2z
). (3.9)
As the Xl; l = 1, . . . , n are taken from one and the same parent distribution,we have
µ1 = µ2 = · · · = µn = µx and σ21 = σ2
2 = · · · = σ2n = σ2
x .
Hence, setting bl = 1/n; l = 1, . . . , n, the sum Z changes into X, µz into µx
and, finally, σ2z into σ2
x/n. Consequently, the probability of finding a realiza-tion of X within an interval x, . . . , x + dx is given by
pX(x)dx =1
(σx/√
n)√
2πexp(− (x − µx)2
2σ2x/n
)dx . (3.10)
In the following, we shall refer to the density pX(x) of the arithmetic meanX many a time. Error propagation, to be developed in Chap. 6, will be basedon linear sums of the kind (3.7), though, admittedly, we shall have to extendthe presumption underlying (3.10), as the sum Z turns out to be normallydistributed even if the variables Xl; l = 1, . . . , n prove to be dependent, see
3.2 Multidimensional Normal Density 29
(3.29). Let a final remark be made on the coherence between the level ofprobability and the range of integration,
P =
a∫−a
pX(x)dx .
Setting a = 2σx, we have
0.95 ≈2σx∫
−2σx
pX(x)dx , (3.11)
i.e. P ≈ 95% .
3.2 Multidimensional Normal Density
We consider several series of repeat measurements
x1, x2, . . . , xn
y1, y2, . . . , yn
. . . . . . . . . . . .
(3.12)
each comprising the same number n of items. We assign random variablesX, Y, . . . to the series and assume that they refer to different normal parentdistributions.
The joint treatment of several series of measurements implies the intro-duction of covariances. Again, we have to distinguish between empirical andtheoretical quantities. While the former pertain to series of measurements offinite lengths, the latter are defined through theoretical distribution densities.
Let us consider just two series of measurements xl, yl, . . .; l = 1, . . . , n. Bydefinition,
sxy =1
n − 1
n∑l=1
(xl − x)(yl − y) (3.13)
designates the empirical covariance and
rxy =sxy
sxsy; |rxy| ≤ 1
the empirical correlation coefficient. The quantities sx and sy denote theempirical standard deviations, i.e. the square roots of the empirical variances
s2x =
1n − 1
n∑l=1
(xl − x)2 and s2y =
1n − 1
n∑l=1
(yl − y)2 . (3.14)
30 3 Densities of Normal Parent Distributions
While the expectation of (3.13) leads to the theoretical covariance
σxy = ESxy , (3.15)
we associate a theoretical correlation coefficient of the form
xy =σxy
σxσy; |xy| ≤ 1
with the empirical quantity rxy. The definitions of σxy and xy are formal,as they refer to experimentally inaccessible quantities. It proves convenientto assemble the estimators s2
x ≡ sxx, sxy = syx, s2y ≡ syy and the parameters
σ2x ≡ σxx, σxy = σyx, σ2
y ≡ σyy in variance–covariance matrices
s =
(sxx sxy
syx syy
)and σ =
(σxx σxy
σyx σyy
)(3.16)
respectively, where it is assumed that they will be non-singular. In this case,they are even positive definite, see Appendix B.
Given that the random variables X and Y are normally distributed, theprobability of finding a pair of realizations in any rectangle x, . . . , x + dx;y, . . . , y + dy follows from
pXY (x, y)dxdy =1
2π|σ|1/2exp(− 1
2|σ|[σ2
y(x − µx)2 (3.17)
−2σxy(x − µx)(y − µy) + σ2x(y − µy)2
])dxdy ,
where µx and µy designate the expectations EX and EY respectively,and |σ| the determinant of the theoretical variance–covariance matrix σ. Bymeans of the vectors
ζ =
(x
y
)and µ =
(µx
µy
),
the quadratic form of the exponent in (3.17) may be written as
q =1|σ|[σ2
y(x − µx)2 − 2σxy(x − µx)(y − µy) + σ2x(y − µy)2
]= (ζ − µ)Tσ−1(ζ − µ) .
The symbol “T” which is used, where necessary, as a superscript of vectorsand matrices denotes transposes. Evidently, the density (3.17) reproduces thetheoretical covariance
3.2 Multidimensional Normal Density 31
σxy = E (X − µx)(Y − µy)
=
∞∫−∞
∞∫−∞
(x − µx)(y − µy)pXY (x, y)dxdy . (3.18)
The same applies to the parameters σ2x and σ2
y . If σxy vanishes, the series ofmeasurements are called uncorrelated. In this situation, the conventional er-ror calculus tacitly ignores empirical covariances. It is interesting to note that,as long as experimenters refer to unequal numbers of repeat measurements,which is common practice, empirical covariances cannot be formalized any-way. We shall resume the question of how to interpret and handle empiricalcovariances in due course.
Given that the series of measurements (3.12) are independent, the densitypXY (x, y) factorizes and σxy vanishes, which means the two data series areuncorrelated. Conversely, it is easy to give examples proving that uncorre-lated series of measurements may well be dependent [43, 50]. However, withrespect to normally distributed data series, a vanishing covariance automat-ically entails the independence of the pertaining random variables.
Considering m random variables X1, X2, . . . , Xm, the theoretical vari-ances and covariances are given by
σij = E (Xi − µi)(Xj − µj) ; i, j = 1 , . . . , m , (3.19)
which, again, we assemble in a variance–covariance matrix
σ =
⎛⎜⎜⎜⎝
σ11 σ12 . . . σ1m
σ21 σ22 . . . σ2m
. . . . . . . . . . . .
σm1 σm2 . . . σmm
⎞⎟⎟⎟⎠ ; σii ≡ σ2
i ; σij = σji . (3.20)
If rank(σ) = m, the latter is positive definite. If rank(σ) < m, it is positivesemi-definite, see Appendix B.
Now, let x1, x2, . . . , xm denote any realizations of the random variablesX1, X2, . . . , Xm and let µ1, µ2, . . . , µm designate the respective expectations.Then, defining the column vectors
x = (x1 x2 · · · xm)T and µ = (µ1 µ2 · · · µm)T (3.21)
the associated m-dimensional normal probability density is given by
pX(x) =1
(2π)m/2|σ|1/2exp(−1
2(x − µ)Tσ−1(x − µ)
), (3.22)
where we assume rank(σ) = m.Let us now consider the random variables X and Y . We put down their
theoretical variances
32 3 Densities of Normal Parent Distributions
σxx ≡ σ2x =
σ2x
nand σyy ≡ σ2
y =σ2
y
n, (3.23)
their theoretical covarianceσxy =
σxy
n(3.24)
and, finally, their theoretical correlation coefficient
xy =σxy
σxσy=
σxy
σxσy= xy .
After all, the joint distribution density is given by
pXY (x, y) =1
2π|σ|1/2exp(−1
2(ζ − µ)Tσ−1(ζ − µ)
), (3.25)
where
ζ =
(x
y
)and σ =
(σxx σxy
σyx σyy
).
The quadratic form in the exponent of (3.25),
q = (ζ − µ)Tσ−1(ζ − µ)
=1|σ|[σyy(x − µx)2 − 2σxy(x − µx)(y − µy) + σxx(y − µy)2
]=
n
|σ|[σ2
y(x − µx)2 − 2σxy(x − µx)(y − µy) + σ2x(y − µy)2
](3.26)
defines ellipses q = const. Finally, we add the joint density of the variablesX1, X2, . . . , Xm,
pX(x) =1
(2π)m/2|σ|1/2exp(−1
2(x − µ)Tσ−1(x − µ)
)(3.27)
and liken the quadratic forms in the exponents of (3.22) and (3.27),
q = (x − µ)Tσ−1(x − µ) and q = (x − µ)Tσ−1(x − µ) . (3.28)
Remarkably enough, both are χ2(m) distributed, see Sect. 3.3. From q weshall derive the variable of Hotelling’s density in due course.
Finally, we generalize (3.7), admitting dependence between the variables.This time, however, we shall consider m random variables instead of n, sothat
Z = b1X1 + b2X2 + · · · + bmXm . (3.29)
Let the Xi come from the density (3.22). Then, as is shown e.g. in [40], seealso Appendix C, the variable Z is still normally distributed. Its expectationand its variance are given by
3.3 Chi-Square Density and F Density 33
µz = EZ =m∑
i=1
bi µi and σ2z = E(Z − µz)2 = bTσb , (3.30)
respectively, where b = (b1 b2 . . . bm)T. In (3.11) we realized that an intervalof length 4 σx is related to a probability level of about 95%. Yet, underrelatively simple circumstances, the number of variables involved may easilybe 20 or even more. Let us set m = 20 and ask about the probability offinding realizations x1, x2, . . . , xm of the random variables X1, X2, . . . , Xm
within a hypercube symmetrical about zero and having edges of 4σx sidelength. For simplicity, we assume the random variables to be independent.This probability is given by
P ≈ 0.9520 ≈ 0.36 . (3.31)
Not until we extend the side lengths of the edges of the hypercube up to 6 σx
would we come back to P ≈ 95%.
3.3 Chi-Square Density and F Density
The chi-square or χ2 density covers the statistical properties of a sum of inde-pendent, normally distributed, standardized and squared random variables.Let X be normally distributed with EX = µx and E(X − µx)2 = σ2
x.Then, obviously, the quantity
Y =X − µx
σx(3.32)
is normal and standardized, as EY = 0 and EY 2 = 1. Transferred ton independent random variables X1, X2, . . . , Xn, the probability of finding arealization of the sum variable
χ2(n) =n∑
l=1
[Xl − µx
σx
]2(3.33)
within any interval χ2, . . . , χ2 + dχ2 is given by
pχ2(χ2, n)dχ2 =1
2n/2Γ(n/2)(χ2)n/2−1 exp
(−χ2
2
)dχ2 ; (3.34)
pχ2(χ2, n) is called the χ2 density. As the parameter µx is experimentallyinaccessible, (3.33) and (3.34) should be modified. This is, in fact, possible.Substituting X for µx changes (3.33) into
χ2(n − 1) =n∑
l=1
[Xl − X
σx
]2=
n − 1σ2
x
S2x (3.35)
34 3 Densities of Normal Parent Distributions
and (3.34) into
pχ2(χ2, ν)dχ2 =1
2ν/2Γ(ν/2)(χ2)ν/2−1 exp
(−χ2
2
)dχ2 , (3.36)
letting ν = n− 1. The number ν specifies the degrees of freedom. The expec-tation of χ2 is given by
ν =
∞∫0
χ2 pχ2(χ2, ν)dχ2 (3.37)
while for the variance we find
2ν =
∞∫0
(χ2 − ν)2 pχ2(χ2, ν)dχ2 . (3.38)
The χ2 density will be used to illustrate the idea of what has called well-defined measuring conditions. To this end, let us consider the random variabledefined in (3.29),
Z = b1X1 + b2X2 + · · · + bmXm , (3.39)
and n of its realizations
z(l) = b1x(l)1 + b2x
(l)2 + · · · + bmx(l)
m ; l = 1, 2 , . . . , n .
The sequence z(l); l = 1, 2, . . . , n leads us to the mean and to the empiricalvariance,
z =1n
n∑l=1
z(l) and s2z =
1n − 1
n∑l=1
(z(l) − z)2 (3.40)
respectively. As the z(l) are independent,2 we are in a position to define
χ2(n − 1) =(n − 1)S2
z
σ2z
, (3.41)
where σ2z = E(Z − µz)2. If m = 2, then, as
z = b1x1 + b2x2 and z(l) − z = b1(x(l)1 − x1) + b2(x
(l)2 − x2)
we haves2
z = b21s
21 + 2b1b2s12 + b2
2s22 . (3.42)
2Even if the Xi; i = 1, . . . , m are dependent, the realizations z(l), l = 1, . . . , n ofthe random variable Z are independent. This is obvious, as the realizations x
(l)i of
Xi, with i fixed and l running, are independent. The same applies to the realizationsof the other variables Xj ; j = 1, . . . , m; j = i.
3.3 Chi-Square Density and F Density 35
Obviously, the χ2 as defined in (3.41) presupposes the same number ofrepeat measurements and includes the empirical covariance s12 – whether ornot σ12 = 0 is of no relevance.
In error propagation, we generally consider the link-up of several inde-pendent series of measurements. Then, although there are no theoretical co-variances, we shall nevertheless consider their empirical counterparts (which,in general, will not be zero). If we did not consider the latter, we would abol-ish basic statistical relationships, namely the beneficial properties of χ2 asexemplified through (3.41) and (3.42).
Let us return to the quadratic forms q and q quoted in (3.28). Assumingrank(σ) = m, the matrix σ is positive definite so that a decomposition
σ = cTc (3.43)
is defined. As σ−1 = c−1(cT)−1, we find
q = (x − µ)Tc−1(cT)−1
(x − µ) = yTy =m∑
i=1
y2i , (3.44)
wherey = (cT)−1(x − µ) = (y1 y2 · · · ym)T . (3.45)
Obviously, the components of the auxiliary vector y are normally distributed.Moreover, because of Ey = 0 and Ey yT = I, where I denotes an(m × m) identity matrix, they are standardized and independent, i.e. thequadratic form q is χ2(m) distributed. The same applies to the quadraticform q.
The F density considers two normally distributed, independent randomvariables X1 and X2. Let S2
1 and S22 denote their empirical variances implying
n1 and n2 repeat measurements respectively. Then, the probability of findinga realization of the random variable
F =S2
1
S22
=χ2(ν1)/ν1
χ2(ν2)/ν2; ν1 = n1 − 1 , ν2 = n2 − 1 (3.46)
in any interval f, . . . , f + df is given by
pF (f ; ν1, ν2)df =[ν1
ν2
]ν1/2 Γ[(ν1 + ν2)/2]Γ(ν1/2)Γ(ν2/2)
fν1/2−1
[1 + (ν1/ν2)f ](ν1+ν2)/2df .
(3.47)
Obviously, the symbol f denoting the realizations of the random variable Fconflicts with the symbol we have introduced for systematic errors. On theother hand, we shall refer to the F density only once, namely when we convertits quantiles into those of Hotelling’s density.
36 3 Densities of Normal Parent Distributions
3.4 Student’s (Gosset’s) Density
The one-dimensional normal density is based on two inaccessible parameters,namely µx and σx. This inspired W.S. Gosset to derive a new density whichbetter suits experimental purposes [2]. Gosset’s variable has the form
T =X − µx
Sx/√
n, (3.48)
where X denotes the arithmetic mean and Sx the empirical standard devia-tion of n independent, normally distributed repeated measurements,
X =1n
n∑l=1
Xl and Sx =
√√√√ 1n − 1
n∑l=1
(Xl − X)2 .
Besides (3.48), we also consider the variable
T ′ =X − µx
Sx. (3.49)
Inserting (3.35) in (3.48) and (3.49), we find
T =
X − µx
σx/√
n
Sx/√
n
σx/√
n
=
X − µx
σx/√
n√χ2(n − 1)
n − 1
and T ′ =
X − µx
σx
Sx
σx
=
X − µx
σx√χ2(n − 1)
n − 1(3.50)
respectively. The numerators are standardized, normally distributed randomvariables while the common denominator displays the square root of a χ2
variable divided by ν = n − 1, the number of degrees of freedom.The realizations of the variables T and T ′ lie within −∞, . . . ,∞. The
probability of finding a realization of T in any interval t, . . . , t+dt is given by
p T (t; ν) dt =Γ[(ν + 1)/2]√
νπΓ(ν/2)
(1 +
t2
ν
)−(ν+1)/2
dt ; ν = n − 1 ≥ 1 , (3.51)
pT (t; ν) is called the t-density or Student’s density. The latter designation isdue to Gosset himself who published his considerations under the pseudonymStudent. The density (3.51) also applies to the variable T ′, see Appendix G.
On the right hand side, we may replace ν by (n − 1),
=Γ(n/2)√
(n − 1)πΓ[(n − 1)/2]
(1 +
t2
n − 1
)−n/2
dt .
3.4 Student’s (Gosset’s) Density 37
Fig. 3.2. Student’s density for ν = 2, 9,∞
This latter form proves useful when comparing the t-density with its multi-dimensional analogue, which is Hotelling’s density.
Remarkably enough, the number of degrees of freedom, ν = n − 1, is theonly parameter of Gosset’s density. Its graphic representation is broader butsimilar to that of the normal density. In particular, it is symmetrical aboutzero so that the expectation vanishes,
ET =
∞∫−∞
t pT (t; ν) dt = 0 . (3.52)
For the variance of T we have
ET 2 =
∞∫−∞
t2 pT (t; ν) dt =ν
ν − 2. (3.53)
The larger ν, the narrower the density, but its variance will never fall be-low 1. Figure 3.2 compares Student’s density for some values of ν with thestandardized normal density. The the standardized normal density coincideswith Student’s density as ν → ∞. Student’s density is broader than thestandardized normal density, taking into account that the former impliesless information than the latter. Moreover, we realize that the greater ν, thesmaller the importance of this difference.
The probability of finding any realization of T within an interval of finitelength, say, −tP , . . . , tP is given by
P−tP ≤ T ≤ tP =
tP∫−tP
pT (t; ν) dt . (3.54)
Let us once more revert to the meaning of what has been called well-definedmeasuring conditions. To this end, we consider the random variable
38 3 Densities of Normal Parent Distributions
Z = b1X1 + b2X2 + · · · + bmXm ,
quoted in (3.39). Assuming that for each of the variables Xi; EXi = µi;i = 1, . . . , m there are n repeat measurements, we have
Z =1n
n∑l=1
Z(l) , µz = EZ =m∑
i=1
bi µi , S2z =
1n − 1
n∑l=1
(Z(l) − Z)2 .
Now, let us define the variable
T (ν) =Z − µz
Sz/√
n. (3.55)
Obviously, the quantity S2z implies the empirical covariances between the m
variables Xi; i = 1, . . . , m be they dependent or not. We have
z(l) = b1x(l)1 + b2x
(l)2 + · · · + bmx(l)
m , z = b1x1 + b2x2 + · · · + bmxm
so that
s2z =
1n − 1
n∑l=1
[b1(x
(l)1 − x1) + b2(x
(l)2 − x2) + · · · + bm(x(l)
m − xm)]2
= bTs b . (3.56)
The matrix notation on the right hand side refers to the empirical variance–covariance matrix s of the input data,
s =
⎛⎜⎜⎜⎝
s11 s12 . . . s1m
s21 s22 . . . s2m
. . . . . . . . . . . .
sm1 sm2 . . . smm
⎞⎟⎟⎟⎠ ; sij =
1n − 1
n∑l=1
(x(l)i − xi)(x
(l)j − xj) (3.57)
and to an auxiliary vector b of the coefficients bi; i = 1, . . . , m,
b = (b1 b2 · · · bm)T . (3.58)
As mentioned above, empirical covariances are at present treated as if theywere bothersome even if the measured data appear in pairs or triples, etc. Aslong as it is understood that there will be no theoretical covariances, no mean-ing at all is assigned to their empirical counterparts. Nevertheless, in Sect. 3.3we have realized that normally distributed measured data are embedded in acoherent complex of statistical statements whose self-consistency should notbe called into question. We shall quantitatively deploy these thoughts in thefollowing section.
After all, the question as to whether or not empirical covariances bearinformation is reduced to the question of whether or not the model of jointlynormally distributed measurement data is acknowledged. Here, empirical co-variances are considered essential as they preserve the framework of statisticsand allow concise and safe statistical inferences.
3.4 Student’s (Gosset’s) Density 39
Examples
Let us ask whether or not the difference between two arithmetic means ob-tained from different laboratories and relating to one and the same physicalquantity might be significant [39] (Vol. II, p. 146, The problem of two means).
Let us initially summarize the traditional arguments and refer to differentnumbers of repeat measurements,
x1 =1n
n1∑l=1
x(l)1 ; x2 =
1n
n2∑l=1
x(l)2 .
Then, inevitably, the empirical covariance is to be ignored. After this, werefer to equal numbers of repeat measurements and consider the empiricalcovariance.
(i) For simplicity, let us disregard systematic errors; consequently we mayset µi = µ; i = 1, 2. Furthermore, we shall assume equal theoretical variancesso that σ2
i = σ2; i = 1, 2. Hence, the theoretical variance of the randomvariable Z = X1 − X2 is given by
σ2z =
σ21
n1+
σ22
n2= σ2 n1 + n2
n1 n2.
Z obviously follows a
N
(0, σ
√n1 + n2
n1 n2
)density. The associated standardized variable is
(X1 − X2)
σ
√n1 + n2
n1 n2
.
According to (3.50) we tentatively put [38]
T =
X1 − X2
σ√
(n1 + n2)/(n1 n2)√χ2(ν∗)
ν∗
(3.59)
attributing degrees of freedom
ν∗ = n1 + n2 − 2
to the sum of the two χ2-s ,
χ2(ν∗) =(n1 − 1)S2
1 + (n2 − 1)S22
σ2.
40 3 Densities of Normal Parent Distributions
After all, we consider the random variable
T (ν∗) =
X1 − X2
σ√
(n1 + n2)/(n1 n2)√(n1 − 1)S2
1 + (n2 − 1)S22
σ2
√n1 + n2 − 2 (3.60)
to be t-distributed.In contrast to this, from (3.55), which is based on n1 = n2 = n, we simply
deduce
T (ν) =X1 − X2√
S21 − 2S12 + S2
2
√n . (3.61)
Clearly, this result is also exact. To compare (3.60) with (3.61), we put n1 =n2 so that (3.60) changes into
T (ν∗) =X1 − X2√S2
1 + S22
√n . (3.62)
While this latter T lacks the empirical covariance S12 and refers to degrees offreedom ν∗ = 2(n − 1), the T defined in (3.61) relates to degrees of freedomν = (n−1), where, of course, we have to consider tP (2ν) < tP (ν). Remarkablyenough, both quantities possess the same statistical properties, namely
|X1 − X2| ≤ tP (2ν)√
S21 + S2
2/√
n
and
|X1 − X2| ≤ tP (ν)√
S21 − 2S12 + S2
2/√
n
respectively.(ii) To be more general, we now admit different expectations, µ1 = µ2,
and different theoretical variances, σ21 = σ2
2 , briefly touching the so-calledFisher–Behrens problem. A solution akin to that presented above does notseem to exist.
Traditionally, the Fisher–Behrens problem refers to n1 = n2. Its impli-cations are extensively discussed in [39] and will therefore not be dealt withhere.3
B.L. Welch [7] has handled the Fisher–Behrens problem by means of arandom variable of the kind
3We wish to mention that, from a metrological point of view, the mere notationµ1 = µ2 conceals the underlying physical situation. The two means x1 and x2 mayhave been determined by means of two different measuring devices, likewise aimingat one and the same physical object, or, alternatively, there might have been twodifferent physical objects and one measuring device or even two different devices.
3.5 Fisher’s Density 41
ω =(X1 − X2) − (µ1 − µ2)√
S21/n1 + S2
2/n2
, n1 = n2 .
This quantity is approximately t-distributed if related to a so-called effectivenumber of degrees of freedom
νeff =(s2
1/n1 + s22/n2)2
(s21/n1)2/(n1 + 1) + (s2
2/n2)2/(n2 + 1)− 2 ,
i.e. ω(νeff) leads us back to Student’s statistics. Let us recall that both Fisherand Welch considered n1 = n2 a reasonable premise.
Presupposing n1 = n2, our approach doubtlessly yields an exact
T =(X1 − X2) − (µ1 − µ2)√
S21 − 2S12 + S2
2
√n . (3.63)
After all, when sticking to well-defined measuring conditions bringing theempirical covariance S12 to bear the Fisher–Behrens problem is reduced toan ill-defined issue – judged from the kind of statistical reasoning discussedhere.
Hotelling [5] has developed a multidimensional generalization of the one-dimensional Student’s density. Hotelling’s density refers to, say, m arithmeticmeans each of which implies the same number n of repeat measurements.Again, this latter condition leads us to consider the empirical covariances ofthe m variables.
Traditionally, Hotelling’s variable is designated by T and its realizations,correspondingly, by t. Nevertheless, the respective relationship will disclose byitself whether Student’s or Hotelling’s variable is meant: the former possessesone, the latter two arguments:
t(ν) Student’s variable ,
t(m, n) Hotelling’s variable .
3.5 Fisher’s Density
We consider two measurands x and y, assign normally distributed randomvariables X and Y to them and assume well-defined measuring conditions,i.e. equal numbers of repeat measurements
x1, x2, . . . , xn ; y1, y2, . . . , yn .
Then, Fisher’s density specifies the joint statistical behavior of the five ran-dom variables
x, y, s2x, sxy, s
2y ,
42 3 Densities of Normal Parent Distributions
the two arithmetic means, the two empirical variances and the empirical co-variance. Fisher’s density implies a metrologically important statement withrespect to the empirical covariance of uncorrelated series of measurements.The latter is, as a rule, classified as unimportant if its expectation vanishes.As we shall see, the empirical covariance will nevertheless have a meaningwhich, ultimately, is rooted in the structure of Fisher’s density.
Nonetheless, the present situation is this: Given several measurement re-sults obtained by different laboratories are to be linked with one another,empirical covariances are simply disregarded. On the one hand, this seemsto be a matter of habit, on the other hand the original measurement datamight not have been preserved and, if, experimenters have carried out dif-ferent numbers of repeat measurements, which would, a priori rule out thepossibility of considering empirical covariances.
In the following we refer to the notation and definitions of Sect. 3.2.Fisher’s density
p(x, y, s2x, sxy, s2
y) = p1(x, y) p2(s2x, sxy, s
2y) (3.64)
factorizes into the density p1 of the two arithmetic means and the density p2
of the empirical moments of second order. Remarkably enough, the latter,
p2(s2x, sxy, s
2y) =
(n − 1)n−1
4πΓ(n − 2)|σ|(n−1)/2[s2
xs2y − s2
xy](n−4)/2
× exp(−n − 1
2|σ| h(s2x, sxy, s
2y))
;
h(s2x, sxy, s
2y) = σ2
ys2x − 2σxysxy + σ2
xs2y , (3.65)
does not factorize even if σxy = 0. From this we deduce that the modelof jointly normally distributed random variables considers the empirical co-variance sxy a statistically indispensable quantity. Finally, when propagatingrandom errors taking empirical covariances into account, we shall be placedin a position to define confidence intervals according to Student, irrespectiveof whether or not the series of measurements are dependent – the formalism(3.55) will always be correct.
In the same way, Hotelling’s density, see Sect. 3.6, will provide experimen-tally meaningful confidence ellipsoids.
In (3.65), we may substitute the empirical correlation coefficient rxy =sxy/(sxsy) for the empirical covariance sxy so that
p2(s2x, rxy, s2
y) =(n − 1)n−1
4πΓ(n − 2)|σ|(n−1)/2sn−3
x sn−3y [1 − r2
xy](n−4)/2
× exp(−n − 1
2|σ| h(s2x, rxy, s
2y))
;
h(s2x, rxy, s2
y) = σ2ys2
x − 2σxyrxysxsy + σ2xs2
y . (3.66)
3.6 Hotelling’s Density 43
If σxy vanishes, the empirical correlation coefficient rxy and the two empir-ical variances s2
x, s2y obviously turn out to be statistically independent. This
situation is, however, quite different from that in which we considered the em-pirical covariance and the empirical variances. Meanwhile, the error calculusrelates to density (3.65) and not to density (3.66). We conclude:
Within the framework of normally distributed data the density of the em-pirical moments of second order requires us to consider empirical covariances,irrespective of whether or not the data series to be concatenated are depen-dent.
Apparently, we are now confronted with another problem. Independent se-ries of measurements allow the pairing of the measurement data to be altered.For instance, the sequences (x1, y1); (x2, y2); . . . and (x1, y2); (x2, y1); . . . wouldbe equally valid. While such permutations in no way affect empirical vari-ances, they do, however, affect the empirical covariance (3.13). In general,each new pairing of data yields a numerically different empirical covarianceimplying, of course, varying lengths of the pertaining confidence intervalsto be deduced from (3.55). This might appear confusing, nevertheless thedistribution densities (3.51) and (3.65) willingly accept any such empiricalcovariance, be it the direct result of measurements or due to permutationsin data pairing. In particular, each covariance is equally well covered by theprobability statement resulting from (3.51).
On this note, we could even purposefully manipulate the pairing of datain order to arrive at a particular, notably advantageous empirical covariancewhich, e.g., minimizes the length of a confidence interval.
Should, however, a dependence between the series of measurements exist,the experimentally established pairing of data is to be considered statisticallysignificant and no permutations are admissible, i.e. subsequent permutationsabolishing the original pairing are to be considered strictly improper.
Wishart generalized Fisher’s density to more than two random variables[38]. The considerations presented above apply analogously.
Let us return to the problem of two means, X1 and X2, see Sect. 3.4.Assuming σ2
1 = σ22 = σ2 we considered the difference Z = X1−X2. Obviously,
the variable Z relates to a two-dimensional probability density. Even if S21
and S22 refer to independent measurements, the model (3.65) formally asks us
to complete the empirical variances through the empirical covariance S12 asthe empirical moments of second order appear to be statistically dependent.In doing so, statistics tells us that any empirical covariance producible outof the input data is likewise admitted.
3.6 Hotelling’s Density
Hotelling’s density is the multidimensional analogue of Student’s density andpresupposes the existence of the density (3.65) of the empirical moments of
44 3 Densities of Normal Parent Distributions
second order. To represent Hotelling’s density, we refer (3.25). As has beenstated, the exponent
q = (ζ − µ)Tσ−1(ζ − µ) (3.67)
=1|σ|[σyy(x − µx)2 − 2σxy(x − µx)(y − µy) + σxx(y − µy)2
]is χ2(2)-distributed. Insertion of
σ = σ/n , σ−1 = nσ−1 , |σ| = |σ|/n2 (3.68)
has led to
q = n(ζ − µ)Tσ−1(ζ − µ) (3.69)
=n
|σ|[σyy(x − µx)2 − 2σxy(x − µx)(y − µy) + σxx(y − µy)2
].
Hotelling substituted the empirical variance–covariance matrix
s =
(sxx sxy
syx syy
)(3.70)
for the unknown, inaccessible theoretical variance–covariance matrix σ sothat q formally turned into
t2(2, n) = n(ζ − µ)Ts−1(ζ − µ) (3.71)
=n
|s|[syy(x − µx)2 − 2sxy(x − µx)(y − µy) + sxx(y − µy)2
].
Finally, Hotelling confined the sxx, sxy, syy combinations to the inequality
−sxsy < sxy < sxsy ;(s2
x ≡ sxx , s2y ≡ syy
)thus rendering the empirical variance–covariance matrix s positive definite.Likewise, χ2(2) = const. and t2(2, n) = const. depict ellipses but only thelatter, Hotelling’s ellipses, are free from unknown theoretical parameters, i.e.they do not rely on the theoretical variances and the theoretical covariance.This is a first remarkable property of Hotelling’s density. Another property isthat t2(2, n) equally applies to dependent and independent series of measure-ments. Consequently, the experimenter may refer to t2(2, n) without havingto know or discover whether or not there are dependencies between the un-derlying series of measurements. With respect to m measurands, we alter thenomenclature substituting x1, x2, . . . , xm for x, y so that a vector
(x − µ) = (x1 − µ1 x2 − µ2 · · · xm − µm)T (3.72)
is obtained in place of ζ − µ = (x − µx y − µy)T. Thus
3.6 Hotelling’s Density 45
t2(m, n) = n(x − µ)Ts−1(x − µ) (3.73)
is the multidimensional analogue of (3.71), where
s =
⎛⎜⎜⎜⎝
s11 s12 . . . s1m
s21 s22 . . . s2m
. . . . . . . . . . . .
sm1 sm2 . . . smm
⎞⎟⎟⎟⎠ ;
sij =1
n − 1
n∑l=1
(xil − xi)(xjl − xj) ; i, j = 1, . . . , m
(3.74)
denotes the empirical variance–covariance matrix of the measured data.Again, we shall assume s to be positive definite [38].
Formally, we again assign random variables X1, X2, . . . , Xm to the mmeans x1, x2, . . . , xm. In like manner, we understand the empirical variance–covariance matrix s, established through the measurements, to be a particu-lar realization of a random matrix S. Hotelling [5] has shown that free fromtheoretical variances and covariances, a density of the form
pT (t; m, n) =2Γ(n/2)
(n − 1)m/2Γ[(n − m)/2]Γ(m/2)tm−1
[1 + t2/(n − 1)]n/2;
t > 0 , n > m (3.75)
specifies the probability of finding a realization of the random variable
T (m, n) =√
n(X − µ)TS−1(X − µ) (3.76)
in any interval t, . . . , t + dt. Accordingly, the probability of the event T ≤tP (m, n) is given by
PT ≤ tP (m, n) =
tP (m,n)∫0
pT (t; m, n) dt . (3.77)
Figure 3.3 illustrates pT (t; m, n) for m = 2 and n = 10. If m = 1, (3.75)changes into
pT (t; 1, n) =2Γ(n/2)
(n − 1)1/2Γ[(n − 1)/2]Γ(1/2)
(1 +
t2
n − 1
)−n/2
,
which may be compared with (3.51). Obviously, for m = 1, Hotelling’s densityis symmetric about t = 0 and, apart from a factor of 2 in the numerator,identical with Student’s density. The latter may be removed by extendingthe domain of pT (t; 1, n) from 0, . . . ,+∞ to −∞, . . . ,+∞, which is, however,admissible only if m = 1.
46 3 Densities of Normal Parent Distributions
Fig. 3.3. Hotelling’s density; m = 2; n = 10
In order to practically utilize (3.76) and (3.77), we proceed just as we didwhen discussing Student’s density and wished to localize the unknown pa-rameter µ. Following (3.76), we center Hotelling’s ellipsoid t2(m, n) = const.in x and formally substitute an auxiliary vector x for the unknown vector µ.The so-defined confidence ellipsoid
t2P (m, n) = n(x − x)Ts−1(x − x) (3.78)
localizes the point defined by the unknown vector µ with probability PT ≤tP (m, n) as given in (3.77).
This latter interpretation obviously reveals the benefit of our considera-tions. As is known, the conventional error calculus gets confidence ellipsoidsfrom the exponent of the multidimensional normal probability density andthese, unfortunately, imply unknown theoretical variances and covariances.In contrast to this, Hotelling’s ellipsoid only needs empirical variances andcovariances. Let us again consider the ellipse defined in (3.71). To simplifythe notation, we set
α = syy , β = −sxy , γ = sxx , δ =|s|t2(2, n)
n
so that
α(x − µx)2 + 2β(x − µx)(y − µy) + γ(y − µy)2 = δ .
From the derivative
y′ = −α(x − µx) + β(y − µy)β(x − µx) + γ(y − µy)
we deduce
|x − µx|max =tP (2, n)√
n
√sxx and |y − µy|max =
tP (2, n)√n
√syy .
3.6 Hotelling’s Density 47
Fig. 3.4. Hotelling’s ellipse
Figure 3.4 puts straight that Hotelling’s ellipse is included in a rectanglewhose side lengths are independent of the empirical covariance sxy. In the caseof independent series of measurements, this means: while any permutationswithin the pairings
x1 , y1 ; x2 , y2 ; . . . ; xn , yn
do not affect the empirical variances, the empirical covariance may, however,assume new values, thus rotating Hotelling’s ellipse within the aforesaid fixedrectangle. All positions of the ellipse are equally admissible. Up to now, degen-erations had been formally excluded. However, admitting them for a moment,we are in a position to state:
The inequality
−sxsy ≤ sxy ≤ sxsy
(which here includes degenerations) reveals that, given that the two series ofmeasurements are independent, all (x−µx), (y−µy) pairs within the rectangleare tolerable.
This interpretation proves invertible, i.e. in terms of (3.78) we may justas well state: The point (µx, µy) defining the unknown vector µ may lieanywhere within a rectangle, centered in (x, y), and having the side lengths
2tP (2, n)√
n
√sxx and 2
tP (2, n)√n
√syy
respectively. We stress that, given that the series of measurement are de-pendent, we are not allowed to perform any interchanges within the jointlymeasured data pairs.
Apparently, a table of quantiles tP (m, n) of order P (tP (m, n)) accordingto (3.77) does not exist. Meanwhile, we can use the tabulated quantiles of
48 3 Densities of Normal Parent Distributions
Fig. 3.5. Transformation of Hotelling’s density
the F density as there is a relationship between the latter and Hotelling’sdensity.4
We ask, given a random variable with probability density F (3.47),
pF (f ; ν1, ν2)df =[ν1
ν2
]ν1/2 Γ[(ν1 + ν2)/2]Γ(ν1/2)Γ(ν2/2)
fν1/2−1
[1 + (ν1/ν2)f ](ν1+ν2)/2df ,
what is the density pT (t) of the variable
t =√
ν1
ν2(n − 1)f . (3.79)
With a view to Fig. 3.5 we find [43]
pT (t) = pF
[ν2
ν1(n − 1)t2] ∣∣∣∣df
dt
∣∣∣∣ . (3.80)
Settingν1 = m ; ν2 = n − m ; ν1 + ν2 = n (3.81)
we easily reconstruct (3.75). Then, from the quantiles of the F density, weinfer via
P =
f∫0
pF (f ′; ν1, ν2) df ′ =
t∫0
pT (t′; m, n) dt′ (3.82)
the quantiles t of Hotelling’s density.
4As an exception, here we deviate from the nomenclature agreed on accordingto which f denotes a systematic error.
3.6 Hotelling’s Density 49
Example
Calculate Hotelling’s quantile of order P=95%, given m = 1 and n = 11. Asν1 = 1 and ν2 = 10, we have
0.95 =
f0.95∫0
pF (f ; 1, 10) df .
From the table of the F density we infer the quantile f0.95 = 4.96 so thatfrom (3.79) we obtain Hotelling’s quantile t0.95 = 2.2. As expected, this valueindeed agrees with Student’s quantile, since m = 1.
4 Estimators and Their Expectations
4.1 Statistical Ensembles
Expectations, as theoretical concepts, emerge from integrals and distribu-tion densities. Their experimental counterparts are so-called estimators. Thearithmetic mean and the empirical variance
x =1n
n∑l=1
xl , s2x =
1n − 1
n∑l=1
(xl − x)2 (4.1)
for example, are estimators of the expectations
µx = EX =
∞∫−∞
x pX(x)dx (4.2)
and
σ2x = E
(X − µx)2
=
∞∫−∞
(x − µx)2 pX(x) dx , (4.3)
where pX(x) and pX(x) denote the distribution densities of the random vari-ables X and X respectively. While (4.2) is quite obvious, statement (4.3)needs some explanations as the integrand holds a non-linear function of x,namely φ(x) = (x − µx)2, which is simply averaged by the density pX(x)of X .
Let X be a random variable with realizations x, and u = φ(x) a givenfunction. Then, as may be shown, see e.g. [43], the expectation of the randomvariable U = φ(X) is given by
EU =
∞∫−∞
u pU (u)du =
∞∫−∞
φ(x) pX(x) dx , (4.4)
where pU (u) denotes the density of the variable U and pX(x) that of thevariable X . Analogously, considering functions of several variables, e.g. V =φ(X, Y ), we may set
52 4 Estimators and Their Expectations
EV =
∞∫−∞
v pV (v) dv =
∞∫−∞
∞∫−∞
φ(x, y) pXY (x, y) dxdy . (4.5)
Again, pV (v) constitutes the density of the random variable V and pXY (x, y)the joint density of the random variables X and Y . Thus, (4.4) and (4.5) saveus the trouble of developing the densities pU (u) and pV (v) when calculatingthe respective expectations.
In the following, we shall address the expectations of some frequentlyused empirical estimators. For this, we purposefully rely on a (conceived,fictitious) statistical ensemble of infinitely many identical measuring deviceswhich basically have similar statistical properties and, exactly, one and thesame unknown systematic error, Fig. 4.1. From each device, we bring in thesame number n of repeat measurements. In a formal sense, the collectivity ofthe random variables X(k); k = 1, 2, 3, . . .
X(k) ⇔ x(k)1 , x
(k)2 , . . . , x(k)
n ; k = 1, 2, 3 , . . . (4.6)
establishes a statistical ensemble enabling us to calculate expectations.A measuring device, which is used over a period of many years, may be
regarded as approximating an ensemble of the above kind – if only its inher-ent systematic errors remain constant in time. As has been pointed out, ex-perimenters are pleased to manage with less, namely with systematic errorsremaining constant during the time it takes to perform n repeat measure-ments so that the data remain free from drifts. Moreover, as variations ofsystematic errors cannot be excluded during prolonged periods of time, theirchanges should stay within the error bounds conclusively assigned to themfrom the outset.
Let us conceptualize these ideas. If, as presumed, all measuring deviceshave the same unknown systematic error f = const., the data are gatheredunder conditions of reproducibility. On the other hand, each of the experi-mental set-ups might also be affected by a specific unknown systematic error.Then the data would be recorded under conditions of comparability. The dis-tinction between reproducibility and comparability induces us to stress oncemore that reproducibility aims at quite a different thing than accuracy. Weshall encounter conditions of comparability in Sect. 12.2 when treating so-called key comparisons.
Obviously, apart from (4.6), the ensemble also implies random variablesof the kind Xl; l = 1, 2, . . . , n
Xl ⇔ x(1)l , x
(2)l , . . . , x
(k)l , . . . , l = 1, 2 , . . . , n . (4.7)
Formally, we now represent the arithmetic mean
x =1n
(x1 + x2 + · · · + xn)
4.1 Statistical Ensembles 53
Fig. 4.1. Statistical ensemble of series of repeated measurements, each series com-prising n measurements and being affected by one and the same systematic errorf = const.
by means of the n realizations of the random variable X(k) as defined in (4.6).For any k we have
x(k) =1n
(x
(k)1 + x
(k)2 + · · · + x(k)
n
).
Certainly, with respect to (4.7), we may just as well introduce the sum vari-able
X =1n
(X1 + X2 + · · · + Xn) . (4.8)
54 4 Estimators and Their Expectations
As assumed, the variables X1, X2, . . . , Xn are independent. We are now ina position to calculate the expectation of the arithmetic mean. Let pX(x)designate the density of the random variable X so that
EX =
∞∫−∞
x pX(x) dx .
Referring to (4.5) and (4.8), we find
EX =1n
∞∫−∞
· · ·∞∫
−∞
×(x1 + x2 + · · · + xn) pX1X2...Xn(x1, x2, . . . , xn) dx1 dx2 · · · dxn .
In each case, integrating out the “undesired variables” yields 1,
EX =1n
n∑l=1
∞∫−∞
xl pXl(xl) dxl = µx . (4.9)
After all, the expectation of X presents itself as the mean of the expectationsof the n random variables X1, X2, . . . , Xn, each with the same value µx.
4.2 Variances and Covariances
As we recall, the concatenation of the xl in x is linear by definition. Thecalculation of expectations proves to be more complex if the estimators areof a quadratic nature. Letting µx denote the expected value EX of thearithmetic mean X, we consider the empirical variance as defined through
s2x =
1n
n∑l=1
(xl − µx)2 . (4.10)
Replacing the quantities s2x and xl with the random variables S2
x and Xl
respectively, turns (4.10) into
S2x =
1n
n∑l=1
(Xl − µx)2 .
The expectation follows from
ES2x =
1n
E
n∑
l=1
(Xl − µx)2
=1n
n∑l=1
E(Xl − µx)2 .
4.2 Variances and Covariances 55
But as
E(Xl − µx)2 =
∞∫−∞
(xl − µx)2 pXl(xl) dx = σ2
x ,
we findES2
x = σ2x . (4.11)
Next we consider the scattering of X with respect to µx. Each of the devicessketched in Fig. 4.1 produces a specific mean x. What can we say about themean-square scattering of X relative to the parameter µx? We have
E(X − µx)2 = E
⎧⎨⎩(
1n
n∑l=1
Xl − µx
)2⎫⎬⎭
=1n2
E
⎧⎨⎩(
n∑l=1
(Xl − µx)
)2⎫⎬⎭ (4.12)
=1n2
[E(X1 − µx)2 + 2E(X1 − µx)(X2 − µx)
+ · · · + 2E(Xn−1 − µx)(Xn − µx) + E(Xn − µx)2].
As X1 and X2 are independent, the first covariance yields
E(X1 − µx)(X2 − µx) =
∞∫−∞
∞∫−∞
(x1 − µx)(x2 − µx)pX1X2(x1, x2) dx1 dx2
=
∞∫−∞
(x1 − µx)pX1(x1) dx1
∞∫−∞
(x2 − µx)pX2(x2) dx2 = 0
and the same applies to the other covariances. As each of the quadratic termsyields σ2
x we arrive at
E(X − µx)2
=
σ2x
n≡ σx .
The empirical variance
s2x =
1n − 1
n∑l=1
(xl − x)2 (4.13)
differs from that defined in (4.10) insofar as reference is taken here to x and1/(n− 1) has been substituted for 1/n. Rewriting (4.13) in random variablesand formalizing the expectation yields
56 4 Estimators and Their Expectations
ES2
x
=
1n − 1
E
n∑
l=1
(Xl − X)2
.
Due ton∑
l=1
(Xl − X)2 =n∑
l=1
[(Xl − µx) − (X − µx)
]2=
n∑l=1
(Xl − µx)2 − 2(X − µx)n∑
l=1
(Xl − µx) + n(X − µx)2
=n∑
l=1
(Xl − µx)2 − n(X − µx)2
we get
ES2
x
=
1n − 1
E
n∑
l=1
(Xl − µx)2 − n(X − µx)2
=1
n − 1
(nσ2
x − nσ2
x
n
)= σ2
x . (4.14)
Here again the independence of the measured values xl; l = 1, 2, . . . , n provesto be essential. Obviously, estimators (4.10) and (4.13) feature the same ex-pected value but only (4.13) can be realized experimentally.
When calculating E(X − µx)2, we have met the theoretical covarianceof the random variables X1 and X2. Evidently, this definition applies to anytwo random variables X and Y with expectations µx and µy. In general, wehave
E(X − µx)(Y − µy) =
∞∫−∞
∞∫−∞
(x − µx)(y − µy) pXY (x, y) dxdy = σxy .
(4.15)
Given X and Y are independent, pXY (x, y) = pX(x)pY (y) and σxy = 0.Before calculating other expectations we extend the conceived statistical
ensemble by a second random variable Y , i.e. with each random variable Xl
we associate a further random variable Yl. The consecutive pairs (Xl, Yl); l =1, . . . , n are assumed to be independent, however there may be a correlationbetween Xl and Yl. In particular, each Xl is independent of any other Yl′ ;l = l′.
We now consider the expected value of the empirical covariance
sxy =1n
n∑l=1
(xl − µx)(yl − µy) . (4.16)
4.2 Variances and Covariances 57
Letting Sxy denote the associated random variable, we look after the expec-tation
ESxy =1n
E
n∑
l=1
(Xl − µx)(Yl − µy)
.
Obviously
ESxy =1n
n∑l=1
E (Xl − µx)(Yl − µy) =1n
nσxy = σxy . (4.17)
More effort is needed to cover
E(X − µx)(Y − µy)
=
∞∫−∞
∞∫−∞
(x − µx)(y − µy) pXY (x, y) dx dy
= σxy . (4.18)
In contrast to (4.15), the expectation (4.18) relates to the less fluctuatingquantities X and Y . As
x − µx =1n
n∑l=1
xl − µx =1n
n∑l=1
(xl − µx)
y − µy =1n
n∑l=1
yl − µy =1n
n∑l=1
(yl − µy) ,
we find
E(X − µx)(Y − µy)
=
1n2
∞∫−∞
· · ·∞∫
−∞
n∑l=1
(xl − µx)n∑
l′=1
(yl′ − µy)
×pX1...Yn(x1, . . . , yn) dx1 . . . dyn ,
i.e.E(X − µx)(Y − µy)
=
σxy
n≡ σxy . (4.19)
Occasionally, apart from covariances, so-called correlation coefficients are ofinterest. The latter are defined through
xy =σxy
σxσyand xy =
σxy
σxσy, where xy ≡ xy . (4.20)
The covariances defined in (4.15) and (4.18) rely on experimentally inacces-sible parameters. Of practical importance is
58 4 Estimators and Their Expectations
sxy =1
n − 1
n∑l=1
(xl − x)(yl − y) . (4.21)
As the term within the curled brackets on the right hand side of
(n − 1)ESxy = E
n∑
l=1
(Xl − X)(Yl − Y )
passes into
n∑l=1
(Xl − X)(Yl − Y ) =n∑
l=1
[(Xl − µx) − (X − µx)
] [(Yl − µy) − (Y − µy)
]
=n∑
l=1
(Xl − µx)(Yl − µy) + n(X − µx)(Y − µy)
−(X − µx)n∑
l=1
(Yl − µy) − (Y − µy)n∑
l=1
(Xl − µx)
=n∑
l=1
(Xl − µx)(Yl − µy) − n(X − µx)(Y − µy) ,
we find
(n − 1)ESxy = nσxy − nσxy
n= (n − 1)σxy ,
so thatESxy = σxy . (4.22)
The empirical covariance (4.21) is unbiased with respect to the theoreticalcovariance σxy.
4.3 Elementary Model of Analysis of Variance
The concept of a conceived statistical ensemble helps us to explore the im-pact of unknown systematic errors on the analysis of variance. As we recall,the latter shall reveal whether or not series of measurements relating to thesame physical quantity but carried out at different laboratories are chargedby specific, i.e. different unknown systematic errors. But this, quite obviously,conflicts with the approach pursued here. Obviously, given the measured dataare biased, the tool analysis of variance does no longer appear to be mean-ingful:
The analysis of variance breaks down, given the empirical data to be in-vestigated are charged or biased by unknown systematic errors.
4.3 Elementary Model of Analysis of Variance 59
Let us assume a group of, say m, laboratories has measured one and thesame physical quantity, established by one and the same physical object,and that a decision is pending on whether or not the stated results may beconsidered compatible.
All laboratories have performed the same number n of repeat measure-ments, so there are means and variances of the form
xi =1n
n∑l=1
xil ; s2i =
1n − 1
n∑l=1
(xil − xi)2 ; i = 1, . . . , m . (4.23)
We assume the xil to be independent and normally distributed. With regardto the influence of unknown systematic errors, we admit that the expectationsµi of the arithmetic means Xi differ. The expected values of the empiricalvariances shall, however, be equal
EXi = µi , ES2i = σ2 ; i = 1, . . . , m . (4.24)
The two error equations
xil = x0 + (xil − µi) + fi , xi = x0 + (xi − µi) + fi
i = 1, . . . , m ; l = 1, . . . , n (4.25)
clearly conflict with the basic idea of analysis of variance. On the other hand,we want to show explicitly why, and how, this classical tool of data analysisbreaks down due to the empirical character of the data fed into it.
The analysis of variance defines a grand mean, summing over N = m×nmeasuring data xil,
x =1N
m∑i=1
n∑l=1
xil (4.26)
and an associated grand empirical variance
s2 =1
N − 1
m∑i=1
n∑l=1
(xil − x)2 . (4.27)
The latter is decomposed by means of the identity
xil = x + (xi − x) + (xil − xi) (4.28)
and the error equations (4.25). Thus, (4.27) turns into
(N − 1)s2 = nm∑
i=1
(xi − x)2 +m∑
i=1
n∑l=1
(xil − xi)2 (4.29)
as the mixed term vanishes,
60 4 Estimators and Their Expectations
2m∑
i=1
(xi − x)n∑
l=1
(xil − xi) = 0 .
If the systematic errors are zero, the right-hand side of (4.29) would offer thepossibility of defining empirical variances s2
1 and s22, namely
(m − 1)s21 = n
m∑i=1
(xi − x)2 and (N − m)s22 =
m∑i=1
n∑l=1
(xil − xi)2 (4.30)
so that(N − 1)s2 = (m − 1)s2
1 + (N − m)s22 . (4.31)
The expectations of both, S21 and S2
2 should be equal to the theoretical vari-ance σ2 defined in (4.24), i.e.
ES21 = σ2 and ES2
2 = σ2 .
Let us first consider the left expectation. As
xi =1n
n∑l=1
xil = x0 + (xi − µi) + fi ,
and
x =1N
m∑i=1
n∑l=1
xil =n
N
m∑i=1
[x0 + (xi − µi) + fi]
= x0 +n
N
m∑i=1
(xi − µi) +n
N
m∑i=1
fi ,
we have
xi − x =
⎡⎣(xi − µi) − n
N
m∑j=1
(xj − µj)
⎤⎦+
⎡⎣fi − n
N
m∑j=1
fj
⎤⎦ ,
so that
ES2
1
= σ2 +
n
(m − 1)
m∑i=1
⎡⎣fi − n
N
m∑j=1
fj
⎤⎦2
. (4.32)
The second expectation follows fromm∑
i=1
n∑l=1
(xil − xi)2 = (n − 1)m∑
i=1
s2i ,
so thatES2
2
= σ2 . (4.33)
The test ratio of the analysis of variance is the quotient S21/S2
2 , which shouldbe F distributed. On account of (4.32), however, any further considerationappears to be dispensable.
5 Combination of Measurement Errors
5.1 Expectation of the Arithmetic Mean
For convenience’s sake, we cast the basic error equation (2.3)
xl = x0 + εl + fx ; l = 1, 2, . . . , n
into the form of an identity
xl = x0 + (xl − µx) + fx ; l = 1, 2, . . . , n . (5.1)
Let us remind that the quantity xl is a physically measured value whilethe decomposition (5.1) reflects the error model under discussion. The sameapplies to the the arithmetic mean
x = x0 + (x − µx) + fx (5.2)
which, in fact, is a formal representation of
x =1n
n∑l=1
xl .
Let us assign a random variable X to x. Then, obviously, the expectation
µx = EX = x0 + fx (5.3)
puts straight:The unknown systematic error fx biases the expectation µx of the arith-
metic mean X with respect to the true value x0.In the following we shall design a confidence interval localizing the ex-
pectation µx of the arithmetic mean. To this end, we refer to the empiricalvariance
s2x =
1n − 1
n∑l=1
(xl − x)2 (5.4)
and to Student’s variable
T =X − µx
Sx/√
n. (5.5)
62 5 Combination of Measurement Errors
Fig. 5.1. Confidence interval for the expectation µx of the arithmetic mean X
As has been said in (3.54)
P−tp ≤ T ≤ tp =
tP∫−tp
pT (t; ν) dt ; ν = n − 1 (5.6)
specifies the probability of finding any realization t of T in an interval−tP , . . . , tP . Often, P = 0.95 is chosen. The probability “masses” lying tothe left and right of −tP and tP respectively, are traditionally denoted asα/2. In terms of this partitioning we have P = 1 − α.
From (5.5) and (5.6), we infer a statement about the probable location ofthe random variable X ,
P
µx − tP (n − 1)√
nSx ≤ X ≤ µx +
tP (n − 1)√n
Sx
= 1 − α .
From the point of view of the experimenter, it will, however, be more in-teresting to learn about the probable location of the parameter µx. Indeed,inverting the above statement yields
P
X − tP (n − 1)√
nSx ≤ µx ≤ X +
tP (n − 1)√n
Sx
= 1 − α . (5.7)
An equivalent statement, which reflects the experimental situation moreclosely, is obtained as follows: We “draw a sample” x1, x2, . . . , xn of sizen and determine the estimators x and s2
x. Then, as illustrated in Fig. 5.1, wemay rewrite (5.7) in the form
x − tP√n
sx ≤ µx ≤ x +tP√n
sx with probability P . (5.8)
5.2 Uncertainty of the Arithmetic Mean 63
This latter statement reveals the usefulness of Student’s density: it allows usto localize the unknown parameter µx by means of the two estimators x ands2
x. The normal density itself would not admit of such a statement, as theparameter σx is inaccessible.
The intervals (5.7) and (5.8) localize the parameter µx with probabilityP as defined in (5.6). The intervals themselves are called confidence intervalsand P is called the confidence level.
As has been pointed out, the difference xl− x cancels the systematic errorfx so that the expectation σ2
x = ES2x of the random variable S2
x is unbiased.Let us keep in mind that the parameter µx may differ from the true value
x0 by up to ±fs,x. In order to assess the uncertainty of the mean x withrespect to the true value x0, we therefore need to do better.
5.2 Uncertainty of the Arithmetic Mean
The expectation
µx = EX = x0 + fx , −fs,x ≤ f ≤ fs,x (5.9)
may deviate from x0 by up to ±fs,x , Fig. 5.2. Hence, as sketched in Fig. 5.3 ,we define the overall uncertainty ux of the estimator x as
ux =tP√n
sx + fs,x . (5.10)
The final measurement result reads
x ± ux . (5.11)
While (5.8) is a probability statement, (5.11) is not. Instead, we simply arguethat this statement localizes the true value x0 with, say, “factual” or “physi-cal” certainty. Though random errors, emerging from real experiments, mayas a rule be considered approximately normally distributed, in general, theextreme tails of the theoretical distribution model N(µx, σx) seem to be lack-ing. Relating tP to, say, P = 95%, this in turn means that we may assumethe interval (5.8) to localize µx quasi-safely.
The fact that the systematic error fx enters into (5.11) in the form of aworst case estimate might reflect an overestimation. On the other hand, theexperimenter does not have any means at his disposal to shape the uncer-tainty (5.10) more favorably, i.e. more narrowly. If he nevertheless compressedit, he would run the risk of forgoing the localization of the true value x0 sothat, as a consequence, all his efforts and metrological work would have beenfutile and, in particular, he might miss metrology’s strict demand for trace-ability.
64 5 Combination of Measurement Errors
Fig. 5.2. Difference between the expected value µx and the true value x0
Fig. 5.3. Uncertainty ux of the arithmetic mean x with respect to the true value x0
Example
Legal SI-based units are defined verbally – in this respect, they are faultless.Albeit their realizations are blurred as they suffer from experimental imper-fections. We consider the base units volt (V), ohm (Ω) and ampere (A). AsFig. 5.4 illustrates, the (present) realization of the SI unit ampere relies ontwo terms,
1 A = 1 AL + c .
Here 1 AL designates the laboratory ampere. This should coincide with thetheoretically defined SI ampere 1 A. However, as the former deviates from
5.2 Uncertainty of the Arithmetic Mean 65
Fig. 5.4. Realization of the SI unit ampere. The quantities 1A and 1AL symbolizethe SI ampere and the laboratory ampere respectively
the latter, a correction c needs to be applied. The correction is obtainedeither from other experiments or from a least squares fit to the fundamentalconstants of physics. Measured in laboratory units AL, the correction is givenin terms of an average c and an uncertainty uc so that
1A = 1AL + (c ± uc) = (1AL + c) ± uc .
On the other hand, the distinction between theoretically defined units andtheir experimentally realized counterparts may also be expressed in the formof quotients. To this end, so-called conversion factors have been introduced.The conversion factor for the ampere is defined by
KA =1AL
1A.
Let c denote the numerical value1 of c so that c = cAL. Then
KA =1
1 + c or, equivalently, c =1 − KA
KA.
The uncertainty uKA , given uc, will be derived in Sect. 5.3. Analogous re-lations refer to the SI units volt and ohm:
1V = 1VL + (a ± ua) = (1VL + a) ± ua ,
KV =1VL
1V=
11 + a , a =
1 − KV
KV,
1Ω = 1ΩL + (b ± ub) = (1ΩL + b) ± ub ,
KΩ =1ΩL
1Ω=
11 + b , b =
1 − KΩ
KΩ.
1e.g. the unit is suppressed.
66 5 Combination of Measurement Errors
For the realization of the units volt and ohm, Josephson’s and von Klitzing’squantum effects
UJ =h
2eν and RH =
h
ie2; i = 1, 2, 3, . . .
respectively, are of tremendous importance; h denotes Planck’s constant, ethe elementary charge and ν high-frequency radiation (≈ 70GHz). Unfortu-nately, the uncertainties of the leading factors h/(2e) and RH = h/e2 arestill unsatisfactory. Nevertheless, sophisticated predefinitions
2e/h = 483 597.9GHz/V (exact) ; h/e2 = 25.812 807KΩ (exact)
enable experimenters to exploit the remarkable reproducibilities of quantumeffects, i.e. to adjust the sums 1 VL + a and 1 ΩL + b ; the uncertainties ua
and ub remain, however, unaffected.We finally add the relationships between the constants a, b, c and the
conversion coefficients KV, KΩ, KA. From
1 V = 1 Ω 1 A and 1 VL = 1 ΩL 1 AL
we find
[1 + a] VL =[1 + b] ΩL [1 + c] AL
and, assuming a, b, c <≈ 10−6, by approximation
a = b + c .
Similarly, from
VL
KV=
ΩL
KΩ
AL
KA
we obtain
KV = KΩKA .
5.3 Uncertainty of a Function of One Variable
The simplest case of error propagation is the function of one variable, sayφ(x), e.g. φ(x) = lnx, φ(x) = 1/x, etc. In most cases, error propagationwithin functional relationships takes place on the basis of linearized Taylorseries expansions. Whether or not this is admissible depends on the mag-nitude of the measurement uncertainties of the input data, in this case ofthe uncertainty ux of the quantity x, and on the analytical behavior of theconsidered function in the neighborhood of x± ux. Assuming φ(x) to behavelinearly enough, the error equations
5.3 Uncertainty of a Function of One Variable 67
xl = x0 + (xl − µx) + fx , x = x0 + (x − µx) + fx
lead us to
φ(xl) = φ(x0) +dφ
dx
∣∣∣x=x0
(xl − µx) +dφ
dx
∣∣∣x=x0
fx + · · ·
φ(x) = φ(x0) +dφ
dx
∣∣∣x=x0
(x − µx) +dφ
dx
∣∣∣x=x0
fx + · · · (5.12)
so that2
φ(xl) − φ(x) =dφ
dx(xl − x) ; l = 1, 2, . . . , n . (5.13)
Here it is assumed that
– the expansions may be truncated after their linear terms and that– the derivative in x0 may be approximated by the derivative in x.
Furthermore, for the sake of lucidity, an equal sign has been introduced which,actually, is of course incorrect. Finally, we shall assume that
– by approximation, the derivatives may be treated as constants.
Within the framework of these approximations, the expected value of therandom variable φ(X),
µφ = Eφ(X)
= φ(x0) +
dφ
dxfx , (5.14)
presents itself as the sum of the true value φ(x0) and a bias expressing thepropagated systematic error
fφ =dφ
dxfx ; −fs,x ≤ fx ≤ fs,x . (5.15)
Obviously, the linearization (5.13) implies
φ = φ(x) =1n
n∑l=1
φ(xl) . (5.16)
Using (5.13), the empirical variance of the φ(xl) with respect to their meanφ(x) is given by
s2φ =
1n − 1
n∑l=1
[φ(xl) − φ(x)]2 =(
dφ
dx
)2
s2x . (5.17)
If the xl are normally distributed, this holds also for the φ(xl), l = 1, . . . , n.Hence, referring to (5.12), (5.14) and (5.17), we are in a position to define aStudent’s
2abbreviating the operation of differentiation.
68 5 Combination of Measurement Errors
T (n− 1) =φ(X) − µφ
Sφ/√
n
through
T (n− 1) =
dφ
dx(X − µx)∣∣∣∣dφ
dx
∣∣∣∣ Sx√n
= sign(
dφ
dx
)(X − µx)Sx/
√n
, (5.18)
where the sign function is meaningless on account of the symmetry of Stu-dent’s density. After all, (5.18) provides a confidence interval
φ(X) ± tP (n − 1)√n
Sφ (5.19)
for the parameter µφ with reference to a confidence level P . Combining (5.19)with the worst case estimate
fs,φ =∣∣∣∣dφ
dx
∣∣∣∣ fs,x ; −fs,φ ≤ fφ ≤ fs,φ (5.20)
of the propagated systematic error fφ, the overall uncertainty turns out to be
uφ =∣∣∣∣dφ
dx
∣∣∣∣[tP (n − 1)√
nsx + fs,x
]=∣∣∣∣dφ
dx
∣∣∣∣ ux , (5.21)
where ux is given in (5.10). Thus, the final measurement result is
φ(x) ± uφ . (5.22)
Examples
(i) Occasionally, the fine structure constant3
α = 7.29735308(33) 10−3
is inversely written
α−1 = 137.03598949 · · · .
We shall determine the uncertainty uα−1 , given uα = tP sα/√
n + fs,α .Linearizing φ(α) = 1/α,
φ(αl) − φ(α) = − 1α2
(αl − α) ; l = 1 , . . . , n
3The data are taken from [17]. Here, the rounding procedures proposed inSect. 1.2 do not apply. Instead, uncertainties are generally provided with two digits.
5.3 Uncertainty of a Function of One Variable 69
we find
s2α−1 =
1α4
s2α .
As
fα−1 = − 1α2
fα ,
the uncertainty in question is
uα−1 =1α2
[tP√n
sα + fs,α
]=
uα
α2
so that
α−1 = 137.0359895(62) .
We notice that the relative uncertainties are equal,
uα−1
α−1=
uα
α.
The same applies when ppm are introduced. As
uppm, α
106=
uα
α,
we have
uppm, α =uα
α106 =
uα−1
α−1106 = uppm, α−1 = 0.045 .
(ii) Given the uncertainty uc of the correction c to the base unit ampere,1 A = (1 AL+c)±uc, we shall find the uncertainty of the conversion factor KA.
Setting KA = y and c = x so that y = 1/(1 + x), we obtain
y − y0 =dy
dx0(x − x0) ; x − x0 = (x − µx) + fx
i.e.
y − y0 =dy
dx0[(x − µx) + fx] .
But then
uy =∣∣∣∣dy
dx
∣∣∣∣[
tP√n
sx + fs,x
]or
uKA=
uc(1 + c)2 .
70 5 Combination of Measurement Errors
Just to illustrate the situation, let us refer to the 1986 adjustment of thefundamental physical constants [17]. We have
KA = 0.99999397 ± 0.3 × 10−6 .
This means
1AL = KA 1A = 0.99999397 Ac = 6.03 × 10−6
uc = 0.30 × 10−6 A
c = 6.03 × 10−6 AL ,
so that
1AL + c = 0.99999397 + 0.00000603 = 1 A .
In Sect. 6.4 we shall assess the uncertainty uc, given ua and ub. Furthermore,given uV and uKΩ
, we shall quote the uncertainty uKA.
5.4 Systematic Errors of the Measurands
Should the measurand itself for any reason be blurred, a true value x0 wouldnot exist. Even in this case it would, however, be assumed that the measurandremains constant during the time it takes to perform n repeat measurements.The blurry structure of the measurand can be taken into account by anadditional systematic error
−fs,x0 ≤ fx0 ≤ fs,x0 . (5.23)
The linewidths on semiconductor templates, for example, may locally varymore or less severely. Subject to the resolution of the measuring process, suchvariations will be integrated out or disclosed. If integrated out, an additionalsystematic error should account for the inconstancies of the linewidths.
The extra systematic error fx0 of the measurand may be merged with thesystematic error of the measuring process
−fs,x ≤ fx ≤ fs,x
into an overall systematic error
−(fs,x + fs,x0), . . . , (fs,x + fs,x0) . (5.24)
It is essential that during the time it takes to register the repeat measurementsfx = const. and fx0 = const.
The formalization of series of measurements which are subject to time-varying systematic errors giving rise to so-called time series would be outsidethe scope of this presentation. A basic metrological example is given in [12].
6 Propagation of Measurement Errors
6.1 Taylor Series Expansion
We consider a functional relationship φ(x, y) and given results of measure-ment
x ± ux and y ± uy (6.1)
localizing true values x0 and y0 respectively. The latter define a quantityφ(x0, y0) which we regard as the true value of the physical quantity in ques-tion. The result to be found,
φ(x, y) ± uφ , (6.2)
should localize this quantity φ(x0, y0). In general, it is sufficient to base theassessment of the uncertainty uφ on the linear part of the Taylor expansion.Within the scope of this approximation and due to the error model reliedon, the error propagation resolves into two different, independent branchescovering the propagation of random and systematic errors.
Expanding φ(xl, yl); l = 1, 2, . . . , n throughout a neighborhood of thepoint x0, y0,
φ(xl, yl) = φ(x0, y0) +∂φ
∂x
∣∣∣∣∣∣∣x = x0y = y0
(xl − x0) +∂φ
∂y
∣∣∣∣∣∣∣x = x0y = y0
(yl − y0) + · · ·
and referring to the error equations
xl = x0 + (xl − µx) + fx, yl = y0 + (yl − µy) + fy (6.3)
yields1
φ(xl, yl) = φ(x0, y0) (6.4)
+[∂φ
∂x(xl − µx) +
∂φ
∂y(yl − µy)
]+[∂φ
∂xfx +
∂φ
∂yfy
].
1abbreviating the operation of differentiation.
72 6 Propagation of Measurement Errors
For the sake of lucidity, the truncated series expansion has been expressedin terms of an equality, which, of course, is incorrect. By approximation, weshall treat the differential quotients as constants – this will keep the formalismcontrollable. As we assume equal numbers of repeat measurements, we maysum over l and divide by n. Subtracting
φ(x, y) = φ(x0, y0) (6.5)
+[∂φ
∂x(x − µx) +
∂φ
∂y(y − µy)
]+[∂φ
∂xfx +
∂φ
∂yfy
]from (6.4) we find
φ(xl, yl) − φ(x, y) =∂φ
∂x(xl − x) +
∂φ
∂y(yl − y) (6.6)
The linearization implies
φ =1n
n∑l=1
φ(xl, yl) = φ(x, y) . (6.7)
From the expectation
µφ = Eφ(X, Y )
= φ(x0, y0) +
[∂φ
∂xfx +
∂φ
∂yfy
](6.8)
we seize:The unknown systematic errors fx and fy cause the expectation µφ of the
arithmetic mean φ(X, Y ) to be biased with respect to the true value φ(x0, y0).The bias is the propagated systematic error
fφ =∂φ
∂xfx +
∂φ
∂yfy ; −fs,x ≤ fx ≤ fs,x , −fs,y ≤ fy ≤ fs,y . (6.9)
The uncertainty uφ sought-for will be assessed by means of (6.6) and (6.9).Let us summarize. On account of the error model, the linear part of theTaylor expansion (6.5) splits up into two terms. The part due to randomerrors embodies the unknown expectations µx, µy. As a result, no direct as-sessment is possible. Meanwhile the difference (6.6) will help us to overcomethe difficulty. In contrast to this, the second term of (6.5) which expressesthe influence of systematic errors, turns out to be directly assessable.
Finally, for formal reasons, we define the vectors
b =
⎛⎜⎜⎝
∂φ
∂x
∂φ
∂y
⎞⎟⎟⎠ , ζ − µ =
⎛⎝ x − µx
y − µy
⎞⎠ (6.10)
so that (6.5) changes into
6.1 Taylor Series Expansion 73
φ(x, y) = µφ + bT(ζ − µ) . (6.11)
To set up the error propagation for more than two variables, we generalizethe nomenclature. We consider a functional relationship φ(x1, x2, . . . , xm)and arithmetic means
xi =1n
n∑l=1
xil ; i = 1, 2 , . . . , m (6.12)
to be based on an identical number n of repeat measurements.As usual, we designate the true values of the measurands xi; i =
1, 2, . . . , m by
x0,1, x0,2 , . . . , x0,m . (6.13)
Then, the final result
φ(x1, x2 , . . . , xm) ± uφ (6.14)
should localize the true value φ(x0,1, x0,2, . . . , x0,m). In analogy to (6.5), wehave
φ(x1, x2, . . . xm) = µφ +m∑
i=1
∂φ
∂xi(xi − µi) ; µi = EXi , (6.15)
where
µφ = Eφ(X1, X2, . . . , Xm)
= φ(x0,1, x0,2, . . . , x0,m) +
m∑i=1
∂φ
∂xifi . (6.16)
The contribution
fφ =m∑
i=1
∂φ
∂xifi ; −fs,i ≤ fi ≤ fs,i (6.17)
denotes the propagated systematic error biasing the true value φ(x0,1, x0,2,. . . , x0,m). Instead of (6.6) we here have, letting l = 1, 2, . . . , n,
φ(x1l, x2l, . . . xml) − φ(x1, x2, . . . xm) =m∑
i=1
∂φ
∂xi(xil − xi) . (6.18)
Finally, we convert (6.15) into vector form
φ(x1, x2, . . . xm) = µφ + bT(x − µ) , (6.19)
74 6 Propagation of Measurement Errors
where
b =[
∂φ
∂x1· · · ∂φ
∂xm
]Tand
x − µ = (x1 − µ1 · · · xm − µm)T .
We are ultimately prepared to reformulate the formalism for error propaga-tion.
6.2 Expectation of the Arithmetic Mean
We might wish to assess the statistical fluctuations of the random errors from(6.4) and (6.8),
φ(xl, yl) = µφ +[∂φ
∂x(xl − µx) +
∂φ
∂y(yl − µy)
]; l = 1, 2 , . . . , n (6.20)
by means of the expectation
σ2φ = E
(φ(X, Y ) − µφ)2
=(
∂φ
∂x
)2
σ2x + 2
(∂φ
∂x
)(∂φ
∂y
)σxy +
(∂φ
∂x
)2
σ2y . (6.21)
However, neither the theoretical variances nor the theoretical covariance areavailable – the latter may turn out to be positive or negative. Resorting to(6.6), we can overcome the problem that we do not know the theoreticalmoments of second order, as
φ(xl, yl) − φ(x, y) =∂φ
∂x(xl − x) +
∂φ
∂y(yl − y) ; l = 1, 2 , . . . , n
puts us in a position to express the scattering of the random errors via theempirical moments,
s2x , sxy and s2
y
so that
s2φ =
1n − 1
n∑l=1
[φ(xl, yl) − φ(x, y)]2
yields
6.2 Expectation of the Arithmetic Mean 75
s2φ =
1n − 1
n∑l=1
[(∂φ
∂x
)2
(xl − x)2
+2(
∂φ
∂x
)(∂φ
∂y
)(xl − x)(yl − y) +
(∂φ
∂y
)2
(yl − y)2]
=(
∂φ
∂x
)2
s2x + 2
(∂φ
∂x
)(∂φ
∂y
)sxy +
(∂φ
∂y
)2
s2y . (6.22)
As the empirical variances s2x, s2
y and the empirical covariance sxy are unbi-ased, the random variable S2
φ is unbiased as well, ES2φ = σ2
φ.Succeeding pairs x1, y1; x2, y2; . . . ; xn, yn are independent. On the other
hand, with respect to the same l, xl and yl may well be dependent. When thepairs xl, yl are assigned to the random variables X, Y of a two-dimensionalnormal probability density, the quantities φ(xl, yl); l = 1, 2, . . . , n definedin (6.20) are normally distributed, be X and Y dependent or not. How-ever, succeeding pairs φ(xl, yl); φ(xl+1, yl+1) are independent, like pairs xl, yl;xl+1, yl+1. As s2
φ estimates σ2φ, we are in a position to define a
χ2(n − 1) =(n − 1)S2
φ
σ2φ
(6.23)
and a Student’s
T (n − 1) =φ(X, Y ) − µφ
Sφ/√
n. (6.24)
They are defined because we have agreed on statistically well-defined mea-suring conditions. Thus,
φ(X, Y ) − tP (n − 1)√n
Sφ ≤ µφ ≤ φ(X, Y ) +tP (n − 1)√
nSφ (6.25)
is a confidence interval for the expectation µφ of the arithmetic mean φ(X, Y ).The subscript P in tP specifies the associated level of confidence. Seen ex-perimentally we state:
The confidence interval
φ(x, y) ± tP (n − 1)√n
sφ (6.26)
localizes the expected value µφ = Eφ(X, Y ) of the random variable φ(X, Y )with probability PtP (n − 1).
Referring to the empirical variance–covariance matrix
s =
(sxx sxy
syx syy
); sxx ≡ s2
x, sxy = syx, syy ≡ s2y
76 6 Propagation of Measurement Errors
and the vector b defined in (6.10), we may compress (6.22) into
s2φ = bTs b . (6.27)
The formalism does not require us to differentiate between dependent andindependent random variables. In the case of a dependence, the experimentaldevice watches out which realizations of X and Y fit together. In the caseof independence, the experimenter may realize any ordering.2 The numericalvalue of the empirical covariance sxy depends on the kind of pairing chosen.While all the values are equally well admissible, each of them is confined tothe interval
−sxsy ≤ sxy ≤ sxsy .
Even purposeful modifications of the data ordering would not invalidate(6.26).
The conventional error calculus, ignoring empirical covariances, restrictsitself to empirical variances and, dividing them by their respective numbersof repeat measurements, defines a quantity
s∗φ =
√(∂φ
∂x
)2s2
x
nx+(
∂φ
∂x
)2 s2y
ny
ad hoc. With reference to B.L. Welch’s effective degrees of freedom νeff , aconfidence region approximating Student’s interval (6.26) can still be given [7]
φ(x, y) − tP (νeff)s∗φ ≤ µφ ≤ φ(x, y) + tP (νeff)s∗φ .
For more than two variables, little is to be added. From (6.18) we derive theempirical variance
s2φ =
m∑i,j
∂φ
∂xi
∂φ
∂xjsij = bTs b , (6.28)
in which the vector b and the matrix s are given by
b =(
∂φ
∂x1
∂φ
∂x2· · · ∂φ
∂xm
)T
(6.29)
and
s =
⎛⎜⎜⎜⎝
s11 s12 . . . s1m
s21 s22 . . . s2m
. . . . . . . . . . . .
sm1 sm2 . . . smm
⎞⎟⎟⎟⎠
2For example, the two series (x1, y1), (x2, y2), . . . and (x1, y2), (x2, y1), . . . wouldbe equally admissible.
6.3 Uncertainty of the Arithmetic Mean 77
sij =1
n − 1
n∑l=1
(xil − xi)(xjl − xj) ; i, j = 1 , . . . , m (6.30)
respectively. If need be, we shall denote the diagonal elements of s as sii ≡ s2i .
Furthermore, for reasons of symmetry, we have sij = sji. Finally, Student’sconfidence interval for the parameter µφ which has been defined in (6.16)turns out to be
φ(X1, X2, . . . , Xm) ± tP (n − 1)√n
Sφ . (6.31)
To assess the influence of systematic errors, we shall have to refer to (6.9)and (6.17).
6.3 Uncertainty of the Arithmetic Mean
According to (6.8), the expectation µφ differs from the true value φ(x0, y0)by the propagated systematic error
fφ =∂φ
∂xfx +
∂φ
∂yfy ; −fs,x ≤ fx ≤ fs,x ; −fs,y ≤ fy ≤ fs,y , (6.32)
where neither the signs nor the magnitudes of fx and fy are known. Aswe wish to safely localize the true value φ(x0, y0), we opt for a worst caseestimation, setting
fs,φ =∣∣∣∣∂φ
∂x
∣∣∣∣ fs,x +∣∣∣∣∂φ
∂y
∣∣∣∣ fs,y (6.33)
and quote the result of measurement as
φ(x, y) ± uφ
uφ =tP (n − 1)√
n
√(∂φ
∂x
)2
s2x + 2
(∂φ
∂x
)(∂φ
∂y
)sxy +
(∂φ
∂y
)2
s2y
+∣∣∣∣∂φ
∂x
∣∣∣∣ fs,x +∣∣∣∣∂φ
∂y
∣∣∣∣ fs,y . (6.34)
We do not assign a probability to this statement. Considering m variables,we have
φ(x1, x2, . . . , xm) ± uφ
uφ =tP (n − 1)√
n
√√√√ m∑i,j
∂φ
∂xi
∂φ
∂xjsij +
m∑i=1
∣∣∣∣ ∂φ
∂xi
∣∣∣∣ fs,i , (6.35)
78 6 Propagation of Measurement Errors
expecting the so-defined interval to localize the true value φ(x0,1, x0,2, . . .,x0,m).
The question as to whether worst case estimations of propagated system-atic errors might provoke unnecessarily large uncertainties or whether therelevant ISO Guide’s quadratic estimations might, in the end, lead to unre-liably small uncertainties, is still controversial. Here, worst case estimationsand a linear combination of random and systematic errors are in any casepreferred in contrast to the Guide which, postulating theoretical variances forsystematic errors, adds uncertainty components quadratically. The computersimulations in the next section will serve to scrutinize the differences.
There are useful supplements to (6.34) and (6.35). Remarkably enough,as −sxsy ≤ sxy ≤ sxsy, the term due to random errors can be reduced asfollows (
∂φ
∂x
)2
s2x + 2
(∂φ
∂x
)(∂φ
∂y
)sxy +
(∂φ
∂y
)2
s2y
≤(
∂φ
∂x
)2
s2x + 2
∣∣∣∣∂φ
∂x
∣∣∣∣∣∣∣∣∂φ
∂y
∣∣∣∣ sxsy +(
∂φ
∂y
)2
s2y
=(∣∣∣∣∂φ
∂x
∣∣∣∣ sx +∣∣∣∣∂φ
∂y
∣∣∣∣ sy
)2
(6.36)
so that (6.34) simplifies to
uφ =tP (n − 1)√
n
√(∂φ
∂x
)2
s2x + 2
(∂φ
∂x
)(∂φ
∂y
)sxy +
(∂φ
∂y
)2
s2y
+∣∣∣∣∂φ
∂x
∣∣∣∣ fs,x +∣∣∣∣∂φ
∂y
∣∣∣∣ fs,y
≤∣∣∣∣∂φ
∂x
∣∣∣∣(
tP (n − 1)√n
sx + fs,x
)+∣∣∣∣∂φ
∂y
∣∣∣∣(
tP (n − 1)√n
sy + fs,y
)
=∣∣∣∣∂φ
∂x
∣∣∣∣ ux +∣∣∣∣∂φ
∂y
∣∣∣∣ uy ,
i.e.
uφ ≤∣∣∣∣∂φ
∂x
∣∣∣∣ ux +∣∣∣∣∂φ
∂y
∣∣∣∣ uy . (6.37)
Obviously, it is of no significance that this expression is based upon a singularempirical variance–covariance matrix s.
A corresponding simplification is available for (6.35): considering two vari-ables at a time, the respective empirical covariance may be treated just as inthe case of two variables. But this leads to
6.3 Uncertainty of the Arithmetic Mean 79
uφ ≤m∑
i=1
∣∣∣∣ ∂φ
∂xi
∣∣∣∣uxi . (6.38)
In a sense, this approach conceals the empirical covariances between thevariables, saving us the necessity to calculate them afterwards from raw data,if need be. Evidently, just this latter property renders (6.38) particularlyattractive. With this, we finish the development of the new formalism oferror propagation. Its basic principles rely on
– stationarily operated non-drifting measuring devices,– decoupled random and systematic errors,– linearized Taylor series expansion,– well-defined measuring conditions,– the propagation of random errors via Student’s confidence intervals and
systematic errors by means of worst case estimations, and– the linear combination of the uncertainty components due to random and
systematic errors.
Fig. 6.1. Flow of random and systematic errors
80 6 Propagation of Measurement Errors
Figure 6.1 presents a descriptive overview of the origin, the classification andthe flow of errors entering measurement uncertainties.
The errors of physical constants have to be treated according to the partic-ular role they play in the measuring process under consideration. If constants,akin to measurands, appear as arguments of functional relationships, theiruncertainty components should be broken up into components according towhether they are of random or systematic origin. If, on the other hand, theyare physically active parts of the measuring procedure applying for exam-ple to material measures, their uncertainties will exclusively be effective assystematic errors. An example will be given in Sect. 12.5.
In general, error propagation covers several stages. As will be shown inSect. 6.5, the error model under discussion steadily channels the flow of ran-dom and systematic errors.
6.4 Uncertainties of Elementary Functions
We shall illustrate the new formalism by means of some examples.
Sums and Differences
Given two measurands x and y, we combine the results
x ±(
tP (n − 1)√n
sx
)and y ±
(tP (n − 1)√
nsy
)(6.39)
to form functional relationships
φ(x, y) = x ± y . (6.40)
With respect to the sum, φ(x, y) = x + y, the error equations yield
φ(xl, yl) = x0 + y0 + (xl − µx) + (yl − µy) + fx + fy (6.41)
and
φ(x, y) = x0 + y0 + (x − µx) + (y − µy) + fx + fy (6.42)
respectively. Then, subtracting (6.41) and (6.42), we obtain
φ(xl, yl) − φ(x, y) = (xl − x) + (yl − y) . (6.43)
The expected value of (6.41),
µφ = Eφ(X, Y ) = x0 + y0 + fx + fy (6.44)
reveals a propagated systematic error
6.4 Uncertainties of Elementary Functions 81
fφ = fx + fy ; −fs,x ≤ fx ≤ fs,x ; −fs,y ≤ fx ≤ fs,y . (6.45)
Furthermore, from (6.43) we obtain an estimator
s2φ =
1n − 1
n∑l=1
[φ(xl, yl) − φ(x, y)]2 = s2x + 2sxy + s2
y (6.46)
for the theoretical variance σ2φ = E(φ(X, Y )−µφ)2 so that the final result is
φ(x, y) ± uφ
uφ =tP (n − 1)√
n
√s2
x + 2sxy + s2y + (fs,x + fs,y) . (6.47)
Now, in terms of a test of hypothesis, we ask for the compatibility of twoarithmetic means, x and y aiming at one and the same physical quantity, i.e.we are looking for a maximum tolerable deviation |x − y|. Setting
φ(xl, yl) − φ(x, y) = (xl − x) − (yl − y) (6.48)
we find
s2φ = s2
x − 2sxy + s2y (6.49)
and, as
fφ = fx − fy ; −fs,x ≤ fx ≤ fs,x ; −fs,y ≤ fx ≤ fs,y , (6.50)
these two means are compatible if
|x − y| ≤ tp(n − 1)√n
√s2
x − 2sxy + s2y + (fs,x + fs,y) . (6.51)
Condoning a singular empirical variance–covariance matrix, a coarser esti-mation would be
|x − y| ≤ ux + uy . (6.52)
Within the scope of so-called round robins, Sect. 12.1, one and the same mea-suring object is circulated amongst various laboratories. Each of the partici-pants measures the same physical quantity. Assuming this quantity remainsconstant during the whole period of circulation, then, any two uncertaintiestaken at a time should mutually overlap.
Example
We estimate the uncertainty uc of the correction c to the ampere, given theuncertainties ua and ub of the corrections a and b to the volt and the ohm.
82 6 Propagation of Measurement Errors
In Sect. 5.2 we had 1 V = (1 + a)VL; 1 Ω = (1 + b)ΩL; 1 A = (1 +c)AL. From
c = a − bwe deduce
uc =tP (n − 1)√
n
√s2a − 2sab + s2
b + (fs,a + fs,b) .
Products and Quotients
Linearizing the product
φ(x, y) = xy (6.53)
yields
φ(xl, yl) − φ(x, y) = y(xl − x) + x(yl − y) ; l = 1, 2 , . . . , n
so that
s2φ = y2s2
x + 2 x y sxy + x2s2y . (6.54)
As
fφ = yfx + xfy , (6.55)
we have
fs,φ = |y|fs,x + |x|fs,y (6.56)
so that the true value φ(x0, y0) = x0y0 should lie within
φ(x, y) ± uφ
uφ =tP (n − 1)√
n
√y 2s2
x + 2xysxy + x 2s2y + (|y|fs,x + |x|fs,y) . (6.57)
Linearizing the quotient
φ(x, y) =x
y(6.58)
leads to
φ(xl, yl) − φ(x, y) =1y(xl − x) − x
y2(yl − y) ; l = 1, 2 , . . . , n .
Thus, the empirical variance follows from
6.4 Uncertainties of Elementary Functions 83
s2φ =
1y 2
s2x − 2
x
y 3sxy +
x2
y 4s2
y . (6.59)
Considering the propagated systematic error
fφ =1yfx +
x
y2fy , (6.60)
we have
φ(x, y) ± uφ (6.61)
uφ =tP (n − 1)√
n
√1y 2
s2x − 2
x
y 3sxy +
x 2
y 4s2
y +(
1|y|fs,x +
|x||y2|fs,y
).
Given that X and Y are taken from standardized normal distributions, therandom variable U = X/Y follows a Cauchy density. Then, as is known,EU and EU2 would not exist. However, we confine ourselves to a non-pathological behavior of the quotient X/Y , so that the φ(xl, yl); l = 1, . . . , nas defined by the truncated Taylor series, may be considered to be (approxi-mately) normally distributed.
Examples
We estimate the uncertainty of the conversion factor KA, given the conversionfactors KV ± uKV and KΩ ± uKΩ . From
KA =KV
KΩ
we obtain
uKA =tP (n − 1)√
n
√1
K2Ω
s2KV
− 2KV
K3Ω
sKV KΩ +K2
V
K4Ω
s2KΩ
+(
1KΩ
fs,KV +KV
K2Ω
fs,KΩ
).
We consider the Michelson–Morley experiment and estimate the uncertaintyuV of the displacement
V = 2L(u
c
)2
that the interference fringes should exhibit after 90 rotation of the inter-ferometer. The quantity u denotes the Earth’s velocity relative to the pos-tulated ether, c the velocity of light and L the distance between any of thetwo all-silvered mirrors to the central half-silvered mirror. As u/c ≈ 10−4,
84 6 Propagation of Measurement Errors
we observe V ≈ 2 10−8L. Setting e.g. L = 2 × 107 λ the displacement shouldbe V ≈ 0.4 λ. Quite contrary to today’s concepts, we naively consider c ameasurand. Thus
uV =tP (n − 1)√
n
×√(
V
L
)2
s2L +(
2V
u
)2
s2u +(
2V
c
)2
s2c +
2V 2
(Lu)sLu − 4V 2
(uc)suc − 2V 2
(Lc)sLc
+
[(V
L
)s,L
+(
2V
u
)fs,u +
(2V
c
)fs,c
].
Though Michelson and Morley relied on a different approach to estimate theuncertainty of the displacement V , their result will endure.3 This might will-ingly be taken as an opportunity to play down the importance of measurementuncertainties in general. Nevertheless, we are sure that scientific discoveriesrely on vast fields of a priori knowledge, i.e. sound theoretical frameworksand decades of metrological experience as well as the understanding that anyscientific concept has to be subjected to an all-decisive comparison betweentheory and experiment so that in the last analysis, it is the measurementuncertainty which is responsible for the result.
Comparisons will be particularly critical if experiments are operating closeto the limits of verifiability and if their outgoings have far-reaching physicalconsequences – which applies, for example, to the still pending question ofthe correct neutrino model.
Simulation
As the true values of the measurands are unknown, there is no way to test aformalism estimating measurement uncertainties which relies on real, phys-ically obtained data. Nevertheless, we are in a position to design numericaltests being based on simulated data, since under conditions of simulation, thetrue values are necessarily part of the model, and thus known a priori. Wesubject both formalisms, the one proposed here and that of the ISO Guide,to simulative tests. We “measure” the area of a rectangle whose side lengthsare chosen to be x0 = 1 cm and y0 = 2cm, Fig. 6.2. In each case, we con-sider n = 10 repeat measurements for x and y respectively. Let the intervalslimiting the systematic errors be
−10 µm ≤ fx ≤ 10 µm , −10 µm ≤ fy ≤ 10 µm .
3The interpretation of the observation is as follows. If the arm of the interferom-eter lying parallel to the direction of motion of the Earth is shortened by a factorof [1 − (u/c)2], the displacement V vanishes numerically.
6.4 Uncertainties of Elementary Functions 85
Fig. 6.2. Simulation of the measurement of the area of a rectangle
We adjust the variances σ2x, σ2
y of the simulated, normally distributed randomerrors to be equal to the postulated ISO variances of the systematic errorsfx, fy,
σ2fx
=f2
s,x
3and σ2
fy=
f2s,y
3,
referring to rectangular densities over the two intervals given above.According to the error model under discussion, the uncertainty uxy of the
area xy is given by
uxy =tP (n − 1)√
n
√y 2s2
x + 2 x y sxy + x 2s2y + |y|fs,x + |x|fs,y (6.62)
while the ISO Guide [36] would yield
uxy = kP
√y 2
[s2
x
nx+
f2s,x
3
]+ x 2
[s2
y
ny+
f2s,y
3
]. (6.63)
The error limits fs,x = 10 µm and fs,y = 10 µm enter (6.62) and (6.63).But, to simulate the “measuring data”, we choose any values fx, fy from thepertaining intervals ±fs,x, ±fs,y and superpose them on the simulated data.Let us consider some special cases. If fx and fy
– have different signs, we shall speak of error compensation,– have the same sign, of error intensification,– are permuted, of a test for symmetry.
86 6 Propagation of Measurement Errors
As to ISO uncertainties, kP = 1 is recommended for purposes of high-levelmetrology and kP = 2 for standard cases. Within the alternative formalismwe set t68%(9) = 1.04 and t95%(9) = 2.26, thus allowing meaningful compar-isons.
Every “measurement” yields a result xy ± uxy. However, to arrive atstatistically representative statements, we carry out 1000 complete cyclesof measurements at a time, i.e. for any given choice of fx = const. andfy = const., we simulate 1000 new sets of random errors.
If kP = 1, in the case of error intensification, the ISO model turns out tobe unfit. Remarkably enough, the test for symmetry delivers only 59 positiveresults or successes, given that fx = 10 µm and fy = 0 µm. However, iffx = 0 µm and fy = 10 µm, we observe 841 successes. This observation is dueto the disparity of the partial derivatives y ≈ 2 cm and x ≈ 1 cm which, in asense, “weight” the errors.
The situation gets better on setting kP = 2. This does not, however, applyto the most unfavorable case fx = fy = 10 µm.
Clearly, as Tables 6.1 and 6.2 illustrate, occasionally used terms like “opti-mistically” or “pessimistically” estimated measurement uncertainties do nothave any specific meaning. We neither give information away when quotingsafe estimates nor regain information otherwise lost by keeping the measure-ment uncertainties tight. Measurement uncertainties should rather be simplyobjective.
Table 6.1. Localization of true values: ISO Guide versus alternative approach
1000 Simulations kp = 1 t68%(9) = 1.04
in µm ISO alternative model
fx fy successes failures successes failures
compensation
10 −10 815 185 1000 0
5 −5 986 14 1000 0
intensification
10 10 0 1000 842 158
5 5 363 637 1000 0
symmetry
10 0 59 941 1000 0
0 10 814 186 1000 0
5 0 796 204 1000 0
0 5 978 22 1000 0
6.5 Error Propagation over Several Stages 87
6.5 Error Propagation over Several Stages
Even in experimentally simple situations, quantities to be linked up maypresent themselves as results of previous measurements and other func-tional relationships, so φ(x, y) = x/y could be a first and Γ[y, φ(x, y)] =y exp[φ(x, y)] a second combination. We consider the uncertainty uΓ, refer-ring again to truncated series expansions and the notation Γ = Γ[y, φ(x, y)].
Let us initially linearize φ(x, y),
φ(xl, yl) = φ(x0, y0) +[∂φ
∂x(xl − µx) +
∂φ
∂y(yl − µy)
]+[∂φ
∂xfx +
∂φ
∂yfy
],
φ(x, y) = φ(x0, y0) +[∂φ
∂x(x − µx) +
∂φ
∂y(y − µy)
]+[∂φ
∂xfx +
∂φ
∂yfy
].
The expectation
µφ = φ(x0, y0) +∂φ
∂xfx +
∂φ
∂yfy (6.64)
reveals a propagated systematic error of the kind
fφ =∂φ
∂xfx +
∂φ
∂yfy . (6.65)
We might wish to estimate fφ as
Table 6.2. Localization of true values: ISO Guide versus alternative approach
1000 Simulations kp = 2 t95%(9) = 2.26
in µm ISO alternative model
fx fy successes failures successes failures
compensation
10 −10 1000 0 1000 0
5 −5 1000 0 1000 0
intensification
10 10 232 768 974 26
5 5 999 1 1000 0
symmetry
10 0 996 34 1000 0
0 10 1000 0 1000 0
5 0 1000 0 1000 0
0 5 1000 0 1000 0
88 6 Propagation of Measurement Errors
−fs,φ ≤ fφ ≤ fs,φ , fs,φ =∣∣∣∣∂φ
∂x
∣∣∣∣ fs,x +∣∣∣∣∂φ
∂y
∣∣∣∣ fs,y (6.66)
which, however, should be postponed. Let us first refer to
φ(xl, yl) − φ(x, y) =[∂φ
∂x(xl − x) +
∂φ
∂y(yl − y)
]. (6.67)
Now, we expand Γ[y, φ(x, y)] twofold, abbreviating φl = φ(xl, yl) and φ0 =φ(x0, y0),
Γ(yl, φl) = Γ(y0, φ0) +[∂Γ∂y
(yl − µx) +∂Γ∂φ
(φl − µφ)]
+[∂Γ∂y
fy +∂Γ∂φ
fφ
],
Γ(y, φ) = Γ(y0, φ0) +[∂Γ∂y
(y − µx) +∂Γ∂φ
(φ − µφ)]
+[∂Γ∂y
fy +∂Γ∂φ
fφ
].
This leads us to
Γ(yl, φl) − Γ(y, φ) =[∂Γ∂y
(yl − y) +∂Γ∂φ
(φl − φ)]
(6.68)
and
fΓ =[∂Γ∂y
fy +∂Γ∂φ
fφ
](6.69)
respectively. Finally, we insert (6.67) into (6.68),
Γ(yl, φl) − Γ(y, φ) =[∂Γ∂φ
∂φ
∂x(xl − x) +
(∂Γ∂y
+∂Γ∂φ
∂φ
∂y
)(yl − y)
](6.70)
and (6.65) into (6.69),
fΓ =∂Γ∂φ
∂φ
∂xfx +
[∂Γ∂y
+∂Γ∂φ
∂φ
∂y
]fy . (6.71)
It is only now that we should estimate the propagated systematic error,
−fs,Γ ≤ fΓ ≤ fs,Γ , fs,Γ =∣∣∣∣∂Γ∂φ
∂φ
∂x
∣∣∣∣ fs,x +∣∣∣∣∂Γ∂y
+∂Γ∂φ
∂φ
∂y
∣∣∣∣ fs,y . (6.72)
If, instead, we had subjected (6.69) to a worst case estimation
f∗s,Γ =
∣∣∣∣∂Γ∂y
∣∣∣∣ fs,y +∣∣∣∣∂Γ∂φ
∣∣∣∣ fs,φ
and had done the same with (6.65), we would have got
6.5 Error Propagation over Several Stages 89
f∗s,Γ =
∣∣∣∣∂Γ∂φ
∂φ
∂x
∣∣∣∣ fs,x +[∣∣∣∣∂Γ
∂y
∣∣∣∣+∣∣∣∣∂Γ∂φ
∂φ
∂y
∣∣∣∣]
fs,y .
Obviously, this would be an inconsistent worst case estimation conflictingwith the concepts of error propagation provided by the alternative errormodel. In other words, we had overlooked a dependency and, as a conse-quence, estimated too large a systematic error.
Referring to the abbreviations
bx =∂Γ∂φ
∂φ
∂xand by =
∂Γ∂y
+∂Γ∂φ
∂φ
∂y
(6.70) and (6.71) yield
s2Γ = b2
xs2x + 2bxbysxy + b2
ys2y (6.73)
and
fΓ = bxfx + byfy (6.74)
respectively, so that
Γ(y, φ) ± uΓ (6.75)
uΓ =tP (n − 1)√
n
√b2xs2
x + 2bxbysxy + b2ys2
y + [|bx|fs,x + |by|fs,y] .
Just to prove the efficiency of this approach, we consider another relationship,namely
Γ [φ(x, y , . . .) , ψ(p, q , . . .) , . . .] .
Confining ourselves to two measurands, from
Γ(φl, ψl) = Γ(φ0, ψ0) +∂Γ∂φ
(φl − µφ) +∂Γ∂ψ
(ψl − µψ) +∂Γ∂φ
fφ +∂Γ∂ψ
fψ
Γ(φ, ψ) = Γ(φ0, ψ0) +∂Γ∂φ
(φ − µφ) +∂Γ∂ψ
(ψl − µψ) +∂Γ∂φ
fφ +∂Γ∂ψ
fψ
we deduce
Γ(φl, ψl) − Γ(φ, ψ) =∂Γ∂φ
(φl − φ) +∂Γ∂ψ
(ψl − ψ) ,
and, furthermore,
fΓ =∂Γ∂φ
fφ +∂Γ∂ψ
fψ .
90 6 Propagation of Measurement Errors
With
φl − φ =[∂φ
∂x(xl − x) +
∂φ
∂y(yl − y)
]; fφ =
∂φ
∂xfx +
∂φ
∂yfy
ψl − ψ =[∂ψ
∂p(pl − p) +
∂ψ
∂q(ql − q)
]; fψ =
∂ψ
∂pfp +
∂φ
∂qfq
and
bx =∂Γ∂φ
∂φ
∂x, by =
∂Γ∂φ
∂φ
∂y, bp =
∂Γ∂ψ
∂ψ
∂p, bq =
∂Γ∂ψ
∂ψ
∂q
we find
Γ(φl, ψl) − Γ(φ, ψ) = bx(xl − x) + by(yl − y) + bp(pl − p) + bq(ql − q)
and
fΓ = bxfx + byfy + bpfp + bqfq
respectively. Obviously, that part of the overall uncertainty uΓ which is dueto random errors includes the empirical covariances sxy, sxp, sxq, syp, syq, spq,and this in turn means that the evaluation procedures rely on the originalseries of measurements, which, hopefully, will have been saved by the labo-ratories.
Part II
Least Squares Adjustment
7 Least Squares Formalism
7.1 Geometry of Adjustment
In a sense, the least squares adjustment of an inconsistent, overdeterminedlinear system relies on the idea of orthogonal projection. To illustrate this,let us consider a source emitting parallel light falling perpendicularly onto ascreen. A rod held obliquely over the screen then casts a shadow, Fig. 7.1. Thisscenario may be considered to be an equivalent to the orthogonal projectionof the method of least squares.
Let us develop this idea to solve the following problem. Given n repeatmeasurements
x1, x2, . . . , xn ,
how can we find an estimator β for the unknown true value x0? Referring ntimes to the error equations (2.3), we are faced with
x1 = x0 + ε1 + fx2 = x0 + ε2 + f· · · · · · · · · · · · · · · · · · · · ·xn = x0 + εn + f .
(7.1)
By means of an auxiliary vector
Fig. 7.1. The method of least squares, presented as an orthogonal projection of arod, held obliquely over a screen
94 7 Least Squares Formalism
Fig. 7.2. Vector x of input data, vector r of residuals and true solution vector x0
a = (1 1 · · · 1)T ,
we may formally define a true solution vector x0 = ax0. Putting
x = (x1 x2 · · · xn)T and ε = (ε1 ε2 · · · εn)T
(7.1) takes the form
x = x0 + ε + f .
As β is unknown, we introduce a variable β instead:
β ≈ x1 , β ≈ x2 , . . . , β ≈ xn ,
or, in vector form,⎛⎜⎜⎝
11· · ·1
⎞⎟⎟⎠β ≈
⎛⎜⎜⎝
x1
x2
· · ·xn
⎞⎟⎟⎠ , i.e. aβ ≈ x . (7.2)
If x had no errors, we would have aβ0 = x0, i.e. β0 = x0. In a sense, we mayunderstand the vector a to span a one-dimensional space holding the truesolution vector x0.
As there are measurement errors, we purposefully quantify the discrep-ancy between x and x0 by the vector
r = x − x0 ,
the components of which are called residuals, Fig. 7.2. As the vector space aholds the true solution vector x0, we assume that just this space will holdthe approximation to (7.2) we are looking for. Indeed, the method of leastsquares “solves” the inconsistent system (7.2) by orthogonally projecting xonto a||x0. With P denoting the appropriate projection operator, we have
aβ = Px . (7.3)
7.1 Geometry of Adjustment 95
Fig. 7.3. Orthogonal projection P x of x onto a; r vector of residuals of minimallength; β stretch factor
The stretch factor β by which a has to be multiplied so that (7.3) is fulfilled isconsidered to be the least squares estimator of the inconsistent system (7.2).Figure 7.3 illustrates the proposal. We prove:
The orthogonal projection of x onto a minimizes the Euclidean normof r.
To this end, an arbitrary vector of residuals r = x − aβ is likened withthe vector of residuals being due to the orthogonal projection,
r = x − aβ .
The identity
r = x − aβ = (x − aβ) + a(β − β)
yields
rTr = rTr + aTa(β − β)2 ,
i.e.
rTr ≥ rTr .
Usually, this result is established by differentiating an expression of the form
Q = (x1 − β)2 + (x2 − β)2 + · · · + (xn − β)2
with respect to β. This sum, called the sum of squared residuals, is obviouslyestablished by squaring and summing up the components of the vector r =x − aβ. However, here we prefer to trace the method of least squares backto the idea of orthogonal projections. From
aTr = aT(x − aβ) = 0 , i.e. aTa β = aTx (7.4)
96 7 Least Squares Formalism
we infer
β = (aTa)−1aTx . (7.5)
Hence, in this introductory example, the required least squares estimatoris nothing but the arithmetic mean
β =1n
n∑l=1
xl
of n observations xl; l = 1, . . . , n. Let us summarize: a and x are vectors ofan n-dimensional space. The orthogonal projection of x, presented in (7.3),onto a one-dimensional space, spanned by a, has produced the stretch fac-tor or estimator β = x. Thus, the orthogonal projection has “solved” theinconsistent system (7.2) in terms of least squares.
To find the projection operator P , we multiply aTaβ = aTx on the leftby a(aTa)−1,
a β = a(aTa)−1aTx ,
so that
P = a(aTa)−1aT . (7.6)
In this particular case, all elements of P exhibit the same value 1/n. Multi-plying (7.3) on the left by aT yields
aTaβ = aTPx = aTx , i.e. β = (aTa)−1aTx .
The method of least squares cannot do wonders: all it can do is to transferthe approximation
aβ ≈ x ,
say by means of a gimmick, into a formally self-consistent statement, namely
aβ = Px .
The input data need not fulfil any presumptions at all. The orthogonal projec-tion “adjusts” any kind of contradiction in like manner, so that fatal errors ofreasoning and the finest, metrologically unavoidable, errors of measurementare treated side by side on the same level.
The vector of residuals, r = x−aβ, is in no way related to physical errors.As r is due to an orthogonal projection, its components are mathematicallydefined auxiliary quantities. However, the components of the vector r =x − x0 are physically meaningful error sums.
7.2 Unconstrained Adjustment 97
7.2 Unconstrained Adjustment
Let us assume that the measuring problem has led us to a set of linearrelationships
a11β1 + a12β2 + · · · + a1rβr = x1
a21β1 + a22β2 + · · · + a2rβr = x2
· · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·am1β1 + am2β2 + · · · + amrβr = xm .
(7.7)
To cast this in matrix notation, we introduce an (m × r) matrix A of coeffi-cients aik, an (m × 1) column vector x, assembling the right-hand sides xi,and finally, an (r × 1) column vector β of unknowns βk. From
A =
⎛⎜⎜⎝
a11 a12 . . . a1r
a21 a22 . . . a2r
. . . . . . . . . . . .am1 am2 . . . amr
⎞⎟⎟⎠ , x =
⎛⎜⎜⎝
x1
x2
. . .xm
⎞⎟⎟⎠ , β =
⎛⎜⎜⎝
β1
β2
. . .βr
⎞⎟⎟⎠
we find
Aβ = x .
This system is solvable if and only if the vector x may be expressed as alinear combination of the column vectors of A,
ak =
⎛⎜⎜⎝
a1k
a2k
. . .amk
⎞⎟⎟⎠ ; k = 1, . . . , r ,
so that
β1
⎛⎜⎜⎝
a11
a21
. . .am1
⎞⎟⎟⎠+ β2
⎛⎜⎜⎝
a12
a22
. . .am2
⎞⎟⎟⎠+ · · · + βr
⎛⎜⎜⎝
a1r
a2r
. . .amr
⎞⎟⎟⎠ = x .
If this applies, x is inside the column space of A; if not, the linear system isinconsistent or algebraically unsolvable.
Any experimental implementation of relationships (7.7), being conceptu-ally exact, entails measurement errors so that
Aβ ≈ x (7.8)
which means that x is outside the column space of A. To quantify the dis-crepancy, we again refer to a vector r of residuals,
98 7 Least Squares Formalism
Fig. 7.4. The vector r of residuals is perpendicular to the r-dimensional subspace,spanned by the column vectors ak ; k = 1, . . . , r
r = x − Aβ .
As has been discussed, we require the solution vector β to minimize the sumof squared residuals.
In the following, we assume
m > r and rank(A) = r
The vectors ak; k = 1, . . . , r define an r-dimensional subspace of an m-dimensional space. Of course, x is an element of the latter, however, onaccount of measurement errors, not an element of the former. By means ofa projection operator P the method of least squares throws x orthogonallyonto the subspace spanned by the r column vectors ak and substitutes theprojection Px for the erroneous vector x so that, as shown in Fig. 7.4,
Aβ = Px . (7.9)
Obviously, the vector Px so defined is a linear combination of the columnvectors ak. Hence, (7.9) is self-consistent and solvable. Its solution vector βsolves (7.8) in terms of least squares. The components βk; k = 1, . . . , r of βmay be seen to stretch the column vectors ak of A, rendering (7.9) possible.
Due to the orthogonality of the projection of x, the vector of residuals
r = x − Aβ (7.10)
is perpendicular to each of the column vectors of A,
ATr = 0 .
7.2 Unconstrained Adjustment 99
Inserting (7.10), we obtain
ATAβ = ATx ,
so that
β = BTx , (7.11)
where, to abbreviate,
B = A(ATA)−1 (7.12)
has been set. The definition of projection operators is independent of themethod of least squares, see Appendix D. For now we are content with aheuristic derivation of P . Multiplying ATAβ = ATx on the left by B yieldsAβ = A(ATA)−1ATx, i.e.
P = A(ATA)−1AT . (7.13)
By means of the identity
r = x − Aβ = (x − Aβ) + A(β − β)
we again prove that the Euclidean norm
rTr = rTr + (β − β)TATA(β − β)
of any vector of residuals r = x−Aβ can never be smaller than the Euclideannorm of the vector of residuals r resulting from the orthogonal projection, i.e.
rTr ≥ rTr .
The product ATA is invertible if the column vectors of the design matrix Aare independent, i.e. if A has rank r, see Appendix A.
Uncertainty assignments ask for well-defined reference values. Conse-quently, for any empirical input vectors x obtained metrologically, we have tointroduce a true vector x0 the components x0,i of which are the true valuesof the measurands xi,
x0 = (x0,1 x0,2 · · · x0,m)T .
If x0 was known, r equations would be sufficient to determine the r un-known components β0,k of the true solution vector β0. But assuming thelinear system to be both physically and mathematically well defined, we maynevertheless deduce β0 referring to all m > r equations,
β0 = (ATA)−1ATx0 .
As A is rectangular, an inverse is not defined. However, we may take(ATA)−1AT as a pseudo-inverse.
100 7 Least Squares Formalism
7.3 Constrained Adjustment
We demand the least-squares solution vector β of the overdetermined, incon-sistent linear system
Aβ ≈ x (7.14)
not only to reconcile the linear system with the measurement errors of the in-put data, but to also consistently comply with a given set of linear constraintsof the form
Hβ = y , (7.15)
where H is an (q×r) matrix of rank q. The least squares vector β is requiredto satisfy Hβ = y. The problem solving procedure depends on the rank ofthe system matrix A. Initially, we assume
rank(A) = r .
Certainly, the projection operator P = A(ATA)−1AT used until now wouldproduce a vector β that is incompatible with the constraints. Consequently,we have to modify the quiddity of the orthogonal projection. Indeed, as willbe shown in Appendix E, a projection operator of the form
P = A(ATA)−1AT − C(CTC)−1CT
meets the purpose. The least squares solution vector is
β = (ATA)−1ATx − (ATA)−1HT[H(ATA)−1HT
]−1
× [H(ATA)−1ATx − y]
(7.16)
fulfilling exactly Hβ = y. Now consider
rank(A) = r′ < r .
We add q = r − r′ constraints
Hβ = y , (7.17)
where rank(H) = q, and combine the two systems (7.14) and (7.17). Thoughthe product ATA is not invertible, the sum of products ATA + HTH is.The least squares solution vector turns out as
β = (ATA + HTH)−1(ATx + HTy) , (7.18)
satisfying Hβ = y.
8 Consequences of Systematic Errors
8.1 Structure of the Solution Vector
In the following we shall confine ourselves to unconstrained adjustments, i.e.to linear systems of the kind (7.8),
Aβ ≈ x . (8.1)
With respect to the assessment of uncertainties, constrained adjustmentswould not introduce new aspects. As a matter of principle, any vector x ofinput data could be subjected to a least squares adjustment, given the matrixA has full rank and there is a true vector x0 underlying the erroneous inputdata, i.e. if alongside (8.1) there is a consistent system of the form
Aβ0 = x0 .
While the geometry of the least squares adjustment distinguishes the errorsof the input data neither with respect to their causes nor with respect to theirproperties, as it is but an orthogonal projection, the situation gets differentwhen we intend to estimate measurement uncertainties.
The column vector
x = (x1 x2 · · · xm)T
contains m erroneous input data
xi = x0,i + (xi − µi) + fi ; i = 1, . . . , m . (8.2)
The x0,i denote the true values of the observations xi, the differences (xi − µi)their random errors εi and, finally, the fi systematic errors which we considerto be constant in time. Further, we assume the εi to be independent andnormally distributed. At present we shall abstain from repeat measurements.
The expected values of the random variables Xi; i = 1, . . . , m are given by
µi = E Xi = x0,i + fi ; i = 1, . . . , m . (8.3)
Converting (8.2) into vector form yields
102 8 Consequences of Systematic Errors
Fig. 8.1. The least squares solution vector β is given by the sum of the truesolution vector β0 and the vectors BT(x −µ) and BTf being due to random andsystematic errors respectively
x =
⎛⎜⎜⎝
x1
x2
. . .xm
⎞⎟⎟⎠ =
⎛⎜⎜⎝
x0,1
x0,2
. . .x0,m
⎞⎟⎟⎠+
⎛⎜⎜⎝
x1 − µ1
x2 − µ2
. . .xm − µm
⎞⎟⎟⎠+
⎛⎜⎜⎝
f1
f2
. . .fm
⎞⎟⎟⎠
or
x = x0 + (x − µ) + f . (8.4)
We now decompose the least squares solution vector
β =(ATA
)−1ATx = BTx ; B = A
(ATA
)−1(8.5)
into
β = BT [x0 + (x − µ) + f ] = β0 + BT (x − µ) + BTf . (8.6)
Obviously,
β0 =(ATA
)−1ATx0 = BTx0 (8.7)
designates the true solution vector. As (8.6) and Fig. 8.1 underscore, the leastsquares estimator β is given by the sum of the true solution vector β0 and twofurther terms BT (x − µ) and BTf expressing the propagation of randomand systematic errors respectively.
Within the framework of the concept of a statistical ensemble of measuringdevices biased by one and the same systematic error, the estimator β scatterswith respect to the expectation
µβ = Eβ
= β0 + BTf (8.8)
and not with respect to the true solution vector β0. On this note, we propose:Given the input data are charged by unknown systematic errors, the least
squares solution vector β is biased with respect to the true solution vector β0.Just this bias rules out most of the classical least squares tools.
8.2 Minimized Sum of Squared Residuals 103
8.2 Minimized Sum of Squared Residuals
Systematic errors provide the minimized sum of squared residuals with newproperties.
For simplicity, we assume the random errors εi = xi − µi to relate to thesame parent distribution. We then have
σ2 = E(Xi − µi)
2
; i = 1, . . . , m . (8.9)
If the input data were free from systematic errors, the unweighted minimizedsum of squared residuals
Qunweighted = rTr =(x − Aβ
)T (x − Aβ
)(8.10)
would supply an estimator for (8.9), namely
s2 =Qunweighted
m − r≈ σ2 . (8.11)
Considering (8.4), (7.10) and (7.13),
r = x − Aβ = x − Px
= x0 + (x − µ) + f − P [x0 + (x − µ) + f ]= (x − µ) − P (x − µ) + (f − Pf) , (8.12)
and temporarily ignoring the last term (f − Pf ) on the right-hand side, wefind
Qunweighted = (x − µ)T (I − P ) (x − µ) .
Here I denotes an (m × m) identity matrix. Dividing by σ2 produces theweighted sum of squared residuals
Qweighted =(x − µ)T
σ(I − P )
(x − µ)σ
. (8.13)
We shall show that Qweighted is χ2-distributed with m−r degrees of freedom.The matrices I, P and I − P are projection matrices, see Appendix D.
As rank(I − P ) = m − r, the difference matrix I − P has m − r non-zeroeigenvalues, all having the same value 1. The remaining r eigenvalues arezero.
Let T denote an orthogonal (m × m) matrix and I an (m × m) identitymatrix so that T TT = T T T = I, T T = T−1. Following the spectral theoremof algebra [52], we put the orthogonal eigenvectors of I −P into the columnsof T so that
104 8 Consequences of Systematic Errors
T T(I − P )T =
⎛⎜⎜⎜⎜⎜⎜⎝
1 0 0 . . . 0 |0 1 0 . . . 0 |
. . . . . . . . . . . . . . . | 00 0 0 . . . 1 |
− − − −−− −−− −−− −−− | − −−0 | 0
⎞⎟⎟⎟⎟⎟⎟⎠
m − r
r
m − r r (8.14)
Then, introducing an auxiliary vector
y = (y1 y2 . . . ym)T
and putting
x − µ
σ= Ty ,
(8.13) changes into
Qweighted = (T y)T (I − P )T y =m−r∑i=1
y2i . (8.15)
The elements of the vector
y = T T (x − µ) /σ
are normally distributed. Moreover, on account of
E y = 0 and EyyT
= T TT = I ,
they are standardized and independent. Thus, according to (8.15), Qweighted
is χ2-distributed with m − r degrees of freedom, i.e.
EQweighted
= m − r or E Qunweighted = σ2 (m − r) , (8.16)
see Sect. 3.3. Unfortunately, we are not allowed to ignore the term (f − Pf )in (8.12). As a consequence we find
E Qunweighted = σ2 (m − r) +(fTf − fTP f
)(8.17)
and this, obviously, means that as long as the second term on the right-hand side does not vanish, the experimenter is not in a position to derivean estimator for the variance of the input data from the minimized sum ofsquared residuals. In other words, provided that measurement uncertaintiesare considered indispensable, in general there is no chance of carrying outleast squares adjustments based on single observations. In Sect. 10.1, however,we shall meet an exception.
8.3 Gauss–Markoff Theorem 105
Meanwhile, instead of using single observations x1, x2, . . . , xm, the exper-imenter may refer to arithmetic means
x1, x2, . . . , xm ; xi =1n
n∑l=1
xil (8.18)
the empirical variances of which
s2i =
1n − 1
n∑l=1
(xil − xi)2 ; i = 1, . . . , m (8.19)
are known from the outset. Hence, we shall treat inconsistent systems of thekind
Aβ ≈ x , (8.20)
in which the components of the input vector x are arithmetic means
xi = x0,i + (xi − µi) + fi; i = 1, . . . , m . (8.21)
In particular, we shall assume each mean to cover the same number n ofrepeat measurements. Just this ideas will enable us to assign a completeempirical variance–covariance matrix to the solution vector
β =(ATA
)−1ATx = BTx ; B = A
(ATA
)−1. (8.22)
8.3 Gauss–Markoff Theorem
As is well known, the concept of so-called “optimal estimators” frequentlyreferred to in measurement technique is due to the Gauss–Markoff theorem.1
Just for completeness let us present the theorem, temporarily ignoring the ex-istence of systematic errors so that the least squares estimators are unbiased.The Gauss–Markoff theorem derives the uncertainties of the components ofthe least squares solution vector
β = (β1 β2 · · · βr)T
from the diagonal elements of the associated theoretical variance–covariancematrix
1R.L. Plackett (Biometrika, (1949), 149–157) has pointed out that the theoremis due to Gauss (Theoria combinationis observationum erroribus minimis obnox-iae, 1821/23). Presumably, the notation Gauss–Markoff theorem intends to honorMarkoff’s later contributions to the theory of least squares.
106 8 Consequences of Systematic Errors
E(
β − µβ
) (β − µβ
)T =
⎛⎜⎜⎜⎜⎜⎝
σβ1β1σβ1β2
. . . σβ1βr
σβ2β1σβ2β2
. . . σβ2βr
. . . . . . . . . . . .
σβr β1σβrβ2
. . . σβr βr
⎞⎟⎟⎟⎟⎟⎠ . (8.23)
If need be, we shall set σβkβk≡ σ2
βk. The uncertainties uβk
of the componentsβk; k = 1, . . . , r would be
uβk= √
σβkβk≡ σβk
; k = 1, . . . , r .
The Gauss–Markoff theorem states:Among all unbiased estimators, being linear in the input data, the least
squares estimator minimizes the diagonal elements of the theoretical variance–covariance matrix of the solution vector.
In view of the underlying error model, however, least squares estimatorsare biased. In this respect, the fate of the theorem should be sealed. On theother hand, as the ISO Guide steadfastly continues to take the validity ofthe theorem for granted, we shall carry out computer simulations in orderto settle the issue. As we shall see, the results not only differ in the sizesof the measurement uncertainties – an observation apparently regarded asdisputable on the part of experimenters – but still in another property which,in fact, manoeuvres the Guide in a difficult position, see Sects. 8.4 and 9.3.
Ignoring systematic errors and assuming observations of equal accuracy,we have
E(x − x0) (x − x0)
T
=σ2
nI , (8.24)
with I denoting an (m × m) identity matrix. According to (8.8) we haveµβ = β0. But then (8.23) yields
E(
β − β0
) (β − β0
)T =σ2
n
(ATA
)−1. (8.25)
Instead of B = A(ATA)−1 let us impute the existence of an (r ×m) matrixV defining another solution vector
γ = V x (8.26)
of Aβ ≈ x, also being linear in the observations. Of course, we ask for
γ0 = V x0 = V Aβ0 = β0 ; i.e. V A = I . (8.27)
As
E
(γ − γ0) (γ − γ0)T
=σ2
nV V T ,
8.4 Choice of Weights 107
we have to compare the diagonal elements of the variance–covariance matrices(ATA)−1 and V V T. The identity
V V T =(ATA
)−1+(V − BT
) (V − BT
)T(8.28)
reveals the diagonal elements of V V T to be non-negative; in particular theyturn out to be minimal in case V = BT, i.e. if V is due to least squares.
However, we are not allowed to ignore the third term BTf in (8.6). Con-sequently, the estimator β is no longer unbiased and just this fact violates theGauss–Markoff theorem. With respect to the error model under discussionwe state:
Empirical data, being biased by unknown systematic errors, cause theGauss–Markoff theorem to break down.
Nevertheless, let us continue to consider the classical form of an adjust-ment for a weighted linear system
GAβ ≈ Gx . (8.29)
According to the Gauss–Markoff theorem the optimal matrix of weightswould be
G =[E
(x − x0) (x − x0)T]−1/2
(8.30)
leading to minimal diagonal elements of the theoretical variance–covariancematrix
E(
β − β0
) (β − β0
)T =[(GA)T (GA)
]−1
(8.31)
of the solution vector
β = BTGx , B = (GA)[(GA)T (GA)
]−1
(8.32)
of the weighted system (8.29). However, as Sect. 9.3 will disclose, these con-siderations turn out to be inoperative so that we shall have to proceed dif-ferently.
Incidentally, for the purposes of the experimenter, the Gauss–Markofftheorem does not seem to be exhaustive insofar as it presumes an invari-able design matrix. On the other hand, it might well happen that one andthe same measuring problem allows for quite different design matrices lead-ing to varying measurement uncertainties. Unfortunately, the Gauss–Markofftheorem will in no way assist the experimenter in finding a design matrixgenerating absolutely minimal diagonal elements [20].
8.4 Choice of Weights
In general, the input data of the linear system
108 8 Consequences of Systematic Errors
Aβ ≈ x (8.33)
will not be of the same accuracy. To boost the influence of more accuratedata and reduce the influence of less accurate data, so-called weights shouldbe introduced. Weighting the linear system means to multiply its m rows
αi1β1 + αi2β2 + · · · + αirβr ≈ xi ; i = 1, . . . , m (8.34)
by some factors gi = 0.2 We purposefully assemble the latter within a (non-singular) diagonal weight matrix
G = diag g1, g2, . . . , gm . (8.35)
We emphasize that weights, in principle, do not shift the components of thetrue solution vector β0. Clearly,
GAβ0 = Gx0 (8.36)
re-establishes
β0 =[(GA)T (GA)
]−1
(GA)T Gx0 (8.37)
regardless of the weights the experimenter has chosen. Mathematically, thisproperty is self-evident. Nevertheless, it will turn out to be of crucial impor-tance when measurement uncertainties are to be optimized, i.e. minimized.
Weighting procedures, however, shift the least squares estimators of theinconsistent linear system. Then, finding “points of attack”, weights have twotypes of consequences: they
– shift the components of the vector β and they– either expand or compress the respective uncertainties.
Assuming the unknown systematic errors to abrogate the Gauss–Markoff the-orem, there no longer is a rule of how to select the weight factors in order tominimize measurement uncertainties. Nevertheless, as will be demonstrated,the alternative error model is in a position to state:
The intervals
βk ± uβk; k = 1, . . . , r (8.38)
localize the true values β0,k of the components βk of the solution vector β forany choice of the weights.
In a sense, this statement restores the memory of the method of leastsquares, see also Sect. 9.3. According to this result, we are, in principle, freeto choose the weights gi. In the standard case we would derive the weightsfrom the uncertainties of the input data xi through
2Even linear combinations of rows are admissible.
8.5 Consistency of the Input Data 109
gi =1
uxi
; uxi =tP (n − 1)√
nsi + fs,i ; i = 1, . . . , m . (8.39)
Beyond, we may cyclically repeat the numerical calculations varying theweights by trial and error and observe the estimator’s uncertainties. Hence,we can see whether re-evaluation has produced larger or smaller uncertainties.
If the individual weights were multiplied by one and the same arbitraryfactor, nothing would change. For instance, the gi might be replaced with
√wi =
gi√m∑
i=1
g2i
. (8.40)
Clearly, the so-defined weights are dimensionless and their squares add up toone. But it remains immaterial to which set of weights we refer.
8.5 Consistency of the Input Data
Conventional least squares provides a test that shall reveal whether or not theinput data are consistent,3 i.e. whether they might be blurred by undetectedsystematic errors.4 In the presence of unknown systematic errors, any suchtest will, of course, be superfluous. Nevertheless, we shall derive it, though,basically, it simply is a repetition of the considerations of Sect. 8.2. Let theinput data be arithmetic means
x = (x1 x2 · · · xm)T . (8.41)
Their underlying theoretical variances
σ2i = E
(Xi − µi)
2
; i = 1, . . . , m (8.42)
may be different. Conventional least squares obtains the weight matrixG from the Gauss–Markoff theorem, i.e. from the theoretical variance–covariance matrix of the input data,
E
(x − µ) (x − µ)T
=1n
⎛⎜⎜⎜⎜⎜⎝
σ11 σ12 . . . σ1m
σ21 σ22 . . . σ2m
. . . . . . . . . . . .
σm1 σm2 . . . σmm
⎞⎟⎟⎟⎟⎟⎠ ; σii ≡ σ2
i , (8.43)
3From our point of view, consistency means that the relationship between thetrue values of the input data and the true solution vector is physically unique.
4Undetected simply means that it has not been known to the experimenter thatan unknown systematic error he should have taken into account has been operative.
110 8 Consequences of Systematic Errors
setting
G =[E(x − µ) (x − µ)T
]−1/2
. (8.44)
Multiplying the inconsistent linear system
Aβ ≈ x (8.45)
on the left by G yields
GAβ ≈ Gx . (8.46)
The projection operator and the vector of the residuals are given by
P = (GA)[(GA)T (GA)
]−1
(GA)T (8.47)
and
r = Gx − PGx (8.48)
respectively. Inserting
x = x0 + (x − µ) + f (8.49)
we find
r = G (x − µ) − PG (x − µ) + (Gf − P Gf ) , (8.50)
since5
PGx0 = Gx0 . (8.51)
Disregarding systematic errors in (8.50), i.e. the third term on the right, theminimized sum of squared residuals is given by
Qweighted = rTr = [G (x − µ)]T (I − P ) [G (x − µ)] , (8.52)
where I designates an (m×m) identity matrix. Let T denote an orthogonal(m × m) matrix and y an (m × 1) auxiliary vector. Putting
G (x − µ) = T y; y = T TG (x − µ) (8.53)
and referring to (8.14) we find
Qweighted = (T y)T (I − P )T y =m−r∑i=1
y2i . (8.54)
5Following (8.36) the vector Gx0 is an element of the column space of the matrixGA. According to (8.47), the operator P projects onto this space.
8.5 Consistency of the Input Data 111
Again, the components of the auxiliary vector y are normally distributed(even if there are dependencies between the means xi, see Appendix C,[40]). As
E y = 0 and EyyT
= T TT = I ,
Qweighted is χ2-distributed with m − r degrees of freedom, i.e.
EQweighted
= m − r .
After all, given that the input data are free from systematic errors we shouldobserve
Qweighted ≈ m − r . (8.55)
Meanwhile, accounting for the term (Gf − PGf ) disregarded in (8.50), andthe observation that, due to systematic errors, the weight matrix G as pro-posed in (8.35) should be used, we have instead
EQweighted
=
m∑i=1
g2i
σ2i
n−
m∑i=1
g2i
σ2i
npii +
[(Gf)T (Gf) − (Gf)TP (Gf)
].
(8.56)
Here, the pii designate the diagonal elements of the projection matrix Pintroduced in (8.47). We note that (8.56) presupposes independent xi as oth-erwise the expected value EQweighted would turn out to be more complex.We observe:
Unknown systematic errors suspend the property of the weighted min-imized sum of squared residuals to estimate the number of the degrees offreedom of the linear system.
For decades empirical observations, in particular those relating to high-level metrology, have revealed over and over again
Qweighted m − r , (8.57)
[16,17]. Procedures relying on the ISO Guide [36] then speak of inconsistentinput data attributing the inconsistencies to undetected unknown systematicerrors. However, within the alternative error model, (8.57) does not necessar-ily express inconsistencies among the input data, as we think (8.56) ratherthan (8.55) applies. Whether or not measuring data are inconsistent shouldbe scrutinized by other means.
9 Uncertainties of Least Squares Estimators
9.1 Empirical Variance–Covariance Matrix
To determine the uncertainties of the least squares estimators, we resort tothe decomposition (8.6),
β = BTx = BT [x0 + (x − µ) + f ] . (9.1)
Let us firstly estimate the term BT (x − µ). In contrast to the conventionalerror calculus, we shall rely on empirical quantities, i.e. we shall establishthe empirical variance–covariance matrix direct from the input data. Thisis possible, given each of the xi; i = 1, . . . , m relates to the same numberof repeat measurements, as was presumed. Then, the said matrix allows usto express those parts of the overall uncertainties which are due to randomerrors through confidence intervals according to Student. In principle, weshall proceed as in Sect. 6.3 starting from the components βk; k = 1, . . . , r ofthe estimator (9.1). Denoting the elements of the matrix B by bik,
B = A(ATA)−1 = (bik) , i = 1, . . . , m ; k = 1, . . . , r ,
we find from
βk =m∑
i=1
bikxi ; k = 1, . . . , r , (9.2)
inserting the means
xi =1n
n∑l=1
xil ,
the representations
βk =m∑
i=1
bik
[1n
n∑l=1
xil
]=
1n
n∑l=1
[m∑
i=1
bikxil
]
=1n
n∑l=1
βkl ; k = 1, . . . , r . (9.3)
114 9 Uncertainties of Least Squares Estimators
For convenience, the sums
βkl =m∑
i=1
bikxil ; k = 1, . . . , r (9.4)
may be understood as components of a least squares estimator, the inputdata of which are the m individual measurements x1l, x2l, . . . , xml: each ofthe m means xi contributes just the l-th measured datum. The followingillustration depicts the situation:
x11 x12 . . . x1l . . . x1n ⇒ x1
x21 x22 . . . x2l . . . x2n ⇒ x2
. . . . . . . . . . . . . . . . . . . . . . . .
xm1 xm2 . . . xml . . . xmn ⇒ xm
⇓ ⇓ ⇓ ⇓ ⇓βk1 βk2 βkl βkn ⇒ βk
According to (9.3), the βk are the arithmetic means of the n means βkl. Thedifferences
βkl − βk =m∑
i=1
bik (xil − xi) ; l = 1, . . . , n; k = 1, . . . , r (9.5)
allow us to establish the empirical variances and covariances
sβkβk′ =1
n − 1
n∑l=1
(βkl − βk
) (βk′l − βk′
); k, k′ = 1, . . . , r
of the components of the estimator β and thus its empirical variance–covariance matrix
sβ =
⎛⎜⎜⎜⎜⎝
sβ1β1sβ1β2
. . . sβ1βr
sβ2β1sβ2β2
. . . sβ2βr
. . . . . . . . . . . .
sβrβ1sβrβ2
. . . sβrβr
⎞⎟⎟⎟⎟⎠ ; sβkβk
≡ s2βk
. (9.6)
Inserting the differences (9.5) into sβkβk′ , we obtain
sβkβk′ =1
n − 1
n∑l=1
[m∑
i=1
bik (xil − xi)
]⎡⎣ m∑j=1
bjk′ (xjl − xj)
⎤⎦
=m∑i,j
bikbjk′sij . (9.7)
9.1 Empirical Variance–Covariance Matrix 115
Here the quantities
sij =1
n − 1
n∑l=1
(xil − xi) (xjl − xj) ; i, j = 1, . . . , m (9.8)
designate the elements of the empirical variance–covariance matrix of theinput data
s =
⎛⎜⎜⎜⎜⎝
s11 s12 . . . s1m
s21 s22 . . . s2m
. . . . . . . . . . . .
sm1 sm2 . . . smm
⎞⎟⎟⎟⎟⎠ . (9.9)
By means of the column vectors bk; k = 1, . . . , r of the matrix B,
B = (b1 b2 · · · br) , (9.10)
we can compress the variances and covariances as defined in (9.7) into
sβkβk′ = bTk s b k′ ; k, k′ = 1, . . . , r . (9.11)
Hence, (9.6) turns into
sβ =
⎛⎜⎜⎜⎜⎝
bT1s b1 bT
1s b2 . . . bT1s br
bT2s b1 bT
2s b2 . . . bT2s br
. . . . . . . . . . . .
bTrs b1 bT
rs b2 . . . bTrs br
⎞⎟⎟⎟⎟⎠ = BTs B . (9.12)
As each of the βk; k = 1, 2, . . . , r relies on the same set of input data, theynecessarily are dependent. The same applies to the βkl. Moreover, for anyfixed i, the sequence of the xil is indeterminate, i.e. we may interchange thexil; i fixed. With these permutations, the βk do not alter; the βkl, however,will alter. Consequently, the elements of the matrices (9.9) and (9.12) dependon the sequences of the xil; i fixed. This, nevertheless, complies perfectly withthe properties of the underlying statistical distributions and, in particular,with our aim of introducing Student’s confidence intervals.
In order to find the empirical variance–covariance matrix of the weightedadjustment, we resort to (8.29)
GAβ ≈ Gx = G [x0 + (x − µ) + f ]
and (8.32)1
1Though the weighted solution vector β will be different from the unweightedone, for simplicity, we use the same symbol.
116 9 Uncertainties of Least Squares Estimators
β = BTGx ; B = (GA)[(GA)T (GA)
]−1
.
Denoting the elements bik of the (m × r) matrix B by bik, we write the rcomponents βk of the vector β in the form
βk =m∑
i=1
bikgixi ; k = 1, . . . , r . (9.13)
Insertion of
xi =1n
n∑l=1
xil
leads to
βk =m∑
i=1
bikgi
[1n
n∑l=1
xil
]=
1n
n∑l=1
[m∑
i=1
bikgixil
]=
1n
n∑l=1
βkl . (9.14)
Again, we may consider the quantities
βkl =m∑
i=1
bikgixil (9.15)
as the components of a least squares estimator, the input data of which arethe m weighted individual measurements x1l, x2l, . . . , xml. Referring to thedifferences
βkl − βk =m∑
i=1
bikgi (xil − xi) ; k = 1, . . . , r ,
we form the elements
sβkβk′ =1
n − 1
n∑l=1
[m∑
i=1
bikgi (xil − xi)
]⎡⎣ m∑j=1
bjk′gj (xjl − xj)
⎤⎦
=m∑i,j
bikbjk′gigjsij (9.16)
of the empirical variance–covariance matrix sβ . The matrix itself is given by
sβ =
⎛⎜⎜⎜⎜⎜⎝
bT1
(GTs G
)b1 bT
1
(GTs G
)b2 . . . bT
1
(GTs G
)br
bT2
(GTs G
)b1 bT
2
(GTs G
)b2 . . . bT
2
(GTs G
)br
. . . . . . . . . . . .
bTr
(GTs G
)b1 bT
r
(GTs G
)b2 . . . bT
r
(GTs G
)br
⎞⎟⎟⎟⎟⎟⎠
= BT(GTs G
)B . (9.17)
9.2 Propagation of Systematic Errors 117
Here, we have made use of
B =(b1 b2 · · · br
)=(bik
). (9.18)
The elements sij and the matrix s have been defined in (9.8) and (9.9) re-spectively.
Each of the quantities βkl, as defined in (9.4) and (9.15), imply just oneset of input data x1l, x2l, . . . , xml. Assuming the successive sets l = 1, 2, . . . , nto be independent and assigning an m-dimensional normal density to eachof them, we are in a position to introduce Student’s confidence intervals. Forthe non-weighted adjustment we have
βk ± tP (n − 1)√n
√√√√ m∑i,j
bikbjk sij ; k = 1, . . . , r (9.19)
while the weighted adjustment yields
βk ± tP (n − 1)√n
√√√√ m∑i,j
bik bjkgigj sij ; k = 1, . . . , r . (9.20)
The intervals (9.19) and (9.20) localize the components of the vectors
µβ = β0 + BTf (9.21)
and
µβ = β0 + BTGf (9.22)
respectively, with probability P as given by tP (n − 1).
9.2 Propagation of Systematic Errors
From (9.21) and (9.22) we read the systematic errors of the estimators of thenon-weighted and the weighted adjustment respectively,
f β = BTf ; f β = BTGf , (9.23)
or, written in components,
fβk=
m∑i=1
bikfi ; fβk=
m∑i=1
bikgifi ; k = 1, . . . , r . (9.24)
Their worst-case estimations are
118 9 Uncertainties of Least Squares Estimators
fs,βk=
m∑i=1
|bik| fs,i (9.25)
and
fs,βk=
m∑i=1
∣∣∣bik
∣∣∣ gifs,i . (9.26)
Should fi = f = const., i = 1, . . . , m apply, the estimations turn out to bemore favorable,
fs,βk= fs
∣∣∣∣∣m∑
i=1
bik
∣∣∣∣∣ ; fs,βk= fs
∣∣∣∣∣m∑
i=1
bikgi
∣∣∣∣∣ . (9.27)
According to (9.25) and (9.26), the new formalism should respond to anrising number m of input data with an inappropriate growth in the propa-gated systematic errors, thus blotting out the increase in input information.This would, of course, scarcely make sense. Fortunately, as the examples inSect. 9.3 will underscore, this indeed does not apply. For now, let us make thisobservation a bit plausible. Suppressing the index k and confining ourselvesto m = 2 so that
fβ = b1f1 + b2f2 ,
we find from
fβ = b1x1f1
x1+ b2x2
f2
x2,
dividing by β = b1x1 + b2x2,
fβ
β= p1
f1
x1+ p2
f2
x2,
where
p1 =b1x1
β; p2 =
b2x2
β; p1 + p2 = 1 .
As long as p1 and p2 act as true weight factors, being both positive, we have
fs,β∣∣β∣∣ = p1fs,1
|x1| + p2fs,2
|x2| .
In general, however, the situation might not be that simple.
9.3 Overall Uncertainties of the Estimators 119
9.3 Overall Uncertainties of the Estimators
To find the overall uncertainties uβk; k = 1, . . . , r of the least squares adjust-
ment, we refer to the decomposition
β = BTG [x0 + (x − µ) + f ]
= β0 + BTG (x − µ) + BTGf . (9.28)
As usual, we linearly combine the estimations (9.19), (9.25) and (9.20), (9.26)respectively, to obtain the quested uncertainty intervals
βk − uβk≤ β0, k ≤ βk + uβk
; k = 1, . . . , r . (9.29)
While the non-weighted adjustment yields
uβk=
tP (n − 1)√n
√√√√ m∑i,j
bikbjksij +m∑
i=1
|bik| fs,i , (9.30)
the weighted adjustment issues
uβk=
tP (n − 1)√n
√√√√ m∑i,j
bik bjkgigjsij +m∑
i=1
∣∣∣bik
∣∣∣ gifs,i . (9.31)
In either case, we expect the intervals (9.29) to localize the true values β0,k
of the estimators βk; k = 1, . . . , r. As Fig. 9.1 depicts, the least squares ad-justment depends on
– the input data,– the choice of the design matrix A, and– the structure of the weight matrix
Let us recall (see Sect. 8.4) that weightings involve two things: they nu-merically shift the estimators and they alter the pertaining uncertainties.Nonetheless, as the uncertainty intervals steadfastly localize the estimators’true values independent of which weights have been chosen, we may attributesomething like a memory to the approach. Also, we should not cling so muchto the numerical values of the estimators but rather consider intervals asquoted in (9.29). As Fig. 9.2 illustrates, we may assume a set of uncertaintyintervals stemming from different experiments and aiming at one and thesame physical quantity. Then, each of the intervals should localize the com-mon true value β0,k. In other words, the intervals should mutually overlap.
The property (9.29) proves to be of vital importance and differs signifi-cantly from the implications of the ISO Guide. Though the latter evens outinconsistencies among the input data, the experimenter must, unfortunately,
120 9 Uncertainties of Least Squares Estimators
Fig. 9.1. The least squares adjustment depends on the input data, the designmatrix A and the weight matrix G. The latter may be chosen by trial and error tominimize the uncertainties of the estimators
Fig. 9.2. Overlaps of uncertainty regions of measuring results coming from differentpaths and aiming at one and the same physical quantity β0,k
be aware of the fact that the uncertainties at hand might not embrace thetrue values he is looking for.
Nonetheless, for completeness, we add the Guide’s formalism: The linearsystem and the least squares estimator of the non-weighted adjustment aregiven by
Aβ ≈ x , β = BTx , B = A(ATA
)−1(9.32)
where
β0 = Eβ
= BTx0
9.3 Overall Uncertainties of the Estimators 121
is tacitly assumed. But then, the theoretical variance–covariance matrix ofthe estimator β would be
E(
β − β0
) (β − β0
)T = BTE(x − x0) (x − x0)
T
B . (9.33)
The square roots of its diagonal elements are seen to be the uncertainties ofthe non-weighted adjustment. Similarly, for the weighted adjustment we have
GA β ≈ G x , β = BTG x , B = (GA)[(GA)T (GA)
]−1
(9.34)
where again
β0 = Eβ
= BTG x0
is taken for granted. Referring to a weight matrix of the form
G =[E(x − x0) (x − x0)
T]−1/2
, (9.35)
the theoretical variance–covariance matrix of the weighted ISO adjustmentis given by
E(
β − β0
) (β − β0
)T = BTGE
(x − x0) (x − x0)T
GB
= BTB =[(GA)T (GA)
]−1
. (9.36)
Again, the square roots of the diagonal elements of this matrix act as theuncertainties of the weighted adjustment.
Let us recall, on account of the randomization postulate, there is no dis-tinction between the expectation E
β
of the estimator β and its true valueβ0 . According to (9.21) and (9.22), however, these quantities differ by thebiases
BTf and BTGf
respectively. After all, the experimenter has no means to exclude the possi-bility that the weights have shifted his estimators away from their true valuesand, paradoxically, shrunk their uncertainties.
Examples
(i) Non-weighted and weighted adjustment: A computer simulation shall helpus to compare the error model of the ISO Guide with that of the alternativeapproach. To this end we consider the linear system
10 β1 + β2 + 9 β3 ≈ x1
− β1 + 2 β3 ≈ x2
β1 + 2 β2 + 3 β3 ≈ x3
−3 β1 − β2 + 2 β3 ≈ x4
β1 + 2 β2 + β3 ≈ x5
122 9 Uncertainties of Least Squares Estimators
and let
β0,1 = 1 , β0,2 = 2 , β0,3 = 3
be the true values of the parameters so that the true values of the input dataturn out to be
x0,1 = 39 , x0,2 = 5 , x0,3 = 14 , x0,4 = 1 , x0,5 = 8 .
We simulate normally distributed input data xil; i = 1, . . . , 5; l = 1, . . . , 10.It seems recommendable to process random and systematic errors of thesame order of magnitude. That’s why we set, for any fixed i, the theoreticalvariance of the random errors equal to the postulated theoretical variance ofthe pertaining fi,
σ2xi
=13f2
s,i ; i = 1 , . . . , m .
We assume the boundaries of the systematic errors and their actual valuesto be
fs,1 = 0.39 × 10−2 , fs,2 = 0.35 × 10−4 , fs,3 = 0.98 × 10−5
fs,4 = 0.35 × 10−4 , fs,5 = 0.8 × 10−3
and
f1 = 0.351 × 10−2 , f2 = 0.315 × 10−4 , f3 = −0.882 × 10−5
f4 = −0.315× 10−4 , f5 = −0.72 × 10−3
respectively, i.e. we multiplied the boundaries fs,i; i = 1, . . . , 5 by a factor of0.9 and purposefully chose the signs of the actual systematic errors in sucha way that the Guide’s 2σ uncertainties of the weighted adjustment fail tolocalize the true values β0,k of the estimators βk.
We refer to a weight matrix G of the form (8.35), the entries being thereciprocals of the uncertainties of the input data corresponding to the ISOGuide and to the alternative approach respectively, i.e.
gi = 1/uxi , i = 1, . . . , 5
with
uxi =
√s2
xi
n+
13f2
s,i ISO Guide and
uxi =tP (n − 1)√
nsxi + fs,i alternative model .
The alternative error model allows us to manipulate the weights of the ad-justment. To exemplify this possibility we choose
9.3 Overall Uncertainties of the Estimators 123
g1 = 1/ux1 g2 = 4/ux2 g3 = 10/ux3 g4 = 4/ux4 g5 = 1/ux5 .
For the ISO Guide we quote so-called 2σ uncertainties. The alternative errormodel relates the random parts of the overall uncertainties to a confidencelevel of P = 95%.
Tables 9.1–9.3 display the results. In general, the 2σ uncertainties of theGuide may be considered acceptable. There are, however, singular cases inwhich the Guide fails to localize the true values of the estimators. We pur-posefully exemplified this situation thus underscoring that the Guide’s 2σuncertainties can be too short.
In contrast to this, the alternative approach presents us with robust un-certainties safely localizing the true values of the estimators. Even if we ma-nipulate the weights, the localization property of the alternative error modelstill persists.
(ii) Weighted mean of means: Given m arithmetic means and their re-spective uncertainties,
xi =1n
n∑l=1
xil ; uxi =tP (n − 1)√
nsi + fs,i ; i = 1 , . . . , m . (9.37)
We shall find the weighted grand mean β and its uncertainty uβ.To average a pool of given arithmetic means ranks among the most fre-
quently used standard procedures of metrology. Clearly, this is done in orderto extract as much information as possible out of the data. The proceedingis taken permissible, trouble-free and self-evident. But consider the followingproposal:
Averaging a set of means makes sense only if the underlying true valuesx0,i; i = 1, . . . , m are identical.
As has been discussed in Sect. 7.1, to average means to perform a leastsquares adjustment. Here we have, instead of (7.2),
Table 9.1. Least squares estimators and their uncertainties based on the ISOGuide and the alternative error model, respectively. In the first case so-called 2σuncertainties have been quoted, in the latter case the random parts of the overalluncertainties refer to a confidence level of P=95%. The * notifies manipulatedweights
Model β1 ± uβ1β2 ± uβ2
β3 ± uβ3
ISO non-weighted 1.0003 ± 0.0004 1.9995 ± 0.0005 3.00016 ± 0.00021
ISO weighted 1.00013 ± 0.00011 1.99980 ± 0.00016 3.00009 ± 0.00007
altern. non-weighted 1.0003 ± 0.0005 1.9995 ± 0.0008 3.00017 ± 0.00029
altern. weighted 1.00014 ± 0.00020 1.99978 ± 0.00030 3.00009 ± 0.00013
altern. weighted* 1.00012 ± 0.00017 1.99981 ± 0.00025 3.00008 ± 0.00011
124 9 Uncertainties of Least Squares Estimators
Table 9.2. Data taken from Table 9.1: Localization in least squares adjustmentbased on the ISO Guide. The weighted adjustment fails to localize the true values
ISO Guide
non-weighted weighted
0.9999 < 1 < 1.0007 1.00002 > 1 < 1.00024
1.9990 < 2 = 2.0000 1.99964 < 2 > 1.99996
2.99995 < 3 < 3.00037 3.00002 > 3 < 3.00016
Table 9.3. Data taken from Table 9.1: Localization in least squares adjust-mentvbased on the alternative error model. The * notifies manipulated weights
alternative model
non-weighted weighted weighted*
0.9998 < 1 < 1.0008 0.99994 < 1 < 1.00034 0.99995 < 1 < 1.00029
1.9987 < 2 < 2.0003 1.99948 < 2 < 2.00008 1.99956 < 2 < 2.00006
2.99988 < 3 < 3.00046 2.99996 < 3 < 3.00022 2.99997 < 3 < 3.00019
⎛⎜⎜⎝
11· · ·1
⎞⎟⎟⎠ β ≈
⎛⎜⎜⎝
x1
x2
· · ·xm
⎞⎟⎟⎠ , i.e. aβ ≈ x . (9.38)
Obviously, the underlying consistent system reads⎛⎜⎜⎝
11· · ·1
⎞⎟⎟⎠ β0 =
⎛⎜⎜⎝
x0,1
x0,2
· · ·x0,m
⎞⎟⎟⎠ . (9.39)
From this we take that the true value β0 of the grand mean β cannot complywith different true values of the input data. Consequently we have to requireidentical true values x0,i = x0; i = 1, . . . , m.
Nevertheless, we might wish to average “ad hoc”, i.e. without resortingto the method of least squares. Let us average two masses m1 = 1/4kgand m2 = 3/4 kg each being accurate to about ±1mg each. The mean ism = 1/2 kg. However, the associated uncertainty exceeds the uncertaintiesof the input data by the factor 250 000. The uncertainty has been blown upsince the true values of the masses differ. We are sure, uncertainties should bedue to measurement errors and not to deviant true values. This drasticallyexaggerating example elucidates that the averaging of means implies far-reaching consequences if done inappropriately.
To weight the inconsistent linear system
aβ ≈ x
9.3 Overall Uncertainties of the Estimators 125
we set
G = diag g1 , g2 , . . . , gm ; gi = 1/uxi
so that
Gaβ ≈ Gx. (9.40)
The least squares estimator is
β =m∑
i=1
wixi ; wi =g2
im∑
i=1
g2i
. (9.41)
Let us replace the means x1, x2, . . . , xm with the respective l-th individualmeasurements x1l, x2l, . . . , xml
x11 x12 . . . x1l . . . x1n ⇒ x1
x21 x22 . . . x2l . . . x2n ⇒ x2
. . . . . . . . . . . . . . . . . . . . . . . .
xm1 xm2 . . . xml . . . xml ⇒ xm .
(9.42)
Then, (9.41) turns into
βl =m∑
i=1
wixil ; l = 1 , . . . , n (9.43)
so that
β =1n
n∑l=1
βl . (9.44)
Inserting
xil = x0,i + (xil − µi) + fi ; xi = x0,i + (xi − µi) + fi ,
(9.41) and (9.43) yield
βl =m∑
i=1
wi [x0,i + (xil − µi) + fi] and
β =m∑
i=1
wi [x0,i + (xi − µi) + fi] . (9.45)
Being free from systematic errors, the difference
126 9 Uncertainties of Least Squares Estimators
βl − β =m∑
i=1
wi (xil − xi) (9.46)
solely expresses the influence of random errors. Consequently, the empiricalvariance of the estimator β is given by
s2β =
1n − 1
n∑l=1
(βl − β
)2
=1
n − 1
n∑l=1
[m∑
i=1
wi (xil − xi)
]⎡⎣ m∑j=1
wj (xjl − xj)
⎤⎦
=m∑i,j
wiwjsij (9.47)
where the
sij =1
n − 1
n∑l=1
(xil − xi) (xjl − xj) ; i, j = 1, . . . , m ; sii ≡ s2i (9.48)
denote the empirical variances and covariances of the input data (9.42).Again, the sij shall define the empirical variance–covariance matrix s andthe weights wi an auxiliary vector w,
s =
⎛⎜⎜⎜⎜⎝
s11 s12 . . . s1m
s21 s22 . . . s2m
. . . . . . . . . . . .
sm1 sm2 . . . smm
⎞⎟⎟⎟⎟⎠ ; w = (w1 w2 · · · wm)T .
Hence, (9.47) takes the form
s2β = wTs w . (9.49)
The propagated systematic errors follow from (9.45),
fβ =m∑
i=1
wifi ,
and its worst case estimation is
fs,β =m∑
i=1
wifs,i . (9.50)
Let us formally assign an m-dimensional normal density to each of the n datasets
9.3 Overall Uncertainties of the Estimators 127
x1l, x2l, . . . , xml ; l = 1, . . . , n (9.51)
depicted as columns in the decomposition (9.42). But then the βl; l = 1, . . . , ngiven in (9.43) are normally distributed – whether there are dependencies be-tween the individuals x1l, x2l, . . . , xml, l fixed, remains immaterial. Moreover,as we consider the n successive sets (9.51) to be independent, the βl are alsoindependent.
Finally, the overall uncertainty of the weighted mean β is given by
uβ =tP√n
√√√√ m∑i,j
wiwjsij +m∑
i=1
wifs,i . (9.52)
It looks as if (9.50) would overestimate the influence of systematic errors, i.e.as if fs,β might grow inadequately, the greater the number m of input data.This is not, however, the case, as the following two examples suggest.
Given ux1 = ux2 = ux3 , we find w1 = w2 = w3 = 1/3 so that
fs, β = (fs, 1 + fs, 2 + fs, 3) /3 .
Given ux1 = a, ux2 = 2a, ux3 = 3a we obtain w1 = 36/49, w2 = 9/49,w1 = 4/49, i.e.
fs, β = (36fs, 1 + 9fs, 2 + 4fs, 3) /49 .
Due to the weight factors wi, the propagated systematic error preserves quitea reasonable order of magnitude even if the number of means entering theaverage should be large.
Let us summarize: We conclude that the Guide’s recommendation to as-sign distribution densities to unknown systematic errors should scarcely betenable.
Over the years, appreciable efforts have been undertaken to develop aformalism allowing experimenters to randomize detected unknown systematicerrors and to somehow repair the bothersome observation (8.57).
Ultimately, the inconsistencies in meeting the influence of unknown sys-tematic errors had condensed themselves into a sheerly inextricable clew ofdifferent procedures competing with one another, similar to a Gordian knot.In connection with the adjustment of the fundamental constants of physics,even the abolition of the method of least squares was discussed [13].
Arguably, the idea of treating unknown systematic errors on a proba-bilistic basis was inspired by the objective of designing lowest possible mea-surement uncertainties. The analysis of variance disclosed, however, a firstinconsistency, Sect. 4.3. The computer simulations testing the reliability ofthe Guide’s measurement uncertainties showed that there are situations inwhich even so-called 2σ uncertainties turn out to be too small, Sect. 6.4. Thetask to calculate the grand mean of a group of means accentuated anew
128 9 Uncertainties of Least Squares Estimators
the need to base data evaluations on true values, Sect. 9.3. Finally, as (8.57)underscored, (8.56) seems to reflect the actual metrological situation morerealistically than (8.55) could.
On the other hand, we are in a position to undo the Gordian knot as soonas we are willing to abstain from randomizing systematic errors. Then, weunderstand why the standard tools of data evaluation do no longer producereasonable results, applied to data of empirical origin. It suggests that weeither revise those classical concepts according to the prevailing metrologicalsituation, or, if necessary, abolish them. Only then shall we be in a position tointerlace our measurement results within a reliable and traceable metrologicalnet of data.
Though a rigorous revision of the error calculus might appear uninviting,above all the aim is to extract a maximum of reliable, objectively justifiableinformation out of the measurements which, as is well known, require con-siderable efforts even in standard cases and virtually huge efforts in specialcases.2
9.4 Uncertainty of a Function of Estimators
According to Sect. 6.5, we shall quote the uncertainty uφ of a given functionφ of the components β1, β2, . . . , βr of the least squares estimator β, i.e. weshall look for
β1 ± uβ1, β2 ± uβ2
, . . . , βr ± uβr⇒ φ
(β1, β2, . . . , βr
)± uφ . (9.53)
For simplicity, we confine ourselves to a function of two estimators β1, β2 and,as usual, neglect the errors of the linearization of φ. Inserting
xi = x0,i + (xi − µi) + fi ; i = 1, . . . , m
into
β1 =m∑
i=1
bi1xi and β2 =m∑
i=1
bi2xi
yields
βk = β0,k +m∑
i=1
bik (xi − µi) +m∑
i=1
bikfi ; k = 1, 2
so that2The vacuum tube of the Large Hadron Collider (LHC) of the European Orga-
nization for Nuclear Research (CERN), Geneva has a circumference of 27 km, andits magnets work near absolute zero. Among the results physicists are waiting foris the proof of the Higgs boson.
9.4 Uncertainty of a Function of Estimators 129
φ(β1, β2
)= φ (β0,1, β0,2) +
∂φ
∂β1
[m∑
i=1
bi1 (xi − µi)
]+
∂φ
∂β2
[m∑
i=1
bi2 (xi − µi)
]
+∂φ
∂β1
[m∑
i=1
bi1fi
]+
∂φ
∂β2
[m∑
i=1
bi2fi
]
or
φ(β1, β2
)= φ (β0,1, β0,2) +
m∑i=1
(∂φ
∂β1bi1 +
∂φ
∂β2bi2
)(xi − µi)
+m∑
i=1
(∂φ
∂β1bi1 +
∂φ
∂β2bi2
)fi . (9.54)
To abbreviate, we put
χi =∂φ
∂β1bi1 +
∂φ
∂β2bi2 ; i = 1, . . . , m . (9.55)
Then, (9.54) turns into
φ(β1, β2
)= φ (β0,1, β0,2) +
m∑i=1
χi (xi − µi) +m∑
i=1
χifi . (9.56)
For convenience, we also assign the χi; i = 1, . . . , m to an (m × 1) columnvector
χ = (χ1 χ2 . . . χm)T . (9.57)
First, we assess the second term on the right hand side of (9.56), which,apparently, is due to random errors. With reference to the notations (9.2),(9.3) and (9.4),
βk =m∑
i=1
bikxi =1n
n∑l=1
βkl ; βkl =m∑
i=1
bikxil ; k = 1, 2 ,
we expand φ(β1l, β2l) throughout a neighborhood of β1, β2,
φ(β1l, β2l
)= φ(β1, β2
)+
∂φ
∂β1
(β1l − β1
)+
∂φ
∂β2
(β2l − β2
)= φ(β1, β2
)+
m∑i=1
(∂φ
∂β1bi1 +
∂φ
∂β2bi2
)(xil − xi)
= φ(β1, β2
)+
m∑i=1
χi (xil − xi) .
The difference φ(β1l, β2l) − φ(β1, β2) leads us to the empirical variance
130 9 Uncertainties of Least Squares Estimators
s2φ =
1n − 1
n∑l=1
[φ(β1l, β2l
)− φ(β1, β2
)]2 = χTs χ . (9.58)
Here, s designates the empirical variance–covariance matrix of the input data(9.9) and the vector χ has been defined in (9.57).
The third term on the right of (9.56),
fφ =m∑
i=1
χifi , (9.59)
is due to the propagation of systematic errors. Its worst case estimation yields
fs,φ =m∑
i=1
|χi|fs,i . (9.60)
Finally, the overall uncertainty of the estimator φ(β1, β2) turns out to be
uφ =tP (n − 1)√
n
√χTs χ +
m∑i=1
|χi|fs,i . (9.61)
Before any particular worst case estimate can be carried out, we have tomerge in brackets all terms relating to the same fi, otherwise the triangleinequality would overestimate the share of the overall uncertainty cause bysystematic errors.
9.5 Uncertainty Spaces
The preceding section dwelled on the problem of how to estimate the un-certainty uφ of a given function φ, linking some or all of the least squaresestimators. Now we turn toward the problem of spanning so-called uncer-tainty spaces by means of the estimators themselves.
Let us illustrate this notion by conceiving a least squares adjustment to acircle. Should it suffice to estimate just the radius r, we can confine ourselvesto a result of the form r ± ur. However, if we additionally want to know allabout where we may insert the tip of the compasses and how wide to openits legs, we have to revert to the couplings between the radius r and thecoordinates xM , yM of its center hidden within the adjustment’s quotations
xM ± uxM , yM ± uyM ; r ± ur .
These intervals obviously imply all circles which are compatible with the inputdata and the error model. We expect the latter to provide a small circularregion localizing the true center x0, y0 of the circle and a large annular regionenclosing the true arc x0, y0; r0. We shall treat this problem in Sect. 11.4.
9.5 Uncertainty Spaces 131
The origin of the couplings between the components of the estimator βis obvious. As each of the r components
βk =m∑
i=1
bikxi
= β0,k +m∑
i=1
bik (xi − µi) +m∑
i=1
bikfi ; k = 1, . . . , r (9.62)
relies on one and the same set of input data, the βk are necessarily depen-dent. The second term on the right-hand side expresses the couplings due torandom errors. As the xi imply the same number of repeat measurements,we gain access to the complete empirical variance–covariance matrix of theinput data. Hence, we are in a position to formalize the said dependencies byHotelling’s density. Remarkably enough, the confidence ellipsoids providedby this density differ substantially from those referred to in the conventionalerror calculus. As is known, the conventional error calculus obtains its con-fidence ellipsoids from the exponent of the multidimensional normal proba-bility density so that they are based on unknown theoretical variances andcovariances, thus hampering metrological interpretations.
Finally, the third term on the right-hand side of (9.62) formalizes thecouplings due to systematic errors, i.e. to influences entirely ignored on thepart of the Gaussian error calculus.
9.5.1 Confidence Ellipsoids
Let us reconsider the decomposition (9.3),
βk =1n
n∑l=1
βkl ; k = 1, . . . , r
where, as (9.4) shows,
βkl =m∑
i=1
bikxil ; k = 1, . . . , r, l = 1, . . . , n .
Assuming normally distributed measured values xil, the βkl are also normallydistributed, furthermore, as has already been discussed, they are independent.The empirical variance–covariance matrix of the βk has been stated in (9.12),
sβ = BTs B .
To establish Hotelling’s ellipsoid, we refer to (3.73).3 Assembling the expec-tations
3I am grateful to Dr. Woger for calling my attention to this analogy.
132 9 Uncertainties of Least Squares Estimators
µβk= E
βk
= β0,k +
m∑i=1
bikfi ; k = 1, . . . , r (9.63)
within a column vector µβ as we did in (8.8), we have
t2P (r, n) = n(β − µβ
)T (BTs B
)−1 (β − µβ
). (9.64)
The ellipsoid is centered in µβ; the probability of holding any realization ofβ is given by
P =
tP (r,n)∫0
pT (t; r, n)dt ; (9.65)
pT (t; r, n) has been stated in (3.75). Experimentally speaking, the vector µβ
cannot be known. On the other hand, we may refer (9.64) to β and substitutean auxiliary vector β for µβ. The so-defined confidence ellipsoid
t2P (r, n)n
=(β − β
)T (BTs B
)−1 (β − β
)(9.66)
localizes the vector µβ with probability P as given in (9.65). In principle, thisprocedure is similar to that we have used to define one-dimensional confidenceintervals in Sect. 5.1.
9.5.2 The New Solids
The propagated systematic errors define a geometric solid as well. Accordingto (9.62), it is constituted by the components
fβk=
m∑i=1
bikfi ; −fs,i ≤ fi ≤ fs,i ; k = 1, . . . , r (9.67)
of the vector
f β = BTf =(fβ1
fβ2· · · fβr
)T. (9.68)
In a sense, the solid comes into being by letting the vector
f = (f1 f2 · · · fm) (9.69)
“scan” the set of points defined by the m-dimensional hypercuboid
−fs,i ≤ fi ≤ fs,i ; i = 1, 2, . . . , m (9.70)
and lying within and on its boundary surfaces. It goes without saying thatall vectors f of the null space N(BT) of the matrix BT are mapped into thenull vector f β = 0 and will not be of interest to us.
To begin with, let us focus on some fundamental properties of the linearsingular mapping (9.67). The projected solid is
9.5 Uncertainty Spaces 133
– enclosed within an r-dimensional hypercuboid of extensions
−fs,βk≤ fβk
≤ fs,βk; fs,βk
=m∑
i=1
|bik| fs,i ; k = 1, 2, . . . , r (9.71)
– convex, as the mapping is linear and the hypercuboid (9.70) is convex.4
For any τ ∈ [0, 1] we have
f β = f(1)
β+ τ(f
(2)
β− f
(1)
β
)= BT [f1 + τ (f2 − f1)] ,
as every point on the line connecting any two vectors f1 and f2 is a pointof the hypercuboid (9.70), all points on the line connecting the vectorsf
(1)
βand f
(2)
βare points of the (yet unknown) solid (9.67)
– point-symmetric, as for any vector f = −f there is a vector f β = −f β.
As we shall see, the objects in question are polygons in case of two mea-surands, polyhedra in case of three and abstract polytopes if there are morethan three measurands. Moreover, we suggest to add the prefix security, i.e.to speak of security polygons, security polyhedra and security polytopes. Ina sense, this naming would correspond to the terms confidence ellipses andconfidence ellipsoids.
Security Polygons
In the first instance, we shall confine ourselves to couplings between just twopropagated systematic errors, fβk
; k = 1, 2. Assuming m = 5, we have
fβ1= b11f1 + b21f2 + b31f3 + b41f4 + b51f5
fβ2= b12f1 + b22f2 + b32f3 + b42f4 + b52f5 , (9.72)
where
−fs,i ≤ fi ≤ fs,i ; i = 1, . . . , 5 . (9.73)
Let us refer to a rectangular fβ1, fβ2
coordinate system. According to (9.71),the geometrical figure in question will be enclosed within a rectangle
−fs,β1, . . . , fs,β1
; −fs,β2, . . . , fs,β2
.
Let b11 = 0. Eliminating f1, we arrive at a straight line,
fβ2=
b12
b11fβ1
+ c1 , (9.74)
4Imagine a straight line connecting any two points of a given solid. If everypoint on the line is a point of the solid, the latter is called convex. There will thenbe neither hollows inside the solid nor folds in its hull.
134 9 Uncertainties of Least Squares Estimators
where
c1 = h12f2 + h13f3 + h14f4 + h15f5
and
h12 =1
b11(b11b22 − b12b21) , h13 =
1b11
(b11b32 − b12b31) ,
h14 =1
b11(b11b42 − b12b41) , h15 =
1b11
(b11b52 − b12b51) .
Any variation of c1 shifts the line (9.74) parallel to itself; c1 is maximal if
f2 = f∗2 = sign (h12) fs,2 , f3 = f∗
3 = sign (h13) fs,3 ,
f4 = f∗4 = sign (h14) fs,4 , f5 = f∗
5 = sign (h15) fs,5 . (9.75)
But then
cs,1 = h12f∗2 + h13f
∗3 + h14f
∗4 + h15f
∗5
and
−cs,1 ≤ c1 ≤ cs,1 . (9.76)
Of all the lines specified in (9.74), the lines
fβ2=
b12
b11fβ1
+ cs,1 , fβ2=
b12
b11fβ1
− cs,1 (9.77)
hold maximum distance. Shifting f1 within its interval −fs,1, . . . , fs,1, thecoordinates (fβ1
, fβ2), as defined through (9.77), slide along two of the edges
of the geometrical figure to be constructed. Obviously, the pertaining verticesfollow from (9.72) and (9.75),
fβ1= b11f1 + b21f
∗2 + b31f
∗3 + b41f
∗4 + b51f
∗5 ,
fβ2= b12f1 + b22f
∗2 + b32f
∗3 + b42f
∗4 + b52f
∗5 . (9.78)
We finally have found two of the polygon edges and the associated vertices.Elimination of the other variables f2, f3, . . . , fm (m = 5 here) reveals that
there will be m pairs of parallel edges at most and, in particular, that thefigure in question will be a convex polygon.
Should b11, for example vanish, it would not be possible to eliminated f1.Instead, (9.72) would yield two abscissas
fβ1= ± [ sign (b21) b21fs,2 + sign (b31) b31fs,3
+sign (b41) b41fs,4 + sign (b51) b51fs,5 ] , (9.79)
through each of which we may draw a vertical line parallel to the fβ2-axis.
Along these vertical lines the coordinate fβ2slides according to
fβ2= b12f1 ± [ sign (b21) b22fs,2 + sign (b31) b32fs,3
+sign (b41) b42fs,4 + sign (b51) b52fs,5 ] . (9.80)
9.5 Uncertainty Spaces 135
Fig. 9.3. Convex, point-symmetric polygon, m = 4
Example
We shall find the polygon defined by
BT =[
1 2 3 −1−1 3 2 1
]and − 1 ≤ fi ≤ 1 ; i = 1, . . . , 4 .
From
fβ1= f1 + 2f2 + 3f3 − f4 ; fβ2
= −f1 + 3f2 + 2f3 + f4
we conclude that the polygon is bounded by a rectangle of size
−7 ≤ fβ1≤ 7 ; −7 ≤ fβ2
≤ 7 .
Furthermore, we see that there are, instead of four, just three pairs of straightlines, namely
(g1, g2) : fβ2= −fβ1
± 10 ,
(g3, g4) : fβ2=
32fβ1
± 152
,
(g5, g6) : fβ2=
23fβ1
± 5 .
Figure 9.3 depicts the polygon.
136 9 Uncertainties of Least Squares Estimators
Remark : In least squares, treating r unknowns, we may happen to beinterested in the couplings between r′ < r estimators wanting to assign tothe remaining r−r′ any fixed values. Considering such a situation, we add tothe relations for fβ1
and fβ2some third relation for fβ3
. Let the new systembe given by
fβ1= f1 + 2f2 + 3f3 − f4
fβ2= −f1 + 3f2 + 2f3 + f4
fβ3= 2f1 − f2 + 3f3 + 3f4 ,
and let us look for the contour line fβ3= 0. Obviously, we have to consider
the so-enlarged system as a whole, which, of course, leads us to a result quitedifferent from that we got by treating the first two relations alone. Indeed,as Fig. 9.3 illustrates, the inner polygon depicting the new result encloses asmaller area than the previous outer one.
Security Polyhedra
We shall now investigate the geometrical properties of the solids establishedby three propagated systematic errors. Exemplifying the case m = 5, weconsider
fβ1= b11f1 + b21f2 + b31f3 + b41f4 + b51f5
fβ2= b12f1 + b22f2 + b32f3 + b42f4 + b52f5
fβ3= b13f1 + b23f2 + b33f3 + b43f4 + b53f5 , (9.81)
where
−fs,i ≤ fi ≤ fs,i ; i = 1, . . . , 5 (9.82)
is assumed. With respect to a rectangular fβ1, fβ2
, fβ3coordinate system, the
solid in question is enclosed within a cuboid as given by (9.71). Writing (9.81)in the form
b11f1 + b21f2 + b31f3 = fβ1− b41f4 − b51f5 = ξ
b12f1 + b22f2 + b32f3 = fβ2− b42f4 − b52f5 = η
b13f1 + b23f2 + b33f3 = fβ3− b43f4 − b53f5 = ζ (9.83)
we may solve for f3,
f3
∣∣∣∣∣∣∣b11 b21 b31
b12 b22 b32
b13 b23 b33
∣∣∣∣∣∣∣ =∣∣∣∣∣∣∣b11 b21 ξ
b12 b22 η
b13 b23 ζ
∣∣∣∣∣∣∣ .
Defining
9.5 Uncertainty Spaces 137
λ12 = b12b23 − b22b13 , µ12 = b21b13 − b11b23 , ν12 = b11b22 − b21b12
we have
λ12fβ1+ µ12fβ2
+ ν12fβ3= c12 , (9.84)
where
c12 = h12,3f3 + h12,4f4 + h12,5f5 (9.85)
and
h12,3 = (λ12b31 + µ12b32 + ν12b33)h12,4 = (λ12b41 + µ12b42 + ν12b43)h12,5 = (λ12b51 + µ12b52 + ν12b53) .
Solving (9.83) for f3, the variables f1 and f2 vanish. Varying c12, which ispossible through f3, f4, f5, the plane (9.84) gets shifted parallel to itself. Thechoices
f3 = f∗3 = sign (h12,3) fs,3
f4 = f∗4 = sign (h12,4) fs,4
f5 = f∗5 = sign (h12,5) fs,5
assign a maximum value to c12,
cs,12 = h12,3f∗3 + h12,4f
∗4 + h12,5f
∗5
so that
−cs,12 ≤ c12 ≤ cs,12 .
Finally, we have found two of the faces of the solid, namely
λ12fβ1+ µ12fβ2
+ ν12fβ3= −cs,12 (F1)
λ12fβ1+ µ12fβ2
+ ν12fβ3= cs,12 (F2) , (9.86)
where, for convenience, the faces have been designated by F1 and F2. Withinthe vector
f = (f1 f2 f3 f4 f5)T
,
the variables f1, f2 are free, while the variables f3, f4, f5 are fixed. If f1, f2
vary, the vector f β = BTf slides along the faces (9.86). Cyclically swappingthe variables, we successively find all the faces of the solid, which, as we nowsee, is a convex polyhedron. As Table 9.4 indicates, the maximum number offaces is given by m(m − 1). Table 9.5 visualizes the swapping procedure.
If in (9.81), e.g. b11 and b12 vanish, we cannot eliminate f1. Moreover, asν12 becomes zero, (9.86) represents a pair of planes parallel to the fβ3
-axis,
λ12fβ1+ µ12fβ2
= ±cs,12 , ν12 = 0 .
The cases λ12 = 0 and µ12 = 0 may be treated correspondingly.
138 9 Uncertainties of Least Squares Estimators
Table 9.4. Maximum number of faces of the polyhedron
× f1f2 f1f3 f1f4 f1f5
× × f2f3 f2f4 f2f5
× × × f3f4 f3f5
× × × × f4f5
× × × × ×
Table 9.5. Cyclic elimination, r = 3 and m = 5
eliminated variables remaining
variable within the system
1st step f1 f2 ⇒ f3 f4 f5
2nd step f1 f3 ⇒ f4 f5 f2
3rd step f1 f4 ⇒ f5 f2 f3
4th step f1 f5 ⇒ f2 f3 f4
5th step f2 f3 ⇒ f4 f5 f1
6th step f2 f4 ⇒ f5 f1 f3
7th step f2 f5 ⇒ f1 f3 f4
8th step f3 f4 ⇒ f5 f1 f2
9th step f3 f5 ⇒ f1 f2 f4
10th step f4 f5 ⇒ f1 f2 f3
Vertices
The vertices on the faces (9.86) emerge as follows. Given h12,i = 0; i = 3, 4, 5,we consider, as f3, f4 and f5 are fixed, two sets of f -vectors referring to F1and F2, namely
[ −fs,1 −fs,2 −f∗3 −f∗
4 −f∗5 ]T
[ fs,1 −fs,2 −f∗3 −f∗
4 −f∗5 ]T
[ −fs,1 fs,2 −f∗3 −f∗
4 −f∗5 ]T
[ fs,1 fs,2 −f∗3 −f∗
4 −f∗5 ]T
(F1)
and
[ fs,1 fs,2 f∗3 f∗
4 f∗5 ]T
[ −fs,1 fs,2 f∗3 f∗
4 f∗5 ]T
[ fs,1 −fs,2 f∗3 f∗
4 f∗5 ]T
[ −fs,1 −fs,2 f∗3 f∗
4 f∗5 ]T
(F2)
On each of the planes, these vectors mark exactly four vertices via f β = BTf .If in (9.85), e.g. h12,5 vanishes, we would encounter eight vertices instead of
9.5 Uncertainty Spaces 139
four on each of the faces (9.86). Should h12,4 also be zero, there would be 16vertices on each face. However, at least one of the coefficients h12,i, i = 3, 4, 5must be unequal to zero as otherwise the plane (9.84) would pass throughthe origin.
Contour Lines
Let us assume that none of the coefficients of the variables fβk; k = 1, 2, 3 in
(9.84) vanishes. Solving (9.86) for fβ2yields
fβ2= −λ12
µ12fβ1
− cs,12
µ12− ν12
µ12fβ3
(F1)
fβ2= −λ12
µ12fβ1
+cs,12
µ12− ν12
µ12fβ3
. (F2) (9.87)
Obviously, any fixed fβ3according to
fβ3= b13f1 + b23f2 − b33f
∗3 − b43f
∗4 − b53f
∗5 = const. (F1)
fβ3= b13f1 + b23f2 + b33f
∗3 + b43f
∗4 + b53f
∗5 = const. (F2) (9.88)
yields an fβ1, fβ2
contour line along the pertaining face of the polyhedron.For F2 we have
fβ3,2= |b13| fs,1 + |b23| fs,2 + b33f
∗3 + b43f
∗4 + b53f
∗5
fβ3,1= − |b13| fs,1 − |b23| fs,2 + b33f
∗3 + b43f
∗4 + b53f
∗5
so that the interval
fβ3,1≤ fβ3
≤ fβ3,2(F2) (9.89)
constitutes the range of admissible fβ3values for contour lines alongside F2.
Analogously, for F1 we find
−fβ3,2≤ fβ3
≤ −fβ3,1. (F1) (9.90)
We may repeat this procedure with each of the m(m − 1)/2 available pairsof parallel planes of the kind (9.86). Then, choosing any contour level fβ3
=const. within the interval
−fs,β3, . . . , fs,β3
as defined in (9.71), we combine all straight lines of the kind (9.87) whichpossess an fβ3
-value coping with this preset level. The search for the set ofstraight lines which actually constitutes a specific contour line fβ3
= const. isbased on intervals of the kind (9.89) and (9.90). We then find the coordinatesof the endpoints of the segments of the straight lines and string the segments
140 9 Uncertainties of Least Squares Estimators
together thus building a polygon, namely that which is defined by the inter-section of a horizontal plane having height fβ3
= const. with the polyhedron.Stacking the contour lines on top of each other maps the polyhedron.
It might appear to be more advantageous to bring the polyhedron intobeing via its faces and not by its contour lines. On the other hand, we shouldbear in mind that the number of faces rises rapidly with m, thus provokingsituations which might appear to be scarcely visually controllable. In theopinion of the author, it therefore seems to be simpler to refer to contourlines. As they appear stacked, the polyhedron’s hull should be smoothed bymeans of a suitable image filter.5
Example
We shall find the surfaces and contour lines of the polyhedron
f β = BTf ; BT =
⎛⎝ 1 2 3 −1 2
−1 3 2 2 −12 −1 −3 3 −3
⎞⎠ ; −1 ≤ fi ≤ 1 ; i = 1, . . . , 4 .
As the linear system
fβ1= f1 + 2f2 + 3f3 − f4 + 2f5
fβ2= −f1 + 3f2 + 2f3 + 2f4 − f5
fβ3= 2f1 − f2 − 3f3 + 3f4 − 3f5
shows, the polyhedron is enclosed within the cuboid
−9 ≤ fβ1≤ 9 ; −9 ≤ fβ2
≤ 9 ; −12 ≤ fβ3≤ 12 .
Cyclic elimination, Table 9.6 yields the coefficients λ, µ, ν and the constants±cs for (9.86) and (9.87). Figure 9.4 depicts the polyhedron.
Table 9.6. Cyclic elimination, m = 5, r = 3
eliminatedvariable λ µ ν cs fβ3,1
fβ3,2
1 f1, f2 −5 −5 5 80 6 12
2 f1, f3 −1 −9 5 80 0 10
3 f1, f4 −7 5 1 76 2 12
4 f1, f5 5 −7 1 68 −6 4
5 f2, f3 −7 −3 −5 24 −6 2
6 f2, f4 11 5 7 38 −8 0
7 f2, f5 −10 −4 −8 38 −12 −4
8 f3, f4 12 6 8 42 −6 6
9 f3, f5 −9 −3 −7 34 −10 2
10 f4, f5 −3 −3 −3 24 −12 0
5I am grateful to my son Niels for designing and implementing the filter.
9.5 Uncertainty Spaces 141
Fig. 9.4. Exemplary merging of a confidence ellipsoid and a security polyhedroninto an EPC hull
Security Polytopes
Polyhedra in more than three dimensions, called polytopes, may be rep-resented as a sequence of three-dimensional intersections each defining anfβ1
, fβ2, fβ3
polyhedron. Let, for example, r = 4. It then would be possible tovary one of the propagated systematic errors, say fβ4
, in discrete steps anddisplay the associated sequence of polyhedra. Thus, the couplings would bemapped by an ensemble of polyhedra.
9.5.3 EP Boundaries and EPC Hulls
In the following, confidence ellipsoids and security polytopes will be mergedinto overall uncertainty spaces.
For r = 2 we have to combine confidence ellipses and security polygons,for r = 3 confidence ellipsoids and security polyhedra and for r > 3 abstractr-dimensional ellipsoids and polytopes.
The boundary of a two-dimensional overall uncertainty region turns outto be convex. As we shall see, the boundary consists of the arc segments ofa contiguous decomposition of the circumference the Ellipse and, addition-ally, of the edges of the Polygon, i.e. arcs and edges alternately succeed oneanother. We therefore will call these lines EP boundaries. In a sense, theyresemble “convex slices of potatoes”.
The boundary of a three-dimensional overall uncertainty space is alsoconvex and comprises the segments of a contiguous decomposition of theEllipsoid’s “skin”, the faces of the Polyhedron and, finally, certain parts ofelliptical Cylinders providing smooth transitions between flat and curvedsections of the surface. So-established boundaries will be called EPC hulls.A comparison with “convex potatoes“ appears suggestive.
142 9 Uncertainties of Least Squares Estimators
EP Boundaries
We refer to (9.62),
β1 = β0,1 +(β1 − µβ1
)+ fβ1
β2 = β0,2 +(β2 − µβ2
)+ fβ2
, (9.91)
and resort to Fig. 9.5 to tackle the task of designing a two-dimensional uncer-tainty region within which to find the true values β0,1, β0,2 of the estimatorsβ1, β2. With respect to the influence of random errors, we refer to the con-fidence ellipsoid (9.66), as the parameters µβ1
, µβ2happen to be unknown.
Because of
s−1β
=(BTs B
)−1=
1∣∣sβ
∣∣(
sβ2β2−sβ1β2
−sβ2β1sβ1β1
),
where |sβ| = sβ1β1sβ2β2
− s2β1β2
, we arrive at
sβ2β2
(β1 − β1
)2 − 2sβ1β2
(β1 − β1
) (β2 − β2
)+ sβ1β1
(β1 − β2
)2=∣∣sβ
∣∣ t2H(2, n)n
. (9.92)
To simplify the notation, we put
x =(β1 − β1
), y =
(β2 − β2
)a = sβ2β2
, b = −sβ1β2, c = sβ1β1
, d =∣∣sβ
∣∣ t2H(2, n)n
so that
ax2 + 2bxy + cy2 = d . (9.93)
Transforming the equation of the tangent to the ellipse (9.93) at the pointx = xE, y = yE into Hesse’s normal form, we may easily find that point of thesecurity polygon which maximizes the two-dimensional overall uncertaintyregion “outwardly”. The slope of the tangent at (xE, yE),
y′E = −axE + byE
bxE + cyE(9.94)
leads to the equation
y = yE + y′E(x − xE) (9.95)
of the tangent, the Hesse form of which is
−y′E
Hx +
1H
y +y′ExE − yE
H= 0 , (9.96)
9.5 Uncertainty Spaces 143
Fig. 9.5. EP boundary localizing the true values β0,1, β0,2
where
H = −sign (y′ExE − yE)
√1 + y′
E2 .
Let the (ξP, ηP); P = 1, 2, . . . denote the vertices of the polygon with respectto a rectangular, body-fixed fβ1
, fβ2coordinate system. For a given point
(xE, yE), we successively insert the coordinates of the vertices (ξP, ηP); P =1, 2, . . . in (9.96) and select that vertex which produces the largest positivedistance according to
−y′E
H(xE + ξP) +
1H
(yE + ηP) +y′ExE − yE
H> 0 ; P = 1, 2, . . . . (9.97)
Obviously, the coordinates
xEP = xE + ξP ; yEP = yE + ηP (9.98)
define a point of the boundary we are looking for. Now, all we have to do is tomove (xE, yE) in discrete steps along the circumference of the ellipse and torestart procedure (9.97) thus bringing into being, bit by bit, the boundary ofthe two-dimensional uncertainty region sought after. According to the errormodel, this line localizes the true values β0,1, β0,2 “quasi-certainly”.
If the tangent runs parallel to one of the edges of the polygon, that edgeas a whole becomes part of the boundary. Consider two successive points(xE, yE), (xE′ , yE′), the tangents of which are parallel to polygon edges. Then,the arc segment lying in-between becomes, as shifted “outwardly”, part ofthe boundary. Hence, the circumference of the ellipse gets contiguously de-composed, and each of its segments becomes part of the boundary of thetwo-dimensional uncertainty region. Finally, this line is made up of Ellipticalarcs and Polygon edges. From this the term EP boundary has been derived.
144 9 Uncertainties of Least Squares Estimators
Fig. 9.6. Exemplary merging of a confidence interval and a security polygon
Example
We shall merge the confidence ellipse
sβ =
(500 70
70 200
); tP (2, n)/
√n = 1
and the security polygon
BT =
(1 2 3 −1 5 −1 7 0 −3 8
−1 3 2 1 −1 2 0 2 7 −1
)−1 ≤ fi ≤ 1 ; i = 1, . . . , 10
9.5 Uncertainty Spaces 145
into an overall uncertainty region or EP boundary.According to Sect. 3.6, the confidence ellipse is bounded by a rectangle
with sides√
sβ1β1=
√500 = ±22.4 ,
√sβ2β2
=√
200 = ±14.1 .
The rectangle which encloses the security polygon has dimensions
±31 and ± 20 .
But then the EP boundary is contained in a rectangle with sides
±53.4 and ± 34.1 .
Figure 9.6 illustrates the computer-aided putting together of the two figures.
EPC Hulls
Adding component β3 according to (9.62),
β1 = β0,1 +(β1 − µβ1
)+ fβ1
β2 = β0,2 +(β2 − µβ2
)+ fβ2
β3 = β0,3 +(β3 − µβ3
)+ fβ3
(9.99)
we may proceed as for two-dimensional uncertainty regions. Let us considerthe simplest case: We place the center of the security polyhedron at somepoint (xE, yE, zE) of the ellipsoid’s “skin” and shift the tangent plane at thispoint outwardly until it passes through that vertex of the polyhedron whichmaximizes the uncertainty space in question. The associate vertex (ξP, ηP, ζP)is found by trial and error. Finally, the coordinates
xEPZ = xE + ξP
yEPZ = yE + µP
zEPZ = zE + ζP (9.100)
provide a point of the EPC hull to be constructed.To begin with, we shall find a suitable representation of the tangent plane
to the confidence ellipsoid. For convenience, we denote the inverse of the em-pirical variance–covariance matrix entering (9.66) merely symbolically, i.e. in
t2P (3, n)n
=(β − β
)T (BTs B
)−1 (β − β
)we set
(BTs B
)−1=
⎛⎜⎝
γ11 γ12 γ13
γ21 γ22 γ23
γ31 γ32 γ33
⎞⎟⎠ .
146 9 Uncertainties of Least Squares Estimators
Defining
x =(β1 − β1
), y =
(β2 − β2
), z =
(β3 − β3
), d = t2P (3, n) /n ,
the confidence ellipsoid turns into
γ11x2 + γ22y
2 + γ33z2 + 2γ12xy + 2γ23yz + 2γ31zx = d . (9.101)
We now intersect the ellipsoid by a set of sufficiently close-spaced planeszE = const., where
− t2P (r, n)n
√sβ3β3
≤ zE ≤ t2P (r, n)n
√sβ3β3
, (9.102)
and shift the center of the polyhedron in discrete steps alongside any partic-ular contour line zE = const. of the ellipsoid’s “skin”. During this slide, thespatial orientation of the polyhedron remains fixed. At each stop (xE, yE, zE)we construct the tangent plane
nx (x − xE) + ny (y − yE) + nz (z − zE) = 0 , (9.103)
where the nx, ny, ny denote the components of the normal
n =
⎛⎜⎝
nx
ny
ny
⎞⎟⎠ =
⎛⎜⎝
γ11xE + γ12yE + γ31zE
γ12xE + γ22yE + γ23zE
γ31xE + γ23yE + γ33zE
⎞⎟⎠ (9.104)
at (xE, yE, zE). By means of Hesse’s normal form
nx
Hx +
ny
Hy +
nz
Hz − (nxxE + nyyE + nzzE)
H= 0 , (9.105)
where
H = sign (nxxE + nyyE + nzzE)√
n2x + n2
y + n2z ,
we are in a position to select that vertex of the polyhedron which maximizesthe size of the three-dimensional uncertainty region “outwardly”. This is eas-ily done by successively inserting the vertices (ξP, ηP, ζP); P = 1, 2, . . . of thepolyhedron into (9.105), i.e. by testing the expressions
nx
H(xE + ξP) +
ny
H(yE + ηP) +
nz
H(zE + ζP)
− (nxxE + nyyE + nzzE)H
> 0 . P = 1, 2, . . . (9.106)
one after another. It goes without saying that the points (xE, yE, zE = const.)of a given contour line of the ellipsoid will not produce a contour line of
9.5 Uncertainty Spaces 147
the three-dimensional uncertainty region in question. After all, this region isestablished through a dense system of curved, non-intersecting lines (9.100).To improve the graphical appearance of the stripy surface pattern we resortto a smoothing filtering algorithm6 and a suitable “illumination” procedure.
As can be shown, the construction procedure implies a contiguous de-composition of the Ellipsoid’s “skin”. These segments, shifted “outwardly”,become integral parts of the hull. The same applies to each of the faces of thePolyhedron. Finally, polyhedron edges when shifted appropriately alongsidethe ellipsoid’s skin, produce segments of elliptically curved Cylinders.
The itemized geometrical constituents, adjoining smoothly, have led tothe term EPC hull.
Example
We shall merge the confidence ellipsoid
sβ =
⎛⎝ 121 −70 −80
−70 100 50−80 50 81
⎞⎠ ; tP (3, n)/
√n = 1
and the security polyhedron
BT =
⎛⎜⎝ 1 2 3 −1 2
−1 3 2 2 −12 −1 −3 3 −3
⎞⎟⎠ ; −1 ≤ fi ≤ 1
into an EPC hull.The ellipsoid, polyhedron and EPC hull are enclosed within cuboids the
side lengths of which are given by
±11,±10,±9 ; ±9,±9,±12 and ± 20,±19,±21 ,
respectively.
6I am grateful to my son Niels for designing and implementing the filter.
Part III
Special Linear and Linearized Systems
10 Systems with Two Parameters
10.1 Straight Lines
On fitting of a straight line through m given data pairs we refer to threemetrologically different situations. In the simplest case, we assume, by def-inition, the abscissas to be error-free and the ordinates to be individual,erroneous measurements
(x0,1, y1) , (x0,2, y2) , . . . , (x0,m, ym) . (10.1)
Next, again considering error-free abscissas, we assume the ordinates to bearithmetic means
(x0,1, y1) , (x0,2, y2) , . . . , (x0,m, ym) . (10.2)
In general, however, both coordinates will exhibit the structure of arithmeticmeans
(x1, y1) , (x2, y2) , . . . , (xm, ym) . (10.3)
Let the latter be based on n repeat measurements xil, yil, l = 1, . . . , n,
xi =1n
n∑l=1
xil = x0,i + (xi − µxi) + fxi ; fs,xi ≤ fxi ≤ fs,xi
yi =1n
n∑l=1
yil = y0,i + (yi − µyi) + fyi ; fs,yi ≤ fyi ≤ fs,yi .
The data pairs (10.2) imply error bars
yi ± uyi ; i = 1, . . . , m ,
and the pairs (10.3) error rectangles
xi ± uxi , yi ± uyi ; i = 1, . . . , m .
Obviously, the data given in (10.1) should also exhibit error bars. However,the experimenter is not in a position to quote them when finishing his mea-surements. But, as will be shown, error bars turn out to be assessable af-terwards, namely when a straight line has been fitted through the measureddata.
152 10 Systems with Two Parameters
Given that the measurements are based on a linear physical model, thetrue values (x0,i, y0,i); i = 1, . . . , m of the coordinate pairs should establisha true straight line
y0(x) = β0,1 + β0,2x , (10.4)
where β0,1 and β0,2 denote the true y intercept and the true slope. A necessarycondition for this to occur comes from the error bars (and error rectangles).If the latter prove to be reliable, they will localize the true values (x0,i, y0,i);i = 1, . . . , m. Consequently, it should be possible to draw a straight line eitherthrough all error bars (and error rectangles) or, at least, through as many aspossible. Error bars (or rectangles) lying too far apart so that they cannotbe “hit” by the straight line do not fit the intersected ones. In the followingwe shall assume the linear model to be valid and, in the first step, assess theparameters
βk ± uβk; k = 1, 2 (10.5)
of the fitted straight line
y(x) = β1 + β2x . (10.6)
In the second step we shall find a so-called uncertainty band
y (x) ± uy(x) (10.7)
which we demand to enclose the true straight line (10.4).Let us formalize a least squares adjustment based on the data pairs (10.3).
The erroneous data give rise to the linear system
yi ≈ β1 + β2 xi ; i = 1, . . . , m . (10.8)
Introducing the vectors
y = (y1 y2 · · · ym)T , β = (β1 β2)T (10.9)
and the design matrix
A =
⎛⎜⎜⎝
1 x1
1 x2
· · · · · ·1 xm
⎞⎟⎟⎠ , (10.10)
(10.8) becomes
Aβ ≈ y . (10.11)
Obviously, the design matrix A includes erroneous quantities. Though thisaspect remains irrelevant with respect to the orthogonal projection of the
10.1 Straight Lines 153
vector y onto the column space of the matrix A, it will complicate the es-timation of measurement uncertainties. The orthogonal projection throughP = A(ATA)−1AT as defined in (7.13) turns (10.11) into
Aβ = P y (10.12)
so that
β = BTy ; B = A(ATA
)−1. (10.13)
Denoting
B = (bik) ; i = 1, . . . , m , k = 1, 2
the components β1, β2 of the solution vector β take the form
β1 =m∑
i=1
bi1yi , β2 =m∑
i=1
bi2yi . (10.14)
Finally, the least squares adjustment has confronted the true straight line(10.4) with the approximation formally introduced in (10.6). Let us presentthe quantities bik detail. The inversion of the matrix
ATA =
⎛⎜⎜⎜⎝
mm∑
j=1
xj
m∑j=1
xj
m∑j=1
x2j
⎞⎟⎟⎟⎠ (10.15)
yields
(ATA
)−1=
1D
⎛⎜⎜⎜⎝
m∑j=1
x2j −
m∑j=1
xj
−m∑
j=1
xj m
⎞⎟⎟⎟⎠ ; D = m
m∑j=1
x2j −⎡⎣ m∑
j=1
xj
⎤⎦2
(10.16)
so that
B =1D
⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣
m∑j=1
x2j − x1
m∑j=1
xj −m∑
j=1
xj + mx1
m∑j=1
x2j − x2
m∑j=1
xj −m∑
j=1
xj + mx2
· · · · · ·m∑
j=1
x2j − xm
m∑j=1
xj −m∑
j=1
xj + mxm
⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦
, (10.17)
154 10 Systems with Two Parameters
and, letting i = 1, . . . , m,
bi1 =1D
⎡⎣ m∑
j=1
x2j − xi
m∑j=1
xj
⎤⎦ ; bi2 =
1D
⎡⎣− m∑
j=1
xj + mxi
⎤⎦ . (10.18)
In the special case of error-free abscissas, we merely need to substitute thetrue values x0,j for the means xj ,
bi1 =1D
⎡⎣ m∑
j=1
x20,j − x0,i
m∑j=1
x0,j
⎤⎦ ; bi2 =
1D
⎡⎣− m∑
j=1
x0,j + mx0,i
⎤⎦ (10.19)
where
D = m
m∑j=1
x20,j −
⎡⎣ m∑
j=1
x0,j
⎤⎦2
.
10.1.1 Error-free Abscissas, Erroneous Ordinates
Though in most cases both the abscissas and the ordinates will be affected bymeasurement errors, the standard literature confines itself to the case of error-free abscissas. To begin with, we will follow this tradition, the subsequentsection will, however, deal with the more general situation.
No Repeat Measurements
In the simplest case there is just one measured ordinate for each abscissa.Let us further assume that to each of the ordinates we may attribute one andthe same systematic error, otherwise, under the present conditions, we wouldnot be in a position to estimate uncertainties, see Sect. 8.2.
Summarizing, our premises are as follows: We assume
– the true values (x0,i, y0,i); i = 1, . . . , m to satisfy the linear model (10.4),– the abscissas to be error-free,– the random errors, superposed on the true values of the ordinates to be
independent and to stem from one and the same parent distribution,– each ordinate to carry one and the same systematic error fyi = fy; i =
1, . . . , m.
The input data
y = (y1 y2 . . . ym)T
yi = y0,i + εi + fy = y0,i + (yi − µyi) + fy ; −fs,y ≤ fy ≤ fs,y (10.20)
and the design matrix
10.1 Straight Lines 155
A =
⎛⎜⎜⎜⎝
1 x0,1
1 x0,2
· · · · · ·1 x0,m
⎞⎟⎟⎟⎠ (10.21)
define the inconsistent system
Aβ ≈ y (10.22)
to be submitted to least squares. The coefficients bi1 and bi2 of the estimators
β1 =m∑
i=1
bi1yi and β2 =m∑
i=1
bi 2yi (10.23)
have been quoted in (10.19). For convenience, let us collect the quantitiesdefining the true linear system
Aβ0 = y0 , β0 = (β0,1 β0,2)T
, y0 = (y0,1 y0,2 . . . y0,m)T . (10.24)
Here, β0 designates the true solution vector and y0 the vector of the truevalues of the observations.
Next we prove, given the above quoted conditions apply, the term fTf −fTPf appearing in (8.17) vanishes. To this end we refer to an auxiliaryvector γ0. As on each of the ordinates one and the same systematic error fy
is superimposed, on account of
Aγ0 = y0 + f ; f = fy (1 1 . . . 1)T (10.25)
γ0 has the components
γ0 =
(γ0,1
γ0,2
)=
(β0,1 + fy
β0,2
)(10.26)
as, evidently, the slope β0,2 is not subject to change. Combining (10.24) and(10.25) yields
Aγ0 = Aβ0 + A
(fy
0
)= Aβ0 + f ,
i.e.
A
(fy
0
)= f , Pf = f ⇒ fTf − fTPf = 0 (10.27)
with P = A(ATA)−1AT. Consequently, we may refer to (8.11) and estimatethe unknown theoretical variance
156 10 Systems with Two Parameters
σ2 = E(Yi − µyi)
2
; µyi = E Yi ; i = 1, . . . , m
of the input data1 through
s2 =Qunweighted
m − 2(10.28)
so that, as shown in Sect. 8.2,
(m − 2) s2
σ2=
Qunweighted
σ2= χ2 (m − 2) . (10.29)
Finally, we are in a position to assign uncertainties to the individual mea-surements yi, i = 1, . . . , m. Referring to (3.49), Student’s
T =(Yi − µyi)/σ
S/σ=
(Yi − µyi)/σ√[χ2(m − 2)] /(m − 2)
; i = 1, . . . , m
enables us to define the confidence intervals
yi − tP (m − 2)s ≤ µyi ≤ yi + tP (m − 2)s ; i = 1, . . . , m
for the expectations µyi = EYi. We then associate overall uncertainties ofthe kind
yi ± uyi ; uyi = tP (m − 2)s + fs,y ; i = 1, . . . , m (10.30)
with the individual measurements yi; i = 1, . . . , m .
Uncertainties of the Components of the Solution Vector
Though, at present, there are no repeat measurements, we shall still denoteestimators by a bar on top. Inserting (10.20) into (10.14) yields
βk =m∑
i=1
bik [y0,i + (yi − µyi) + fy]
= β0,k +m∑
i=1
bik (yi − µyi) + fy
m∑i=1
bik ; k = 1, 2 . (10.31)
The respective sums of the coefficients bi1 and bi2, as defined in (10.19), aregiven by
1If, in contrast to our premise, the systematic errors vary along the sequence ofmeasured ordinates, or should the fitted line deviate too much at its left and right“ends” from the empirical data pairs implying that the latter do not fulfil the linearmodel (10.4), the estimator (10.28) will, of course, break down.
10.1 Straight Lines 157
m∑i=1
bi1 = 1 andm∑
i=1
bi2 = 0 (10.32)
so that the third term on the right-hand side of (10.31) gives the propagatedsystematic errors
fβ1= fy and fβ2
= 0 . (10.33)
The empirical variance–covariance matrix of the solution vector
sβ =
(sβ1β1
sβ1β2
sβ2β1sβ2β2
)(10.34)
can only be stated in analogy to the way the conventional error calculusproceeds, i.e. we firstly define a theoretical variance–covariance matrix andsubsequently substitute empirical quantities for the inaccessible theoreticalquantities. Denoting the expectations of the components β1, β2 of the solutionvector as
µβ1=
m∑i=1
bi1µyi , µβ2=
m∑i=1
bi2µyi ; µyi = E Yi (10.35)
the theoretical variances and the theoretical covariance turn out to be
σ2β1
≡ σβ1β1= E
(β1 − µβ1
)2
= E
⎧⎨⎩[
m∑i=1
bi1 (Yi − µyi)
]2⎫⎬⎭ = σ2
m∑i=1
b2i1
σ2β2
≡ σβ2β2= σ2
m∑i=1
b2i2 (10.36)
and
σβ1β2= σ2
m∑i=1
bi1bi2 (10.37)
respectively. Obviously, their empirical counterparts are
sβ1β1= s2
β1= s2
m∑i=1
b2i1 , sβ2β2
= s2β2
= s2m∑
i=1
b2i2 ,
sβ1β2= s2
m∑i=1
bi1bi2 . (10.38)
According to (10.29) and (3.49), Student’s t pertaining to β1, is given by
158 10 Systems with Two Parameters
T (m − 2) =β1 − µβ1
S
√m∑
i=1
b2i1
(10.39)
=
⎛⎜⎜⎜⎜⎝
β1 − µβ1
σ
√m∑
i=1
b2i1
⎞⎟⎟⎟⎟⎠
⎛⎜⎜⎜⎜⎝
S
√m∑
i=1
b2i1
σ
√m∑
i=1
b2i1
⎞⎟⎟⎟⎟⎠
−1
=
(β1 − µβ1
)/σ
√m∑
i=1
b2i1√
χ2(m−2)m−2
,
where, as usual, S denotes the random variable formally associated with theempirical standard deviation s. Finally, we assign the uncertainties
uβ1= tP (m − 2)s
√√√√ m∑i=1
b2i1 + fs,y ; uβ2
= tP (m − 2)s
√√√√ m∑i=1
b2i2 (10.40)
to the components β1, β2 of the solution vector β.
Uncertainty Band
Inserting (10.31) into (10.6) yields
y(x) = β1 + β2x
= (β0,1 + β0,2x) +m∑
i=1
(bi1 + bi2x) (yi − µyi) + fy . (10.41)
For any fixed x, we subtract the expectation
µy(x) = EY (x)
= (β0,1 + β0,2x) + fy (10.42)
from (10.41) so that
σ2y(x) = E
(Y (x) − µy(x)
)2 = σ2m∑
i=1
(bi1 + bi2x)2 . (10.43)
Obviously, an estimator for σ2y(x) is
s2y(x) = s2
m∑i=1
(bi1 + bi2x)2 . (10.44)
Referring to
T (m − 2) =Y (x) − µy(x)
S
√m∑
i=1
(bi1 + bi2x)2, (10.45)
10.1 Straight Lines 159
the uncertainty band in question turns out to be
y(x) ± uy(x) ; uy(x) = tP (m − 2)s
√√√√ m∑i=1
(bi1 + bi2x)2 + fs,y . (10.46)
Repeat Measurements
Repeat measurements are optional, given that on each of the ordinates oneand the same systematic error is superimposed, however, they are indispens-able if the ordinates are biased by different systematic errors. The compo-nents β1 and β2 of the solution vector β are given in (10.14), the associatedcoefficients bi1 and bi2 in (10.19).
Uncertainties of the Components of the Solution Vector
We firstly quote the empirical variance–covariance matrix of the solutionvector and the systematic errors of its components. With a view to Fig. 10.1and the error equations
yi = y0,i + (yi − µyi) + fyi ; −fs,yi ≤ fyi ≤ fs,yi ; i = 1, . . . , m ,
we “expand” the estimators βk, k = 1, 2 throughout a neighborhood of thepoint (y0,1, . . . , y0,2),
βk (y1, . . . , y2) = βk (y0,1, . . . , y0,2) +m∑
i=1
∂βk
∂y0,1(yi − y0,i) ; k = 1, 2 .
To abbreviate we set
βk (y0,1, . . . , y0,2) = β0,k and∂βk
∂y0,i= bik .
Fig. 10.1. Mean value yi, true value y0,i, systematic error fyi and expectationµyi = EYi of the i-th measured ordinate
160 10 Systems with Two Parameters
On the other hand, we could have directly inserted the error equations for yi
into (10.14) to find
βk (y1, . . . , y2) = β0,k +m∑
i=1
bik (yi − µyi) +m∑
i=1
bikfyi ; k = 1, 2 . (10.47)
To assess the second term on the right-hand side, which expresses randomerrors, we refer to the individual measurements yil via
yi =1n
n∑l=1
yil ; i = 1, . . . , m .
Hence, (10.14) turns into
βk =m∑
i=1
bik
[1n
n∑l=1
yil
]=
1n
n∑l=1
[m∑
i=1
bikyil
]=
1n
n∑l=1
βkl ; k = 1, 2(10.48)
where
βkl =m∑
i=1
bikyil . (10.49)
From (10.14) and (10.49) we obtain
βkl − βk =m∑
i=1
bik (yil − yi) . (10.50)
Thus, the elements of the empirical variance–covariance matrix of the solutionvector take the form
sβkβk′ =1
n − 1
n∑l=1
(βkl − βk
) (βk′l − βk′
)
=1
n − 1
n∑l=1
[m∑
i=1
bik (yil − yi)
]⎡⎣ m∑j=1
bjk′ (yjl − yj)
⎤⎦
=m∑
i,j=1
bikbjk′sij ; k, k′ = 1, 2 (10.51)
in which we recognize the elements
sij =1
n − 1
n∑l=1
(yil − yi) (yjl − yj) ; i, j = 1, . . . , m (10.52)
of the empirical variance–covariance matrix s of the input data. Finally, wedenote
10.1 Straight Lines 161
sβ =
(sβ1β1
sβ1β2
sβ2β1sβ2β2
)= BTs B ; sβkβk
≡ s2βk
. (10.53)
According to (10.48) and (10.49), the βk are the arithmetic means of n in-dependent, normally distributed quantities βkl. Consequently, we may defineStudent’s confidence intervals
βk − tP (n − 1)√n
sβk≤ µβk
≤ βk +tP (n − 1)√
nsβk
; k = 1, 2 (10.54)
for the expectations Eβk = µβk; k = 1, 2. The propagated systematic errors
follow from (10.47). Their worst case estimations turn out to be
fs,βk=
m∑i=1
|bik| fs,yi ; k = 1, 2 . (10.55)
Finally, the overall uncertainties uβkof the estimators βk ± uβk
, k = 1, 2 aregiven by
uβk=
tP (n − 1)√n
sβk+ fs,βk
=tP (n − 1)√
n
√√√√ m∑i,j=1
bikbjksij +m∑
i=1
|bik| fs,yi ; k = 1, 2 . (10.56)
Equal Systematic Errors
Let fyi = fy, fs,yi = fs,y; i = 1, . . . , m. From (10.47) we then deduce
fβk=
m∑i=1
bikfyi = fy
m∑i=1
bik . (10.57)
Because of (10.32) we have
fβ1= fy ; fβ2
= 0 (10.58)
so that
uβ1=
tP (n − 1)√n
√√√√ m∑i,j=1
bi1bj1sij + fs,y
uβ2=
tP (n − 1)√n
√√√√ m∑i,j=1
bi2bj2sij . (10.59)
162 10 Systems with Two Parameters
Uncertainty Band
As disclosed through (10.14) and (10.56), it clearly is not only a single straightline which is compatible with the input data and the error model. Rather,in addition to (10.6), arbitrary many least squares lines would apply justas well. Speaking vividly, what we expect is something like the “bow net”shaped uncertainty band similar to (10.46). Inserting (10.47),
βk (y1, . . . , y2) = β0,k +m∑
i=1
bik (yi − µyi) +m∑
i=1
bikfyi ; k = 1, 2
into (10.6),
y(x) = β1 + β2x ,
yields
y(x) = β0,1 + β0,2x +m∑
i=1
(bi1 + bi2x) (yi − µyi) +m∑
i=1
(bi1 + bi2x) fyi .
(10.60)
We first estimate the contribution due to random errors. To this end, weinsert (10.48),
βk =1n
n∑l=1
βkl ; k = 1, 2 ,
into (10.6)
y(x) =1n
n∑l=1
β1l +1n
n∑l=1
β2lx
=1n
n∑l=1
(β1l + β2lx
)=
1n
n∑l=1
yl(x) . (10.61)
Obviously,
yl(x) = β1l + β2lx ; l = 1, . . . , n (10.62)
designates a fitted line which we would get if at every i-th measuring point weinserted just the l-th measured value yil. Referring to (10.49) and to (10.14),we have
βkl =m∑
i=1
bikyil , l = 1, . . . , n and βk =m∑
i=1
bikyi .
10.1 Straight Lines 163
Fig. 10.2. Measuring points, measurement uncertainties, least squares fit and “bownet” shaped uncertainty band
Then, for any fixed x, the difference
yl(x) − y(x) =(β1l − β1
)+(β2l − β2
)x
=m∑
i=1
(bi1 + bi2x) (yil − yi) (10.63)
gives rise to an empirical variance
s2y(x) =
1n − 1
n∑l=1
[yl(x) − y(x)]2 = bTsb , (10.64)
where for convenience the auxiliary vector
b = (b11 + b12x b21 + b22x . . . bm1 + bm2x)T (10.65)
has been introduced. According to (10.60), the propagated systematic erroris given by
fy(x) =m∑
i=1
(bi1 + bi2x) fyi , (10.66)
and its worst case estimate is
fs,y(x) =m∑
i=1
|bi1 + bi2x| fs,yi . (10.67)
As shown in Fig. 10.2, for any fixed x, the uncertainty uy(x) of the leastsquares fit (10.6),
164 10 Systems with Two Parameters
y(x) = β1 + β2x ,
takes the form
uy(x) =tP (n − 1)√
nsy(x) + fs,y(x)
=tP (n − 1)√
n
√bTs b +
m∑i=1
|bi1 + bi2x| fs,yi . (10.68)
Hence, the true straight line (10.4),
y0(x) = β0,1 + β0,2x ,
should lie within the uncertainty band
y(x) ± uy(x) . (10.69)
Equal Systematic Errors
Let fyi = fy, fs,yi = fs,y; i = 1, . . . , m. Because of (10.32), the relationship(10.66) turns into
fy(x) = fy
m∑i=1
(bi1 + bi2x) = fy , (10.70)
i.e.
uy(x) =tP (n − 1)√
nsy(x) + fs,y =
tP (n − 1)√n
√bTs b + fs,y . (10.71)
To check this, we compare (10.56) with (10.68) for the special case y(0) = β1.The uncertainty uβ1
should be equal to the uncertainty uy(x) in x = 0. Ifx = 0, the vector b as defined in (10.65) turns out to be
b1 = (b11 b21 . . . bm1)T ;
hence
s2y(0) = sβ1β1
≡ s2β1
(10.72)
so that
uy(0) =tP (n − 1)√
nsβ1
+m∑
i=1
|bi1| fs,yi (10.73)
which agrees with (10.56).
10.1 Straight Lines 165
Fig. 10.3. Measuring points (xi, yi), expectations (µxi , µyi) and true values(x0,i, y0,i)
10.1.2 Erroneous Abscissas, Erroneous Ordinates
It would be more realistic to consider both coordinates, the abscissas and theordinates, to be affected by measurement errors, Fig. 10.3.
Uncertainties of the Components of the Solution Vector
Let us expand the elements bik of the matrix B as defined in (10.17). Indeed,under the conditions considered, these elements are erroneous so that, incontrast to (10.47), now there is no other choice,
βk (x1, . . . , ym) = βk (x0,1, . . . , y0,m) (10.74)
+m∑
i=1
∂βk
∂x0,i(xi − x0,i) +
m∑i=1
∂βk
∂y0,i(yi − y0,i) + · · ·
Linearizing the expansion, approximating the derivatives in (x0,1, . . . , y0,m)through derivatives in (x1, . . . , ym) and inserting the error equations
xi = x0,i + (xi − µxi) + fxi ; −fs,xi ≤ fxi ≤ fs,xi
yi = y0,i + (yi − µyi) + fyi ; −fs,yi ≤ fyi ≤ fs,yi
yields, as βk(x0,1, . . . , y0,m) = β0,k,
β1 (x1, . . . , ym) = β0,1 +m∑
i=1
∂β1
∂xi(xi − µxi) +
m∑i=1
∂β1
∂yi(yi − µyi)
+m∑
i=1
∂β1
∂xifxi +
m∑i=1
∂β1
∂yifyi (10.75)
and
β2 (x1, . . . , ym) = β0,2 +m∑
i=1
∂β2
∂xi(xi − µxi) +
m∑i=1
∂β2
∂yi(yi − µyi)
+m∑
i=1
∂β2
∂xifxi +
m∑i=1
∂β2
∂yifyi . (10.76)
166 10 Systems with Two Parameters
To establish the partial derivatives, we insert (10.18) into (10.14),
β1 =m∑
i=1
bi1yi
=1D
m∑i=1
⎡⎣ m∑
j=1
x2j − xi
m∑j=1
xj
⎤⎦ yi =
1D
⎡⎣ m∑
j=1
yj
m∑j=1
x2j −
m∑j=1
xj yj
m∑j=1
xj
⎤⎦
so that
∂β1
∂xi=
1D
⎡⎣2xi
m∑j=1
yj − yi
m∑j=1
xj −m∑
j=1
xj yj
⎤⎦− 2β1
D
⎡⎣mxi −
m∑j=1
xj
⎤⎦ (10.77)
and
∂β1
∂yi= bi1 =
1D
⎡⎣ m∑
j=1
x2j − xi
m∑j=1
xj
⎤⎦ . (10.78)
Similarly, from
β2 =m∑
i=1
bi2yi
=1D
m∑i=1
⎡⎣− m∑
j=1
xj + mxi
⎤⎦ yi =
1D
⎡⎣− m∑
j=1
xj
m∑j=1
yj + m
m∑j=1
xj yj
⎤⎦
we obtain
∂β2
∂xi=
1D
⎡⎣− m∑
j=1
yj + myi
⎤⎦− 2β2
D
⎡⎣mxi −
m∑j=1
xj
⎤⎦ (10.79)
and
∂β2
∂yi= bi2 =
1D
⎡⎣− m∑
j=1
xj + mxi
⎤⎦ . (10.80)
In (10.75) and (10.76), random and systematic errors are each expressedthrough two terms. Let us repeat the expansions of β1 and β2, this time,however, throughout a neighborhood of the point
(x1, . . . , xm; y1, . . . , ym)
and with respect to the l-th repeat measurements of the measuring pointsi = 1, . . . , m, i.e. with respect to the data set
10.1 Straight Lines 167
(x1l, . . . , xml; y1l, . . . , yml) ; l = 1, . . . , n .
We then find
β1l (x1l, . . . , xml; y1l, . . . , yml) = β1 (x1, . . . , xm; y1, . . . , ym)
+m∑
i=1
∂β1
∂xi(xil − xi) +
m∑i=1
∂β1
∂yi(yil − yi) + · · · (10.81)
and
β2l (x1l, . . . , xml; y1l, . . . , yml) = β2 (x1, . . . , xm; y1, . . . , ym)
+m∑
i=1
∂β2
∂xi(xil − xi) +
m∑i=1
∂β2
∂yi(yil − yi) + · · · . (10.82)
Obviously, the linearizations imply
βk =1n
n∑l=1
βkl ; k = 1, 2 .
For convenience, we simplify the nomenclature by introducing the definitions
vil = xil ,
vi = xi ,
µi = µxi ,
fi = fxi ,
fs,i = fs,xi ,
vi+m,l = yil
vi+m = yi
µi+m = µyi
fi+m = fyi
fs,i+m = fs,yi
(10.83)
and
ci1 =∂β1
∂xi, ci+m,1 =
∂β1
∂yi
ci2 =∂β2
∂xi, ci+m,2 =
∂β2
∂yi. (10.84)
Applying this notation, the expansions (10.75), (10.76), (10.81) and (10.82)turn into
βk = β0,k +2m∑i=1
cik (vi − µi) +2m∑i=1
cikfi ; k = 1, 2 (10.85)
and
βkl − βk =2m∑i=1
cik (vil − vi) ; k = 1, 2 . (10.86)
168 10 Systems with Two Parameters
From (10.86) we deduce the elements
sβkβk′ =1
n − 1
n∑l=1
(βkl − βk
) (βk′l − βk′
)
=1
n − 1
n∑l=1
[2m∑i=1
cik (vil − vi)
]⎡⎣ 2m∑j=1
cjk′ (vjl − vj)
⎤⎦
=2m∑
i,j=1
cikcjk′sij (10.87)
of the empirical variance–covariance matrix sβ of the solution vector, in whichthe
sij =1
n − 1
n∑l=1
(vil − vi) (vjl − vj) ; i, j = 1, . . . , 2m
designate the elements of the (expanded) empirical variance–covariance ma-trix
s = (sij)
of the input data. We condense the coefficients ci1 and ci2 into the auxiliarymatrix
CT =
(c11 c21 · · · c2m,1
c12 c22 · · · c2m,2
)
and arrive, in analogy to (10.53), at
sβ =
(sβ1β1
sβ1β2
sβ2β1sβ2β2
)= CTsC ; sβkβk
≡ s2βk
. (10.88)
Finally, in view of (10.85) and (10.87), we may assign the uncertainties
uβk=
tP (n − 1)√n
√√√√ 2m∑i,j=1
cikcjksij +2m∑i=1
|cik| fs,i ; k = 1, 2 (10.89)
to the estimators βk ± uβk, k = 1, 2.
Equal Systematic Errors
Let fxi = fx, fs,xi = fs,x; fyi = fy, fs,yi = fs,y; i = 1, . . . , m. From (10.75)we obtain
10.1 Straight Lines 169
m∑i=1
∂β1
∂xifxi +
m∑i=1
∂β1
∂yifyi = fx
m∑i=1
ci1 + fy
m∑i=1
ci+m,1 = −fxβ2 + fy
and from (10.76)
m∑i=1
∂β2
∂xifxi +
m∑i=1
∂β2
∂yifyi = fx
m∑i=1
ci2 + fy
m∑i=1
ci+m,2 = 0 .
Hence, (10.89) turns into
uβ1=
tP (n − 1)√n
√√√√ 2m∑i,j=1
ci1cj1sij + fs,x
∣∣β2
∣∣+ fs,y (10.90)
and
uβ2=
tP (n − 1)√n
√√√√ 2m∑i,j=1
ci2cj2sij . (10.91)
Uncertainty Band
Inserting (10.85) into the fitted line y(x) = β1 + β2x, we obtain
y(x) = β0,1 + β0,2x +2m∑i=1
(ci1 + ci2x) (vi − µi)
+2m∑i=1
(ci1 + ci2x) fi . (10.92)
To estimate the contribution due to random errors we refer to the n leastsquares lines of the repeat measurements,
yl(x) = β1l + β2lx ; l = 1, . . . , n .
Upon insertion of (10.86), the difference
yl(x) − y(x) =(β1l − β1
)+(β2l − β2
)x ; l = 1, . . . , n (10.93)
changes into
yl(x) − y(x) =2m∑i=1
(ci1 + ci2x) (vil − vi) ; l = 1, . . . , n (10.94)
so that
s2y(x) = cTsc (10.95)
170 10 Systems with Two Parameters
Fig. 10.4. Measuring points, uncertainty rectangles, fitted straight line, “bow net”shaped uncertainty band and true straight line
where for convenience the auxiliary vector
c = (c11 + c12x c21 + c22x · · · c2m,1 + c2m,2x)T
has been introduced. As (10.92) shows, the propagated systematic error isgiven by
fy(x) =2m∑i=1
(ci1 + ci2x) fi . (10.96)
Its worst case estimation yields
fs,y(x) =2m∑i=1
|ci1 + ci2x| fs,i . (10.97)
As illustrated in Fig. 10.4, the “bow net” shaped uncertainty band takes theform
y(x) ± uy(x) (10.98)
uy(x) =tP (n − 1)√
nsy(x) + fs,y(x) =
tP (n − 1)√n
√cTsc +
2m∑i=1
|ci1 + ci2x| fs,i .
Equal Systematic Errors
Let fxi = fx, fs,xi = fs,x; fyi = fy, fs,yi = fs,y; i = 1, . . . , m. Then, from(10.96) we deduce
10.1 Straight Lines 171
fy(x) = −fxβ2 + fy , (10.99)
the worst case estimation is
fs,y(x) = fs,x
∣∣β2
∣∣+ fs,y . (10.100)
The modified overall uncertainty is given by
uy(x) =tP (n − 1)√
n
√cTsc + fs,x
∣∣β2
∣∣+ fs,y . (10.101)
Again, we may convince ourselves that uy(0) = uβ1.
Example
Measurement of the linear coefficient of expansion. The length L(t) of a rodis to be measured with respect to the temperature t. Let α denote the linearcoefficient of expansion. Then we have
L(t) = Lr [1 + α (t − tr)] = Lr + αLr (t − tr) .
Here, Lr denotes the length of the rod at some reference temperature tr.Setting x = t − tr, y = L(t), β1 = Lr, and β2 = αLr, we obtain
y(x) = β1 + β2x .
The uncertainties uβ1, uβ2
have already been quoted in (10.89). Consequently,we may confine ourselves to assessing the uncertainty uα of the coefficient
α = φ(β1, β2
)= β2/β1 ,
see Sect. 9.4. Inserting (10.85),
βk = β0,k +2m∑i=1
cik (vi − µi) +2m∑i=1
cikfi ; k = 1, 2
into the expansion
φ(β1, β2
)= φ(β0,1, β0,2
)+
∂φ
∂β0,1
(β1 − β0,1
)+
∂φ
∂β0,2
(β2 − β0,2
)+ · · ·
yields
φ(β1, β2
)= φ(β0,1, β0,2
)+
2m∑i=1
(∂φ
∂β1ci1 +
∂φ
∂β2ci2
)(vi − µi)
+2m∑i=1
(∂φ
∂β1ci1 +
∂φ
∂β2ci2
)fi ,
172 10 Systems with Two Parameters
where the partial derivatives are given by
∂φ
∂β1≈ − α
β1;
∂φ
∂β2≈ 1
β1.
Further expansion with respect to β1l and β2l leads to
φ(β1l, β2l
)= φ(β1, β2
)+
∂φ
∂β1
(β1l − β1
)+
∂φ
∂β2
(β2l − β2
)so that
s2φ =
1n − 1
n∑l=1
[φ(β1l, β2l
)− φ(β1, β2
)]2=(
∂φ
∂β1
)2
sβ1β1+ 2(
∂φ
∂β1
∂φ
∂β2
)sβ1β2
+(
∂φ
∂β2
)2
sβ2β2.
Hence, the uncertainty uα of the result α ± uα follows from
uα =tP (n − 1)√
n
√(∂φ
∂β1
)2
sβ1β1+ 2(
∂φ
∂β1
∂φ
∂β2
)sβ1β2
+(
∂φ
∂β2
)2
sβ2β2
+2m∑i=1
∣∣∣∣ ∂φ
∂β1ci1 +
∂φ
∂β2ci2
∣∣∣∣ fs,i .
10.2 Linearization
Let φ(x, y; a, b) = 0 be some non-linear function whose parameters a and bare to be estimated by means of m measured data pairs (xi, yi); i = 1, . . . , m.In order to linearize the function through series expansion and subsequenttruncation, we have to assume that estimates or starting values a∗, b∗ exist.The series expansion yields
φ(x, y; a, b) = φ(x, y; a∗, b∗) (10.102)
+∂φ(x, y; a∗, b∗)
∂a∗ (a − a∗) +∂φ (x, y; a∗, b∗)
∂b∗(b − b∗) + · · ·
Clearly, the closer the estimates a∗, b∗ to the true values a0, b0, the smallerthe linearization errors. Considering the differences
β1 = a − a∗ and β2 = b − b∗ (10.103)
as unknowns and setting
ui =
∂φ (xi, yi ; a∗, b∗)∂b∗
∂φ (xi, yi; a∗, b∗)∂a∗
, vi = − φ (xi, yi; a∗, b∗)∂φ (xi, yi; a∗, b∗)
∂a∗
, (10.104)
10.3 Transformation 173
relation (10.102) turns into
β1 + β2ui ≈ vi ; i = 1, . . . , m . (10.105)
The least squares estimators β1 and β2 lead to the quantities
a = a∗ + β1 and b = b∗ + β2 (10.106)
which, given the procedure converges, will be closer to the true values a0, b0
than the starting values a∗, b∗. In this case we shall cyclically repeat theadjustment until
β1 ≈ 0 and β2 ≈ 0 . (10.107)
If the input data were error free, the cycles would reduce the estimators β1, β2
down to the numerical uncertainty of the computer. However, as the inputdata are erroneous, after a certain number of cycles, the numerical values ofthe estimators β1, β2 will start to oscillate around some residuals which aredifferent from zero and reflect the degree of accuracy of the input data.
The uncertainties of the estimators β1, β2 pass to the uncertainties of theparameters a and b according to
uβ1= ua ; uβ2
= ub (10.108)
as the quantities a∗, b∗ enter error-free. The estimation of the uncertaintiesuβ1
, uβ2has been explained in Sect. 10.1.
10.3 Transformation
Assume that a given non-linear relationship may be linearized by means ofcoordinate transformations. Then, initially, there will be no linearization er-rors which otherwise would be superimposed on the measurement errors.On the other hand, when transferring the uncertainties of the estimators ofthe linearized model to the estimators of the primary, non-linear model, wehave to linearize as well. In a sense, which of the two approaches – lineariza-tion or transformation of a given non-linear function – turns out to be morefavourable, seems to depend on the strategy aimed at.
Example
Given m measured data pairs (xi, yi); i = 1, . . . , m and a non-linear relation-ship φ(x, y; a, b) = 0, we shall estimate the parameters a and b. Let
φ(x, y; a, b) = y − axb . (10.109)
Taking logarithms yields
174 10 Systems with Two Parameters
ln y = ln a + b ln x . (10.110)
Setting v = ln y, u = ln x, β1 = ln a, and β2 = b changes (10.110) into
β1 + β2u = v . (10.111)
From (10.14) and (10.89), we find the parameters and their uncertainties, say
βk ± uβk, k = 1, 2 . (10.112)
To estimate the uncertainty ua of the parameter a, we linearize
a = eβ1 (10.113)
with respect to the points a and a0,
al = a + eβ1(β1,l − β1
); l = 1, . . . , n ,
a = a0 + eβ1(β1 − β0,1
). (10.114)
From the first expansion we obtain the uncertainty due to random errors andfrom the second one the uncertainty due to propagated systematic errors, sothat
a ± ua ; ua = eβ1uβ1. (10.115)
Finally, as b = β2, we have
b ± ub ; ub = uβ2. (10.116)
11 Systems with Three Parameters
11.1 Planes
The adjustment of three-parametric relationships calls for coordinate triples
(xi, yi, zi) ; i =, . . . , m .
As usual, we assign to each of the measurands xi, yi, zi an error equation
xi =1n
n∑l=1
xil = x0,i + (xi − µxi) + fxi ; fs,xi ≤ fxi ≤ fs,xi
yi =1n
n∑l=1
yil = y0,i + (yi − µyi) + fyi ; fs,yi ≤ fyi ≤ fs,yi
zi =1n
n∑l=1
zil = z0,i + (zi − µzi) + fzi ; fs,zi ≤ fzi ≤ fs,zi .
Let us now consider the least squares adjustment of planes, see Fig. 11.1.
Solution Vector
Inserting m > 4 erroneous coordinate triples into the equation of the plane
z = β1 + β2x + β3y (11.1)
yields an inconsistent linear system of the kind
β1 + β2xi + β3yi ≈ zi ; i = 1, . . . , m . (11.2)
For convenience, we define column vectors
z = (z1 z2 . . . z3)T
, β = (β1 β2 β3)T (11.3)
and a design matrix
A =
⎛⎜⎜⎜⎝
1 x1 y1
1 x2 y2
· · · · · · · · ·1 xm ym
⎞⎟⎟⎟⎠
176 11 Systems with Three Parameters
Fig. 11.1. Least squares plane, measured and true data triples
so that (11.2) turns into
Aβ ≈ z . (11.4)
The solution vector is
β = BTz ; B = A(ATA)−1 . (11.5)
Uncertainties of the Parameters
We immediately see the difficulties incurred to estimate the measurement un-certainties and the uncertainty “bowls” which should enclose the true plane:Firstly, a (3 × 3) matrix has to be inverted algebraically, and, secondly, thecomponents of the solution vector β have to be linearized. However, not ev-ery user needs to bother about these procedures as there will be softwarepackages to implement the formalism we are looking for.
Expanding the components of the solution vector (11.5) throughout aneighborhood of the point (x0,1, . . . , x0,m; y0,1, . . . , y0,m; z0,1, . . . , z0,m) withrespect to the means x1, . . . , xm; y1, . . . , ym; z1, . . . , zm and making use ofthe common approximations yields, letting k = 1, 2, 3,
βk (x1, . . . , xm; y1, . . . , ym; z1, . . . , zm)
= βk (x0,1, . . . , x0,m; y0,1, . . . , y0,m; z0,1, . . . , z0,m) (11.6)
+m∑
j=1
∂βk
∂xj(xj − x0,j) +
m∑j=1
∂βk
∂yj(yj − y0,j) +
m∑j=1
∂βk
∂zj(zj − z0,j) .
The β0,k = βk(x0,1 . . . , x0,m; y0,1, . . . , y0,m; z0,1, . . . , z0,m) designate the truevalues of the coefficients of the plane (11.1). The systematic errors of theestimators βk are given by
11.1 Planes 177
fβk=
m∑j=1
∂βk
∂xjfxj +
m∑j=1
∂βk
∂yjfyj +
m∑j=1
∂βk
∂zjfzj ; k = 1, 2, 3 . (11.7)
To present the empirical variance–covariance matrix of the solution vector,we expand the βk; k = 1, 2, 3 in
(x1, . . . , xm; y1, . . . , ym; z1, . . . , zm)
with respect to the l-th individual measurements
x1l, . . . , xml; y1l, . . . , yml; z1l, . . . , zml
so that
βkl (x1l, . . . , xml; y1l, . . . , yml; z1l, . . . , zml)
= βk (x1, . . . , xm; y1, . . . , ym; z1, . . . , zm) (11.8)
+m∑
j=1
∂βk
∂xj(xjl − xj) +
m∑j=1
∂βk
∂yj(yjl − yj) +
m∑j=1
∂βk
∂zj(zjl − zj) ;
where l = 1, . . . , n. The partial derivatives
cik =∂βk
∂xi, ci+m,k =
∂βk
∂yi, ci+2m,k =
∂βk
∂zi; i = 1, . . . , m (11.9)
are given in Appendix F. To abbreviate, we introduce the notations
vil = xil
vi = xi
µi = µxi
fi = fxi
fs,i = fs,xi
vi+m,l = yil
vi+m = yi
µi+m = µyi
fi+m = fyi
fs,i+m = fs,yi
vi+2m,l = zil
vi+2m = zi
µi+2m = µzi
fi+2m = fzi
fs,i+2m = fs,zi
so that (11.6) turns into
βk = β0,k +3m∑j=1
cjk (vj − µj) +3m∑j=1
cjkfj ; k = 1, 2, 3 (11.10)
while (11.8) yields
βkl − βk =3m∑j=1
cjk (vjl − vj) ; k = 1, 2, 3 . (11.11)
From this we find
178 11 Systems with Three Parameters
sβkβk′ =1
n − 1
n∑l=1
[3m∑i=1
cik (vil − vi)
]⎡⎣ 3m∑j=1
cjk′ (vjl − vj)
⎤⎦
=3m∑
i,j=1
cikcjk′
[1
n − 1
n∑l=1
(vil − vi) (vjl − vj)
]
=3m∑
i,j=1
cikcjk′sij (11.12)
where the
sij =1
n − 1
n∑l=1
(vil − vi) (vjl − vj) ; i, j = 1, . . . , 3m (11.13)
denote the empirical variances and covariances of the input data. Assemblingthe sij and the cik within matrices
s = (sij) (11.14)
and
CT =
⎛⎝ c11 c21 · · · c3m,1
c12 c22 · · · c3m,2
c13 c23 · · · c3m,3
⎞⎠ (11.15)
respectively, we may condense (11.12) to
sβ = CTsC . (11.16)
Furthermore, from (11.10) we infer
fβk=
3m∑j=1
cjkfj ; k = 1, 2, 3 . (11.17)
Finally, the uncertainties of the parameters βk; k = 1, 2, 3 of the fitted leastsquares plane
z(x, y) = β1 + β2x + β3y (11.18)
are given by
uβk=
tP (n − 1)√n
√√√√ 3m∑i,j=1
cikcjksij +3m∑j=1
|cjk| fs,j ; k = 1, 2, 3 . (11.19)
11.1 Planes 179
Equal Systematic Errors
Let fxi = fx; fyi = fy; fzi = fz; i = 1, . . . , m. Obviously, suchlike perturba-tions shift the least squares plane parallel to itself. Splitting up the systematicerrors (11.17),
fβk=
m∑j=1
cjkfxj +m∑
j=1
cj+m,kfyj +m∑
j=1
cj+2m,kfzj
= fx
m∑j=1
cjk + fy
m∑j=1
cj+m,k + fz
m∑j=1
cj+2m,k ,
and using
m∑j=1
cj1 = −β2
m∑j=1
cj+m,1 = −β3
m∑j=1
cj+2m,1 = 1
m∑j=1
cj2 = 0m∑
j=1
cj+m,2 = 0m∑
j=1
cj+2m,2 = 0
m∑j=1
cj3 = 0m∑
j=1
cj+m,3 = 0m∑
j=1
cj+2m,3 = 0
we find
fβ1= −β2fx − β3fy + fz ; fβ2
= 0 ; fβ3= 0 (11.20)
so that
uβ1=
tP (n − 1)√n
√√√√ 3m∑i,j=1
ci1cj1sij +∣∣β2
∣∣ fs,x +∣∣β3
∣∣ fs,y + fs,z
uβ2=
tP (n − 1)√n
√√√√ 3m∑i,j=1
ci2cj2sij
uβ3=
tP (n − 1)√n
√√√√ 3m∑i,j=1
ci3cj3sij . (11.21)
Uncertainty “Bowl”
Inserting (11.10) into (11.18) leads to
z(x, y) = β0,1 + β0,2x + β0,3y (11.22)
+3m∑j=1
(cj1 + cj2x + cj3y) (vj − µj) +3m∑j=1
(cj1 + cj2x + cj3y) fj .
180 11 Systems with Three Parameters
From this we read the systematic error of z(x, y),
fz(x,y) =3m∑j=1
(cj1 + cj2x + cj3y) fj , (11.23)
and its worst case estimation
fs,z(x,y) =3m∑j=1
|cj1 + cj2x + cj3y| fs,j . (11.24)
The least squares planes of the l-th individual measurements
zl = β1l + β2lx + β3ly ; l = 1, . . . , n ,
allow for the differences
zl − z =(β1l − β1
)+(β2l − β2
)x +(β3l − β3
)y (11.25)
=3m∑j=1
(cj1 + cj2x + cj3y) (vjl − vj) ; l = 1, . . . , n .
For any fixed x, the empirical variance turns out to be
s2z(x,y) =
1n − 1
n∑l=1
(zl − z)2 = cTsc , (11.26)
where c designates the auxiliary vector
c = (c11 + c12x + c13y c21 + c22x + c23y . . . c3m,1 + c3m,2x + c3m,3y)T .
(11.27)
Hence the uncertainty “bowl” in which to find the true plane
z0(x, y) = β0,1 + β0,2x + β0,3y (11.28)
is of the form
z(x, y) ± uz(x,y)
uz(x,y) =tP (n − 1)√
n
√cTsc +
3m∑j=1
|cj1 + cj2x + cj3y| fs,j . (11.29)
Equal Systematic Errors
Let fxi = fx; fyi = fy; fzi = fz. As (11.23) changes into
fz(x,y) = −β2fx − β3fy + fz ,
we have
z(x, y) ± uz(x,y)
uz(x,y) =tP (n − 1)√
n
√cTsc +
∣∣β2
∣∣ fs,x +∣∣β3
∣∣ fs,y + fs,z . (11.30)
Finally, we may convince ourselves of uz(0,0) = uβ1.
11.2 Parabolas 181
11.2 Parabolas
Though second order curves, parabolas are linear in their parameters. In thisrespect, the execution of least squares fits do not pose particular problems.
Given m measured, erroneous coordinate pairs
(x1, y1) , (x2, y2) , . . . , (xm, ym) , (11.31)
we assume the underlying true values (x0,i, y0,i); i = 1, . . . , m to define thetrue parabola
y0(x) = β0,1 + β0,2x + β0,3x2 . (11.32)
Inserting the measured point into the equation y(x) = β1 + β2x + β3x2 leads
to a linear, inconsistent system
β1 + β2xi + β3x2i ≈ yi ; i = 1, . . . , m . (11.33)
Let A, β and y denote the system matrix, the vector of the unknowns andthe vector of the observations respectively,
A =
⎛⎜⎜⎝
1 x1 x21
1 x2 x22
· · · · · · · · ·1 xm x2
m
⎞⎟⎟⎠ ; β =
⎛⎝β1
β2
β3
⎞⎠ ; y =
⎛⎜⎜⎝
y1
y2
. . .ym
⎞⎟⎟⎠ . (11.34)
Using these definitions, we may compress (11.33) into
Aβ ≈ y .
The least squares solution vector is
β = BTy , B = A(ATA
)−1= (bik) . (11.35)
11.2.1 Error-free Abscissas, Erroneous Ordinates
To begin with, we shall follow the traditional approach and assume the ab-scissas to be error-free. Subsequently this restriction will be suspended.
No Repeat Measurements
In the simplest case, we consider the ordinates to represent individual, erro-neous measurements. Furthermore, we shall imply that each of the latter isbiased by one and the same systematic error fy so that
y = (y1 y2 . . . ym)T = y0 + (y − µ) + f
yi = y0,i + (yi − µyi) + fy , i = 1, . . . , m , (11.36)
182 11 Systems with Three Parameters
where
f = fy(1 . . . 1︸ ︷︷ ︸m
)T ; −fs,y ≤ fy ≤ fs,y .
The random errors yi − µyi are assumed to be independent and normallydistributed; furthermore, they shall relate to the same parent distribution.
In (11.33), substituting true values x0,i for the abscissas xi, the matrix Bbecomes error-free. The solution vector is
β = BTy . (11.37)
Inserting (11.36) into (11.37) yields
β = β0 + BT(y − µ) + f β , f β = BTf . (11.38)
Clearly, the components of the vector f β are given by
fβ1= fy , fβ2
= 0 , fβ3= 0 . (11.39)
Remarkably enough, in analogy to (10.26) and (10.27), we may prove fTf −fTPf = 0 so that, following (8.11), the minimized sum of squared residualsyields an estimator
s2y =
Qunweighted
m − 3(11.40)
for the theoretical variance
σ2y = E
(Yi − µyi)
2
; i = 1, . . . , m . (11.41)
of the input data1
Uncertainties of the Components of the Solution Vector
From the components
βk =m∑
i=1
bikyi ; k = 1, 2, 3 (11.42)
of the solution vector and its expectations
Eβk
= µβk
=m∑
i=1
bikµyi ; k = 1, 2, 3 , (11.43)
1If the existence of a true parabola is doubtful, (11.40), of course, is also ques-tionable.
11.2 Parabolas 183
we obtain the differences
βk − µβk=
m∑i=1
bik(yi−µyi) ; k = 1, 2, 3
so that we can define the elements
σβkβk′ = E(
βk − µβk
) (βk′ − µβk′
)= σ2
y
m∑i=1
bikbik′ ; k, k′ = 1, 2, 3 ,
of the theoretical variance–covariance matrix of the solution vector β. Ac-cording to (11.40), their empirical counterparts are given by
sβkβk′ = s2y
m∑i=1
bikbik′ ; k, k′ = 1, 2, 3 . (11.44)
Finally, the uncertainties of the estimators βk; k = 1, 2, 3 turn out to be
uβk= tP (m − 3)sy
√√√√ m∑i=1
bikbik + fs,βk; k = 1, 2, 3 (11.45)
in which, according to (11.39),
fs,β1= fs,y , fs,β2
= 0 , fs,β3= 0 . (11.46)
Uncertainty Band
Inserting the components of the solution vector (11.38),
βk = β0,k +m∑
i=1
bik (yi − µyi) + fβk; k = 1, 2, 3 ,
into the fitted parabola y(x) = β1 + β2x + β3x2, we find
y(x) = y0(x) +m∑
i=1
(bi1 + bi2x + bi3x
2)(yi − µyi) + fy (11.47)
in which
fy = fβ1(11.48)
designates the systematic error. For any fixed x, the theoretical variance ofY (x) with respect to the expectation
µY (x) = EY (x)
= y0(x) + fy (11.49)
184 11 Systems with Three Parameters
yields
σ2Y (x) = E
(Y (x) − µY (x)
)2 = σ2y
m∑i=1
(bi1 + bi2x + bi3x
2)2
. (11.50)
Hence, we expect the uncertainty band
y(x) ± uy(x)
uy(x) = tP (m − 3)sy
√√√√ m∑i=1
(bi1 + bi2x + bi3x2)2 + fs,y (11.51)
to enclose the true parabola (11.32).
Repeat Measurements
For convenience, we shall preserve the nomenclature hitherto used, i.e. wecontinue to denote the least squares parabola by
y(x) = β1 + β2x + β3x2 ,
though the ordinates will now enter in the form of arithmetic means
y = (y1 y2 . . . ym)T
yi = y0,i + (yi − µyi) + fyi ; −fs,yi ≤ fyi ≤ fs,yi . (11.52)
As usual, we assume each of the ordinates to include the same number, sayn, of repeat measurements. At the same time, we dismiss the restriction thatthe means yi; i = 1, . . . , m are biased by one and the same systematic error.
Uncertainties of the Components of the Solution Vector
As we assume the abscissas to be error-free, so are the elements bik of thematrix B. Inserting the error equations (11.52) into
βk =m∑
i=1
bikyi ; k = 1, 2, 3 (11.53)
yields
βk (y1, . . . , y2) = β0,k +m∑
i=1
bik (yi − µyi) +m∑
i=1
bikfyi . (11.54)
Obviously, the
fβk=
m∑i=1
bikfyi , fs,βk=
m∑i=1
|bik| fs,yi (11.55)
11.2 Parabolas 185
denote the propagated systematic errors of the components βk; k = 1, 2, 3 ofthe solution vector. To assess the influence of random errors, we insert themeans
yi =1n
n∑l=1
yil , i = 1, . . . , m
into (11.53),
βk =m∑
i=1
bik
[1n
n∑l=1
yil
]=
1n
n∑l=1
[m∑
i=1
bikyil
]=
1n
n∑l=1
βkl ; k = 1, 2, 3 .
(11.56)
Obviously, the quantities
βkl =m∑
i=1
bikyil ; k = 1, 2, 3 ; l = 1, . . . , n (11.57)
are the estimators of an adjustment which uses only every l-th measurementof the means yi; i = 1, . . . , m. The differences
βkl − βk =m∑
i=1
bik (yil − yi) ; k = 1, 2, 3 ; l = 1, . . . , n (11.58)
lead us to the elements
sβkβk′ =1
n − 1
n∑l=1
(βkl − βk
) (βk′l − βk′
)
=1
n − 1
n∑l=1
[m∑
i=1
bik (yil − yi)
]⎡⎣ m∑j=1
bjk′ (yjl − yj)
⎤⎦
=m∑
i,j=1
bikbjk′sij ; k, k′ = 1, 2, 3 (11.59)
of the empirical variance–covariance matrix of the solution vector β. As tocomputer-assisted approaches, the matrix representation
sβ =
⎛⎜⎝ sβ1β1
sβ1β2sβ1β3
sβ2β1sβ2β2
sβ2β3
sβ3β1sβ3β2
sβ3β3
⎞⎟⎠ = BTsB ; sβkβk
≡ s2βk
(11.60)
proves to be more convenient. The matrix
s = (sij) , sij =1
n − 1
n∑l=1
(yil − yi) (yjl − yj) , i, j = 1, . . . , m
(11.61)
186 11 Systems with Three Parameters
designates the empirical variance–covariance matrix of the input data. Nowwe are in a position to establish Student’s confidence intervals
βk − tP (n − 1)√n
sβk≤ µβk
≤ βk +tP (n − 1)√
nsβk
; k = 1, 2, 3 (11.62)
for the expectations Eβk = µβk.
Finally, the overall uncertainties uβkof the results βk ± uβk
; k = 1, 2, 3turn out to be
uβk=
tP (n − 1)√n
sβk+ fs,βk
=tP (n − 1)√
n
√√√√ m∑i,j=1
bikbjksij +m∑
i=1
|bik| fs,yi
k = 1, 2, 3 . (11.63)
Equal Systematic Errors
Let fyi = fy; i = 1, . . . , m. The expansions given in Appendix F (where wehave to substitute true abscissas x0,j for the means xj quoted there) yield
m∑j=1
bj1 = 1 ,
m∑j=1
bj2 = 0 ,
m∑j=1
bj3 = 0 , (11.64)
i.e.
fβ1= fy , fβ2
= 0 , fβ3= 0 (11.65)
so that
uβ1=
tP (n − 1)√n
√√√√ m∑i,j=1
bi1bj1sij + fs,y
uβ2=
tP (n − 1)√n
√√√√ m∑i,j=1
bi2bj2sij
uβ3=
tP (n − 1)√n
√√√√ m∑i,j=1
bi3bj3sij . (11.66)
Uncertainty Band
Insertion of (11.54) into the least squares parabola y(x) = β1 + β2x + β3x2
yields
y(x) = y0(x) +m∑
j=1
(bj1 + bj2x + bj3x
2) (
yj − µyj
)
+m∑
j=1
(bj1 + bj2x + bj3x
2)fyj . (11.67)
11.2 Parabolas 187
Clearly, the x-dependent systematic error is given by
fy(x) =m∑
j=1
(bj1 + bj2x + bj3x
2)fyj
and its worst case estimation yields
fs,y(x) =m∑
j=1
∣∣bj1 + bj2x + bj3x2∣∣ fs,yj . (11.68)
Furthermore, from
yl(x) = β1,l + β2,lx + β3,lx2
y(x) = β1 + β2x + β3x2
and (11.58) we find
yl(x) − y(x) =(β1l − β1
)+(β2l − β2
)x +(β3l − β3
)x2
=m∑
j=1
(bj1 + bj2x + bj3x
2)(yjl − yj) (11.69)
so that
s2y(x) =
1n − 1
n∑l=1
[yl(x) − y(x)]2 = bTsb , (11.70)
where the auxiliary vector
b =(b11 + b12x + b13x
2 b21 + b22x + b23x2 . . . bm1 + bm2x + bm3x
2)T(11.71)
has been introduced. The matrix s was denoted in (11.61). Ultimately, thetrue parabola (11.32) is expected to lie within an uncertainty band of theform
y(x) ± uy(x)
uy(x) =tP (n − 1)√
n
√bTsb +
m∑j=1
∣∣bj1 + bj2x + bj3x2∣∣ fs,yj . (11.72)
Equal Systematic Errors
Let fyi = fy; i = 1, . . . , m. According to (11.64), relation (11.68) turns intofs,y(x) = fs,y, i.e.
y(x) ± uy(x) uy(x) =tP (n − 1)√
n
√bTsb + fs,y . (11.73)
188 11 Systems with Three Parameters
11.2.2 Erroneous Abscissas, Erroneous Ordinates
If, as is common experimental practice, both the abscissas and the ordinatesare erroneous quantities, the elements bik of the matrix B as defined in(11.35) are erroneous as well. Accordingly, we linearize the estimators βk
twofold, namely throughout the neighborhoods of the points
(x0,1, . . . , x0,m ; y0,1, . . . , y0,m) and (x1, . . . , xm ; y1, . . . , ym)
respectively. In the former case we have
βk (x1, . . . , xm; y1, . . . , ym)
= βk (x0,1, . . . , x0,m; y0,1, . . . , y0,m)
+m∑
j=1
∂βk
∂x0,j(xj − x0,j) +
m∑j=1
∂βk
∂y0,j(yj − y0,j) + · · ·
The constants β0,k = βk(x0,1, . . . , x0,m; y0,1, . . . , y0,m); k = 1, 2, 3 denote thetrue values of the estimators. Furthermore, we have to relate the partialderivatives to the point (x1, . . . , xm; y1, . . . , ym),
∂βk
∂x0,j≈ ∂βk
∂xj= cjk ,
∂βk
∂y0,j≈ ∂βk
∂yj= cj+m,k ; j = 1, . . . , m .
The constants cjk, cj+m,k are quoted in Appendix F. With respect to theerror equations
xj = x0,j +(xj − µxj
)+ fxj ; −fs,xj ≤ fxj ≤ fs,xj
yj = y0,j +(yj − µyj
)+ fyj ; −fs,yj ≤ fyj ≤ fs,yj
and the second expansion which is still due, we introduce the notations
vjl = xjl ; vj+m,l = yjl ; vj = xj ; vj+m = yj
µj = µxj ; µj+m = µyj ; fj = fxj ; fj+m = fyj .
Hence, we have
βk = β0,k +2m∑j=1
cjk (vj − µj) +2m∑j=1
cjkfj ; k = 1, 2, 3 . (11.74)
From this, we infer the propagated systematic errors fβkof the components
βk as
fβk=
2m∑j=1
cjkfj ; k = 1, 2, 3 .
11.2 Parabolas 189
The second expansion throughout a neighborhood of the point (x1, . . . , xm;y1, . . . , ym) with respect to every l-th repeat measurement yields
βkl (x1l, . . . , xml; y1l, . . . , yml) = βk (x1, . . . , xm; y1, . . . , ym)
+2m∑j=1
∂βk
∂xj(xjl − xj) +
2m∑j=1
∂βk
∂yj(yjl − yj) + · · ·
i.e.
βkl − βk =2m∑j=1
cjk (vjl − vj) ; k = 1, 2, 3 ; l = 1, . . . , n . (11.75)
These differences allow the elements
sβkβk′ =1
n − 1
n∑l=1
(βkl − βk
) (βk′l − βk′
)
=1
n − 1
n∑l=1
[2m∑i=1
cik (vil − vi)
]⎡⎣ 2m∑j=1
cjk′ (vjl − vj)
⎤⎦
=2m∑
i,j=1
cikcjk′sij ; k, k′ = 1, 2, 3 (11.76)
of the empirical variance–covariance matrix sβ of the solution vector β tobe calculated. The sij designate the elements of the (expanded) empiricalvariance–covariance matrix of the input data,
s = (sij) , sij =1
n − 1
n∑l=1
(vil − vi) (vjl − vj) ; i, j = 1, . . . , 2m .
(11.77)
Finally, with the coefficients cik being comprised within the auxiliary matrix
CT =
⎛⎜⎝ c11 c21 · · · c2m,1
c12 c22 · · · c2m,2
c13 c23 · · · c2m,3
⎞⎟⎠ , (11.78)
we arrive at
sβ =
⎛⎜⎝ sβ1β1
sβ1β2sβ1β3
sβ2β1sβ2β2
sβ2β3
sβ3β1sβ3β2
sβ3β3
⎞⎟⎠ = CTsC . (11.79)
Thus, the uncertainties of the estimators βk; k = 1, 2, 3 turn out to be
uβk=
tP (n − 1)√n
√√√√ 2m∑i,j=1
cikcjksij +2m∑j=1
|cjk| fs,j ; k = 1, 2, 3 . (11.80)
190 11 Systems with Three Parameters
Equal systematic errors
Let fxi = fx, fyi = fy; i = 1, . . . , m. As
m∑j=1
cj1 = −β2 ,
m∑j=1
cj+m,1 = 1 ,
m∑j=1
cj2 = −2β3
m∑j=1
cj+m,2 = 0 ,
m∑j=1
cj3 = 0 ,
m∑j=1
cj+m,3 = 0 , (11.81)
the propagated systematic errors change into
fβk=
2m∑j=1
cjkfj = fx
m∑j=1
cjk + fy
m∑j=1
cj+m,k ; k = 1, 2, 3
i.e.
fβ1= −β2fx + fy ; fβ2
= −2β3fx ; fβ3= 0
so that
fs,β1=∣∣β2
∣∣ fs,x + fs,y ; fs,β2= 2∣∣β3
∣∣ fs,x ; fs,β3= 0 . (11.82)
Finally, (11.80) turns into
uβ1=
tP (n − 1)√n
√√√√ 2m∑i,j=1
ci1cj1sij +∣∣β2
∣∣ fs,x + fs,y
uβ2=
tP (n − 1)√n
√√√√ 2m∑i,j=1
ci2cj2sij + 2∣∣β3
∣∣ fs,x
uβ3=
tP (n − 1)√n
√√√√ 2m∑i,j=1
ci3cj3sij . (11.83)
Uncertainty Band
Insertion of (11.74) into the least squares parabola y(x) = β1 + β2x + β3x2
yields
y(x) = y0(x) +2m∑j=1
(cj1 + cj2x + cj3x
2)(vj − µj)
+2m∑j=1
(cj1 + cj2x + cj3x
2)fj . (11.84)
11.2 Parabolas 191
Thus, the systematic error of y(x) turns out as
fy(x) =2m∑j=1
(cj1 + cj2x + cj3x
2)fj . (11.85)
We then combine
y(x) = β1 + β2x + β3x2
and
yl(x) = β1l + β2lx + β3lx2 ; l = 1, . . . , n
to define the n differences
yl − y =(β1l − β1
)+(β2l − β2
)x +(β3l − β3
)x2 . (11.86)
Inserting (11.75), we are in a position, for any fixed x, to quote the empiricalvariance of the yl(x) with respect to y(x),
s2y(x) =
1n − 1
n∑l=1
(yl − y)2 = cTsc . (11.87)
Here, s designates the (expanded) empirical variance–covariance matrix(11.77) of the input data and c the auxiliary vector
c =(c11 + c12x + c13x
2 c21 + c22x + c23x2 . . . c2m,1 + c2m,2x + c2m,3x
2)T
.
(11.88)
Hence, we expect the true parabola (11.32) to lie within the uncertainty band
y(x) ± uy(x)
uy(x) =tP (n − 1)√
n
√cTsc +
2m∑j=1
∣∣cj1 + cj2x + cj3x2∣∣ fs,j . (11.89)
As may be shown, uy(0) = uβ1.
Equal Systematic Errors
Let fxi = fx, fyi = fy; i = 1, . . . , m. As
fy(x) = fx
m∑j=1
(cj1 + cj2x + cj3x
2)
+ fy
m∑j=1
(cj+m,1 + cj+m,2x + cj+m,3x
2)
= −β2fx − 2β3xfx + fy
(11.89) changes into
uy(x) =tP (n − 1)√
n
√cTsc +
∣∣β2 + 2β3x∣∣ fs,x + fs,y . (11.90)
192 11 Systems with Three Parameters
Example
Conversion of erroneous data pairs (xi, yi) into data pairs (ξ0,i, ηi) the ab-scissas ξ0,i of which are error-free while the ordinates ηi carry the errors ofthe abscissas xi and the ordinates yi.
Let it suffice to base the conversion on, say, five neighboring data pairs(xi−2, yi−2), . . . , (xi+2, yi+2). Then, every so-considered group of data definesa least squares parabola y(x) and an associated uncertainty band uy(x). Wesimply put ξ0,i = xi and ηi = y(ξ0,i). We may then define yl(ξ0,i); l = 1, . . . , n,so that
ηi = y (ξ0,i) =1n
n∑l=1
yl (ξ0,i) .
The uncertainty uy(ξ0,i) has been quoted in (11.89).
11.3 Piecewise Smooth Approximation
Suppose we want to approximate an empirical function, defined through asequence of data pairs (xi, yi); i = 1, . . . by a string of consecutive parabolasthe junctions of which shall be continuous and differentiable, Fig. 11.2. Forsimplicity, we shall restrict ourselves to two consecutive parabolas so thatthere is but one junction. We shall presume that there are sufficiently manymeasuring points within the regions of approximation, so that it is possible tosubstitute points ξ0,i, ηi for the measuring points (xi, yi), the abscissas ξ0,i ofwhich may be considered error-free while the ordinates ηi include the errorsof the abscissas xi and the ordinates yi. This idea has been outlined at theend of the preceding section. Let us assume that the transformations havealready been done and that, for convenience, our common notation (x0,i, yi);i = 1, 2, . . . has been reintroduced.
Solution Vector
Let ξ2 designate the junction of parabolas I and II. We then have
yI(x) = β1 + β2x + β3x2 ; ξ1 ≤ x < ξ2
yII(x) = β4 + β5x + β6x2 ; ξ2 ≤ x < ξ3 . (11.91)
Obviously, the parabolas have to satisfy the conditions
yI (ξ2) = yII (ξ2) ; y′I (ξ2) = y′
II (ξ2) . (11.92)
Defining the column vector
β = (β1 β2 β3 β4 β5 β6)T
, (11.93)
11.3 Piecewise Smooth Approximation 193
Fig. 11.2. Consecutive parabolas with continuous and differentiable junctions forpiecewise smooth approximation of an empirical function
according to (7.15), we require
Hβ = 0 , (11.94)
where
H =
(1 ξ2 ξ2
2 −1 −ξ2 −ξ22
0 1 2ξ2 0 −1 −2ξ2
). (11.95)
To stress, the βk; k = 1, . . . , 6 do not define a common polynomial; neverthe-less we shall adjust them in one single step. The parabolas defined in (11.91)give rise to two inconsistent systems
β1 + β2x0,i + β3x20,i ≈ yi ; i = 1, . . . , m1
β4 + β5x0,i + β6x20,i ≈ yi ; i = m1 + 1, . . . , m1 + m2 = m . (11.96)
Let us assign the means
yi =1n
n∑l=1
yil ; i = 1, . . . , m (11.97)
to a column vector
y = (y1 y2 . . . ym)T (11.98)
and the coefficients 1, x0,i, x20,i to a system matrix of the form
A =
⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝
1 x0,1 x20,1 | 0 0 0
. . . . . . . . . | . . . . . . . . .
1 x0,m1 x20,m1
| 0 0 0−−− −−− −−− | − −− −− − −−−
0 0 0 | 1 x0,m1+1 x20,m1+1
. . . . . . . . . | . . . . . . . . .
0 0 0 | 1 x0,m x20,m
⎞⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠
.
194 11 Systems with Three Parameters
Then, (11.96) turns into
Aβ ≈ y . (11.99)
The solution vector follows from (7.16),
β =[(
ATA)−1
AT(ATA
)−1HT
×[H(ATA
)−1HT]−1
H(ATA
)−1AT
]y . (11.100)
To abbreviate, we assign a matrix DT to the outer square brackets, so that
β = DTy ; D = (dik) , (11.101)
⎛⎜⎜⎜⎜⎜⎜⎜⎜⎝
β1
β2
β3
−−β4
β5
β6
⎞⎟⎟⎟⎟⎟⎟⎟⎟⎠
=
⎛⎜⎜⎜⎜⎜⎜⎜⎜⎝
d11 d21 . . . dm1
d12 d22 . . . dm2
d13 d23 . . . dm3
−− −− −− −−d14 d24 . . . dm4
d15 d25 . . . dm5
d16 d26 . . . dm6
⎞⎟⎟⎟⎟⎟⎟⎟⎟⎠
⎛⎜⎜⎝
y1
y2
. . .ym
⎞⎟⎟⎠
in components,
βk =m∑
i=1
dikyi ; k = 1, . . . , 6 . (11.102)
Uncertainty Band of a String of Parabolas
For simplicity, we confine ourselves to the uncertainty band of the fittedparabola
yI(x) = β1 + β2x + β3x2 . (11.103)
Obviously, this formally agrees with (11.72),
yI(x) ± uyI(x)
uyI(x) =tP (n − 1)√
n
√dT
I sdI +m∑
i=1
∣∣di1 + di2x + di3x2∣∣ fs,yi , (11.104)
where the auxiliary vector
dI =(d11 + d12x + d13x
2 d21 + d22x + d23x2 . . . dm1 + dm2x + dm3x
2)T
(11.105)
11.4 Circles 195
corresponds to (11.71) and the matrix s to (11.61). With respect to the fittedparabola yII(x) we have just two modifications to make: In the uncertaintycomponent due to random errors we have to substitute an auxiliary vector
dII =(d14 + d15x + d16x
2 d24 + d25x + d26x2 . . . dm4 + dm5x + dm6x
2)T
(11.106)
for dI, and in the uncertainty component due to systematic errors the ma-trix elements di4, di5, di6 for the elements di1, di2, di3. Obviously, any errorcombination yields a continuous and differentiable string of parabolas.
The procedure may be generalized to more than two regions of approxi-mation, e.g. to three regions, implying junctions ξ1 and ξ2, then the matrixH takes the form
H =
⎛⎜⎜⎜⎜⎝
1 ξ1 ξ21 | −1 −ξ1 −ξ2
1 | 0 0 00 1 2ξ1 | 0 −1 −2ξ1 | 0 0 0
−− −− −− | −− −− −− | −− −− −−0 0 0 | 1 ξ2 ξ2
2 | −1 −ξ2 −ξ22
0 0 0 | 0 1 2ξ2 | 0 −1 −2ξ2
⎞⎟⎟⎟⎟⎠ . (11.107)
11.4 Circles
In contrast to polynomials, the equation of a circle is non-linear in the pa-rameters to be adjusted. Consequently, we first have to linearize. This canbe done either by truncating a series expansion or through a transformation,namely a stereographic projection2. We here will content ourselves with seriestruncation.
Linearization
As has been discussed in Sect. 10.2, we expand the circle’s equation
φ(r, xM , yM ; x, y) = r2 − (x − xM )2 − (y − yM )2 = 0 (11.108)
throughout a neighborhood of some starting values r∗, x∗M , y∗
M ,
φ(r, xM , yM ; x, y) = φ (r∗, x∗M , y∗
M ; x, y) (11.109)
+∂φ
∂r∗(r − r∗) +
∂φ
∂x∗M
(xM − x∗M ) +
∂φ
∂y∗M
(yM − y∗M ) + · · · = 0 .
As
∂φ
∂r∗= 2r∗ ,
∂φ
∂x∗M
= 2(x − x∗M ) ,
∂φ
∂y∗M
= 2(y − y∗M )
2E.g. Knopp, K., Elemente der Funktionentheorie, Sammlung Goschen,vol. 1109, Walter de Gruyter, Berlin 1963
196 11 Systems with Three Parameters
we find, truncating non-linear terms,
r∗ 2 − (x − x∗M )2 − (y − y∗
M )2 (11.110)
+ 2r∗ (r − r∗) + 2 (x − x∗M ) (xM − x∗
M ) + 2 (y − y∗M ) (yM − y∗
M ) ≈ 0 .
Introducing the abbreviations
β1 = r − r∗ , β2 = xM − x∗M , β3 = yM − y∗
M
u(x) =x − x∗
M
r∗, v(y) =
y − y∗M
r∗
w(x, y) =1
2r∗
[(x − x∗
M )2 + (y − y∗M )2 − r∗ 2
](11.111)
the expansion (11.110) turns into
β1 + u(x)β2 + v(y)β3 ≈ w(x, y) . (11.112)
Obviously, in a formal sense, (11.112) is the equation of a plane. In orderto reduce the linearization errors, we cyclically repeat the adjustment, eachtime replacing the “old” starting values with improved “new” ones.
Suppose there are m ≥ 4 measured data pairs (xi, yi); i = 1, . . . , m, lyingin the vicinity of the circumference of the circle and having the structure
xi =1n
n∑l=1
xil = x0,i + (xi − µxi) + fxi ,
yi =1n
n∑l=1
yil = y0,i + (yi − µyi) + fyi .
Inserting the latter into (11.112)) produces the linear, inconsistent system
β1 + u (xi) β2 + v (yi)β3 ≈ w (xi, yi) ; i = 1, . . . , m . (11.113)
The system matrix A, the column vector β of the unknowns and the columnvector w of the right-hand sides
A =
⎛⎜⎜⎝
1 u1 v1
1 u2 v2
. . . . . . . . .1 um vm
⎞⎟⎟⎠ ; β =
⎛⎝β1
β2
β3
⎞⎠ ; w =
⎛⎜⎜⎝
w1
w2
. . .wm
⎞⎟⎟⎠ ,
turn (11.113) into
Aβ ≈ w . (11.114)
11.4 Circles 197
Iteration of the Solution Vector
The least squares solution vector is
β = BTw , B = A(ATA
)−1. (11.115)
Its components β1, β2, β3 define the parameters of the circle,
r = β1 + r∗ , xM = β2 + x∗M , yM = β3 + y∗
M . (11.116)
Interpreting the results r, xM , yM as new, improved starting values r∗x∗M , y∗
M ,we are in a position to cyclically repeat the adjustment. As has been discussedin Sect. 10.2, we expect the estimators to tend to zero,
β1 → 0, β2 → 0, β3 → 0 .
This will, however, be the case only as far as allowed by the measurementerrors (and, if so, by the accuracy of the machine). In
βk =m∑
j=1
bjkwj ; k = 1, 2, 3 (11.117)
the coefficients bjk as well as the quantities wj prove to be erroneous.
Uncertainties of the Parameters of the Circle
In order to estimate the uncertainties, we carry out series expansions through-out the neighborhoods of the points
(x0,1, . . . , x0,m ; y0,1, . . . , y0,m) and (x1, . . . , xm ; y1, . . . , ym) .
The former leads us to the propagated systematic errors and the latter to theempirical variance–covariance matrix of the components βk; k = 1, 2, 3 of thesolution vector respectively. The expansion with respect to the true valuesyields
βk (x1, . . . , ym) = βk (x0,1, . . . , y0,m) (11.118)
+m∑
j=1
∂βk
∂x0,j(xj − x0,j) +
m∑j=1
∂βk
∂y0,j(yj − y0,j) + · · · ; k = 1, 2, 3 .
As usual, we linearize the expansion and approximate the partial derivativesaccording to
∂βk
∂x0,j≈ ∂βk
∂xj= cjk ,
∂βk
∂y0,j≈ ∂βk
∂yj= cj+m,k ; k = 1, 2, 3 , j = 1, . . . , m .
The coefficients cjk and cj+m,k are given in Appendix F. Furthermore, weintroduce
198 11 Systems with Three Parameters
fj = fxj , fj+m = fyj ; j = 1, . . . , m .
Then, according to (11.118), the propagated systematic errors of the βk;k = 1, 2, 3 turn out as
fβk=
2m∑j=1
cjkfj , fs, βk=
2m∑j=1
|cjk| fs,j ; k = 1, 2, 3 . (11.119)
The variances and covariances follow from the second expansion
βkl (x1l, . . . , yml) = βk (x1, . . . , ym) (11.120)
+m∑
j=1
∂βk
∂xj(xjl − xj) +
m∑j=1
∂βk
∂yj(yjl − yj) + · · · .
We define
vjl = xjl , vj+m,l = yjl , vj = xj , vj+m = yj
µj = µxj , µj+m = µyj ; j = 1, . . . , m
so that
βkl − βk =2m∑j=1
cjk (vjl − vj) , k = 1, 2, 3 , l = 1, . . . , n . (11.121)
From this we obtain
sβkβk′ =1
n − 1
n∑l=1
[2m∑i=1
cik (vil − vi)
]⎡⎣ 2m∑j=1
cjk′ (vjl − vj)
⎤⎦
=2m∑
i,j=1
cikcjksij ; k, k′ = 1, 2, 3 . (11.122)
Assigning the coefficients cjk to the (2m × 3) auxiliary matrix
CT =
⎛⎝ c11 c21 . . . c2m,1
c12 c22 . . . c2m,2
c13 c23 . . . c2m,3
⎞⎠ (11.123)
the empirical variance–covariance matrix of the solution vector β takes theform
sβ =
⎛⎜⎝ sβ1β1
sβ1β2sβ1β3
sβ2β1sβ2β2
sβ2β3
sβ3β1sβ3β2
sβ3β3
⎞⎟⎠ = CTsC ; sβkβk
≡ s2βk
. (11.124)
11.4 Circles 199
Here
s = (sij) ; i, j = 1, . . . , 2m , sij =1
n − 1
n∑l=1
(vil − vi) (vjl − vj)
(11.125)
denotes the empirical variance–covariance matrix of the input data. After all,the primary results of the adjustment are given by
βk ± uβk(11.126)
uβk=
tP (n − 1)√n
√√√√ 2m∑i,j=1
cikcjksij +2m∑j=1
|cjk| fs,j ; k = 1, 2, 3 .
The parameters of circle follow from
r = β1 + r∗ , xM = β2 + x∗M , yM = β3 + y∗
M .
The uncertainties of the latter are, of course, identical with those of the βk;k = 1, 2, 3.
Equal Systematic Errors
Let fxi = fx; fyi = fy; i = 1, . . . , m. As (11.119) changes into
fβk= fx
m∑j=1
cjk + fy
m∑j=1
cj+m,k ; k = 1, 2, 3 , (11.127)
according to Appendix F we findm∑
j=1
cj1 = − β2
r∗;
m∑j=1
cj+m,1 = − β3
r∗;
m∑j=1
cj2 = 1
m∑j=1
cj+m,2 = 0 ;m∑
j=1
cj,3 = 0 ;m∑
j=1
cj+m,3 = 1 . (11.128)
The propagated systematic errors become
fβ1= − 1
r∗(β2fx + β3fy
) ≈ 0 ; fβ2= fx , fβ3
= fy (11.129)
so that3
uβ1=
tP (n − 1)√n
sβ1, uβ2
=tP (n − 1)√
nsβ2
+ fs,x
uβ3=
tP (n − 1)√n
sβ3+ fs,y . (11.130)
3On finishing the iteration, the absolute values of the estimators β2 and β3 willbe very small so that fβ1
will be close to zero – the radius remains insensitive toshifts of the center of the circle.
200 11 Systems with Three Parameters
Uncertainty Regions
From (11.111) we conclude
β1 = r − r∗ ; µβ1= Eβ1 = E r − r∗ = µr − r∗ ,
i.e.
β1 − µβ1= r − µr ,
and, similarly,
β2 − µβ2= xM − µxM ; β3 − µβ3
= yM − µyM
so that
β − µβ = (r − µr xM − µxM yM − µyM )T . (11.131)
Referring to Hotelling’s ellipsoid (9.64)
t2P (r, n) = n(β − µβ
)T (CTsC
)−1 (β − µβ
)(11.132)
we may replace β − µβ by a vector of the kind(β − β
)= (r − r x − xM y − yM )T (11.133)
so that (11.132) takes the form of a confidence ellipsoid
t2P (r, n) = n(β − β
)T (CTsC
)−1 (β − β
)(11.134)
which we depict in a rectangular r, xM , yM coordinate system. Combining(11.134) with the pertaining security polyhedron produces the EPC hull,the procedure turning out to be more simple in case we restrict ourselves tothe degeneration presented in (11.129).
There are two uncertainty regions to be constructed: a very small innercircular region, enclosing the admissible centers and a much larger, outerannular region, marking off the set of allowable circles. The inner region isgiven by the projection of the EPC hull onto the xM , yM -plane.
To find the outer region, we proceed numerically. To this end, we put abunch of rays through the point xM , yM covering the angular range 0, . . . , 2πuniformly and densely enough.
Let us intersect the EPC hull with a plane perpendicular to the r-axisand consider the orthogonal projection of the intersection onto the xM , yM -plane. Obviously, each point of the so-projected line represents an admissiblecenter (centers lying in the interior are of no interest). Each of the centers hasthe same radius, namely that which is given by the height of the intersect-ing plane. Repeating the procedure with a sufficiently closely spaced system
11.4 Circles 201
of intersecting planes produces the set of all admissible centers and circles.We then consider the intersections of the circles with the rays of the bunch.Sorting the intersections on a given ray according to their distances from thepoint xM , yM , the intersection with the largest distance obviously establishesa point of the outer boundary of the ring-shaped uncertainty region in ques-tion while the intersection with the smallest distance marks a point of theinner boundary of that region.
12 Special Metrology Systems
12.1 Dissemination of Units
Two basic procedures are available for disseminating units: the linking of so-called working standards to primary standards and the intercomparison ofstandards (of any kind) through what is called key comparisons.
12.1.1 Realization of Working Standards
The primary standards of the SI units are the starting points for the realiza-tion of day-to-day employed working standards. By nature, the accuracy suchstandards should have, depends on their field of application. This aspect hasled to a hierarchy of standards of decreasing accuracies. Being established,we shall denote the pertaining “ranks” by Hk; k = 1, 2, . . . and use growingk to express decreasing accuracy.
Suppose the comparator operates by means of an equation of the kind
βk = βk−1 + x ; k = 1, 2, . . . . (12.1)
Here, βk designates the standard of rank Hk, which is to be calibrated, βk−1
the standard of rank Hk−1, which accomplishes the link-up, and, finally, xthe difference shown in the comparator display. Normally, any link-up impliesa loss of accuracy. Let us assume l = 1, . . . , n repeat measurements
xl = x0 + (xl − µx) + fx ; l = 1, . . . , n , (12.2)
where x0 designates the true value of the displayed difference xl, εl = xl −µx
a random and fx a systematic error. The n repeat measurements define anarithmetic mean and an empirical variance,
x =1n
n∑l=1
xl ; s2x =
1n − 1
n∑l=1
(xl − x)2 . (12.3)
Let the systematic error fx be contained within an interval
−fs,x ≤ fx ≤ fs,x . (12.4)
204 12 Special Metrology Systems
Fig. 12.1. Primary standard: nominal value N(β) and true value β0
Then, the interval
x − ux ≤ x0 ≤ x + ux ; ux =tP (n − 1)√
nsx + fs,x (12.5)
should localize the true value x0 of the differences xl; l = 1, . . . , n.Let N(β) denote the nominal value of a given primary standard, e.g.
N(β) = 1 kg (exact) and β0 its actual, physically true value, which, ofcourse, is unknown. Let the nominal value and the true value differ by an un-known systematic error fN(β), which we assume to be confined to the interval−uN(β), . . . , uN(β). We then have, Fig. 12.1,
β0 = β − fN(β) ; −uN(β) ≤ fN(β) ≤ uN(β) . (12.6)
Clearly, whenever N(β) is used, its true value β0 will be effective. In linkingup a standard β1 of true value β0,1 to N(β), the physical error equation reads
β1,l = β0 + xl = β0 + x0 + (xl − µx) + fx ; l = 1, . . . , n .
Obviously, β0 and x0 define the true value
β0,1 = β0 + x0 (12.7)
of the standard β1. Averaging over the n values β1,l; l = 1, . . . , n leads to
β1 = β0 + x . (12.8)
Unfortunately, the so-defined quantity is metrologically undefined. In order toarrive at a useful statement, we have to substitute N(β) for β0 and includethe associated systematic error fN(β) into the uncertainty of the modifiedmean
β1 = N(β) + x (12.9)
so that
12.1 Dissemination of Units 205
Fig. 12.2. The true values of standards of same hierarchical rank need not coincide,nor must their uncertainties overlap – the respective uncertainties should ratherlocalize the associated true values
uβ1= uN(β) +
tP (n − 1)√n
sx + fs,x . (12.10)
According to
β1 − uβ1≤ β0,1 ≤ β1 + uβ1
, (12.11)
the interval β1±uβ1localizes the true value β0,1 of the lower-ranking standard
β1. The true value β0,1 of β1 will, of course, differ from the true value β0 ofthe primary standard N(β).
12.1.2 Key Comparison
So-called key comparisons shall ensure the national measurement institutesto rely on equivalent SI standards. Let β(1), β(2), . . . , β(m) denote a selectedgroup of standards (of equal physical quality) with true values β
(i)0 ; i =
1, . . . , m. The superscripts distinguish individuals of the same hierarchicalrank, Fig. 12.2. Again, the true values of the standards need not coincide.It is rather indispensable that the uncertainties of the standards localizethe respective true values. Key comparisons are implemented through roundrobins in which a suitable transfer standard, which we shall call T , is passed onfrom one participant to the next where it is calibrated on the spot, Fig. 12.3.During the portage the physical properties of the transfer standard must notalter so that each link-up relates exactly to one and the same true value.Subsequently, the mutual consistency of the results has to be tested.
The i-th participant, i ∈ 1, . . . , m, links the transfer standard T to hislaboratory standard β(i). We assume the uncertainties of the laboratory stan-dards to localize the pertaining true values β
(i)0 so that
β(i) − uβ(i) ≤ β(i)0 ≤ β(i) + uβ(i) , i = 1, . . . , m . (12.12)
For convenience, let us rewrite these inequalities as a set of equations
206 12 Special Metrology Systems
Fig. 12.3. Round robin for a group of m standards β(i); i = 1, . . . , m realized bymeans of a transfer standard T . The calibrations T (i) should be mutually consistent
β(i) = β(i)0 + fβ(i) ; −uβ(i) ≤ fβ(i) ≤ uβ(i) ; i = 1, . . . , m . (12.13)
The physical error equations of the m link-ups read
T(i)l = β
(i)0 + x
(i)0 +
(x
(i)l − µx(i)
)+ fx(i) ; i = 1, . . . , m ; l = 1, . . . , n
T (i) = β(i)0 + x
(i)0 +
(x(i) − µx(i)
)+ fx(i) ; i = 1, . . . , m .
The second group of equations is obviously identical to
T (i) = β(i)0 + x(i) ; i = 1, . . . , m .
For practical reasons, we have to substitute estimators β(i) for the inaccessibletrue values β
(i)0 and to include the imported errors in the uncertainty uT (i)
of the relevant mean T (i). We then find
T (i) = β(i) + x(i) , uT (i) = uβ(i) + ux(i) ; i = 1, . . . , m . (12.14)
Let T0 designate the true value of the transfer standard. As
T0 = β(i)0 + x
(i)0 ; i = 1, . . . , m , (12.15)
we have ∣∣∣T (i) − T0
∣∣∣ ≤ uT (i) ; i = 1, . . . , m . (12.16)
Figure 12.4 depicts the result of a consistent round robin.We might wish to quantify differences of the kind T (i) − T (j). Here, we
should observe∣∣∣T (i) − T (j)∣∣∣ ≤ uβ(i) + ux(i) + uβ(j) + ux(j) ; i = 1, . . . , m .
12.1 Dissemination of Units 207
Fig. 12.4. Round robin: in case of consistency of the standards β(i); i = 1, . . . , m,a horizontal line may be drawn intersecting each of the uncertainties uT (i) of thecalibrations T (i)
Finally, we refer to the m means T (i) to establish a weighted grand mean1 asdefined in (9.41),
β =m∑
i=1
wiT(i) ; wi =
g2i
m∑i=1
g2i
; gi =1
uT (i). (12.17)
In this context, the mean β is called a key comparison reference value(KCRV). We derive the uncertainties udi
of the m differences
di = T (i) − β ; i = 1, . . . , m . (12.18)
To this end, we have to insert (12.14) into (12.18), so that
di = β(i) + x(i) −m∑
j=1
wj
(β(j) + x(j)
).
Combining the two errors fx(i) and fβ(i)
fi = fx(i) + fβ(i) , (12.19)
the error representation of (12.18) reads
di = T0 +(x(i) − µx(i)
)+ fi −
m∑j=1
wj
[T0 +
(x(j) − µx(j)
)+ fj
]
=(x(i) − µx(i)
)−
m∑j=1
wj
(x(j) − µx(j)
)+ fi −
m∑j=1
wjfj . (12.20)
1As has been discussed in Sect. 9.3, a group of means with unequal true valuesdoes not define a coherent grand mean.
208 12 Special Metrology Systems
Reverting to individual measurements x(i)l we find
di =1n
n∑l=1
(x
(i)l − µx(i)
)−
m∑j=1
wj
[1n
n∑l=1
(x
(j)l − µx(j)
)]+ fi −
m∑j=1
wjfj
=1n
n∑l=1
⎡⎣(x(i)
l − µx(i)
)−
m∑j=1
wj
(x
(j)l − µx(j)
)⎤⎦+ fi −m∑
j=1
wjfj
so that
di,l =(x
(i)l − µx(i)
)−
m∑j =1
wj
(x
(j)l − µx(j)
)+ fi −
m∑j=1
wjfj
l = 1, . . . , n . (12.21)
Hence, we are in a position to define the n differences
di,l − di =(x
(i)l − x(i)
)−
m∑j=1
wj
(x
(j)l − x(j)
), l = 1, . . . , n . (12.22)
The empirical variances and covariances of the participants are
sx(i)x(j) ≡ sij =1
n − 1
n∑l=1
(x
(i)l − x(i)
)(x
(j)l − x(j)
); i, j = 1, . . . , m .
The sij define the empirical variance–covariance matrix
s =
⎛⎜⎜⎝
s11 s12 . . . s1m
s21 s22 . . . s2m
. . . . . . . . . . . .sm1 sm2 . . . smm
⎞⎟⎟⎠ ; sii ≡ s2
i .
For convenience, we introduce the auxiliary vector
w = (w1 w2 . . . wm)T
comprising the weights wi. Then, from (12.22) we obtain
s2di
=1
n − 1
n∑l=1
(di,l − di
)2 = s2i − 2
m∑j=1
wjsij + wTsw . (12.23)
The worst case estimation of (12.19) yields
fs,i = fs,x(i) + uβ(i) ; i = 1, . . . , m .
Hence, the uncertainties of the differences (12.18) are given by
12.1 Dissemination of Units 209
Fig. 12.5. The absolute value of the difference di = T (i) − β should not exceed theuncertainty udi
udi=
tP (n − 1)√n
√√√√s2i − 2
m∑j=1
wjsij + wTsw
+ (1 − wi) fs,i +m∑
j=1j =i
wjfs,j , i = 1, . . . , m .
We are free to add to the second term on the right ±wifs,i,
(1 − wi) fs,i +m∑
j=1j =i
wjfs,j ± wifs,i = fs,i − 2wifs,i +m∑
j =1
wjfs,j ,
so that
udi=
tP (n − 1)√n
√√√√s2i − 2
m∑j=1
wjsij + wTsw
+fs,i − 2wifs,i +m∑
j=1
wjfs,j . (12.24)
As illustrated in Fig. 12.5, we should observe∣∣∣T (i) − β∣∣∣ ≤ udi
. (12.25)
In case of equal weights, we have to set wi = 1/m. Whether or not (12.25)turns out to be practically useful, appears somewhat questionable. Let usrecall that the grand mean β as defined in (12.17) is established through thecalibrations T (i) themselves. Consequently, we cannot rule out that just bychance, correct calibrations might appear to be incorrect and incorrect onesto appear correct.
Rather, it appears simpler and less risky to check whether the uncertain-ties of the T (i); i = 1, . . . , m mutually overlap. If they do, this would indicatecompatibility of the β(1), β(2), . . . , β(m).
210 12 Special Metrology Systems
Finally, let us refer to what metrologists call “traceability”. In a sense,this is the most basic definition of metrology stating the imperative: theuncertainty of every standard and every physical quantity should localizethe pertaining true value. Provided the alternative error model we refer toapplies, we are in a position to claim:
The alternative error model meets the requirements of traceability.In contrast to this, the ISO Guide presents us with difficulties of how
to explain the term and of how to realize its implications – even lengthydiscussions holding up for years do not seem to have clarified the situation.
12.2 Mass Decades
Since the First General Conference of Weights and Measures in 1889, theunit of mass is realized through artefacts, namely so-called prototypes of thekilogram.2 However, as thorough long-term investigations have revealed, massartefacts are not really time-constant quantities. Their inconstancies mightamount up to ±5 µg [15]. Another problem relates to the high density of theplatinum-iridium alloy. Their density of 21 500 kgm−3 has to be confrontedwith the density of 8 000kg m−3 of stainless steel weights. According to thisdifference, the primary link-up
prototype ⇒ 1 kg stainless steel weight
is affected by an air buoyancy of the order of magnitude of 100mg, theuncertainty of which should lie in an interval of ±5 µg up to ±10 µg.
The scaling of the kilogram, there are submultiples and multiples as
. . . , 10 g , 20 g , 50 g , 100 g , 200 g , 500 g , 1 kg , 2 kg , 5 kg , . . . (12.26)
is carried out through weighing. In the following, we shall consider the ideasunderlying mass link-ups. To adjust the weights to exact nominal valueswould be neither necessary nor, regarding the costs, desirable. Rather, differ-ences of the kind
βk = mk − N(mk) ; k = 1, 2, . . . (12.27)
between the physical masses mk and their nominal values N(mk) are consid-ered.
Weighing Schemes
For simplicity, we shall confine ourselves to “down-scaling”. Consider a set ofweights with nominal values
2The prototypes, whose nominal value is 1 kg (exact), are cylinders 39 mm inheight and 39 mm in diameter and consist of an alloy of 90% platinum and 10%iridium. At present, a new definition of the SI unit kg is being investigated.
12.2 Mass Decades 211
Fig. 12.6. Link-up standard with nominal value N , true value N0 and meanvalue N
N(m1) = 500 g , N(m2) = 200 g , N(m3) = 200 gN(m4) = 100 g , N(m5) = 100 g . (12.28)
In order to better control the weighing procedures, we might wish to add aso-called check-weight, for example
N(m6) = 300 g ,
the actual mass of which is known a priori. After the link-up, the two intervals
(m6 ± um6)before calibration (m6 ± um6)after calibration
should mutually overlap.According to Fig. 12.6, we assume that a standard of nominal value N =
1kg is provided, N0 being its physically true value, and N its known meanvalue with uncertainty uN ,
N − uN ≤ N0 ≤ N + uN ; N = N0 + fN ; −uN ≤ fN ≤ uN .
(12.29)
We shall always liken two groups of weights of equal nominal values, e.g.
m1 + m2 + m3 + m6 nominal value 1.2 kg
m4 + m5 + N0 nominal value 1.2 kg
1st comparison .
We refer to l = 1, . . . , n repeat measurements and represent the comparisonin the form of an observational equation
m1 + m2 + m3 + m6 − m4 + m5 + N0 ≈ x1 ; x1 =1n
n∑l=1
x1,l .
212 12 Special Metrology Systems
In physical terms, the link-up standard enters, of course, with its true valueN0. As this value is unknown, we have to substitute N for N0 and to increasethe uncertainty ux1 of the right-hand side by uN up to uN + ux1 ,
m1 + m2 + m3 + m6 −m4 + m5 + N
≈ x1
uright−hand side = uN + ux1 .
For convenience, we now turn to deviations as defined in (12.27),
β1 + β2 + β3 − β4 − β5 + β6 ≈ x1 + d . (12.30)
Here
d = N − N (12.31)
denotes the difference between N and N , obviously,
d0 = N0 − N (12.32)
denotes the true value of d. The associated error equation is
d = d0 + fN ; −uN ≤ fN ≤ uN . (12.33)
As we learn from (12.30), the first row of the design matrix A are numbers+1 or −1, depending on whether the weight is a member of the first or ofthe second group. Since there are also comparisons which do not include allweights of the set considered, e.g.
m6 nominal value 300 g
m3 + m5 nominal value 300 g
2nd comparison ,
i.e.
−β3 − β5 + β6 ≈ x2 ,
there will also be zeros. The first two rows of the design matrix A read
1 1 1 -1 -1 10 0 -1 0 -1 1 . (12.34)
If there are r unknowns βk; k = 1, . . . , r we need m > r observational equa-tions, allowing for rank(A) = r.
The exemplary weighing scheme presented below is made up of m = 16mass comparisons and r = 6 unknown weights. The first five comparisonsinclude the link-up standard N , which in the design matrix is indicated by a(+) sign. In general, N will appear within the first p observational equationsso that
12.2 Mass Decades 213
d = (d . . . d︸ ︷︷ ︸p
| 0 . . . 0︸ ︷︷ ︸m−p
)T . (12.35)
Furthermore, 14 comparisons include the check-weight. Setting
β = (β1 β2 . . . βr)T
, x = (x1 x2 . . . xm)T ,
A =
⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝
1 1 1 −1 −1 11 −1 1 1 1 11 1 −1 1 1 11 1 0 1 −1 11 0 1 −1 1 11 −1 1 −1 −1 −11 −1 −1 1 1 −11 1 −1 −1 −1 −11 −1 0 1 −1 −11 0 −1 −1 1 −10 −1 0 −1 0 10 −1 0 −1 0 10 0 −1 0 −1 10 0 −1 0 −1 10 0 0 1 −1 00 0 0 1 −1 0
⎞⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠
(+)(+)(+)(+)(+)
the inconsistent linear system takes the form
Aβ ≈ x + d . (12.36)
The least squares solution vector is
β =(ATA
)−1AT(x + d
)= BT
(x + d
)B = A
(ATA
)−1= (bik) (12.37)
or, written in components,
βk =m∑
i=1
bikxi + d
p∑i=1
bik ; k = 1, . . . , r . (12.38)
Inserting the error equations
xi = x0,i + (xi − µi) + fi , d = d0 + fN
we find
βk =m∑
i=1
bik [x0,i + (xi − µi) + fi] + (d0 + fN)p∑
i=1
bik ; k = 1, . . . , r .
214 12 Special Metrology Systems
Obviously,
β0,k =m∑
i=1
bikx0,i + d0
p∑i=1
bik (12.39)
are the true values of the estimators βk. The propagated systematic errorsturn out to be
fs,βk=
m∑i=1
|bik| fs,i + uN
∣∣∣∣∣p∑
i=1
bik
∣∣∣∣∣ ; k = 1, . . . , r . (12.40)
Going back to individual measurements, (12.38) leads us to
βkl =m∑
i=1
bikxil + d
p∑i=1
bik ; k = 1, . . . , r . (12.41)
By means of the differences
βkl − βk =m∑
i=1
bik (xil − xi) ; k = 1, . . . , r ; l = 1, . . . , n
we arrive at the empirical variances and covariances of the components of thesolution vector,
sβkβk′ =1
n − 1
n∑l=1
(βkl − βk
) (βk′l − βk′
)
=1
n − 1
n∑l=1
[m∑
i=1
bik (xil − xi)
]⎡⎣ m∑j=1
bjk′ (xj l − xj)
⎤⎦
=m∑i,j
bikbjk′sij ; sβkβk≡ s2
βk. (12.42)
Here, the elements
sij =1
n − 1
n∑l=1
(xil − xi) (xj l − xj) ; i, j = 1, . . . , m (12.43)
designate the empirical variances and covariances of the input data. After all,we shall assign uncertainties
uβk=
tP (n − 1)√n
sβk+ fs,βk
; k = 1, . . . , r (12.44)
=tP (n − 1)√
n
√√√√ m∑i,j
bikbjksij +m∑
i=1
|bik| fs,i + fs,N
p∑i=1
|bik|
to the r adjusted weights βk; k = 1, . . . , r of the set.
12.4 Fundamental Constants of Physics 215
12.3 Pairwise Comparisons
Occasionally, within a set of physical quantities of the same quality (lengths,voltages, etc.) only differences between two individuals are measurable. Con-sider, for example, four quantities βk; k = 1, . . . , 4. Then, there are either sixor twelve measurable differences, depending on whether not only the differ-ences
(βi − βj) but also the inversions (βj − βi)
appear to be of metrological relevance. Let us refer to the first, simpler situ-ation,
β1 − β2 ≈ x1
β1 − β3 ≈ x2
β1 − β4 ≈ x3
β2 − β3 ≈ x4
β2 − β4 ≈ x5
β3 − β4 ≈ x6 . (12.45)
Obviously, the first and the second relation establish the fourth one, the firstand the third the fifth one and, finally, the second and the third the sixthone. Consequently, the design matrix
A =
⎛⎜⎜⎜⎜⎜⎜⎝
1 −1 0 01 0 −1 01 0 0 −10 1 −1 00 1 0 −10 0 1 −1
⎞⎟⎟⎟⎟⎟⎟⎠ (12.46)
has rank 3. As has been outlined in Sect. 7.3, a constraint is needed. Let use.g. consider
h1β1 + h2β2 + h3β3 + h4β4 = y . (12.47)
The uncertainties of the βk; k = 1, . . . , 4 follow from Chap. 9. Of course, theright hand side of (12.47) may also refer to quantities of the kind y ± uy.
12.4 Fundamental Constants of Physics
Among others, the following physical constants are at present consideredfundamental:
216 12 Special Metrology Systems
h Planck constant R∞ Rydberg constante elementary charge µB Bohr magnetonc speed of light in vacuum µN nuclear magnetonµ0 permeability of vacuum me electron massε0 permittitivity of vacuum mp proton massα fine-structure constant Mp proton molar massγp proton gyromagnetic ratio µp proton magnetic momentNA Avogadro constant F Faraday constant
Due to measurement errors, fundamental constants taken from measured dataprove mutually inconsistent, i.e. different concatenations which we expect toproduce equal results, disclose numerical discrepancies. To circumvent thismishap, a sufficiently large set of constants, say r, is subjected to a leastsquares adjustment calling for mutual consistency. Just as the uncertaintiesof the input data are expected to localize the pertaining true values, werequire that the adjusted constants should meet the respective demand, i.e.each of the newly obtained intervals,
(constant ± uncertainty)k , k = 1, . . . , r
should localize the pertaining true value. The latter property should applyirrespective of the weighting procedure, as has been discussed in Sects. 8.4and 9.3.
Examples of concatenations between fundamental constants are
R∞ =me
2hcα2 , α =
µ0c
2e2
h, µB =
eh
4πme, µN =
eh
4πmp
Mp = NAmp , γp =(
µp
µB
)(mp
me
)1
MpNA e , ε0µ0 =
1c2
, F = NAe .
Those constants which are not members of the adjustment, will be derivedfrom the adjusted ones so that, as the bottom line, a complete system ofnumerically self-consistent constants is obtained. Two remarks may be inorder:
– The smaller the set of primary constants, the larger the risk of an overallmaladjustment.
– The adjusted constants are not exact but rather self-consistent, i.e. theyno longer give rise to numerical discrepancies.3 We expect the uncertain-ties as issued by the adjustment to localize true values, Fig. 12.7
Constants may enter the adjustment either directly as measured values
Ki = measured value (i) , i = 1, 2, . . . (12.48)3I.e. the adjustment establishes consistency through redistribution of discrep-
ancies.
12.4 Fundamental Constants of Physics 217
Fig. 12.7. Consider a set of three fundamental constants β1, β2 and β3 and scru-tinize the interplay between weightings and the capability of the system to localizetrue values
218 12 Special Metrology Systems
or in the form of given concatenations
φi (K1, K2, . . .) = measured value (i) , i = 1, 2, . . . (12.49)
In the latter case, the functions φi, i = 1, 2, . . . have to be linearized bymeans of suitable starting values. The presently accepted form of adjustmentis presented in [17]. Here, however, the error model recommended in [34]is used. As the latter sets aside the introduction of true values, at leastin the opinion of the author, collective shifts of the numerical levels of thefundamental physical constants4 are conceivable.
In the following, we shall outline the principles of an alternative adjust-ment [28].
Linearizing the System of Equations
Given m functions
φi (K1, K2, . . . , Kr) ≈ aiyi ; i = 1, . . . , m (12.50)
between r < m constants Kk; k = 1, . . . , r to be adjusted. Not every con-stant will enter each functional relationship. Let the ai be numerically knownquantities the details of which will not concern us here. Furthermore, let theinput data
yi =mi∑j=1
w(i)j x
(i)j ; i = 1, . . . , m (12.51)
be weighted means, each defined through mi means5 xj ; j = 1, . . . , mi. Thequotations
x(i)j ± u
(i)j (12.52)
x(i)j =
1n
n∑l=1
x(i)j l , u
(i)j =
tP (n − 1)√n
s(i)j + f
(i)s,j ; j = 1, . . . , mi
are presented by the different working groups participating in the adjustment.The pertaining error equations
x(i)j l = x
(i)0 +
(x
(i)jl − µ
(i)j
)+ f
(i)j
x(i)j = x
(i)0 +
(x
(i)j − µ
(i)j
)+ f
(i)j
−f(i)s,j ≤ f
(i)j ≤ f
(i)s,j ; j = 1, . . . , mi (12.53)
4http://www.nist.gov5See Sect. 9.3 where the precondition for the existence of metrologically reason-
able mean of means has been extensively discussed.
12.4 Fundamental Constants of Physics 219
lead us to the empirical variances and covariances of the i-th functional re-lationship,
s(i) =(s(i)jj′
)(12.54)
s(i)jj′ =
1n − 1
n∑l=1
(x
(i)j l − x
(i)j
)(x
(i)j′l − x
(i)j′
); s
(i)jj ≡ s
(i)j
2.
The weights
w(i)j =
g(i)2j
mi∑j ′=1
g(i)2j′
; g(i)j =
1
u(i)j
; j = 1, . . . , mi
and the uncertainties
uyi =tP (n − 1)√
n
√√√√mi∑j,j′
w(i)j w
(i)j′ s
(i)jj′ +
mi∑j
w(i)j f
(i)s,j (12.55)
have been quoted in Sect. 9.3.Expanding the φi by means of known starting values Kv,k; k = 1, . . . , r
yields
φi (K1, . . . , Kr) = φi, v (Kv,1, . . . , Kv, r)
+∂φi
∂Kv,1(K1 − Kv,1) + · · · + ∂φi
∂Kv,r(Kr − Kv,r) ≈ aiyi .
Truncating non-linear terms and rearranging leads to
1φi,v
∂φi
∂Kv,1(K1 − Kv,1) + · · · + 1
φi,v
∂φi
∂Kv,r(Kr − Kv,r) ≈ aiyi − φi,v
φi,v,
where
φi,v ≡ φi,v (Kv,1, . . . , Kv,r) .
A reasonable weighting matrix is
G = diag g1 , g2 , . . . , gm ; gi =1∣∣∣ ai
φi,v
∣∣∣ uyi
, (12.56)
leading to
gi
φi, v
∂φi
∂Kv,1(K1 − Kv,1) + · · · + gi
φi,v
∂φi
∂Kv,r(Kr − Kv,r) ≈ gi
aiyi − φi,v
φi,v
i = 1 . . . , m . (12.57)
220 12 Special Metrology Systems
To abbreviate, we put
Kk − Kv,k = Kv,kβk ; Kv,kgi
φi,v
∂φi
∂Kv,k= aik ; zi = gi
aiyi − φi,v
φi,v
(12.58)
so that (12.57) turns into
ai1β1 + · · · + airβr ≈ zi ; i = 1, . . . , m . (12.59)
Finally, defining
A = (aik) , β = (β1 β2 . . . βr)T and z = (z1 z2 . . . zm)T
we arrive at
Aβ ≈ z . (12.60)
Here, the ≈ sign implies both, linearization and measurement errors.
Structure of the Solution Vector
From (12.60) we obtain
β = BTz ; B = A(ATA
)−1, (12.61)
or, written in components,
βk =m∑
i=1
bikzi ; k = 1, . . . , r .
Inserting the zi as given in (12.58) yields
βk =m∑
i=1
bikzi =m∑
i=1
bikgiaiyi − φi, v
φi,v=
m∑i=1
bikgiaiyi
φi,v−
m∑i=1
bikgi .
We then introduce the yi according to (12.51),
βk =m∑
i=1
bikaigi
φi,v
mi∑j=1
w(i)j x
(i)j −
m∑i=1
bikgi , (12.62)
and, finally, the x(i)j as given in (12.53),
βk =m∑
i=1
bikaigi
φi,v
mi∑j=1
w(i)j
[x
(i)0 +
(x
(i)j − µ
(i)j
)+ f
(i)j
]−
m∑i=1
bikgi
=m∑
i=1
bikaigi
φi,vx
(i)0 −
m∑i=1
bikgi
+m∑
i=1
bikaigi
φi,v
mi∑j=1
w(i)j
[(x
(i)j − µ
(i)j
)+ f
(i)j
]. (12.63)
12.4 Fundamental Constants of Physics 221
Obviously,
β0,k =m∑
i=1
bikaigi
φi,vx
(i)0 −
m∑i=1
bikgi =m∑
i=1
bikgiaix
(i)0 − φi,v
φi,v
k = 1 , . . . , r (12.64)
designate the true values of the estimators βk within the limits of vanishinglinearization errors so that
βk = β0,k +m∑
i=1
bikaigi
φi,v
mi∑j=1
w(i)j
[(x
(i)j − µ
(i)j
)+ f
(i)j
]. (12.65)
From this we obtain the propagated systematic errors
fβk=
m∑i=1
bikaigi
φi,v
mi∑j=1
w(i)j f
(i)j ; k = 1, . . . , r . (12.66)
To simplify the handling, we alter the nomenclature and transfer the doublesum (12.62) into an ordinary sum,
βk = b1ka1g1φ1,v
[w
(1)1 x
(1)1 + w
(1)2 x
(1)2 + . . . + w
(1)m1 x
(1)m1
]+b2k
a2g2φ2,v
[w
(2)1 x
(2)1 + w
(2)2 x
(2)2 + . . . + w
(2)m2 x
(2)m2
]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
+bmkamgm
φm,v
[w
(m)1 x
(m)1 + w
(m)2 xm
2 + . . . + w(m)mm x
(m)mm
]−
m∑i=1
bikgi
so that
βk =M∑i=1
bikvi −m∑
i=1
bikgi (12.67)
putting
M =m∑
i=1
mi (12.68)
and
i = 1 : v1 = x(1)1 v2 = x
(1)2 · · · vm1 = x
(1)m1
i = 2 : vm1+1 = x(2)1 vm1+2 = x
(2)2 · · · vm1+m2 = x
(2)m2
· · · · · · · · · · · · · · ·i = m : vq+1 = x
(m)1 vq+2 = x
(m)2 · · · vM = x
(m)mm
222 12 Special Metrology Systems
i = 1 : b1k = b1ka1g1
φ1,vw
(1)1 ,
b2k = b1ka1g1
φ1,vw
(1)2 ,
. . . . . . . . . . . . . . .
bm1,k = b1ka1g1
φ1,vw(1)
m1
i = 2 : bm1+1,k = b2ka2g2
φ2,vw
(2)1 ,
bm1+2,k = b2ka2g2
φ2,vw
(2)2 ,
. . . . . . . . . . . . . . .
bm1+m2,k = b2ka2g2
φ2,vw(2)
m2
. . . . . . . . . . . . . . . . . .
i = m : bq+1,k = bmkamgm
φm,vw
(m)1 ,
bq+2,k = bmkamgm
φm,vw
(m)2 ,
. . . . . . . . . . . . . . .
bM,k = bmkamgm
φm,vw(m)
mm
where
q =m−1∑j=1
mj .
Inserting (12.52) into (12.62) yields
βk =m∑
i=1
bikaigi
φi,v
mi∑j=1
w(i)j
1n
n∑l=1
x(i)jl −
m∑i=1
bikgi
=1n
n∑l=1
⎡⎣ m∑
i =1
mi∑j=1
bikaigi
φi,vw
(i)j x
(i)jl
⎤⎦−
m∑i=1
bikgi (12.69)
so that
βk =m∑
i=1
βkl ; k = 1, . . . , r (12.70)
12.4 Fundamental Constants of Physics 223
understanding the
βkl =m∑
i=1
mi∑j=1
bikaigi
φi,vw
(i)j x
(i)jl −
m∑i=1
bikgi ;
k = 1, . . . , r ; l = 1, . . . , n (12.71)
as estimators of an adjustment based on the respective l-th repeat measure-ments. The n differences between (12.71) and (12.62),
βkl − βk =m∑
i=1
mi∑j=1
bikaigi
φi,vw
(i)j
(x
(i)jl − x
(i)j
);
k = 1, . . . , r ; l = 1, . . . , n (12.72)
give rise to the elements of the empirical variance–covariance matrix of thesolution vector. To have a concise notation, we introduce
i = 1 : v1l = x(1)1l v2l = x
(1)2l · · · vm1,l = x
(1)m1,l
i = 2 : vm1+1,l = x(2)1l vm1+2,l = x
(2)2l · · · vm1+m2,l = x
(2)m2,l
· · · · · · · · · · · · · · ·i = m : vq+1,l = x
(m)1l vq+2,l = x
(m)2l · · · vM,l = x
(m)mm,l
so that
βkl − βk =M∑i=1
bik (vil − vi) ; k = 1, . . . , r ; l = 1, . . . , n . (12.73)
Further, we define an auxiliary matrix assembling the bik,
B =(bik
). (12.74)
Then, referring to the empirical variance–covariance matrix of all input data,
s = (sij) ; sij =1
n − 1
n∑l=1
(vil − vi) (vjl − vj)
i, j = 1, . . . , M , (12.75)
the empirical variance–covariance matrix of the solution vector β turns outto be
sβ = BTsB . (12.76)
Its elements are
224 12 Special Metrology Systems
sβkβ′k
=1
n − 1
n∑l=1
(βkl − βk
) (βk′l − βk′
)sβkβk
≡ s2βk
; k, k′ = 1, . . . , r . (12.77)
To concisely present the systematic errors, we set
i = 1 : f1 = f(1)1 f2 = f
(1)2 · · · fm1 = f
(1)m1
i = 2 : fm1+1 = f(2)1 fm1+2 = f
(2)2 · · · fm1+m2 = f
(2)m2
· · · · · · · · · · · · · · ·i = m : fq+1 = f
(m)1 fq+2 = f
(m)2 · · · fM = f
(m)mm
so that (12.66) becomes
fβk=
M∑i=1
bikfi ; k = 1, . . . , r . (12.78)
Finally, the result of the adjustments is
βk ± uβk
uβk=
tP (n − 1)√n
sβk+
M∑i=1
∣∣∣bik
∣∣∣ fs,i ; k = 1, . . . , r . (12.79)
Uncertainties of the Constants
According to (12.58), the constants and their uncertainties turn out to be
Kk = Kk,v
(βk + 1
), uKk
= Kk,vuβk; k = 1, . . . , r . (12.80)
Cyclically repeating the adjustment with improved starting values shouldreduce the linearization errors. However, on account of the errors of the inputdata, the estimators βk; k = 1, . . . , r tend to oscillate after a certain numberof cycles, telling us to end the iteration.
12.5 Trigonometric Polynomials
Let us look for a smoothing trigonometric polynomial to be fitted to a givenempirical relationship between any two measured quantities. At first, we shallassume the model function to be unknown. After this, we shall consider themodel function to be known – yet to simplify, we wish to substitute a trigono-metric polynomial presuming the difference between the two functions to benegligible as compared with the error bounds of the empirical data.
For simplicity we confine ourselves to the case of error-free abscissas anderroneous ordinates,
12.5 Trigonometric Polynomials 225
(x0,1, y1) , (x0,2, y2) , . . . (x0,m, ym) , (12.81)
where we assume
yi ± uyi ; i = 1, . . . , m (12.82)
yi =1n
n∑l=1
yil = y0,i + (yi − µi) + fyi , uyi =tP (n − 1)√
nsyi + fs,yi .
12.5.1 Unknown Model Function
We require the fitted function to intersect each of the error bars of the givenpairs of data. To implement this demand, we are watching out for a well-behaving, bendable trigonometric polynomial. Let us keep in mind that ifthe polynomial’s degree is too low, it will not be in a position to adaptitself to the details suggested by the sequence of data. If it is too high, thepolynomial might follow too closely the data’s pseudo-structures. Fortunately,the number of terms of a trigonometric polynomial is not overly critical.Hence, we may fix its degree by trial and error so that the fitted polynomialpasses through each of the error bars and, at the same time, appropriatelysmoothes the empirical structure of the given data sequence.
We suppose the number of data pairs to be sufficiently large and closelyspaced so that the adjustment of an empirical function appears to be physi-cally reasonable. We are then asking for a smoothing, trigonometric polyno-mial
y(x) = β1 +r∑
k=2
βktk(x) ; r = 3, 5, 7, . . . < m
tk(x) =
⎧⎪⎪⎨⎪⎪⎩
cos(
k
2ω0x
); k even
sin(
k − 12
ω0x
); k odd
k ≥ 2
which we intend to adjust according to least squares. To find the coefficientsβk; k = 1, . . . , r over the interval
∆X = x0,m − x0,1 , (12.83)
we define
tik =
⎧⎪⎪⎨⎪⎪⎩
cos(
k
2ω0x0,i
); k even
sin(
k − 12
ω0x0,i
); k odd
k ≥ 2 , i = 1 . . . , m . (12.84)
As is well known, trigonometric polynomials (and Fourier series) are basedon auxiliary frequencies which are multiples of a basic frequency
226 12 Special Metrology Systems
ω0 =2π
∆X
their purpose being to keep the least squares adjustment linear.6 After all,the inconsistent system is given by
β1 + β2ti2 + · · · + βrtir ≈ yi ; i = 1, . . . , m
r = 3, 5, 7, . . . < m . (12.85)
The design matrix
A =
⎛⎜⎜⎝
1 t12 t13 · · · t1r
1 t22 t23 · · · t2r
· · · · · · · · · · · · · · ·1 tm2 tm3 · · · tmr
⎞⎟⎟⎠
and the column vectors
y = (y1 y2 · · · ym)T , β = (β1 β2 · · · βr)T
transfer (12.85) into
Aβ ≈ y . (12.86)
Assuming rank (A) = r we have
β = BTy ; B = A(ATA
)−1= (bik) . (12.87)
The components
βk =m∑
i=1
bikyi ; k = 1, . . . , r (12.88)
of the solution vector β establish the adjusted polynomial
y(x) = β1 +r∑
k=2
βktk(x) ; r = 3, 5, 7, . . . < m . (12.89)
Since the model function is unknown, all we can do is to argue as follows:The error bars yi±uyi; i = 1, . . . , m localize the true values y0,i of the meansyi. And, as the trigonometric polynomial shall pass through each of the errorbars, we have
|y(x0,i) − y0,i| ≤ 2uyi ; i = 1, . . . , m . (12.90)
6This suggests we also adjust the frequencies, but we would then be faced witha highly non-linear problem to which, in general, there is no solution. Remarkablyenough, G. Prony, a contemporary of J. Fourier, developed a true harmonic analysis.
12.5 Trigonometric Polynomials 227
12.5.2 Known Model Function
Provided the difference between the known model function and the approx-imating trigonometric polynomial is insignificant, we may formally definea true polynomial over the interval (12.83) via the true values (x0,i, y0,i);i = 1, . . . , m of the data pairs (12.81),
y0(x) = β0,1 +r∑
k=2
β0,ktk(x) ; r = 3, 5, 7, . . . < m . (12.91)
Obviously, (12.91) expresses equality in a formal sense. Out of necessity, wehave to keep the adjusted polynomial (12.89). Assigning the yi; i = 1, . . . , mgiven in (12.82) to a vector
y = y0 + (y − µy) + fy ,
(12.87) yields
β = BT [y0 + (y − µy) + fy] = β0 + BT (y − µy) + BTfy . (12.92)
Here, the “true solution vector”
β0 = (β0,1 β0,2 · · · β0,r)T
provides a suitable reference to assess the fluctuations of (12.89). The vector
f β = BTfy ; fβk=
m∑i=1
bikfyi ; k = 1, . . . , r (12.93)
bears the influence of the systematic errors with respect to β0. Let us as-sume that on each of the ordinates, one and the same systematic error issuperimposed
fyi = fy , −fs,y ≤ fyi ≤ fs,y ; i = 1, . . . , m .
Then, the fβk; k = 1, . . . , r have to satisfy
fβk=
m∑i=1
bikfyi = fy
m∑i=1
bik ,
i.e.
fβ1= fy and fβ2
= . . . = fβr= 0 (12.94)
asm∑
i=1
bi1 = 1 andm∑
i=1
bik = 0 ; k = 2, . . . , r .
228 12 Special Metrology Systems
The polynomial coefficients βkl of the l-th set of repeat measurements
y1l, y2l, · · · , yml ; l = 1, . . . , n
may be taken from
βk =m∑
i=1
bik
[1n
n∑l=1
yil
]=
1n
n∑l=1
m∑i=1
bikyil
=1n
n∑l=1
βkl ; k = 1, . . . , r , (12.95)
i.e.
βkl =m∑
i=1
bikyil ; k = 1, . . . , r ; l = 1, . . . , n . (12.96)
The differences
βkl − βk =m∑
i=1
bik (yil − yi) ; k = 1, . . . , r ; l = 1, . . . , n (12.97)
give rise to the elements sβkβk′ ; k, k′ = 1, . . . , r of the empirical variance–covariance matrix sβ of the solution vector β,
sβkβk′ =1
n − 1
n∑l=1
(βkl − βk
) (βk′l − βk′
)
=1
n − 1
n∑l=1
[m∑
i=1
bik (yil − yi)
]⎡⎣ m∑j=1
bjk′ (yjl − yj)
⎤⎦
=m∑
i,j=1
bikbjk′sij ; sβkβk≡ s2
βk. (12.98)
Here, the
sij =1
n − 1
n∑l=1
(yil − yi) (yjl − yj)
are the empirical variances and covariances of the input data. Letting s =(sij) designate the associated empirical variance–covariance matrix, (12.98)changes into
sβ =(sβkβk′
)= BTsB . (12.99)
Finally, we find
12.5 Trigonometric Polynomials 229
βk ± uβk; k = 1, . . . , r (12.100)
uβ1=
tP (n − 1)√n
sβ1+ fs,y ; uβk
=tP (n − 1)√
nsβk
; k = 2, . . . , r .
The intervals βk ± uβk; k = 1, . . . , r localize the auxiliary quantities β0,k.
Uncertainty Band
The l-th set of input data produces the least squares polynomial
yl(x) = β1l + β2lt2(x) + · · · + βrltr(x) ; l = 1, . . . , n . (12.101)
According to (12.89), we have
y(x) =1n
n∑l=1
yl(x) . (12.102)
The empirical variance at the point x is given by
s2y(x) =
1n − 1
n∑l=1
(yl(x) − y(x))2 . (12.103)
Within the differences
yl(x) − y(x) =(β1l − β1
)+(β2l − β2
)t2(x) + · · · + (βrl − βr
)tr(x)
=m∑
i=1
[bi1 + bi2t2(x) + · · · + birtr(x)] (yil − yi) ; l = 1, . . . , n
we assign auxiliary constants
ci(x) = bi1 + bi2t2(x) + · · · birtr(x) ; i = 1, . . . , m (12.104)
to the square brackets. Defining the auxiliary vector c = (c1 c2 · · · cm)T theempirical variance s2
y(x) turns into
s2y(x) = cTsc . (12.105)
Finally, the uncertainty band which is a measure of the fluctuations of theleast squares polynomial with respect to the formally introduced true poly-nomial (12.91) is given by
y(x) ± uy(x) ; uy(x)tP (n − 1)√
nsy(x) + fs,y . (12.106)
230 12 Special Metrology Systems
Fig. 12.8. Approximation of a given inharmonic polynomial based on five frequen-cies through a harmonic trigonometric least squares polynomial requiring about16 frequencies. Data: correct abscissas, erroneous ordinates given in the form ofarithmetic means
Example
Adjustment of a harmonic trigonometric polynomial.We simplify our preconditions and assume the existence of a true known
inharmonic trigonometric polynomial of the form
y0(x) = a0 +N∑
k=1
ak cosωkt + bk sin ωkt
based on N = 5 frequencies. We superimpose normally distributed randomerrors and a fixed systematic error, satisfying fyi = fy; −fs,y ≤ fyi ≤ fs,y;i = 1, . . . , m on the ordinates of the given polynomial.
Figure 12.8 depicts the fitted harmonic trigonometric least squares poly-nomial and its uncertainty band.
While the given inharmonic polynomial y0(x) gets along with just N = 5frequencies ω1, . . . , ω5, the harmonic trigonometric least squares polynomialcalls for an r of the order of magnitude 33, i.e. about 16 harmonic frequencies.
Part IV
Appendices
A On the Rank of Some Matrices
A matrix A is said to have rank r if there is at least one non-zero sub-determinant of order (r × r) while all sub-determinants of higher order arezero. The notation is rank (A) = r.
The so-called design matrix A, referred to in the method of least squares,Sect. 7.2, has m rows and r columns, m > r. Given A has rank r, its rcolumns are linearly independent.
It appears advantageous to regard the columns of a given matrix as vec-tors. Obviously, A has r m-dimensional column vectors. We denote the latterby ak; k = 1, . . . , r. If the ak are independent, they span an r-dimensionalsubspace R(A) = V r
m() of the underlying m-dimensional space Vm(); denotes the set of real numbers.
1. ATA, Sect. 7.2
Let rank (A) = r and let the ak; k = 1, . . . , r be a basis of an r-dimensionalsubspace R(A) = V r
m() of the m-dimensional space Vm() considered. Asthe ak are independent, it is not possible to find a vector
β = (β1 β2 · · · βr)T = 0 ,
so that
Aβ = β1a1 + β2a2 + · · · + βrar = 0 .
The vector
c = β1a1 + β2a2 + · · · + βrar
lies in the column space of the matrix A. Consequently, the vector ATc mustbe different from the zero vector, as c cannot be orthogonal to all vectors ak;k = 1, . . . , r. But, if for any vector β = 0
ATAβ = 0 ,
the columns of ATA are independent, i.e. rank (A) = r. Moreover, as(Aβ)TAβ equals the squared length of the vector Aβ, which is nonzeroif β is nonzero, we have
βT(ATA)β > 0
so that the symmetric matrix ATA is also positive definite.
234 A On the Rank of Some Matrices
2. B = A(ATA)−1, Sect. 7.2
As (ATA)−1 is non-singular, the rank of B is that of A.
3. P = A(ATA)−1AT, Sect. 7.2
The rank of an idempotent matrix (P 2 = P ) is equal to its trace [53]. Giventhat the product of two matrices, say R and S, is defined in either way, RSand SR, we have trace (RS) = trace(SR), i.e.
rank(P ) = trace(P )
= trace[A(ATA
)−1AT]
= trace[ATA
(ATA
)−1]
= r .
4. C = A(ATA)−1HT, Sect. 7.3
Each of the columns of C is a linear combination of the columns of B =A(ATA)−1. As rank (H) = q, the q columns of C are independent.
5. BTsB, Sect. 9.5
Since we consider positive definite (non-singular) empirical variance–covari-ance matrices s only, we are allowed to resort to the decomposition s = cTc;c denoting a non-singular (m × m) matrix [52]. As BTsB = (c B)T(c B),where rank (c B) = r, we have rank (BTs B) = r. Moreover, the symmetricmatrix BTs B is positive definite.
B Variance–Covariance Matrices
We consider m random variables Xi, µi = EXi; i = 1, . . . , m and m realnumbers ξi; i = 1, . . . , m. The expectation
E
⎧⎨⎩[
m∑i=1
ξi (Xi − µi)
]2⎫⎬⎭ ≥ 0 (B.1)
yields the quadratic form
ξTσ ξ ≥ 0 , (B.2)
where
ξ = (ξ1 ξ2 . . . ξm)T (B.3)
and
σ =
⎛⎜⎜⎝
σ11 σ12 . . . σ1m
σ21 σ22 . . . σ2m
. . . . . . . . . . . .σm1 σm2 . . . σmm
⎞⎟⎟⎠
σij = E (Xi − µi) (Xj − µj) ; i, j = 1, . . . , m . (B.4)
From (B.2) we seize: Variance–covariance matrices are at least positive semi-definite.
As the variance–covariance matrix is real and symmetric there is an or-thogonal transformation for diagonalization. This in turns allows for
|σ| = λ1 · λ2 · . . . · λm , (B.5)
the λi; i = 1, . . . , m denoting the eigenvalues of σ. If rank (σ) = m, necessar-ily all the eigenvalues λi; i = 1, . . . , m are different from zero. On the otherhand, given σ is positive definite
ξTσ ξ > 0 , (B.6)
the theory of quadratic forms tells us
236 B Variance–Covariance Matrices
λi > 0 ; i = 1, . . . , m (B.7)
so that
|σ| > 0 . (B.8)
This in the end implies: Given the the variance–covariance matrix has fullrank, it is positive definite.
If (B.6) does not apply, the variance–covariance matrix σ is positive semi-definite, rank (σ) < m and some of the eigenvalues are zero, those beingdifferent from zero are still positive. We then have
|σ| = 0 . (B.9)
Let us consider m = 2 and substitute σxy for σ12; furthermore σxx ≡ σ2x and
σyy ≡ σ2y for σ11 and σ22 respectively. Then, from
σ =(
σxx σxy
σyx σyy
), given |σ| > 0
we deduce
−σxσy < σxy < σxσy .
Let us now switch from expectations and theoretical variance–covariance ma-trices to sums and empirical variance–covariance matrices. To this end, weconsider m arithmetic means
xi =1n
n∑l=1
xil ; i = 1, . . . , m , (B.10)
each covering n repeat measurements xil; i = 1, . . . , m; l = 1, . . . , n. Thenon-negative sum
1n − 1
n∑l=1
[m∑
i=1
ξi (xil − xi)
]2≥ 0 (B.11)
yields
ξTsξ ≥ 0 , (B.12)
where
s =
⎛⎜⎜⎝
s11 s12 . . . s1m
s21 s22 . . . s2m
. . . . . . . . . . . .sm1 sm2 . . . smm
⎞⎟⎟⎠
sij =1
n − 1
n∑l=1
(xil − xi) (xj l − xj) ; i, j = 1, . . . , m (B.13)
B Variance–Covariance Matrices 237
denotes the empirical variance–covariance matrix of the m data sets xil; i =1, . . . , m; l = 1, . . . , n.
Given rank (s) = m, the matrix s is positive definite,
ξTsξ > 0 , |s| > 0 , (B.14)
but for rank (s) < m, the matrix s is positive semi-definite,
ξTsξ ≥ 0 , |s| = 0 . (B.15)
Again, we set m = 2 and substitute sxy for s12, sxx = s2x and syy = s2
y fors11 and s22 respectively. Then, from
s =(
sxx sxy
syx syy
), provided |s| > 0 ,
we obtain
−sxsy < sxy < sxsy . (B.16)
C Linear Functions of Normal Variables
We confine ourselves to sums of two (possibly dependent) normally dis-tributed random variables, X and Y , and refer to the nomenclature intro-duced in Sect. 3.2, i.e. we consider
Z = bxX + byY (C.1)
and denote by µx = EX, µy = EY the expectations and by σxx =E(X − µx)2, σyy = E(Y − µy)2, σxy = E(X − µx)(Y − µy) the theo-retical moments of second order. As usual, we associate the variables X andY with the quantities to be measured, x and y. Defining
ζ = (x y)T , µζ = (µx µy)T , b = (bx by)T
, σ =(
σxx σxy
σyx σyy
),
we arrive at
z = bTζ , µz = E Z = bTµζ , σ2z = E
(Z − µz)
2
= bTσ b .
We prove that Z is N(µz, σ2z) distributed. To this end we refer to the tech-
nique of characteristic functions. Given a random variable X , its character-istic function is defined by [43]
χ(v) = EeivX
=
∞∫−∞
pX(x)eivx dx . (C.2)
If X is N(µx, σ2x) distributed, we have
χ(v) = exp(
iµxv − σ2xv2
2
). (C.3)
Obviously, the Fourier transform of χ(v) reproduces the distribution densitypX(x).
We consider the characteristic function of the random variable Z. Follow-ing (4.5) we may put
240 C Linear Functions of Normal Variables
EeivZ
=
∞∫−∞
eivzpZ(z) dz
=
∞∫−∞
∞∫−∞
exp [iv(bxx + byy)] p (x, y) dxdy , (C.4)
the density pXY (x, y) being given in (3.17),
pXY (x, y) =1
2π|σ|1/2exp[−1
2(ζ − µζ)T σ−1 (ζ − µζ)
].
Adding ±ivbTµζ to the exponent of (C.4), we form the identity
iv bTζ − 12
(ζ − µζ)T σ−1 (ζ − µζ) ± iv bTµζ
= −12
(ζ − µ∗)T σ−1 (ζ − µ∗) + iv bTµζ − v2
2bTσ b , (C.5)
in which
µ∗ = µζ + ivvecσ .
Finally, (C.4) turns into
EeivZ
= exp(
ibTµζv − bTσ bv2
2
)
= exp(
iµzv − σ2z v2
2
). (C.6)
A comparison of (C.3) and (C.6) reveals that Z is normally distributed.
D Orthogonal Projections
Let Vm() denote an m-dimensional vector space. Furthermore, let the r < mlinearly independent vectors
a1 =
⎛⎜⎜⎝
a11
a21
. . .am1
⎞⎟⎟⎠ , a2 =
⎛⎜⎜⎝
a12
a22
. . .am2
⎞⎟⎟⎠ , . . . , ar =
⎛⎜⎜⎝
a1r
a2r
. . .amr
⎞⎟⎟⎠
define an r-dimensional subspace V rm() ⊂ Vm(). We assemble the r vectors
within a matrix
A = (a1 a2 · · · ar)
so that V rm() coincides with the column space R(A) of the matrix A. The
(m − r)-dimensional null space N(AT) = V m−rm () of the matrix AT is or-
thogonal to R(A). The set of linear independent vectors r1, . . . , rm−r, span-ning N(AT), is defined through the linear system
ATr = 0 . (D.1)
Each vector lying within R(A) is orthogonal to each vector lying withinN(AT) and vice versa: Vm() is the direct sum of the subspaces R(A) andN(AT),
Vm () = N(AT)⊕ R (A) .
Each vector y being an element of R(A) may be uniquely decomposed intoa linear combination of the vectors ak; k = 1, . . . , r,
Aβ = y . (D.2)
The r components βk; k = 1, . . . , r of the vector
β = (β1 β2 . . . βr)T
may be seen as coefficients or “stretch factors” attributing to the vectors ak;k = 1, . . . , r suitable lengths so that the linear combination
242 D Orthogonal Projections
Fig. D.1. The r column vectors a1, a2, . . . , ar span an r-dimensional subspaceR(A) = V r
m() of the m-dimensional space Vm(). The vector of the residuals,r ∈ N(AT), N(AT) ⊥ R(A), is orthogonal to the vectors a1, a2 . . . , ar
β1a1 + β2a2 + · · · + βrar
defines, as desired, the vector y.Clearly, any vector x ∈ Vm() may only be represented through a linear
combination of the vectors ak; k = 1, . . . , r if it happens that x lies in R(A) ⊂Vm(). This, however, will in general not be the case. In view of Fig. D.1 weconsider the orthogonal projection of x onto R(A).1
Let P be an operator accomplishing the orthogonal projection of x ontoR(A),
y = Px = Aβ . (D.3)
We shall show that the column vectors ak; k = 1, . . . , r of the matrix Adetermine the structure of the projection operator P . Let r ∈ N(AT) denotethe component of x perpendicular to R(A) = V r
m(). We then have
r = x − y ; ATr = 0 . (D.4)
Any linear combination of the m−r linearly independent vectors spanning thenull space N(AT) = V m−r
m () of the matrix AT is orthogonal to the vectorsak; k = 1, . . . , r spanning the column space R(A) = V r
m(). However, givenP , the orthogonal projection y = Px of x onto R(A) is fixed. At the sametime, out of all vectors r satisfying ATr = 0, the vector of the residualsr = x − Px is selected.
Suppose we had an orthonormal basis
1Should x already lie in R(A), the projection would not entail any change atall.
D Orthogonal Projections 243
α1 =
⎛⎜⎜⎝
α11
α21
. . .αm1
⎞⎟⎟⎠ , . . . , αm =
⎛⎜⎜⎝
α1m
α2m
. . .αmm
⎞⎟⎟⎠
of the space Vm(). Let the first r vectors define the column space R(A) andlet the last m−r span the null space N(AT). Thus, for any vector x ∈ Vm()we have
x = γ1α1 + · · · + γrαr︸ ︷︷ ︸y
+ γr+1αr+1 + · · · + γmαm︸ ︷︷ ︸r
;
αTk x = γk ; k = 1, . . . , m , (D.5)
or
y =r∑
k=1
γkαk , r =m∑
k=r+1
γkαk .
By means of the first r vectors αk; k = 1, . . . , r we define a matrix
T =
⎛⎜⎜⎝
α11 α12 . . . α1r
α21 α22 . . . α2r
. . . . . . . . . . . .αm1 αm2 . . . αmr
⎞⎟⎟⎠ ; T TT = I . (D.6)
Here, I designates an (r× r) identity matrix. In contrast to this, the productT T T accomplishes the desired orthogonal projection, since from
(γ1 γ2 · · · γr)T = T Tx
we find
y = T (γ1 γ2 · · · γr)T = T T Tx = Px , P = T T T . (D.7)
The projection operator P has now been established. But this has been donewith reference to an orthogonal basis, which actually we do not possess. Inorder to express P by means of the vectors ak; k = 1, . . . , r of the columnspace of A, we define a non-singular (r × r) auxiliary matrix H so that
A = T H . (D.8)
Following (D.6) we have H = T TA and, furthermore, T = AH−1, i.e.
P = T T T = AH−1(H−1
)TAT = A
(HTH
)−1AT ,
and as
HTH = ATTT TA = ATPA = ATA ,
244 D Orthogonal Projections
PA = A, we finally arrive at
P = A(ATA
)−1AT . (D.9)
This operator, expressed in terms of A, accomplishes the desired orthogonalprojection of x onto R(A).
Should there be two different projection operators P 1x = y, P 2x = y,we necessarily have
(P 1 − P 2)x = 0
which is not, however, possible for arbitrary vectors x. From this we concludethat projection operators are unique.
We easily convince ourselves once again that the projection of any vectorlying already within R(A) simply reproduces just this vector. Indeed, from(D.7) we obtain
Py = T T TT (γ1 γ2 · · · γr)T = T (γ1 γ2 · · · γr)
T = y .
Projection operators possess the property P 2 = P , i.e. they are idempotent.Furthermore, they are symmetric. Vice versa, any symmetric matrix showingP 2 = P is a projection operator.
As rank (A) = r, we have rank (P ) = r. Let v be an eigenvector of Pand λ = 0 the pertaining eigenvalue, Pv = λv. Then from P (Pv) = λ2v =Pv = λv, i.e. λ2 = λ, we see λ = 1. The eigenvalues of projection operatorsare either 0 or 1.
Obviously, the trace of P is given by the sum of its eigenvalues, at thesame time the rank of an indempotent matrix equals its trace. From this weconclude that r eigenvalues are 1 and m − r eigenvalues are 0 [52–54].
E Least Squares Adjustments
We refer to the nomenclature of Appendix D.
Unconstrained Adjustment
The linear system
Aβ ≈ x (E.1)
is unsolvable as, due to the measurement errors, x ∈ Vm() and Aβ ∈V r
m(), where least squares presupposes r < m. Let us decompose Vm()into orthogonal subspaces V r
m() and V m−rm () so that
Vm () = V rm () ⊕ V m−r
m () .
The method of least squares projects the vector x by means of the projectionoperator
P = A(ATA
)−1AT (E.2)
orthogonally onto the subspace V rm(), so that Px ∈ V r
m(). The orthogonalprojection defines a so-called vector of residuals r satisfying AT r = 0, r ∈V m−r
m () and x− r = Px ∈ V rm().1 Ignoring r and substituting the vector
Px for x,
Aβ = Px , (E.3)
the inconsistent system (E.1) becomes “solvable”. After all, the quintessenceof the method of least squares is rooted in this substitution, which is, franklyspeaking, a trick. Multiplying (E.3) on the left by (ATA)−1AT yields theleast squares estimator
β =(ATA
)−1ATx (E.4)
of the unknown true solution vector.1There are, of course, infinitely many vectors r ∈ V m−r
m () satisfying ATr = 0.The orthogonal projection through P selects the vector of residuals r.
246 E Least Squares Adjustments
Constrained Adjustment
1. rank (A) = r
The least squares estimator β shall exactly satisfy q < r constraints
Hβ = y , (E.5)
where rank (H) = q is assumed. Although we adhere to the idea of orthogonalprojection, we of course have to alter the projection operator, i.e. though werequire the new P to accomplish Px ∈ V r
m(), this new P will be differentfrom the projection operator of the unconstrained adjustment.
First, we introduce an auxiliary vector β∗ leading to homogeneous con-straints – later on this vector will cancel.
Let β∗ be any solution of the linear constraints (E.5),
Hβ∗ = y . (E.6)
Then, from
A (β − β∗) ≈ x − Aβ∗ , H (β − β∗) = 0
we obtain, setting
γ = β − β∗ , z = x − Aβ∗ , (E.7)
the new systems
Aγ ≈ z , Hγ = 0 . (E.8)
Let us assume that the new projection operator P is known. Hence, Pprojects z orthogonally onto the column space of A,
Aγ = Pz , P = A(ATA
)−1AT here (E.9)
so that
γ =(ATA
)−1ATPz . (E.10)
Obviously, the condition
Hγ = H(ATA
)−1ATPz = CTPz = 0 (E.11)
is fulfilled, given the vector Pz lies in the null space of the matrix
CT = H(ATA
)−1AT .
Finally, Pz has to lie in the column space of the matrix A as well as in thenull space of the matrix CT, i.e.
E Least Squares Adjustments 247
Fig. E.1. V rm() symbolic, R(C) ⊂ R(A), R(C) ⊥ N(CT), intersection
R(A) ∩ N(CT)
Pz ∈ R (A) ∩ N(CT)
.
The q linearly independent column vectors of the matrix C span a q-dimensional subspace R(C) = V q
m() of the space R(A) = V rm(), where
R(C) ⊂ R(A). Furthermore, as
N(CT) = V m−qm () , R (C)⊥N(CT) and R (C) ⊕ N(CT) = Vm () ,
the intersection R(A) ∩ N(CT) is well defined, Fig. E.1. To project in justthis intersection, we simply put
P = A(ATA
)−1AT − C
(CTC
)−1CT
so that the component projected onto R(C), R(C) ⊂ R(A) cancels. Return-ing to β yields
β =(ATA
)−1ATx (E.12)
− (ATA)−1
HT[H(ATA
)−1HT]−1 [
H(ATA
)−1ATx − y
].
Finally, we see Hβ = y.
2. rank (A) = r′ < r
We assume the first r′ column vectors of the matrix A to be linearly inde-pendent. Then, we partition the matrix A into an (m × r′) matrix A1 andan (m × (r − r′)) matrix A2,
A = (A1 |A2 ) .
Formally, the inconsistencies of the system (E.1) “vanish”, when the erro-neous vector x is projected onto the column space of A1,
248 E Least Squares Adjustments
Aβ = Px , P = A1
(AT
1 A1
)−1AT
1 . (E.13)
However, as rank (A) = r′ < r, there are q = (r − r′) indeterminate compo-nents of the vector β. Consequently, we may add q constraints. As the columnvectors ak; k = r′ + 1, . . . , r are linear combinations of the column vectorsak; k = 1, . . . , r′ and the projection of any vector lying in the column spaceof A1 simply gets reproduced, we have PA = A or (PA)T = ATP = AT.Hence, multiplying (E.13) on the left by AT yields
ATAβ = ATx . (E.14)
Adding q = r− r′ constraints, we may solve the two systems simultaneously.To this end, we multiply (E.5) on left by HT,
HTHβ = HTy , (E.15)
so that (ATA + HTH
)β = ATx + HTy . (E.16)
Obviously, the auxiliary matrix
C =
⎛⎝ A
−−H
⎞⎠
possesses r′ linearly independent rows in A and q = r−r′ linearly independentrows in H so that rank (C) = r. But then, the product matrix
CTC =(AT A + HT H
)has the same rank. Finally, (E.16) yields
β =(ATA + HTH
)−1 (ATx + HTy
). (E.17)
The two relationships [8],
H(ATA + HTH
)−1AT = 0 and H
(ATA + HTH
)−1HT = I ,
where I denotes a (q × q) identity matrix, disclose that β satisfies Hβ = y.
F Expansion of Solution Vectors
Section 11.1: Planes
The design matrix
A =
⎛⎜⎜⎝
1 x1 y1
1 x2 y2
. . . . . . . . .1 xm ym
⎞⎟⎟⎠
leads us to the product
ATA =
⎛⎝H11 H12 H13
H21 H22 H23
H31 H32 H33
⎞⎠
in which
H11 = m , H22 =m∑
j=1
x2j , H33 =
m∑j=1
y2j
H12 = H21 =m∑
j=1
xj , H13 = H31 =m∑
j=1
yj , H23 = H32 =m∑
j=1
xj yj .
Let D denote the determinant of ATA,
D = H11
(H22H33 − H2
23
)+ H12 (H31H23 − H33H12)
+H13 (H21H32 − H13H22) .
Obviously,
(ATA
)−1=
1D
⎛⎝ H22H33 − H2
23 | H23H13 − H12H33 | H12H23 − H22H13
H23H13 − H12H33 | H11H33 − H213 | H12H13 − H11H23
H12H23 − H22H13 | H12H13 − H11H23 | H11H22 − H212
⎞⎠
is the inverse of ATA. Putting B = A(ATA
)−1, the components of β =BTz are given by
250 F Expansion of Solution Vectors
β1 =m∑
j=1
bj1zj , β2 =m∑
j=1
bj2zj , β3 =m∑
j=1
bj3zj
or
β1 =1D
m∑j=1
[H22H33 − H2
23
]+ [H23H13 − H12H33] xj
+ [H12H23 − H22H13] yj zj
β2 =1D
m∑j=1
[H23H13 − H12H33] +
[H11H33 − H2
13
]xj
+ [H12H13 − H11H23] yj zj
β3 =1D
m∑j=1
[H12H23 − H22H13] + [H12H13 − H11H23] xj
+[H11H22 − H2
12
]yj
zj .
Before writing down the partial derivatives, we differentiate the determinant,
∂D
∂xi= 2[H11H33 − H2
13
]xi + 2 [H12H13 − H11H23] yi
+2 [H13H23 − H12H33]
∂D
∂yi= 2 [H12H13 − H11H23] xi + 2
[H11H22 − H2
12
]yi
+2 [H12H23 − H13H22]
∂D
∂zi= 0 .
Now, each of the components βk; k = 1, 2, 3 has to be partially differentiatedwith respect to x, y and z,
ci1 =∂β1
∂xi= − β1
D
∂D
∂xi
+1D
⎧⎨⎩2 (H33xi − H23yi)
m∑j=1
zj + (H13H23 − H12H33) zi
+ (H13yi − H33)m∑
j=1
xj zj + (H23 + H12yi − 2H13xi)m∑
j=1
yj zj
⎫⎬⎭
F Expansion of Solution Vectors 251
ci+m,1 =∂β1
∂yi= − β1
D
∂D
∂yi
+1D
⎧⎨⎩2 (H22yi − H23xi)
m∑j=1
zj + (H12H23 − H13H22) zi
+ (H23 + H13xi − 2H12yi)m∑
j=1
xj zj + (H12xi − H22)m∑
j=1
yj zj
⎫⎬⎭
ci+2m,1 =∂β1
∂zi= bi1
ci2 =∂β2
∂xi= − β2
D
∂D
∂xi+
1D
⎧⎨⎩(H13yi − H33)
m∑j=1
zj
+(H11H33 − H2
13
)zi + (H13 − H11yi)
m∑j=1
yj zj
⎫⎬⎭
ci+m,2 =∂β2
∂yi= − β2
D
∂D
∂yi+
1D
⎧⎨⎩(H23 + H13xi − 2H12yi)
m∑j=1
zj
+ (H12H13 − H11H23) zi + 2 (H11yi − H13)m∑
j=1
xj zj
+ (H12 − H11xi)m∑
j=1
yj zj
⎫⎬⎭
ci+2m,2 =∂β2
∂zi= bi2
ci3 =∂β3
∂xi= − β3
D
∂D
∂xi+
1D
⎧⎨⎩(H23 + H12yi − 2H13xi)
m∑j=1
zj
+ (H12H13 − H11H23) zi
+ (H13 − H11yi)m∑
j=1
xj zj + 2 (H11xi − H12)m∑
j=1
yj zj
⎫⎬⎭
ci+m,3 =∂β3
∂yi= − β3
D
∂D
∂yi+
1D
⎧⎨⎩(H12xi − H22)
m∑j=1
zj
+(H11H22 − H2
12
)zi + (H12 − H11xi)
m∑j=1
xj zj
⎫⎬⎭
252 F Expansion of Solution Vectors
ci+2m,3 =∂β3
∂zi= bi3 .
Section 11.1: Parabolas
From the design matrix
A =
⎛⎜⎜⎝
1 x1 x21
1 x2 x22
. . . . . . . . .1 xm x2
m
⎞⎟⎟⎠
we find the product matrix
ATA =
⎛⎝H11 H12 H13
H21 H22 H23
H31 H32 H33
⎞⎠
whose elements are given by
H11 = m , H22 =m∑
j=1
x2j , H33 =
m∑j=1
x4j
H12 = H21 =m∑
j=1
xj , H13 = H31 =m∑
j=1
x2j , H23 = H32 =
m∑j=1
x3j .
Designating the determinant of ATA by
D = H11
(H22H33 − H2
23
)+ H12 (H31H23 − H33H12)
+H13 (H21H32 − H13H22) .
we find
(ATA
)−1=
1D
⎛⎝ H22H33 − H2
23 | H23H13 − H12H33 | H12H23 − H22H13
H23H13 − H12H33 | H11H33 − H213 | H12H13 − H11H23
H12H23 − H22H13 | H12H13 − H11H23 | H11H22 − H212
⎞⎠ .
Setting B = A(ATA)−1, the components of β = BTy are
β1 =m∑
j=1
bj1yj , β2 =m∑
j=1
bj2yj , β3 =m∑
j=1
bj3yj
or
β1 =1D
m∑j=1
[H22H33 − H2
23
]+ [H23H13 − H12H33] xj
+ [H12H23 − H22H13] x2j
yj
F Expansion of Solution Vectors 253
β2 =1D
m∑j=1
[H23H13 − H12H33] +
[H11H33 − H2
13
]xj
+ [H12H13 − H11H23] x2j
yj
β3 =1D
m∑j=1
[H12H23 − H22H13] + [H12H13 − H11H23] xj
+[H11H22 − H2
12
]x2
j
yj .
The partial derivatives of the determinant follow from
∂D
∂xi= 2 [H13H23 − H12H33]
+2[H11H33 + 2 (H12H23 − H13H22) − H2
13
]xi
+6 [H12H13 − H11H23] x2i + 4
(H11H22 − H2
12
)x3
i
∂D
∂yi= 0 ,
and for those of the components of the solution vector we have
ci1 =∂β1
∂xi= − β1
D
∂D
∂xi+
1D
⎧⎨⎩2(H33xi − 3H23x
2i + 2H22x
3i
) m∑j=1
yj
+ (H23H13 − H12H33) yi
+(−H33 + 2H23xi + 3H13x
2i − 4H12x
3i
) m∑j=1
xj yj
+2 (H12H23 − H22H13) xiyi
+[H23 − 2 (H13 + H22) xi + 3H12x
2i
] m∑j=1
x2j yj
⎫⎬⎭
ci+m,1 =∂β1
∂yi= bi1
ci2 =∂β2
∂xi= − β2
D
∂D
∂xi+
1D
⎧⎨⎩(−H33 + 2H23xi + 3H13x
2i − 4H12x
3i
) m∑j=1
yj
+4(−H13xi + H11x
3i
) m∑j=1
xj yj
+(H11H33 − H2
13
)yi + 2 (H12H13 − H11H23) xiyi
+[H13 + 2H12xi − 3H11x
2i
] m∑j=1
x2j yj
⎫⎬⎭
254 F Expansion of Solution Vectors
ci+m,2 =∂β2
∂yi= bi2
ci3 =∂β3
∂xi= − β3
D
∂D
∂xi+
1D
⎧⎨⎩[H23 − 2 (H13 + H22) xi + 3H12x
2i
] m∑j=1
yj
+(H13 + 2H12xi − 3H11x
2i
) m∑j=1
xj yj + (H12H13 − H11H23) yi
+2(H11H22 − H2
12
)xiyi + 2 (−H12 + H11xi)
m∑j=1
x2j yj
⎫⎬⎭
ci+m,3 =∂β3
∂yi= bi3 .
Section 11.4: Circles
From
A =
⎛⎜⎜⎝
1 u1 v1
1 u2 v2
. . . . . . . . .1 um vm
⎞⎟⎟⎠
we find
ATA =
⎛⎝H11 H12 H13
H21 H22 H23
H31 H32 H33
⎞⎠
whose elements are given by
H11 = m , H22 =m∑
j=1
u2j , H33 =
m∑j=1
v2j
H12 = H21 =m∑
j=1
uj , H13 = H31 =m∑
j=1
vj , H23 = H32 =m∑
j=1
uj vj .
The determinant of ATA is
D = H11
(H22H33 − H2
23
)+ H12 (H31H23 − H33H12)
+H13 (H21H32 − H13H22) .
so that
(ATA
)−1=
1D
⎛⎝ H22H33 − H2
23 | H23H13 − H12H33 | H12H23 − H22H13
H23H13 − H12H33 | H11H33 − H213 | H12H13 − H11H23
H12H23 − H22H13 | H12H13 − H11H23 | H11H22 − H212
⎞⎠ .
F Expansion of Solution Vectors 255
Putting B = A(ATA)−1 the components of β = BTw are
β1 =m∑
j=1
bj1wj , β2 =m∑
j=1
bj2wj , β3 =m∑
j=1
bj3wj
or
β1 =1D
m∑j=1
[H22H33 − H2
23
]+ [H23H13 − H12H33] uj
+ [H12H23 − H22H13] vj wj
β2 =1D
m∑j=1
[H23H13 − H12H33] +
[H11H33 − H2
13
]uj
+ [H12H13 − H11H23] vj wj
β3 =1D
m∑j=1
[H12H23 − H22H13] + [H12H13 − H11H23] uj
+[H11H22 − H2
12
]vj
wj .
For the partial derivatives of the determinant we find
∂D
∂xi=
2r∗[(
H11H33 − H213
)ui + (H13H12 − H11H23) vi
+ (H13H23 − H33H12)]
∂D
∂yi=
2r∗[(H12H13 − H11H23) ui +
(H11H22 − H2
12
)vi
+ (H12H23 − H13H22)]
and for those of the components of the solution vector
ci1 =∂β1
∂xi= − β1
D
∂D
∂xi+
1D
⎧⎨⎩ 2
r∗(H33ui − H23vi)
m∑j=1
wj
+(H22H33 − H2
23
)ui
+1r∗
(H13vi − H33)m∑
j=1
ujwj + (H23H13 − H12H33)( wi
r∗+ u2
i
)
+1r∗
(H23 + H12vi − 2H13ui)m∑
j=1
vjwj + (H12H23 − H22H13) uivi
⎫⎬⎭
ci+m,1 =∂β1
∂yi= − β1
D
∂D
∂yi+
1D
⎧⎨⎩ 2
r∗(H22vi − H23ui)
m∑j=1
wj
+(H22H33 − H2
23
)vi
256 F Expansion of Solution Vectors
+1r∗
(H13ui + H23 − 2H12vi)m∑
j=1
ujwj + (H23H13 − H12H33) uivi
+1r∗
(H12ui − H22)m∑
j=1
vjwj + (H12H23 − H22H13)( wi
r∗+ v2
i
)⎫⎬⎭
ci2 =∂β2
∂xi= − β2
D
∂D
∂xi+
1D
⎧⎨⎩ 1
r∗(H13vi − H33)
m∑j=1
wj
+ (H23H13 − H12H33) ui +(H11H33 − H2
13
)( wi
r∗+ u2
i
)
+1r∗
(H13 + H11vi)m∑
j=1
vjwj + (H12H13 − H11H23) uivi
⎫⎬⎭
ci+m,2 =∂β2
∂yi= − β2
D
∂D
∂yi+
1D
⎧⎨⎩ 1
r∗(H23 + H13ui − 2H12vi)
m∑j=1
wj
+ (H23H13 − H12H33) vi
+2r∗
(H11vi − H13)m∑
j=1
ujwj +(H11H33 − H2
13
)uivi
+1r∗
(H12 − H11ui)m∑
j=1
vjwj + (H12H13 − H11H23)( wi
r∗+ v2
i
)⎫⎬⎭
ci3 =∂β3
∂xi= − β3
D
∂D
∂xi+
1D
1r∗
(H23 + H12vi − 2H13ui)m∑
j=1
wj
+ (H12H23 − H22H13) ui +1r∗
(H13 − H11vi)m∑
j=1
ujwj
+ (H12H13 − H11H23)( wi
r∗+ u2
i
)+
2r∗
(H11ui − H12)m∑
j=1
vjwj
+(H11H22 − H2
12
)uivi
ci+m,3 =∂β3
∂yi= − β3
D
∂D
∂yi+
1D
⎧⎨⎩ 1
r∗(H12ui − H22)
m∑j=1
wj
+ (H12H23 − H22H13) vi +1r∗
(H12 − H11ui)m∑
j=1
ujwj
+ (H12H13 − H11H23) uivi +(H11H22 − H2
12
) ( wi
r∗+ v2
i
).
G Student’s Density
Suppose the random variable X to be N(µ, σ2) -distributed. Then, its distri-bution function is given by
P (X ≤ x) =1
σ√
2π
x∫−∞
exp
(− (x − µ)2
2σ2
)dx . (G.1)
We consider a sample
x1, x2, . . . , xn (G.2)
and the estimator
s2 =1
n − 1
n∑l=1
(xl − x)2 ; ν = n − 1 . (G.3)
For a fixed t we define
x = µ + ts , (G.4)
so that
P (X ≤ µ + ts) =1
σ√
2π
µ+ts∫−∞
exp
(− (x − µ)2
2σ2
)dx . (G.5)
Substituting η = (x − µ)/σ for x leads to
P (X ≤ µ + ts) = P (t, s, ν) =1√2π
ts/σ∫−∞
e−η2/2dη . (G.6)
For this statement to be statistically representative, we average by means ofthe density
pS2
(s2)
=νν/2
2ν/2Γ (ν/2)σνexp(− νs2
2σ2
)sν−2 , (G.7)
258 G Student’s Density
which we deduce from (3.36) putting χ2 = n−1σ2 s2 and pS2(s2) = pχ2
dχ2
ds2 , sothat
P (t, ν) = E P (t, s, ν) =
∞∫0
⎧⎪⎨⎪⎩ 1√
2π
ts/σ∫−∞
e−η2/ 2dη
⎫⎪⎬⎪⎭ pS2
(s2)ds2 . (G.8)
Averaging by means of the density pS(s) as given e.g. in [50] leads to thesame result as pS(s)ds = pS2(S2)dS2.
Obviously, (G.8) is a distribution function and its only variable is t. Hence,differentiating with respect to t,
dP (t, ν)dt
=
∞∫0
1
σ√
2πexp(− t2s2
2σ2
)s
pS2
(s2)ds2 , (G.9)
and integrating out the right hand side, we find Student’s (or Gosset’s) den-sity
pT (t, ν) =Γ(
ν+12
)√
πνΓ(
ν2
) 1(1 + t2
ν
) ν+12
(G.10)
the random variable of which is implied in (G.4),
T (ν) =X − µ
S. (G.11)
According to (G.7), the number of the degrees of freedom is ν = n−1. Insteadof (G.2) we now consider the estimator
x =1n
n∑l=1
xl (G.12)
and the limit of integration
x = µ + ts/√
n (G.13)
so that
P(X ≤ x
)=
1σ/
√n√
2π
x∫−∞
exp
(− (x − µ)2
2σ2/n
)dx (G.14)
leads us to
P(X ≤ µ + ts/
√n)
=1
σ/√
n√
2π
µ+ts/√
n∫−∞
exp
(− (x − µ)2
2σ2/n
)dx . (G.15)
G Student’s Density 259
On substituting η = (x − µ)/(σ/√
n) we again arrive at (G.6). Hence, therandom variable
T (ν) =X − µ
S/√
n. (G.16)
is t-distributed with ν = n − 1 degrees of freedom.
H Quantiles of Hotelling’s Density
t0.95 ; P=95%
m/n 1 2 3 4 5 6 7 8 9 10
12 133 4.3 284 3.2 7.6 445 2.8 5.1 10.7 606 2.6 4.2 6.8 13.9 767 2.5 3.7 5.4 8.5 17.0 928 2.4 3.5 4.8 6.7 10.3 20.1 1089 2.3 3.3 4.4 5.8 7.9 12.0 23.3 12410 2.3 3.2 4.1 5.2 6.7 9.1 13.7 26.4 14011 2.2 3.1 3.9 4.9 6.0 7.7 10.3 15.4 29.5 15612 2.2 3.0 3.8 4.6 5.6 6.9 8.7 11.5 17.1 32.713 2.2 3.0 3.7 4.4 5.3 6.3 7.7 9.6 12.7 18.814 2.2 2.9 3.6 4.3 5.0 5.9 7.0 8.5 10.6 13.915 2.1 2.9 3.5 4.1 4.8 5.6 6.6 7.7 9.3 11.216 2.1 2.8 3.4 4.0 4.7 5.4 6.2 7.2 8.4 10.117 2.1 2.8 3.4 4.0 4.6 5.2 5.9 6.8 7.8 9.118 2.1 2.8 3.3 3.9 4.5 5.1 5.7 6.5 7.4 8.419 2.1 2.8 3.3 3.8 4.4 4.9 5.5 6.2 7.0 7.920 2.1 2.7 3.3 3.8 4.3 4.8 5.4 6.0 6.7 7.521 2.1 2.7 3.3 3.7 4.2 4.7 5.3 5.8 6.5 7.222 2.1 2.7 3.2 3.7 4.2 4.7 5.2 5.7 6.3 6.923 2.1 2.7 3.2 3.7 4.1 4.6 5.1 5.6 6.1 6.724 2.1 2.7 3.2 3.6 4.1 4.5 5.0 5.5 6.0 6.525 2.1 2.7 3.2 3.6 4.0 4.5 4.9 5.4 5.9 6.426 2.1 2.7 3.1 3.6 4.0 4.4 4.8 5.3 5.7 6.227 2.1 2.7 3.1 3.6 4.0 4.4 4.8 5.2 5.7 6.128 2.1 2.7 3.1 3.5 3.9 4.3 4.7 5.1 5.6 6.029 2.1 2.6 3.1 3.5 3.9 4.3 4.7 5.1 5.5 5.930 2.0 2.6 3.1 3.5 3.9 4.3 4.6 5.0 5.4 5.831 2.0 2.6 3.1 3.5 3.9 4.2 4.6 5.0 5.4 5.832 2.6 3.1 3.5 3.8 4.2 4.6 4.9 5.3 5.733 3.1 3.5 3.8 4.2 4.5 4.9 5.3 5.634 3.4 3.8 4.2 4.5 4.9 5.2 5.635 3.8 4.1 4.5 4.8 5.2 5.536 4.1 4.5 4.8 5.1 5.437 4.4 4.7 5.1 5.438 4.7 5.1 5.439 5.0 5.340 5.3
References
Journals
1. Gauss, C.F., Abhandlungen zur Methode der kleinsten Quadrate, Physica,Wurzburg, 1964
2. Student (W.S. Gosset), The Probable Error of a Mean, Biometrika 1(1908)1–25
3. Fisher, R.A., Applications of Student’s Distribution, Metron 5 (1925) 90foll.4. Behrens, W.U., Ein Beitrag zur Fehlerrechnung bei wenigen Beobachtungen,
Landwirtschaftliches Jahrbuch 68 (1929) 807–9375. Hotelling, H., The Generalization of Student’s Ratio, Annals of Mathematical
Statistics 2 (1931) 360–3786. Fisher, R.A., The Comparison of Samples with Possible Unequal Variances,
Annals of Eugenics 9 (1939) 174foll.7. Welch, B.L., The Generalization of Student’s Problem when Several Different
Population Variances are Involved, Biometrika 34 (1974) 28–358. Plackett, R.L., Some Theorems in Least Squares, Biometrika 37 (1950) 149–
1579. Eisenhart, C., The Reliability of Measured Values – Part I Fundamental Con-
cepts, Photogrammetric Engineering 18 (1952) 543–56110. Wagner, S., Zur Behandlung systematischer Fehler bei der Angabe von Meßun-
sicherheiten, PTB-Mitt. 79 (1969) 343–34711. Chand, D.R. and S.S. Kapur, An Algorithm for Convex Polytopes, J. Ass.
Computing Machinery 17 (1979) 78–8612. Barnes, J.A., Characterization of Frequency Stability, IEEE IM 20 (1971)
105–12013. Bender, P.L., B.N. Taylor, E.R. Cohen, J.S. Thomsen, P. Franken and C.
Eisenhart, Should Least Squares Adjustment of the Fundamental Constantsbe Abolished? In Precision Measurement and Calibration, NBS Special Publi-cation 343, United States Department of Commerce, Washington D.C., 1971
14. Cordes, H. und M. Grabe, Messung der Neigungsstreuung mikroskopischerRauheiten in inkoharenten Lichtfeldern, Metrologia 9 (1973) 69–74
15. Quinn, T.J., The Kilogram: The Present State of our Knowledge, IEEE IM 40(1991) 81–85
16. Birge, R.T., Probable Values of the General Physical Constants, Rev. Mod.Phys. 1 (1929) 1–73
17. Cohen, E.R. and B.R. Taylor, The 1986 Adjustement of the FundamentalPhysical Constants, Codata Bulletin, No. 63 (1968)
264 References
Works by the Author
18. Masseanschluß und Fehlerausbreitung, PTB-Mitt. 87 (1977) 223–22719. Uber die Fortpflanzung zufalliger und systematischer Fehler, Seminar uber die
Angabe der Meßunsicherheit, 20. und 21. Februar 1978, PTB Braunschweig(On the Propagation of Random and Systematic Errors, Seminar on the State-ment of the Measurement Uncertainty)
20. Note on the Application of the Method of Least Squares, Metrologia 14 (1978)143–146
21. On the Assignment of Uncertainties Within the Method of Least Squares,Poster Paper, Second International Conference on Precision Measurement andFundamental Constants, Washington, D.C. 8-12 June 1981
22. Principles of “Metrological Statistics”, Metrologia 23 (1986/87) 213–21923. A New Formalism for the Combination and Propagation of Random and Sys-
tematic Errors, Measurement Science Conference, Los Angeles, 8-9 February1990
24. Uber die Interpretation empirischer Kovarianzen bei der Abschatzung vonMeßunsicherheiten, PTB-Mitt. 100 (1990) 181–186
25. Uncertainties, Confidence Ellipsoids and Security Polytopes in LSA, Phys.Letters A165 (1992) 124-132, Erratum A205 (1995) 425
26. Justierung von Waagen und Gewichtstucken auf den konventionellenWagewert, Wage-, Mess- und Abfulltechnik, Ausgabe 6, November 1977
27. On the Estimation of One- and Multidimensional Uncertainties, National Con-ference of Standards Laboratories, 25-29 July, Albuquerque, USA, Proceedings569–576
28. An Alternative Algorithm for Adjusting the Fundamental Physical Constants,Phys. Letters A213 (1996) 125–137
29. see [14]30. Gedanken zur Revision der Gauss’schen Fehlerrechnung, tm Technisches
Messen 67 (2000) 283–28831. Estimation of Measurement Uncertainties – an Alternative to the ISO Guide,
Metrologia 38 (2001) 97–10632. Neue Formalismen zum Schatzen von Meßunsicherheiten – Ein Beitrag zum
Verknupfen und Fortpflanzen von Meßfehlern, tm Technisches Messen 69(2002) 142–150
33. On Measurement Uncertainties Derived from “Metrological Statistics”, Pro-ceedings of the Fourth International Symposium on Algorithms for Approxima-tions, held at the University of Huddersfield, July 2001, edited by J. Levesley,I. Anderson and J.C. Mason, 154–161
National and International Recommendations
34. BIPM, Bureau International des Poids et Mesures, Draft Recommendation onthe Statement of Uncertainties, Metrologia 17 (1981) 69–74
35. Deutsches Institut fur Normung e.V., Grundbegriffe der Meßtechnik, DIN1319-1 (1995), DIN 1319-2 (1980), DIN 1319-3 (1996), DIN 1319-4 (1999),Beuth Verlag GmbH, Berlin 30
References 265
36. ISO, International Organization for Standardization, Guide to the Expressionof Uncertainty in Measurement, 1 Rue Varambe, Case Postale 56, CH 1221Geneva 20, Switzerland
37. Deutsches Institut fur Normung e.V., Zahlenangaben, DIN 1333 (1992), BeuthVerlag GmbH, Berlin 30
Monographs
38. Cramer, H., Mathematical Methods of Statistics, Princeton University Press,Princeton, 1954
39. Kendall, M. and A.S. Stuart, The Advanced Theory of Statistics, Vol.I-III,Charles Griffin, London, 1960
40. Graybill, F.A., An Introduction to Linear Statistical Models, McGraw-Hill,New York, 1961
41. Weise, K. and W. Woger, Meßunsicherheit und Meßdatenauswertung, Wiley-VCH, Weinheim, 1999
42. Papoulis, A., Probability, Random Variables and Stochastic Processes,McGraw-Hill Kogakusha Ltd, Tokyo, 1965
43. Beckmann, P., Elements of Applied Probability Theory, Harcourt, Brace &World Inc., New York, 1968
44. Ku, H.H. (Ed.), Precision Measurement and Calibration, NBS Special Publi-cation 300 Vol. 1, United States Department of Commerce, Washington D.C.,1969
45. Eadie, W.T. et al., Statistical Methods in Experimental Physics, North Hol-land, Amsterdam, 1971
46. Clifford, A.A., Multivariate Error Analysis, Applied Science Publisher Ltd.,London, 1973
47. Duda, R.O. and P. E. Hart, Pattern Classification and Scene Analysis, JohnWiley & Sons, New York, 1973
48. Chao, L.L., Statistics Methods and Analyses, McGraw-Hill Kogakusha Ltd.,Tokyo, 1974
49. Seber, G.A.F., Linear Regression Analysis, John Wiley & Sons, New York,1977
50. Fisz, M., Wahrscheinlichkeitsrechnung und mathematische Statistik, VEBDeutscher Verlag der Wissenschaften, Berlin, 1978
51. Draper, N.R. and H. Smith, Applied Regression Analysis, John Wiley & Sons,New York, 1981
52. Strang, G., Linear Algebra and its Applications, Harcourt Brace JovanovichCollege Publishers, New York, 1988
53. Ayres, F., Matrices, Schaum’s Outline Series, McGraw-Hill, New York, 197454. Rao, C.R., Linear Statistical Inference and Its Applications, John Wiley &
Sons, New York, 1973
Index
ISO Guideconsequences of 127
accuracy 4analysis of variance 16, 58arithmetic mean
biased 21uncertainty of 63, 77
arithmetic meanscompatibility of 81
biasand least squares 102and minimized sum of squares
residuals 104introduction of 13of arithmetic mean 22of least squares solution vector 102
central limit theorem 13circle
adjustment of 195uncertainty region of 200
comparability 52confidence ellipsoid 42, 46
in least squares 132confidence interval 27, 63, 75
in least squares 117confidence level 63, 75constants
as active parts of measuring processes80
as arguments of relationships 80conversion factor 65
uncertainty of 69, 83correlation coefficient
empirical 29theoretical 32
covarianceempirical, neglect of 31empirical 29, 35, 38empirical and Fisher’s density 43theoretical 30, 35
density of air 12deviation, systematic 14difference between two arithmetic
means 39digits, insignificant 8distribution density
Chi-square 33empirical 19F 35Fisher’s 41Hotelling’s 41, 44, 261linear function of normal variables
239multidimensional, normal 31normal 25postulated 16postulated and biases 16standardized normal 27Student’s 36, 257theoretical 19two–dimensional, normal 30
drift 11
ensemble, statistical 52, 102environmental conditions 16EP boundary 141EPC hull 141error bars 11error calculus
Gaussian 3, 16revision of 17
error combination
268 Index
arithmetical 23geometrical 17
error equationbasic 14, 71
error propagation 23, 71over several stages 87
estimator 4, 26expansion of solution vector 249
circles 254parabolas 252planes 249straight lines 165
expectation 26, 51of arithmetic mean 54, 74of empirical covariance 57of empirical variance 54of estimators 52of non-linear functions 51of the arithmetic mean 63
fairness, mutual 5Fisher–Behrens 40fundamental constants of physics
adjustment of 215localization of true values 216uncertainties of 224
Gauss–Markoff theorem 106break down of 107non-exhaustive properties 107
Gordian knot 127
Hotelling’s ellipse 44, 47Hotelling’s ellipsoid 46hypothesis testing 27
influence quantities 11input data, consistency of 109
key comparison 52, 205
least squaresabolition of 127method of 127
least squares adjustment 93, 245and weighting 108constrained 100, 246unconstrained 97, 245
localization of true values
ISO Guide 84, 120alternative error model 84in least squares 108
mass decades 210matrix, rank of 233mean of means, weighted 123measurement uncertainty 4measuring conditions, well-defined 17,
75measuring process 11, 14
stationary 4
national metrology institutes 5normal density
convergence of 27tails of 27, 63
outlier 27
pairwise comparison 215parabola
adjustment of 181uncertainty band of 183, 186, 190
parent distribution 20piecewise smooth approximation 192plane
adjustment of 175uncertainty bowl of 179
platinum-iridium 12points of attack 108potatoes, convex 141
slices of 141primary standards 203projection
orthogonal 96, 241prototype of the kilogram 210
quantiles 35Hotelling’s density 47, 261
random errors 12origin of 12
random process 18random variables
introduction of 18realization of 18sum of 28, 32, 34, 38
repeat measurements 3, 18reproducibility 4, 52, 66
Index 269
residualsvector of 94, 96
residuals, sum of squareddegrees of freedom of 127minimized 17, 95, 127
resolving power 18round robin 81, 205rounding
of estimators 9of uncertainties 9
security polygons 133security polyhedra 136security polytopes 141series expansion 71
truncation of 72SI units 64simulation 84state of statistical control 11straight line
adjustment of 151, 165for linearized function 172for transformed function 173uncertainty band of 158, 162, 169
string of parabolas, uncertainty band of194
Student’s densityconvergence of 37
time series 11traceability 17, 209, 210trigonometric polynomial
adjustment of 224uncertainty band of 229
true value 3
uncertainty 4
function of least squares estimators128
function of measurands 77of a mean of means 127of least squares estimators 108, 119of repeated measurements 63ppm 7relative 7
uncertainty spaces 130unit
defined 3realized 3
unknown systematic errors 3, 11confining interval 13estimation of 13of measurands 70of measured values 14propagation of 88, 117randomization of 13sources of 13status of 13varying of 15
variance–covariance matrix 30, 31,235
in least squares 113, 115
weighing scheme 210weightings, consequences of 119weights
arbitrary factor in 109varying by trial and error 109
working standardrealization of 203uncertainty of 205
worst-case estimation 13