the identification of distribution form

9
Miszellen David Brenner/D.A.S. Fraser The Identification of Distribution Form Summary A notion of distribution form with an intuitive interpre- tation has had a long presence in statistical writings: "the variable has a normal distribution", "the lifetime is Weibull". This paper examines this notion of distribution form and presents a formalization of it. For purposes of statistical modelling (Fraser, 1979) the objectivity of the distribution form is of central concem. Three criteria are examined which relate to this objectivity. Each is shown to require the same restriction on the determination of distri- bution form, namely, that the class of possible response pre- sentations should have closure properties under composition, that is, be group like. This paper examines the foundational support for the use of the transformation model investigated in Brenner and Fraser 1979). I. Introduction We examine the ubiquitious notion of distribution form as en- countered throughout the statistical literature. As instances consider the following. A response variable is normally distri- buted with unknown location % and known scaling ~ The distri- bution is approximately Student (6) with location U and scaling a. The response vector y has location X8 relative to the de- sign matrix X with error that is approximately normal with mean O and scaling o. Are the very clear references to the normal, and Student (6) in these examples just a convenience? Or do we have an element of substance, with clear aspects of objectivity for an in- vestigator? We examinethe notion of an identifiable distri- bution form and investigate three criteria that relate to its basic objectivity. Roughly expressed these criteria are ob- servability, samplability, and compatability. Each leads to certain explicit closure properties among the presentation (or location) functions of the response variable, essentially that these form a group. 2. The Definition of Distribution Form Consider a few simple examples. A response variable is normal- ly distributed with unknown location 8 and known scaling Go- For the system under investigation there is in fact just one distribution and we can refer to points on the range of this 296

Upload: david-brenner

Post on 25-Aug-2016

213 views

Category:

Documents


1 download

TRANSCRIPT

Miszellen

David Brenner/D.A.S. Fraser

The Identification of Distribution Form

Summary

A notion of distribution form with an intuitive interpre- tation has had a long presence in statistical writings: "the variable has a normal distribution", "the lifetime is Weibull". This paper examines this notion of distribution form and presents a formalization of it. For purposes of statistical modelling (Fraser, 1979) the objectivity of the distribution form is of central concem. Three criteria are examined which relate to this objectivity. Each is shown to require the same restriction on the determination of distri- bution form, namely, that the class of possible response pre- sentations should have closure properties under composition, that is, be group like. This paper examines the foundational support for the use of the transformation model investigated in Brenner and Fraser 1979).

I. Introduction

We examine the ubiquitious notion of distribution form as en- countered throughout the statistical literature. As instances consider the following. A response variable is normally distri- buted with unknown location % and known scaling ~ The distri- bution is approximately Student (6) with location U and scaling a. The response vector y has location X8 relative to the de- sign matrix X with error that is approximately normal with mean O and scaling o.

Are the very clear references to the normal, and Student (6) in these examples just a convenience? Or do we have an element of substance, with clear aspects of objectivity for an in- vestigator? We examinethe notion of an identifiable distri- bution form and investigate three criteria that relate to its basic objectivity. Roughly expressed these criteria are ob- servability, samplability, and compatability. Each leads to certain explicit closure properties among the presentation (or location) functions of the response variable, essentially that these form a group.

2. The Definition of Distribution Form

Consider a few simple examples. A response variable is normal- ly distributed with unknown location 8 and known scaling Go- For the system under investigation there is in fact just one distribution and we can refer to points on the range of this

296

distribution: for example, the upper 10 % point, the median, the lower 10 % point. Correspondingly, to the degree that the model is valid, we can then refer to the same points in terms of "the normal distribution" or more specifically in terms of each of the possible normal distributions for the response.

Consider a distribution that is approximately Student (6) with location H and scaling ~. For the system being investigated there is just one distribution and we can refer to points on its range as in the preceding example the upper 10 % point, the median, the lower 10 % point. And again, to the degree that the model is valid, we can describe these same points in terms of "the Student (6) distribution" or in terms of each of the possible Student (6) response presentations.

As a third example consider a simple factor-analysis situation involving a bivariate response. We suppose that the background information supports a single linear factor with say a sym- metric triangular distribution convolved with a rotationally symmetric Student (6) error distribution. For the response presentation we suppose that this can appear as any positive linear deformation. There is just one distribution for the application and to the degree of validity of the model we can refer to points on its range directly in terms of the same points for "the standard case" or in terms of the same points for each of the possible response possibilities.

We now formalize the notion of distribution form. Let Y be the sample space for the response variable and let T in T be a parameter indexing the various possible response distri- butions. Let P = (p} be a space of labels for referring to the points of the distribution, and correspondingly, for each TET, let pT:Y§ be a one-one onto function that presents the label p = PT(Y) for each point y in the sample space of the T-distribution.

The preceding presents identifying labels for points on each of the possible distributions. It does not, however, tell us how the labels were engendered or derived from the physical ingredients of the problem.

Pursuing this, we now consider points in the response space and a means for "scanning" the various possible distributions. Let Ye be an examination space for this purpose and let T = {t} now denote a class of one-one transformations t:Ye+Y; we scan possibilities on the response space Y by examining them on Ye" Appropriately the elements of T will be referred to as scans.

Note that we are using the same symbol T for both the para- meter space and the class of scans where a scan t examines a distribution for which T = t. Of course the transformations T must necessarily embody the same continuity and differentiabi- lity characteristics and properties that are either implicit or explicit in the modelling of the particular physical sys- tem.

297

Finally we record the natural requirement for the class T: the class scans (or recognizes) the distribution form in the sense that

Pt o t : Ye ~ P

is a fixed function, p, independent of the particular t in T. In this way p = PtOt is the point-labelling function when the various distributions are "scanned" back to the examination space.

The preceding allows us to summarize the possibilities for the response distribution in terms of a distribution for form. Let u be a variable for this distribution on Ye" At this point we say that u describes a nominal underlying variation for the system. Correspondingly, we can then write the response vari- able as y = Tu where T in T indexes the possible distributions.

The preceding can be presented in a more comprehensive set theoretic form. For this consider the Cartesian product Y x T, the examination space Ye, and the point-labelling space P. The T section on Y x T is a copy of Y and can be used to consider the T distribution for the response y. The point-labelling function PT maps this section into the space P. On the com- pound space we can then consider the composite function

p : y x T § P, (y,T) + pTy

The preimage of p is a partition of Y x T. For convenience we use P to label this partition; for a point say p' in P, the correspondingly labelled fibre is then given by

-I -I p (p') = {(y,t) :y = Pt p'}

= {(tu,t):u = p-lp,}

Now consider the preceding for a sample (yl,...,yn) from a

distribution with T in T. For this we have the Cartesian pro- n and the point-labelling duct yn • T, the examination space Ye'

space pn. The T-section on yn • T is a copy of yn and can be used to consider the distribution for the sample response y. The point-labelling function PT maps this section into the space pn. On the full compound space we then have the compo- site function

P : yn x T § pn (y,T) + PT[: (PTYl ..... PTYn )"

The primage of p is a partition of yn x T and we label this partition by pn and label the fibres or sets by the points in pn. Let p' be a point in pn, the corresponding fibre is then given ~y

298

-I -I P C p') = {(y,t) :y = Pt p',ts

= {(tu,t) :u = p-lp',ts

3. Observability of Distribution Form

Our discussion of the scans, t, in the preceding section im- plicitly assumed that they provided a means to observe distri- bution form, essentially that distribution form was objective. For the moment then we will speak of nominal distribution form and consider criteria for it to be real or objective. In this section we examine a criterion in terms of observability.

Consider the various response distributions y = TU as ob- tained by the scans T from+the variable u for nominal distri- bution form. In this section we centre our concern on the ob- servability of distribution form: can an unknown distribution +for y = TU be examined on Ye in a way independent of the un- known T, in other words, can the alleged form be examined ob- jectively?

What does the collection T of scans present on the examination space Ye ? If the response distribution is T then the point- labelling functions re-presented on the examination space by the scans, t, give

H T = {pTt:tET}

This is a collection of point-labelling functions from the examination space Ye to the labelling space P. For the ap- plication under investigation the particular value for T is unknown and observability requires that what we see be in- dependent of the unknown T.

Definition I. The class T = {t} of scans is a platform (1)

for distribution form if H T is independent of T in T.

In effect a platform gives a coherent picture of distribution form independent of the particular presentation T.

We examine the consequences of a class of scans being a plat- form (I). The condition

H T = {pTt : tET} = H TET

can be rewritten (using pTT = p) as

H = {pT-It : teT} = H TET. T

299

As p is a fixed bijection the preceding is equivalent to

G = {z-lt : tET} Ts

where G designates a fixed collection of functions. It follows routinely then that G = T-IT is a group of transformations on the examination space Ye and equivalently that

-I G = TT

is a group of transformations on the response space Y. The reverse analysis is obvious. We summarize the preceding in the following Lemma.

Lemma I. A class T of scans is a platform (I) if and only if

T-IT is a transformation group G on the examination space Ye"

4. Sam~lability of Distribution Form

We continue with our examination of nominal distribution form and investigate a criterion involving the ability to directly sample distribution form.

As a first consideration we note that the ability to sample distribution form requires that the class T of scans t or of distributions ~ cannot be too large. This prevents multiple presentation possibilities standing between the examination space and nominal form. To avoid this elusiveness, we intro- duce the following assumption

Assumption 1. For n ~ n O there is at most one t in T satis- fying an equation

(yl,...,yn) = t(Ul, .... Un).

Now consider the assessment of a response sample y=(yl,...,yn ) . Does it produce a sample from distribution form?

For an alleged presentation value T the examination space -I

point is T y. The possible samples that could then appear on the response space are given by

TT(y) = {tT-ly : ts

For this to be independent of the alleged T we require TT(y) = T(y) to be independent of T in T.

Definition 2. The class T = {t} of scans is a platform (2) for distribution form if for n ~ n o the class of samples TT([) is independent of T in T.

300

We examine what follows from the condition that a collection of scans is a platform (2). From the definition it follows that for n > n

- o

{tT;ly : ts = {tT;ly : teT}.

This equality implies that for each t there corresponds a unique ~n(t:y) in T such that

-I -ly. tT I Y = ~n(t:Y)T2 ~

NOW we consider a sample size n + I with the first n coor- dinates given by the preceding y and an additional coordinate say y. Analogously we then obtain ~n+1(t:y,y) such that

tT; I (Y'Y) = %n+I (t:y'Y)T21 (Y~'Y)

or equivalently

tT;ly = ~n+1(t:y,y) T;ly

ty;ly = ~n+1(t:y,y) T;ly.

But the first equation gives

~n+1(t:y,y) = ~n(t:y)

and thus

tT;ly = ~n(t:Y)T21y

for all y. This last implies that tT; I = ~ (t,Y)T21 with ~n n_1 ~

unique for given ~I' T2' It follows that TT I = G is a group

of transformations on the response space Y. The reverse ana- lysis is obvious.

We now summarize this in the following lemma.

Lemma 2. The class T = {t} of scans is a platform (2) if and only if T~ -I is a transformation group G on the response space Y.

301

5. Event Compatibility of Sample Information

We now examine whether sample information provides event-type information concerning the variation. It has been noted, for example, Fraser (1976, Section 4.1), that information need not be of the pure event form that supports conditional probabili- ty. Some recent discussion may be found in Brenner and Fraser (1979). Essentially, the requirement that information be of event type is that its production be equivalent to obtaining an observed value of a well defined function. We examine this requirement in the present context of nominal distribution form.

Consider the information provided by a response sample y. For this we use the set theoretic frame-work recorded at the end of Section 2. Let S(y) be the corresponding set of possible vectors for the variation on yn

e'

= {~-Iy:TET}.

A vector o on yn corresponds to a fibre ~ e

V(u) = {(tu,t) :tET}

-I and in particular the vector u = T y corresponds to

V(T-Iy) = {(tT-ly,t):t~T}.

In this way we can then represent S(y) as a set P(y) of fibres

on yn•

p(y) = {V(~-Iy):~T}.

We have however that the fibres from a partition of yn• ac- cordingly we can represent P(y) equivalently as the union of its components

P(y) = UTETV(T-Iy )

-ly = {(t2t I ~,t2):tis

Summarizing the preceding we note that a response sample y identifies the set

(y) = { (t2t~ly,t2) :tiET}

on the compound space yn• as the set of possibilities for the variation.

302

We now investigate the mapping from y to P(y), from a point

to a set in the space ynxT. In particular is the mapping of the pure type_discussed at the beginning of this section, specially is P a canonical projection to a partition on

ynxT?

Oefinition 3. ~he class T of scans is a platform (3) for distribution form if P(.) is a canonical projection to a partition.

In effect a platform (3) provides usable event-type infor- mation concerning the variation.

We examine the consequences of a class of scans being a plat- form (3). First we note that the section yXT is contained in P(y). This follows trivially by forming the subset of P(y) with t I = t 2.

Now consider a section y'xT that has a non trivial inter- section with P(y). Then"

~(y,)n~(y) # $

and then from the partition property in Definition 3 we have P(y') = P(y). In particular it follows that P(y) is a union of~section~ y,xT.

The preceding can be presented more compactly by letting ~ be

the projection of ynxT on yn:

~(y) = -I ~(y)

-I

where

T(y) = {tlt21y:tiET}

is the projection of P(y) on yn. This notation allows us to note that TT([) in Section 4 is the, ~ projection of

{(tT-ly,t) :teT} = V(T-Iy).

But P(y) is both a union of sets V(T-Iy) and is composed of sections y'xT. It follows that TT(y) i~ independent of T. We can then ~se the results from Section 4 and conclude that

-I TT = G is a group of transformations on the response space Y. The reverse analysis is obvious.

We can summarize the preceding as a lemma.

Lemma 3. The class T = t of scans is a pZatform (3) if and only if Tt -I is a transformation group G on the response space Y.

303

References

[ 11 Brenner, David and Fraser, D.A.S. (1979): On foundations for conditional probability with statistical models - when is a class of functions a function. Statistische Hefte 20, 148-159.

[21 Fraser, D.A.S. (1976): Probability and Statistics, Theory and Applications, Toronto, University of Toronto Textbook Store.

[ 3] Fraser, D.A.S. (1979): Inference and Linear Models, New York City, McGraw Hill.

304