naval missile center - dticnaval missile center an activity of the naval air systems command e. e....

21
Technical Publication TP-72-25 WHAT DISTINGUISHES AN INDEPENDENTLY OBSERVED VECTOR FROM AN ESTIMATED MULTIVARIATE NORMAL POPULATION? By A. C. BITTNER, JR. Systems Integration Division 3 April 1972 APPROVED FOR PUBLIC RELEASE; DISTRIBUTION UNLIMITED. NAVAL MISSILE CENTER Point Mugu, California I0 A

Upload: others

Post on 26-Jan-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

  • Technical Publication TP-72-25

    WHAT DISTINGUISHES AN INDEPENDENTLY

    OBSERVED VECTOR FROM AN ESTIMATED

    MULTIVARIATE NORMAL POPULATION?

    By

    A. C. BITTNER, JR.Systems Integration Division

    3 April 1972

    APPROVED FOR PUBLIC RELEASE; DISTRIBUTION UNLIMITED.

    NAVAL MISSILE CENTER

    Point Mugu, California

    I0 A

  • NAVAL MISSILE CENTER

    AN ACTIVITY OF THE NAVAL AIR SYSTEMS COMMAND

    E. E. IRISH, CAPT USNCommanding Officer

    D. F. SULLIVANTechnical Director

    CDR W. H. Nelson, Head, Human Factors Engineering Branch; Mr. E. P. Olsen, Head,Systems Integration Division; and Mr. J. J. O'Brien, Associate Laboratory Officer, havereviewed this report for publication.

    Technical Publication TP-72-25

    Published by ....................................... Editorial BranchTechnical Publications Division

    Photo/Graphics DepartmentSecurity classification ................................. UNCLASSIFIEDFirst printing ........................................... 145 copies

  • CONTENTS

    Page

    SUMMARY ......................... ........................... 1

    GLOSSARY ........................ ........................... 3

    INTRODUCTION ...................... ......................... 5

    APPROACH ........................ ........................... 6

    DERIVATIONS ...................... .......................... 6

    DISCUSSION ................... ........................... .. 10

    REFERENCES .................... .......................... 10

    APPENDIXTwo Working Theorems .................... ..................... 11

  • TP-72-253 April 1972

    NAVAL MISSILE CENTERPoint Mugu, California

    WHAT DISTINGUISHES AN INDEPENDENTLY OBSERVED VECTORFROM AN ESTIMATED MULTIVARIATE NORMAL POPULATION?

    ByA. C. BITTNER, JR.

    SUMMARY

    Once a researcher has determined that a multivariate observation x is different from anestimated population, he still has an unanswered question. He wants to know, "How is it

    different?" A method for answering this question is considered in this report.

    Two results are shown and both involve the estimated population parameters K-the esti-

    mated mean, an i-the estimated covariance matrix. The first result answers the question "Whichlinear combinations of the elements of the difference x -x- are significant?" The second

    answers the question "Which elements of the difference xo - x_ are significant?" Both resultsare derived using S. N. Roy's Union-Intersection Principle; hence, one can set an overall a/signifi-

    cance/type-i level for the totality of tests.

    A brief discussion of an application is also presented.

    Publication Unclassified.

    Approved for public release; distribution unlimited.

  • GLOSSARY

    xo A p-component (p by 1) vector observation, "o", independent of the vectorsx1P x2.... 9 xýn

    n The number of independent observations which are independent of the vector x0

    i n-= n i A p-component (p by 1) estimate of the population mean vector/yi=n

    IV A p-variate (p by 1) population mean vector

    X A p by p population covariance matrix

    An unbiased estimate of the population covariance matrix based on m degreesof freedom

    A, A

    x The inverse of the matrix X

    a The probability that the (null) hypothesis H0 will be rejected when it is true

    H0 The hypothesis that the vector x and the set of vectors x1, X2,... ?6are from the same population

    H1 The negation of the hypothesis H0

    N(g, •) The multivariate normal population with parameters M andA

    m The number of degrees of freedom of the matrix estimate

    t A "students" t-test statistic

    c A (1 by p) p-variate vector of constant coefficients

    3

  • _ct The p by I transpose of the vector c

    _ The (1 by p) p-variate vector which maximizes the function T2(_)

    A 1 by p vector which has a I as the i-th element and zeros elsewhere

    A p-variate (p by 1) vector of partial derivatives whose i-th component is

    NO a~T 2(C-)a(Ci)

    0 A p by I vector which has zeros for all the elements

    si2 An unbiased estimate of the i-th variate from NW, •)

    D A matrix conformal with x

    4

  • INTRODUCTION

    In a previous report by the author (reference 1), a probability density/distribution function

    was developed for an observed (p by 1) vector (xo) from a multivariate normal distribution withestimated parameters (see A-l)* This result enables a researcher to ask the general question

    "How unlikely is it that the observation x arose from N(y, X) where y is estimated by 3? andA -0

    Sis estimated by X?" Specifically the statistic:

    (_Xn ° _ 1 )t _ (_ - (1)

    mpwhich is distributed as -- F with p and m-p+1 degrees of freedom (df), can be compared

    m - p+lwith tables of the F distribution for probability of occurrence. If the probability of occurrenceis less than some specified amount (a), then the hypothesis (H0 ) that xo is from the same popu-

    lation that generated K- could be rejected. In other words, the hypothesis (H,) that the popu-

    lations which generated x. and x- are not the same would be accepted.

    The rejection of the hypothesis that x and K are from the same population (H0 ) doesn't

    show which of the variates of x were significantly different. Although individual tests of sig-

    nificance could be constructed (e.g., t-tests), there is no control of the overall a level for the set

    of comparisons. The reasons for this are twofold; there are p such tests, and the variates are

    generally correlated. Hence an approach is needed for testing the individual variates of x while

    controlling the overall a-level. The purpose of the present development is to delineate such a

    procedure.

    *The results A-1 and A-2 are working theorems which are given in the appendix.

    5

  • APPROACH

    The approach employed here uses S.N. Roy's Union-Intersection Principle (reference 2,

    and reference 3). This principle allows one to fix the overall significance level for thetotality of tests of linear compounds of the difference x - R. Operationally, this is accom-plished by employing the same (a-level) criterion required for the test of the most unlikelylinear compound, E(xo - JR), to all particular tests of linear compounds. Since the individualvariates x0 i - _x (i=l, ... , p) can be tested by the linear compounds ei(_x - _) (i=1, - - -this procedure will yield a solution to the problem posed in the Introduction.*

    Let us define the statistic T2(c) as follows:

    T2 (c) =••[c(x° nY)]t [c^ t -(]t c ]ý (2)

    where c is a fixed 1 by p vector. This equation is a special case of equation (1) with the p-vari-ate terms xo,, and f replaced by the corresponding univariate terms cxo, cx and c Z ct. Theseunivariate terms are those appropriate for linear compounds of the respective variates (see A-2).Under the hypothesis (H0 ) that xo is from the same population that generated K, equation (2)is distributed as F with I and' m degrees of freedom. Hence, the most unlikely T2(c_) valuewould occur for that c which maximizes (2).

    DERIVATIONS

    In the following, a theorem will be stated which contains both the conditions for maxi-mizing T2 (c), and a procedure for testing the significance of the totality of linear compoundswith a fixed overall significance level. Consider the following:

    Theorem 1.0. If T2 (c) is defined as in equation (2), then its maximum value is

    2c n A

    T2 _-) = ý+ (-X-o X4 x-)t' -X-o- (3)

    mpwhich is distributed as -- F with p and m-p+l degrees of freedom when H0 is true. This

    m - p+lvalue of T2 (_) is obtained for

    _= XoX•) - (4)

    *ei (i=l,., p) is a 1 by p vector with a I in the i-th entry and zeros elsewhere.

    6

  • Further, the totality of linear compounds of the form c(Xo - K) could be tested for significance

    at the overall a-level by the test:

    H1> mp (5)

  • There will be one nonzero eigenvalue of (10) since the rank of (Ro -D) (xo _?R)t is oneA -and the product of it and any conformal nonsingular matrix (e.g., 7-1) would have the same

    rank. Since the trace (tr) of (10) equals the sum of its eigenvalues (of which there is exactlyone, T2(c), that is nonzero), it follows that

    T 2(7) = tr Ln•-T,. (2;° _ -F (tx° _ _)t 011)

    This can, by the commutative laws for traces, be rewritten

    T = (2-) tr [ jo - x (_Xo - K)J (12)

    n (-)o - 0)t -I(- - D) (13)

    which is the first result (3) of Theorem 1.0.* The distribution of (3) or (13) is given by A-1;hence, the first result of the theorem is completed.

    The second result of this theorem (F) follows upon substitution of the value of__ given in

    equation (4) for c in equation (6). This yields a T2 (c) value of

    knU~ -V~~ (- 'o-X'o--(-o - Dxt '-0 - Y) (14)

    or equivalently

    ( - j X O ) (15)

    Because this corresponds to the maximum value of T2((c), equation (5) gives the desired vector.

    The last portion of the theorem can be seen when one recalls the nature of the Union-

    Intersection principle. Specifically, by employing the significance criterion (a - level) of T2 C@)

    for each test of the type T2(c), one is assured that the totality of such tests has a joint signifi-

    cance level of a. Since p F a:pm-p+l is this criteria, as indicated by the first part of them- p-Il ap

    theorem and A-1, the general test (5) follows directly.

    *See reference 4 for discussion of the rank of the product of two matrices.

    **Reference 4 also contains a discussion of various results concerning the traces of matrices.

    8

  • Tests for the individual components can be derived from specialization of equation (5).Since testing a specific component for significance (i-th) is equivalent to testing T2 (.e) for

    significance, one could substitute ei for c in equation (5) and obtain a specialized result. Thiswill be considered in the remainder of this section.

    Substitution of ei for c in equation (5) yields

    H1> mp_'ej Fap-l (16)

    H0 M - P m1

    which by equation (2) is equivalent toH1 mp_! e --- F (17)( _ )[9e4(_Xo n ex--)0 [ et- -e -- -x--) < m - p+l a:p,m-p+l ( 7

    H0

    Multiplying out each of the bracketed terms,

    n- A H1 mpXon -'i•iYJ[ 1} I'xoi - m] > - l Fa:p,m~p+1 (18)

    H0

    Awhere xoi is the i-th entry of x -i is the i-th entry of x_, and is the i, i-th entry of .Since in general the diagonal entries of Xare variance estimates, [:Jii' can be written asI12 where

    S.s?2 is the estimated variance of the i-th variate. Thus equation (18) can be written:

    n \(Xoi _ -i)2 H1Xi)_ )> mp (9

    -s2

  • DISCUSSION

    In the above, two results were derived which answered the question "How does the obser-

    vation xo differ from the population which generated _?" The first of these, Theorem 1.0,

    allows one to ask if any particular linear combination of variates (c__o) distinguishes xo from

    the population which generated R_. The second result, Corollary 1.1, allows one to ask if any

    particular variate distinguishes xo from the R generating population. Both of these results have

    obvious practical application. Let us briefly consider one.

    In a multiple criteria experiment, an unforseen (random) event occurs which makes suspect

    a single observation xo. The first question which faces the research is, "Is xo from the same

    population as the set of other observations (_x , x2 . . ,x2) from the same condition?"

    This question can be answered by employing the statistic:

    T2= V~ i -I( (21)

    en A In 2 (n-l)pwhere_ = x- = l (xi - K) (xi - 3-)t, and T is distributed as F with

    i= 1 'n-p

    p and n-p degrees of freedom.* Given that this statistic (21) is significant, the next question is,A

    "Which variates ofx° differ from the population which generated R and 2 ?" This could be

    answered by applying Corollary 1.1 with m equaling n-I. Guided by the costs of observations

    and their number, the results of these tests would be useful for decisions regarding the inclusion

    of x as part of the data of the experiment.

    The above does not exhaust the set of possible applications. Hopefully this application

    will suggest others to the reader.

    REFERENCES

    1. Naval Missile Center. Probability Density When an Independent Observation is From anEstimated Multivariate Normal Population, by A. C. Bittner Jr. Point Mugu, Calif., 15 Jul1971. (TP-71-33) UNCLASSIFIED

    2. Morrison, D. F. Multivariate Statistical Methods, San Francisco, McGraw-Hill, 1967.

    3. Roy, S. N. Some Aspects of Multivariate Analysis, New York, Wiley, 1957.

    4. Marcus, M. and Minc, H. Introduction to Linear Algebra, New York, Macmillian, 1965.

    5. Anderson, T. W. An Introduction to Multivariate Statistical Analysis, New York, Wiley,1958.

    *The statistic shown in (21) is a variation of A-I. A proof for its distribution was shown in reference 1.

    10

  • APPENDIX

    TWO WORKING THEOREMS

    The first theorem (A-i) was shown in reference 1 by the author. The second theorem

    (A-2) was shown by Anderson (reference 5) in 1958.

    A-i

    If x0 is an observed p-variate vector from N(y, ),

    1 n

    i=lI

    Ais a mean vector also from N(m,$) based on n independent observations, and m$ is the sum of

    the matrix products of mn independent N(Q, $) p-variate vectors (4 1, ? 2 , ie.

    mS= E Z-iZi=lI

    then

    T2= n (x An+- -0 --Xto - -

    is distributed as:

    mp F

    mr-p-p-i

    where F has p and m - p + 1 degrees of freedom.

    11

  • A-2

    If x is distributed according to NM, Z), then Z = Dx is distributed according to N(Dy,

    D:D t ).

    12

  • r - ;I .4-

    0) 0 o

    cn4 I* ~ ~ t

    44 p

    u , C?) 0. f4 C'I ?

    o) 0 z0 0 0Z

    ,I Z 0 -

    m V

    -. w 4- o c -4 -,

    CCi)

    ca 0.. : P14'.:

    I 0nIý N

    z I4I zP3 14Z

  • INITIAL DISTRIBUTION

    EXTERNAL Copies EXTERNAL Copies

    Commander Chief of Naval ResearchNaval Material Command Navy DepartmentHleadquarters Washington, DC 20360

    Washington, DC 20360 Attn: Code 455 1Attn: NMAT-0313 1 Code 461 1

    NNAT-0325 1Director

    Commander Office of Naval ResearchNaval Air Systems Command Branch OfficeHeadquarters 495 Summer St.

    Washington, DC 20360 Boston, MA 02210Attn: AIR-604 2

    AIR-104 1 DirectorAIR-303B 1 Office of Naval ResearchAIR-3034C 1 Branch OfficeAIR-340D 1 New York AnnexAIR-510 1 207 West 24th St.AIR-5313 3 New York, NY 10011AIR-533B1 1AIR-5337 1 DirectorAIR-53371 1 Office of Naval Research

    Branch OfficeDefense Documentation Center San Francisco AnnexCameron Station 1076 Mission St.Alexandria, VA 22314 San Francisco, CA 94013Attn: TIOG 12

    Chief of Naval Operations DirectorNavy Department Office of Naval ResearchWashington, DC 20350 Branch OfficeAttn: OP-507 1 219 South Dearborn St.

    OP-70111 1 Chicago, IL 60604

    Commander DirectorNaval Ship Systems Command Office of Naval ResearchHeadquarters Branch Office

    Washington, DC 20360 Box 39Attn: Librarian 2 FPO New York 09510

    Commander DirectorNaval Electronics Systems Office of Naval Research

    Command Headquarters Branch OfficeWashington, DC 20360 1030 E. Green St.Attn: Code 0532 1 Pasadena, CA 91101

    17

  • EXTERNAL Copies EXTERNAL Copies

    Director SuperintendentNaval Research Laboratory Naval Posgraduate SchoolWashington, DC 20390 Monterey, CA 93940 2Attn: Technical Information

    Officc 2 Commander

    Naval Command ControlCommander Communications LaboratoryNaval Air Development Center CenterWarminster, PA 18974 San Diego, CA 92152Attn: ADLR (Technical Library) 2

    Office, Secretary of the ArmyComr.ander Assistant Secretary for R&DNaval Air Test Center The Pentagon, Room 3E383Patuxent River, MD 20670 Washington, DC 20360Attn: Technical Library 2

    Chief of Research and DevelopmentCommander Department of the ArmyNaval Weapons Center Washington, DC 20315China Lake, CA 93555 Attn: CDR/R 1Attn: Code 3513 1

    Commanding GeneralCommanding Officer Army CDRNaval Avionics Facility Fort Belvoir, VA 2206021st and Arlington Ave. Attn: Technical LibraryIndianapolis, IN 46218 1

    Commanding GeneralNaval Safety Center U.S. Army Research andNaval Air Station Development ActivityNorfolk, VA 23511 1 Fort !Iuachuca, AZ 85613

    Commanding Officer Comranding OfficerHuman Factors Support Branch Army Human Engincering

    Naval Personnel Research LaboratoriesActivity Aberdeen Proving Ground,

    San Diego, CA 92152 1 Mi) 21005

    Commanding Officer and Director Attn: Technical Library

    Navy Marine Engineering Comrianding GeneralLaboratory U.S. Armly Laboratory

    Annapolis, HD 21402 Fort Eustis, VA 23604Attn: Librarian 1 Attn: Technical Library

    Commanding Officer and Director Commanding GeneralNaval Training Devices Center TRE'COM.!Orlando, FL 32813 Fort Eustis, VA 23604Attn: Library (Technical) 1 Attn: Library

    18

  • EXTERNAL Copies EXTERNAL Copies

    Headquarters, USAF (RTD/RTNF) Human Resources Research OfficeWashington, DC 20332 Division No. 6 (Aviation)Attn: Technical Library 2 P.O. Box 428

    Fort Rucker, AL 36360AMSMD-R Attn: Librarian 1Headquarters Mobility Command R. Wright ICenterline, MI 48015 1

    ChiefAFFDL Engineering Electronics SectionWright-Patterson AFB, National Bureau of Standards

    Ol 45433 Washington, DC 20360Attn: Technical Library 1 Attn: Library

    Commander Science and Technology DivisionUSAF 6570th Aero Medical Library of CongressResearch Laboratory Washington, DC 20540 1

    Wright-Patterson AFB,O1l 45433 Officer in Charge

    Attn: AMRL-HRIIED 1 Naval Aerospace Medical ResearchLaboratory

    National Aeronautics and Naval Aerospace Medical InstituteSpace Administration Naval Aerospace Medical Center

    1520 II Street, N.W. Aerospace Psychology DivisionWashington, DC 20546 Pensacola, FL 32512Attn: Library 1 Attn: CDR T. J. Gallagher 1

    National Aeronautics and Space Bureau of Medicine and SurgeryAdministration Aerospace Operational Psychology

    Ames Research Center BranchMoffett Field, CA 94035 Washington, DC 20390Attn: Librarian 1 Attn: CDR J. E. Goodson

    National Aeronautics andSpace Administration

    Ames Research CenterElectronics Research Center575 Technology SquareCambridge, MA 02139Attn: Library 1

    Federal Aviation AgencyDepartment of Transportation800 Independence Avenue, S.W.Washington, DC 20553Attn: Information Retrieval

    Branch IIQ-630 2

    19

  • INTERNAL Copies INTERNAL Copies

    Technical DirectorCode 51

    D. F. Sullivan 1

    Planning and ResourcesManagement Office

    Code 51-1R. W. Sexty 1

    Technical ConsultantCode 530

    L. S. Marquardt 1

    Systems Evaluation DivisionCode 5110

    E. Q. Smith, Jr. 1

    Operations Research DivisionCode 5170

    Dr. W. M. Simpson 1

    Laboratory DepartmentCode 5300

    T. P. Perry 1

    Reliability OfficeCode 5301-2

    R. A. Harmen 1

    Systems Integration DivisionCode 5340

    E. P. Olsen 1Code 5342

    CDR IV. If. Nelson 20A. C. Bittner, Jr. 10

    Technical Publications DivisionCode 5632.2Technical Library 2

    20

  • UNCLASSIFIEDSecuritv Classification

    DOCUMENT CONTROL DATA - R & D

    (Se-crity classification of title, body, of abstract -td indesing an'otatian must be entered when It-l oerall report is classified)

    I . ORIGINA TING AC TI VITY (Corporate autfhor) [2.. REPORT SECURITY CLASSIFICATION

    Naval Missile Center - UNCLASSIFIED

    Point Mugu, California 93042 2b. GROUP

    3. REPORT TITLE

    WHAT DISTINGUISHES AN INDEPENDENTLY OBSERVED VECTOR FROM ANESTIMATED MULTIVARIATE NORMAL POPULATION?

    4. DESCRIPTIVE NOTES (Type of report and inclusive dates)

    5. AUTHOR(S) (First name, middle initial, last name)

    Alvah C. Bittner, Jr.

    6. REPORT DATE 7a. TOTAL NO, OF PAGES 7b. NO. OF REFS

    3 April 1972 12 5Ba. CONTRACT OR GRANT NO. 9a. ORIGINATOR'S REPORT NUMBER(S)

    b. PROJECT NO. TP-72-25

    c. 9b. OTHER REPORT NO(S) (Any othernumbers thatmay be assignedthis report)

    d.

    10. DISTRIBUTION STATEMENT

    Approved for public release; distribution unlimited.

    1. SUPPLEMENTARY NOTES 12. SPONSORING MILITARY ACTIVITY

    Naval Air Systems Command

    13. ABSTRACT

    Once a researcher has determined that a multivariate observation xo is different from an

    estimated population, he still has an unanswered question. He wants to know, "How is itdifferent?" A method for answering this question is considered in this report.

    Two results are shown and both involve the estimated population parameters i-the estimatedA

    mean, and $-the estimated covariance matrix. The first result answers the question "Which linearcombinations of the elements of the difference xO - R are significant?" The second answers the

    question "Which elements of the difference x° - R are significant?" Both results are derived

    using S. N. Roy's Union-Intersection principle; hence, one can set an overall a/significance/type-1 level for the totality of tests.

    A brief discussion of an application is also presented.

    FORM 174 (PAGE 1)DD, ,NOV1473 UNCLASSIFIEDS/N 0101•807-6801 Security Classification

    21

  • UNCLASSIFIEDSecurity Classification

    KE SLINK A LIN LINK C

    ROLE WT ROLE WT ROLE WT

    ApplicationsMultivariate normal

    Normal populationSignificance testStatisticsUnion-intersection principle

    DD D, '.. 1473 (BACK) UNCLASSIFIED(PAGE 2) Security Classification

    22