comparing ciren and nass cases: a multivariate similarity

35
Comparing CIREN and NASS Cases: Comparing CIREN and NASS Cases: A Multivariate Similarity Approach A Multivariate Similarity Approach Joel Stitzel, Ph. D., Joel Stitzel, Ph. D., Patrick Kilgo, M. S., Patrick Kilgo, M. S., Brian Brian Schmotzer Schmotzer , M. S., , M. S., Clay Gabler, Ph. D., Clay Gabler, Ph. D., J. Wayne Meredith, M.D. J. Wayne Meredith, M.D. Toyota – Wake Forest University School of Medicine CIREN Center

Upload: others

Post on 30-Dec-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

Comparing CIREN and NASS Cases: Comparing CIREN and NASS Cases: A Multivariate Similarity ApproachA Multivariate Similarity Approach

Joel Stitzel, Ph. D.,Joel Stitzel, Ph. D.,Patrick Kilgo, M. S.,Patrick Kilgo, M. S.,

Brian Brian SchmotzerSchmotzer, M. S.,, M. S.,Clay Gabler, Ph. D.,Clay Gabler, Ph. D.,

J. Wayne Meredith, M.D.J. Wayne Meredith, M.D.

Toyota – Wake Forest UniversitySchool of Medicine

CIREN Center

Impetus: The QuestionImpetus: The QuestionCIREN CIREN -- excellent source for detailed excellent source for detailed injury datainjury dataNASS weighted sample of real world casesNASS weighted sample of real world casesQuestion: How do we draw conclusions Question: How do we draw conclusions about injuries and safety using CIREN about injuries and safety using CIREN data, knowing that it accurately describes data, knowing that it accurately describes real world injuries? For:real world injuries? For:

Biomechanics ResearchBiomechanics ResearchSafety RegulationSafety RegulationAutomotive IndustryAutomotive Industry

NASS/CDS vs. CIRENNASS/CDS vs. CIRENNASS/CDSNASS/CDS

National SampleNational SampleCrash DataCrash DataEntry Criteria: TowEntry Criteria: Tow--away away CrashesCrashes40004000--5000 crashes/year5000 crashes/yearLarge contingent of Large contingent of MAIS 1MAIS 1--2 injuries2 injuriesReliable Sampling Reliable Sampling WeightsWeights

CIRENCIRENClinical DataClinical DataCrash DataCrash DataEntry Criteria: Level I Entry Criteria: Level I Trauma CenterTrauma Center300300--400 crashes/year400 crashes/yearMinimum MAIS3Minimum MAIS3**

Detailed Injury DataDetailed Injury DataExhaustive OutcomesExhaustive Outcomes

*CIREN inclusion criteria specify some exceptions to this rule.

NASS/CDS vs. CIRENNASS/CDS vs. CIRENWhy use NASS/CDS?Why use NASS/CDS?

Allows National Estimates of InjuryAllows National Estimates of InjuryInjury data not always complete or detailed (less Injury data not always complete or detailed (less clinical data)clinical data)Lots of data, sampling weights, well established, Lots of data, sampling weights, well established, some injury datasome injury data

Why use CIREN?Why use CIREN?Detailed Clinical Evaluation of InjuriesDetailed Clinical Evaluation of InjuriesCases reviewed by doctors, engineers, crash Cases reviewed by doctors, engineers, crash investigators, etc.investigators, etc.Not a populationNot a population--based samplebased sampleDifficult to extrapolate to national outcomeDifficult to extrapolate to national outcome

ObjectivesObjectives

QuantifyQuantify the similarity/difference between the similarity/difference between a given CIREN case (or a subset of CIREN a given CIREN case (or a subset of CIREN cases) and the population of NASS casescases) and the population of NASS casesHypothetical questions:Hypothetical questions:

Given a subset of NASS cases of interest:Given a subset of NASS cases of interest:How do we identify CIREN cases that How do we identify CIREN cases that ““match itmatch it””??What is the best way to make a comparison?What is the best way to make a comparison?What variables in NASS or CIREN are important in What variables in NASS or CIREN are important in describing the differences?describing the differences?

NASS/CDS vs. CIRENNASS/CDS vs. CIREN

NASS/CDS NASS/CDS CIREN

CIREN

Is CIREN a subset of

NASS/CDS?

Does CIREN contain

crash scenarios not in

NASS/CDS?

NASS/CDSCIREN

What is the nature and size of the overlap?

Main QuestionMain Question

How similar are individual CIREN cases to How similar are individual CIREN cases to the average NASS case?the average NASS case?By extension: how similar is a group of By extension: how similar is a group of CIREN cases of interest, to a population of CIREN cases of interest, to a population of NASS cases of interest?NASS cases of interest?

General ApproachGeneral Approach

Compute a Compute a ““similarity scoresimilarity score”” between between CIREN cases and the average NASS caseCIREN cases and the average NASS case

That is, the kThat is, the k--dimensional dimensional ““distancedistance””between the case typesbetween the case types

Where k=number of variables common Where k=number of variables common between the data sources between the data sources

Methods Methods ––Basics of Mahalanobis DistanceBasics of Mahalanobis Distance

A multivariate measure of distanceA multivariate measure of distanceIt puts all variables on the same scale It puts all variables on the same scale ––standardizes themstandardizes themTakes into account the correlation Takes into account the correlation between the k variablesbetween the k variables

Mahalanobis DistanceMahalanobis Distance

Based on Based on correlationscorrelations between variables between variables Different patterns can be identified and Different patterns can be identified and analyzed. analyzed. Useful way of determining Useful way of determining similaritysimilarity of an of an unknown unknown sample setsample set to a known one. to a known one. Differs from Differs from EuclideanEuclidean distance in that it takes distance in that it takes into account the correlations of the data set and into account the correlations of the data set and isis scalescale--invariantinvariant, i.e. not dependent on the , i.e. not dependent on the scale of measurements scale of measurements

http://en.wikipedia.org/wiki/Mahalanobis_distance

Example: Case Study of Example: Case Study of Height/Weight of Two MenHeight/Weight of Two Men

Male #1: Male #1: 6 inches above average height6 inches above average height100 pounds 100 pounds above above average weightaverage weight

Male #2Male #26 inches above average height6 inches above average height100 pounds 100 pounds belowbelow average weightaverage weight

By Euclidean Distance, these cases are By Euclidean Distance, these cases are equidistant from the average (equidistant from the average (centroidcentroid))By Mahalanobis Distance, Male #2 is By Mahalanobis Distance, Male #2 is MUCH further from the averageMUCH further from the average

Previous Example:Previous Example:

AVERAGE

Height

Wei

ght

Male #2

Male #1

Mahalanobis

<<Euclidean

Mahalanobis Distance and Similarity Mahalanobis Distance and Similarity ScoresScores

Mahalanobis distance is an effective Mahalanobis distance is an effective multivariate measure for how far points multivariate measure for how far points are apart in kare apart in k--space in the context of the space in the context of the correlations between themcorrelations between themSo, we can use Mahalanobis distance as So, we can use Mahalanobis distance as our Similarity Scoreour Similarity Score

Methods: Data Gathering Methods: Data Gathering –– Common Common Variables Between DatasetsVariables Between Datasets

Total Delta VTotal Delta VOccupant AgeOccupant AgeWeight (lbs)Weight (lbs)Height (ft)Height (ft)MAXAISMAXAISISSISS

Model yearModel yearGenderGenderMaximum IntrusionMaximum Intrusion# of Lower # of Lower ExtrExtr. Injuries. Injuries# of Upper # of Upper ExtrExtr. Injuries. Injuries# of Head Injuries# of Head Injuries# of Chest Injuries# of Chest Injuries

Methods: Our SampleMethods: Our SampleNASSNASS

All NASS cases meeting following All NASS cases meeting following criteria:criteria:

MAXAIS >= 3MAXAIS >= 32001 to 20052001 to 2005

A subset of 1869 NASS cases (with A subset of 1869 NASS cases (with MAXAIS >= 3)MAXAIS >= 3)

CIRENCIRENAll CIREN cases from 2001 to the All CIREN cases from 2001 to the present (2819)present (2819)

Methods Methods –– Dealing with Dealing with MissingnessMissingness

When there are large amounts of When there are large amounts of missingnessmissingness that need to be dealt with.that need to be dealt with.Possibilities:Possibilities:

Impute with averagesImpute with averagesEstimate covariance structure with multiple Estimate covariance structure with multiple imputation methodsimputation methodsDelete these observationsDelete these observations

For this study, we For this study, we simply imputed simply imputed columncolumn--wise wise averagesaverages

Next stepNext step

Methods: NASS WeightsMethods: NASS Weights

Used NASS sampling weight coefficient in Used NASS sampling weight coefficient in calculating weighted average for NASS calculating weighted average for NASS casescasesUsed in weighting our Used in weighting our ““covariancecovariance”” matrixmatrix

ItIt’’s as if each row was repeated s as if each row was repeated ““kk”” times times where where ““kk”” is the NASS weight for that rowis the NASS weight for that row

NASS/CDS is composed of lower NASS/CDS is composed of lower severity casesseverity cases

(Distribution of Cases by MAIS (Distribution of Cases by MAIS –– NASS/CDS 2005 vs. CIREN)NASS/CDS 2005 vs. CIREN)

57%

35%

6%1.6% 0.5% 0.2% 0.1%0%

4%

12%

47%

20%

13%

4%

0%

10%

20%

30%

40%

50%

60%

70%

0 1 2 3 4 5 6Max AIS

% C

ases

NASS/CDS (wgt'd)CIREN

NASS/CDS and CIREN roughly NASS/CDS and CIREN roughly comparable for MAIS3+ Casescomparable for MAIS3+ Cases

(Distribution of MAIS3+ Cases(Distribution of MAIS3+ Cases–– NASS/CDS 2005 vs. CIREN)NASS/CDS 2005 vs. CIREN)

67%

21%

9%

3%

56%

22%

16%

6%

56%

23%

16%

5%

0%

10%

20%

30%

40%

50%

60%

70%

80%

3 4 5 6Max AIS

% C

ases

NASS/CDS (wgt'd)NASS/CDS (unwgt'd)CIREN

Weighted NASS and CIREN MeansWeighted NASS and CIREN Means

0.500.50199719970.150.1521.621.63.393.395.395.39160.5160.536.736.740.940.9

CIREN MeansCIREN Means

0.560.56MaleMale19941994Model YearModel Year0.240.24MortalityMortality23.623.6ISSISS3.73.7MAXAISMAXAIS5.565.56Height (ft)Height (ft)167167Weight (lbs)Weight (lbs)38.038.0AgeAge37.537.5Total Delta VTotal Delta V

Weighted Weighted NASS MeanNASS Mean

VariableVariable

Normalized CIREN to NASS MeansNormalized CIREN to NASS Means

0.6

0.7

0.8

0.9

1

1.1

Total D

elta V Age

Weig

ht (lb

s)Heig

ht (ft)

MAXAIS ISS

Mortali

tyMod

el Yea

r

Male

Nor

mal

ized

Mea

nnormalized nassmeansnormalized cirenmeans

Histogram of Mahalanobis Distances

Distance

Den

sity

0 5 10 15

0.0

0.1

0.2

0.3

0.4

NASSCIREN

ResultsResults

Distance Distribution Distance Distribution is roughly the same.is roughly the same.CIREN has a greater CIREN has a greater proportion of its proportion of its cases clustered cases clustered about the mean about the mean distance than NASS distance than NASS doesdoes““CIREN is more like CIREN is more like NASS than NASSNASS than NASS””

Potential outcomes of principal Potential outcomes of principal components analysiscomponents analysis

Population1

Population

2

Population1

Population

2

Pop.3

Pop.3

Principal Component 1

Pri

nci

pal C

ompo

nen

t 2

Principal Component 1

Pri

nci

pal C

ompo

nen

t 2

Principal Components AnalysisPrincipal Components Analysis

There are There are differences between differences between NASS and CIRENNASS and CIRENBut more important But more important part is part is overlap, overlap, they are largely they are largely the same.the same.CIREN cases are CIREN cases are driven to the right driven to the right by some low MAISby some low MAIS

NASS

CIREN

Principal Components AnalysisPrincipal Components Analysis

Could Could subsetsubsetCIREN or NASS CIREN or NASS to minimize to minimize Mahalanobis Mahalanobis distance distance depending on depending on interest.interest.This is shown This is shown here in 2here in 2--d by d by minimizing minimizing variation in 1variation in 1stst 2 2 principal principal components.components.

Number of Head Injuries

Mah

alan

obis

Dis

tanc

e

0 129630

3

6

9

12

Number of Chest Injuries0 102 4 6 8

Mah

alan

obis

Dis

tanc

e

0

3

6

9

12

Number of Upper Extremity Injuries0 123 6 9

Mah

alan

obis

Dis

tanc

e

0

3

6

9

12

0 302010

Mah

alan

obis

Dis

tanc

e

0

3

6

9

12

Number of Lower Extremity Injuries

Distance vs. # InjuriesDistance vs. # Injuries

Injury Severity Score (ISS) Maximum Intrusion 0 16080 12040

0

3

12

6

9

Mah

alan

obis

dis

tanc

e

Total Delta V

0

3

12

6

9

Mah

alan

obis

dis

tanc

e

0

3

12

6

9M

ahal

anob

is d

ista

nce

1 53 42 60

3

12

6

9

Mah

alan

obis

dis

tanc

e

0 12060 9030

0 8040 6020

Maximum AIS

Distance vs. Crash and Injury CharacteristicsDistance vs. Crash and Injury Characteristics

Mah

alan

obis

Dis

tanc

e

0

3

6

9

12

Model Year1970 20101990 20001980

Mah

alan

obis

Dis

tanc

e

0

3

6

9

12

1 53 42 6Height

7

Mah

alan

obis

Dis

tanc

e

0

3

6

9

12

0 1005025 75Age

Mah

alan

obis

Dis

tanc

e

3

6

9

12

0

Weight0 600300150 450

Distance vs. Model Year and Anthropometric Variables Distance vs. Model Year and Anthropometric Variables

CaveatsCaveats

Other regressions could be performedOther regressions could be performedShape of data suggests maybe they Shape of data suggests maybe they should be performedshould be performed

Mah

alan

obis

Dis

tanc

e

0

3

6

9

12

0 20 40 60 80 100Age

Distance vs. AgeDistance vs. Age

Mah

alan

obis

Dis

tanc

e

0

3

6

9

12

1 2 3 4 5 6MAIS

Distance vs. Max AISDistance vs. Max AIS

Mah

alan

obis

Dis

tanc

e

0

3

6

9

12

1 2 6 8 10 12Number of Head Injuries

4

Distance vs. # Head InjuriesDistance vs. # Head Injuries

Results Results –– Subset of CIREN cases Subset of CIREN cases ““FurthestFurthest”” from weighted average from weighted average

NASS caseNASS case

Delta V Head Chest Up_ex Low_ex Age Weight Height Gender Max_int Max_AIS ISS Dead Model Year130 1 4 1 5 64.3 201 2.2 1 148 5 43 1 199761 0 0 0 3 40.8 560 6 1 31 3 22 0 200041 1 0 0 2 53.5 220 5.25 1 180 2 9 0 199846 1 0 0 0 9 66 1.74 1 6 1 2 0 2000

101 0 4 11 22 19.3 143 5.68 1 71 5 43 1 199736 0 0 0 0 1.3 24 1.57 0 13 1 1 0 2000

126 0 1 3 2 1.4 26 2.49 1 70 3 10 0 199935 0 0 0 0 0.2 13 1.9 1 31 1 1 0 200141 5 1 0 0 0.2 11 1.51 1 12 5 35 0 199345 4 1 1 0 0.1 11 1.84 1 50 5 57 0 1979

ConclusionsConclusions

CIREN crashes are similar to NASS crashesCIREN crashes are similar to NASS crashesFor severe injuries (MAIS 3)For severe injuries (MAIS 3)

Valuable approach to improve understanding Valuable approach to improve understanding and use of CIRENand use of CIRENCould refine method to be able to assign a NASS Could refine method to be able to assign a NASS similarity score to each CIREN crashsimilarity score to each CIREN crashOr NASS crash similarity or anthropomorphic Or NASS crash similarity or anthropomorphic similarity score or injury similarity scoresimilarity score or injury similarity scoreGood method to quality control data in CIREN Good method to quality control data in CIREN and NASSand NASS

Thank youThank you

AcknowledgementsAcknowledgementsToyotaToyotaCarol Flannagan and Jonathan Rupp, UMTRICarol Flannagan and Jonathan Rupp, UMTRIAnalytical CommitteeAnalytical Committee

Disclaimer: the published material represents Disclaimer: the published material represents the position of the authors and not necessarily the position of the authors and not necessarily that of Toyota or NHTSAthat of Toyota or NHTSAFor more information contact:For more information contact:

Dr. Joel Stitzel (Dr. Joel Stitzel ([email protected]@wfubmc.edu) )