using permutation tests to study infant handling by female baboons thomas l. moore vicki...
TRANSCRIPT
Using Permutation Tests to Study Infant Handling by
Female Baboons
Thomas L. Moore
Vicki Bentley-Condit
Grinnell College
Infant handling examples
Plan for talk
• The data & problem
• The use of permutation tests
• Interpret results
• The stability of results
• The choice of test statistics
• Summary
The data (handout) HANDLERS ranks
INFANTS/ KM KN NQ PO HQ LL NY PS SK ST WK AL CO DD LS LY MH ML MM PA PH PT RSMothers 1 1 1 1 | 2 2 2 2 2 2 2 | 3 3 3 3 3 3 3 3 3 3 3 3 ranks KG/KM 1 0 0 4 1 | 1 0 0 0 3 1 0 | 0 0 0 0 0 0 0 0 0 0 2 1 HZ/HQ 2 13 23 7 5 | 0 2 1 1 5 6 18 | 1 6 3 0 1 4 1 0 9 0 10 1 LC/LL 2 4 0 1 4 | 3 0 2 1 1 5 3 | 1 0 0 1 0 2 1 1 1 0 1 6 NK/NY 2 12 4 10 5 | 9 1 0 2 3 11 7 | 8 6 3 1 0 2 1 1 5 3 2 3 PZ/PS 2 1 3 4 1 | 0 0 0 0 0 0 2 | 0 2 0 0 0 3 0 1 1 0 3 0 CY/CO 3 2 2 7 3 | 1 1 2 0 3 12 16 | 3 0 2 0 0 2 0 0 1 0 0 2 LZ/LS 3 1 0 3 2 | 1 1 0 0 2 0 5 | 2 2 2 0 1 9 2 0 0 0 3 2 MQ/ML 3 0 1 5 2 | 2 4 2 2 2 4 5 | 7 5 2 1 1 7 0 4 4 1 0 2 MW/MH 3 3 0 7 4 | 2 3 0 5 2 8 13 | 7 14 2 0 0 0 4 0 8 0 13 6 MX/MM 3 2 3 4 5 | 0 0 0 0 0 5 2 | 9 3 1 0 0 2 0 0 1 2 2 3 PK/PH 3 2 0 6 4 | 3 4 1 0 0 15 10 | 8 5 1 0 3 1 1 6 3 0 7 5
HANDLERS ranks KM KN NQ PO HQ LL NY PS SK ST WK AL CO DD LS LY MH ML MM PA PH PT RS 1 1 1 1 | 2 2 2 2 2 2 2 | 3 3 3 3 3 3 3 3 3 3 3 3 INFANTS/ Mothers ranks KG/KM 1 0 0 4 1 | 1 0 0 0 3 1 0 | 0 0 0 0 0 0 0 0 0 0 2 1 HZ/HQ 2 13 23 7 5 | 0 2 1 1 5 6 18 | 1 6 3 0 1 4 1 0 9 0 10 1 LC/LL 2 4 0 1 4 | 3 0 2 1 1 5 3 | 1 0 0 1 0 2 1 1 1 0 1 6 NK/NY 2 12 4 10 5 | 9 1 0 2 3 11 7 | 8 6 3 1 0 2 1 1 5 3 2 3 PZ/PS 2 1 3 4 1 | 0 0 0 0 0 0 2 | 0 2 0 0 0 3 0 1 1 0 3 0 CY/CO 3 2 2 7 3 | 1 1 2 0 3 12 16 | 3 0 2 0 0 2 0 0 1 0 0 2 LZ/LS 3 1 0 3 2 | 1 1 0 0 2 0 5 | 2 2 2 0 1 9 2 0 0 0 3 2 MQ/ML 3 0 1 5 2 | 2 4 2 2 2 4 5 | 7 5 2 1 1 7 0 4 4 1 0 2 MW/MH 3 3 0 7 4 | 2 3 0 5 2 8 13 | 7 14 2 0 0 0 4 0 8 0 13 6 MX/MM 3 2 3 4 5 | 0 0 0 0 0 5 2 | 9 3 1 0 0 2 0 0 1 2 2 3 PK/PH 3 2 0 6 4 | 3 4 1 0 0 15 10 | 8 5 1 0 3 1 1 6 3 0 7 5
High-ranked female handles mid-ranked infant: Female NQ handles Infant NK 10 times
NK’s mother is NY
KM KN NQ PO HQ 1 1 1 1 | 2
INFANTS/
Mothers
ranks
KG/KM 1 0 0 4 1 | 1
HZ/HQ 2 13 23 7 5 | 0
LC/LL 2 4 0 1 4 | 3
NK/NY 2 12 4 10 5 | 9
PZ/PS 2 1 3 4 1 | 0
High-ranked female handles mid-ranked infant: Female NQ handles Infant NK 10 times
NK’s mother is NY
KM KN NQ PO HQ
1 1 1 1 2
KG/KM 1 0 0 4 1 1
HZ/HQ 2 13 23 7 5 0
LC/LL 2 4 0 1 4 3
NK/NY 2 12 4 10 5 9
PZ/PS 2 1 3 4 1 0
High-ranked female handles mid-ranked infant: Female NQ handles Infant NK 10 times
NK’s mother is NY
The variables
• Handler rank: high(1), mid(2), low(3)
• Infant rank: high(1), mid(2), low(3)
• The number of interactions between a given infant-handler pair
NOTE: Rank is determined from a dominance hierarchy score measured independently of infant handling.
3 kinds of handling behavior
• Passive: movement to within 1m. of the mother-infant pair with no attempt to handle,
• Unsuccessful: movement to within 1m. of the mother-infant pair with attempted (but not successful) handle, or
• Successful: a successful handle.
• NOTE: Each count in the matrix above is the sum of counts from the 3 categories listed on this slide.
Research hypotheses
1. Females will tend to handle the infants of females who are ranked the same as or lower than themselves. (RH1)
2. Females will tend to handle the infants of females who are ranked directly below them (or same rank if female is low-ranked). (RH2)
The data (handout) HANDLERS ranks
INFANTS/ KM KN NQ PO HQ LL NY PS SK ST WK AL CO DD LS LY MH ML MM PA PH PT RSMothers 1 1 1 1 | 2 2 2 2 2 2 2 | 3 3 3 3 3 3 3 3 3 3 3 3 ranks KG/KM 1 0 0 4 1 | 1 0 0 0 3 1 0 | 0 0 0 0 0 0 0 0 0 0 2 1 HZ/HQ 2 13 23 7 5 | 0 2 1 1 5 6 18 | 1 6 3 0 1 4 1 0 9 0 10 1 LC/LL 2 4 0 1 4 | 3 0 2 1 1 5 3 | 1 0 0 1 0 2 1 1 1 0 1 6 NK/NY 2 12 4 10 5 | 9 1 0 2 3 11 7 | 8 6 3 1 0 2 1 1 5 3 2 3 PZ/PS 2 1 3 4 1 | 0 0 0 0 0 0 2 | 0 2 0 0 0 3 0 1 1 0 3 0 CY/CO 3 2 2 7 3 | 1 1 2 0 3 12 16 | 3 0 2 0 0 2 0 0 1 0 0 2 LZ/LS 3 1 0 3 2 | 1 1 0 0 2 0 5 | 2 2 2 0 1 9 2 0 0 0 3 2 MQ/ML 3 0 1 5 2 | 2 4 2 2 2 4 5 | 7 5 2 1 1 7 0 4 4 1 0 2 MW/MH 3 3 0 7 4 | 2 3 0 5 2 8 13 | 7 14 2 0 0 0 4 0 8 0 13 6 MX/MM 3 2 3 4 5 | 0 0 0 0 0 5 2 | 9 3 1 0 0 2 0 0 1 2 2 3 PK/PH 3 2 0 6 4 | 3 4 1 0 0 15 10 | 8 5 1 0 3 1 1 6 3 0 7 5
[,1] [,2] [,3] [1,] 5 5 3 [2,] 97 83 95 [3,] 68 138 184
X=handler rank; Y=Infant rank
Handler's rank Hi Mid Low Infant Hi 5 5 3 Rank Mi 97 83 95 Lo 68 138 184 Totals: 170 226 282 (A)Counts
Handler's rank Hi Mid Low Infant Hi 2.9% > 2.2% > 1.1% Rank Mi 57.1% > 36.7% > 33.9% Lo 40.0% < 61.1% < 65.0% (B)Column%
Handler's rank Hi Mid Low
Infant Hi 1.12 0.40 -1.37
Rank Mi 5.06 -1.44 -3.08
Lo -5.34 1.32 3.43
Adjusted residuals
Adjusted residuals
Handler's rank Hi Mid Low
Infant Hi 1.12 0.40 -1.37
Rank Mi 5.06 -1.44 -3.08
Lo -5.34 1.32 3.43
Is the relationship statistically significant?
The Null Model
The female handlers interacted with infants as given in the data set. These interactions involved a variety of complex causes, but none of this complexity had anything to do with ranks. That is, ranks can be viewed as meaningless labels attached to infants and females.
Computing a permutation test
• Choose a test statistic, C.• (1) Assign ranks at random to infants and females using
the rank distributions of the data set. That is, assign ranks at random so that infants are assigned, in this case, 1 High, 4 Mid, and 6 Low and so that females are assigned 4 High’s, 7 Mid’s, and 12 Low’s. This assignment leads to the original data table but with permuted ranks.
• (2) Re-form the 3-by-3 table.• (3) Compute the value of C for this table.• Iterate (1)-(3) many times for empirical null distribution.• )( CC D
PvalueP
Test statistic for Research hypothesis 1
472
18413868
958397
355
*
LTE
Test statistic for Research hypothesis 2
160
18413868
958397
355
*
LT
A sample permutation (handout)
HANDLERS ranks KM KN NQ PO HQ LL NY PS SK ST WK AL CO DD LS LY MH ML MM PA PH PT RS 1 3 1 3 3 3 2 2 3 1 3 2 2 3 3 1 3 3 3 2 2 3 2 INFANTS/ Mothers ranks KG/KM 1 0 0 4 1 | 1 0 0 0 3 1 0 | 0 0 0 0 0 0 0 0 0 0 2 1 HZ/HQ 3 13 23 7 5 | 0 2 1 1 5 6 18 | 1 6 3 0 1 4 1 0 9 0 10 1 LC/LL 3 4 0 1 4 | 3 0 2 1 1 5 3 | 1 0 0 1 0 2 1 1 1 0 1 6 NK/NY 2 12 4 10 5 | 9 1 0 2 3 11 7 | 8 6 3 1 0 2 1 1 5 3 2 3 PZ/PS 2 1 3 4 1 | 0 0 0 0 0 0 2 | 0 2 0 0 0 3 0 1 1 0 3 0 CY/CO 2 2 2 7 3 | 1 1 2 0 3 12 16 | 3 0 2 0 0 2 0 0 1 0 0 2 LZ/LS 3 1 0 3 2 | 1 1 0 0 2 0 5 | 2 2 2 0 1 9 2 0 0 0 3 2 MQ/ML 3 0 1 5 2 | 2 4 2 2 2 4 5 | 7 5 2 1 1 7 0 4 4 1 0 2 MW/MH 3 3 0 7 4 | 2 3 0 5 2 8 13 | 7 14 2 0 0 0 4 0 8 0 13 6 MX/MM 3 2 3 4 5 | 0 0 0 0 0 5 2 | 9 3 1 0 0 2 0 0 1 2 2 3 PK/PH 2 2 0 6 4 | 3 4 1 0 0 15 10 | 8 5 1 0 3 1 1 6 3 0 7 5
[,1] [,2] [,3] [1,] 5 1 7 [2,] 85 60 119 [3,] 81 117 203
A sample permutation (handout)
HANDLERS ranks KM KN NQ PO HQ LL NY PS SK ST WK AL CO DD LS LY MH ML MM PA PH PT RS 1 3 1 3 3 3 2 2 3 1 3 2 2 3 3 1 3 3 3 2 2 3 2 INFANTS/ Mothers ranks KG/KM 1 0 0 4 1 | 1 0 0 0 3 1 0 | 0 0 0 0 0 0 0 0 0 0 2 1 HZ/HQ 3 13 23 7 5 | 0 2 1 1 5 6 18 | 1 6 3 0 1 4 1 0 9 0 10 1 LC/LL 3 4 0 1 4 | 3 0 2 1 1 5 3 | 1 0 0 1 0 2 1 1 1 0 1 6 NK/NY 2 12 4 10 5 | 9 1 0 2 3 11 7 | 8 6 3 1 0 2 1 1 5 3 2 3 PZ/PS 2 1 3 4 1 | 0 0 0 0 0 0 2 | 0 2 0 0 0 3 0 1 1 0 3 0 CY/CO 2 2 2 7 3 | 1 1 2 0 3 12 16 | 3 0 2 0 0 2 0 0 1 0 0 2 LZ/LS 3 1 0 3 2 | 1 1 0 0 2 0 5 | 2 2 2 0 1 9 2 0 0 0 3 2 MQ/ML 3 0 1 5 2 | 2 4 2 2 2 4 5 | 7 5 2 1 1 7 0 4 4 1 0 2 MW/MH 3 3 0 7 4 | 2 3 0 5 2 8 13 | 7 14 2 0 0 0 4 0 8 0 13 6 MX/MM 3 2 3 4 5 | 0 0 0 0 0 5 2 | 9 3 1 0 0 2 0 0 1 2 2 3 PK/PH 2 2 0 6 4 | 3 4 1 0 0 15 10 | 8 5 1 0 3 1 1 6 3 0 7 5
[,1] [,2] [,3] [1,] 5 1 7 [2,] 85 60 119 [3,] 81 117 203
Test statistic for Research hypothesis 1
424
20311781
1196085
715
*
LTE
Null distribution: 1000 resamples
C
Fre
qu
en
cy
100 200 300 400 500
05
01
00
15
02
00
Conclusion
• P-value ≈ 15/1000 = .015
• Observed pattern is unlikely the result of chance alone.
Summary by type of interaction
LTE LT n
PA 0.015 0.038 678
Pass 0.013 0.071 377
Un 0.012 0.119 189
Succ 0.372 0.017 112
---------------------------
p-values for two test statistics (LTE and LT) for 4 datasets of counts.
Look at Successful interactions
hi mid lo
hi 0 3 2
mid 19 6 22
lo 3 14 43
hi mid lo
hi 0% 13% 3%
mid 86% 26% 33%
lo 14% 61% 64%
hi mid lo
hi -1.13 2.23 -0.92
mid 4.71 -1.73 -2.39
lo -4.19 0.79 2.75
Counts
Resids
Column%
Stability of results
• Suggested by Clifford Lunneborg (Stats 2002)
• Stable description: “finding of the study … is not unduly influenced by the inclusion in the study of one particular source of observations.”
Four views of stability
Remove infants
Remove handlers
Test statistic ???? ????
P-value ???? ????
For example …
• Remove infants, one at a time, recompute the test statistic.
• Use a normalized test statistic = LTE*LTE / Sum of table entries;
• LTE* and LTE have same permutation distribution, …
• But LTE* accounts for sub-table count variation.
• LTE* values are stable (plot below)
Remove infant, LTE*, PA
2 4 6 8 10
0.64
0.68
0.72
0.76
Remove infant, p-value, Pass
Infant ID2 4 6 8 10
0.02
0.06
0.10
Remove handler, p-value, Pass
Handler ID5 10 15 20
0.01
0.03
0.05
0.07
Stability summary
Remove infants
Remove handlers
Test statistic Stable for both LTE and LT
Stable for both LTE and LT
P-value Slight instability for Infant #1 and Passive handling
Slight instability for Handler #1 and Passive handling
Choice of test statistics
• LTE and LT ad hoc, but intuitive• Power analysis to compare LTE and LT to
some other statistics– Correlation-based statistics
• M = Pearson’s correlation putting scores on ranks (Agresti,88)
• GK = Goodman and Kruskal’s gamma (Agresti,58)
– Beta = the asymmetry parameter in an ordinal quasi-symmetric log-linear model (Agresti,202)
Simulation
• Product multinomial model where J-th handler generates nJ interactions
• 2^(7-3) design to estimate main effects for 7 factors, including– Total sample size– Table dimension– RH1 vs RH2– Strength of research hypothesis– Patterns of non-homogeneity of counts– Infant rank distribution– Handler rank distribution
Results of power study
• LTE and LT outperform others;
• LTE does best under RH1
• LT does best under RH2
• Good news!
Summary of talk
• There is evidence for both research hypotheses, depending on the type of handling behavior;
• This suggests a nuanced description of how infant-handling behavior works.
• Permutation tests gave a way of analyzing a messy data set;
• We assessed stability through a simple remove-one-at-a-time strategy;
• We compared the power of test statistics and found simple ones to perform well.
Final slide
• Thank you for listening.
• Email: [email protected]
• Slides and handout: http://www.math.grinnell.edu/~mooret/reports/reports.html
• Note: this info is on your handout.