effect of additional loci on likelihood ratio values for ... · effect of additional loci on...
TRANSCRIPT
Effect of Additional Loci on Likelihood Ratio Values for Complex Kinship Analysis
Kristen Lewis O’Connor, Ph.D.
National Institute of Standards and Technology
21st International Symposium on Human Identification San Antonio, TexasOctober 12, 2010
Final version of this presentation available at: http://www.cstl.nist.gov/strbase/NISTpub.htm
Questions to Be Addressed
• How does kinship analysis relate to forensic DNA typing?
• Is there value in examining additional loci?
• What has NIST accomplished with kinship analysis?
• Where can one learn more about these topics?
What is our forensic core competency?
Laying the foundation for a discussion of kinship analysis
Forensic Core Competency
Standard STR Typing
13 core loci
Standard STR Typing for
Database Search13 core loci
1-to-1
1-to-many
Typ
e o
f Se
arch
Random Match Probabilitydetermines the rarity of the profile
Direct match
High certainty
Evidence profile8,14 - 10,13 - …
Profile 1: 9,13 - 10,11 - ..
Profile 2: 8,11 - 10,12 - ..
Profile 3: 10,11 - 11,12 - ..
Profile 4: 15,16 - 13,13 - ..
Profile 5: 9,10 - 10,10 - ..
Profile 6: 8,14 - 10,13 - ..
Profile 7: 9,12 - 10,11 - ..
Profile 8: 7,15 - 10,12 - ..
Profile 9: 9,13 - 11,11 - ..
Reference Profiles (K1 to Kn)
Expanding the Forensic Core Competency
Standard STR Typing
13 core loci
Paternity Testing
~15 loci
Standard STR Typing for
Database Search13 core loci
1-to-1
1-to-many
Typ
e o
f Se
arch
12,15 8,13
12,13
AF M
C
Paternity Index determines the strength of the compared relationships
Level of CertaintyHigh
IndirectDirectType of Match
Low
Expanding the Forensic Core Competency
Standard STR Typing
ComplexKinship Analysis
13 core loci Additional loci may be beneficial
IndirectDirect
Paternity Testing
~15 loci
Level of CertaintyLow
Standard STR Typing for
Database Search
High
Type of Match
13 core loci
1-to-1
1-to-many
Typ
e o
f Se
arch
What is kinship analysis?
What is kinship analysis?
Evaluation of relatedness between individuals
Applications
Parentage testing (civil or criminal)
Disaster victim identification
Missing persons identification
Familial searching
Immigration
AF M
C
?
Immigration TestingU.S. Department of Homeland Security
U.S. anchor
Anchor may sponsor up to 15 relatives (spouse, parents, siblings, children)
79% of refugee claims were fraudulent based on DNA testing or failure to appear for DNA testing (U.S. Dept. of State)
DHS is looking to require DNA to support relationship claims
Why can kinship analysis be complex?
Direct Matching
Evidence profile8,14 - 10,13 - …
Suspect profile8,14 - 10,13 - ...
Exact match between compared genotypes
Standard STR Typing
Direct MatchingExact match between compared genotypes
Kinship Analysis? Kinship profile 1
8,14 - 10,13 - …Kinship profile 28,14 - 10,13 - ...
Direct MatchingExact match between compared genotypes
Kinship Analysis
Relationship 0 alleles 1 allele 2 alleles
Identical twin 0 0 1
Probability of Sharing Alleles from a Common Ancestor
(12,15) (10,13)
12,13 12,13
Twins!
2 alleles shared every locus
Indirect Matching
Relationship 0 alleles 1 allele 2 alleles
Parent-child 0 1 0
Probability of Sharing Alleles from a Common Ancestor
12,15
12,13
F
C
M
(10,13)
Parent-Offspring
1 allele shared at every locus
Relationship 0 alleles 1 allele 2 alleles
Full siblings 1/4 1/2 1/4
Probability of Sharing Alleles from a Common Ancestor
(12,15) (10,13)
12,13 10,15
Indirect MatchingFull Siblings
0 alleles shared at a locus
12,15 10,13
12,13
What information is required for kinship analysis?
• Alleged relationship• Genotypes of specific markers• Method to assess the relationship
Probability of genotypes if “10” is the true father of “21”Probability of genotypes if an unrelated man is the father of “21”
Paternity Index =
Paternity trio
(Likelihood Ratio)
What information is required for kinship analysis?
• Alleged relationship• Genotypes of specific markers• Method to assess the relationship
Paternity trio
Pedigrees are not always this simple
MaleFemaleDivorceNo data
Complex Pedigree
Why can kinship analysis be complex?
Relationship 0 alleles 1 allele 2 alleles
Parent-child 0 1 0
Full siblings 1/4 1/2 1/4
Half siblings 1/2 1/2 0
Uncle-nephew 1/2 1/2 0
Grandparent-grandchild 1/2 1/2 0
First cousins 3/4 1/4 0
Half siblings, uncle-nephew, and grandparent-grandchild are genetically identical
Probability of Sharing Alleles from a Common Ancestor
For more distant familial relationships, allele sharing decreases uncertainty increases
Leve
l of
Ce
rtai
nty
High
Low
What materials are used for kinship analysis at NIST?
What markers are being studied for kinship analysis?
• 46 autosomal loci
• 17 Y-chromosomal loci
• 15 X-STRs (AFDIL collaboration)
• Mitochondrial control region
NIST 26plex Assay(non-commercial)
D1GATA113
D1S1627
D1S1677
D2S1776
D3S3053
D3S4529
D4S2364
D4S2408
D5S2500
D6S474
D6S1017
D8S1115
D9S1122
D9S2157
D10S1435
D11S4463
D12ATA63
D14S1434
D17S974
D17S1301
D2S441
D10S1248
D22S1045
D18S853
D20S482
D20S1082
U.S.
TPOX
CSF1PO
D5S818
D7S820
D13S317
FGA
vWA
D3S1358
D8S1179
D18S51
D21S11
TH01
D16S539
D2S1338
D19S433
Penta D
Penta E
13
CO
DIS
loci
7 ESS loci
5 loci adopted to expand to 12 ESS loci
46 unique STR loci have been characterized at NIST
See poster #40 for details on additional loci
Europe
FGA
vWA
D3S1358
D8S1179
D18S51
D21S11
TH01
D16S539
D2S1338
D19S433
D12S391
D1S1656
D2S441
D10S1248
D22S1045
SE33
Autosomal STR Markers
European Standard Set = ESS
NIST Sample Set
• NIST U.S. population samples– 254 African American, 261 Caucasian, 139 Hispanic
• U.S. father/son samples– 178 African American, 198 Caucasian, 190 Hispanic,
198 Asian
• Extended family samples– 6 sets of 3–4 generations– 165 total samples
www.cstl.nist.gov/strbase/
NIST Data Analysis Capabilities
Kinship Software– DNA-VIEW™ v. 29.23 (Charles Brenner)
– KIn CALc v. 4.0 (CA DOJ, Steven Myers)
– GeneMarker® HID v. 1.90 (SoftGenetics)
– FSS DNA Lineage (Forensic Science Service)
– LISA (Future Technologies Inc.)
Population Genetics Software– Arlequin v. 3.5
What methods are we using to assess kinship?
How is kinship assessed?
MF
21
Likelihood Ratio (LR)
Evaluate genotypes to give weight (strength) to compared relationships
By the definition of a LR:LR > 1 supports the numerator (alleged relationship)LR < 1 supports the denominator (unrelated)
Larger LR values provide more support for the alleged relationship
Probability of genotypes if 1,2 are full siblingsProbability of genotypes if 1,2 are unrelated
LR =
How is kinship assessed?
Goal: How well does a set of loci perform for kinship analysis?
Method: Evaluate “expected” range of LRs for different relationship questions
Need: Graphical method to display the LRs
Solution: Distribution of data points (LR values)
What is a likelihood ratio distribution?
Variables:
Allele frequency
Number of loci
Kinship probabilities (account for shared alleles from common ancestor)
Likelihood ratio
High LR
(more weight for relationship)
Moderate support for
relationship
Low LR
(less weight for relationship)
Computer simulations can
generate many genotype
combinations and relatedness
scenarios (pedigrees)
Calculate LR values for each
pedigree
Generate LR distributions
What is a likelihood ratio distribution?
Variables:
Allele frequency
Number of loci
Kinship probabilities (account for shared alleles from common ancestor)
Likelihood ratio
LR threshold = 1
Unrelated persons
Relatedpersons
LR < 1 LR > 1
Separate distributions- High probability of shared alleles from common ancestor (e.g., parent-offspring)- Discriminating loci genotyped
Overlap of likelihood ratio distributions
Variables:
Allele frequency
Number of loci
Kinship probabilities (account for shared alleles from common ancestor)
Overlapping distributions- Low probability of shared alleles from common ancestor (e.g., first cousin)- Less discriminating loci genotyped
LR threshold = 1
RelatedUnrelated
False positives
False negatives
uncertainty
What kinship questions have we asked with our dataset?
NIST 26plex Assay(non-commercial)
D1GATA113
D1S1627
D1S1677
D2S1776
D3S3053
D3S4529
D4S2364
D4S2408
D5S2500
D6S474
D6S1017
D8S1115
D9S1122
D9S2157
D10S1435
D11S4463
D12ATA63
D14S1434
D17S974
D17S1301
D2S441
D10S1248
D22S1045
D18S853
D20S482
D20S1082
U.S.
TPOX
CSF1PO
D5S818
D7S820
D13S317
FGA
vWA
D3S1358
D8S1179
D18S51
D21S11
TH01
D16S539
D2S1338
D19S433
Penta D
Penta E
13
CO
DIS
loci
7 ESS loci
5 loci adopted to expand to 12 ESS loci
46 unique STR loci have been characterized at NIST
See poster #40 for details on additional loci
Europe
FGA
vWA
D3S1358
D8S1179
D18S51
D21S11
TH01
D16S539
D2S1338
D19S433
D12S391
D1S1656
D2S441
D10S1248
D22S1045
SE33
European Standard Set = ESS
6.3 megabases apart on chromosome 12
Can D12S391 be used with vWAfor kinship analysis?
vWA chr12:6,093,104 – 6,093,253
D12S391 chr12:12,449,874 – 12,450,226
6.3 megabases apart on chromosome 12UCSC Genome Browser, Feb. 2009 assembly
Are vWA and D12S391 independent?
O’Connor KL, et al., Linkage disequilibrium analysis of D12S391 and vWA in U.S. population and paternity samples, Forensic Sci. Int. Genet. (in press)
No!
No!
Budowle B, et al., Population genetic analyses of the NGM STR loci, Int. J. Legal Med. (in press)
Should vWA and D12S391 be multiplied for profile probability calculations in kinship analysis?
How do 13 loci perform for kinship analysis?
How do 13 loci perform for kinship analysis?
0
0.1
0.2
0.3
0.4
0.5
-10 -8 -6 -4 -2 0 2 4 6 8 10 12 14
Parent-Offspring(1000 simulations)
RelatedUnrelated
0.668 < -10
median LR=10,000median LR=2.7E-13
0
0.1
0.2
0.3
0.4
0.5
-10 -8 -6 -4 -2 0 2 4 6 8 10 12 14
median LR=0.14 median LR=6.9
0
0.1
0.2
0.3
0.4
0.5
-10 -8 -6 -4 -2 0 2 4 6 8 10 12 14
median LR=1.5E-3 median LR=2400
The degree of overlap corresponds with possible values for false positive or false negative results.
Parent-offspring comparisons:No overlap between unrelated and related LR distributions
Full sibling comparisons:False positive rate = 0.027False negative rate = 0.033
Half sibling comparisons:False positive rate = 0.155False negative rate = 0.168
Pro
po
rtio
n o
f si
mu
lati
on
s u
sin
g U
.S. C
auca
sian
alle
le f
req
uen
cies
LR threshold = 1
Full Siblings(5000 simulations)
Half Siblings(1000 simulations)
log10(LR)
Do additional loci improve the discrimination of true relatives
vs. unrelated persons?
0
0.1
0.2
0.3
0.4
0.5
-13 -11 -9 -7 -5 -3 -1 1 3 5 7 9 11 13 15 17
0
0.1
0.2
0.3
0.4
0.5
-13 -11 -9 -7 -5 -3 -1 1 3 5 7 9 11 13 15 17
0
0.1
0.2
0.3
0.4
0.5
-13 -11 -9 -7 -5 -3 -1 1 3 5 7 9 11 13 15 17
How do 20 loci perform for kinship analysis?
median LR=680,000median LR=2.4E-5
Pro
po
rtio
n o
f si
mu
lati
on
s u
sin
g U
.S. C
auca
sian
alle
le f
req
uen
cies
LR threshold = 1
median LR=0median LR=9 million
median LR=0.036 median LR=32.5
RelatedUnrelated
0.991 < -13
log10(LR)
Parent-Offspring(1000 simulations)
Full Siblings(5000 simulations)
Half Siblings(1000 simulations)
Parent-offspring comparisons:No overlap between unrelated and related LR distributions
Full sibling comparisons:False positive rate = 0.006False negative rate = 0.008
Half sibling comparisons:False positive rate = 0.075False negative rate = 0.104
Additional loci improve separation of LR distributions for parent-offspring and full siblings.
0
0.1
0.2
0.3
0.4
-17 -13 -9 -5 -1 3 7 11 15 19
0
0.1
0.2
0.3
0.4
-17 -13 -9 -5 -1 3 7 11 15 19
0
0.1
0.2
0.3
0.4
-17 -13 -9 -5 -1 3 7 11 15 19
How do 40 loci perform for kinship analysis?P
rop
ort
ion
of
sim
ula
tio
ns
usi
ng
U.S
. Cau
casi
an a
llele
fre
qu
enci
es
median LR=0.005 median LR=300
LR threshold = 1
median LR=1.4E-8 median LR=3.1 billion
All LRs = 0
median LR=140 billion
RelatedUnrelated
log10(LR)
Half Siblings(1000 simulations)
Parent-Offspring(1000 simulations)
Full Siblings(5000 simulations)
Parent-offspring comparisons:No overlap between unrelated and related LR distributions
Full sibling comparisons:False positive rate = 0.0006False negative rate = 0.0018
Half sibling comparisons:False positive rate = 0.051False negative rate = 0.066
Additional loci further improve separation of LR distributions for parent-offspring and full siblings.
0
0.1
0.2
0.3
0.4
-4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
Parent-offspring1000 simulations
Half siblings1000 simulations
Do additional loci improve kinship determination?P
rop
ort
ion
of
sim
ula
tio
ns
usi
ng
U.S
. Cau
casi
an a
llele
fre
qu
en
cies
Full siblings5000 simulations
(additional simulations performed for smoother curves)0
0.05
0.1
0.15
0.2
0.25
-4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
0
0.1
0.2
0.3
0.4
-4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
log10(LR)
13 CODIS
Review previous data
Parent-offspring1000 simulations
Pro
po
rtio
n o
f si
mu
lati
on
s u
sin
g U
.S. C
auca
sian
alle
le f
req
ue
nci
es
0
0.1
0.2
0.3
0.4
-4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
0
0.05
0.1
0.15
0.2
0.25
-4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
0
0.1
0.2
0.3
0.4
-4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
log10(LR)
20 STRs
Half siblings1000 simulations
Full siblings5000 simulations
(additional simulations performed for smoother curves)
13 CODIS
Do additional loci improve kinship determination?Review previous data
Parent-offspring1000 simulations
Pro
po
rtio
n o
f si
mu
lati
on
s u
sin
g U
.S. C
auca
sian
alle
le f
req
ue
nci
es
0
0.1
0.2
0.3
0.4
-4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
0
0.1
0.2
0.3
0.4
-4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
log10(LR)
20 STRs 40 STRs
0
0.05
0.1
0.15
0.2
0.25
-4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
Half siblings1000 simulations
Full siblings5000 simulations
(additional simulations performed for smoother curves)
13 CODIS
Do additional loci improve kinship determination?Review previous data
How can uncertainty in kinship determination be reduced?
How can uncertainty in kinship determination be reduced?
Improve the measurement technique
• Add more family references
• Add more loci
– Autosomal STRs* improve identification of parent-offspring and full siblings
– Lineage markers or SNP arrays may improve identification of more distant relatives
Nothnagel M, Schmidtke J, Krawczak M. (2010) Potentials and limits of pairwise kinship analysis using autosomal short tandem repeat loci, Int. J. Legal Med. 124(3):205-15.
* More chances for mutation
Know your limits… simulate… validate!
How does kinship analysis relate to questions that concern you?
Expanding the Forensic Core Competency
Standard STR Typing
ComplexKinship Analysis
Familial Searching
13 core loci Additional loci may be beneficial
Additional loci may be beneficial
IndirectDirect
Paternity Testing
~15 loci
Level of CertaintyLow
Standard STR Typing for
Database Search
High
Type of Match
13 core loci
1-to-1
1-to-many
Typ
e o
f Se
arch
Expanding the Forensic Core Competency to Familial Searching
• One-to-many search many false positives
• Allele sharing method– Partial match between evidence profile and database profile– Miss true relatives
• Especially full siblings (1/4 probability of sharing 0 alleles at a locus)
– Introduce many false positives due to chance allele sharing
• Likelihood ratio approach– Kinship probabilities plus allele frequencies account for allele sharing
due to familial relationship
• Reduce uncertainty with additional loci (autosomal STRs, Y-STRs)
What is NIST doing to improve kinship analysis?
What is NIST doing to improve kinship analysis?
Allele frequencies for U.S. population samples
Evaluation of new loci
Concordance testing of new multiplexes
Developed a new website to support kinship analysis
See Poster #40 tomorrow
Kinship Resource Page on STRBasewww.cstl.nist.gov/strbase/kinship.htm
NIST Standard Reference Family DataAid validation of algorithms, software, and loci selection for kinship analysis – Use genotypes with known inheritance
– Compare LRs from algebraic and software calculations
– Test algorithms for mutation, rare alleles, null alleles, incest
– Evaluate use of additional loci to detect relationships
See Poster #35 today
AcknowledgmentsApplied Genetics
Kinship Page on STRBasehttp://www.cstl.nist.gov/strbase/kinship.htm
John Butler
Applied GeneticsGroup Leader
FundingNRC – Postdoctoral fellowship support for Kristen O’ConnorFBI – Application of DNA Typing as a Biometric ToolNIJ – Forensic DNA Standards, Research, and Training
Kristen Lewis O’Connor
Peter Vallone
DNA BiometricsProject Leader
Erica Butts
Becky Hill
Workshops & Textbooks
Concordance & LT-DNA
Rapid PCR & Biometrics
Kinship Analysis DNA Extraction Efficiency