human variation – dna profiling with mtdna and … dnaprofiling.pdfhuman variation – dna...
TRANSCRIPT
Human variation –DNA profiling with mtDNA
and microsatellitesEric Yap
Defence Medical Research InstituteSingapore Eye Research Institute
COFM Dept, National University of [email protected]
DNA profiling
• Forensic remains identification• Types of DNA polymorphisms• Methods of genotyping
polymorphisms• Short tandem repeats
(Microsatellites)• Mitochondrial DNA• Osteoarchaeology case study
DNA Profiling for Remains Identification
• Conventional Methods–Personal effects, tattoos–Dermal fingerprints–Dental Xrays
Sept 11 - Pentagon & AA77188 casualties (124 Pentagon, 64 AA77)
838 samples to AFDIL(588 bones, 285 soft tissue, 22 teeth, 43 hair)
97% reportable data
182 unique DNA profiles, 177 “by name” DNA report
52 has cards, total 183 reference profiles
177 matched, 178th by dental
188 victims - 178 named = 10 unidentified
5 terrorists with no DNA
reference
5 Americans with DNA references
DNA profiling
• Forensic remains identification• Types of DNA polymorphisms• Methods of genotyping
polymorphisms• Short tandem repeats
(Microsatellites)• Mitochondrial DNA• Osteoarchaeology case study
Types of DNA Polymorphisms
– repeats (may be invariant)• long repeats (SINES, LINES, Alu)
– length (repeat)• minisatellite (variable number of tandem
repeats)• microsatellite (short tandem repeats)
– sequence• minisatellite variant repeats (MVR)• restriction fragment length
polymorphisms• single nucleotide polymorphisms
Microsatellites
• Repeat unit–di- (CA)n or (GT)n– triple repeats– tetra-nucleotide, penta, etc
• Ubiquitous, well distributed• Densely mapped in human
genome: 7000+ (pre-sequence)
Biological Significance of STR
• Unstable, highly variable• Conserved across phyla• Associated with disease:
– fragile chromosome syndromes– cancer (MSH, MLH mutations)– degenerative neuromuscular diseases
• Mechanisms– transcriptional regulation & gene expression– structural instability
Applications of STR Genotyping
• Positional cloning/Reverse genetics– linkage studies: genome wide screens–association studies: candidate genes
• Molecular Diagnostics–directly or by linkage
• Individual profiling & Human diversity– remains id, forensics, anthropology
DNA profiling
• Forensic remains identification• Types of DNA polymorphisms• Methods of genotyping
polymorphisms• Short tandem repeats
(Microsatellites)• Mitochondrial DNA• Osteoarchaeology case study
gagagtncttgaaatgttcccaacacacacacactctctctctctcnctctctcacacacacacacacacacacacacacacacacagttttctccctataaaatatgntttccaggcccaatcctctcacaagcatcctagagcagtaaagatttcacatagtgacatcagacttgagtttgattccacctttactattcattcaccatgagactggacaaattagttaacctcagagtctcagtttcctcatctgtgcattgaatataggaaacag
D5S2033
D5S463D5S2099
D5S2090D5S413D5S434
D5S2013
D5S636D5S640
Gene ofPhosphodiesterase A
2 cM
Genetic Map at Phosphodiesteras A gene locus
Sequence of D5S434 markerCA repeatPrimer s
Electrophpresis Gel Separation
DNA Marker
Idiogram of Chromosome 5
Gel electrophoresis & Detection
• Denaturing polyacrylamide gels• Detection
–32P, 33P–Ag–Fluorescence (FAM, HEX, TET + size
standard)• Labelling: Incorporation (Double
strand) vs Primer end-label (Single strand)
D5S2013 D4S3038 D4S1536
D4S3002D3S3606RHO
D4S227 D5S434
High Throughput Genotyping by Multiplexing on Automated Fluorescent Detection System
High-throughput Genotyping. The 11 markers used were labelled with 3 different fluorescent dyes and multiplexed in1 lane. Allele sizing was determined by the Genotyper software.
Sizing & Analysis
• Size stds (intralane)• Controls (“gel shift”)• Automated sizing, calling of peaks,
alleles• Problems:
–stutter– ‘+A’–allele drop-out
Recent Developments
• Novel genotyping methods–Very high throughput CE–MALDI-TOF Mass Spectrometry
• Single Nucleotide Polymorphisms–300,000 SNPs, Maps–microarray genotyping
DNA profiling
• Forensic remains identification• Types of DNA polymorphisms• Methods of genotyping
polymorphisms• Short tandem repeats
(Microsatellites)• Mitochondrial DNA• Osteoarchaeology case study
Relative Merits of nucDNA with mtDNA
nucDNA
High discriminationOnly 2 copiesBest indirect references -
mother, father, offspring
Sample - several yearsQuick turn-aroundModerate expense
mtDNA
Limited discriminationMultiple copiesOnly indirect references -
maternal lineageSample - decades to
centuriesLengthy turn-aroundHigh expense
Marker PanelsMarker CODIS FSS Interpol DMRI
Amelogenin X X X XD3S1358 X X X XTH01 X X X XD21S11 X X X XD18S51 X X X XvWA X X X XD8S1179 X X X XTPOX X XFGA X X X XD5S818 X XD13S317 X XD7S820 X XD16S539 X X XCSF1PO X XPenta D XPenta E XD2S1338 XD19S433 X
CODISFBI Laboratory’s Combined DNA Index System, USA.
FSSForensic Science Service, UK.
Marker chromosome Gene Name GenBankLocus
GenBAnkAccess.No.
Repeat Seq
D3S1358 3p 11449919 AGATcomplex
TH01 11p15.5 Human tyrosinehydroxylasegene
HUMTH01 D00269 AATG
D21S11 21q11 – 21q21 HUMD21LOC M84567 TCTAcomplex
D18S51 18q21.3 HUMUT574 L18333X91254
AGAA
Penta D 21q AC001752 AAAGAD5S818 5q23.3 – 32 G08446 AGATD13S317 13q22 – q31 G09017 TATCD7S820 7q11.21 – 22 G08616 GATAD16S539 16q24 – qter G07925 GATACSF1PO 5q33.3 - 34 Human c-fms
proto-oncogenfor CSF-1receptor gene
HUMCSF1PO U63963 AGAT
Penta E 15q AC027004 AAAGAAmelogenin Xp22.1 – 22.3
& YHuman Ychromosomalgene forAmelogenin-likeprotein
HUMAMEL M55419
vWA 12p12 – pter Human vonWillebrandfactor gene
HUMVWFA31 M25858 TCTAcomplex
D8S1179 8q G08710 TCTATPOX 2p23 – 2pter Human thyroid
peroxidase geneHUMTPOX M68651 AATG
FGA 4q28 Humanfibrinogen alphachain gene
HUMFIBRA M64982 TTTCcomplex
DMRI In-house Marker panel
Discrimination Power
• Random Math Probability RMP
RMP = Σfi2i
Where the sum is taken over all I frequencies (counts) observed
• Genetic diversity h
h = (1- Σfi2 )n/(n-1)
Where n is the number of haplotypes
Discrimination Power
Matching Probabilities
AmericanAfrican Caucasian Hispanic Asian
Promega PP16 (15 markers) 7.09 x 10-19 5.46 x 10-18 3.41 x 10-18 2.67 x 10-18
ABI Profiler(2 panels 15 markers) 4.66 x 10-18 2.23 x 10-17
SingaporeanChinese Malay Indian Combined
DMRI Panel (15 markers) 1.16 x 10-17 5.60 x 10-18 2.57 x 10-18 1.01 x 10-18
Categories of Disease Causationadapted from Ward RH 1979: Social Biology 27:87-100
Category Examples• Chromosomal
– aneuploidy Down’s syndrome– structural Prader-Willi syndrome
• Single-gene Inborn errors of metabolism• Mitochondrial musculoskeletal/neurodegenerative syn• Multifactorial
– high heritability cleft lip / palate, neural tube defects– low heritability coronary artery disease, asthma etc
• Infectious viral, bacterial, fungal, protozoal pathogens
• Environmental physical (radiation, trauma), chemical (drug, pollutant, occupational), nutritional
Mendelian
Complex
Morbidity of Genetic Disease
TYPE LIFETIME FREQ
• Chromosomal 3.8 per 1000• Single gene 20 per 1000• Multifactorial 646 per 1000
Emory & Rimoin (1997)
Cytogenetic Diagnostic Tests
• Congenital abnormalities, dysmorphic syndromes, short stature
• Delayed development, mental retardation• Recurrent, unexplained miscarriages, subfertility• Ambiguous/abnormal genitalia, amenorrhoea• Down’s (21), Edward’s (18 or E1), Patau’s (13 or D1)
syndrome (advanced maternal age, • Fragile (breakage) chromosomes• Angelman and Prader-Willi syndromes• Haemopoietic neoplasia (NHL, CLL)
Correlation of Allele distance and Peak height Ratio
0.800
1.000
1.200
1.400
1.600
1.800
2.000
2.200
0 1 2 3 4 5 6 7
Distance of Alleles (bp)
Ratio
of A
ve. P
eak
Heig
ht
0.800
0.900
1.000
1.100
1.200
1.300
1.400
1.500
1.600
1.700
0 2 4 6 8 10 12
No. of Allele Repeats (per 4 bp)R
atio
of A
ve. P
eak
Hei
ght
D21S11 D21S1412
Chr 21 TrisomyPeak Ht Characteristic of genotype
Marker GenotypePeak Ht Ratio
D21S11 221 225
0.686 1
D21S1412 406 410 414
1.225 1.185 1
D21S1413 156 176
0.816 1
D21S1411 284 300
0.576 1
Conclusion - Utility in trisomy diagnosis
DNA profiling
• Forensic remains identification• Types of DNA polymorphisms• Methods of genotyping
polymorphisms• Short tandem repeats
(Microsatellites)• Mitochondrial DNA• Osteoarchaeology case study
Mitochondrial DNA
• 16 569 bp, circular double strand DNA
• 37 genes :
> 2 rRNAs,
> 22 tRNAs &
> 13 mRNAs that encode 12 polypeptides for OXPHOS
Polymorphisms between Cambridge reference sequence (Anderson et al 1981) & a local Chinese sample
M tD N Aposition
R ef S eq L oca l C h in esesam ple
F ragm en t
16086 T C H V 1
16129 G A H V 1
16192 C T H V 1
16223 C T H V 1
16297 T C H V 1
73 A G H V 2
150 C T H V 2
199 T C H V 2
263 A G H V 2
311 - In sertion C H V 2
verify with reverse sequences
C R SN I S T _ 9 9 4 7 A - C R
1 4 6 - C R2 0 6 - C R
N I S T _ C H R - C R1 5 2 - C R
1 7 3 - C R1 9 0 - C R2 3 4 - C R
1 8 0 - C R2 0 4 - C R
2 0 8 - C R2 4 7 - C R
2 0 9 - C RQ L 0 2 9 9 1 1 3 - C R
1 6 0 - C R1 9 6 - C R
2 4 0 - C R1 6 8 - C R
Q L 0 2 9 9 1 1 6 - C R1 9 4 - C R
2 3 6 - C R1 7 4 - C R
2 4 2 - C R2 1 4 - C R
1 6 7 - C R1 8 3 - C R
9 2 9 - C R1 3 0 - C RQ L 0 2 9 9 1 3 6 - C R
1 5 5 - C R2 0 3 - C R
2 0 5 - C R1 5 0 - C R
2 4 1 - C R1 7 2 - C R
2 1 5 - C R2 4 6 - C R
1 8 6 - C R1 8 8 - C R
1 8 4 - C R2 4 8 - C R
1 6 2 - C R1 6 6 - C R
1 9 1 - C R2 0 1 - C R
2 3 9 - C R1 7 0 - C R
2 4 4 - C R1 8 1 - C R
1 9 7 - C R1 4 7 - C R
1 7 1 - C R2 1 7 - C R
2 4 9 - C R2 0 7 - C R
1 4 8 - C R1 7 7 - C R
1 5 4 - C R2 1 1 - C R
1 9 2 - C R1 9 3 - C R2 4 3 - C R
R i t a - C R2 5 2 - C R
1 5 1 - C R1 5 7 - C R
2 1 0 - C R2 4 5 - C R
1 5 9 - C R2 0 2 - C R
2 3 8 - C R1 7 8 - C R
1 4 9 - C R1 7 6 - C R
1 6 3 - C R1 6 1 - C R
2 1 6 - C R2 3 5 - C R
1 8 9 - C R2 5 0 - C R
2 1 2 - C R2 0 0 - C R
1 5 3 - C R2 1 8 - C R
1 5 6 - C R2 1 9 - C R
1 6 9 - C R1 8 5 - C R
1 6 4 - C RQ L 0 2 9 9 1 1 5 - C R
2 1 3 - C R1 3 6 - C R
1 9 5 - C R1 6 5 - C R
1 7 5 - C R2 3 7 - C R
1 9 8 - C R1 7 9 - C R
1 8 2 - C R2 5 1 - C R
CRSNIST_CHR
NIST_9947A
136QL136
Hapltype no.Occurance1 33 2
88 1
QC:
• 2 NIST ref samples produced concordant seq.
• Internal std (QL136) reproduced seq
CR region = 1197 bp
Random Match Probability = 0.0116
Genetic Diversity = 0.9987114
Phylogenetic Analysis of mt DNA CR region of 101 DNAs
Relative Merits of nucDNA with mtDNA
nucDNA
High discriminationOnly 2 copiesBest indirect references -
mother, father, offspring
Sample - several yearsQuick turn-aroundModerate expense
mtDNA
Limited discriminationMultiple copiesOnly indirect references -
maternal lineageSample - decades to
centuriesLengthy turn-aroundHigh expense