integrative analysis in 1000 genomes - bioinfosummer 2012 (fuli yu)
DESCRIPTION
1000 Genomes - A deep catalog of Human Genetic VariationTRANSCRIPT
![Page 1: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/1.jpg)
1 1
Integrative analysis in 1000 Genomes data
Fuli Yu
BioInfoSummer, Adelaide Australia
2012
![Page 2: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/2.jpg)
Outline
• Background overview of 1000G
• 1000G Phase I results
• BCM NGS variation analysis software
• Further development and timeline
2
![Page 3: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/3.jpg)
3
The history before the 1000 Genomes Project
-Phase I and II: common SNPs in CEU, CHB, JPT, YRI -HapMap3: 11 populations -Patterns of linkage disequilibrium and haplotypes defined genome-wide
www.hapmap.org
• Complex diseases gene mapping – GWAS. • Characteristics of the human genome variants: allele frequency spectrum, LD
patterns, recombination rate variation… • Population genetics: selection, migration, drift, admixture
Impacts
1,449 published
GWA at p≤5x10-8
for 237 traits
![Page 4: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/4.jpg)
4 4
Disease mutations are likely rare and heterogeneous
McClellan J and King M-C, 2010
‘Clan Genomics’ Lupski JR et al. 2011
![Page 5: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/5.jpg)
5 5
The quest for rare genetic variation
Gibbs R 2005
HapMap
1000G
![Page 6: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/6.jpg)
6 6
Project goal
“…sequence a large number of people, to provide a
comprehensive resource on human genetic variation…”
“…find most genetic variants that have frequencies of at
least 1% in the populations studies…”
www.1000genomes.org
![Page 7: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/7.jpg)
1000 Genomes Project Design and Progress
• Pilot data collected in 2008; paper published October 2010 in Nature
– Companions in Science and Genome Research
– Other companions later
• Full project data collection and analysis underway – Phase 1 results published Nov 1st 2012
– Phase 2 / Phase 3 being completed
• Sequencing completion - early 2013 – Analysis completion in 2013-2014
![Page 8: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/8.jpg)
8 8
Nature, Oct 2010
-179 WGS, 700 exon seq
-15M new SNPs
-CNV group
-Exon group
![Page 9: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/9.jpg)
1000 Genomes Project Design and Progress
• Pilot data collected in 2008; paper published October 2010 in Nature
– Companions in Science and Genome Research
– Other companions later
• Full project data collection and analysis underway – Phase 1 results published Nov 1st 2012
– Phase 2 / Phase 3 being completed
• Sequencing completion - early 2013 – Analysis completion in 2013-2014
![Page 10: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/10.jpg)
10
1000G Phase I populations
![Page 11: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/11.jpg)
Mark DePristo
![Page 12: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/12.jpg)
12
An integrative map of 40 million variants Low-pass Genomes
SNPs 38M
Low-pass Genomes Low-pass Genomes Low-pass Genomes
Low-pass Genomes
Low-pass Genomes Low-pass Genomes Low-pass Genomes Low-pass Genomes
Deep Exomes
INDELs 1.4M
SVs 14k
Integrated Genotypes ~40M
Hyun Min Kang
![Page 13: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/13.jpg)
1000 Genomes Project Design and Progress
• Pilot data collected in 2008; paper published October 2010 in Nature
– Companions in Science and Genome Research
– Other companions later
• Full project data collection and analysis underway – Phase 1 results published Nov 1st 2012
– Phase 2 / Phase 3 being completed
• Sequencing completion - early 2013 – Analysis completion in 2013-2014
![Page 14: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/14.jpg)
Discovery power
• 1% SNPs
– 99.3% genome / 99.8% exome
• 0.1% SNPs
– 70% genome / 90% exome
- Exome high r2>0.9 - with LD information, WGS genotype - improves MAF>=1% by 30-40% - unchanges MAF<0.1%
![Page 15: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/15.jpg)
Phase 1 variants are of high quality
![Page 16: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/16.jpg)
Overall genotype accuracy at ~99%
Hyun Min Kang
![Page 17: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/17.jpg)
Hyun Min Kang
Sensitivity >96% in a given genome
![Page 18: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/18.jpg)
Rare variation is population specific
• 17% of low frequency (0.5-5%) in a single ancestry group
• 53% of less than 0.5% in a single population
• African populations have many more low frequency variants due to bottleneck on other lineages
• All populations are enriched in rare variants – Explosive recent population
growth
Slide Courtesy of Paul Flicek Adam Auton, Gil McVean
![Page 19: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/19.jpg)
Rare variants identify recent historical links between populations
48% of IBS variants shared with American populations
ASW shows stronger sharing with YRI than LWK
Adam Auton, Gil McVean
![Page 20: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/20.jpg)
The proportion of rare variants by conservation
Tuuli Lappalainen
![Page 21: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/21.jpg)
The proportion of rare variants by conservation
Tuuli Lappalainen
![Page 22: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/22.jpg)
The proportion of rare variants by conservation
Tuuli Lappalainen
![Page 23: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/23.jpg)
Implication for GWAS imputation
Bryan Howie, Hyun Min Kang
![Page 24: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/24.jpg)
BCM NGS PIPELINES: ATLAS2 & SNPTOOLS
24
![Page 25: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/25.jpg)
25
Overview of NGS variation analysis pipelines
Nielsen R 2011
SNPTools Atlas2
![Page 26: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/26.jpg)
26 26
Atlas uses logistic regression: systematic errors
DistbNQSbSwapbRawQualityb 4321 Pr(SNP)i1
Pr(SNP)ilog
Items Values derived from our
training experiment
Z
score
Significance
(p-value)
Intercept α -3.3 -39 <2e-16
Coefficient b1 for raw quality score 0.11 19 <2e-16
Coefficient b2 for swap -3.5 28 <2e-16
Coefficient b3 for NQS 0.26 3 0.001
Coefficient b4 for relative position -0.37 -4 0.0005
j=1 (0/1) 2 (0/0) m (0/0) . . . .
i=1,
2,
.
.
.,
n
Read
harboring
reference
alleles
Read
harboring
substitutions
Reference
sequence
Shen et al. 2010 Genome Research
![Page 27: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/27.jpg)
27 27
posterior Pr(SNP) using Bayesian
j=1 (0/1) 2 (0/0) m (0/0) . . . .
i=1,
2,
.
.
.,
n
Read
harboring
reference
alleles
Read
harboring
substitutions
Reference
sequence
Pr(error)i = 1 – Pr(SNP)i
Pr(error)j = ∏ Pr(error)i
Pr(SNP)j = 1- Pr(error)j = Sj
)|(),|Pr()|(),|Pr(
)|(),|Pr(),|Pr(
cerrorpriorcerrorScSNPpriorcSNPS
cSNPpriorcSNPScSSNP
jj
j
jj
Shen et al. 2010 Genome Research
![Page 28: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/28.jpg)
28 28
Exome data summary
• 1128 (822 Illumina/306 SOLiD) samples in 20110521.alignment.index
– 822 Illumina BAMs
• MOSAIK
– 306 SOLiD BAMs
• BFAST
• SNPs are called using Atlas-SNP2 at BCM
![Page 29: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/29.jpg)
29 29
Intersection #SNP:238,356 dbSNP: 48.5% Ti/Tv : 3.35
Baylor Exome Unique #SNP: 218,739 dbSNP: 8.2% Ti/Tv: 2.97
VQSR v2b Unique #SNP: 23,096 dbSNP: 15.3% Ti/Tv: 2.67
Exome SNP calls on consensus target regions
Platform #Sample # SNP %dbSNP
b132
Known Ti/Tv merged / per-
sample
Novel Ti/Tv merged / per-
sample
Illumina+ SOLiD 1128
457,095 29.23% 3.47/3.41 3.05/2.97
SOLiD 306
244,736 42.05% 3.54 / 3.51 3.19/ 3.03
Illumina 822
348,599 35.94% 3.46/3.37 2.99/2.95
![Page 30: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/30.jpg)
30
Effective Base
Depth
•Novel Effective Base Depth (EBD) summarization for each BAM
•High performance IO, small disk foot print (1~2GB per BAM)
SNP Site Discovery
•Novel variance ratio based site discovery statistics
•High sensitivity and specificity
Sequence Genotype Likelihood
•Novel BAM-specific binomial mixture modeling (BBMM)
•Capture BAM heterogeneity
Exist Genotype Integratio
n
•‘Dynamic linking’ of multiple exist genotype datasets with Bayesian style
•Improve both exist genotypes and sequence calls significantly
Genotype Imputatio
n
•Novel imputation engine
•High genotyping and phasing accuracy
Raw Sequence Reads (FASTQ)
Short Reads Mapping
Base Quality Recalibration
Binary sequence Alignment/Map Files (BAM)
Haplotype with Confidence Score (VCF)
Downstream Analysis
SNPTools pipeline overview
![Page 31: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/31.jpg)
31
EBD file format
![Page 32: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/32.jpg)
32
New algorithm for Genotype Likelihood
• Challenges in Raw Genotype Likelihood 1. Mapping/sequencing errors in site discovery
2. BAM heterogeneity, potential contamination
• Solutions 1. Novel concept of Effective Base Depth (EBD) to summarize
sequence details
2. BAM-specific binomial mixture model handles BAM heterogeneity
![Page 33: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/33.jpg)
33
Rationale
• BAM-specific modeling – Using whole-genome VQSR
sites
– Perform 3-component BBMM on each BAM using Phase I VQSR (38M) SNPs sites
– High precision modeling with 38M data points!
– Make SNP array free QC on individual BAMs
1094
BAMs
39
M V
QSR
SNPs
site specific modeling
BAM specific modeling
small learning size BAM heterogeneity
low accuracy for alt/alt
huge learning size high accuracy for alt/alt
as one QC metric
aara,rr,=g
giigi )e,a+B(rw=)P(r
![Page 34: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/34.jpg)
34
BBMM overcomes platform heterogeneity
![Page 35: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/35.jpg)
35
SOLiD GL: BBMM better than Samtools
HM3
OMNI
Hyun Min Kang Univ Mich
![Page 36: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/36.jpg)
36
Improvement of using BBMM GL also seen in Beagle
Hyun Min Kang Univ Mich
![Page 37: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/37.jpg)
37
SNPTools Imputation – ‘Constraint Li-Stephens’
![Page 38: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/38.jpg)
38
Phase I Genotypes: Chr1, Chr20 (released 2011-05-08)
call set OMNI HM3 Axiom
AA RA RR non-ref AA RA RR non-ref AA RA RR non-ref chr1 1.03 1.02 0.19 1.43 1.64 0.86 0.21 1.43 0.85 1.38 0.19 1.51
chr20 1.02 1.18 0.23 1.60 1.22 0.88 0.25 1.30 1.33 1.48 0.22 1.85 chr20 V4 1.33 1.21 0.37 2.02 1.20 0.83 0.25 1.26 1.36 1.45 0.21 1.83 chr20* 0.99 1.17 0.22 1.57 1.18 0.88 0.25 1.28 1.23 1.47 0.21 1.79
chr20 V4* 1.01 1.11 0.22 1.52 1.18 0.83 0.24 1.25 1.24 1.44 0.21 1.77
•chr1 and chr20 are based on new VQSR sites •chr20 V4 is based on old VQSR sites •chr20* and chr20 V4* are the overlapped sites between new VQSR and old VQSR
Chr20 genotype call set
Better OMNI concordance than V4 due to site/allele selection improvement
Similar accuracy on overlapped sites
Chr1 genotype call set
Slightly better than chr20 call set
![Page 39: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/39.jpg)
39
Phasing accuracy evaluation
![Page 40: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/40.jpg)
40
Integrating known array genotypes
raw genotype
probabilities
known genotypes
Direct re-weighting of overall
accuracy. Improvement is in
proportion to the number known
genotype integrated.
Imputation improvement of on-
array accuracy. Known
genotypes are treated as
99.98% confidence priors which
is still improvable.
Imputation improvement of off-
array accuracy. Make full use of
the LD between on and off
array sites.
sample
sites
![Page 41: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/41.jpg)
Integrating LowPass + ExomeOffTarget
41
![Page 42: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/42.jpg)
Exome off-target reads are evenly distributed
42
![Page 43: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/43.jpg)
Exome off-target reads improve sensitivity
•~5% improved sensitivity in off targets
![Page 44: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/44.jpg)
1000G NEW DEVELOPMENT & TIMELINE TO COMPLETION
44
![Page 45: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/45.jpg)
1000 Genomes Project Design and Progress
• Pilot data collected in 2008; paper published October 2010 in Nature
– Companions in Science and Genome Research
– Other companions later
• Full project data collection and analysis underway – Phase 1 results published Nov 1st 2012
– Phase 2 / Phase 3 being completed
• Sequencing completion - early 2013 – Analysis completion in 2013-2014
![Page 46: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/46.jpg)
46
1000G Phase 2/3 populations
ACB CDX
GHI KHV
PEL
CHD
GWD
MSL
ESN
PJL
BEB
STU
ITU
![Page 47: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/47.jpg)
BCM BI1 LU SI1 UM BC BI2 OX1 OX2 SI2
05
00
015
00
0
BCM BI1 LU SI1 UM BC BI2 OX1 OX2 SI2
020
00
00
50
00
00
Overview of AFR Phase 2 Call Set Sizes (chr20)
47
SNPs
BCM BI1 LU SI1 UM BC BI2 OX1 OX2 SI2
02
00
00
40
00
0
Indels/ Cplxsubs
MNPs
195K
511K 491K 480K 481K
362K 460K 452K
252K
0
17K
0 0
48K 42K 42K 44K
49K 46K
28K
0 0 0 0 0 4K
8K
19K
3K 206
Alignment-based Call Sets Assembly-based Call Sets
Adrian Tan, Hyun Min Kang
![Page 48: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/48.jpg)
A time-line
• Data generation (incl, LC, exome, CG, SNP arrays) by end March.
• Final alignment index from DCC by start June.
• Contributing call sets (SNP, indel, MNP, complex, SV) by end July
• Consensus and resolved site list with GLs by end August
• Integrated haplotypes by ASHG 2013
Gil McVean
![Page 49: Integrative Analysis in 1000 Genomes - BioInfoSummer 2012 (Fuli Yu)](https://reader034.vdocuments.mx/reader034/viewer/2022042814/55504d77b4c90580748b52bb/html5/thumbnails/49.jpg)
49 49
Acknowledgements
BCM-HGSC
• Yi Wang: SNPTOOLS
• Jin Yu: Atlas-SNP
• Danny Challis: Atlas-INDEL
• Uday Evani: VCFPRINTER
• Matthew Bainbridge
• Donna Muzny
• Jeffrey Reid
• Richard Gibbs
• Gabor Marth
• Amit Indap
• Wen Fung Leong
• Alistair Ward
Boston College
Broad Institute
• Mark DePristo
• Ryan Poplin
• Eric Banks
• Simon Gravel
• Carlos Bustamante
Stanford University
Univ of Michigan
• Goncalo Abecasis
• Hyun Min Kang
BCM-BRL
• Andrew R. Jackson
• Sameer Paithankar
• Cristian Coarfa
• Aleksandar Milosavljevic
BlueBioU@Rice University
• Kim Andrews
• Roger Moye
• Chandler Wilkerson