comprehensive detection of germline and ... - bionano genomics · bionano genomics, san diego,...
TRANSCRIPT
Extraction of long DNA molecules Label DNA at specific sequence
motifs
Saphyr Chip linearizes DNA in
NanoChannel arrays
Saphyr automates imaging of
single molecules in
NanoChannel arrays
Molecules and labels detected in
images
Bionano Access software
assembles optical maps
1 2 3 4 5 6
Blood Cell Tissue Microbes
Free DNA Solution DNA in a Microchannel DNA in a Nanochannel
Gaussian Coil Partially Elongated Linearized
Position (kbp)
Methods
(1) Long molecules of DNA are labeled with Bionano reagents by (2) incorporation of fluorophores at a specific sequence motif throughout the genome. (3) The labeled genomic DNA is then
linearized in the Saphyr Chip using NanoChannel arrays (4) Single molecules are imaged by Saphyr and then digitized. (5) Molecules are uniquely identifiable by distinct distribution of sequence
motif labels (6) and then assembled by pairwise alignment into de novo genome maps.
COLO 829 COLO 829-
BL
Comprehensive Detection of Germline and Somatic Structural Mutation in Cancer Genomes
by Bionano Genomics Optical Mapping
Abstract Background
ConclusionsWe demonstrate that the Saphyr system can be used to accurately detect genetic mutation hallmarks in samples with
cancer. These includes large rearrangements ranging from translocations, within chromosome fusions, to copy number
alterations. Researchers can perform mapping experiments to uncover somatic variants by comparing with our control
sample database and a matching non-tumor sample. Furthermore, our molecule mapping approach enables us to
identify lower allelic mutation. Our results indicate that Saphyr can capture a broad spectrum of variation with functional
importance, and can provide easy solutions for cancer studies.
Acknowledgements
We like to thank our collaborators Wigard Kloosterman, Jose Espejo Valle-Inclan and Edwin Cuppen for the samples,
inputs to the study design and advice on the analysis.
Reference1) Cao, H., et al., Rapid detection of structural variation in a human genome using
NanoChannel-based genome mapping technology. Gigascience (2014); 3(1):34
2) Hastie, A.R., et al. Rapid Genome Mapping in NanoChannel Arrays for Highly Complete
and Accurate De Novo Sequence Assembly of the Complex Aegilops tauschii Genome.
PLoS ONE (2013); 8(2): e55864.
3) Das, S. K., et al. Single molecule linear analysis of DNA in nano-channel labeled with
sequence specific fluorescent probes. Nucleic Acids Research (2010); 38: 8
4) Xiao, M et. al. Rapid DNA mapping by fluorescent single molecule detection. Nucleic
Acids Research (2007); 35:e16.
In cancer genetics, the ability to identify constitutive and low-allelic
fraction structural variants (SVs) is crucial. Conventional karyotype and
cytogenetics approaches are manually intensive. Microarrays and short-
read sequencing cannot detect calls in segmental duplications and
repeats, often miss balanced variants, and have trouble finding low-
frequency mutations.
We describe the use of Bionano Genomics’ Saphyr platform to
comprehensively identify SVs for studying cancer genomes. DNA larger
than 100 kbp are extracted, labelled at specific motifs, and linearized
through NanoChannel arrays for visualization. Molecule images are
digitized and de novo assembled, creating megabases long Bionano
genome maps. Somatic mutations can be identified by running the variant
annotation pipeline, which compares the cancer sample’s assembly calls
against > 600,000 SVs in Bionano’s control sample SV database, and
against a matched control sample’s variants. In addition, two new
Bionano pipelines leverage these long molecules to identify additional
somatic SVs: the copy number variation pipeline and the molecule
mapping pipeline.
By examining the coverage-depth of molecule alignment to the public
human genome reference, the pipeline can identify megabases long
amplifications and deletions. Similarly, clusters of split-molecule
alignments against a reference can reliably find translocations and other
rearrangements.
We applied this suite of discovery tools to construct a comprehensive
map of SVs in a well-studied melanoma cell line, COLO829. We collected
data from the tumor and the matched blood cell line, constructed
contiguous assemblies (N50 > 50 Mbp), and called over 6,000 SVs in
each genome. Then, we classified 51 as somatic by comparing the tumor
and the blood control. Furthermore molecule mapping identified extra
mutations. The copy number profile captured the BRAF on chr7, as well
as other chromosomal-arm gains and losses. In conclusion, with one
comprehensive platform, Saphyr can discover a broad range of
traditionally refractory but functionally-relevant SVs, and further improves
our understanding of cancer etiology.
Generating high-quality finished genomes replete with accurate
identification of structural variation and high completion (minimal
gaps) remains challenging using short read sequencing technologies
alone. The Saphyr™ system provides direct visualization of long
DNA molecules in their native state, bypassing the statistical
inference needed to align paired-end reads with an uncertain insert
size distribution. These long labeled molecules are de novo
assembled into physical maps spanning the entire diploid genome.
The resulting provides the ability to correctly position and orient
sequence contigs into chromosome-scale scaffolds and detect a
large range of homozygous and heterozygous structural variation
with very high efficiency.
A.W.C. Pang, J. Lee, T. Anantharaman, E. Lam, A. Hastie, H. Cao, M. Borodkin
Bionano Genomics, San Diego, California, United States of America
COLO 829 COLO 829-BL
Assembly size (haplotype-aware) 6.29 Gbp 6.34 Gbp
Genome map N50 52.7 Mbp 57.3 Mbp
SV calls against hg19
Insertions 4,089 4,218
Deletions 1,736 1,771
Duplications 181 164
Inversion breakpoints 308 305
Intrachromosomal translocations 0 0
Interchromosomal translocations 2 0
Step 1
• Compare with Bionano SV database (~ 180 normal/healthy humans)
• Examine coverage and chimera quality scores
• Overlap with gene annotation
Step 2
• Compare the sample’s SV calls with the matched control’s SV calls
Step 3
• Align the blood control’s molecules to the sample’s assembly
• Does the blood control contain molecule support for sample’s SV?
• Yes False classification as somatic due to false negative call in blood.
• No Sample-unique SV.
• Re-align the sample’s molecules to the sample’s assembly
• Does the sample contain molecule support for sample’s SV?
• Yes True SV.
• No False positive SV.
Sample SV call
set
Annotated sample
SV call set
SV Type COLO 829
Insertions 5
Deletions 28
Duplications 16
Inversion breakpoints 0
Intra-chromosomal translocations 0
Inter-chromosomal translocations 2
RAB31
ITIH5
hg19
chr18
hg19
chr10
COLO 829
map 1871
9,735,632 10,024,212
7,493,300 7,787,674
9,869,797
7,647,848
TXNDC2
hg19 chr10
COLO 829
map 242
COLO 829
map 241
CFL1P
1
PTEN
89,593,241 89,751,246
AK13007
6
12.0 kb del
12.0 kb del
COLO 829-
BL map 71
COLO 829-
BL
map 72
89,593,241 89,751,246
CFL1P
1
PTEN AK13007
6
hg19 chr10
Chromosome 7 copy number
variation
BRAF
amplification
TMEM185
A
MAGEA11
Case study: Identifying SVs in metastatic melanoma fibroblast and matched lymphoblastoid cell lines
1 de novo assembly and SV detected against hg19
1) Assembly and SV statistics for the two samples
2) A circos plot indicating the large scale rearrangements uniquely detected in COLO829.
3) A assembled map captures a t(10;18) translocation
Insertion
Deletion
Inversion
DuplicationTranslocatio
nCNV
2
3
4
5
4) Variant annotation
pipeline designed to
identify somatic variants
for cancer studies
5) Resulting number of
somatic variants
detected in COLO 829
6) A 12 kb deletion
identified interrupting the
PTEN gene in the tumor
and not the blood cell
line6
7 Detection of low allelic mutation by direction molecule alignment to hg19
TLR8-
AS1
TLR7 PRPS2TMSB4X
TPTE BAGE
hg19
chrX
hg19
chr21
COLO 829 local
assembly map
COLO 829 local
assembly map
7) A t(21;X)
translocation uniquely
detected by molecule
alignment to hg19. We
directly called SVs by
molecules, and then
constructed a local
assembly to validate the
call. The red arrows
indicate the location of
the breakpoint on the
local assembled map.
Molecules
COLO 829 local
assembly map
hg19
chrX
Molecules
8) A 75.8 kb somatic
inversion on chrX that
was uniquely captured
by molecule alignment
and local assembly
approach.
8
Chromosome 3 copy number
variation
Position (bp)
Position (bp)
24 Mb
deletion
170 Mb
duplication
2.4 kb
amplification
9
9) Copy number variation can be captured by elevation and depletion of coverage in molecule
alignment to hg19
COLO 829
COLO 829-BL
10,956,785 11,327,739
12,738,707 13,093,292
148,670,159 148,906,550