20160308 dtl ngs_focus_group_meeting_slideshare
TRANSCRIPT
Nanopore Sequencing
Hans JansenZF-screens B.V.
ZF-screens B.V.
Common carp (Cyprinus carpio)High throughput screening modelGenome and transcriptomes
European and Japanese eel (Anguilla anguilla and Anguilla japonica)Completing the life cycle in aquacultureGenome and transcriptomes
King cobra (Ophiophagus hannah)Evolution and toxinsGenome and transcriptomes
But the quality of these genomes could be improved.
Dutch SME located in the Bioscience Park in Leiden, the Netherlands.Compound screens, Fish fertility. Sequencing.
• Started to work on nanopore sensing in 2005• Investments to date 251 million GBP (355 M€)• Valued at 1.2 B€.• Broad IP portfolio on nanopore sensing.
• Products: MinION and PromethION systems
Oxford Nanopore Technologies
But MAP is much more. It is about being a community and a playground to test new applications and analysis tools.
Visible as a web portal with information from ONT and social media like system with blog possibilities, comment, likes, and a forum to ask advice.
MinION Access Program
Inlet port Electrode
Array of 2048 wells with nanopores embedded in polymer membrane.Underneath is an Application Specific Integrated Circuit (ASIC) which contains the 512 signal amplifiers (SA).
MinION Flow Cell design
Waste reservoir
SA
well
After 24 hrs a well is exhausted and the SA switches to a new well (remux).
Inlet port Electrode
Array of 2048 wells with nanopores embedded in polymer membrane.Underneath is an Application Specific Integrated Circuit (ASIC) which contains the 512 signal amplifiers (SA).
MinION Flow Cell design
Waste reservoir
SA
well
After 24 hrs a well is exhausted and the SA switches to a new well (remux).
Stills taken from: https://www.nanoporetech.com/news/movies#movie-24-nanopore-dna-sequencing
The ion flux is partially blocked by the translocating DNA strand.
Potential (voltage) is applied across the membrane and there is a ion flux through the pore.The ion flux is measured by an application-specific integrated circuit (ASIC).
+-
Strand sequencing
ATP
ADP
Tethering oligo
Motor protein
Hairpin
abasic nucleotides
• Tethers keep the DNA molecules on the membrane to increase concentration of fragments.• Motor protein unwinds ds DNA and ratchets it though the pore at a controlled rate.• Abasic nucleotides in the hairpin are a marker for the basecaller to separate template and
complement data.
Tethering oligo
Adapter
Template strand
Complement strand
Strand sequencing
Template Complement
Bulk Data (ionic curent, pA)
Events (with time domain)
Squiggle (events with time domain removed)
~6 nucleotides are blocking the current when a DNA strand is in the pore, but nucleotides around this k-mer can influence the signal.
AGGCTC
AGGCTCACTCCCATAAGC
ACG
CGT
A CGT
A
CGT
A
Base calling
T
Reading in k-mer space
AGGCTC
AGGCTCACTCCCATAAGC
ACG
CGT
A CGT
A
CGT
A
Base calling
T
Reading in k-mer space
AGGCTCACTCCCATAAGCAGGCTCGGCTCAGCTCACCTCACTTCACTCCACTCCACTCCCCTCCCA
The A is measured 6 times on the template strand and the corresponding T 6 times on the complement strand.
Event
Event
Event
Event
Within an event the different possibilities have an probability value attached to them. Base calling is choosing a path through all possibilities with the highest probability.
Read length is limited by the non-nicked fragment length rather than the by the system itself.Longest 2D reads are over 100 Kbp, 1D reads over 200 Kbp.
Read length distribution
The base caller should model the translocation process in and around the pore.Most errors are a result of the basecaller not being accurate enough.
Error rate lies around ~10-13% for current chemistry (R7.3 and MAP006). The contribution of deletions, insertions and substitutions is more or less equal.
Challenging are still:Homopolymers Time domain of the events might be used to resolve this.DNA modifications Modified bases need to be added to the model.AT rich area’s Better nanopores and better modeling.
Base caller code will be released shortly to enable the community to develop different base callers.In MARC data will be generated to enable the incorporation of modified bases in the base calling model.The R9 nanopore is in the pipeline (improving on G/C bias and better S/N).
Errors
GC content
June 2014 February 2016
Abundance of k-mers in the reads is plotted against the abundance in the reference
MinION Analysis and Reference Consortium
The MinION Analysis and Reference Consortium (MARC) is a collaboration of members of MAP.• Evaluate the reproducibility of the MinION platform• Investigate improvements to the library preparation and running
protocols.• Phase 1: 5 labs running the same protocols on the same samples.
A first paper was published on F1000research.com in a channel dedicated to the analysis of nanopore sequence data (Ip, L. C., et al. doi: 10.12688/f1000research.7201.1)
http://f1000research.com/channels/nanoporeanalysis
MARC intends to publish more papers in this channel.
This channel is open to all papers that fall within the scope of the channel.
Reproducibility: flow cell yield
Active pores during a run
remux 24 hrs remux 48 hrs
Reproducibility: active pores
Percentage of active pores decreases during the run.Amount of active pores can vary significantly.
Reproducibility: read length
Read length is influenced by the shearing conditions, DNA quality, g-TUBE.
We joined MAP right from the start.Our first MinION arrived in April 2014 and the first flow cells and kits in June.Since then we run ~50 Flow Cells.
MAPpers competition Topped the leader board on read length and yield so we now have three MinION's.
MinION Access Program and ZF-screens
And one of these
is on the way
144000 channels producing 6.4 TB/day @ 500 bps/sec.Basecalling is done locally.Experimental design can be flexible. A loading port is connected to 750 channels (4 ports/Flow Cell). Flow Cells can be run independently from each other.
PromethION
Conclusions
The Nanopore platform is very promising but its not completely ready for production.
The MinION is a platform with some unique characteristics.• Very mobile.• Cost structure is very different.• Reads can be very long.• Data can be accessed in real time.
Things that should be improved• Flow cell quality.• Base calling.• Throughput.
What I would like to see happening• Development of tools that work with the event data rather than the base called data.• This might also allow for “streaming analysis”.
Ron Dirks (CEO of ZF-screens B.V.)
All members of the MARC consortiumEwan Birney, EMBL-EBICamilla Ip, WTCHG OxfordJohn Tyson, University of British ColumbiaJustin O’Grady, UEASara Goodwin, CSHL Vadim Zalunin, EMBL-EBIMiten Jain, UCSCMatt Loose, NottinghamJared Simpson, OICR, Toronto
Acknowledgements
Rosemary DokosOliver Hartwell
Christiaan Henkel (Assistant professor Leiden University)