detection of chimeric sequences from pcr artefacts
DESCRIPTION
Detection of chimeric sequences from PCR artefacts. Thomas Huber [email protected] Computational Biology and Bioinformatics Environment ComBinE Departments of Biochemistry & Mathematics The University of Queensland. What are PCR-generated chimeric sequence?. Prematurely terminated amplicon - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Detection of chimeric sequences from PCR artefacts](https://reader035.vdocuments.mx/reader035/viewer/2022062810/56815e1f550346895dcc7bb7/html5/thumbnails/1.jpg)
Detection of chimeric sequences from PCR artefacts
Thomas Huber [email protected]
Computational Biology andBioinformatics Environment
ComBinE Departments of Biochemistry & Mathematics
The University of Queensland
![Page 2: Detection of chimeric sequences from PCR artefacts](https://reader035.vdocuments.mx/reader035/viewer/2022062810/56815e1f550346895dcc7bb7/html5/thumbnails/2.jpg)
What are PCR-generated chimeric sequence?
• Prematurely terminated amplicon
• Re-annealing with foreign DNA• Copied to completion in
following PCR cycle
• Artificial sequence from 2 parent sequences
From: http://www.gnis-pedagogie.org
![Page 3: Detection of chimeric sequences from PCR artefacts](https://reader035.vdocuments.mx/reader035/viewer/2022062810/56815e1f550346895dcc7bb7/html5/thumbnails/3.jpg)
Are chimeric sequence a problem?
• Culture independent surveys of microbial communities– Chimeric sequences suggest non-existing
organisms 0.5-5% of all sequences are PCR artefacts
• Why bother with such a small artefact?– Signal vs Noise
• 100 times repetition of same survey (5% chimeras): ratio of existing:non-existing organisms = 1:5
![Page 4: Detection of chimeric sequences from PCR artefacts](https://reader035.vdocuments.mx/reader035/viewer/2022062810/56815e1f550346895dcc7bb7/html5/thumbnails/4.jpg)
Detection of chimeras:1. Alignment to reference sequences
• Each target sequence in turn– Align to ref. sequences– if alignment to a single
sequence gives better match then alignment to two sequences:
No chimera– else:
Chimera !!
(Cole et al., 2003; Komatsoulis and Waterman, 1997, …)
![Page 5: Detection of chimeric sequences from PCR artefacts](https://reader035.vdocuments.mx/reader035/viewer/2022062810/56815e1f550346895dcc7bb7/html5/thumbnails/5.jpg)
Problems
• Database contamination– More and more chimeras accumulate
• Database coverage– Parent sequences are not necessarily in
database
![Page 6: Detection of chimeric sequences from PCR artefacts](https://reader035.vdocuments.mx/reader035/viewer/2022062810/56815e1f550346895dcc7bb7/html5/thumbnails/6.jpg)
2. Partial tree building approach
• Align sequence to existing sequences (build MSA)
• Divide MSA at postulated conversion point
• Construct 2 trees• Compare consistency
of phylogeny
(Wang and Wang, 1997; Hugenholtz , 2003)
1
2
3
4
53
4
5
2
1
![Page 7: Detection of chimeric sequences from PCR artefacts](https://reader035.vdocuments.mx/reader035/viewer/2022062810/56815e1f550346895dcc7bb7/html5/thumbnails/7.jpg)
3. Bellerophon approach
• Just like “partial tree building”, but:– MSA from PCR library
• More likely to contain parent sequence– No trees are actually built– All possible conversion points are tested
![Page 8: Detection of chimeric sequences from PCR artefacts](https://reader035.vdocuments.mx/reader035/viewer/2022062810/56815e1f550346895dcc7bb7/html5/thumbnails/8.jpg)
How Bellerophon works
• Compute MSA• for each conversion point:
– 2 windows left/right• Calculate all “distances”
between sequence– Instead of comparing trees,
compare distance matrices
n
i
n
j
rightleft jidmjidmdme ]][[]][[
![Page 9: Detection of chimeric sequences from PCR artefacts](https://reader035.vdocuments.mx/reader035/viewer/2022062810/56815e1f550346895dcc7bb7/html5/thumbnails/9.jpg)
How Bellerophon works (cont.)
• Chimeric sequence will result in large dme
• Chimera detection:– Exclude sequence– Observe change of dme
][
][idme
dmeipreference
![Page 10: Detection of chimeric sequences from PCR artefacts](https://reader035.vdocuments.mx/reader035/viewer/2022062810/56815e1f550346895dcc7bb7/html5/thumbnails/10.jpg)
How Bellerophon works (cont.)
• Chimeric sequence will result in large dme
• Chimera detection:– Exclude sequence– Observe change of dme
][
][idme
dmeipreference
n
j
rightleft jidmjidmicol ]][[]][[][
])[2(][
icoldmedmeipreference
• Expensive to calculate (O(n3))
• Speedy way
n
i
n
j
rightleft jidmjidmdme ]][[]][[
![Page 11: Detection of chimeric sequences from PCR artefacts](https://reader035.vdocuments.mx/reader035/viewer/2022062810/56815e1f550346895dcc7bb7/html5/thumbnails/11.jpg)
Bellerophon user interface
![Page 12: Detection of chimeric sequences from PCR artefacts](https://reader035.vdocuments.mx/reader035/viewer/2022062810/56815e1f550346895dcc7bb7/html5/thumbnails/12.jpg)
Example output
Title line
![Page 13: Detection of chimeric sequences from PCR artefacts](https://reader035.vdocuments.mx/reader035/viewer/2022062810/56815e1f550346895dcc7bb7/html5/thumbnails/13.jpg)
Example output
Title line
Job parameter
![Page 14: Detection of chimeric sequences from PCR artefacts](https://reader035.vdocuments.mx/reader035/viewer/2022062810/56815e1f550346895dcc7bb7/html5/thumbnails/14.jpg)
Example output
Title line
Job parameter
!! Advice !!
Chi
mer
a ou
tput
![Page 15: Detection of chimeric sequences from PCR artefacts](https://reader035.vdocuments.mx/reader035/viewer/2022062810/56815e1f550346895dcc7bb7/html5/thumbnails/15.jpg)
Example output
Title line
Job parameter
!! Advice !!
Chi
mer
a ou
tput
Preference score (only relative)Conversion points
Sequence identities across windows
IDs of chimera and parents
![Page 16: Detection of chimeric sequences from PCR artefacts](https://reader035.vdocuments.mx/reader035/viewer/2022062810/56815e1f550346895dcc7bb7/html5/thumbnails/16.jpg)
Server usage
0
50
100
150
200
250
300
350
400
450
500
Mar-03
Apr-03
May-03
Jun-03
Jul-03
Aug-03
Sep-03
Oct-03
Nov-03
Dec-03
Jan-04
Feb-04
Mar-04
Apr-04
May-04
Jun-04
Jul-04
Aug-04
Sep-04
Oct-04
Nov-04
Dec-04
Jan-05
Feb-05
Mar-05
Apr-05
May-05
Jun-05
Jul-05
Aug-05
http://foo.maths.uq.edu.au/~huber/bellerophon.pl
Bellerophon: Number of jobs processed
![Page 17: Detection of chimeric sequences from PCR artefacts](https://reader035.vdocuments.mx/reader035/viewer/2022062810/56815e1f550346895dcc7bb7/html5/thumbnails/17.jpg)
Who uses Bellerophon?
![Page 18: Detection of chimeric sequences from PCR artefacts](https://reader035.vdocuments.mx/reader035/viewer/2022062810/56815e1f550346895dcc7bb7/html5/thumbnails/18.jpg)
What Bellerophon does/does not do!
• Bellerophon does not determine chimeric sequences !!
• It merely indicates putative chimeras• You must confirm them !
![Page 19: Detection of chimeric sequences from PCR artefacts](https://reader035.vdocuments.mx/reader035/viewer/2022062810/56815e1f550346895dcc7bb7/html5/thumbnails/19.jpg)
Current developments
• Bellerophon 2– For large PCR libraries (or single sequences)
• A smaller library of related sequences is selected for each target sequence
– Cost reduction from O(n3) to something more tractable
– Cleaning up sequence databases• Web services• Large scale data statistics on chimeras
![Page 20: Detection of chimeric sequences from PCR artefacts](https://reader035.vdocuments.mx/reader035/viewer/2022062810/56815e1f550346895dcc7bb7/html5/thumbnails/20.jpg)
Bellerophon web services
• Sporadic user (web page interface)– Interactive / manual use– Easy to understand, convenient to use
• Large scale users have different needs– E.g. JGI’s microbial ecology pipeline– Easy to implement/use interface that allows automatic
submission and processing of data Web services
• Standardised protocol (SOAP, WSDL)• Remote service calls from own scripts and programs• Not a mirror. All Bellerophon services are maintained in
Brisbane
![Page 21: Detection of chimeric sequences from PCR artefacts](https://reader035.vdocuments.mx/reader035/viewer/2022062810/56815e1f550346895dcc7bb7/html5/thumbnails/21.jpg)
Large scale data statistics on chimeras
• How much chimeras to expect in a PCR library– Differences in phyla?
• Is recombination in 16S rRNA a random event?– Structural bias?