Download - Comparative Assembly for Cancer Human Genome
![Page 1: Comparative Assembly for Cancer Human Genome](https://reader035.vdocuments.mx/reader035/viewer/2022081513/5681617a550346895dd109a6/html5/thumbnails/1.jpg)
Comparative Assemblyfor
Cancer Human GenomeGao Song
2010/02/03
![Page 2: Comparative Assembly for Cancer Human Genome](https://reader035.vdocuments.mx/reader035/viewer/2022081513/5681617a550346895dd109a6/html5/thumbnails/2.jpg)
Background Knowledge Problem Description Framework of Solution Own Methods Results
Content
![Page 3: Comparative Assembly for Cancer Human Genome](https://reader035.vdocuments.mx/reader035/viewer/2022081513/5681617a550346895dd109a6/html5/thumbnails/3.jpg)
Pair End Tag (PET)
Background Knowledge
![Page 4: Comparative Assembly for Cancer Human Genome](https://reader035.vdocuments.mx/reader035/viewer/2022081513/5681617a550346895dd109a6/html5/thumbnails/4.jpg)
Concordant PET (CPET)
Discordant PET (DPET)◦ Distance or orientation is incorrect◦ Map to different chromosomes
DPET Cluster
Background Knowledge
![Page 5: Comparative Assembly for Cancer Human Genome](https://reader035.vdocuments.mx/reader035/viewer/2022081513/5681617a550346895dd109a6/html5/thumbnails/5.jpg)
Given:◦ Frequency of DPET and CPET along the reference
genome◦ DPET Cluster
Requirement:◦ Find rearrangement of cancer genome compare to
normal human genome◦ Now focus on Amplicons
Problem Description
![Page 6: Comparative Assembly for Cancer Human Genome](https://reader035.vdocuments.mx/reader035/viewer/2022081513/5681617a550346895dd109a6/html5/thumbnails/6.jpg)
The reference genome is cut when CPET is 0=> some big contigs
According to DPET, find the breakpoints Using CPET to check if there is connection
between breakpoints Convert DPET Cluster into edges in the
graph Using high copy edges to form subgraph of
amplicons
Framework of Solution
![Page 7: Comparative Assembly for Cancer Human Genome](https://reader035.vdocuments.mx/reader035/viewer/2022081513/5681617a550346895dd109a6/html5/thumbnails/7.jpg)
Framework of Solution
DPET Start and End Breakpoint
CPET
Filted BreakPoints
Original Contigs
Small Contigs
DPETReference Genome
Edges CPETNodes
Graph
![Page 8: Comparative Assembly for Cancer Human Genome](https://reader035.vdocuments.mx/reader035/viewer/2022081513/5681617a550346895dd109a6/html5/thumbnails/8.jpg)
DPET Frequency Curve Using DPET directly
choose a threshold to Select the breakpoint
Problem:◦ How to choose the threshold◦ Within amplicon region, it is hard to find the
breakpoint – basic frequency is too much
Own Methods-NaiveChromosome 9
![Page 9: Comparative Assembly for Cancer Human Genome](https://reader035.vdocuments.mx/reader035/viewer/2022081513/5681617a550346895dd109a6/html5/thumbnails/9.jpg)
Using slope(differentiation)
Problem:◦ How to define threshold◦ Too many false positive◦ Also miss some DPET cluster
Own Methods - Slope Chromosome 9
![Page 10: Comparative Assembly for Cancer Human Genome](https://reader035.vdocuments.mx/reader035/viewer/2022081513/5681617a550346895dd109a6/html5/thumbnails/10.jpg)
In breakpoint, DPET increases, CPET decreases
Can be used as another criteria Problem
◦ Another Parameter!
Own Method – Consider Ratio
![Page 11: Comparative Assembly for Cancer Human Genome](https://reader035.vdocuments.mx/reader035/viewer/2022081513/5681617a550346895dd109a6/html5/thumbnails/11.jpg)
Using slope to find the threshold The previous missing point can be found
New methods of finding breakpoint
![Page 12: Comparative Assembly for Cancer Human Genome](https://reader035.vdocuments.mx/reader035/viewer/2022081513/5681617a550346895dd109a6/html5/thumbnails/12.jpg)
Localize checking Using two consecutive windows
◦ Each window has: μ σ
◦ Null Hypothesis: σ2 is not significantly larger than σ1
◦ Using Binomial Testing:
Significance level: 0.05
Own Method – Hypothesis Testing
window1 window2
![Page 13: Comparative Assembly for Cancer Human Genome](https://reader035.vdocuments.mx/reader035/viewer/2022081513/5681617a550346895dd109a6/html5/thumbnails/13.jpg)
Some details:◦ Check if the cluster region is included in window
Not finished yet Calculating σ is time-consuming
- have to recalculate after each step
Own Method – Hypothesis Testing
![Page 14: Comparative Assembly for Cancer Human Genome](https://reader035.vdocuments.mx/reader035/viewer/2022081513/5681617a550346895dd109a6/html5/thumbnails/14.jpg)
Results(slope)
10k 20k# of subgraph 72 35
Max chromosome inOne subgraph
4 4
Average chromosomeIn one subgraph
1.18 1.23
Max edge inOne subgraph
42 44
Average edgeIn one subgraph
5.47 5.77
![Page 15: Comparative Assembly for Cancer Human Genome](https://reader035.vdocuments.mx/reader035/viewer/2022081513/5681617a550346895dd109a6/html5/thumbnails/15.jpg)
One Special Case
![Page 16: Comparative Assembly for Cancer Human Genome](https://reader035.vdocuments.mx/reader035/viewer/2022081513/5681617a550346895dd109a6/html5/thumbnails/16.jpg)
10k Lib
![Page 17: Comparative Assembly for Cancer Human Genome](https://reader035.vdocuments.mx/reader035/viewer/2022081513/5681617a550346895dd109a6/html5/thumbnails/17.jpg)
20k Lib
![Page 18: Comparative Assembly for Cancer Human Genome](https://reader035.vdocuments.mx/reader035/viewer/2022081513/5681617a550346895dd109a6/html5/thumbnails/18.jpg)
10k lib 20k lib
Another example