1 modelling of cgh arrays experiments philippe broët faculté de médecine, université de paris-xi...
TRANSCRIPT
![Page 1: 1 Modelling of CGH arrays experiments Philippe Broët Faculté de Médecine, Université de Paris-XI Sylvia Richardson Imperial College London CGH = Competitive](https://reader030.vdocuments.mx/reader030/viewer/2022032701/56649b57550346318e8d6463/html5/thumbnails/1.jpg)
1
Modelling of CGH arrays experiments
• Philippe Broët
Faculté de Médecine,
Université de Paris-XI
• Sylvia Richardson
Imperial College
London
CGH = Competitive Genomic Hybridization
![Page 2: 1 Modelling of CGH arrays experiments Philippe Broët Faculté de Médecine, Université de Paris-XI Sylvia Richardson Imperial College London CGH = Competitive](https://reader030.vdocuments.mx/reader030/viewer/2022032701/56649b57550346318e8d6463/html5/thumbnails/2.jpg)
2
Outline
• Background
• Mixture model with spatial allocations
• Performance, comparison with CGH-Miner
• Analyses of CGH-array cancer data sets
• Extensions
![Page 3: 1 Modelling of CGH arrays experiments Philippe Broët Faculté de Médecine, Université de Paris-XI Sylvia Richardson Imperial College London CGH = Competitive](https://reader030.vdocuments.mx/reader030/viewer/2022032701/56649b57550346318e8d6463/html5/thumbnails/3.jpg)
3
The development of solid tumors is associated with the acquisition of complex genetic alterations that modify normal cell growth and survival.
Many of these changes involve gains and/or losses of parts of the genome: Amplification of an oncogene or deletion of a tumor suppressor gene are considered as important mechanisms for tumorigenesis.
Loss Gain
Tumor supressor gene Oncogene
Aim: study genomic alterations in oncology
![Page 4: 1 Modelling of CGH arrays experiments Philippe Broët Faculté de Médecine, Université de Paris-XI Sylvia Richardson Imperial College London CGH = Competitive](https://reader030.vdocuments.mx/reader030/viewer/2022032701/56649b57550346318e8d6463/html5/thumbnails/4.jpg)
4
1. Extraction- DNA
2. Labelling (fluo)
3. Co-hybridization
4. Scanning
Case Control
CGH = Competitive Genomic hybridization• Array containing short sequences of DNA bound to
glass slide• Fluorescein-labeled normal and pathologic samples
co-hybridised to the array
![Page 5: 1 Modelling of CGH arrays experiments Philippe Broët Faculté de Médecine, Université de Paris-XI Sylvia Richardson Imperial College London CGH = Competitive](https://reader030.vdocuments.mx/reader030/viewer/2022032701/56649b57550346318e8d6463/html5/thumbnails/5.jpg)
5
• Once hybridization has been performed, the signal intensities of the fluorophores is quantified
Provides a means to quantitatively measure DNA copy-number alterations and to map them directly onto genomic sequence
![Page 6: 1 Modelling of CGH arrays experiments Philippe Broët Faculté de Médecine, Université de Paris-XI Sylvia Richardson Imperial College London CGH = Competitive](https://reader030.vdocuments.mx/reader030/viewer/2022032701/56649b57550346318e8d6463/html5/thumbnails/6.jpg)
6
MCF7 cell line investigated in Pollack et al (2002)23 chromosomes and 6691 cDNA sequences
Data log transformed: Difference bet. MCF7 and reference
![Page 7: 1 Modelling of CGH arrays experiments Philippe Broët Faculté de Médecine, Université de Paris-XI Sylvia Richardson Imperial College London CGH = Competitive](https://reader030.vdocuments.mx/reader030/viewer/2022032701/56649b57550346318e8d6463/html5/thumbnails/7.jpg)
7
Types of alterations observed
• (Single) Gain or Deletion of sequences, occurring for contiguous regions
Low level changes in the ratio ± log2but attenuation (dye bias) ratio ≈ ± 0.4• Multiple gains (small regions)
High level change, easy to pick upFocus the modelling on the first
common type of alterations
![Page 8: 1 Modelling of CGH arrays experiments Philippe Broët Faculté de Médecine, Université de Paris-XI Sylvia Richardson Imperial College London CGH = Competitive](https://reader030.vdocuments.mx/reader030/viewer/2022032701/56649b57550346318e8d6463/html5/thumbnails/8.jpg)
8
Deletion?
Multiple gains ?
Normal?
Chromosome 1
![Page 9: 1 Modelling of CGH arrays experiments Philippe Broët Faculté de Médecine, Université de Paris-XI Sylvia Richardson Imperial College London CGH = Competitive](https://reader030.vdocuments.mx/reader030/viewer/2022032701/56649b57550346318e8d6463/html5/thumbnails/9.jpg)
9
2 -- Mixture model
![Page 10: 1 Modelling of CGH arrays experiments Philippe Broët Faculté de Médecine, Université de Paris-XI Sylvia Richardson Imperial College London CGH = Competitive](https://reader030.vdocuments.mx/reader030/viewer/2022032701/56649b57550346318e8d6463/html5/thumbnails/10.jpg)
10
Specificity of CGH array experiment
A priori biological knowledge from conventional CGH :
• Limited number of states for a genomic sequence :
- presence (modal), - deletion, - gain(s)
corresponding to different intensity ratios on the array
Mixture model to capture the underlying discrete states
• GS located contiguously on chromosomes are likely to carry alterations of the same type
Use clone spatial location in the allocation model
3 component mixture model with spatial allocation
![Page 11: 1 Modelling of CGH arrays experiments Philippe Broët Faculté de Médecine, Université de Paris-XI Sylvia Richardson Imperial College London CGH = Competitive](https://reader030.vdocuments.mx/reader030/viewer/2022032701/56649b57550346318e8d6463/html5/thumbnails/11.jpg)
11
Mixture model
For chromosome k:
Zgk : log ratio of measurement of normal versus tumoral change, genomic sequence (GS) g, chromosome k
Dye bias is estimated by using a reference array (normal/normal) and then subtracting the bias from Zgk
Zgk w1gkN(μ1 ,1
2) + w2gkN(μ2 ,2
2) + w3gkN(μ3 ,3
2)
For unique labelling:μ1 < 0 , μ3 > 0μ2 = 0 (dye bias has been adjusted)
2=presence1=deletion 3=gain
![Page 12: 1 Modelling of CGH arrays experiments Philippe Broët Faculté de Médecine, Université de Paris-XI Sylvia Richardson Imperial College London CGH = Competitive](https://reader030.vdocuments.mx/reader030/viewer/2022032701/56649b57550346318e8d6463/html5/thumbnails/12.jpg)
12
• Define mixture proportions to depend on the chromosomic location via a logistic model:
wcgk = exp(uc
gk) / Σm exp(umgk)
favours allocation of nearby GS to same component
Mixture model with spatial allocation
Zgk w1gkN(μ1 ,1
2) + w2gkN(μ2 ,2
2) + w3gkN(μ3 ,3
2)
Spatial structure on the weights (c.f. Fernandez and Green, 2002):
• Introduce 3 centred Markov random fields {umgk}, m = 1, 2, 3
with nearest neighbours along the chromosomes
x x xg -1 g g+1
Spatial neighbours of GS g
![Page 13: 1 Modelling of CGH arrays experiments Philippe Broët Faculté de Médecine, Université de Paris-XI Sylvia Richardson Imperial College London CGH = Competitive](https://reader030.vdocuments.mx/reader030/viewer/2022032701/56649b57550346318e8d6463/html5/thumbnails/13.jpg)
13
Prior structure
• wcgk = exp(uc
gk) / Σm exp(umgk)
with Gaussian Conditional AutoRegressive model :
ucgk | uc
-gk ~ N (h uc hk /ng , ck
2/ng)
for h = neighbour of g (ng = #h, one or two in this simple case), with constraint g uc
gk = 0
• Variance parameters ck2 of the CAR acts as a smoothing
prior: indexed by the chromosome : ‘switching structure’ between the states can be different between chromosomes
• Mean and variances (μc ,c2 ) of the mixture components
are common to all chromosomes borrowing information• Inverse gamma priors for the variances, uniform priors for
the means
![Page 14: 1 Modelling of CGH arrays experiments Philippe Broët Faculté de Médecine, Université de Paris-XI Sylvia Richardson Imperial College London CGH = Competitive](https://reader030.vdocuments.mx/reader030/viewer/2022032701/56649b57550346318e8d6463/html5/thumbnails/14.jpg)
14
Posterior quantities of interest
• Bayesian inference via MCMC, implemented using Winbugs• In particular, latent allocations, Lgk , of GS g on chromosome
k to state c, are sampled during the MCMC run • Compute posterior allocation probabilities :
pcgk= P(Lgk = c | data), c =1,2,3
• Probabilistic classification of each GS using threshold
on pcgk :
-- Assign g to modified state: deletion (c=1) or gain (c=3) if corresponding pc
gk > 0.8, -- Otherwise allocate to modal state.
Subset S of genomic sequences classified as modified(this subset depends on the chosen threshold)
![Page 15: 1 Modelling of CGH arrays experiments Philippe Broët Faculté de Médecine, Université de Paris-XI Sylvia Richardson Imperial College London CGH = Competitive](https://reader030.vdocuments.mx/reader030/viewer/2022032701/56649b57550346318e8d6463/html5/thumbnails/15.jpg)
15
False Discovery Rate
• Using the posterior allocation probabilities, can compute an estimate of FDR for the list S :
• Bayes FDR (S) | data = 1/card(S) Σg S p2gk
where p2gk is posterior probability of allocation to
the modal (c=2) state
Note: Can adjust the threshold to get a desired FDR and vice versa
![Page 16: 1 Modelling of CGH arrays experiments Philippe Broët Faculté de Médecine, Université de Paris-XI Sylvia Richardson Imperial College London CGH = Competitive](https://reader030.vdocuments.mx/reader030/viewer/2022032701/56649b57550346318e8d6463/html5/thumbnails/16.jpg)
16
3 -- Performance
![Page 17: 1 Modelling of CGH arrays experiments Philippe Broët Faculté de Médecine, Université de Paris-XI Sylvia Richardson Imperial College London CGH = Competitive](https://reader030.vdocuments.mx/reader030/viewer/2022032701/56649b57550346318e8d6463/html5/thumbnails/17.jpg)
17
Simulation set-up
• 200 fake GS with Z ~ N(0 ,.32) , modal
Z ~ N(log 2 ,.32) , deletion, a block of 30 GS
Z ~ N(- log 2 ,.32), gains, blocks of 20 and 10 GS
• Reference array with Z ~ N(0 ,.32)
• 50 replications
Modal Deletion ModalGain ModalGainMod
30 1020
![Page 18: 1 Modelling of CGH arrays experiments Philippe Broët Faculté de Médecine, Université de Paris-XI Sylvia Richardson Imperial College London CGH = Competitive](https://reader030.vdocuments.mx/reader030/viewer/2022032701/56649b57550346318e8d6463/html5/thumbnails/18.jpg)
18
CGH-Miner
• Data mining approach to select gain and losses (Wang et al 2005):
– Hierarchical clustering with a spatial constraint
(ie only spatially adjacent clusters are joined)– Subtree selection according to predefined rules
focus on selecting large consistent gain/loss regions and small (big spike) regions
– Implemented in CGH-Miner Excel plug in
• Estimation of FDR using a reference (normal/normal) array and the same set of rules to prune the tree. Declared target 1%
• Simulation set-up is similar to Wang et al.
![Page 19: 1 Modelling of CGH arrays experiments Philippe Broët Faculté de Médecine, Université de Paris-XI Sylvia Richardson Imperial College London CGH = Competitive](https://reader030.vdocuments.mx/reader030/viewer/2022032701/56649b57550346318e8d6463/html5/thumbnails/19.jpg)
19
Classification obtained by CGH miner and CGH mix
Modal Deletion ModalGain ModalGainMod
30 1020
![Page 20: 1 Modelling of CGH arrays experiments Philippe Broët Faculté de Médecine, Université de Paris-XI Sylvia Richardson Imperial College London CGH = Competitive](https://reader030.vdocuments.mx/reader030/viewer/2022032701/56649b57550346318e8d6463/html5/thumbnails/20.jpg)
20
Posterior probabilities of allocation to the 3 components
![Page 21: 1 Modelling of CGH arrays experiments Philippe Broët Faculté de Médecine, Université de Paris-XI Sylvia Richardson Imperial College London CGH = Competitive](https://reader030.vdocuments.mx/reader030/viewer/2022032701/56649b57550346318e8d6463/html5/thumbnails/21.jpg)
21
Comparative performance between CGHmix and CGH-Miner
50 simulations CGHmix CGH-Miner
Realised false positive (mean)
1.9 16.4
Realised false positive (range)
0 -- 20 3 -- 39
Realised false negative (mean)
1.0 9.6
Realised false negative (range)
0 -- 4 0 -- 50
Realised FDR (%) 2.8 23.7
Estimated FDR (%) 1.3 1.2
![Page 22: 1 Modelling of CGH arrays experiments Philippe Broët Faculté de Médecine, Université de Paris-XI Sylvia Richardson Imperial College London CGH = Competitive](https://reader030.vdocuments.mx/reader030/viewer/2022032701/56649b57550346318e8d6463/html5/thumbnails/22.jpg)
22
4 -- Analyses of CGH-array cancer data sets
![Page 23: 1 Modelling of CGH arrays experiments Philippe Broët Faculté de Médecine, Université de Paris-XI Sylvia Richardson Imperial College London CGH = Competitive](https://reader030.vdocuments.mx/reader030/viewer/2022032701/56649b57550346318e8d6463/html5/thumbnails/23.jpg)
23
Breast cancer cell line MCF7
• Data from Pollack et al., 6691 GS on 23 chromosomes
• μ1 = -0.35, 1 = 0.37
• (μ2 = 0) 2 = 0.27
• μ3 = 0.44, 3 = 0.54
• Estimated FDR CGHmix = 2.6%• Estimated FDR CGH-Miner = 1.5%
^
^
^^^
![Page 24: 1 Modelling of CGH arrays experiments Philippe Broët Faculté de Médecine, Université de Paris-XI Sylvia Richardson Imperial College London CGH = Competitive](https://reader030.vdocuments.mx/reader030/viewer/2022032701/56649b57550346318e8d6463/html5/thumbnails/24.jpg)
24
![Page 25: 1 Modelling of CGH arrays experiments Philippe Broët Faculté de Médecine, Université de Paris-XI Sylvia Richardson Imperial College London CGH = Competitive](https://reader030.vdocuments.mx/reader030/viewer/2022032701/56649b57550346318e8d6463/html5/thumbnails/25.jpg)
25
Classification of GS obtained by CGHmix
![Page 26: 1 Modelling of CGH arrays experiments Philippe Broët Faculté de Médecine, Université de Paris-XI Sylvia Richardson Imperial College London CGH = Competitive](https://reader030.vdocuments.mx/reader030/viewer/2022032701/56649b57550346318e8d6463/html5/thumbnails/26.jpg)
26
knownalterationsfound byboth methods
additionalknownAlterationsfound byCGHmix
![Page 27: 1 Modelling of CGH arrays experiments Philippe Broët Faculté de Médecine, Université de Paris-XI Sylvia Richardson Imperial College London CGH = Competitive](https://reader030.vdocuments.mx/reader030/viewer/2022032701/56649b57550346318e8d6463/html5/thumbnails/27.jpg)
27
Neuroblastoma KCNR cell lineCurie Institute CGH custom array
for chromosome 1
• 190 genomic clones, mostly on the short arm• 3 replicate spots for each• μ1 = - 0.49, loss component• μ3 = 0.04, not plausible no gain in this case• Estimate FDR by regrouping c=2 and c=3
classes• Substantial number of deletions on short arm • No deletion found for the long arm by CGHmix,
a result confirmed by classical cytogenetic information
^
^
![Page 28: 1 Modelling of CGH arrays experiments Philippe Broët Faculté de Médecine, Université de Paris-XI Sylvia Richardson Imperial College London CGH = Competitive](https://reader030.vdocuments.mx/reader030/viewer/2022032701/56649b57550346318e8d6463/html5/thumbnails/28.jpg)
28
Long arm
![Page 29: 1 Modelling of CGH arrays experiments Philippe Broët Faculté de Médecine, Université de Paris-XI Sylvia Richardson Imperial College London CGH = Competitive](https://reader030.vdocuments.mx/reader030/viewer/2022032701/56649b57550346318e8d6463/html5/thumbnails/29.jpg)
29
Extensions
• Account for variability in the case of repeated measurement
add a measurement model with GS specific noise, with exchangeable prior
• Refine the spatial model:– Incorporate genomic sequence location in the
neighbourhood definition of the CAR model0-1 contiguity spatial weights– In particular, account for overlapping sequences
by using weights that depend on the overlap