inferring clonal composition of a breast cancer from multiple tissue samples habil zare department...

45
Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013 1

Upload: loraine-stokes

Post on 25-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

1

Inferring clonal composition of a breast cancer

from multiple tissue samples

Habil ZareDepartment of Genome Sciences

University of Washington19 Dec 2013

Page 2: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

2

Hypothesis

Because cancer is a heterogeneous disease, synergistic medications can

treat it better than a single drug.

Page 3: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

3

Traditional concept of a tumor

Schematic figure

Page 4: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

4

Most tumors are heterogeneous

Schematic figure

Page 5: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

5

Different clones have different genotypes and phenotypes

Clone 1

Clone 2

Clone 3Clone 4Clone 5

Clone 6

Schematic figure

Page 6: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

6

It is important to identify the clonal composition

Treatment A

Treatment B

Relapse

Relapse

Page 7: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

7

It is important to identify the clonal composition

?

Page 8: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

8

It is important to identify the clonal composition

?

Page 9: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

9

Our approach to analyze multiple samples from a single tumor

Page 10: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

10

Our approach to analyze multiple samples from a single tumor

Page 11: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

Each sample has different information about the clonal composition

PCR

PCR

PCR

Next Gen Sequencing

Next Gen Sequencing

Next Gen Sequencing

Counting the number of reads which support each mutation

Page 12: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

A closer look at the Next-Gen Sequencing output

• At each locus, 2 integers are provided: total number of analyzed reads, andthe number of reads supporting the mutation.

• Because different clones have different contributions to each sample, these numbers vary across the samples.

How to use this variation to infer the clonal composition?

Page 13: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

13

The observations

The observations boils down to the number of reads which support each allele.

• M samples• Mutations on N loci

Building a generative model

Tumor

Page 14: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

14

Building a generative modelGiven the parameters, how to generate data?

Data

Parameters

Generate

Page 15: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

15

Data

Parameters

Generate

?

Building a generative modelGiven the parameters, how to generate data?

Page 16: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

16

Generate

Building a generative model

Parameters?

Page 17: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

17

The main assumption on the distribution of reads

Mutation i can be present or absent in each clone

Project on Mutation i

Building a generative model

Assumption: Reads are analyzed uniformly at random => Binomially distributed

Page 18: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

18

The main assumption on the distribution of reads

Mutation i can be present or absent in each clone

Project on Mutation i

Number of reads exhibiting the variant allele at locus i in sample j.

Total number of reads

Frequency of variant allele

Assumption

Building a generative model

Page 19: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

19

A close look at the binomial distribution

Total number of readsObserved

Frequency of variant allele ?

Number of reads exhibiting the variant

Observed

depends on:1. Which clones contain mutation i ?2. What is the frequency of those clones in sample j ?

Building a generative model

Page 20: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

20

Introducing the hidden variables

If Zi,c = 1, clone c has a variant allele at locus i. depends on:1. Which clones contain mutation i ?2. What is the frequency of those clones in sample j ?

Building a generative model

Page 21: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

21

Notation for the model parameters

depends on:1. Which clones contain mutation i ?2. What is the frequency of those clones in sample j ?

Building a generative model

Page 22: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

22

Building a generative model

ParametersC

Generate

?

Page 23: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

23

The assumptions

• Each mutation can occur at a locus independently at random.• The samples are independent from each other.

Building a generative model

Page 24: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

24

Building a generative model

ParametersC

Generate

Technical

Page 25: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

25

Overview of the generative model from parameters to the observations

C

Parameters

Observations

Page 26: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

26

InferenceGiven the observed counts, how do we infer the clonal structure?

C

Inference

Technical

EM

Page 27: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

27

We infer model parameters using expectation-maximization

Details omitted

Derived from the binomial distribution

Derived from Bernoulli distribution

Page 28: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

28

How can we evaluate whether the model works?

Inference

Two rounds of next gen sequencing

C

Page 29: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

29

We do not know the reality

~Inferred Reality

Page 30: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

30

Generating synthetic data

Inference

C`

Generate

Page 31: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

31

Inference

C

Generate

Generating synthetic data

Random parameters

compare

Page 32: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

32

• Genotype error: The frequency of false entries in the genotype matrix Z

• Clone frequency error: The average error in entries of the frequency matrix P

Defining accuracy criteria

Page 33: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

33

Simulation shows genotype error decreasing with increasing samples

Page 34: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

34

Simulation shows genotype error decreasing with increasing samples

Page 35: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

35

Clone frequency error shows a similar trend

Page 36: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

36

M1

P1

P3

P2

Experiment with real dataStudy on a primary breast cancer

• 10 breast tumor samples• 1 adjacent normal • 2 samples from the

metastatic lymph node

Page 37: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

37

Clone frequencies vary smoothly across the tumor sections

The model doesn’t know anything about the anatomic location of the samples!

Page 38: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

38

Clone frequencies vary smoothly across the tumor sections

Page 39: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

39

Phylogenetic analysis tells the story of the tumor over time

Page 40: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

40

Five clone solution

Page 41: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

41

Six clone solution is consistent with five-

clone solution

Page 42: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

42

Next-Gen Sequencing Data

Oncologists

Clonal structureEM Validated by simulations

Anatomic variation of clones Phylogenetic trees

Overview of the projectInferring clonal composition of a breast cancer from multiple tissue samples

Page 43: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

43

Software publicly available

Page 44: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

44

Supplementary slides

Page 45: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013

Proposed project based on former experiences:Identifying clonal decomposition using sub-tissues

SamSPECTRAL

Sort cell populations

Next Gen Sequencing

Next Gen Sequencing

Next Gen Sequencing

Next Gen Sequencing

Leukemia or lymphoma sample

Clonalanalysis