clustering from structural variation in endometrial...

1
Clustering from structural variation in endometrial cancer Julian Stanley, Shreepriya Das, Michele Cummings, Nicolas Orsi, & Jeremy Gunawardena Department of Systems Biology, Harvard Medical School, Boston, MA 02115 Picture courtesy of Affymetrix Picture courtesy of Leica Biosystems Endometrial cancers are the fourth most common cancer in women worldwide, and identification of molecular characteristics that define subtypes of the cancer may complement traditional diagnostic techniques. We obtained 95 hysterectomies of different grades of endometrial cancer and analyzed structural variation events from formalin-fixed samples. Hierarchical clustering of copy number variation (CNV) revealed two clusters of endometrial cancer that also correspond to survival. Clustering can also be accomplished by a simple technique of plotting patients in terms of aggregate copy number gain and loss. Further work needs to be done to identify genetic markers that may aid diagnosis, and how those markers compare across cancer types. Grade 1 or 2 Endometroid Grade 3 Endometroid Non-Endometroid Better Prognosis Worse Prognosis A simplified view of endometrial cancer subtypes Grade 1 or 2 Endometroid Grade 3 Endometroid Non-Endometroid Better Prognosis Worse Prognosis A simplified view of endometrial cancer subtypes Grade 1 or 2 Endometroid “Type-1” Grade 3 Endometroid “Type-1” or “Type 2” Non-Endometroid “Type 2” Better Prognosis Worse Prognosis A simplified view of endometrial cancer subtypes Picture courtesy of Cancer Research UK Figure courtesy of Watkins et al. 2014 in Breast Cancer Research. Green bar presents copy number gain, red bar represents copy number loss. Figures represent: (left) copy number gain leading to allelic imbalance, (center) copy number loss leading to LOH, (right) recombination leading to copy neutral LOH. Copy number variation (CNV), allelic imbalance, and loss of heterozygosity (LOH) are types of structural variation Structural variation analysis CN Loss Allelic Imbalance LOH CN Gain Grade 1 Endometroid (11) Grade 2 Endometroid (21) Non-Endometroid (31) Grade 3 Endometroid (32) Figure 1. Aggregate (a) copy number gain and loss and (b) allelic imbalance and loss of heterozygosity in 95 endometrial cancer FFPE samples a b Cluster 1 (68 Patients) 2 (27 Patients) Cluster 1 (Low Copy Number) Cluster 2 (High Copy Number) Log-rank P = 0.0025 Figure 2. Method: Unsupervised hierarchical clustering based on Euclidean distance, Ward linkage, and silhouette optimization. a: Copy number variation separates patients into two clusters. Green represents CN loss and red represents CN gain. b: Clusters have a statistically and clinically significant difference in survival. Figure 3. Patient clusters in copy number space, labeled by clinical diagnosis. K-means clustering on a simple metric shows clusters that have significant difference in survival (not shown). CNV-based clusters Does the method also apply to ovarian cancer? Normalized copy number gain Fraction survived Time since diagnosis (days) Normalized copy number loss Clear Cell: 19 Low Grade EDM: 14 Mucinous: 16 High Grade Serous: 19 Low Grade Serous: 18 Clear Cell: 19 Low Grade Endometroid: 14 Mucinous: 16 High Grade Serous: 19 Low Grade Serous: 18 Figure 4. Ovarian cancer patient clusters in copy number space. Low grade endometroid are least severe Low copy number Mucinous is intermediate, but based on CN we predict less severe Low grade serous and clear cell are intermediate; they distribute normally High grade serous are most serious High copy number Figure 5. Copy number calls were made and analyzed by Nexus Express Software. Breast and renal cell cancers were used as comparisons with respectively similar and different risk factors as endometrial and ovarian cancer. a: CN gain (blue) and CN loss (red) and b: allelic imbalance (purple) and LOH (brown) shown as a proportion of samples with the given variation. Discussion Both endometrial and ovarian cancer are extremely heterogenous diseases that affect hundreds of thousands of women worldwide. Microarray analysis is a powerful tool that may provide information that can lead to more personalized cancer treatments. The Oncoscan platform, based on the molecular inversion probe technique, can consistently detect variation heavily-degraded FFPE clinical samples. Clustering techniques and comparative analyses are necessary to understand what differences between cancers are visible from genomic analyses, and how different molecular profiles can complement traditional diagnostics. Acknowledgements References I am extremely grateful to HMS Systems Biology and the Summer Internship in Systems Biology program for the welcoming working environment. Thanks to Shreepriya Das and Jeremy Gunawardena for their invaluable mentorship during this short internship, and to Nicolas Orsi and Michele Cummings for providing samples and helpful clinical expertise. 1. Levine, DA et al. 2013 Integrated genomic characterization of endometrial carcinoma. Nature 497, 67-73. 2. Zhang, J. 2018 CNTools: Convert segment data into a region by sample matrix to allow for other high level computational analyses. R package version 1.36.0. 3. Kaveh, F. et al. 2016 A systematic comparison of copy number alterations in four types of female cancer. BMC Cancer 16:913. 4. Jung, H. et al. 2017 Utilization of the Oncoscan microarray assay in cancer diagnostics. Applied Cancer Research 37:1. Contact: [email protected] Normalized copy number gain Normalized copy number loss Moving forward: comparative analyses Better Prognosis Worse Prognosis Grade 3 Endometroid Grade 1 Endometroid Grade 2 Endometroid Non-Endometroid Less severe More severe Intermediate DNA extracted from FFPE tissue was analyzed using the Oncoscan Copy Number assay % of samples with CN gain % of samples with CN loss % of samples with allelic imbalance % of samples with LOH Breast Cancer (11) Endometrial Cancer (95) Ovarian Cancer (89) Renal Cell Cancer (15) Breast Cancer (11) Endometrial Cancer (95) Ovarian Cancer (89) Renal Cell Cancer (15) Chromosome: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 18 20 22 17 19 21 X Aggregate % of samples Aggregate % of samples Chromosome: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 18 20 22 17 19 21 X

Upload: others

Post on 17-Oct-2019

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Clustering from structural variation in endometrial cancervcp.med.harvard.edu/papers/poster-julian-stanley.pdf · Julian Stanley, Shreepriya Das, Michele Cummings, Nicolas Orsi, &

Clustering from structural variation in endometrial cancer Julian Stanley, Shreepriya Das, Michele Cummings, Nicolas Orsi, & Jeremy Gunawardena

Department of Systems Biology, Harvard Medical School, Boston, MA 02115

Picture courtesy of AffymetrixPicture courtesy of Leica Biosystems

Endometrial cancers are the fourth most common cancer in womenworldwide, and identification of molecular characteristics that define subtypesof the cancer may complement traditional diagnostic techniques. We obtained95 hysterectomies of different grades of endometrial cancer and analyzedstructural variation events from formalin-fixed samples. Hierarchical clusteringof copy number variation (CNV) revealed two clusters of endometrial cancerthat also correspond to survival. Clustering can also be accomplished by asimple technique of plotting patients in terms of aggregate copy number gainand loss. Further work needs to be done to identify genetic markers that mayaid diagnosis, and how those markers compare across cancer types.

Grade 1 or 2 Endometroid

Grade 3 Endometroid

Non-Endometroid

Better Prognosis

Worse Prognosis

A simplified view of endometrial cancer subtypes

Grade 1 or 2 Endometroid

Grade 3 Endometroid

Non-Endometroid

Better Prognosis

Worse Prognosis

A simplified view of endometrial cancer subtypes

Grade 1 or 2 Endometroid“Type-1”

Grade 3 Endometroid“Type-1” or “Type 2”

Non-Endometroid“Type 2”

Better Prognosis

Worse Prognosis

A simplified view of endometrial cancer subtypes

Picture courtesy of Cancer Research UK

Figure courtesy of Watkins et al. 2014 in Breast Cancer Research. Green bar presents copy number gain, red bar represents copy number loss. Figures represent: (left) copy number gain leading to allelic imbalance, (center) copy number loss leading to LOH, (right) recombination leading to copy neutral LOH.

Copy number variation (CNV), allelic imbalance, and loss of heterozygosity (LOH) are types of structural variation

Structural variation analysis

CN

Lo

ss

Allelic

Imb

alance

LOH

CN

Gain

Grade 1 Endometroid (11)

Grade 2 Endometroid (21)

Non-Endometroid (31)

Grade 3 Endometroid (32)

Figure 1. Aggregate (a) copy number gain and loss and (b) allelic imbalance and loss of heterozygosity in 95 endometrial cancer FFPE samples

a b

Cluster 1 (68 Patients) 2 (27 Patients)1 (68 Patients) 2 (27 Patients)

Cluster 1 (Low Copy Number)

Cluster 2 (High Copy Number) Log-rank P = 0.0025

Figure 2.

Method: Unsupervised hierarchical clustering based on Euclidean distance, Ward linkage, and silhouette optimization.

a: Copy number variation separates patients into two clusters. Green represents CN loss and red represents CN gain.

b: Clusters have a statistically and clinically significant difference in survival.

Figure 3. Patient clusters in copy number space, labeled by clinical diagnosis.

K-means clustering on a simple metric shows clusters that have significant difference in survival (not shown).

CNV-based clusters

Does the method also apply to ovarian cancer?

No

rmal

ized

co

py

nu

mb

er g

ain

Frac

tio

n s

urv

ived

Time since diagnosis (days)

Normalized copy number loss

Clear Cell: 19

Low Grade EDM: 14

Mucinous: 16

High Grade Serous: 19

Low Grade Serous: 18

Clear Cell: 19

Low Grade Endometroid: 14

Mucinous: 16

High Grade Serous: 19

Low Grade Serous: 18

Figure 4. Ovarian cancer patient clusters in copy number space.

❖ Low grade endometroid are least severe Low copy number❖Mucinous is intermediate, but based on CN we predict less severe❖ Low grade serous and clear cell are intermediate; they distribute normally❖ High grade serous are most serious High copy number

Figure 5. Copy number calls were made and analyzed by Nexus ExpressSoftware. Breast and renal cell cancers were used as comparisons withrespectively similar and different risk factors as endometrial and ovariancancer. a: CN gain (blue) and CN loss (red) and b: allelic imbalance (purple)and LOH (brown) shown as a proportion of samples with the given variation.

DiscussionBoth endometrial and ovarian cancer are extremely heterogenous diseasesthat affect hundreds of thousands of women worldwide. Microarray analysisis a powerful tool that may provide information that can lead to morepersonalized cancer treatments.

The Oncoscan platform, based on the molecular inversion probe technique,can consistently detect variation heavily-degraded FFPE clinical samples.

Clustering techniques and comparative analyses are necessary to understandwhat differences between cancers are visible from genomic analyses, andhow different molecular profiles can complement traditional diagnostics.

Acknowledgements

References

I am extremely grateful to HMS Systems Biology and the Summer Internshipin Systems Biology program for the welcoming working environment. Thanksto Shreepriya Das and Jeremy Gunawardena for their invaluable mentorshipduring this short internship, and to Nicolas Orsi and Michele Cummings forproviding samples and helpful clinical expertise.

1. Levine, DA et al. 2013 Integrated genomic characterization of endometrial carcinoma. Nature 497, 67-73.2. Zhang, J. 2018 CNTools: Convert segment data into a region by sample matrix to allow for other high levelcomputational analyses. R package version 1.36.0.3. Kaveh, F. et al. 2016 A systematic comparison of copy number alterations in four types of female cancer. BMC Cancer16:913.4. Jung, H. et al. 2017 Utilization of the Oncoscan microarray assay in cancer diagnostics. Applied Cancer Research 37:1.

Contact: [email protected]

No

rmal

ized

co

py

nu

mb

er g

ain

Normalized copy number loss

Moving forward: comparative analyses

Better Prognosis

Worse Prognosis

Grade 3 Endometroid

Grade 1 Endometroid

Grade 2 Endometroid

Non-Endometroid

Less severe

More severe

Intermediate

DNA extracted from FFPE tissue was analyzed using the Oncoscan Copy Number assay

% of samples with CN gain

% of samples with CN loss

% of samples with allelic imbalance

% of samples with LOH

Breast Cancer (11)

Endometrial Cancer (95)

Ovarian Cancer (89)

Renal Cell Cancer (15)

Breast Cancer (11)

Endometrial Cancer (95)

Ovarian Cancer (89)

Renal Cell Cancer (15)

Chromosome: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 18 20 2217 19 21 X

Agg

rega

te %

of

sam

ple

sA

ggre

gate

% o

f sa

mp

les

Chromosome: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 18 20 2217 19 21 X