ngs genetic analysis for national lung matrix trial stratified medicine technology hubs and illumina...
TRANSCRIPT
NGS GENETIC ANALYSIS FOR NATIONAL LUNG MATRIX TRIALSTRATIFIED MEDICINE TECHNOLOGY HUBS AND ILLUMINA22 JUNE 2015
Contents
– Overview of sample workflow in technology hubs• Timings• QC steps
– Overview of the SMP2 NGS panel and analysis• Explanation of Nextera protocol • Overview of the validation at Illumina• Analysis / How aberrations are detected• How the aberrations will be reported
Monday 22 June 2015Lung Cancer Research Stratified Medicine Educational Event- Birmingham
Lung Cancer Research Stratified Medicine Educational Event- Birmingham
TH Operations
• Sample prep and assessment
Clinical Hub
• XML receipt and logging in
• Allocation of test
• Macro-dissection and DNA extraction
• Testing
Technology Hub • Receipt of
results
• Monthly KPIs to CR-UK
Clinical Hub
• Archive results to FTP
• Patient enters appropriate trial arm
Matrix Trial / CRUK
XML
XML
XML
Monday 22 June 2015
Lung Cancer Research Stratified Medicine Educational Event- Birmingham
Sample processing timings at the TH
Monday 22 June 2015
– Pre-testing
– Post-testing
Lung Cancer Research Stratified Medicine Educational Event- Birmingham
Pre-testing
Monday 22 June 2015
Day 1
•Paired sample receipt
•Booked in
•Triaged
•Assign extraction / send for quantification
Day 2
•DNA extraction
Day 3
•DNA quantification
•>50ng – proceed to testing
•<50ng – send Pre-Test QC fail report for tumour and bank blood DNA
Day 4
•Select next 5 pairs ready for testing
•Prepare worksheets, etc
Day 5-10
•Nextera enrichment
• MiSeq run
QC stepWORK DONE BY ILLUMINA TO IDENTIFY A QC STEP– Three samples run at different concentrations.– Lower than 30ng input DNA, no sequencing data is generated.– 50ng input DNA to the NGS analysis is optimal.
Monday 22 June 2015Lung Cancer Research Stratified Medicine Educational Event- Birmingham
NA12878-50ng_S1
12M09075-50ng_S2
D12-33773-50ng_S3
NA12878-30ng_S4
12M09075-30ng_S5
D12-33773-30ng_S6
NA12878-20ng_S7
12M09075-20ng_S8
D12-33773-20ng_S9
NA12878-10ng_S1
0
12M09075-10ng_S1
1
D12-33773-10ng_S1
2
0
1000000
2000000
3000000
4000000
5000000
6000000
7000000
8000000
9000000
10000000
0
100
200
300
400
500
600
700
PF_Duplicate_reads
PF_UNIQUE_READS
MEAN_TARGET_COVERAGE
Sample
Num
ber o
f rea
ds
Mea
n ta
rget
cov
erag
e
50ng 30ng 20ng 10ng
Factors affecting transitioning from Day 3 to Day 4
– Batching• 5 pairs processed per NGS run• 10 pairs processed max per week (2 MiSeq runs) (40 samples per month)
– Sample numbers• Too few – waiting for samples to activate a batch• Too many – samples in a queue waiting to start testing
– Timings• Protocol for enrichment is complex – 5 day protocol• Safe stopping points that are built into a working week• Sample pair that is QC ready on a Thursday will wait until following Wednesday
to start testing
Monday 22 June 2015Lung Cancer Research Stratified Medicine Educational Event- Birmingham
Lung Cancer Research Stratified Medicine Educational Event- Birmingham
Post-testing
Monday 22 June 2015
Day 11
•Raw data retrieved from MiSeq
•Scientist 1 processes data and performs 1st analysis
Day 12
•Scientist 2 performs 2nd check
•Scientist 3 validates test result entry onto laboratory database
Day 13
•Reports written
Day 14
•Reports checked and authorised by senior scientist
•XML generated
Day 15
•XML reports available for retrieval by CH from sFTP site
Lung Cancer Research Stratified Medicine Educational Event- Birmingham
Monday 22 June 2015
<10 days 10-20 days >20 days0
5
10
15
20
25
30
35
40
45
50
Batching affect on TaT
Date rec'd Test start
Dec-14
Jan-15
Feb-15
Mar-15
Apr-15
May-15
0
20
40
60
80
100
120
140
160
180
200
110
165.9
44.1
19.5
92
26.6
12.5 11.4
Average TaT
TaT from receipt NGSTaT from test start NGSTaT from receipt QC Fail
Repo
rts/
TAT
– Developed in partnership with Illumina– Panel linked to Matrix Trial– Increased gene spectrum– Increased mutation spectrum
Monday 22 June 2015Lung Cancer Research Stratified Medicine Educational Event- Birmingham
CR-UK NGS Panel 2
Nextera hybridisation
28 gene
Must have matched blood sample and tumour % information for analysis stage
Detects SNVs, insertions/deletions, CNV, translocations
Single test that requires less DNA, analyses more genes, all types of mutational events, better quality result
AKT1 ALK BRAF CCND1
CCND2 CCND3 CCNE1 CDK2
CDK4 CDKN2A EGFR FGFR2
FGFR3 Her2* HRAS KRAS
MET NF1 NRAS NTRK1
PIK3CA PTEN RB1 RET
ROS1 STK11 TSC1 TSC2
PATIENT DNA IS PROCESSED TO GENERATE A SEQUENCING LIBRARY – collection of DNA fragments derived from the patient
DNA FRAGMENTS ARE FLANKED BY ADAPTOR SEQUENCES – Allow fragments to ‘stick’ to the flow cell surface
Nextera library prep
• Transposons fragment and tag the DNA with primers
• Indexes are added to each sample so they can be separated
• Uses an enrichment or capture approach – positively select fragments of interest from
your sequencing library
SMP2 Nextera NGS panel validation overview– Validation was performed at Illumina
• Assessment of how panel works• All types of variant were validated, but not CNV as no clinical samples were
available• Used sample previously tested by THs in their routine clinical pathways• Cell line DNA provided by Horizon Discovery
– translocations
– known variant frequency
– Technology transfer to TH• TH ran 3 Nextera panels in house
– Panel 1, cell lines
– Panel 2, previously analysed blood and tumour
– Panel 3, ‘real’ SMP2 samples
– No issues with the transfer of the wet protocol.
Monday 22 June 2015Lung Cancer Research Stratified Medicine Educational Event- Birmingham
Validation conclusions– Combining the results of the validation for SNV and indels; we know with 95%
certainty that the TH can detect SNV and indels at >10% overall allele frequency in at least 95% of cases.
– CNV are detectable in samples with high tumour percentage and if the CNV is large.• Confident in calling >5 copy number increases
• TH will request a retrospective FISH slide to confirm low level or suspected copy number variants.
– Homozygous deletions are not detectable in samples with <60% of neoplastic nuclei, because of contamination from normal tissue. • The TH will request a retrospective FISH slide to confirm the deletion is homozygous.
– Translocations detected but caution required with poorer quality samples where less read depth achieved• Prospective validation by FISH analysis, through analysis of a number of translocation negative
samples.
Monday 22 June 2015Lung Cancer Research Stratified Medicine Educational Event- Birmingham
How aberrations are detectedDEVELOPMENT OF ANALYSIS TOOL BY THE THREE TH– 4 stages to analysis
Monday 22 June 2015Lung Cancer Research Stratified Medicine Educational Event- Birmingham
1• Raw Data output from MiSeq run
2• Analysis of raw data using commercially available (Illumina
Variant Studio), Open source tools (Manta), or manually
3• Collation of detected variants •Excel spreadsheet containing algorithms developed by TH based bioinformaticians• Information on variants detected and coverage achieved across the panel test
4• Assessment of pathogenicity / eligibility for Matrix Trial• Scientific assessment
Virtual machineUnix
Patient raw data from MiSeq (blood and FFPE)
BWA alignment to human genome
(generate BAM files)
Somatic variant caller SNV and indels in
FFPE and blood(generates gVCF files)
Min 3 events
Germline variants removed from FFPE
Pass filter
MiSeq reporterWindows
Variant studio
Manta variant callingStructural changes
(FFPE)Min 3 events
Coverage calculator(FFPE)
Using coverage at each base per gene calculates;
•Average depth per gene•Min and max depth per gene•Mean depth per gene
Excel
SNV calling;•Filter known SNP out•Filter 10% allele freq and min 10 reads•Manual raw data check if needed
Structural variant calling;•See if Manta has pulled out a variant.
CNV calling;•Plot graph of mean coverage per gene/mean coverage per sample•Filter gains >5 fold•Filter loss <0.5 fold
WT calling;•Add tumour %•Look at % of bases in each gene coverage to required depth•Consider hotspot genes•Pass or fail gene
Report in XML
Compare to tier variant list from Pharma
Flag and remove duplicate reads that span chromosomes
Germline variants removed from FFPE
1
2
3
4
2
22
How aberrations will be reportedGENERAL RULES– Variants are being reported in an XML format.– Split into 3 tiers by the pharmaceutical partners.
• Tier 1= trial eligible• Tier 2= trial eligible• Tier 3= not trial eligible
– These lists will be maintained throughout the programme through quarterly meetings to look at evidence for moving variants between tiers
– Confidence scoring system developed to answer the question of how confident can you be that in a given tumour sample no variants have been missed that are above 10% frequency. This considers;• Minimum sequencing coverage across the region of interest• Tumour % of sample• Can only be applied to SNVs and in/dels
Monday 22 June 2015Lung Cancer Research Stratified Medicine Educational Event- Birmingham
Tumour %, depth of coverage, WT confidence, variant frequency are all linked– The number of times a DNA base or gene is sequenced is called the
depth/coverage or number of reads/read depth– This is what makes NGS a quantitative assay
• 100% tumour material has a KRAS c.35G>A p.(Gly12Asp) present at 20%
• In 100 reads at base c.35, 75 reads =G and 25 reads = A
– NGS allows us to look for low level variants ie ‘needle in a haystack’– Based on validation, test sensitivity set at 10%
• Confidently call variants where we detect 10 reads in 100
– 100% tumour material where we have 100 uniform reads at any given site we can be confident that we have not missed any variant that was present above 10%
– Most material sent for testing will be less than 100% tumour even if macrodissected so this will either• Decrease sensitivity of detection if keep a given read depth eg 100 reads• Maintain sensitivity by increasing read depth
Monday 22 June 2015Lung Cancer Research Stratified Medicine Educational Event- Birmingham
Tumour %
Variant present at 10% frequency – what % of reads would be mutant
Mutant reads
10 mutant reads minimum so minimum read depth required is
5% 0.5% 1/200 2000
10% 1% 1/100 1000
20% 2% 1/50 500
40% 4% 1/25 250
50% 5% 1/20 200
100% 10% 1/10 100
5 10 20 40 50 1000
500
1000
1500
2000
2500
Relationship between tumour %, test sensitivity and NGS read depth
Tumour %
Read
dep
th
– Tumour % information critical for analysis– Really important to define this to the nearest 10%. – Coverage calculator works out % of bases within
gene covered to the depth required– Gene is passed or failed based on criteria below
How aberrations will be reportedWILD TYPE SAMPLES OR FAILED SAMPLES– Wild type samples will say how confident the TH are this is a true WT for
SNV and indels.– Samples that fail either NGS or QC step are eligible for a Matrix Trial re-
biopsy.
– TH will never say no translocation or no copy number variation as still developing confidence value for this.
Monday 22 June 2015Lung Cancer Research Stratified Medicine Educational Event- Birmingham
Test Result Test Report
Wild type No variant detected High/medium confidence
Gene test failed No result Fail. Repeat sample requested if available.
QC step failed Not tested Failed QC step- insufficient sample. Repeat sample requested if available.
How aberrations will be reportedVARIANTS DETECTED– The general format for variants that will be in the ‘Test Result’ field are;
Monday 22 June 2015Lung Cancer Research Stratified Medicine Educational Event- Birmingham
Type of variant Test Result
Single nucleotide variants and small indels(Mutation/Sequence change/In dels)
c.codon number and base change p.(amino acid number and change)
Translocation Gene_Gene_exons
Copy number variant- deletions
Whole gene deletion homozygous
Whole gene deletion heterozygous
Exons deletion homozygous
Exons deletion heterozygous
Copy number variant - amplification Whole gene amplification ± confirmed by FISH
How aberrations will be reportedVARIANTS DETECTED– The ‘Test Report’ field will contain text around the variant tier, therefore
whether the variant makes the patient potentially trial eligible. – There will also be text to say if a FISH confirmation confirmed the variant.– If FISH slides were requested by the TH but could not be obtained, this
will be in the comments field of the XML.
Monday 22 June 2015Lung Cancer Research Stratified Medicine Educational Event- Birmingham
Test Report Meaning- trial eligibility
Tier 1 Potentially trial eligible
Tier 2 Potentially trial eligible
Tier 3
Confirmed by FISH
How aberrations will be reportedEXAMPLES OF REPORTING – WHOLE PANEL
Monday 22 June 2015Lung Cancer Research Stratified Medicine Educational Event- Birmingham
Gene Test Status Test Result Test Report AKT1 Success c.49G>A p.(Glu17Lys) Tier 1 AKT1 variant ALK Success EML4-ALK Tier 3 ALK translocationBRAF Success c.1799T>A p.(Val600Glu) Tier 3 BRAF variant CCND1 Success No variant detected Medium confidence Wild TypeCCND2 Complete Fail No result Fail. Repeat sample requested if available. CCND3 Complete Fail No result Fail. Repeat sample requested if available. CCNE1 Success No variant detected Medium confidence Wild TypeCDK2 Complete Fail No result Fail. Repeat sample requested if available. CDK4 Success AmplificationCDKN2A Success Whole gene deletion homozygous Tier 1 CDKN2A homozygous deletion confirmed by FISH EGFR Success c.2235_2249del15 p.(Glu746_Ala750del) Tier 3 EGFR 14bp duplication FGFR2 Success FGFR2-TACC2 Tier 3 FGFR2 translocation FGFR3 Success Amplification Her2 Success Exon 5 and 6 deletion heterozygous Tier 3 Her2 exon 5 and 6 heterozygous deletion HRAS Success No variant detected High confidence Wild TypeKRAS Success No variant detected High confidence Wild TypeMET Success Exon 5 deletion heterozygous Tier 3 MET exon 5 heterozygous deletion NF1 Partial fail c.135T>A p.(Asn45Lys) Tier 2 NF1 variantNRAS Success c.183A>C p.(Gln61His) Tier 1 NRAS variantNTRK1 Success CD74-NTRK1.C3N13 Tier 3 NTRK1 translocation PIK3CA Success c.1624G>A p.(Glu542K) and c.1616C>G p.
(Pro539Arg) Tier 1 and tier 2 PIK3CA variant
PTEN Success c.106G>C p.(Gly36Arg) Tier 2 PTEN variantRB1 Success No variant detected High confidence Wild TypeRET Partial Fail No variant detected Low confidence Wild TypeROS1 Success CD74-ROS1_C6:R34 Tier 1 ROS1 translocation STK11 Success c.523_528del6 (delAAGGAC) p.(Lys175_Asp176del) Tier 2 STK11 6bp deletion TSC1 Success c.2647G>A p.(Ala883Thr) Tier 2 TSC1 variantTSC2 Complete Fail No result Fail. Repeat sample requested if available.
How aberrations will be reported
Monday 22 June 2015Lung Cancer Research Stratified Medicine Educational Event- Birmingham
Stats so far …
– 147 samples received– 48 (33%) failed before testing - not enough DNA– 78 reported full 28 gene panel
• 9/78 (12%) all genes failed post testing
– 21 in progress
– Detected sequence variants 49/78 patients (63%)– 78 sequence variants detected in 49 patients– 39/49 patients with Tier 1 mutations
• Some with multiple Tier 1 mutations
78 Variants
62%
4%
35%
Tier 1Tier 2Tier 3
86%
10%
1% 3%
SNVs / In_dels
Amplification
Deletion
Structural
AKT1 ALK BRAF 2 CCND1 2
CCND2 CCND3 CCNE1 1 CDK2
CDK4 2 CDKN2A 6 EGFR 8 FGFR2 2
FGFR3 Her2* HRAS KRAS 15
MET 2 NF1 10 NRAS NTRK1
PIK3CA 10 PTEN 1 RB1 7 RET
ROS1 2 STK11 6 TSC1 1 TSC2 1
Variants detected in 17/28 genes on panel
THANK YOU
CRUK SMP2 Technology HubsBirmingham- Mike Griffiths, Jennie Bell, Fiona MacDonald, Pauline Rehal, Alessandro Rettino, Sam Clokie
Cardiff- Rachel Butler, Ian Williams, Michelle Wood, Helen Roberts
RMH- David Gonzalez de Castro, Lisa Thompson, Keeda Dover, Brian Walker, Lisa Grady
IlluminaDavid McBride
Mark Ross
cruk.org