application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the...

184
Application of next generation sequencing to aid in effective diagnosis of familial hypercholesterolaemia by Satishkumar Krishnan A portfolio of research and development in a professional context submitted in part fulfilment of the Doctorate in Biomedical Science School of Pharmacy and Biomedical Sciences Faculty of Science November 2018

Upload: others

Post on 05-Oct-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

Application of next generation sequencing to aid in

effective diagnosis of familial

hypercholesterolaemia

by

Satishkumar Krishnan

A portfolio of research and development in a professional context submitted in part

fulfilment of the Doctorate in Biomedical Science

School of Pharmacy and Biomedical Sciences

Faculty of Science

November 2018

Page 2: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

I

Abstract Familial Hypercholesterolaemia (FH) is an autosomal dominant inherited condition

characterised by increased blood cholesterol levels which leads to xanthomas and

coronary heart disease. Prevalence of FH worldwide as well as in the UK is 1 in 250-500.

Almost 85% of people with FH still remain undiagnosed. FH is predominantly caused by

mutations in the three genes LDLR, APOB and PCSK9.

Aim To set up Next Generation Sequencing (NGS) methodology for genetic screening of FH.

Methods A retrospective cohort of 94 patients known and suspected to suffer from FH were

selected for targeted sequencing on LDLR, APOB, PCSK9, LDLRAP1 and APOE genes

using an Illumina MiSeq system. Data analysis was performed with BaseSpace and

Galaxy platforms. Sanger sequencing was performed as a gap filling for NGS.

Results A total of 94 samples were screened by the reference lab of which 58 patients had no

reported FH variants. Our NGS approach allowed the identification of FH variants in 62

patients out the 94 samples. Of these, 32 patients had variants identified in LDLR, 27

patients in APOB, 14 patients had variants in APOE and 1 patient had a variant in the

PCSK9 gene. Furthermore, 10 patients had variants identified in both LDLR and APOB

genes, 4 patients in LDLR and APOE and 3 patients in APOB and APOE.

Conclusion and impact This study has demonstrated the practical utility of NGS as a diagnostic platform for

genetic disorders such as FH to be used by the NHS of the future. NGS represents a

paradigm shift from the conventional Sanger sequencing in its ever-increasing breadth of

coverage, resolution and reliability, coupled with ever-reducing costs. Therefore,

incorporation of NGS into routine molecular diagnostics will create a major impact on the

way we diagnose and tailor the management of hereditary disorders such as FH.

Page 3: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

II

Table of Contents

Abstract ............................................................................................................................................... I

List of figures ................................................................................................................................... VI

List of tables .................................................................................................................................. VIII

Abbreviations ...................................................................................................................................IX

Dissemination ................................................................................................................................XIII

Acknowledgements ...................................................................................................................... XIV

Dedication ...................................................................................................................................... XVI

Disclaimer ..................................................................................................................................... XVII

CHAPTER ONE ................................................................................................................................ 1

INTRODUCTION .............................................................................................................................. 1

1.1 Familial hypercholesterolaemia ........................................................................... 2

1.2 Disease prevalence ............................................................................................. 2

1.3 Hepatic cholesterol homeostasis ......................................................................... 5

1.3.1 Endogenous pathway ................................................................................... 5

1.3.2 LDLR pathway .............................................................................................. 7

1.3.3 Reverse cholesterol transport ....................................................................... 9

1.4 Pathophysiology of FH ......................................................................................... 9

1.4.1 Xanthomas ................................................................................................... 9

1.4.2 Atherosclerosis ........................................................................................... 10

1.5 FH genes ........................................................................................................... 12

1.5.1 LDLR .......................................................................................................... 13

1.5.2 APOB ......................................................................................................... 15

1.5.3 PCSK9 ....................................................................................................... 17

1.5.4 LDLRAP1 ................................................................................................... 19

1.5.5 APOE ......................................................................................................... 21

1.6 Diagnosis ........................................................................................................... 23

1.6.1 Biochemical diagnosis ................................................................................ 23

1.6.2 Clinical diagnosis ........................................................................................ 24

1.6.3 Genetic diagnosis ....................................................................................... 25

1.6.4 Cascade screening ..................................................................................... 25

1.7 New approaches in molecular diagnosis ............................................................ 26

1.8 FH Management ................................................................................................ 26

Page 4: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

III

1.9 Gaps in the literature and problems with the current diagnostic methods ........... 27

1.10 Need for further research and service development........................................... 29

1.11 Aims of the study ............................................................................................... 30

CHAPTER TWO ............................................................................................................................. 31

MATERIALS AND METHODS ..................................................................................................... 31

Materials and methods ................................................................................................. 32

2.1 Ethical approval ...................................................................................................... 32

2.2 Sample selection .................................................................................................... 32

2.3 Primer design on Illumina DesignStudio ................................................................. 32

2.4 Measuring the DNA concentration .......................................................................... 35

2.4.1 Measuring the DNA concentration by Qubit ..................................................... 35

2.4.2 Measuring the DNA concentration using NanoDrop ......................................... 35

2.5 Illumina workflow .................................................................................................... 36

2.5.1 Library preparation ........................................................................................... 36

2.5.2 Hybridize the oligo pool .................................................................................... 38

2.5.3 Removal of unbound oligos .............................................................................. 39

2.5.4 Extend and ligate bound oligos ........................................................................ 39

2.5.5 Amplify Libraries .............................................................................................. 40

2.5.6 Quality, quantification and sizing of the amplified libraries................................ 42

2.5.7 Clean up Libraries ............................................................................................ 44

2.5.8 Normalize Libraries .......................................................................................... 45

2.5.9 Pool Libraries ................................................................................................... 46

2.5.10 Denaturing and diluting the libraries ............................................................... 46

2.5.11 Creating sample plate, sample sheets and manifest file on Illumina experiment

manager (IEM) software ........................................................................................... 47

2.5.12 MiSeq flow cell ............................................................................................... 47

2.5.13 Illumina reagent cartridge ............................................................................... 47

2.6 Monitoring the run on BaseSpace .......................................................................... 48

2.7 Bioinformatic analysis ............................................................................................. 52

2.7.1 BaseSpace analysis ......................................................................................... 52

2.7.2 BaseSpace Variant Interpreter Beta ................................................................. 62

2.7.3 Galaxy ............................................................................................................. 64

2.7.4 wAnnovar ......................................................................................................... 74

2.8 Sanger Sequencing ................................................................................................ 74

2.8.1 PCR primers .................................................................................................... 74

Page 5: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

IV

2.8.2 Initial PCR amplification ................................................................................... 74

2.8.3 DNA Sequencing ............................................................................................. 75

2.8.4 Sequencing electrophoresis ............................................................................. 76

CHAPTER THREE ......................................................................................................................... 77

RESULTS ........................................................................................................................................ 77

3. Results ......................................................................................................................................... 78

3.1 Reference lab findings ............................................................................................ 78

3.1.1 Spectrum of FH variants in the reference lab samples ..................................... 78

3.2 BaseSpace findings at UHS ................................................................................... 79

3.2.1 Spectrum of variants identified by BaseSpace ................................................. 79

3.2.2 Pathogenic FH variants identified through BaseSpace ..................................... 79

3.2.3 Variants of unknown significance identified through BaseSpace ...................... 81

3.2.4 Combination of variants in different FH causing genes .................................... 84

3.3 Comparison of UHS BaseSpace findings with the reference lab findings ............... 84

3.3.1 31 samples (source or method of analysis unknown) ....................................... 84

3.3.2 FH 20 samples ................................................................................................. 85

3.3.3 Bristol 36 samples............................................................................................ 85

3.4 Investigation of non-concordant results .................................................................. 86

3.5 Sanger sequencing ................................................................................................ 91

CHAPTER FOUR ........................................................................................................................... 93

DISCUSSION .................................................................................................................................. 93

4. Discussion ................................................................................................................................... 94

4.1 Investigation of non-concordant results .................................................................. 96

4.1.1 Sample quality ................................................................................................. 96

4.1.2 Edge effects ..................................................................................................... 97

4.1.3 Technical issues .............................................................................................. 98

4.1.4 Amplicon issues ............................................................................................... 99

4.1.5 Panel design .................................................................................................. 100

4.1.6 Analytical validity and clinical validity of the analytical software ..................... 102

4.2 Barriers for implementation of an NGS-based FH mutation testing strategy in the

NHS ........................................................................................................................... 104

4.2.1 Bioinformatic analysis .................................................................................... 104

4.2.2 Big data ......................................................................................................... 106

4.2.3 Cloud computing ............................................................................................ 107

Page 6: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

V

4.2.4 Data governance ........................................................................................... 108

4.2.5 Costing .......................................................................................................... 109

4.2.6 Health Economics .......................................................................................... 110

4.2.7 Genetic services reconfiguration .................................................................... 111

4.2.8 FH services .................................................................................................... 113

4.3 Recommendations ............................................................................................... 114

4.4 Limitations ............................................................................................................ 115

4.5 Conclusion ........................................................................................................... 116

CHAPTER FIVE ........................................................................................................................... 117

REFLECTION ............................................................................................................................... 117

5. Reflection .................................................................................................................................. 118

5.1 The Professional Doctorate programme ............................................................... 118

5.2 Researching while working as a full time Biomedical Scientist ............................. 119

5.3 Bringing about a change in professional practice ................................................. 119

5.4 Dissemination of information ................................................................................ 120

5.5 Professional development .................................................................................... 120

References .................................................................................................................................... 121

Appendices .................................................................................................................................... 142

Appendix 1 ................................................................................................................. 142

UHS Sponsorship letter .......................................................................................... 142

Appendix 2 ................................................................................................................. 146

UHS R&D Confirmation of capacity & capability letter ............................................. 146

Appendix 3 ................................................................................................................. 150

REC approval ......................................................................................................... 150

Appendix 4 ................................................................................................................. 153

HRA approval ......................................................................................................... 153

Appendix 5 ................................................................................................................. 159

Excel spreadsheet with sample information and BaseSpace findings ..................... 159

Excel spreadsheet with amplicon coverage ............................................................ 159

Galaxy FH-NGS pipeline......................................................................................... 159

Appendix 6 ................................................................................................................. 160

Illumina BaseSpace generated pdf report. .............................................................. 160

Page 7: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

VI

List of figures Figure 1 Diagnosis of familial hypercholesterolaemia (FH) worldwide in 2017 based on a

frequency of 1:250. .......................................................................................................................... 4

Figure 2 Endogenous pathway of cholesterol homeostasis. ..................................................... 6

Figure 3 LDLR receptor pathway for the uptake and degradation of LDL. ............................. 8

Figure 4 Deposition of cholesterol in peripheral tissues. ........................................................... 9

Figure 5 Atherosclerosis formation in the blood vessel of patients with raised LDL. .......... 11

Figure 6 LDLR gene located on the short (p) arm of chromosome 19 at position 13.2

(19p13.2). ........................................................................................................................................ 14

Figure 7 APOB gene located on the short (p) arm of chromosome 2. ................................... 16

Figure 8 PCSK9 gene located on the short (p) arm of chromosome 1 at position 32.3

(1p32.3). .......................................................................................................................................... 18

Figure 9 LDLRAP1 gene located on the short (p) arm of chromosome 1 at position 36.11

(1p36.11). ........................................................................................................................................ 20

Figure 10 APOE gene located on the long (q) arm of chromosome 19 at position 13.32

(19q13.32). ...................................................................................................................................... 22

Figure 11 Screen shot of Illumina® DesignStudio™. ............................................................... 34

Figure 12 Illumina sample workflow. ........................................................................................... 36

Figure 13 TruSeq Custom Amplicon Workflow. ........................................................................ 37

Figure 14 Hybridization of upstream and downstream oligos to the region of interest. ...... 38

Figure 15 Filter Plate Unit (FPU) assembly. .............................................................................. 39

Figure 16 Extension and ligation of bound oligos. .................................................................... 40

Figure 17 TruSeq Index Plate Fixture. ........................................................................................ 41

Figure 18 Sample indexing and amplification. ........................................................................... 42

Figure 19 Gel electrophoresis of amplified libraries along with 100bp ladder. ..................... 43

Figure 20 Amplified libraries along with a ladder on a Bioanalyser. ....................................... 44

Figure 21 Library clean up using AMPure XP beads................................................................ 44

Figure 22 Sequencing Analysis Viewer (SAV) on BaseSpace. .............................................. 49

Figure 23 Screen shot of the charts on the FH run on BaseSpace. ....................................... 50

Figure 24 Run and Lane metrics of the FH-NGS run. .............................................................. 51

Figure 25 Read level statistics along with percentage of aligned reads. ............................... 54

Figure 26 Base level statistics along with percentage of aligned bases. .............................. 56

Figure 27 Small variants summary along with the number of SNVs passing through the

filter. .................................................................................................................................................. 58

Figure 28 Insertions statistics report with number of insertions passing through the filter. 59

Figure 29 Deletions statistics report along with number of deletions passing through the

filter. .................................................................................................................................................. 60

Figure 30 Coverage summary. ..................................................................................................... 61

Figure 31 Pathogenic LDLR on the Illumina Variant Interpreter Beta. ................................... 63

Figure 32 Range of quality scores over all reads at each position. ........................................ 65

Figure 33 Average quality scores over all reads. ...................................................................... 66

Figure 34 Screen shot of output table from IdxStats ................................................................ 68

Figure 35 Histogram of the insert sizes for all reads. ............................................................... 69

Figure 36 List of variants in the vcf file on Galaxy. ................................................................... 71

Page 8: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

VII

Figure 37 FH-NGS Galaxy pipeline for annotation, filtering, alignment and variant calling.

.......................................................................................................................................................... 73

Figure 38 Electrophoretic gel displaying PCR products. .......................................................... 75

Figure 39 IGV screen shot showing no coverage in some of the exons and poor coverage

in few exons in the LDLR gene. ................................................................................................... 88

Figure 40 IGV screen shot on 4 samples showing no coverage around 11216263bp,

11216243bp and 11216244bp regions of the LDLR gene. ...................................................... 89

Figure 41 IGV screen shot on sample 1107996839 showing no coverage at location

11231100bp in the exon 14 of the LDLR gene. ......................................................................... 90

Figure 42 Sanger sequencing electropherogram showing LDLR variants. ........................... 92

Figure 43 ACCE Model Process for evaluating genetic tests. .............................................. 103

Page 9: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

VIII

List of tables Table 1 Simon Broome criteria. ................................................................................................... 24

Table 2 PHRED score table. ........................................................................................................ 65

Table 3 Primer sequences used for Sanger Sequencing. ....................................................... 74

Table 4 Reference lab findings. ................................................................................................... 78

Table 5 Spectrum of FH variants in the reference lab findings. .............................................. 79

Table 6 UHS BaseSpace findings. .............................................................................................. 79

Table 7 Pathogenic FH variants identified using BaseSpace. ................................................ 80

Table 8 Details of pathogenic FH variants identified in the samples through BaseSpace. 80

Table 9 Variants of unknown significance. ................................................................................. 81

Table 10 Summary of some of the variants identified through BaseSpace similar to the

Latvian study group........................................................................................................................ 82

Table 11 Combination of variants in the FH causing genes. ................................................... 84

Table 12 Comparison of findings in 31 samples whose source was unknown. ................... 85

Table 13 Comparison of FH variants in 27 samples of FH20. ................................................ 85

Table 14 Comparison of FH variants in the 36 Bristol panel samples. .................................. 86

Table 15 Samples which were not in concordance with the reference lab findings. ........... 86

Table 16 Identification of variants in BaseSpace by manual analysis. .................................. 87

Table 17 Variants identified in the LDLR gene by Sanger sequencing. ................................ 91

Table 18 NanoDrop® and Qubit® findings on the 5 discordant samples. ............................ 97

Table 19 Off the shelf bespoke FH panels. .............................................................................. 101

Page 10: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

IX

Abbreviations

*.bcl Base Calling Files

*.cif Cluster Intensity Files

*.tiff Tagged Image File Format

.bam Binary Compressed Sequence Alignment/Map

.vcf Variant Call Format

ACD1 Amplicon Control DNA

ACP1 Control Oligo Pool

AI Artificial Intelligence

AP2 Adaptor Protein 2

APOA Apolipoprotein A

APOB Apolipoprotein B

APOC Apolipoprotein C

APOE Apolipoprotein E

ARH Autosomal Recessive Hypercholesterolaemia

bp Base Pair

BWA-MEM Burrows-Wheeler Alignment

CAT Custom Amplicon Oligos

CDC Centre for Disease Control and Prevention

CHD Coronary Heart Disease

CLP Cleanup Plate

CM Chylomicron

CNV Copy Number Variant

DBMS Doctorate in Biomedical Science

DLCN Dutch Lipid Clinic Network

DLSO Downstream Locus-Specific Oligos

DNA Deoxyribonucleic acid

dsDNA Double Stranded Deoxyribonucleic acid

Page 11: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

X

EAS European Society of Cardiology/European Atherosclerosis Society

EBT Elution Buffer with Tris

EDTA Ethylenediaminetetraacetic acid

ELM4 Extension-ligation Mix 4

ER Endoplasmic reticulum

FH Familial Hypercholesterolaemia

FPU Filter Plate Unit

GB Gigabyte

HDL High Density Lipoprotein

HeFH Heterozygous Familial Hypercholesterolaemia

HL Hepatic Lipase

HoFH Homozygous Familial Hypercholesterolaemia

HRA Health Research Authority

HT1 Hybridisation Buffer 1

HYP Hybridisation Plate

IAP Index Amplification Plate

IDL Intermediate Density Lipoprotein

IEM Illumina Experiment Manager

IGV Integrative Genome Viewer

kb Kilobase

LCAT Lecithin Cholesterol Acyl Transferase

LDL Low Density Lipoprotein

LDL-C Low Density Lipoprotein Cholesterol

LDLR Low Density Lipoprotein

LDLRAP1 Low Density Lipoprotein Receptor Adaptor Protein 1

LLDLE Type-1 clathrin box

LNA1 Library Normalisation Additives 1

LNB1 Library Normalisation Beads 1

Page 12: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

XI

LNP Library Normalisation Plate

LNS2 Library Normalisation Storage Buffer 2

LNW1 Library Normalisation Wash 1

LOVD Leiden Open Variation Database

LPL Lipoprotein Lipase

MB Megabyte

MEDPED Make Early Diagnosis to Prevent Early Death

MI Myocardial infarction

MLPA Multiplex Ligation-dependent Probe Amplification

NaOH Sodium Hydroxide

NGS Next Generation Sequencing

NHS National Health Service

NICE National Institute of Clinical Excellence

OHS2 Oligo Hybridisation for Sequencing 2

OMIM Online Mendelian Inheritance in Man

PAL Pooled Amplicon Library

PCR Polymerase Chain Reaction

PCSK9 Proprotein Convertase Subtilisin/Kexin type 9

PDF Portable Document Format

PF Passing Filter

PMM2 Polymerase Chain Reaction Master Mix 2

ProfDoc Professional Doctorate

PTB Phosphotyrosine Binding Domain

QALYs Quality-adjusted-life-years

R&D Research and Development

REC Research Ethics Committee

RNA Ribonucleic acid

RPM Revolutions per Minute

Page 13: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

XII

RT Room Temperature

sam Sequence Alignment Map

SAV Sequence Analysis Viewer

SBS Sequencing by Synthesis

SGP Storage Plate

SNV Single Nucleotide Variant

SSCP Single Strand Confirmation Polymorphism

SW1 Stringent Wash 1

TAE Tris Acetate-EDTA buffer

TDP1 TruSeq DNA polymerase 1

UB1 Universal Buffer 1

UHS University Hospital Southampton NHS Foundation Trust

UK United Kingdom

ULSO Upstream Locus-Specific Oligos

UoP University of Portsmouth

US United States of America

UTR Untranslated Region

VLDL Very Low Density Lipoprotein

VUS Variant of Unknown Significance

WES Whole Exome Sequencing

WGS Whole Genome Sequencing

WHO World Health Organisation

Page 14: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

XIII

Dissemination The subject of the thesis has been disseminated to others through presentations.

At national/International level, the output from the study was disseminated through a

poster presentation in the form of an impact case study poster titled “Application of next

generation sequencing to aid in effective diagnosis of familial hypercholesterolaemia” in

the 6th International Conference on Professional Doctorates in London on March 22-23,

2018.

The output of the study was also disseminated through a poster presentation titled

“Application of next generation sequencing to aid in effective diagnosis of familial

hypercholesterolaemia” at Research and Innovation Conference at the University of

Portsmouth on September 6, 2018.

Page 15: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

XIV

Acknowledgements Firstly, I am so thankful and grateful to God for giving me a happy family and an

opportunity to study and research. Through God, I have known myself and have learnt

that “Known is a drop and unknown is an ocean”.

I would like to thank my work place supervisors Dr Elizabeth Hodges and Dr Rosalind

Ganderton for giving me the opportunity to undertake this research. Thank you for all the

motivation and guidance you both have provided me over the last few years. I have been

very lucky to have you both as my workplace supervisors and truly you have been my

inspiring guardians. Special thanks to Dr Elizabeth Hodges for supporting me and being

there with me during difficult times of this research, especially while facing challenges and

when things became tough. Your unconditional support and energy has been a great

inspiration to me. Special thanks to Dr Rosalind Ganderton, who has guided me tirelessly

and has provided me valuable suggestions on every chapter and for being the driving

force from day one of this research. Your compassionate personality, enthusiasm for

science and your positive attitude has increased my understanding of both science and

life. Special thanks to Dr Paul Cook for his continual support and guidance.

I would also like to express my deepest gratitude to my University supervisors, Prof Darek

Gorecki and Dr Sassan Hafizi for encouraging me and supporting me in my research.

Special thanks to Prof Darek for his excellence mentorship, top quality intellectual advice,

straightforward critique, overwhelming desire to help and showing me the importance of

patience and positivity in my work. Sincere thanks to Dr Sassan Hafizi for his devotion to

help, constant motivation, passionate guidance and for sharing his top quality academic

and writing skills. His insightful suggestions, comments and timely corrections have

helped me to improve and develop this work. Many thanks for guiding me and supporting

me tirelessly with your valuable feedback on every draft of the chapters as they evolved. I

am extremely grateful and indebted for all his efforts for my success. Sincere thanks to

Prof Graham Mills for finding me such lovable, caring, compassionate supervisors who

have immensely supported me and guided me through this journey.

Special thanks to the entire lab staff in Molecular Pathology who have always been very

friendly and very supportive. Special thanks to WISH lab at the UHS and to Dan for

helping us with our experiments.

I would like to thank my Mum and Dad, without whom I would not be where I am today.

Many thanks for your unconditional love, dedication and sacrifice without which I would

not achieved anything in life. Special thanks to my brother for his sacrifice and without

who I would not have been here in the UK. He has always been my inspiration and my

Page 16: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

XV

confidence in life who has taught me the reality in life. My sincere thanks to my brother’s

family and to my niece and nephew who have always prayed to God for my success and

made me smile in times of difficulty. Last but not the least, special thanks to my tolerant,

loving and caring wife who has been very understanding and for running the family all

these years without my support. She has been amazing in looking after and supporting my

two daughters who always knew dad as a person who resided either at library or at work.

Thanks to my loving daughters who have always greeted me with a smiling face after a

busy day at work or from the library.

Page 17: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

XVI

Dedication I dedicate this thesis to my family for their unconditional love, dedication and sacrifice

without which I would not have achieved anything in life.

Page 18: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

XVII

Disclaimer I declare that whilst studying for the Doctorate in Biomedical Science at the University of

Portsmouth, I have not been registered for any other award at another university. The

work undertaken for this degree has not been submitted elsewhere for any other award.

The work contained within this submission is my own work and that, to the best of my

knowledge and belief, it contains no material previously published or written by another

person, except where due acknowledgement has been made in the text.

Satishkumar Krishnan.

November 2018.

Page 19: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

1

CHAPTER ONE

INTRODUCTION

Page 20: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

2

1.1 Familial hypercholesterolaemia

Familial hypercholesterolemia (FH, OMIM no. 143890) is an autosomal dominant inherited

disorder of lipoprotein metabolism characterised by raised blood plasma cholesterol levels

(J L Goldstein & Brown, 1973) (Brown & Goldstein, 1974) and was first described in 1920

(Burns, 1920). Increased plasma cholesterol levels result in its deposition in peripheral

tissues leading to xanthomas (A. K. Soutar & Naoumova, 2007) (Fahed & Nemer, 2011)

and deposition in arterial wall leads to coronary heart disease (CHD) (Goldstein JL, Hobbs

HH, Brown MS, 2001) (Khachadurian, 1988). As a result of raised blood cholesterol

levels, men are at almost 50% risk of fatal or non-fatal CHD by the age of 50 and is

slightly lower in women, who are at risk of almost 30% by the age of 60 (Slack, 1969)

(Stone, Levy, Fredrickson, & Verter, 1974). It was in 1938, a Norwegian physician, Carl

Müller in his landmark paper showed the connection between plasma cholesterol and

heart attacks (Müller, 1938). In his paper, he had described 17 Norwegian families with 76

family members who had xanthomas, high cholesterol, myocardial infarction and 68

members of the family had symptoms of possible heart disease (Müller, 1938).

1.2 Disease prevalence

FH exists in both homozygous and heterozygous forms. Heterozygous FH (HeFH) is more

common and the prevalence of HeFH worldwide as well as in the UK was known to be

1:500 (Goldstein JL, Hobbs HH, Brown MS, 2001) (Scriver, 1995). However, recent

population based studies in UK (Wald et al., 2016), the Netherlands (Barbara Sjouke et

al., 2015), northern Europe (Benn, Watts, Tybjærg-Hansen, & Nordestgaard, 2016),

Poland (Pajak et al., 2016) and the United States (Abul-Husn et al., 2016) reveal the

prevalence to be 1 in 200-250, twice as high as previously thought. Furthermore, due to

founder effects the prevalence is even higher in certain populations, such as 1 in 200 in

French Canadians, 1 in 165 in Tunisians, 1 in 85 in Christian Lebanese, and in South

African Afrikaners it is 1 in 72 (Austin, Hutter, Zimmern, & Humphries, 2004). Such

founder effects are also found in Finns, Icelanders, Gujarati South African Indians and

Ashkenazi Jews (Henderson, O’Kane, McGilligan, & Watterson, 2016). Homozygous FH

(HoFH) is extremely rare with a frequency of 1 in 1,000,000 individuals (Bhagavan &

Jackson, 2002) (Rader, Cohen, & Hobbs, 2003) and in certain populations it is known to

be 1 in 160,000 to 1 in 300,000 (Barbara Sjouke et al., 2015) (Cuchel et al., 2014). Using

these prevalence figures, it is estimated that almost 34 million individuals worldwide have

FH and less than 1% have been diagnosed (Iacocca et al., 2017). In the UK, the

proportion of patients with FH identified and being treated in the lipid clinics is

approximately 15% which therefore indicates that almost 85% of people with FH still

remain undiagnosed (D Marks, Thorogood, Farrer, & Humphries, 2004) (Neil, Hammond,

Page 21: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

3

Huxley, Matthews, & Humphries, 2000) . The Netherlands, Norway and Japan are the

only countries known to have high diagnostic rates of 86%, 37% and 26% as shown in

Figure 1.

Underdiagnosis of FH has been mainly due the failure in implementation of national

guidelines such as NICE in the UK, poor availability of genetic services, low through put

from traditional screening methodologies, high costs of traditional genetic testing,

fragmented service delivery and also due to the lack of investment in identification of

index cases (Norsworthy et al., 2014). Undertreatment has been due to the lack of

awareness of the pathogenicity of FH, fear of side effects of statins, lack of knowledge

about the efficacy and importance of statin treatment among patients and healthcare

providers, lack of universal screening programmes for FH, lack of electronic health record

systems and lack of national registries to monitor and follow-up FH patients (Haymana,

2017) (Metha et al., 2016).

Page 22: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

4

Figure 1 Diagnosis of familial hypercholesterolaemia (FH) worldwide in 2017 based on a frequency of 1:250 (Borge G Nordestgaard & Benn, 2017).

LDL-C: Low density lipoprotein concentration.

Page 23: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

5

1.3 Hepatic cholesterol homeostasis

Cholesterol is an essential component of the cell membranes and also acts as a precursor

to steroidal and non-steroidal hormones. Cholesterol is essential for normal cellular

functions and each day the human body synthesises 700-900 mg of cholesterol and

absorbs 300-500 mg of cholesterol from diet (Dietschy, 1984). In humans, cholesterol is

mainly synthesised in the liver, intestine, adrenal glands and in the reproductive organs

(Russell, 1992). Cholesterol is transported in the blood as five major lipoproteins,

chylomicrons (CM), very low density lipoprotein (VLDL), intermediate density lipoprotein

(IDL), low density lipoprotein (LDL) and the high density lipoprotein (HDL) (Wadhera,

Steen, Khan, Giugliano, & Foody, 2016). In common parlance, LDL is often called the

“bad” cholesterol due to its risk of causing atherosclerosis whilst “HDL” is known as the

good cholesterol due to its protective effects. There are three distinct mechanisms in the

liver through which cholesterol homeostasis occurs: endogenous cholesterol synthesis,

LDLR expression on the liver cells, and cholesterol excretion as bile acids by reverse

cholesterol transport (A David Marais, 2004).

1.3.1 Endogenous pathway

VLDL is synthesised along with the apolipoprotein ApoB-100 and is secreted by the liver

(Nguyen et al., 2008). In circulation, VLDL acquires ApoC and ApoE from the circulating

HDL particles. VLDL is hydrolysed by lipoprotein lipase leading to the formation of fatty

acids and glycerol. During this process VLDL loses ApoC to the HDL resulting into the

transformation of IDL. IDL is either taken up by the LDLR or converted to LDL by the

action of hepatic lipase (shown in Figure 2) and during this process IDL molecule loses

ApoE to the HDL molecules. LDL particles with ApoB enter the liver through the LDLR

pathway which further gets metabolised to bile acids (T. F. Daniels, Killinger, Michal,

Wright, & Jiang, 2009) and excreted in the faeces.

Page 24: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

6

Figure 2 Endogenous pathway of cholesterol homeostasis. VLDL: Very low density lipoprotein, B100: APOB100, APOC: Apolipoprotein C, APOE: Apolipoprotein E, APOA: Apolipoprotein A, LPL: Lipoprotein lipase, HL: Heaptic lipase, LDLR: Low density lipoprotein receptor.

Page 25: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

7

1.3.2 LDLR pathway

The LDL receptor pathway characterised by Brown and Goldstein for the uptake and

degradation of LDL is shown in Figure 3. LDL receptor is a cell surface glycoprotein

synthesised by the endoplasmic reticulum (ER) and processed by the Golgi apparatus (A.

K. Soutar & Naoumova, 2007). The mature form of the LDL receptor anchors to the

hepatic cell membrane. The circulating LDL particles bind to the LDL receptor specifically

through the APOB component of the LDL particle (A. K. Soutar & Naoumova, 2007). The

receptor-ligand complex is then internalised by the process of endocytosis through the

clathrin-coated pits and forms a vesicle through its interaction with the LDL receptor

adaptor protein (J L Goldstein, Anderson, & Brown, 1982). The vesicle is transported to

the endosomes and the acidic environment in the endosomes causes the dissociation of

the LDL particles from the LDL receptors (A. K. Soutar & Naoumova, 2007) (Brautbar et

al., 2015). LDL receptors are either degraded or recycled back to the cell surface

(Brautbar et al., 2015). LDL particles are degraded in the lysosomes where the APOB is

converted to amino acids and cholesterol esters are converted to free cholesterol which is

used in the synthesis of cell membranes, steroid hormones and bile acids (J L Goldstein

et al., 1982). Proprotein convertase subtilisin/kexin-9 synthesised by the Golgi is involved

in the degradation of the LDL receptors and prevents it from being recycled back to the

cell surface (Brautbar et al., 2015).

Page 26: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

8

Figure 3 LDLR receptor pathway for the uptake and degradation of LDL. LDL: Low density lipoprotein, APOB: Apolipoprotein B, LDLR: Low density

lipoprotein rceptor, PCSK9: Proprotein convertase subtilisin/kexin-9.

Page 27: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

9

1.3.3 Reverse cholesterol transport

In this pathway, HDL particles transport cholesterol from the arterial walls and peripheral

tissues back to the liver where it can be catabolised and excreted as bile (Moffatt &

Stamford, 2005). Hence, HDL is also termed as good cholesterol since it controls the bad

cholesterol (LDL) levels and protects from CHD (Gordon et al., 1989). This pathway relies

on the availability of a group of apolipoproteins, enzymes, transfer proteins and receptors

(Moffatt & Stamford, 2005). Amongst these, HDL contains ApoA, functions as an acceptor

of cholesterol as well as serves a co-factor for lecithin cholesterol acyl transferase (LCAT)

(D. P. Nicholls, Nicholls, & Young, 2009).

1.4 Pathophysiology of FH

Mutations in any of the FH causing genes leads to raised blood plasma cholesterol levels.

In children, the cholesterol level is usually >6.7 mmol/l and in adults it is >7.5 mmol/l

(“Risk of fatal coronary heart disease in familial hypercholesterolaemia. Scientific Steering

Committee on behalf of the Simon Broome Register Group,” 1991). Raised blood plasma

cholesterol levels results in its deposition in peripheral tissues leading to xanthomas and

deposition in the blood vessels leading to the formation of atherosclerotic plaques and

consequently, CHD.

1.4.1 Xanthomas

Deposition of cholesterol in tendons leads to thickening of the tendons and causes tendon

xanthomata as shown in panel B, C and D of Figure 4. It is composed of monocyte-

derived foam cells which can accumulate and store cholesterol intracellularly (Kruth,

1985). Tendon xanthomata usually occur in the elbows, Achilles tendon and hands.

Deposition of cholesterol in the cornea of the eye leads to corneal arcus which usually

appears as a crescent white line. Deposition of cholesterol in the eyelids leads to

xanthelasma which appears as a flat yellow plaque on the eye lids.

Figure 4 Deposition of cholesterol in peripheral tissues. A= Corneal arcus and xanthelasma, B= Extensor tendon xanthomas, C and D= Achilles tendon xanthomas. Reproduced with permission from Burnett et al (D. A. Bell, Hooper, Watts, & Burnett, 2012).

A B C D

Page 28: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

10

1.4.2 Atherosclerosis

Raised blood plasma LDL cholesterol for prolonged period leads to its deposition into the

arterial blood vessel walls, leading to atherosclerosis as shown in Figure 5. LDL in

circulation is oxidised and initiates an inflammatory response damaging the endothelial

tissues to form atherosclerotic plaques (Van Wijk et al., 2014). Oxidised LDL enters the

endothelial cell surface where it is phagocytosed by monocytes (Stocker, 2004).

Monocytes acquire the characteristics of macrophages and undergo a series of changes

to develop into a foam cell (Libby, 2002). Cholesterol intake by macrophages is facilitated

by the scavenger receptors and is capable of accumulating vast stores of cholesterol

esters (A D Marais & Firth, 2005). Scavenger receptors are a diverse group of receptors

which are involved in the uptake of oxidised lipoproteins and are expressed by the

macrophages in the atherosclerotic plaques and by circulating monocytes in atherogenic

conditions (Kzhyshkowska, Neyen, & Gordon, 2012).

The early atherosclerotic lesion is characterised by these lipid-laden macrophages called

foam cells which leads to the formation of a so-called fatty streak (Libby, 2002). A fatty

streak develops into an atheroma along with the proliferation of smooth muscle cells

which evolves into a plaque and occludes the blood vessel by narrowing the lumen of the

arteries. Stenotic lesions, produced by the atherosclerotic plaques, restrict the blood flow,

leading to ischaemia (Libby, 2002). Disruption and rupture of these atherosclerotic

plaques allows contact with tissue factor in the blood, which triggers thrombus formation

(Davies, 1996). Thrombus formation contributes to occlusion of blood vessel, which is a

precursor to acute myocardial infarction (MI) (Libby, 2002).

Page 29: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

11

Figure 5 Atherosclerosis formation in the blood vessel of patients with raised LDL. Oxidised LDL initiates an inflammatory response on the endothelial cells. Oxidised LDL cholesterol is engulfed by the macrophages and develops into foam cells loaded with LDL particles. Accumulation of foam cells leads to the atherosclerotic plaque formation. Rupture of the plaque leads to thrombus formation.

Page 30: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

12

1.5 FH genes

The hereditary basis of the FH was first demonstrated by Wilkinson et al (Wilkinson,

Hand, & Fliegelman, 1948) and later on Khachadurian demonstrated the autosomal-

dominant mode of inheritance (Khachadurian, 1964). FH is predominantly caused by

mutations in the genes coding for three proteins: Low Density Lipoprotein Receptor

(LDLR) (Hobbs, Brown, & Goldstein, 1992a) (Scriver, 1995), Apolipoprotein B (APOB)

(Soria et al., 1989), and Proprotein Convertase Subtilisin/Kexin Type 9 (PCSK9) (M

Abifadel et al., 2003). A very rare recessive form of FH is caused by mutations in the Low

Density Lipoprotein Receptor Adaptor Protein (LDLRAP1) (Henderson et al., 2016) (A. K.

Soutar & Naoumova, 2007) (Faiz, Hooper, & Van Bockxmeer, 2012). However, recently

Apolipoprotein E (APOE) is known to cause FH in certain families (Cenarro et al., 2016)

(Awan et al., 2013). To date, according to the Leiden Open Variation Database (LOVD) an

web-based platform designed by the Netherlands, to collect and display variants in the

DNA sequence has 1741 variants reported in the LDLR gene (S. E. Leigh, Foster,

Whittall, Hubbart, & Humphries, 2008), 69 variants in APOB gene of which only 8 variants

are known to cause FH (Henderson et al., 2016), and 163 variants in the PCSK9 gene (S.

E. A. Leigh, Leren, & Humphries, 2009). It has been reported that almost 79% of FH

patients carry a mutation in their LDLR gene, 5.5% carry a mutation in their APOB gene

and 1.5% carry a mutation in their PCSK9 gene (Tosi, Toledo-Leiva, Neuwirth,

Naoumova, & Soutar, 2007) (Henderson et al., 2016) (De Castro-Orós, Pocoví, & Civeira,

2010). The remaining 15% of the FH cases are either polygenic or the genetic cause is

still unknown (Wiegman et al., 2015).

It is also evident from the literature that there could be other genes involved in FH, for

which evidence is still evolving and emerging (Vandrovcova et al., 2013). The patient may

have FH caused by a very rare mutation in a gene which is yet to be discovered and

hence researchers have started screening other genes in the cholesterol-processing

pathways (Vandrovcova et al., 2013) (Thomas, 2015). In addition, there could also be

variants which have been identified, but lack sufficient evidence for being the cause of FH

(Thomas, 2015).

Mutations in the APOB and the LDLR gene results in defective binding of LDL particles to

the LDL receptors on the cell surface (Brautbar et al., 2015). As a result, LDL cannot be

internalised by the hepatic cell for degradation, which results in raised blood plasma LDL

cholesterol levels. A mutation in the LDLRAP1 gene prevents the internalisation of the

receptor-ligand complex into the hepatic cell (Brautbar et al., 2015). A gain-of-function

mutation in the PCSK9 gene leads to increased degradation of the LDL receptor and

prevents it from being recycled (Brautbar et al., 2015).

Page 31: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

13

1.5.1 LDLR

LDLR was first discovered in 1974 by Goldstein and Brown (J. L. Goldstein & Brown,

1974). In 1986 they discovered the LDLR molecular defect underlying FH (M S Brown &

Goldstein, 1986) and were later awarded the Nobel Prize for this work. LDLR is a cell

surface glycoprotein receptor expressed on the surface of hepatocytes and regulates the

LDL concentration in the blood by removing them from circulation and aids in hepatic

cholesterol homeostasis. The 45 kilobase (kb) LDLR gene (MIM 606945) is located on the

short arm of chromosome 19 (19p13.2) and consists of 18 exons and 17 introns

(Henderson et al., 2016) as shown in Figure 6. The LDLR protein is comprised of five

structural domains encoded by exons 2 to 18 (A. Soutar, 2009). Exon 1 encodes a signal

peptide which is required for the transport of the LDLR to the ER. Exon 2 to 6 encodes the

functional domain which is responsible for binding of lipoprotein particles. Epidermal

Growth Factor (EGF) precursor-like domain is encoded by exons 7 to 14 which play a

significant role in dissociation of LDLR-LDL complex in the endosomes. Oligosaccharide

rich domain is encoded by exon 15 and deletion of this domain does not appear to have

any effect on the LDLR function in vitro (Davis et al., 1986). Transmembrane (TM) domain

rich in hydrophobic amino acids is encoded by exon 16 and aids in anchoring LDLR to the

cell membrane. The cytoplasmic domain is encoded by exons 17 and 18 and is

responsible for ensuring the correct orientation of LDLR as well as directing the LDLR-

LDL complex to the clathrin coated pit during endocytosis (Hussain, Strickland, & Bakillah,

1999). To date, 1741 mutations (S. E. Leigh et al., 2008) have been identified in the exons

of the LDLR gene and these have been divided into five categories based on their effects

on the LDLR. Class 1 mutations are defined as null mutations since they result in no

detectable LDLR protein (Hopkins, Toth, Ballantyne, & Rader, 2011) (Hobbs, Russell,

Brown, & Goldstein, 1990). In class 2, there are two subclasses: 2a and 2b, where the

transport of LDLR from the ER to Golgi is blocked completely (2a) or partially blocked (2b)

(Hobbs, Brown, & Goldstein, 1992b) (Hopkins et al., 2011) (Hobbs et al., 1990). Class 3

mutations lead to the production of defective or non-functional LDLR (Hopkins et al.,

2011) (Hobbs et al., 1990). Class 4 mutations are due to defective internalisation of the

LDLR and LDL complex via clathrin-coated pits (Hopkins et al., 2011) (Hobbs et al.,

1990). Class 5 mutations arise due to defects in the recycling of LDLR (Hopkins et al.,

2011) (Hobbs et al., 1990). More than 90% of the reported variants are disease causing

variants (Usifo et al., 2012) and the majority of these are clustered around exon 4

(Grenkowitz et al., 2016).

Page 32: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

14

Figure 6 LDLR gene located on the short (p) arm of chromosome 19 at position 13.2 (19p13.2). Numbered vertical bars represent the exons which encode the various domains of the LDLR protein. OLS: O-linked sugar domain, TM: Transmembrane domain. Mutations can affect any domain of the LDLR

protein and can result in defective or impaired binding of LDL particle to the LDLR and also can results in impaired internalisation of the complex into the cell.

Page 33: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

15

1.5.2 APOB

The 42-kb APOB gene (MIM 107730) is located on the short arm of chromosome 2

between 2p23-p24 and consists of 29 exons and 28 introns (Huang, Miller, Bruns, &

Breslow, 1986) (Law et al., 1985). The APOB gene gives rise to two isoforms of APOB

protein; apoB-100, found in the LDL particles and is expressed in the liver cells and apoB-

48 expressed in the small intestine (Whitfield, Barrett, Van Bockxmeer, & Burnett, 2004).

APOB-100 is 4,563 amino acids in length and contains five domains: NH3-ßα1-ß-α2-ß2-

α3-COOH (S. H. Chen et al., 1986) (Segrest, Jones, De Loof, & Dashti, 2001) as shown in

Figure 7. Mutations in the APOB gene lead to familial defective APOB-100 (FDB)

(Innerarity et al., 1990). The binding region for the microsomal triglyceride transfer protein

(MTTP) which is very significant for the lipoprotein assembly is present on the ßα1 domain

(White, Bennett, Billett, & Salter, 1998). ß1 and ß2 domains can irreversibly bind to the

lipids, while α2 and α3 can bind reversibly (Segrest et al., 2001). In addition ß2 domain

also contains the LDLR binding region which is essential for the uptake of lipoproteins

containing APOB-100 (Whitfield et al., 2004). FDB results in production of defective

APOB-100 protein, which results in defective binding of APOB to LDLR and thereby fails

to clear LDL from the circulation (Innerarity et al., 1990). In contrast to LDLR, there are

only 8 disease causing variants and the majority of these variants are located on exon 26

(Henderson et al., 2016). The most common APOB gene variant Arg3527Gln, which is

due to R3500Q at the gene level, is known to disrupt the conformational binding of LDL to

its receptor (Borén, Ekström, Ågren, Nilsson-Ehle, & Innerarity, 2001). The prevalence of

this mutation is known to be 1 in 500 to 1 in 1000 and varies with the population being

studied (Innerarity et al., 1990). A Danish study has also shown the prevalence of this

mutation in the general population to be 1 in 1000 and was associated with a significant

increase in blood plasma cholesterol levels (Tybjærg-Hansen, Steffensen, Meinertz,

Schnohr, & Nordestgaard, 1998).

Page 34: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

16

Figure 7 APOB gene located on the short (p) arm of chromosome 2. Numbered vertical bars represent the exons. To-date there are 8 known disease

causing variants and the majority of them are located in exon 26.

Page 35: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

17

1.5.3 PCSK9

PCSK9 (MIM 607786) codes for a 25 kb serine protease (Marianne Abifadel et al., 2014),

which belongs to the proprotein convertase family (Seidah & Prat, 2007) and is mainly

expressed in the liver, intestine and the kidneys (Seidah et al., 2003) (Kwon, Lagace,

McNutt, Horton, & Deisenhofer, 2008) (Zhang et al., 2007). Human PCSK9 is located on

the short (p) arm of chromosome 1 at position 32.3 and contains 12 exons and 11 introns

(Seidah et al., 2003) (Hunt et al., 2000). The protein consists of 692 amino acids

(Marianne Abifadel et al., 2014) and contains N-terminal pro-domain, catalytic domain,

and a cysteine- and histidine-rich C- terminal domain as shown in Figure 8 (Cariou, Le

May, & Costet, 2011) (Pandit et al., 2008). PCSK9 regulates cholesterol levels in the body

through its interaction with the EGF domain of the LDLR (Zhang et al., 2007). Mutations in

the PCSK9 gene are rare and it was first found in French families (M Abifadel et al.,

2003). However, it is now known to be associated with severe hypercholesterolaemia in a

number of families in several countries (Leren, 2004) (Sun et al., 2005) (Timms et al.,

2004). Gain of function mutations in the PCSK9 gene causes hypercholesterolaemia and

the exact underlying mechanism in not fully understood (Faiz et al., 2012). However, it is

known that PCSK9 gene encodes an enzyme which is involved in the regulation and

degradation of the LDLR receptor protein and preventing it from being recycled. Hence, in

the recent times the inhibition of PCSK9 has been of great interest to the pharmaceutical

industry as an important target for development of new drugs to FH (A. K. Soutar &

Naoumova, 2007). PCSK9 gene mutations account for 1.5% clinically diagnosed FH (De

Castro-Orós et al., 2010) and are commonly seen as heterozygotes, homozygotes and

compound heterozygous on their own or in combination with LDLR mutations as double

heterozygotes (Brautbar et al., 2015).

Page 36: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

18

Figure 8 PCSK9 gene located on the short (p) arm of chromosome 1 at position 32.3 (1p32.3). Numbered vertical bars represent the exons which encode the various domains of the PCSK9 protein. To date 163 mutations are known in the PCSK9 gene, some of which cause gain-of-function whilst some

cause loss-of-function.

Page 37: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

19

1.5.4 LDLRAP1

The 25-kb LDLRAP1 gene is located on the short (p) arm of chromosome 1 at position

36.11 which contains 9 exons and 8 introns. It yields a protein product containing 308

amino acids with 3 functional domains: phosphotyrosine-binding domain (PTB), a type- 1

clathrin box (LLDLE) and an adaptor protein 2 (AP2)-binding domain (Figure 9). Loss-of-

function mutations in the LDLRAP1 gene causes Autosomal Recessive

Hypercholesterolaemia (ARH) and was first noted by Khachadurian (Khachadurian, 1964).

One of the main functions of LDLRAP1 is to facilitate the endocytosis of LDLR in the

clathrin-coated pits of the hepatocytes. However, in ARH due mutations in the LDLRAP1,

the LDL-C ligand and receptor complex cannot be internalised in the hepatocytes which

results in impaired catabolism of LDL-C. ARH are extremely rare with an estimated

frequency of 1 in 5,000,000 (Rader et al., 2003). The prevalence of ARH is high in

Sardinia, Italy; at around 1 in 40,000, presumably due to founder effects, consanguineous

marriages and due to its isolated geographical location (Filigheddu et al., 2009). To date,

there are 39 mutations reported in the LDLRAP1 gene (Henderson et al., 2016). The

phenotype for both ARH and homozygous FH due to LDLR are similar (Pisciotta et al.,

2006). However, total serum cholesterol and LDL cholesterol levels are lower and when

compared to homozygous FH caused by LDLR (A. K. Soutar & Naoumova, 2007). In

addition, ARH patients tend to have high HDL cholesterol in comparison to patients with

homozygous FH due to LDLR mutations (Pisciotta et al., 2006).

Page 38: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

20

Figure 9 LDLRAP1 gene located on the short (p) arm of chromosome 1 at position 36.11 (1p36.11). Numbered vertical bars represent the exons which

encodes to the respective domains of the LDLRAP1 protein. PTB domain: phosphotyrosine–binding domain, AP-2: adaptor protein 2.

Page 39: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

21

1.5.5 APOE

APOE gene is located on the long (q) arm of chromosome 19 at position 13.32

(19q13.32). The APOE gene consists of 4 exons and 3 introns and encodes a protein

called apolipoprotein E (ApoE) with 299 amino acids with a molecular mass of ~34kDa

(Zhao, Liu, Qiao, & Bu, 2018). The receptor binding domain of the ApoE is in the N-

terminal domain and the lipid binding domain is in the C-terminal domain as shown in the

Figure 10. The gene produces a multifunctional glycosylated protein synthesised in

several organs such as liver, brain, kidney and spleen (Awan et al., 2013) (Mahley, 1988).

ApoE is a key component of chylomicrons, chylomicron remnants and very low density

lipoprotein (VLDL) (Awan et al., 2013). It acts as a ligand for LDLR and transports

cholesterol and other lipids among various cells of the body (Mahley, 1988). A mutation in

the APOE gene leads to defective binding of LDLR which is characterised by raised blood

cholesterol levels and accelerated coronary artery disease (Mahley, 1988). Through

genome wide association studies (GWAS), it is evident that the APOE gene is strongly

associated with LDL-C levels (Asselbergs et al., 2012). Recent evidences from the

literature indicate that APOE could be another gene candidate causing FH (Solanas-

Barca et al., 2012) (Marduel et al., 2013). In particular APOE p.Leu167del mutation has

been predicted to alter the ApoE protein structure and weaken its lipoprotein binding link

with the LDLR (Awan et al., 2013).

Page 40: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

22

Figure 10 APOE gene located on the long (q) arm of chromosome 19 at position 13.32 (19q13.32). Numbered vertical bars represent exons.

Apolipoprotein E forms the major component of chylomicrons.

Page 41: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

23

1.6 Diagnosis

FH is a public health problem and studies have shown that FH is underdiagnosed and

undertreated throughout the world (Børge G. Nordestgaard et al., 2013). The Netherlands,

Norway and Japan are the only countries known to have high diagnostic rates of 86%,

37% and 26% respectively, whereas the diagnostic rates in the rest of the world is known

to be less than 10% (Borge G Nordestgaard & Benn, 2017). In the UK, only 12% of the

population are known to be diagnosed (Børge G. Nordestgaard et al., 2013) and an audit

by the Royal College of Physicians has therefore indicated that 85% of the individuals with

FH remain undiagnosed (Mooney, 2011). Generally, diagnosis is primarily based on family

history, clinical history of CHD, physical examination for xanthomas, repeated

measurements of high blood plasma cholesterol levels and by genetic diagnosis (F

Civeira, 2004). Despite being cost-effective, this approach risks missing 30-60% of the

affected patients (S. R. Daniels & Greer, 2008). Moreover, diagnosis solely based on

physical examination and family histories are not effective as family studies are

complicated and some of the features of FH occur only in the adulthood (Heath,

Humphries, Middleton-Price, & Boxer, 2001). In addition, studies have also shown that low

cholesterol levels were observed in individuals with certain LDLR mutation carriers and

hence measurement of cholesterol levels alone are not effective in diagnosis (Hobbs et

al., 1989) (Koivisto, Koivisto, Miettinen, & Kontula, 1992) (Sass et al., 2004). Hence,

currently, FH is diagnosed based on biochemical, clinical and genetic diagnosis (Watts et

al., 2011) (D Marks, Thorogood, Neil, & Humphries, 2003).

1.6.1 Biochemical diagnosis

FH is primarily diagnosed by biochemical diagnosis whereby patient’s blood plasma

cholesterol levels are measured and at least two elevated blood plasma cholesterol

measurements are required. Serial measurements of blood cholesterol levels, in particular

LDL cholesterol, aid in the diagnosis of FH. However, cholesterol levels vary with age, sex

and the population group and hence it is important to establish a cut-off based on the

above parameters (D Marks et al., 2003). In the UK, based on the Simon-Broome

Diagnostic Criteria, the cut-off for total cholesterol and LDL cholesterol for children is set

to >6.7 mmol/l and >4.0 mmol/l and for adults it is >7.5 mmol/l and >4.9 mmol/l (“Risk of

fatal coronary heart disease in familial hypercholesterolaemia. Scientific Steering

Committee on behalf of the Simon Broome Register Group,” 1991). LDL cholesterol is

known to be the driving factor for CHD and hence fasting lipid profile is requested by the

GPs and clinicians in the lipid clinic as the first line of the obligatory biochemical screening

process where LDL cholesterol is calculated using the Friedewald formula (Friedewald,

Levy, & Fredrickson, 1972). However, studies have shown that almost 15-20% patients

Page 42: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

24

would have been misdiagnosed based on cholesterol testing alone (Koivisto et al., 1992)

(Ward et al., 1996). Hence, biochemical coupled with clinical and genetic diagnosis would

be expected to be the most effective way to diagnose and screen patients suspected to

have FH.

1.6.2 Clinical diagnosis

Several diagnostic criteria have been developed to clinically diagnose FH throughout the

world and no international standard currently exists (Fahed & Nemer, 2011). There are

currently three widely used clinical diagnostic criteria which supplement the use of blood

plasma cholesterol levels, clinical signs, family history and genetic diagnosis. The three

widely used clinical diagnostic tools are Simon Broome criteria (“Risk of fatal coronary

heart disease in familial hypercholesterolaemia. Scientific Steering Committee on behalf

of the Simon Broome Register Group,” 1991), Dutch Lipid Clinic Network (DLCN) (WHO,

1999) and Make Early Diagnosis to Prevent Early Deaths (MEDPED) criteria (Williams et

al., 1993). In the UK, the Simon Broome criteria are followed, which classifies the patient

as definite FH or possible FH based on the blood cholesterol and LDL cholesterol levels,

family history of myocardial infarction, presence of tendon xanthomata along with genetic

diagnosis, as shown in Table 1.

Table 1 Simon Broome criteria.

Definite FH Possible FH

Cholesterol: >6.7 mmol/l (LDL > 4.0 mmol/l) in children under 16, >7.5 mmol/l (LDL > 4.9 mmol/l) in adults.

Plus either Tendon xanthomata in patient, 1st or 2nd degree relative.

Plus either Family history of myocardial infarction <50yrs in 2nd degree relative or <60yrs in 1st degree relative.

OR DNA confirmation.

OR Family history of cholesterol >7.5mmol/l in 1st or 2nd degree relative.

FH: Familial hypercholesterolaemia, LDL: Low density lipoprotein.

The DLCN tool is similar to the Simon Broome tool, but adds the calculation of a numeric

score and is in use in some part of Europe. Based on the scores, diagnosis is considered

to be certain if it is greater than 8 and probable if it is between 6 and 8 and is considered

to be possible if the score is between 3 and 5. Any score below 3 is considered to be

negative (F Civeira, 2004). The US MEDPED criteria take into account of age-specific and

first degree and second degree relative specific criteria for total cholesterol only and do

not take clinical characteristics, such as tendon xanthomata and an identification of a

mutation into account (Williams et al., 1993). These clinical criteria are not expensive and

Page 43: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

25

hence they are still being used for clinically diagnosing FH as well as to identify family

members who may have FH. However, this process is not accurate in diagnosing index

cases in the general population.

1.6.3 Genetic diagnosis

The definitive way of diagnosing FH, is by genetic testing whereby pathological mutations

can be detected by a DNA based mutation screening (D Marks et al., 2003) (P. Nicholls,

Young, Lyttle, & Graham, 2001). The National Institute for Health and Care Excellence

(NICE) guidelines in the UK states, that genetic analysis is clinically more effective as well

as cost-effective for the diagnosis of FH than LDL cholesterol screening (A S Wierzbicki,

Humphries, & Minhas, 2008). It also states that all patients meeting the Simon Broome

criteria should be offered a DNA test to confirm FH. So far, a combination of different

genetic techniques have been used to genetically diagnose FH which has resulted in only

few people being able to avail a genetic test partially due to cost implications as well as

time associated with these screening methods (Maglio et al., 2014) (Thomas, 2015).

Currently, Sanger sequencing is still considered as the gold standard for SNV mutation

analysis and Multiplex ligation-dependent probe amplification (MLPA) is used for the

detection of large insertions and deletions. Several of these traditional molecular

screening methods are currently in use which varies in sensitivity (Maglio et al., 2014)

(Sharma et al., 2012). However, recently genetic diagnosis through next generation

sequencing (NGS) technology has created a major impact in the scientific and diagnostic

community by allowing labs to run several samples in a single run as well as to analyse

multiple genes. Genetic diagnosis through NGS has the potential to improve early disease

identification, atherosclerosis risk prediction and aid in timely management.

1.6.4 Cascade screening

Cascade screening is a process of identifying people who are at risk of a genetic condition

by systematically screening the families of the index case (Sturm, 2014). In cascade

screening, an index patient is initially identified through confirmation of a mutation by

genetic testing. On confirmation, the first degree relatives of the index patient are

screened for the same mutation. New confirmed cases from the first degree are then

treated as index cases and their first degree relatives are screened (Fahed & Nemer,

2011). FH being a dominantly inherited condition makes cascade screening as an

effective tool as half of the screened relatives of the index case will demonstrate the

mutation (Faiz, Nguyen, Van Bockxmeer, & Hooper, 2014). In 1994, the Netherlands

came up with the first successful model of cascade screening and to date this programme

has identified 33,000 FH cases (Huijgen, Hutten, Kindt, Vissers, & Kastelein, 2012). In the

UK, cascade screening has shown to be more cost-effective than universal screening

Page 44: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

26

(Dalya Marks et al., 2002) and has been established in Scotland, Wales and Northern

Ireland. It is also evident from the literature that genetic cascade screening programmes

has been well received by the patients and their relatives (Faiz et al., 2014) as it releases

the unaffected relatives from anxiety and further surveillance (Thomas, 2015).

1.7 New approaches in molecular diagnosis

We have witnessed the evolution of NGS technology and the way it has fuelled the

diagnostics in a very short period of time. This state-of-art technology will increase our

understanding of pathologies at molecular and cellular levels and aid in personalised

medicine. These advancements will pave way for bringing healthcare research from the

traditional “bench to bedside” and will also aid in genetic tests being the forefront of

diagnostics and treatment. Currently, there are different NGS platforms available which

share a common technological feature - massively parallel sequencing of clonally

amplified or single DNA molecules that are spatially separated in a flow cell. These NGS

platforms differ in sequencing chemistries and are capable of generating sequences of

read length ranging from 35bp to 400bp. They require minimal DNA input as a starting

material for construction of a library and output data ranges from few Mb to 600Gb per

run. The run time also varies and ranges from 8 hours to 11.5 days based on the

platforms.

NGS for FH can be either performed as whole genome sequencing (WGS), whole exome

sequencing (WES) or by using targeted gene panels. Each of these approaches has its

advantages and disadvantages. WGS is universal, is capable of analysing non-coding

regions and has more reliable coverage in comparison to WES (Majewski,

Schwartzentruber, Lalonde, Montpetit, & Jabado, 2011). However, WES is more cost-

effective than WGS and has increased depth of sequencing (Majewski et al., 2011). In

contrast, targeted gene panels are preferred for single gene disorders, such as FH, since

they are less labour intensive and cost-effective in comparison to WGS and WES (Wu,

Sun, Pan, Yang, & Wang, 2014). In addition, targeted gene panels do not pose any ethical

issues due to incidental findings as in WGS and WES (Maglio et al., 2014).

1.8 FH Management

Early diagnosis and management of FH can significantly reduce the mortality and

morbidity rates (Finnie et al., 2012). In the UK, National Institute for Health and Care

Excellence (NICE) recommends that patients with FH should achieve a reduction in their

LDL-cholesterol levels of more than 50% from their baseline (National Institute for Health

and Clinical Excellence, 2017). The European Society of Cardiology/European

Atherosclerosis Society (EAS) recommends the acceptable target levels of LDL-

Page 45: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

27

cholesterol in heterozygous patients with confirmed CHD as <1.8mmol/l and <2.5mmol/l in

heterozygous patients without confirmed CHD (Børge G. Nordestgaard et al., 2013).

Currently, statins are the first line of drugs in the treatment of patients with FH and there is

large amount of evidence available to show that regular use of statins in FH patients can

reduce the cardiovascular events (Catapano et al., 2011) (Baigent et al., 2010). Statins

act by inhibiting HMG-CoA reductase, the rate-limiting step in cholesterol synthesis and by

increasing the expression of LDL receptors resulting in lower plasma LDL cholesterol

(Ooi, Barrett, & Watts, 2013). Large clinical trials (n=26) have shown overwhelming

evidence that rigorous management with statins can significantly reduce cardiovascular

mortality and morbidity (Baigent et al., 2010). In the UK, NICE guideline (CG71)

recommends the use of statins in children from the age of 10 (Steve E Humphries,

Cooper, Dale, & Ramaswami, 2018). However, it is evident from the literature that some

patients not able to tolerate statins are not able to achieve the EAS recommended levels

of LDL cholesterol (Pijlman et al., 2010). In such patients Ezetimibe could be the drug of

choice which works by blocking the cholesterol absorption in the small intestine and is

preferable to be given in combination with statins (Anthony S. Wierzbicki, Humphries, &

Minhas, 2008). Alternative lipid lowering strategies with lifestyle modifications and drugs

such as bile acid sequestrants, mipomersen, fibrates (Jun et al., 2010), niacin (Ganji,

Kamanna, & Kashyap, 2003) and PCSK9 inhibitors (Reiner, 2015) in combination with

statins should be considered (B Sjouke, Kusters, Kastelein, & Hovingh, 2011). In

homozygote FH patients with extremely high blood plasma cholesterol levels, lipoprotein

apheresis (Varghese, 2014) should be considered and gene replacement therapy should

be explored for reducing the LDL cholesterol levels (Raper, Kolansky, & Cuchel, 2012).

1.9 Gaps in the literature and problems with the current diagnostic

methods

It is evident from the literature that there is currently no single genetic test that is effective

for screening and diagnosing FH patients. There are several studies that have reported

the use of a combination of DNA tests for FH screening and diagnosis (Chiou, Charng, &

Chang, 2011) (Duskova et al., 2011) (Finnie et al., 2012) (C A Graham et al., 2005)

(Heath et al., 2001) (Palacios et al., 2012). However, these studies have shown a wide

variation in sensitivity (34%-100%) and specificity (28%-99.1%) and have concluded that

the poor sensitivity and specificity observed could be due to the screening methods

employed as well as patient selection. Another key finding which could have played a

significant role in altering the sensitivity and specificity rate in these studies is using

different clinical diagnostic criteria in selecting patients. Some studies (Chiou et al., 2011)

(Finnie et al., 2012) (Futema, Plagnol, Whittall, Neil, & Humphries, 2012) (C A Graham et

Page 46: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

28

al., 2005) (Heath et al., 2001) (Vandrovcova et al., 2013) have employed Simon Broome

criteria while a few others (Duskova et al., 2011) (Palacios et al., 2012) have used DLCN

and MEDPED criteria. This also highlights the fact that there are currently no universal

guidelines that could be accepted and used in the clinical diagnosis of these FH patients.

Two recent targeted NGS based studies in the UK have shown a very high sensitivity and

specificity rate highlighting the significance and potential of NGS based approaches in the

screening and diagnosis of FH (Vandrovcova et al., 2013) (Futema et al., 2012).

Furthermore, a Latvian study has shown that NGS based approach can be used to detect

FH in high-risk individuals who do not meet the defined clinical criteria (Radovica-Spalvina

et al., 2015). In addition, a recent Canadian study has shown that NGS based approach

can also be used in detecting LDLR gene copy number variants in patients with FH and

has also concluded that MLPA can be removed from routine diagnostic screening

(Iacocca et al., 2017). Furthermore, as evident from the literature it could be argued that

microarray studies (Chiou et al., 2011) (Duskova et al., 2011) have also shown a very high

reproducibility rate of 99.99% and a mutation-pick up rate of 98.9% similar to the NGS

studies; therefore, this platform could also be a powerful new tool for genetic screening

and diagnosis of FH patients. However, perhaps a major limitation of micro-array methods

is that they are not capable of picking up novel mutations. Furthermore, the microarray

chips also require regular updates with new insertions and deletions into their tiling

(Palacios et al., 2012). Moreover, microarrays are specifically designed and are only

capable of picking up mutations most common to a specific population and therefore may

not be applicable for a wider global population. Therefore, its application may not be

optimal to a multi-cultural society such as UK.

Many clinically diagnosed FH patients with raised blood plasma cholesterol levels fail to

show mutations in the known FH causing genes (Fahed & Nemer, 2011). This suggests

that there could be mutations in other novel genes involved in the cholesterol metabolic

pathway which are yet to be discovered (Fahed & Nemer, 2011). In addition, current

traditional screening methodologies also vary in sensitivity which could also lead to

diagnostic gaps. Researchers in Canada have studied this diagnostic gap and have

shown that exon-by-exon sequencing is capable of diagnosing only two thirds of FH

patients (J. Wang, Ban, & Hegele, 2005). They have reduced the diagnostic gap from

30% to 10% by detecting copy number variants through Multiplex Ligation Dependent

Probe Amplification (MLPA) (J. Wang et al., 2005). However, there could be a number of

reasons for having a negative result through genetic testing-a mutation could have been

missed due to technical reasons, a variant could have been detected but there is not

enough evidence to prove that it is pathogenic (variants of unknown significance), the

Page 47: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

29

patient could be having mutations in multiple genes (polygenic) and there could be other

novel genes which are yet to be discovered (Thomas, 2015). Hence, it is important for

clinicians and patients to acknowledge that a negative result does not exclude a diagnosis

of FH.

1.10 Need for further research and service development

In the last decade, our understanding of human lipid biology has dramatically increased,

and novel FH-associated variants continue to be reported world-wide on FH causing

genes. This consequently imposes practical limitations on our current traditional genetic

screening methods and therefore there remains no single test that can effectively

diagnose FH. However, it is of paramount importance to effectively diagnose and detect

FH at an early stage with relatively low cost in order to prevent CHD and also to tailor

appropriate management strategies. Recently, few studies have shown that targeted NGS

approaches can effectively be used in the screening and diagnosis of FH (Futema et al.,

2012) (Vandrovcova et al., 2013).

NGS and its potential applications in the next few years will bring a huge transition in

routine molecular diagnostics and hence NHS laboratories should be prepared to

embrace the technology at the earliest. To do so, more NGS studies will need to be

carried out in the NHS laboratories to witness the diagnostic potential of NGS as well as to

aid its translation into routine molecular diagnostics. In addition, by conducting more pilot

studies using NGS approaches in the laboratories of the NHS will allow the healthcare

professionals to gain more expertise and competence for successful implementation of

NGS into routine diagnostics.

In the current era of genomics, it is necessary for the NHS molecular laboratories to keep

in pace with the rapid explosion of developments in the NGS technology. Our study will

serve as a proof of principle study and will aid in understanding the technical challenges

and barriers in implementing NGS technology into routine molecular diagnostic services of

the NHS. The future for NGS is highly promising and its use in routine diagnostic

molecular laboratories can also be expanded into areas beyond FH.

Page 48: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

30

1.11 Aims of the study

1) To set up next generation sequencing methodology for the genetic screening of familial

hypercholesterolaemia.

2) To validate the next generation sequencing results by comparing it with the reference

lab results.

3) To evaluate the technical issues and implementation barriers of setting up the NGS

methodology in routine molecular diagnostic laboratories of the NHS.

Page 49: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

31

CHAPTER TWO

MATERIALS AND METHODS

Page 50: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

32

Materials and methods

2.1 Ethical approval

This study was sponsored by the University Hospital Southampton NHS Foundation Trust

(UHS) under the terms of the Department of Health Research Governance Framework for

Health and Social Care (Health, 2005). R&D confirmation of capacity and capability

approval was obtained from UHS R&D. Research ethics committee (REC) approval

(16/NW/0215) and Health Research Authority (HRA) approval were obtained for this study

from North West – Greater Manchester South Research Ethics Committee.

2.2 Sample selection

University Hospital Southampton NHS Foundation Trust runs a lipid clinic where patients

with elevated cholesterol and other forms of lipidaemia are being diagnosed, assessed

and managed. Patients visiting the lipid clinic have either one of the following or a

combination of the following: elevated blood plasma cholesterol, family history of elevated

blood plasma cholesterol, coronary heart disease (CHD) and tendon xanthomata. Patients

with the above conditions and who fulfil the Simon Broome criteria (“Risk of fatal coronary

heart disease in familial hypercholesterolaemia. Scientific Steering Committee on behalf

of the Simon Broome Register Group,” 1991) are referred for genetic testing for FH in a

reference lab by the Consultant in the lipid clinic. The DNA is extracted and stored from

these patient samples. Patients identified as having a mutation in one of the FH genes or

not having a mutation are documented in the patient notes and are managed with

appropriate prophylactic strategies in UHS. A retrospective cohort of 94 DNA samples

from patients known to have FH or suspected to have FH was selected for this study. All

the samples for this study were provided by the Consultant running the lipid clinic at UHS.

The samples were fully anonymised and no patient identifiable information such as name,

age, hospital number and gender was available. Details provided were unique identifier

number for each sample and the type of variant detected by the reference lab along with

the method or panel used for the detection of the variant. No other information was

available.

2.3 Primer design on Illumina DesignStudio

Illumina® DesignStudio™ (Illumina, 2016a) was used to design the amplicons for the FH-

NGS study. DesignStudio™ is a personalised web based sequencing assay design tool

used to design, review and order the amplicons for the study. TruSeq® custom amplicon

is a fully customisable amplicon- based assay for targeted sequencing offered by Illumina.

TruSeq custom amplicon allows researchers to sequence hundreds of genomic regions

with little as 2Kb to 650Kb of cumulative sequences.

Page 51: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

33

Five main genes (LDLR, APOB, PCSK9, LDLRAP1 and APOE) with mutations known to

cause FH were added to the DesignStudio to design the amplicons for the study. After

adding the genes of interest in the DesignStudio, the programme brings up the target

regions with number of targets, cumulative targets (base pairs), total number of amplicons

along with their coverage. The UHS design data was reviewed and was found to have low

coverage in certain regions in LDLR as low as 68% and the overall coverage was 94% as

shown below in Figure 11. Illumina diagnostics was contacted and the final design was

optimised by the Illumina concierge team. The FH-NGS panel designed and validated by

Illumina had an overall coverage of 98% with 211/211 selected amplicons, 68/68 selected

targets and with a cumulative target of 32,197bp. There was one gap with a gap distance

of 510bp in the LDLR region 11,241,937 to 11,244,525 which had only 80% coverage.

Page 52: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

34

Figure 11 Screen shot of Illumina® DesignStudio™. FH-NGS amplicons were designed at UHS on the Illumina® DesignStudio™.

Page 53: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

35

2.4 Measuring the DNA concentration

Quality of DNA is very crucial for a good library preparation. Construction of high quality

sequencing libraries is very significant for successful NGS. DNA quality and concentration

was estimated both by Qubit Flourometer and by NanoDrop. Through Qubit Flourometer,

the interference of contaminants such as degraded DNA/RNA is eliminated and through

NanoDrop the quality and purity is determined.

2.4.1 Measuring the DNA concentration by Qubit

The double stranded DNA concentrations of the anonymised patients sample were

determined using the next generation benchtop Qubit® 3.0 Fluorometer (Thermo Fisher

Scientific, 2014). The Qubit Fluorometer utilises florescent dyes that specifically bind to

the targets even at low concentrations and emits a signal. The Qubit Fluorometer is more

sensitive than UV absorbance, since it minimises the effect of other contaminants such as

degraded DNA /RNA.

The Qubit® Fluorometer was calibrated using Qubit™ dsDNA Broad Range Standard 1

and 2 before measuring the DNA concentration on the patient samples. A working solution

was prepared by adding 1µl of Qubit® Broad Range Detection Reagent (fluorochrome) to

199 µl of Qubit® Broad Range Buffer. 195 µl of the working solution was added to 5 µl of

the patients DNA sample in Qubit® Assay tubes. The tubes were vortexed for 2-3 sec and

incubated at RT for 2 min. The tubes were then placed in the Qubit® 3.0 Fluorometer and

the dsDNA concentration was measured using the dsDNA Broad Range programme. The

concentrations of dsDNA in the patients sample were documented on a spreadsheet in

ng/µl. This worksheet was used to work out the dilutions required to prepare a working

concentration DNA of 25ng/µl. All the patient samples were then diluted individually with

DNA/RNA free water to have a concentration of 25ng/µl.

2.4.2 Measuring the DNA concentration using NanoDrop

The anonymised samples were also checked for concentration and quality using the

NanoDrop® ND-1000 Spectrophotometer (Thermo Fisher Scientific, 2010) which can

analyse the absorbance of RNA, DNA, nucleic acids and proteins. The ratio of

absorbance at 260nm and 280nm were used to assess the purity of the DNA samples. A

ratio of ~1.8 was generally accepted as “pure” for DNA and 98% of our study samples had

a 260/280 ratio greater than 1.8. This technology is also useful for low volume samples.

Page 54: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

36

2.5 Illumina workflow

Illumina MiSeq platform was used in this FH-NGS study which utilises the sequencing by

synthesis (SBS) principle (Braun & LaBaer, 2003) and is a widely used next generation

sequencing technology. Illumina platforms support massively parallel sequencing using a

proprietary method that detects single bases as they are incorporated into the growing

DNA strand. Illumina workflow includes four basic steps: Library preparation, cluster

generation, sequencing and data analysis as shown in Figure 12.

Figure 12 Illumina sample workflow.

2.5.1 Library preparation

Illumina TruSeq Custom Amplicon v1.5 protocol (Illumina, 2016c) was followed to prepare

the libraries as shown in Figure 13. 94 uniquely indexed paired-end libraries of the

genomic DNA were prepared using the Illumina® TruSeq® Custom Amplicon Library

Preparation kit. The TruSeq Custom Amplicon library protocol offers multiplexing

capability to amplify all the amplicons in a single reaction and sequence them in a single

run.

Page 55: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

37

Figure 13 TruSeq Custom Amplicon Workflow (Illumina, 2016c).

Page 56: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

38

2.5.2 Hybridize the oligo pool

This process hybridizes a custom oligo pool which contains the upstream and

downstream oligos specific to targeted regions of interest as shown in Figure 14. In this

step the primers anneal to the genomic DNA. Multiple samples can be processed

simultaneously using the 96 well filter plate. Extreme care should be taken and personal

protective equipment should be used during this procedure as these sets of reagents

contain formamide, a reproductive toxin. 5µl of Amplicon Control DNA (ACD1) was added

to 5µl of distilled water to well one of the Hybridization Plate (HYP). Use of control DNA

enables the Illumina technical team to troubleshoot in case of any issues with the assay.

Then 10µl of genomic DNA (25ng/µl) was added to each of the remaining wells

respectively. Then 5µl of Control Oligo Pool (ACP1) was added to the well containing

ACD1 and 5µl of Custom Amplicon Oligos (CAT) was added to all the wells containing the

genomic DNA. The plate was then centrifuged at 1000 x g for 1 min. After centrifuging,

35µl of Oligo Hybridization for Sequencing 2 (OHS2) was added to each well and mixed

by pipetting and the plate was again centrifuged at 1000 x g for 1 min. The plate was then

placed on the preheated heat block and incubated at 40ºC for 80 min.

Figure 14 Hybridization of upstream and downstream oligos to the region of interest. ULSO:

Upstream Locus-Specific Oligos, DLSO: Downstream Locus-Specific Oligos.

Page 57: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

39

2.5.3 Removal of unbound oligos

This step removes the unbound oligos from the genomic DNA using a size-selection filter.

The filter plate was assembled as shown in Figure 15.

Figure 15 Filter Plate Unit (FPU) assembly (Illumina, 2016c). A=Lid, B=Filter plate, C=Adaptor

collar, D=Midiplate.

The wells of the filter plate were washed by adding 45µl of Stringent Wash 1 (SW1) to

each well. The FPU was then covered and centrifuged at 2400 x g for 10 minutes. Care

was taken to ensure that there was not >15µl/well of residual buffer in the wells. Then the

plate was removed from the heat block and centrifuged at 1000 x g for 1 min. The

samples from each well were then transferred to the corresponding well of the FPU plate.

The FPU plate was covered and centrifuged at 2400 x g for 2 min. The plate was then

washed twice by adding 45µl of SW1 to each sample well. The plate was covered and

centrifuged at 2400 x g for 2 min. The plate was centrifuged again to ensure the SW1 had

drained completely. The flow through was discarded with each wash and 45µl of Universal

Buffer 1 (UB1) was added to each sample well and covered. The plate was then

centrifuged at 2400 x g for 2 min. The plate was centrifuged for further few minutes to

ensure UB1 had drained completely.

2.5.4 Extend and ligate bound oligos

This step connects the hybridized upstream and the downstream oligos. A DNA

polymerase extends from the upstream oligos through the targeted region, followed by the

ligation to the 5’ end of the downstream oligo using a DNA ligase as shown in Figure 16.

Page 58: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

40

This results in the formation of products containing targeted regions of interest flanked by

sequences required for amplification.

45µl of Extension-ligation Mix 4 (ELM4) was added to each sample well of the FPU plate.

The plate was sealed using foil adhesive seal and incubated at 37ºC for 45 min.

Figure 16 Extension and ligation of bound oligos. ULSO: Upstream Locus-Specific Oligos,

DLSO: Downstream Locus-Specific Oligos.

2.5.5 Amplify Libraries

This step amplifies the extension-ligation products and adds the index 1 (i7) adapters,

index 2 (i5) adapters and sequences required for the cluster formation as shown in Figure

18. By multiplexing, large number of libraries can be pooled and can be sequenced in a

single sequencing run.

The index 1 (i7) adapters and the index 2 (i5) adapters were arranged on the TruSeq

Index Plate Fixture in columns 1-12 and rows A-H as shown in Figure 17.

Page 59: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

41

Figure 17 TruSeq Index Plate Fixture (Illumina, 2016c). A= Rows A-H: Index 2 (i5) adapters

(white caps), B= Columns 1-12: Index 1 (i7) adaptors (orange caps), C= Indexed Amplification Plate (IAP) plate.

The plate was then placed on the TruSeq Index Plate Fixture and 4µl of each Index 1 (i7)

adapter was added down to each column using a multichannel pipette. Then 4µl of each

index 2 (i5) adapters was added across each rows using a multichannel pipette. New

orange and white caps were used to recap the unused indexes to avoid cross

contamination. Then the FPU plate from the incubator was removed and the aluminium

foil seal was replaced with the filter plate lid. The plate was then centrifuged at 2400 x g

for 2 min. After centrifugation, 25µl of freshly made 50 mM NaOH was added to each well

and mixed well with a pipette. The plate was then incubated at RT for 5 min. This step

removed the bound libraries from the filter plate. PCR master mix 2 (PMM2) and TruSeq

DNA Polymerase 1 (TDP1) mixture was prepared freshly just before use by combining

56µl TDP1 to a full tube (2.8ml) of PMM2 and mixed well by inverting. Then 22µl of

PMM2/TDP1 mixture was added to each well of the IAP plate. Then 20µl of the eluted

samples from the FPU plate was transferred to the IAP plate and mixed well with a

pipette. The waste collection in the midi plate was discarded. The IAP plate was then

centrifuged at 1000 x g for 1 min. After centrifugation, the IAP plate was taken to the post-

amplification area and placed on a thermal cycler to run the PCR. The plate was left on

the thermal cycler at 95ºC for 3 min and then the PCR was run for 25 cycles at 95ºC for

30 sec, 66ºC for 30 sec, 72ºC for 60 sec. After the 25 cycles the plate was programmed to

incubate at 72ºC for 5 min and then held for infinite at 10ºC on the thermal cycler.

Page 60: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

42

Figure 18 Sample indexing and amplification. ULSO: Upstream Locus-Specific Oligos, DLSO:

Downstream Locus-Specific Oligos.

2.5.6 Quality, quantification and sizing of the amplified libraries

The amplified library quality, quantification and sizing were performed both by the gel

electrophoresis and by the Bioanalyser method.

2.5.6.1 Gel electrophoresis method

2% Agarose gels were prepared using 2g of agarose powder (Alpha Laboratories) in

100ml of Tris Acetate-EDTA (TAE) buffer. The gel was supplemented with ethidium

bromide (Sigma) (final concentration of 0.5µg/ml) before pouring the gel. The gel was

poured on the plate with combs in place and was allowed to settle at RT for 1h. Then the

combs were removed and the gel was placed in the electrophoresis tank with TAE buffer

along with 1-2µl of ethidium bromide. To visualise the amplified libraries, 10µl of the

samples were added to 1-2µl of the loading dye and electrophoresed at 100 volts for

40min-1hour. The patient samples were run along with a 100bp ladder to visualise the

size of the products as shown in the Figure 19. Expected PCR product size around 350bp

for 250bp amplicons were seen on both the gel electrophoresis as well as on the

Bioanalyser.

Page 61: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

43

Figure 19 Gel electrophoresis of amplified libraries along with 100bp ladder. Amplicons of

size 350bp were seen on the gel electrophoresis post PCR.

2.5.6.2 Bioanalyser method

The amplified libraries were also run on a Bioanalyser (Agilent Technologies, 2007)

according to the manufacturer’s instructions to assess the quality, quantity and the size.

Agilent DNA 1000 assay protocol was followed in running the Bioanalyser. The gel-dye

mix was prepared by adding 25µl of DNA dye concentrate to the DNA gel matrix vial and

vortexed to mix well. The vial was then centrifuged at 2240 x g for 15 mins and then

stored at 4ºC and protected from light. Expected PCR product size around 350bp for

250bp amplicons were seen on the Bioanalyser as shown in Figure 20. Bioanalyser was

used to accurately quantify and assess the quality and size of the amplified libraries which

is significant for further downstream analysis. It also allows visualisation other unwanted

products present in the sample.

Page 62: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

44

Figure 20 Amplified libraries along with a ladder on a Bioanalyser. TSCA: TruSeq Custom

Amplicon.

2.5.7 Clean up Libraries

AMPure XP beads (Beckman Coulter, 2016) were used in this step to purify the PCR

products from other reaction components as shown in Figure 21. These are paramagnetic

beads which acts like a magnet in a magnetic field.

Figure 21 Library clean up using AMPure XP beads (Beckman Coulter, 2016).

45µl of AMPure XP beads (for amplicon size 250bp) were added to each well of a new

midi plate and labelled as Cleanup Plate (CLP). All the supernatant from the IAP plate

were transferred to the corresponding well of the CLP plate. The plate was shaken at

1800 rpm for 2 min and then incubated at RT for 10 min. The PCR products bind to the

paramagnetic beads. The plate was then placed on a magnetic stand for approximately 2

min until the liquid was clear. This step aids in the separation of beads bound to the PCR

products from other contaminants. All the supernatant from each well was removed and

discarded. 40ml of fresh 80% ethanol was prepared from 100% ethanol for 96 samples.

The CLP plate was washed 2 times by adding 200µl of 80% ethanol to each sample well

and incubating on the magnetic stand for 30 sec to remove all the contaminants. The

supernatant was removed and discarded after each wash with 80% ethanol. The residual

Page 63: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

45

ethanol was removed from each well by using a 20µl pipette. The plate was then removed

from the magnetic stand and air dried at RT for exactly 10 min. The plate should be dried

exactly for 10 min only at RT, as prolonged air drying could lead to cracking of beads.

After air drying 30µl of Elution Buffer with Tris (EBT), was added to each well and shaken

at 1800 rpm for 2 min. Care was taken to ensure that all the beads are suspended after

shaking. This step washes all the DNA from the beads when incubated at RT for 2 min.

After incubation, the plate was placed on the magnetic stand for approximately 2 min until

the liquid was clear. 20µl of the supernatant from each well of the CLP plate were then

transferred to the corresponding well of the Library Normalization Plate (LNP) and

centrifuged for 1000 x g for 1 min.

2.5.8 Normalize Libraries

This step normalizes the quantity of each library for balanced representation in pooled

libraries. Library normalization ensures all samples are represented equally. During this

normalization procedure, DNA binds to the beads in approximately the same

concentration in all the samples and yields the same concentration across all samples

when eluted. After normalization, the final sample is single stranded which can be loaded

on the MiSeq analyser.

A Library Normalization Additives 1 (LNA1) and Library Normalization Beads 1 (LNB1)

mixture was prepared just before use by adding 4.4ml of LNA1 to 800µl LNB1 in a 15ml

conical tube. The tube was inverted several times to prepare the LNA1/LNB1 mixture.

45µl of LNA1/LNB1 mixture was added to each well of the LNP plate and shaken at 1800

rpm for exactly 30 min, as any other duration other than 30 min will affect the library

representation and cluster density. After 30 min the plate was removed from the shaker

and placed on a magnetic plate for approximately 2 min until the liquid was clear. All the

supernatant was then removed and discarded. The plate was then removed from the

magnetic plate and washed twice by adding 45µl of Library Normalization Wash1 (LNW1)

to each well and shaking the plate at 1800 rpm for 5 min. After shaking for 5 min the plate

was placed on the magnetic plate for 2 min until the liquid was clear. The supernatant was

removed and discarded after each wash. The residual LNW1 was removed from each well

using a 20µl pipette. The plate was removed from the magnetic plate and 30µl of freshly

made 0.1N NaOH was added to each well and shaken at 1800 rpm for 5 min for

suspending the libraries. The LNP plate was then placed on the magnetic plate for 2 min

until the liquid was clear. For storage 30µl of Library Normalization Storage Buffer 2

(LNS2) were added to a plate labelled as Storage Plate (SGP) and 30µl of supernatant

from each well of the LNP plate was transferred to the corresponding well of the SGP

Page 64: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

46

plate. The plate was then centrifuged at 1000 x g for 1 min and stored at -25ºC to -15ºC

(stable for 30 days) until the libraries were pooled.

2.5.9 Pool Libraries

Pooling libraries combines equal volumes of normalized libraries in a single tube. This

pool is later diluted and denatured for the sequencing run. 5µl of each library is transferred

to an 8-tube strip, column by column to form a pool. The contents of the 8-tube strip were

transferred to a Pooled Amplicon Library (PAL) Eppendorf tube. This PAL tube can be

frozen at -25ºC to -15ºC and stable for 30 days. 94 samples were done in three batches

and all the 3 library pools were pooled together to form a single library pool and was

frozen at -25ºC to -15ºC.

2.5.10 Denaturing and diluting the libraries

For denaturing and diluting the libraries, Protocol B: Bead based normalization method in

the Illumina document #15039740 v01 was followed (Illumina, 2016b). The Hybridization

Buffer (HT1) was removed from the -25ºC to -15ºC freezer and thawed at room

temperature. After thawing the HT1 was stored at 2ºC to 8ºC. The library was then diluted

to the loading concentration by adding 8µl of amplicon library pool to 592µl of prechilled

HT1 to make a total volume of 600µl in a micro centrifuge tube. This was briefly vortexed

for 1 minute at 280 x g. PhiX serves as a control adapter-ligated library in the sequencing

run and provides quality metrics for cluster generation, sequencing, alignment, phasing

and pre-phasing (Illumina, 2016d).

The diluted library was placed in the incubator for 2 min at 98ºC. After 2 min, the micro

centrifuge tube was removed from the 98ºC incubator and immediately cooled by placing

the micro centrifuge tube on ice for 5 min.

The PhiX Control was denatured and diluted to 12.5pM to produce an optimum cluster

density for v2 Illumina reagents. PhiX was diluted to 4 nM by combining 2µl of 10 nM PhiX

library to 3µl of 10 nM Tris-Cl at pH 8.5 with 0.1% Tween 20. PhiX Control was denatured

by combining 5µl of 4 nM PhiX library with 5 µl of freshly prepared 0.2N NaOH in a micro

centrifuge tube. The mixture was briefly vortexed and centrifuged for 1 minute at 280 x g

and incubated at room temperature for 5 min. The denatured PhiX was diluted to 20 pM

by adding 10µl of denatured PhiX library to 990µl of pre chilled HT1which results in 1ml of

20 pM PhiX library. This was further diluted to 12.5 pM PhiX by mixing 375µl of 20 pM

denatured PhiX library to 225µl of pre chilled HT1 which results in 600µl of 12.5 pM PhiX

library.

Page 65: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

47

A low concentration PhiX control spike-in of 1% was used as a sequencing control which

was prepared by combining 6µl of denatured and diluted PhiX with 594µl of denatured

and diluted library. This mixture was placed on ice to be loaded onto the reagent cartridge.

2.5.11 Creating sample plate, sample sheets and manifest file on Illumina

experiment manager (IEM) software

IEM is a software programme used in creating and editing sample sheets for Illumina

instrument and analysis. A sample sheet was created using IEM which involved two steps.

Initially, a sample plate was created which included information about samples in each of

the 96 wells, type of library that was performed, plate name and the sample indexes used.

Secondly, the sample sheet was created based on the sample plate information which

passes the information on the run and samples in the correct (.csv) file format to the

instrument and the analysis software. While creating the sample sheet, care was taken to

ensure correct barcode details of the reagent cartridge, type of kit, number of indexes,

name of the experiment, name of the investigator and date were being entered.

A manifest file was created which contained information about each sample that focuses

analysis results only to user-defined regions of interest, information on the reference

genome used for alignment, information on Chromosome coordinates i.e. start and stop of

each amplicon used and the length of the PCR primer used for the generation of the PCR

amplicon.

2.5.12 MiSeq flow cell

MiSeq flow cell is a single use glass based substrate where the clusters are generated

and the sequencing reaction is performed. The reagents enter the flow cell through the

inlet port and pass through the imaging lane and exists the flow cell through the outlet

port. The waste from the flow cells goes to the waste bottle.

2.5.13 Illumina reagent cartridge

Illumina MiSeq reagent cartridge is a single use cartridge consisting of foil-sealed

reservoirs consisting of reagents for cluster generation and sequencing for 1 flow cell. The

reservoirs on the cartridge are numbered and the sample libraries are loaded on the

cartridge on reservoir number 17, which is labelled as “Load Samples”. Libraries were

loaded on the Illumina reagent cartridge and analysed according to the manufacturer’s

instructions.

Page 66: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

48

2.6 Monitoring the run on BaseSpace

MiSeq run monitoring was performed using Sequencing Analysis Viewer (SAV) in

BaseSpace as shown in Figure 22. BaseSpace is a cloud based genomics computing

environment for analysing, storing next generation sequencing data. Sequencing labs can

store and share data as well as monitor sequencing runs on the Illumina analysers in real

time. BaseSpace also offers a wide variety of NGS data analysis applications including a

few third party applications. Using BaseSpace, one can remotely monitor a sequencing

run, cluster Intensity, Q-Score, cluster density, clusters passing through the filter and the

estimated yield. The FH-NGS run in Figure 23 and 24 shows lane metrics and graphs on

flow cell chart, data by cycle, data by lane, QScore distribution and QScore heatmap.

Page 67: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

49

Figure 22 Sequencing Analysis Viewer (SAV) on BaseSpace. Screen shot of the FH run summary on BaseSpace.

Page 68: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

50

Figure 23 Screen shot of the charts on the FH run on BaseSpace.

Page 69: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

51

Figure 24 Run and Lane metrics of the FH-NGS run.

Page 70: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

52

The run lane metrics gives more detailed information about the run. It gives information on

the read 1 and 2 along with the number of cycles, yield, percentage of aligned reads, error

rate in percentage, Q30 in percentage, cluster density (K/MM2), percentage of clusters

passing through the filter, number of reads passing through the filter etc.

2.7 Bioinformatic analysis

Bioinformatic analysis is a process of analysing and interpreting complex genomic data

using computer software tools. It is a complex stepwise process which involves the

application of computational techniques to design a pipeline which aids in identification of

pathogenic variants from raw genomic data. First step involved determining the quality of

raw reads and bases. Poor quality reads and bases were trimmed at the 3’ end. After

quality checks, the reads were aligned to the reference genome for variant calling. Finally

the data was annotated to filter and identify the pathogenic variants.

2.7.1 BaseSpace analysis

FH-NGS data was analysed on Illumina BaseSpace and the data analysis workflow

consists of two main phases: 1) Primary analysis and 2) Secondary analysis.

During the primary analysis, the images were analysed for base calling after intensity

extraction and correction. The initial sequencing output files from the MiSeq analyser were

Tagged Image File Format (*.tiff). MiSeq control software analyses the images in real time

and after analysis the images are deleted since the file sizes are very big and require

large storage space. The tiff files are converted and saved as Cluster Intensity Files (*.cif).

Base calling is performed as soon as image analysis is completed. Cluster intensity files

are converted to Base Calling Files (*.bcl) which are converted to *.fastq files which

contain information about the read, sequence of the read etc.

During the secondary analysis, the MiSeq reporter software processes the base calls

generated during the primary analysis and also aids in alignment and variant calling. The

output files are in .fastq.gz, .bam, and .vcf formats. .fastq.gz is a text based file format for

storing nucleotide sequence and its corresponding quality scores. Binary Compressed

Sequence Alignment/Map (.bam) file contains information on aligned reads and the

Variant Call Format (.vcf) file contains information about variant calls.

Fastq files were loaded on the BaseSpace and the variant calling was performed. On

completion, detailed information was available on BaseSpace on i) amplicon summary

statistics was available both at read level and base level ii) small variants summary

statistics with SNV’s, insertions and deletions and iii) coverage summary with amplicon

mean coverage depth for each sample.

Page 71: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

53

i) Amplicon summary statistics: The amplicon summary report detailed

statistical information on all samples at a) Read level and at the b) Base level.

a) Read Level Statistics

Read level statistics report details information on each sample with total

aligned reads (R/R2) i.e. the total count of PF (passing filter) reads aligning

for the sample (Read1/Read2), statistical information on percent aligned

reads (R1/R2) i.e. percentage of PF reads aligning for the sample

(Read1/Read2), and statistical information on overall percent aligned reads

i.e. percentage of PF reads aligning for the sample across both reads 1

and 2 as shown in Figure 25.

Page 72: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

54

Figure 25 Read level statistics along with percentage of aligned reads.

Page 73: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

55

b) Base level statistics

Base level statistics report details statistical information on total number of

bases aligning for the sample (Read1/Read2), information on overall total

aligned bases i.e. the total count of bases aligning for the sample across both

reads 1 and 2, statistical information on percent aligned bases (R1/R2) i.e.

percentage of bases aligning for the sample (Read1/Read2), statistical

information on overall percent aligned bases for sample across both reads 1

and 2, statistical information on the percentage of bases with a quality score of

30 or higher for read 1 and 2, information on mismatch rate i.e. percentage of

mismatch to reference averaged over cycles per read (Read1/Read2) as

shown in Figure 26.

Page 74: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

56

Figure 26 Base level statistics along with percentage of aligned bases.

Page 75: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

57

ii) Small variants summary statistics

The small variants summary statistics report detailed information on a) SNVs,

b) Insertions and c) Deletions.

a) SNVs: SNVs statistics report detailed information on the total number of

SNVs which passed through the quality filter which ranged from 28 to 68 as

shown in Figure 27.

b) Insertions: The insertions statistics report detailed information about the

number of insertions present in the dataset which passed the quality filter

as shown in Figure 28.

c) Deletions: The deletions statistical report on the dataset detailed

information about the total number of deletions passing through the quality

filters as shown in Figure 29.

Page 76: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

58

Figure 27 Small variants summary along with the number of SNVs passing through the filter.

Page 77: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

59

Figure 28 Insertions statistics report with number of insertions passing through the filter.

Page 78: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

60

Figure 29 Deletions statistics report along with number of deletions passing through the filter.

Page 79: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

61

iii) Coverage summary: The coverage summary of the dataset details information on the amplicon mean coverage depth as shown in

Figure 30.

Figure 30 Coverage summary.

Page 80: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

62

BaseSpace also creates a PDF summary report for each sample analysed as shown in

Appendix 6. The PDF summary report includes detailed sample information which

includes total number of reads passing the filter and percentage Q30 score, amplicon

summary which includes the number of amplicon regions, total length of amplicon regions.

The PDF summary report also details information on read level statistics which includes

information on total number of aligned reads for 1 and 2 along with the percentage of

aligned reads. At base level statistics it includes information on percent Q30 bases, total

number of aligned bases, percentage of aligned bases out of the total bases and the

mismatch rates for read 1 and 2. The PDF report generated also provides information on

the small variants summary with total number of SNV’s, insertions and deletions. The

report also classifies the variants further in detail with number of variants at gene level, in

the exonic level, coding regions, UTR regions and in splice site regions and also further

classifies variants by their consequences i.e. frameshift, non-synonymous, synonymous,

stop gained and stop lost etc. The coverage summary in the report includes information

on amplicon mean coverage and uniformity of coverage both in tabulated and graphical

versions.

2.7.2 BaseSpace Variant Interpreter Beta

This is a cloud-based platform offered by Illumina for interpreting and reporting genomic

variants. This was used to see if there were any differences in the variant calling. It also

decreases the time and effort in designing complicated pipelines. VCF files were uploaded

along with the sample number, gender of the patient and other identifications. Once

submitted the data are imported and analysed. BaseSpace Variant Interpreter Beta

platform generates a report with the total number of variants. The variants can be filtered

generally based on prediction if they are pathogenic, likely pathogenic, benign and based

on the zygosity and genomic regions. Filtering of variants can also be performed based on

population frequency, annotation data bases, consequences and impact etc. Figure 31

shows identification of a pathogenic LDLR variant on Illumina Variant Interpreter Beta.

Page 81: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

63

Figure 31 Pathogenic LDLR on the Illumina Variant Interpreter Beta.

Page 82: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

64

2.7.3 Galaxy

Galaxy (Blankenberg, Kuster, et al., 2010) is a web-based, open source platform which

was used to perform the bioinformatics analysis for this study. This was used in addition to

BaseSpace, to identify if there were any differences in variant calling and to avoid biases.

Special training was undertaken to use this platform as a part of the Genomic Medicine

course. The platform also provides detailed auditable information on the bioinformatic

analysis. The history bar on the Galaxy analysis platform shows an audit of steps

undertaken to manipulate and analyse the data. A pipeline was specially designed and

constructed for the FH-NGS study as shown in Figure 37 by adding the analysis tools in a

step by step process as shown below.

1) Input data sets: The first step of the pipeline was to input the Fastq raw data for

read 1 and 2. There are options to upload files from local disk or from the web. As

soon as the data is uploaded, the file formats can be auto-detected or manually

assigned.

2) FastQC: Once the raw reads were uploaded, a quality check was performed on

the sequences by FastQC (Andrews, 2010). The aim of this step was to get an

idea on the quality of reads and to establish if the reads were of expected quality

before performing the analysis. The main function of FastQC is to provide a quick

overview of the quality and areas where there could be problems. It summarises

the findings on the basic statistics, per base sequence quality, per base sequence

quality scores, per base sequence content, per base GC content, per sequence

GC content, sequence length distribution, per base N content i.e. base calls which

could not be made with certainty, sequence duplication levels etc. Quality scores

were derived from the PHRED programme. This programme reads DNA

sequences and assigns quality scores to each base using the formula shown

below.

Q_PHRED = -10 log 10 (Pe)

Pe = error in base calling. All of the quality scores are stored in a file called QUAL,

which has a header line followed by integers. Low integers indicate probability of

the bases being called incorrectly and higher scores indicates the probability of

increased accuracy in base calls.

Page 83: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

65

Table 2 PHRED score table.

PHRED Quality Score Probability of incorrect base call

Base call accuracy

10 1 in 10 90%

20 1 in 100 99%

30 1 in 1000 99.9%

40 1 in 10000 99.99%

50 1 in 100000 99.999%

Figure 32 Range of quality scores over all reads at each position. Quality metrics are reported using a traffic light warning system, normal (green), abnormal (orange) and bad (red). Quality scores across all the bases are in the green region indicating they are good quality data.

Page 84: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

66

Figure 33 Average quality scores over all reads. Quality metrics indicates the mean sequence quality is well above Q30 indicating good quality data. A mean quality score below 27 indicates to 0.2% error rate and score below 20 equate to 1% error.

3) FASTQ joiner: In this step paired end FASTQ reads from two separate files are

joined into a single read in one file using the FASTQ joiner tool (Aronesty, 2013).

The joining is performed using sequence identifiers and if they are not present in

both the files they are excluded from the output.

4) Filter by quality: This tool filters the reads based on their quality scores. The

quality cut-off value and percent of quality bases in the sequence are assigned so

that any reads below the assigned quality will be filtered.

5) FASTQ splitter: This tool splits a single fastq data set representing a paired end

run into two data sets (Blankenberg, Gordon, et al., 2010). This tool works only

when both the reads are of same length.

6) Map with BWA-MEM: This alignment algorithm tool is used to align the

sequences against a reference genome (H Li & Durbin, 2009) (Heng Li & Durbin,

2010). NGS experiments generate millions of small reads which need to be

aligned or mapped to the reference genome. Using the standard alignment

algorithms for millions of reads is time consuming and not feasible. Burrows-

Wheeler Alignment (BWA) is generally used for DNA projects and performs

gapped alignments. BWA is most widely used, but its mapping algorithms are least

understood. This tool has a greater power to determine indels and SNPs. BWA

Page 85: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

67

automatically chooses between local and end to end alignments, supports paired-

end reads. This algorithm is robust to sequencing errors and is best suited to read

lengths greater than 70 bp. This tool takes in fastq files as the input files and the

produces the output files in BAM format.

7) FASTQC: This tool is used to perform the quality checks.

8) Flagstat: This tool tabulates descriptive statistics for the BAM dataset by using

samtools flagstat (H Li et al., 2009) as shown below.

9) Filter SAM or BAM, output SAM or BAM: This tool filters a SAM or BAM file on

the mapping quality, FLAG bits, read group, library or region. The input file type is

SAM or BAM format and the output will be SAM or BAM based on the chosen

option.

10) Flagstat: This tool uses samtools flagstat command to tabulate descriptive stats

for BAM dataset.

11) IdxStats: This runs samtools idxstats command to tabulate mapping statistics for

BAM dataset (H Li et al., 2009). The input file is a sorted and indexed BAM file and

the output is tabular with four columns representing reference sequence identifier,

reference sequence length, number of mapped reads and number of placed but

unmapped reads as shown in Figure 34.

Page 86: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

68

Figure 34 Screen shot of output table from IdxStats

12) Add or replace read groups: This tool adds or replaces read group information in

an input BAM or SAM files. Setting read groups correctly can simplify the analysis

greatly as multiple BMA files can be merged into one which will significantly reduce

the number of steps in the analysis.

13) Collect insert size metrics: This tool reads a SAM or BAM dataset and creates a

file with metrics containing statistical distribution of insert sizes and generates a

histogram plot as shown in Figure 35.

Page 87: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

69

Figure 35 Histogram of the insert sizes for all reads.

14) Depth of coverage: Depth and breadth of sequence coverage is very significant

for the identification as well as to accurately call the variants. In this step a set of

BAM files are processed to establish coverage at different levels. Coverage is

analysed per locus, gene or in total. Coverage can also be determined per sample

and read group and can be summarised by mean, median and by percentage of

bases covered etc. The input files are BAM format and the output files are in tables

(Blankenberg, Kuster, et al., 2010).

15) Unified Genotyper: This variant caller calls SNP and indels by using a general

Bayesian framework (DePristo et al., 2011) (Blankenberg, Kuster, et al., 2010).

GATK accepts an aligned BAM file as input file and the output files are in VCF

formats. It also generates some metrics about the base calls with number of visited

bases along with percentage of callable bases as shown below.

Page 88: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

70

16) ANNOVAR Annotate VCF: Annovar was used to annotate each variant in the

VCF file in respect to their location in genes and coding sequences (K. Wang, Li, &

Hakonarson, 2010). In Galaxy, ANNOVAR works only with hg19 VCF files and

hence they had to be changed from hg_g1k_v37 to hg19. The input file is a VCF

file format and the output file is a table of annotations for the variants in the VCF

data as shown in Figure 36.

Page 89: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

71

Figure 36 List of variants in the vcf file on Galaxy.

Page 90: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

72

17) Text reformatting: This tool runs the unix awk command on the ANNOVAR vcf

data and the output file is tabular data with a list of variants.

Page 91: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

73

Figure 37 FH-NGS Galaxy pipeline for annotation, filtering, alignment and variant calling. Pipeline designed for use in Galaxy. A PDF link is provided in

appendix 5 to enable viewing at a higher resolution.

Page 92: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

74

2.7.4 wAnnovar

wAnnovar is an online web-based tool to annotate variants obtained from high-throughput

sequencing. This was used since it was simple and easy and also to see if any new

variants were called in comparison to BaseSpace and Galaxy. This web- based tool

required basic information i.e. e-mail address of the person performing the analysis,

sample identifier, input file (VCF), name of the disease/phenotype and reference genome

etc. On completion of analysis, a link is sent to the provided e-mail address with a list of

variants.

2.8 Sanger Sequencing

Sanger sequencing was performed on five samples which had discordant results with the

reference lab results.

2.8.1 PCR primers

The PCR primers used for Sanger sequencing were as described in the literature (Lee et

al., 1998) (Whittall et al., 2010). The primers were supplied by Eurofins Genomics

(Eurofins Genomics, 2017) as shown in the Table 3. The forward and the reverse primers

were diluted with DNase free and RNase free water to a stock standard concentration of

100 pmol/µl. The stock primers were further diluted to working standard concentration of

5pmol/µl.

Table 3 Primer sequences used for Sanger Sequencing.

Gene Exon Primer Sequence (5’→3’)

LDLR Exon 4 Upstream GGACAGTAGCCCCTGCTCG

LDLR Exon 4 Downstream GGAGCCCAGGGACAGGTGATAG

LDLR Exon 14 Upstream GAATCTTCTGGTATAGCTGAT

LDLR Exon 14 Downstream GCAGAGAGAGGCTCAGGAGG

2.8.2 Initial PCR amplification

The target region of the LDLR gene was amplified using polymerase chain reaction

(PCR). The products were prepared in a 25µl reaction with 5U/µL Taq polymerase

(Thermo Start), 10x reaction buffer, 25 mM MgCl2 (ThermoScientific), 2mM dNTP’s

(ThermoScientific) and a working concentration of 5pmol/µl of each primer (Eurofins

Genomics). The PCR was run on G-Storm PCR thermal cycler for 35 cycles of 95ºC for 30

sec, 68ºC for 30 sec, 72ºC for 1 min and a final extension of 72ºC for 5 min. The PCR

products were run on an agarose gel and PCR products at the expected sizes of around

250bp were seen as shown in Figure 38.

Page 93: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

75

Figure 38 Electrophoretic gel displaying PCR products.

2.8.3 DNA Sequencing

2.8.3.1 Purification of PCR products by Qiagen MinElute kit

The MinElute system (Anon, 2008) was used in purification of the PCR products. It uses

spin-column technology and selective binding properties of specially designed silica

membrane to yield high end-concentrations of DNA fragments. DNA adsorbs to the silica

membrane in the presence of high salt concentration leaving the contaminants to drain

through the column (Anon, 2008). DNA was eluted according to the manufacturer’s

instructions using the Qiagen MinElute kit.

2.8.3.2 Sequencing PCR reaction

Using ABI’s BigDye Terminator v1.1 cycle sequencing kit (Thermo Fisher Scientific,

2016), sequencing PCR was set up for 20µl, with 4µl of BigDye sequencing reaction mix

(Applied Biosystems), 2µl of 5 X dilution buffer (Applied Biosystems), 2µl of the respective

sequencing primers (5pmol/µl) (Eurofins Genomics), 2µl of purified PCR product and 10µl

of distilled water. The sequencing PCR was run on G-Storm PCR thermal cycler for 25

cycles at 96ºC for 10 sec, 50ºC for 5 sec, and a final extension of 60ºC for 4 min.

2.8.3.3 Dye-Terminator removal

Dye-Terminator was removed using QIAGEN’s DyeEx 2.0 spin kit (QIAgen, 2012). The

spin columns were vortexed to re-suspend the resin and DNA was eluted according to

Page 94: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

76

manufacturer’s instructions. After centrifugation, the columns were discarded and the

eluate contained purified DNA. The eluted DNA (20µl) was then transferred to a 96 well

plate and dried at 70ºC for 1-2 hrs using the PCR machine block. After drying, 10µl of HiDi

Formamide (Applied Biosystems) was added to each well. The DNA was denatured into

single strands at 94ºC for 5 min and then cooled to 4ºC using the PCR machine block.

2.8.4 Sequencing electrophoresis

Sequencing electrophoresis was carried out on Applied Biosystems Hitachi 3500xL

Genetic Analyser. The plate records were set up on the analyser with the folder name as

FH-NGS under the sequencing folder. The plate was then loaded on the analyser and the

plate record was linked to the plate and the run was started. On completion of the run, the

sequences were loaded on the Mutation Surveyor® DNA Variant Analysis software

(Minton, Flanagan, & Ellard, 2011) for mutation identification.

Page 95: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

77

CHAPTER THREE

RESULTS

Page 96: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

78

3. Results A retrospective cohort of 94 DNA samples were analysed on Illumina MiSeq analyser.

These samples were previously analysed by the reference lab by standard molecular

diagnostic techniques. The reference lab samples had three distinct patterns i.e. 1) 36

samples were analysed by Bristol Panel, 2) 27 samples were analysed by the FH20 chip

and 3) for the remaining 31 samples it was not specified by which methodology they were

analysed. The reference lab had offered an FH service using Bristol panel which

consisted three levels of analysis. Level 1 consisted sequencing all 18 exons plus

promotor regions of the LDLR gene, exon 26 hotspot of the APOB, exon 7 of the PCSK9

along with MLPA analysis of all 18 exons and promotor regions of the LDLR gene. Level 2

services consisted extended analysis with sequencing of exons 1-6 and 8-12 of the

PCSK9 gene and level 3 services consisted of Cascade/familial mutation testing.

Elucigene FH20 kit was being used to detect the most common FH variants in specific

populations and was designed to replace the comprehensive genetic analysis (Sharma et

al., 2012). Elucigene FH20 kit detected point mutations, insertions or deletions in the

LDLR, APOB and PCSK9 genes and the kit was designed only to detect the 20

commonest mutations which were found in the UK population. FH 20 is now obsolete and

is not commercially available since they were able to detect only 20-50% of FH causing

variants and also was not recommended for cascade testing relatives of people with

confirmed FH. It was also not found to be cost effective when compared to the

comprehensive genetic analysis with greater QALY gains and was withdrawn from the

NICE diagnostic guidance (Sharma et al., 2012).

3.1 Reference lab findings

Out of 94 samples, a total of 36 patients had FH variants detected in their samples and

the remaining 58 patients had no variants identified in their samples as shown in Table 4.

Table 4 Reference lab findings.

Ref lab Positive Negative

31 Samples (Unknown source) 24 7

FH 20 (27 samples) 0 27

Bristol panel (36 samples) 12 24

TOTAL 36 58 Positive= FH variants detected, Negative= FH variants were not detected.

3.1.1 Spectrum of FH variants in the reference lab samples

The spectrum of FH variants identified in the 94 samples by the reference lab are

summarised in the Table 5. 17 patients had FH variants detected in the LDLR gene, 17

patients had FH variants in their APOB gene and 1 patient had a FH variant in the PCSK9

Page 97: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

79

gene. No variants were detected in the LDLRAP1 and APOE genes. 1 patient had FH

variants detected in both LDLR and APOB genes.

Table 5 Spectrum of FH variants in the reference lab findings.

Ref lab LDLR APOB PCSK9 APOE LDLRAP1 LDLR + APOB

31 Samples (Unknown source) 10 14 0 0 0 0

FH 20 (27 samples) 0 0 0 0 0 0

Bristol panel (36 samples) 7 3 1 0 0 1

LDLR: Low density lipoprotein receptor, APOB: Apolipoprotein B, PCSK9: Proprotein

convertase subtilisin/kexin type 9, APOE: Apolipoprotein E, LDLRAP1: Low density

lipoprotein receptor adaptor protein 1.

3.2 BaseSpace findings at UHS

The bioinformatic analysis on BaseSpace at UHS yielded a huge number of variants

across all the major FH causing genes. These data were tabulated on a MS Excel

spreadsheet and were also compared against the reference lab findings as shown in

Appendix 5. Out of the 94 samples analysed, BaseSpace analysis detected 62 patients

with FH variants whilst 32 patients had no FH variants detected in their samples as shown

in Table 6.

Table 6 UHS BaseSpace findings.

UHS Positive Negative

31 Samples (Source unknown) 28 3

FH20 (27 Samples) 12 15

Bristol (36 Samples) 22 14

TOTAL 62 32 Positive= FH variants detected, Negative= FH variants were not detected.

3.2.1 Spectrum of variants identified by BaseSpace

FH variants were detected in 62 of 94 patients. A total of 426 variants were identified in

LDLR gene, 493 variants in APOB gene, 486 variants in PCSK9 gene and 7 variants in

LDLRAP1 gene. In the 62 patients with FH variants, 40.4% of them were coding missense

variants, 9.4% were coding synonymous variants, 5.6% were intronic variants, 0.9 % was

coding frame shift variants and 0.9% was coding stop gained variants.

3.2.2 Pathogenic FH variants identified through BaseSpace

A total of 32 FH variants were detected in the LDLR gene of which 17 patients had known

pathogenic FH variants and 15 patients had variants whose pathogenicity was unknown

Page 98: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

80

or had conflicting reports of their pathogenicity as shown in Table 7. In addition, 18

patients had FH variants detected in APOB gene, 14 patients had FH variants detected in

their APOE gene and 1 patient had a known FH variant detected in the PCSK9 gene. No

variants were identified in the LDLRAP1 gene. Table 8 summarises the details of

pathogenic FH variants identified in the samples through BaseSpace.

Table 7 Pathogenic FH variants identified using BaseSpace.

UHS BaseSpace LDLR APOB PCSK9 APOE LDLRAP1

31 Samples (Source unknown)

17 (8 has conflictions in pathogenicity). 15 0 7 0

FH20 (27 Samples) 5 (3 has conflictions in pathogenicity). 0 0 3 0

Bristol (36 Samples)

10 (4 has conflictions in pathogenicity). 3 1 4 0

Variant rs2228671 in the LDLR gene had conflicting reports of pathogenicity in the

literature. Out of the 94 samples screened, 15 samples had this variant detected.

Table 8 Details of pathogenic FH variants identified in the samples through BaseSpace.

Variants Numbers rs Consequence

LDLR 11210912 C>T 15 rs2228671 Stop Gained

LDLR c.662A>G p.(Asp221Gly) 1 rs373822756 Missense Variant

LDLR c.1238C>T p.(Thr413Met) 1 rs368562025 Missense Variant

LDLR c.2547+73G>T 1 LDLR 11227625 LDLR c.1796T>C

p.(Leu599Ser) 1 rs879255025 Missense Variant

LDLR 11224119 Ch19 T>C LDLR p.(Ile451Thr) 1 rs879254874 Missense Variant

LDLR c.313+1G>A 3 rs112029328 Splice Donor Variant

LDLR c.680_681delAC p.(Asp227Glyfs*12) 1 rs387906305 Frameshift

LDLR c.2054C>T p.(Pro685Leu) 1 rs28942084 Missense Variant

LDLR c.2029T>C p.(Cys677Arg) 1 rs775092314 Missense Variant

LDLR c.1444G>A p.(Asp482Asn) 1 rs139624145 Missense Variant

LDLR c.1061A>T p.(Asp354Val) 1 rs755449669 Missense Variant

LDLR c.1048C>T p.(Arg350Ter) 1 rs769737896 Stop Gained

LDLR c.301G>A p.(Glu101Lys) 1 rs144172724 Stop Gained

APOB c.10580G>A p.(Arg3527Gln) 14 rs5742904 Missense Variant

APOB c.8882A>G p.(Asn2961Ser) 1 rs142756262 Missense Variant

APOB c.2981C>T p.(Pro994Leu) 1 rs41288783 Missense Variant

APOB codons 3110-3655: c.10294C>G p.(Gln3432Glu) 1 rs1042023 Missense Variant

APOB c.11833A>G p.(Thr3945Ala) - class 2 1 rs1801698 Missense Variant

PCSK9 c.385G>A p.(Asp129Asn) 1 rs778738291 Missense Variant

APOE (45411941 T>C) 14 rs429358 Missense Variant Known FH causing variants detected on LDLR, APOB, PCSK9 and APOE genes by

BaseSpace. Numbers= Number of patients carrying the variant, rs= reference SNP.

Page 99: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

81

3.2.3 Variants of unknown significance identified through BaseSpace

A total of 1,352 variants of unknown significance (VUS) were identified through

BaseSpace analysis in the 94 samples as shown in Table 9. In the LDLR gene alone,

there were 394 variants whose pathogenicity or significance was unknown, in APOB there

were 466 variants, in PCSK9 there were 485 variants and in LDLRAP1 there were 7

variants of unknown significance.

Table 9 Variants of unknown significance.

UHS BaseSpace LDLR VUS

APOB VUS

PCSK9 VUS

LDLRAP1 VUS

31 Samples (Source unknown) 151 203 177 5

FH20 (27 Samples) 122 115 159 0

Bristol (36 Samples) 121 148 149 2 BaseSpace analysis on 94 samples identified a large number of variants across LDLR,

APOB, PCSK9 and LDLRAP1 whose significance was not known on the dbSNP short

genetic variation database.

UHS BaseSpace findings have identified variants in LDLR, APOB and PCKS9 genes

similar to findings of the Latvian study group (Radovica-Spalvina et al., 2015). Based on

the evidence from the literature, the Latvian study group had categorised these variants

into various categories from 1 to 6. Table 10, below summarises some of the variants

identified in our study similar to the Latvian study. Most of the PCSK9 variants shown in

the table below from our study have also been previously reported by the Humphries

study group through Single Strand Confirmation Polymorphism (SSCP) analysis (S E

Humphries et al., 2006). By identifying some of the common variants, this study has also

shown that our NGS approach has higher sensitivity and specificity rate.

Page 100: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

82

Table 10 Summary of some of the variants identified through BaseSpace similar to the Latvian study group.

CAT Gene rs code Patient no.s

Variant Description with references

3 APOB rs1801696 14 p.(Glu2566Lys) Hypertriglyceridemia (Johansen et al., 2010)

4 APOB rs142151703 2 c.*179G>A No information

4 APOB rs1801695 4 p.(Ala4481Thr) Associated with HDL (Edmondson et al., 2011), risk of dementia (Reynolds et al., 2010), seen in individuals with low TG levels (Neale et al., 2011).

4 APOB rs1042034 66 p.(Ser4338Asn) Common variant (Radovica-Spalvina et al., 2015), benign (Mäkelä et al., 2012)

4 APOB rs1801702 1 p.(Arg4270Thr) Associated with TC and LDL-C(Liao et al., 2008)

4 APOB rs1042031 36 p.(Glu4181Lys) Common variant (Radovica-Spalvina et al., 2015), benign (Mäkelä et al., 2012), influences BMI (Bentzen, Jorgensen, & Fenger, 2002)

4 APOB rs1801701 18 p.(Arg3638Gin) Common variant (Radovica-Spalvina et al., 2015), associated With LDL-C (Benn et al., 2008), influences BMI (Bentzen et al., 2002)

4 APOB rs533617 7 p.(His1923Arg) Probably damaging to APOB (Mäkelä et al., 2012), seen in individuals with low TG (Neale et al., 2011)

4 APOB rs679899 67 p.(Ala618Val) Probably damaging to APOB (Mäkelä et al., 2012), associated with kidney disease (Yoshida et al., 2009), common variant (Radovica-Spalvina et al., 2015).

4 APOB rs72653061 1 c.905-15G>C No information

4 APOB rs1367117 54 p.(Thr98Ile) Common variant (Calandra et al., 2011), benign (Mäkelä et al., 2012)

6 APOB rs676210 30 p.(Pro2739Leu) Hypercholesterolaemia (Benn et al., 2008), common variant (Radovica-Spalvina et al., 2015)

6 APOB rs12691202 5 p.(Val730Ile) As a compound heterozygote with Arg490Trp influences severity of hypobetalipoproteinaemia (Lam et al., 2012)

4 LDLR rs2738442 58 c.1060+7T>C No association with FH (Al-Khateeb et al., 2011)

4 LDLR rs12710260 37 c.1060+10G>C Found in Greek FH patients (Dedoussis et al., 2004), common

Page 101: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

83

variant (Radovica-Spalvina et al., 2015)

4 LDLR rs11669576 7 p.(Ala391Thr) Associated with increased risk of stroke (Frikke-Schmidt, Nordestgaard, Schnohr, & Tybjaerg-Hansen, 2004), found in FH patients (Salazar et al., 2002).

4 LDLR rs14158 43 c.*52G>A Probably benign (Amsellem et al., 2002), common variant (Radovica-Spalvina et al., 2015)

4 LDLR rs17249029 1 c.*338G>A No information

4 PCSK9 rs45448095 21 c.-64C>T Found in hypercholesteraemic individuals (Leren, 2004), common variant (Radovica-Spalvina et al., 2015)

4 PCSK9 rs2483205 59 c.658-7C>T Found in FH individuals (Tosi et al., 2007), common variant (Radovica-Spalvina et al., 2015)

4 PCSK9 rs2495477 36 c.799+3A>G Found in FH individuals (Leren, 2004), common variant (Radovica-Spalvina et al., 2015)

4 PCSK9 rs562556 55 p.(Val474Ile) Found in both low and high LDL-C groups (Miyake et al., 2008), common variant (Leren, 2004)(Al-Waili et al., 2013)(Scartezini et al., 2007)

4 PCSK9 rs505151 57 p.(Gly670Glu) Associated with severity of atherosclerosis (S. N. Chen et al., 2005), found in hypercholesterolemic individuals (Leren, 2004), common polymorphism (Leren, 2004), common variant (Radovica-Spalvina et al., 2015)

4 PCSK9 rs28362287 1 c.*75C>T No information

4 PCSK9 rs11800231 8 c.524-11G>A Found in FH (Leren, 2004)(S E Humphries et al., 2006) and healthy individuals (Radovica-Spalvina et al., 2015)

4 PCSK9 rs11800243 8 c.657+9G>A Found in FH (Leren, 2004)(S E Humphries et al., 2006) and healthy individuals (Radovica-Spalvina et al., 2015)

UHS BaseSpace variants on LDLR, APOB and PCSK9 genes similar to the Latvian study group. CAT= Designated category of variant by

Latvian study group. 3= other rare, non-synonymous and potential splice site variants, 4= other rare, synonymous variants and common

variants, 6= variants with opposite effect. TG= Triglycerides, LDL-C= Low Density Lipoprotein- Cholesterol.

Page 102: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

84

3.2.4 Combination of variants in different FH causing genes

A total of 17 patients had two or more variants identified in the same gene or in a different

candidate gene known to cause FH. Out of the 17, 10 patients had variants in LDLR and

APOB genes, 4 patients had variants in LDLR and APOE genes and 3 patients had

variants in APOB and APOE genes as shown in Table 11. There were 2 patients from the

Bristol sample group who had 2 variants in the LDLR gene along with a variant in the

APOB gene. Furthermore, 1 patient from the unknown source group had 2 variants

identified in the APOB gene along with a variant in the APOE gene.

Table 11 Combination of variants in the FH causing genes.

UHS BaseSpace LDLR + APOB

LDLR + APOE

APOB + APOE

31 Samples (Source unknown) 8 4 2

FH20 (27 Samples) 0 0 0

Bristol (36 Samples) 2 0 1

Combinations of variants were identified by the BaseSpace. 4 patients had 2 variants in the

LDLR gene and 2 patients had 2 variants in the APOB gene.

3.3 Comparison of UHS BaseSpace findings with the reference lab

findings

As mentioned earlier, there were three patterns of sample sources identified in the

samples which were analysed for FH variants. There were a total of 31 samples whose

source or method of analysis was not known. 27 samples were analysed by the FH20 chip

method and 36 samples were analysed by the Bristol Panel.

3.3.1 31 samples (source or method of analysis unknown)

Out of the 31 samples, the reference lab findings showed only 24 samples to have FH

variants and 7 patients had no FH variants detected. In contrast, our findings showed 28

patients had FH variants and only 3 had no FH variants detected. On analysing the

spectrum of FH variants, the reference lab detected 10 FH variants in LDLR gene and 14

FH variants in APOB gene, where as our BaseSpace detected 17 FH variants in the LDLR

gene and 18 patients with FH variants in the APOB gene and 7 FH variants in the APOE

gene. In addition there were 151 FH variants in LDLR gene, 203 FH variants in APOB

gene, 177 FH variants in the PCSK9 gene and 5 FH variants in the LDLRAP1 whose

pathogenicity was not known, as shown in Table 12.

Page 103: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

85

Table 12 Comparison of findings in 31 samples whose source was unknown.

Samples Positive Negative LDLR APOB APOE LDLR VUS

APOB VUS

PCSK9 VUS

LDLRAP1 VUS

31 Samples (Source unknown) 24 7 10 14

UHS 28 3 17 18 7 151 203 177 5 UHS BaseSpace analysis was able to identify more FH variants in patients when compared

to the reference lab. There were also variants in FH genes whose pathogenicity or

significance was not known.

3.3.2 FH 20 samples

The reference lab detected no FH variants in the 27 samples which were analysed by the

FH20 chip. Our BaseSpace analysis detected 8 patients with known FH variants; no

known FH variants were detected in 19 patients. There were 5 patients known to have

known FH causing variants in the LDLR gene and 3 patients had FH causing variants

detected in the APOE gene. In addition there were 122 variants detected in the LDLR

gene, 115 variants in the APOB gene and 159 variants in the PCSK9 gene whose

pathogenicity was not known as shown in Table 13.

Table 13 Comparison of FH variants in 27 samples of FH20.

Samples Positive Negative LDLR APOB APOE LDLR VUS

APOB VUS

PCSK9 VUS

LDLRAP1 VUS

FH 20 0 27 UHS 8 19 5

3 122 115 159

UHS BaseSpace analysis was able to identify FH variants in 8 patients out of the 27 samples

in the FH20 group.

3.3.3 Bristol 36 samples

The last group of 36 samples were analysed by the Bristol panel. The reference lab

detected FH causing variants in 12 patients and the remaining 24 patients had no FH

causing variants detected in their samples. In total, there were 8 patients who had FH

causing variants in the LDLR gene, 3 patients had FH causing variants in the APOB gene

and 1 patient had FH causing variant detected in the PCSK9 gene. In contrast, our

BaseSpace findings detected FH causing variants in 22 patients and in the remaining 14

patients there were no known FH causing variants detected. Overall, a total of 10 known

FH causing variants were detected in the LDLR gene, 9 known FH causing variants were

detected in the APOB gene, 4 known FH causing variants in the APOE gene and 1 known

FH causing variant in the PCSK9 gene. In addition, there were 121 variants in the LDLR

gene, 148 variants in the APOB gene, 149 variants in the PCSK9 gene and 2 variants in

the LDLRAP1 whose pathogenicity was not known, as shown in Table 14.

Page 104: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

86

Table 14 Comparison of FH variants in the 36 Bristol panel samples.

Samples Positive Negative LDLR APOB APOE PCSK9 LDLR VUS

APOB VUS

PCSK9 VUS

LDLRAP1 VUS

Bristol(36 Samples) 12 24 8 3

1

UHS 22 14 10 9 4 1 121 148 149 2 UHS BaseSpace analysis was able to identify more FH variants in the Bristol sample group.

There were variants identified in the FH causing genes whose pathogenicity or significance

was not known.

3.4 Investigation of non-concordant results

On completion of the bioinformatics analysis on all the 94 samples, the UHS BaseSpace

findings were compared with the reference lab findings. Out of the 94 samples, there were

11 samples where variants could not be identified by UHS BaseSpace and were not in

concordance with the reference lab findings. All 11 samples had variants identified in their

LDLR gene by the reference lab as shown in Table 15.

Table 15 Samples which were not in concordance with the reference lab findings.

Sample number Variant rs

1107996864 LDLR c.662A>G rs373822756

1107996791 LDLR c.1238C>T rs368562025

1107996839 LDLR c.2042G>C rs201637900

1107996837 LDLR c.2547+73G>T

1107996821 LDLR c.661G>A rs875989906

1107996798 LDLR c.681C>G rs121908028

1107996822 LDLR c.662A>G rs373822756

1107996786 LDLR c.681C>G rs121908028

1107996029 LDLR c.2029T>C rs775092314

1107996006 LDLR c.1061A>T rs755449669

1107996070 LDLR c.1048C>T rs769737896 List of 11 samples which were not in concordance with the reference lab findings.

All 11 samples had variant in the LDLR gene.

The BaseSpace analysis results on these 11 samples were individually looked into. Out of

the 11 samples, BaseSpace was able to identify the nucleotide change in the sequences

in 6 samples but were not able to call them with a dbSNP number. These 6 samples

showed considerable total depth i.e. number of reads aligned at the respective position

and considerable alt depth i.e. number of reads containing the variant allele as shown in

Table 16. The exact position of the variants on the BaseSpace within the reference

chromosome were noted and searched on the NCBI LDLR gene using the genomic

sequence Chromosome 19 Reference GRCh37.p13 primary assembly. All the 6 samples

showed variants in the LDLR gene and were in concordance with the reference lab

results.

Page 105: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

87

Table 16 Identification of variants in BaseSpace by manual analysis.

Samples rs Location Variant Total depth Alt depth

1107996864 rs373822756 11216244 A>G

LDLR c.662A>G p.(Asp221Gly)

412 159

1107996791 rs368562025 11224005 C>T

LDLR c.1238C>T p.(Thr413Met)

307 166

1107996837

11240419 G>T

LDLR c.2547+73G>T Intron

449 222

1107996029 rs775092314 11231087 T>C

LDLR c.2029T>C p.(Cys677Arg)

195 85

1107996006 rs755449669 11222190 A>T

LDLR c.1061A>T p.(Asp354Val)

197 101

1107996070 rs769737896 11221435 C>T

LDLR c.1048C>T p.(Arg350Ter)

431 220

LDLR variants identified on BaseSpace which did not have a dbSNP identifier. Total depth=

Number of reads aligned at the respective position, Alt depth= Number of reads containing

the variant allele.

The remaining 5 (1107996821, 1107996798, 1107996822, 1107996786 and 1107996839)

samples did not show any variants in the BaseSpace analysis. These 5 samples were run

on the Galaxy through a pipeline specially constructed and designed for this FH-NGS

study as shown in Figure 35 in the methods section. Galaxy was also not able to identify

any variants in these 5 samples. Since there were no variants identified on the 5 samples

both on BaseSpace and Galaxy, the .vcf files of these 5 samples were loaded on

Integrated Genome Viewer (IGV) to see if there were any coverage issues. On loading the

.vcf files on the IGV it was found that there was no coverage in the respective regions and

hence the variants could not be detected as shown in Figures 39, 40 and 41.

Page 106: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

88

Figure 39 IGV screen shot showing no coverage in some of the exons and poor coverage in few exons in the LDLR gene. Faint grey colour indicates

no coverage and dark grey colour indicates good coverage of the exons.

Page 107: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

89

Figure 40 IGV screen shot on 4 samples showing no coverage around 11216263bp, 11216243bp and 11216244bp regions of the LDLR gene. 4 samples 1107996821, 1107996798, 1107996822 and 1107996786 on the IGV showing no coverage on the exon 4 of the LDLR gene.

Page 108: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

90

Figure 41 IGV screen shot on sample 1107996839 showing no coverage at location 11231100bp in the exon 14 of the LDLR gene.

Page 109: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

91

3.5 Sanger sequencing

Sanger sequencing was performed as a gap-filling analysis for this FH-NGS study since

no variants could be identified both by BaseSpace and Galaxy due to coverage issues.

Reference lab findings indicated the presence of FH variants in the LDLR gene on exons

4 and 14. Hence, targeted Sanger sequencing was performed to detect the variants in the

LDLR gene on exons 4 and 14. The sequences of five samples were loaded on the

Mutation Surveyor® DNA Variant Analysis software for identification of the variants in the

LDLR gene. Sanger sequencing identified variants in the LDLR gene on all 5 sample

which were in concordance with the reference lab findings (Table 17).

Table 17 Variants identified in the LDLR gene by Sanger sequencing.

Samples rs Variant

1107996821 rs875989906 LDLR c.661G>A p.(Asp221Asn)

1107996798 rs121908028 LDLR c.681C>G p.(Asp227Glu)

1107996822 rs373822756 LDLR c.662A>G p.(Asp221Gly)

1107996786 rs121908028 LDLR c.681C>G p.(Asp227Glu)

1107996839 rs201637900 LDLR c.2042G>C p.(Cys681Ser) LDLR variants were identified in 5 samples by Sanger sequencing. These variants

were not detected by BaseSpace due to the coverage issues.

Out of the five patients, two patients had LDLR c.681C>G p.(Asp227Glu) variant, one

patient had LDLR c.661G>A p.(Asp221Asn) variant, one patient had LDLR c.662A>G

p.(Asp221Gly) variant and one patient had LDLR c.2042G>C p.(Cys681Ser) variant as

shown in Figure 42.

Page 110: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

92

Figure 42 Sanger sequencing electropherogram showing LDLR variants. Mutation Surveyor® software was used to analyse and discover the LDLR

variants from the Sanger sequencing.

Page 111: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

93

CHAPTER FOUR

DISCUSSION

Page 112: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

94

4. Discussion FH is an autosomal dominant inherited disorder characterised by raised blood plasma

cholesterol levels. Increased plasma cholesterol can lead to its deposition in the

peripheral tissues, leading to xanthomas, and deposition in the arterial wall can lead to

CHD. It is estimated that there are 10 million subjects with FH worldwide. It is evident from

the literature, that early diagnosis and treatment can significantly reduce the mortality rate

and can have a significant impact on the life expectancy in FH patients (Finnie et al.,

2012) (Joseph L Goldstein, 2001).

The identification, early treatment and management of affected individuals still remains a

major challenge, and underdiagnosis of FH is a huge worldwide problem (Neil et al.,

2000). Diagnosis has often been through physical examination for xanthomas, complete

family history for CHD and a laboratory blood test for measuring blood plasma cholesterol

levels. It is evident from the literature that almost 50% of patients do not display any

physical signs (Colin A. Graham et al., 1999) and there is often no evidence of CHD in

heterozygous FH patients (Bhagavan & Jackson, 2002) and in some, plasma cholesterol

levels have been to be found around normal levels. Therefore, diagnosis based on just

physical symptoms and family history has proven to be insufficient and further genetic

testing is mandatory for FH diagnosis. Recent findings also suggest the prevalence of FH

has been higher than previously thought and the proportion of patients known to have FH

remains low. Therefore, genetic testing is pivotal to timely diagnosis and management of

these patients to prevent early mortality and morbidity. At present a combination of

genetic tests are being employed to screen and diagnose FH. This process is labour

intensive, time consuming and involves more staffing, the throughput is limited and the

associated costs are high. The sensitivity of these combined different approaches also is

found to be variable (Maglio et al., 2014).

Although genetic testing is the definitive way of diagnosing FH and has been widely

accepted as a gold standard, it is very important to acknowledge that LDL cholesterol is

the main driver for CVD risk rather than the presence of a molecular defect. In addition,

LDL cholesterol measurements also aid in tailoring therapies and management strategies.

However, genetic testing plays an important role in identifying variants which leads to

raised blood plasma cholesterol levels. Moreover, it is evident from the literature and also

from our study findings that FH phenotype could be due the presence of multiple variants

in a number of genes suggesting a polygenic cause to be the reason for raised blood

plasma cholesterol levels similar to the monogenic forms (Talmud et al., 2013). Hence it is

important to acknowledge biochemical diagnosis coupled with genetic diagnosis is the

most effective way to diagnose and screen patients suspected to have FH.

Page 113: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

95

In recent years, huge advances have been made in the area of genetic testing technology,

and a major aim is to translate these into routine diagnostics in order to improve our

current clinical diagnostic service. The ability of NGS platforms to process a large volume

of data in a short period of time makes NGS technology a powerful tool for screening and

diagnosing diseases on a large scale. In addition, the ability of NGS to multiplex several

samples as well as its ability to analyse multiple genes in a single run offers significant

cost reduction over single gene testing which makes it a powerful diagnostic platform of

choice to be used by the NHS of the future. With great developments in sequencing

technologies, single nucleotide polymorphism (SNP) has become the marker of choice in

the wide range of genetic analysis. SNP refers to a polymorphism i.e. change in a

nucleotide base at a particular position in comparison to the reference sequence.

Recently, researchers have shown that copy number variants in the LDLR gene can be

successfully identified by mining the NGS data through depth of coverage analysis

(Iacocca et al., 2017). They have successfully demonstrated and concluded that multiplex

ligation-dependant probe amplification (MLPA) can be removed from the routine

diagnostic screening process of FH and thereby significantly reducing the associated

costs, resources and analysis time (Iacocca et al., 2017).

This study has demonstrated that NGS can be used to effectively screen and diagnose

FH without using much of the traditional genetic tests. In addition, this study has also

demonstrated that a targeted NGS approach is very effective in the screening and

diagnosis of FH, since it greatly reduces the genetic screening range in comparison to

whole exome sequencing (WES). It is also cost effective when compared to WES and

also the data analysis can be effectively performed with greater efficiency. Furthermore,

development of NGS for routine diagnostics in the NHS can aid an efficient, rapid and

affordable mutation analysis methodology in other areas beyond FH.

A total of 94 samples from patients known and suspected to suffer from FH were

screened using targeted next generation sequencing on LDLR, APOB, PCSK9, LDLRAP1

and APOE genes using the Illumina MiSeq® system. Out of the 94 samples analysed,

BaseSpace analysis detected 62 patients with FH variants whilst 32 patients had no FH

variants detected in their samples. A total of 32 FH variants were detected in the LDLR

gene of which 17 patients had known pathogenic FH variants and 15 patients had variants

whose pathogenicity was unknown or had conflicting reports of their pathogenicity in the

literature. In addition, 18 patients had FH variants detected in APOB gene, 14 patients

had FH variants detected in their APOE gene and 1 patient had a known FH variant

detected in the PCSK9 gene. No variants were identified in the LDLRAP1 gene. Several

variants detected in this cohort as shown in Table 8 of the results section were all well-

Page 114: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

96

known variants previously described in the literature. In addition, a total of 1,352 variants

of unknown significance (VUS) were identified through BaseSpace analysis in the 94

samples. In the LDLR gene alone, there were 394 variants whose pathogenicity or

significance was unknown, in APOB there were 466 variants, in PCSK9 there were 485

variants and in LDLRAP1 there were 7 variants of unknown significance. Some of these

variants shown in Table 10 of the results section were previously reported by the Latvian

study group.

The UHS BaseSpace findings were compared with the reference lab findings. Our NGS

approach was able to identify FH variants in the 26 patients out of 58, in whom no FH

variants were previously detected by the reference lab. In 6 samples, BaseSpace was

able to identify FH variants, but was not able to designate them as pathogenic. The exact

positions of these variants on the BaseSpace were located in the NCBI LDLR genomic

sequence using the Chromosome 19 Reference GRCh37.p13 primary assembly. All 6

samples showed FH variants in the LDLR gene and were in concordance with the

reference lab findings. However, there were 5 samples where BaseSpace was not able to

identify FH variants in the LDLR gene. There could be number of reasons for not being

able to identify the FH variants in those 5 samples. On review of the literature these could

be drawn into 6 areas: (i) sample quality, (ii) edge effect, (iii) technical issues, (iv)

amplicon issues, (v) panel design and (vi) analytical and clinical validity of the software.

4.1 Investigation of non-concordant results

4.1.1 Sample quality

High quality, intact pure DNA is essential for accurate and reliable NGS. For successful

high quality library preparation it is essential to have a high quality genomic DNA as

source material. As evident from the literature, the DNA should be freshly purified, free

from DNases and have an OD260/280 ratio between 1.8 to 2.0 and an OD260/230 ratio of 2.0

to 2.2 (Endrullat, Glökler, Franke, & Frohme, 2016) on the NanoDrop®. The ratio of

absorbance at 260 and 280 nm is used to assess the purity of nucleic acid and a ratio of

~1.8 is generally accepted as ‘pure’ for DNA (Wilfinger, Mackey, & Chomczynski, 1997),

with a lower ratio indicating protein contamination. OD260/230 is generally used as a

secondary measure of nucleic acid purity and a ratio below 2.0 indicates the presence of

contaminants (Wilfinger et al., 1997) such as salts and solvents (phenol). In the 5

discordant samples, 4 samples had an OD260/280 ratio above 1.8 except one sample

1107996798 as shown in the Table below which had a ratio of ~1.73. Three out of the 5

discordant samples had an OD260/230 less than ~2.0 as shown in the Table 18 which

indicated the presence of contaminants. The genomic DNA concentrations were also low

Page 115: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

97

in two of the five samples by NanoDrop®. Sample 1107996798 had a DNA concentration

of 50.7ng/µl and sample 1107996822 had a DNA concentration of 45.6ng/µl. Qubit®

findings also illustrated the differences in the genomic DNA concentration in comparison

to the NanoDrop® findings as shown in Table 18. It is also important to note that 3 of the 5

samples (1107996798, 1107996822 and 1107996786) had genomic DNA concentrations

below the minimum input recommendations (50ng) when measured by Qubit®. Therefore,

with the Qubit® and NanoDrop® findings and through the review of literature, it can be

concluded that a combination of contaminants and low DNA concentration in the samples

could have been one of the reasons for non-concordant results in comparison to the

reference lab findings.

Table 18 NanoDrop® and Qubit® findings on the 5 discordant samples.

Sample 1107996839 1107996821 1107996798 1107996822 1107996786

OD260/280 1.89 1.93 1.73 2.4 1.98

OD260/230 2.37 2.02 1.37 1.99 1.9

Nano drop® (ng/µl)

303.1 164.8 50.7 45.6 109.6

Qubit® (ng/µl) 129 60 11.9 12.4 38 OD260/280 = Ratio of absorbance used to assess the purity of DNA and RNA, OD260/230 = Ratio of

absorbance used to assess the nucleic acid purity. ng= nanogram.

4.1.2 Edge effects

Edge effect is a phenomenon which mainly affects the peripheral wells of the plate during

the incubation phase which can alter the volume in the well, its concentration and the

evaporation rates. Edge effect is a major yet understated problem, which contributes to

deterioration of assays in experiments (Lundholt, Scudder, & Pagliaro, 2003). The causes

of edge effects are complex and evaporation during the incubation phase has been one of

the major problems. There are several reports on edge effect in microplate-based

experiments which are due to differences in thermal gradients (Burt, Carter, & Kricka,

1979) (Oliver, Sanders, Douglas Hogg, & Woods Hellman, 1981) and due to differential

adsorption (Kricka et al., 1980) characteristics across the plate. Thermal gradients are

greatest in the edge wells of the plates (Lundholt et al., 2003). It has been a common

practice in many labs to avoid the use of peripheral wells on the plate to avoid the edge

effects. In our FH-NGS study, the locations of the 5 discordant samples on the 96 well

plate were identified so as to rule out the edge effect. The five samples, 1107996839,

1107996821, 1107996798, 1107996822 and 1107996786 were located in positions F02,

E05, C06, E06 and B07. The exact locations of the 5 samples on the 96 well plate

revealed that the edge effect could be excluded as reason for not being able to detect the

variants in the LDLR gene.

Page 116: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

98

4.1.3 Technical issues

Batch analysis issues or batch effects are a common and a very powerful source of

variation in the high throughput experiments (Leek et al., 2010). In the majority of the

cases, batch effect leads to increased variability in the outcome of the experiments and

also decreases the ability to detect the significant biological signals (Leek et al., 2010).

However, if properly dealt with, the problems caused by batch effects can be prevented.

Some of the common reasons for batch effects are technical, such as variations in

pipetting techniques, laboratory conditions, reagent lots, analysing personnel, laboratory

temperature, storage of reagents, equipment issues and training (Head et al., 2014). It is

evident from the published studies that batch effects and other technical artefacts are

critical issues to be addressed as the technical variability can be more damaging than the

biological variability (Leek et al., 2010). Nevertheless, as all the samples were processed

using the same lot of reagents and in the same, temperature-controlled, laboratory

environment, at least these factors can likely be excluded. Hence, I could narrow the

reason for non-concordant results to technical issues; I then started investigating further

by looking into the way the samples were processed by plotting the data on a

spreadsheet.

In our FH-NGS study, 94 samples were processed in 3 batches of 24, 36 and 34 samples.

On plotting the information on a spreadsheet (Excel), it was found that 4 of the 5

discordant samples were in batch two. The coverage file of the FH-NGS run from the

MiSeq analyser was also uploaded on a spreadsheet to investigate batch issues and

coverage issues as shown in Appendix 5. It was observed from the spreadsheet data that

there were batch testing issues, as a few samples from the batch 2 had poor coverage

across few amplicons. On reviewing the literature, I speculated and looked into batch

testing issues arising due to improper PCR amplification (Akbari, Hansen, Halgunset,

Skorpen, & Krokan, 2005) and improper mixing of reagents (McIntyre et al., 2011). PCR

amplification issues could be due to a number of reasons; using inappropriate number of

cycles in the PCR, inappropriate temperature, using incorrect PCR plates in the thermal

cycler i.e. using a round bottom PCR plate in a thermal cycler which accommodates only

flat bottom PCR plates. Nevertheless, all three batches of samples were analysed using

the same PCR programme, same thermal cycler and similar PCR plates and hence these

factors can likely be excluded. This then leaves the option of improper mixing which could

be the reason for the batch testing issues. It is generally known that pipetting skills are

very significant and play an important role in any assay which involves steps of manual

mixing using pipettes. Throughout the experiment, I used single channelled and multi-

channelled pipettes. Using multi-channelled pipette requires special practice and skills

Page 117: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

99

and even an experienced user can sometimes find it difficult to use. Usage of multi-

channelled pipette by inexperienced users can lead to pipetting errors and could lead to

assay failures. It could be speculated that pipetting and mixing reagents using multi-

channel pipettes could be the reason for issues in batch 2 samples.

Furthermore, the .vcf files of the 5 discordant samples were loaded on the IGV as shown

in Figure 38 and Figure 39 of the results section which showed no coverage in the LDLR

region where the variants were located. On mining the data on the spreadsheet, it was

evident that there were a few low performing amplicons which resulted in low coverage or

no coverage at all. With the available evidence on the spreadsheet and through IGV data,

it can be concluded that a combination of batch issues and some low performing

amplicons resulted in coverage issues which led to the non-concordant results in the five

samples.

4.1.4 Amplicon issues

PCR amplicon-based sequencing is widely used as a targeted approach for both DNA and

RNA workflows (Peng, Vijaya Satya, Lewis, Randad, & Wang, 2015). However, one of the

limitations with this approach is coverage issues due to failed amplicons or low

performance amplicons (Froyen et al., 2016) and also incomplete coverage arising from

the limits of multiplexing (Radovica-Spalvina et al., 2015). Inadequate coverage can result

in the failure of detection of variants leading to false-negative results for heterozygotes

(Wheeler et al., 2008) (Bentley et al., 2008). Identification of these low performing or failed

amplicons in a panel is very important for laboratories using them for routine diagnosis

(Yan et al., 2016). In addition, it is evident from the literature that massively parallel

sequencing approaches do not provide adequate coverage of all regions, particularly the

GC–rich regions in the hybridisation based capture methods and is not able to sequence

through homopolymer regions in pyrosequencing and semi-conductor based methods

(Faiz et al., 2014). In addition, studies have shown that in most of the NGS experiments,

low performing amplicons result in coverage issues (Samorodnitsky et al., 2015). In our

FH-NGS study, when the coverage file from MiSeq analyser was exported to an Excel

spreadsheet, it became evident that there were a few low performing amplicons and these

amplicons showed no coverage across all the 94 samples. I also loaded the .vcf files of

the 5 non-concordant samples on the IGV and it was clearly evident that there were no

coverage and hence we were not able to detect the variants in those 5 samples. However,

on performing gap filling analysis with Sanger sequencing on those 5 samples we were

able to detect the variants and show that they were in concordance with the reference lab

results. Therefore, it can be confirmed and concluded that coverage issues and a few low

performing amplicons were the reason for non-concordance in the 5 samples.

Page 118: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

100

4.1.5 Panel design

One of the limitations in our study is the panels were in silico designed and not wet lab

validated. This could be the reason for some of the low performing amplicons, as they

were not subjected to systematic optimisation and performance testing by the provider.

With NGS creating major impacts on our routine molecular diagnostic services, many

laboratories have started tailoring their custom panels through diagnostic companies who

offer the flexibility to design the panels of choice as per the requirements. Some

laboratories include only core genes based on the evidence in the literature and some

also include genes involved in the cholesterol pathways for which evidence is still

emerging. However, these panels need to be wet lab validated with considerable

optimisation (Jennings et al., 2017) before they can be used in routine diagnostics.

Optimisation and familiarisation is a process whereby samples are subjected to NGS test,

from library preparation to sequencing and variant calling so that it can be systematically

evaluated (Jennings et al., 2017). However, at present there are now wet lab validated, off

the shelf bespoke panels which can be used in routine FH screening services after

extensive verification but without the requirement for any in house optimisation and

validation. These off the shelf bespoke panels were not available at the time of our study.

At present, there are a handful of diagnostic companies which provide off the shelf

bespoke panels which have been optimised and wet lab validated which are capable of

detecting SNV’s and copy number variants (CNV’s) as shown in Table 19. In our opinion,

these off the shelf bespoke panels could be the future for targeted sequencing which

offers options for mix and match and also allows the labs to customise the required core

genes and hotspots.

Page 119: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

101

Table 19 Off the shelf bespoke FH panels.

Provider Name Genes Method Detects Weblink

Oxford Gene Technology

SureSeq myPanel™ NGS custom FH panel

LDLR, PCSK9, APOB, LDLRAP1, APOE, LIPA, STAP1 and Hotspots

NGS SNV, CNV https://www.ogt.com/products/1350_sureseq_mypanel_ngs_custom_fh_panel

Progenika Biopharma

SEQPRO LIPO IS LDLR, APOB, PCSK9, APOE, STAP1 (ADH) and LDLRAP1

NGS SNV http://www.progenika.com/other_regions/index.php?option=com_content&task=view&id=325&Itemid=415

GB HealthWatch Genetic tests

Familial Hypercholesterolemia Panel

ABCA1, APOB, LDLR, LDLRAP1, PCSK9, STAP1, APOA2, EPHX2, ITIH4

NGS SNV, CNV https://www.gbhealthwatch.com/gbinsight/gb-panels-info.php?catalog=GBDNA2031&target=gene

Ambry Genetics

FHNext APOB, LDLR, PCSK9, LDLRAP1 NGS SNV, CNV http://www.ambrygen.com/clinician/genetic-testing/13/cardiology/fhnext-1

CGC genetics Hypercholesterolemia, familial (NGS panel for 15 genes)

ABCA1, ABCG5, ABCG8, APOA2, APOB, APTX, EPHX2, GHR, ITIH4, LDLR, LDLRAP1, LIPA, LRP6, PCSK9, PPP1R17

NGS SNV http://www.cgcgenetics.com/en/by-specialty/3864

Blueprint Genetics

Hyperlipidemia Core Panel

APOB, LDLR, LDLRAP1, PCSK9 NGS SNV, CNV https://blueprintgenetics.com/tests/panels/cardiology/hyperlipidemia-core-panel/

Phosphorus diagnostics

Familial Hypercholesterolemia Panel

APOB, LDLR, LDLRAP1, PCSK9 NGS SNV, CNV http://phosphorus.com/diagnostics/familial-hypercholesterolemia-panel/

Fulgent Familial Hypercholesterolemia NGS Panel

APOB, LDLR, LDLRAP1, PCSK9, SLCO1B1

NGS SNV https://www.fulgentgenetics.com/familial-hypercholesterolemia

Page 120: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

102

The joint consensus recommendation of the Association for Molecular Pathology and

College of American Pathologists addresses the significance of NGS test development,

optimisation and validation, including recommendations on panel content selection and

rationale for optimisation and familiarisation phase conducted before test validation

(Jennings et al., 2017). In our study, the panel design and its in silico validation by the

provider was a clear limitation. Through this study we were able to demonstrate and show

problems of in silico designed panels and its implications on clinical validity. In our

opinion, these off the shelf bespoke wet lab validated panels can be time saving and more

cost effective to cater for the needs of the NHS laboratories without the need for further

optimisation and validation.

4.1.6 Analytical validity and clinical validity of the analytical software

The Centres for Disease Control and prevention (CDC) office of Public Health Genomics

has developed a framework to evaluate the validity and utility of the emerging genetic

tests based on several factors as shown in Figure 43. Considering these variables in the

context of bioinformatics is essential for successful implementation of NGS as well as to

provide a routine NGS diagnostic service in the NHS. In this context, analytical validity can

be defined as the ability of the test to detect the genotype of interest and clinical validity

can be defined as the availability of the scientific evidence to categorise the variant as

pathogenic or non-pathogenic i.e. the association of the variant with the disease.

Page 121: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

103

Figure 43 ACCE Model Process for evaluating genetic tests (Haddow & Palomaki, 2004). ACCE takes its name from four of its main criteria- analytical validity, clinical validity, clinical utility and associated ethical, legal and social implications. PPV: Positive predictive value, NPP: Negative predictive value, QC: Quality control.

In our FH-NGS study we used BaseSpace®; a cloud-based computing system provided

by Illumina, Galaxy and wANNOVAR platforms for bioinformatics analysis. The

bioinformatic analysis on all the 94 samples was performed using Illumina BaseSpace®.

Galaxy platform and wANNOVAR were also used to perform the bioinformatics analysis

only on samples which had discordant results in comparison with the reference lab

findings. Through the study findings I was able to come across one of the limitations of

Illumina BaseSpace®, whereby the BaseSpace® was able to identify a change in

nucleotide in the sequences but was unable to identify them as pathogenic variants and

there were 6 samples in our study with this issue. This suggests that BaseSpace® had

analytical validity but poor clinical validity, since it was able to identify the nucleotide

change but was unable to associate the variant with the disease. Through the study

findings it can also be acknowledged that, at present the clinical utility of the rich

bioinformatics data from the BaseSpace® is not being exploited to its fullest potential. One

Page 122: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

104

way to achieve this would be to perform more studies and in silico analyses to gather

more evidence on the variants of unknown significance so as to be able to categorise

them as pathogenic or non-pathogenic.

Through the investigations, I can conclude that coverage issues due to few low performing

amplicons were the main reasons for not being able to detect the variants in the 5

samples which were not in concordance with the reference lab findings. If the panel had

been subjected to intense wet lab validation by the provider, these low performing

amplicons could have been identified during the process and could have been optimised

before it was being sold to the users. Secondly, poor sample quality and batch testing

issues could have compounded with the coverage issues.

4.2 Barriers for implementation of an NGS-based FH mutation testing

strategy in the NHS

Establishing and implementing a new service in the NHS is very complicated and requires

rigorous planning. However, in this current era of the genomic revolution, with genomic

discoveries being made and published almost every week and with the rapid pace of

developments using NGS technologies, it is necessary for NHS laboratories to embrace

this technology at the earliest opportunity for the benefit of public healthcare. It is also

important to acknowledge that there is a need for training and the development of

healthcare professionals working within the NHS, so that they can gain the ability to

understand complex genetic and genomic information to potentially adopt the new

emerging branch of medicine, the so-called “personalised medicine” of modern clinical

care.

Although NGS has been the forefront of the genomic technologies, its implementation into

routine diagnostics in the NHS is slow due to bioinformatic challenges, high costs,

generation of unprecedented volumes of data which in turn creates challenges and

opportunities for data management, in its storage, computational resources, infrastructure,

analysis and interpretation. Hence, for successful implementation in the NHS, it is very

important to demonstrate the cost effectiveness of these genomic tests in routine

diagnostics along with their clinical utility and its efficacy in comparison to the existing

traditional methodologies.

4.2.1 Bioinformatic analysis

Bioinformatic analysis is a process through which sequence reads are aligned to a

reference genome for annotation and variant discovery. Generation of significant clinical

and diagnostic information from the raw sequencing data through bioinformatics analysis

is very important for any NGS experiments. However, one of the practical difficulties is the

Page 123: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

105

presentation of this complex data to clinicians in a way which could be easily understood

and used by them to effectively manage and treat their patients.

At present, there are large numbers of software suites available for bioinformatic data

analysis. These softwares are either open source or commercially available as supplied

by the provider. Most of the commercial providers cover the analytical process from read

alignment to variant annotation whereas in-house pipelines utilise different programmes

for read alignment, removal of duplicate reads, quality calibration, variant calling and

annotation (Deans et al., 2015).

Big data often poses complex bioinformatics challenges and expert bioinformatics

knowledge is required in the NHS laboratories to analyse and deal with complex

bioinformatics data. As evident from the house of commons report on genomics and

genome editing in the NHS, it is important to acknowledge that there are gaps in training

required for the implementation of genomic services in the NHS (Science and Technology

Committee. House of Commons., 2018). Hence, in the current climate, it is significant for

the NHS to invest in more training of existing scientists to study bioinformatics as well as

to recruit specially trained bioinformaticians in order to embed NGS into our routine

practice and also for successful implementation of NGS in our routine diagnostic services.

Hence, recruitment and retention of skilled bioinformatics workforce is essential for the

successful implementation of NGS in the NHS.

Use of Artificial Intelligence (AI) in interpreting and analysing complex bioinformatics data

could be one of the solutions for the NHS of the future. AI can handle and analyse large

amounts of data faster than humans. Making sense of the enormous amount of data

generated by the NGS analysers is a huge challenge and this is one of the areas where AI

excels humans (Loh, 2018). A recent study on genomics has shown that AI platform took

just 10 mins to analyse a genome of a patient with brain cancer and suggest a treatment

plan in comparison to human experts who took 160 hours (Wrzeszczynski et al., 2017). It

is evident from the literature that researchers are currently using machine learning to

identify patterns in the large volume genetic data sets (Libbrecht & Noble, 2015). These

patterns are then translated into computer models which may help to predict the

probability of an individual developing a disease or aid in designing potential therapies.

Major giants such as Google are already developing computational systems that can

learn, analyse and interpret genetic variations. Deep Variant an AI tool introduced by

Google is an analysis pipeline which uses neural networks to call variants from the NGS

datasets (Anderson, 2018). It is capable of identifying small insertions, deletions and

single-base-pair mutations in the sequencing data. Integration of AI in genomics will aid in

Page 124: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

106

analysis and interpretation of genomic data which is complex even for experts. These AI

approaches will also aid in quicker identification of diseases as well as design potential

drug therapies favouring a personalised medicine approach (Loh, 2018). Use of AI in

analysing and interpreting NGS data will aid the healthcare industry to make big leaps in

the coming years. However, only time will tell when these AI tools could be available in the

NHS.

4.2.2 Big data

NGS is considered to be a paradigm shift from conventional Sanger sequencing, in the

scale of the massively parallel sequencing of NGS technology. In the last few years the

ability of NGS platforms to produce large volumes of data in a short period of time has

rapidly increased and has outpaced Moore’s Law (Simó, Cifuentes, & García-Cañas,

2014) (Krampis & Wultsch, 2015). It is known that sequencing a single person’s genome

will generate 100’s of gigabytes of raw data and further analysis of this genome will

generate 1 gigabyte of data (Beigh, 2016). NHS laboratories are currently not equipped

and need to meet these infrastructure and computational challenges before implementing

any NGS analysis. The time necessary for the analysis of NGS data greatly depends on

the data volumes, hardware and software packages and computational power which exist

in the NHS laboratories. With the massive amount of data generated, there will always be

a trade-off between speed and accuracy of analysis and a balance would be required to

have an ideal alignment and computational efficiency to effectively analyse the data and

aid in timely diagnosis. However, majority of the NGS platforms are now integrated with

bioinformatics applications and tools and the introduction of benchtop analysers with

much simpler methodologies will aid the scientists to analyse and interpret the large scale

genomic data much quicker than before and therefore will pave the way for its translation

into routine diagnostics.

This complex big data and its analysis obviously poses great bioinformatics challenges

and opportunities to the laboratories and one of the big challenges for laboratories is

converting this big data into actionable clinical outcomes and storing them. Hence, it is

important to acknowledge that significant digital infrastructure is required and huge

amount of work needs to be done on the digital front in order to provide a routine genomic

service in the NHS (Science and Technology Committee. House of Commons., 2018).

Currently NHS England through Global Digital Exemplars (GDE Initiative) is safely

harnessing the power of technology to integrate and store all patient information digitally,

so that healthcare providers and patients can have access to all their data and can be

securely used to improve patient outcomes and the value of services. Through

digitalisation, clinicians can have access to all healthcare information concerning the

Page 125: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

107

patient, to aid in timely diagnosis and management. Already we are witnessing the

integration of computer systems in pathology and radiology to aid in diagnosis and provide

answers which are even difficult for experienced pathologists (Granter, Beck, & Papke Jr,

2017). Artificial Intelligence offers solutions and could be the new weapon in our fight

against diseases. AI can be used to mine the electronic health care records for patient

history, digital images, laboratory results and genomic data to aid in precision diagnosis.

Recent findings have shown the success of AI in accurate identification various medical

conditions ranging from the identification of cataracts in diabetic patients (Long et al.,

2017), classifying skin cancers in competence with expert dermatologists (Esteva et al.,

2017) and accurate prediction of cardiovascular risk (Weng, Reps, Kai, Garibaldi, &

Qureshi, 2017). Even the UK government has proposed plans for the NHS to use AI to

transform the diagnosis of many diseases including cancer and aid in early management

of these patients (Prime Minister’s office & The Rt Hon Theresa May MP, 2018).

Efforts should also be taken to streamline the digitalisation and integration of the genomic

data and should be considered as a part of the general digital reconfiguration in the NHS.

Systems should be developed to store the genomic data within the patient’s electronic

record and should be presentable in a way which is easily understandable and

interpretable by the healthcare professionals.

4.2.3 Cloud computing

Sequencing and analysing a person’s whole genome which comprises of 3.2 billion letters

of the DNA generates 100’s of gigabytes of data. The generation of huge data sets with

the NGS presents challenges in NHS laboratories by placing a great demand on the

computational resources and the skills required to handle the data. However, with the rise

of cloud computing new avenues have opened up for storing and analysing the big

sequencing data. In cloud computing, customers usually rent the hardware and storage as

much as they need and pay for the time rented as well as for the storage. Through cloud

computing, data can be remotely stored and analysis of data can also be performed

remotely. There is a great need to securely protect the data with strict protocols as privacy

protection in cloud computing is a major concern (Greenbaum, Du, & Gerstein, 2008)

(Stein, Knoppers, Campbell, Getz, & Korbel, 2015). If cloud services are to be used by the

NHS, the data needs to be encrypted before it can be stored in the cloud, and

consequently this raises issues to do with privacy protection laws and regulations. The

cost of sequencing has been dropping several times faster than the cost of storing data

(Stein, 2010). Very soon it will be not feasible to store all the data, and laboratories will

have to filter huge data sets to identify the informative ones alone and discard the rest.

Currently in the NHS the genomic data are stored in restricted access computers and are

Page 126: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

108

password protected. A scalable biocomputing infrastructure is essential in order to

implement the NGS technology in the NHS laboratories. At present, the NHS is harvesting

the power of technology to digitalise all patient information to provide exceptional care by

providing clinicians the access to all the patient information through world-class digital

technology. Through the GDE initiative, all the patient information will be digitally

encrypted and stored securely in cloud-based systems which can be securely accessed

by the relevant professionals. Through digitalisation and cloud computing, the NHS will

create a healthcare system free of paper at the point of care (NHS England, 2017).

4.2.4 Data governance

In the context of healthcare data, it is important to acknowledge the words of John Bell

“One of the most important resources held by the UK health system is the data generated

by the 65 million people within it” (J. Bell, 2017). Data governance in the NHS refers to the

handling of patient information in a confidential and secure manner to appropriate ethical

standards. Data governance provides a consistent framework to staff in the NHS to deal

with the availability, usability, integrity and security of patient information. Genomic data

needs to be protected and its privacy and confidentiality needs to be preserved. Artificial

Intelligence could be an option for the NHS of the future to safe guard the genomic data

by various processes such as by encrypting the data, protecting systems using passwords

and by having an audit of the data access, exchange and transfer. It is also important to

have local data governance policies along with the protection offered by the supplier.

Detailed auditable information is essential while dealing with confidential patient

information on its access and usage. Our study findings reveal that there is no auditable

information available on the bioinformatic analysis performed on the BaseSpace®. This is

very significant especially in an NHS setup; it is important and mandatory to have

auditable information on any analysis performed on the patient samples. The Galaxy

platform was used to perform bioinformatic analysis on samples which were discordant

from the reference lab findings. Through our study, we found that for auditable purposes

platforms such as Galaxy would be very useful in an NHS setup as there is auditable

information available on the analysis at each step right from uploading the raw .fastq files

till the results output stage. However, through our study we were also able to identify that

without any experience in bioinformatics or without the support from bioinformaticians,

designing a pipeline on Galaxy for bioinformatic analysis can be a daunting task. Hence, it

is important for the NHS to invest in training our professionals in bioinformatics as NGS

becomes an integral part of our routine molecular diagnostic services. As evident from

literature (Desai & Jere, 2012) and from the study findings, I conclude that in the current

Page 127: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

109

NHS diagnostic molecular laboratories, bioinformatic data analysis will be one of the

biggest challenge for implementation of NGS in routine clinical diagnostic services.

4.2.5 Costing

It is well known through the literature that post completion of the Human Genome Project,

the costs of sequencing the human genome have considerably dropped. It took

approximately 10 years to sequence the first human genome and then 3 years for

completing the data analysis, with a price tag of approximately 3 billion USD (Illumina,

2012a). Thus, with recent developments in NGS technology in the post genomic era the

costs have been considerably brought down which has allowed experiments to be

performed which were once unimaginable. Currently, through NGS we have developed

the capability to sequence five human genomes in a single run and produce data within a

week at a price tag of 5000 USD per genome (Illumina, 2012b). This dramatic price

decrease, especially with targeted NGS in comparison to whole genome sequencing

(WGS) and whole exome sequencing (WES), has allowed us to witness genomic

discoveries being made and published almost on a weekly basis. At present, the

economic impact of additional incidental findings through WGS and WES is still unknown

and hence our targeted approach would be the model of choice for the NHS laboratories.

Our targeted NGS approach in the diagnosis and screening of FH by multiplexing large

number of samples and simultaneously sequencing them in a single run is a much better

substitute for WGS and WES since it can be performed at a lower fraction of the cost in

comparison to those global approaches. In addition, our targeted NGS approach also

generates less data and has less turnaround times in comparison to the WGS and WES,

and indeed similar views have been expressed by many others as witnessed from the

House of Commons report on Genomics and Genome Editing in the NHS (Science and

Technology Committe. House of Commons., 2018). At present, there are off the shelf

bespoke panels which can also detect CNV’s, eliminating the need for further traditional

approaches and WGS. These targeted panels could therefore be the best option for busy

NHS laboratories. They would be the most cost-effective option, as well as time-saving for

busy scientists, eliminating the need for further extensive wet lab validation and

optimisation. A recent study finding (Futema et al., 2012) also supports our view that a

targeted NGS approach through MiSeq platform can significantly reduce the costs as well

as reduce the time spent on data analysis. In conclusion, targeted sequencing can be

more cost-effective in comparison to genome-wide approaches, and furthermore preclude

the ethical implications and the associated issues surrounding incidental findings.

Page 128: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

110

4.2.6 Health Economics

Health economics is a branch of economics which studies the efficient allocation of

healthcare resources (Whittington, 2008). It informs decision-makers on which health

treatments, medications or technologies to support, and from which areas to disinvest, in

order to maximise population health (Whittington, 2008). NGS technologies have

revolutionised our scientific communities and has allowed us to understand and practice

medicine at a level which was once unthinkable. However, despite its success and major

impact, the translation of NGS from research into routine diagnosis has been limited due

to the lack of economic evidence. Despite the lack of economic evidence, optimism still

exists in relation to the integration of genomic technologies into routine clinical settings so

as to reduce overall healthcare expenditures (S. G. Nicholls et al., 2013). Recently,

however, this optimism has been scaling down among the scientific community as we

have been witnessing a steep rise in the costs of downstream analysis. The cost of

sequencing per base has been falling, however, conversely the cost of bioinformatics

analysis, variant identification, clinical interpretation and reporting has been increasing

(Christensen, Dukhovny, Siebert, & Green, 2015) (Buchanan, Wordsworth, & Schuh,

2013) (Science and Technology Committe. House of Commons., 2018).

Economic evaluations of implementing NGS in an NHS laboratory setup should take into

account of the costs right from the point of sample collection, laboratory testing, data

analysis, reporting of test results, training, resources and the infrastructure. The economic

impact of laboratory results in the timely diagnosis, treatment and management of patients

can be huge. It may be argued that the laboratory testing and the analysis costs can be

insignificant when compared to the downstream costs associated with medical procedures

and interventions ordered in response to the laboratory findings (Christensen et al., 2015).

However, it is evident from the literature that health economic studies strongly argue in

favour of systematic screening of FH (Brice, Burton, Edwards, Humphries, & Aitman,

2013). The UK could save almost £380 million by preventing CHD events if the relatives of

the index cases were identified and treated over a period of 55 years, equating the

savings to £6.9 million per year (Heart UK, 2012). In addition, the findings of a study on

FH has shown that structured models of care undertaken in their study have proven to be

more cost-effective than estimated by the NICE (Pears et al., 2014). It is also evident from

the literature that detection and treatment of FH is cost-effective (Alonso et al., 2008)

(Marang-van de Mheen, Ten Asbroek, Bonneux, Bonsel, & Klazinga, 2002) (D Marks et

al., 2000) when compared with the healthcare interventions at a later stage due to CHD.

Similarly, a recent study in the UK has also shown cascade testing on relatives suspected

to have FH was found to be highly cost-effective and the adoption of cascade testing will

Page 129: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

111

yield substantial quality of life and survival gains (Kerr et al., 2017). In addition, with

particular reference to a targeted NGS screening approach, it also well-known from the

literature that applying this approach for FH is more cost-effective and accurate compared

to WES (Brautbar et al., 2015) (Futema et al., 2012).

In this FH-NGS study, the cost of the reagents per sample was approximately £54.82

which was based on the reagent costs in April 2016. The cost per sample is heavily

dependent on the instrument usage and without prior expert knowledge in economics it

was not possible to calculate the equipment usage cost and labour costs. This could have

probably incurred an additional cost of about £200 approximately in addition to the

reagent costs.

Compounding all these issues to perform a health economics evaluation of FH-NGS was

beyond the scope of this study and requires detailed systematic analysis on its own and

expert academic knowledge in the field of economics to be able to fully assess the health

economics of FH-NGS in NHS laboratories. It is also important to note that economic

evaluations of genomic technologies in particular pose a number of methodological

challenges to health economists itself (Buchanan et al., 2013). It is also evident from the

literature that health economists are also unsure if the current existing economic

evaluation models are sufficient to evaluate these genomic technologies (Buchanan et al.,

2013). However, it is necessary to acknowledge that the health economics evaluation of

FH-NGS could significantly contribute to an efficient allocation of the available scarce

resources in NHS laboratories. One of the ways to decrease the cost in the NHS

laboratories is to streamline the workflow, increase the number of samples being analysed

as the cost per sample are heavily dependent on the instrument usage, integrating NGS in

the routine molecular diagnostics in areas beyond FH, and automating the full process

(Yohe & Thyagarajan, 2017).

4.2.7 Genetic services reconfiguration

Genomics England, through their ‘100,000 Genome Project’ is providing a roadmap for

NHS molecular laboratories in the UK to learn and embrace the NGS technology in their

routine diagnostics and thereby provide a personalised or precision medicine to the

patients. It is expected, on completion of the ‘100,000 Genome Project’, NHS England in

partnership with Genomics England will be involved in embedding genomic medicine into

routine NHS care pathways (Science and Technology Committe. House of Commons.,

2018). The ‘100,000 Genome Project’ has the potential to catalyse and bring about a

system change in the NHS diagnostics very soon. This however, brings great challenges

Page 130: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

112

as well as opportunities to the NHS molecular laboratories and hence it is very important

to engage all the key stake holders including the public.

The revolution in sequencing technology through NGS has led to the production of

sequencing machines in size from the giant ones to the bench top ones. These analysers

can generate massive amount of data and swamp the current computing infrastructures in

the NHS laboratories. They also bring along complex challenges as regards data analysis

and storage. Hence, Health Education England started funding to provide more training to

healthcare professionals in the NHS to enhance their understanding and knowledge in this

emerging field by developing programmes such as MSc in Genomic Medicine and through

some online courses in genomics and bioinformatics. These courses are specially

designed for healthcare professionals working in the NHS to improve their skills as well as

to integrate genomic technologies into the patient care in the NHS. However, it is

important to acknowledge that expert knowledge in bioinformatics will be required to deal

with complex bioinformatics challenges if NGS is to be implemented in routine NHS

laboratories. Furthermore, infrastructure, resources and computational requirements pose

additional challenges to NHS laboratories. Hence, there is a planned reconfiguration of

regional services with the required infrastructure, computing facilities and professionals

with great expertise and experience in genomics to provide comprehensive and

specialised genetic services (Bruce Keogh, 2015). Currently, plans are underway to have

a network of genomic hub laboratories and local genomic laboratories to deliver a specific

list of genomic tests and a central data repository connected to a wider NHS digital

infrastructure (Science and Technology Committe. House of Commons., 2018). This

opportunity could be used to embrace NGS in NHS routine diagnostic services for

screening and diagnosis of simple and single gene disorders such as FH. Our FH-NGS

study serves as an example of how an existing NHS molecular laboratory can start

piloting this new technology in their diagnostic services. It also can accelerate the

integration and uptake of NGS into routine NHS diagnostics. It can be argued here that

implementing NGS locally has its own values and benefits; the clinicians can tailor

effective management strategies in a timely manner, sample turnaround times can be

significantly reduced, costings involved in sample transport to the labs can be saved,

scientists can become specialised and develop expertise in the relevant areas. Regional

centres with big infrastructure, resource, computational facilities and experts in the field

can aid in providing expert knowledge and skills to the developing NHS labs, and should

be able to deal with complex cases which the local labs could not deal with and offer

training to the small NHS labs. However, in the current NHS landscape it is not clear how

the genetic reconfiguration would happen and what impact it would have on the molecular

Page 131: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

113

diagnostic labs in the NHS, and whether FH testing should be adopted locally or be

centralised. Evaluating the health economics through cost-benefit analysis and measuring

the outcomes in terms of quality-adjusted life-years (QALYs) of local and regional services

can break the political barriers and also can be an important driving factor in the

reconfiguration of genetic services in the UK. One option for the NHS could be developing

partnerships with universities and private companies so that they could bring in more

investments into the NHS to meet the infrastructure demands, computational

requirements and the financial challenges. However, these should be carefully planned as

they could bring about challenges in legal liability, data ownership and privacy issues. In

the current political climate and in the current NHS landscape, it is a policy of ‘wait and

see’ what the reconfiguration brings and means to the existing NHS molecular labs.

4.2.8 FH services

It is evident from the literature that various approaches and screening strategies are being

followed across the world to provide an effective FH diagnostic service. Critiques have

argued and discussed the advantage and disadvantage of these approaches in their

respective settings. For example, in Slovenia, universal screening of children for FH is

being implemented from the age of 5 (Kusters et al., 2012) where as in other parts of the

world this approach has remained a subject of controversy (Newman & Garber, 2000). In

the UK, NICE guidelines advices that children with FH should be identified before the age

of 10, so that the treatment could be started early (A S Wierzbicki et al., 2008). In the US,

universal screening of children aged 9-11 is recommended (National Heart, Lung, 2011)

whereas the same approach is not being recommended by the Royal Australian College

of Practitioners (Practitioners, 2009). Index cases are usually identified by opportunistic or

targeted systematic screening (Børge G. Nordestgaard et al., 2013). An Australian study

team has reviewed various approaches for screening FH, i.e. such as opportunistic

screening for family history, opportunistic screening of lipids, systematic screening of

general practice electronic records and universal screening for a general practice setting

(Kirke, Watts, & Emery, 2012). However, such approaches in the UK are debatable and

also may not be cost-effective. The UK literature evidences cascade screening as the

most cost-effective screening strategy (Dalya Marks et al., 2002). The first successful

national genetic cascade screening was from the Netherlands in 1994 (Umans-

Eckenhausen, Defesche, Sijbrands, Scheerder, & Kastelein, 2001) (Wonderling et al.,

2004), followed by Norway in 2003 (Leren et al., 2004) (Leren, Finborud, Manshaus, Ose,

& Berge, 2008). This successful cascade screening strategy is also being followed in

other countries such as Spain (Pocovi, Civeira, Alonso, & Mata, 2004) (Fernando Civeira

Page 132: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

114

et al., 2008), New Zealand (Muir, George, Laurie, Reid, & Whitehead, 2010) and Wales

(Taylor et al., 2010).

At UHS, cascade screening has been found to be an effective tool in the screening and

diagnosis of FH. On identification of an index case at the lipid clinic in the UHS, their first

degree “at risk” relatives are approached as a part of the cascade screening process with

their consent. These patients are seen and managed by a specialist team of clinical

consultant, specialist FH nurses and FH genetic counsellor. Blood samples from the first

degree relatives of the FH index case are sent to a reference lab for FH genetic

screening. If they are found to have FH, they are managed appropriately with appropriate

management strategies. GPs also refer suspected cases to the UHS lipid clinic and if they

found to have FH they are then managed either by the primary care specialist or by the

referring GP practices. Regardless of the FH genetic testing outcome, the genetic

counsellor plays a very important role in the FH management pathway and aids the

patient and their families to make informed choices (Sturm, 2014). It is also evident from

the literature that genetic counselling helps patients to conceive and acclimate themselves

to the medical and physiological implications of genetic contributions to the disease

(National Society of Genetic Counselors’ Definition Task Force et al., 2006). Studies have

also shown that involvement of a genetic counsellor right from the process of requesting

tests and reviewing the ordered genetic tests could lead to huge savings and decrease

wastage (Miller et al., 2014).

In the current rapid pace of genomics, this study strongly suggests that the

implementation of NGS in the screening and diagnosis of FH at the UHS will make a

substantial impact on cascade screening as it will aid in early identification of “at risk”

relatives of the probands. This will also aid in their early management, thereby preventing

considerable mortality and morbidity due to CHD at later stages of life. NGS technology

will allow us to perform the genetic screening of FH as a standard in-house clinical

practice in the lipid clinics as well as by GP services across the UK.

4.3 Recommendations

Current Sanger screening methodologies are sensitive and specific; however they are

impractical for routine screening purposes as well as for screening large populations due

to the time and cost involved. NGS is considered to be a paradigm shift in technology

since a large number of samples can be analysed in one run with minimal human

interference through automated protocols. Hence it becomes very significant for the future

NHS laboratories to embrace this technology to meet the demands of the increasing

diagnostic needs in the most cost-effective way to aid in timely diagnosis and

Page 133: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

115

management. Since the technology is becoming more economical, it is also necessary for

the NHS of the future to implement NGS towards the approach of personalised medicine.

However, it is important to acknowledge that concurrent training of healthcare

professionals is crucial in order to successfully implement NGS in routine NHS diagnostics

as it requires new skills. As evident from the House of Commons report, there are

widespread concerns that the current NHS staff have insufficient training and there is a

lack of suitably qualified staff in the NHS to meet the demands of genomic medicine

integration into the routine diagnostic services of the NHS (Science and Technology

Committe. House of Commons., 2018). Hence, it is important to develop, train and

integrate the existing workforce which includes diagnostic staff in the laboratories and

imaging, computer scientists, statisticians, data scientist and bioinformaticians for

successful implementation of NGS into our routine NHS diagnostics. The lab

professionals need to be trained to provide the results in a way which is understandable

and interpretable by the clinicians, who themselves also need to be trained to interpret the

results provided by the laboratories. Hence, healthcare professionals in the NHS should

be trained in bioinformatics which will aid them in analysing and understanding the data in

a more effective way so that diagnosis and management can be tailored appropriately to

the patients. Another important recommendation would be the use of wet lab validated

panels in NGS with considerable optimisation and familiarisation phases (Jennings et al.,

2017) before it is in use for routine diagnostics. This will eliminate the need for the labs to

perform extensive in-house optimisation and validation and save huge costs and time

involved in these processes.

4.4 Limitations

One of the limitations of this study was that the variant findings could not be compared

with the patient family history and cholesterol levels, since the samples were fully

anonymised in this study. This comparison would have enabled me to understand the

correlation of specific mutations with the blood cholesterol levels and the occurrence of

CHD. Nonetheless, our main goal in this pilot study was to set up and standardise the

NGS methodology for the genetic screening of FH and to study the implementation issues

both at the technical and organisational levels.

In our study, we were able to achieve a sensitivity rate of 92.54% and a specificity of

100%. The reason for 92.54% sensitivity was due to the coverage issues with the

amplicons. This was mainly due to the usage of panels which were in silico validated and

not wet lab validated by the provider, which is another limitation of our study. However,

with gap filling Sanger sequencing we were able to achieve 100% sensitivity. This

limitation could be overcome by using wet lab validated off the shelf bespoke panels

Page 134: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

116

which were not available in the market at the time of our study. Nevertheless, our NGS

approach shows increased rate of sensitivity and specificity in comparison to the current

traditional sequencing methodologies in use as seen in the literature (Heath et al., 2001)

(Palacios et al., 2012) (Finnie et al., 2012) (C A Graham et al., 2005). The variant

detection rates in our FH-NGS study were very similar and comparable to studies

described in the literature (Futema et al., 2012) (Vandrovcova et al., 2013) (Radovica-

Spalvina et al., 2015).

4.5 Conclusion

NGS allowed the identification of FH variants in patients which were not previously

reported by the reference lab. Moreover, our NGS approach was also able to identify

numerous variants of unknown significance in LDLR, APOB, PCSK9 and LDLRAP1

genes. We were also able to identify some of the common variants in LDLR, APOB and

PCSK9 genes similar to the variants reported in the Latvian population study (Radovica-

Spalvina et al., 2015).

In contrast with the traditional Sanger capillary sequencing, NGS has allowed us to

analyse multiple samples in a single run by multiplexing samples. It has the potential to be

fully automated and errors arising due to human intervention can be minimised. Our NGS

approach has also found that it can significantly reduce the turnaround times of the

samples and aid the clinician in diagnosis and tailor appropriate management strategies

effectively in a timely manner.

This study has demonstrated the practical utility of NGS to be used by the NHS of the

future as a diagnostic platform for genetic disorders such as FH. NGS represents a

paradigm shift from the conventional Sanger sequencing in its ever-increasing breadth of

coverage, resolution and reliability, coupled with ever-reducing costs. Therefore,

incorporation of NGS into routine molecular diagnostics will create a major impact on the

way we diagnose and tailor the management of hereditary disorders such as FH. It will

also aid in the integration of personalised medicine approach and practice into our routine

clinical practice in the NHS.

Based on my personal experience and our study’s findings, I conclude that, with well

validated panels, appropriate resources together with the bioinformatics support, NGS can

be effectively translated from the research domain into routine NHS diagnostics. Also, the

experience gained from this study will be used to implement the applications of NGS in

other areas beyond FH in the UHS Molecular Pathology laboratory.

Page 135: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

117

CHAPTER FIVE

REFLECTION

Page 136: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

118

5. Reflection

5.1 The Professional Doctorate programme

Soon after completing my Masters degree in Biomedical Science at the University of

Portsmouth in 2011, I enrolled onto the Professional Doctorate (Prof Doc) in Biomedical

Science (DBMS) programme at the University of Portsmouth (UoP) in September 2012.

The programme is composed of two phases:

Taught phase This phase of the course consist of taught units on ‘Professional Review

and Development’, ‘Advanced Research Techniques’, ‘Publication and Dissemination’ and

‘Project Proposal’, which were taught by lecturers at UoP who were experts with up-to-

date knowledge in their respective fields. These units have allowed me to gain an in-depth

understanding of my strengths and weaknesses, to learn how to formulate research

questions using different frame-works, develop research proposals. In addition, I have

also learnt how to critically analyse and appraise qualitative and quantitative research

articles systematically using the critical appraisal tools. The greatest advantage of this

taught phase was that the units were taught by lecturers with considerable experience in

their fields which were then followed by interactive workshops and group discussions to

facilitate the learning. These interactive discussions and workshops gave student an

opportunity to apply what was being taught in the lectures. On completion of each unit in

the taught phase, we were expected to submit an assignment and do a PowerPoint

presentation in order to progress. These units have greatly enhanced my knowledge and

understanding on qualitative and quantitative research, as well as enhancing my

confidence in public speaking and scientific discussions.

Research phase This phase involved the completion of a workplace research project on

the application of next generation sequencing (NGS) to aid in effective diagnosis of

familial hypercholesterolaemia (FH). This research work required an in-depth

understanding and knowledge of NGS technology and how this new technology can be

translated effectively into routine diagnostics. To achieve this, I had to learn and

understand the current practice methods, gather information around my research subject

to effectively understand the science, enrol myself onto courses to learn and be

familiarised with the appropriate new technology, and attend lectures and workshops. I

had to reflect on every situation to effectively assess my own progress and monitor my

development needs in order to progress and successfully complete this research project. I

am sure these techniques and methodologies will enhance my future learning and

research. I feel that this doctorate programme has provided me the required knowledge to

Page 137: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

119

bring about changes in diagnostic practices and believe that it has enhanced my

confidence in public speaking and scientific debates.

5.2 Researching while working as a full time Biomedical Scientist

The DBMS programme has been a challenging programme which required discipline and

dedication. For me it was a very fine line I had to trend in order to effectively balance full-

time work and university life with my family life. I faced a few financial burdens, since I

was self-funding my DBMS, and to rub salt in the wound, I had no study leave approved

for this programme at my workplace and hence had to use my annual leave to attend

university lectures, workshops, and meetings. However, my family was very

understanding and supported me during this period of financial hardship. Moreover, my

workplace and university supervisors were very supportive and motivating, which allowed

me to successfully complete my research. My workplace and university supervisors are

very busy professionals, each having their own priorities and timelines, but nevertheless

still made the effort to support me and guide me in my research work. Their expertise,

knowledge sharing and constructive feedback has allowed me to work more effectively on

my research project.

5.3 Bringing about a change in professional practice

Bringing about change to a procedure or practice in the workplace is never expected to be

a smooth or flawless process. This research project has helped me to understand how

NGS can be effectively used to diagnose FH. Through this pilot study, we have gained an

understanding of how this new technology can be effectively translated from the research

domain into routine diagnostics. It has also allowed me to understand the implementation

challenges and how they could be effectively addressed in order to successfully

implement NGS into our routine molecular diagnostics (e.g. at UHS ) in the near future. It

is evident from this study and from the literature that NGS will likely aid in effective

screening and diagnosis of patients with FH. This would allow the clinicians in tailoring

effective management strategies in a timely manner. The rapid pace of developments in

NGS technologies has created more opportunities and has opened doors for the NHS

diagnostic laboratories to embrace this new technology. Furthermore, the potential

applications of this technology could be expanded to other branches of diagnostics

beyond FH. Hence, to keep pace, it is important to address that the NHS needs to

embrace NGS into their routine diagnostics at the earliest feasible opportunity. This study

will lay a foundation for the implementation of NGS for FH in routine molecular diagnostics

at the UHS. Implementing NGS at UHS will surely bring in economic benefits to the NHS

as well as to society as a whole.

Page 138: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

120

5.4 Dissemination of information

During the research phase of this study, the author has presented the provisional findings

at a local level in the Molecular Pathology Laboratory at UHS which was attended by

biomedical scientists, clinical scientists, trainees and consultants. At national/International

level, the output from the study was disseminated through a poster presentation in the

form of an impact case study poster in the 6th International Conference on Professional

Doctorates in London 2018. The abstract of this study and the link to the extended

abstract was published in the conference brochure. The output of the study was

disseminated through a poster presentation at Research and Innovation Conference at

the University of Portsmouth on September 2018.

5.5 Professional development

The Professional Doctorate in Biomedical Science programme at UoP has created a

positive impact on my academic career. Through this programme, I feel that I have

enhanced and developed my academic skills through lectures, assignments,

presentations, university workshops, as well as by attending and enrolling onto courses

required for the successful completion of my research project. Whilst on this programme, I

have had various colleagues at work discussing about their research projects and seeking

advice on the ethical approval process for their research projects. The knowledge and the

skills gained through this programme have allowed me to give them a helping hand by

providing guidance and advice on research projects and ethical approval process. I am

confident and sure that on completion of my doctorate programme, my horizons will be

expanded and my opportunities for career development will also be enhanced.

Page 139: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

121

References Abifadel, M., Elbitar, S., El Khoury, P., Ghaleb, Y., Chémaly, M., Moussalli, M. L., …

Boileau, C. (2014). Living the PCSK9 adventure: From the identification of a new gene in familial hypercholesterolemia towards a potential new class of anticholesterol drugs. Current Atherosclerosis Reports. https://doi.org/10.1007/s11883-014-0439-8

Abifadel, M., Varret, M., Rabes, J. P., Allard, D., Ouguerram, K., Devillers, M., … Boileau, C. (2003). Mutations in PCSK9 cause autosomal dominant hypercholesterolemia. Nat Genet, 34(2), 154–156. https://doi.org/10.1038/ng1161

Abul-Husn, N. S., Manickam, K., Jones, L. K., Wright, E. A., Hartzel, D. N., Gonzaga-Jauregui, C., … Murray, M. F. (2016). Genetic identification of familial hypercholesterolemia within a single U.S. Health care system. Science, 354(6319). https://doi.org/10.1126/science.aaf7000

Agilent Technologies. (2007). Agilent 2100 Bioanalyzer. Retrieved August 18, 2016, from http://www.webcitation.org/6LUOQQnc4

Akbari, M., Hansen, M. D., Halgunset, J., Skorpen, F., & Krokan, H. E. (2005). Low copy number DNA template can render polymerase chain reaction error prone in a sequence-dependent manner. The Journal of Molecular Diagnostics : JMD, 7(1), 36–39.

Al-Khateeb, A., Zahri, M. K., Mohamed, M. S., Sasongko, T. H., Ibrahim, S., Yusof, Z., & Zilfalil, B. A. (2011). Analysis of sequence variations in low-density lipoprotein receptor gene among Malaysian patients with familial hypercholesterolemia. BMC Medical Genetics, 12. https://doi.org/10.1186/1471-2350-12-40

Al-Waili, K., Al-Zidi, W. A. M., Al-Abri, A. R., Al-Rasadi, K., Al-Sabti, H. A., Shah, K., … Banerjee, Y. (2013). Mutation in the PCSK9 gene in Omani Arab subjects with autosomal dominant hypercholesterolemia and its efect on PCSK9 protein structure. Oman Medical Journal, 28(1), 48–52. https://doi.org/10.5001/omj.2013.11

Alonso, R., Fernández de Bobadilla, J., Méndez, I., Lázaro, P., Mata, N., & Mata, P. (2008). [Cost-effectiveness of managing familial hypercholesterolemia using atorvastatin-based preventive therapy]. Revista Española De Cardiología, 61(4), 382–393. https://doi.org/10.1016/S0140-6736(00)04257-4

Amsellem, S., Briffaut, D., Carrié, A., Rabés, J. P., Girardet, J. P., Fredenrich, A., … Benlian, P. (2002). Intronic mutations outside of Alu-repeat-rich domains of the LDL receptor gene are a cause of familial hypercholesterolemia. Human Genetics, 111(6), 501–510. https://doi.org/10.1007/s00439-002-0813-4

Anderson, C. (2018). Google’s AI Tool DeepVariant Promises Significantly Fewer Genome Errors. Clinical OMICs, 5(1), 33. https://doi.org/10.1089/clinomi.05.01.21

Andrews, S. (2010). FastQC: A quality control tool for high throughput sequence data. Babraham Bioinformatics, http://www.bioinformatics.babraham.ac.uk/projects/. https://doi.org/citeulike-article-id:11583827

Anon. (2008). MinElute Handbook. QIAGEN Sample & Assay Technologies, (March), 23–24. Retrieved from http://www.qiagen.com/resources/Download.aspx?id=%7BFA2ED17D-A5E8-4843-80C1-3D0EA6C2287D%7D&lang=en&ver=1

Aronesty, E. (2013). Comparison of Sequencing Utility Programs. The Open Bioinformatics Journal, 7(1), 1–8. https://doi.org/10.2174/1875036201307010001

Page 140: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

122

Asselbergs, F. W., Guo, Y., van Iperen, E. P. A., Sivapalaratnam, S., Tragante, V., Lanktree, M. B., … Drenos, F. (2012). Large-scale gene-centric meta-analysis across 32 studies identifies multiple lipid loci. American Journal of Human Genetics, 91(5), 823–838. https://doi.org/10.1016/j.ajhg.2012.08.032

Austin, M. A., Hutter, C. M., Zimmern, R. L., & Humphries, S. E. (2004). Genetic causes of monogenic heterozygous familial hypercholesterolemia: a HuGE prevalence review. Am J Epidemiol, 160(5), 407–420. https://doi.org/10.1093/aje/kwh236

Awan, Z., Choi, H. Y., Stitziel, N., Ruel, I., Bamimore, M. A., Husa, R., … Genest, J. (2013). APOE p.Leu167del mutation in familial hypercholesterolemia. Atherosclerosis, 231(2), 218–222. https://doi.org/10.1016/j.atherosclerosis.2013.09.007

Baigent, C., Blackwell, L., Emberson, J., Holland, L. E., Reith, C., Bhala, N., … Collins, R. (2010). Efficacy and safety of more intensive lowering of LDL cholesterol: a meta-analysis of data from 170,000 participants in 26 randomised trials. Lancet, 376(9753), 1670–1681. https://doi.org/10.1016/s0140-6736(10)61350-5

Beckman Coulter, I. (2016). Agencourt AMPure XP-PCR Purification. Retrieved August 18, 2016, from https://www.beckmancoulter.com/wsrportal/techdocs?docname=B37419

Beigh, M. (2016). Next-Generation Sequencing: The Translational Medicine Approach from “Bench to Bedside to Population.” Medicines, 3(2), 14. https://doi.org/10.3390/medicines3020014

Bell, D. A., Hooper, A. J., Watts, G. F., & Burnett, J. R. (2012). Mipomersen and other therapies for the treatment of severe familial hypercholesterolemia. Vascular Health and Risk Management. https://doi.org/10.2147/VHRM.S28581

Bell, J. (2017). Life sciences: industrial strategy. Retrieved May 19, 2018, from https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/650447/LifeSciencesIndustrialStrategy_acc2.pdf

Benn, M., Stene, M. C. A., Nordestgaard, B. G., Jensen, G. B., Steffensen, R., & Tybjærg-Hansen, A. (2008). Common and rare alleles in apolipoprotein B contribute to plasma levels of low-density lipoprotein cholesterol in the general population. Journal of Clinical Endocrinology and Metabolism, 93(3), 1038–1045. https://doi.org/10.1210/jc.2007-1365

Benn, M., Watts, G. F., Tybjærg-Hansen, A., & Nordestgaard, B. G. (2016). Mutations causative of familial hypercholesterolaemia: Screening of 98 098 individuals from the Copenhagen General Population Study estimated a prevalence of 1 in 217. European Heart Journal, 37(17), 1384–1394. https://doi.org/10.1093/eurheartj/ehw028

Bentley, D. R., Balasubramanian, S., Swerdlow, H. P., Smith, G. P., Milton, J., Brown, C. G., … Smith, A. J. (2008). Accurate whole human genome sequencing using reversible terminator chemistry. Nature, 456(7218), 53–59. https://doi.org/10.1038/nature07517

Bentzen, J., Jorgensen, T., & Fenger, M. (2002). The effect of six polymorphisms in the Apolipoprotein B gene on parameters of lipid metabolism in a Danish population. Clin Genet, 61(2), 126–134.

Bhagavan, N. V., & Jackson, C. M. (2002). Medical Biochemistry. Medical Biochemistry.

Page 141: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

123

https://doi.org/10.1016/B978-012095440-7/50037-8

Blankenberg, D., Gordon, A., Von Kuster, G., Coraor, N., Taylor, J., Nekrutenko, A., & Team, G. (2010). Manipulation of FASTQ data with galaxy. Bioinformatics, 26(14), 1783–1785. https://doi.org/10.1093/bioinformatics/btq281

Blankenberg, D., Kuster, G. Von, Coraor, N., Ananda, G., Lazarus, R., Mangan, M., … Taylor, J. (2010). Galaxy: A web-based genome analysis tool for experimentalists. Current Protocols in Molecular Biology. https://doi.org/10.1002/0471142727.mb1910s89

Borén, J., Ekström, U., Ågren, B., Nilsson-Ehle, P., & Innerarity, T. L. (2001). The Molecular Mechanism for the Genetic Disorder Familial Defective Apolipoprotein B100. Journal of Biological Chemistry, 276(12), 9214–9218. https://doi.org/10.1074/jbc.M008890200

Braun, P., & LaBaer, J. (2003). High throughput protein production for functional proteomics. Trends in Biotechnology, 21(9), 383–8. https://doi.org/10.1016/S0167-7799(03)00189-6

Brautbar, A., Leary, E., Rasmussen, K., Wilson, D. P., Steiner, R. D., & Virani, S. (2015). Genetics of Familial Hypercholesterolemia. Current Atherosclerosis Reports, 17(4), 20. https://doi.org/10.1007/s11883-015-0491-z

Brice, P., Burton, H., Edwards, C. W., Humphries, S. E., & Aitman, T. J. (2013). Familial hypercholesterolaemia: A pressing issue for European health care. Atherosclerosis. https://doi.org/10.1016/j.atherosclerosis.2013.09.019

Brown, M. S., & Goldstein, J. L. (1974). Expression of the familial hypercholesterolemia gene in herozygotes: Mechanism for a dominant disorder in man. Science, 185(4145), 61–63. https://doi.org/10.1126/science.185.4145.61

Brown, M. S., & Goldstein, J. L. (1986). A receptor-mediated pathway for cholesterol homeostasis. Science (New York, N.Y.), 232(4746), 34–47.

Bruce Keogh. (2015). Personalised Medicine Strategy. Retrieved May 19, 2018, from https://www.england.nhs.uk/wp-content/uploads/2015/09/item5-board-29-09-15.pdf

Buchanan, J., Wordsworth, S., & Schuh, A. (2013). Issues surrounding the health economic evaluation of genomic technologies. Pharmacogenomics, 14(15), 1833–1847. https://doi.org/10.2217/pgs.13.183

Burns, F. S. (1920). A contribution to the study of the etiology of xanthoma multiplex. Archives of Dermatology and Syphilology, 2(4), 415–429. https://doi.org/10.1001/archderm.1920.02350100003001

Burt, S. M., Carter, T. J. N., & Kricka, L. J. (1979). Thermal characteristics of microtitre plates used in immunological assays. Journal of Immunological Methods, 31(3–4), 231–236. https://doi.org/10.1016/0022-1759(79)90135-2

Calandra, S., Tarugi, P., Speedy, H. E., Dean, A. F., Bertolini, S., & Shoulders, C. C. (2011). Mechanisms and genetic determinants regulating sterol absorption, circulating LDL levels, and sterol elimination: implications for classification and disease risk. Journal of Lipid Research, 52(11), 1885–1926. https://doi.org/10.1194/jlr.R017855

Cariou, B., Le May, C., & Costet, P. (2011). Clinical aspects of PCSK9. Atherosclerosis. https://doi.org/10.1016/j.atherosclerosis.2011.04.018

Page 142: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

124

Catapano, A. L., Reiner, Ž., De Backer, G., Graham, I., Taskinen, M.-R., Wiklund, O., … Durrington, P. (2011). ESC/EAS Guidelines for the management of dyslipidaemias. Atherosclerosis, 217, 1–44.

Cenarro, A., Etxebarria, A., De Castro-Orós, I., Stef, M., Bea, A. M., Palacios, L., … Civeira, F. (2016). The p.Leu167del mutation in APOE gene causes autosomal dominant hypercholesterolemia by down-regulation of LDL receptor expression in hepatocytes. Journal of Clinical Endocrinology and Metabolism, 101(5), 2113–2121. https://doi.org/10.1210/jc.2015-3874

Chen, S. H., Yang, C. Y., Chen, P. F., Setzer, D., Tanimura, M., Li, W. H., … Chan, L. (1986). The complete cDNA and amino acid sequence of human apolipoprotein B-100. Journal of Biological Chemistry, 261(28).

Chen, S. N., Ballantyne, C. M., Gotto, A. M., Tan, Y., Willerson, J. T., & Marian, A. J. (2005). A Common PCSK9Haplotype, Encompassing the E670G Coding Single Nucleotide Polymorphism, Is a Novel Genetic Marker for Plasma Low-Density Lipoprotein Cholesterol Levels and Severity of Coronary Atherosclerosis. Journal of the American College of Cardiology, 45(10), 1611–1619. https://doi.org/10.1016/j.jacc.2005.01.051

Chiou, K. R., Charng, M. J., & Chang, H. M. (2011). Array-based resequencing for mutations causing familial hypercholesterolemia. Atherosclerosis, 216(2), 383–389. https://doi.org/10.1016/j.atherosclerosis.2011.02.006

Christensen, K. D., Dukhovny, D., Siebert, U., & Green, R. C. (2015). Assessing the costs and cost-effectiveness of genomic sequencing. Journal of Personalized Medicine. https://doi.org/10.3390/jpm5040470

Civeira, F. (2004). Guidelines for the diagnosis and management of heterozygous familial hypercholesterolemia. Atherosclerosis, 173(1), 55–68. https://doi.org/10.1016/j.atherosclerosis.2003.11.010

Civeira, F., Ros, E., Jarauta, E., Plana, N., Zambon, D., Puzo, J., … Pocovi, M. (2008). Comparison of genetic versus clinical diagnosis in familial hypercholesterolemia. The American Journal of Cardiology, 102(9), 1187–93, 1193.e1. https://doi.org/10.1016/j.amjcard.2008.06.056

Cuchel, M., Bruckert, E., Ginsberg, H. N., Raal, F. J., Santos, R. D., Hegele, R. A., … Chapman, M. J. (2014). Homozygous familial hypercholesterolaemia: New insights and guidance for clinicians to improve detection and clinical management. A position paper fromthe Consensus Panel on Familial Hypercholesterolaemia of the European Atherosclerosis Society. European Heart Journal. https://doi.org/10.1093/eurheartj/ehu274

Daniels, S. R., & Greer, F. R. (2008). Lipid Screening and Cardiovascular Health in Childhood. PEDIATRICS, 122(1), 198–208. https://doi.org/10.1542/peds.2008-1349

Daniels, T. F., Killinger, K. M., Michal, J. J., Wright, R. W., & Jiang, Z. (2009). Lipoproteins, cholesterol homeostasis and cardiac health. International Journal of Biological Sciences. https://doi.org/10.7150/ijbs.5.474

Davies, M. J. (1996). Stability and instability: Two faces of coronary atherosclerosis: The Paul Dudley White lecture 1995. Circulation. https://doi.org/10.1161/01.CIR.94.8.2013

Davis, C. G., Elhammer, A., Russell, D. W., Schneider, W. J., Kornfeld, S., Brown, M. S.,

Page 143: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

125

& Goldstein, J. L. (1986). Deletion of clustered O-linked carbohydrates does not impair function of low density lipoprotein receptor in transfected fibroblasts. The Journal of Biological Chemistry, 261(6), 2828–2838.

De Castro-Orós, I., Pocoví, M., & Civeira, F. (2010). The genetic basis of familial hypercholesterolemia: inheritance, linkage, and mutations. The Application of Clinical Genetics, 3, 53–64. https://doi.org/10.2147/TACG.S8285

Deans, Z., Watson, C. M., Charlton, R., Ellard, S., Wallis, Y., Mattocks, C., & Stephen Abbs. (2015). Practice guidelines for Targeted Next Generation Sequencing Analysis and Interpretation. Acgs, 1–13. Retrieved from http://www.acgs.uk.com/media/983872/bpg_for_targeted_next_generation_sequencing_-_approved_dec_2015.pdf

Dedoussis, G. V. Z., Genschel, J., Bochow, B., Pitsavos, C., Skoumas, J., Prassa, M., … Schmidt, H. (2004). Molecular characterization of familial hypercholesterolemia in German and Greek patients. Human Mutation, 23(3), 285–286. https://doi.org/10.1002/humu.9218

DePristo, M. A., Banks, E., Poplin, R., Garimella, K. V, Maguire, J. R., Hartl, C., … Daly, M. J. (2011). A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nature Genetics, 43(5), 491–498. https://doi.org/10.1038/ng.806

Desai, A. N., & Jere, A. (2012). Next-generation sequencing: ready for the clinics? Clinical Genetics, 81(6), 503–510. https://doi.org/10.1111/j.1399-0004.2012.01865.x

Dietschy, J. M. (1984). Regulation of cholesterol metabolism in man and in other species. Klinische Wochenschrift, 62(8), 338–45. https://doi.org/10.1007/BF01716251

Duskova, L., Kopeckova, L., Jansova, E., Tichy, L., Freiberger, T., Zapletalova, P., … Fajkusova, L. (2011). An APEX-based genotyping microarray for the screening of 168 mutations associated with familial hypercholesterolemia. Atherosclerosis, 216(1), 139–145. https://doi.org/10.1016/j.atherosclerosis.2011.01.023

Edmondson, A. C., Braund, P. S., Stylianou, I. M., Khera, A. V., Nelson, C. P., Wolfe, M. L., … Rader, D. J. (2011). Dense genotyping of candidate gene loci identifies variants associated with high-density lipoprotein cholesterol. Circulation. Cardiovascular Genetics, 4(2), 145–55. https://doi.org/10.1161/CIRCGENETICS.110.957563

Endrullat, C., Glökler, J., Franke, P., & Frohme, M. (2016). Standardization and quality management in next-generation sequencing. Applied and Translational Genomics. https://doi.org/10.1016/j.atg.2016.06.001

Esteva, A., Kuprel, B., Novoa, R. A., Ko, J., Swetter, S. M., Blau, H. M., & Thrun, S. (2017). Dermatologist-level classification of skin cancer with deep neural networks. Nature, 542(7639), 115.

Eurofins Genomics. (2017). DNA & RNA Oligonucleotides. Retrieved March 16, 2017, from https://www.eurofinsgenomics.eu/en/dna-rna-oligonucleotides.aspx

Fahed, A. C., & Nemer, G. M. (2011). Familial hypercholesterolemia: The lipids or the genes? Nutrition and Metabolism. https://doi.org/10.1186/1743-7075-8-23

Faiz, F., Hooper, A. J., & Van Bockxmeer, F. M. (2012). Molecular pathology of familial hypercholesterolemia, related dyslipidemias and therapies beyond the statins. Critical

Page 144: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

126

Reviews in Clinical Laboratory Sciences. https://doi.org/10.3109/10408363.2011.646942

Faiz, F., Nguyen, L. T., Van Bockxmeer, F. M., & Hooper, A. J. (2014). Genetic screening to improve the diagnosis of familial hypercholesterolemia. Clinical Lipidology, 9(5), 523–532.

Filigheddu, F., Quagliarini, F., Campagna, F., Secci, T., Degortes, S., Zaninello, R., … Arca, M. (2009). Prevalence and clinical features of heterozygous carriers of autosomal recessive hypercholesterolemia in Sardinia. Atherosclerosis, 207(1), 162–167. https://doi.org/10.1016/j.atherosclerosis.2009.04.027

Finnie, R. M., Bell, C., Bloomfield, P., Ho, C. K. M., Jenks, S., Shand, N., & Walker, S. W. (2012). The first hundred families diagnosed with familial hypercholesterolaemia in two lipid clinics in lothian. The British Journal of Diabetes & Vascular Disease, 12(5), 243–247.

Friedewald, W. T., Levy, R. I., & Fredrickson, D. S. (1972). Estimation of the concentration of low-density lipoprotein cholesterol in plasma, without use of the preparative ultracentrifuge. Clinical Chemistry, 18(6), 499–502.

Frikke-Schmidt, R., Nordestgaard, B. G., Schnohr, P., & Tybjaerg-Hansen, A. (2004). Single nucleotide polymorphism in the low-density lipoprotein receptor is associated with a threefold risk of stroke. A case-control and prospective study. Eur Heart J, 25(11), 943–951. https://doi.org/10.1016/j.ehj.2004.03.020\rS0195668X04002143 [pii]

Froyen, G., Broekmans, A., Hillen, F., Pat, K., Achten, R., Mebis, J., … Maes, B. (2016). Validation and application of a custom-designed targeted next-generation sequencing panel for the diagnostic mutational profiling of solid tumors. PLoS ONE, 11(4). https://doi.org/10.1371/journal.pone.0154038

Futema, M., Plagnol, V., Whittall, R. A., Neil, H. A., & Humphries, S. E. (2012). Use of targeted exome sequencing as a diagnostic tool for Familial Hypercholesterolaemia. J Med Genet, 49(10), 644–649. https://doi.org/10.1136/jmedgenet-2012-101189

Ganji, S. H., Kamanna, V. S., & Kashyap, M. L. (2003). Niacin and cholesterol: role in cardiovascular disease (review). The Journal of Nutritional Biochemistry, 14(6), 298–305. https://doi.org/10.1016/S0955-2863(02)00284-X

Goldstein, J. L. (2001). Familial hypercholesterolemia.

Goldstein, J. L., Anderson, R. G., & Brown, M. S. (1982). Receptor-mediated endocytosis and the cellular uptake of low density lipoprotein. Ciba Foundation Symposium, (92), 77–95. Retrieved from http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=6129958

Goldstein, J. L., & Brown, M. S. (1973). Familial hypercholesterolemia: identification of a defect in the regulation of 3-hydroxy-3-methylglutaryl coenzyme A reductase activity associated with overproduction of cholesterol. Proc Natl Acad Sci U S A, 70(10), 2804–2808.

Goldstein, J. L., & Brown, M. S. (1974). Binding and degradation of low density lipoproteins by cultured human fibroblasts. Comparison of cells from a normal subject and from a patient with homozygous familial hypercholesterolemia. Journal of Biological Chemistry, 249(16), 5153–5162.

Page 145: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

127

Goldstein JL, Hobbs HH, Brown MS, . (2001). The metabolic and molecular bases of inherited disease. Familial hypercholesterolemia New York: In McGraw-Hill (pp. 2863–913). https://doi.org/10.1036/ommbid.149

Gordon, D. J., Probstfield, J. L., Garrison, R. J., Neaton, J. D., Castelli, W. P., Knoke, J. D., … Tyroler, H. A. (1989). High-density lipoprotein cholesterol and cardiovascular disease. Four prospective American studies. Circulation, 79(1), 8–15. https://doi.org/10.1161/01.CIR.79.1.8

Graham, C. A., McClean, E., Ward, A. J. M., Beattie, E. D., Martin, S., O’Kane, M., … Nicholls, D. P. (1999). Mutation screening and genotype:phenotype correlation in familial hypercholesterolaemia. Atherosclerosis, 147(2), 309–316. https://doi.org/10.1016/S0021-9150(99)00201-4

Graham, C. A., McIlhatton, B. P., Kirk, C. W., Beattie, E. D., Lyttle, K., Hart, P., … Nicholls, D. P. (2005). Genetic screening protocol for familial hypercholesterolemia which includes splicing defects gives an improved mutation detection rate. Atherosclerosis, 182(2), 331–340. https://doi.org/10.1016/j.atherosclerosis.2005.02.016

Granter, S. R., Beck, A. H., & Papke Jr, D. J. (2017). AlphaGo, deep learning, and the future of the human microscopist. Archives of Pathology & Laboratory Medicine, 141(5), 619–621.

Greenbaum, D., Du, J., & Gerstein, M. (2008). Genomic anonymity: Have we already lost it? American Journal of Bioethics. https://doi.org/10.1080/15265160802478560

Grenkowitz, T., Kassner, U., Wühle-Demuth, M., Salewsky, B., Rosada, A., Zemojtel, T., … Demuth, I. (2016). Clinical characterization and mutation spectrum of German patients with familial hypercholesterolemia. Atherosclerosis, 253, 88–93. https://doi.org/10.1016/j.atherosclerosis.2016.08.037

Haddow, J. E., & Palomaki, G. E. (2004). ACCE : A Model Process for Evaluating Data on Emerging Genetic Tests. In Human Genome Epidemiology: A Scientific Foundation for Using Genetic Information to Improve Health and Prevent Disease (pp. 217–233).

Haymana, C. (2017). Identifying Undiagnosed or Undertreated Patients with Familial Hypercholesterolemia from the Laboratory Records of a Tertiary Medical Center. Turk Kardiyoloji Dernegi Arsivi-Archives of the Turkish Society of Cardiology. https://doi.org/10.5543/tkda.2017.63846

Head, S. R., Kiyomi Komori, H., LaMere, S. A., Whisenant, T., Van Nieuwerburgh, F., Salomon, D. R., & Ordoukhanian, P. (2014). Library construction for next-generation sequencing: Overviews and challenges. BioTechniques, 56(2), 61–77. https://doi.org/10.2144/000114133

Health, D. of. (2005). Research governance framework for health and social care. Retrieved from https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/139565/dh_4122427.pdf

Heart UK. (2012). Saving lives, saving families. The Health, Social and Economic Advantages of Detecting and Treating Familial Hypercholesterolaemia (FH). Retrieved October 21, 2018, from www.heartuk.org.uk/files/uploads/documents/HUK_SavingLivesSavingFamilies_FHreport_Feb2012.pdf.

Page 146: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

128

Heath, K. E., Humphries, S. E., Middleton-Price, H., & Boxer, M. (2001). A molecular genetic service for diagnosing individuals with familial hypercholesterolaemia (FH) in the United Kingdom. Eur J Hum Genet, 9(4), 244–252. https://doi.org/10.1038/sj.ejhg.5200633

Henderson, R., O’Kane, M., McGilligan, V., & Watterson, S. (2016). The genetics and screening of familial hypercholesterolaemia. Journal of Biomedical Science. https://doi.org/10.1186/s12929-016-0256-1

Hobbs, H. H., Brown, M. S., & Goldstein, J. L. (1992a). Molecular genetics of the LDL receptor gene in familial hypercholesterolemia. Human Mutation, 1(6), 445–466. https://doi.org/10.1002/humu.1380010602

Hobbs, H. H., Brown, M. S., & Goldstein, J. L. (1992b). Molecular genetics of the LDL receptor gene in familial hypercholesterolemia. Hum Mutat, 1(6), 445–466. https://doi.org/10.1002/humu.1380010602

Hobbs, H. H., Leitersdorf, E., Leffert, C. C., Cryer, D. R., Brown, M. S., & Goldstein, J. L. (1989). Evidence for a dominant gene that suppresses hypercholesterolemia in a family with defective low density lipoprotein receptors. The Journal of Clinical Investigation, 84(2), 656–664. https://doi.org/10.1172/JCI114212

Hobbs, H. H., Russell, D. W., Brown, M. S., & Goldstein, J. L. (1990). The LDL receptor locus in familial hypercholesterolemia: mutational analysis of a membrane protein. Annual Review of Genetics, 24, 133–170. https://doi.org/10.1146/annurev.ge.24.120190.001025

Hopkins, P. N., Toth, P. P., Ballantyne, C. M., & Rader, D. J. (2011). Familial hypercholesterolemias: Prevalence, genetics, diagnosis and screening recommendations from the National Lipid Association Expert Panel on Familial Hypercholesterolemia. Journal of Clinical Lipidology, 5(3 SUPPL.). https://doi.org/10.1016/j.jacl.2011.03.452

Huang, L. S., Miller, D. a, Bruns, G. a, & Breslow, J. L. (1986). Mapping of the human APOB gene to chromosome 2p and demonstration of a two-allele restriction fragment length polymorphism. Proceedings of the National Academy of Sciences of the United States of America, 83(3), 644–648. https://doi.org/10.1073/pnas.83.3.644

Huijgen, R., Hutten, B. A., Kindt, I., Vissers, M. N., & Kastelein, J. J. P. (2012). Discriminative ability of LDL-cholesterol to identify patients with familial hypercholesterolemia: A cross-sectional study in 26 406 individuals tested for genetic FH. Circulation: Cardiovascular Genetics, 5(3), 354–359. https://doi.org/10.1161/CIRCGENETICS.111.962456

Humphries, S. E., Cooper, J., Dale, P., & Ramaswami, U. (2018). The UK Paediatric Familial Hypercholesterolaemia Register: Statin-related safety and 1-year growth data. Journal of Clinical Lipidology, 12(1), 25–32. https://doi.org/10.1016/j.jacl.2017.11.005

Humphries, S. E., Whittall, R. A., Hubbart, C. S., Maplebeck, S., Cooper, J. A., Soutar, A. K., … Neil, H. A. W. (2006). Genetic causes of familial hypercholesterolaemia in patients in the UK: relation to plasma lipid levels and coronary heart disease risk. J Med Genet, 43(12), 943–949. https://doi.org/10.1136/jmg.2006.038356

Hunt, S. C., Hopkins, P. N., Bulka, K., McDermott, M. T., Thorne, T. L., Wardell, B. B., … Samuels, M. E. (2000). Genetic localization to chromosome 1p32 of the third locus for familial hypercholesterolemia in a Utah kindred. Arteriosclerosis, Thrombosis, and

Page 147: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

129

Vascular Biology, 20(4), 1089–1093. https://doi.org/10.1161/01.ATV.20.4.1089

Hussain, M. M., Strickland, D. K., & Bakillah, A. (1999). The mammalian low-density lipoprotein receptor family. Annual Review of Nutrition, (19), 141–172. https://doi.org/http://dx.doi.org/10.1146/annurev.nutr.19.1.141

Iacocca, M. A., Wang, J., Dron, J. S., Robinson, J. F., McIntyre, A. D., Cao, H., & Hegele, R. A. (2017). Use of next-generation sequencing to detect LDLR gene copy number variation in familial hypercholesterolemia. Journal of Lipid Research, jlr.D079301. https://doi.org/10.1194/jlr.D079301

Illumina. (2012a). An Introduction to Next-Generation Sequencing Technology. Retrieved August 18, 2016, from https://www.illumina.com/Documents/products/Illumina_Sequencing_Introduction.pdf

Illumina. (2012b). An Introduction to Next-Generation Sequencing Technology.

Illumina. (2016a). Illumina DesignStudio. Retrieved May 18, 2017, from https://designstudio.illumina.com/

Illumina. (2016b). MiSeq System Denature and Dilute Libraries Guide. Retrieved July 18, 2016, from https://support.illumina.com/content/dam/illumina-support/documents/documentation/system_documentation/miseq/miseq-denature-dilute-libraries-guide-15039740-01.pdf

Illumina. (2016c). TruSeq Custom Amplicon v1.5 Reference Guide. Retrieved May 18, 2016, from https://support.illumina.com/content/dam/illumina-support/documents/documentation/chemistry_documentation/samplepreps_truseq/truseqcustomamplicon/truseq-custom-amplicon-15-reference-guide-15027983-02.pdf

Illumina. (2016d). Using a PhiX Control for HiSeq Sequencing Runs. Retrieved August 20, 2017, from https://www.illumina.com/content/dam/illumina-marketing/documents/products/technotes/hiseq-phix-control-v3-technical-note.pdf

Innerarity, T. L., Mahley, R. W., Weisgraber, K. H., Bersot, T. P., Krauss, R. M., Vega, G. L., … McCarthy, B. J. (1990). Familial defective apolipoprotein B-100: a mutation of apolipoprotein B that causes hypercholesterolemia. Journal of Lipid Research, 31, 1337–1349.

Jennings, L. J., Arcila, M. E., Corless, C., Kamel-Reid, S., Lubin, I. M., Pfeifer, J., … Nikiforova, M. N. (2017). Guidelines for Validation of Next-Generation Sequencing–Based Oncology Panels: A Joint Consensus Recommendation of the Association for Molecular Pathology and College of American Pathologists. Journal of Molecular Diagnostics. https://doi.org/10.1016/j.jmoldx.2017.01.011

Johansen, C. T., Wang, J., Lanktree, M. B., Cao, H., McIntyre, A. D., Ban, M. R., … Hegele, R. A. (2010). Excess of rare variants in genes identified by genome-wide association study of hypertriglyceridemia. Nature Genetics, 42(8), 684–687. https://doi.org/10.1038/ng.628

Jun, M., Foote, C., Lv, J., Neal, B., Patel, A., Nicholls, S. J., … al., et. (2010). Effects of fibrates on cardiovascular outcomes: a systematic review and meta-analysis. Lancet (London, England), 375(9729), 1875–84. https://doi.org/10.1016/S0140-6736(10)60656-3

Kerr, M., Pears, R., Miedzybrodzka, Z., Haralambos, K., Cather, M., Watson, M., & Humphries, S. E. (2017). Cost effectiveness of cascade testing for familial

Page 148: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

130

hypercholesterolaemia, based on data from familial hypercholesterolaemia services in the UK. European Heart Journal, 38(23), 1832–1839. https://doi.org/10.1093/eurheartj/ehx111

Khachadurian, A. K. (1988). Clinical features, diagnosis and frequency of familial hypercholesterolemia. Beitrage Zur Infusionstherapie = Contributions to Infusion Therapy, 23, 26–32.

KHACHADURIAN, A. K. (1964). THE INHERITANCE OF ESSENTIAL FAMILIAL HYPERCHOLESTEROLEMIA. The American Journal of Medicine, 37, 402–407.

Kirke, A., Watts, G. F., & Emery, J. (2012). Detecting familial hypercholesterolaemia in general practice. Australian Family Physician, 41(12), 965–968.

Koivisto, P. V, Koivisto, U. M., Miettinen, T. A., & Kontula, K. (1992). Diagnosis of heterozygous familial hypercholesterolemia. DNA analysis complements clinical examination and analysis of serum lipid levels. Arteriosclerosis and Thrombosis : A Journal of Vascular Biology, 12(5), 584–592.

Krampis, K., & Wultsch, C. (2015). A Review of Cloud Computing Bioinformatics Solutions for Next-Gen Sequencing Data Analysis and Research. Methods in Next Generation Sequencing, 2(1). https://doi.org/10.1515/mngs-2015-0003

Kricka, L. J., Carter, T. J. N., Burt, S. M., Kennedy, J. H., Holder, R. L., Halliday, M. I., … Wisdom, G. B. (1980). Variability in the adsorption properties of microtitre plates used as solid supports in enzyme immunoassay. Clinical Chemistry, 26(6), 741–744.

Kruth, H. S. (1985). Lipid deposition in human tendon xanthoma. The American Journal of Pathology, 121(2), 311–315.

Kusters, D. M., De Beaufort, C., Widhalm, K., Guardamagna, O., Bratina, N., Ose, L., & Wiegman, A. (2012). Paediatric screening for hypercholesterolaemia in Europe. Archives of Disease in Childhood. https://doi.org/10.1136/archdischild-2011-300081

Kwon, H. J., Lagace, T. A., McNutt, M. C., Horton, J. D., & Deisenhofer, J. (2008). Molecular basis for LDL receptor recognition by PCSK9. Proceedings of the National Academy of Sciences of the United States of America, 105(6), 1820–1825. https://doi.org/10.1073/pnas.0712064105

Kzhyshkowska, J., Neyen, C., & Gordon, S. (2012). Role of macrophage scavenger receptors in atherosclerosis. Immunobiology, 217(5), 492–502. https://doi.org/10.1016/j.imbio.2012.02.015

Lam, M. C. W., Singham, J., Hegele, R. A., Riazy, M., Hiob, M. A., Francis, G., & Steinbrecher, U. P. (2012). Familial hypobetalipoproteinemia-induced nonalcoholic steatohepatitis. Case Reports in Gastroenterology, 6(2), 429–437. https://doi.org/10.1159/000339761

Law, S. W., Lee, N., Monge, J. C., Brewer Jr., H. B., Sakaguchi, A. Y., & Naylor, S. L. (1985). Human ApoB-100 gene resides in the p23----pter region of chromosome 2. Biochemical and Biophysical Research Communications, 131(2), 1003–1012. https://doi.org/0006-291X(85)91339-7 [pii]

Lee, W. K., Haddad, L., Macleod, M. J., Dorrance, a M., Wilson, D. J., Gaffney, D., … Dominiczak, a F. (1998). Identification of a common low density lipoprotein receptor mutation (C163Y) in the west of Scotland. Journal of Medical Genetics, 35(7), 573–8. https://doi.org/10.1136/jmg.35.7.573

Page 149: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

131

Leek, J. T., Scharpf, R. B., Bravo, H. C., Simcha, D., Langmead, B., Johnson, W. E., … Irizarry, R. A. (2010). Tackling the widespread and critical impact of batch effects in high-throughput data. Nature Reviews Genetics. https://doi.org/10.1038/nrg2825

Leigh, S. E. A., Leren, T. P., & Humphries, S. E. (2009, March). Commentary PCSK9 variants: A new database. Atherosclerosis. Ireland. https://doi.org/10.1016/j.atherosclerosis.2009.02.006

Leigh, S. E., Foster, A. H., Whittall, R. A., Hubbart, C. S., & Humphries, S. E. (2008). Update and analysis of the University College London low density lipoprotein receptor familial hypercholesterolemia database. Ann Hum Genet, 72(Pt 4), 485–498. https://doi.org/10.1111/j.1469-1809.2008.00436.x

Leren, T. P. (2004). Mutations in the PCSK9 gene in Norwegian subjects with autosomal dominant hypercholesterolemia. Clinical Genetics, 65(5), 419–422. https://doi.org/10.1111/j.0009-9163.2004.0238.x

Leren, T. P., Finborud, T. H., Manshaus, T. E., Ose, L., & Berge, K. E. (2008). Diagnosis of familial hypercholesterolemia in general practice using clinical diagnostic criteria or genetic testing as part of cascade genetic screening. Community Genetics, 11(1), 26–35. https://doi.org/10.1159/000111637

Leren, T. P., Manshaus, T., Skovholt, U., Skodje, T., Nossen, I. E., Teie, C., … Bakken, K. S. (2004). Application of Molecular Genetics for Diagnosing Familial Hypercholesterolemia in Norway: Results from a Family-Based Screening Program. Seminars in Vascular Medicine. https://doi.org/10.1055/s-2004-822989

Li, H., & Durbin, R. (2009). Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics, 25(14), 1754–1760. https://doi.org/10.1093/bioinformatics/btp324

Li, H., & Durbin, R. (2010). Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics, 26(5), 589–595. https://doi.org/10.1093/bioinformatics/btp698

Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., … Durbin, R. (2009). The Sequence Alignment/Map format and SAMtools. Bioinformatics, 25(16), 2078–2079. https://doi.org/10.1093/bioinformatics/btp352

Liao, Y.-C., Lin, H.-F., Rundek, T., Cheng, R., Hsi, E., Sacco, R. L., & Juo, S.-H. H. (2008). Multiple genetic determinants of plasma lipid levels in Caribbean Hispanics. Clinical Biochemistry, 41(4–5), 306–12. https://doi.org/10.1016/j.clinbiochem.2007.11.011

Libbrecht, M. W., & Noble, W. S. (2015). Machine learning applications in genetics and genomics. Nature Reviews. Genetics, 16(6), 321–332. https://doi.org/10.1038/nrg3920

Libby, P. (2002). Inflammation in atherosclerosis. Nature. https://doi.org/10.1038/nature01323

Loh, E. (2018). Medicine and the rise of the robots: a qualitative review of recent advances of artificial intelligence in health. BMJ Leader. Retrieved from http://bmjleader.bmj.com/content/early/2018/06/01/leader-2018-000071.abstract

Long, E., Lin, H., Liu, Z., Wu, X., Wang, L., Jiang, J., … Chen, J. (2017). An artificial intelligence platform for the multihospital collaborative management of congenital

Page 150: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

132

cataracts. Nature Biomedical Engineering, 1(2), 24.

Lundholt, B. K., Scudder, K. M., & Pagliaro, L. (2003). A simple technique for reducing edge effect in cell-based assays. Journal of Biomolecular Screening, 8(5), 566–570. https://doi.org/10.1177/1087057103256465

Maglio, C., Mancina, R. M., Motta, B. M., Stef, M., Pirazzi, C., Palacios, L., … Romeo, S. (2014). Genetic diagnosis of familial hypercholesterolaemia by targeted next-generation sequencing. J Intern Med, 276(4), 396–403. https://doi.org/10.1111/joim.12263

Mahley, R. W. (1988). Apolipoprotein E: cholesterol transport protein with expanding role in cell biology. Science (New York, N.Y.), 240(4852), 622–630.

Majewski, J., Schwartzentruber, J., Lalonde, E., Montpetit, A., & Jabado, N. (2011). What can exome sequencing do for you? Journal of Medical Genetics. https://doi.org/10.1136/jmedgenet-2011-100223

Mäkelä, K.-M., Seppälä, I., Hernesniemi, J. A., Lyytikäinen, L.-P., Oksala, N., Kleber, M. E., … Lehtimäki, T. (2012). Genome-Wide Association Study Pinpoints a New Functional ApoB Variant Influencing Oxidized LDL Levels but Not Cardiovascular Events: AtheroRemo Consortium. Circulation: Cardiovascular Genetics. https://doi.org/10.1161/CIRCGENETICS.112.964965

Marais, A. D. (2004). Familial hypercholesterolaemia. The Clinical Biochemist. Reviews, 25(1), 49–68. https://doi.org/10.1016/S0140-6736(77)92173-0

Marais, A. D., & Firth, J. C. (2005). The diagnosis and management of familial hypercholesterolaemia. European Review for Medical and Pharmacological Sciences, 9(3), 141.

Marang-van de Mheen, P. J., Ten Asbroek, A. H. A., Bonneux, L., Bonsel, G. J., & Klazinga, N. S. (2002). Cost-effectiveness of a family and DNA based screening programme on familial hypercholesterolaemia in The Netherlands. European Heart Journal, 23(24), 1922–1930. https://doi.org/10.1053/euhj.2002.3281

Marduel, M., Ouguerram, K., Serre, V., Bonnefont-Rousselot, D., Marques-Pinheiro, A., Erik Berge, K., … Varret, M. (2013). Description of a Large Family with Autosomal Dominant Hypercholesterolemia Associated with the APOE p.Leu167del Mutation. Human Mutation, 34(1), 83–87. https://doi.org/10.1002/humu.22215

Marks, D., Thorogood, M., Farrer, J. M., & Humphries, S. E. (2004). Census of clinics providing specialist lipid services in the United Kingdom. J Public Health (Oxf), 26(4), 353–354. https://doi.org/10.1093/pubmed/fdh176

Marks, D., Thorogood, M., Neil, H. A., & Humphries, S. E. (2003). A review on the diagnosis, natural history, and treatment of familial hypercholesterolaemia. Atherosclerosis, 168(1), 1–14.

Marks, D., Wonderling, D., Thorogood, M., Lambert, H., Humphries, S. E., & Neil, H. A. (2000). Screening for hypercholesterolaemia versus case finding for familial hypercholesterolaemia: a systematic review and cost-effectiveness analysis. Health Technol Assess, 4(29), 1–123.

Marks, D., Wonderling, D., Thorogood, M., Lambert, H., Humphries, S. E., & Neil, H. A. W. (2002). Cost effectiveness analysis of different approaches of screening for familial hypercholesterolaemia. BMJ : British Medical Journal, 324(7349), 1303. Retrieved

Page 151: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

133

from http://www.ncbi.nlm.nih.gov/pmc/articles/PMC113765/

McIntyre, L. M., Lopiano, K. K., Morse, A. M., Amin, V., Oberg, A. L., Young, L. J., & Nuzhdin, S. V. (2011). RNA-seq: technical variability and sampling. BMC Genomics, 12, 293. https://doi.org/10.1186/1471-2164-12-293

Metha, R., Zubirán, R., Martagón, A. J., Vázquez Cárdenas, A., Segura Kato, Y., Tusié Luna, M. T., & Aguilar-Salinas, C. A. (2016). The panorama of familial hypercholesterolemia in Latin America: a systematic review. Journal of Lipid Research. https://doi.org/10.1194/jlr.R072231

Miller, C. E., Krautscheid, P., Baldwin, E. E., Tvrdik, T., Openshaw, A. S., Hart, K., & Lagrave, D. (2014). Genetic counselor review of genetic test orders in a reference laboratory reduces unnecessary testing. American Journal of Medical Genetics, Part A, 164(5), 1094–1101. https://doi.org/10.1002/ajmg.a.36453

Minton, J. A. L., Flanagan, S. E., & Ellard, S. (2011). Mutation surveyor: software for DNA sequence analysis. Methods in Molecular Biology (Clifton, N.J.), 688, 143–53. https://doi.org/10.1007/978-1-60761-947-5_10

Miyake, Y., Kimura, R., Kokubo, Y., Okayama, A., Tomoike, H., Yamamura, T., & Miyata, T. (2008). Genetic variants in PCSK9 in the Japanese population: Rare genetic variants in PCSK9 might collectively contribute to plasma LDL cholesterol levels in the general population. Atherosclerosis, 196(1), 29–36. https://doi.org/10.1016/j.atherosclerosis.2006.12.035

Moffatt, R. J., & Stamford, B. (2005). Lipid Metabolism and Health. CRC Press. Retrieved from https://books.google.co.uk/books?id=2_LLBQAAQBAJ

Mooney, H. (2011). Families at risk of genetically high cholesterol are not being spotted, says royal college. BMJ (Clinical Research Ed.). https://doi.org/10.1136/bmj.d520

Muir, L. A., George, P. M., Laurie, A. D., Reid, N., & Whitehead, L. (2010). Preventing cardiovascular disease: A review of the effectiveness of identifying the people with familial hypercholesterolaemia in New Zealand. New Zealand Medical Journal.

Müller, C. (1938). Xanthomata, Hypercholesterolemia, Angina Pectoris. Acta Medica Scandinavica, 95(89 S), 75–84. https://doi.org/10.1111/j.0954-6820.1938.tb19279.x

National Heart, Lung, and B. I. (2011). Expert Panel on Integrated Guidelines for Cardiovascular Health and Risk Reduction in Children and Adolescents: Summary Report. Pediatrics. https://doi.org/10.1542/peds.2009-2107C

National Institute for Health and Clinical Excellence. (2017). Familial hypercholesterolaemia: identification and management. Retrieved June 28, 2018, from https://www.nice.org.uk/guidance/cg71/chapter/Recommendations

National Society of Genetic Counselors’ Definition Task Force, Resta, R., Biesecker, B. B., Bennett, R. L., Blum, S., Hahn, S. E., … Williams, J. L. (2006). A new definition of Genetic Counseling: National Society of Genetic Counselors’ Task Force report. Journal of Genetic Counseling, 15(2), 77–83. https://doi.org/10.1007/s10897-005-9014-3

Neale, B. M., Rivas, M. A., Voight, B. F., Altshuler, D., Devlin, B., Orho-Melander, M., … Daly, M. J. (2011). Testing for an unusual distribution of rare variants. PLoS Genetics, 7(3). https://doi.org/10.1371/journal.pgen.1001322

Neil, H. A., Hammond, T., Huxley, R., Matthews, D. R., & Humphries, S. E. (2000). Extent

Page 152: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

134

of underdiagnosis of familial hypercholesterolaemia in routine practice: prospective registry study. Bmj, 321(7254), 148.

Newman, T. B., & Garber, A. M. (2000). Cholesterol screening in children and adolescents. Pediatrics, 105(3 Pt 1), 637–638.

Nguyen, P., Leray, V., Diez, M., Serisier, S., Le Bloc’H, J., Siliart, B., & Dumon, H. (2008). Liver lipid metabolism. Journal of Animal Physiology and Animal Nutrition, 92(3), 272–283. https://doi.org/10.1111/j.1439-0396.2007.00752.x

NHS England. (2017). Global Digital Exemplars. Retrieved June 3, 2018, from https://www.england.nhs.uk/digitaltechnology/info-revolution/exemplars/

Nicholls, D. P., Nicholls, P., & Young, I. (2009). Lipid Disorders. OUP Oxford. Retrieved from https://books.google.co.uk/books?id=KQIYNF2tKR0C

Nicholls, P., Young, I., Lyttle, K., & Graham, C. (2001, April). Screening for familial hypercholesterolaemia. Early identification and treatment of patients is important. BMJ (Clinical Research Ed.). England.

Nicholls, S. G., Wilson, B. J., Craigie, S. M., Etchegary, H., Castle, D., Carroll, J. C., … Scherer, S. (2013). Public attitudes towards genomic risk profiling as a component of routine population screening 1. Genome, 56(10), 626–633. https://doi.org/10.1139/gen-2013-0070

Nordestgaard, B. G., & Benn, M. (2017). Genetic testing for familial hypercholesterolaemia is essential in individuals with high LDL cholesterol: who does it in the world? European Heart Journal, 38(20), 1580–1583. https://doi.org/10.1093/eurheartj/ehx136

Nordestgaard, B. G., Chapman, M. J., Humphries, S. E., Ginsberg, H. N., Masana, L., Descamps, O. S., … Tybjærg-Hansen, A. (2013). Familial hypercholesterolaemia is underdiagnosed and undertreated in the general population: Guidance for clinicians to prevent coronary heart disease. European Heart Journal, 34(45). https://doi.org/10.1093/eurheartj/eht273

Norsworthy, P. J., Vandrovcova, J., Thomas, E. R. A., Campbell, A., Kerr, S. M., Biggs, J., … Aitman, T. J. (2014). Targeted genetic testing for familial hypercholesterolaemia using next generation sequencing: A population-based study. BMC Medical Genetics. https://doi.org/10.1186/1471-2350-15-70

Oliver, D. G., Sanders, A. H., Douglas Hogg, R., & Woods Hellman, J. (1981). Thermal gradients in microtitration plates. Effects on enzyme-linked immunoassay. Journal of Immunological Methods, 42(2), 195–201. https://doi.org/10.1016/0022-1759(81)90149-6

Ooi, E. M. M., Barrett, P. H. R., & Watts, G. F. (2013). The extended abnormalities in lipoprotein metabolism in familial hypercholesterolemia: Developing a new framework for future therapies. International Journal of Cardiology, 168(3), 1811–1818. https://doi.org/10.1016/j.ijcard.2013.06.069

Pajak, A., Szafraniec, K., Polak, M., Drygas, W., Piotrowski, W., Zdrojewski, T., & Jankowski, P. (2016). Prevalence of familial hypercholesterolemia: A meta-analysis of six large, observational, population-based studies in Poland. Archives of Medical Science. https://doi.org/10.5114/aoms.2016.59700

Palacios, L., Grandoso, L., Cuevas, N., Olano-Martin, E., Martinez, A., Tejedor, D., & Stef,

Page 153: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

135

M. (2012). Molecular characterization of familial hypercholesterolemia in Spain. Atherosclerosis, 221(1), 137–142. https://doi.org/10.1016/j.atherosclerosis.2011.12.021

Pandit, S., Wisniewski, D., Santoro, J. C., Ha, S., Ramakrishnan, V., Cubbon, R. M., … Fisher, T. S. (2008). Functional analysis of sites within PCSK9 responsible for hypercholesterolemia. Journal of Lipid Research, 49(6), 1333–1343. https://doi.org/10.1194/jlr.M800049-JLR200

Pears, R., Griffin, M., Watson, M., Wheeler, R., Hilder, D., Meeson, B., … Byrne, C. D. (2014). The reduced cost of providing a nationally recognised service for familial hypercholesterolaemia. Open Heart, 1(1). https://doi.org/10.1136/openhrt-2013-000015

Peng, Q., Vijaya Satya, R., Lewis, M., Randad, P., & Wang, Y. (2015). Reducing amplification artifacts in high multiplex amplicon sequencing by using molecular barcodes. BMC Genomics, 16(1). https://doi.org/10.1186/s12864-015-1806-8

Pijlman, A., Huijgen, R., Verhagen, S. N., Imholz, B. P. M., Liem, a H., Kastelein, J. J. P., … Visseren, F. L. J. (2010). Evaluation of cholesterol lowering treatment of patients with familial hypercholesterolemia: a large cross-sectional study in The Netherlands. Atherosclerosis, 209(1), 189–94. https://doi.org/10.1016/j.atherosclerosis.2009.09.014

Pisciotta, L., Priore Oliva, C., Pes, G. M., Di Scala, L., Bellocchio, A., Fresa, R., … Bertolini, S. (2006). Autosomal recessive hypercholesterolemia (ARH) and homozygous familial hypercholesterolemia (FH): a phenotypic comparison. Atherosclerosis, 188(2), 398–405. https://doi.org/10.1016/j.atherosclerosis.2005.11.016

Pocovi, M., Civeira, F., Alonso, R., & Mata, P. (2004). Familial Hypercholesterolemia in Spain: Case-Finding Program, Clinical and Genetic Aspects. Seminars in Vascular Medicine. https://doi.org/10.1055/s-2004-822988

Practitioners, R. A. C. of G. (2009). Guidelines for preventive activities in general practice / [prepared by the Royal Australian College of General Practitioners “Red Book” Taskforce]. (R. A. C. of G. Practitioners & R. A. C. of G. P. R. B. Taskforce, Eds.). South Melbourne, Vic: Royal Australian College of General Practitioners.

Prime Minister’s office, & The Rt Hon Theresa May MP. (2018). PM to set out ambitious plans to transform outcomes for people with chronic diseases. Artificial intelligence will help to prevent 22,000 cancer deaths each year by 2033. Retrieved June 20, 2018, from https://www.gov.uk/government/news/pm-to-set-out-ambitious-plans-to-transform-outcomes-for-people-with-chronic-diseases

QIAgen. (2012). QIAprep Spin Miniprep Kit. Http://Www.Qiagen.Com/Products/Plasmid/ QIAprepMiniprepSystem/QIAprepSpinMiniprepKit.Aspx, (February), 12–13.

Rader, D. J., Cohen, J., & Hobbs, H. H. (2003). Monogenic hypercholesterolemia: New insights in pathogenesis and treatment. Journal of Clinical Investigation. https://doi.org/10.1172/JCI200318925

Radovica-Spalvina, I., Latkovskis, G., Silamikelis, I., Fridmanis, D., Elbere, I., Ventins, K., … Klovins, J. (2015). Next-generation-sequencing-based identification of familial hypercholesterolemia-related mutations in subjects with increased LDL–C levels in a latvian population. BMC Medical Genetics, 16(1), 86. https://doi.org/10.1186/s12881-015-0230-x

Page 154: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

136

Raper, A., Kolansky, D. M., & Cuchel, M. (2012). Treatment of familial hypercholesterolemia: Is there a need beyond Statin therapy? Current Atherosclerosis Reports, 14(1), 11–16. https://doi.org/10.1007/s11883-011-0215-y

Reiner, Ž. (2015). Management of patients with familial hypercholesterolaemia. Nature Reviews Cardiology, 12(10), 565–575. https://doi.org/10.1038/nrcardio.2015.92

Reynolds, C. A., Hong, M. G., Eriksson, U. K., Blennow, K., Wiklund, F., Johansson, B., … Prince, J. A. (2010). Analysis of lipid pathway genes indicates association of sequence variation near SREBF1/TOM1L2/ATPAF2 with dementia risk. Human Molecular Genetics, 19(10), 2068–2078. https://doi.org/10.1093/hmg/ddq079

Risk of fatal coronary heart disease in familial hypercholesterolaemia. Scientific Steering Committee on behalf of the Simon Broome Register Group. (1991). Bmj, 303(6807), 893–896.

Russell, D. W. (1992, April). Cholesterol biosynthesis and metabolism. Cardiovascular Drugs and Therapy. United States.

Salazar, L. A., Hirata, M. H., Cavalli, S. A., Nakandakare, E. R., Forti, N., Diament, J., … Hirata, R. D. C. (2002). Molecular basis of familial hypercholesterolemia in Brazil: Identification of seven novel LDLR gene mutations. Human Mutation, 19(4), 462–463. https://doi.org/10.1002/humu.9032

Samorodnitsky, E., Jewell, B. M., Hagopian, R., Miya, J., Wing, M. R., Lyon, E., … Roychowdhury, S. (2015). Evaluation of Hybridization Capture Versus Amplicon-Based Methods for Whole-Exome Sequencing. Human Mutation, 36(9), 903–914. https://doi.org/10.1002/humu.22825

Sass, C., Giroux, L. M., Ma, Y., Roy, M., Lavigne, J., Lussier-Cacan, S., … Minnich, A. (2004). Evidence for a cholesterol-lowering gene in a French-Canadian kindred with familial hypercholesterolemia. Human Genetics, 96(1), 21–26. https://doi.org/10.1007/BF00214181

Scartezini, M., Hubbart, C., Whittall, R. a, Cooper, J. a, Neil, A. H. W., & Humphries, S. E. (2007). The PCSK9 gene R46L variant is associated with lower plasma lipid levels and cardiovascular risk in healthy U.K. men. Clinical Science (London, England : 1979), 113(11), 435–41. https://doi.org/10.1042/CS20070150

Science and Technology Committe. House of Commons. (2018). Genomics and genome editing in the NHS : Third Report of session 2017-19. Volume 1, Report. Retrieved May 19, 2018, from http://www.publicinformationonline.com/download/173998

Scriver, C. R. (1995). The metabolic and molecular bases of inherited disease. McGraw-Hill, Health Professions Division. Retrieved from http://books.google.co.uk/books?id=b5ue-5ZMzxwC

Segrest, J. P., Jones, M. K., De Loof, H., & Dashti, N. (2001). Structure of apolipoprotein B-100 in low density lipoproteins. Journal of Lipid Research, 42, 1346–67. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/11518754

Seidah, N. G., Benjannet, S., Wickham, L., Marcinkiewicz, J., Jasmin, S. B., Stifani, S., … Chretien, M. (2003). The secretory proprotein convertase neural apoptosis-regulated convertase 1 (NARC-1): liver regeneration and neuronal differentiation. Proceedings of the National Academy of Sciences of the United States of America, 100(3), 928–933. https://doi.org/10.1073/pnas.0335507100

Page 155: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

137

Seidah, N. G., & Prat, A. (2007). The proprotein convertases are potential targets in the treatment of dyslipidemia. Journal of Molecular Medicine. https://doi.org/10.1007/s00109-007-0172-7

Sharma, P., Boyers, D., Boachie, C., Stewart, F., Miedzybrodzka, Z., Simpson, W., … Mowatt, G. (2012). Elucigene FH20 and LIPOchip for the diagnosis of familial hypercholesterolaemia: a systematic review and economic evaluation. Health Technology Assessment (Winchester, England), 16(17), 1–266. https://doi.org/10.3310/hta16170

Simó, C., Cifuentes, A., & García-Cañas, V. (2014). Fundamentals of Advanced Omics Technologies: From Genes to Metabolites. Elsevier Science. Retrieved from https://books.google.co.uk/books?id=s75CAgAAQBAJ

Sjouke, B., Kusters, D. M., Kastelein, J. J., & Hovingh, G. K. (2011). Familial hypercholesterolemia: present and future management. Curr Cardiol Rep, 13(6), 527–536. https://doi.org/10.1007/s11886-011-0219-9

Sjouke, B., Kusters, D. M., Kindt, I., Besseling, J., Defesche, J. C., Sijbrands, E. J. G., … Hovingh, G. K. (2015). Homozygous autosomal dominant hypercholesterolaemia in the Netherlands: Prevalence, genotype-phenotype relationship, and clinical outcome. European Heart Journal, 36(9), 560–565. https://doi.org/10.1093/eurheartj/ehu058

Slack, J. (1969). Risks of ischaemic heart-disease in familial hyperlipoproteinaemic states. Lancet, 2(7635), 1380–1382.

Solanas-Barca, M., de Castro-Orós, I., Mateo-Gallego, R., Cofán, M., Plana, N., Puzo, J., … Cenarro, A. (2012). Apolipoprotein E gene mutations in subjects with mixed hyperlipidemia and a clinical diagnosis of familial combined hyperlipidemia. Atherosclerosis, 222(2), 449–455. https://doi.org/10.1016/j.atherosclerosis.2012.03.011

Soria, L. F., Ludwig, E. H., Clarke, H. R., Vega, G. L., Grundy, S. M., & McCarthy, B. J. (1989). Association between a specific apolipoprotein B mutation and familial defective apolipoprotein B-100. Proc Natl Acad Sci U S A, 86(2), 587–591.

Soutar, A. (2009). Regulation of the LDL receptor in familial hypercholesterolemia. Clinical Lipidology, 4(6), 755–765.

Soutar, A. K., & Naoumova, R. P. (2007). Mechanisms of disease: genetic causes of familial hypercholesterolemia. Nature Clinical Practice Cardiovascular Medicine, 4(4), 214–225.

Stein, L. D. (2010). The case for cloud computing in genome informatics. Genome Biology. https://doi.org/10.1186/gb-2010-11-5-207

Stein, L. D., Knoppers, B. M., Campbell, P., Getz, G., & Korbel, J. O. (2015). Create a cloud commons. Nature, 523(7559), 149–151. https://doi.org/10.1038/523149a

Stocker, R. (2004). Role of Oxidative Modifications in Atherosclerosis. Physiological Reviews, 84(4), 1381–1478. https://doi.org/10.1152/physrev.00047.2003

Stone, N. J., Levy, R. I., Fredrickson, D. S., & Verter, J. (1974). Coronary artery disease in 116 kindred with familial type II hyperlipoproteinemia. Circulation, 49(3), 476–488.

Sturm, A. C. (2014). The Role of Genetic Counselors for Patients with Familial Hypercholesterolemia. Current Genetic Medicine Reports, 2(2), 68–74. https://doi.org/10.1007/s40142-014-0036-8

Page 156: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

138

Sun, X. M., Eden, E. R., Tosi, I., Neuwirth, C. K., Wile, D., Naoumova, R. P., & Soutar, A. K. (2005). Evidence for effect of mutant PCSK9 on apolipoprotein B secretion as the cause of unusually severe dominant hypercholesterolaemia. Human Molecular Genetics, 14(9), 1161–1169. https://doi.org/10.1093/hmg/ddi128

Talmud, P. J., Shah, S., Whittall, R., Futema, M., Howard, P., Cooper, J. A., … Humphries, S. E. (2013). Use of low-density lipoprotein cholesterol gene score to distinguish patients with polygenic and monogenic familial hypercholesterolaemia: A case-control study. The Lancet, 381(9874), 1293–1301. https://doi.org/10.1016/S0140-6736(12)62127-8

Taylor, A., Wang, D., Patel, K., Whittall, R., Wood, G., Farrer, M., … Humphries, S. E. (2010). Mutation detection rate and spectrum in familial hypercholesterolaemia patients in the UK pilot cascade project. Clin Genet, 77(6), 572–580. https://doi.org/10.1111/j.1399-0004.2009.01356.x

Thermo Fisher Scientific. (2010). NanoDrop 1000 Spectrophotometer V3.8 User ’ s Manual. Retrieved June 19, 2017, from http://tools.thermofisher.com/content/sfs/manuals/nd-1000-v3.8-users-manual-8 5x11.pdf

Thermo Fisher Scientific. (2014). Qubit 3.0 Fluorometer. Retrieved June 14, 2017, from https://www.thermofisher.com/content/dam/LifeTech/global/life-sciences/Laboratory Instruments/Files/1014/MAN0010866_Qubit 3.0_RevA_UG_22Sep2014.pdf

Thermo Fisher Scientific. (2016). BigDyeTMTerminator v1.1 Cycle Sequencing Kit User Guide. Retrieved June 20, 2017, from https://tools.thermofisher.com/content/sfs/manuals/cms_041330.pdf

Thomas, E. R. A. (2015). The past, present and future of molecular genetic diagnosis in familial hypercholesterolemia. Clinical Lipidology, 10(5), 379–385. https://doi.org/10.2217/clp.15.29

Timms, K. M., Wagner, S., Samuels, M. E., Forbey, K., Goldfine, H., Jammalapati, S., … Shattuck, D. M. (2004). A mutation in PCSK9 causing autosomal-dominant hypercholesterolemia in a Utah pedigree. Human Genetics, 114(4), 349–353. https://doi.org/10.1007/s00439-003-1071-9

Tosi, I., Toledo-Leiva, P., Neuwirth, C., Naoumova, R. P., & Soutar, A. K. (2007). Genetic defects causing familial hypercholesterolaemia: Identification of deletions and duplications in the LDL-receptor gene and summary of all mutations found in patients attending the Hammersmith Hospital Lipid Clinic. Atherosclerosis, 194(1), 102–111. https://doi.org/10.1016/j.atherosclerosis.2006.10.003

Tybjærg-Hansen, A., Steffensen, R., Meinertz, H., Schnohr, P., & Nordestgaard, B. G. (1998). Association of Mutations in the Apolipoprotein B Gene with Hypercholesterolemia and the Risk of Ischemic Heart Disease. New England Journal of Medicine, 338(22), 1577–1584. https://doi.org/10.1056/NEJM199805283382203

Umans-Eckenhausen, M. A. W., Defesche, J. C., Sijbrands, E. J. G., Scheerder, R. L. J. M., & Kastelein, J. J. P. (2001). Review of first 5 years of screening for familial hypercholesterolaemia in the Netherlands. Lancet, 357(9251), 165–168. https://doi.org/10.1016/S0140-6736(00)03587-X

Usifo, E., Leigh, S. E., Whittall, R. A., Lench, N., Taylor, A., Yeats, C., … Humphries, S. E. (2012). Low-density lipoprotein receptor gene familial hypercholesterolemia variant database: update and pathological assessment. Ann Hum Genet, 76(5), 387–401.

Page 157: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

139

https://doi.org/10.1111/j.1469-1809.2012.00724.x

Van Wijk, D. F., Sjouke, B., Figueroa, A., Emami, H., Van Der Valk, F. M., Macnabb, M. H., … Stroes, E. S. G. (2014). Nonpharmacological lipoprotein apheresis reduces arterial inflammation in familial hypercholesterolemia. Journal of the American College of Cardiology, 64(14), 1418–1426. https://doi.org/10.1016/j.jacc.2014.01.088

Vandrovcova, J., Thomas, E. R. a, Atanur, S. S., Norsworthy, P. J., Neuwirth, C., Tan, Y., … Aitman, T. J. (2013). The use of next-generation sequencing in clinical diagnosis of familial hypercholesterolemia. Genetics in Medicine : Official Journal of the American College of Medical Genetics, 15(May), 1–10. https://doi.org/10.1038/gim.2013.55

Varghese, M. J. (2014). Familial hypercholesterolemia: A review. Annals of Pediatric Cardiology, 7(2), 107–117. https://doi.org/10.4103/0974-2069.132478

Wadhera, R. K., Steen, D. L., Khan, I., Giugliano, R. P., & Foody, J. M. (2016). A review of low-density lipoprotein cholesterol, treatment strategies, and its impact on cardiovascular disease morbidity and mortality. Journal of Clinical Lipidology. https://doi.org/10.1016/j.jacl.2015.11.010

Wald, D. S., Bestwick, J. P., Morris, J. K., Whyte, K., Jenkins, L., & Wald, N. J. (2016). Child–Parent Familial Hypercholesterolemia Screening in Primary Care. New England Journal of Medicine, 375(17), 1628–1637. https://doi.org/10.1056/NEJMoa1602777

Wang, J., Ban, M. R., & Hegele, R. a. (2005). Multiplex ligation-dependent probe amplification of LDLR enhances molecular diagnosis of familial hypercholesterolemia. Journal of Lipid Research, 46(2), 366–372. https://doi.org/10.1194/jlr.D400030-JLR200

Wang, K., Li, M., & Hakonarson, H. (2010). ANNOVAR: Functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Research, 38(16). https://doi.org/10.1093/nar/gkq603

Ward, A. J., O’Kane, M., Nicholls, D. P., Young, I. S., Nevin, N. C., & Graham, C. A. (1996). A novel single base deletion in the LDLR gene (211delG): Effect on serum lipid profiles and the influence of other genetic polymorphisms in the ACE, APOE and APOB genes. Atherosclerosis, 120(1–2), 83–91.

Watts, G. F., Sullivan, D. R., Poplawski, N., van Bockxmeer, F., Hamilton-Craig, I., Clifton, P. M., … Trent, R. (2011). Familial hypercholesterolaemia: A model of care for Australasia. Atherosclerosis Supplements. https://doi.org/10.1016/j.atherosclerosissup.2011.06.001

Weng, S. F., Reps, J., Kai, J., Garibaldi, J. M., & Qureshi, N. (2017). Can machine-learning improve cardiovascular risk prediction using routine clinical data? PloS One, 12(4), e0174944.

Wheeler, D. A., Srinivasan, M., Egholm, M., Shen, Y., Chen, L., McGuire, A., … Rothberg, J. M. (2008). The complete genome of an individual by massively parallel DNA sequencing. Nature, 452(7189), 872–876. https://doi.org/10.1038/nature06884

White, D. a, Bennett, A. J., Billett, M. a, & Salter, A. M. (1998). The assembly of triacylglycerol-rich lipoproteins: an essential role for the microsomal triacylglycerol transfer protein. The British Journal of Nutrition, 44(3), 219–229. https://doi.org/S0007114598001263 [pii]

Page 158: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

140

Whitfield, A. J., Barrett, P. H. R., Van Bockxmeer, F. M., & Burnett, J. R. (2004). Lipid disorders and mutations in the APOB gene. Clinical Chemistry. https://doi.org/10.1373/clinchem.2004.038026

Whittall, R. a, Scartezini, M., Li, K., Hubbart, C., Reiner, Z., Abraha, A., … Humphries, S. E. (2010). Development of a high-resolution melting method for mutation detection in familial hypercholesterolaemia patients. Annals of Clinical Biochemistry, 47(Pt 1), 44–55. https://doi.org/10.1258/acb.2009.009076

Whittington, R. (2008). Introduction to Health Economics Concepts - A Beginners Guide. Greenflint Limited. Retrieved from https://books.google.co.uk/books?id=7K0n_bxIsfoC

WHO, H. G. P. (1999). Familial hypercholesterolemia: Report of a second WHO consultation. WHO Geneva.

Wiegman, A., Gidding, S. S., Watts, G. F., Chapman, M. J., Ginsberg, H. N., Cuchel, M., … Watts, G. F. (2015). Familial hypercholesterolæmia in children and adolescents: Gaining decades of life by optimizing detection and treatment. European Heart Journal. https://doi.org/10.1093/eurheartj/ehv157

Wierzbicki, A. S., Humphries, S. E., & Minhas, R. (2008). Familial hypercholesterolaemia: summary of NICE guidance. BMJ (Clinical Research Ed.), 337. https://doi.org/10.1136/bmj.a1095

Wierzbicki, A. S., Humphries, S. E., & Minhas, R. (2008). Familial hypercholesterolaemia: summary of NICE guidance. Bmj, 337, a1095. https://doi.org/10.1136/bmj.a1095

Wilfinger, W. W., Mackey, K., & Chomczynski, P. (1997). Effect of pH and ionic strength on the spectrophotometric assessment of nucleic acid purity. BioTechniques, 22(3), 474–481.

Williams, R. R., Hunt, S. C., Schumacher, M. C., Hegele, R. A., Leppert, M. F., Ludwig, E. H., & Hopkins, P. N. (1993). Diagnosing heterozygous familial hypercholesterolemia using new practical criteria validated by molecular genetics. Am J Cardiol, 72(2), 171–176.

Wonderling, D., Umans-Eckenhausen, M. A. W., Marks, D., Defesche, J. C., Kastelein, J. J. P., & Thorogood, M. (2004). Cost-effectiveness analysis of the genetic screening program for familial hypercholesterolemia in The Netherlands. Seminars in Vascular Medicine, 4(1), 97–104. https://doi.org/10.1055/s-2004-822992

Wrzeszczynski, K. O., Frank, M. O., Koyama, T., Rhrissorrakrai, K., Robine, N., Utro, F., … Shah, M. (2017). Comparing sequencing assays and human-machine analyses in actionable genomics for glioblastoma. Neurology Genetics, 3(4), e164.

Wu, W. F., Sun, L. Y., Pan, X. D., Yang, S. W., & Wang, L. Y. (2014). Use of targeted exome sequencing in genetic diagnosis of chinese familial hypercholesterolemia. PLoS One, 9(4), e94697. https://doi.org/10.1371/journal.pone.0094697

Yan, B., Hu, Y., Ng, C., Ban, K. H. K., Tan, T. W., Huan, P. T., … Chng, W. J. (2016). Coverage analysis in a targeted amplicon-based next-generation sequencing panel for myeloid neoplasms. Journal of Clinical Pathology, 69(9), 801–804. https://doi.org/10.1136/jclinpath-2015-203580

Yohe, S., & Thyagarajan, B. (2017). Review of Clinical Next-Generation Sequencing. Archives of Pathology & Laboratory Medicine, 141(11), 1544–1557.

Page 159: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

141

https://doi.org/10.5858/arpa.2016-0501-RA

Yoshida, T., Kato, K., Yokoi, K., Watanabe, S., Metoki, N., Satoh, K., … Yamada, Y. (2009). Association of candidate gene polymorphisms with chronic kidney disease in Japanese individuals with hypertension. Hypertension Research, 32(5), 411–418. https://doi.org/10.1038/hr.2009.22

Zhang, D. W., Lagace, T. A., Garuti, R., Zhao, Z., McDonald, M., Horton, J. D., … Hobbs, H. H. (2007). Binding of proprotein convertase subtilisin/kexin type 9 to epidermal growth factor-like repeat A of low density lipoprotein receptor decreases receptor recycling and increases degradation. Journal of Biological Chemistry, 282(25), 18602–18612. https://doi.org/10.1074/jbc.M702027200

Zhao, N., Liu, C. C., Qiao, W., & Bu, G. (2018). Apolipoprotein E, Receptors, and Modulation of Alzheimer’s Disease. Biological Psychiatry. https://doi.org/10.1016/j.biopsych.2017.03.003

Page 160: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

142

Appendices

Appendix 1

UHS Sponsorship letter

Page 161: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

143

Page 162: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

144

Page 163: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

145

Page 164: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

146

Appendix 2

UHS R&D Confirmation of capacity & capability letter

Page 165: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

147

Page 166: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

148

Page 167: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

149

Page 168: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

150

Appendix 3

REC approval

Page 169: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

151

Page 170: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

152

Page 171: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

153

Appendix 4

HRA approval

Page 172: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

154

Page 173: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

155

Page 174: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

156

Page 175: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

157

Page 176: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

158

Page 177: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

159

Appendix 5

Excel spreadsheet with sample information and BaseSpace findings

Sample information with BaseSpace findings.

Excel spreadsheet with amplicon coverage

Amplicon coverage in FH-NGS.

Galaxy FH-NGS pipeline

Galaxy FH-NGS pipeline.

Page 178: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

160

Appendix 6

Illumina BaseSpace generated pdf report.

Page 179: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

161

Page 180: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

162

Page 181: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

163

Page 182: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

164

Page 183: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

165

Page 184: Application of next generation sequencing to aid in effective ......2019/04/12  · mutations in the three genes LDLR, APOB and PCSK9. Aim To set up Next Generation Sequencing (NGS)

166