diversity of the emerging pneumococcal serotype 6c

20
Diversity of the emerging pneumococcal serotype 6C in the UK SGM Autumn Conference - University of Nottingham 6-9 September 2010 Rebecca Gladstone 06 September 2010

Upload: rebeccagladstone

Post on 12-Jul-2015

342 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Diversity Of The Emerging Pneumococcal Serotype 6C

Diversity of the emerging pneumococcal serotype 6C in the UKSGM Autumn Conference - University of Nottingham

6-9 September 2010

Rebecca Gladstone06 September 2010

Page 2: Diversity Of The Emerging Pneumococcal Serotype 6C

Overview

• Introduction –Why is pneumococcal serotype 6C diversity important?

• Collection of samples

• Strain selection - How were the strains chosen for further analysis?

• Whole genome sequencing

• Summary and future work

Ok just a quick overview of what I will be talking about today. Firstly a brief introduction as to why 6C is important. I will describe the collection methods and the selection of strains for whole genome sequencing. Then I will go on to discuss the whole genome sequencing results. And finally I will summarise our findings.
Page 3: Diversity Of The Emerging Pneumococcal Serotype 6C

Introduction§ Streptococcus pneumoniae is common coloniser of the

nasopharynx but also a major cause of morbidity and mortality in the UK

§ S. pneumoniae serotype 6C is a recently recognised serotype that arose from 6Aarose from 6A

§ Prevalence of 6C has significantly increased in our carriage study

§ Driven by expansion of Multi-locus sequence type 1692 (MLST-genotyping tool based on the sequence of seven house keeping genes)

§ Capacity to cause invasive disease

§ Not included in any conjugate vaccine

Carvalho et al., 2009, Cooper et al., 2010, Park et al., 2007a, Park et al., 2007b, Tocheva et al., 2010

-The pneumococcus colonises the nasopharynx of approximately 30% of healthy children <4 years old in the Southampton area but it can also cause otitis media, pneumonia and meningitis -6C was first recognised as a serotype distinct from 6A in 2007 by park et al -Retrospective analysis revealed the earliest isolate confirmed to be 6C to be from 1979. In recent years 6C has been seen to increase in our carriage study, predominantly due to the expansion of multi locus sequence type 1692 -6C is capable of causing disease, therefore an increase in carriage could potentially equate to a rise in invasive pneumococcal disease caused by this serotype, as was observed in the US -The polysaccharide conjugate vaccines Prevenar and Prevenar 13 do not include serotype 6C. However recent reports suggest that the inclusion of 6A in Prevenar 13 will provide functional cross protection against 6C
Page 4: Diversity Of The Emerging Pneumococcal Serotype 6C

Sample collection and strain selection• Nasopharyngeal specimens obtained from healthy children

<4years during the winter months of 2006-9

301 S. pneumoniae strains isolated32 serotype 6C17 sequence type 169215 strains selected with representatives from each of the three winters covering each of the 9 observed MLST

Carvalho et al., 2009, Tocheva et al., 2010

• IPD isolates obtained from HPA SE regional microbiology laboratory, Southampton in 2004-2010

– 315 S. pneumoniae IPD strains – 6 serotype 6C – 4most recent strains selected for analysis – 3 ST1692

-During the winter months of 2006-2009 301 nasopharyngeal Streptococcus pneumoniae strains were isolated -Of these approximately 10% were 6C with 14 6C isolates collected in each of the latter two winters -17 of all the 6C isolates were ST 1692 -15 strains were selected to cover each of the 9 observed multi locus sequence types, with 6 ST1692 isolates among the selection -6 of the 315 available IPD strains were determined to be 6C, those from 2006-present were selected, three out of the four were ST1692
Page 5: Diversity Of The Emerging Pneumococcal Serotype 6C

Whole genome sequencing (WGS)• Objectives of WGS study:

– the diversity of the serotype 6C

– the diversity of ST1692

– clinical relevance of any diversity– clinical relevance of any diversity

• Preliminary analysis:

– Confirmation of MLST

– Gene content

– SNPs

Hiller et al., 2007, Silva et al., 2006

We aimed to determine the diversity of the recently recognised serotype 6C and the 1692 clone driving the expansion. Additional we wish to determine the clinical relevance of any diversity. Our preliminary analysis involved comparing strains to our internal reference ST1692 isolate (SOT2073) or the publicly available complete reference genome R6. This allowed the genic content and single nucleotide polymorphisms to be determined. Next we are planning to look at large scale genome rearrangements with reference to published regions of diversity
Page 6: Diversity Of The Emerging Pneumococcal Serotype 6C

Whole genome sequencing (WGS)

•454 Genome Sequencer FLX System

•Whole genome shotgun methodology

• Analysis was perfomed via xBASE-NG (http://ng.xbase.ac.uk) and (http://xbase.ac.uk/annotation), utilising the following software

– De novo assembly (Newbler 2.5, Roche)

– Annotation (xBASE annotation pipeline)

– Mapping (Newbler gsMapper, Roche)

• Sequencing averages- 113,178 reads, 39,434,823 bases, and 18.55 coverage

The 454 GS FLX system at the University of Birmingham was utilised using the whole genome shotgun methodology. Newbler was used for De novo assembly The xBASE annotation pipeline was utilised And mapping was performed with Newbler gsMapper The sequencing averages were 113,178 reads, 39,434,823 bases and 18.55 coverage.
Page 7: Diversity Of The Emerging Pneumococcal Serotype 6C

A plot of the gene-

distance matrix

The number The number of genes not

shared between any

given pair

Image produced by Dr Nick Loman

Analysis of the number of genes not shared between any given pair produced this gene-distance matrix. The 19 isolates can be split into two genetic clusters. The first includes all sequence type 1692s and the STs 395 and ST1714. These three STs in clonal complex group 1 differ by only one base in the xpt allele. 78% of our observed 6C in carriage belonged to these sequence types. Interestingly three ST1692 strains, two of which are from IPD, group between the ST395 and the ST1714s away from the other ST1692 isolates. As the STs do not group exactly according MLST alone, it suggests that further diversity exists within the ST’s The second cluster is more heterogenous, containing isolates from the other six sequence types
Page 8: Diversity Of The Emerging Pneumococcal Serotype 6C

SNPs –Isolates vs 2073(ST1692)

Strain Sequence Type Total SNP CDS Non-Synonymous2074 1692 81 4 3

3022 1692 233 51 29

2105 1692 217 49 30

802M 1692 202 77 45

954Q 1692 251 70 45

1058S 1692 237 77 461058S 1692 237 77 46

0081 1692 189 80 48

3055 1692 304 102 70

2371 395 367 123 81

3074 1714 353 123 82

0237 1714 334 122 83

0113 65 16772 2042 636

2029 3460 18835 10360 2977

3060 1150 17824 11935 3466

3050 1600 16534 11551 3494

3088 398 17735 12275 3721

1060N 1150 19298 13403 3874

2300 1862 24959 17209 5054

Key: IPD strains, Carriage strains

SNPs were identified by comparing all isolates to 2073 a ST1692 isolate. Within sequence type 1692 we identified a number of putative non-synonymous SNPs ranging from 3-70 As to be expected the more distantly related the ST is from 1692 the larger the number of non-synonymous SNPs Closely related STs have similar numbers of SNPs presumably many of these will be the same. Furthermore when two ST1150 isolates ‘in green’ from carriage and disease were compared to each other we identified 151 putative non-synonymous SNPs showing further diversity within an ST.
Page 9: Diversity Of The Emerging Pneumococcal Serotype 6C

Novel genes• Novel genes were detected by orthologue comparison of

study strains and removal of any genes found in other completed S. pneumoniae genomes in GenBank.

– ermB and a gene encoding a tetracycline-resistance protein were unique to 2029 (ST3460). Erythromycin protein were unique to 2029 (ST3460). Erythromycin and tetracycline resistance were confirmed as functional.

– Two novel putative beta-lactamase genes in 2300 (ST1862) homologous to coding sequences in two plant pathogens (Ralstonia solanacearum and Dickeya dadantii).

– 6C group 1 strains contain a region encoding for a lantibiotic biosynthesis protein and lantibiotic efflux protein

Ding et al., 2009

We detected novel genes through orthologue comparison of all the studied strains and removal of any genes found in other Streptococcus pneumoniae completed genomes in GenBank. The majority of these genes were found in the ST 3460 and ST1862 isolates. The presence of ermB and a tetracycline-resistance gene was unique to 2029. Erythromycin and tetracycline resistance were confirmed as functional There were two novel putative beta-lactamase genes in 2300 that were homologous to coding sequences in two plant pathogens. All the strains in first 6C cluster contained a region coding for a lantibiotic biosynthesis and efflux protein, as has been described in a serotype 14 pneumococcus by Ding et al, this could give the cluster 1 6C a competitive advantage over its co-colonisers
Page 10: Diversity Of The Emerging Pneumococcal Serotype 6C

Summary

• Two distinct 6C genetic clusters could be observed by gene-distance analysis

• The cluster that included ST1692 was more homogenous than the second clusterhomogenous than the second cluster

• MLST is a good indicator of genotype

• MLST can be determined from WGS however there is diversity within the STs of 6C

Two distinct 6C clonal complexes could be observed after gene-distance analysis The clonal complex that included ST1692 was more homogenous than the second, this diversity is due to both gene content and SNPs The differential clustering of MLSTs based on gene-distance suggests there is diversity below the level of MLST The presence of up to 70 coding non-synonymous SNPs within ST1692 and 151 between two ST1150 supports this finding.
Page 11: Diversity Of The Emerging Pneumococcal Serotype 6C

Future work• Analysis of insertions and deletions including previously

reported regions of diversity

• Comparison with other streptococcal genomes

• Analysis of gene content/diversity for known virulence • Analysis of gene content/diversity for known virulence factors

• Significance of identified SNPs and novel genes

• Further comparison of disease and carriage isolates

Hava & Camilli, 2002; Hiller et al., 2007; Obert et al., 2006; Silva et al., 2006.

Future work will include the analysis of insertions and deletions, as it is known that the primary method of diversity generation in S. pneumoniae is via recombination Comparisons or our data with other streptococcal genomes could inform on the extent of gene sharing between streptococcal species We will also analysis of gene content and diversity of known virulence factors, including pneumolysin, and pneumococcal surface proteins Determine the biological significance of identified SNPs and novel genes And perform further comparisons of disease and carriage isolates to identify any properties unique to either class
Page 12: Diversity Of The Emerging Pneumococcal Serotype 6C

Acknowledgements

University of Southampton

Dr Stuart Clarke

Dr Saul Faust

Dr Jo JefferiesDr Jo Jefferies

Anna Tocheva Centre for Systems Biology

University of Birmingham

Professor Mark Pallen

Dr Nick Loman - Bioinformatics

Dr Chrystala Constantinidou - Sequencing

Mala Patel - Sequencing

I would like to acknowledge and thank those who made this project possible, especially those attending today, my supervisors Jo Jefferies and Stuart Clarke and from our collaborators at the University of Birmingham Nick Loman for his bioinformatic expertise
Page 13: Diversity Of The Emerging Pneumococcal Serotype 6C

Thank you for your attentionattention

SGM Autumn Conference - University of Nottingham, 6-9 September 2010

Rebecca Gladstone ([email protected])

06 September 2010

I have mountains more data to analyse so thank you for your time and watch this space!
Page 14: Diversity Of The Emerging Pneumococcal Serotype 6C

References•xBASE-NG website http://ng.xbase.ac.uk•Carvalho, M., Pimenta, F. C., Gertz, R. E., Jr., Joshi, H. H., Trujillo, A. A., Keys, L. E., Findley, J., Moura, I. S., Park, I. H. & other authors (2009). PCR-Based Quantitation and Clonal Diversity of the Current Prevalent Invasive Serogroup 6 Pneumococcal Serotype, 6C, in the United States in 1999 and 2006 to 2007. J Clin Microbiol 47, 554-559.•Cooper, D., Yu, X., Sidhu, M., Nahm, M. H., Fernsten, P. & Jansen, K. U. (2010). Development of an opsonophagocytic assay to Streptococcus pneumoniae serotype 6C: Demonstration of cross-functional responses to 6C in Prevenar-13 immune sera. In ESPID. Nice.•Ding, F., Tang, P., Hsu, M.-H., Cui, P., Hu, S., Yu, J. & Chiu, C.-H. (2009). Genome evolution driven by host adaptations results in a more virulent and antimicrobial-resistant Streptococcus pneumoniae serotype 14. BMC Genomics 10, 158.•Hava, D. L. & Camilli, A. (2002). Large-scale identification of serotype 4 Streptococcus pneumoniae virulence factors. Mol Microbiol 45, 1389-1406.Microbiol 45, 1389-1406.•Hiller, N. L., Janto, B., Hogg, J. S., Boissy, R., Yu, S., Powell, E., Keefe, R., Ehrlich, N. E., Shen, K. & other authors (2007). Comparative genomic analyses of seventeen Streptococcus pneumoniae strains: insights into the pneumococcal supragenome. J Bacteriol 189, 8186-8195.•Obert, C., Sublett, J., Kaushal, D., Hinojosa, E., Barton, T., Tuomanen, E. I. & Orihuela, C. J. (2006). Identification of a Candidate Streptococcus pneumoniae core genome and regions of diversity correlated with invasive pneumococcal disease. Infect Immun 74, 4766-4777.•Park, I. H., Park, S., Hollingshead, S. K. & Nahm, M. H. (2007a). Genetic basis for the new pneumococcal serotype, 6C. Infect Immun 75, 4482-4489.•Park, I. H., Pritchard, D. G., Cartee, R., Brandao, A., Brandileone, M. C. C. & Nahm, M. H. (2007b). Discovery of a New Capsular Serotype (6C) within Serogroup 6 of Streptococcus pneumoniae. J Clin Microbiol 45, 1225-1233.•Park, I. H., Moore, M. R., Treanor, J. J., Pelton, S. I., Pilishvili, T., Beall, B., Shelly, M. A., Mahon, B. E. & Nahm, M. H. (2008). Differential effects of pneumococcal vaccines against serotypes 6A and 6C. J Infect Dis 198, 1818-1822.•Silva, N. A., McCluskey, J., Jefferies, J. M. C., Hinds, J., Smith, A., Clarke, S. C., Mitchell, T. J. & Paterson, G. K. (2006).Genomic Diversity between Strains of the Same Serotype and Multilocus Sequence Type among Pneumococcal Clinical Isolates. Infect Immun 74, 3513-3518.•Tocheva, A. S., Jefferies, J. M., Christodoulides, M., Faust, S. N. & Clarke, S. C. (2010). Increase in serotype 6C pneumococcal carriage, United Kingdom. Emerg Infect Dis 16, 154-155.

Page 15: Diversity Of The Emerging Pneumococcal Serotype 6C

Sequencing results

Strain Number of reads Number of bases Coverage0081 91237 28733390 13.58

0113 110623 38489470 17.02

0237 116400 38374563 17.87

2029 107703 35591041 15.76

2073 122684 41618252 19.66

2074 226938 74015324 34.98

2105 96988 32480840 15.37

2300 78683 23267452 10.5523267452 10.55

2371 117005 35530396 16.64

3022 127553 39499836 18.68

3050 133283 40967776 18.87

3055 94317 31276976 14.90

3060 74967 25060487 12.20

3074 136821 45703125 21.42

3088 107105 36219264 17.44

0802M 117813 41369459 20.37

0954Q 167622 58368746 27.67

1058S 122634 43260411 20.91

1060N 124573 44306824 21.79

Average 113178 39434823 18.55

Key Lowest Highest

The highest and lowest number of reads, bases and coverage are highlighted. The Averages were 113,178 reads, 39,434,823, the average coverage was 18.55.
Page 16: Diversity Of The Emerging Pneumococcal Serotype 6C

Structural difference between serotype 6A and 6C

HO

H

OH

H

CH2OH

H

OH

OH

H

H

HO

H OH

H

CH2OH

H

OH

OH

H

H

O O

• There is one structural difference between serotype 6A and serotype 6C

• There are two structural differences between serotype 6B and serotype 6C

6A [P 2) – galactose – (1 3) – glucose – (1 3) – rhamnose – (1 3) – ribitol – (5 P ]

6B [P 2) – galactose – (1 3) – glucose – (1 3) – rhamnose – (1 4) – ribitol – (5 P ]

6C [P 2) – glucose – (1 3) – glucose – (1 3) – rhamnose – (1 3) – ribitol – (5 P ]

OHH OHHgalactose glucose

Page 17: Diversity Of The Emerging Pneumococcal Serotype 6C

MLST treeST1150

ST398

ST395

ST1692ST1692

ST1714

ST398

ST65

ST1862

ST3460

Multi-locus sequence typing uses the sequence of seven house keeping genes as a marker of genotype, each strain is assigned a seqeunce type (ST) based on the alleles of these seven genes. This is a neighbour joining tree based on the seven MLST alleles, it shows the relation between the strains in this study based purely on MLST The whole genome sequencing results allowed the MLST to be discerned with 100% accuracy when compared to the MLST results of Qiagen genomic services. A single base change in a MLST would result in a different allele number for the gene it is in and consequently would be assigned a different MLST. This highlights the precision of the 454 sequencing for detecting SNP level differences between sequenced genomes.
Page 18: Diversity Of The Emerging Pneumococcal Serotype 6C

MLST: WGS vs QiagenGene and allele number

Isolate aroe gdh gki recP spi xpt ddl ST WGS ST Qiagen Agreement

SOT0081 1 5 7 12 17 158 14 ST1692 ST1692 Y

SOT0113 2 7 4 10 10 1 27 ST65 ST65 Y

SOT0237 1 5 7 12 17 148 14 ST1714 ST1714 Y

SOT2029 2 34 1 5 42 28 75 ST3460 ST3460 Y

SOT2073 1 5 7 12 17 158 14 ST1692 ST1692 Y

SOT2074 1 5 7 12 17 158 14 ST1692 ST1692 YSOT2074 1 5 7 12 17 158 14 ST1692 ST1692 Y

SOT2105 1 5 7 12 17 158 14 ST1692 ST1692 Y

SOT2300 79 1 40 13 6 1 6 ST1862 ST1862 Y

SOT2371 1 5 7 12 17 1 14 ST395 ST395 Y

SOT3022 1 5 7 12 17 158 14 ST1692 ST1692 Y

SOT3050 1 5 9 12 94 1 20 ST1600 ST1600 Y

SOT3055 1 5 7 12 17 158 14 ST1692 ST1692 Y

SOT3060 7 25 8 6 25 6 8 ST1150 ST1150 Y

SOT3074 1 5 7 12 17 148 14 ST1714 ST1714 Y

SOT3088 37 25 4 4 15 20 28 ST398 ST398 Y

SOT0802M 1 5 7 12 17 158 14 ST1692 ST1692 Y

SOT0954Q 1 5 7 12 17 158 14 ST1692 ST1692 Y

SOT1058S 1 5 7 12 17 158 14 ST1692 ST1692 Y

SOT1060N 7 25 8 6 25 6 8 ST1150 N/A N/A

The whole genome sequencing results allowed the MLST to be discerned with 100% accuracy when compared to the MLST results of Qiagen genomic services. A single base change in a MLST would result in a different allele number for the gene it is in and consequently would be assigned a different MLST. This higlights the precision of the 454 sequencing for detecting SNP level differences between sequenced genomes. The analysis also confirms that MLST is a good indicator of overall genotype.
Page 19: Diversity Of The Emerging Pneumococcal Serotype 6C

Wheel comparison of 9 S. pneumoniae strains with R6 genome

Page 20: Diversity Of The Emerging Pneumococcal Serotype 6C

Clinical data for IPD strains

�� �� ���� �� �� ��� ��������

��������� ��� ���

���� ���� ���� ��� � ����� ��������������

����������������

��� ���� ���� ��� ����� ��!���� ���"�

����# ���� ���� ���$ �� �#� ����%���" ���"�

����& ���� ���� ��� �� ����� #'"�" ���"�