diversity of the emerging pneumococcal serotype 6c
TRANSCRIPT
Diversity of the emerging pneumococcal serotype 6C in the UKSGM Autumn Conference - University of Nottingham
6-9 September 2010
Rebecca Gladstone06 September 2010
Overview
• Introduction –Why is pneumococcal serotype 6C diversity important?
• Collection of samples
• Strain selection - How were the strains chosen for further analysis?
• Whole genome sequencing
• Summary and future work
Introduction§ Streptococcus pneumoniae is common coloniser of the
nasopharynx but also a major cause of morbidity and mortality in the UK
§ S. pneumoniae serotype 6C is a recently recognised serotype that arose from 6Aarose from 6A
§ Prevalence of 6C has significantly increased in our carriage study
§ Driven by expansion of Multi-locus sequence type 1692 (MLST-genotyping tool based on the sequence of seven house keeping genes)
§ Capacity to cause invasive disease
§ Not included in any conjugate vaccine
Carvalho et al., 2009, Cooper et al., 2010, Park et al., 2007a, Park et al., 2007b, Tocheva et al., 2010
Sample collection and strain selection• Nasopharyngeal specimens obtained from healthy children
<4years during the winter months of 2006-9
301 S. pneumoniae strains isolated32 serotype 6C17 sequence type 169215 strains selected with representatives from each of the three winters covering each of the 9 observed MLST
Carvalho et al., 2009, Tocheva et al., 2010
• IPD isolates obtained from HPA SE regional microbiology laboratory, Southampton in 2004-2010
– 315 S. pneumoniae IPD strains – 6 serotype 6C – 4most recent strains selected for analysis – 3 ST1692
Whole genome sequencing (WGS)• Objectives of WGS study:
– the diversity of the serotype 6C
– the diversity of ST1692
– clinical relevance of any diversity– clinical relevance of any diversity
• Preliminary analysis:
– Confirmation of MLST
– Gene content
– SNPs
Hiller et al., 2007, Silva et al., 2006
Whole genome sequencing (WGS)
•454 Genome Sequencer FLX System
•Whole genome shotgun methodology
• Analysis was perfomed via xBASE-NG (http://ng.xbase.ac.uk) and (http://xbase.ac.uk/annotation), utilising the following software
– De novo assembly (Newbler 2.5, Roche)
– Annotation (xBASE annotation pipeline)
– Mapping (Newbler gsMapper, Roche)
• Sequencing averages- 113,178 reads, 39,434,823 bases, and 18.55 coverage
A plot of the gene-
distance matrix
The number The number of genes not
shared between any
given pair
Image produced by Dr Nick Loman
SNPs –Isolates vs 2073(ST1692)
Strain Sequence Type Total SNP CDS Non-Synonymous2074 1692 81 4 3
3022 1692 233 51 29
2105 1692 217 49 30
802M 1692 202 77 45
954Q 1692 251 70 45
1058S 1692 237 77 461058S 1692 237 77 46
0081 1692 189 80 48
3055 1692 304 102 70
2371 395 367 123 81
3074 1714 353 123 82
0237 1714 334 122 83
0113 65 16772 2042 636
2029 3460 18835 10360 2977
3060 1150 17824 11935 3466
3050 1600 16534 11551 3494
3088 398 17735 12275 3721
1060N 1150 19298 13403 3874
2300 1862 24959 17209 5054
Key: IPD strains, Carriage strains
Novel genes• Novel genes were detected by orthologue comparison of
study strains and removal of any genes found in other completed S. pneumoniae genomes in GenBank.
– ermB and a gene encoding a tetracycline-resistance protein were unique to 2029 (ST3460). Erythromycin protein were unique to 2029 (ST3460). Erythromycin and tetracycline resistance were confirmed as functional.
– Two novel putative beta-lactamase genes in 2300 (ST1862) homologous to coding sequences in two plant pathogens (Ralstonia solanacearum and Dickeya dadantii).
– 6C group 1 strains contain a region encoding for a lantibiotic biosynthesis protein and lantibiotic efflux protein
Ding et al., 2009
Summary
• Two distinct 6C genetic clusters could be observed by gene-distance analysis
• The cluster that included ST1692 was more homogenous than the second clusterhomogenous than the second cluster
• MLST is a good indicator of genotype
• MLST can be determined from WGS however there is diversity within the STs of 6C
Future work• Analysis of insertions and deletions including previously
reported regions of diversity
• Comparison with other streptococcal genomes
• Analysis of gene content/diversity for known virulence • Analysis of gene content/diversity for known virulence factors
• Significance of identified SNPs and novel genes
• Further comparison of disease and carriage isolates
Hava & Camilli, 2002; Hiller et al., 2007; Obert et al., 2006; Silva et al., 2006.
Acknowledgements
University of Southampton
Dr Stuart Clarke
Dr Saul Faust
Dr Jo JefferiesDr Jo Jefferies
Anna Tocheva Centre for Systems Biology
University of Birmingham
Professor Mark Pallen
Dr Nick Loman - Bioinformatics
Dr Chrystala Constantinidou - Sequencing
Mala Patel - Sequencing
Thank you for your attentionattention
SGM Autumn Conference - University of Nottingham, 6-9 September 2010
Rebecca Gladstone ([email protected])
06 September 2010
References•xBASE-NG website http://ng.xbase.ac.uk•Carvalho, M., Pimenta, F. C., Gertz, R. E., Jr., Joshi, H. H., Trujillo, A. A., Keys, L. E., Findley, J., Moura, I. S., Park, I. H. & other authors (2009). PCR-Based Quantitation and Clonal Diversity of the Current Prevalent Invasive Serogroup 6 Pneumococcal Serotype, 6C, in the United States in 1999 and 2006 to 2007. J Clin Microbiol 47, 554-559.•Cooper, D., Yu, X., Sidhu, M., Nahm, M. H., Fernsten, P. & Jansen, K. U. (2010). Development of an opsonophagocytic assay to Streptococcus pneumoniae serotype 6C: Demonstration of cross-functional responses to 6C in Prevenar-13 immune sera. In ESPID. Nice.•Ding, F., Tang, P., Hsu, M.-H., Cui, P., Hu, S., Yu, J. & Chiu, C.-H. (2009). Genome evolution driven by host adaptations results in a more virulent and antimicrobial-resistant Streptococcus pneumoniae serotype 14. BMC Genomics 10, 158.•Hava, D. L. & Camilli, A. (2002). Large-scale identification of serotype 4 Streptococcus pneumoniae virulence factors. Mol Microbiol 45, 1389-1406.Microbiol 45, 1389-1406.•Hiller, N. L., Janto, B., Hogg, J. S., Boissy, R., Yu, S., Powell, E., Keefe, R., Ehrlich, N. E., Shen, K. & other authors (2007). Comparative genomic analyses of seventeen Streptococcus pneumoniae strains: insights into the pneumococcal supragenome. J Bacteriol 189, 8186-8195.•Obert, C., Sublett, J., Kaushal, D., Hinojosa, E., Barton, T., Tuomanen, E. I. & Orihuela, C. J. (2006). Identification of a Candidate Streptococcus pneumoniae core genome and regions of diversity correlated with invasive pneumococcal disease. Infect Immun 74, 4766-4777.•Park, I. H., Park, S., Hollingshead, S. K. & Nahm, M. H. (2007a). Genetic basis for the new pneumococcal serotype, 6C. Infect Immun 75, 4482-4489.•Park, I. H., Pritchard, D. G., Cartee, R., Brandao, A., Brandileone, M. C. C. & Nahm, M. H. (2007b). Discovery of a New Capsular Serotype (6C) within Serogroup 6 of Streptococcus pneumoniae. J Clin Microbiol 45, 1225-1233.•Park, I. H., Moore, M. R., Treanor, J. J., Pelton, S. I., Pilishvili, T., Beall, B., Shelly, M. A., Mahon, B. E. & Nahm, M. H. (2008). Differential effects of pneumococcal vaccines against serotypes 6A and 6C. J Infect Dis 198, 1818-1822.•Silva, N. A., McCluskey, J., Jefferies, J. M. C., Hinds, J., Smith, A., Clarke, S. C., Mitchell, T. J. & Paterson, G. K. (2006).Genomic Diversity between Strains of the Same Serotype and Multilocus Sequence Type among Pneumococcal Clinical Isolates. Infect Immun 74, 3513-3518.•Tocheva, A. S., Jefferies, J. M., Christodoulides, M., Faust, S. N. & Clarke, S. C. (2010). Increase in serotype 6C pneumococcal carriage, United Kingdom. Emerg Infect Dis 16, 154-155.
Sequencing results
Strain Number of reads Number of bases Coverage0081 91237 28733390 13.58
0113 110623 38489470 17.02
0237 116400 38374563 17.87
2029 107703 35591041 15.76
2073 122684 41618252 19.66
2074 226938 74015324 34.98
2105 96988 32480840 15.37
2300 78683 23267452 10.5523267452 10.55
2371 117005 35530396 16.64
3022 127553 39499836 18.68
3050 133283 40967776 18.87
3055 94317 31276976 14.90
3060 74967 25060487 12.20
3074 136821 45703125 21.42
3088 107105 36219264 17.44
0802M 117813 41369459 20.37
0954Q 167622 58368746 27.67
1058S 122634 43260411 20.91
1060N 124573 44306824 21.79
Average 113178 39434823 18.55
Key Lowest Highest
Structural difference between serotype 6A and 6C
HO
H
OH
H
CH2OH
H
OH
OH
H
H
HO
H OH
H
CH2OH
H
OH
OH
H
H
O O
• There is one structural difference between serotype 6A and serotype 6C
• There are two structural differences between serotype 6B and serotype 6C
6A [P 2) – galactose – (1 3) – glucose – (1 3) – rhamnose – (1 3) – ribitol – (5 P ]
6B [P 2) – galactose – (1 3) – glucose – (1 3) – rhamnose – (1 4) – ribitol – (5 P ]
6C [P 2) – glucose – (1 3) – glucose – (1 3) – rhamnose – (1 3) – ribitol – (5 P ]
OHH OHHgalactose glucose
MLST treeST1150
ST398
ST395
ST1692ST1692
ST1714
ST398
ST65
ST1862
ST3460
MLST: WGS vs QiagenGene and allele number
Isolate aroe gdh gki recP spi xpt ddl ST WGS ST Qiagen Agreement
SOT0081 1 5 7 12 17 158 14 ST1692 ST1692 Y
SOT0113 2 7 4 10 10 1 27 ST65 ST65 Y
SOT0237 1 5 7 12 17 148 14 ST1714 ST1714 Y
SOT2029 2 34 1 5 42 28 75 ST3460 ST3460 Y
SOT2073 1 5 7 12 17 158 14 ST1692 ST1692 Y
SOT2074 1 5 7 12 17 158 14 ST1692 ST1692 YSOT2074 1 5 7 12 17 158 14 ST1692 ST1692 Y
SOT2105 1 5 7 12 17 158 14 ST1692 ST1692 Y
SOT2300 79 1 40 13 6 1 6 ST1862 ST1862 Y
SOT2371 1 5 7 12 17 1 14 ST395 ST395 Y
SOT3022 1 5 7 12 17 158 14 ST1692 ST1692 Y
SOT3050 1 5 9 12 94 1 20 ST1600 ST1600 Y
SOT3055 1 5 7 12 17 158 14 ST1692 ST1692 Y
SOT3060 7 25 8 6 25 6 8 ST1150 ST1150 Y
SOT3074 1 5 7 12 17 148 14 ST1714 ST1714 Y
SOT3088 37 25 4 4 15 20 28 ST398 ST398 Y
SOT0802M 1 5 7 12 17 158 14 ST1692 ST1692 Y
SOT0954Q 1 5 7 12 17 158 14 ST1692 ST1692 Y
SOT1058S 1 5 7 12 17 158 14 ST1692 ST1692 Y
SOT1060N 7 25 8 6 25 6 8 ST1150 N/A N/A
Wheel comparison of 9 S. pneumoniae strains with R6 genome
Clinical data for IPD strains
�� �� ���� �� �� ��� ��������
��������� ��� ���
���� ���� ���� ��� � ����� ��������������
����������������
��� ���� ���� ��� ����� ��!���� ���"�
����# ���� ���� ���$ �� �#� ����%���" ���"�
����& ���� ���� ��� �� ����� #'"�" ���"�