a new high-throughput aflp approach for identification of new genetic polymorphism in the genome of...

14
A new high-throughput AFLP approach for identification of new genetic polymorphism in the genome of the clonal microorganism Mycobacterium tuberculosis Nicole van den Braak a , Guus Simons b , Roy Gorkink b , Martin Reijans b , Kimberly Eadie a , Kristin Kremers c , Dick van Soolingen c , Paul Savelkoul d , Henri Verbrugh a , Alex van Belkum a, * a Department of Medical Microbiology and Infectious Diseases, Erasmus MC, Dr. Molewaterplein 40, 3015 GD Rotterdam, The Netherlands b Department of Microbial Genomics, Keygene BV, Agro Businesspark 90, 6708 PW Wageningen, The Netherlands c Diagnostic Laboratory for Infectious Diseases and Perinatal Screening, National Institute of Public Health and the Environment, P.O. Box 1, 3720 BA Bilthoven, The Netherlands d Department of Medical Microbiology, Free University of Amsterdam, De Boelelaan, Amsterdam, The Netherlands Received 21 August 2003; received in revised form 13 September 2003; accepted 15 September 2003 Abstract We have here applied high-throughput amplified fragment length polymorphism (htAFLP) analysis to strains belonging to the five classical species of the Mycobacterium tuberculosis complex. Using 20 strains, three enzyme combinations and eight selective amplification primer pairs, 24 AFLP reactions were performed per strain. Overall, this resulted in 480 DNA fingerprints and more than 1200 htAFLP-amplified PCR fragments were visualised per strain. The cumulative dendrogram correctly clustered strains from the various species, albeit within a distance of 6.5% for most of them. The single isolate of Mycobacterium canettii presented separately at 19% distance. All over, 169 fragments (14%) appeared to be polymorphic. Sixty-eight were specific for M. canetti and forty-five for Mycobacterium bovis. For the 10 different M. tuberculosis strains included in the present analysis, 56 polymorphic markers were identified. Upon sequencing 20 of these marker regions and comparisons with the H37Rv genome sequence, 25% appeared to share homology to members of the antigenically variable PE/PPE surface protein encoding gene family confirming previous findings on the genetic heterogeneity within these genes. In addition, homologues for phage genes and insertion element-encoded genes were detected. Forty-five percent of the sequences derived from ORFs with a currently unknown function, which was corroborated by genome sequence comparison for the clinical M. tuberculosis CD 1551 isolate. Sequence variation in M. tuberculosis was assessed in more detail for a subset of these loci by newly designed PCR restriction fragment length polymorphism (RFLP) tests and direct sequencing. Fourteen novel PCR RFLP tests were developed and twelve novel single nucleotide polymorphisms (SNPs) were identified, all suited for epidemiological analysis of M. tuberculosis. The tests allowed for identification of the major Mycobacterium species and M. tuberculosis variants and clones. D 2003 Elsevier B.V. All rights reserved. Keywords: Mycobacterium tuberculosis; Genetic variation; High-throughput AFLP; PCR RFLP; Single nucleotide polymorphism 0167-7012/$ - see front matter D 2003 Elsevier B.V. All rights reserved. doi:10.1016/j.mimet.2003.09.018 * Corresponding author. Tel.: +31-10-4635813; fax: +31-10-4633875. E-mail address: [email protected] (A. van Belkum). www.elsevier.com/locate/jmicmeth Journal of Microbiological Methods 56 (2004) 49 – 62

Upload: radboud

Post on 20-Feb-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

www.elsevier.com/locate/jmicmeth

Journal of Microbiological Methods 56 (2004) 49–62

A new high-throughput AFLP approach for identification of

new genetic polymorphism in the genome of the clonal

microorganism Mycobacterium tuberculosis

Nicole van den Braaka, Guus Simonsb, Roy Gorkinkb, Martin Reijansb,Kimberly Eadiea, Kristin Kremersc, Dick van Soolingenc, Paul Savelkould,

Henri Verbrugha, Alex van Belkuma,*

aDepartment of Medical Microbiology and Infectious Diseases, Erasmus MC, Dr. Molewaterplein 40, 3015 GD Rotterdam, The NetherlandsbDepartment of Microbial Genomics, Keygene BV, Agro Businesspark 90, 6708 PW Wageningen, The Netherlands

cDiagnostic Laboratory for Infectious Diseases and Perinatal Screening, National Institute of Public Health and the Environment,

P.O. Box 1, 3720 BA Bilthoven, The NetherlandsdDepartment of Medical Microbiology, Free University of Amsterdam, De Boelelaan, Amsterdam, The Netherlands

Received 21 August 2003; received in revised form 13 September 2003; accepted 15 September 2003

Abstract

We have here applied high-throughput amplified fragment length polymorphism (htAFLP) analysis to strains belonging to the

five classical species of the Mycobacterium tuberculosis complex. Using 20 strains, three enzyme combinations and eight

selective amplification primer pairs, 24 AFLP reactions were performed per strain. Overall, this resulted in 480 DNA fingerprints

and more than 1200 htAFLP-amplified PCR fragments were visualised per strain. The cumulative dendrogram correctly

clustered strains from the various species, albeit within a distance of 6.5% for most of them. The single isolate ofMycobacterium

canettii presented separately at 19% distance. All over, 169 fragments (14%) appeared to be polymorphic. Sixty-eight were

specific forM. canetti and forty-five forMycobacterium bovis. For the 10 differentM. tuberculosis strains included in the present

analysis, 56 polymorphic markers were identified. Upon sequencing 20 of these marker regions and comparisons with the H37Rv

genome sequence, 25% appeared to share homology to members of the antigenically variable PE/PPE surface protein encoding

gene family confirming previous findings on the genetic heterogeneity within these genes. In addition, homologues for phage

genes and insertion element-encoded genes were detected. Forty-five percent of the sequences derived from ORFs with a

currently unknown function, which was corroborated by genome sequence comparison for the clinical M. tuberculosis CD 1551

isolate. Sequence variation in M. tuberculosis was assessed in more detail for a subset of these loci by newly designed PCR

restriction fragment length polymorphism (RFLP) tests and direct sequencing. Fourteen novel PCR RFLP tests were developed

and twelve novel single nucleotide polymorphisms (SNPs) were identified, all suited for epidemiological analysis of M.

tuberculosis. The tests allowed for identification of the major Mycobacterium species and M. tuberculosis variants and clones.

D 2003 Elsevier B.V. All rights reserved.

Keywords: Mycobacterium tuberculosis; Genetic variation; High-throughput AFLP; PCR RFLP; Single nucleotide polymorphism

0167-7012/$ - see front matter D 2003 Elsevier B.V. All rights reserved.

doi:10.1016/j.mimet.2003.09.018

* Corresponding author. Tel.: +31-10-4635813; fax: +31-10-4633875.

E-mail address: [email protected] (A. van Belkum).

N. van den Braak et al. / Journal of Microbiological Methods 56 (2004) 49–6250

1. Introduction

Mycobacterium tuberculosis, a cause of severe

infectious morbidity and mortality among humans, is

a clonally reproducing infectious microorganism. The

most likely explanation for its genetic homogeneity is

a short evolutionary history (Kapur et al., 1994). This

hypothesis was verified by high-throughput sequenc-

ing of housekeeping genes, which demonstrated that,

despite normal mutation frequencies (David and New-

man, 1971), M. tuberculosis only accumulated limited

numbers of allelic variants in these genes (Sreevatsan

et al., 1997; Kapur et al., 1994; Musser et al., 2000). It

was shown that the largest proportion of mutations re-

corded was associated with selective pressure exerted

by antimicrobial agents, for instance. However, vari-

ous variable elements or loci were still identifiable in

theM. tuberculosis genome. The first in this series was

a set of insertion elements including IS1081 (Collins

and Stephens, 1991), IS1547 (Fang et al., 1998), an IS-

like element (Mariani et al., 1993) and the highly

polymorphic IS6110 (Thierry et al., 1990). Because

of its flexibility, both in copy number and chromo-

somal location, mapping of IS6110 restriction frag-

ment length polymorphism (RFLP) has become an

important tool for investigation of the epidemiology of

M. tuberculosis (Small et al., 1994; Van Embden et al.,

1993).

The nucleotide sequence for the entire genome of

M. tuberculosis H37Rv revealed the presence of

additional IS elements (Cole et al., 1998) and other

regions of putative variability, including minisatellite-

like elements. These mycobacterial interspersed re-

petitive units (MIRUs) were usually between 40 and

100 nucleotides in length and were found to be

dispersed across the chromosome (Supply et al.,

2000). Out of 31 candidate loci, 12 appeared to be

variable in both repeat copy number and primary

structure, a feature that is not uncommon among

microbes (for a review, see Van Belkum et al.,

1998). It was simultaneously documented that the

variable number of tandem repeats (VNTRs) in these

MIRUs seemed to be evolving slowly in mycobac-

terial populations. MIRU-VNTRs can be used for

assessing diversity among strains of M. tuberculosis

(Supply et al., 2001) and Mycobacterium bovis

(Roring et al., 2002). Another example of such a

repetitive domain, the direct repeat or DR locus, was

already discovered in the pre-genomics era (Hermans

et al., 1993; Kamerbeek et al., 1997). Within the DR

locus constant repeats are alternated with variable

sections. These latter elements have been used for the

development of a typing system suited for large-scale

mycobacterial epidemiology (Fang et al., 1998; Groe-

nen et al., 1993). Population genetic studies involv-

ing the DR element indicated that successive

deletions in the region rather than scrambling of the

variable units gave rise to significant levels of

diversity (Van Embden et al., 2000). In conclusion,

even after the elucidation of the first M. tuberculosis

genome sequence, genetic variation seemed to be

confined to a limited number of loci and frequently

based on repeat diversity or inherently dynamic

insertion elements. This implied that additional meth-

ods for defining genetic variability in M. tuberculosis

are still required prior to our full understanding of the

population genetics of this medically highly relevant

microorganism.

More recent developments in microbial genomics

and DNA array technology generated improvements in

our understanding of the biology of M. tuberculosis.

Initially, arrays of bacterial artificial chromosome

(BAC) clones were used to make inventories of

genomic differences between strains of M. tuberculo-

sis (Gordon et al., 1999). Especially mapping genomic

deletions revealed important details on the phylogeny

of M. tuberculosis and M. bovis. Similar but more

detailed data were obtained using whole genome

arrays manufactured by spotting 250–1000 nucleoti-

des long PCR products for all of the predicted open

reading frames in the genome of H37Rv on glass slides

(Behr et al., 1999). The large deletions are currently

supposed to involve genes whose functions are no

longer necessary for certain lineages of mycobacteria.

This was confirmed by the observation that strains

containing increasing numbers of deletions caused less

severe infections without pulmonary cavitation (Kato-

Maeda et al., 2001a,b). Most recently, a second M.

tuberculosis genome sequence became available for

the clinical isolate CD 1551 (Fleischmann et al., 2002).

This enabled the most detailed genetic comparison

between two strains of a single bacterial species

presented thus far. Various novel polymorphic sites

were identified, not only deletions and insertions but

also widespread single nucleotide polymorphism

(SNP). Apparently, genetic variation in M. tuberculo-

Table 1

Survey of strains used for htAFLP analysis

Numbers between brackets refer to numbering as given in Kremer et

al. (1999). Genuine duplicates are indicated by coloured shades, the

boxed region on the right indicates the strains also present in the

pilot collection.

N. van den Braak et al. / Journal of Microbiological Methods 56 (2004) 49–62 51

sis may be more widespread than anticipated. It has to

be realised, however, that determination of genetic

variability based on direct genome comparisons is

not always feasible: it is not (yet?) economically

possible to sequence the entire chromosome for two

or more representative strains of all bacterial species.

In addition, the variability between two members of a

species may not be representative for the general

genetic heterogeneity within the species as a whole.

This raises the need for alternative technology that

could be more generally applied to discover genetic

variation among all of the microbial pathogens.

We here demonstrate that high-throughput amplifi-

cation fragment length polymorphism (htAFLP) anal-

ysis is a likely choice for genome wide mutation

screening. AFLP as such is a restriction-amplification

method developed in the early 1990s (Vos et al.,

1995). After an additional digestion of genomic

DNA, by a combination of restriction enzymes, re-

striction site-specific extensions are ligated to the

multitude of DNA fragments. The attached linker

contains site-specific PCR priming sequences. When

targeted by PCR, the combination of sites at the

termini of an individual restriction fragment deter-

mines whether the fragment is amplified or not.

Usually, due to the complexity of the restriction digest,

one or two AFLP fingerprints suffice for obtaining a

reliable genetic signature for a microbial strain. In

principle, the method screens restriction site polymor-

phism and a clever extension of the primers used

during AFLP also facilitates monitoring of DNA

polymorphism in the restriction site-neighbouring re-

gion. This has resulted in the establishment of repro-

ducible and robust microbial typing strategies that do

not only provide genetic epidemiological information,

but which can also be used to identify new species,

even within the M. tuberculosis complex (Ahmed et

al., 2003). We here demonstrate that analysis of

various restriction enzyme combinations together with

differentially extended AFLP primers facilitates high-

density genomic screening, independent of knowledge

of mutation positions. Our current studies were aimed

at the detection of novel sites of genetic variation not

previously characterised in the genetically homoge-

neous species belonging to the M. tuberculosis com-

plex. This is the first model study on the value of

htAFLP for detecting genetic variants in an essentially

clonal bacterial species.

2. Materials and methods

2.1. Strains and DNA isolation

Strains used for the htAFLP analysis have been

described before (Kremer et al., 1999). A survey of

the isolates in the collection is given in Table 1. It is

important to emphasise that two separate but over-

lapping collections of strains were used. The first

collection was employed for the htAFLP, whereas the

second collection was used for the validation of

markers developed using the pilot collection. Both

collections were provided in a blinded fashion, the

receiving institution being unaware of the nature of

the strains before htAFLP was finished. In the

validation collection, three times three outbreak re-

lated isolates were included to confirm inter-strain

Table 2

Survey of polymorphic htAFLP markers for various Mycobacterium species

N. van den Braak et al. / Journal of Microbiological Methods 56 (2004) 49–6252

Table 2 (continued)

The individual markers have been given a numerical code (column on the left) and a letter code (fragment code), the strains are numbered

according to Table 1. Columns with an identical shade highlight duplicate analyses. The plusses and minuses indicate the presence or absence,

respectively, of the fragment in the htAFLP analysis. X=strain not tested for the htAFLP primer combination; ?=very faint band (still visible). It

has to be emphasised that the fragments derived from different fingerprint, additional information can be obtained upon request to the

communication author. The begin and end sequences of the BLAST search are given and the coding potential of the region, both in the H37Rv

and CD1551 whole genome sequences, are stated.

N. van den Braak et al. / Journal of Microbiological Methods 56 (2004) 49–62 53

reproducibility of the newly developed genotyping

assays. Besides strains of the five classical species

of the M. tuberculosis complex, a single strain of

Mycobacterium canettii was included. Isolates within

this taxon clearly belong to the M. tuberculosis

complex, but present as a separate lineage based on

spoligotyping and IS6110 and IS1081 RFLP mapping.

As such, this strain presents an adequate internal

AFLP process control (Pfyffer et al., 1998; Van

Soolingen et al., 1997). The M. tuberculosis isolates

represented important different genotypes, such as the

Beijing, the Haarlem and the Africa genotypes, refer-

ence strains H37Rv and H37Ra, and a strain devoid of

IS6110. DNA isolation for AFLP analysis was per-

formed as described before (Van Soolingen et al.,

1994).

2.2. High-throughput AFLP

The individual AFLP PCRs were performed essen-

tially as detailed before in the presence of radioactive

nucleotides for the visualisation of the fingerprints (Vos

et al., 1995). htAFLP was performed using three

different enzyme combinations to digest the mycobac-

terial DNA: MboI/TaqI, NlaI/TaqI and MaeII/NlaI (all

Boehringer-Mannheim, Mannheim, Germany). Each

restriction enzyme was combined with the ligation of

specific linker oligonucleotide pairs (MboI: 98/16:

5V-GTAGACTGCGTACCGATC-3V; 98/15: 5V-GATCGGTACGCAGTCTAC-3V; NlaIII: 98/28:

5V-GTAGACT-GCGTACACATG-3V; 98/27: 5V-TGTACGCAGTCT-AC-3V; MaeII: 91P25: 5V-GACGATGAGTCCTGAC-3V; 02K195 : 5V-CGGTCAGGACTCAT-3V; TaqI: 91P25 and 92H51:

5V-AGCCAGTCCTGAGTAGCAG-3V). For each of

these linker combinations, AFLP was performed us-

ing eight different linker specific primer combina-

tions. One of these primers was extended with a single

nucleotide ( + 1), whereas the other primer was

equipped with a 3Vterminal dinucleotide ( + 2). These

nucleotides probe sequence variation beyond that

present in the restriction site itself. The extensions

were AA/A, AA/G, AC/A, AC/T, AG/A, AG/T, AT/C

and AT/G. Amplified material was analysed on

50 � 20 cm polyacrylamide slabgels and the

amplimers were visualised using phosphor-imaging.

Post-AFLP, gels were fixed, dried and stored at

ambient temperature.

2.3. Marker selection, sequencing and genomic

identification

Upon visual inspection of the autoradiographs,

polymorphic marker bands were identified. This was

Table 3

Development of novel PCR RFLP tests for M. tuberculosis

The fragment codes are identical to those listed in Table 2. Shaded areas on the right identify those PCRs that did not result in amplification. Different numbers in the PCR RFLP

result sections indicate different RFLP patterns.

N.vanden

Braaket

al./JournalofMicro

biologica

lMeth

ods56(2004)49–62

54

N. van den Braak et al. / Journal of Microbiological Methods 56 (2004) 49–62 55

corroborated by indexing using the automated inter-

pretation software package AFLP QuattroPro (Key-

gene, Wageningen, The Netherlands). The fragments

were excised from the gels and re-amplified using their

matching AFLP consensus primer set without restric-

tion site-specific + 1/ + 2 extension sequences attached

(99G18: 5V-AGCGGATAACAATTTCACAGAGGA-CACACTGGTATA-GACTGCGT-ACCGAT-3V;99G22: 5V-AGCGGATAACAATTTCACACAGGA-CAC-ACTGGTATAGACTGCGTACA-CATG-3V).The amplimers were subjected to DNA sequencing

using a 96-well capillary sequencing machine (Meg-

aBace; ABI, Gouda, The Netherlands). We used the

nucleotide sequences of the fragments derived from

the htAFLP of the H37Rv strain in order to facil-

itate fragment identification using BLAST analysis

versus the genome sequence for this strain. Table 2

lists the fragments that were successfully subjected

to DNA sequencing and analysed by BLAST. Iden-

tification codes for the fragments are included as

well.

2.4. Development of PCR RFLP tests and directed

amplicon sequencing

In case an amplimer sequence matched a target

region in the M. tuberculosis H37Rv genome, a novel

PCR test was designed for probing the genetic poly-

morphism detected in more detail. This involved the

synthesis of forward and reverse primers located ap-

proximately 50 nucleotides upstream or downstream of

the region of homology, respectively, for all of the

validation strains (n = 26) (see Table 3), thereby ampli-

fying not only the differentially present AFLP fragment

itself but also neighbouring sequences. First, the

amplimers were digested with the restriction enzymes

used for the AFLP reaction. PCR RFLP digests were

analysed on agarose gels. In addition, some of the

fragments were amplified and completely sequenced.

This analysis revealed whether or not the variability

was due to variation in the restriction sites or in the

adjourning + 1/ + 2 nucleotides encoded by the AFLP

primers. DNA sequences were compared with the

genome sequence using the web-accessible version of

the Basic Local Alignment Search Tool (BLAST).

Additional comparison was performed by alignment

of all sequences using DNASTAR (Lasergene, Madi-

son WI, USA).

3. Results

3.1. htAFLP analysis and fragment analysis for

Mycobacterium strains

The first 20 Mycobacterium spp. strains (pilot col-

lection, see Table 1) were subjected to htAFLP. For

these strains, three enzyme combinations were com-

bined with eight different selective primer pairs; this

generated 480 different fingerprint types for all of the

strains. This resulted in an overall number of approx-

imately 1200 amplimers. Fig. 1 highlights the dendro-

gram as based on all of thesemarkers. It can be seen that

all of the different M. tuberculosis genotype strains

(Haarlem, Beijing and Africa) cluster closely. The M.

bovis strains cluster with the single M. bovis BCG and

the Mycobacterium microti isolates. Mycobacterium

africanum andM. canettii form quite distinct branches

in the phylogenetic tree, withM. canettii occupying the

most exceptional position. The genome strains were

analysed in duplicate with the genotyping results clus-

tering most closely. This difference is less than 0.2%,

indicative of the AFLP high signal-to-noise ratio.

Different numbers of selectively amplified marker

bands, not universally present for all strains in the pilot

collection, were identified per enzyme combination. In

case of MboI/TaqI, 58 useful markers were observed,

whereas NlaI/TaqI and MaeIII/NlaI digests rendered

64 and 47 differentiating markers, respectively. Over-

all, this amounted to 169 well-scored markers. Among

these, 31 were positive and 37 were negative for theM.

canettii strain only. Out of the remaining 101 markers,

45 were specific for M. bovis, leaving 56 polymorphic

markers for M. tuberculosis, essentially generated

using 10 different strains only.

3.2. DNA sequencing of M. tuberculosis specific

markers

Out of the 56 polymorphic markers identified forM.

tuberculosis, a randomly selected set of 20 was se-

quenced (Table 2), including a description of their

differential occurrence in the AFLP banding patterns.

For theM. tuberculosismarkers, the data obtained with

respect to the homology searches for the two currently

available genomes (H37Rv and CD 1551) are quite

similar: in 18 out of 20 comparisons, homologous

sequences were identified in both genome sequences.

Fig. 1. Example of a phylogenetic tree constructed on the basis of combined htAFLP fingerprints generated for the pilot collection of strains using

eight primer combinations per each of three different species of DNA restriction digests. Note the duplicate sets of isolates and observe the

outlying position of the M. canettii strain (see also Table 1). In addition, several of the fingerprints are pasted next to the dendrogram in order to

visualise the experimental output. Species and isolate specific markers are highlighted.

N. van den Braak et al. / Journal of Microbiological Methods 56 (2004) 49–6256

In addition, the physical locations of all of the markers

are essentially identical in both genomes (see the

BLAST hit begin and end score in Table 2). The

sequence homologies with the H37RV genome se-

quence that were revealed upon BLAST analysis are

described in detail in the same table. Interestingly, 5 out

of 20 sequences (25%) demonstrated homology with

the PPE family of proteins. These proteins are most

probably surface associated and putatively involved in

virulence. The products of these genes contain variable

numbers of either PE or PPE peptide repeats (Ram-

akrishnan et al., 2000; Skeiky et al., 2000). The fact that

many of these sequences turn up in our AFLP analysis

indicates that the encoding genes are subject to rela-

tively frequent sequence variation. Other obviously

polymorphic genes pinpointed by htAFLP are those

encoding a phage-related protein and a transposon

associated resolvase. Whether heat shock proteins,

Esat6 and the universal stress protein are also inher-

ently variable is currently not clear, but our data suggest

that this may be the case. One example of a variable

intergenic region is included. Nine sequences (45%)

are similar to those of hypothetical open reading frames

in the H37RV genomes and require additional investi-

gation as to the molecular basis of their putative genetic

variability. No apparent clusters were observed for the

polymorphisms, the mutations seem to be scattered

throughout the genome.

3.3. DNA sequencing of Mycobacterium spp. specific

markers

For the single strain of M. canettii, a multitude of

markers was identified by scanning fingerprints. This is

in full agreement with its outlying position in the

overall AFLP dendrogram (see Fig. 1). Out of 68

N. van den Braak et al. / Journal of Microbiological Methods 56 (2004) 49–62 57

markers, a random set of 34 were successfully se-

quenced. It is interesting to note that all of the sequen-

ces showed significant homology scores with the

H37Rv and CD 1551 genome sequences, indicating

high levels of cross-species sequence identity and

suggesting that sequence variation rather than gene

gain or loss defines the species’ boundaries. Nine out of

thirty-four sequences matched with intergenic motifs

identified in the H37RV genome, suggesting mutations

seem to accumulate in intergenic regions (1/20 versus

9/34 inM. tuberculosis versusM. canettii, respectively;

p = 0.07). Eight sequences matched with open reading

frames for which no function has been proposed as yet.

For six of the hits, the BLAST analysis revealed a

match with a gene encoding a putative surface compo-

nent. Three of these matched again with the so-called

PE/PPE protein genes. The variability that was encoun-

tered using the htAFLP approach seems to be in

agreement with the expectations. The other surface

components appeared to be encoded by the ABC

transporter and potassium efflux genes.

Various other species-specific polymorphisms were

identified in addition. For the 21 markers that appeared

to separate either M. bovis, M. microti or M. africa-

num, 5 coincided with hypothetical protein genes, 5

with intergenic regions and, again, 5 identified surface-

associated protein genes. The comparison performed

with the genome sequence obtained for the clinical M.

tuberculosis isolate CD1551 largely confirmed the

findings listed above (see Table 2 for details).

3.4. Development of PCR RFLP tests for M. tuber-

culosis strains

Based on the sequences listed in Table 2, PCR tests

for amplifying the locus identified by htAFLP and its

surrounding regions were developed. The correct frag-

ments were generated for 17 out of 20 markers. In

these cases, the PCR facilitated the amplification of a

correctly sized fragment, albeit that for three tests the

PCRs failed to deliver sufficient quantities of the

amplicon. In addition, the PCRs were isolate-selective

for 5 out of 17 tests. Ultimately, 17 RFLP analyses

were performed. The results for the differential ampli-

fication and subsequent RFLP analysis are summar-

ised in Table 3. It is immediately obvious that the PCR

amplification in itself was already indicative for the

genetic heterogeneity among the strains. The grey

blocks in Table 3 reveal that PCRs turned out negative

quite regularly, but it has to be stated that this is

concordant with the epidemiological data: when one

of the strains belonging to either of the three clusters

included in the validation strains, the two other strains

were negative as well. In addition, negative amplifi-

cation was relatively often documented for the non-

tuberculosis species. One of the PCR data sets

appeared to be specifically negative for the M. bovis

strains (fragments C1 and F5). It is interesting to note

that the three epidemiologically related clusters of

three strains can be adequately discriminated on the

basis of PCR effectiveness and RFLP analysis. Com-

bining the data from 3 out of 14 tests accurately

discriminated the clusters (fragments A5, A11 and

F5). The results show that the PCR in itself already

corroborated the epidemiological relatedness between

the strains. The explanation could be that different

AFLP patterns are due to mutations in the primer

extension region rather than in the restriction sites

themselves. For this reason, we decided to completely

sequence the C11, H8 and D4 fragments generated for

all 26 strains from the validation collection.

3.5. Detection of single nucleotide polymorphism in

marker fragments for M. tuberculosis

Fragments C11, D4 and H8were amplified for all 26

strains in the validation collection. Amplification was

successful in all cases, but use of the corresponding

AFLP restriction enzymes did not result in distinguish-

able patterns. Sequencing these fragments in full,

however, revealed the presence of 12 single nucleotide

polymorphisms (SNPs) (see Table 4 for the precise

nature and position of the mutations). Out of these 12, 4

were physically linked and most likely the result of a

recombination event (see Table 4 for the position and

the nature of the SNPs). Two mutations appeared to be

insertions, the others were point mutations. The two

insertion sites were located within coding sequences:

whether the insertion causes abrogation of the gene

sequences is subject to current investigations. One of

the two genes encodes a member of the PE/PPE

proteins and it could well be that the insertion event

is part of the antigenic variation noted before for these

genes. When all mutation are accumulated, nine differ-

ent overall genotypes are found among the 26 strains.

WithinM. tuberculosis, five different types were found

Table 4

Detection of sequence polymorphism in AFLP-derived PCR products

Strain Fr. C11 Fr. D4 Fr. H8 Overall

numberA A B C D A B C D

sequence

(– ) (A) (– ) (A) (C) (G) (C) (G) A–T–G–Gtype

Diverse M. tuberculosis strains

2 mtb A – – – – – – – – – B

7 mtb A – – – – – – – – – B

5 mtb H C – – – – – – – – D

9 mtb H C – – – – – – – – D

3 mtb – – – – – – – – – B

8 mtb – C – – – – – A G–C–A–T E

11 mtb – C – – – – – A G–C–A–T E

12 mtb gen – – – – – – – – – B

15 mtb H37 – – – – – – – – – B

Clonally related clusters of M. tuberculosis strains

4 mtb bej – – G – – – – – – C

6 mtb bej – – G – – – – – – C

18 mtb bej – – G – – C – – – I

19 mtb bej – – G – – C – – – I

20 mtb bej – – G – – C – – – I

21 mtb103 – – – – – – – – – B

22 mtb103 – – – – – – – – – B

23 mtb103 – – – – – – – – – B

24 mtb265 – – – – – – – – – B

25 mtb265 – – – – – – – – – B

26 mtb265 – – – – – – – – – B

Other mycobacterial species

1 bovis – – – – – C T – – A

13 bovis – – – – – – – A G–C–A–T F

16 bovis – – – – – – – A – G

14 afric. – – – – – – – ? – B

10 microti – – – – – – – A G–C–A–T F

17 canetti – – – G G – – A G–C–A–T H

SNPs were identified by sequencing PCR products encoded C11, D4 and H8 (see also Tables 2 and 3). The mutations listed in this table have been

mapped in the H37Rv genome sequence (GenBank entrance MTBH37RV) with the following results: C11A, insertion between 1092342 and

1092343; D4A, point mutation at 4303404; D4B, insertion between 4303497 and 4303496; D4C, point mutation at 4303431; D4D, point

mutation at 4303500; H8A, point mutation at 2626387; H8B, point mutation at 2626384; H8C, point mutation at 2626321; H8D, recombination

at a region including positions 2626231, 2626223, 2626218 ad 2626216. The C11 locus comprises hypothetical glycine-rich protein gene, the D4

region harbours the IS1537 resolvase gene and the H8 region encodes an unknown hypothetical gene.

N. van den Braak et al. / Journal of Microbiological Methods 56 (2004) 49–6258

(B, C, D, E and I). It is particularly noteworthy that the

C and I types were specific for the Beijing clone of M.

tuberculosis and capable of distinguishing within this

strongly conserved clone. Apparently, strains can be

discriminated below the clonality level using our new

assays. It is also important to note that the species M.

bovis, M. canettii and M. microti show sequence types

that are not encountered among the M. tuberculosis

strains. The fact that the B type ofM. africanum is also

found in several of theM. tuberculosis strains is in full

agreement with the position of M. africanum in the

dendrogram displayed in Fig. 1.

4. Discussion

Comprehensive studies on microbial evolution and

population genetics depend on the detection of genetic

variation and diversity among members of the bacte-

rial species under investigation (Van Belkum et al.,

N. van den Braak et al. / Journal of Microbiological Methods 56 (2004) 49–62 59

2001; Kato-Maeda et al., 2001a,b). Assessment of

panmicticism versus clonality is based on the degree

of genetic variability that can be addressed in the

microbial genetics laboratory. Consequently, tractabil-

ity of a bacterial genotype may be limited once only

small numbers of distinct genetic polymorphisms that

effectively separate strains are known. The mechanism

underlying the epidemic spread of organisms that are

largely clonal remains incompletely understood. M.

tuberculosis is an example of such a microorganism:

although major clones have been described and fol-

lowed globally (Bifani et al., 2002; Kato-Maeda et al.,

2001a,b), additional genotyping tools are required in

order to deepen our understanding of M. tuberculosis’

recent evolution and its mode of transmission (Brosch

et al., 2002; Mostowy et al., 2002). This is important

since control of mycobacterial diseases can only be

optimised once dissemination mechanisms have been

elucidated in full detail (Mathema and Kreiswirth,

2003). In addition, it has been demonstrated that single

mutations can change the pathogenicity of a M.

tuberculosis isolate (Collins et al., 1995) or its resis-

tance against some of the most commonly applied

antibiotics (Troesch et al., 1999; Upton et al., 2001;

Ramaswamy et al., 2003). This once more argues for

the need of identification of additional molecular

markers suited for identifying clinically relevant ge-

netic polymorphism for M. tuberculosis even beyond

the large deletions that were detected upon whole-

genome sequence comparison.

4.1. Distinguishing among M. tuberculosis strains

We here show that 10 strains of M. tuberculosis

suffice for the detection and identification of several

new genetic markers by htAFLP. Straightforward

application of htAFLP resulted in significant numbers

of markers for which sequence data could be

obtained. Further development of a small subset of

these marker molecules already resulted in several

novel and convenient PCR RFLP tests for mapping

genomic polymorphism that are epidemiologically

concordant among M. tuberculosis strains (see Table

3). In addition, based on sequencing of again a

limited set of htAFLP-identified loci, we identified

several new polymorphisms suited for distinguishing

strains in the M. tuberculosis complex including

members of the highly conserved Beijing family.

The universal spread of this strain and its frequent

association with outbreaks of disease indicate that

such additional molecular markers may be used to

further refine the ontogeny of this particularly path-

ogenic strain (Glynn et al., 2002). The mutation we

detected is located in a conserved hypothetical gene

for which further functional studies are certainly

warranted.

4.2. htAFLP and whole genome sequences

In principle, AFLP is a simple laboratory tech-

nology, requiring PCR machines and electrophoresis

equipment only. This renders the htAFLP easily

accessible to the microbiology research laboratory.

AFLP can also be used diagnostically and when

adequate software is available fingerprints can be

stored in exchangeable and expandable databases.

Informative evolutionary and epidemiological com-

parisons can be made (Kassama et al., 2002). In

addition, when whole genome sequences are avail-

able, AFLP fingerprints can be predicted on the

basis of computerised analyses. The number of frag-

ments generated by our htAFLP method is in rea-

sonable agreement with the expectations based on in

silico analyses (results not shown). This is in con-

trast with a previous study where the authors pre-

sented significant differences between theoretical and

practical outcomes (Sims et al., 2002). The data

presented by these authors were not supported by

sequencing of polymorphic fragments. It seems as if

the choice of the restriction enzyme is critical in this

respect, whereas also the methylation status of a

restriction site may be important (Hemavathy and

Nagaraja, 1995). Furthermore, it was recently dem-

onstrated for another clonal bacterial species that

even in the absence of a full genome sequence

AFLP can be used for the generation of informative

DNA probes. For Salmonella enterica serovar typhi-

murium, it was shown that probes specific for a

certain phage type could be readily developed (Hu et

al., 2002). Among 46 strains, 84 phage type-specific

fragments were identified on the basis of a single

restriction enzyme combination and all 16 different

+ 1 extensions for the primer pair used. This corrob-

orated the approach as sketched here, be it that in

case of an even more clonal organism such as M.

tuberculosis, the number of variant fragments is less.

N. van den Braak et al. / Journal of Microbiological Methods 56 (2004) 49–6260

Other procedures, such as for instance mapping IS

element insertion sites (Collins et al., 2002), are

clearly less efficient for mutation detection in my-

cobacterial species.

The availability of two genome sequences for M.

tuberculosis (Cole et al., 1998; Fleischmann et al.,

2002) facilitates initial functional analysis our specific

htAFLP markers. Mathematical studies performed by

Hughes et al. (2002) indicated that the rate of synon-

ymous mutations was found to be 0.000328F0.000022% when the two full genome sequences were

compared. More than 80% of all sites appeared to be

nonvariable. We here establish approximately 5%

polymorphism when screening 1200 loci by htAFLP.

Apparently, when more than two strains are used, the

relative easy with which mutation can be tracked down

increases, which does not support the claim by Hughes

et al. that ‘‘large numbers of loci need to be screened’’

before significant variability can be detected for my-

cobacterial strains. When assessing the nature of the

variable regions, the genomics approaches identified

phospholipase, membrane lipoproteins, the PE/PPE

surface proteins and certain cyclases (Fleischmann et

al., 2002). All of these elements can also be traced back

in Table 2, highlighting the BLAST searches for many

of the AFLP fragments. Additional genetic elements

such as the prophage phiRv1 and molybdopterin cofac-

tor biosynthesis genes were also identified by both

approaches. Based on their genomic comparison,

Fleischmann et al. (2002) literally conclude that they

were able to ‘‘. . . develop a set of markers that would be

valuable in studying the phylogenetics of theM. tuber-

culosis species and other tubercle bacilli’’. In view of

the overlapping outcome of our AFLP analysis, we feel

confident in repeating this statement for the AFLP

approach.

4.3. Concluding remarks

We have not yet reached the stage where for all

species of microorganisms two full genome sequences

are available. This calls for alternative strategies for

high-density assessment of informative genetic poly-

morphism. We here provide proof of principle for one

such method. htAFLP data as presented in this com-

munication are in adequate agreement with whole

genome comparisons. It is also shown that the avail-

ability of such markers can be helpful in the develop-

ment of simple tests for assessment of genetic

polymorphism between large number of microbial

strains. In conclusion, in the absence of multiple

genome sequences for a given microbial species,

htAFLP provides an attractive option for high-density

genotyping and the subsequent development of phy-

logenetically informative molecular variables.

Acknowledgements

The research described in this communication has

in part been facilitated by a grant provided by the

Dutch Ministry of Economic Affairs (BTS 00145).

References

Ahmed, N., Alam, M., Majeed, A.A., Rahman, S.A., Cataldi, A.,

Cousins, D., Hasnain, S.E., 2003. Genome sequence based, com-

parative analysis of the fluorescent amplified fragment length

polymorphisms (FAFLP) of tubercle bacilli from seals provides

molecular evidence for a new species within the Mycobacterium

tuberculosis complex. Infect. Genet. Evol. 2, 193–199.

Behr, M.A., Wilson, M.A., Gill, W.P., Salamon, H., Schoolnik,

G.K., Rane, S., Small, P.M., 1999. Comparative genomics of

BCG vaccines by whole genome DNA microarray. Science

284, 1520–1523.

Bifani, B.J., Mathema, B., Kurepina, N.E., Kreiswirth, B.N., 2002.

Global dissemination of the Mycobacterium tuberculosis W–

Beijing family strains. Trends Microbiol. 10, 45–52.

Brosch, R., Gordon, S.V., Marmiesse, M., Brodin, P., Buchrieser, C.,

Eiglmeier, K., Garnier, T., Gutierrez, C., Hewinson, G., Kremer,

K., Parsons, L.M., Pym, A.S., Samper, S., Van Soolingen, D.,

2002. A new evolutionary scenario for theMycobacterium tuber-

culosis complex. Proc. Natl. Acad. Sci. 99, 3684–3689.

Cole, S.T., Brosch, J., Parkhill, J., Garnier, T., Churcher, C., Harris,

D., Gordon, S.V., Eiglmeier, K., Gas, S., Barry, C.E., Tekaia,

K., Badcock, K., Baham, D., Brown, D., Chillingworth, T.,

Connor, R., Davies, R., Devlin, K., Feltwell, T., Gentles, S.,

Hamlin, N., Holroyd, S., Hornsby, T., Jagels, K., Krogh, A.,

McClean, J., Moule, S., Murphy, L., Oliver, K., Osborne, J.,

Quail, M.A., Rajandream, M.A., Rogers, R., Sutter, S., Seeger,

K., Skelton, J., Squares, R., Sulston, J.E., Taylor, K., White-

head, S., Barrell, B.G., 1998. Deciphering the biology of Myco-

bacterium tuberculosis from the complete genome sequence.

Nature 393, 537–544.

Collins, D.M., Stephens, D.M., 1991. Identification of insertion

sequence, IS1081, in Mycobacterium bovis. FEMS Lett. 83,

11–16.

Collins, D.M., Kawakami, R.P., De Lisle, G.W., Pascopella, L.,

Bloom, B.R., Jacobs, W.R., 1995. Mutation of the principal

sigma factor causes loss of virulence in a strain of the Myco-

N. van den Braak et al. / Journal of Microbiological Methods 56 (2004) 49–62 61

bacterium tuberculosis complex. Proc. Natl. Acad. Sci U. S. A.

92, 8036–8040.

Collins, D.M., De Zoete, M., Cavaignac, S.M., 2002. Mycobacte-

rium avium subsp. paratuberculosis strains from cattle and

sheep can be distinguished by a PCR test based on a novel

DNA sequence difference. J. Clin. Microbiol. 40, 4760–4762.

David, H.L., Newman, C.M., 1971. Some observations on the ge-

netics of isoniazid resistance in the tubercle bacilli. Am. Rev.

Respir. Dis. 104, 508–515.

Fang, Z., Morrison, N., Watt, B., Doig, C., Forbes, K.J., 1998.

IS6110 transposition and evolutionary scenario of the direct

repeat locus in a group of closely related Mycobacterium tuber-

culosis strains. J. Bacteriol. 180, 2102–2109.

Fleischmann, R.D., Alland, D., Eisen, J.A., Carpenter, L., White,

O., Peterson, J., DeBoy, R., Dodson, R., Gwinn, M., Haft, D.,

Hickey, E., Kolonay, J.F., Nelson, W.C., Umayam, L.A., Ermo-

laeva, M., Salzberg, S.L., Delcher, A., Utterback, T., Weidman,

J., Khouri, H., Gill, J., Mikula, A., Bishai, W., Jacobs, W.R.,

Venter, J.C., Fraser, C.M., 2002. Whole genome comparison

of Mycobacterium tuberculosis clinical and laboratory strains.

J. Bacteriol. 184, 5479–5490.

Glynn, J.R., Whiteley, J., Bifani, P.J., Kremer, K., Van Soolingen,

D., 2002. Worldwide occurrence of Beijing/W strains of Myco-

bacterium tuberculosis: a systematic review. Emerg. Infect. Dis.

8, 843–849.

Gordon, S.V., Brosch, R., Billault, A., Garnier, T., Eiglmeier, K.,

Cole, S.T., 1999. Identification of variable regions in the ge-

nomes of tubercle bacilli using bacterial artificial chromosome

arrays. Mol. Microbiol. 32, 643–655.

Groenen, P.M.A., Bunschoten, A.E., Van Soolingen, D., Van

Embden, J.D.A., 1993. Nature of DNA polymorphism in the

direct repeat cluster of Mycobacterium tuberculosis: application

for strain differentiation by a novel typing method. Mol. Micro-

biol. 10, 1057–1085.

Hemavathy, K.C., Nagaraja, V., 1995. DNA methylation in myco-

bacteria: absence of methylation at GATC (Dam) and CCA/

TGG (Dcm) sequences. FEMS Immunol. Med. Microbiol. 11,

291–296.

Hermans, P.W.M., Van Soolingen, D., Bik, E.M., De Haas, P.E.W.,

Dale, J.W., Van Embden, J.D.A., 1993. The insertion element

IS987 from Mycobacterium bovis BCG is located in a hot spot

integration region for insertion elements in Mycobacterium

tuberculosis complex strains. Infect. Immun. 59, 2695–2705.

Hu, H., Lan, R., Reeves, P.R., 2002. Fluorescent amplified fragment

length polymorphism analysis of Salmonella enterica serovar

typhimurium reveals phage-type specific markers and potential

for microarray typing. J. Clin. Microbiol. 40, 3406–3415.

Hughes, A.L., Friedman, R., Murray, M., 2002. GenomEwide pat-

tern of synonymous nucleotide substitution in two complete

genomes of Mycobacterium tuberculosis. Emerg. Infect. Dis.

8, 1342–1345.

Kamerbeek, J., Schouls, L.M., Kolk, A., Van Agterveld, M., Van

Soolingen, D., Kuijper, S., Bunschoten, J.E., Molhuizen, H.,

Shaw, R., Goyal, M., Van Embden, J.D.A., 1997. Simultaneous

detection and strains differentiation of Mycobacterium tubercu-

losis for diagnosis and epidemiology. J. Clin. Microbiol. 35,

907–914.

Kapur, V., Whittam, T.S., Musser, J.M., 1994. Is Mycobacterium

tuberculosis 15,000 years old? J. Infect. Dis. 170, 1348–1349.

Kassama, Y., Rooney, P.J., Goodacrs, R., 2002. Fluorescent ampli-

fied fragment length polymorphisms probabilistic database for

identification of bacterial isolates from uninary tract infections.

J. Clin. Microbiol. 40, 2795–2800.

Kato-Maeda, M., Bifani, P.J., Kreiswirth, B.N., Small, P.M., 2001a.

The nature and consequence of genetic variability within Myco-

bacterium tuberculosis. J. Clin. Invest. 107, 533–537.

Kato-Maeda, M., Rhee, J.T., Gingeras, T.R., Salamon, H., Dren-

kow, J., Smittipat, N., Small, P.M., 2001b. Comparing genomes

within the species Mycobacterium tuberculosis. Genome Res.

11, 547–554.

Kremer, K., Van Soolingen, D., Frothingham, R., Haas, W.H.,

Hermans, P.W.M., Martin, C., Palittapongearnpin, P., Plikaytis,

P.P., Riley, L.W., Yakrus, M.A., Musser, J.M., Van Embden,

J.D.A., 1999. Comparison of methods based on different mo-

lecular epidemiological markers for typing of Mycobacterium

tuberculosis complex strains: interlaboratory study of discrim-

inatory power and reproducibility. J. Clin. Microbiol. 37,

2607–2618.

Mariani, F., Piccolella, E., Collizzi, V., Rappuoli, R., Gross, R.,

1993. Characterization of an IS-like element from Mycobacte-

rium tuberculosis. J. Gen. Microbiol. 139, 1767–1772.

Mathema, B., Kreiswirth, B.N., 2003. Rethinking tuberculosis epi-

demiology: the utility of molecular methods. ASM News 69,

80–85.

Mostowy, S., Cousins, D., Brinkman, J., Aranaz, A., Behr, M.A.,

2002. Genomic deletions suggest a phylogeny for the Mycobac-

terium tuberculosis complex. J. Infect. Dis. 186, 74–80.

Musser, J.M., Amin, A., Ramaswamy, S., 2000. Negligible genetic

diversity of Mycobacterium tuberculosis host immune system

protein targets. Evidence of limited selective pressure. Genetics

155, 7–16.

Pfyffer, G.E., Auckenthaler, R., Van Embden, J.D.A., Van Soolingen,

D., 1998. Mycobacterium canettii, the smooth variant of M.

tuberculosis, isolated from a Swiss patient exposed in Africa.

Emerg. Infect. Dis. 4, 631–634.

Ramakrishnan, L., Federspiel, N.A., Falkow, S., 2000. Granuloma-

specific expression of Mycobacterium virulence proteins from

the glycine-rich PE-PGRS family. Science 288, 1436–1439.

Ramaswamy, S.V., Reich, R., Dou, S.J., Jasperse, L., Pan, X.,

Wanger, A., Quitugua, T., Graviss, E.A., 2003. Single nucleo-

tide polymorphisms in genes associated with isoniazid resist-

ance in Mycobacterium tuberculosis. Antimicrob. Agents

Chemother. 47, 1241–1250.

Roring, S., Scott, A., Brittain, D., Walker, I., Hewison, G., Neill, S.,

Skuce, R., 2002. Development of variable number of tandem

repeat typing of Mycobacterium bovis: comparison of results

with those obtained by using existing exact tandem repeats

and spoligotyping. J. Clin. Microbiol. 40, 2126–2133.

Sims, E.J., Goyal, M., Arnold, C., 2002. Experimental versus in

silico fluorescent amplified fragment length polymorphism anal-

ysis of Mycobacterium tuberculosis: improved typing with and

extended fragment range. J. Clin. Microbiol. 40, 4072–4076.

Skeiky, Y.A., Ovendale, P.J., Jen, S., Alderson, M.R., Dillon, D.C.,

Smith, S., Wilson, C.B., Orme, I.M., Reed, S.G., Campos-Neto,

N. van den Braak et al. / Journal of Microbiological Methods 56 (2004) 49–6262

A., 2000. T cell expression cloning of a Mycobacterium tuber-

culosis gene encoding a protective antigen associated with the

early control of infection. J. Immunol. 165, 7140–7149.

Small, P.M., Hopewell, P.C., Singh, S.P., Paz, A., Parsonnet, J.,

Ruston, D.C., Schecter, G.F., Daley, C.L., Schoolnik, G.A.,

1994. The epidemiology of tuberculosis in San Francisco. A

population based study using conventional and molecular

methods. N. Engl. J. Med. 330, 1703–1709.

Sreevatsan, S., Pan, X., Stockbauer, K.E., Connell, N.D., Kreis-

wirth, B.N., Whittam, T.S., Musser, J.M., 1997. Restricted struc-

tural gene polymorphism in the Mycobacterium tuberculosis

complex indicates evolutionary recent global dissemination.

Proc. Natl. Acad. Sci. U. S. A. 94, 9869–9874.

Supply, P., Lesjean, S., Savine, E., Kremer, K., Van Soolingen, D.,

Locht, C., 2001. Automated high throughput genotyping for

study of global epidemiology of Mycobacterium tuberculosis

based on mycobacterial interspersed repetitive units. J. Clin.

Microbiol. 39, 3563–3571.

Supply, P., Mazars, E., Lesjean, S., Vincent, V., Gicquel, B., Locht,

C., 2000. Variable human minisatellite-like regions in the Myco-

bacterium tuberculosis genome. Mol. Microbiol. 36, 762–771.

Thierry, D., Brisson Noel, A., Vincent-Levy-Frebault, V., Nguyen,

S., Guesdon, J., Gicquel, B., 1990. Characterization of a Myco-

bacterium tuberculosis insertion sequence, IS6110, and its ap-

plication in diagnosis. J. Clin. Microbiol. 28, 2668–2673.

Troesch, A., Nguyen, H., Miyada, C.G., Desvarenne, S., Gingeras,

T.R., Kaplan, P.M., Cros, P., Mabilat, C., 1999. Mycobacterium

species identification and rifampin resistance testing with high-

density DNA probe arrays. J. Clin. Microbiol. 37, 49–55.

Upton, A.M., Mushtaq, A., Victor, T.C., Sampson, S.L., Sandy, J.,

Smith, D.M., Van Helden, P.V., Sim, E., 2001. Arylamine N-

acetyltransferase of Mycobacterium tuberculosis is a polymor-

phic enzyme and a site of isoniazid metabolism. Mol. Microbiol.

42, 309–317.

Van Belkum, A., Scherer, S., Van Alphen, L., Verbrugh, H., 1998.

Short sequence DNA repeats in prokaryotic genomes. Microbiol

Mol. Biol. Rev. 62, 275–293.

Van Belkum, A., Struelens, M., De Visser, A., Verbrugh, H.,

Tibayrenc, M., 2001. Role of genomic typing in taxonomy, evo-

lutionary genetics, and microbial epidemiology. Clin. Microbiol.

Rev. 14, 547–560.

Van Embden, J.D.A., Crawford, J.T., Dale, J.W., Gicquel, B., Her-

mans, P.W.A., McAdam, R., Shinnick, T., Small, P.M., 1993.

Strain identification ofMycobacterium tuberculosis by DNA fin-

gerprinting: recommendations for a standardized method. J. Clin.

Microbiol. 31, 406–409.

Van Embden, J.D.A., Van Gorkom, T., Kremer, K., Jansen, R.,

Van der Zeijst, B.A.M., Schouls, L.M., 2000. Genetic varia-

tion and evolutionary origin of the direct repeat locus of

Mycobacterium tuberculosis complex bacteria. J. Bacteriol.

182, 2393–2401.

Van Soolingen, D., De Haas, P.E.W., Hermans, P.W.M., Van

Embden, J.D.A., 1994. DNA fingerprinting of Mycobacterium

tuberculosis. Methods Enzymol. 235, 196–205.

Van Soolingen, D., Hoogenboezem, T., De Haas, P.E.W., Hermans,

P.W.M., Koedam, M.A., Teppema, K.S., Brennan, P.J., Besra,

G.S., Portaels, F., Top, J., Schouls, L.M., Van Embden, J.D.,

1997. A novel pathogenic taxon of the Mycobacterium tuber-

culosis complex, Canettii: characterization of an exceptional

isolate from Africa. Int. J. Syst. Bacteriol. 47, 1236–1245.

Vos, P., Hogers, R., Bleeker, M., Reijans, M., Van de Lee, T.,

Hornes, M., Frijters, A., Pot, J., Peleman, J., Kuiper, M., Za-

beau, M., 1995. AFLP: a new technique for DNA fingerprinting.

Nucleic Acids Res. 23, 4407–4414.