novel primers for complete mitochondrial cytochrome b gene sequencing in mammals
TRANSCRIPT
Novel primers for complete mitochondrial cytochrome b genesequencing in mammals
ASHWIN NAIDU,* ROBERT R. FITAK,† ADRIAN MUNGUIA-VEGA* and MELANIE CULVER*†‡
*School of Natural Resources and the Environment, University of Arizona, 1311, East Fourth Street, Room 317, Tucson, AZ 85721,
USA, †Genetics Graduate Interdisciplinary Program, University of Arizona, Tucson, AZ 85721, USA, ‡Arizona Cooperative Fish
and Wildlife Research Unit, University of Arizona, Tucson, AZ 85721, USA
Abstract
Sequence-based species identification relies on the extent and integrity of sequence data available in online databases such
as GenBank. When identifying species from a sample of unknown origin, partial DNA sequences obtained from the sam-
ple are aligned against existing sequences in databases. When the sequence from the matching species is not present in the
database, high-scoring alignments with closely related sequences might produce unreliable results on species identity. For
species identification in mammals, the cytochrome b (cyt b) gene has been identified to be highly informative; thus, large
amounts of reference sequence data from the cyt b gene are much needed. To enhance availability of cyt b gene sequence
data on a large number of mammalian species in GenBank and other such publicly accessible online databases, we identi-
fied a primer pair for complete cyt b gene sequencing in mammals. Using this primer pair, we successfully PCR amplified
and sequenced the complete cyt b gene from 40 of 44 mammalian species representing 10 orders of mammals. We submit-
ted 40 complete, correctly annotated, cyt b protein coding sequences to GenBank. To our knowledge, this is the first single
primer pair to amplify the complete cyt b gene in a broad range of mammalian species. This primer pair can be used for the
addition of new cyt b gene sequences and to enhance data available on species represented in GenBank. The availability of
novel and complete gene sequences as high-quality reference data can improve the reliability of sequence-based species
identification.
Keywords: cytochrome b, mammals, primers, sequence data, species identification
Received 12 June 2010; revision received 31 August 2011; accepted 7 September 2011
Introduction
Sequence-based species identification relies on the extent
and integrity of sequence data present in online databases
such as GenBank, but the availability and quality of
reference sequence data have often been questioned
(Bridge et al. 2003; Harris 2003; Nilsson et al. 2006).
When identifying species from a sample of unknown
origin, a BLAST search of partial DNA sequence(s)
obtained from the sample may not overlap, or match
completely, with reference sequences present in Gen-
Bank. When the sequence from the matching species is
not present, high-scoring alignments with closely
related sequences might produce unreliable results on
species identity. Accurate identification of species from
DNA sequences depends on existing reference data in
GenBank and interpretation of BLAST search results
(reviewed in Kang et al. 2010). To minimize errors in
species identification, there is a need for the deposition
of complete gene sequences into GenBank that can be
particularly useful as reference sequences. These refer-
ence sequences would provide full coverage in local
alignments with query sequences obtained from ampli-
fication of partial gene regions (e.g. in forensic cases
and ancient DNA applications).
For species identification, the query sequence must
show little intra-specific variation and sufficient interspe-
cies variation such that two closely related members of
the same genus could be separated; this query sequence
can be the whole gene or only a short region of the gene
(e.g. Hebert et al. 2003). Recently, for species identifica-
tion in mammals, the mitochondrial cytochrome b (cyt b)
gene and the cytochrome c oxidase subunit 1 (COI) gene
have become popular (reviewed in Ogden et al. 2009). In
mammals, the cyt b gene shows higher interspecies varia-
tion and thereby, when using shorter regions, can be
more informative than the COI gene for species identifi-
cation (Tobe et al. 2009). Although many mammalian
species are represented in online databases such as Gen-
Bank, only partial DNA sequences from the cyt b gene ofCorrespondence: Ashwin Naidu, Fax: (520) 621 8801;
E-mail: [email protected]
� 2011 Blackwell Publishing Ltd
Molecular Ecology Resources (2012) 12, 191–196 doi: 10.1111/j.1755-0998.2011.03078.x
these species are generally available. This could be due
to the primer pairs that have already been in use for cyt b
gene sequencing in whole or in part (e.g. Kocher et al.
1989; Irwin et al. 1991; Bartlett & Davidson 1992; Parson
et al. 2000; Hsieh et al. 2001; Verma & Singh 2003).
To alleviate the issues associated with reference
sequence availability and efficient amplification of whole
genes, we developed a single PCR primer pair to enable
sequencing of the complete cyt b gene (�1140 bp) in a
large number of mammal species. Because the cyt b gene
is valuable for mammalian species identification, we
developed this primer pair with the aim of contributing
towards and expanding the complete cyt b sequence data
in GenBank.
Materials and methods
To design this primer pair, we downloaded DNA
sequences spanning �1740 bp in the mitochondrial gen-
ome of six species representing six mammalian orders
from GenBank (Table 1). These DNA sequences included
the complete coding sequence (cds) of the cyt b gene and
�300 bp of flanking sequence on either side. We aligned
these sequences using BioEdit software v7.0.9.0 (Hall
1999). We identified and designed a forward primer
MTCB-F (5¢-CCHCCATAAATAGGNGAAGG-3¢) and a
reverse primer MTCB-R (5¢-WAGAAYTTCAGCTTT-
GGG-3¢) located on highly conserved regions—the for-
ward primer is located in the NADH dehydrogenase
subunit 6 gene, and the reverse primer is anchored in the
transfer RNA-Pro gene. The positions of these forward
and reverse primers in the human mitochondrial genome
are 14588–14607 and 15989–16006, respectively, according
to Anderson et al. (1981).
We used OligoAnalyzer software v3.1 (Integrated
DNA Technologies, Inc., Coralville, IA, USA, http://
www.idtdna.com/analyzer/Applications/OligoAnalyzer/)
to assess melting temperatures and to verify homo-
dimers, hetero-dimers, secondary structures and self-
priming on the putative primer pair—MTCB-F and
MTCB-R. We used a short-input sequence optimized
BLAST search on each primer sequence to check for
matching sequences in GenBank. To test the functionality
and scope of this primer pair, we performed an in silico
PCR using the Amplify software v3.1 (Copyright of Bill
Engels, 2005, University of Wisconsin, http://
engels.genetics.wisc.edu/amplify/) on mitochondrial
genome sequences of 27 species representing 26 mamma-
lian orders (Table 1). We obtained amplification in all 27
species. The amplified target fragment size ranged
between 1415 and 1442 bp (Table 1). We also observed
amplification of nontarget fragments in some species
indicating that in vitro gel purification of the amplified
target fragment may be required prior to sequencing.
We further tested the primer pair using PCR in vitro
on 44 species representing 10 orders of mammals
(Table 2). Of these 44 species, 41 were unique to the 27
species that we tested in silico. All 44 samples were tissue
samples collected in the field by various experienced
field biologists (see Acknowledgements), who identified
species with the specimen in hand. To extract DNA from
these samples, we used the QIAamp Tissue & Blood
kit (Qiagen, Valencia, CA, USA) and followed the manu-
facturer’s instructions. To test specificity of the primer
pair, we also included DNA samples from two nonmam-
malian species—Athene cunicularia and Anodonta californi-
ensis. We used 0.1–10 ng ⁄ lL of DNA as template for
PCR. We performed amplifications in a 20-lL reaction
volume with the following final concentrations: 1· PCR
buffer containing 1.5 mM MgCl2 (Qiagen), an additional
1.0 mM MgCl2 (Qiagen), 0.2 mM dNTPs (Qiagen), 0.05%
BSA (Sigma-Aldrich, St. Louis, MO, USA), 0.5 U of Taq
DNA Polymerase (Qiagen) and 0.5 lM each of forward
and reverse primers. PCR cycling was performed in
Mastercycler PCR machines (Eppendorf, Westbury, NY,
USA) with an initial denaturation at 95 �C for 10 min,
followed by 35 cycles of denaturation at 95 �C for 45 s,
annealing at 55 �C for 1 min, extension at 72 �C for 2 min
and a final extension step at 72 �C for 10 min. We
subjected PCR products to electrophoresis in a 1% aga-
rose gel stained with ethidium bromide. We prepared
PCR products for sequencing via treatment with the Exo-
SAP-IT PCR Clean-up kit (USB Corporation, Cleveland,
OH, USA) using manufacturer’s recommendations. In
cases where nontarget amplifications of fragments other
than the target region were obtained in addition to the
target region (�1420 bp), we used the QIAquick Gel
Extraction kit (Qiagen) and followed manufacturer’s
instructions to purify the target fragment for sequencing.
All purified PCR products were sequenced using both
forward and reverse primers on a 3730xl Automated
DNA Analyzer (Applied Biosystems, Foster City, CA,
USA).
All samples were sequenced in triplicate. We assem-
bled and edited sequences from each species using
Sequencher software v4.9 (Gene Codes Corporation, Ann
Arbor, MI, USA). We aligned both forward and reverse
sequences from all three sequencing attempts with refer-
ence sequences to check for base calling errors, frame
shifts, insertions and deletions. We trimmed flanking
sequences from the sequence assembly and derived a
consensus of the cyt b cds (�1140 bp). We annotated each
cyt b cds in Sequin software v10.3 (NCBI, http://
www.ncbi.nlm.nih.gov/Sequin/), to check for complete
amino acid translation to the cyt b protein, prior to
submitting sequences to GenBank.
To test for sequence homology between species, we
used MegAlign software v9.0.4 (DNASTAR, Inc.,
� 2011 Blackwell Publishing Ltd
192 A . N A I D U E T A L .
Madison, WI, USA, https://www.dnastar.com/t-sub-
products-lasergene-megalign.aspx). We performed a
slow-accurate ClustalW alignment of the full-length cyt b
gene. We used 1140 bp from each sequence to perform
the alignment with the following multiple alignment
parameters: gap penalty = 15.00, gap length penalty =
6.66, delay divergent sequences = 30%, DNA transition
weight = 0.50, protein weight matrix = Gonnet series,
DNA weight matrix = ClustalW. We calculated pairwise
percent identity and divergence between each of the
sequences.
Results
We obtained PCR amplicons from 43 of 44 mammalian
species (98%), and no amplicons from both nonmammali-
an species. The sample from Chaetodipus rudinoris failed
to amplify even after multiple PCR attempts. Of the 43,
amplicons from 30 species were sequenced directly
(70%), and amplicons from 13 species needed gel extrac-
tion prior to sequencing (30%, Table 2). On average, in
each direction, forward and reverse, we obtained a
sequence read length of 700–750 bp. Sequencing in both
forward and reverse directions was necessary to obtain
the entire gene sequence putatively spanning �1140 bp.
We found that the cyt b gene sequences spanned between
1140–1200 bp in our data set. Our submission to Gen-
Bank consisted of 40 mammalian species’ consensus cyt b
gene sequences that correctly annotated to the cyt b pro-
tein (Table 2).
The intra-species percent identity (homology), as
between Canis lupus baileyi and Canis lupus familiaris, was
Table 1 Sequences from GenBank used in primer design and in silico testing
Order Species GenBank accession TFS (bp) NTFS (bp)
Sequences used in alignment for primer design
Carnivora Acinonyx jubatus AF344830.1
Cetartiodactyla Cervus elaphus AB245427.2
Didelphimorphia Didelphis virginiana Z29573.1
Primates Homo sapiens J01415.2
Proboscidea Loxodonta africana AJ224821.1
Rodentia Mus musculus DQ874614.2
Sequences used for in silico PCR simulations
Afrosoricida Echinops telfairi AJ400734.2 1433 –
Carnivora Herpestes javanicus AY873843.1 1421 –
Cetartiodactyla Grampus griseus EU557095.1 1422 93, 877, 1075
Cetartiodactyla Lama glama AP003426.1 1421 –
Cingulata Dasypus novemcinctus Y11832.1 1425 363, 633
Chiroptera Artibeus jamaicensis AF061340.1 1419 60, 92, 635, 973
Dasyuromorphia Dasyurus hallucatus AY795973.1 1430 59, 252, 367, 884, 975
Dermoptera Cynocephalus variegatus AJ428849.1 1420 –
Didelphimorphia Didelphis virginiana Z29573.1 1436 –
Diprotodontia Vombatus ursinus AJ304826.1 1427 –
Erinaceomorpha Erinaceus europaeus X88898.2 1421 –
Hyracoidea Procavia capensis AB096865.1 1442 1305
Lagomorpha Ochotona collaris AF348080.1 1415 654, 716, 873, 1056, 1214
Macroscelidea Macroscelides proboscideus AJ421452.1 1417 60
Monotremata Ornithorhynchus anatinus X83427.1 1416 370, 1689
Notoryctemorphia Notoryctes typhlops AJ639874.1 1429 534, 1098, 1173
Paucituberculata Rhyncholestes raphanurus AJ508399.1 1432 97
Peramelemorphia Isoodon macrourus AF358864.1 1428 719
Perissodactyla Equus caballus X79547.1 1427 703
Pholidota Manis tetradactyla AJ421454.1 1419 –
Primates Eulemur macaco AB371088.1 1419 1504
Proboscidea Elephas maximus DQ316068.1 1419 186, 309, 897
Rodentia Myoxus glis AJ001562.1 1421 248, 633, 871, 971
Scandentia Tupaia belangeri AF217811.1 1415 761
Sirenia Dugong dugon AJ421723.1 1420 –
Soricomorpha Crocidura russula AY769264.1 1422 364, 373, 1465
Tubulidentata Orycteropus afer Y18475.1 1426 718, 764
TFS, target fragment size; NTFS, nontarget fragment size, detected by in silico PCR in Amplify v3.1 software.
� 2011 Blackwell Publishing Ltd
P R I M E R S F O R M A M M A L I A N C Y T b G E N E S E Q U E N C I N G 193
99.3%; the interspecific percent identity, as between two
Dipodomys spp., was 85.4%, and between three Lepus spp.
averaged 93.5%. We also generated an overall comparison
chart of pairwise percent identity and sequence diver-
gence within species and between closely related genera
in our data set (see Table S1, Supporting information).
Discussion
The cyt b gene is a valuable marker for sequence-based
species identification in mammalian species, although
effective species identification depends on the availabil-
ity of reference sequence data available in databases such
Table 2 DNA samples used in PCR in vitro testing and respective sequences deposited in GenBank
Order Species SD ⁄ GE GenBank accession Sequence length (bp)
Carnivora Panthera onca SD GU175435 1140
Carnivora Lynx rufus SD GU175436 1140
Carnivora Procyon lotor GE GU175439 1140
Carnivora Mephitis mephitis SD GU175440 1140
Carnivora Puma concolor couguar SD GU175442 1140
Carnivora Canis lupus baileyi SD HM222711 1140
Carnivora Canis lupus familiaris SD JF489119 1140
Carnivora Urocyon cinereoargenteus SD JF489121 1140
Carnivora Vulpes macrotis SD JF489127 1140
Cetartiodactyla Phocoena sinus GE HM222714 1140
Cetartiodactyla Balaena mysticetus SD JF489130 1140
Cetartiodactyla Antilocapra americana sonoriensis SD GU175434 1140
Cetartiodactyla Ovis canadensis mexicana SD HM222706 1140
Cetartiodactyla Odocoileus hemionus SD HM222707 1140
Cetartiodactyla Alces alces SD JF489131 1140
Cetartiodactyla Lama pacos SD JF489132 1140
Cetartiodactyla Cervus elaphus SD JF489133 1140
Cetartiodactyla Pecari tajacu SD JF489135 1140
Cetartiodactyla Madoqua kirkii SD JF489137 1140
Chiroptera Leptonycteris curasoae SD GU175441 1140
Chiroptera Myotis auriculus SD JF489122 1140
Chiroptera Euderma maculatum SD JF489125 1140
Chiroptera Tadarida brasiliensis SD JF489129 1140
Didelphimorphia Didelphis virginiana GE HM222715 1149
Didelphimorphia Caluromys derbianus GE JF489138 1200
Insectivora Sorex monticolus GE JF489124 1140
Lagomorpha Lepus californicus xanti GE HM222712 1140
Lagomorpha Lepus insularis GE HM222713 1140
Lagomorpha Lepus americanus SD JF489126 1140
Perissodactyla Equus caballus SD JF489134 1140
Primates Ateles geoffroyi frontatus SD HM222708 1140
Primates Lemur catta SD JF489136 1140
Rodentia Dipodomys simulans peninsularis SD GU175437 1140
Rodentia Dipodomys merriami melanurus SD GU175438 1140
Rodentia Tamiasciurus hudsonicus grahamensis SD GU175443 1140
Rodentia Mus musculus GE HM222709 1146
Rodentia Rattus norvegicus GE HM222710 1143
Rodentia Peromyscus maniculatus SD JF489123 1143
Rodentia Neotoma albigula SD JF489128 1143
Sirenia Trichechus manatus GE JF489120 1140
DNA samples amplified by PCR in vitro but not submitted to GenBank
Carnivora Ursus americanus GE
Cetartiodactyla Phocoena spinipinnis GE
Rodentia Castor canadensis GE
DNA samples failed to amplify by PCR in vitro
Rodentia Chaetodipus rudinoris
SD, sequenced directly; GE, gel extracted.
� 2011 Blackwell Publishing Ltd
194 A . N A I D U E T A L .
as GenBank. We developed a single PCR primer pair to
enable sequencing of the complete cyt b gene in a large
number and diverse range of mammalian species. We
developed this primer pair with the aim of expanding the
complete cyt b sequence data on mammalian species
represented in GenBank. Primer sets previously
described for mammalian cyt b gene sequencing (Kocher
et al. 1989; Bartlett & Davidson 1992; Parson et al. 2000;
Hsieh et al. 2001; Verma & Singh 2003) were for partial
fragments of the cyt b gene. The primers described by
Irwin et al. (1991) amplify the entire cyt b but were tested
in a limited set of mammalian species. To our knowledge,
this is the first single primer pair tested for complete cyt b
gene sequencing in a broadly representative set of mam-
malian species.
Two major applications of this primer pair are as fol-
lows: (i) development of mammalian cyt b gene sequence
databases for species identification and (ii) verification of
cyt b sequence data before and after deposition into
online databases. As this is a single primer pair for whole
gene sequencing, PCRs are more time and cost effective
than when using a set or panel of primers. This is desir-
able especially when dealing with a large number of sam-
ples or when setting up reference sequence databases
where multiple sequencing runs may be required.
Although this primer pair may have limited use in
degraded or ancient DNA applications, this primer pair
can be used to obtain sequences from known specimens,
including field-collected samples from well-documented
species.
Because we did not test this primer pair by PCR in
vitro in monotreme and xenarthran species of mammals,
and that in marsupials we tested them only in two spe-
cies (one of them being the marsupial species Didelphis
virginiana), it is possible that this primer pair will mostly
be useful for studies involving eutherian mammals. Also,
we tried multiple PCR trials on samples from different
individuals from Chaetodipus rudinoris, but the samples
failed to amplify. Based on this information, we accept
that exceptions may exist to the universal nature of this
primer pair, most likely due to mismatches in the primer
sites on template DNA.
Nuclear mitochondrial pseudogenes (numts) could be
easily coamplified by universal primers (Song et al. 2008).
However, because numts are usually not functional, they
can be removed by examination of sequences characteris-
tics that imply functionality of the gene product, includ-
ing indels, in-frame stop codons and nucleotide
composition (Song et al. 2008). We did not submit
sequences from three species, Ursus americanus, Phocoena
spinipinnis and Castor canadensis, because these sequences
either consisted of internal stop codons, indels or frame-
shift mutations that we identified upon translation
and ⁄ or also upon comparison with closely related
complete cyt b reference sequences from GenBank. This
may suggest the presence of numt amplicons, or that the
template DNA used during PCR was of poor-quality or
degraded. In support of the argument about the possibil-
ity of pseudogene coamplification, we recognize that
such problems have been encountered in previous stud-
ies (e.g. Song et al. 2008; Moulton et al. 2010) and that the
issue of nonspecificity is an important concern for a pri-
mer pair that has been specifically designed for universal
amplification across a range of taxa.
Considering the examination of amino acid translation
as a quality control measure for protein coding genes, we
submitted only the complete coding sequences that cor-
rectly annotated to the cyt b protein (Table 2). The soft-
ware we used (Amplify v3.1) for in silico testing
predicted several nonspecific fragment amplifications in
the mitochondrial genomes. We also needed to perform
gel extraction of target fragment from 30% of the species
that we tested in vitro. Although increasing the primer
specificity may not eliminate the coamplification of
pseudogenes (Moulton et al. 2010), to reduce nonspecific
amplifications, we suggest the use of more stringent PCR
conditions. Using 1–1.5 mM of MgCl2 instead of 2.5 mM
MgCl2 (total concentration including MgCl2 precontained
in Qiagen PCR buffer) and an annealing time of 30–45 s
instead of 1 min could enhance specificity.
Although it is known that primers consisting of
degenerate bases are less likely to preferentially anneal to
numts (Sorenson et al. 1999), because our primer pair
consists of degenerate bases, we wanted to test whether
these primers putatively match with any nuclear DNA
sequences available in GenBank. Our test on using a
short-input sequence optimized BLAST search on each
primer sequence returned hits only from mitochondrial
DNA sequences (data not shown). However, this result
does not imply that these primers will not anneal to other
sites in the nuclear genome, even if stringent PCR condi-
tions are used. On the contrary, it is known that genomic
extracts from tissue samples have a much higher copy
number of mitochondrial loci compared with nuclear
loci. Additionally, considering that most eukaryotic num-
ts are shorter than amplified fragments >700 bp from
mitochondrial DNA (see Pereira & Baker 2004; Richly &
Leister 2004; also see http://www.pseudogene.net/), we
can be confident that the target fragments we amplified
(�1420 bp), sequenced (�750 bp in each direction) and
submitted to GenBank are not numt sequences.
A search of GenBank will show many partial mamma-
lian sequences corresponding to shorter (<500 bp) frag-
ments, partial cds, of the cyt b gene. Effective species
identification will require the availability of reliable refer-
ence data that are complete. There is also a need to
update sequence data on species represented in Gen-
Bank, particularly in the light of data sharing concerns
� 2011 Blackwell Publishing Ltd
P R I M E R S F O R M A M M A L I A N C Y T b G E N E S E Q U E N C I N G 195
among the scientific community as expressed by Noor
et al. (2006). We believe that this primer pair and the
design and use of other such primers will enhance
sequence data availability for species identification, mini-
mize errors in DNA sequence submissions and promote
successful data sharing in a more convenient, efficient
and cost-effective manner.
Acknowledgements
We would like to thank the researchers who contributed DNA
samples to us at the Conservation Genetics Laboratory in the
School of Natural Resources and the Environment, University
of Arizona—Alberto Macias-Duarte, Cora Varas-Nelson, Judith
Ramirez, Karla Pelz-Serrano, Sarah Rinkevich, Terry Myers
and Ron Thompson; Carol Chambers, Tad Theimer, Tzeidle
Wasserman and Suzanne Hagell from Northern Arizona
University; and Linda Searles from Southwest Wildlife Conser-
vation Center. We also thank the University of Arizona Genet-
ics Core for assistance with DNA sequencing. We are grateful
to the United States Fish and Wildlife Service–Science Support
Program for funding this work.
References
Anderson S, Bankier AT, Barrell BG et al. (1981) Sequence and organiza-
tion of the human mitochondrial genome. Nature, 290, 457–465.
Bartlett SE, Davidson WS (1992) FINS (forensically informative nucleotide
sequencing): a procedure for identifying the animal origin of biological
specimens. BioTechniques, 12, 408–411.
Bridge PD, Roberts PJ, Spooner BM, Panchal G (2003) On the unreliability
of published DNA sequences. New Phytologist, 160, 43–48.
Hall TA (1999) BioEdit: a user-friendly biological sequence alignment edi-
tor and analysis program for Windows 95 ⁄ 98 ⁄ NT. Nucleic Acids Sympo-
sium Series, 41, 95–98.
Harris DJ (2003) Can you bank on GenBank? Trends in Ecology and Evolu-
tion, 18, 317–319.
Hebert PDN, Cywinska A, Ball SL, DeWaard JR (2003) Biological identifi-
cations through DNA barcodes. Proceedings of the Royal Society B: Biolog-
ical Sciences, 270, 313–321.
Hsieh HM, Chiang HL, Tsai LC et al. (2001) Cytochrome b gene for spe-
cies identification of the conservation animals. Forensic Science Interna-
tional, 122, 7–18.
Irwin DM, Kocher TD, Wilson AC (1991) Evolution of cytochrome b in
mammals. Journal of Molecular Evolution, 32, 128–144.
Kang S, Mansfield MA, Park B et al. (2010) The promise and pitfalls of
sequence-based identification of plant pathogenic fungi and oomyce-
tes. Phytopathology, 100, 732–737.
Kocher TD, Thomas WK, Meyer A et al. (1989) Dynamics of mitochon-
drial DNA evolution in animals: amplification and sequencing with
conserved primers. Proceedings of the National Academy of Sciences of the
USA, 86, 6196–6200.
Moulton MJ, Song H, Whiting MF (2010) Assessing the effects of primer
specificity on eliminating numt coamplification in DNA barcoding: a
case study from Orthoptera (Arthropoda: Insecta). Molecular Ecology
Resources, 10, 615–662.
Nilsson RH, Ryberg M, Kristiansson E, Abarenkov K, Larsson K-H, Kol-
jalg U (2006) Taxonomic reliability of DNA sequences in public
sequence databases: a fungal perspective. PLoS ONE, 1, e59.
Noor MAF, Zimmerman KJ, Teeter KC (2006) Data sharing: how much
doesn’t get submitted to GenBank? PLoS Biology, 4, 1113–1114.
Ogden R, Dawnay N, McEwing R (2009) Wildlife DNA forensics–bridg-
ing the gap between conservation genetics and law enforcement.
Endangered Species Research, 9, 179–195.
Parson W, Pegoraro K, Niederstatter H, Foger M, Steinlechner M (2000)
Species identification by means of the cytochrome b gene. International
Journal of Legal Medicine, 114, 23–28.
Pereira SL, Baker AJ (2004) Low number of mitochondrial pseudogenes in
the chicken (Gallus gallus) nuclear genome: implications for molecular
inference of population history and phylogenetics. BMC Evolutionary
Biology, 4, 17.
Richly E, Leister D (2004) NUMTs in sequenced eukaryotic genomes.
Molecular Biology and Evolution, 21, 1081–1084.
Song H, Buhay JE, Whiting MF, Crandall KA (2008) Many species in one:
DNA barcoding overestimates the number of species when nuclear
mitochondrial pseudogenes are coamplified. Proceedings of the National
Academy of Sciences of the USA, 105, 13486–13491.
Sorenson MD, Ast JC, Dimcheff DE, Yuri T, Mindell DP (1999) Primers
for a PCR-based approach to mitochondrial genome sequencing in
birds and other vertebrates. Molecular Phylogenetics and Evolution, 12,
105–114.
Tobe SS, Kitchener A, Linacre A (2009) Cytochrome b or cytochrome c oxi-
dase subunit I for mammalian species identification—an answer to the
debate. Forensic Science International, 2, 306–307.
Verma SK, Singh L (2003) Novel universal primers establish identity of an
enormous number of animal species for forensic application. Molecular
Ecology Notes, 3, 28–31.
Data Accessibility
DNA sequences: Genbank accessions GU175434–
GU175443, HM222706–HM222715, and JF489119–
JF489138.
Supporting Information
Additional supporting information may be found in the
online version of this article.
Table S1 Pairwise percent identity (above diagonal) and
sequence divergence (below diagonal) between all 40 cyt
b gene sequences submitted to GenBank.
Please note: Wiley-Blackwell are not responsible for the
content or functionality of any supporting information
supplied by the authors. Any queries (other than missing
material) should be directed to the corresponding author
for the article.
� 2011 Blackwell Publishing Ltd
196 A . N A I D U E T A L .