linnea guldbrand - diva1038999/fulltext02.pdf · skolan fÖr bioteknologi. . 1 ... pcr primers were...
TRANSCRIPT
INOM EXAMENSARBETE BIOTEKNIK,AVANCERAD NIVÅ, 30 HP
, STOCKHOLM SVERIGE 2016
Development of a massive parallel sequencing method for population genetics, for the sequencing of 1,000 dog mitochondrial genomes per Miseq run, based on nested and multiplexed PCR amplification and PCR-incorporated dual-index identification barcodes
LINNEA GULDBRAND
KTHSKOLAN FÖR BIOTEKNOLOGI
www.kth.se
1
Development of a massive parallel sequencing method for
population genetics, for the sequencing of 1,000 dog mitochondrial genomes per Miseq run, based on nested
and multiplexed PCR amplification and PCR-incorporated dual-index identification barcodes
Linnea Guldbrand
Master Thesis at the School of Biotechnology, KTH Royal Institute of Technology,
Department of Gene Technology, SciLifeLabs
Supervisor and Examiner: Peter Savolainen
2
Abstract
The geographical origin of the domestic dog has not yet been conclusively established. The
mitochondrion, being matrilineally inherited and prone to a greater rate of mutation compared
to nuclear DNA, is of great significance in genetic evolutionary studies and as such, complete
sequencing of the mitochondrial genome of a great number of individuals would provide
important data for the furthering of such studies.
This project aims to design a method of sequencing the entire mitochondrial genome of the
domestic dog for a large number of individuals in parallel on the Illumina MiSeq sequencing
platform, using several sets of PCR primers to generate barcoded and sequencing-ready
libraries of predetermined fragments for each individual.
PCR primers were constructed both for initial long-range products and for shorter fragments,
suitable for sequencing and containing partial sequencing adaptors, covering the entire
mitochondrial chromosome. Additionally, primers containing barcode indices and the final
required sequencing constructs were designed. The viability of the primers and of different
PCR parameters were investigated, verified on agarose gels and Bioanalyzer, and a set of
samples were taken through cleaning, barcoding, and sequencing.
Results indicate a promising method, where all primers successfully generate product, and
both cleaning and sequencing appears in essence successful, but the relative amounts of
product obtained from each primer, and subsequently the amount of reads obtained in
sequencing, varies significantly with the initial set up. Subsequent experiments, performed
after the closing of the practical part of the project, have shown that compensating for this
uneven amplification by using significantly unequal primer concentrations greatly serves to
alleviate these issues.
3
Abstract ...................................................................................................................................... 2 Introduction ................................................................................................................................ 4
Previous Findings on the Geographical Origins of the Domestic Dog .................................. 4 The Mitochondrion, in Biology and in Forensics ................................................................... 5 The Illumina, Dual Indexed, Paired End Sequencing Method ............................................... 6 Existing Methods for Whole mtDNA Sequencing ................................................................. 9
Aim of Project .......................................................................................................................... 10 Materials and Methods ............................................................................................................. 11
Primer Layout and Design .................................................................................................... 11 PCR Reactions ...................................................................................................................... 11
Results ...................................................................................................................................... 13 Primer Layout and Design .................................................................................................... 13 PCR Reactions, Viability of Primers and Multiplex Set Ups ............................................... 20 Barcoding PCR ..................................................................................................................... 34 Concentration Measurements and Cleaning ......................................................................... 35 Initial Sequencing ................................................................................................................. 37 Results Obtained Post-Project .............................................................................................. 40
Discussion ................................................................................................................................ 41 Primer Layout and Design .................................................................................................... 41 PCR Procedures .................................................................................................................... 43 Indexing, Cleaning, and Sequencing .................................................................................... 44
Conclusions .............................................................................................................................. 44 References ................................................................................................................................ 46
4
Introduction Previous Findings on the Geographical Origins of the Domestic Dog
That the domestic dog has its evolutionary origin in the wolf has long been known and
accepted as fact, based on both genetic evidence and on archaeological findings, as well as on
behavioural and physical traits [1]. However, the precise circumstances, the historical time
point, and the geographical location of the original domestication event, or events, are less
clear. Several evolutionary genetic studies have been performed, using various methods and
sample materials, with varying results. Simply put, these studies attempt to identify the most
likely common ancestor of the domestic dog as the one whose genetic material can account
for the diversity of all others, while also comparing them to wild wolf populations and
estimating the timeframe for the domestication event via the rate at which mutations are
believed to accumulate. Variously, such studies have indicated the geographical origin of the
domestic dog in places as disparate as Europe, South East Asia, and the Middle East.
A 2002 study [1] of a stretch of 582 base pairs from the so-called control region of the
mitochondrial DNA from 654 dogs, representing dog populations worldwide, indicated an
East Asian origin for the domestic dog, based on a comparatively higher degree of
phylogenetic variation in dogs from this area. These findings were corroborated by the
analysis of 14 437 base pairs from the Y chromosome (fragmented sequences from
incomplete sequencing of a male dog DNA, assigned to the Y chromosome through
comparison to female dog DNA as well as the human Y chromosome sequences [21]) from
151 dogs worldwide [2] as well as by a further study of the complete mitochondrial DNA
from 169 individuals, combined with the 582 control region base pairs from 1576 individuals,
both placing the geographical origin of the domestic dog in South Eastern Asia, south of the
Yangtze River, China, less than 16 300 years ago [3]. Both studies also show that this region
of South East China, south of the Yangtze River, is the region in which the genetic diversity is
the greatest, and is the only region where almost all haplotypes of both the mtDNA and the Y
chromosome can be found simultaneously. Additionally, analysis of the mtDNA of Native
American dog breeds, when compared to East Asian and European dogs, as well as Pre-
Columbian samples, show low levels of European mtDNA [4], indicating a more ancient,
Asian, origin of the Native American dog breeds. Similarly, mtDNA analysis of the
Australian Dingo and Polynesian domestic dogs indicate an origin in mainland South East
Asia [5].
5
On the other hand, a study of the mitochondrial DNA from 18 ancient canids [6] indicated a
closer relationship with either ancient or modern European canids for all modern dogs the
world over. The study did, admittedly, lack ancient canid samples from both the Middle East
and China, two other major candidates for the geographical origin of the domestic dog.
Furthermore, genome-wide SNP (Single Nucleotide Polymorphism) analysis of over 48,000
SNPs in 912 dogs as well as 225 grey wolves has indicated a Middle Eastern origin for the
domestic dog, based on the significantly larger genetic variation found in breeds from this
region [7].
The Mitochondrion, in Biology and in Forensics
The mitochondrion is an organelle present in eukaryotic cells, whose role is to perform
oxidative metabolism, providing energy for the cell. The origin of the mitochondrion is
assumed to be the enveloping of a purple bacterium by an ancient eukaryotic ancestor,
resulting in an endosymbiotic relationship whereby both the eukaryotic host cell and the
bacterium symbiont benefits. Certain genetic material has since migrated from the original
bacteria into the nucleic DNA of the host cell to the degree that the modern mitochondrion
can no longer survive independently, but does retain certain vital genes in its own,
mitochondrial DNA, the mtDNA. They also still, like bacteria, reproduce through division,
rather than through being disassembled and reassembled, as is the case with all other
organelles, apart from the chloroplasts of plants, which have similar origins to the
mitochondrion. [8]
Mitochondrial DNA is usually comprised of one essential double-stranded, circular
chromosome, in multiple copies, but single-stranded, and linear chromosomes exist. The
mitochondrial DNA encodes for components necessary for protein production and certain
enzymes required for aerobic metabolism, but many components of the mitochondrion are
encoded by nuclear DNA and are transported into the mitochondrion. Multiple copies of
mitochondria are present in any given cell, and their genetic make up are not necessarily
homogenous. [8]
Mitochondria are inherited maternally. In meiosis in females, mitochondria are evenly
segregated between the two new cells and in the resulting embryo, the mitochondria present
in the fertilised ovum will divide and produce all the mitochondria in the new organism. Due
to this manner of inheritance, mitochondrial DNA can be used in forensics, to trace familial
6
relations on the maternal side (e.g. mother and child, as well as siblings who share a mother,
but not fatherhood), rule out suspects based on crime scene DNA, and on a larger time scale,
trace the evolution of a species. Mitochondrial DNA is more suited to such analyses for two
main reasons. Firstly, while each cell only contains one complete set up of nuclear DNA,
mitochondrial DNA is present in multiple copies per cell, thereby making it significantly
more abundant than nuclear DNA, somewhat circumventing the common issue of limited
sample material. Secondly, specifically in mammals, mutations accumulate at a much higher
rate in mitochondrial DNA than in nuclear DNA, on average 10-8 times per nucleotides and
year, meaning that evolutionary differences can be visible on a comparative shorter timescale.
[8]
The Illumina, Dual Indexed, Paired End Sequencing Method
The Illumina sequencing method is a Sequencing-By-Synthesis (SBS) sequencing method,
known as Solexa, utilising fluorescently labelled nucleotides to track base incorporation. In its
basic iteration, DNA is sheared into randomly sized fragments, to the ends of which a forward
and a reverse adaptor sequence are ligated. These adaptor-ligated fragments are then, single-
strandedly, randomly attached to the surface of the flow cell, where each individual fragment
is amplified into clusters of multiple copies of the same fragment, using so-called bridge
amplification. This means that the non-attached adaptor sequence of any given fragment
anneals to its complementing adaptor on the surface of the flow cell, which then acts as a
primer, allowing amplification into an arch-shaped double stranded structure where each of
the two strands is in one end attached to the flow cell. A denaturation step separates the two
strands of the ‘bridge’, and the process is repeated, until a sufficiently dense cluster of single
stranded DNA fragments has been formed.
7
Figure 1: Basic overview of Illumina sequencing, using random fragmentation and adaptor ligation [9]
After the clusters have been formed, the sequencing commences. Bases are determined
through the use of fluorescently labelled nucleotides, each of the four bases fluorescing at a
different wavelength. The labelled nucleotides are also reversible terminators, meaning that
the fluorescent label blocks more than one nucleotide from being incorporated at a time, but
that after the base has been determined, this label is enzymatically cleaved off, in preparation
for the next cycle, allowing the next nucleotide to be incorporated. For the first cycle of the
sequencing, all four fluorescently labelled nucleotides are added to the flow cell at once,
together with primers specific to the adaptor sequences and DNA polymerase, and one
nucleotide is incorporated in the first position of each strand, in each cluster. A laser is then
used to excite the fluorescent label on the nucleotide, and its identity is recorded for each
cluster. The label is then cleaved off, and all remaining reagents are washed away. For all
subsequent cycles, all four labelled nucleotides and DNA polymerase is added, one base is
incorporated in each strand in each cluster, laser excitation and image recording is performed,
the label is cleaved off. Remaining reagents are then washed away in preparation for the next
cycle. [9] [10]
8
Figure 2: Sequencing-by-Synthesis using Illumina sequencing, by annealing one base at a time and detecting them by their fluorescent label [9]
This sequencing method provides sequencing data from all fragments applied to the flow cell,
but has the downside of not being able to differentiate between the origins of the different
fragments, as well as, depending on the size of the fragments, not being able to obtain full
sequences, due to limitations imposed by the inherent read length of the sequencing method.
A way to enable the former is to employ the Illumina Single- or Dual-Indexed Sequencing
method, both based on the Paired End Sequencing method. The second can be achieved by
ensuring that all fragments used in the sequencing are below the maximum read length for the
particular sequencing method.
The Dual-Indexed Paired End Sequencing method employs several modifications to the
original adaptor construct and the sequencing procedure in order to distinguish between the
origins of different sequenced fragments. Instead of a simple adaptor on each end of the
fragment to be sequenced, the adaptor sequences are composed of several different
components each. These complex adaptors are shown in Figure 3, in a step-by-step depiction
of the sequencing process. Upstream of the DNA insert to be sequenced, the components of
the construct are the P5 adaptor, one of the slide-attaching sequences, which is also
complementary to the i5 Index Sequencing Primer, followed by the i5 Index, and the
sequence complementary to the Read 1 Primer, which initiates the sequencing from one end
9
of the DNA Insert. Downstream, the DNA insert is followed by a stretch of bases that is
complementary to both the i7 Index Sequencing Primer and to the Read 2 Primer, the i7
Index, and finally the P7 adaptor sequence that also attaches to the surface of the flow cell.
Each of the two index sequences is composed of 8 bases.
The sequencing includes four different primers, sequencing the DNA insert from both ends as
well as the two indices. First, Read Primer 1 is aligned and the DNA insert is sequenced from
the P5 end of the construct. The Read 1 product is then removed. Secondly, the i7 Index
Sequencing Primer is used to sequence the i7 Index, after which the index product is
removed. Then, the P5 adaptor is annealed to its corresponding adaptor, grafted to the surface
of the flow cell, which is used as the primer for the i5 Index. The i5 Index product is removed
and the full complementary strand is generated and the original strand is removed. Lastly,
Read Primer 2 is used to sequence the DNA insert form the P7 end. [11]
Figure 3: Schematic overview of the Dual-Indexed Paired End sequencing method, showing the order and the orientation of the primers involved [11]
Existing Methods for Whole mtDNA Sequencing
There are existing methods for sequencing the entire mitochondrial DNA of several
individuals in parallel, using different set ups. One such is the PTS (Parallel Tagged
Sequencing) method on the 454 sequencing platform [23, 24], using single-stranded, self-
hybridising barcodes to tag samples prior to pooling and sequencing. Samples are barcoded
separately and then pooled and prepared for sequencing. The barcodes are 6 bp long and
10
allow for 72 samples to be sequenced in parallel. Another method is the PCR-product capture
method [25], using fragments from a reference individual, fixed to beads, in order to retrieve
and enrich mtDNA fragments from complex DNA mixtures. Long range PCR is used to
produce two PCR fragments that cover the entire mtDNA, and these are then sonicated into
15-800 bp fragments, which are biotinylated and immobilized on streptavidin-coated beads.
The beads are then used to extract mtDNA fragments from sheared DNA mixtures, by
hybridisation, and the fragments can then be eluted, amplified, and sequenced, after separately
barcoding each library and preparing them for sequencing.
Aim of Project
The aim of this project was to design and implement a method for the sequencing of the
canine mitochondrial genome, for the purpose of producing data for phylogeographical
analysis of the geographical origin of the domestic dog, for which large numbers of samples
are necessary. The strategy employed was the introduction of barcodes during preparatory
PCR in order to enable multiplexed sequencing, on the Illumina MiSeq, of 1152 individuals in
parallel. Ultimately, the samples intended for use are saliva samples stored on FTA cards
(Whatman).
In contrast to existing methods, the focus of this project lies on a high degree of
parallelisation, requiring steps taken to reduce workload and on streamlining the procedures,
and on the specificity of the amplified fragments, in size and location, to guarantee the
coverage of the entirety of the mtDNA, in fragments that can be fully sequenced by the
Illumina MiSeq sequencing platform. The PTS method, being on the 454 sequencing platform
and only providing a 72-plex, is therefore not suitable. Neither is the PCR-product capture
method, both due to the fact that the intended sample material for the project is immobilised
on FTA cards, and because it requires one library per individual to be prepared all the way to
sequencing separately, which is both labour and cost intensive.
In order to enable these high degrees of parallelisation, it is important that the read numbers
obtained from each fragment are as even as possible. This is to ensure that all fragments are
sequenced with a sufficiently high redundancy to provide reliable output data.
11
Materials and Methods Primer Layout and Design
Primers were designed using the NCBI Primer BLAST tool [15], which can be used to
generate primers according to a set of user specified parameters regarding, using the canine
mtDNA reference genome [13] as the template.
The goal was to generate primers that would allow for the sequencing of the entire canine
mitochondrial genome in fragments of a size that would be fully covered by the MiSeq
sequencing platform. The highest number of base pairs the MiSeq can cover is 600 bp, which
influences the number of primer pairs that are needed. These primers would, apart from the
sequence-specific component, contain parts of the adaptor constructs necessary for MiSeq
sequencing, to which barcoding primers, containing the rest of the necessary adaptors, can
later be incorporated.
In addition to these primers, a set of primers, to be used for initial amplification of longer
fragments, were desired. The purpose of these long-range primers are to limit the use of the
original samples, to avoid depleting it, as well as to create a type of nested PCR [26] for the
sequencing-specific fragments, reducing the likelihood of unspecific targets being generated
by limiting the available unrelated template.
PCR Reactions
PCR reactions were carried out using either TagTaq, obtained from the Alba Nova University
Center [22], or PlatinumTaq, produced by Invitrogen, both being polymerase enzymes for the
purpose of the replication of DNA. The TagTaq was used for the inner primers, due to its
availability and lower cost, as its lower processivity was deemed sufficient for the shorter
inner primers, while PlatinumTaq was required to fully amplify the longer outer primers.
Originally, TagTaq was intended for both the inner and the outer primers, but after attempting
to amplify the outer primers using the TagTaq, in multiple reaction set-ups, and failing to
obtain product, possibly due to the outer fragments being too long for the TagTaq enzyme to
successfully amplify, PlatinumTaq was employed instead.
The TagTaq-based reaction mixture consisted of 2.5 µl “P” (10x polymerase buffer, final
concentrations 50 mM KCl, 2 mM MgCl2, 10 mM TrisHCl pH 8.5, 0.1% v/v Tween), 2.5 µl
12
“C” (10x dNTP mix, containing 2 mM of each dNTP in water, final concentration 0.2mM), 1
µl Forward primer (0.2 µM final concentration), 1 µl Reverse primer (0.2 µM final
concentration), 1 µl template, and 17 µl nuclease-free H2O, to a final volume of 25 µl per
reaction. Initially, 0.1 µl TagTaq was used per reaction, according to suggestions from the
providers of the enzyme, but this was later increased to 0.2 µl per reaction due to the low
yield.
For PlatinumTaq-based reaction mixtures, used for amplifying the outer fragments, volumes
were adapted from the information sheet provide by Invitrogen [16] and consisted of 5 µl 10x
PCR Buffer without MgCl2, 5 µl dNTP mixture (2 mM of each dNTP), 1.5 µl MgCl2 (50
mM), 2 µl Forward primer (0.2 µM final concentration), 2 µl Reverse primer (0.2 µM final
concentration), 1 µl template, and 33.5 µl nuclease-free H2O, to a final volume of 50 µl per
reaction. A volume of 0.2 µl PlatinumTaq, 5U/µl, per reaction was used throughout the
experiments.
For both TagTaq-based and PlatinumTaq based reactions, in the case of multiplexing,
initially, equal amounts of each of the necessary forward and reverse primers were added, and
the volume of H2O was lowered accordingly. In later experiments, in attempts to obtain
comparable levels of each product in these multiplexes, the concentrations of the primers
included in each multiplex were varied, increasing the concentration of those primers that
failed to yield product in relation to those that did.
The PCR reactions were tried out with several different annealing temperatures, extension
times, and numbers of cycles. The initial set up for the inner primers was 1.5 minutes of initial
denaturation at 94°C, followed by 30 cycles of 30 seconds of annealing at 46°C and 2 minutes
of extension at 72°C, a final extension for 10 minutes at 72°C and ending in a Hold at 4°C.
The number of cycles was later increased to 40, and both 49°C and 52°C as annealing
temperatures were evaluated.
The PCR reaction parameters for the outer primers were initially the same as for the inner
primers, with the exception of the annealing temperature being set to 50°C. This was later
adjusted to evaluate both 5 and 10 minutes of extension time, as well as different numbers of
cycles.
The annealing temperatures were chosen by manually calculating the optimal annealing
temperature for each primer, using only the sequence-specific part of the inner primers and
the entirety of the outer primers, adding 2°C for an adenine or a thymine and 4°C for a
13
guanine or a cytosine, together with estimations of melting temperatures from the primer
generating tool, and choosing a temperature that was believed to be sufficiently low to allow
all primers to anneal successfully.
The success of PCR reactions were evaluated by running aliquots of the reaction mixture on
1% agarose gels, pre-stained with GelRed (Biotium). In the case of multiplex reactions,
singleplex reaction mixtures for the primers participant in the multiplex were prepared and
dilutions of the multiplex reaction mixtures were used as template for the singleplexes. The
product of these singleplexes were then checked on gels, on the assumption that if and only if
the multiplex had been successful would the singleplex be successful in regards to that
specific primer.
Results Primer Layout and Design
The primers required for the project were subject to a number of criteria set by the intended
sequencing platform, the parameters of adjacent primers, as well as the nature of the mtDNA
itself.
The external criteria set by the Illumina Paired End sequencing on the MiSeq is stated as a
maximum of 550 bases per primer-amplified segment, including the primer sequences, for
sufficient coverage of the entire segment. This includes an overlap of 50 bp at the centre for
better coverage of the ends of the reads. This is due to the fact that, as an ensemble
sequencing-by-synthesis (SBS) method, the read length when sequencing on the MiSeq is
limited by the reliability of the synchronous incorporation of the correct base to each strand in
the cluster. In each step of the sequencing, the correct base has to be incorporated exactly
once and be measured accurately, followed by the removal of the extension-blocking agent,
allowing the next base to be incorporated and measured. As the sequencing proceeds, errors
are eventually introduced, wherein bases fail to be properly incorporated in certain strands,
leading to portions of the cluster lagging behind the others, giving 'false' signals. As these
errors accumulate, the signal-to-noise ratio will decrease, ultimately to the point where bases
can no longer be accurately detected. The number of bases into the sequencing where this
threshold is reached dictates the read length of the method in question. [12]
It is, however, possible to use segments of sizes approaching 600 bp by utilising the Illumina
14
stitching algorithm to combine an overlap of at least 10 bp to a single read, using consensus
and quality data from the two reads, allowing the use of larger inserts. The upper limit for the
size of a DNA insert, including primer sequences, was thus set to 590 bp. Subtracting the
length of the primer sequences from the inserts leaves approximately 550 bp sequenced in
each insert, as primers are ideally around 20 bp long. With this average fragment length, it
was estimated that 32 fragments would be needed in order to fully cover the 16727 bp
reference genome, with a reasonable margin for overlaps and difficult-to-align stretches of
DNA. [13]
Sequencing 32 individual 550 bp long sequences would yield a total of 17600 bp, leaving a
margin of 837 bases when compared to the 16727 bp of the reference genome. Spread out
over 32 fragments, this enables a variance of 27 bp per fragment, providing a certain degree
of freedom when aligning the primers. Finally, in order to fully cover the mitochondrial
genome, the fragments cannot average lower than 523 bp (563 bp with the primers included).
32 primer pairs is also a desirable number from a practical design point of view, as sets of 32
fit evenly on 96-well plates as well as in multiples of eight, corresponding to the width of
common laboratory equipment.
These 32 primer pairs must then be laid out in an interconnecting fashion, where each forward
primer must be placed slightly upstream of the reverse primer of the previous pair, relative to
the leading strand, so that every base is sequenced independently.
Figure 4: Schematic representation of the overlapping orientation of primers, highlighting how all parts of the template are covered by amplified fragments in an interlocking fashion. Template DNA represented by the wide yellow line, the primers by red and orange arrows (alternating colours purely for visual clarity) and the amplified fragments in corresponding colours below the template.
Another limiting factor for the placement of the primers is the repeating region of the mtDNA
inside which the primers cannot reliably be placed. This is due to the fact that the repeating
region, as indicated by its name, is comprised of multiple repetitions of the same DNA motif,
meaning that a primer that is complementary to a site in this region is thus complimentary to a
15
large number of sites, upstream and downstream of the intended annealing site, at every place
where this motif repeats itself. In the domestic canine mtDNA, this repeating region alternates
between two almost identical 10 bp segments, only differing in one position. This region
covers bases 16131 through 16430 of the reference genome, but can vary greatly in size
between individuals due to differing numbers of repeats of the two 10 bp motifs. [14]
The option of not including the repeating region was considered, as the size differences may
mean that longer repeating regions would not be completely sequenced by the Illumina Paired
End sequencing method, and shorter ones would be sequenced to redundancy, but possibly
without the means to tell to what extent, represented in figure 5.
Figure 5: Schematic representation of the different possible results when sequencing the repeating region. Due to its varying size between individuals, coverage will vary, and due to its repeating nature, conclusive alignments cannot be guaranteed.
It was decided to attempt to sequence the repeat region to the highest extent possible, as full
coverage of the rest of the mtDNA appeared to be achievable with the remaining 31 primer
pairs, meaning that no information would be lost from trying to sequence the repeat region as
well. Including the repeat region as an amplified segment would also ensure that the bases
immediately preceding and following it would actually be included in the sequencing,
something that could otherwise not be achieved, as primers cannot reliably be aligned inside
the repeat region.
16
With this in mind, the primers were aligned, starting from the primer pair upstream of the
repeating region, placing the reverse primer as close to the repeating region as possible,
followed by the pair covering the repeating region, ensuring enough room after the repeating
region to align the forward primer of the next primer pair.
The primers were designed using the NCBI Primer BLAST tool [15]. The Canis familiaris
reference mitochondrion genome entry [13] was used as the template to which the primers
were to be aligned. The PCR product size was set to a maximum of 590 bases and a minimum
of 540 bases, to ensure coverage of the whole mtDNA sequence. Remaining parameters were
subject to dynamic modifications depending on the ease or, rather, difficulty with which
primers could be aligned. TM was desired to be between 52 and 60 degrees Celsius, with an
optimal temperature of 56 degrees. The allowed difference in TM between the primers in a
pair was initially set at 3 degrees, but was subject to increases in cases where primers could
otherwise not be aligned.
The initial advanced settings were for a primer size between 17 and 23 bases with 20 as an
optimum, a GC-clamp of 2, maximum poly-X sequences of 4, and maximum 3’ GC content
of 3. GC content was desired to be between 40 and 60% and due to the nature of the
mitochondrial DNA, ‘Avoid low complexity regions for primer selection’ was unchecked.
Primers were then generated by specifying a stretch of approximately 50 bases within which
the forward primer was allowed to align. The starting point of the first stretch was dictated by
the end of the repeating region, while all subsequent alignment areas were instead dictated by
the location of the reverse primer in the previous primer pair, i.e. in relation to the leading
strand, each forward primer had to end before the reverse primer of the preceding pair
‘started’.
Due to the structure of the mtDNA and the rigidity of where the next primers had to be
aligned, in relation to the preceding pairs, it was often difficult to align primers according to
the above-mentioned ‘optimal’ parameters, which necessitated that the conditions were made
less stringent, on a primer-by-primer basis. Initially, the stretch of bases allotted to the
alignment of the forward primer would be extended, in the hopes of finding a primer without
having to lower the other requirements placed on the primer. Failing this, as moving the
primer too far back would in the end compromise the possibility of covering the entire
mtDNA in the chosen number of primers, the remaining parameters were in turn made less
stringent. The decision on what parameter to change was aided by the error message given by
the Primer-BLAST tool upon failure to generate a primer pair, which would list the reasons
17
for the failure, e.g. TM difference too high, too long poly-X sequence, or lack of GC clamp.
Decisions were also made by observing the surrounding sequence manually, and thereby
decide whether or not certain changes were appropriate. For each primer pair, the changes to
the parameters that were deemed to cause the least impactful changes to the overall structure
of the primers were chosen.
To limit the use of template DNA, which is available in limited amounts, primers that would
amplify longer parts of the mtDNA were required. These would then be used as templates for
the aforementioned 32 primer pairs, also creating a sort of nested PCR [26], which reduces
the likelihood of generating unspecific PCR products. It also serves the purpose of generating
template that is in solution, not bound to FTA cards.
The 32 primers pairs will from here on be referred to as ‘Inner Primers’ and these new,
analogously dubbed ‘Outer Primers’ were aligned in much the same manner as the inner
primers, interlocking with each other, but also taking care not to overlap with the alignment
sequences of the inner primers.
Figure 6: Schematic representation of the interlocking design of the outer primers. Template DNA represented by the wide yellow line, the primers by blue and green arrows (alternating colours purely for visual clarity) and the amplified fragments in corresponding colours below the template.
18
Figure 7: Schematic representation of how the outer primers fully cover a set of four inner primers. Template DNA represented by the wide yellow line, the inner primers by red and orange arrows and outer primers by blue and green arrows (alternating colours purely for visual clarity) and the amplified fragments in corresponding colours below the template.
The outer primers were designed to cover four inner primers each, resulting in eight outer
primer pairs, each amplifying around 2200 bp long sequences. These were to serve as both a
way of amplifying the original templates, which is available in limited amounts, and as a way
to create a nested PCR, reducing the complexity in subsequent PCR reactions.
As detailed previously, in order to utilise the Illumina Dual-Indexed Paired End sequencing
protocol, a number of additional specific sequences need to be present in the primers. The
basic Illumina sequencing relies on random fragmentation of sample DNA, followed by
ligation of specific adaptors to the fragments, which enable bridge amplification of the
fragments on the sample slide, as well as containing the primer alignment sequence for the
sequencing-by-synthesis steps.
As this project endeavours to sequence the entire mtDNA of thousands of individuals in
specific, predetermined PCR-amplified segments, this random fragmentation approach to
creating to DNA inserts to which adaptors are ligated is not appropriate, as it would require
separate libraries for each individual and involves increased labour and cost, as well as
removing the specificity of using primers to ensure full coverage. Instead, the Read Primer
parts of the adaptor sequences are added single-strandedly to the 5’ end of the forward and
reverse inner primers as handles, making these increase in size significantly. In order to
complete the sequencing-enabling structures for Dual-Indexed Paired End Sequencing, an
additional PCR step is required. This step will be used to introduce the outermost adaptor
sequences that allow ligation to the slides, P5 and P7, as well as the two index sequences, i5
and i7.
19
Figure 8: Schematic overview of the two PCR steps that complete the sequencing construct. The top step uses specific inner primers (shown in dark grey) with attached partial adaptor sequences containing read primer complementary sequences (shown in yellow and light blue). The second step adds the outer adaptor sequences (shown in red and dark blue) and the indices (shown in light and dark green) by completing the previously added adaptors.
The final construct, shown above in figure 8, consists, from left to right, of the P5 flow cell
attachment sequence, the i5 index barcode, the Read 1 Primer complementary region, the
forward insert specific primer, the DNA insert, the reverse insert specific primer, the i7 index
complementary region (which doubles as the Read 2 complementary region when read in the
other direction), the i7 index barcode, and the P7 flow cell attachment sequence. The
difference in structure between the default ligated adaptor and this PCR-generated construct
lies in the forward and reverse insert specific primers, which enable the sequencing of
specific, predetermined parts of the sample DNA, but from a sequencing stand point, these are
merely treated as a part of the DNA insert, and do not influence the sequencing itself in any
way.
Using 32 inner fragments per individual, and sequencing 1152 individuals in parallel, given
the total read output of the MiSeq v3 sequencing kit [27], an equal distribution of reads
between the fragments would, in an ideal situation, provide a redundancy of 600 reads per
fragment and individual.
20
PCR Reactions, Viability of Primers and Multiplex Set Ups
Initially, the first eight inner primers, from Eurofins, were tried in singleplex, 0.1 µl TagTaq,
30 cycles.
Figure 9: First attempt at amplifying the first 8 inner primers (0.1 µl TagTaq, 30 cycles) flanked by two DNA ladders (Low Range, 3% TopVisionAgarose #RO491 25-700 bp). Ladders are smeary and dissimilar, and exposure is high in an attempt to visualize potential product.
Bands were very weak and smeary. As this was true for the ladders too, as well as for other
gels run by others in the lab at the same point in time, part of the fault may, in this case, lie in
the gel bath itself.
Next, the same primers were tried again, resulting in a gel with much sharper bands but still
very weak product bands, showing only for primers 1-3 (Figure 10), but submitting the
remaining reaction mixtures to a subsequent extra 12 cycles showed more clear results
(Figure 11). The gel after the additional 12 cycles shows bands of the expected size for
primers 1-3 multiple, as well as multiple bands of lower sizes, presumed to be various primer
dimers. These results prompted the decision to increase the number of cycles for the inner
primers to 40. Low processivity of the enzyme was suspected.
21
Figure 10: Second attempt at amplifying the first 8 inner primers (0.1 µl TagTaq, 30 cycles), flanked by two DNA ladders (Low Range, 3% TopVisionAgarose #RO491 25-700 bp, and M, 1% TopVision LE GQ Agarose #RO491 250-10000 bp). Ladders are clearer but product is very weak, faintly visible for primers 1 through 3.
Figure 11: Second attempt on first 8 inner primers after 12 additional PCR cycles. Primers 1 through 3 are clearly visible. Samples flanked by two DNA ladders as before (Low Range, 3% TopVisionAgarose #RO491 25-700 bp, and M, 1% TopVision LE GQ Agarose #RO491 250-10000 bp). The two rightmost lanes before the M ladder are primers 1 and 2 from a different sample compared to the first eight lanes.
Reactions for the same eight primers were then run using twice the amount of enzyme, 0.2 µl,
for 40 cycles, and 10 times as much enzyme, 1 µl, but remaining at 30 cycles. The 0.2 µl, 40
cycle run showed product of the expected size for all primers except primer 8, while the 1.0
µl, 30 cycle run did not yield any product. Whether the latter was caused by a laboratory
mistake or a result of imbalances between the reaction components due to the increase in
enzyme concentration, or simply still too few amplification cycles was not further
investigated.
22
Figure 12: Inner primers 1 through 8 amplified with 0.2 µl TagTag and 40 cycles, in duplicate, M ladder. All primers except primer 8 appear clearly.
Figure 13: Inner primers 1 through 8 amplified with 1.0 µl TagTag and 30 cycles, in duplicate, M ladder. No primers visible, possibly due to human error.
Deciding to proceed with 40 cycles and 0.2 µl enzyme per reaction for inner primers in
singleplex, different annealing temperatures were investigated. Both 49°C and 52°C were
tried, using the now established parameters, both yielding product for all primers apart from
primer 8.
Figure 14: Inner primers 1 through 8 amplified with 0.2 µl TagTag and 40 cycles, 49°C annealing temperature to the left and 52°C annealing temperature to the right, Low Range ladder. All primers except primer 8 appear clearly in both sets.
23
Inner primers were also tried in multiplex, initially in quadruplexes of primers 1-4 and 5-8, 50
cycles, yielding vague primer dimer products. At the same time, the eight outer primers were
run for the first time, using TagTaq, 50°C annealing temperature and 40 cycles, but no
product was obtained. Figure 15 below shows these results, using inner primer 1 as a positive
control.
Figure 15: Attempt at amplifying the 8 outer primers using 0.2 µl TagTaq, 50°C annealing temperature and 40 cycles, in duplicate, with inner primer 1 as positive control, M ladder. The two rightmost lanes contain inner primer multiplex attempts, primers 1-4 and 5-8, 50 cycles. No outer primer product visible, and only unspecific product visible for the inner primer multiplexes.
The outer primers and the multiplex attempts were retried using both 5 minutes and 10
minutes extension time for both, again failing to result in the desired products.
Figure 16: Attempt at amplifying the 8 outer primers using 0.2 µl TagTaq, 50°C annealing temperature and 40 cycles, using 5 minutes extension time (left) and 10 minutes (right), with inner primer 1 as positive control, M ladder. The 4 rightmost lanes contain inner primer multiplex attempts, primers 1-4 and 5-8, 50 cycles, in duplicate. Again, no outer primer product visible, and only unspecific product visible for the inner primer multiplexes.
24
In order to rule out human error, all outer primers were re-suspended from stock solution and
the PCR reactions were re-run at 40 cycles and 10 minutes extension time. As yet again no
product was obtained, it was suspected that TagTaq lacked the processivity required to
adequately amplify the longer outer primer fragments, and PlatinumTaq was tried instead,
using 45 cycles and 10 minutes extension time. This set up yielded clear product for all eight
outer primers. It was subsequently concluded that TagTaq did indeed lack the necessary
processivity to reliably produce the longer, outer primer fragments, and PlatinumTaq was
employed for all outer primer reactions from this point onwards.
Figure 17: Outer primers amplified with PlatinumTaq, 45 cycles, 10 minutes extension time, inner primer one as positive control, M ladder. All outer primers visible.
Subsequently, quadruplexes of the outer primers were set up, 1-4 and 5-8, using 0.2 µl
PlatinumTaq, 45 cycles, and 10 minutes extension time, and singleplexes of each of the eight
primers were run using 1µl 1:500 dilutions from the corresponding quadruplexes as template
and were run for 15 cycles. These secondary singleplexes yielded product in outer primers 3,
4, 5, and 8.
25
In order to further investigate the possibility of quadruplexing the outer primers, quadruplexes
comprised of odd- and even-numbered outer primers, as well as a combination of outer
primers 1, 2, 6, and 7 and 3, 4, 5, and 8. As before, these multiplexes were verified by
singleplex reactions based on the multiplex product, run on gels. The former combinations
showed clear product for outer primer 3, 4, 5, and weak bands for 7 and 8. The latter was
similar, and showed primers 3, 5, and 8 relatively clearly, and 4 weakly.
Figure 19: Singleplexes of outer primers, from 1 µl 1:500 dilutions of Even and Odd combination (i.e. 1-3-5-7 and 2-4-6-8) quadruplex template, 15 cycles, M ladder. Outer primers 3, 4, and 5 clearly visible, primer 7 and 8 vary faintly, and 1, 2, and 6 seemingly not amplified.
Figure 20: Singleplexes of outer primers, from 1 µl 1:500 dilutions of 1-2-6-7 and 3-4-5-8 quadruplex template, 15 cycles, M ladder. Primers 3, 5, and 8 were visible, and primer 4 was faintly visible.
These results led to the decision to try the outer primers in duplexes, one set up with primers
Figure 18 Singleplexes of outer primers, from 1 µl 1:500 dilutions of 1-4 and 5-8 quadruplex template, 15 cycles, M ladder. Outer primers 3, 4, 5, and 8 clearly visible, 1, 2, 6, and 7 seemingly not amplified.
26
1+2, 3+4, 5+6, and 7+8, and one set up with primers 1+3, 2+4, 5+7, and 6+8. The duplex
reaction mixture was then used as templates for the corresponding singleplexes, using 1 µl
1:20 dilutions and run for 15 cycles. These set ups consistently yielded product for primers 3,
4, 5, and 8, similarly to the earlier quadruplexes, but primers 1 and 2 showed weak
amplification when paired together, as did primer 7 when paired with primer 5.
Figure 21: Singleplexes of outer primers from duplex set-ups (1+2, 3+4, 5+6, and 7+8), M ladder (one blank lane between the ladder and the first primer). Primers 1 through 5, and primer 8 visible, primers 3 through 5 more strongly.
Figure 22: Singleplexes of outer primers from duplex set-ups (1+3, 2+4, 5+7, and 6+8), M ladder (one blank lane between the ladder and the first primer). Primers 3 through 5, and primers 7 and 8 visible, primers 3, 5, and 8more strongly, primer 7 very faint.
Outer primers 5 and 6, one that had consistently worked and one that did not appear to work,
were subsequently chosen for testing other parameters of the PCR. Singleplexes of primer 5
and 6 were run at 20 cycles and at 30 cycles, with 5 minutes and 10 minutes of extension
time, i.e. four different PCR set ups for each primer. A duplex of outer primers 5 and 6 were
also run at the same parameters. The gels showed 20 cycles to be too few to properly amplify
the segments, and the longer extension time appeared to increase yield. At 10 minutes
extension time, outer primer 5 amplified to a higher extent than primer 6. The singleplexes
performed from the duplexes showed amplification of only primer 5.
27
Figure 23: Parameter tests for outer primers, using outer primers 5 and 6. From left to right, 20 cycles with 5 minutes extension time, 20 cycles with 10 minutes extension time, 30 cycles with 5 minutes extension time, and 30 minutes with 10 minutes extension time for both primer 5 and 6. 20 cycles did not yield product at any extension time, and the higher extension time yielded higher degrees of product, especially for primer 5.
Figure 24: Singleplex amplification of outer primers 5 and 6 from duplex template (imaged cropped from larger gel with other samples on). Only showing result of 30 cycle runs, ostensibly only yielding product for outer primer 5.
After the initial attempts at running the first eight inner primers, the remaining 24 inner
primers were ordered. Due to the issues with getting inner primer 8 to yield product, and
based on advice regarding primer purchase (personal communication with Afshin Ahmadian,
Associate Professor, School of Biotechnology, Royal Institute of Technology, KTH) the new
primers were ordered from Biolegio. Primer 8 was redesigned, and both the new and old
version was ordered, along with primer 1, for comparison to the Eurofins primers, together
with inner primers 9-32.
Firstly, primer 1 from both Eurofins and Biolegio were run in triplicate, as well as both the
original and the new version of Primer 8, both from Biolegio and also in triplicate. The
reactions were run as before, at 46°C annealing temperature and for 40 cycles. The results
28
showed comparable results for both versions of inner primer 1 and indicate product from the
re-synthesis of the original inner primer 8, but not from the new version.
Figure 25: Comparison of inner primer 1 from Eurofins and from Biolegio, and of the old and new design of inner primer 8, both from Biolegio, in triplicate. Primer 1 worked comparably well from both manufacturers, and the old design of primer 8 from Biolegio appeared to work, while the new design did not.
Next, inner primers 9 through 32 were tested in duplicate, according to the same parameters.
Due to the unexpected result from the two inner primer 8, these were also re-run and are
included on the gel showing inner primers 25 through 32. The re-run did support the previous
evidence in showing that the re-synthesis of the original primer 8 worked, while the redesign
did not. The majority of inner primers 9-32 showed product, and the ones that did not or
appeared only weakly were re-run.
Figure 26: Singleplexes of inner primers 9-16, in duplicate (one set after the other, with one empty lane in-between the sets), M ladder.
29
Figure 27: Singleplexes of inner primers 17-24, in duplicate (one set after the other, with one empty lane in-between the sets), M ladder.
Figure 28: Singleplexes of inner primers 25-32, in duplicate (one set after the other), M ladder. Additionally, to the left of the ladder, the old and new design of inner 8, again showing product from the old design.
The primers to be re-run in duplicate were 9, 16, 19, 20, 22, 23, 24 and 25. The results
obtained were largely inconclusive, being inconsistent between duplicates, at best showing
fairly weak bands, at worst appearing almost fully blank, and overall showing a lot of
unspecific product.
Figure 29: Singleplexes in duplicate of the inner primers between 9 and 32 that did not appear to yield product in the initial singleplexes. From left to right, in pairs, 9 16, 19, 20, 22, 23, 24, and 25, M ladder.
30
Next, the viability of TagTaq compared to PlatinumTaq for the amplification of inner primers
was assessed, at the same time investigating how well the inner primers amplify from a
previously amplified outer primer segment, by setting up singleplexes of inner primers 9-12
using template from singleplex amplification of outer primer 3. The outer PCR was run at
50°C annealing temperature, 30 cycles, and 5 minutes extension time. 1 µl 1:20 dilution of the
outer primer product was used as template for the inner singleplexes. These were run at 46°C
annealing temperature, 40 cycles. All steps were performed in duplicate, i.e. two singleplex
reactions of outer primer 3 were used for duplicates of both the TagTaq and the PlatinumTaq,
totalling four singleplexes of each inner primer. All singleplexes were successful, with
PlatinumTaq showing much more strongly, and the two sets of TagTag clearly differing in
intensity.
Figure 30: Duplicate sets of inner primers 9 through 12 in singleplex from previous amplification of outer primer 3, comparing TagTaq (left) to PlatinumTaq (right), M ladder. The PlatinumTaq amplified inner primers show more strongly, and there is a marked difference between the two TagTaq sets, despite having been amplified under the same conditions.
Similarly, quadruplexes of inner primers 9-12, one using TagTaq and one using PlatinumTaq,
were set up, still using the amplified outer primer 3, 1:20 dilution, as template. Reactions
were run at 46°C annealing temperature, 30 cycles. Secondary singleplexes for verification
were performed with Platinum for all reactions, on 1:20 dilutions of the quadruplex mixtures.
Results were similar between the two enzymes; inner primers 9, 11, and 12 were successfully
amplified, while primer 10 was very weak.
31
Figure 31: Singleplexes from quadruplexes of inner primers 9 through 12. All singleplex reactions were performed with PlatinumTaq, while one quadruplex was performed TagTaq (left) and one with PlatinumTaq (right), M ladder.
Due to the apparent failure of certain outer primers in quadruplex reactions, outer primers
were re-suspended from stock and run in singleplex, as before, showing primers 3 through 8
clearly, primer 2 was weaker and primer 1 was very faint. This was to investigate degradation
of the primers due to freeze-thaw cycles as the cause of the amplification failure.
Figure 32: Outer primers in singleplex, re-suspended from stock, M ladder.
Diluting all outer primer product (apart from primer 1, being significantly weaker) at a ratio
of 1:20, singleplexes of all inner primers were run from their corresponding outer primer, for
40 cycles, with 46°C annealing temperature. Results showed amplification of inner primers
corresponding to each of the outer primers, but not from all inner primers, even from inner
primers that had previously been successfully amplified.
32
Figure 33: Singleplexes of inner primers 1 through 16 from outer primer singleplexes 1 through 4, M ladder.
Figure 34: Singleplexes of inner primers 17 through 32 from outer primer singleplexes 5 through 8, M ladder.
The resuspended outer primers were then tried in quadruplex; outer primers 1, 3, 5, and 7 in
one quadruplex and outer primer 2, 4, 6, and 8 in the other. They were run in duplicates of
both 20 and 30 cycles, all using 50°C annealing temperature and 5 minutes. 1:20 dilutions of
the quadruplex reaction mixtures were used for singleplex verification and were run for 15
cycles. The gels showed successful amplification of primers 3, 4, 5, and 8 in both the 20 cycle
and the 30 cycle quadruplexes, and in the latter, outer primer 7 was also visible.
Figure 35: Duplicates of outer primer singleplexes from outer primer quadruplexes (1+3+5+7 and 2+4+6+8, i.e. Even and Odd), M ladder. Quadruplex run for 20 cycles, primers 3, 4, 5, and 8 faintly visible.
33
Figure 36: Duplicates of outer primer singleplexes from outer primer quadruplexes (1+3+5+7 and 2+4+6+8, i.e. Even and Odd), M ladder. Quadruplex run for 30 cycles, primers 3, 4, 5, 7 and 8 visible, number 7 more faintly.
It was then decided to attempt singleplex inner primer amplification, using the 30 cycle outer
primer quadruplex reaction mixture as template, at 1:100 dilution. The singleplexes were run
for 40 cycles. Although product was only expected for inner primers corresponding to outer
primers 3, 4, 5, 7, and 8, gels showed amplification of inner primers from all outer primers,
including those that appeared not to have worked in quadruplex. Only two inner primers
appeared to not have yielded product.
Figure 37: Singleplexes of inner primers 1 through 16, from outer primer quadruplex template (even and odd outer primer combinations), M ladder. Most inner primers visible, irrespective of whether or not the corresponding outer primer appeared to have yielded product.
Figure 38: Singleplexes of inner primers 17 through 32, from outer primer quadruplex template (even and odd outer primer combinations), M ladder. Most inner primers visible, irrespective of whether or not the corresponding outer primer appeared to have yielded product.
To verify this unexpected result, the entire experiment was run again, starting from the
quadruplex of the outer primers, with similar results.
34
Figure 39: Singleplexes of inner primers 1 through 16, from re-run of outer primer quadruplex template (even and odd outer primer combinations), M ladder. Again, almost all inner primers are clearly visible.
Figure 40: Singleplexes of inner primers 17 through 32, from re-run of outer primer quadruplex template (even and odd outer primer combinations), M ladder. Again, almost all inner primers are clearly visible.
Barcoding PCR
Following these results, the decision was made to proceed towards sequencing. For two
different DNA samples, PCR1 was run in quadruplexes of odd and even numbered outer
primers, 40 cycles, and PCR2 was run in quadruplexes, duplexes and singleplexes according
to the pattern in figure 41, each for 25 cycles. A further four DNA samples were prepared in
the same manner, but for these, PCR2 was only run on quadruplexes.
Figure 41: Duplex and quadruplex set-ups for all 32 inner primers, with corresponding names (Q1-8 for the quadruplexes and D1-8 and D17-24 for the duplexes).
35
Product from PCR2 were pooled together, per individual and multiplexing set-up, creating 32-
plexes of inner fragments, and a 1:100 dilution of these pools were used as template for the
barcode-introducing PCR3. This reaction was run for 15 cycles, at 58 °C annealing
temperature, and 5 minutes extension time. Both TagTaq and PlatinumTaq were employed,
according to Table 1 below.
Samples Singleplex TagTaq
Singleplex PlatinumTaq
Duplex TagTaq
Duplex PlatinumTaq
Quadruplex TagTaq
Quadruplex PlatinumTaq
IR119 X X X X X X IR126 X X X X X X IR85 -‐ -‐ -‐ -‐ X -‐ IR92 -‐ -‐ -‐ -‐ X -‐ IR114 -‐ -‐ -‐ -‐ X -‐ IR127 -‐ -‐ -‐ -‐ X -‐ Table 1: Table showing the different combinations of samples, enzymes, and multiplexing variants used in PCR3 for the introduction of the barcode indices.
Concentration Measurements and Cleaning
Concentration measurements were performed on all 32-plexes after PCR3, using the Qubit
dsDNA HS Assay Kit (Invitrogen, Life Technologies), followed by a cleaning step on an
MBS machine (Magnetic Bead Separation) in order to remove smaller fragments than those
meant for sequencing, such as loose primers and primer dimer constructs. This first
concentration measurement was performed both as a quick way of verifying product from
PCR3, and as a means of estimating the relative concentration of actual product when
compared to the clean samples. The Illumina CA Purification protocol [15] was used, diluting
20 µl of PCR3 product to 50 µl using elution buffer (EB). A concentration of 14% PEG was
used as precipitation buffer in order to achieve an appropriate size cut-off [17] [18]. The
parameters entered into the Magnatrix OS were 50 µl sample volume, 20 µl magnetic beads,
100 µl Precipitation Buffer, 25 µl EB, and 10 minutes binding time, resulting in input
volumes of 50 µl sample, 95 µl EB, 125 µl 14% PEG, 220 µl 80% EtOH, and 25 µl beads.
After MBS cleaning, a second Qubit concentration measurement was performed on all
samples, in order to estimate the actual product concentration, from which the pooling of
samples for the sequencing was subsequently based. The samples were also run on
BioAnalyzer (Agilent Technologies, 1000 kit) for a visual verification of the success of the
36
cleaning step. The desired products are expected to be in the range of 600-700 bp, due to the
base fragment being around 550 bp, to which large additional adaptor sequences have been
added.
Figure 42: Bioanalyzer results of all samples apart from the singleplex PlatinumTaq set ups, which were on a separate Bioanalyzer run. All samples are successfully cleaned, showing no short, unspecific products, and those run with PlatinumTaq clearly showing a peak at the expected size range of 600-700, while peaks are very small for those run with TagTaq.
Both assays indicated higher yields of specific product from the sample set-ups run with
PlatinumTaq than from those that were performed with TagTaq, and product above the
detection cut-off for all samples except one (Table 2). In the case of the four additional
individual samples, these were pooled into one sample prior to the second cleaning step.
37
Sample Concentration before cleaning Concentration after cleaning
In assay [ng/ml]
In sample [µg/ml]
In assay [ng/ml]
In sample [µg/ml]
IR119ST 20.6 4.12 1.32 0.263
IR119SP 50.2 10.0 23.9 4.78
IR119DT 21.0 4.20 1.06 0.212
IR119DP 70.8 14.2 25.6 5.12
IR119QT 14.7 2.94 <0.5* -‐
IR119QP 48.7 9.74 21.6 4.33
IR126ST 20.0 4.0 1.61 0.322
IR126SP 36.3 7.27 22.3 4.45
IR126DT 18.8 3.77 1.24 0.248
IR126DP 35.6 7.12 19.9 3.99
IR126QT 23.2 4.64 1.42 0.284
IR126QP 37.6 7.51 18.0 3.59
IR85QT 17.6 3.52
1.15 0.230 IR92QT 11.6 2.33
IR114QT 14.5 2.89
IR127QT 22.7 4.53 Table 2: Overview of the amount of PCR in the different samples before and after cleaning using the Illumina CA Purification Protocol on the MBS [17].
Initial Sequencing
All 32-plexes were then pooled for sequencing on the MiSeq, using the MiSeq Reagent Kit
V2, 300 cycles (Illumina), i.e. paired-end sequencing of 150 bases from each end. After
demultiplexing, the results, shown in abbreviation in Tables 3 and 4, were analysed, and table
5 provides a colour coding key, used to highlight the different magnitudes of reads. Results
showed generally lower numbers of reads than anticipated, and while not being completely
conclusive nevertheless showed clear trends in successful amplification and sequencing. The
main implications were that sample set-ups where PlatinumTaq had been used for PCR3 had
overall generated larger numbers of reads than those that had been performed with TagTaq,
and that the inner fragments corresponding to outer fragments 1, 2, and 6 (inner fragments 1
through 8, and 21 through 24) had yielded far fewer reads than the remaining ones. The latter
trend was particularly noticeable among the first 8 inner fragments, with only two out of 16
deviating from the pattern, while the pattern for fragments 21 through 24 was not as
pervasive.
38
Sample 85 92 114 127 119DP 119DT 119QP 119QT 119SP 119ST
Fragments 1 392 332 1 10 1191 861 3763 691 1551 1382 2 4 1 3 1 2190 1892 18 8 360 182 3 21 1 3 1 172 101 27 8 711 900 4 0 6 5 4 1565 410 240 9 49258 3885 5 13 9 7 9 411 199 297 16 6943 2992 6 1 2 0 0 159 90 57 8 26382 2707 7 7 6 2 5 34767 6293 361 15 63126 6568 8 1 0 0 0 211 69 97 3 37699 1718 9 2125 1157 1146 1158 133270 12310 28138 1132 397 515 10 2136 1071 1538 1943 116761 31809 14731 1677 9387 13220 11 1683 1017 1725 1931 168542 17748 8367 351 38699 5969 12 2161 1215 266 5836 126426 24588 9050 629 57564 10430 13 967 406 90 531 104270 7243 51632 819 31637 2396 14 435 144 114 394 104642 6709 63538 1390 72882 3827 15 502 144 58 122 29198 8296 1073 335 74150 7058 16 192 72 13 3011 75913 9907 19411 973 82509 16184 17 2326 1409 2517 2885 59213 9038 270002 6629 706 1394 18 3349 2002 2754 2442 105531 10036 316883 9134 73157 5234 19 3322 1830 1564 1680 136755 12378 437807 11373 110441 7505 20 3386 1705 52 561 126721 9141 390608 8600 99596 5159 21 1477 612 13 130 1427 2868 4268 528 139331 10548 22 10 2 8 6 2243 993 3006 133 71586 12202 23 10 5 1 5 734 591 548 95 70299 6355 24 2 1 2 9 8859 6700 2139 107 40107 19496 25 424 193 260 231 4194 1639 19755 1307 73767 6168 26 395 199 42 42 8142 3634 25041 1888 155734 18440 27 188 86 128 72 6825 1478 33161 749 32533 3727 28 74 36 19 48 1698 574 6172 237 221553 12900 29 5145 2708 4085 4350 101017 9688 257360 7500 198695 19424 30 8141 4205 2521 4991 130392 19357 274886 9148 906 4091 31 2772 1551 1587 1965 71028 13996 179512 5725 41943 3070 32 4579 2926 743 753 66604 29799 228790 7520 78606 12442
Table 3: Table showing the number of reads obtained for each fragment from the additionally prepared TagTaq quadruplexes 85, 92, 114, 127, and all six TagTaq and PlatinumTaq set ups for sample 119. The colour of the background, in a scale from red to yellow to green, serves to highlight the differences in the number of reads obtained for each fragment, with one colour for each degree of magnitude, with the exception of dark red, signifying a frequency of 0, detailed in table 5.
39
Sample 126DP 126DT 126QP 126QT 126SP 126ST
Fragments 1 74 87 8 200 497 447 2 76 35 15 63 69 96 3 34 16 19 71 401 568 4 5336 651 984 496 191 202 5 120 152 183 265 111 22 6 87 18 53 67 435 479 7 56622 3217 855 154 3583 4222 8 271 70 109 62 20365 1590 9 121469 12941 5479 3597 395 161 10 199811 24356 10993 9149 483 114 11 147953 23749 13682 7657 81658 10882 12 311537 59062 71196 31221 82312 11919 13 245150 20750 40067 14957 35829 2558 14 152221 13621 64102 10398 106776 8142 15 68266 12994 1838 18021 80154 6856 16 124980 11135 87470 8458 80434 16867 17 471 4529 226507 34521 58965 5959 18 133148 14492 255859 45269 19840 2061 19 3427 1402 183958 28863 100110 9264 20 1083 181 85691 9098 96191 10227 21 2996 970 4545 11022 167713 16844 22 1451 510 2753 2892 134284 19105 23 381 561 188 1925 32671 4575 24 23872 2099 14601 1511 69398 12497 25 372 694 1991 7647 43532 7101 26 10983 1363 19320 3233 129972 15561 27 1008 943 75301 16132 132979 9330 28 832 285 38164 12357 117058 10674 29 157249 19281 223762 46346 250894 30530 30 132277 26932 258884 78112 118590 12603 31 128804 27254 212690 39436 86354 6759 32 22988 12172 13393 33537 94675 39913
Table 4: Table showing the number of reads obtained for each fragment from all six TagTaq and PlatinumTaq set ups for sample 126. The colour of the background, in a scale from red to yellow to green, serves to highlight the differences in the number of reads obtained for each fragment, with one colour for each degree of magnitude, detailed in table 5.
Colour Coding Key Number of reads
n = 0
1 < n < 10
10 < n < 100
100 < n < 1'000
1'000 < n < 10'000
10'000 < n < 100'000
100'000 < n < 1'000'000 Table 5: Table showing the colours used to lable the reads, and their corresponding number of reads
40
Results Obtained Post-Project
Subsequent PCR and MiSeq runs, performed by the research group during the writing of the
report, have yielded additional insight into the workings of the primers, PCR set ups, and
sequencing. After modifying the relative concentrations of both inner and outer primers, a
highly modified set up of primer concentrations, seen in tables 6 and 7, were found to result in
a very even level of reads in sequencing, seen in table 8, with a maximum 6.1-fold difference
between the fragments with the highest and lowest number of reads, counting the two high
outliers (4.2 if discounting the outliers).
Outer Primer Concentration in PCR1
[µM] 1 0.5 2 0.4 3 0.08 4 0.04 5 0.04 6 0.4 7 0.15 8 0.035
Table 6: Table showing the modified concentrations of the outer primers. Rather than equal concentrations for each pair primers, the concentrations now vary significantly.
Concentrations of Primers in Modified Inner Quadruplexes Q1 c [µM] Q2 c [µM] Q3 c [µM] Q4 c [µM] 1 0.07 2 0.1 3 0.2 4 0.15 9 0.1 10 0.1 11 0.15 12 0.08 17 0.1 18 0.1 19 0.18 20 0.1 25 0.07 26 0.12 27 0.07 28 0.15
Q5* c [µM] Q6 c [µM] Q7* c [µM] Q8 c [µM] 7 0.1 6 0.25 5 0.4 8 0.2 13 0.3 14 0.5 15 0.8 16 0.25 21 0.2 22 0.3 23 0.6 24 0.15 29 0.1 30 0.35 32 0.3 31 0.2
Table 7: Table showing the modified quadruplexes and related concentrations of the inner primers. Rather than equal concentrations for each pair of primers, the concentrations vary significantly, and the primers included in quadruplexes 5 and 7 have been changed: inner primers 7 and 31 have been moved to Q5* and primer 5 to Q7*, making them a quintuplex and a triplex, respectively.
41
Sample C1V2.7 Fragment % Reads
1 2,32 % 24103 2 1,39 % 14484 3 3,85 % 40023 4 5,32 % 55317 5 2,54 % 26366 6 4,03 % 41859 7 2,44 % 25403 8 1,08 % 11207 9 3,66 % 37995 10 4,68 % 48609 11 4,67 % 48503 12 1,92 % 19993 13 1,87 % 19398 14 1,43 % 14910 15 1,34 % 13876 16 2,06 % 21401 17 3,46 % 35993 18 4,50 % 46735 19 4,37 % 45458 20 6,58 % 68381 21 3,15 % 32769 22 1,55 % 16121 23 1,74 % 18135 24 2,92 % 30357 25 2,19 % 22806 26 4,40 % 45703 27 3,61 % 37564 28 4,11 % 42677 29 4,19 % 43558 30 2,90 % 30190 31 2,89 % 30011 32 2,83 % 29465
Table 8: Table showing the result of a MiSeq sequencing run on samples generated using the modified concentrations shown in tables 6 and 7. As before, the colour key in table 5 was used to label the numbers of reads, and the percentage of total reads are here graded in blue for clarity.
Discussion Primer Layout and Design
The first, and one of the more challenging parts of the project, was the initial primer design.
Compounded by the lack of previous experience in primer design, together with the
constraints stipulated by the template DNA itself, the process took significant time, effort, and
42
compromise to finalise. The first attempt at aligning 32 pairs of primers and have them cover
the entirety of the 16727 bp mtDNA failed to reach all the way around, and thus, the second
attempt required further compromises and careful choices between possible primer pairs on
every step of the way. That is not to say that the final set of primers is necessarily inherently
inferior in quality, rather that different choices early on in the primer selection process,
influenced by knowledge of the shortcomings of the previous attempt, allowed for a tighter
alignment of primer pairs, thereby covering more ground. It was quite evident, when
performing the second primer alignment, that actively choosing a few of the longer-reaching
primer pairs early on effectively shifted the entire alignment into a more favourable frame,
where most forward inner primers could be placed closer to the preceding reverse inner
primer, in comparison to the previous attempt. This ultimately lead to more freedom in
aligning problematic primer pairs, due to the fact that more pairs, over all, met or exceeded
the average required length.
Most inner primer pairs have proven to successfully amplify their intended target of the
expected size, with a few significant exceptions, including the failure of the originally
synthesised primer pair 8, as well as the fact that primer pair 2 occasionally appeared much
lower on the gels than expected, suggesting a shorter sequence than intended had been
amplified. This is, however, to be expected, as inner primer 2 covers the repeating region,
which varies in size between individuals. Apart from these specific events, primers often
resulted in shorter non-specific, suspected primer dimers, sometimes in worryingly large
quantities in comparison to the specific product, and there was a significant inconsistency in
the successful amplification of individual sequences. Certain primer pairs were more prone to
this latter behaviour than others, but there was, in the end, evidence for each primer pair to
have functioned, however not all 32 at any single one occasion. The suspected primer dimer
products were also highly successfully removed by the cleaning step, employed later in the
process.
The 32 inner primer set up does seem to be successful, albeit after several modifications to
relative concentrations.
There is some concern regarding the amplification of the control region, being a very
informative region of the mtDNA that contains a large number of single nucleotide
polymorphisms. It would be preferable to cover this region in one single amplified segment,
but currently, it spans the last third of primer the segment amplified by primer pair 32,
through segments 1 and 2, ending at the very beginning of number 3. It is, however, not
43
possible to fully cover the control region in a single amplified segment using the chosen
sequencing platform, as it measures 1270 bp in the reference genome, [19], which is
obviously far beyond the reach of the 600 bp covered by the Illumina Paired End Sequencing
protocol. Even just the hypervariable region 1, HV1, containing the majority of these
informative sites, once primers for its specific amplification had been added, would have
exceeded the upper limit of 590 bp, measuring in itself 582 bp [20].
PCR Procedures
Although not to the same extent as primer design, previous experience with setting up PCR
reactions, particularly at this scale, was quite limited. There was significant trail and error
involved in finding working protocols, and especially early on, it was harder to rule out
human error, as opposed to sub-optimal protocols when results were poor. As the project
proceeded, however, more experience was obtained, both theoretically and practically, and it
became easier to both perform and evaluate the PCR runs. At the closing of the laborative part
of the project, many of the protocols were in all likelihood less than optimal, and can probably
be improved upon at a later date.
As it stood then, quadruplexes of the outer primers appeared to be working, as evidenced by
the subsequent successful singleplex amplification of inner primers from the diluted
quadruplex reaction mixture, despite the fact that subsequent singleplexes of the outer primers
themselves have ostensibly not shown this not to be the case. These odd results were later
elucidated by experiments performed post-project, which revealed that highly un-equal
concentrations of both inner and outer primers were required to ultimately yield even numbers
of reads in sequencing.
Outer primers, when run in singleplex from earlier quadruplexes, using equal concentrations,
almost invariably yielded product for only primers 3, 4, 5, and 8. In light of the concentrations
established by later experiments, where these previously successfully outer primers were
amplified with concentrations of around a tenth of those that were previously not successfully
amplified, these original results hardly seem surprising. Given the exponential nature of PCR
amplification, this difference in concentrations is highly significant. The same results can be
seen in the inner primers.
44
Indexing, Cleaning, and Sequencing
The incorporation of index primers into the sequencing construct seems to be overall
successful, but appears to vary significantly based primarily on the enzyme employed. This is
evident from the fact that the concentration of product post-purification, i.e. the specific
product in the correct size range, is significantly higher in the reactions that used PlatinumTaq
than the ones that used TagTaq.
The cleaning itself also appears successful, judging by the difference in the amount of product
pre- and post-cleaning, and the lack of low length sequences present post-cleaning. The
Bioanalyzer corroborates these results, showing samples clear of low length, unspecific
products, primer dimers, or loose primers, while showing peaks in the expected size range for
the desired, complete sequencing constructs.
The first attempt at sequencing was moderately successful, showing reads from most of the 32
different fragments for all sequenced individuals, while also displaying indications of the
expected issues with this original 32-fragment approach. There was far from an equal
representation of each primer fragment, and there were also clear indications that the relative
success of the outer primer from which an individual inner primer is derived heavily
influences the number of reads obtained for the inner primer in question. The latter conclusion
is derived from the pattern of lower yielding inner primers, which occur in groups of four that
correspond to the outer primers whereas they do not correspond to the physical inner
quadruplexes in which they were amplified. The samples that had the indices incorporated
with PlatinumTaq had overall a higher number of reads, collectively, than those that used
TagTaq, but generally shows the same patterns and variation between inner primer fragments.
Using the results obtained post-project, it is clearly evident that mainly the outer primers were
very uneven in their comparative efficiency, but that drastically changing their relative
concentrations to compensate for this resulted in very favourable results.
Conclusions
While the initial sequencing run was not by any means perfect, the method as a whole shows
promise. All outer primers appear functional, as do all inner primers, although some variation
has been observed in different individual experiments. All individual inner primers have been
shown to yield product in different experiments, but never all at the same time, at least not to
45
the degree as to being readily visible on agarose gels. While the outer primers could not be
successfully proven to be functional by re-amplifying them in singleplex from the outer
primer quadruplexes themselves, subsequent successful inner primer amplification from outer
primer quadruplexes shows that all outer primers must have amplified enough to provide
adequate template for the inner primers. The cleaning on the MBS using the Illumina CA
Purification works well, and does not seem to need any modifications, and the incorporation
of indices works well, especially when using PlatinumTaq. The sequencing itself indicates
that the amount of indexed sequencing constructs provided by the TagTaq enzyme was in fact
enough to provide a substantial amount of reads, comparable to those provided by the
PlatinumTaq, where the higher amount of reads were generally greater by one order of
magnitude.
The challenges yet to be overcome after the conclusion of the practical part of the project
were primarily those of levelling out the highly varying read numbers between the different
fragments. Since these appeared to correlate significantly with the outer primers from which
they were amplified, a solution seemed to be to alter the relative concentrations of the outer
primers in the initial outer quadruplexes, or in some other manner manipulate the relative
abundance of the outer primers products. This, in conjunction with slight adjustments to the
compositions and the relative concentrations of the primers in the inner quadruplexes, has
now, as described in the post-project results, been shown to drastically improve the ratio
between the number of reads obtained for the 32 different fragments. Based on these results,
the method appears very promising.
46
References
1. Savolainen P., Zhang Y-P., Luo J., Lundeberg J., and Leitner T. (2002)
Genetic Evidence for an East Asian Origin of Domestic Dogs
SCIENCE Vol. 298:1610-1613
2. Ding Z-L., Oskarsson M., Ardalan A., Angleby H., Dahlgren L-G., Tepeli C.,
Kirkness E., Savolainen P., and Zhang Y-P. (2011)
Origins of domestic dog in Southern East Asia is supported by analysis of Y-
chromosome DNA
Heredity (2011), 1–8
3. Pang J-F., Kluetsch C., Zou X-J., Zhang A-B., Luo L-Y., Angleby H., Ardalan A.,
Ekström C., Sköllermo A., Lundeberg J., Matsumura S., Leitner T., Zhang Y-P., and
Savolainen P. (2009)
mtDNA Data Indicate a Single Origin for Dogs South of Yangtze River, Less Than
16,300 Years Ago, from Numerous Wolves
Mol. Biol. Evol. 26(12): 2849–2864
4. van Asch B., Zhang A-B., Oskarsson M. C. R., Klütsch C. F. C., Amorim A., and
Savolainen P. (2013)
Pre-Columbian origins of Native American dog breeds, with only limited replacement
by European dogs, confirmed by mtDNA analysis
Proc R Soc B 280: 20131142
5. Oskarsson M. C. R., Klütsch C. F. C., Boonyaprakob U., Wilton A., Tanabe Y., and
Savolainen P. (2011)
Mitochondrial DNA data indicate an introduction through Mainland Southeast Asia
for Australian dingoes and Polynesian domestic dogs
Proc. R. Soc. B
DOI: 10.1098/rspb.2011.1395
6. Shapiro B., Cui P., Schuenemann V. J., Sawyer S. K., Greenfield D. L., Germonpré
M. B., Sablin M. V., López-Giráldez F., Domingo-Roura X., Napierala H., Uerpmann
H-P., Loponte D. M., Acosta A. A., Giemsch L., Schmitz R. W., Worthington B.,
Buikstra J. E., Druzhkova A., Graphodatsky A. S., Ovodov N. D., Wahlberg N.,
Freedman A. H., Schweizer R. M., Koepfli K-P., Leonard J. A., Meyer M., Krause J.,
Pääbo S., Green R. E., Wayne R. K. (2013)
47
Complete Mitochondrial Genomes of Ancient Canids Suggest a European Origin of
Domestic Dogs
SCIENCE Vol. 342:871-874
7. von Holdt B. M., Pollinger J. P., Lohmueller K. E., Han E., Parker H. G., Quignon P.,
Degenhardt J. D., Boyko A. R., Earl D. A., Auton A., Reynolds A., Bryc K., Brisbin
A., Knowles J. C., Mosher D. S., Spady T. C., Elkahloun A., Geffen E., Pilot M.,
Jedrzejewski W., Greco C, Randi E., Bannasch D., Wilton A., Shearman J., Musiani
M., Cargill M., Jones P. G., Qian Z., Huang W., Ding Z-L, Zhang Y-P., Bustamante
C. D., Ostrander E. A., Novembre J., and Wayne R. K. (2010)
Genome-wide SNP and haplotype analyses reveal a rich history underlying dog
domestication
Nature 464, 898-902
8. Ringo J. (2004)
Fundamental Genetics
9. DNA Sequencing with Solexa® Technology (2007)
Illumina, Pub. No. 770-2007-002 01May07
10. Illumina Sequencing Technology Highest data accuracy, simple workflow, and a
broad range of applications (2010)
Illumina, Pub. No. 770-2007-002 Current as of 11 October 2010
11. Sequencing Dual-Indexed Libraries on the HiSeq® System User Guide
ILLUMINA PROPRIETARY Part # 15032071 Rev. B July 2012
12. Fuller, C. W., Middendorf L. R., Benner S. A., Church G. M., Harris T., Huang X.,
Jovanovich S. B., Nelson J. R., Schloss J. A., Schwartz D. C., and Vezenov D. V.
(2009)
The challenges of sequencing by synthesis
Nature Biotechnology Vol: 27 Nr:,11 1013-1023
13. GenBank: U96639.2, Mitochondrial reference genome of the domestic canine on the
NCBI Database
http://www.ncbi.nlm.nih.gov/nuccore/U96639
14. Savolainen P., Arvestad L., and Lundeberg J. (2000)
mtDNA Tandem Repeats in Domestic Dogs and Wolves: Mutation Mechanism
Studied by Analysis of the Sequence of Imperfect Repeats
Mol. Biol. Evol. 17(4): 474–488
15. Primer BLAST primer alignment tool on the NCBI database
48
http://www.ncbi.nlm.nih.gov/tools/primer-blast/
16. Invitrogen information sheet on Platinum Taq
https://www.lifetechnologies.com/content/dam/LifeTech/migration/files/pcr/pdfs.par.2
6652.file.dat/platinumtaq-pps.pdf
17. Protocol for Illumina CA Purification
https://github.com/EnvGen/LabProtocols/blob/master/CA_cleaning.pdf
18. Lundin S., Stranneheim H., Pettersson E., Klevebring D., Lundeberg J. (2010)
Increased Throughput by Parallelization of Library Preparation for Massive
Sequencing.
PLoS ONE 5(4): e10029.
DOI:10.1371/journal.pone.0010029
19. Gundry R. L., Allard M. W., Moretti T. R., Honeycutt R. L., Wilson M. R., Monson
K. L., and Foran D. R. (2007)
Mitochondrial DNA Analysis of the Domestic Dog: Control Region Variation Within
and Among Breeds
DOI: 10.1111/j.1556-4029.2007.00425.x
20. Imes D. L., Wictum E. J., Allard M. W., Sacks B. N. (2012)
Identification of single nucleotide polymorphisms within the mtDNA genome of the
domestic dog to discriminate individuals with common HVI haplotypes
DOI: 10.1016/j.fsigen.2012.02.004
21. Natanaelsson C., Oskarsson MC., Angleby H., Lundeberg J., Kirkness E., Savolainen
P. (2006).
Dog Y chromosomal DNA sequence: identification, sequencing and SNP discovery.
BMC
Genet 7: 45.
22. AlbaNova University Center
School of Biotechnology of the Royal Institute of Technology (KTH)
23. Meyer M., Stenzel U., Myles S., Prüfer K., and Hofreiter M. (2007)
Targeted high-throughput sequencing of tagged nucleic acid samples
DOI: 10.1093/nar/gkm566
24. Gunnarsdóttir E. D., Li M., Bauchet M., Finstermeier K., and Stoneking M. (2011)
High-throughput sequencing of complete human mtDNA genomes from the
Philippines
DOI: 10.1101/gr.107615.110
49
25. Maricic T., Whitten M., and Pääbo S. (2010)
Multiplexed DNA Sequence Capture of Mitochondrial Genomes Using PCR Products
DOI: 10.1371/journal.pone.0014004
26. Improved quantitative PCR using nested primers.
Haff L.A.
Genome Res. 1994 3: 332-337
27. Illumina MiSeq Specifications
http://www.illumina.com/systems/miseq/performance_specifications.html