linnea guldbrand - diva1038999/fulltext02.pdf · skolan fÖr bioteknologi. . 1 ... pcr primers were...

INOM EXAMENSARBETE BIOTEKNIK,AVANCERAD NIVÅ, 30 HP

, STOCKHOLM SVERIGE 2016

Development of a massive parallel sequencing method for population genetics, for the sequencing of 1,000 dog mitochondrial genomes per Miseq run, based on nested and multiplexed PCR amplification and PCR-incorporated dual-index identification barcodes

LINNEA GULDBRAND

KTHSKOLAN FÖR BIOTEKNOLOGI

www.kth.se

1

Development of a massive parallel sequencing method for

population genetics, for the sequencing of 1,000 dog mitochondrial genomes per Miseq run, based on nested

and multiplexed PCR amplification and PCR-incorporated dual-index identification barcodes

Linnea Guldbrand

Master Thesis at the School of Biotechnology, KTH Royal Institute of Technology,

Department of Gene Technology, SciLifeLabs

Supervisor and Examiner: Peter Savolainen

2

Abstract

The geographical origin of the domestic dog has not yet been conclusively established. The

mitochondrion, being matrilineally inherited and prone to a greater rate of mutation compared

to nuclear DNA, is of great significance in genetic evolutionary studies and as such, complete

sequencing of the mitochondrial genome of a great number of individuals would provide

important data for the furthering of such studies.

This project aims to design a method of sequencing the entire mitochondrial genome of the

domestic dog for a large number of individuals in parallel on the Illumina MiSeq sequencing

platform, using several sets of PCR primers to generate barcoded and sequencing-ready

libraries of predetermined fragments for each individual.

PCR primers were constructed both for initial long-range products and for shorter fragments,

suitable for sequencing and containing partial sequencing adaptors, covering the entire

mitochondrial chromosome. Additionally, primers containing barcode indices and the final

required sequencing constructs were designed. The viability of the primers and of different

PCR parameters were investigated, verified on agarose gels and Bioanalyzer, and a set of

samples were taken through cleaning, barcoding, and sequencing.

Results indicate a promising method, where all primers successfully generate product, and

both cleaning and sequencing appears in essence successful, but the relative amounts of

product obtained from each primer, and subsequently the amount of reads obtained in

sequencing, varies significantly with the initial set up. Subsequent experiments, performed

after the closing of the practical part of the project, have shown that compensating for this

uneven amplification by using significantly unequal primer concentrations greatly serves to

alleviate these issues.

3

Abstract ...................................................................................................................................... 2 Introduction ................................................................................................................................ 4

Previous Findings on the Geographical Origins of the Domestic Dog .................................. 4 The Mitochondrion, in Biology and in Forensics ................................................................... 5 The Illumina, Dual Indexed, Paired End Sequencing Method ............................................... 6 Existing Methods for Whole mtDNA Sequencing ................................................................. 9

Aim of Project .......................................................................................................................... 10 Materials and Methods ............................................................................................................. 11

Primer Layout and Design .................................................................................................... 11 PCR Reactions ...................................................................................................................... 11

Results ...................................................................................................................................... 13 Primer Layout and Design .................................................................................................... 13 PCR Reactions, Viability of Primers and Multiplex Set Ups ............................................... 20 Barcoding PCR ..................................................................................................................... 34 Concentration Measurements and Cleaning ......................................................................... 35 Initial Sequencing ................................................................................................................. 37 Results Obtained Post-Project .............................................................................................. 40

Discussion ................................................................................................................................ 41 Primer Layout and Design .................................................................................................... 41 PCR Procedures .................................................................................................................... 43 Indexing, Cleaning, and Sequencing .................................................................................... 44

Conclusions .............................................................................................................................. 44 References ................................................................................................................................ 46

4

Introduction Previous Findings on the Geographical Origins of the Domestic Dog

That the domestic dog has its evolutionary origin in the wolf has long been known and

accepted as fact, based on both genetic evidence and on archaeological findings, as well as on

behavioural and physical traits [1]. However, the precise circumstances, the historical time

point, and the geographical location of the original domestication event, or events, are less

clear. Several evolutionary genetic studies have been performed, using various methods and

sample materials, with varying results. Simply put, these studies attempt to identify the most

likely common ancestor of the domestic dog as the one whose genetic material can account

for the diversity of all others, while also comparing them to wild wolf populations and

estimating the timeframe for the domestication event via the rate at which mutations are

believed to accumulate. Variously, such studies have indicated the geographical origin of the

domestic dog in places as disparate as Europe, South East Asia, and the Middle East.

A 2002 study [1] of a stretch of 582 base pairs from the so-called control region of the

mitochondrial DNA from 654 dogs, representing dog populations worldwide, indicated an

East Asian origin for the domestic dog, based on a comparatively higher degree of

phylogenetic variation in dogs from this area. These findings were corroborated by the

analysis of 14 437 base pairs from the Y chromosome (fragmented sequences from

incomplete sequencing of a male dog DNA, assigned to the Y chromosome through

comparison to female dog DNA as well as the human Y chromosome sequences [21]) from

151 dogs worldwide [2] as well as by a further study of the complete mitochondrial DNA

from 169 individuals, combined with the 582 control region base pairs from 1576 individuals,

both placing the geographical origin of the domestic dog in South Eastern Asia, south of the

Yangtze River, China, less than 16 300 years ago [3]. Both studies also show that this region

of South East China, south of the Yangtze River, is the region in which the genetic diversity is

the greatest, and is the only region where almost all haplotypes of both the mtDNA and the Y

chromosome can be found simultaneously. Additionally, analysis of the mtDNA of Native

American dog breeds, when compared to East Asian and European dogs, as well as Pre-

Columbian samples, show low levels of European mtDNA [4], indicating a more ancient,

Asian, origin of the Native American dog breeds. Similarly, mtDNA analysis of the

Australian Dingo and Polynesian domestic dogs indicate an origin in mainland South East

Asia [5].

5

On the other hand, a study of the mitochondrial DNA from 18 ancient canids [6] indicated a

closer relationship with either ancient or modern European canids for all modern dogs the

world over. The study did, admittedly, lack ancient canid samples from both the Middle East

and China, two other major candidates for the geographical origin of the domestic dog.

Furthermore, genome-wide SNP (Single Nucleotide Polymorphism) analysis of over 48,000

SNPs in 912 dogs as well as 225 grey wolves has indicated a Middle Eastern origin for the

domestic dog, based on the significantly larger genetic variation found in breeds from this

region [7].

The Mitochondrion, in Biology and in Forensics

The mitochondrion is an organelle present in eukaryotic cells, whose role is to perform

oxidative metabolism, providing energy for the cell. The origin of the mitochondrion is

assumed to be the enveloping of a purple bacterium by an ancient eukaryotic ancestor,

resulting in an endosymbiotic relationship whereby both the eukaryotic host cell and the

bacterium symbiont benefits. Certain genetic material has since migrated from the original

bacteria into the nucleic DNA of the host cell to the degree that the modern mitochondrion

can no longer survive independently, but does retain certain vital genes in its own,

mitochondrial DNA, the mtDNA. They also still, like bacteria, reproduce through division,

rather than through being disassembled and reassembled, as is the case with all other

organelles, apart from the chloroplasts of plants, which have similar origins to the

mitochondrion. [8]

Mitochondrial DNA is usually comprised of one essential double-stranded, circular

chromosome, in multiple copies, but single-stranded, and linear chromosomes exist. The

mitochondrial DNA encodes for components necessary for protein production and certain

enzymes required for aerobic metabolism, but many components of the mitochondrion are

encoded by nuclear DNA and are transported into the mitochondrion. Multiple copies of

mitochondria are present in any given cell, and their genetic make up are not necessarily

homogenous. [8]

Mitochondria are inherited maternally. In meiosis in females, mitochondria are evenly

segregated between the two new cells and in the resulting embryo, the mitochondria present

in the fertilised ovum will divide and produce all the mitochondria in the new organism. Due

to this manner of inheritance, mitochondrial DNA can be used in forensics, to trace familial

6

relations on the maternal side (e.g. mother and child, as well as siblings who share a mother,

but not fatherhood), rule out suspects based on crime scene DNA, and on a larger time scale,

trace the evolution of a species. Mitochondrial DNA is more suited to such analyses for two

main reasons. Firstly, while each cell only contains one complete set up of nuclear DNA,

mitochondrial DNA is present in multiple copies per cell, thereby making it significantly

more abundant than nuclear DNA, somewhat circumventing the common issue of limited

sample material. Secondly, specifically in mammals, mutations accumulate at a much higher

rate in mitochondrial DNA than in nuclear DNA, on average 10-8 times per nucleotides and

year, meaning that evolutionary differences can be visible on a comparative shorter timescale.

[8]

The Illumina, Dual Indexed, Paired End Sequencing Method

The Illumina sequencing method is a Sequencing-By-Synthesis (SBS) sequencing method,

known as Solexa, utilising fluorescently labelled nucleotides to track base incorporation. In its

basic iteration, DNA is sheared into randomly sized fragments, to the ends of which a forward

and a reverse adaptor sequence are ligated. These adaptor-ligated fragments are then, single-

strandedly, randomly attached to the surface of the flow cell, where each individual fragment

is amplified into clusters of multiple copies of the same fragment, using so-called bridge

amplification. This means that the non-attached adaptor sequence of any given fragment

anneals to its complementing adaptor on the surface of the flow cell, which then acts as a

primer, allowing amplification into an arch-shaped double stranded structure where each of

the two strands is in one end attached to the flow cell. A denaturation step separates the two

strands of the ‘bridge’, and the process is repeated, until a sufficiently dense cluster of single

stranded DNA fragments has been formed.

7

Figure 1: Basic overview of Illumina sequencing, using random fragmentation and adaptor ligation [9]

After the clusters have been formed, the sequencing commences. Bases are determined

through the use of fluorescently labelled nucleotides, each of the four bases fluorescing at a

different wavelength. The labelled nucleotides are also reversible terminators, meaning that

the fluorescent label blocks more than one nucleotide from being incorporated at a time, but

that after the base has been determined, this label is enzymatically cleaved off, in preparation

for the next cycle, allowing the next nucleotide to be incorporated. For the first cycle of the

sequencing, all four fluorescently labelled nucleotides are added to the flow cell at once,

together with primers specific to the adaptor sequences and DNA polymerase, and one

nucleotide is incorporated in the first position of each strand, in each cluster. A laser is then

used to excite the fluorescent label on the nucleotide, and its identity is recorded for each

cluster. The label is then cleaved off, and all remaining reagents are washed away. For all

subsequent cycles, all four labelled nucleotides and DNA polymerase is added, one base is

incorporated in each strand in each cluster, laser excitation and image recording is performed,

the label is cleaved off. Remaining reagents are then washed away in preparation for the next

cycle. [9] [10]

8

Figure 2: Sequencing-by-Synthesis using Illumina sequencing, by annealing one base at a time and detecting them by their fluorescent label [9]

This sequencing method provides sequencing data from all fragments applied to the flow cell,

but has the downside of not being able to differentiate between the origins of the different

fragments, as well as, depending on the size of the fragments, not being able to obtain full

sequences, due to limitations imposed by the inherent read length of the sequencing method.

A way to enable the former is to employ the Illumina Single- or Dual-Indexed Sequencing

method, both based on the Paired End Sequencing method. The second can be achieved by

ensuring that all fragments used in the sequencing are below the maximum read length for the

particular sequencing method.

The Dual-Indexed Paired End Sequencing method employs several modifications to the

original adaptor construct and the sequencing procedure in order to distinguish between the

origins of different sequenced fragments. Instead of a simple adaptor on each end of the

fragment to be sequenced, the adaptor sequences are composed of several different

components each. These complex adaptors are shown in Figure 3, in a step-by-step depiction

of the sequencing process. Upstream of the DNA insert to be sequenced, the components of

the construct are the P5 adaptor, one of the slide-attaching sequences, which is also

complementary to the i5 Index Sequencing Primer, followed by the i5 Index, and the

sequence complementary to the Read 1 Primer, which initiates the sequencing from one end

9

of the DNA Insert. Downstream, the DNA insert is followed by a stretch of bases that is

complementary to both the i7 Index Sequencing Primer and to the Read 2 Primer, the i7

Index, and finally the P7 adaptor sequence that also attaches to the surface of the flow cell.

Each of the two index sequences is composed of 8 bases.

The sequencing includes four different primers, sequencing the DNA insert from both ends as

well as the two indices. First, Read Primer 1 is aligned and the DNA insert is sequenced from

the P5 end of the construct. The Read 1 product is then removed. Secondly, the i7 Index

Sequencing Primer is used to sequence the i7 Index, after which the index product is

removed. Then, the P5 adaptor is annealed to its corresponding adaptor, grafted to the surface

of the flow cell, which is used as the primer for the i5 Index. The i5 Index product is removed

and the full complementary strand is generated and the original strand is removed. Lastly,

Read Primer 2 is used to sequence the DNA insert form the P7 end. [11]

Figure 3: Schematic overview of the Dual-Indexed Paired End sequencing method, showing the order and the orientation of the primers involved [11]

Existing Methods for Whole mtDNA Sequencing

There are existing methods for sequencing the entire mitochondrial DNA of several

individuals in parallel, using different set ups. One such is the PTS (Parallel Tagged

Sequencing) method on the 454 sequencing platform [23, 24], using single-stranded, self-

hybridising barcodes to tag samples prior to pooling and sequencing. Samples are barcoded

separately and then pooled and prepared for sequencing. The barcodes are 6 bp long and

10

allow for 72 samples to be sequenced in parallel. Another method is the PCR-product capture

method [25], using fragments from a reference individual, fixed to beads, in order to retrieve

and enrich mtDNA fragments from complex DNA mixtures. Long range PCR is used to

produce two PCR fragments that cover the entire mtDNA, and these are then sonicated into

15-800 bp fragments, which are biotinylated and immobilized on streptavidin-coated beads.

The beads are then used to extract mtDNA fragments from sheared DNA mixtures, by

hybridisation, and the fragments can then be eluted, amplified, and sequenced, after separately

barcoding each library and preparing them for sequencing.

Aim of Project

The aim of this project was to design and implement a method for the sequencing of the

canine mitochondrial genome, for the purpose of producing data for phylogeographical

analysis of the geographical origin of the domestic dog, for which large numbers of samples

are necessary. The strategy employed was the introduction of barcodes during preparatory

PCR in order to enable multiplexed sequencing, on the Illumina MiSeq, of 1152 individuals in

parallel. Ultimately, the samples intended for use are saliva samples stored on FTA cards

(Whatman).

In contrast to existing methods, the focus of this project lies on a high degree of

parallelisation, requiring steps taken to reduce workload and on streamlining the procedures,

and on the specificity of the amplified fragments, in size and location, to guarantee the

coverage of the entirety of the mtDNA, in fragments that can be fully sequenced by the

Illumina MiSeq sequencing platform. The PTS method, being on the 454 sequencing platform

and only providing a 72-plex, is therefore not suitable. Neither is the PCR-product capture

method, both due to the fact that the intended sample material for the project is immobilised

on FTA cards, and because it requires one library per individual to be prepared all the way to

sequencing separately, which is both labour and cost intensive.

In order to enable these high degrees of parallelisation, it is important that the read numbers

obtained from each fragment are as even as possible. This is to ensure that all fragments are

sequenced with a sufficiently high redundancy to provide reliable output data.

11

Materials and Methods Primer Layout and Design

Primers were designed using the NCBI Primer BLAST tool [15], which can be used to

generate primers according to a set of user specified parameters regarding, using the canine

mtDNA reference genome [13] as the template.

The goal was to generate primers that would allow for the sequencing of the entire canine

mitochondrial genome in fragments of a size that would be fully covered by the MiSeq

sequencing platform. The highest number of base pairs the MiSeq can cover is 600 bp, which

influences the number of primer pairs that are needed. These primers would, apart from the

sequence-specific component, contain parts of the adaptor constructs necessary for MiSeq

sequencing, to which barcoding primers, containing the rest of the necessary adaptors, can

later be incorporated.

In addition to these primers, a set of primers, to be used for initial amplification of longer

fragments, were desired. The purpose of these long-range primers are to limit the use of the

original samples, to avoid depleting it, as well as to create a type of nested PCR [26] for the

sequencing-specific fragments, reducing the likelihood of unspecific targets being generated

by limiting the available unrelated template.

PCR Reactions

PCR reactions were carried out using either TagTaq, obtained from the Alba Nova University

Center [22], or PlatinumTaq, produced by Invitrogen, both being polymerase enzymes for the

purpose of the replication of DNA. The TagTaq was used for the inner primers, due to its

availability and lower cost, as its lower processivity was deemed sufficient for the shorter

inner primers, while PlatinumTaq was required to fully amplify the longer outer primers.

Originally, TagTaq was intended for both the inner and the outer primers, but after attempting

to amplify the outer primers using the TagTaq, in multiple reaction set-ups, and failing to

obtain product, possibly due to the outer fragments being too long for the TagTaq enzyme to

successfully amplify, PlatinumTaq was employed instead.

The TagTaq-based reaction mixture consisted of 2.5 µl “P” (10x polymerase buffer, final

concentrations 50 mM KCl, 2 mM MgCl2, 10 mM TrisHCl pH 8.5, 0.1% v/v Tween), 2.5 µl

12

“C” (10x dNTP mix, containing 2 mM of each dNTP in water, final concentration 0.2mM), 1

µl Forward primer (0.2 µM final concentration), 1 µl Reverse primer (0.2 µM final

concentration), 1 µl template, and 17 µl nuclease-free H2O, to a final volume of 25 µl per

reaction. Initially, 0.1 µl TagTaq was used per reaction, according to suggestions from the

providers of the enzyme, but this was later increased to 0.2 µl per reaction due to the low

yield.

For PlatinumTaq-based reaction mixtures, used for amplifying the outer fragments, volumes

were adapted from the information sheet provide by Invitrogen [16] and consisted of 5 µl 10x

PCR Buffer without MgCl2, 5 µl dNTP mixture (2 mM of each dNTP), 1.5 µl MgCl2 (50

mM), 2 µl Forward primer (0.2 µM final concentration), 2 µl Reverse primer (0.2 µM final

concentration), 1 µl template, and 33.5 µl nuclease-free H2O, to a final volume of 50 µl per

reaction. A volume of 0.2 µl PlatinumTaq, 5U/µl, per reaction was used throughout the

experiments.

For both TagTaq-based and PlatinumTaq based reactions, in the case of multiplexing,

initially, equal amounts of each of the necessary forward and reverse primers were added, and

the volume of H2O was lowered accordingly. In later experiments, in attempts to obtain

comparable levels of each product in these multiplexes, the concentrations of the primers

included in each multiplex were varied, increasing the concentration of those primers that

failed to yield product in relation to those that did.

The PCR reactions were tried out with several different annealing temperatures, extension

times, and numbers of cycles. The initial set up for the inner primers was 1.5 minutes of initial

denaturation at 94°C, followed by 30 cycles of 30 seconds of annealing at 46°C and 2 minutes

of extension at 72°C, a final extension for 10 minutes at 72°C and ending in a Hold at 4°C.

The number of cycles was later increased to 40, and both 49°C and 52°C as annealing

temperatures were evaluated.

The PCR reaction parameters for the outer primers were initially the same as for the inner

primers, with the exception of the annealing temperature being set to 50°C. This was later

adjusted to evaluate both 5 and 10 minutes of extension time, as well as different numbers of

cycles.

The annealing temperatures were chosen by manually calculating the optimal annealing

temperature for each primer, using only the sequence-specific part of the inner primers and

the entirety of the outer primers, adding 2°C for an adenine or a thymine and 4°C for a

13

guanine or a cytosine, together with estimations of melting temperatures from the primer

generating tool, and choosing a temperature that was believed to be sufficiently low to allow

all primers to anneal successfully.

The success of PCR reactions were evaluated by running aliquots of the reaction mixture on

1% agarose gels, pre-stained with GelRed (Biotium). In the case of multiplex reactions,

singleplex reaction mixtures for the primers participant in the multiplex were prepared and

dilutions of the multiplex reaction mixtures were used as template for the singleplexes. The

product of these singleplexes were then checked on gels, on the assumption that if and only if

the multiplex had been successful would the singleplex be successful in regards to that

specific primer.

Results Primer Layout and Design

The primers required for the project were subject to a number of criteria set by the intended

sequencing platform, the parameters of adjacent primers, as well as the nature of the mtDNA

itself.

The external criteria set by the Illumina Paired End sequencing on the MiSeq is stated as a

maximum of 550 bases per primer-amplified segment, including the primer sequences, for

sufficient coverage of the entire segment. This includes an overlap of 50 bp at the centre for

better coverage of the ends of the reads. This is due to the fact that, as an ensemble

sequencing-by-synthesis (SBS) method, the read length when sequencing on the MiSeq is

limited by the reliability of the synchronous incorporation of the correct base to each strand in

the cluster. In each step of the sequencing, the correct base has to be incorporated exactly

once and be measured accurately, followed by the removal of the extension-blocking agent,

allowing the next base to be incorporated and measured. As the sequencing proceeds, errors

are eventually introduced, wherein bases fail to be properly incorporated in certain strands,

leading to portions of the cluster lagging behind the others, giving 'false' signals. As these

errors accumulate, the signal-to-noise ratio will decrease, ultimately to the point where bases

can no longer be accurately detected. The number of bases into the sequencing where this

threshold is reached dictates the read length of the method in question. [12]

It is, however, possible to use segments of sizes approaching 600 bp by utilising the Illumina

14

stitching algorithm to combine an overlap of at least 10 bp to a single read, using consensus

and quality data from the two reads, allowing the use of larger inserts. The upper limit for the

size of a DNA insert, including primer sequences, was thus set to 590 bp. Subtracting the

length of the primer sequences from the inserts leaves approximately 550 bp sequenced in

each insert, as primers are ideally around 20 bp long. With this average fragment length, it

was estimated that 32 fragments would be needed in order to fully cover the 16727 bp

reference genome, with a reasonable margin for overlaps and difficult-to-align stretches of

DNA. [13]

Sequencing 32 individual 550 bp long sequences would yield a total of 17600 bp, leaving a

margin of 837 bases when compared to the 16727 bp of the reference genome. Spread out

over 32 fragments, this enables a variance of 27 bp per fragment, providing a certain degree

of freedom when aligning the primers. Finally, in order to fully cover the mitochondrial

genome, the fragments cannot average lower than 523 bp (563 bp with the primers included).

32 primer pairs is also a desirable number from a practical design point of view, as sets of 32

fit evenly on 96-well plates as well as in multiples of eight, corresponding to the width of

common laboratory equipment.

These 32 primer pairs must then be laid out in an interconnecting fashion, where each forward

primer must be placed slightly upstream of the reverse primer of the previous pair, relative to

the leading strand, so that every base is sequenced independently.

Figure 4: Schematic representation of the overlapping orientation of primers, highlighting how all parts of the template are covered by amplified fragments in an interlocking fashion. Template DNA represented by the wide yellow line, the primers by red and orange arrows (alternating colours purely for visual clarity) and the amplified fragments in corresponding colours below the template.

Another limiting factor for the placement of the primers is the repeating region of the mtDNA

inside which the primers cannot reliably be placed. This is due to the fact that the repeating

region, as indicated by its name, is comprised of multiple repetitions of the same DNA motif,

meaning that a primer that is complementary to a site in this region is thus complimentary to a

15

large number of sites, upstream and downstream of the intended annealing site, at every place

where this motif repeats itself. In the domestic canine mtDNA, this repeating region alternates

between two almost identical 10 bp segments, only differing in one position. This region

covers bases 16131 through 16430 of the reference genome, but can vary greatly in size

between individuals due to differing numbers of repeats of the two 10 bp motifs. [14]

The option of not including the repeating region was considered, as the size differences may

mean that longer repeating regions would not be completely sequenced by the Illumina Paired

End sequencing method, and shorter ones would be sequenced to redundancy, but possibly

without the means to tell to what extent, represented in figure 5.

Figure 5: Schematic representation of the different possible results when sequencing the repeating region. Due to its varying size between individuals, coverage will vary, and due to its repeating nature, conclusive alignments cannot be guaranteed.

It was decided to attempt to sequence the repeat region to the highest extent possible, as full

coverage of the rest of the mtDNA appeared to be achievable with the remaining 31 primer

pairs, meaning that no information would be lost from trying to sequence the repeat region as

well. Including the repeat region as an amplified segment would also ensure that the bases

immediately preceding and following it would actually be included in the sequencing,

something that could otherwise not be achieved, as primers cannot reliably be aligned inside

the repeat region.

16

With this in mind, the primers were aligned, starting from the primer pair upstream of the

repeating region, placing the reverse primer as close to the repeating region as possible,

followed by the pair covering the repeating region, ensuring enough room after the repeating

region to align the forward primer of the next primer pair.

The primers were designed using the NCBI Primer BLAST tool [15]. The Canis familiaris

reference mitochondrion genome entry [13] was used as the template to which the primers

were to be aligned. The PCR product size was set to a maximum of 590 bases and a minimum

of 540 bases, to ensure coverage of the whole mtDNA sequence. Remaining parameters were

subject to dynamic modifications depending on the ease or, rather, difficulty with which

primers could be aligned. TM was desired to be between 52 and 60 degrees Celsius, with an

optimal temperature of 56 degrees. The allowed difference in TM between the primers in a

pair was initially set at 3 degrees, but was subject to increases in cases where primers could

otherwise not be aligned.

The initial advanced settings were for a primer size between 17 and 23 bases with 20 as an

optimum, a GC-clamp of 2, maximum poly-X sequences of 4, and maximum 3’ GC content

of 3. GC content was desired to be between 40 and 60% and due to the nature of the

mitochondrial DNA, ‘Avoid low complexity regions for primer selection’ was unchecked.

Primers were then generated by specifying a stretch of approximately 50 bases within which

the forward primer was allowed to align. The starting point of the first stretch was dictated by

the end of the repeating region, while all subsequent alignment areas were instead dictated by

the location of the reverse primer in the previous primer pair, i.e. in relation to the leading

strand, each forward primer had to end before the reverse primer of the preceding pair

‘started’.

Due to the structure of the mtDNA and the rigidity of where the next primers had to be

aligned, in relation to the preceding pairs, it was often difficult to align primers according to

the above-mentioned ‘optimal’ parameters, which necessitated that the conditions were made

less stringent, on a primer-by-primer basis. Initially, the stretch of bases allotted to the

alignment of the forward primer would be extended, in the hopes of finding a primer without

having to lower the other requirements placed on the primer. Failing this, as moving the

primer too far back would in the end compromise the possibility of covering the entire

mtDNA in the chosen number of primers, the remaining parameters were in turn made less

stringent. The decision on what parameter to change was aided by the error message given by

the Primer-BLAST tool upon failure to generate a primer pair, which would list the reasons

17

for the failure, e.g. TM difference too high, too long poly-X sequence, or lack of GC clamp.

Decisions were also made by observing the surrounding sequence manually, and thereby

decide whether or not certain changes were appropriate. For each primer pair, the changes to

the parameters that were deemed to cause the least impactful changes to the overall structure

of the primers were chosen.

To limit the use of template DNA, which is available in limited amounts, primers that would

amplify longer parts of the mtDNA were required. These would then be used as templates for

the aforementioned 32 primer pairs, also creating a sort of nested PCR [26], which reduces

the likelihood of generating unspecific PCR products. It also serves the purpose of generating

template that is in solution, not bound to FTA cards.

The 32 primers pairs will from here on be referred to as ‘Inner Primers’ and these new,

analogously dubbed ‘Outer Primers’ were aligned in much the same manner as the inner

primers, interlocking with each other, but also taking care not to overlap with the alignment

sequences of the inner primers.

Figure 6: Schematic representation of the interlocking design of the outer primers. Template DNA represented by the wide yellow line, the primers by blue and green arrows (alternating colours purely for visual clarity) and the amplified fragments in corresponding colours below the template.

18

Figure 7: Schematic representation of how the outer primers fully cover a set of four inner primers. Template DNA represented by the wide yellow line, the inner primers by red and orange arrows and outer primers by blue and green arrows (alternating colours purely for visual clarity) and the amplified fragments in corresponding colours below the template.

The outer primers were designed to cover four inner primers each, resulting in eight outer

primer pairs, each amplifying around 2200 bp long sequences. These were to serve as both a

way of amplifying the original templates, which is available in limited amounts, and as a way

to create a nested PCR, reducing the complexity in subsequent PCR reactions.

As detailed previously, in order to utilise the Illumina Dual-Indexed Paired End sequencing

protocol, a number of additional specific sequences need to be present in the primers. The

basic Illumina sequencing relies on random fragmentation of sample DNA, followed by

ligation of specific adaptors to the fragments, which enable bridge amplification of the

fragments on the sample slide, as well as containing the primer alignment sequence for the

sequencing-by-synthesis steps.

As this project endeavours to sequence the entire mtDNA of thousands of individuals in

specific, predetermined PCR-amplified segments, this random fragmentation approach to

creating to DNA inserts to which adaptors are ligated is not appropriate, as it would require

separate libraries for each individual and involves increased labour and cost, as well as

removing the specificity of using primers to ensure full coverage. Instead, the Read Primer

parts of the adaptor sequences are added single-strandedly to the 5’ end of the forward and

reverse inner primers as handles, making these increase in size significantly. In order to

complete the sequencing-enabling structures for Dual-Indexed Paired End Sequencing, an

additional PCR step is required. This step will be used to introduce the outermost adaptor

sequences that allow ligation to the slides, P5 and P7, as well as the two index sequences, i5

and i7.

19

Figure 8: Schematic overview of the two PCR steps that complete the sequencing construct. The top step uses specific inner primers (shown in dark grey) with attached partial adaptor sequences containing read primer complementary sequences (shown in yellow and light blue). The second step adds the outer adaptor sequences (shown in red and dark blue) and the indices (shown in light and dark green) by completing the previously added adaptors.

The final construct, shown above in figure 8, consists, from left to right, of the P5 flow cell

attachment sequence, the i5 index barcode, the Read 1 Primer complementary region, the

forward insert specific primer, the DNA insert, the reverse insert specific primer, the i7 index

complementary region (which doubles as the Read 2 complementary region when read in the

other direction), the i7 index barcode, and the P7 flow cell attachment sequence. The

difference in structure between the default ligated adaptor and this PCR-generated construct

lies in the forward and reverse insert specific primers, which enable the sequencing of

specific, predetermined parts of the sample DNA, but from a sequencing stand point, these are

merely treated as a part of the DNA insert, and do not influence the sequencing itself in any

way.

Using 32 inner fragments per individual, and sequencing 1152 individuals in parallel, given

the total read output of the MiSeq v3 sequencing kit [27], an equal distribution of reads

between the fragments would, in an ideal situation, provide a redundancy of 600 reads per

fragment and individual.

20

PCR Reactions, Viability of Primers and Multiplex Set Ups

Initially, the first eight inner primers, from Eurofins, were tried in singleplex, 0.1 µl TagTaq,

30 cycles.

Figure 9: First attempt at amplifying the first 8 inner primers (0.1 µl TagTaq, 30 cycles) flanked by two DNA ladders (Low Range, 3% TopVisionAgarose #RO491 25-700 bp). Ladders are smeary and dissimilar, and exposure is high in an attempt to visualize potential product.

Bands were very weak and smeary. As this was true for the ladders too, as well as for other

gels run by others in the lab at the same point in time, part of the fault may, in this case, lie in

the gel bath itself.

Next, the same primers were tried again, resulting in a gel with much sharper bands but still

very weak product bands, showing only for primers 1-3 (Figure 10), but submitting the

remaining reaction mixtures to a subsequent extra 12 cycles showed more clear results

(Figure 11). The gel after the additional 12 cycles shows bands of the expected size for

primers 1-3 multiple, as well as multiple bands of lower sizes, presumed to be various primer

dimers. These results prompted the decision to increase the number of cycles for the inner

primers to 40. Low processivity of the enzyme was suspected.

21

Figure 10: Second attempt at amplifying the first 8 inner primers (0.1 µl TagTaq, 30 cycles), flanked by two DNA ladders (Low Range, 3% TopVisionAgarose #RO491 25-700 bp, and M, 1% TopVision LE GQ Agarose #RO491 250-10000 bp). Ladders are clearer but product is very weak, faintly visible for primers 1 through 3.

Figure 11: Second attempt on first 8 inner primers after 12 additional PCR cycles. Primers 1 through 3 are clearly visible. Samples flanked by two DNA ladders as before (Low Range, 3% TopVisionAgarose #RO491 25-700 bp, and M, 1% TopVision LE GQ Agarose #RO491 250-10000 bp). The two rightmost lanes before the M ladder are primers 1 and 2 from a different sample compared to the first eight lanes.

Reactions for the same eight primers were then run using twice the amount of enzyme, 0.2 µl,

for 40 cycles, and 10 times as much enzyme, 1 µl, but remaining at 30 cycles. The 0.2 µl, 40

cycle run showed product of the expected size for all primers except primer 8, while the 1.0

µl, 30 cycle run did not yield any product. Whether the latter was caused by a laboratory

mistake or a result of imbalances between the reaction components due to the increase in

enzyme concentration, or simply still too few amplification cycles was not further

investigated.

22

Figure 12: Inner primers 1 through 8 amplified with 0.2 µl TagTag and 40 cycles, in duplicate, M ladder. All primers except primer 8 appear clearly.

Figure 13: Inner primers 1 through 8 amplified with 1.0 µl TagTag and 30 cycles, in duplicate, M ladder. No primers visible, possibly due to human error.

Deciding to proceed with 40 cycles and 0.2 µl enzyme per reaction for inner primers in

singleplex, different annealing temperatures were investigated. Both 49°C and 52°C were

tried, using the now established parameters, both yielding product for all primers apart from

primer 8.

Figure 14: Inner primers 1 through 8 amplified with 0.2 µl TagTag and 40 cycles, 49°C annealing temperature to the left and 52°C annealing temperature to the right, Low Range ladder. All primers except primer 8 appear clearly in both sets.

23

Inner primers were also tried in multiplex, initially in quadruplexes of primers 1-4 and 5-8, 50

cycles, yielding vague primer dimer products. At the same time, the eight outer primers were

run for the first time, using TagTaq, 50°C annealing temperature and 40 cycles, but no

product was obtained. Figure 15 below shows these results, using inner primer 1 as a positive

control.

Figure 15: Attempt at amplifying the 8 outer primers using 0.2 µl TagTaq, 50°C annealing temperature and 40 cycles, in duplicate, with inner primer 1 as positive control, M ladder. The two rightmost lanes contain inner primer multiplex attempts, primers 1-4 and 5-8, 50 cycles. No outer primer product visible, and only unspecific product visible for the inner primer multiplexes.

The outer primers and the multiplex attempts were retried using both 5 minutes and 10

minutes extension time for both, again failing to result in the desired products.

Figure 16: Attempt at amplifying the 8 outer primers using 0.2 µl TagTaq, 50°C annealing temperature and 40 cycles, using 5 minutes extension time (left) and 10 minutes (right), with inner primer 1 as positive control, M ladder. The 4 rightmost lanes contain inner primer multiplex attempts, primers 1-4 and 5-8, 50 cycles, in duplicate. Again, no outer primer product visible, and only unspecific product visible for the inner primer multiplexes.

24

In order to rule out human error, all outer primers were re-suspended from stock solution and

the PCR reactions were re-run at 40 cycles and 10 minutes extension time. As yet again no

product was obtained, it was suspected that TagTaq lacked the processivity required to

adequately amplify the longer outer primer fragments, and PlatinumTaq was tried instead,

using 45 cycles and 10 minutes extension time. This set up yielded clear product for all eight

outer primers. It was subsequently concluded that TagTaq did indeed lack the necessary

processivity to reliably produce the longer, outer primer fragments, and PlatinumTaq was

employed for all outer primer reactions from this point onwards.

Figure 17: Outer primers amplified with PlatinumTaq, 45 cycles, 10 minutes extension time, inner primer one as positive control, M ladder. All outer primers visible.

Subsequently, quadruplexes of the outer primers were set up, 1-4 and 5-8, using 0.2 µl

PlatinumTaq, 45 cycles, and 10 minutes extension time, and singleplexes of each of the eight

primers were run using 1µl 1:500 dilutions from the corresponding quadruplexes as template

and were run for 15 cycles. These secondary singleplexes yielded product in outer primers 3,

4, 5, and 8.

25

In order to further investigate the possibility of quadruplexing the outer primers, quadruplexes

comprised of odd- and even-numbered outer primers, as well as a combination of outer

primers 1, 2, 6, and 7 and 3, 4, 5, and 8. As before, these multiplexes were verified by

singleplex reactions based on the multiplex product, run on gels. The former combinations

showed clear product for outer primer 3, 4, 5, and weak bands for 7 and 8. The latter was

similar, and showed primers 3, 5, and 8 relatively clearly, and 4 weakly.

Figure 19: Singleplexes of outer primers, from 1 µl 1:500 dilutions of Even and Odd combination (i.e. 1-3-5-7 and 2-4-6-8) quadruplex template, 15 cycles, M ladder. Outer primers 3, 4, and 5 clearly visible, primer 7 and 8 vary faintly, and 1, 2, and 6 seemingly not amplified.

Figure 20: Singleplexes of outer primers, from 1 µl 1:500 dilutions of 1-2-6-7 and 3-4-5-8 quadruplex template, 15 cycles, M ladder. Primers 3, 5, and 8 were visible, and primer 4 was faintly visible.

These results led to the decision to try the outer primers in duplexes, one set up with primers

Figure 18 Singleplexes of outer primers, from 1 µl 1:500 dilutions of 1-4 and 5-8 quadruplex template, 15 cycles, M ladder. Outer primers 3, 4, 5, and 8 clearly visible, 1, 2, 6, and 7 seemingly not amplified.

26

1+2, 3+4, 5+6, and 7+8, and one set up with primers 1+3, 2+4, 5+7, and 6+8. The duplex

reaction mixture was then used as templates for the corresponding singleplexes, using 1 µl

1:20 dilutions and run for 15 cycles. These set ups consistently yielded product for primers 3,

4, 5, and 8, similarly to the earlier quadruplexes, but primers 1 and 2 showed weak

amplification when paired together, as did primer 7 when paired with primer 5.

Figure 21: Singleplexes of outer primers from duplex set-ups (1+2, 3+4, 5+6, and 7+8), M ladder (one blank lane between the ladder and the first primer). Primers 1 through 5, and primer 8 visible, primers 3 through 5 more strongly.

Figure 22: Singleplexes of outer primers from duplex set-ups (1+3, 2+4, 5+7, and 6+8), M ladder (one blank lane between the ladder and the first primer). Primers 3 through 5, and primers 7 and 8 visible, primers 3, 5, and 8more strongly, primer 7 very faint.

Outer primers 5 and 6, one that had consistently worked and one that did not appear to work,

were subsequently chosen for testing other parameters of the PCR. Singleplexes of primer 5

and 6 were run at 20 cycles and at 30 cycles, with 5 minutes and 10 minutes of extension

time, i.e. four different PCR set ups for each primer. A duplex of outer primers 5 and 6 were

also run at the same parameters. The gels showed 20 cycles to be too few to properly amplify

the segments, and the longer extension time appeared to increase yield. At 10 minutes

extension time, outer primer 5 amplified to a higher extent than primer 6. The singleplexes

performed from the duplexes showed amplification of only primer 5.

27

Figure 23: Parameter tests for outer primers, using outer primers 5 and 6. From left to right, 20 cycles with 5 minutes extension time, 20 cycles with 10 minutes extension time, 30 cycles with 5 minutes extension time, and 30 minutes with 10 minutes extension time for both primer 5 and 6. 20 cycles did not yield product at any extension time, and the higher extension time yielded higher degrees of product, especially for primer 5.

Figure 24: Singleplex amplification of outer primers 5 and 6 from duplex template (imaged cropped from larger gel with other samples on). Only showing result of 30 cycle runs, ostensibly only yielding product for outer primer 5.

After the initial attempts at running the first eight inner primers, the remaining 24 inner

primers were ordered. Due to the issues with getting inner primer 8 to yield product, and

based on advice regarding primer purchase (personal communication with Afshin Ahmadian,

Associate Professor, School of Biotechnology, Royal Institute of Technology, KTH) the new

primers were ordered from Biolegio. Primer 8 was redesigned, and both the new and old

version was ordered, along with primer 1, for comparison to the Eurofins primers, together

with inner primers 9-32.

Firstly, primer 1 from both Eurofins and Biolegio were run in triplicate, as well as both the

original and the new version of Primer 8, both from Biolegio and also in triplicate. The

reactions were run as before, at 46°C annealing temperature and for 40 cycles. The results

28

showed comparable results for both versions of inner primer 1 and indicate product from the

re-synthesis of the original inner primer 8, but not from the new version.

Figure 25: Comparison of inner primer 1 from Eurofins and from Biolegio, and of the old and new design of inner primer 8, both from Biolegio, in triplicate. Primer 1 worked comparably well from both manufacturers, and the old design of primer 8 from Biolegio appeared to work, while the new design did not.

Next, inner primers 9 through 32 were tested in duplicate, according to the same parameters.

Due to the unexpected result from the two inner primer 8, these were also re-run and are

included on the gel showing inner primers 25 through 32. The re-run did support the previous

evidence in showing that the re-synthesis of the original primer 8 worked, while the redesign

did not. The majority of inner primers 9-32 showed product, and the ones that did not or

appeared only weakly were re-run.

Figure 26: Singleplexes of inner primers 9-16, in duplicate (one set after the other, with one empty lane in-between the sets), M ladder.

29

Figure 27: Singleplexes of inner primers 17-24, in duplicate (one set after the other, with one empty lane in-between the sets), M ladder.

Figure 28: Singleplexes of inner primers 25-32, in duplicate (one set after the other), M ladder. Additionally, to the left of the ladder, the old and new design of inner 8, again showing product from the old design.

The primers to be re-run in duplicate were 9, 16, 19, 20, 22, 23, 24 and 25. The results

obtained were largely inconclusive, being inconsistent between duplicates, at best showing

fairly weak bands, at worst appearing almost fully blank, and overall showing a lot of

unspecific product.

Figure 29: Singleplexes in duplicate of the inner primers between 9 and 32 that did not appear to yield product in the initial singleplexes. From left to right, in pairs, 9 16, 19, 20, 22, 23, 24, and 25, M ladder.

30

Next, the viability of TagTaq compared to PlatinumTaq for the amplification of inner primers

was assessed, at the same time investigating how well the inner primers amplify from a

previously amplified outer primer segment, by setting up singleplexes of inner primers 9-12

using template from singleplex amplification of outer primer 3. The outer PCR was run at

50°C annealing temperature, 30 cycles, and 5 minutes extension time. 1 µl 1:20 dilution of the

outer primer product was used as template for the inner singleplexes. These were run at 46°C

annealing temperature, 40 cycles. All steps were performed in duplicate, i.e. two singleplex

reactions of outer primer 3 were used for duplicates of both the TagTaq and the PlatinumTaq,

totalling four singleplexes of each inner primer. All singleplexes were successful, with

PlatinumTaq showing much more strongly, and the two sets of TagTag clearly differing in

intensity.

Figure 30: Duplicate sets of inner primers 9 through 12 in singleplex from previous amplification of outer primer 3, comparing TagTaq (left) to PlatinumTaq (right), M ladder. The PlatinumTaq amplified inner primers show more strongly, and there is a marked difference between the two TagTaq sets, despite having been amplified under the same conditions.

Similarly, quadruplexes of inner primers 9-12, one using TagTaq and one using PlatinumTaq,

were set up, still using the amplified outer primer 3, 1:20 dilution, as template. Reactions

were run at 46°C annealing temperature, 30 cycles. Secondary singleplexes for verification

were performed with Platinum for all reactions, on 1:20 dilutions of the quadruplex mixtures.

Results were similar between the two enzymes; inner primers 9, 11, and 12 were successfully

amplified, while primer 10 was very weak.

31

Figure 31: Singleplexes from quadruplexes of inner primers 9 through 12. All singleplex reactions were performed with PlatinumTaq, while one quadruplex was performed TagTaq (left) and one with PlatinumTaq (right), M ladder.

Due to the apparent failure of certain outer primers in quadruplex reactions, outer primers

were re-suspended from stock and run in singleplex, as before, showing primers 3 through 8

clearly, primer 2 was weaker and primer 1 was very faint. This was to investigate degradation

of the primers due to freeze-thaw cycles as the cause of the amplification failure.

Figure 32: Outer primers in singleplex, re-suspended from stock, M ladder.

Diluting all outer primer product (apart from primer 1, being significantly weaker) at a ratio

of 1:20, singleplexes of all inner primers were run from their corresponding outer primer, for

40 cycles, with 46°C annealing temperature. Results showed amplification of inner primers

corresponding to each of the outer primers, but not from all inner primers, even from inner

primers that had previously been successfully amplified.

32

Figure 33: Singleplexes of inner primers 1 through 16 from outer primer singleplexes 1 through 4, M ladder.

Figure 34: Singleplexes of inner primers 17 through 32 from outer primer singleplexes 5 through 8, M ladder.

The resuspended outer primers were then tried in quadruplex; outer primers 1, 3, 5, and 7 in

one quadruplex and outer primer 2, 4, 6, and 8 in the other. They were run in duplicates of

both 20 and 30 cycles, all using 50°C annealing temperature and 5 minutes. 1:20 dilutions of

the quadruplex reaction mixtures were used for singleplex verification and were run for 15

cycles. The gels showed successful amplification of primers 3, 4, 5, and 8 in both the 20 cycle

and the 30 cycle quadruplexes, and in the latter, outer primer 7 was also visible.

Figure 35: Duplicates of outer primer singleplexes from outer primer quadruplexes (1+3+5+7 and 2+4+6+8, i.e. Even and Odd), M ladder. Quadruplex run for 20 cycles, primers 3, 4, 5, and 8 faintly visible.

33

Figure 36: Duplicates of outer primer singleplexes from outer primer quadruplexes (1+3+5+7 and 2+4+6+8, i.e. Even and Odd), M ladder. Quadruplex run for 30 cycles, primers 3, 4, 5, 7 and 8 visible, number 7 more faintly.

It was then decided to attempt singleplex inner primer amplification, using the 30 cycle outer

primer quadruplex reaction mixture as template, at 1:100 dilution. The singleplexes were run

for 40 cycles. Although product was only expected for inner primers corresponding to outer

primers 3, 4, 5, 7, and 8, gels showed amplification of inner primers from all outer primers,

including those that appeared not to have worked in quadruplex. Only two inner primers

appeared to not have yielded product.

Figure 37: Singleplexes of inner primers 1 through 16, from outer primer quadruplex template (even and odd outer primer combinations), M ladder. Most inner primers visible, irrespective of whether or not the corresponding outer primer appeared to have yielded product.

Figure 38: Singleplexes of inner primers 17 through 32, from outer primer quadruplex template (even and odd outer primer combinations), M ladder. Most inner primers visible, irrespective of whether or not the corresponding outer primer appeared to have yielded product.

To verify this unexpected result, the entire experiment was run again, starting from the

quadruplex of the outer primers, with similar results.

34

Figure 39: Singleplexes of inner primers 1 through 16, from re-run of outer primer quadruplex template (even and odd outer primer combinations), M ladder. Again, almost all inner primers are clearly visible.

Figure 40: Singleplexes of inner primers 17 through 32, from re-run of outer primer quadruplex template (even and odd outer primer combinations), M ladder. Again, almost all inner primers are clearly visible.

Barcoding PCR

Following these results, the decision was made to proceed towards sequencing. For two

different DNA samples, PCR1 was run in quadruplexes of odd and even numbered outer

primers, 40 cycles, and PCR2 was run in quadruplexes, duplexes and singleplexes according

to the pattern in figure 41, each for 25 cycles. A further four DNA samples were prepared in

the same manner, but for these, PCR2 was only run on quadruplexes.

Figure 41: Duplex and quadruplex set-ups for all 32 inner primers, with corresponding names (Q1-8 for the quadruplexes and D1-8 and D17-24 for the duplexes).

35

Product from PCR2 were pooled together, per individual and multiplexing set-up, creating 32-

plexes of inner fragments, and a 1:100 dilution of these pools were used as template for the

barcode-introducing PCR3. This reaction was run for 15 cycles, at 58 °C annealing

temperature, and 5 minutes extension time. Both TagTaq and PlatinumTaq were employed,

according to Table 1 below.

Samples Singleplex TagTaq

Singleplex PlatinumTaq

Duplex TagTaq

Duplex PlatinumTaq

Quadruplex TagTaq

Quadruplex PlatinumTaq

IR119 X X X X X X IR126 X X X X X X IR85 -‐ -‐ -‐ -‐ X -‐ IR92 -‐ -‐ -‐ -‐ X -‐ IR114 -‐ -‐ -‐ -‐ X -‐ IR127 -‐ -‐ -‐ -‐ X -‐ Table 1: Table showing the different combinations of samples, enzymes, and multiplexing variants used in PCR3 for the introduction of the barcode indices.

Concentration Measurements and Cleaning

Concentration measurements were performed on all 32-plexes after PCR3, using the Qubit

dsDNA HS Assay Kit (Invitrogen, Life Technologies), followed by a cleaning step on an

MBS machine (Magnetic Bead Separation) in order to remove smaller fragments than those

meant for sequencing, such as loose primers and primer dimer constructs. This first

concentration measurement was performed both as a quick way of verifying product from

PCR3, and as a means of estimating the relative concentration of actual product when

compared to the clean samples. The Illumina CA Purification protocol [15] was used, diluting

20 µl of PCR3 product to 50 µl using elution buffer (EB). A concentration of 14% PEG was

used as precipitation buffer in order to achieve an appropriate size cut-off [17] [18]. The

parameters entered into the Magnatrix OS were 50 µl sample volume, 20 µl magnetic beads,

100 µl Precipitation Buffer, 25 µl EB, and 10 minutes binding time, resulting in input

volumes of 50 µl sample, 95 µl EB, 125 µl 14% PEG, 220 µl 80% EtOH, and 25 µl beads.

After MBS cleaning, a second Qubit concentration measurement was performed on all

samples, in order to estimate the actual product concentration, from which the pooling of

samples for the sequencing was subsequently based. The samples were also run on

BioAnalyzer (Agilent Technologies, 1000 kit) for a visual verification of the success of the

36

cleaning step. The desired products are expected to be in the range of 600-700 bp, due to the

base fragment being around 550 bp, to which large additional adaptor sequences have been

added.

Figure 42: Bioanalyzer results of all samples apart from the singleplex PlatinumTaq set ups, which were on a separate Bioanalyzer run. All samples are successfully cleaned, showing no short, unspecific products, and those run with PlatinumTaq clearly showing a peak at the expected size range of 600-700, while peaks are very small for those run with TagTaq.

Both assays indicated higher yields of specific product from the sample set-ups run with

PlatinumTaq than from those that were performed with TagTaq, and product above the

detection cut-off for all samples except one (Table 2). In the case of the four additional

individual samples, these were pooled into one sample prior to the second cleaning step.

37

Sample Concentration before cleaning Concentration after cleaning

In assay [ng/ml]

In sample [µg/ml]

In assay [ng/ml]

In sample [µg/ml]

IR119ST 20.6 4.12 1.32 0.263

IR119SP 50.2 10.0 23.9 4.78

IR119DT 21.0 4.20 1.06 0.212

IR119DP 70.8 14.2 25.6 5.12

IR119QT 14.7 2.94 <0.5* -‐

IR119QP 48.7 9.74 21.6 4.33

IR126ST 20.0 4.0 1.61 0.322

IR126SP 36.3 7.27 22.3 4.45

IR126DT 18.8 3.77 1.24 0.248

IR126DP 35.6 7.12 19.9 3.99

IR126QT 23.2 4.64 1.42 0.284

IR126QP 37.6 7.51 18.0 3.59

IR85QT 17.6 3.52

1.15 0.230 IR92QT 11.6 2.33

IR114QT 14.5 2.89

IR127QT 22.7 4.53 Table 2: Overview of the amount of PCR in the different samples before and after cleaning using the Illumina CA Purification Protocol on the MBS [17].

Initial Sequencing

All 32-plexes were then pooled for sequencing on the MiSeq, using the MiSeq Reagent Kit

V2, 300 cycles (Illumina), i.e. paired-end sequencing of 150 bases from each end. After

demultiplexing, the results, shown in abbreviation in Tables 3 and 4, were analysed, and table

5 provides a colour coding key, used to highlight the different magnitudes of reads. Results

showed generally lower numbers of reads than anticipated, and while not being completely

conclusive nevertheless showed clear trends in successful amplification and sequencing. The

main implications were that sample set-ups where PlatinumTaq had been used for PCR3 had

overall generated larger numbers of reads than those that had been performed with TagTaq,

and that the inner fragments corresponding to outer fragments 1, 2, and 6 (inner fragments 1

through 8, and 21 through 24) had yielded far fewer reads than the remaining ones. The latter

trend was particularly noticeable among the first 8 inner fragments, with only two out of 16

deviating from the pattern, while the pattern for fragments 21 through 24 was not as

pervasive.

38

Sample 85 92 114 127 119DP 119DT 119QP 119QT 119SP 119ST

Fragments 1 392 332 1 10 1191 861 3763 691 1551 1382 2 4 1 3 1 2190 1892 18 8 360 182 3 21 1 3 1 172 101 27 8 711 900 4 0 6 5 4 1565 410 240 9 49258 3885 5 13 9 7 9 411 199 297 16 6943 2992 6 1 2 0 0 159 90 57 8 26382 2707 7 7 6 2 5 34767 6293 361 15 63126 6568 8 1 0 0 0 211 69 97 3 37699 1718 9 2125 1157 1146 1158 133270 12310 28138 1132 397 515 10 2136 1071 1538 1943 116761 31809 14731 1677 9387 13220 11 1683 1017 1725 1931 168542 17748 8367 351 38699 5969 12 2161 1215 266 5836 126426 24588 9050 629 57564 10430 13 967 406 90 531 104270 7243 51632 819 31637 2396 14 435 144 114 394 104642 6709 63538 1390 72882 3827 15 502 144 58 122 29198 8296 1073 335 74150 7058 16 192 72 13 3011 75913 9907 19411 973 82509 16184 17 2326 1409 2517 2885 59213 9038 270002 6629 706 1394 18 3349 2002 2754 2442 105531 10036 316883 9134 73157 5234 19 3322 1830 1564 1680 136755 12378 437807 11373 110441 7505 20 3386 1705 52 561 126721 9141 390608 8600 99596 5159 21 1477 612 13 130 1427 2868 4268 528 139331 10548 22 10 2 8 6 2243 993 3006 133 71586 12202 23 10 5 1 5 734 591 548 95 70299 6355 24 2 1 2 9 8859 6700 2139 107 40107 19496 25 424 193 260 231 4194 1639 19755 1307 73767 6168 26 395 199 42 42 8142 3634 25041 1888 155734 18440 27 188 86 128 72 6825 1478 33161 749 32533 3727 28 74 36 19 48 1698 574 6172 237 221553 12900 29 5145 2708 4085 4350 101017 9688 257360 7500 198695 19424 30 8141 4205 2521 4991 130392 19357 274886 9148 906 4091 31 2772 1551 1587 1965 71028 13996 179512 5725 41943 3070 32 4579 2926 743 753 66604 29799 228790 7520 78606 12442

Table 3: Table showing the number of reads obtained for each fragment from the additionally prepared TagTaq quadruplexes 85, 92, 114, 127, and all six TagTaq and PlatinumTaq set ups for sample 119. The colour of the background, in a scale from red to yellow to green, serves to highlight the differences in the number of reads obtained for each fragment, with one colour for each degree of magnitude, with the exception of dark red, signifying a frequency of 0, detailed in table 5.

39

Sample 126DP 126DT 126QP 126QT 126SP 126ST

Fragments 1 74 87 8 200 497 447 2 76 35 15 63 69 96 3 34 16 19 71 401 568 4 5336 651 984 496 191 202 5 120 152 183 265 111 22 6 87 18 53 67 435 479 7 56622 3217 855 154 3583 4222 8 271 70 109 62 20365 1590 9 121469 12941 5479 3597 395 161 10 199811 24356 10993 9149 483 114 11 147953 23749 13682 7657 81658 10882 12 311537 59062 71196 31221 82312 11919 13 245150 20750 40067 14957 35829 2558 14 152221 13621 64102 10398 106776 8142 15 68266 12994 1838 18021 80154 6856 16 124980 11135 87470 8458 80434 16867 17 471 4529 226507 34521 58965 5959 18 133148 14492 255859 45269 19840 2061 19 3427 1402 183958 28863 100110 9264 20 1083 181 85691 9098 96191 10227 21 2996 970 4545 11022 167713 16844 22 1451 510 2753 2892 134284 19105 23 381 561 188 1925 32671 4575 24 23872 2099 14601 1511 69398 12497 25 372 694 1991 7647 43532 7101 26 10983 1363 19320 3233 129972 15561 27 1008 943 75301 16132 132979 9330 28 832 285 38164 12357 117058 10674 29 157249 19281 223762 46346 250894 30530 30 132277 26932 258884 78112 118590 12603 31 128804 27254 212690 39436 86354 6759 32 22988 12172 13393 33537 94675 39913

Table 4: Table showing the number of reads obtained for each fragment from all six TagTaq and PlatinumTaq set ups for sample 126. The colour of the background, in a scale from red to yellow to green, serves to highlight the differences in the number of reads obtained for each fragment, with one colour for each degree of magnitude, detailed in table 5.

Colour Coding Key Number of reads

n = 0

1 < n < 10

10 < n < 100

100 < n < 1'000

1'000 < n < 10'000

10'000 < n < 100'000

100'000 < n < 1'000'000 Table 5: Table showing the colours used to lable the reads, and their corresponding number of reads

40

Results Obtained Post-Project

Subsequent PCR and MiSeq runs, performed by the research group during the writing of the

report, have yielded additional insight into the workings of the primers, PCR set ups, and

sequencing. After modifying the relative concentrations of both inner and outer primers, a

highly modified set up of primer concentrations, seen in tables 6 and 7, were found to result in

a very even level of reads in sequencing, seen in table 8, with a maximum 6.1-fold difference

between the fragments with the highest and lowest number of reads, counting the two high

outliers (4.2 if discounting the outliers).

Outer Primer Concentration in PCR1

[µM] 1 0.5 2 0.4 3 0.08 4 0.04 5 0.04 6 0.4 7 0.15 8 0.035

Table 6: Table showing the modified concentrations of the outer primers. Rather than equal concentrations for each pair primers, the concentrations now vary significantly.

Concentrations of Primers in Modified Inner Quadruplexes Q1 c [µM] Q2 c [µM] Q3 c [µM] Q4 c [µM] 1 0.07 2 0.1 3 0.2 4 0.15 9 0.1 10 0.1 11 0.15 12 0.08 17 0.1 18 0.1 19 0.18 20 0.1 25 0.07 26 0.12 27 0.07 28 0.15

Q5* c [µM] Q6 c [µM] Q7* c [µM] Q8 c [µM] 7 0.1 6 0.25 5 0.4 8 0.2 13 0.3 14 0.5 15 0.8 16 0.25 21 0.2 22 0.3 23 0.6 24 0.15 29 0.1 30 0.35 32 0.3 31 0.2

Table 7: Table showing the modified quadruplexes and related concentrations of the inner primers. Rather than equal concentrations for each pair of primers, the concentrations vary significantly, and the primers included in quadruplexes 5 and 7 have been changed: inner primers 7 and 31 have been moved to Q5* and primer 5 to Q7*, making them a quintuplex and a triplex, respectively.

41

Sample C1V2.7 Fragment % Reads

1 2,32 % 24103 2 1,39 % 14484 3 3,85 % 40023 4 5,32 % 55317 5 2,54 % 26366 6 4,03 % 41859 7 2,44 % 25403 8 1,08 % 11207 9 3,66 % 37995 10 4,68 % 48609 11 4,67 % 48503 12 1,92 % 19993 13 1,87 % 19398 14 1,43 % 14910 15 1,34 % 13876 16 2,06 % 21401 17 3,46 % 35993 18 4,50 % 46735 19 4,37 % 45458 20 6,58 % 68381 21 3,15 % 32769 22 1,55 % 16121 23 1,74 % 18135 24 2,92 % 30357 25 2,19 % 22806 26 4,40 % 45703 27 3,61 % 37564 28 4,11 % 42677 29 4,19 % 43558 30 2,90 % 30190 31 2,89 % 30011 32 2,83 % 29465

Table 8: Table showing the result of a MiSeq sequencing run on samples generated using the modified concentrations shown in tables 6 and 7. As before, the colour key in table 5 was used to label the numbers of reads, and the percentage of total reads are here graded in blue for clarity.

Discussion Primer Layout and Design

The first, and one of the more challenging parts of the project, was the initial primer design.

Compounded by the lack of previous experience in primer design, together with the

constraints stipulated by the template DNA itself, the process took significant time, effort, and

42

compromise to finalise. The first attempt at aligning 32 pairs of primers and have them cover

the entirety of the 16727 bp mtDNA failed to reach all the way around, and thus, the second

attempt required further compromises and careful choices between possible primer pairs on

every step of the way. That is not to say that the final set of primers is necessarily inherently

inferior in quality, rather that different choices early on in the primer selection process,

influenced by knowledge of the shortcomings of the previous attempt, allowed for a tighter

alignment of primer pairs, thereby covering more ground. It was quite evident, when

performing the second primer alignment, that actively choosing a few of the longer-reaching

primer pairs early on effectively shifted the entire alignment into a more favourable frame,

where most forward inner primers could be placed closer to the preceding reverse inner

primer, in comparison to the previous attempt. This ultimately lead to more freedom in

aligning problematic primer pairs, due to the fact that more pairs, over all, met or exceeded

the average required length.

Most inner primer pairs have proven to successfully amplify their intended target of the

expected size, with a few significant exceptions, including the failure of the originally

synthesised primer pair 8, as well as the fact that primer pair 2 occasionally appeared much

lower on the gels than expected, suggesting a shorter sequence than intended had been

amplified. This is, however, to be expected, as inner primer 2 covers the repeating region,

which varies in size between individuals. Apart from these specific events, primers often

resulted in shorter non-specific, suspected primer dimers, sometimes in worryingly large

quantities in comparison to the specific product, and there was a significant inconsistency in

the successful amplification of individual sequences. Certain primer pairs were more prone to

this latter behaviour than others, but there was, in the end, evidence for each primer pair to

have functioned, however not all 32 at any single one occasion. The suspected primer dimer

products were also highly successfully removed by the cleaning step, employed later in the

process.

The 32 inner primer set up does seem to be successful, albeit after several modifications to

relative concentrations.

There is some concern regarding the amplification of the control region, being a very

informative region of the mtDNA that contains a large number of single nucleotide

polymorphisms. It would be preferable to cover this region in one single amplified segment,

but currently, it spans the last third of primer the segment amplified by primer pair 32,

through segments 1 and 2, ending at the very beginning of number 3. It is, however, not

43

possible to fully cover the control region in a single amplified segment using the chosen

sequencing platform, as it measures 1270 bp in the reference genome, [19], which is

obviously far beyond the reach of the 600 bp covered by the Illumina Paired End Sequencing

protocol. Even just the hypervariable region 1, HV1, containing the majority of these

informative sites, once primers for its specific amplification had been added, would have

exceeded the upper limit of 590 bp, measuring in itself 582 bp [20].

PCR Procedures

Although not to the same extent as primer design, previous experience with setting up PCR

reactions, particularly at this scale, was quite limited. There was significant trail and error

involved in finding working protocols, and especially early on, it was harder to rule out

human error, as opposed to sub-optimal protocols when results were poor. As the project

proceeded, however, more experience was obtained, both theoretically and practically, and it

became easier to both perform and evaluate the PCR runs. At the closing of the laborative part

of the project, many of the protocols were in all likelihood less than optimal, and can probably

be improved upon at a later date.

As it stood then, quadruplexes of the outer primers appeared to be working, as evidenced by

the subsequent successful singleplex amplification of inner primers from the diluted

quadruplex reaction mixture, despite the fact that subsequent singleplexes of the outer primers

themselves have ostensibly not shown this not to be the case. These odd results were later

elucidated by experiments performed post-project, which revealed that highly un-equal

concentrations of both inner and outer primers were required to ultimately yield even numbers

of reads in sequencing.

Outer primers, when run in singleplex from earlier quadruplexes, using equal concentrations,

almost invariably yielded product for only primers 3, 4, 5, and 8. In light of the concentrations

established by later experiments, where these previously successfully outer primers were

amplified with concentrations of around a tenth of those that were previously not successfully

amplified, these original results hardly seem surprising. Given the exponential nature of PCR

amplification, this difference in concentrations is highly significant. The same results can be

seen in the inner primers.

44

Indexing, Cleaning, and Sequencing

The incorporation of index primers into the sequencing construct seems to be overall

successful, but appears to vary significantly based primarily on the enzyme employed. This is

evident from the fact that the concentration of product post-purification, i.e. the specific

product in the correct size range, is significantly higher in the reactions that used PlatinumTaq

than the ones that used TagTaq.

The cleaning itself also appears successful, judging by the difference in the amount of product

pre- and post-cleaning, and the lack of low length sequences present post-cleaning. The

Bioanalyzer corroborates these results, showing samples clear of low length, unspecific

products, primer dimers, or loose primers, while showing peaks in the expected size range for

the desired, complete sequencing constructs.

The first attempt at sequencing was moderately successful, showing reads from most of the 32

different fragments for all sequenced individuals, while also displaying indications of the

expected issues with this original 32-fragment approach. There was far from an equal

representation of each primer fragment, and there were also clear indications that the relative

success of the outer primer from which an individual inner primer is derived heavily

influences the number of reads obtained for the inner primer in question. The latter conclusion

is derived from the pattern of lower yielding inner primers, which occur in groups of four that

correspond to the outer primers whereas they do not correspond to the physical inner

quadruplexes in which they were amplified. The samples that had the indices incorporated

with PlatinumTaq had overall a higher number of reads, collectively, than those that used

TagTaq, but generally shows the same patterns and variation between inner primer fragments.

Using the results obtained post-project, it is clearly evident that mainly the outer primers were

very uneven in their comparative efficiency, but that drastically changing their relative

concentrations to compensate for this resulted in very favourable results.

Conclusions

While the initial sequencing run was not by any means perfect, the method as a whole shows

promise. All outer primers appear functional, as do all inner primers, although some variation

has been observed in different individual experiments. All individual inner primers have been

shown to yield product in different experiments, but never all at the same time, at least not to

45

the degree as to being readily visible on agarose gels. While the outer primers could not be

successfully proven to be functional by re-amplifying them in singleplex from the outer

primer quadruplexes themselves, subsequent successful inner primer amplification from outer

primer quadruplexes shows that all outer primers must have amplified enough to provide

adequate template for the inner primers. The cleaning on the MBS using the Illumina CA

Purification works well, and does not seem to need any modifications, and the incorporation

of indices works well, especially when using PlatinumTaq. The sequencing itself indicates

that the amount of indexed sequencing constructs provided by the TagTaq enzyme was in fact

enough to provide a substantial amount of reads, comparable to those provided by the

PlatinumTaq, where the higher amount of reads were generally greater by one order of

magnitude.

The challenges yet to be overcome after the conclusion of the practical part of the project

were primarily those of levelling out the highly varying read numbers between the different

fragments. Since these appeared to correlate significantly with the outer primers from which

they were amplified, a solution seemed to be to alter the relative concentrations of the outer

primers in the initial outer quadruplexes, or in some other manner manipulate the relative

abundance of the outer primers products. This, in conjunction with slight adjustments to the

compositions and the relative concentrations of the primers in the inner quadruplexes, has

now, as described in the post-project results, been shown to drastically improve the ratio

between the number of reads obtained for the 32 different fragments. Based on these results,

the method appears very promising.

46

References

1. Savolainen P., Zhang Y-P., Luo J., Lundeberg J., and Leitner T. (2002)

Genetic Evidence for an East Asian Origin of Domestic Dogs

SCIENCE Vol. 298:1610-1613

2. Ding Z-L., Oskarsson M., Ardalan A., Angleby H., Dahlgren L-G., Tepeli C.,

Kirkness E., Savolainen P., and Zhang Y-P. (2011)

Origins of domestic dog in Southern East Asia is supported by analysis of Y-

chromosome DNA

Heredity (2011), 1–8

3. Pang J-F., Kluetsch C., Zou X-J., Zhang A-B., Luo L-Y., Angleby H., Ardalan A.,

Ekström C., Sköllermo A., Lundeberg J., Matsumura S., Leitner T., Zhang Y-P., and

Savolainen P. (2009)

mtDNA Data Indicate a Single Origin for Dogs South of Yangtze River, Less Than

16,300 Years Ago, from Numerous Wolves

Mol. Biol. Evol. 26(12): 2849–2864

4. van Asch B., Zhang A-B., Oskarsson M. C. R., Klütsch C. F. C., Amorim A., and


Pre-Columbian origins of Native American dog breeds, with only limited replacement

by European dogs, confirmed by mtDNA analysis

Proc R Soc B 280: 20131142

5. Oskarsson M. C. R., Klütsch C. F. C., Boonyaprakob U., Wilton A., Tanabe Y., and


Mitochondrial DNA data indicate an introduction through Mainland Southeast Asia

for Australian dingoes and Polynesian domestic dogs

Proc. R. Soc. B

DOI: 10.1098/rspb.2011.1395

6. Shapiro B., Cui P., Schuenemann V. J., Sawyer S. K., Greenfield D. L., Germonpré

M. B., Sablin M. V., López-Giráldez F., Domingo-Roura X., Napierala H., Uerpmann

H-P., Loponte D. M., Acosta A. A., Giemsch L., Schmitz R. W., Worthington B.,

Buikstra J. E., Druzhkova A., Graphodatsky A. S., Ovodov N. D., Wahlberg N.,

Freedman A. H., Schweizer R. M., Koepfli K-P., Leonard J. A., Meyer M., Krause J.,

Pääbo S., Green R. E., Wayne R. K. (2013)

47

Complete Mitochondrial Genomes of Ancient Canids Suggest a European Origin of

Domestic Dogs

SCIENCE Vol. 342:871-874

7. von Holdt B. M., Pollinger J. P., Lohmueller K. E., Han E., Parker H. G., Quignon P.,

Degenhardt J. D., Boyko A. R., Earl D. A., Auton A., Reynolds A., Bryc K., Brisbin

A., Knowles J. C., Mosher D. S., Spady T. C., Elkahloun A., Geffen E., Pilot M.,

Jedrzejewski W., Greco C, Randi E., Bannasch D., Wilton A., Shearman J., Musiani

M., Cargill M., Jones P. G., Qian Z., Huang W., Ding Z-L, Zhang Y-P., Bustamante

C. D., Ostrander E. A., Novembre J., and Wayne R. K. (2010)

Genome-wide SNP and haplotype analyses reveal a rich history underlying dog

domestication

Nature 464, 898-902

8. Ringo J. (2004)

Fundamental Genetics

9. DNA Sequencing with Solexa® Technology (2007)

Illumina, Pub. No. 770-2007-002 01May07

10. Illumina Sequencing Technology Highest data accuracy, simple workflow, and a

broad range of applications (2010)

Illumina, Pub. No. 770-2007-002 Current as of 11 October 2010

11. Sequencing Dual-Indexed Libraries on the HiSeq® System User Guide

ILLUMINA PROPRIETARY Part # 15032071 Rev. B July 2012

12. Fuller, C. W., Middendorf L. R., Benner S. A., Church G. M., Harris T., Huang X.,

Jovanovich S. B., Nelson J. R., Schloss J. A., Schwartz D. C., and Vezenov D. V.

(2009)

The challenges of sequencing by synthesis

Nature Biotechnology Vol: 27 Nr:,11 1013-1023

13. GenBank: U96639.2, Mitochondrial reference genome of the domestic canine on the

NCBI Database

http://www.ncbi.nlm.nih.gov/nuccore/U96639

14. Savolainen P., Arvestad L., and Lundeberg J. (2000)

mtDNA Tandem Repeats in Domestic Dogs and Wolves: Mutation Mechanism

Studied by Analysis of the Sequence of Imperfect Repeats

Mol. Biol. Evol. 17(4): 474–488

15. Primer BLAST primer alignment tool on the NCBI database

48

http://www.ncbi.nlm.nih.gov/tools/primer-blast/

16. Invitrogen information sheet on Platinum Taq

https://www.lifetechnologies.com/content/dam/LifeTech/migration/files/pcr/pdfs.par.2

6652.file.dat/platinumtaq-pps.pdf

17. Protocol for Illumina CA Purification

https://github.com/EnvGen/LabProtocols/blob/master/CA_cleaning.pdf

18. Lundin S., Stranneheim H., Pettersson E., Klevebring D., Lundeberg J. (2010)

Increased Throughput by Parallelization of Library Preparation for Massive

Sequencing.

PLoS ONE 5(4): e10029.

DOI:10.1371/journal.pone.0010029

19. Gundry R. L., Allard M. W., Moretti T. R., Honeycutt R. L., Wilson M. R., Monson

K. L., and Foran D. R. (2007)

Mitochondrial DNA Analysis of the Domestic Dog: Control Region Variation Within

and Among Breeds

DOI: 10.1111/j.1556-4029.2007.00425.x

20. Imes D. L., Wictum E. J., Allard M. W., Sacks B. N. (2012)

Identification of single nucleotide polymorphisms within the mtDNA genome of the

domestic dog to discriminate individuals with common HVI haplotypes

DOI: 10.1016/j.fsigen.2012.02.004

21. Natanaelsson C., Oskarsson MC., Angleby H., Lundeberg J., Kirkness E., Savolainen

P. (2006).

Dog Y chromosomal DNA sequence: identification, sequencing and SNP discovery.

BMC

Genet 7: 45.

22. AlbaNova University Center

School of Biotechnology of the Royal Institute of Technology (KTH)

23. Meyer M., Stenzel U., Myles S., Prüfer K., and Hofreiter M. (2007)

Targeted high-throughput sequencing of tagged nucleic acid samples

DOI: 10.1093/nar/gkm566

24. Gunnarsdóttir E. D., Li M., Bauchet M., Finstermeier K., and Stoneking M. (2011)

High-throughput sequencing of complete human mtDNA genomes from the

Philippines

DOI: 10.1101/gr.107615.110

49

25. Maricic T., Whitten M., and Pääbo S. (2010)

Multiplexed DNA Sequence Capture of Mitochondrial Genomes Using PCR Products

DOI: 10.1371/journal.pone.0014004

26. Improved quantitative PCR using nested primers.

Haff L.A.

Genome Res. 1994 3: 332-337

27. Illumina MiSeq Specifications

http://www.illumina.com/systems/miseq/performance_specifications.html

linnea guldbrand - diva1038999/fulltext02.pdf · skolan fÖr bioteknologi. . 1 ... pcr primers were...

Documents