molecular marker and its application to genome mapping and molecular breeding
Post on 11-May-2015
9.494 Views
Preview:
TRANSCRIPT
Molecular Marker and Its Application to
Genome Mapping and Molecular Breeding
Binying Fu
Institute of Crop Sciences
The Chinese Academy of Agricultural Sciences
Beijing 100081, China
Nov-14-2012
Definition of Biological Marker
�� Biological markers can be anything that distinguishes Biological markers can be anything that distinguishes
one individual or population from anotherone individual or population from another
�� Can be phenotypicCan be phenotypic
� Can be a biochemical or genetic differenceCan be a biochemical or genetic difference
�� Can be phenotypicCan be phenotypic
��ColorColor: yellow vs white etc: yellow vs white etc
��TextureTexture: smooth vs rough etc: smooth vs rough etc
��ShapeShape: round vs irregular etc: round vs irregular etc
Phenotypic Markers
Weakness: unstable and limited number and polymorphism
http://cgil.uoguelph.ca/QTL/Fig2_3.htm
Cytological Marker
Any distinct and heritable feature of chromosome structure that
can be used to follow (usually by microscopy) that chromosome
or chromosome region in breeding experiments.
Weakness: side effect and need special technique
Biochemical Marker-Isozyme and Protein
Weakness:limited number, spatio-temporal expressed and need special technique such as Starch Gel with special staining
Characteristics of Ideal Markers
� Polymorphism
� Stability, no influences from the environment
� Wide dispersion through the genome
� Simplicity of observation� Simplicity of observation
� Low cost
� Mendelian Heritability
� Co-dominancy
� Reproducibility
� Portability between species
Define::::A molecular selection technique of DNA signposts which allows the identification of differences in the nucleotide sequences of the DNA in different individuals. Or any genetic element ( locus, allele, DNA sequence or chromosome feature) which can be readily detected by phenotype, cytological or molecular techniques, and used to follow a chromosome or chromosomal segment during genetic analysis. (Also DNA marker)
Molecular MarkersMolecular Markers
Agriculture: a tool which allows crop geneticists and breeders to locate on a plant chromosome the genes for a trait of interest. It is considered more efficient than conventional breeding as it has the potential to greatly reduce development times and substitutes laboratory selection for much of the fieldwork. MAS or MDB!
Molecular, or DNA-based, markers have been increasingly important in plant breeding because of their features: Phenotypic stability (not affected by environment), Useful polymorphism, Ease of development.
Mutation = heritable (at the cell level) changes in DNA
sequence, regardless of whether the change produces any
detectable effect on a gene product. Mutations are the source
of new variation (polymorphism) upon which natural selection
works. Inherited mutations that are dispersed through a
Where does the molecular marker come from?
population can become polymorphisms.
Polymorphism = presence in the same population of two or
more alternative forms of a DNA sequence, with the most
common allele having a frequency of 99% or less. Any two
individuals have a polymorphic difference every 1,000-10,000
base pairs.
Class of Mutation Mechanism Frequency Example
Genome mutation Chromosome 10-2/cell division Aneuploidy
missegregation
Chromosome mutation Chromosome 6x10-4/cell division Translocation
Comparison of Mutation FrequenciesComparison of Mutation Frequencies
Chromosome mutation Chromosome 6x10 /cell division Translocation
rearrangement
Gene mutation Base-pair mutation 10-10/base pair/cell division Point mutation
10-5-10-6/locus/generation
humans have ~109 base pairs/haploid genome,therefore each person will have 1-100 new mutations
1 in 20 people will have a new gene mutation
Types of Types of MutationsMutations (1)(1)
• Missense mutations (amino acid substitution)
• Nonsense mutations (premature stop codon)
Nucleotide Substitutions Altering Coding Sequence
Types of Mutations (2)Types of Mutations (2)
• RNA processing mutations (destruction of splice sites,
cap sites, poly A sites, or creation of cryptic sites)
• Regulatory mutations (promoter mutations)
Nucleotide Substitutions Altering Gene Expression
• Regulatory mutations (promoter mutations)
Types of Mutations (3)Types of Mutations (3)
• Insertion or deletion of small number of bases
If number of bases involved is not a multiple of 3,
causes frameshift
Deletions and Insertions (InDels)
causes frameshift
If number of bases involved is a multiple of 3,
causes loss or gain of codons
• Larger deletions, inversions, and duplications
Can create gene syndromes
RecombinationRecombination--GeneratedGenerated
Duplications, Deletions, Insertions
Duplication
Insertion
Inversion
Brief Summary
The term MARKER is usually used for “LOCUS MARKER”. Each gene has a particular place along the chromosome called
LOCUS. Due to mutations, genes can be modified in several forms
mutually exclusives called ALLELES (or allelic forms). All allelic
forms of a gene occur at the same locus on homologous
chromosomes. When allelic forms of one locus are identical, the
genotype is called HOMOZYGOTE (at this locus), whereas
different allelic forms constituted a HETEROZYGOTE. In
diploid organisms, the GENOTYPE is constituted by the two
allelic forms of the homologous chromosomes.
Thus, MOLECULAR MARKERS are all loci markers related
to DNA (sometimes biochemical or morphological markers
included).
First Generation: 1980s
� -Based on DNA-DNA hybridizations, such as RFLP.
Second Generation: 1990s
� -Based on PCR: Using random primers: RAPD, DAF, ISSR
Molecular Markers ClassesMolecular Markers Classes
Using specific primers: SSR, SCAR, STS
� -Based on PCR and restriction cutting: AFLP, CAPs
Third Generation: recently
� -Based on DNA point mutations (SNP), can be detected by SSCP,
DASH, DNA chip, sequencing etc.
The Evolution of Markers
AFLPs on microarrays (2000)
SSCPs CAPs (1993)
cDNA Sequencing-cSSR
RAPDs (1990)
SCARs
AFLPs (1996)
Genomic Era
Automation
AFLPs on automated sequencers (1998)
Complete Genomic Sequence
High-throughput marker analysis
SNPs
SNPs on Chips
Hallmark event
Morphological Variants (Pre 1950s)
Restriction (1968) and Southern Blotting (1975)
RFLPs (1980)Pre-PCR
DNA-Hybridization-Scene
Protein-Scene
Allozymes (1960s)
Gel Eletrophoresis (1950s)
Gene –Specific PCRPCR (1986)
SSCPs CAPs (1993)
OLIGO-Scene
Microsatellites (SSRs 1989)
RAPDs (1990)
Hallmark event
DNA Markers
Simple Sequence Repeats-SSR
Single Nucleotide Polymorphism-SNPSingle Nucleotide Polymorphism-SNP
Single Feature Polymorphisms (SFPs)
Microsatellites
What are microsatellites?
� Simple sequence repeats (SSRs) or microsatellites are tandemly repeated mono-,
di-, tri-, tetra-, penta-, and hexa-nucleotide motifs. SSR length polymorphisms are
caused by differences in the number of repeats
� SSR loci are “individually amplified by PCR using pairs of oligonucleotide primers
specific to unique DNA sequences flanking the SSR sequence”.
Example Mononucleotide SSR (A)11
AAAAAAAAAAADinucleotide SSR (GT)6
GTGTGTGTGTGTTrinucleotide SSR (CTG)4CTGCTGCTGCTG
Tetranucleotide SSR (ACTC)4
ACTCACTCACTCACTC
Feature of SSR Marker
� SSRs tend to be highly polymorphic.
SSRs are highly abundant and randomly dispersed throughout
Microsatellites
� SSRs are highly abundant and randomly dispersed throughout
most genomes.
� Most SSR markers are co-dominant and locus specific.
� Genotyping throughput is high and can be automated.
Where are microsatellites found?
Majority are in non-coding region
Microsatellites
Repeat Motifs
� AC repeats tend to be more abundant than other di-nucleotide repeat motifs in animals
� The most abundant di-nucleotide repeat motifs in plants, in descending order, are AT, AG, and AC.
Microsatellites
� Because AT repeats self-anneal, AT-enrichment methods have not been developed.
� Typically, SSRs are developed for di-, tri-, and tetra-nucleotide repeat motifs. CA and GA have been widely used in plants.
� SSR markers have been developed for a variety of tri- and tetra-nucleotide repeats in plants.
� Tetra-nucleotide repeats have the potential to be very highly polymorphic.
SSR Containing Sequences from BACSSR Containing Sequences from BAC--endsends
21%
3%
1 % in Corn 0.6 % in Soybean
2bp
3bp
4bp
5-6bp
76 %
3%
SSR containing sequences in different BAC ends, there are 1% SSR in Corn,
0.6% in Soybean. Among these, most are dinucleotide repeats
AAT
AAC
AAG
ATG
ATC
15%5%
Trinucleotide Repeats in Soy BACTrinucleotide Repeats in Soy BAC--end Sequencesend Sequences
ATC
AGG
ACT
CCT
CGT
ACC
CTG
48%
25%
In the Soybean genome, most of the trinucleotide repeats
in BAC-end sequences are AAT repeats, one quarter of
them are AAC repeats.
Simple sequence repeats (SSRs). SSRs are particularly useful for developing genetic Simple sequence repeats (SSRs). SSRs are particularly useful for developing genetic markers. They are believed to vary through DNA replication slippage , and are related to genetic instability . In Table 2, we describe SSR content for two sectors, n 6 to 11 units and n >11 units, to emphasize that the number of SSRs dropped substantially after 11 units. The SSR content for 93-11 was 1.7% of the genome,
lower than in the human, where it was 3%. The overwhelming majority of
rice SSRs were mononucleotides, primarily (A)n or (T)n, and with n is 6 to
11. In contrast, for the human, the greatest contributions came from dinucleotides.
From Nipponbare, Goff etal., 2002, Sciences.
The most prevalent SSR is tri-nucleotide; Most frequent 2-SSR is AG, 3-SSR is
CGG, 4-SSR is CGAT.
How do microsatellites mutate?
Microsatellites
� Replication Slippage
� Unequal crossing-over during meiosis
“Polymerase slippage” or
“slipped-strand mispairing.”
A commonly observed replication error is the replication slippage, which
Replication Slippage
When the DNA replicates, the polymerase loses track of its place, and either leaves
out repeat units or adds too many repeat units.
occurs at the repetitive sequences when the new strand mispairs with the template
strand. The microsatellite polymorphism is mainly caused
by the replication slippage. If the mutation occurs in a coding region, it could produce abnormal proteins, leading to diseases.
Unequal crossing-over during meiosis
This is thought to explain more drastic changes in numbers of repeats. In this
diagram, chromosome A obtained too many repeats during crossing-over, and
chromosome B obtained too few repeats.
Why do microsatellites exist?
� "junk" DNA, and the variation is mostly neutral
� a necessary source of genetic variation
Microsatellites
� a necessary source of genetic variation
� regulate gene expression and protein function
Moxon, E. R., Wills, C. 1999. "DNA microsatellites: Agents of Evolution?" Scientific
American. Jan., pp. 72-77.
Kashi, Y. and M. Soller. 1999. "Functional Roles of Microsatellites and Minisatellites."
In: Microsatellites: Evolution and Applications. Edited by Goldstein and Schlotterer.
Oxford University Press.
Models of Microsatellite Mutation (1)
This model holds that when microsatellites mutate, they only gain
or lose one repeat. This implies that two alleles that differ by one repeat are more closely related (have a more recent common
1. Stepwise Mutation Model (SMM)
repeat are more closely related (have a more recent common
ancestor) than alleles that differ by many repeats. In other words, size matters when doing statistical tests of population
substructuring. The SMM is generally the preferred model when
calculating relatedness between individuals and population
substructuring, although there is the problem of homoplasy.
Each mutation can create any new allele randomly. A 15-repeat allele
could be just as closely related to a 10-repeat allele as a 11-repeat allele.
All that matters is that they are different alleles. In other words, size isn't
important.
2. Infinite Alleles Model (IAM)
Models of Microsatellite Mutation(2)
important.
A 15-repeat allele could be just as closely related to a 10-repeat allele
as a 11-repeat allele.
15-repeat 11-repeat 10-repeat
8-repeat
Genomic DNA
Conventional Developmental Steps of SSR Markers
PCR test using diverse genotypes
Specific SSR
DNA Library
Positive Clones
SSR
SSR probes
Sequencing of positive DNA clones
1. The customary method for SSR genotyping is denaturing polyacrylamide gel electrophoresis using silver-stained PCR products. These assays can usually distinguish alleles differing by 4 bp and may distinguish alleles differing by 2 bp.
2. Semi-automated SSR genotyping can be performed by assaying fluorescently labelled PCR products for length variants on an automated DNA sequencer. Several instruments have been developed (e.g., Applied
Four Assay Methods
DNA sequencer. Several instruments have been developed (e.g., Applied Biosystems and Li-Cor). Alleles differing by 2 to 4 bp can usually be distinguished.
3. SSR length polymorphisms can be assayed using non-denaturing high performance liquid chromatography (Marino et al. 1998). Alleles differing by 2 to 4 bp can usually be distinguished.
4. SSR alleles differing by several repeat units can often be distinguished on agarose gels.
SSRs assayed on polyacrylamide gels typically show a characteristic
“stuttering”. Stutter bands are artifacts produced by DNA polymerase slippage. Typically, the most prominent stutter bands are +1 and - 1 repeat (e.g., + or - 2 bp for a di-nucleotide repeat), and, if visible, the next most prominent stutter bands are +2 and -2 repeats.
� The development of SSRs is labor intensive(NO in
sequence-based SSR development) .
� SSR marker development costs are very high.
Weaknesses
� SSR markers are taxa specific.
� Start-up costs are high for automated SSR assay methods.
� Developing PCR multiplexes is difficult and expensive. Some
markers may not multiplex.
� SNP is the molecular basis for most phenotypic differences between
individuals
� SNP is the most common genetic variations.
SNPs are highly abundant, stable and distributed throughout the genome
Single Nucleotide Polymorphisms
� SNPs are highly abundant, stable and distributed throughout the genome
� SNP assay is amenable to automation and high throughput.
� SNP is biallelic.
GATTTAGATCGCGATAGAG
GATTTAGATCTCGATAGAG
� SNPs in intergenic regions may …
� Have no genetic effect …
� Affect genetic regulatory signals …
� Interfere with RNA splice sites …
Single Nucleotide Polymorphisms
� SNPs in Coding regions (cSNP) may …� Synonymously change the codon of an amino acid,
which may have no further effect, or may influence
e.g. codon bias.
� non-synonymously alter the encoded amino acid
(nsSNP) by a conservative exchange, or non-
conservative (radical) mutation.
30
35
40
CT
GA
%
SNP Variation in Maize and Soybean
0
5
10
15
20
25
Maize Soy
GA
GC
AC
GT
AT
Del
Frequency of Candidate SNPs from
Different Sources in Maize and Soy
Region Maize Soy
EST (5’end) 1/1.5kb 1/1.9kbEST (5’end) 1/1.5kb 1/1.9kb
Genomic 1/640bp 1/750bp
3’UTR 1/441bp 1/416bp
SNP/250bp SNP/268bp SNP/243bpSNP/236bpSNP/250bp
16.5%
18.2%
65.3%
23.5%
21.8%
54.7%
14.3%
16.3%
69.4%
23.3%
22.4%
54.3%
SNP/268bp SNP/243bpSNP/236bp
SNPs Discovery
1. Sequence databases searches
2. Target specific SNP discovery and development2. Target specific SNP discovery and development
-Conformation-based mutation scanning
-Direct DNA sequencing
Identify SNP from Sequence Databases
Identification of Target Specific SNPs
Steps:
1. Amplify the genes of interests with PCR
2. Scan for mutation with various methods
-Conformation-based mutation scanning
- Single -strand conformation polymorphism analysis- Gel electrophoresis- Gel electrophoresis- Chemical and enzymatic mismatch cleavage detection - Denaturing gradient gel electrophoresis- Denaturing HPLC
4. Align sequences from different sources to find SNPs
3. Sequence positive PCR products
-Sequence multiple individuals
-Sequence heterozygotes
Gel-Based Methods
-PCR-restriction fragment length polymorphism analysis
-PCR-based allelic specific amplification
-Oligonucleotide ligation assay genotyping
-Minisequencing(10~20base)
Technologies for Detecting Known SNPs
Non-Gel-Based High Through Genotyping Technologies
-Solution hybridization using fluorescence dyes
-Allelic specific ligation
-Allelic specific nucleotide incorporation
1. High resolution separation
2. Chemical color reaction
-DNA microarray genotyping
Oligo Ligation Assay((((OLA))))
Two allele-specific oligonucleotide probes (one specific for the wild-type allele and the
other specific for the variant allele) and a fluorescent common probe are used in each
assay. The 3' ends of the allele-specific probes are immediately adjacent to the 5' end of
the common probe. In the presence of thermally stable DNA ligase, ligation of the
fluorescently labeled probe to the allele-specific probe(s) occurs only when there is a
perfect match between the variant or the wild-type probe and the PCR product template.
These ligation products are then separated by electrophoresis, which permits the
recognition of the wild-type genotypes, the variants, the heterozygotes, and the unligated
probes.
Figure. Schematic representation of the allele-specific codominant PCR strategy. Oligonucleotide primers with 3' nucleotides that correspond to an SNP site are used to preferentially amplify specific alleles. A, Primer P1 forms a perfect match with allele 1 but forms a mismatch at the 3' terminus with the DNA sequence of allele 2. Primer P2 similarly
Allele-Specific Codominant PCR Strategy
DNA sequence of allele 2. Primer P2 similarly forms a perfect match with allele 2 and a 3' terminus mismatch with allele 1.B, Schematic of agarose gel analysis showing the expected outcome for the amplification of organisms homozygous and heterozygous for both alleles using primers P1 and P2. P1, Primer 1; P2, primer 2; A1, allele 1; A2, allele 2.
Eliana Drenkard et al. 2000 Plant Physiol 124: 1483-1492
Principle: A 1 bp mismatch in the center of a 15mer will change
the T m by 5 - 10 degrees, therefore a SNP in the middle of a
15mer can be genotyped using paired ASOs.
SNP Detection Allele Specific Oligohybridization
� PCR amplify target gene (different individual) in 96 well format
� Prepare dot-blot on nylon filter
� Hybridize to allele-specific 15mer and detect the signal
� Wash at stringency temperature
� Repeat for alternate allele and other SNPs
Single-stranded DNAs are generated by denaturation of the PCR
products and separated on a nondenaturing polyacrylamide gel. A
fragment with a single-base modification generally forms a different
conformer and migrates differently when compared with wild-type
DNA.
Single-Strand Conformation Polymorphism Analysis
DNA.
Size <200bp,
Accuracy: 70%-95%
Size >400bp,
Accuracy: 50%
1% false positive
SNP Genotyping Using Oligo Chip
T genotype
C genotypeOligo Chip: a set of 15-
nucleotide probes, which consist
of different sets of probes
overlapped each other, 14
nucleotides were overlapped,
among the four probes in one set,
the sequences are almost the
same except one A/G/C/T
http://www.ricesnp.org/index.aspx##
Direct Sequencing - New Sequencing Technology
Pyrosequencing technology offers rapid and accurate genotyping, allowing for
dependable SNP and mutation analysis. This technology utilizes an enzyme cascade system that results in the production of measurable light whenever a nucleotide forms a base pair with its complimentary base in a DNA template strand.
Solexa/Illumina SequencingSolexa/Illumina Sequencing
Munroe & Harris, (2010) Third-generation sequencing fireworks at Marco Island.
Nature Biotechnology 28: 426–428.
Use of SNPs
1. Markers for linkage mapping-Discover SNPs contribute
to agronomic traits
2. Trace origin of introgression2. Trace origin of introgression
3. Markers for association studies (Linkage Disequilibrium)
4. Markers for population genetic analysis
Further Reading:
McNally et al., 2009. Genomewide SNP variation reveals relationships among landraces and modern varieties of rice. PNAS 106(30):12273-8.
Jones et al., 2009. Development of single nucleotide polymorphism (SNP) markers for use in commercial maize (Zea mays L.) germplasm. Mol Breeding 24 (2):165-176.
Varshney et al., 2007. Single nucleotide polymorphisms in rye (Secale Varshney et al., 2007. Single nucleotide polymorphisms in rye (Secale cereale L.): discovery, frequency, and applications for genome mapping and diversity studies. TAG 114 (6): 1105-1116.
Wu et al., 2010. SNP discovery by high-throughput sequencing in soybean. BMC Genomics 11:469.
Single Feature Polymorphisms (SFPs)
SFPs are a consequence either of insertions/deletion (InDel)
polymorphisms or
represent multiple SNPs across the complementary sequences.
SFPs identified through hybridization of genomic DNA to whole-genome
tiling arrays (i.e., Affymetrix Genechips) or home-made microarray.tiling arrays (i.e., Affymetrix Genechips) or home-made microarray.
References
Yeast: Wodicka, L., H. Dong, M. Mittmann, M.H. Ho, and D.J. Lockhart.
1997. Nat Biotechnol 15: 1359-1367.
Arabidopsis: Borevitz, J.O., D. Liang, D. Plouffe, H.S. Chang, T. Zhu, D.
Weigel, C.C. Berry, E. Winzeler, and J. Chory. 2003. Genome Res 13:
513-523.
Further reading:
Kumar et al., 2007. Single Feature Polymorphism Discovery in Rice. Plos ONE, 2(3): e284
Principle of Microarray-based
genotyping of Single
Feature Polymorphisms
(SFPs) by Oligo Chip.
A genotype B genotype A/B genotype
http://cropwiki.irri.org/gc
p/images/6/61/Single_Fe
ature_Polymorphism.pdf
http://cropwiki.irri.org/gc
p/images/6/61/Single_Fe
ature_Polymorphism.pdf
A. Mutation at restriction sites (RFLP, CAPS, AFLP) or PCR
primer sites (RAPD, DAF, AP-PCR, SSR, ISSR)
B. Insertion or deletion between restriction sites (RFLP, CAPS,
AFLP) or PCR primer sites (RAPD, DAF, AP-PCR, SSR, ISSR)
Classification of DNA Markers
AFLP) or PCR primer sites (RAPD, DAF, AP-PCR, SSR, ISSR)
C. Changes in the number of repeat unit between restriction sites
or PCR primer sites: SSR, VNTR, ISSR
D. Mutations at single nucleotides: SNP
Summary of Common Molecular Markers
Single Locus Detection
RFLP (restriction fragment length polymorphism) Hybridization
CAPS (cleaved amplified polymorphic sequences) PCR
SSLP (simple sequence length polymorphism)
---- VNTR (variable number of tandem repeat) Hybridization or PCR ---- VNTR (variable number of tandem repeat) Hybridization or PCR
---- SSR/STR (simple sequence repeats/tandem repeats) PCR
SCAR (Sequence characterized amplified region) PCR
SNP (Single nucleotide polymorphism)
---- DASH (dynamic allele-specific hybridization) Hybridization
---- SSCP (single strand conformation polymorphism) Conformation
Multiple Loci Detection
AFLP (amplified fragment length polymorphism) PCR
RAPD (random amplified polymorphic DNA) PCR
AP-PCR (arbitrarily primed-PCR) PCR
DAF (DNA amplification fingerprinting) PCR
Summary of Common Molecular Markers
SSLP (simple sequence length polymorphism) PCR
when multiple pairs of primers were used
ISSR (inter-simple sequence repeat) PCR
SNP (Single nucleotide polymorphism)
-- SSCP (single strand conformation polymorphism) Conformation
when used to scan for randomly located SNPs
Conclusion
All molecular markers are not equal. None is
ideal. Some are better for some purposes than
others. However, all are generally preferable to
morphological markers for mapping and marker morphological markers for mapping and marker
assisted selection.
Thanks For Your Attention!
top related