genomics based tools for omega 3 rich oil seed crop,...
TRANSCRIPT
1
Genomics based tools for omega‐3 fatty acid rich oil seed crop, Linseed
1
CSIR‐NCL, Pune, India Vidya Gupta
Narendra Kadoo, Ashok Giri, Varsha Pardeshi, VitthalBarvkar, Sandip Kale, Rajwade Ashwini
COA, NagpurP. B. Ghorpade
CSAUAT, Kanpur, IndiaShrivastava R. L.
Next Generation Genomics and Integrated Breeding for Crop Improvement Feb 20, 2014, ICRISAT, Hyderabad
Flax/ Linseed (Linum usitatissimum L.)
Versatile total utilization crop
High nutritional and industrial value:
Unique fatty acid composition : (>50% α‐Linolenic acid (ALA), an omega‐3 fatty acid)
Excellent source of protein and dietary fiber (28 g total dietary fibre/100 g dry weight)
Rich source of lignans (phytoestrogens) and other antioxidants (800 mcg/g)
Oil used in paint industry while fibre used for carpet making, linoleum etc.
2
2
Flax agriculture in India
India ranks 1st (0.45 MHa) in area under cultivation in the World
6th in the production (0.14 MT): average yield of 317.4 Kg/ha (FAO, 2010)
Largely a neglected crop in India 3
0
100000
200000
300000
400000
500000
Area under cultivation (Ha)-2010
Rabi crop confined to limited areas in the country
0200400600800
100012001400
Yield (Kg/Ha)-2010
Low economic returns –Native varieties‐ Low productivity 400kg/hectare under rain fed condition
Susceptible to rust, wilt, seedling blight, viral diseases, insect pests
Production and area under cultivation shrinking
050000
100000150000200000250000300000350000400000450000
Production (Tonnes)-2010
Need based research in flax/ linseedEnhancing omega‐3 fatty acid and lignan content
Levels of omega‐3 fatty acids in the diet decreasing drastically
Only agricultural source of high ALA (45‐65 %) and SDG
Systematic efforts lacking: Breeding, OMICS and Engineering approaches for omega‐3 fatty acid and high lignan content
Essential breeding activity
Tapping varietal variation for biochemical parameters, oil content, quality and yield
High yielding, early maturing, disease tolerant
High oil content and better quality
Fiber content and better quality
Better management of the crop
Developing genetic and genomic tools Developing appropriate mapping populations
Core population development and phenomics
Developing molecular tools
Genome analysis, metabolic pathway networks
Health benefits Flax products
Verification of health related activities using animal models4
3
5
Development of genetic and genomic resources
Development of Genomic SSR and EST‐SSR Markers
6
4
Development of genomic SSR markers
Genomic DNA: Flax variety NL‐97
Microsatellite enrichment library preparation
5’ anchored PCR method (Fisher et al. 1996)
PIMA (PCR Isolation of Microsatellite Arrays) (Lunt et al. 1999)
FIASCO (Fast Isolation by AFLP of Sequences COntainingrepeats) (Zane et al. 2002)
Next generation sequencing: Pooled amplicons (from each enrichment library) sequenced using next generation sequencing (454 pyrosequencing) technology
7Kale et al Molecular Breeding 2012
8
5
Frequency distribution of genomic SSRs
Dinucleotide motif distribution
0
5
10
15
20
25
AG GA CT TC AC CA TG GT
% Frequen
cy
Dinucleotide repeat motif
0
5
10
15
20
25
AAG AGA GAA CTT TTC TCT
% Frequen
cy
Trinucleotide repeat motif
Trinucleotide motif distribution
DNR TNR Others
% of repeat motif
Repeat motifKatti et al. (2001)
Motif Analysis
AT/TAAG/GA/CT/TCAC/CA/TG/GTGC/CG
27 diverse genotypes used 52 primers randomly selected 43 showed amplification
31 (72%) TNR, 10 (23%) DNR Nine (21%) polymorphic
Polymorphism Screening
Cd-hit-est
CAP3
Development of EST‐SSR markers
10
Work flow
Species No of primers
amplified
Transferability (%)
Jatropha curcas 0 0.00
Arabidopsis thaliana 3 0.34
Ricinus communis 13 1.47
Medicago
truncatula
22 2.49
Populus trichocarpa 22 2.49
Glycine max 22 2.49
Gossypium
hirsutum
52 5.88
Vitis vinifera 60 6.79
Nicotiana tabacum 66 7.47
e - PCRDeviation in virtual PCR product size observed in
different species from that observed in L. usitatissimum
CAP3
Cd-hit-est
SSR Locator v.1
6
Development of Indian flax core collection
11
Indian flax diversityIndian flax germ‐plasm: 2239
Indigenous: 1890
Exotic: 349
12
Descriptors used for clustering Descriptors used for evaluation
Days to 50% flowering (DTF) Plant type
Days to maturity (DTM) Flower colour
Plant height (cm) (PH) Flower size & shape
Technical plant height (cm) (TPH) Aestivation type
Capsules per plants (CPP) Venation colour
Seeds per capsule (SPC) Anther colour, Stigma colour, Style colour
1000 seeds weight (g) (TW) Capsule dehiscence
Seed yield per plant (g) (SPP) Seed colour & Seed sizeKale et al 2013 Communicated
7
Method : Ward’s minimum variance Distance : Standardized Euclidean Squared multiple correlation (R2) : 0.75 CC construction: proportional random sampling
( Upadhyaya et al.,2003)
Software:SAS 9.1. 3XLSTAT
Diversity within eight quantitative characters and five fatty acid contents
Variable mean SD min max range CV
DTF 82.59 8.35 54.67 105 50.33 10.1
DTM 139.5 7.6 115.9 158 42.11 5.45
PH 62.94 11.4 27.06 111 83.94 18.1
TPH 33.78 9.18 15 88 73 27.2
CPP 107.8 46.6 22 364 342 43.2
SPC 7.78 1.42 4 10.56 6.56 18.3
TW 6.49 1.76 3 10.6 7.6 27.1
YPP 5.48 2.24 0.46 15 14.54 40.9
Stearic acid 1.02 0.37 0.35 2.71 2.36 35.9
Palmitic acid 1.41 0.46 0.67 4.21 3.54 32.5
Oleic acid 3.89 1.47 0.95 11.3 10.34 37.8
Linoleic acid 2.31 0.99 0.59 8.98 8.39 42.9
Linolenic acid 10.33 3.58 2.87 22.3 19.43 34.7
Core population size: 192
0.0000.1000.2000.3000.4000.5000.6000.700
Entire collectionCore collection
Shannon‐Weaver diversity index for 12 morphological descriptors in the entire and core collections
Means (± Standard errors) for morphological descriptors for the entire and core collections
% MD: 00.00% CR: 82.35
VariableEntire
Collection
Core
Collectionp value Difference
DAF 82.946±0.18 82.58±0.58 0.550 NS
DTM 139.3±0.17 139.34±0.51 0.942 NS
PH 62.142±0.25 62.92±0.74 0.318 NS
TPH 33.666±0.19 33.71±0.59 0.950 NS
CPP 106.724±0.96 107.44±3.16 0.829 NS
SPC 7.749±0.03 7.76±0.09 0.919 NS
TW 6.391±0.03 6.55±0.11 0.187 NS
YPP 5.505±0.05 5.48±0.15 0.875 NS
14
8
Genetic diversity analysis Total Available SSRs: 100 Polymorphic SSRs: 34 Genotypes Used: 192 No of alleles: 78 Average allele no. : 2.5 Average gene diversity: 0.4 Average PIC: 0.3
15
Neighbor‐ joining tree constructed using genetic distance matrix calculated based on molecular marker data
PCA analysis of 192 linseed accessions based on molecular marker data
Present of two sub‐populations in the core collection
Software used : STRUCTURE V. 2.3.1 Ancestry model: Admixture No. of K assumed: 1 to 10 (each with 10
replicates) Burn in time : 1,00,000 MCMC iterations: 1,00,000 Optimal K determination:
ad hoc procedure (Pritchard et al. 2000) ∆K method ( Evanno et al. 2005 )
SSR marker representative
SNP Development and Genome Wide Association Scan
16
9
Genotyping By Sequencing‐Overview
SNP Discovery
GBS Workshop., June 2013
SNP Development and Annotation
1.84 Million of reads
38% aligned to flax reference genome
72,758 SNPs identified
13,280 SNPs with MAF of ≥ 0.1
Density : one SNP per 27.86 kbp
SNPEff
10
Linkage Disequilibrium
Contig_25 (LD‐Decay = 75Kbp )
Contig_67 (LD‐Decay = 112Kbp )
Contig_123 (LD‐Decay = 100Kbp )
Population Structure Analysis
Software: Structure 2.3.3
Software: DARwin v5.0.158
Fst : 0.006
11
Association studies
CPP: 16 Significant SNPsTPH: 7 significant SNPsPH: one SNPs
Capsule per plant
Tech. Plant Height
Plant Height
SNP distribution
Trait Alleles Contig P‐value MarkerR2
CPP A/G 604 1.34E‐12 0.45CPP C/T 543 5.18E‐10 0.37CPP A/T 1486 1.20E‐09 0.36CPP T/G 3345 2.06E‐09 0.35CPP T/G 863 2.09E‐09 0.35CPP T/C 165 3.34E‐09 0.35CPP G/A 176 4.68E‐09 0.34CPP C/A 165 1.52E‐08 0.32CPP G/A 1253 4.45E‐08 0.31CPP T/G 1123 1.09E‐06 0.26CPP C/T 8159373 1.69E‐06 0.25CPP C/T 86 1.8E‐06 0.25CPP A/G 1376 2.38E‐06 0.25CPP A/T 1376 2.38E‐06 0.25CPP G/A 98 2.59E‐06 0.24CPP G/C 6 2.72E‐06 0.24PH A/T 883 5.41E‐07 0.27TPH A/T 883 3.55E‐12 0.44TPH A/T 34 4.43E‐10 0.37TPH T/C 464 1.87E‐07 0.29TPH C/T 196 2.41E‐07 0.28TPH T/A 8 8.04E‐07 0.26TPH C/T 888 1.81E‐06 0.25TPH A/G 924 1.90E‐06 0.25
12
Sr. No.
Cross Character
1. R‐552 (7.9) x KL‐221 (1.12)SDG
2. KL‐224 (7.49) x LC‐54 (1.23)
3. PKV‐NL‐260 (55.5%) x TL‐23 (1.2%) Omega 3
4. TL‐23 (1.2%) x EC‐541221 (66.13%)
Development of mapping populations for SDG and Omega‐3 fatty acid
COA, Nagpur
Genome analysis
24
13
Genome wide identification of flax NBS‐LRR genes
25
Identification and characterization of flax NBS‐LRR genes
CN CNL N NL TCNL TLTNL TN TNL TNLTNL TNN TNNN TNTNL Total
CNLA 1 2 3
CNLC 4 17 1 4 26
TNLA 2 1 3
TNLB 1 8 1 1 11
TNLC 1 1 4 1 7
TNLD 2 1 20 1 24
Total 4 18 1 8 2 1 3 33 1 1 1 1 7426
• Flax gene models
~43,484 gene models
• 114 characterized Rgene from plant R gene database
BLASTP
• NBS, LRR, TIR and CC domains search
Pfam and COILS
• Phylogeneticanalysis
• Real time analysis
97 NBS‐LRR genes
Classification of NBS‐LRR genes
Kale et al Genome 2013
14
Gene structure
27
TNL‐A and TNL‐C showed either lack of domain or show unusual structure
Phylogeny analysis of flax NBS –LRR genes
NBS region: P‐loop to WMA
HMMER 3: profile building and alignment
Phylogeny method : Parsimony (protpars ; Phylip suite )
Dots : coalescent points determined by comparing with Poplar & Arabidopsis NBS Sequences
28
15
NBS‐LRR gene expression analysisIn silIco expression:• Total 23% predicted genes had EST expression evidence with the highest for g2659• Promoter region analysis:
– Uniform distribution of the regulatory elements (WBOX, CBF, GCC and DRE) across the families
Real time gene expression: Variety: Ayogi ( Disease tolerant) Pathogen: Alternaria lini Tissue collected: 0, 4, 7 and 10 Days After Infection (DAI) 4 TNL and 1 CNL genes studied
29
Identification of the genes involved in lignan biosynthesis
30
16
Lignan• Phytoestrogens, structurally similar to several sex hormones – estradiol
and estrone
• Anti‐cancer effects (hormone related – breast, prostate)
• Flax seed contains lignan; Secoisolariciresinol diglucoside (SDG)
Why study UDP‐Glycosyltransferases (UGTs)?• Glycosylation is terminal and regulatory step of secondary metabolite
biosynthesis
• Alters the bioactivity, solubility, bioavailability and transport of metabolites
• In planta functions: hormone homeostasis, detoxification of xenobiotics and biosynthesis and storage of secondary metabolites
Strategy for identification of flax UDP‐Glycosyltransferases
32
Mining of WGS using PSPG box
Filtration using various criteria for identification of UDP‐ Glucosyltransferase
Phylogenetic and expression analysis
Exploitation of the generated data to identify and characterize gene involved in lignan biosynthesis
Baravkar et al BMC Genomics 2012
17
Phylogeny of flax UGT-
Glycosyltransferase
33
New family members
Flax diverged members
Expression of flax UGT genes
34
Duplicated genes with similar expression
Duplicated genes with different expression
Duplicated genes with one of the genes without expression
Candidate gene with highexpression in seeds
18
Genome wide Identification of the regulatory genes
35
Mature sequences from miRBase (3430)
Flax genome search using PatMan
Extract miRNA flanking sequences
Folding using RNAfold
Most stable hairpin check using Randfold
Strategy for prediction of miRNA from flax genome
36
Total 116 microRNA genes belonging to 23 families identified
Minimum free energy (MFE) for all miRNA was ≤ ‐ 37.70 kcal.mol‐1
Baravkar et al Planta 2013
19
Flax miRNA genes characterization
37
RT-qPCR of selected 14 pre-miRNA transcripts
38
20
Target prediction Mature miRNA sequences were searched against assembled flax
EST database
Specifically miRNA/target duplexes must obey the following rules
Complementarity: Smith–Waterman algorithm
No more than four mismatches between miRNA & target
No more than two adjacent mismatches in the miRNA/target duplex
No adjacent mismatches in positions 2‐12 of the miRNA/target duplex (5' of miRNA)
Target site accessibility: RNAup calculates unpaired energy (UPE)
Translational inhibition : predict mode of gene expression inhibition
39psRNATarget (http://plantgrn.noble.org/psRNATarget)
Expression analysis of miRNA target genes
40
A total of 479
(142 non-redundant)
potential target
transcripts identified
Most of the targets
coding for TFs
Flax EST targets transcripts identified for 105 (90.51%) miRNAs
52% of identified target transcripts of unknown function
Inverse correlation in expression of miRNA and its predicted target,
indicating miRNA mediated regulation of the target genes
21
46
Flax genetic improvement
Gene regulation
Genome organization
and evolution
Genomic resources
Seed development
FA & SDG pathway
Seed proteome
Seed metabolites
Seed miRNA
Molecular markers
Core population
Genetic map
MiRNAprofiling
Gene Promoter
Comparative genomics
Multi‐gene families
PublicationsA.V. Rajwade, R. S. Arora, N. Y. Kadoo, A. M. Harsulkar, P. B. Ghorpade and V. S.Gupta(2010)Relatedness of Indian Flax Genotypes (Linum usitatissimum L.): An Inter‐Simple Sequence Repeat (ISSR) primer assay Mol Biotechnol, 45:161–170 DOI10.1007/s12033‐010‐9256‐7
S. M. Kale, V.C. Pardeshi, N. Y. Kadoo, P. B. Ghorpade, M. M. Jana and V. S. Gupta (2012).Development of simple sequence repeat markers in linseed using next generationsequencing technology.Molecular Breeding, 30 , 596‐606
S.M. Kale, S.G. Kale, V.C. Pardeshi, G.S. Gurjar, V.S. Gupta, R.T. Gohokar, P.B. Ghorpade andN.Y. Kadoo. (2012). Inter‐simple sequence repeat markers reveal high genetic diversityamong Alternaria alternata isolates of Indian origin. Journal of mycology and plantpathology, 42(2)
V. T. Barvkar, V. C. Pardeshi, S. M. Kale, N. Y. Kadoo and V. S. Gupta. (2012). Phylogenomicanalysis of UDP glycosyltransferase 1 multigene family in Linum usitatissimum identifiedgenes with varied expression patterns. BMC Genomics, 13:175
V. T. Barvkar, V. C. Pardeshi, S. M. Kale, N. Y. Kadoo and V. S. Gupta (2012). Proteomeprofiling of flax (Linum usitatissimum) seed: Characterization of functional metabolicpathways operating during seed development. Journal of Proteome Research, 11 (12),6264‐6276
22
PublicationsS. M. Kale, V. C. Pardeshi, V. T. Barvkar, V. S. Gupta and N. Y Kadoo. (2013) Genome‐wide identification and characterization of nucleotide binding site leucine‐rich repeat genes in linseed reveal distinct patterns of gene structure. Genome, 56:91‐99, 10.1139/gen‐2012‐0135
V. T. Barvkar, V.C. Pardeshi, S.M. Kale, S. Qiu, M. Rollin S,R. Datla, V. S. Gupta and N. Y. Kadoo. (2013). Genome‐wide identification and characterization of microRNA genes and their targets in flax (Linum usitatissimum). Planta, 237:1149–116
Rajwade AV , Kadoo NY, Borikar SP, Harsulkar AM, Ghorpade PB and Gupta VS (2014)Differential transcriptional activity of SAD, FAD2 and FAD3 desaturase genes in developing flax seeds contributes to varietal variation in α‐linolenic acid content. Phytochemistry, 98: 41–53
S.M. Kale, D. Jarquin R.L. Srivastava, P.K. Singh, V.C. Pardeshi, V.T. Barvkar, A. Lowrenz, N.Y.Kadoo, V.S. Gupta (2014) Genomewide association mapping in linseed identifies significant loci for different agronomic traits. Submitted to Molecular Breeding.
S.M. Kale, R.L. Srivastava, P.K. Singh, V.C. Pardeshi, V.T. Barvkar, N.Y.Kadoo, V.S. Gupta.(2014) Development of core collection of linseed and Genetic diversity, population structure analysis using SSR markers. Submitted to Indian Journal of Genetics and Plant Breeding
44
NCL team
Dr. Narendra Y. Kadoo, Dr. Ashok P. Giri
Dr. Varsha C. Pardeshi , Mrs. AshwiniRajwade , Mr. Sandip Kale, Mr. Vitthal Barvkar
Collaborators
Dr. Prakash B. Ghorpade, COA, Nagpur
Dr. Abhay Harsulkar, IRSHA, Pune
Dr. R. L. Srivastava, CSAUAT, Kanpur
Dr. Raju Datla/ Dr. Mike Deyholos‐ PBI –Canada
Funding Agency
DBT, India and CSIR, India
23
Thank you
45