transcriptomics as a tool for functional characterization...
TRANSCRIPT
Transcriptomics as a tool for functional characterization of the grapevine genome
Massimo Delledonne
University of Verona - Italy
Assembly Number N50 (Kb) Longest (Kb) Size (Mb) Percentage of the assembly
contigs All 19578 65.9 557 467.5 -
Supercontigs
All 3515 2 065 12 675 487.1 100
Anchored on chromosomes 191 3 189 12 675 335.6 68.9
Anchored on chromosomes and oriented
143 3 827 12 675 296.9 60.9
Annotation Number Median Size (bp) Total length (Mb) Percentage of the
genome %GC
Genes 30 434 3 399 225.6 46.3 36.2
Exons CDS 149 351 130 33.6 6.9 44.5
Introns CDS 118 917 213 178.6 36.7 34.7
Intergenic 30 434 3 544 261.5 34.7 33.0
tRNA* 600 73 0.04 NS 43.0
miRNA** 164 103.5 0.002 NS 35.9
Orthology Common to Eudicotyledons
Common to Magnoliophyta
(flowering plants)
Number of orthologous
proteins
Identity percent
Number of orthologous proteins
Number of orthologous proteins
Populus trichocarpa 12 996 72.7
10 5478 121Arabidopsis
thaliana 11 404 65.5
Oryza sativa 9 731 59.8
25 years of Arabidopsis molecular biology and genetics have now yielded experimental proof of function for 3,500 genes
out of 25,000
1 GAATTCACGC TTTAAGGCTA TGGCCACCTT TAAATAAAGT ACAGCATTAC
TAAAAAAAAA
61 TAAATAAATA ACCAATACAA ATCTTTTCAG AGACAAATGC ATTCTCTGAC
ATCTGAGGTT
121 ACAGCAAATC TCTTCTTCAC CTGTTTGCTT GTTTAGAGTT GTAATATTTG
CTTTGGTGTA
181 GAGCTGAAGA CATAAATTGG TAACCAATGG AATTATCTGG CCTCAGACTT
TATTTATTTT
241 CATCATTTAT TTCACTGATG TGCAAATTTA TTCCGTACCA GCAAATGTCA
ATTTAATTAT
301 ATTCTACAGT ACACAGTGAA TCATGTATAC TTAGTCAAGT TGTAAATACA
CTAAACCATA
361 TAAACTCACA ACAGTATATC AGCTCATGAT GGGTAAATGA CTTTTCCCTG
AGAAAGAGTA
421 TCTGTTTAAC CTGCATGATC TCACTCTTTA GTATTTGCTT CTTTAGTCGA
CGTTTGTTTC
481 CTAGTTTTGA ATATAATCAT GATATGGAGA GACAAGTGAA ATCACCACAA
TTTTGTTTTC
541 CAAAATGGGA GACTATGCAA ATGCTGAAAT GAGAATTAAT ACATCCAAAA
TATCGAACCA
601 CAATTATGGC TTTGCTTTAC TTTTTGCCCG TAAGAGACAT GTGGCCTAGA
ATAGGTGGCA
661 GGTATTCCTA CCACAACCTT GCTTAGCATA GTGGTTGACT AAATATAAAT
TTTAGAGATG
721 AAGGTTGTTC TATACCCAGA TTTCAACGTG ATTGCTATGC CCACTTCACT
TTCTTTAAAA
781 TACATATTTT TCTTACTTCT CACTTTCTTT TTCTTCTTGG TTGACATTTT
TTGGCTCAGG
841 GATTTTTTTT TTCCTTATGA TCTCAAGAAA TTTTTCTCAT TGAAAAAGAC
ATAATCGTGC
901 TGGGAGTGGT GGCTCATGCT TGTAATCCCA GCACTTTGGG AGGCTGAGGC
TGGTGGATCA
961 CCTGAGGTCA GCAGTTACAG ATGAGCCCGG CCAAAATGGT GAAACCTCAT
CTCTACTAAA
1021 AATACAAAAA TTTGCCAGGT GTGGTGGCAG GCACTTGTAA TCCCAGCCAC
TCGGGAGGCT
1081 GAGGCAGGAG AATCGCTTGA ACCCAGGAGG CAGAGGTTGC AGTGAGCCAA
GATCATTCCA
1141 TTGGACTCTA GCAGGGTGAC AAGAGCAAAA CTCCATCTCA GGAAAAAAAA
AATCATAAAT
1201 TTTCCCATAT GAAAAAAATA ACACAAGATC CGGAATACAG AGAGGAGCAT
AATCCTTTGC
1261 AGGTCATAGA TGTAATCTTT CTTCCAGGAA AAATTTATTT CAGATAAGAC
CAGAATTGGA
1321 AACATATTCC ATGCCGTCAG ATAGCACTGG CTTAGGAGAC GAATGAGGAG
GAGCCTGCAG
1381 GCTACCTCAA GGATAAGAAG CAGGCAAAAG GCAAGCACAG GGGCGGCATG
CACTCACACT
1441 GGGGCTGCTC CTTCCTGGGC AAGTTTCAGA AACTCACTGA CAGAGCTAGC
AGCTCCCATA
1501 GAGATGAATG CCCATGTTTT CCCGAAGGGA GAACTGATGC TTAGAAAGGC
TGAATGACTT
Genome
Gene 2
Gene 1
Gene 4
Gene 3
Gene 5
Gene 6
Gene 7
Gene 8
Gene 9
Gene 10
Orchestration of gene expression is at the basis of the mistery of life
If we could monitor the expression of the whole set of genes that are present in a cell or tissue, we could:
• make a “genetic catalog” of the biological processes going on in those cells• understand the set of genes required for those processes
And by comparing such “gene expression profiles” from two different cell types, we could learn what makes those cells different from each other.
MicroarrayMicroarray market market keepskeeps evolvingevolving• Affymetrix dominates the market of whole
genome chips (reproducibility, platform diffusion, platform automation etc.), but it is NOT flexible and is EXPENSIVE. Customization nearly impossible ($$$!!!!).
• A number of competitors offer more advanced or flexible technologies (Nimblegen, Illumina, Agilent etc.). Customization possible. Expensive
CombiMatrix Microarray
ExceptionalExceptional qualityquality controlcontrol: : EachEach spot spot ofof eacheach microarraymicroarray isis 100% 100% functionallyfunctionally testedtested
High flexibilityHigh flexibility: Each array contains up to : Each array contains up to 90,000 different sequences. In house synthesis90,000 different sequences. In house synthesis
Arrays are reArrays are re--usableusable: cost of the analysis : cost of the analysis significantly reducedsignificantly reduced
stanza2_0001.wmv
Electrode 1 Electrode 2
CGTGTACGTCGATGTCTACTGTT
TTATTATCAAATGTACAACTGTT
• Software applies voltage to sets of specific electrodes
• Electrode activation controls chemical reactions at each individual electrode on the microarray
90K sensitivity
calculation
with dynamic
range
Very good signal of hybridization due to good hybridization and washing procedures
Scatter
plot with
pair‐wise
correlation coefficients determining
limit
of
reproducibility
8.00
9.00
10.00
11.00
12.00
13.00
14.00
15.00
16.00
17.00
8.00 10.00 12.00 14.00 16.00
3
5
7.00
8.00
9.00
10.00
11.00
12.00
13.00
14.00
15.00
16.00
17.00
7.00 9.00 11.00 13.00 15.00 17.00
5
6
7.00
8.00
9.00
10.00
11.00
12.00
13.00
14.00
15.00
16.00
7.00 9.00 11.00 13.00 15.00
2
7
0.99 0.98 0.97
Up to 5 hyb
After 5 hyb
Value of correlation coefficients
CentreCentre forfor plantplant functionalfunctional genomicsgenomics
Combimatrix
Sinthesiser
Grape
genome
project:The centre
is
in charge
for
the production of
the Vitis
vinifera
“complete gene chip”. Two
chip versions
released
so far. Collaboration
with
USA, Portugal, Spain, Germany
and
FRANCE
Other
microarrays
currently
sinthesised and analysed:
• Homo sapiens
(University
of Verona)• Sheep
(University
of Udine and University
of Viterbo)• Medicago truncatula
(University
of
Turin, CNR)• Barley
(CRA, Piacenza and Rome)• Lotus japonicus
(CNR Naple)• Arabidopsis
(University
of Verona)• Pomodoro
(University
of Verona, University
of Naple, CNR)• Tobacco
(University
di Verona)
Agilent
Bioanalyzer
Real-Time PCR
1st 1st grapegrape
probe design: EST probe design: EST basedbased•
Total Grape
database:–
19583 TCs
and ETs–
2394 singletons (on 14550)–
Total of 21977 Grape
sequences
•
Probe selection:–
In high
specificity
across
the total database
to avoid
cross-hybridization–
Without
secondary
structure–
With
same
range of melting
temperature
and GC content–
35-40 mers which
is
the limit
of the technology–
In the 3’end preferentially
•
OligoArray
design on Grape
sequences
(controls
not included):–
17454 oligo pairs–
1908 single oligos–
2615 sequences
rejected
•
Total number
of Grape
transcripts
represented: 19363
•
Negative
(QC, virus, bacteria) and positive controls
(spikes, housekeeping genes) added
2nd 2nd grapegrape
probe design: probe design: basedbased
on the on the 8.4X 8.4X assemblyassembly
•
Total Grape
database:–
19583 TCs
and ETs–
2394 singletons (on 14550)–
3494 gene
predicted
from
genomic
sequences–
Total of 25471 Grape
sequences
•
Probe selection:–
In high
specificity
across
the total database
to avoid
cross-hybridization–
Without
secondary
structure–
With
same
range of melting
temperature
and GC content–
35-40 mers which
is
the limit
of the technology–
In the 3’end preferentially
(after
exclusion of the last 30 bases)
•
OligoArray
design on Grape
sequences
(controls
not included):–
21451 unique–
3111 with
cross-hybridization
•
Total number
of Grape
transcripts
represented: 24562
•
Negative
(QC, virus, bacteria) and positive controls
(spikes, housekeeping
genes) added
3rd 3rd grapegrape
probe design: probe design: basedbased
on 12X on 12X assemblyassembly
and and in in deepdeep sequencingsequencing of the of the grapegrape
transcriptometranscriptome
Predicted genes validated only by a 454 sequence
67Predicted genes validated only by an EST sequence
84
Predicted genes covered by 454 OR EST 609
2500 “orphan
genes”
estimated
to
be
identified
in the grape
genome
-> dedicated
microarray
for
further
validation
ExerciseExercise
on on chromosomechromosome
33
Sequencing MappingGenomic sequencing
Molecular polymorphismsPhysical map
High resolution genetic mapQTLs for resistance
LEVEL 1 Structural Genomics
Germplasmcharacterization
Berry maturation
LEVEL 3
Biotic stresses Abiotic stresses
APPLIED PROJECTS
Reproduction /Development
Transcriptomics BioinformaticsEST
Deep cDNA sequencingMicroarray
Web portalDatabase
Data analysis
LEVEL 2
Proteomics Metabolomics2D-EMS
Metabolites analysis
Functional Genomics
UniVR Platform for the analysis of gene expression
Samples quality control
hybridization Data analysis Validation of results
Exercise with the prototype microarray carrying 24.000 grape genes
The objectives that we intend to achieve are The objectives that we intend to achieve are the following:the following:
• Functional characterization of the genes involved in the process of berry ripening and withering
• Functional characterization of the genes involved in the mechanisms of plant resistance to diseases
• Application of a “genetical genomics” approach to grapeThrough extensive microarray analysis, quantitative trait loci (QTL) analysis is applied to gene expression levels to identify genomic loci that control the observed expression change (eQTL), thus establishing “genetic regulatory networks”. Exercise: in collaboration with the “Institut Des Sciences De la Vigne et Du Vin” in Bordeaux, analysis of a population of 130 individuals segregating for root resistance to iron chlorosis and grape quality.
Project:“MOLECULAR CARACTERIZATION OF RIPENING AND WITHERING IN
CORVINA BERRY, THE MAJOR COMPONENT OF AMARONE WINE”( BACCA)
Veneto regional wine district, 2005
1000.000 euro: ORVIT 600.000, Veneto region 400.000
To effectively improve a process, it is imperative to understand its fundamental mechanisms as well as their mechanisms of regulation.
The wine is the final product of the process of biological transformation of grape berries, which characteristics determine the quality of the wine.
OUR GOAL:
To study ripening and withering in grape berries in order to identify the genes (“the instructions”) involved in the process and in the accumulation of compounds characterizing high-quality wines
Vitis vinifera cv. Corvina clone 482005, 2006 and 2007
• 3 areas of Verona province:
•High and moderate altitude
•Mode of cultivation: pergola o spalliera
Bardolino zoneValpolicellaEast Verona zone
Masi Faettini Borghetti Pule
Boni Az.GIV Zeni Cinquetti VillaMedici
Caloini RamaSolfa Danzi Aldegheri
Pasqua
3-08-20058-08-2006
15-0918-09
18-1023-10
17-1113-11
15-1218-12
22-084-09
RIPENING WITHERING
•RNA and protein
extraction
from
skin
and pulp•Metabolites
extraction
from
berry
and wine
18-07-2007 8-08 29-08 10-10 30-10 25-11
wine making
Transcriptomics
Proteomics
MetabolomicsFigura C2
0
100
200
300
400
500
600
700
800
900
7/8/02
23/8/02
29/8/024/9/02
18/9/02
25/9/02Data
g/K
g
3-oxo-a-ionolo Vomifoliolo3-OH-b-damascone 3-OH-b-ionone • Correlation of gene expression results from
microarray analysis with those coming from metabolomics, proteomics and with data from wine making experiments
• “Phenotypic plasticity” study on the 3 areas of production