european networking summer school (enss) plant genomics...
TRANSCRIPT
28.7.2009
European Networking Summer School(ENSS)
Plant Genomics & Bioinformatics
Standing Committee for Life, Earth and EnvironmentalSciences (LESC)
28.7.2009
Supported by:Austria Fonds zur Förderung der wissenschaftlichen Forschung (FWF)
Belgium: Fonds voor Wetenschappelijk Onderzoek (FWO)
Finland: Academy of Finland - Research Council fo Biosciences and Environment
Ireland: Irish Research Council for Science Engineering and Technology (IRCSET)
Italy: Consiglio Nazionale delle Ricerche (CNR) - Dipartimento Agroalimentare
Netherlands: Nederlandse Wetenschappelijk voon Onderzoek (NWO)
Norway: The Research Council of Norway
Poland: The Polish Academy of Science
Romania: Ministry of Education and Research
United Kingdom: Biotechnology and Biological Sciences Research Council (BBSRC)
28.7.2009
AIMS
• Support plant genome research networks based bytraining of young investigators
• Summer courses with theoretic and practical training
• Access to technologies, resources, skills and know-how
28.7.2009
ENSS 2009 Plant Bioinformatics, Systems and Synthetic Biology
27-31 July 2009University of Nottingham, UK
Natalio Krasnogor, Jaume Bacardit, Malcolm Bennett
ENSS 2010 Plant Epigenetics
September 2010Leibniz Institute of Plant Genetics and Crop
Plant Research (IPK) in Gatersleben, GermanyMichael Florian Mette
28.7.2009
Comparative and Functional Genomics
Functional genomics is the understanding of the function of genes and other parts of the genome
Comparative genomics involves the use of computer programs to line up multiple genomes/genes for the identification of similarities
28.7.2009
What is needed to do comparative and functional genomics?
model organism
Why are model organisms important?
Criteria for a good model organism?
Relationship of the model to important crop plants?
How many genes are the same?
Why using knock out/down mutants?
How will they help us determine gene function?
28.7.2009
What will you hear?
Background on annotating gene function using comparativegenomic tools
Example for the use comparing genomes/genes from individuals between populations to determine their function
Example to show how these tools can be employed to get a glimps on the function of a yet unknown gene
28.7.2009
All Starts With Genome Sequencing Projects
http://www.ncbi.nlm.nih.gov/genomes/leuks.cgi
http://www.ensembl.org/info/about/species.html
How many plant genomes have been sequenced?
28.7.2009
Plant Genome Sequencing Projects
28.7.2009
Improvements in the rate of DNA sequencing overthe past 30 years and into the future
Stratton et al., 2009
28.7.2009
Quote from Joe Ecker IARC2009Capillary sequencing – 500 people – 7 years – 70.000.000 $
Perlegen sequencing – 50 people – 1 year – 70.000 $
Next generation sequencing – 2 people –7 days – 7.000 $ - 50 x coverage
28.7.2009
Paradigm Change
28.7.2009
After genome sequencing still many questions remain – example Arabidopsis
MASC 2007
MASC 2009
28.7.2009
Sequenced genomes – the basis to address questions on
Function of all genes
Role of single nucleotide polymorphisms (SNPs, natural variation)
Where? - Localization (organs, tissues, cellular, sub-cellular)
Functional redundancies/diversification of gene families
Role of noncoding regions and repeats in the genome
Biological role(s)
When? – Regulation (transcriptional, post-transcriptional, post-translational,..)
Interacting partners - Networks
Role of alternative splicing variants
28.7.2009
Functional Genomic ToolsSequences genome, full-length cDNA clones
Gene knock-outs, knock-downs (T-DNA, transposon, amiRNA, tilling, gene targeting, collection of natural variants, ....)
Methods for studying functions of nonprotein-coding sequences
Comprehensive analysis of gene expression (microarray, deepsequencing, cell sorting, laser dissection, reporter constructs, … )
Large-scale protein analyses (proteomics, protein arrays, large scaleY2H, interactomes-networks, 3D structures)
Metabolomics
-omics
28.7.2009
Comparative genomics between species
comparison of genomes from different taxa
28.7.2009
Comparative genomics within a speciescomparison of genomes from different individuals between populationsthat might be differentially adapted to particular environments ….
Weigel Lab
28.7.2009
What to compare? on the structural level:
sequence similarity (nucleic acid, protein, domains)gene location (synteny)gene structure (length, number of exons) amount of noncoding DNA highly conserved regions (fundamental/essential genes)highly/less polymorphic regions (indication of adaptation,
selection)
on the functional level: expression pattern epigenetic regulation post-transcriptional translational regulation subcellular localization interactionspost-translational regulation/modification
28.7.2009
How to compare?Search tools for homologies: BLAST
FASTA
28.7.2009
How to compare?http://www.expasy.org/
28.7.2009
28.7.2009
28.7.2009
28.7.2009
Example
1. BLAST:
Proteome analysis in Poplar result a peptide of MILSALLTSVGINLGLC
UniGene
2. UniGene
28.7.2009
G – Entrez Gene
28.7.2009
G – Entrez Gene
28.7.2009
Domains – Function - Localization?
28.7.2009
Example
1. BLAST:
Proteome analysis in Poplar result a peptide of MILSALLTSVGINLGLC
28.7.2009
G – Entrez Gene
28.7.2009
G – Entrez Gene
28.7.2009
alternatively spliced
28.7.2009
GBrowser
gene family 13 members
28.7.2009
Synteny SearchCan the annotation of one member of the gene family in any plant species
guide to the function?
28.7.2009
Search for HYP1
28.7.2009
Common Function?
28.7.2009
Other Synteny Tools
http://chibba.agtec.uga.edu/duplication/
28.7.2009
Plant Genome Duplication Database (PGDD)
http://chibba.agtec.uga.edu/duplication/
28.7.2009
Intra-genome Dotblot analysis at PGDD
duplication of chr 3 and chr 2
non-synonymous substitution (Ka)synonyoumous substitution (Ks)
28.7.2009
Cross-genome Dotblot analysis at PGDD
28.7.2009
PGDD
microsynteny
Vitis vinifera
Medicagotrunculata
28.7.2009
Back to TAIR
28.7.2009
SUBA-Database
28.7.2009
novel putative function
HYP1
http://aramemnon.botanik.uni-koeln.de/
15 related proteins
28.7.2009
What is AtGFS10?http://www.ncbi.nlm.nih.gov/sites/gquery
28.7.2009
Topology Prediction
28.7.2009
Strongly predicted to be in the secretroypathway
28.7.2009
Expression Analyses
Poplar homologs
http://bbc.botany.utoronto.ca/efp/cgi-bin/efpWeb.cgi
28.7.2009highest in roots and young leaves
Expression Analyses
highest in male catkins
highest in leaves
28.7.2009
Upon water stressand
day/night cycles
28.7.2009
co-expressed gene in Arabidopsis
28.7.2009
Expression of the Arabidopsis homolog uponosmotic stress
28.7.2009
Still no entry in Proteins Wiki
http://proteins.wikia.com/wiki/
28.7.2009
Summary Peptide/Protein AnnotationMILSALLTSVGINLGLC
• belongs to gene family –• members HYP1 (hypothetical protein 1)• RXW8 (name of cDNA) • ERD4 (early responsive to dehydration)• AtGFS10 (protein involved in vacuolar sorting fo storage proteins,
green fluorescent seed, gfs mutant)
Quote: „no closely related homologs of GFS10 in the Arabidopsis genome“but topology similar – Aramemnon and 39% amino acid homology
• integral membrane protein
• plasma membrane, secretory pathway
• induced during water stress
wild-type
secrete vacuole-targeted GFP out of the seed cells
28.7.2009
What next?
Poplar gene with homolog in Arabidopsis (81%) First functional tests in Arabidopsis: knock outs
over-/ectopic-/ inducible expressionin vivo localization – XFP, immunointeraction with other proteins
Design experiments for functional characterization
28.7.2009
QTL-Analysis or Association Mapping
28.7.2009
Understanding functional consequences of natural variation: trichome patterning in
Arabidopsis
Example for the comparison of genomes/gene and their function from individuals between populations
Julia Hilscher, Christian Schlötterer, Marie-Theres Hauser
28.7.2009
Trichomes in Arabidopsis• Single cell structure• Present on leaves, stem, petioles, sepals• 32C->polyploid• Model for cell fate specification
A B
DC
28.7.2009
Trichome function• Arabidopsis
– Protection against herbivoryMauricio & Rausher (1997), Handley, Ekbom & Ågren (2005)
• A. lyrata– Protection against herbivory
Kivimäki, Kärkkäinen, Gaudeul, Løe & Ågren (2007)
• Other plants– Decrease of water loss– Increased light reflection– Freezing tolerance– Ca++ homeostasis– Heavy metal storage– Metabolite production and storage – Cotton fiber development
28.7.2009
Cross-species function of Trichomeregulators
Cotton fiber development
Trichome development of Arabidopsis 35S::GaMYB2 in Arabidopsis
„hairs on seeds“
28.7.2009
Trichome density differs in natural populations
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
Gr-
1Bl
h-1
Bur
-0/1
Ba-
1B
uckh
orn
Pas
sLe
rN
ok-0
Ga-
0Lz
-0W
t-5O
y-0
Di-0
Van
-0B
rest
-1C
t-1/4
No-
0A
n-1
Wa-
1/3
83-3
Tsu-
0Ki
n-0
Pi-0
Pa-
1/1
T-20
-IA
g-0/
4T-
12-I
I-33
N7
Lip-
0/1
RLD
-1S
toc
5T-
29-I
Mh-
0M
t-0G
e-0
Tsch
a-1
Hi-0
Kar
a7-2
Sf-2
Pa-
3D
ül/4
Nd-
0Es
t-1Yo
-0S
t-0 Col
Aa-
0S
haG
rivo-
1B
ayU
k-3
Es-0
9481
Te-0
Ws-
0Te
in-1
Bla-
1K
as-1
Ms-
0R
yb2
Mas
/2Ita
-0C
ond
Per
-1/5
Lim
384
-1Al
c-0/
1Ts
ar/1
Edi
-0/2
Gy-
0P
og-0
/2B
as-2
/320
-13
Rub
-1/2
Can
-0
Can-0Gr-1
1.5 mm
28.7.2009
QTL mapping
x
F1x F1
F2
28.7.2009
Composite Interval Mapping
• 266 F2 individuals
• 24 microsatellites
• A single QTL on chromosome 2 explains 33% of the variation in trichome number
0
20
40
60
80
100
120
0 10 20 30 40 50 60
LR0ng
a63
Ath
ZFP
G
T27K
12-S
p6
T2K
10
0
20
40
60
80
100
120
0 10 20 30 40 50
LR0
[cM]
CIW
2
F19G
14A
nga2
235
ngaT
3B23
nga3
61ng
a347
0ng
a369
2
ngaT
2P4
Chr I Chr II
0
20
40
60
80
100
120
0 10 20 30 40 50 60
LR0
nga1
62
3g28
3 7
F1P
2TG
F
nga6
0
20
40
60
80
100
120
0 10 20 30 40
LR0
[cM]
JV30
/31
nga8
CIW
7
nga1
139
Chr III Chr IV
0
20
40
60
80
100
120
0 10 20 30 40
LR0
[cM]
nga1
51 ng
a13
9 AtS
0191
nga1
29
Chr V Trichome numberTrichome density
28.7.2009
Components of trichomedevelopment
Patterning involves positive and negative regulators with redundant functions
WTtry cpc
28.7.2009
Model of epidermal patterning
Phenotypic class of mutants
Gene Action
no trichomes“glabrous”
GL1 Positive regulators
TTG1
decreased trichome #
GL3/EGL3
GL2
nests of trichomes
TRY Negative regulators
single-repeat R3 MYB genes,
act non-cell autonomous
increasedtrichome #
CPC
GL3 GL3
TTG1
GL1GL2, …
CPC, TRY, ETC1, ETC2, TCL1, TCL2,ETC3
modified from Larkin et al., 2003
All of the genes or their paralogs are also involved in root hair patterning
28.7.2009
Composite Interval Mapping
TCL2
28.7.2009
Fine mapping on chromosome II• Selective genotyping
– 465 F2 with extreme phenotype– 10 additional markers
• 66 recombinants in the initial interval -> expected mapping resolution of 40kb
• 288 kb mapping interval with remainig 87 genes
ngaT
3B23
nga3
000
nga3
023
HD
UP
3038
Sm
all M
ybs
nga3
073
nga3
098
nga3
61ng
a310
82ng
a311
9
nga3
168
nga3
341
% re
com
bina
ntch
rom
osom
es
0
1
2
3
4
5
6
28.7.2009
3 candidate genes
Three single-repeat R3 MYB genes are located in the fine-mapping interval
At2g30440
TCL1 ETC2
KIS
TCL2
At2g30430
1kb
28.7.2009
Association with trichome number
CPC
84-I
Gr
Can
Oy
Ler
Col
84-I
CPC
Gr
Can
Oy
Ler
Col
Ws
20-13
84-I
CPC
Gr
Can
Oy
Ler
Col
Ws
20-13
low trichome number accessions
high trichome number accessions
Small MYB_A
Small MYB_B
Small MYB_C
28.7.2009
Complementation test
mutants are complemented with the wildtype alleles from high and low trichome number accessions
smal
lmyb
Col
Sm
allM
ybG
r-1
smal
lMY
BLe
r
smal
lMY
BC
an-0
Sm
allM
YB
20-1
3
# of
tric
hom
es/le
af
0
100
200
300
400
F1 cross to
low trichome numberaccession
high trichome numberaccession
28.7.2009
Identification of QTN
Many SNPs show association with trichome phenotype
Screen many high/low trichome accessions for recombinants
28.7.2009
Identification of QTN: 2 candidates-53 +55
28.7.2009
Identification of QTN
Gr-1
Can-0
A
C A
G-53 bp +55 bpposition
Gr-1
Can-0
C
C G
G
Gr-1
Can-0
A
A A
A
allele background
SNP status
wt
+55 Gr-1
-53 Can-0
+55 Can-0
-53 Gr-1
28.7.2009
Identification of QTNTransgenic complementation
0
50
100
150
200
250
300#
of tr
icho
mes
/leaf
+55Gr-1 +55Can-0
Gr-1Can-0
- 53Can-0 - 53Gr-1
allele background
ANOVA
SNP state: p=0.001
allele background: p=0.35
Mean ± s.e.m. of n>21 T1 lines each, 3 leaf positions
28.7.2009
QTN is highly conserved among family members
+55 mutation leads to an amino acid replacement: Lysine (K) to Glutamate (E)
K: ancestral, yet unknown importance
Alpha Helices 1-3 constituting R3 MYB domain with conserved W residues forming cluster
Predicted bHLH interaction motif: [DE]Lx2[RK]x3Lx6Lx3R (Zimmermann et al., Plant J 2004)
Required for CPC movement (Kurata et al., Development 2005)
H1 H2 H3
CPL3
TCL1
CPL3
TCL1
K19E
TCL2
TCL2TCL1
28.7.2009
Competition between activators and repressors
TTG1
GL3/EGL3
TTG1
GL3/EGL3smallmyb
target genes
GL1
target genes
smallMyb
GL1
TTG1
GL3/EGL3
competition
28.7.2009
3 possible factors for trichome patterning
Binding strength to GL3 or the regulatory region
Movement rate to neighboring cells
Stability
Lysine modification by: methylationN-glycosylationubiquitylationsumoylationacetylation
Glutamate modification by: methylation
28.7.2009
Next steps
Biochemistry
Cell Biology
28.7.2009
Sumoylation? Use ExPASy
28.7.2009
Take Home Message
Functional genetics & natural variation – powerful tool
Strong QTL or Association
33% of natural variation in trichome number was explainable by a single aa replacement
Classical genetic studies failed to identify the major modifier of trichome number
Typical accession used for functional tests (Ws, Col) have intermediate-high trichome number and the weak suppressor allele
Requirements for success
28.7.2009
Association Mapping
trichome density
In future
– phenotyping ecotypes/accession- look up into the association map- identify the gene with the variable SNPs- functional confirmation
28.7.2009
More sequences * More functions * More to compare
28.7.2009
THANKS
QUESTIONS?