what can (many) sequences tell us?. nuclear receptor function
TRANSCRIPT
![Page 1: What can (many) sequences tell us?. Nuclear receptor function](https://reader038.vdocuments.mx/reader038/viewer/2022110209/56649e545503460f94b4b5b4/html5/thumbnails/1.jpg)
What can (many) sequences tell us?
![Page 2: What can (many) sequences tell us?. Nuclear receptor function](https://reader038.vdocuments.mx/reader038/viewer/2022110209/56649e545503460f94b4b5b4/html5/thumbnails/2.jpg)
Nuclear receptor function
![Page 3: What can (many) sequences tell us?. Nuclear receptor function](https://reader038.vdocuments.mx/reader038/viewer/2022110209/56649e545503460f94b4b5b4/html5/thumbnails/3.jpg)
Nuclear receptor family
NR1C1-PPAR
NR1C2-PPAS
NR1C3-PPAT
NR1D1-EAR1NR1D2-BD73
NR1I3-MB67NR1I4-CAR1-MOUSE-
NR1H2-NER
NR1H3-LXR
NR1H4-FAR
NR4A2-NOT
NR4A3-NOR1
NR4A1-NGFINR2F1-COTF
NR2F2-ARP1
NR2F6-EAR2
NR2E3-PNR
NR2B1-RRXA NR2B2-RRXB
NR2A2-HN4G
NR3C1-GCRNR3C4-ANDR
NR3C3-PRGRNR3A1-ESTR
NR3A2-ERBT
NR3B1-ERR1
NR3B2-ERR2
NR5A1-SF1NR5A2-FTF
NR1I1-VDR
NR1B3-RRG1
NR2E1-TLXNR2C1-TR2-11
NR2C2-TR4
NR6A1-GCNF
NR2B3-RRXG
NR2A1-HNF4NR2A5-HN4
NR0B1-DAX1NR0B2-SHP NR3C2-MCR
NR1F3-RORG
NR1F2-RORBNR1F1-ROR1NR1A2-THB1
NR1A1-THA1NR1I2-PXR
NR1B2-RRB2 NR1B1-RRA1
![Page 4: What can (many) sequences tell us?. Nuclear receptor function](https://reader038.vdocuments.mx/reader038/viewer/2022110209/56649e545503460f94b4b5b4/html5/thumbnails/4.jpg)
Nuclear receptor structure
A-B C D E F
Ligand binding domain– conserved protein fold– > 20% sequence similarity
DNA binding domain– highly conserved– > 90% similarity
C
E
AF-1 DNA LBD
![Page 5: What can (many) sequences tell us?. Nuclear receptor function](https://reader038.vdocuments.mx/reader038/viewer/2022110209/56649e545503460f94b4b5b4/html5/thumbnails/5.jpg)
The questions
How do ligands relate to activity?
What is the role of each amino acid in the NR LBD?
Which data handling / bioinformatics is needed to answer these questions?
![Page 6: What can (many) sequences tell us?. Nuclear receptor function](https://reader038.vdocuments.mx/reader038/viewer/2022110209/56649e545503460f94b4b5b4/html5/thumbnails/6.jpg)
3D structure LBD
(hER)
![Page 7: What can (many) sequences tell us?. Nuclear receptor function](https://reader038.vdocuments.mx/reader038/viewer/2022110209/56649e545503460f94b4b5b4/html5/thumbnails/7.jpg)
Available NR data
56 structures in (PDB) (>200 now*)
>500 sequences (scattered) (>1500 now)
>1000 mutations (very scattered)
>10000 ligand-binding studies (secret)
Disease patterns, expression, >1000 SNPs, genetic localization, etc., etc., etc.
This data must be integrated, sorted, combined,validated, understood, and used to answer our questions.
Now was in 2007…
![Page 8: What can (many) sequences tell us?. Nuclear receptor function](https://reader038.vdocuments.mx/reader038/viewer/2022110209/56649e545503460f94b4b5b4/html5/thumbnails/8.jpg)
Step 1
The first important step is a common numbering scheme because all structures have different numbering schemes, and there are insertions and deletions between species that are confusing any numbering.
Whoever solves that problem once and for all should get three Nobel prices.
![Page 9: What can (many) sequences tell us?. Nuclear receptor function](https://reader038.vdocuments.mx/reader038/viewer/2022110209/56649e545503460f94b4b5b4/html5/thumbnails/9.jpg)
Large data volumes
Large data volumes allow us to develop new data analysis techniques.
Entropy-variability analysis is a novel technique to look at very large multiple sequence alignments.
Entropy-variability analysis requires ‘better’ alignments than routinely are obtained with ‘standard’ multiple sequence alignment programs.
![Page 10: What can (many) sequences tell us?. Nuclear receptor function](https://reader038.vdocuments.mx/reader038/viewer/2022110209/56649e545503460f94b4b5b4/html5/thumbnails/10.jpg)
Part of the big alignment
We see correlations between columns and between ‘things’.
![Page 11: What can (many) sequences tell us?. Nuclear receptor function](https://reader038.vdocuments.mx/reader038/viewer/2022110209/56649e545503460f94b4b5b4/html5/thumbnails/11.jpg)
Vriend’s first rule of sequence analysis
If it is conserved,it is important
![Page 12: What can (many) sequences tell us?. Nuclear receptor function](https://reader038.vdocuments.mx/reader038/viewer/2022110209/56649e545503460f94b4b5b4/html5/thumbnails/12.jpg)
Vriend’s second rule of sequence analysis
If it is very conserved,it is very important
![Page 13: What can (many) sequences tell us?. Nuclear receptor function](https://reader038.vdocuments.mx/reader038/viewer/2022110209/56649e545503460f94b4b5b4/html5/thumbnails/13.jpg)
Consequence:
If something is conserved in each sub-family,it is involved in a sub-family specific function.
![Page 14: What can (many) sequences tell us?. Nuclear receptor function](https://reader038.vdocuments.mx/reader038/viewer/2022110209/56649e545503460f94b4b5b4/html5/thumbnails/14.jpg)
QWERTYASDFGRGHQWERTYASDTHRPMQWERTNMKDFGRKCQWERTNMKDTHRVWRed = conservedGreen = variableBlue = correlatedExample: (chymo)trypsin
What is CMA?Functions never is just one residue
![Page 15: What can (many) sequences tell us?. Nuclear receptor function](https://reader038.vdocuments.mx/reader038/viewer/2022110209/56649e545503460f94b4b5b4/html5/thumbnails/15.jpg)
Correlations
Residues can correlate with residues, and when that happens we found a function, no matter the conservation or variability.
Residues that have a function, correlate with that function.
![Page 16: What can (many) sequences tell us?. Nuclear receptor function](https://reader038.vdocuments.mx/reader038/viewer/2022110209/56649e545503460f94b4b5b4/html5/thumbnails/16.jpg)
Correlations with wavelength
Residues can also correlate with something else. Example: optimal wavelength for opsin excitation.
Wavelength Loop1 Loop2UV Gln HisBlue Asn GlnRed/Green Leu Gln
![Page 17: What can (many) sequences tell us?. Nuclear receptor function](https://reader038.vdocuments.mx/reader038/viewer/2022110209/56649e545503460f94b4b5b4/html5/thumbnails/17.jpg)
Wilma
Wilma Kuipers Thesis
Correlations with drug binding(so no longer evolution-based…)
![Page 18: What can (many) sequences tell us?. Nuclear receptor function](https://reader038.vdocuments.mx/reader038/viewer/2022110209/56649e545503460f94b4b5b4/html5/thumbnails/18.jpg)
Correlation analysis
Receptor
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 ...
Affinity + + + + - - - - - - - - - - - + + + - - - - - - ...
res. 386 N N N N T T T T A A A V V L L N N N Y Y Y Y T T ...
1 = 5HT-1a
2 = 5HT-1b
3 = 5HT-1d
.... ....
• Correlate sequences with ligand binding affinities• Alignments showed 100% correlation of affinity for
pindolol and the absence/presence of Asn386
• Obviously, Asn386 plays an important role in ligand binding
![Page 19: What can (many) sequences tell us?. Nuclear receptor function](https://reader038.vdocuments.mx/reader038/viewer/2022110209/56649e545503460f94b4b5b4/html5/thumbnails/19.jpg)
Wilma Kuipers Thesis
Wilma
![Page 20: What can (many) sequences tell us?. Nuclear receptor function](https://reader038.vdocuments.mx/reader038/viewer/2022110209/56649e545503460f94b4b5b4/html5/thumbnails/20.jpg)
Wilma Kuipers Thesis
WilmaSummary correlation
If its conserved its important; if its important it remains conserved.If residue positions show correlation with ‘something’ it is involved in that ‘something’. ‘Something’ can be any of a very large number of functions.
![Page 21: What can (many) sequences tell us?. Nuclear receptor function](https://reader038.vdocuments.mx/reader038/viewer/2022110209/56649e545503460f94b4b5b4/html5/thumbnails/21.jpg)
Wilma Kuipers Thesis
WilmaExample correlation: Which cysteines form a pair in this protein family? Shown are aligned peptides from five different bacteria.
ASDFGCHIKLMCNPQRSCTVWYSDYGCNIKLFCQPQRSCT--ATDYPVQIKLMCNPQKSCSMWYTDFGCHVKLLVQPNRSVTVW-TDFGVHVKLMCNPQKSCSFW
![Page 22: What can (many) sequences tell us?. Nuclear receptor function](https://reader038.vdocuments.mx/reader038/viewer/2022110209/56649e545503460f94b4b5b4/html5/thumbnails/22.jpg)
WilmaConserved or very conserved? Recalcitrant.
ASDFGCHIKLMCNPQRSCTVWYSDYGCNIKLFCNPQRSCT--ATDLPVQIKLMANPQKSCSVWLSDFGCHIKLMCNPQRSCTVWYTDFGCHVKLLVQPNRSVAFW-SDAGVHVKLMVQPNKSVSF-YTDFGCHVKLLVQPNRSVVFW-TDSGVHVKLMIQPNKSVSFW
![Page 23: What can (many) sequences tell us?. Nuclear receptor function](https://reader038.vdocuments.mx/reader038/viewer/2022110209/56649e545503460f94b4b5b4/html5/thumbnails/23.jpg)
Conclusion from recalcitrance
The more exceptions you find in other (homologous) families, the less important is the residue in your family.
![Page 24: What can (many) sequences tell us?. Nuclear receptor function](https://reader038.vdocuments.mx/reader038/viewer/2022110209/56649e545503460f94b4b5b4/html5/thumbnails/24.jpg)
Entropy and variability
So far we saw that conservation and correlation can help us find functionally important residues.
Can variability patterns also tell us something?
![Page 25: What can (many) sequences tell us?. Nuclear receptor function](https://reader038.vdocuments.mx/reader038/viewer/2022110209/56649e545503460f94b4b5b4/html5/thumbnails/25.jpg)
Entropy
20
Ei = pi ln(pi)
i=1
Sequence entropy Ei at position i is calculated from the frequency pi of the twenty amino acid types (p) at position i:
![Page 26: What can (many) sequences tell us?. Nuclear receptor function](https://reader038.vdocuments.mx/reader038/viewer/2022110209/56649e545503460f94b4b5b4/html5/thumbnails/26.jpg)
Variability
Sequence variability Vi is the number of amino acid types observed at position i in more than 0.5% of all sequences.
![Page 27: What can (many) sequences tell us?. Nuclear receptor function](https://reader038.vdocuments.mx/reader038/viewer/2022110209/56649e545503460f94b4b5b4/html5/thumbnails/27.jpg)
Intermezzo
It is a common concept in bioinformatics to create an hypothesis. But……, every hypothesis must be tested against real data from real experiments.
![Page 28: What can (many) sequences tell us?. Nuclear receptor function](https://reader038.vdocuments.mx/reader038/viewer/2022110209/56649e545503460f94b4b5b4/html5/thumbnails/28.jpg)
Ras Entropy-Variability
11 Red12 Orange22 Yellow23 Green33 Blue
![Page 29: What can (many) sequences tell us?. Nuclear receptor function](https://reader038.vdocuments.mx/reader038/viewer/2022110209/56649e545503460f94b4b5b4/html5/thumbnails/29.jpg)
GPCR Entropy-Variability; signalling path
GPCR11 G protein12 Support22 Signaling23 Ligand in33 Ligand out
![Page 30: What can (many) sequences tell us?. Nuclear receptor function](https://reader038.vdocuments.mx/reader038/viewer/2022110209/56649e545503460f94b4b5b4/html5/thumbnails/30.jpg)
0.0
0.4
0.8
1.2
1.6
2.0
2.4
2.8
0 2 4 6 8 10 12 14 16 18
VARIABILITY
ENTROPY
11
2212
23 33
11 main function
12 first shell around main function
22 core residues (signal transduction)
23 modulator
33 mainly surface
NR LBD Entropy-Variability
![Page 31: What can (many) sequences tell us?. Nuclear receptor function](https://reader038.vdocuments.mx/reader038/viewer/2022110209/56649e545503460f94b4b5b4/html5/thumbnails/31.jpg)
Example: role of Asp 351EV ánd correlation. But the correlation would never have been found from sequence analyses.
antagonistagonist
![Page 32: What can (many) sequences tell us?. Nuclear receptor function](https://reader038.vdocuments.mx/reader038/viewer/2022110209/56649e545503460f94b4b5b4/html5/thumbnails/32.jpg)
Summary variability analysis
Variability patterns hold information.
Entropy and Variability are two (of the) ways to measure variability patterns.
Entropy and Variability patterns can say something about the type of function, and thus add detail to correlation studies.
![Page 33: What can (many) sequences tell us?. Nuclear receptor function](https://reader038.vdocuments.mx/reader038/viewer/2022110209/56649e545503460f94b4b5b4/html5/thumbnails/33.jpg)
Conclusions:
Data is difficult, but we need it (sic); life would be so nice if we could do without it. PDB files are the worst.
Nomenclature is not homogeneous. Ontologies….
Much data has been carefully hidden in the literature, where it can only be found back with great difficulty.
Residue numbering is difficult but very necessary.
Variability-entropy analysis is powerful, but requires very 'good' alignments.
![Page 34: What can (many) sequences tell us?. Nuclear receptor function](https://reader038.vdocuments.mx/reader038/viewer/2022110209/56649e545503460f94b4b5b4/html5/thumbnails/34.jpg)
A short break for a word from our sponsors
LaerteOliveira
Our industrial sponsor:
FLORENCE
HORN
Wilma Kuipers Weesp Bob Bywater CopenhagenNora vd Wenden The HagueMike SingerNew HavenAd IJzermanLeidenMargot Beukers LeidenFabien Campagne New YorkØyvind Edvardsen TromsØ
Simon Folkertsma FrisiaHenk-Jan Joosten WageningenJoost van Durma BrusselsDavid Lutje Hulsik UtrechtTim Hulsen GoffertManu Bettler Lyon
Elmar
Krieger
Simon Folkertsma
David
Tim
Adje Margot
FabienManu