exploiting structural and comparative genomics to reveal protein functions
DESCRIPTION
Exploiting Structural and Comparative Genomics to Reveal Protein Functions. How many domain families can we find in the genomes and can we predict the functions of relatives? Exploiting protein structure to predict protein functions - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Exploiting Structural and Comparative Genomics to Reveal Protein Functions](https://reader036.vdocuments.mx/reader036/viewer/2022070411/568146bb550346895db3ec28/html5/thumbnails/1.jpg)
Exploiting Structural and Comparative Exploiting Structural and Comparative
Genomics to Reveal Protein FunctionsGenomics to Reveal Protein Functions
How many domain families can we find in the genomes and can we predict the functions of relatives?
Exploiting protein structure to predict protein functions
Using correlated phylogenetic profiles based on CATH domains to reveal functional associations
CCAATTHH Domain families of known structureDomain families of known structure
Gene3DGene3D Protein families and domain annotations Protein families and domain annotations for completed genomesfor completed genomes
![Page 2: Exploiting Structural and Comparative Genomics to Reveal Protein Functions](https://reader036.vdocuments.mx/reader036/viewer/2022070411/568146bb550346895db3ec28/html5/thumbnails/2.jpg)
CATHEDRALOliver Redfern and Andrew Harrison
CATH version 3.01100 fold groups
2100 homologous superfamilies86,000 Domains
Combines a rapid graph theory secondary structure filter with dynamic programming foraccurate residue alignment
SVM is used to combinescores and assess significance of match
![Page 3: Exploiting Structural and Comparative Genomics to Reveal Protein Functions](https://reader036.vdocuments.mx/reader036/viewer/2022070411/568146bb550346895db3ec28/html5/thumbnails/3.jpg)
0.8
0.82
0.84
0.86
0.88
0.9
0.92
0.94
0.96
0.98
1
0 5 10 15 20 25
Rank
% C
orr
ect
Fo
ld
CATHEDRAL
CE
DALI
LSQMAN
STRUCTAL
SSAPDDP
Fold Recognition Performance%
Corr
ect
Fold
Rank
SSAP
![Page 4: Exploiting Structural and Comparative Genomics to Reveal Protein Functions](https://reader036.vdocuments.mx/reader036/viewer/2022070411/568146bb550346895db3ec28/html5/thumbnails/4.jpg)
Gene3DGene3D::Domain annotations in genome sequencesDomain annotations in genome sequences
scan againstscan againstlibrary of HMM library of HMM
modelsmodels
~2000 CATH~2000 CATH~9000 Pfam~9000 Pfam
>2 million protein >2 million protein sequencessequencesfrom 300 from 300
completed completed genomes and genomes and
UniprotUniprot
assign domains toassign domains toCATH and Pfam CATH and Pfam superfamiliessuperfamilies
Benchmarking by structural data shows that 76% of remote homologues can be identified using the HMMs
![Page 5: Exploiting Structural and Comparative Genomics to Reveal Protein Functions](https://reader036.vdocuments.mx/reader036/viewer/2022070411/568146bb550346895db3ec28/html5/thumbnails/5.jpg)
DomainFinder: structural domains from CATH take precedent
Gene3D:Gene3D:Domain annotations in genome sequencesDomain annotations in genome sequences
N CCATH-1
Pfam-2Pfam-1
NewFam
CATH-1CATH-1Pfam-1Pfam-1 NewFamNewFam Pfam-2Pfam-2
![Page 6: Exploiting Structural and Comparative Genomics to Reveal Protein Functions](https://reader036.vdocuments.mx/reader036/viewer/2022070411/568146bb550346895db3ec28/html5/thumbnails/6.jpg)
Domain families ranked by size (number of domain Domain families ranked by size (number of domain sequences)sequences)
Perc
en
tag
e o
f all
dom
ain
fam
ily s
eq
uen
ces
Rank by family size
CATH superfamilies of known structure
Pfam families of unknown structure
NewFam of unknown stucture
~90% of domain sequences in the genomes and UniProt can be assigned to ~7000 domain families
![Page 7: Exploiting Structural and Comparative Genomics to Reveal Protein Functions](https://reader036.vdocuments.mx/reader036/viewer/2022070411/568146bb550346895db3ec28/html5/thumbnails/7.jpg)
structuralstructuralsuperfamilysuperfamily
(CATH)(CATH)
Only ~3% of diverse sequences in large CATH domain Only ~3% of diverse sequences in large CATH domain families have known structures families have known structures
subfamily subfamily of relativesof relatives
<100 families account for 50% of domain sequences of known fold
F1
F2
F3
F4
F5
relatives likely relatives likely to have similar to have similar
functionsfunctions
![Page 8: Exploiting Structural and Comparative Genomics to Reveal Protein Functions](https://reader036.vdocuments.mx/reader036/viewer/2022070411/568146bb550346895db3ec28/html5/thumbnails/8.jpg)
Iterative Profile SearchMethodology
300 genomes, >2 million sequences including UniProt
and RefSeq
structural domain assignments from CATH
functional domain assignments from Pfam
Also: SWISS-PROT, EC, COGs, GO, KEGG, MIPS, BIND, IntAct
Gene3D: Domain mappings for 300 Completed
Genomes
http://www.biochem.ucl.ac.uk:8080/Gene3Dhttp://www.biochem.ucl.ac.uk:8080/Gene3D
Russell Marsden, Corin Yeats, Michael Maibaum, David LeeNucleic Acids Res. 2006
Yeats et al. Nucleic Acids res. 2006.
![Page 9: Exploiting Structural and Comparative Genomics to Reveal Protein Functions](https://reader036.vdocuments.mx/reader036/viewer/2022070411/568146bb550346895db3ec28/html5/thumbnails/9.jpg)
DOMAINS IN SAME ARCHITECTURES
0
10
20
30
40
50
60
70
80
90
100
11--20 21-30 31-40 41-50 51-60 61-70 71-80 81-90 91-100
Sequence Identity
FU
NC
TIO
N C
ON
SE
RV
AT
ION
(3r
d
leve
l E
C S
TR
ING
M
AT
CH
)
No OVERLAP 10% OVERLAP 20% OVERLAP 30% OVERLAP40% OVERLAP 50% OVERLAP 60% OVERLAP 70% OVERLAP80% OVERLAP 90% OVERLAP 100% OVERLAP
Conservation of enzyme function in homologous Conservation of enzyme function in homologous domains with same multidomain architecture (MDA) in domains with same multidomain architecture (MDA) in
Gene3D Gene3D
CATH-1CATH-1Pfam-1Pfam-1 NewFamNewFam Pfam-2Pfam-2
Con
serv
ati
on
of
EC
C
on
serv
ati
on
of
EC
n
um
ber
to 3
levels
(%
)n
um
ber
to 3
levels
(%
)
CATH-1CATH-1Pfam-1Pfam-1 NewFamNewFam Pfam-2Pfam-2
Protein 1
Protein 2
Sequence identity
![Page 10: Exploiting Structural and Comparative Genomics to Reveal Protein Functions](https://reader036.vdocuments.mx/reader036/viewer/2022070411/568146bb550346895db3ec28/html5/thumbnails/10.jpg)
1
10
100
1000
10000
100000
1000000
11-20% 21-30% 31-40% 41-50% 51-60% 61-70% 71-80% 81-90% 91-100%
020406080100120140160180200
Number of domain relatives Number of Superfamilies
Sequence identity thresholds for 95% conservation Sequence identity thresholds for 95% conservation of enzyme function (to 3 EC Levels) of enzyme function (to 3 EC Levels)
Sequence identity thresholdsSequence identity thresholds
number of sequencesnumber of sequences number of familiesnumber of families
number of number of sequencessequences
number of number of familiesfamilies
332 highly 332 highly conserved familiesconserved families
60 highly variable 60 highly variable familiesfamilies
![Page 11: Exploiting Structural and Comparative Genomics to Reveal Protein Functions](https://reader036.vdocuments.mx/reader036/viewer/2022070411/568146bb550346895db3ec28/html5/thumbnails/11.jpg)
Exploiting Structural and Comparative Exploiting Structural and Comparative
Genomics to Reveal Protein FunctionsGenomics to Reveal Protein Functions
How many domain families can we find in the genomes and can we predict the functions of relatives?
Exploiting protein structure to predict protein functions
Using correlated phylogenetic profiles based on CATH domains to reveal functional associations
CCAATTHH Domain families of known structureDomain families of known structure
Gene3DGene3D Protein families and domain annotations Protein families and domain annotations for completed genomesfor completed genomes
![Page 12: Exploiting Structural and Comparative Genomics to Reveal Protein Functions](https://reader036.vdocuments.mx/reader036/viewer/2022070411/568146bb550346895db3ec28/html5/thumbnails/12.jpg)
40
50
60
70
80
90
100
0 10 20 30 40 50 60 70 80 90 100
sequence idenity (%)
SS
AP
sco
re
Different Function
Same Function
Conservation of Enzyme Function in CATH Domain Conservation of Enzyme Function in CATH Domain FamiliesFamilies
Pairwise sequence identity
Str
uct
ura
l si
mila
rity
(S
SA
P)
score
same functions different functions
![Page 13: Exploiting Structural and Comparative Genomics to Reveal Protein Functions](https://reader036.vdocuments.mx/reader036/viewer/2022070411/568146bb550346895db3ec28/html5/thumbnails/13.jpg)
0 10 20 30 40 50 600
10
20
30
40
50
60
70
80
90
COGs Vs SSGs
0-2525-50
50-75
75-100
Number of Structua l Sub-Groups
Num
ber
of C
OG
s
P-loop hydrolases(COG-270, SSG-67)
Number of diverse structural clusters within family
Nu
mb
er
of
CO
G f
un
c ti o
nal g
r ou
ps
Correlation of structural variability with number of Correlation of structural variability with number of different functional groupsdifferent functional groups
![Page 14: Exploiting Structural and Comparative Genomics to Reveal Protein Functions](https://reader036.vdocuments.mx/reader036/viewer/2022070411/568146bb550346895db3ec28/html5/thumbnails/14.jpg)
Multiple structural alignment by CORA allows identification of consensus secondary structure and embellishments
Some families show great structural diversitySome families show great structural diversity
In 117 superfamilies relatives expanded by >2 fold or more
2DSEC algorithm2DSEC algorithm
These families represent more than half the genome sequences of known These families represent more than half the genome sequences of known foldfold
Gabrielle Reeves
![Page 15: Exploiting Structural and Comparative Genomics to Reveal Protein Functions](https://reader036.vdocuments.mx/reader036/viewer/2022070411/568146bb550346895db3ec28/html5/thumbnails/15.jpg)
Structural embellishments can modify the active siteStructural embellishments can modify the active site
Galectin binding superfamily
![Page 16: Exploiting Structural and Comparative Genomics to Reveal Protein Functions](https://reader036.vdocuments.mx/reader036/viewer/2022070411/568146bb550346895db3ec28/html5/thumbnails/16.jpg)
Structural embellishments can modulate domain interactionsStructural embellishments can modulate domain interactions
Glucose 6-phosphate Glucose 6-phosphate dehydrogenasedehydrogenase
side orientationside orientation face orientationface orientation
Dihydrodipiccolinate Dihydrodipiccolinate reductasereductase
Additional secondary structure shown at (a) are involved in Additional secondary structure shown at (a) are involved in subunit interactionssubunit interactions
a
![Page 17: Exploiting Structural and Comparative Genomics to Reveal Protein Functions](https://reader036.vdocuments.mx/reader036/viewer/2022070411/568146bb550346895db3ec28/html5/thumbnails/17.jpg)
Structural embellishments can modify function by modifying
active site geometry and mediating new domain and subunit
interactions
Biotin carboxylaseBiotin carboxylaseD-alanine-d-alanine ligaseD-alanine-d-alanine ligase
Dimer of biotin carboxylaseDimer of biotin carboxylase
ATP GraspATP Graspsuperfamilysuperfamily
![Page 18: Exploiting Structural and Comparative Genomics to Reveal Protein Functions](https://reader036.vdocuments.mx/reader036/viewer/2022070411/568146bb550346895db3ec28/html5/thumbnails/18.jpg)
Secondary structure insertions are distributed along the Secondary structure insertions are distributed along the chain but aggregate in 3Dchain but aggregate in 3D
60% of domains have secondary structure embellishments co-located in 3D with 3 or more other embellishments
In 80% of domains, 1 or more embellishments contact other domains or subunits
Indel frequency < 1 %
0.85% 0.38% 0.23% 0.11% 0.06% 0.02%
0
20
40
60
80
1 2 3 4 5 6 7 8 9 10 11 12
Size of Indel (number of secondary structures)
Frequency (%)
85% of residue insertions comprise only 1 or 2 secondary structures
![Page 19: Exploiting Structural and Comparative Genomics to Reveal Protein Functions](https://reader036.vdocuments.mx/reader036/viewer/2022070411/568146bb550346895db3ec28/html5/thumbnails/19.jpg)
2 Layer Beta Sandwich
2 Layer Alpha Beta Sandwich
Alpha / Beta Barrel3 Layer Alpha Beta Sandwich
~80% of variable families are adopt regular layered architectures
![Page 20: Exploiting Structural and Comparative Genomics to Reveal Protein Functions](https://reader036.vdocuments.mx/reader036/viewer/2022070411/568146bb550346895db3ec28/html5/thumbnails/20.jpg)
2 Layer Beta Sandwich
2 Layer Alpha Beta Sandwich
Alpha / Beta Barrel3 Layer Alpha Beta Sandwich
![Page 21: Exploiting Structural and Comparative Genomics to Reveal Protein Functions](https://reader036.vdocuments.mx/reader036/viewer/2022070411/568146bb550346895db3ec28/html5/thumbnails/21.jpg)
structuralstructuralsuperfamilysuperfamily
(CATH)(CATH)
Function prediction to Guide Target Selection for Structural Function prediction to Guide Target Selection for Structural Genomics Genomics
relatives likely relatives likely to have similar to have similar
functionsfunctions
Only ~3% of diverse sequence families (S30 clusters) in Only ~3% of diverse sequence families (S30 clusters) in large CATH families have known structures large CATH families have known structures
close close relatives relatives
with same with same MDAMDA
F1
F2
F3
F4
F5
![Page 22: Exploiting Structural and Comparative Genomics to Reveal Protein Functions](https://reader036.vdocuments.mx/reader036/viewer/2022070411/568146bb550346895db3ec28/html5/thumbnails/22.jpg)
0
10
20
30
40
50
60
70
80
90
100
50-60 60-70 70-80 80-90 90-100
SSAP Score
% F
req
uen
cy Not Conserved
Less than 3 EC
EC3
EC4
Conservation of Enzyme Function in Homologous Conservation of Enzyme Function in Homologous DomainsDomains
Structure similarity (SSAP) score
Conse
rvati
on o
f EC
C
onse
rvati
on o
f EC
le
vels
(%
)le
vels
(%
)
![Page 23: Exploiting Structural and Comparative Genomics to Reveal Protein Functions](https://reader036.vdocuments.mx/reader036/viewer/2022070411/568146bb550346895db3ec28/html5/thumbnails/23.jpg)
FLORA – structural templates for assigning structures to functional subgroups in CATH
Perform CORA multiple structural alignment on functional subfamiles within CATH superfamily
Use CORAXplode (HMMs) to find related sequences in UniProt and identify conserved residues (seed)
Explore local structural environment of seed residues to find conserved structural motifs
Dataset of 84 enzyme superfamilies in CATH of which 21 are functionally very diverse
![Page 24: Exploiting Structural and Comparative Genomics to Reveal Protein Functions](https://reader036.vdocuments.mx/reader036/viewer/2022070411/568146bb550346895db3ec28/html5/thumbnails/24.jpg)
Finding conserved residue positions (seeds) - Finding conserved residue positions (seeds) - ScoreconsScorecons
seed positions
identify most highly conserved residue positions
using Scorecons – Valdar and Thornton (2001)
multiple sequence alignment of relatives from functional familyguided by structure
alignment
![Page 25: Exploiting Structural and Comparative Genomics to Reveal Protein Functions](https://reader036.vdocuments.mx/reader036/viewer/2022070411/568146bb550346895db3ec28/html5/thumbnails/25.jpg)
identify structurally conserved
residue cliques and generate template
new structures are scanned against a library of FLORA
templates and SVMs used to assess significance of
matches
expand to local environment of
12Å
assign conserved sequence seeds
FLORA Algorithm for Identifying Structural Homologues with Similar Functions
![Page 26: Exploiting Structural and Comparative Genomics to Reveal Protein Functions](https://reader036.vdocuments.mx/reader036/viewer/2022070411/568146bb550346895db3ec28/html5/thumbnails/26.jpg)
Performance of FLORA vs Global Structure
Comparison (SSAP)
Error rate
Coverage
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.1 0.2
Error
Co
ve
rag
e
SSAP
FLORA-
![Page 27: Exploiting Structural and Comparative Genomics to Reveal Protein Functions](https://reader036.vdocuments.mx/reader036/viewer/2022070411/568146bb550346895db3ec28/html5/thumbnails/27.jpg)
Exploiting Structural and Comparative Exploiting Structural and Comparative
Genomics to Reveal Protein FunctionsGenomics to Reveal Protein Functions
How many domain families can we find in the genomes and can we predict the functions of relatives?
Exploiting protein structure to predict protein functions
Using correlated phylogenetic profiles based on CATH domains to reveal functional associations
CCAATTHH Domain families of known structureDomain families of known structure
Gene3DGene3D Protein families and domain annotations Protein families and domain annotations for completed genomesfor completed genomes
![Page 28: Exploiting Structural and Comparative Genomics to Reveal Protein Functions](https://reader036.vdocuments.mx/reader036/viewer/2022070411/568146bb550346895db3ec28/html5/thumbnails/28.jpg)
Eisenberg Phylogenetic Profiles for Detecting Functional Eisenberg Phylogenetic Profiles for Detecting Functional AssociationsAssociations
Superfamily 1
Superfamily 2
Superfamily 3
CATH Domain Superfamily
Organism sp1 sp2 sp3 sp4
35 0 12 60
12 13 14 11
6 0 0 0
Gene3D Phylogenetic Occurrence ProfilesGene3D Phylogenetic Occurrence Profiles
Superfamily 1
Superfamily 2
Superfamily 3
Superfamily Organism sp1 sp2 sp3 sp4
1 0 1 0
1 0 1 0
0 0 1 1
FunctionallyFunctionallyLinked Linked
presence or presence or absence of absence of superfamily superfamily in organismin organism
number of number of relatives relatives
from from superfamily superfamily in organismin organism
![Page 29: Exploiting Structural and Comparative Genomics to Reveal Protein Functions](https://reader036.vdocuments.mx/reader036/viewer/2022070411/568146bb550346895db3ec28/html5/thumbnails/29.jpg)
Superfamily
40% sequence identity cluster
30% sequence identity cluster
50% sequence identitycluster
Phylogenetic Occurrence Profiles Based on DomainPhylogenetic Occurrence Profiles Based on DomainSuperfamily and Subfamilies in Gene3DSuperfamily and Subfamilies in Gene3D
![Page 30: Exploiting Structural and Comparative Genomics to Reveal Protein Functions](https://reader036.vdocuments.mx/reader036/viewer/2022070411/568146bb550346895db3ec28/html5/thumbnails/30.jpg)
Phylogenetic Profiles for Families and Subfamilies Phylogenetic Profiles for Families and Subfamilies
Superfam. 30% 40% 50% 60%… 100%
phylogenetic occurrence profile matrix
Sp1 Sp2 Sp3 Sp4 … Spn
Cluster 1Cluster 2Cluster 3Cluster 4Cluster 5Cluster 6Cluster 7
.
.
.Cluster n
3 3 5 7 … 50 2 4 5 … 41 0 1 0 … 10 2 0 0 … 61 0 2 1 … 00 3 1 2 … 10 0 0 1 … 2. . . . … .. . . . … .. . . . … .0 1 0 1 … 0
domains clustered at different levels of sequence similarity:
Juan Ranea and Corin Yeats Juan Ranea and Corin Yeats
![Page 31: Exploiting Structural and Comparative Genomics to Reveal Protein Functions](https://reader036.vdocuments.mx/reader036/viewer/2022070411/568146bb550346895db3ec28/html5/thumbnails/31.jpg)
Comparison of Pairs of Comparison of Pairs of Phylogenetic ProfilesPhylogenetic Profiles
Sp1 Sp2 Sp3 Sp4 Sp5 … Spn
Cluster 1Cluster 2Cluster 3Cluster 4Cluster 5Cluster 6Cluster 7
.
.
.Cluster n
6 9 6 9 5 … 94 3 7 5 3 … 51 0 1 0 2 … 10 2 0 0 1 … 61 4 1 4 1 … 40 3 1 2 0 … 14 8 4 8 4 … 8. . . . . … .. . . . . … .. . . . . … .0 1 0 1 1 … 0
Sp1 Sp2 Sp3 Sp4 Sp5 … Spn
5
10
Sp1 Sp2 Sp3 Sp4 Sp5 … Spn
5
10
Sp1 Sp2 Sp3 Sp4 Sp5 … Spn
5
10
Cluster 1
Cluster 2
Cluster 1
Cluster 5
Cluster 1
Cluster 7
E1
E2
E1 >> E2
Euclidian distance:
![Page 32: Exploiting Structural and Comparative Genomics to Reveal Protein Functions](https://reader036.vdocuments.mx/reader036/viewer/2022070411/568146bb550346895db3ec28/html5/thumbnails/32.jpg)
0
10
20
30
40
50
60
70
80
(-0
.3)-
(-0
.2)
(-0
.2)-
(-0
.1)
(-0
.1)-
(0.0
)
(0.0
)-(0
.1)
(0.1
)-(0
.2)
(0.2
)-(0
.3)
(0.3
)-(0
.4)
(0.4
)-(0
.5)
(0.5
)-(0
.6)
(0.6
)-(0
.7)
(0.7
)-(0
.8)
(0.8
)-(0
.9)
(0.9
)-(1
.0)
Statistical Significance of Correlated Pairs
(Comparison against 3 randomised models)
Freq
uen
cy
Pearson correlation coefficients
Real matrix
Random matrix II
Random matrix III
Random matrix I
![Page 33: Exploiting Structural and Comparative Genomics to Reveal Protein Functions](https://reader036.vdocuments.mx/reader036/viewer/2022070411/568146bb550346895db3ec28/html5/thumbnails/33.jpg)
Domain Associations Network from 13 Eukaryotes:Domain Associations Network from 13 Eukaryotes:
Actin&
VCP-like ATPases
DNA replication and repair
Chaperones and Cytoskeleton
DNA Topoisomerase & Elongation factor G
![Page 34: Exploiting Structural and Comparative Genomics to Reveal Protein Functions](https://reader036.vdocuments.mx/reader036/viewer/2022070411/568146bb550346895db3ec28/html5/thumbnails/34.jpg)
0
1
2
3
4
5
6
7
8
9
10
1 2 3 4 5 6 7 8 9 10 11 12 13
Num
ber
of
dom
ain
re
lati
ves
Species
DNA topoisomerase & Elongation Factor G
![Page 35: Exploiting Structural and Comparative Genomics to Reveal Protein Functions](https://reader036.vdocuments.mx/reader036/viewer/2022070411/568146bb550346895db3ec28/html5/thumbnails/35.jpg)
0
10
20
30
40
50
60(0
)-(1
)
(1)-
(2)
(2)-
(3)
(3)-
(4)
(4)-
(5)
(5)-
(6)
(6)-
(7)
(7)-
(8)
(8)-
(9)
(9)-
(10
)
(10
)-(1
1)
(11
)-(1
2)
(12
)-(1
3)
(13
)-(1
4)
(14
)-(1
5)
(15
)-(1
6)
(16
)-(1
7)
(17
)-(1
8)
(18
)-(1
9)
(>=
19
)
%Frq %Sum_SS/Frq
Distances of correlated profile scores
Frequency of significant GO semantic similarity scores
Highly correlated profiles correspond to pairs of families Highly correlated profiles correspond to pairs of families with significant similarity in GO functions with significant similarity in GO functions
biological processes
![Page 36: Exploiting Structural and Comparative Genomics to Reveal Protein Functions](https://reader036.vdocuments.mx/reader036/viewer/2022070411/568146bb550346895db3ec28/html5/thumbnails/36.jpg)
– On average 85% of domain sequences in genomes can be
assigned to ~6000 domain families in CATH and Pfam
– Information on multidomain architectures (MDAs) can extend
functional annotations obtained through domain based
homologies
– Specific structural templates for functional subgroups within
domain families can also help in assigning functions as more
structures are solved
– Analysis of Gene3D phylogenetic occurrence profiles allows
detection of functional associations between families
SummarySummary
![Page 37: Exploiting Structural and Comparative Genomics to Reveal Protein Functions](https://reader036.vdocuments.mx/reader036/viewer/2022070411/568146bb550346895db3ec28/html5/thumbnails/37.jpg)
Lesley GreeneLesley Greene
Alison CuffAlison Cuff
Ian SillitoeIan Sillitoe
Tony LewisTony Lewis
Mark DibleyMark Dibley
Oliver RedfernOliver Redfern
Tim DallmanTim Dallman
AcknowledgementsAcknowledgements
CATH
Corin YeatsCorin Yeats
Sarah AddouSarah Addou
Russell MarsdenRussell Marsden
David LeeDavid Lee
Alastair GrantAlastair Grant
Ilhem DibounIlhem Diboun
Juan Garcia RaneaJuan Garcia Ranea
Medical Research Council, Wellcome Trust, NIHEU funded Biosapiens, EU funded Embrace, BBSRC
http://www.biochem.ucl.ac.uk/bsm/cath_new
Gene3D