psi structural genomics knowledgebase helen m. berman bottlenecks workshop april 14, 2008
Post on 01-Apr-2015
216 Views
Preview:
TRANSCRIPT
PSI Structural Genomics Knowledgebase
Helen M. Berman
Bottlenecks Workshop
April 14, 2008
Kn
ow
ledg
ebase
PSI SG Knowledgebase
Knowledgebase Vision The PSI Structural Genomics Knowledgebase
(PSI SG KB) will turn the products of the PSI effort into major advances in knowledge that can be used to understand living systems and human disease
The PSI SG KB will be a key resource for the advancement of biology, biochemistry, functional genomics, pharmacology, bioinformatics, chemistry, education and clinical medicine
PSI SG Knowledgebase
Knowledgebase GoalsTo provide a “marketplace of ideas” that
connects protein sequence information to 3D structures and homology models
enhances functional annotations provides access to new experimental protocols and
materialsTo kick start and enable advancements in structural genomics
by communicating and providing visibility and accessibility of information and technology advances of the PSI
through presentation and discussion of the most provocative challenges with the general community
by fostering community collaborations
PSI SG Knowledgebase
To capture, make accessible, and highlight elements of the high throughput pipelines for general use in the community and to leverage such information through the generation of hundreds of thousands of molecular models and functional annotation. Standard metrics will be used to measure progress.
GenomicBased Target
SelectionData
CollectionStructure
DeterminationIsolation, Expression,
Purification,Crystallization
PDBDeposition & Release
ModelsAnnotationsPublications
Metrics
Technology
Experimental Tracking
Scope
Target Selection Materials
PSI SG Knowledgebase
Knowledgebase Users Biologists Biochemists Functional Genomists Pharmacologists Bioinformatics Chemists Clinical Researchers and
Physicians Teachers and Students
KB Site Features
News and
Events
Molecules of Unknown
Function
Link to Functional
SleuthGallery
FeaturedStructure
Link toTechnologyModule
TechnologyFeature
Search by - Sequence - Keyword - PDB ID
PSI SG Knowledgebase
PSI SG KB Portal Collects sequences, common features, and common
identifiers Maintains correspondences in local database Delivers aggregate reports, inventories, and e-
publications which contain links to PSI projects, modules and external resources
Delivers featured articles describing: PSI news and events, featured molecules and technologies, molecules of unknown function
Provides collaborative environments for discussion, annotation, and target suggestions
PSI SG Knowledgebase
PDB ID
Sequence
Keyword
Queries
PSI Modules
PSI Centers
PSI Info Site
Related Biological Resources
Archival Sequence Databases
Domain Databases (Pfam)
Literature (PubMed)
TargetDB
PepcDB PDB
TargetDB Sequences
PDB Sequences
Portal
Resource
Database
KeywordDatabase
PSI SG KB Portal Databases
Models Portal
PSI SG Knowledgebase
Modules
Modules derived from PSI information and external resources Target Selection & Experimental Data Tracking Materials Repository Models Annotation Metrics Technology Outreach
PSI SG Knowledgebase
Target Selection & Experimental Data Tracking Target Selection – PSI-2 BIG4
Family definitions and target management TargetDB
Search by sequence, Target ID, project site, status, update date, protein name, and source organism
Links to other sequence databases, domain databases, other structural genomics centers, and PDB
Download target data Target statistics summary
PepcDB All the functionality of TargetDB plus– Experimental protocols– Detailed status history of experimental trials – Information on failed experiments
PSI SG Knowledgebase
Experimental TrackingPepcDB Search Form
Protocol Keywords Search
PSI SG Knowledgebase
PSI SG Knowledgebase
Experimental Tracking Module
PSI SG Knowledgebase
PSI SG Knowledgebase
Materials Repository
PSI SG Knowledgebase
PSI Materials Repository Module
PSI SG Knowledgebase
PSI SG Knowledgebase
Modeling Portal
Current Phase 1 Model Portal contains
Models from 4 PSI centers and 2 public model
databases (SwissModel and ModBase) integrated on
a common UniProt reference system.
Current release consists of 5.8 million comparative
protein models for 1.97 million distinct UniProt
entries.
PSI SG Knowledgebase
Modeling Portal
PSI SG Knowledgebase
Metrics Module
Provides objective measures of the progress and output of the PSI project
Centered around “Goals and Milestones” document
PSI SG Knowledgebase
PSI-2 Summary StatisticsUpdated April 1, 2008
I.1.A Number of novel experimental PSI-2 structures 1031
I.1.B Number of distinct experimental PSI-2 structures non-redundant sequences
1428
I.1.D Total number of experimental PSI-2 structures 1628
I.1.E Numbers of experimentally determined distinct residues 319977
Numbers of experimentally determined novel residues 225518
I.2.J Number of experimental structures of human proteins 61
I.2.K Number of experimental structures of eukaryotic proteins 186
I.2.M Number of experimental structures of membrane proteins 1
I.2.N Number of experimental structures determined at the atomic level using x-ray crystallography
1484
Number of experimental structures determined at the atomic level using NMR methods
144
PSI SG Knowledgebase
PSI-2 Summary Statistics for Domain and Modeling Leverage
I.1.C Number and Size of BIG Domain Families for which PSI-2 provides the first Experimental Structure Representative
474
Number and Size of MEGA Domain Families for which PSI-2 provides the first Experimental Structure Representative
399
I.1.E Numbers of Experimentally Determined Distinct BIG Family Residues
76579
Numbers of Experimentally Determined Distinct MEGA Family Residues
76121
I.3.A Total Modeling Leverage 583735
I.3.B Novel Modeling Leverage 114407
Updated January 15, 2008
Updated February 21, 2008
PSI SG Knowledgebase
Technology Module
GenomicBased Target
SelectionData
CollectionStructure
DeterminationPDB
Deposition & Release
FunctionalAnnotation
Publication
PSI Centers are actively developing technologies and methodologies for all aspects of the structure determination pipeline
Isolation, Expression,Purification,Crystallization
PSI SG Knowledgebase
Technology Module Progress
Phase 1 Technology Portal in place Summary Information from all PSI Centers Keyword search from KB portal
PSI SG Knowledgebase
PSI SG Knowledgebase
PSI SG Knowledgebase
PSI SG Knowledgebase
PSI SG Knowledgebase
PSI SG Knowledgebase
PSI SG Knowledgebase
Outreach Module
Provides information to the public about the products and accomplishments of the PSI
Media reports Publications Community activities Plans for a Nature Gateway
PSI SG Knowledgebase
PSI SG Knowledgebase
Current Annotation Module
10 PSI Interactive Services for Sequence, Structure and Functional Annotations
11 PSI Galleries and Summaries of Sequence, Structure and Functional Annotations
35 other resources for annotation
Provides paths to unravel sequence, structure, function relationships
PSI SG Knowledgebase
Annotation Module
PSI SG Knowledgebase
PSI SG Knowledgebase
Biological Annotation of Novel ProteinsMarch 7,8 2008 Calit2, UCSD
Participants PSI groups Annotation system authors General biological community
Outcome Recommendations for standard annotations Processes for community input
PSI SG Knowledgebase
Standard Annotations
Genomic features: gene identifier, name and synonyms, operon/regulon mappings
Protein sequence features: amino acid sequence, taxonomy & phylogeny, sequence database accession, isoform, SNPs, PTMs, sequence families, residue conservation. Structure features: oligomeric state, structure and functional domains, DNA binding motifs, nests & clefts, sites of interaction, residue regions of protein-protein, ligand-protein, catalytic sites, secondary structure, structural neighbors and comparison of groups of structures with common feature, properties/features mapped to 3D and their similarities (e.g. electrostatics, cavities, conserved residues, quality assessment ) Ligands: chemical structure, interactions, functional role.
Functional classification: GO, FunCat, EC, epitope mapping, cellular location, organ location, substrate specificity, disease involvement Mapping to Biological Systems: mapping to networks and pathways (e.g. Reactome, Kegg, HPRD, BioCyc, Reactome, KEGG, HPRD, NetPath, MINT, MIPS, DIP, STRING, STITCH, PROLINKS) Literature: synonyms for protein names, links to PubMed by database identifier and related text and authors
PSI SG Knowledgebase
Future ImprovementsExperimental Data Tracking - Standardization of the protocols in PepcDB PepcDB data deposition tool Integration with the Materials Repository
Materials Repository - Searchable database of clones Ordering system Integration with PepcDB and PSI SGKB
Models Module - Public web service interface Additional quality assessment Interactive homology modeling
PSI SG Knowledgebase
Future Improvements
Technology Module - Improved navigation over technology topic areas Keyword search option of descriptions and publicationsPSI SGKB - Integration with Nature Gateway Simple presentation and search of standard annotations Incorporation of data about ligands and modified-residues Molecular visualization tool
PSI SG Knowledgebase
Acknowledgements
KB Team Modules Wendy Tao Torsten Schwede (Models)Raship Shah Andrei Kouranov (Exp. Data Tracking)James Chun Paul Adams (Technology)John Westbrook Wladek Minor (Publications)
Josh La Baer (Materials)Rajesh Nair (Metrics)
Access Informationhttp://kb.psi-structuralgenomics.org/KB/
top related