rcsb protein data bank advisory committee€¦ · access tools for structure query, visualization,...

64
RCSB Protein Data Bank Advisory Committee Meeting and Teleconference Wednesday May 8, 2019

Upload: others

Post on 01-Aug-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

RCSB Protein Data BankAdvisory Committee

Meeting and Teleconference Wednesday May 8, 2019

Page 2: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

State of the RCSB PDB

Page 3: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

2019-2023: Meeting the Challenges Ahead Structuralbiologyisevolving1. Growth/Complexity

2. EvolvingExperimentalMethods(SFX/XFEL,3DEM)

3. EmergingIntegrative/HybridMethods(I/HM)

2

I/H Methods Structures552-protein yeast Nuclear Pore Complex

Kim et al. (2018) Nature 555, 475-82PDBDEV_00000010; PDBDEV_00000011; PDBDEV_00000012

Page 4: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

3

Millions ~600,000

Page 5: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

RCSB PDB: Four Interoperating Services

CustomerServiceHelpDeskandITSupport

4

Deposition/Biocuration

Archive Management/Access

1 2

DataExploration

3 4

Outreach/Education

• Deposition• Validation• Biocuration

• Datastandards• Dataintegration• Datastorage• Dataaccess

• Portal• Search• Browse• 3Dvisualization

• PDB-101

Deposition/Biocuration

Archive Management/Access1 2 Data

Exploration3 4 Outreach/Education

Page 6: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

RCSB PDB Data Pipeline Assures Adherence to the FAIR Principles1. Deposition/BiocurationsupportingDataDepositorsthrough

deposition,validation,andbiocuration.Dataarewell-curatedandvalidatedforscientific/technicalaccuracy.(FAIR)

2. ArchiveManagement/AccesssupportingDataConsumersbymaintainingthePDBarchiveanddatastandards,enablingglobaldatadelivery,andintegratingPDBdatawithotherdataresources.(FAIR)

3. DataExplorationsupportingDataConsumersthroughopen-accesstoolsforstructurequery,visualization,andanalysis.(FAIR)

4. Outreach/EducationServicessupporteducators,students,andthegeneralpublicvia PDB-101website.(FAR)

CustomerServiceandITSupportunderpinallservices

Deposition/Biocuration

Archive Management/Access1 2 Data

Exploration3 4 Outreach/Education

5

Page 7: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

Biocuration(ongoing)

WeeklyRelease finalized

Master Archivepreparation

wwPDB partner access

Automated integrated and

comparative datapreparation

Bicoastal data staging andWeb service preparation

Global Data Release

User Access

Data Pipeline

GLOBALDATA

GLOBAL KNOWLEDGE

PDB Deposition

PDBe PDBj RCSB PDB

>200 New Structures Released Each Week

Page 8: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

Archive Management/Access Ensures FAIR PDB§ PackagePDBdataforrelease§ MaintainPDBDataStandardsandconductarchive-widestandardization

§ Computecomparativedatatosupportsearchapplications

§ IntegratedatafromacrosstheLifeSciencesecosystem

§ Supportprogrammaticaccess

7Deposition/Biocuration

Archive Management/Access

1 2 DataExploration

3 4 Outreach/Education

ComparativeData

PDB ArchiveData

~40 External Resources

ƒ(x)

Data Integration

RESTfulWeb Services

Page 9: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

2018AnnualSnapshotJanuary1,2019

§ 147,610structures§ >1.5TBdata

§ ~1.9millionuniquearchivefiles

§ >1billion3Datomiccoordinates

§ Onlineannualandmilestoneftparchivesnapshotsfrom2005

Updating the PDB Master Archive§ AssembleweeklydatafromwwPDBpartners

§ Packagefinalarchivaldatafiles,validationreports,referencedictionaries,andsupportingdatafiles(1GB)

§ AuthoritativeMasterArchivereadiedfordelivery• Traditionalarchivelayout• Versionedarchivelayout

§ ExportedtowwPDBpartners

8Deposition/Biocuration

Archive Management/Access1 2 Data

Exploration3 4 Outreach/Education

Page 10: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

Comparative and Integrated Data for Contextual Views

§ Comparativedata• Sequenceclustering• 3Dstructureclustering

§ Leveragecyberinfrastructure(CI)datafrom~40keylifescienceresources• Diffractiondata(ProteinDiffraction.org,SBGrid,Store.SynchrotronDataStore)

• DrugBank• NCBI• GeneOntology(GO)• Sequence(UniProt,SIFTS/PDBe)• SCOPandCATH

§ LeverageCIcomputingfromDIBBSandOpenScienceGrid

9Deposition/Biocuration

Archive Management/Access

1 2 DataExploration

3 4 Outreach/Education

ComparativeData

PDB ArchiveData

~40 External Resources

Programmatic Users

ƒ(x)

Data Integration

RESTfulWeb Services

Page 11: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

RCSB.org: Supporting the Scientific Ecosystem§ RCSBPDBservicesgowellbeyondoriginalstructureandscientificpublication

§ Up-to-dateaccessto• Newly-releasedPDBstructures

• Sequence/3Dstructurecomparisons

• Integrationwith~40externalresources

• 3Dstructure/annotationvisualization

10

PathwaysGenetic

Variations

Target-Drug Interactions2D/3D SequenceAnnotations

Deposition/Biocuration

Archive Management/Access1 2 Data

Exploration3 4 Outreach/Education

Page 12: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

PDB-101: Training Support for ~600K Users/Year

11

§ PrimarydistributionofOutreach/Educationefforts

§ MoleculeoftheMonth:>230articlesaboutFundamentalBiology,Biomedicine,andEnergy

§ Curricularmodulesonpublichealthconcerns,fundamentalstructuralbiology

§ Videos,posters,PDBdatauserguides,andothercontent

§ Today’sstudentsaretomorrow’sPDBusers

Deposition/Biocuration

Archive Management/Access

1 2 DataExploration

3 4 Outreach/Education

Page 13: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

Year in the Life of the RCSB PDB Community

Undergraduate Course on Antimicrobial Resistance

wwPDB and RCSB PDB AC Meetings

5th Annual Video Challenge

AAAS Biomedical Career Symposium

The New York Structural Biology Discussion Group Summer Meeting

12,179 structuresdeposited into the PDB

New structures added to the archive for a total of 147,610 entries

Millions of unique users served

>500 million data files downloaded from RCSB PDB web and FTP sites

2018

Site visit 12

Page 14: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

Publications Supporting RCSB PDB Services

13

Page 15: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

PDB Impact on 2010-2016 New Drug Approvals1

14

PDB Structures contributed to

5,913 of these drug approvals

184

approved

210 NEW DRUGS of NIH funding contributed to these approvals (20% of NIH Budget)2

>$100 BILLION2010-2016

2000-2016

B-Raf Kinase complex with Vemurafenib

PDB ID 3og7

1. Westbrook & Burley (2019) Structure 27, 211-217.2. Galkina Cleary et al. (2018); Value in 2016 US$.

Page 16: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

Deposition/Biocuration(Service 1)JasmineYoung

15

Page 17: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

Deposition/Biocuration Ensures Well-curated and High Quality Structure Data

16

§ SupportstructuresdeterminedbyMX,NMR,and3DEMmethodsandcombinationswiththesetechniques(e.g.,NMR-SAS)

§ Pre-depositiontoolsprovidedatapreparationforsubmission§ ValidationimplementscommunityTaskForcerecommendations

§ Geographicallydistributedbiocuration

Page 18: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

2018 Deposition Growth12,179Structures Rapidgrowthin3DEM

17

0

50000

100000

150000

200000

250000

2000

2002

2004

2006

2008

2010

2012

2014

2016

2018

2020

2022

Total # of depositions Method 2017

Depositions2018

Depositions

MX 11,889(91.1%)

10594(87.0%)

NMR 460(3.5%) 418(3.4%)

3DEM 674(5.2%) 1140(9.4%)

Other 26(0.2%) 27(0.2%)

Page 19: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

Complexity and Size Growth

18

0

10000

20000

30000

40000

50000

60000

10

210

410

610

810

1,010

1,210

1,410

1,610

1,810

2,010

2000200220042006200820102012201420162018

#ofPolymerChains

MolecularW

eight

Millions

SizeGrowthinMolecularWeightandPolymerChains

MolecularWeight #ofPolymerChains

0

5000

10000

15000

20000

25000

30000

2000

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

2014

2015

2016

2017

2018

Total#ofLigands AvailableinPDB

2,498newin2018

Page 20: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

2018 Deposition/Biocuration Statistics§ 12,179deposited• 5117 RCSBPDB-biocurated

§ Workloadbalancedgeographically• 42%Americas,Oceania• 34%Europe,Africa• 24%Asia

19

42%

34%

24%

ProcessingSite

RCSBPDB

PDBe

PDBj

NorthAmerica34%

SouthAmerica1%

Oceania3%

Commercial6%

Europe33%

Africa0%

Asia23%

DepositorLocation

<1%

Page 21: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

350

400

450

500

550

600

650

700

750

800

850

2008

2009

2010

2011

2012

2013

2014

2015

2016

2017

2018

Aver

age

Num

ber o

f Ent

ries

Proc

esse

d pe

r FTE

Year

NewStructures/wwPDBBiocurator

Addressing Increasing Growth and Complexity Through Biocuration Efficiency

20

§ Continuingincreasedefficiencysince2009

§ OngoingimprovementsinBiocuration processes

§ SignificantincreasefromOneDep launch• Needtoboostproductivityin2019andbeyond

*OneDep launched

Page 22: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

2018 Efforts to Improve Biocuration EfficiencyBiocuration

§ Enableutilizationofexternalcomputingforlargecalculations(e.g.,ribosomevalidation)

§ Re-useprevioussequenceannotation

§ Routinetasksmoreautomated

§ Processesstreamlined

Deposition

§ Majorissuesmademoreprominenttodepositors

§ Morechecksandgatestodeposition

2121

Ligand Processing

Sequence Processing

Value-added Annotation

Validation CommunicationEntity Transformation

Page 23: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

2018 MilestonesOneDep Development

§ MandatoryORCiD• 25%ofuniquedepositorswithORCiD

• 3342uniquePIswithORCiD

§ Improvedbiocurationefficiency

§ Bettersoftwaremanagementvia GitHub

§ GDPR-compliant

PDBArchiveImprovements

§ CarbohydrateRemediation• CollaborationwithGlycoscience community

• PDBx/mmCIF dictionaryextensionandexamplespublicvia GitHub

§ Validationreportrecalculation

22

Page 24: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

Roadmap in 2019

23

Page 25: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

2019 Goals

24

Goal Impact/Gain

Validationenhancements:Ligands,NMRrestraints,EMmaps

ImprovedataqualityIncreaseBiocuration efficiency

MandatorymmCIF depositionforcrystallographicstructures

CapturemorecompletedataIncreaseBiocuration efficiency

Author-initiatedcoordinatereplacement ImprovedataqualityMoreautomatedBiocuration

SupportingNEFformatfromNMRtechnique EnablerestraintvalidationImprovedataquality

Carbohydrateremediation EnableFAIRBetterdatavalidation

ChemicalComponentversioning BetterdatamanagementAutomatedtrackingonchanges

Biocuration byDepositorandBiocuration Automation

IncreaseBiocuration efficiency

DOIresolutionatwwpdb.org landingpage HighlightwwPDBcollaboration

ProvideEDmapcoefficientsatFTP Enabledatareproducibility

Infrastructuresoftwareupgrade MoreeffectivesoftwaretestinganddeploymentReduceBiocuration testingresource

Page 26: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

Developing Next Generation Ligand Validation§ SoftwareadaptedfromGlobalPhasingLtd.underformalagreement

§ Provides2Ddepictionofgeometricalquality

§ ProvideselectrondensityfitforX-ray

§ Nowmandatoryatdeposition:identificationofLigand/sOfInterest(LOI)

25

Page 27: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

Examples of NADP

26PDB entry 5zix (Better data quality) PDB entry 1zk4 (Worse data quality)

Page 28: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

Carbohydrate Remediation (NIGMS grant U01 CA221216)

Objectives

§ StandardizenomenclaturefollowingIUPAC/IUBMB

§ Adoptcommunitysoftwarefor• standardnomenclatureassignment

• lineardescriptionforoligosaccharides

§ Provideuniformrepresentationforoligosaccharideswithappropriatedescriptor(s)

§ Identify,validate,andbiocurateglycosylation

Scope

§ 1,614monosaccharidesand369oligosaccharidesinPDBChemicalComponentDictionary

§ 15,244PDBstructures

§ ~20,000oligosaccharidesin9,000PDBstructures

27

Page 29: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

Interoperable Linear Representation for Oligosaccharides

28

Condensed IUPAC LFucpa1-6[DManpa1-3DManpb1-4DGlcpNAcb1-4][LFucpa1-3]DGlcpNAcb1-ASN

LINUCS[][ASN]{[(4+1)][b-D-GlcpNAc]{[(3+1)][a-L-Fucp]{}[(4+1)][b-D-GlcpNAc]{[(4+1)][b-D-Manp]{[(3+1)][a-D-Manp]{}}}[(6+1)][a-L-Fucp]{}}}

These description can be translated into Symbolic representations used by glycoscientists

PDB ID 6cmg

IUPACa-L-Fucp-(1-6)+

| a-D-Manp-(1-3)-b-D-Manp-(1-4)-b-D-GlcpNAc-(1-4)-b-D-GlcpNAc-(1-4)-ASN

| a-L-Fucp-(1-3)+

Page 30: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

Carbohydrate Project Status§ Extendeddatacontentandexamplesavailabletothepublic• Projectsummary:• https://www.wwpdb.org/documentation/carbohydrate-remediation

• Examplesofremediateddata• https://github.com/pdbxmmcifwg/carbohydrate-extension/tree/master/examples

§ Glycoscience communitytoolsthatproduceoligosaccharidelineardescriptorsandIUPACnomenclaturetested

§ Branchedpolymerrepresentationsoftwarereadyforintegrationwithcommunitytools

§ CurrentlystandardizingmonosaccharidenomenclatureinPDBChemicalComponentDictionary

29

Page 31: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

ManagementStephenK.Burley

30

Page 32: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

Managing a Global Public Good§ Experiencedleadershipteam§ Broadknowledgeofbasicandappliedresearch

§ Deepsubjectmatterexpertiseinbiomedicalscienceandinformationtechnology

§ Strongprojectmanagementsupport

§ Specialistcommunityengagementandoversight

§ Professionalaccreditation§ Responsibledatastewardship

31

Page 33: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

Organizational Considerations§ Deliveryofhighqualityservicesandcomplexproductstoadiverse,globalusercommunity

§ RCSBPDBstaffisbydesign• Broadrangeofskillsrepresented• Domainexpertsinkeyareas• Geographicallydistributed

§ Strategiccollaborationswithinternationalpartners§ Scientificrigorcombinedwitheffectiveprojectmanagement

32

Page 34: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

wwPDB International Collaboration § Cost-sharingensuresPDBcontinuesasthesingleOpenAccessarchive

§ Operatestheglobalsystemfordeposition,validation,andbiocuration(OneDep)

§ Definesdatastandardsandcontentinconcertwiththescientificcommunity(PDBx/mmCIFDictionary)

§ EnablesdatauniformityinthePDBarchive(“Remediation”)

§ Synergisticsharingofresourcesandservices(e.g.,SIFTS,Mol*)

§ Member-hostedwebsitesoffercomplementaryservicesandviewsofthedata

33

2018 wwPDB Advisory Committee Meeting

Page 35: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

New wwPDB Organizational Structure

34

CORE ARCHIVES

PDBBMRBEMDB

EMPIAR

SASBDB

MX Images

CORE MEMBERSRCSB PDB

PDBePDBj

BMRB EMDB

FEDERATED RESOURCES

Page 36: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

Other Strategic Collaborations§ Integrative/HybridMethodsWorkingGroups

§ EMDBArchiveTeam§ UniProt§ CCDC:CambridgeCrystallographicDataCentre

§ NCBIPubChem§ Otherexternaldataresources

36

I/H Methods Structures552-protein yeast Nuclear Pore Complex

Kim et al. (2018) Nature 555, 475-82PDBDEV_00000010; PDBDEV_00000011; PDBDEV_00000012

Page 37: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

Staff Recruitment and Advancement§ OngoingrecruitmentwithassistancefromhostinstitutionHumanResources• DiversityandInclusionPlanning

§ Professionaldevelopment• Mentoring• In-servicetraining• Professionalsocietyinvolvement• Co-authoringscientificpapersand

proposals• Sciencecommunicationoutreach

andteaching

§ Healthyturnoverof~10%/year§ WheredoRCSBPDBstaffgowhentheyleave?• Privatesectorjobs(e.g.,Disney,

Google,Invitae)• Academicfacultypositions

3737

Page 38: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

RCSB PDB Team

38

RCSB PDB is funded by the National Science Foundation (DBI-1832184), the National Cancer Institute, the National Institute of General Medical Sciences, and the US Department of Energy (DE-SC0019749)

RCSB PDB is a member of the Worldwide Protein Data Bank partnership (wwPDB; wwpdb.org)

[email protected]

Funding

Management

Follow us

RCSB PDB is hosted by:

Page 39: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

Join the RCSB Protein Data Bank Team at Rutgers, The State University of New Jersey

Biochemical Information & Annotation Specialist (Biocurator)

Curate, validate, and standardize macromolecular structures from the PDB community.

Knowledge and skills:

• PhD in Biological chemistry• Background in 3DEM, small

molecule crystallography, or macromolecular crystallography

• Experience with metalloprotein and small molecule data

• Knowledge of Linux computer systems and biological databases preferred

Open positions: Front End Web Developer

Develop and maintain web applications, from design to deployment.

Knowledge and skills:

• Familiarity with responsive, adaptive design practices using HTML, CSS, Bootstrap

• Experience with JavaScript, JavaScript frameworks and libraries

• Any experience with backend services such as databases (MongoDB), REST, or GraphQL a plus

• Experience with TypeScript and WebGL a plus

More information: http://www.rcsb.org/pages/jobsQuestions? [email protected]

SOFTWARE DEVELOPERS AND BIOCHEMISTS

Page 40: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

Celebrating 21 Advisory Committee Meetings

0

1

2

3

4

5

6

RutgersProteomics

RutgersDoolittle

SDSC/UCSD

Teleconferences NSF

RutgersCABM

NewOrleans

TorontoLBNL

Locations

40

Page 41: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

Sustaining Open Access Biological Data§ Perceptible,butslow,movementtowardsfindingaglobalsolutionforfundingdataresourceslikePDB• GlobalLifeScienceDataResourcesWorkingGroup

• NIHScientificDataCouncil• USInteragencyWorkingGrouponBiologicalDataSharing

• EUELIXIR

41Do not anticipate change 2019-2023doi: 10.1038/543179a

Page 42: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

Mol* 3D Visualization DemonstrationAlexRose

Page 43: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

Mol* Overview§ What• WebMolecularGraphics+UI/State+DataDeliveryServices

§ Who• CollaborativeProjectwithPDBe(andothers,opentoeveryone)

• SuccessortoNGL(RCSBPDB)andLiteMol (PDBe)§ Status• Currentlyworkingoncorecapabilities• Soonfocusonmakingitmoreuserfriendly• Try:https://molstar.org/viewer• Develop:https://github.com/molstar

44

Page 44: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

Macromolecular Rendering

■ Basic3DRepresentations● Cartoon,Spacefill,Ball&Stick,MolecularSurface

■ Demo● XFELcrystalstructureofhumanmelatoninreceptorincomplexwithRamelteon(6ME2)

45

Page 45: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

Volume Streaming and Rendering§ VolumeServer• Efficientaccesstosmallandverylargevolumetricdatasets

• EvolvedfromLiteMol’sDensityServer

§ Demos• ZikavirusEMdensityatdifferentresolutionsfromsamedataset

• X-raydensityofselection(aroundNADPligandin5ZIX)

46Ketopantoate reductase bound to NADP+ (5ZIX)

Zika Virus (5IRE)

Page 46: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

Bonus§ CarbohydrateSymbols(3D-SNFG)§ SpecialCases• Cyclicpeptides• Peptidenucleicacid(PNA)

47

Carbohydrate in Cardosin (1B5F) Cyclic protease inhibitor (1SFI) RNA/PNA complex (5EME)

Page 47: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

Demonstration

48

Page 48: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

Outreach/Education(Service 4)ChristineZardecki

49

Page 49: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

Goal: Promote Structural Views of Biology

50

§ Enableopen-accessexplorationofPDBhighlights

§ Enableeducationofundergraduate,graduateandprofessionalstudents,postdoctoralfellows,andresearchersinacademe,government,andindustry

§ ProvidetrainingmaterialsforPDBUsers

§ Exposethepublictoglobalhealthtopicsthroughthelensof3Dstructure

Page 50: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

PDB-101 Website Serves as Primary Vehicle KeyPerformanceIndicators:§ 1,816,972pageviewsin2018• 1,750,456in2017

§ 594,073Usersin2018• 620,784in2017

Topaccessedfeatures• MoleculeoftheMonth(hemoglobin,catalase,GFP,carbonicanhydrase)

• GuidetoUnderstandingPDBData

• PaperModels• ContentBrowser

51

Users by Country

80% Desktop, 20% mobile or tablet

Users by State

Page 51: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

Molecule of the Month Survey

52

Basedon~240

ResponsesasofApril302020: 20 Years of Molecule of the Month

>50% re-use illustrations in

their classroom

~50% use MOTM for teaching MOTM helps >50% understand

Health and DiseaseBiomolecular Structure and FunctionBasic Principles of Molecular Biology

>60% located at a College/University

Page 52: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

Public Health Focus: Antimicrobial Resistance (2018-2019)§ PDB-101• MoleculeoftheMonth• Videos• 2018calendar• InteractivePoster

§ Undergraduatecourse§ GlobalHealthresourcedevelopment

§ HighschoolVideoChallenge

Nextfocus:DrugsandtheBrain

53

High School Video Challenge Winners

Page 53: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

Factors Influencing Decision Making

55

RCSBPDBOutreach/Education

AdvisoryCommittee

ExpertStaffKnowledgeCollaborators

FeedbackandAnalytics

Sustainability

EvolvingCommunityNeeds

Page 54: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

2019 Goals: Growth and Preparation§ NewPDB-101materialstodrivetraffic(usersandvisits)• GlobalHealth:diabetes,AMR• Curricularmaterials• MoleculeoftheMontharticleson

FundamentalBiology,Biomedicine,andEnergy

§ SupportRCSB.org developmentwithtrainingmaterials

§ MaintainPDB-101uptime§ Planforthefuture• DevelopmaterialsforDrugsand

theBrainhealthfocus(2020-2021)• InitialPDB50discussions• Planmaterialsandeventsto

leverage20yearsofMoleculeoftheMonth

56

First Molecule of the Month: Myoglobin, January 2000

Measles Virus Proteins, March 2019

Page 55: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

57

Page 56: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

Outreach Depends Upon Everyone

58

Page 57: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

2018: Select Offline Highlights

59

Page 58: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

Growing theData Consumer Community

60

Page 59: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

0

1000

2000

3000

4000

5000

6000

7000

8000

Biochemistr

y Molecular

Biology

Biophysi

cs

Biochemica

l Resea

rch M

ethods

Computer S

cience

Interdisc

iplinary

Applicati

ons

Chemistr

y Medic

inal

Chemistr

y Multid

isciplina

ry

Mathem

atical

Computatio

nal Biology

Biotechn

ology A

pplied

Micro

biology

Chemistr

y Physi

cal

Multidisc

iplinary

Science

s

Communities Currently Served

61

§ MillionsofusersvisitRCSB.org• Increased10%in2018• Estimated3.5millionuniqueusersin2018

• Unabletodirectlytrackresearchinterests

§ >400resourcesutilizePDBdata

§ ~19,000publicationsciteinauguralRCSBPDBpublication(Bermanetal.2000)• Predominatelybiology-biomedicine-chemistry Journal subject categories for papers citing

Berman et al. 2000

Page 60: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

RCSB.org Integration with Key ResourcesResource TypeofData Resource TypeofData Resource TypeofData

BindingDB Bindingaffinities ImmuneEpitopeDatabaseAntibodyandTcell

epitopesProteinModelPortal Theoretical models

BindingMOAD Bindingaffinities LS-SNPSingleNucleotide

PolymorphismsPubMed Citationinformation

BiGGReconstructionof

metabolicpathwaysmpstruc

Classificationof

transmembraneprotein

structures

PubMedCentral Openaccessliterature

BMRB BMRB-to-PDBmappings NCBIGeneGeneinfo,reference

sequences,etal.Recon3D

A3-DimensionalViewof

HumanMetabolismand

Disease

CatalyticSiteAtlasEnzyme activesitesand

catalyticresiduesNCBITaxonomy OrganismClassification RECOORD NMRstructureensembles

CATHProteinstructure

classificationNDB

Experimentally-

determinednucleicacids

andcomplexassemblies

RESID Proteinmodifications

DrugBank Drugandtargetdata OLDERADONMRdomaincomposition

andclusteringSBGrid diffractionimages

EMDB3DEMdensitymapsand

associatedmetadataOPM

Orientationof

transmembraneproteinsSCOP

Proteinstructure

classification

ExPASy Enzymeclassification PDBbind-CN Bindingaffinities SIFTS(PDBe)Structure,function,

taxonomy,sequence

Gencode Genestructuredata PDBflexProteinstructure

flexibility

Store.Synchrotron

DataStorediffractionimages

GeneOntology Biologicalontologies Pfam Proteinfamilies TCDBmembranetransport

protein classification

HMMER3Sequencesimilarity

searchesPhospoSitePlus

Mammalianpost-

translational

modifications

UniProtProteinsequencesand

annotations

HumanGene

NomenclatureCommittee

nomenclatureand

genomicinformationProteinDiffraction.org diffractionimages UCSCgenomebrowser humangenomedata

http://www.rcsb.org/pages/external-resources

Page 61: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

Supporting and Growing Established User Communities§ Canwegrowusercommunityby• ImprovingRCSB.org tools• Buildingnewtools• Integratingwithcomparativeproteinmodels• Integratingwithadditionaldataresources• PubChem,CARD,ModelArchive,…

• Developnewtrainingmaterials

65

Page 62: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

PDB Data Impact on Scientific Literature

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%Biochemistry & Molecular Biology

ChemistryCell Biology

Pharmacology & PharmacyBiotechnology & Applied Microbiology

MicrobiologyGenetics & Heredity

PhysicsComputer Science

Medicine, Research & ExperimentalComputer Science, Interdisciplinary Applications

ImmunologyMathematical & Computational BiologyPhysics, Atomic, Molecular & Chemical

Plant SciencesVirology

Materials ScienceToxicology

ParasitologyMaterials Science, Multidisciplinary

EngineeringFood Science & Technology

Nanoscience & NanotechnologyEvolutionary Biology

Environmental Sciences & EcologyAgriculture

Environmental SciencesZoology

Percentage of PDB Archive

Percentage of PDB Archive Cited in Subject-Area Publications

66

Page 63: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

Recruiting New User Communities§ ArethereroadblockstousingRCSB.org?• Utilityof3Ddataforresearchnotclear?• BarrierstoutilizingRCSB.org tools?

§ Opportunitiesforfuturegrowthtoconsider?• Newtoolstodevelop• Integrationwithnewresources• PubChem,CARD,ModelArchive,…

• Trainingmaterials• Collaborationswithscientificsocieties• …

§ ShouldweexpandthecurrentAdvisoryCommitteemembership?

67

Page 64: RCSB Protein Data Bank Advisory Committee€¦ · access tools for structure query, visualization, and analysis. (FAIR) 4. Outreach/Education Services support educators, students,

Thank you for your contributions

68