application of the integrated nmr database in protein science · high sasa. high ddg. hydr. mecp2:...

22
PSSJ 2016-06-07 Application of the integrated NMR database in protein science Naohiro Kobayashi PDBj-BMRB group Institute for Protein Research Osaka university, Japan 1

Upload: others

Post on 06-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Application of the integrated NMR database in protein science · High SASA. High DDG. hydr. MeCP2: methylated CpG. binding protein. NMR. structure. 1QK9: Wakefiled et al., 1999. R133

PSSJ 2016-06-07

Application of the integrated NMR database in protein science

Naohiro Kobayashi

PDBj-BMRB group Institute for Protein Research

Osaka university, Japan

1

Page 2: Application of the integrated NMR database in protein science · High SASA. High DDG. hydr. MeCP2: methylated CpG. binding protein. NMR. structure. 1QK9: Wakefiled et al., 1999. R133

PSSJ 2016-06-07

NBDC

wwPDB

Project director Haruki Nakamura

PDBj: Database for X-ray and EM structure

and experimental data Haruki Nakamura (Professor)

Atsushi Nakagawa (Professor)

Rei Kinjyo (Associate Prof.)

Daron Standley (Prof. at iFREc)

5 annotators, 2 programmers,

2 research fellows

PDBj-BMRB: Database for NMR structure

and experimental data

Toshimichi Fujiwara (Professor)

Chojiro Kojima (Prof. YNU & Osaka Univ.)

Naohiro Kobayashi (Sp.Ap. Associate Prof.)

Masashi Yokochi (Programmer Annotator)

Takeshi Iwata (Sys. Admin. Annotator)

3-year project April 2014-March 2017

Organization of the project of PDBj at Osaka university supported by JST-NBDC

2

Page 3: Application of the integrated NMR database in protein science · High SASA. High DDG. hydr. MeCP2: methylated CpG. binding protein. NMR. structure. 1QK9: Wakefiled et al., 1999. R133

PSSJ 2016-06-07

What is chemical shift?

the most important parameter of NMR…

Chemical shift

anisotropy Ring current effect

Very sensitive to the

structure of protein!

3

Page 4: Application of the integrated NMR database in protein science · High SASA. High DDG. hydr. MeCP2: methylated CpG. binding protein. NMR. structure. 1QK9: Wakefiled et al., 1999. R133

PSSJ 2016-06-07

But very difficult to predict

from structure…

that’s why we are collecting

chemical shift data!

4

Page 5: Application of the integrated NMR database in protein science · High SASA. High DDG. hydr. MeCP2: methylated CpG. binding protein. NMR. structure. 1QK9: Wakefiled et al., 1999. R133

PSSJ 2016-06-07

What you can do with chemical shifts?

You can use the NMR signal as 100 of probes at atomic resolution

5

Page 6: Application of the integrated NMR database in protein science · High SASA. High DDG. hydr. MeCP2: methylated CpG. binding protein. NMR. structure. 1QK9: Wakefiled et al., 1999. R133

PSSJ 2016-06-07

You can get a variety of information using chemical shifts

Domain orientation

Interaction with ligand

Structure

Dynamics

Sugase et al., Nature, 2007

6

Page 7: Application of the integrated NMR database in protein science · High SASA. High DDG. hydr. MeCP2: methylated CpG. binding protein. NMR. structure. 1QK9: Wakefiled et al., 1999. R133

PSSJ 2016-06-07

We are collecting chemical shift data in collaboration with Univ. Wisconsin and wwPDB

7

Page 8: Application of the integrated NMR database in protein science · High SASA. High DDG. hydr. MeCP2: methylated CpG. binding protein. NMR. structure. 1QK9: Wakefiled et al., 1999. R133

PSSJ 2016-06-07

Our activities on BMRB processing (2015)

• 78 entries processed at Osaka (~10%)

• BMRB total 759 entries

0

300

600

900

1200

2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015

Ent

ry

Year

MadisonOsaka

8

Page 9: Application of the integrated NMR database in protein science · High SASA. High DDG. hydr. MeCP2: methylated CpG. binding protein. NMR. structure. 1QK9: Wakefiled et al., 1999. R133

PSSJ 2016-06-07

NMR-STAR v3 BMRB/XML BMRBxTool

bmr15400.str

Yokochi et al., J. Biomed. Sem. (2016)

Conversion of BMRB data into machine readable format

9

Page 10: Application of the integrated NMR database in protein science · High SASA. High DDG. hydr. MeCP2: methylated CpG. binding protein. NMR. structure. 1QK9: Wakefiled et al., 1999. R133

PSSJ 2016-06-07

BMRB/XML

bmr15400.xml

BMRBoTool

BMRB/RDF

bmr15400.rdf

RDF: Resource Description Framework

Toward the more machine readable format 10

Page 11: Application of the integrated NMR database in protein science · High SASA. High DDG. hydr. MeCP2: methylated CpG. binding protein. NMR. structure. 1QK9: Wakefiled et al., 1999. R133

PSSJ 2016-06-07

PREFIX uniprot_c: <http://purl.uniprot.org/core/> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX omim_v: <http://bio2rdf.org/omim_vocabulary:> SELECT DISTINCT ?label ?omim_id ?dbsnp_id ?mutation ?phenotype FROM <http://purl.uniprot.org/uniprot> FROM <http://bio2rdf.org/omim_resource:bio2rdf.dataset.omim.R3> WHERE { BIND (IRI(CONCAT("http://purl.uniprot.org/uniprot/", “P51608”)) AS ?s_uniprot) SERVICE <http://uniprot.bio2rdf.org/sparql> { ?s_uniprot uniprot_c:recommendedName ?s_name . ?s_name uniprot_c:fullName ?label . ?s_uniprot rdfs:seeAlso ?o_purl . } FILTER (STRSTARTS(STR(?o_purl), "http://purl.uniprot.org/mim/")) BIND (STRAFTER(STR(?o_purl), "http://purl.uniprot.org/mim/") AS ?omim_id) BIND (IRI(CONCAT("http://bio2rdf.org/omim:", ?omim_id)) AS ?s_omim) SERVICE <http://omim.bio2rdf.org/sparql> { ?s_omim omim_v:variant ?s_allele . ?s_allele omim_v:x-dbsnp ?s_dbsnp ; omim_v:mutation ?mutation ; rdfs:label ?phenotype . BIND (STRAFTER(STR(?s_dbsnp), "http://bio2rdf.org/dbsnp:") AS ?dbsnp_id) } }

•Simillar to SQL language •Multiple database search can be applied with a simple script

(Uniprot -> OMIM)

Uniprot ID

dbSNP ID

Example of SPARQL search script 11

Page 12: Application of the integrated NMR database in protein science · High SASA. High DDG. hydr. MeCP2: methylated CpG. binding protein. NMR. structure. 1QK9: Wakefiled et al., 1999. R133

PSSJ 2016-06-07

BMRB Mutation OMIM dbSNP 2nd SASA% ∆∆Ghydr 4280 L100V 300005 rs28935168 Coil 22.5 3.9 4280 R106W 300005 rs28934907 Strand 8.0 11 4280 R133C 300005 rs28934904 Coil 49.4 7.2 4280 E137G 300005 rs61748392 Helix 41.7 6.6 4280 A140V 300005 rs28934908 Helix 42.0 1.7

High SASA

High DDGhydr

MeCP2: methylated CpG binding protein

NMR structure 1QK9: Wakefiled et al., 1999 R133

E137 The mutations, R133C and E137 have

been found in patients with X-linked mental

retardation.

-- detailed information from OMIM--

Data analysis using BMRB/PDB/OMIM database

[Application of BMRB/RDF] 12

Page 13: Application of the integrated NMR database in protein science · High SASA. High DDG. hydr. MeCP2: methylated CpG. binding protein. NMR. structure. 1QK9: Wakefiled et al., 1999. R133

PSSJ 2016-06-07

[Application of BMRB/XML] Multiple database search on the PDBj-BMRB portal

Universal search box Search filter

Search tabs & settings

Search button

13

Page 14: Application of the integrated NMR database in protein science · High SASA. High DDG. hydr. MeCP2: methylated CpG. binding protein. NMR. structure. 1QK9: Wakefiled et al., 1999. R133

PSSJ 2016-06-07

# of hits of each DB Sort menu

14

Page 15: Application of the integrated NMR database in protein science · High SASA. High DDG. hydr. MeCP2: methylated CpG. binding protein. NMR. structure. 1QK9: Wakefiled et al., 1999. R133

PSSJ 2016-06-07

15

Page 16: Application of the integrated NMR database in protein science · High SASA. High DDG. hydr. MeCP2: methylated CpG. binding protein. NMR. structure. 1QK9: Wakefiled et al., 1999. R133

PSSJ 2016-06-07

16

Page 17: Application of the integrated NMR database in protein science · High SASA. High DDG. hydr. MeCP2: methylated CpG. binding protein. NMR. structure. 1QK9: Wakefiled et al., 1999. R133

PSSJ 2016-06-07

MagRO: Tool for highly automated NMR data analysis and deposition to BMRB

FLYA tool for fully automated Structure analysis

[Application of BMRB/RDF] 17

Page 18: Application of the integrated NMR database in protein science · High SASA. High DDG. hydr. MeCP2: methylated CpG. binding protein. NMR. structure. 1QK9: Wakefiled et al., 1999. R133

PSSJ 2016-06-07

Easy to get averaged chemical shifts for target Comp_ID and Atom_ID

18

Page 19: Application of the integrated NMR database in protein science · High SASA. High DDG. hydr. MeCP2: methylated CpG. binding protein. NMR. structure. 1QK9: Wakefiled et al., 1999. R133

PSSJ 2016-06-07

SAHG

~42,000

~32,000

Selection of models not found in PDB

With NMR data

2,038

no NMR data

1,041

~3,100

Selection of high quality models(seqence identity: 40~90%) Filter total residue numbers (40~300 residues)

Refseq

PDB

BMRB

OMIM

Single

nucleotide

polymorphism

IntAct

Pull-down, Y2H

Information of

interaction

Uniprot

SAHG models

Secondary database of modeled structures linked with the other life science databases

[Application of BMRB/RDF] 19

Page 20: Application of the integrated NMR database in protein science · High SASA. High DDG. hydr. MeCP2: methylated CpG. binding protein. NMR. structure. 1QK9: Wakefiled et al., 1999. R133

PSSJ 2016-06-07

http://bmrbdep.pdbj.org/en/mp_search.html

Publication of the secondary database 20

Page 21: Application of the integrated NMR database in protein science · High SASA. High DDG. hydr. MeCP2: methylated CpG. binding protein. NMR. structure. 1QK9: Wakefiled et al., 1999. R133

PSSJ 2016-06-07

Automated assign system of MagRO with modeled structure and BMRB data

Assignment results

Matrix AASS Matrix ART

NMR spectra

Matrix CCON

BMRB data

Matrix ABMRB

Peak and residue type recognition assisted by neural network

Sequence alignment

Matrix AMODEL

Model Structure

Predicted Chemical shifts

SPARTA

Anneal_Robot / FLYA

21

Page 22: Application of the integrated NMR database in protein science · High SASA. High DDG. hydr. MeCP2: methylated CpG. binding protein. NMR. structure. 1QK9: Wakefiled et al., 1999. R133

PSSJ 2016-06-07

Acknowledgement

Osaka University:

Nagoya University: Hidekazu Hiroaki

Masato Katahira, Takashi Nagata Kyoto University:

Wisconsin University: Eldon L. Ulrich, John L. Markley

AIST (Tokyo): Chie Motono

Toshimichi Fujiwara, Chojiro Kojima (YNU, Osaka Univ.)

Masashi Yokochi, Takeshi Iwata

22