Download - 2009 NIGMS Workshop Enabling Technologies for Structural Biology March 4 th -6 th , 2009

2009 NIGMS WorkshopEnabling Technologies for Structural Biology

March 4th-6th, 2009

Extra-Cellular Mammalian ProteinsAs Structural Genomics Targets

Steve Anderson

Quote from NESG PSI-2 Center Grant Application, Nov. 2004:

“…Table II-1 indicates the relative success rate, in terms of solubility and structure depositions, of targets from key organisms when expressed in E. coli. The data indicate that eukaryotic targets are some three-fold less likely to be expressed and soluble in our E. coli expression systems. Moreover, although structures of some of these eukaryotic targets will be determined in the coming months, the soluble eukaryotic proteins (particularly human proteins) are some three-fold less likely to result in NMR or crystal structures. Despite the tremendous overall success of bacterial expression systems, success rates with eukaryotic proteins in these systems are limited, and eukaryotic protein sample production remains a major challenge for structural genomics.”

Table II-1. PDB submissions based on target organism

Organism Cloned % Soluble In PDB % PDB/cloned

E. coli 237 82 21 9.0

B. subtilis 216 50 7 3.3

S. cerevisiae 490 17 5 1.0

D. melanogast er 113 17 1 0.9

Human 970 25 7 0.7

Major Roadblock:

Many interesting proteins (e.g., eukaryotic proteins) -- especially eukaryotic secreted proteins -- do not express well in commonly-used E. coli systems.

We are especially interested in eukaryotic secreted proteins.

What about E. coli secretion vectors?

Suitable media for enrichment with 15N and 13C (or labeling with SeMet) is the key issue because minimal medium appears to strongly inhibit secretion.

Lu et al (1997) J. Mol. Biol. 266, 441.

Protein A secretion vector for producing samples for structural biologyWould this enable isotope enrichment, too?

Purif 1 2 3 4 M 1 2 3 4

Osmotic shockate IgG-purified

1 - Celtone + 0.2% glucose2 - Spectra + 0.2% glucose3 - MJ media (standard media for labeling protein)4 - 2XTY + 0.2% glucose (optimal rich media, but can’t be used for labeling)

Test expression of ZZ-OM3D-P1LysExpression in Celtone media is better than expression in MJ media

A P K V D G L E G S G E N L Y F Q S L E *…GGCGCCGAAAGTAGAC GGTCTAGAA GGTAGCGGT GAAAACCTGTATTTTCAGAGC CTCGAG TAA

GGGCCCAAGCTTGAATTC……CCGCGGCTTTCATCTG CCAGATCTT CCATCGCCA CTTTTGGACATAAAAGTCTCG GAGCTC ATT CCCGGGTTCGAACTTAAG… Nar I Xba I Xho I HindIIIEcoRI

TEV cleavage site

C-term of Z domain coding

Cloning sites of new pEZZ318 ZZ-fusion vector

• Synthetic genes with N- or C-terminal His tags inserted as Xho I / Hind III fragments

L E G S G H H H H H H H H G S G E N L Y F Q S S * …TCTAGAA GGCTCTGGT CATCACCATCACCATCACCATCAC GGCAGCGGT GAAAACCTGTATTTTCAGAGCTCTTAA GGGCCCAAGCTT… …AGATCTT CCGAGACCT GTAGTGGTAGTGGTAGTGGTAGTG CCGTCGCCA CTTTTGGACATAAAAGTCTCGAGAATT CCCGGGTTCGAA… Xba I Sac I Hind III

Octa-His tag TEV cleavage site

Version 1.0 (TEV cleavage):

• Synthetic genes without His tags inserted as Sac I / Hind III fragments

Version 2.0 (His-tag / TEV cleavage):

Our source of targets:Human Cancer Protein Interaction Network (HCPIN)

Systematically complete structural coverage of pathways and interaction networks

Study structures of complexes

Pathway-Interaction SubnetHuang, Montelione, et al (2008) Molec. Cell. Proteomics 7, 2048.

Human Cancer Pathway Interaction Network*

• Cell cycle progression • Apoptosis• Toll-like receptor pathway• Interferon alpha/beta • JAK-STAT pathway• TGF-beta pathway• PI3K pathway• MAPK pathway

Huang, Montelione, et al (2008) Molec. Cell. Proteomics 7, 2048.

*For further information see Janet Huang (posters 49 & 50).

2328

658

2971

506

1160

136

Target Selection

~1100 human proteins/domains are selected as NESG targets

http://nmr.cabm.rutgers.edu:9090/PLIMS/

2328

658

2971

506

1160

136

Target Selection

~1100 human proteins/domains are selected as NESG targets

http://nmr.cabm.rutgers.edu:9090/PLIMS/

Approximately 1/3 of the HCPIN targets not selected are predicted to be secreted or membrane-bound proteins.

FSTL3: FSTL3.E3 -SCDGVECGPGKACRMLG-GRPRC-EC APDCSGL-PARLQVCGSDGATYRDECELRAARCRGHPDLSVMYRGRCRK 72 FSTL3.E4 –SCEHVVCPRPQSCVVDQTGSAHCVVCRAAPCPVPSSPGQELCGNNNVTYISSCHMRQATCFLGRSIGVRHAGSCAG 76 * : * :* : :* . * .*. ..:: . . * :.: : * FHR1: FHR1.E4 TS--CVNPPTVQNAHILS----RQMSKYPSGERVRYECRSPYEMFGD---EEVMCLNGNWTEPPQCKD-- 59 FHR1.E5 STGKCGPPPPIDNGDITS----FPLSVYAPASSVEYQCQNLYQLEGN---KRITCRNGQWSEPPKCLH-- 61 FHR1.E2 TF--CDFP-KINHGILYDEEKYKPFSQVPTGEVFYYSCEYNFVSPSKSFWTRITCTEEGWSPTPKCLR-- 65 FHR1.E3 ---LCFFP-FVENGHSES-----SGQTHLEGDTVQIICNTGYRLQNNE--NNISCVERGWSTPPKCRSTD 59 * * :::. . . .. . *. : .. .: * : *: .*:* FBLN4: FBLN4.E4 VNECLTIPEACKGEMKCINHYGGYLCLPRSAAVINDLHGEGPPPPVPPAQHPNPCPPGYEPDDQ---------DSCVD 69 FBLN4.E5 VDECAQALHDCRPSQDCHNLPGSYQCT--------------------------- CPDGYRKIG----------PECVD 41 FBLN4.E6 IDECRYR--YCQHR--CVNLPGSFRCQ--------------------------- CEPGFQLGPNN--------RSCVD 39 FBLN4.E8 IDECSYSSYLCQYR--CINEPGRFSCH---------------------------CPQGYQLLAT---------RLCQD 40 FBLN4.E7 VNECDMG-APCEQR--CFNSYGTFLCR--------------------------- CHQGYELHRDG--------FSCSD 40 FBLN4.E9 IDECESGAHQCSEAQTCVNFHGGYRCVDTN-----------------------RCVEPYIQVSENRCLCPASNPLCRE 55 ::** * * * * : * * : * : CEAM1: CEAM1.E3 ---ELPKPSISSNNSNPVEDKDAVAFTCEPETQD-TTYLWWINNQSLPVSPRLQLSNGNRTLTLLSVTRNDTGPYECEIQNPVS-ANRSDPVTLNVTY 93 CEAM1.E5 LSPVVAKPQIKASKTTVTGDKDSVNLTCSTNDTG-ISIRWFFKNQSLPSSERMKLSQGNTTLSINPVKREDAGTYWCEVFNPIS-KNQSDPIMLNVNY 96 CEAM1.E4 ---GPDTPTISPSDTYYRPGAN-LSLSCYAASNPPAQYSWLING---------TFQQSTQELFIPNITVNNSGSYTCHANNSVTGCNRTTVKTIIVTE 85 .* *....: . : : ::* . * ::. :.:.. * : :. :::*.* *. *.:: *:: : *. FCGR1: FCGR1.E3 TTKAVITLQPPWVSVFQEETVTLHCE---VLHLPGSSS-TQWFLNGTAT--QTSTPSYRITSASVNDSGEYRCQRGLS-G---RSDP-IQLEIHRG 85 FCGR1.E5 LFPAPVLNASVTSPLLEGNLVTLSCETKLLLQRPGLQLYFSFYMGSKTLRGRNTSSEYQILTARREDSGLYWCEAATEDGNVLKRSPELELQVL-G 95 FCGR1.E4 ----WLLLQVSSRVFTEGEPLALRCH----AWKDKLVYNVLYYRNGKAFKFFHWNSNLTILKTNISHNGTYHCSGMGKHR---YTSAGISVTVKE- 84 : . : : ::* *. :: ...: ... * .: ...* * *. . .. :.: :

Predicted Extra-cellular Domains

For secreted HCPIN proteins exhibiting evidence of multiple, reiterated domain modules bounded by phase one intron insertion positions [Patthy (1999) Gene 238,

103], multiple sequence alignments of the intervening exons were prepared.*

*See Chiang et poster (#43) for further information.

Osmotic shockates of targets 401, 601, 801/803, & 901/902Expressed in ZZ vector with TEV-cleavable linker

(38 hr. culture in 15N-Celtone)

•••••

lysozyme

P CP C P C

P - purified

C - TEV cleaved

601 801 901

TEV cleavage of 15N-enriched targets

ZZ

TEV protease

Progress Report:

Some case studies with individual domains

Example 1:

Human follistatin-like protein 3_domain 1 (exon 3)

- Paolo Rossi

• TGF antagonist• binds activin A• implicated in glucose &

fat homeostasis

Example 2:

Sushi domain from human complement factor H-related 1 protein.

>>> examination of spectra of 15N-labeled material led to the conclusion that this domain was relatively unstructured.

Is this due to the fact that some domains may need to be packed against adjacent domains for stability’s sake?

Herbert et al (2006) J. Biol. Chem. 281, 16512

There is some evidence that sushi domains are close-packed in holoproteins -- see structure of a pair of sushi domains from human complement factor H (left).

Purifed ZZ fusion of recombinant Sushi domain from human complement factor H-related protein 1

run on reducing and non-reducing gels

Reducing Non-reducing

M - + + + :incubation with TEV buffer*

full-lengthfusion

Conclusion 1:Multimeric disulfide cross-linked concatamers can form from recombinant proteins in the periplasmic space.

Conclusion 2: Thiol-disulfide exchange is promoted by the redox character* of the TEV protease cleavage buffer, allowing breakage of inter-molecular disulfides and refolding to the monomer species.

*includes 3 mM GSH and 0.3 mM GSSG

Connectivity map showing completeness of assignments

Example 3:Assignments for human fibulin-4 (FBLN4) domain 6 (exon 9)

- Swapna Gurla

• predicted to be a Ca++-binding EGF-like domain• binds to extracellular matrix proteins• dysregulated in colon cancer• involved in embryonic development & remodelling

Potential disulfide scrambling issues with FBLN4_domain 6 motivated us to improve the purification protocol by adding an ion exchange step.

Mono Q purification of FBLN4_domain 6

We then checked for incorrect disulfide bond formation by purifying FBLN4_domain 6 in the presence of oxidized glutathione (GSSG), which should reversibly cap any exposed thiols, and then treated the purified sample with iodoacetamide (IAM) in the presence of 6M Gdn-HCl, which should irreversibly alkylate any buried thiols.

Result: Based on MALDI-TOF MS, >90% of the protein appeared to be of the correct molecular weight and fully disulfide bonded.

MALDI-TOF of FBLN4_domain 6 (no IAM)

- Haiyan Zheng

MALDI-TOF of FBLN4_domain 6 (+ IAM)

- Haiyan Zheng

N

C

Human fibulin-4 (FBLN4) domain 6 (exon 9)

Preliminary(Further structure calculations arein progress….)

- Swapna Gurla

Summary of results so far(still in research phase):

6 - targets cloned

3 - expressed

2 - 3D structural information (one expressed domain was soluble but disordered)

>>> The numbers are small but promising!

Conclusion:

• Facile expression of extracellular human proteins as structural genomics targets looks promising. This effort may even result in lower levels of attrition (cloned --> 3D structure) than have “classic” expression approaches.

• Prospective domain parsing of larger extracellular human proteins is possible using the phase 1 intron rule.

Mission Statement. The long-range goal of the Protein Structure Initiative is to make the three-dimensional atomic-level structures of most proteins easily obtainable from knowledge of their corresponding DNA sequences.

Huang, Montelione, et al (2008) Molec. Cell. Proteomics 7, 2048.

“Holy Grail” of structural genomics (cf. Mission Statement of PSI): Complete structural coverage of some domain families in an organism?

For example, the EGF domain family

Yi-Wen Chiang

Davis AndersonJung B. Seo

Yushen Qian

Paolo RossiSwapna Gurla

Guy Montelione

Haiyan ZhengPeter Lobel

Thanks also to

Tom Acton

Li Chung Ma

Rong Xiao

John Everett

Mike Baran

Download - 2009 NIGMS Workshop Enabling Technologies for Structural Biology March 4 th -6 th , 2009

Top Related