practical tips for cloning, expressing and purifying proteins for structural biology
Post on 12-Jan-2016
33 Views
Preview:
DESCRIPTION
TRANSCRIPT
Practical tips for cloning, expressing and purifying proteins for structural biology
Aled Edwards
Banting and Best Department of Medical ResearchUniversity of Toronto, Canadaaled.edwards@utoronto.ca
Affinium PharmaceuticalsToronto, Canadaaedwards@afnm.com
Molecular biological approaches to structural biology
An excellent structural sample usually has the following properties
• Lack of conformational heterogeneity
• Soluble at high concentrations
• Pure
Molecular biology is probably fastest way to transform “poor”sample into an “excellent” one.
Outline
• Historical perspective on engineering proteins for structural biology
• Practical advice for cloning/purification of structural samples
• Ancillary benefits of high-throughput studies
RNA polymerase IIFrom 15Å to 3Å by eliminating heterogeneity
Another source of sample heterogeneityEukaryotic proteins comprise multiple domains
• Conformational heterogeneity lowers probability of crystallization
• Protein domains
• Are resistant to proteolysis
• Fold autonomously
• Can usually be expressed in bacteria
• Are between 15 and 30kDa (NMR or X-ray size)
• Are fundamental unit of protein function
• Domains are often only tractable targets for HTP crystallography
EBNA1 DNA-binding domain(No sequence homologue in database)
RPA Domain StructureA collection of OB-folds
RPA70
RPA32
RPA14
A B
RPA crystallization
• Start with full-length protein purified using baculovirus (Wold)
• Identify domain (aa 1-442) soluble in E coli (Wold)
• Crystallize domain (7Å)
• Use limited proteolysis to define smaller domain (aa161-442) (3.5Å….and same cell as 7Å crystal)
• Create many constructs varying N- and C-termini to identify final construct (aa 181-422). (2.2Å…solve structure)
Final tally: 15 different constructs
RPA70 Domains A and BTwo OB-folds bound to DNA
AB L12 loops
L45 loops
How does one map domains?
Domain mapping using limited proteolysis
Integrative Proteomics
Protease
TFIIS
131
240
264
309
Transcript cleavage and read-through(Nucleic acid binding?)
RNA polymerase binding
TFIIS Domain Structure
124
1
Binds holoenzyme.Similar to elongin, CRSP70
I II III
Industrialized Domain Mapping
•Partial proteolysis in 96 well plates
•Optimized set of proteases
•Low protein requirement
•No SDS-PAGE
•No N-terminal sequencing
•Direct identification of domains by mass spectrometry
DomainHunterTM
DomainHunterTM
23000 28000 33000 m/z
-1.0
-0.8
-0.6
-0.4
-0.2
-0.0
0.2
r.i.
0
0.1
0.25
1.0
2.5
Pro
tea
se T
itrat
ion
5
25
350
57
333
18
316
50
253
60
233
32
219
52
216
12
205
07
Mass Matching sequence Expression Solubility
B 10324.0 G[44-133]R +++ ++C 12352.0 G[44-150]D noA 9131.0 I[55-133]R ++ ++D 11159.0 I[55-150]D no
DomainHunter Applied to NMR Sample
Fragment
Residue NumberN 20 40 60 80 100 120 140
B
AC
D
V8 cleavage site
Chymotrypsin site
A B
MTH40
MTH1184
MTH538
MTH129
MTH1048MTH1699
MTH1790
MTH152
MTH1615
MTH1175
MTH150
Structural Proteomics
Nat. Str. Biol. Oct/Nov 2000
5 moredone
3 moresoon
Molecular biology for crystallization and for large-scale studies
1. Basic steps in creating expression vectors for E. coli
2. Practical tips for making fewer mistakes
3. Application of methods to higher-throughput
4. Alternate expression systems
5. Some results
E coli is the first choice……why?
• Cost effective• Easy to grow• Abundance of expertise and reagents• Easy to incorporate selenomethionine• High yield• Rapid doubling time and rapid scale-up
Factors involved in successful expression of recombinant proteinsin Escherichia coli cytoplasm
Expression vector
Copy number (gene dosage – sometimes better less than more)
Promoter choice (T7, Ptac, Plac, Para )
Little or no expression before induction
Reliable and adjustable expression
mRNA stability (RNAaseE- mutant)
Translation
Consensus SD sequence
Proper spacing and sequence before the initiation codon
Possible mRNA secondary structures that block ribosome binding orinternal ribosome binding site
Codon Bias
But which E coli?
BL21(DE3) F- ompT hsdSB (rB-,mB-), gal, dcm, (DE3)
BL21-Star(DE3) F- ompT hsdSB (rB-,mB-), gal, dcm, rne131, (DE3)
Tuner(DE3) F- ompT hsdSB (rB- mB-) gal dcm lacY1 (DE3)
BL21-Gold(DE3) F- ompT hsdS (rB- mB-) dcm+ Tetr gal endA (DE3)
Conventional cloning approach
1. Select vector of choice
2. Restriction digest the vector
3. PCR the insert
4. Restriction digest the insert
5. Ligate the vector and insert
6. Transform and plate
7. Pick colonies and screen for insert
8. Screen positive clones for protein expression
9. Sequence positive clones
Which vector/tag?
1. T7 RNA polymerase-based systems is overwhelming choice
- Highly specific
- High yields
- Exquisitely controlled
2. Choice of vector
- Restriction sites (are there internal sites in gene?)
- Are there many possible sites?
- Are the enzymes commonly available?
- Do the enzymes cut near ends of DNA fragments?
3. Which tag?
- Relatively little data on which generates best proteins for
crystallization
- His-tag, GST, MBP all are effective at purification
- His tag offers advantage of being able to screen +/- tag
for crystals (double bang for the buck)
- Make sure there is a protease site to remove tag
Practical issues with cloning
1. Choice of protease???
- Thrombin (more difficult to get but highly effective)
- TEV, recombinant with his-tag, stable mutant with
less autoproteolysis activity (Waugh), needs calcium,
finicky
- Factor X, enterokinase…..avoid
“I can’t use thrombin, it digests my protein”
Purification of Thrombin from Thrombostat
1. We start with 10,000 units of Thrombostat fromParke-Davis and dissolved in 10 ml of 50Mm NaPO4Ph6.5 and 5% glycerol.
2. The solution was then spun at 10,000rpm for 10 minin an SS34 rotor to clarity
3. This was then loaded onto a Poros S Column (7.5mmX100mm, Perseptive Biosystems) preequilibrated in theabove buffer at 3ml/min
4. The column was then washed in the above buffer untilthe OD 280 reached zero.
5. The column was then washed with 100Mm NaPO4 Ph6.5and 5% glycerol until the absorbance went to zero.
6. Thrombin was then eluted from the column in 300MmNaPO4 Ph8.5 and 5% glycerol at a flow rate of1ml/min. 0.5 ML fractions were collected and runout on a 15% SDS-PAGE and 35kD protein (Thrombin)was pooled and frozen in small aliquots.
7. Total protein yield was about 3mg in 10 ml ofbuffer.
Schleiff, E., Khanna, R., Orlicky, S. and Vrielink, A.Expression, purification, and in vitro characterization ofthe human outer mitochondrial membrane receptor humantranslocase of the outer mitochondrial membrane 20. Arch.Biochem. Biophys. 367:95-103 (1999)
Practical issues with cloning
Restrict the plasmid
- Double digestion often leave one end undigested,
which in turn results in high background due to
re-ligation
- Phosphatase treatment and gel purification of
large prep makes life much easier in long run
- Optimize system to get no background
Practical issues with cloning
PCR the insert
- For HTP studies need to optimize condition for genome or clone
- Order primers from reputable supplier (most common
problem is in deprotecting oligos)
- Have someone else double-check primer sequence
- Order primers with requisite overhang (be over-cautious)
- Use error-correcting polymerase
Practical issues with cloning
Digest the PCR insert
- Make sure that there are no internal sites
- Purify the restricted product
Practical issues with cloning
Ligation and transformation
- If vector control background is low, and PCR product is
purified, then should be no problem
- Use highly competent cells
Practical issues with cloning
Screen for positive clones
- PCR screen from colony
- Screen by protein expression
- Make note of expression, as well as solubility
gene
T7
6HisTEV
6His TEVMBP
6His TEVTRX
STOP
STOP
STOP
STOP
T7
T7
T7
6His TEV
Clones
Screening for inserts by PCR
Cloning (conventional method)
TOPO cloning
GATEWAY™ Cloning System Technology - Phage
E.coli
attL attRE.coli lysogen
IHF, Int, Xis
att
L
attR
attB
attP
IHF, Int
attP
attB
attL+attR attB+attP
GATEWAY™ Cloning System Technology - Phage
IHF, Int, XisIHF, Int
attB
attB1 x attP1
attB2 x attP2
attR1 x attL1
attR2 x attL2
attP attPattP1attP2
attB1 attB2
E.coliattB
?
attP1
attB1
attP2
attB2
?
attR1 attR2
attL1 attL2?
attR1 attR2
attL1 attL2
x x?
“Gateway type” cloning
“Gateway type” cloning
PCR x96 clones
Cloning and Test Expression
ligate transform
Kan, Amp24 x 3ml LBKan, Amp37C, Induce at OD600Grow O/N 15C or 20C
300 ul 300 ul
X 96 X 96
Spin, Freeze, Lyse with BugBusterTM
Spin again
SDS PAGE
Spin, Dissolve pellet in SDS
supernatant
X 96
X 96
0
10
20
30
40
50
60
70
80
90
100
cloned expressed soluble
1750 clones
Expression systems for eukaryotic proteins
• Baculovirus infection of insect cells• Simple, relatively cost effective, selenomethionine-compatible, not fully able to replicate human post-translational modifications
• Viral infection of human cells• Viruses not as easy to work with, high yield, proper modification
• Stable transformation of human cells• Usually lower expression. After selection, transcription sometimes goes away. Low throughput due to selection process
• Transfection of human cells • High expression in few cells, uses up lots of DNA
lac Z mini attTn7
BacmidHelperHelper
ForeignGene 1
pPolh
ForeignGene 2
p10
Tn7LTn7R
pFastBacDualDonor
Competent DH10Bac E.coli cells
Transformation
E.coli (Lac7-)Containing Recombinant Bacmid
Mini-prep of HighMolecular Weight DNA
InfectionRecombinant GeneExpressionorViral Amplification
Transfection ofInsect cells with
CELLFECTIN Reagent
Transposition
Antibiotic selection
Day 1 Days 2-3
Day 4Day 8
RecombinantBacmid DNA
Generation of recombinant baculoviruses and gene expressionwith the Bac-To-Bac expression system
RecombinantBacmid DNA
Protein Purification
Purification parallel des proteines
1.
2.1 2 3 4 5 1’ 2’ 3’ 4’ 5’
ProteoMax – Automated Protein Purification and Concentration System
Affinium Pharmaceuticals
A few observations from our work
Structure determination strategy
< 20 kDa > 20 kDa
15N/13C-labeled
15N-labeled
Se-Methioninelabeled
3-5 weeks ofNMR data collection
Synchrotron Data
68 Escherichia coli 68 Thermotoga maritima
4, 288 ORFs 1, 877 ORFs
Topt 37 °C Topt 80 °C
4,639,221 bp 1,860,725 bp
Orthologues
Expressed & soluble
Concentratable to > 2mg/ml
62 48
50 44
3515 9
9 Proteins could not be purified from either
species
E. coli T. maritima
311 13
Total Crystals (30)
Total Good/Promising NMR spectra (14)
4
E. coli T. maritima
24
310 6
NMR & Crystallography: complementary!
24 small proteins for which both crystal trials and NMR data collected
Good/promising HSQC
crystals
Of 32 proteins that gave poor HSQC’s7 have crystallized
Data storage and Mining: Defined Vocabulary
Property Vocabulary
Expression level 0-5 (no expression – high expression)
Solubility (test expression) 0-5 (insoluble – highly soluble)
Concentratability 0-5 (or mg/ml)
Crystal trials clearprecipitatecrystal
Initial HSQC NMR goodpromising poor
5 5 4 3 2 1 0 0
Expression/solubility testing
Solubility Tree based On 58 sequence properties
Kluger & Gerstein
Mostly solubleMostly insoluble
Empirical Bioinformatics
Clear dropPrecipitateCrystal
Affinium Pharmaceuticals
Efficiency through mining crystal screens
Different proteins
Cry
stalli
zati
on
condit
ions
Crystal trial: Diminishing Returns
0
50
100
150
200
250
300
number of screening conditions
Lawrence McIntosh (UBC) C. Mackereth, G. Lee
Mike Kennedy (PNNL)* J. Cort, T. Ramelot
Kalle Gehring (McGill) I. Ekiel G. Kozlov
Dave Wishart (U. Alberta) S. Bhattacharyya
Weontae Lee (Yonsei U.)
Emil Pai (U. Toronto) V. Saridakis, N. Wu
Collaborators on Structural Proteomics
*Northeast Structural Genomics Consortium
Thomas Szypersky* (SUNY Buffalo) Mark Gerstein (Yale) * Yval Kluger Ning Lan
Sherry Mowbray (Sweden)
Liang Tong (Columbia) *John Hunt (Columbia) * Andrzej Joachimiak (ANL)* Guy Montelione (Rutgers) *
*Midwest Structural Genomics Consortium
top related