protein purification
TRANSCRIPT
Protein purification
Why purify a protein?
• To establish that a particular biological activity (enzymatic activity, signaling capacity, etc.) actually resides in a unique protein
• To use as a tool in biochemical investigations
• To study a protein’s properties
– e.g., determine mass, pI, specific activity,
Protein Purification Objective: to separate a particular protein from all other proteins and cell components
In a given cell, the target protein can be 0.001-20% of total protein
Other components:
nucleic acids, carbohydrates, lipids, small molecules
Enzymes are found in different states and locations:
soluble, insoluble, membrane bound, DNA bound,
in organelles, cytoplasmic, periplasmic, nuclear
Protein purification: purposes
1. preparative – aim to produce large quantity of purified proteins for
subsequent use – Used in preparation of commercial products enzymes, nutritional proteins , biopharmaceuticals(e.g.,
insulin)
2. Analytical – produces a relatively small amount of protein for a
variety of research or analytical purposes – e.g., identification, quantification, determination of
the protein's structure, post-translational modifications and function
Protein purification strategy
• Move from organism to pure protein in as few steps as possible with as little loss of activity (assayable quality) as possible – Time, temperature are factors
• Highly individualized
• Use a common approach – Fractionate crude extract in a way that protein of
interest always goes into the pellet or the supernatant.
– Follow progress with functional assay
How do we recognize the protein that we are looking for?
• have to continually know how much of your protein is present
and how much contaminating material is present
Ratio = specific activity
• Viable assay needed
Should target a unique identifying property of the protein
• For enzymes, usually based on the reaction it catalyzes in the cell
If measured unit well - defined (e.g., 0.1 optical density increase in a 1 cm cuvette at 650 nm at 37° and pH 6.2), can estimate the amount of enzyme present in the sample
Lactate dehydrogenase assay
• NADH absorbs light at 340 nm
• assay for lactate dehydrogenase activity → increase in A340 observed in 1 minute
• In general, as the no. of units per mg protein per ml increases, so does the purity
• Assays need to be relatively rapid and reproducible to be useful
Separation processes that can be used to fractionate proteins
Separation Process Basis of Separation
Precipitation ammonium sulfate solubility
polyethyleneimine (PEI) charge, size
isoelectric solubility, pI
Chromatography gel filtration (SEC) size, shape
ion exchange (IEX) charge, charge distribution
hydrophobic interaction(HIC) hydrophobicity
DNA affinity DNA binding site
immunoaffinity (IAC) specific epitope
chromatofocusing pI
Electrophoresis gel electrophoresis (PAGE) charge, size, shape
isoelectric focusing (IEF) pI
Centrifugation sucrose gradient size shape, density
Ultrafiltration ultrafiltration (UF) size, shape
Other considerations
• pH
– Buffer solution should be in pH range over which material is stable
• Proteases
– Should be removed
– Add protease inhibitors
• T
– Many ptns denature at T > 25°C
– Other ptns cold - labile
Protein sources for purification
• Traditional natural sources
– bacteria, animal and plant tissue
• Cloning recombinant proteins into overexpression vector/host systems for intracellular production
– E. coli the most widely used
• In vitro protein synthesis
– Transcription/translation systems
Steps in recombinant protein purification
1. Design expression plasmid, transform, select
2. Grow culture of positive clone, induce expression
3. Lyse cells
4. Centrifuge to isolate protein-containing fraction
5. Column Chromatography—collect fractions
6. Assess purity on SDS-PAGE
The insertion of a DNA fragment into a bacterial plasmid with the enzyme DNA ligase
Purification and amplification of a specific DNA sequence by DNA cloning in a bacterium
Protein purification steps from traditional sources
• Extraction from source
– Homogenization
– Differential centrifugation
• Protein enrichment
– Salt precipitation
– Isoelectric precipitation
• Protein purification
– Column chromatography
• Determination of yield, activity
Methods of Solubilization
• protein must be liberated from the cells that contain it • method of choice depends on
1. mechanical characteristics of source tissue 2. location of the required protein
• E.g. , If target protein is cytosolic, only requires cell osmotic lysis Cells suspended in hypotonic solution cells that have a cell wall, such as bacteria or plant cells,
requires cehmical degradation of bacterial cell walls Can use lysozyme
• which chemically degrades bacterial cell wallsa hypotonic solution;
Other means of cellular disruption
Many cells require some sort of mechanical disruption process to break them open.
several cycles of freezing and thawing
grinding with sand, alumina or glass beads
high-speed blender
homogenizer
French press
Sonicator
Once the cells have been broken open, the crude lysate may be filtered or centrifuged to remove cell debris
Differential centrifugation
Salting out
• central role in all purification schemes
• Different proteins precipitate at different salt concentrations [salt] may be adjusted to precipitate target protein
• Ammonium sulfate is the most commonly used reagent for salting out proteins
• Salting out produces protein in high – salt environment Easiest method for eliminating salt = dialysis
Kosmotrope vs. Chaotrope
• Ammonium Sulfate
• Increasing conc causes proteins to precipitate stably.
• Kosmotropic ion = stabilizing ion
• Urea
• Increasing conc denatures proteins; when they finally do precipitate, it is random and aggregated.
• Chaotropic ion = denaturing ion
Dialysis
• Employs a semipermeable membrane – e.g., collodion bag
• Molecules with dimensions >> pore diameter retained inside the dialysis bag
• smaller molecules, ions can pass through membrane and into the dialysate outside the bag
• useful for removing a salt or other small molecules
• takes many hours, usually overnight Not easily used for large scale purification
• Protein molecules (red) are retained within the dialysis bag • Small molecules (blue) diffuse into the surrounding medium • Can replace external buffer multiple times
Dialysis
Recombinant protein expression
Reasons for producing recombinant proteins:
• To study its function
• To analyze its physical properties
• To determine its sequence
• For industrial or therapeutic applications
Protein pharmaceuticals
• Natural sources are often rare and expensive
Difficult to keep up with demand
Hard to isolate product
Lead to immune reactions (diff. species)
Viral / pathogen contamination
• Most protein pharmaceuticals today are produced using recombinant methods
– Cheaper, safer, abundant supply
Recombinant proteins for human use
Steps in recombinant protein purification
1. Design expression recombinant vector, transform,
select
2. Grow culture of positive clone, induce expression
3. Lyse cells
4. Centrifuge to isolate protein-containing fraction
5. Column Chromatography—collect fractions
6. Assess purity on SDS-PAGE
Vector selection
• Must be compatible with host cell system prokaryotic vectors for prokaryotic cells
eukaryotic vectors for eukaryotic cells
• Needs a good combination of strong promoters
ribosome binding sites
termination sequences
affinity tag or solubilization sequences
multi-enzyme restriction site
Key vector components
• Origin of replication (ORI) DNA sequence required for
initiation of replication
• Selectable marker (Amp or Tet)
• Inducible promoter Short DNA sequence which
enhances expression of adjacent gene
• Multi-cloning site (MCS)/polylinker region to facilitate insertion of
gene
A generic vector
Vector elements
• Promoters
– arabinose systems (pBAD), phage T7 (pET), Trc/Tac promoters, phage lambda PL or PR
• Tags
– His6 for metal affinity chromatography (Ni)
– FLAG epitope tag DYKDDDDK
– CBP-calmodulin binding peptide (26 residues)
– E-coil/K-coil tags (poly E35 or poly K35)
– c-myc epitope tag EQKLISEEDL
– Glutathione-S-transferase (GST) tags
– Cellulose binding domain (CBD) tags
Protein expression in E. coli: example
pGEX plasmid:
• Gene encoding affinity tag-glutathione S tranferase (GST)
• Spacer between genes
– encodes protease cleavage site (thrombin)
• Ptac promoter
– inducible with IPTG
• Ribosome binding site
Ligation inserts gene in-frame with GST tag
In frame in pGEX-2T BamHI
CTG GTT CCG CGT GGA TCC CCG GGA ATT CAT CGT GAC TGA CTG ACG
L V P R G S P G I H R D *
Insert into BamHI site BamHI insert BamHI
CTG GTT CCG CGT GGA TCC CTG GGT GAG CGT GAA GCG GGA TCC CCG GGA ATT CAT CGT GAC TGA
L V P R G S L G E R E A G S P G I H R D *
Out of frame in pGEX-3X BamHI
ATC GAA GGT CGT GGG ATC CCC GGG AAT TCA TCG TGA CTG ACT GAC
I E G R G I P G N S S *
Insert into BamHI site BamHI insert BamHI
ATC GAA GGT CGT GGG ATC CCT GGG TGA GCG TGA AGC GGG ATC CCC GGG AAT TCA TCG TGA
I E G R G I P G * A * S G I P G N S S *
* indicates stop codon
IPTG-inducible protein expression: recall lac operon
Bacterial systems
• Grow quickly (8-12 hrs to produce protein)
• High yields (50-500 mg/L)
• Low cost of media (simple media constituents)
• Low fermentor costs
• Difficulty expressing large proteins (>50 kD)
• No glycosylation or signal peptide removal
• Eukaryotic proteins are sometimes toxic
• Can’t handle S-S rich proteins
Advantages Disadvantages
Alternatives to bacterial protein production
• Use different expression system
• Use different host for protein expression
Yeast (Pichia pastoris)
Virus (Baculovirus)
Mammalian cell culture
Plants
Sheep/Cows
Cloning and transforming in yeast cells
• Yeast = single celled eukaryotes
– But bacteria – like in ability to reproduce
• P. pastoris = methylotrophic yeast
– can use methanol as sole carbon source (using alcohol oxidase)
• Has a very strong promoter for the alcohol oxidase (AOX) gene
– ~30% of protein produced when induced
Pichia pastoris cloning
• Uses a special plasmid that works both in E. coli and yeast
• Once gene of interest is inserted into this plasmid, vector is linearized
• Double cross-over recombination event occurs
causes gene of interest to insert directly into P. pastoris chromosome where the old AOX gene used to be
gene of interest now under control of AOX promoter
Yeast systems
• Grow quickly (12-24 hrs to produce protein)
• Very high yields (50-5000 mg/L)
• Low cost of media (simple media constituents)
• Low fermentor costs
• Can express large proteins (>50 kD)
• Glycosylation and signal peptide removal
• Has chaperonins to help fold “tough” proteins
• Can handle S-S rich proteins
Advantages
Baculovirus expression
• Autographica californica multiple nuclear polyhedrosis virus (Baculovirus)
• commonly infects insects cells of the alfalfa looper (small beetle) or armyworms (and their larvae)
• Uses super-strong promoter from the polyhedron coat protein to enhance protein expression
5’ 3’
Transfer vector
Polyhedrin gene
x x
Cloned gene
AcMNPV DNA
5’ 3’ Cloned gene
Recombinant
AcMNPV DNA
Baculovirus expression
Baculovirus systems
Advantages Disadvantages
• Grow very slowly (10-12
days for set-up)
• Cell culture is only
sustainable for 4-5 days
• Set-up is time consuming,
not as simple as yeast
• Can express large
proteins (>50 kD)
• Correct glycosylation &
signal peptide removal
• Has chaperonins to help
fold “tough” proteins
• Very high yields, cheap
Baculovirus successes
• Alpha and beta interferon
• Adenosine deaminase
• Erythropoietin
• Interleukin 2
• Poliovirus proteins
• Tissue plamsinogen activator (TPA)
Mammalian cell line expression
• With the exception of budding yeast, plasmids are uncommon in eukaryotes
• most eukaryotic vectors based on DNA or RNA viral genomes – e.g., SV40
• Sometimes required for difficult-to-express proteins
• Cells are typically derived from the Chinese Hamster Ovary (CHO) cell line
Mammalian expression systems
• Vectors usually use SV-40 virus, CMV or vaccinia virus promoters and DHFR (dihydrofolate reductase) as the selectable marker gene
Mammalian protein expression
• Gene initially cloned and plasmid propagated in bacterial cells
• Mammalian cells
transformed by
electroporation
• Gene integrates ≥1x into random locations within different chromosomes
• Multiple rounds of growth and selection using methotrexate to select for cells with highest expression, integration of DHFR and the gene of interest
Methotrexate (MTX) selection
Foreign gene expressed in high level in MTX – resistant cells
Mammalian systems
• Selection takes time (weeks for set-up)
• Cell culture is only sustainable for limited period of time
• Set-up is very time consuming, costly
• Modest yields
• Can express large proteins (>50 kD)
• Correct glycosylation & signal peptide removal, generates authentic proteins
• Has chaperonins to help fold “tough” ptns
Disadvantages Advantages
Mammalian cell protein expression successes
• Factor IX
• Factor VIII
• Gamma interferon
• Interleukin 2
• Human growth hormone
• Tissue plamsinogen activator (TPA)
Expression system selection
Choice depends on size and character of protein
Large proteins (>100 kD)? Choose eukaryote
Small proteins (<30 kD)? Choose prokaryote
Glycosylation essential? Choose baculovirus or mammalian cell culture
High yields, low cost? Choose E. coli
Post-translational modifications essential? Choose yeast, baculovirus or other eukaryote
Engineering proteins for ease of purification and detection
• Once you have a gene cloned and can over-express the protein, you can alter protein to improve the ease of purification or detection
• Can fuse a tag to the N-or C- terminus of your protein
• Can decide to remove the tag or not
Basic strategies
• Add signal sequence that causes secretion into culture medium
• Add protein that helps the protein refold and stay soluble
• Add sequence that aids in precipitation
• Add an affinity handle (by far the most used is the His-tag)
• Add sequence that aids in detection
Protein purification by column chromatography
1. Protein mixture applied to column
2. Solvent applied to top, flows through column
3. Proteins travel through matrix at different rates
4. Proteins collected separately in different fractions
Protein properties: handles for fractionation
Net charge Ionizable group pKa pH2 pH7 pH12 C-terminal (COOH) 4.0 oooooooo---------------------------------------- Aspartate (COOH) 4.5 oooooooooo------------------------------------- Glutamate (COOH) 4.6 ooooooooooo------------------------------------ Histidine (imidazole) 6.2 +++++++++++++oooooooooooooooooooo N-terminal (amino) 7.3 +++++++++++++++oooooooooooooooooo Cysteine (SH) 9.3 ooooooooooooooooooooooo----------------- Tyrosine (phenol) 10.1 oooooooooooooooooooooooooo------------- Lysine (amino) 10.4 ++++++++++++++++++++++++oooooooo
Charge distribution Isoelectric point: pI = pH where protein has zero net charge
• typical range of pI = 4-9
+ +
+ +
-
- -
- uniform
+ + + +
- - - - clustered versus
Protein properties-handles for fractionation
Hydrophobicity Hydrophobic residues usually are buried internally
The number and distribution on the surface vary
Can use Hydrophobic Interaction Chromatography
Solubility Varies from barely soluble (<mg/ml) to very soluble (>300 mg/ml)
Varies with pH, ionic strength/type, polarity of solvent, T
Least soluble at isoelectric point where there is least charge repulsion
hydrophobic patch
H H H
• Ligand and metal binding
Affinity for cofactors, substrates, effector molecules, metals, DNA
When ligand is immobilized on a bead, you have an affinity bead
Chromatographic Mode Acronym Separation Principle
Non-interactive modes of liquid chromatography
Size-exclusion chromatography SEC Differences in molecular size
Slalom chromatography (for DNA) - Diff. in length and flexibility
Interactive modes of liquid chromatography
Ion-exchange chromatography IEC Electrostatic interactions
Normal-phase chromatography NPC Polar interactions
Reversed-phase chromtography RPC Dispersive interactions
Hydrophobic interaction chromatography
HIC Dispersive interactions
Affinity chromatography AC Biospecific interaction
Metal interaction chromatography MIC Complex w/ an immobilized metal
Chromatographic modes of protein purification
(Christian G. Huber, Biopolymer Chromatography, Encylcopedia in analytical chemistry, 2000)
Gel filtration chromatography: separation by size
Beads have different size pores As column flows: • large proteins
excluded from pores ; flow rapidly
• small proteins enter
pores; flow slowly
Gel filtration by HPLC clearly defines the individual proteins because of its greater resolving power 1. thyroglobulin (669 kd) 2. catalase (232 kd) 3. bovine serum albumin (67
kd) 4. ovalbumin (43 kd) 5. ribonuclease (13.4 kd)
Ion – exchange chromatography
• Separates analytes based on charge
• electrostatic interactions with the stationary phase will cause substance to move slower through column
• 2 types: 1. Cation exchanger
2. Anion exchanger
Ion exchange chromatography: separation by charge
• Target proteins eluted with increasing amount of salt (NaCl or KCl)
• Can elute an ion exchange column with a gradient of salt concentrations or by step elution
• Most protein purification is done on anion exchange columns
most proteins are negatively charged at physiological pH values (pH 6 - 8)
proteins can become inactivated at extreme pHs so they are avoided
Ion exchange chromatography
Net charge of
proteins
Type of ion-exchange
column
Counter ion
Positive Cation exchange Na+
Negative Anion exchange Cl-
Ion – exchange chromatography
Exercise: Predict order of elution of serine, glutamic acid and histidine through a cation – exchange column
Hydryophobic interaction chromatography
• hydrophobic aa’s not normally exposed on proteins protein usually surrounded by
water molecules, except when in high salt concentration
• When in high salt, hydrophobic areas are exposed bind to matrix beads coated
with hydrophobic fatty acid chains
Hydrophobic region
• column is eluted with decreasing [salt]
Proteins usually elute only at very low [salt]
• very useful as a next step after proteins are eluted from ionic exchange columns with high salt
Affinity chromatography: separation by biological binding interactions
Example: GST - Glutathione • GST-tagged proteins bind to gluthatione on beads
• Non-specifically or weakly bound proteins washed off • GST-tagged proteins eluted with glutathione (competitor) or
thrombin (protease)
GST – glutathione affinity chromatography
wash
porous
bead
glutathione
elute
GST apply sample
thrombin site protein of interest
Immunoaffinity chromatography
(http://www.cellmigration.org/resource/discovery/discovery_proteomics_approaches.html)
Problems with immunoaffinity chromatography
• Theoretically, can have a monoclonal antibody produced Stationary phase that will bind protein target
• Problems: 1. MAbs expensive, difficult to purify 2. Other proteins may inactivate or bind non-specifically to them 3. Some of the Mabs may leach off the column during the
purification; must be removed
• Affinity columns usually used as an expensive last resort! – done late in the process when
volume has been reduced majority of contaminants have already been removed
GST•Bind™ Purification Kits His•Bind® Purification Kits Magnetight™ Oligo d(T) Beads MagPrep® Streptavidin Beads Protein A and Protein G Plus Agaroses S•Tag™ Purification Kits Streptavidin Agarose T7•Tag™ Affinity Purification Kit ProteoSpin™ CBED (Concentration, Buffer Exchange and Desalting) Maxi Kit — Effectively desalts and concentrates up to 8 mg of protein with an efficient, easy-to-use protocol.(Norgen Biotek Corporation) ProteoSpin™ Detergent Clean-up Micro Kit — Provides a fast and effective procedure to remove detergents including SDS, Triton® X-100, CHAPS, NP-40 and Tween 20.
Commercially available protein purification kits
(http://www.emdbiosciences.com)
Protein purification by chromatography
Typical protein purification scheme
Table 4.1. Quantification of a purification protocol for a fictitious protein
Step Total
protein (mg)
Total activity
(units)
Specific activity,
(units mg-1
Yield
(%)
Purification
level
Homogenization 15,000 150,000 10 100 1
Salt fractionation 4,600 138,000 30 92 3
Ion-exchange
chromatography
1,278 115,500 90 77 9
Molecular exclusion
chromatography
68.8 75,000 1,100 50 110
Affinity chromatography 1.75 52,500 30,000 35 3,000
Purification parameters
• Yield – protein activity recovered/starting protein activity
• Purification level – measure of the increase in purity = specific activity at purification step/specific activity of the
initial extract • Total protein
– [protein] of a part of each fraction x the fraction's total volume
• Total activity – enz activity in the volume of fraction used x the fraction's
total volume • Specific activity
– total activity/total [protein]
A good purification scheme takes into account both purification levels and yield.
• A high degree of purification but poor yield leave little protein for experiments
• A high yield with low purification leaves many contaminants in the fraction – complicates the interpretation of experiments.
Causes for getting inactive proteins
• Requires molecular chaperone for proper folding
• Requires post-translational modifications
• Cofactors/protein partners needed for proper folding up to 1/3 of eukaryotic proteins may be “natively”
unfolded until binding their protein partner
• Need to obtain these proteins as correctly folded molecules!
Protein Characterization
Characterization of proteins and peptides involves different processes:
1. Determining the MW, pI of the Protein
2. Determining aa composition, sequence
3. Determining 3d structure
4. Determining interactions with proteins, other molecules
5. Determining function
Protein structure determination methods
High resolution: X-ray crystallography | NMR | Electron
crystallography
Medium resolution: Cryo-electron microscopy | Fiber diffraction | Mass
spectrometry
Spectroscopic: NMR | Circular dichroism | Absorbance |
Fluorescence | Fluorescence anisotropy
Translational
Diffusion:
Analytical ultracentrifugation | Size exclusion
chromatography | Light scattering | NMR
Rotational
Diffusion:
Fluorescence anisotropy | Flow birefringence |
Dielectric relaxation | NMR
Chemical: Hydrogen-deuterium exchange | Site-directed
mutagenesis | Chemical modification
Thermodynamic: Equilibrium unfolding
Computational: Protein structure prediction | Molecular docking
Topics for reporting
1. Industrial scale protein purification 2. Techniques in proteome analysis 3. X-ray crystallography 4. Protein NMR 5. Circular dichroism 6. Fluorescence techniques for protein structure
analysis 7. Comparative modeling
– Homology modeling, protein threading
8. Ab initio / de novo protein structure prediction