genetically encoded synthesis of protein-based polymers...

11

Click here to load reader

Upload: trinhlien

Post on 07-Apr-2019

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Genetically Encoded Synthesis of Protein-Based Polymers ...unicorn/reprints/ELPs/ELP2002synthesis.pdf · vector and the BLR(DE3) E. coli strain were purchased from Novagen Inc. (Milwaukee,

Genetically Encoded Synthesis of Protein-Based Polymers withPrecisely Specified Molecular Weight and Sequence by

Recursive Directional Ligation: Examples from the Elastin-likePolypeptide System

Dan E. Meyer and Ashutosh Chilkoti*

Department of Biomedical Engineering, Box 90281, Duke University, Durham, North Carolina 27708-0281

Received October 22, 2001; Revised Manuscript Received December 19, 2001

We report a new strategy for the synthesis of genes encoding repetitive, protein-based polymers of specifiedsequence, chain length, and architecture. In this stepwise approach, which we term “recursive directionalligation” (RDL), short gene segments are seamlessly combined in tandem using recombinant DNA techniques.The resulting larger genes can then be recursively combined until a gene of a desired length is obtained.This approach is modular and can be used to combine genes encoding different polypeptide sequences. Weused this method to synthesize three different libraries of elastin-like polypeptides (ELPs); each libraryencodes a unique ELP sequence with systematically varied molecular weights. We also combined two ofthese sequences to produce a block copolymer. Because the thermal properties of ELPs depend on theirsequence and chain length, the synthesis of these polypeptides provides an example of the importance ofprecise control over these parameters that is afforded by RDL.

Introduction

Protein-based polymers, which are composed of repeatunits of natural or unnatural amino acids, have recentlyemerged as a promising new class of materials.1-4 They areattractive from a fundamental materials science perspectivebecause their genetically encoded synthesis provides precisecontrol, to a level unattainable using chemical polymerizationtechniques, over the primary architectural features of poly-mers, namely sequence, chain length, and stereochemistry.Furthermore, these materials frequently also have desirablemechanical, chemical, and biological properties (e.g., bio-compatibility, biodegradation) that makes their use appealingas biomaterials and tissue engineering scaffolds. The se-quence and molecular weight (MW) of repetitive polypep-tides are of particular importance because these two primaryarchitectural variables determine the physicochemical proper-ties of the macromolecule. These variables are also importantfor in vivo applications because the MW controls pharma-cokinetics and transport phenomena, while the amino acidsequence can impart biological activity and often determinesthe biodegradation of the polypeptide.

The precise and rapid synthesis of genes encoding apolypeptide of desired sequence and length is therefore akey requirement for producing genetically encoded, repetitivepolypeptides for specific applications. Although a numberof different strategies have been developed to assemblesynthetic genes for such polypeptides,5-7 most methods havefocused upon the simultaneous generation of a library ofoligomeric genes by concatemerization of a monomergene.8-12 Concatemerizationsthe self-ligation of a DNA

monomer with cohesive endsshas the advantage of creating,in a single ligation step, a library of genes that encodeoligomeric polypeptides with the same repeat sequence butdifferent sizes. Although concatemerization is rapid, itsacrifices precise control over the oligomerization processbecause it is a statistical process that yields a population ofDNA oligomers with a distribution of different lengths. Theaverage degree of oligomerization can be partially controlledby varying the ligation conditions, but concatemerizationdoes not guarantee the synthesis of a gene with a desiredlength. Because concatemerization does not guarantee a priorithat a clone will be isolated that contains an insert encodingthe desired number of peptide repeats, it is a useful syntheticstrategy when a range of MWs need to be rapidly generatedbut where synthesis of a gene encoding a specified numberof peptide repeats is of secondary importance.

We report here a general strategy, which we term“recursive directional ligation” (RDL), for the synthesis ofrepetitive polypeptides of a specified chain length. This gene-level approach involves controlled, stepwise oligomerizationof a DNA monomer to yield a library of oligomers rangingfrom the monomer to an oligomer of a required chain length(Figure 1). We employed RDL to synthesize elastin-likepolypeptides (ELPs), which are based on the repetitivepentapeptide motif Val-Pro-Gly-Xaa-Gly (where the “guestresidue” Xaa is any amino acid except Pro). ELPs are aninteresting class of polypeptides because they undergo aninverse temperature phase transition.13,14 ELPs are solublein aqueous solution below the inverse transition temperature(Tt, also known as the lower critical solution temperature orLCST). However, when the temperature is raised above theTt, they undergo a sharp (∼2 °C range) phase transition* To whom correspondence should be addressed at [email protected].

357Biomacromolecules 2002,3, 357-367

10.1021/bm015630n CCC: $22.00 © 2002 American Chemical SocietyPublished on Web 01/30/2002

Page 2: Genetically Encoded Synthesis of Protein-Based Polymers ...unicorn/reprints/ELPs/ELP2002synthesis.pdf · vector and the BLR(DE3) E. coli strain were purchased from Novagen Inc. (Milwaukee,

leading to desolvation and aggregation of the polypeptide.The transition can be induced by changes in temperature,ionic strength, or pH and is completely reversible.14 Numer-ous applications of ELPs in biotechnology and medicine havebeen proposed.7,15-20

We developed RDL as a synthetic strategy for ELPs inparticular because we wished to study the effect of sequence,MW, and the architecture (e.g., in ELP block copolymers)of ELPs on their thermal behavior in solution. The abilityto precisely control these variables is important because theTt of a given ELP is independently related to its sequence(i.e., the identity of the guest residues) and to its chainlength.21,22 Using RDL, we synthesized genes encodingoligomeric libraries of three different ELP sequences andcombined two of these genes to encode a block copolymer.Because the sequence and chain length determine the thermalproperties of ELPs, these polypeptides provide an excellentexample of the importance of the precise control over theseparameters that can be obtained by RDL.

Materials and Methods

Materials. Restriction endonucleases, the pUC19 cloningvector, and T4 DNA ligase were purchased from NewEngland Biolabs (Beverly, MA). Calf intestinal alkalinephosphatase (CIP) and Taq DNA polymerase were obtainedfrom Gibco BRL-Life Technologies (Grand Island, NY).Plasmid DNA was purified using spin miniprep kits fromQIAGEN, Inc. (Valencia, CA). The pET-25b(+) expressionvector and the BLR(DE3)E. coli strain were purchased fromNovagen Inc. (Milwaukee, WI). All cultures were grown inCircleGrow medium from Q-BIOgene (Carlsbad, CA), andpolypeptide expression was induced with isopropylâ-D-thiogalactopyranoside (IPTG) from Teknova, Inc. (HalfMoon Bay, CA). Custom oligonucleotides were synthesizedby Integrated DNA Technologies, Inc. (Coralville, IA).Precast Mini-Protean SDS-PAGE gels were from BioRad,Inc. (Hercules, CA).

Nomenclature. We distinguish the different ELP con-structs using the notation ELP[XiY jZk-n]. The bracketed

capital letters are the single letter amino acid codes specifyingthe guest residues in the ELP sequence, and the subscriptsdesignate the number of Val-Pro-Gly-Xaa-Gly repeats foreach corresponding guest residue in the monomer gene. Thetotal length of the ELP gene in number of pentapeptides isspecified byn. For example, ELP[V5A2G3-180] is an ELPof 180 pentapeptides in length that has a repeat unitcomposed of 10 pentapeptides with the guest residues Val,Ala, and Gly in a 5:2:3 ratio, respectively. Ifn is unspecified,the notation refers to a MW library of the given ELPsequence, rather than to an ELP construct of specific length.

Monomer Gene Synthesis.Standard molecular biologyprotocols were used for DNA manipulation,E. coli culture,and protein expression.23 The DNA sequences of themonomer genes for the three libraries are shown in Figure2A. For the ELP[V5A2G3] library, a synthetic gene (150 bp)encoding a 50 amino acid ELP repeat was constructed fromfour 5′-phosphorylated, PAGE-purified synthetic oligonucleo-tides. Although this is the monomer gene for RDL, we termit a “10-mer” because it encodes 10 pentapeptides. Theoligonucleotides were annealed by heating an equimolarmixture of the four oligonucleotides (2µM each in ligasebuffer) to >95 °C and then slowly cooling to roomtemperature to form a double-stranded DNA cassette withEcoR I andHinD III compatible ends. pUC19 was codigestedwith EcoR I and HinD III and enzymatically dephosphor-ylated using CIP. The linearized pUC19 vector was thenpurified using a microcentrifuge spin column purification kitand eluted in sterile, deionized water. The annealed oligo-nucleotides were then ligated to the linearized vector (∼200U ligase,∼0.1 pmol vector, and∼1 pmol insert incubatedin 20 µL ligase buffer at 16°C for 2 h). A 10µL portion ofthe ligation mixture was combined with 100µL of chemicallycompetentE. coli cells (XL1-Blue strain), and the cells weretransformed by heat shock (30 min 4°C, 60 s 42°C, 5 min4 °C), spread on CircleGrow medium agar plates supple-mented with ampicillin (100µg/mL), and incubated overnightat 37 °C. Colonies were initially screened by blue-whitescreening and subsequently verified by agarose gel electro-phoresis of colony PCR products. The DNA sequence ofputative inserts was further verified by dye terminator DNAsequencing (ABI 370 DNA sequencer). The ELP[V5-5] andELP[V1A8G7-16] monomer genes were constructed similarly(Figure 2A).

Gene Oligomerization.One round of RDL oligomeriza-tion of the ELP[V5A2G3-10] gene is described here as anexample. Additional rounds proceed identically, except thatthe products of previous rounds serve as the startingmaterials. After the ELP[V5A2G3-10] monomer gene wasconstructed, a ELP[V5A2G3-20] gene was synthesized byligating the 10-mer insert into a vector containing the same10-mer gene, as follows. The vector, containing a copy ofthe gene of the monomer ELP[V5A2G3-10], was linearizedwith PflMI (∼10× overdigestion for 6 h), enzymaticallydephosphorylated with CIP, and then purified using amicrocentrifuge spin column purification kit. A separatesample of vector was doubly digested withPflMI and Bgl Ito excise the gene encoding ELP[V5A2G3-10]. After diges-tion, the reaction products were separated by agarose gel

Figure 1. Overview of gene oligomerization by RDL. A monomerDNA segment encoding a polypeptide sequence of interest isseamlessly self-ligated. The process is repeated, doubling the genelength with each step, until the gene of a desired length is obtained(left pathway). Other genes encoding different polypeptide sequencescan be incorporated at any step (right pathway). In this example, agene encoding a different sequence is ligated to the first sequenceto produce a diblock copolymer, which in turn can be furtheroligomerized to create more complex block copolymers.

358 Biomacromolecules, Vol. 3, No. 2, 2002 Meyer and Chilkoti

Page 3: Genetically Encoded Synthesis of Protein-Based Polymers ...unicorn/reprints/ELPs/ELP2002synthesis.pdf · vector and the BLR(DE3) E. coli strain were purchased from Novagen Inc. (Milwaukee,

electrophoresis, and the insert was purified using a gelextraction microcentrifuge spin column kit. During purifica-tion of the insert from the agarose gel slice, vortexing orheating of the samples above 37°C was avoided to preventdamage to the DNA. The purified insert and the linearizedvector were ligated and transformed into XL1-Blue cellsusing the protocol described above. Transformants wereinitially screened by colony PCR and/or diagnostic restrictionendonuclease (RE) digests and further confirmed by DNAsequencing.

Expression Vector Construction.Expression vectors thatare compatible with the ELP genes were constructed bymodifying the DNA sequence spanningNdeI to EcoR I ofpET-25b(+), a commercial T7-lac expression vector, bycassette mutagenesis to incorporate a uniqueSfi I recognitionsite (Figure 2B). The modified pET-25b(+) expression vectorwas digested withSfi I (∼10× overdigestion for 6 h),dephosphorylated, and purified using a microcentrifuge spincolumn kit. The ELP gene was excised from the pUC-19vector by digestion withPflM I and Bgl I, and the excisedELP gene was purified by agarose gel extraction followinggel electrophoresis. TheSfi I linearized pET-25b vector andthe ELP-encoding gene were ligated and transformed as

described above, and plasmids isolated from the resultingtransformants were screened by diagnostic RE digest andsequenced.

Expression.The expression vectors were transformed intotheE. coli strain BLR(DE3) for expression. Typically, startercultures (250 mL flasks containing 50 mL of mediumsupplemented with 100µg/mL ampicillin) were inoculatedwith transformed cells from a fresh agar plate or from DMSOstocks stored at-80 °C, and incubated overnight at 37°Cwith shaking (300 rpm). The confluent starter cultures werethen centrifuged at 3000g for 15 min at 4°C to removeâ-lactamase, and resuspended in 10 mL of fresh medium.Expression cultures (4 L flasks containing 1 L of mediumwith 100 µg/mL ampicillin) were inoculated with 2 mL ofthe resuspended starter culture and incubated with shaking(∼300 rpm) at 37°C. When the OD600 reached∼0.8-1.0(typically about 3.5 h postinoculation), expression wasinduced by the addition of IPTG to a final concentration of1 mM. The cells were typically harvested 3 h after inductionby centrifugation at 3000g for 20 min at 4°C, resuspendedin 35 mL of cold, low ionic strength buffer (typically PBS:137 mM NaCl, 2.7 mM KCl, 4.2 mM Na2HPO4, 1.4 mMKH2PO4, pH 7.3), and lysed by sonic disruption at 4°C (90s of sonication at maximum power, using 10 s pulses

Figure 2. Gene and corresponding polypeptide sequences. (A) Monomer genes for the three ELP libraries are shown. The PflM I and Bgl Isites, which are used for RDL, are shown with recognition sequences in bold, cleavage sites indicated by arrows, and cohesive ends underlined.(B) Any ELP gene can be ligated into the Sfi I site of the expression vectors, which also encode short leader and trailer peptides.

Polymers of Controlled Molecular Weight and Sequence Biomacromolecules, Vol. 3, No. 2, 2002 359

Page 4: Genetically Encoded Synthesis of Protein-Based Polymers ...unicorn/reprints/ELPs/ELP2002synthesis.pdf · vector and the BLR(DE3) E. coli strain were purchased from Novagen Inc. (Milwaukee,

separated by 20 s; 550 Sonic Dismembrator, Fisher Scientific,Pittsburgh, PA). The cell lysate was centrifuged at 20000gfor 15 min at 4 °C to remove insoluble cellular matter.Soluble nucleic acids were precipitated by the addition ofpolyethylenimine (0.5% final concentration, w/v) and re-moved by centrifugation at 20000g for 15 min at 4°C.

ELP Purification. ELPs were purified by inverse transi-tion cycling.10,17 Briefly, the ELPs were selectively ag-gregated by heating the cell lysate (typically 30-45 °C) and/or by adding NaCl (typically 0.5-2 M). The aggregatedprotein was separated from solution by centrifugation at10000g for 15 min at 30-45 °C. The supernatant, containingsoluble contaminants from the lysedE. coli cells, wasdecanted and discarded. The pellet containing the ELP wasresolubilized in cold, low ionic strength buffer. Once fullyresuspended in solution, the first round of inverse transitioncycling was completed with a final centrifugation step at15000g, for 10 min at 4 °C to remove any remaininginsoluble contaminants. Typically, two rounds of inversetransition cycling (warm centrifugation, pellet resuspension,and subsequent cold centrifugation) were sequentially per-formed to purify the ELP.

Characterization of the Expressed ELPs.The ELPswere characterized by SDS-PAGE, mass spectrometry, andUV-vis spectrophotometry. The concentration of ELPsolutions was determined spectrophotometrically using themolar extinction coefficient of Trp at 280 nm (5690 M-1

cm-1). SDS-PAGE gels were visualized by copper stain-ing.10,24 Matrix-assisted laser desorption/ionization massspectrometry (MALDI-MS) was performed by the DukeUniversity Mass Spectrometry Facility in the Department ofChemistry using a PE Biosystems Voyager-DE instrumentequipped with a nitrogen laser (337 nm). The MALDI-MSsamples were prepared in an aqueous 50% acetonitrilesolution containing 0.1% trifluoroacetic acid, using a sina-pinic acid matrix.

Thermal Characterization. To characterize the ELPinverse temperature transition, the OD350 of ELP solutions(typically 25µM ELP in PBS) was monitored as a functionof temperature on a Cary 300 UV-visible spectrophotometerequipped with a multicell thermoelectric temperature control-ler (Varian Instruments, Walnut Creek, CA). The heatingand cooling rates were 1°C min-1. The derivative of theturbidity profile with respect to temperature was numericallycalculated, and theTt was defined as the solution temperatureat the maximum of the turbidity gradient. The size of ELPaggregates formed during the inverse temperature transitionwas characterized as a function of temperature by dynamiclight scattering (DLS) using a DynaPro-LSR DLS instrumentequipped with a Peltier temperature control unit (ProteinSolutions, Charlottesville, VA). A 25µM solution of ELPin PBS was centrifuged at 16000g for 10 min at 4°C toremove air bubbles and insoluble debris, and the coldsupernatant was then filtered through a 20 nm WhatmanAnodisc filter. Light scattering data (15 measurements, eachwith a 5 sacquisition time) were collected at 1°C intervalsas the solution was heated from 35 to 60°C. The autocor-relation function was analyzed using a regularization algo-

rithm for spherical particles provided by the manufacturer(Dynamics software version 5.26.37).

Results and Discussion

Overview of RDL. A schematic of RDL is shown inFigure 3. A synthetic oligonucleotide cassette encoding themonomer gene is first ligated into a cloning vector such aspUC19. The oligonucleotides are designed so thatEcoR IandHinD III compatible cohesive ends are produced uponannealing, which enables the annealed product to be directlyligated intoEcoR I and HinD III cleaved pUC19 (Figure3A). The monomer gene is designed to encode a definednumber of pentapeptide repeats, while incorporating twoadditional restriction endonuclease recognition sites on eachend of the coding sequence, internal to theEcoR I andHinDIII sites (Figure 3B). These additional sites, genericallylabeled RE1 and RE2 in Figure 3, are used to oligomerizethe gene by RDL as follows. An insert is produced bycleaving the plasmid that harbors the monomer gene withboth RE1 and RE2, and a linearized vector is produced byseparately digesting another aliquot of the same plasmid withonly RE1 (Figure 3C). The purified insert is ligated into thelinearized vector, resulting in dimerization of the gene (Figure3D).

The monomer gene is designed such that gene oligomer-ization by RDL achieves three goals. First, the insert isligated with its directionality preserved in a head-to-tailorientation upon ligation into the vector. Second, the ligation

Figure 3. The molecular biology steps of RDL. (A) A syntheticmonomer gene is inserted into a cloning vector. (B) The gene isdesigned to contain recognition sites for two different restrictionendonucleases, RE1 and RE2, at each of the coding sequence. (C)An insert is prepared by digestion of the vector with both RE1 andRE2 and subsequently ligated into the vector that has been linearizedby digestion with only RE1. (D) The product contains two head-to-tail repeats of the original gene, with the RE1 and RE2 sitesmaintained only at the ends of the gene. (E) Additional rounds ofRDL proceed identically, using products from previous rounds asstarting materials.

360 Biomacromolecules, Vol. 3, No. 2, 2002 Meyer and Chilkoti

Page 5: Genetically Encoded Synthesis of Protein-Based Polymers ...unicorn/reprints/ELPs/ELP2002synthesis.pdf · vector and the BLR(DE3) E. coli strain were purchased from Novagen Inc. (Milwaukee,

is seamless in that extraneous residues are not introduced atthe ligation site. Third, the original recognition sites for RE1and RE2 are maintained at each end of the dimerized gene,but neither recognition site is generated at the internal siteof ligation. Therefore, the oligomer assembled in any roundof RDL (e.g., dimer in the first round) can be used in futurerounds of RDL as the insert and/or the vector (Figure 3E).Later rounds of RDL are identical to the first round, exceptthat products from previous rounds serve as the source ofthe insert and vector.

Selection of RE1 and RE2 for Recursive DirectionalLigation. RDL requires the selection of a pair of restrictionendonucleases that satisfy four important requirements. First,they must have different recognition sequences so that theDNA can be selectively cleaved either by one or by both ofthe enzymes and so that neither site is re-formed at theinternal site of ligation. Second, the two enzymes mustproduce complementary, single-stranded DNA overhangsupon cleavage. Third, at least one of the two sites (generi-cally, RE1) should be unique on the cloning vector so thatdigestion with the enzyme cleaves the plasmid only at asingle site. Finally, the recognition sequences of both REsmust be compatible with the coding sequence of thepolypeptide such that, upon the ligation of two genesegments, the repeat sequence of the polypeptide is notdisrupted at the internal site of ligation. To fulfill these fourrequirements, we have chosenPflM I as RE1 andBgl I asRE2 for our implementation of RDL for oligomerization ofELP sequences.

PflM I and Bgl I have different recognition sequences,and thus the pair meets the first requirement above. Fur-thermore, both enzymes have a split palindromic recognitionsequence located on either side of five unconstrained bases,three of which comprise the 3′ overhang produced oncleavage (Figure 4C,D). In contrast with the more commonRE cleavage patterns in which the site of cleavage is located

within a palindromic recognition sequence, the unconstrainedoverhang sequence produced by these enzymes is useful fortwo reasons. First, this enables the selection of the sameoverhang sequence for both enzymes, which for our ELPgenes was 5′-GGC-3′. Therefore, the ends created bydigestion withPflM I are compatible with those created bydigestion withBgl I, thereby meeting the second requirementabove. Second, these overhang bases are nonpalindromic,which forces ligation of the insert in a head-to-tail orientationwith respect to the gene segment contained within the vector.

The pUC19 cloning vector has noPflM I sites, andtherefore the single site introduced upon insertion of themonomeric gene is unique, meeting the third requirementoutlined above. This allows the vector to be linearized bydigestion withPflM I, to receive the insert. It is also desirablethat the second site be unique, but this requirement is lessstringent. For example,Bgl I has two recognition sites inpUC-19 in addition to the site that is introduced into theplasmid at the 3′ end of the ELP gene. Therefore, whenpreparing the ELP gene insert by digestion of its vector withPflM I and Bgl I, four fragments are formed. The desiredfragment containing the ELP gene can usually be separatedfrom the other three fragments by extraction of the appropri-ate band following agarose gel electrophoresis. Furthermore,we selected an overhang sequence for the introducedBgl IandPflM I sites that is incompatible with the cohesive endsproduced by the two sites that are native to pUC19, andtherefore the three vector-derived fragments cannot ligateinto the linearized vector prepared by digestion withPflMI. Therefore, if an ELP gene insert and one of the contami-nating fragments are of similar size and cannot be separatedby electrophoresis, it is not critical to purify the insert fromall three contaminating bands. It is necessary, however, topurify the insert from at least one of the contaminating bandsto minimize background arising from religation of all fourfragments of the wild-type vector.

The fourth and final requirement relates to the compat-ibility of these two enzymes with the gene sequence, asillustrated in Figure 4. Recognition sites for RE1 and RE2must be present at opposite ends of the coding sequence,and the sequences must be designed such that the repetitivepolypeptide sequence is not disrupted upon ligation of a RE1-cleaved end to a RE2-cleaved end. To design a gene to satisfythese requirements, a potential RE pair is checked forsuitability by first combining the 5′ end of one RE’srecognition site, divided at the cleavage point, with the 3′end of its potential partner (Figure 4C-E). To ensure thatthe polypeptide repeat sequence is not disrupted at the siteof ligation during RDL, this paired sequence, which will beformed at the internal ligation site, is compared for compat-ibility along the complete length of the degenerate codonsequence that encodes the desired polypeptide repeat (Figure4F). Finally, the coding sequence is shifted, in our examplefrom (Val-Pro-Gly-Val-Gly) to (Val-Gly-Val-Pro-Gly), sothat the recognition sequences of these two REs arepositioned at each end of the coding sequence (Figure 4G-H). (Note that the second half of each recognition sequencemust be incorporated external to each end of the codingsequence, as illustrated in Figure 2A.) Finally, the intervening

Figure 4. Gene design for RDL. The polypeptide sequence to beoligomerized (A) specifies a degenerate DNA coding sequence (B).Pairs of restriction endonucleases (C & D) are checked for compat-ibility by combining the 5′ segment of one enzyme recognitionsequence with the 3′ segment of the second enzyme (E). (Recognitionsequences are in bold, sites of cleavage are marked by verticalarrows, and single-stranded overhangs created by cleavage areunderlined.) The degenerate coding sequence is searched for thepaired recognition sequence (F), and then the coding sequence isshifted so that the cleavage sites of the two enzymes are located atopposite ends of the gene (G).

Polymers of Controlled Molecular Weight and Sequence Biomacromolecules, Vol. 3, No. 2, 2002 361

Page 6: Genetically Encoded Synthesis of Protein-Based Polymers ...unicorn/reprints/ELPs/ELP2002synthesis.pdf · vector and the BLR(DE3) E. coli strain were purchased from Novagen Inc. (Milwaukee,

sequence between the two sites (denoted by “...” in Figure4G-H) can be chosen to encode any desired polypeptidesequence, as long as no new RE1 or RE2 sites are introduced.In our designs, we continue the Val-Pro-Gly-Xaa-Gly motifseamlessly from thePflM I site to theBgl I site (Figure 2A).

In addition to meeting the four RDL requirements for aRE pair, another attractive feature ofPflM I and Bgl I is theavailability of a small pool of similar REs that also producethree-base, unconstrained 3′ overhangs upon cleavage,includingAlwN I, Bsl I, BstAP I, Dra III, Mwo I, andSfi I.Because the RE recognition sequence constrains the encodedresidues flanking each end of the repetitive polypeptide gene,the variety of recognition sequences within this pool ofenzymes provides flexibility in the design of expressionvectors that are compatible with RDL-derived genes (dis-cussed below). Furthermore, using these compatible enzymesto synthesize libraries of different polypeptide sequencesallows modular combination between libraries to form morecomplex, multidomain repetitive polypeptides (e.g., blockcopolymers).

Although the existence of a suitable pair of REs for a givenpolypeptide repeat sequence is not guaranteed, RDL is widelyapplicable because of the large number of REs with differentrecognition sequences and cleavage patterns that are com-mercially available. Furthermore, type IIS REs, which cleaveat a defined distance to the side of their recognition sequence,could be used to completely eliminate the sequence compat-ibility requirement by locating the recognition sequenceoutside of each end the coding region.25 However, they muststill conform to the other RE requirements, and in particularthey must produce a cohesive end that is compatible with atleast one other RE. There are numerous type IIS REs thatfulfill these requirements, and those that produce four-base,unconstrained 5′ overhangs upon cleavage would be par-ticularly well suited to RDL, includingAlw26 I, BbsI, BbVI, BbV II, Bsa I, BsmA I, BsmB I, BsmF I, BspM I, Esp3 I,Fok I, Fin I, andSfaN I. Using type IIS REs for RDL wouldprovide the significant advantage that expression vectorscould be designed to receive the RDL-derived gene withoutthe introduction of any extraneous residues into the codingsequence.12,26

Synthesis of ELP Genes by RDL.We used RDL toproduce three ELP gene libraries. Within each library, genesencode the same repeat unit, as defined by the monomer gene(Figure 2), but differ in the number of repeats. The firstlibrary, ELP[V5], is comprised of a set of Val-Pro-Gly-Val-Gly homopolymers ranging in length up to 120 pentapeptides(∼50 kDa). The second library, ELP[V5A2G3], is a morecomplex copolymer that contains Val, Ala, and Gly in a 5:2:3ratio at the fourth residue of the ELP pentapeptide repeat.This library encodes polypeptides ranging up to 330 pen-tapeptides (∼130 kDa) in length. The third library, ELP-[V1A8G7], is a set of copolymers with a Val:Ala:Gly guestresidue ratio of 1:8:7, ranging up to 320 pentapeptides (∼120kDa). For the two copolymers, we dispersed the differentguest residues throughout each sequence in order to minimizerepetition at the gene level and to produce a pseudorandomdistribution in the expressed ELP. PreferredE. coli codonswere favored wherever possible,27 with exceptions made to

reduce repetition of the nucleotide sequence. We chosedifferent monomer gene lengths for each ELP sequence,depending on the desired incremental step size duringoligomerization by RDL. For simplicity, we discuss thesynthesis of ELP[V5A2G3] only, although similar results wereobtained for the other two gene libraries.

Figure 5A shows the results of DNA agarose gel electro-phoresis of the genes for the entire ELP[V5A2G3] library,which was synthesized as follows. The monomer genecassette, which encodes 10 pentapeptides, was constructedby annealing complementary, chemically synthesized oligo-nucleotides and then ligated into pUC19 to yield pUC19-ELP[V5A2G3-10]. In the first round of RDL, the monomericELP insert was prepared by digesting the pUC19-ELP-[V5A2G3-10] vector withPflM I and Bgl I and purifying theinsert after agarose gel electrophoresis. An ELP vectorfragment was prepared to receive the insert by linearizinganother aliquot of the same pUC19-ELP[V5A2G3-10] plasmidwith only PflM I. The insert encoding the monomer genewas then ligated into the vector. Transformants yielded bothpUC19-ELP[V5A2G3-20], the anticipated product, and also

Figure 5. ELP library produced by RDL. (A) Agarose gel (1.2%)electrophoresis of ELP[V5A2G3] genes visualized by ethidium bromidestaining. The left lane contains a size standard, which is labeled inbp. Plasmids containing the ELP genes were digested with EcoR Iand HinD III, producing a vector fragment (2635 bp) and an ELP genefragment. The expected size of each ELP gene is labeled in bp onthe right. The number of pentapeptide repeats encoded by each geneis labeled below each lane. (B) SDS-PAGE (4-20% gradient)visualized by copper staining of purified ELPs expressed from thegenes in (A). The left lane contains a molecular weight standard,which is labeled in kilodaltons. The expected molecular weight of eachELP is indicated on the right. The length in pentapeptides of eachELP is labeled below each lane.

362 Biomacromolecules, Vol. 3, No. 2, 2002 Meyer and Chilkoti

Page 7: Genetically Encoded Synthesis of Protein-Based Polymers ...unicorn/reprints/ELPs/ELP2002synthesis.pdf · vector and the BLR(DE3) E. coli strain were purchased from Novagen Inc. (Milwaukee,

pUC19-ELP[V5A2G3-30]. The latter construct was the prod-uct of a trimolecular ligation, created when two inserts wereligated into the linearized vector. Multiple inserts obtainedin a single round of RDL are identical to the product obtainedafter sequential steps of RDL with single inserts, and forboth cases, the ligation is seamless and the inserts are joinedin a head-to-tail orientation. We have observed that for smallinserts less than 500 bp, double inserts are commonlyobtained for a small but significant fraction of transformants.This is useful because it reduces the number of RDL cyclesrequired to build a larger library.

After the initial round of RDL, the ELP[V5A2G3] librarywas expanded by recursive combination of the newlysynthesized genes to produce larger genes. In the secondround of RDL, an insert encoding 30 pentapeptides wasprepared from pUC19-ELP[V5A2G3-30] by digestion withPflM I and Bgl I and was then ligated intoPflM I linearizedpUC19-ELP[V5A2G3-30] vector to yield the gene for 60pentapeptide repeats. The 30 pentapeptide gene insert fromthe second round of RDL was then ligated into the pUC19-ELP[V5A2G3-60] vector to form a gene encoding 90 pen-tapeptides. Next, the genes encoding 120, 150, and 180pentapeptides were produced in parallel by ligating insertsencoding 30, 60, and 90 pentapeptides into the pUC19-ELP-[V5A2G3-90] vector, respectively. Finally, genes encoding240 and 330 pentapeptides were created by ligating the genesfor ELP[V5A2G3-60] and ELP[V5A2G3-150] as inserts intothe pUC19-ELP[V5A2G3-180] vector.

Each RDL step requires minimal screening of transfor-mants, and only 5 to 10 colonies are screened for each round.The percentage of positive clones in each RDL step rangesfrom ∼30 to 80% and appears to be independent of insertsize. Additionally,∼10 to 20% of screened colonies yielddouble inserts for inserts of∼500 bp or less, although wehave not observed double insertions for ELP genes largerthan 500 bp. Therefore, when combining two genes ofunequal size by RDL, it is desirable to use the smaller geneas the insert because of the possibility of creating tandemrepeats in one round of RDL. To achieve these overallligation efficiencies of up to 80%, however, it is critical tominimize WT background by ensuring that both digestionof the vector with RE1 (e.g.,PflM I for the ELP genes) andits subsequent dephosphorylation are complete. The WTbackground could be eliminated completely using a third REthat cleaves the vector between its antibiotic resistance geneand the plasmid’s origin of replication, as described byRosenfeld and Kelly.28 In this case, one fragment is preparedby digestion with RE1 and the third RE, and the secondfragment is prepared by digestion with RE2 and the thirdRE. After purification of the two fragments and subsequentligation, only the desired product produces a viable plasmid.This approach, however, would eliminate the possibility ofobtaining multiple inserts in a single round of RDL.

Expression of the ELP Gene Libraries.To express anoligomeric gene constructed by RDL, it is first excised fromthe cloning vector and ligated into an expression vector. Theexpression vector is designed to include a single, unique RErecognition site that provides RE1- and RE2-compatiblecohesive ends upon cleavage. This site must allow ligation

of the RDL gene insert in frame with the reading frame ofthe expression vector. The expression vector also encodesother amino acids adjacent to the insertion site of therepetitive polypeptide gene. Minimal leader and trailersequences might encode only the initiation and terminationcodons, respectively, while others can include codons forspecific amino acids such as Lys or Cys to enable site-specific chemical conjugation with the expressed polypeptide.Even more complex leading or trailing sequences can encodeentire proteins, thereby producing fusion proteins.17,29Trans-fer of the gene from the cloning vector to the expressionvector is achieved using methods that are identical to geneoligomerization by RDL: an insert encoding the gene ofinterest, which is prepared by digestion of the cloning vectorwith RE1 and RE2, is ligated into the expression vector thathas been linearized by single digestion with the insertionRE, which cleaves to produce cohesive ends that arecompatible with those produced by RE1 and RE2.

We have selectedSfi I as the insertion RE for all of ourELP expression vectors. The pET25b(+) T7 lac expressionvector was therefore modified by replacement of the regionbetweenNde I and EcoR I in the parent vector by oligo-nucleotide cassette mutagenesis to introduce a uniqueSfi Isite (Figure 2B). TheSfi I site was designed to be compatiblewith the RDL inserts, produced by digestion of the pUC19-based cloning vectors withPflM I and Bgl I, by selectingthe single stranded overhang produced by cleavage withSfiI to be 5′-GGC-3′. When read in-frame with the ELP inserts,theSfi I recognition sequence was designed to encode Gly-Pro immediately prior to the insertion site and Trp-Proimmediately following it. The Trp residue was chosen toallow measurements of the ELP concentration by UV-visible spectrophotometry. The remaining three residues wereselected because they match well with the pentapeptide repeatof ELPs and are therefore unlikely to significantly affect theinverse temperature transition.

We have constructed a number of different expressionvectors, which can be divided into two categories: vectorsfor expression of “free” ELPs and vectors that enableexpression of ELPs fused to other proteins.17,29 The vectorsdesigned for expression of free ELPs typically encode twoor three additional residues at the N- and C-termini of theELP (Figure 2B). These residues are useful because theyenable unique reactive side chains to be incorporated intothe polypeptide for site-specific chemical conjugation, cross-linking, or surface immobilization of the ELP. These residuesinclude Lys for conjugation with amine-reactive agents, Cysfor sulfhydryl reactive agents, and N-terminal Ser or Thr(after in vivo processing of the initiating Met30 for reactionwith hydrazide linkers after periodate oxidation.31 Theseexpression vectors are generic in that they can be used toexpress any ELP sequence previously produced by RDL.

Characterization of ELPs. Figure 5B shows SDS-PAGEresults for the ELP[V5A2G3] library. When compared to acommercial MW marker, the ELPs consistently migrate as∼20% larger than expected, a trend that has been previouslyobserved by McPherson et al.10 However, the migration ofthe ELPs was consistent with the expected MW differenceswith respect to each other. Similar SDS-PAGE results were

Polymers of Controlled Molecular Weight and Sequence Biomacromolecules, Vol. 3, No. 2, 2002 363

Page 8: Genetically Encoded Synthesis of Protein-Based Polymers ...unicorn/reprints/ELPs/ELP2002synthesis.pdf · vector and the BLR(DE3) E. coli strain were purchased from Novagen Inc. (Milwaukee,

obtained for the other two libraries. To further verify theMWs of the ELPs, the following members of each ELPlibrary were characterized by MALDI-MS: ELP[V5-120],ELP[V5A2G3-180], and ELP[V1A8G7-160]. The MW of eachpolypeptide as determined by MALDI-MS was within 0.4%of the calculated value.

The thermal behavior of each member of the three ELPlibraries was studied by measuring solution turbidity as afunction of temperature. Upon heating the solution to theTt, the solutions rapidly become cloudy due to ELP aggrega-tion. We defined theTt as the temperature at which theincrease in turbidity was most rapid. Figure 6 shows theTt

values for each library as a function of ELP size. ELPssmaller than those shown (i.e., ELP[V5-15], ELP[V5A2G3-30], and ELP[V1A8G7-96]) were also studied by temperature-dependent turbidimetry but did not exhibit a transition below90 °C, the highest temperature that was experimentallyaccessible. These constructs, however, do exhibit a thermaltransition at lower temperatures when NaCl is added to thesolution to depress theirTt values, showing that these ELPswith high Tt values were thermally responsive (data notshown).

The data in Figure 6 show that theTt is a function of twomacromolecular parameters, both of which can be preciselycontrolled at the gene level through RDL. First, theTt

increases with decreasing ELP MW. This result qualitativelyparallels that of Urry et al., who studied the inversetemperature transition of chemically synthesized ELPs at highconcentration (∼40 mg/mL).22 However, the logarithmicrelationship betweenTt and MW that they observed is notreproduced for the ELPs in this study, which were studiedat the significantly lower concentration of 25µM (∼0.25 to3.25 mg/mL). Second, for a given MW, decreasing thehydrophobicity of the guest residues increases theTt, whichis also consistent with previous results of Urry et al.21 Onthe basis of these data, it is now possible to design an ELPof specified sequence and MW to exhibit a desiredTt.Because the thermal properties of ELPs depend on theirsequence and chain length, the synthesis of ELPs by RDL

provides an excellent example of the utility of this synthesismethodology.

ELP Block Copolymer. The simple and facile synthesisof block copolymers is another example of the utility of RDL,which provides exquisite control both over the architectureof the block copolymer and over the repeat sequence withinindividual blocks. To demonstrate this application of RDL,we synthesized and expressed a gene encoding an AB diblockcopolymer, in which ELP[V1A8G7-64] is followed seamlesslyby ELP[V5-60]. This was achieved in one cycle of RDL usingthePflM I/Bgl I-digested ELP[V5-60] gene as the insert andpUC19-ELP[V1A8G7-64] as thePflM I-linearized vector.Both of these genes had been previously generated duringsynthesis of the ELP libraries.

We chose these two blocks because ELP[V5-60] has aTt

of 35 °C and ELP[V1A8G7-64] has aTt > 90 °C, and wehypothesized that a copolymer of these two blocks wouldform a nanoparticle in solution as a function of temperature,driven by the disparity in theTt of each block. If the twoblocks exhibited independent transition behavior, the ELP-[V5-60] block should hydrophobically collapse and aggregateat temperatures above itsTt while the ELP[V1A8G7-64] blockshould remain hydrophilic and solvated, leading to theformation of a micellar structure.32,33

We selected ELP[V5A2G3-120], a pseudorandom analogue,as a control sequence for comparison to the ELP[V1A8G7-64]-ELP[V5-60] block copolymer. The block copolymer is124 pentapeptides in length and its guest residues arecomposed of 51.6% Val, 25.8% Ala, and 22.6% Gly, withthe N-terminal block primarily composed of Ala and Glyguest residues followed by a block containing exclusivelyVal guest residues. Similarly, the pseudorandom ELP-[V5A2G3-120] is 120 pentapeptides in length and its guestresidues are composed of 50% Val, 20% Ala, and 30% Gly.In contrast to the block copolymer, however, the guestresidues are dispersed evenly throughout the polymer chain.Thus, comparison of the block and pseudorandom copolymerwas expected to provide insight into the effect of thedistribution of guest residues within an ELP sequence on itsthermal behavior.

The block and pseudorandom copolymers were eachstudied by measuring solution turbidity and by DLS as afunction of temperature (Figures 7 and 8). When a solutionof the pseudorandom ELP[V5A2G3-120] copolymer is heated,reaching theTt of 44.6 °C triggers a stepwise increase inturbidity (Figure 7). DLS indicates that this increase inturbidity results from the conversion of soluble ELP mono-mer with a hydrodynamic radius (Rh) of 4.8( 1.1 nm (meanRh ( polydispersity) to aggregates with aRh of 1.2 ( 0.26µm (Figure 8A). We term this transition from solublemonomer to micrometer-size aggregates the “bulk transition”because it results in the sudden and dramatic formation ofaggregates over a narrow temperature range without forma-tion of particles of intermediate size.

A solution of the ELP[V1A8G7-64]-ELP[V5-60] blockcopolymer displays a bulk transition at 50.8°C, also leadingto the formation of micrometer size aggregates. The disparitybetween the bulkTt values of the pseudorandom copolymer(Tt ) 44.6 °C) and the block copolymer suggests that the

Figure 6. ELP Tt as a function of sequence and chain length. TheTt was determined by temperature-dependent turbidity measurementsfor each member of three ELP libraries: ELP[V5] (0), ELP[V5A2G3](O), and ELP[V1A8G7] (4). The ELP concentration was 25 µM in PBS.The Tt was defined as the solution temperature at the maximum ofthe turbidity gradient obtained while heating the solution at a rate of1 °C min-1. These data show that the exquisite control over bothcomposition and chain length provided by RDL is critical for the designof ELPs with precisely specified thermal properties.

364 Biomacromolecules, Vol. 3, No. 2, 2002 Meyer and Chilkoti

Page 9: Genetically Encoded Synthesis of Protein-Based Polymers ...unicorn/reprints/ELPs/ELP2002synthesis.pdf · vector and the BLR(DE3) E. coli strain were purchased from Novagen Inc. (Milwaukee,

bulk aggregation behavior of an ELP copolymer is sensitiveto the distribution of the guest residues in the polymer chain.The bulk Tt at 50.8 °C of the block copolymer is alsosignificantly different from that of each free block. The bulktransition of the block copolymer occurs well below theTt

of ELP[V1A8G7-64], which is>90 °C, indicating that fusionto the more hydrophobic ELP[V5-60] reduces theTt of theELP[V1A8G7-64] segment. Conversely, the first detectablechange in turbidity of the block copolymer (∼40 °C, asdiscussed below) is above the bulk transition at 34.8°C ofthe ELP[V5-60] homopolymer, which comprises the hydro-phobic, lowTt segment of the block copolymer. This showsthat fusion of ELP[V5-60] to the more hydrophilic ELP-[V1A8G7-64] sequence in the block copolymer significantlyincreases theTt of the ELP[V5-60] segment.

In contrast to the random copolymer, which simplydisplays a single transition from monomer to micrometeraggregates, the ELP block copolymer exhibits temperature-dependent mesoscale self-assembly. In addition to the bulktransition at 50.8°C, the turbidity profile of the blockcopolymer also exhibits two inflection points at lowertemperatures that are not observed for the ELP[V5-60]homopolymer or the pseudorandom ELP[V5A2G3-120] co-polymer. The first inflection point in the turbidity profile ofthe block copolymer is observed at 40.0°C, and between40 and 47.5°C, a linear increase in turbidity is observed,with a slope that is significantly different from baseline. TheDLS results for ELP[V1A8G7-64]-ELP[V5-60] show that theELP monomer (4.4( 1.6 nm) is the sole species in solutionat temperatures below 40°C (Figure 8B). As the temperatureis increased, a new, larger particle with aRh of 20.4( 5.8nm is observed at 40°C that persists up to 47°C. At 47.5°C, a second inflection point in the turbidity versus temper-ature profile of the block copolymer is observed, and theDLS data show a discontinuous jump in theRh of theparticles from 20.4( 5.8 nm to 54.5( 20.3 nm. Theseparticles are stable until the bulk transition occurs at 50.8°C, above which larger aggregates with aRh of 1.4 ( 0.35µm are formed. When the solution is cooled, the turbidityprofile closely overlays the heating curve, showing that allthree inflections at 40.0, 47.5, and 50.8°C are due to fullyreversible transitions. Note that the presence of smallerparticles can be masked in the DLS data by scattering fromlarger particles, even though several species may coexist ata given temperature.

The turbidity and DLS results for the ELP[V1A8G7-64]-ELP[V5-60] copolymer suggest that the two blocks undergosequential and independent transitions at different temper-atures. We hypothesize that upon increasing the temperatureto 40.0 °C, the solvated ELP[V5-60] block undergoes aninverse temperature transition, resulting in desolvation andhydrophobic collapse of this segment of the polymer chain.Driven by hydrophobic interactions between the collapsedELP[V5-60] segments, molecules of the block copolymerthen self-assemble to form nanoparticles with aRh of 20 nmthat are composed of a hydrophobic core of collapsed ELP-[V5-60] surrounded by a hydrophilic shell of solvated ELP-[V1A8G7-64] segments. The initial appearance of thesemicelle-like nanoparticles causes a slight increase in turbidity.The linear increase in solution turbidity observed as thetemperature is increased between 40.0 and 47.5°C is likelycaused by an increase in the concentration of nanoparticlesat the expense of monomeric ELP molecules. These nano-particles, which have a constant diameter of∼40 nm over

Figure 7. Solution turbidity of ELP block and pseudorandomcopolymers as a function of temperature. OD350 as a function oftemperature for solutions of three ELPs: ELP[V5-60], ELP[V5A2G3-120], and the ELP[V1A8G7-64]-[V5-60] block copolymer. The turbidityprofiles were obtained for 25 µM ELP concentration in PBS, whileheating at a rate of 1 °C min-1 (solid lines). All solutions cleared fullyupon cooling; however, for clarity, a cooling profile is shown only forELP[V1A8G7-64]-[V5-60] (dashed line). The ELP[V1A8G7-64]-[V5-60]block copolymer exhibited a complex turbidity profile, with threeinflection points observed over a range of 10 °C.

Figure 8. Particle size as a function of temperature for the (A)pseudorandom and (B) block ELP copolymers. The hydrodynamicradii of each ELP was measured by dynamic light scattering as afunction of temperature (mean ( polydispersity of the particle sizedistribution). The corresponding turbidity profiles from Figure 7 arereplotted for comparison with the DLS results (solid lines). Theseresults suggest that the block copolymer forms a micelle-like structureat temperatures intermediate between the initial collapse of the morehydrophobic ELP segment at 40.0 °C and the collapse and subse-quent aggregation of the less hydrophobic segment at 50.8 °C.

Polymers of Controlled Molecular Weight and Sequence Biomacromolecules, Vol. 3, No. 2, 2002 365

Page 10: Genetically Encoded Synthesis of Protein-Based Polymers ...unicorn/reprints/ELPs/ELP2002synthesis.pdf · vector and the BLR(DE3) E. coli strain were purchased from Novagen Inc. (Milwaukee,

this temperature range, undergo a rearrangement at 47.5°Cto form nanoparticles that have an apparent diameter of∼110nm. The mechanism of the transition leading to an increasein the diameter of the nanoparticles from∼40 to∼110 nmis not known at this time. Finally, at 50.8°C, the solvent-exposed ELP[V1A8G7-64] block undergoes its inverse tem-perature transition, which drives the aggregation of the∼110nm diameter nanoparticles to form micrometer-sized ag-gregates.

Advantages of RDL.RDL is a facile method to rapidlygenerate a library of polymers with systematically variedMWs. This approach is flexible in that the insert and vectorfor a given round of RDL can be prepared from the sameplasmid or from different plasmids. When the insert andlinearized vector are prepared from the same plasmid, eachround of RDL doubles the size of the gene. Repeateddimerization of a gene is the most rapid method to generatea large, repetitive gene. Each round of RDL can be completedin 2 days, and gene libraries encoding large polypeptidescan be generated in just a few weeks. For example, a 6400bp gene composed of 64 repeats of a 100 bp monomer canbe constructed in six sequential rounds of RDL. In practice,large genes can be produced in even fewer rounds becausemultiple inserts are often obtained in the earlier rounds ofRDL.

RDL is not, however, restricted solely to the dimerizationof a gene. It is a modular and flexible synthesis techniquethat allows any two gene sequences to be seamlesslycombined in a defined orientation, requiring only that eachsequence is designed to incorporate compatible RE recogni-tion sites at their 5′ and 3′ termini. Ligation of the two genesequences results in a longer sequence that can itself be usedin subsequent rounds of RDL. For example, instead ofdimerization, DNA oligomers of different lengths can bejoined in a round of RDL (e.g., a trimer with a tetramer toyield a heptamer), which enables the construction of a geneof any length in increments of the monomer gene size.Because RDL is a modular synthesis methodology, two genesencoding different sequences can also be combined to forma larger, more complex sequence that can then be oligomer-ized in subsequent rounds of RDL (e.g., as illustrated bythe right pathway in Figure 1). Alternatively, larger geneswith different sequences previously constructed by RDL canbe combined to produce block copolymers, as described forELP[V1A8G7-64]-ELP[V5-60].

Compared to previously reported methods for the assemblyof synthetic genes encoding repetitive polypeptides, RDLhas a number of unique features that account for its flexibilityand precision. First, RDL enables a desired MW to berationally and precisely targeted during synthesis. Second,each round of RDL yields identical DNA oligomers that donot have to be fractionated or screened (with the exceptionof a low fraction of multimers in early rounds of RDL withshorter gene segments, as described above). Third, theprocess of assembling a large gene by RDL necessarilycreates a library of potentially useful smaller genes rangingin size from the monomer to the target gene. Finally, theprocedure is modular in that monomers or oligomers encod-

ing different peptide or protein repeats can be combined atany step to further generate diversity at the sequence level.

RDL is also useful in the synthesis of very large repetitivegenes that are several thousand nucleotides in length orgreater. We have constructed genes up to 4950 bp in length(33 repeats of a 150 bp monomer gene), encoding an ELPwith a MW of ∼130 kDa. Although this is the largest genewe have attempted to synthesize to date, this does notrepresent an upper limit. This is because, regardless of thegene size, an RDL step requires the ligation of only twoDNA fragments. In contrast, producing large genes by aconcatamerization method is complicated by the likelihoodof circularization of larger oligomers. The resulting closed-circle oligomer cannot be subsequently ligated into a vector.Methods to ameliorate this problem have been proposed,including the use of chain-terminating capping sequences,10

sequential concatamerization,25 and ligation under fluidshear;34 however reports of the successful synthesis of verylarge, repetitive genes by concatemerization are nonethelesslimited.

The modular nature of RDL provides a convenient andpowerful method to vary the physicochemical properties ofblock copolymers, which are of great interest for drug andgene delivery.35,36 For example, RDL should prove usefulin precisely engineering the properties of ELP block co-polymers to enable thermally triggered nanoparticle forma-tion and to control the nanoparticle size and its drug loadingcapacity. These parameters can be systematically varied byselecting ELP blocks of different lengths and sequences, andhenceTt values. The loading of drugs could be maximizedby optimizing noncovalent interactions between the hydro-phobic segment of the block copolymer and the drug. Sitesfor chemical conjugation of the drug could also be engineeredinto either the low or highTt block. Similarly, DNA for genedelivery could be bound within the ELP nanoparticle bypolycationic blocks. Altering the particle size by adjustingthe relative lengths of each ELP block will enable controlof the pharmacokinetics of systemically injected, drug-loadednanoparticles. RDL should also enable the presentation ofaffinity targeting peptides or proteins at the termini of thesolvent-exposed hydrophilic segments to enable receptor-mediated targeting of the nanoparticle to physiological targetsof interest. Furthermore, mixing of block copolymers con-taining a targeting peptide (or peptides) with polymerslacking a targeting sequence would enable the number oftargeting moieties per nanoparticle to be precisely specifiedand would thereby enable polyvalency effects to be exploitedin targeted drug delivery.37

More broadly, RDL is a generic synthesis approach thatis not restricted to ELPs and can be used to oligomerize manyother genes of interest. For example, RDL can be used tosystematically vary the location and density of cell attach-ment sequences or cross-linking sites in repetitive polypep-tides designed as tissue engineering scaffolds. Similarly,oligomers of peptide pharmaceuticals can be produced byRDL in order to increase the avidity of drugs for their targets.RDL is also useful for applications in which DNA oligomersare the end product, such as the synthesis of multiple copiesof a therapeutic gene for gene delivery or antisense therapy.

366 Biomacromolecules, Vol. 3, No. 2, 2002 Meyer and Chilkoti

Page 11: Genetically Encoded Synthesis of Protein-Based Polymers ...unicorn/reprints/ELPs/ELP2002synthesis.pdf · vector and the BLR(DE3) E. coli strain were purchased from Novagen Inc. (Milwaukee,

In conclusion, RDL is a useful strategy to generate oligo-meric genes and polypeptides for diverse applications inmedicine and biotechnology.

Acknowledgment. This work was supported by grantsfrom the Whitaker Foundation and the National Institutesof Health (R21-GM-057373 and R01-GM-61232). We alsothank the Whitaker Foundation for support of D.E.M. as agraduate fellow.

References and Notes

(1) Cappello, J.Trends Biotechnol.1990, 8, 309-311.(2) McGrath, K. P.; Tirrell, D. A.; Kawai, M.; Mason, T. L.; Fournier,

M. J. Biotechnol. Prog.1990, 6, 188-192.(3) Barron, A. E.; Zuckermann, R. N.Curr. Opin. Chem. Biol.1999, 3,

681-687.(4) Nagarsekar, A.; Ghandehari, H.J. Drug Target.1999, 7, 11-32.(5) McPherson, D. T.; Morrow, C.; Minehan, D. S.; Wu, J.; Hunter, E.;

Urry, D. W. Biotechnol. Prog.1992, 8, 347-352.(6) Feeney, K. A.; Tatham, A. S.; Gilbert, S. M.; Fido, R. J.; Halford,

N. G.; Shewry, P. R.Biochim. Biophys. Acta2001, 1546, 346-355.(7) Kostal, J.; Mulchandani, A.; Chen, W.Macromolecules2001, 34,

2257-2261.(8) Cappello, J.; Crissman, J.; Dorman, M.; Mikolajczak, M.; Textor,

G.; Marquet, M.; Ferrari, F.Biotechnol. Prog.1990, 6, 198-202.(9) Creel, H. S.; Fournier, M. J.; Mason, T. L.; Tirrell, D. A.

Macromolecules1991, 24, 1213-1214.(10) McPherson, D. T.; Xu, J.; Urry, D. W.Protein Expr. Purif.1996, 7,

51-57.(11) Fukushima, Y.Biopolymers1998, 45, 269-279.(12) McMillan, R. A.; Lee, T. A. T.; Conticello, V. P.Macromolecules

1999, 32, 3643-3648.(13) Urry, D. W.Prog. Biophys. Mol. Biol.1992, 57, 23-57.(14) Urry, D. W.J. Phys. Chem. B1997, 101, 11007-11028.(15) Cappello, J.; Crissman, J. W.; Crissman, M.; Ferrari, F. A.; Textor,

G.; Wallis, O.; Whitledge, J. R.; Zhou, X.; Burman, D.; Aukerman,L.; Stedronsky, E. R.J. Controlled Release1998, 53, 105-117.

(16) Urry, D. W.; Pattanaik, A.; Xu, J.; Woods, T. C.; McPherson, D. T.;

Parker, T. M.J. Biomater. Sci. Polym. Ed.1998, 9, 1015-1048.(17) Meyer, D. E.; Chilkoti, A.Nat. Biotechnol.1999, 17, 1112-1115.(18) Urry, D. W.Trends Biotechnol.1999, 17, 249-257.(19) Welsh, E. R.; Tirrell, D. A.Biomacromolecules2000, 1, 23-30.(20) Meyer, D. E.; Kong, G. A.; Dewhirst, M. W.; Zalutsky, M. R.;

Chilkoti, A. Cancer Res.2001, 61, 1548-1554.(21) Urry, D. W.; Luan, C.-H.; Parker, T. M.; Gowda, D. C.; Prasad, K.

U.; Reid, M. C.; Safavy, A.J. Am. Chem. Soc.1991, 113, 4346-4348.

(22) Urry, D. W.; Trapane, T. L.; Prasad, K. U.Biopolymers1985, 24,2345-2356.

(23) Ausubel, F. M.; Brent, R.; Kingston, R. E.; Moore, D. H.; Seidman,J. G.; Smith, J. A.; Struhl, K.Current Protocols in Molecular Biology;John Wiley: New York, 1995.

(24) Lee, C.; Levin, A.; Branton, D.Anal. Biochem.1987, 166, 308-312.

(25) Lee, J. H.; Skowron, P. M.; Rutkowska, S., M.; Hone, S. S.; Kim, S.C. Genet. Anal.1996.

(26) Padgett, K. A.; Sorge, J. A.Gene1996, 168, 31-35.(27) Aota, S.-i.; Gojobori, T.; Ishibashi, F.; Maruyama, T.; Ikemura, T.

Nucleic Acids Res.1988, 16, Suppl., r315-402.(28) Rosenfeld, P. J.; Kelly, T. J.J. Biol. Chem.1986, 261, 1398-1408.(29) Meyer, D. E.; Trabbic-Carlson, K.; Chilkoti, A.Biotechnol. Prog.

2001, 17, 720-728.(30) Hirel, P. H.; Schmitter, J. M.; Dessen, P.; Fayat, G.; Blanquets, S.

Proc. Natl. Acad. Sci. U.S.A.1989, 86, 8247-8251.(31) Geoghegan, K. F.; Stroh, J. G.Bioconjugugate Chem.1992, 3, 138-

146.(32) Chung, J. E.; Yokoyama, M.; Aoyagi, T.; Sakurai, Y.; Okano, T.J.

Controlled Release1998, 53, 119-130.(33) Lee, T. A. T.; Cooper, A.; Apkarian, R. P.; Conticello, V. P.AdV.

Mater. 2000, 12, 1105-1110.(34) Haber, C.; Wirtz, D.Biophys. J.2000, 79, 1530-1536.(35) Kataoka, K.; Harada, A.; Nagasaki, Y.AdV. Drug DeliV. ReV. 2001,

47, 113-131.(36) Jones, M.; Leroux, J.Eur. J. Pharm. Biopharm.1999, 48, 101-

111.(37) Mammen, M.; Choi, S.-K.; Whitesides, G. M.Angew. Chem., Int.

Ed. 1998, 37, 2454-2794.

BM015630N

Polymers of Controlled Molecular Weight and Sequence Biomacromolecules, Vol. 3, No. 2, 2002 367