university of groningen library design and screening ... · global cheese production. examples of...

29
University of Groningen Library design and screening strategies for efficient enzyme evolution van Leeuwen, Johannes Gustaaf Ernst IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document version below. Document Version Publisher's PDF, also known as Version of record Publication date: 2015 Link to publication in University of Groningen/UMCG research database Citation for published version (APA): van Leeuwen, J. G. E. (2015). Library design and screening strategies for efficient enzyme evolution. University of Groningen. Copyright Other than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons). Take-down policy If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim. Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons the number of authors shown on this cover page is limited to 10 maximum. Download date: 21-11-2020

Upload: others

Post on 15-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: University of Groningen Library design and screening ... · global cheese production. Examples of other important developments for biocatalysis are the elucidation of the first protein

University of Groningen

Library design and screening strategies for efficient enzyme evolutionvan Leeuwen, Johannes Gustaaf Ernst

IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite fromit. Please check the document version below.

Document VersionPublisher's PDF, also known as Version of record

Publication date:2015

Link to publication in University of Groningen/UMCG research database

Citation for published version (APA):van Leeuwen, J. G. E. (2015). Library design and screening strategies for efficient enzyme evolution.University of Groningen.

CopyrightOther than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of theauthor(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons).

Take-down policyIf you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediatelyand investigate your claim.

Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons thenumber of authors shown on this cover page is limited to 10 maximum.

Download date: 21-11-2020

Page 2: University of Groningen Library design and screening ... · global cheese production. Examples of other important developments for biocatalysis are the elucidation of the first protein

Chapter 1

General introduction and outline of the thesis

Page 3: University of Groningen Library design and screening ... · global cheese production. Examples of other important developments for biocatalysis are the elucidation of the first protein

Chapter 1

8

Figure 1. Classical propene to epichlorohydrin process. The first step involves the allylic chlorination of propene. 3-Chloropropene is then reacted with hypochlorous acid, prepared by dissolving chlorine gas in water, yielding a 3:1 mixture of 1,3-dichloropropan-2-ol and 2,3-dichloropropan-1-ol. This mixture is reacted with an alkaline solution to yield racemic epichlorohydrin. The chlorine atom-efficiency of this process is only 25% and significant quantities of halogenated side products like 1,2,3-trichloropropane are formed.

!

Figure 2. Newly developed glycerol to epichlorohydrin process. The initial hydrochlorination of glycerol with hydrogen chloride is mediated by an organic acid catalyst (e.g. acetic acid) under mild reaction conditions to give a 30-50:1 mixture of 1,3-dichloropropan-2-ol and 2,3-dichloropropan-1-ol. This intermediate is converted to racemic epichlorohydrin under alkaline conditions. The glycerol to epichlorohydrin process produces only one equivalent of waste chloride and virtually no organic side-products are formed.

Figure 3. Comparison of the thermodynamics of the propene to epichlorohydrin and the glycerol to epichlorohydrin conversion processes. Panel a; the precursors in the classical propene to epichlorohydrin route are chemically highly reactive resulting in low activation energy (ΔG‡). The overall reaction energy (ΔG) is strongly negative which indicates a highly exothermic reaction. Panel b; the activation energy of the non-catalyzed glycerol to epichlorohydrin reaction is very high (solid line); the reaction hardly proceeds under ambient conditions. The presence of an organic acid catalyst such as acetic acid lowers the rate-limiting free energy of activation (dotted line) and facilitates high reaction rates under mild conditions.

!

!

P

S

transition state non-catalyzed reaction(‡)

transition state cata- lyzed reac- tion(‡

*) ΔG

‡*

ΔG‡

ΔG

Gib

bs F

ree

Ene

rgy

Progress

b. New: glycerol to epichlorohydrin !

!

Gib

bs F

ree

Ene

rgy

Progress

transition state (‡)

ΔG‡

ΔG

P

S

a. Classical: propene to epichlorohydrin

Page 4: University of Groningen Library design and screening ... · global cheese production. Examples of other important developments for biocatalysis are the elucidation of the first protein

Introduction !

!9

Green chemistry & catalysis for sustainable organic synthesis The negative effects of growing industrial production and increasing use of natural resources urge the development of green production processes that are based on renewable starting materials.[1] Especially the chemical industry is challenged to produce in a more sustainable manner and to minimize environmental impact, for example by waste prevention and by reducing energy use. Traditional chemical processes already have become less polluting since the 1980s due to measures against the emission of hazardous compounds, but nowadays also cleaner alternative production routes are being implemented. A good example of the introduction of cleaner technology is a newly developed route towards racemic bulk epichlorohydrin, which is an important intermediate for epoxy resins, paints, paper products and pharmaceuticals (production ± 1Mton/year). The traditional process runs under harsh conditions and utilizes propene, chlorine gas and hypochlorous acid as precursors (Figure 1). Propene is commonly derived from fossil sources and besides stoichiometric amounts of solid waste, highly toxic and persistent side products such as 1,2,3-trichloropropane (TCP) and chlorinated ethers are released (chapters 2 and 3).[2] The newly developed process utilizes renewable glycerol and hydrochloric acid as precursors and virtually no organohalogen side products are formed (Figure 2).[3] Glycerol and hydrochloric acid are chemically not reactive under ambient conditions. To facilitate high reaction rates under mild conditions an organic acid is applied as a catalyst. The function of a catalyst in organic synthesis is to lower the rate-limiting free energy of activation. This promotes the formation of the transition-state complex for the desired reaction product and reduces the necessity to use a high temperature (Figure 3). Milder reaction conditions translate to a reduced energy input and less formation of undesired side-products. The overall reaction energy is not affected by the presence of a catalyst and the catalyst itself is not consumed in the reaction. Besides the large-scale processes for commodity chemicals also production processes of fine chemicals are notorious for their high amount of waste production.[4] This is because of the more complex nature of fine chemical synthesis and the requirement of specialized catalysts that make (enantio-) selective conversions possible. The high time-to-market pressure is often not

Page 5: University of Groningen Library design and screening ... · global cheese production. Examples of other important developments for biocatalysis are the elucidation of the first protein

Chapter 1

10

c.

compatible with the time-consuming development of dedicated catalysts and stoichiometric activation reactions are often used. However, the growing availability of highly selective organo-metal catalysts and biocatalysts over the last decades is a major factor that counters this trend.[5-7] Besides an enhanced sustainability the continuous expansion of the catalytic toolbox greatly stretches out the possibilities of synthetic organic chemistry. In this thesis I will focus on the development of enzymes as biocatalyst for selective chemical conversions.

a. !!!

b.

!!

Figure 4. Examples of early biocatalytic processes. a. Glucose isomerase catalyzed conversion of D-glucose to D-fructose (both displayed in the most abundant β-pyranose form, above in open Fisher projection). b. Amino acid acylase catalyzed kinetic resolution of racemic N-acetyl methionine. The enzyme displays a high enantiopreference towards the L- form of the substrate; after ~50% conversion L-methionine is the predominant product and D-N-acetyl methionine remains virtually untouched by the enzyme. c. Cytochrome P450 catalyzed stereo- and regio-selective hydroxylation at the 11 position of progesterone. One oxygen atom that is derived from atmospheric oxygen is incorporated in the product and the second oxygen atom is reduced to water by electrons from β-nicotinamide adenine dinucleotide 2′-phosphate (NADPH). The 11-α-hydroxyprogesterone product can be further converted to cortisol (gray). !

Page 6: University of Groningen Library design and screening ... · global cheese production. Examples of other important developments for biocatalysis are the elucidation of the first protein

Introduction !

!11

Enzymes in organic synthesis

The use of enzymes in organic synthesis started more than a century ago. At that time scientist discovered the possibility of using living cells or extracts thereof as catalyst for the production of useful (chiral) fine chemicals. Already in 1856 Louis Pasteur noticed that living cells of the Lactobacillus genus were responsible for the conversion of glucose into lactic acid, at that time an unexplained problem in wine making.[8] Twenty-five years later, L-lactic acid was the first natural chiral compound that was produced by fermentation on an industrial scale.[9] Also the use of isolated enzymes as catalyst for specific biotransformations was already recognized in the nineteenth century. In 1833 the French bio-pioneer Anselme Payen demonstrated the hydrolysis of starch into fermentable sugars by using an enzyme preparation from malted grains.[10] The first industrial application of isolated enzymes was more than one hundred years later; during World War II immobilized invertase was applied at ambient temperature and pH for the production of invert sugar from sucrose.[11] Sulfuric acid, the preferred catalyst at that time, was not available due to war activities. Other early examples of industrial biotransformations are the xylose isomerase catalyzed conversion of D-glucose into the sweeter tasting sugar D-fructose,[12] the amino acid acylase catalyzed kinetic resolution of various proteinogenic amino acids[13] and the cytochrome P450 (whole cells) catalyzed hydroxylation of progesterone as first step in the production of corticosteroid hormones (Figure 4).[14] In these settings enzymes showed to be useful catalysts and high enantio- and regio- selectivity was often obtained. For example, the biocatalytic 11-α-hydroxylation of progesterone strongly simplified the original Merck process for cortisone-acetate, which involved 31 chemical steps. After the biocatalytic step was implemented the product yield increased and its price dropped from 200 to 6 dollars per gram.[14] On the other hand the use of enzymes in chemical processes also has serious limitations. This is not surprising since enzymes are adjusted, through millions of years of evolution, to their physiological role in an aqueous environment, which is often very different from what organic chemists require. Problems with separating the product from the biocatalyst itself could be largely overcome by the application of carrier-bound enzymes.[15] This also enabled the reuse

Page 7: University of Groningen Library design and screening ... · global cheese production. Examples of other important developments for biocatalysis are the elucidation of the first protein

Chapter 1

12

of the enzyme but other limitations such as a narrow substrate scope, poor catalytic efficiency, low or undesired regio- and stereo-selectivity, poor operational stability and (product) inhibition could not be easily solved. Also the limited availability of stable enzymes restricted their use in organic synthesis. Tailoring enzymes for chemical processes A number of scientific breakthroughs, starting with the discovery of the DNA double helix structure by Watson and Crick in 1953, initiated a new age of biocatalysis.[16] In the decades after this major discovery genetic engineering tools were developed that enabled the over-expression of a gene from a donor organism in a suitable production host such as E. coli. Human insulin was in 1982 the first protein drug that was recombinantly produced in E. coli (Genentech - licensed to Eli Lilly and Company).[17] An early example of an industrial recombinant enzyme is chymosin from calf, which was marketed in 1988 by Gist-Brocades (now DSM Food Specialties).[18] Today fermentation-produced chymosin (FPC) is used for over 80% of the global cheese production. Examples of other important developments for biocatalysis are the elucidation of the first protein crystal structure in 1958,[19] the invention of the polymerase chain reaction (PCR) in 1983,[20] and the advances in DNA sequencing and synthesis technology.[21] Enzyme optimization by rational design The greatly improved accessibility of natural enzymes from various species through recombinant protein production further promoted the discovery and use of enzymes as catalyst in organic synthesis.[22-25] In an attempt to overcome some of the limitations of natural enzymes, such as a poor stability or narrow substrate scope, protein engineering technologies were developed. In the 1980s and 90s this was primarily done in a rational way and guided by a crystal structure of the target enzyme. Rationally designed protein variants were created by site-directed mutagenesis where specific amino acid substitutions are obtained via targeted mutations in the coding DNA. For example, in 1989 Matsumura et al. reported the successful

Page 8: University of Groningen Library design and screening ... · global cheese production. Examples of other important developments for biocatalysis are the elucidation of the first protein

Introduction !

!13

stabilization of phage T4 lysozyme by engineered disulfide bonds.[26,27] They introduced pairs of cysteine amino acid residues on the surface of the protein thereby creating one, two or three disulfide bonds which stabilize the native globular protein structure. The melting temperature of their best mutant protein turned out to be 23.4°C higher than the wild-type enzyme that has no disulfide bonds (Figure 5).

Figure 7. Rational design of a phage T4 lysozyme mutant with higher melting temperature. Structural model based on pdb 1L35. The polypeptide chain of the 164 amino acids long enzyme is displayed from blue (N-terminus) to red (C-terminus), sulfur atoms of engineered disulfide bonds are shown as yellow spheres. The three engineered disulfide bonds hamper the thermal unfolding of the native protein. In many more cases, rational protein design was successfully applied to enhance enzyme stability or catalytic properties.[28] Nonetheless, this approach appeared not to be a robust answer to most engineering challenges. Limitations were encountered, for example, when biocatalysts had to be developed for the synthesis of non-natural pharma intermediates or in cases the enantioselectivity towards a target substrate had to be inverted.[29] Enzyme variants with a desired specificity could not be well predicted in most cases. Another major limitation of rational protein design is that it requires a structure of the template enzyme with atomic resolution. This was especially a problem in the early days of enzyme engineering when the number of available protein crystal structures was still very small.

Page 9: University of Groningen Library design and screening ... · global cheese production. Examples of other important developments for biocatalysis are the elucidation of the first protein

Chapter 1

14

Enzyme optimization by laboratory!evolution In a search to circumvent the limitations of rational enzyme design, researchers have started in the second half of the 1990s with the exploration of molecular biology methods that mimic Darwinian evolution.[30,31] This has eventually resulted in a collection of methods, which has been termed “directed evolution”.[32,33] Contrary to enzyme optimization by rational design, directed evolution uses a random approach that is based on iterative cycles of mutagenesis, starting with a target gene or a set of related genes, followed by the selection or screening of the resulting protein library for variants with improved target features. The genes of the best hits are used as template for the next round of mutagenesis and screening. Mutations can, for example, be introduced by using a random process such as gene amplification under error-prone conditions (epPCR). This mutagenesis approach was extensively explored by Francis Arnold and colleagues in the 1990s.[34] In 1994 Pim Stemmer invented a very different method for creating genetic diversity which is called “gene shuffling”.[35] In this method a set of homologous parent genes is fragmented into smaller DNA pieces, which are randomly recombined to form a library of full-length hybrid genes, which also carry additional random mutations. The general strategy for obtaining enzymes with novel properties by directed evolution is outlined in Figure 6. With this laboratory evolution approach it is possible to discover enzyme variants with unexpected beneficial amino acid substitutions through the entire protein sequence without requiring knowledge of the structure or catalytic mechanism. Whilst the evolution of an enzyme can take millions of years in the context of a living organism this can now be reduced to several months or even weeks, making directed evolution a very powerful approach for the development of biocatalysts with novel properties.

Page 10: University of Groningen Library design and screening ... · global cheese production. Examples of other important developments for biocatalysis are the elucidation of the first protein

Introduction !

!15

Figure 6. Enzyme optimization via directed evolution. The gene of a promising starting protein is used as template for the creation of a large diversity of gene variants with (random) mutations in the nucleotide sequence. The resulting proteins diverge in amino acid sequence and are subjected to a screening, e.g. in microtiter plate (MTP) format, or a selection assay to identify variants that display enhanced function. A screening can be done in. The genes of the best hits can be used as template for another round of mutagenesis and screening, which can be repeated until a protein variant is obtained that is fit for the application. The chance of finding good hits in a directed evolution project is largely determined by the strategy that is chosen and the design and quality of the gene libraries that are used. For example, gene libraries that are created with epPCR can reveal beneficial amino acid substitutions and so-called “hot-spot” positions throughout the entire protein sequence without the use of structural knowledge.[36,37] A disadvantage of this method is that many potentially beneficial mutations will be missed due to intrinsic limitations of the method such as the mutational bias of the employed DNA-polymerase and the organization of the genetic code.[38] Also synergetic combinations of mutations are not efficiently explored with epPCR libraries because the chance that two or more specific amino acid substitutions occur in the same mutant enzyme is very small.[39] Site-saturation mutagenesis (SSM) is an oligonucleotide assisted PCR technique that allows exploring all amino acid substitutions at predetermined sites in the protein of interest.[40] Selection of positions for targeting with SSM can for example be inspired by structural data or by screening and sequencing results of an earlier round. Directed evolution involving both

!! ! ! ! ! ! ! ! ! ! ! !! ! ! ! ! ! ! ! ! ! ! !! ! ! ! ! ! ! ! ! ! ! !! ! ! ! ! ! ! ! ! ! ! !! ! ! ! ! ! ! ! ! ! ! !! ! ! ! ! ! ! ! ! ! ! !! ! ! ! ! ! ! ! ! ! ! !! ! ! ! ! ! ! ! ! ! ! !

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

gene collection !

!

template gene protein library

screening e.g. in MTP, or selection

!!

new round of mutagenesis and screening with gene of hit

Page 11: University of Groningen Library design and screening ... · global cheese production. Examples of other important developments for biocatalysis are the elucidation of the first protein

Chapter 1

16

epPCR and SSM has been used by DSM to engineer a biocatalyst that can catalyze the selective hydroxylation of compactin as last step in the production of pravastatin, a cholesterol-lowering drug.[41,42] A wild-type cytochrome P450 enzyme from Amycolatopsis orientalis was used as starting point. This enzyme has already the desired regioselectivity; it catalyzes the hydroxylation of compactin at the 6-position. However it produces the “wrong” epimer of the product (epi-pravastatin), which lacks the desired biological function (Figure 7 - left reaction). The aim of the directed evolution study was to invert the stereopreference of the template enzyme. In the first round an epPCR library was created and thoroughly evaluated (no structure or literature data was available of the template enzyme). The DNA sequences of the best hits revealed several hot-spot positions. Amino acid substitutions at some of these positions improved the stereoselectivity where mutations at other positions enhanced the catalytic efficiency of the enzyme towards the target substrate. The most promising positions were further explored with combinatorial SSM to identify potentially beneficial combinations of mutations. Comprehensive screening of the second generation SSM library revealed a very active triple mutant enzyme that that is capable of producing the desired form of pravastatin with an enantiomeric excess of > 95%. (Figure 7 - right reaction).

Figure 7. Cytochrome P450 from Amycolatopsis orientalis as a catalyst in the hydroxylation of compactin in the last step of the synthesis of the cholesterol-lowering drug pravastatin. The wild-type enzyme produces the α-variant of pravastatin, which lacks the desired biological function. A triple mutant enzyme that was identified in a directed evolution screening campaign produces the desired epimer of pravastatin with high enantiomeric excess.

Page 12: University of Groningen Library design and screening ... · global cheese production. Examples of other important developments for biocatalysis are the elucidation of the first protein

Introduction !

!17

Workflows for efficient laboratory enzyme evolution In the 1990s directed evolution was successfully applied to acquire improved enzymes for diverse biotechnological applications but the overall process remained costly and time-consuming.[43,44] Later research was focused on improving the efficiency of directed evolution. Important progress was made through the development of better high-throughput screening assays or with advanced molecular tools for creating genetic diversity.[44-47] Here I will further elaborate on methods and considerations for the more effective probing and sampling of sequence diversity. Considerations for optimal library design and sampling

In 2005 Reetz and colleagues have established a directed evolution workflow for the development of enzymes with novel catalytic properties that is called “combinatorial active-site saturation test” (CAST).[48] This method uses structural data to select amino acid positions around the substrate binding pocket of an enzyme since mutations at these sites will influence various catalytic functions.[39] In consecutive rounds these first-shell positions (typically around 10 amino acid residues) are subjected in small sets of two or three sites to saturation mutagenesis and screening. This structure based approach appeared to be a robust method and has become a common way for engineering catalytic parameters such as substrate scope, inhibition properties, catalytic efficiency and (enantio-) selectivity.[48,49] The vast expansion of the protein structure database (www.pdb.org)[50] at the end of the 1990s and 00s supported the use of CASTing. Despite all successes there are three important limitations of the CASTing approach, which are also valid for many other directed evolution methods: A) the evaluation of combinatorial saturation libraries requires a significant screening effort; B) multiple rounds of laborious mutagenesis and screening are needed; and C) possible synergetic combinations of two or more mutations are only partially studied. This is caused by the fact that the capacity of available screening assays is limited with the consequence that just a relatively small number of target positions can be efficiently explored with combinatorial SSM. If, for example, two or three positions are

Page 13: University of Groningen Library design and screening ... · global cheese production. Examples of other important developments for biocatalysis are the elucidation of the first protein

Chapter 1

18

simultaneously randomized to all the twenty proteinogenic amino acids 400 (202) and 8000 (203) enzyme variants are respectively assessed which is about the maximum number of variants that can be covered with analytical tools such as high-performance liquid chromatography (HPLC) and gas-liquid chromatography (GLC), which are often used in the screening for catalytic function. At some point the screening capacity is always limiting; even highly efficient selection methods can cover only a diminutive fraction of the huge sequence space of proteins.[51]

One of the challenges in directed evolution is to make optimal use of the screening capacity that is available. A common idea is that comprehensive screening is required to find the best evolved mutant enzyme.[48] However, full coverage of a random protein library requires oversampling. It is like throwing dices; most of the times it takes more than six throws to get all six values while several possibilities are covered more than once. On average, reaching 95% coverage of an unbiased SSM library of 8,000 unique variants requires an average screening effort of approximately 23,965 randomly picked variants. This translates to almost three-fold library oversampling whereas the repetitive testing of identical clones does not contribute to the discovery of better hits. The relationship between library coverage and oversampling is visualized in Figure 8. Equation 1a is used to calculate the library coverage as function of the screening effort[52] Equation 1b is used to determine the average number of clones that needs to be screened to achieve a desired coverage.

Figure 8. Coverage of an unbiased random protein library in percent as function of the screening effort (solid line). The probability of sampling unique variants decreases when the library coverage progresses (dotted line). Example, after 1× oversam-pling about 63.2% library coverage is achieved; the probability that the 401th randomly picked clone is unique (not sampled before) for a library with 400 variants is approximately 37% (100-63).

0

25

50

75

100

0 1 2 3 4 5 0× 1× 2× 3× 4× 5× library oversampling

libra

ry c

over

age

p un

ique

var

iant

Page 14: University of Groningen Library design and screening ... · global cheese production. Examples of other important developments for biocatalysis are the elucidation of the first protein

Introduction !

!19

a. !" = 1 − 1 − !!"

!"!×!100% b. !" = !"#!! !! !

!"1 − !"

!""

Equation 1. a. Coverage of an unbiased random protein library as a function of the screening effort. LC is the percentage of the theoretical library diversity that is covered, LS is the library size (total number of unique variants that are present in the library) and SC is the number of clones that are screened. The library coverage corresponds to the probability of finding a specific variant, which is the same as one minus the chance that such variant is not picked. The chance that a specific variant is not picked can be written as (1-1/LS)SC. b. The number of randomly picked clones that have to be screened (SC) in order to achieve a given degree of library coverage (LC, of theoretical diversity in percent). Equation 1b is derived from 1a. An important question is to what extend a random protein library should be screened to make optimal use of the screening capacity. In other words, what is the optimal balance between library coverage and oversampling? An absolute answer to this question is hard to give but in case only one improved variant would be present in a certain library it could make sense to strive for 95% coverage so that there is only 5% chance of not finding this hypothetical variant. On the other hand this scenario rarely happens. Especially in larger combinatorial SSM libraries it is much more likely that multiple suitable variants are present or that just no improved variants are formed at all. In a positive situation where multiple sufficiently improved variants exist, it requires much less library oversampling to discover at least one of those well improved variants.[52] For example, to have 95% chance of finding at least one out of five sufficiently improved variants in a random library of 8,000 variants it is required to achieve just 45.1% library coverage (Equation 2a). To reach this degree of library coverage it is necessary to screen about 4793 randomly picked clones (Equation 2b). This example indicates that comprehensive screening is not always required and that testing of large numbers of redundant clones can be avoided. Library screening strategies are further explored in Chapter 3 and also the sampling of biased libraries is covered in this chapter.

Page 15: University of Groningen Library design and screening ... · global cheese production. Examples of other important developments for biocatalysis are the elucidation of the first protein

Chapter 1

20

a. !" = 1 − 1 − !"!""

!!"# !×!100% b. !" = !"#!! !! !

!"1 − !"

!""

!!"#

Equation 2. a. Library coverage (LC) required to find at least one of a defined number of variants that are assigned as sufficiently improved variants (SIV). The library coverage is indicated with LC, the confidence to find at least one SIV is indicated with RC (in %). The probability of finding at least one out of multiple SIVs can be written as 1-(1-(LC/100))SIV and is defined as RC. Using this equation the required LC can be easily calculated. b. Number of clones that need to be screened (SC) to find at least one SIV with certain confidence (RC). The library size (LS) is the theoretical number of unique variants that are encoded. Equation 2b follows from Equation 2a and 1b. From completely random, to knowledge-driven library design To make optimal use of screening capacity it is necessary to create libraries with a high fraction of genotypically different improved variants. One option is to suppress uneven distribution of amino acids in SSM libraries, which arises from the fact that some amino acids are specified by a larger number of codons than others. However, even in unbiased SSM libraries the frequency of positives usually remains relatively low. This is because only a small fraction of the explored amino acid substitutions will be beneficial for a desired function whereas the vast majority of random mutations are neutral or even detrimental.[51] The frequency of beneficial mutations in site-specific mutant libraries can be significantly enhanced by the application of restricted mutagenic codons that specify only functional subsets of the twenty proteinogenic amino acids. With this approach, called site-restricted mutagenesis (SRM),[53] the randomness in the ensuing libraries is drastically reduced.[54-56] For example, an SRM library in which four positions are randomized to five different amino acids plus the wild-type residue (5+1) has a total diversity of 1296 (64) variants. This is 123-fold less than a full saturation library that also addresses four positions (204

=160,000). With SRM more target positions can be simultaneously covered with less screening. However, beneficial amino acid residues that are not included will be missed. This makes the definition of functional subsets of amino acids in SRM library design of key importance. One option that was explored by Reetz and co-workers is to

Page 16: University of Groningen Library design and screening ... · global cheese production. Examples of other important developments for biocatalysis are the elucidation of the first protein

Introduction !

!21

choose subsets in such way that libraries encompass large differences in physical and chemical properties of amino acids without codon bias.[57] More data-driven options use, for example, phylogenetic and co-evolution data, following the assumption that individual or combinations of amino acids that occur in homologous enzymes are likely to be tolerated.[58-61] Also structural[62,63] and literature[64,65] data can be used. Moreover computational methodologies are becoming increasingly more important for the definition of functional subsets of amino acids[66-69] (Chapter 6). Smart library design via SRM typically applies knowledge from various sources and this strategy is visualized in Figure 9.

Figure 9. Knowledge and data-driven library design. From the left (clockwise): (1) phylogenetic data can indicate the mutability of individual sites and the amino acid diversity that is likely to be accepted by the template enzyme; (2) co-evolution analysis (2D heat map showing co-evolution scores created on 3dm.bio-prodict.nl) can identify correlated occurrence of amino acids in homologues enzyme sequences, which is likely important for a certain function; (3) structural inspection and modeling can reveal promising sites and favorable amino acid substitutions can be predicted; (4) target sites and amino acid diversity can be inspired on earlier mutagenesis studies of (related) enzymes; (5,6) various computational routines that run for example under Yasara (http://www.yasara.org/) or Rosetta (http://depts.washington.edu/bakerpg/drupal/) can be used to determine functional diversity for targeting in SRM libraries; (7) HotSpot Wizzard (http://loschmidt.chemi.muni.cz/hotspotwizard/) is a web tool that uses structural and phylogenetic data to identify residues that may come into contact with ligand molecules entering or leaving the active site.

gi|494435834|ref|WP_007229113.1|:2-292gi|407694232|ref|YP_006819020.1|:6-292

gi|492887029|ref|WP_006022791.1|:5-288gi|497420883|ref|WP_009735081.1|:2-288gi|154252063|ref|YP_001412887.1|:16-291

gi|495481248|ref|WP_008205935.1|:1-274gi|492459152|ref|WP_005851751.1|:1-292

gi|16974915|pdb|1G5F|A:7-293gi|495185022|ref|WP_007909812.1|:2-291

gi|379733761|ref|YP_005327266.1|:4-294

gi|497871374|ref|WP_010185530.1|:5-284

gi|222055138|ref|YP_002537500.1|:17-293

gi|497227087|ref|WP_009541349.1|:27-317gi|493555358|ref|WP_006508892.1|:52-343gi|50082962|gb|AAT70109.1|:43-333gi|442322361|ref|YP_007362382.1|:6-291gi|89055111|ref|YP_510562.1|:33-322gi|494370223|ref|WP_007198248.1|:41-323

gi|496034515|ref|WP_008759022.1|:11-287gi|494032893|ref|WP_006975029.1|:39-329

gi|494590639|ref|WP_007349233.1|:9-291

gi|494871792|ref|WP_007597888.1|:12-301gi|310941367|dbj|BAJ23986.1|:10-292

gi|495102118|ref|WP_007826941.1|:10-287gi|358383113|gb|EHK20782.1|:9-298

gi|358380384|gb|EHK18062.1|:6-300gi|61222634|sp|P0A3G2.1|DHAA_RHORH:1-293gi|374989964|ref|YP_004965459.1|:11-279gi|406939851|gb|EKD72788.1|:6-293

gi|212212390|ref|YP_002303326.1|:13-291

gi|251795773|ref|YP_003010504.1|:6-296

100.0

Hotspot Wizard

weblogo.berkeley.edu

0

1

2

3

4

bits

N

1

E

A

K

SDNR

2

M

L

AIV

3

N

L

YVIT

4

V

I

FL

5

I

V

6

G

LAVI

7

D

QH

8

G

D

9

W

10

G

11

M

A

GTS

12

T

P

I

V

F

GA

13

M

F

IL

14

AG

15

W

T

M

A

LF

16

R

Q

Y

NDH

17

R

H

F

YLW

C!

Knowledge driven library design

2 !

3 !

1

4 !

5

6 7

!

!

!

Page 17: University of Groningen Library design and screening ... · global cheese production. Examples of other important developments for biocatalysis are the elucidation of the first protein

Chapter 1

22

Discovery and characterization of haloalkane dehalogenases The subsequent chapters of this thesis describe tools and strategies for more efficient laboratory evolution of enzymes. Some approaches focused on enhancing the biocatalytic potential of haloalkane dehalogenases are experimentally evaluated. Like many other hydrolytic enzymes such as lipases, proteases, esterases and epoxide hydrolases, haloalkane dehalogenases belong to the so-called α/β-hydrolase fold superfamily and are composed of two domains: a large catalytic α/β-core domain and a mainly alpha-helical cap-domain that shapes part of the substrate binding pocket (Figure 10).[70-73]

Figure 10. Cartoon representation of a. haloalkane dehalogenase from Xanthobacter autotrophicus GJ10 - DhlA (image created from pdb file 1B6G[71]) and b. haloalkane dehalogenase from Rhodococcus rhodochrous NCIMB 13064 - DhaA (image created from pdb file 1BN6[72]). α-Helices and β-sheets of the α/β core domain are shown in blue and red respectively, α-helices of the cap domain are displayed in dark-blue. Catalytic residues are displayed as yellow sticks.

!

Figure 11. The hydrolytic cleavage reaction of haloalkanes catalyzed by haloalkane dehalogenase yields an alcohol product, a halide anion and a proton.

a. b.

Page 18: University of Groningen Library design and screening ... · global cheese production. Examples of other important developments for biocatalysis are the elucidation of the first protein

Introduction !

!23

Haloalkane dehalogenases occur in some bacteria and catalyze the hydrolytic cleavage of carbon-halogen bonds (Figure 11). Whereas most enzyme substrates are naturally occurring compounds, this is not the case for haloalkane dehalogenases. Until the second half of the previous century relatively few haloalkanes were present in nature; most natural halogenated compounds are formed by marine organisms.[74,75] However, this situation changed dramatically when haloalkanes started being produced at an industrial scale for diverse applications such as solvent, gasoline additive, biocide and synthon in organic synthesis. Today halogenated materials can be found in, for example, solvents,[76] paint removers,[77] as flame retardant[78] and in about 20% of all pharmaceuticals that are on the market.[79] In just a few decades this new class of chemicals became widespread in nature due to spoilage at production sites, extensive use in agriculture and due to contamination after disposal.[80,81]

Unfortunately, most halogenated compounds and especially haloalkanes bearing multiple halogen atoms such as 1,2,3-trichloropropane appeared to be toxic and highly recalcitrant towards biodegradation. Nonetheless, the selection pressure at contaminated sites and the metabolic versatility and evolutionary potential of microorganisms have resulted in microbes that can mineralize halogenated compounds such as 1,2-dichloroethane (DCE),[82] vinylchloride[83] and pesticides like γ-hexachlorocyclohexane (lindane).[84] Since 1978 haloalkane-degrading microbes have been isolated from contaminated soils[85,86] and the enzymes that are involved in dehalogenation have received widespread scientific attention.[87-89]

The first isolated haloalkane dehalogenase was obtained from Xanthobacter autotrophicus GJ10 (DhlA) by Keuning et al. in 1984.[90] This Xanthobacter strain was isolated from 1,2-dichloroethane contaminated soil and is capable of utilizing this toxic compound as sole carbon and energy source. The substrate scope of DhlA appeared not to be limited to just DCE. In-vitro studies have indicated that it also catalyzes the hydrolytic cleavage of carbon-halogen bonds in a range of other mono- and dihalogenated short-chain alkanes such as 1,2-dibromoethane, 1,3-dichloropropane, 1-chloroethane and 1-bromopropane.[90] Site-directed mutagenesis experiments and the elucidation of a three-dimensional structure of wild-type DhlA by X-ray crystallography have shed light on the catalytic mechanism and

Page 19: University of Groningen Library design and screening ... · global cheese production. Examples of other important developments for biocatalysis are the elucidation of the first protein

Chapter 1

24

structure-function relationship of haloalkane dehalogenases.[73,91,92] In DhlA the residues Asp124, His289 and Asp260 make up a nucleophile-histidine-acid catalytic triad, which is a typical motive in α/β-hydrolases. The indole NH groups of Trp125 and Trp175 form hydrogen bonds with the substrate-halogen atom which stabilizes the transition state complex. The dehalogenation reaction proceeds in two steps. First, a carboxylate-oxygen of Asp124 carries out a nucleophilic attack on the activated carbon-halogen atom thereby producing an alkylated enzyme intermediate and a halide ion. In the next step the alkyl-enzyme intermediate is cleaved by a water molecule that is activated by His289 (Figure 12). The catalytic pentad differs somewhat within the haloalkane dehalogenase family. The catalytic acid that forms a hydrogen bond with the general base (Asp260 in DhlA) can also be a glutamate residue and the second halide binding residue (Trp175 in DhlA) can either be a tryptophan or an asparagine.[93] Also the topological arrangement of the catalytic acid and the second halide-stabilizing residue diverges within the family. The second halide-stabilizing residue can be located either in the cap domain or in the core domain and the general acid can be positioned in two different loops in the core domain.[93]

Figure 12. The catalytic mechanism of haloalkane dehalogenase from Xanthobacter autotrophicus. In the first half reaction a covalent alkyl-enzyme intermediate and a halide ion are formed. In the second half reaction the alkyl-enzyme intermediate is hydrolytically cleaved producing the alcohol and a proton. Besides the value for fundamental studies the structure of DhlA also suggested ways of engineering its catalytic and physical properties. In the 1990s targeted mutations in the cap-domain were reported that altered the

First half reaction Second half reaction

Page 20: University of Groningen Library design and screening ... · global cheese production. Examples of other important developments for biocatalysis are the elucidation of the first protein

Introduction !

!25

substrate scope of the enzyme[94,95] and in 2002 a DhlA variant with engineered disulfide bonds was described that displayed higher thermostability.[96] Haloalkane dehalogenases have received much attention for applications in bioremediation. In 1995 a bioreactor harboring Xanthobacter autotrophicus GJ10 cells was used to cleanup DCE-polluted groundwater in Germany[97] but many other haloalkanes could not be mineralized with this system, in part due to limiting catalytic activity and substrate scope of DhlA. Since the description of DhlA various other haloalkane dehalogenases were discovered and characterized. For example haloalkane dehalogenase from Sphingomonas paucimobilis UT26 (LinB) was reported by Nagata et al. in 1993.[98,99] This enzyme is well-active on bulky compounds such as 1,3,4,6-tetrachloro-1,4-cyclohexadiene which is an intermediate in the degradation of the insecticide Lindane. In 1997 Kulakova et al. described a haloalkane dehalogense (DhaA) from Rhodococcus rhodochrous NCIMB 13064 which was the first haloalkane dehalogense known to be involved in the degradation of several C2-C8 n-haloalkanes.[100] The latter enzyme also displays better catalytic activity on trihalogenated propanes compared to DhlA but this was still not sufficient to support bacterial growth on TCP.[101] In 2002 Bosma and colleagues used a directed evolution approach to enhance the catalytic efficiency of DhaA towards TCP and applied their best evolved mutant in the construction of a synthetic biology organism that can utilize TCP as the sole carbon and energy source.[102] In a subsequent directed evolution study, which was focused on the substrate access tunnel, DhaA was further improved in its catalytic activity on TCP[103] and also this variant, referred to as DhaA31, was used to create a further improved TCP degrading organism.[104]

Biocatalytic potential of haloalkane dehalogenases

The studies cited above indicate that there are several possibilities to tailor haloalkane dehalogenases for applications in the field of bioremediation but the potential of haloalkane dehalogenases in the production of valuable fine chemicals is only poorly explored. Especially the biocatalytic production of chiral intermediates for the pharmaceutical industry requires highly

Page 21: University of Groningen Library design and screening ... · global cheese production. Examples of other important developments for biocatalysis are the elucidation of the first protein

Chapter 1

26

enantioselective enzymes. Unfortunately the best studied haloalkane dehalogenases so far display only poor to moderate enantioselectivity.[105,106,107] An attractive option to expand the biocatalytic potential of haloalkane dehalogenases is to develop enantioselective variants for useful target conversions. Outline of the Thesis In Chapter 2 the biocatalytic potential of five wild-type and one mutant haloalkane dehalogenases is explored in the asymmetric conversion of prochiral polyhalogenated compounds towards chiral haloalcohol building blocks. To enhance the optical purity of the primary dehalogenase product also the subsequent kinetic resolution of the haloalcohols towards the diol is investigated.

In Chapter 3 a smart library design approach is explored to evolve haloalkane dehalogenase variants with complementary enantioselectivity towards the industrial waste product TCP. The anticipated products (R)- and (S)-2,3-dichloropropanol can be converted into (S)- and (R)-epichlorohydrin which are valuable chiral building blocks for the pharmaceutical industry.

In Chapter 4 new tools and strategies are explored for designing and creating site-restricted mutant libraries. The first tool is focused on the definition of sequence diversity that can be optimally covered with the screening capacity that is available. The purpose of the second tool is to find optimal sets of (partly undefined) codons that specify the required subsets of amino acids.

In Chapter 5 a new combinatorial library design strategy is explored. The purpose is to efficiently explore larger numbers of positions and amino acid variation in single combinatorial library designs. Exploring protein sequence space in a more efficient way can speed up the directed evolution process. In Chapter 6 a computational evolution methodology is explored for the development of biocatalysts that can convert non-natural target compounds. The method takes three key-requirements for enzyme catalysis into account;

Page 22: University of Groningen Library design and screening ... · global cheese production. Examples of other important developments for biocatalysis are the elucidation of the first protein

Introduction !

!27

enzyme stability, substrate binding and substrate turnover. The final goal of this study is to evolve haloalkane dehalogenase variants with enhanced properties in the kinetic resolution of racemic 3-chloro-2-alkyn towards enantio pure 3-butyn-2-ol building blocks. References

1. World Commission on Environment and Development (1987) Our common future. Oxford University Press, USA.

2. K. Weissermel, H.J. Arpe (1997) Industrial organic chemistry, 3rd Edn., Wiley VCH, Weinheim, Germany. p 294-299.

3. B.M. Bell, J.R. Briggs, R.M. Campbell, S.M. Chambers, P.D. Gaarenstroom, J.G. Hippler, B.D. Hook, K. Kearns (2008) Glycerin as a renewable feedstock for epichlorohydrin production. The GTE Process. CLEAN - Soil, Air, Water 36: 657-661.

4. R.A. Sheldon (2000) Atom efficiency and catalysis in organic synthesis. Pure Appl. Chem. 72: 1233-1246.

5. R.A. Sheldon, I. Arends, U. Hanefeld (2007) Green chemistry and catalysis. Wiley VCH, Weinheim, Germany.

6. Dutch National Research School Combination Catalysis Controlled by Chemical Design (NRSC-Catalysis) (2009) Future perspectives in catalysis.^^http://www.nrsc−catalysis.nl/files/media/scientific_reports/Future_perspectives_in_Catalysis.pdf

7. R.H. Garrett, C.M. Grisham (1999) Biochemistry. Saunders college publishing, Philadelphia, USA p 426-427.

8. L. Pasteur (1857) Mémoiresur la fermentation appeléelactique. Comptesrendus des séances de l’Academie des Sciences. 45: 913-916.

9. R.A. Sheldon (1993) Chirotechnology - Indistrial synthesis of optically active compounds. CRC press, New York, USA p.105.

10. A. Payen, J.F. Persoz (1833) Memoire sur la Diastase; les principaux produits de ses réactions et leurs applications aux arts industriels. J. Ann. Chem. Phys. 53: 73-92.

11. P.S.J. Cheetham (1995) The application of enzymes in: industry, in Handbook of Enzyme Biotechnology, 3rd Edn., Ellis Horwood, London. p 420.

12. R.L. Antrim, W. Colilla, B.J. Schnyder (1979) Glucose isomerase production of high-fructose syrups. In: Appl. Biochem. Bioeng. (vol 2), Enzyme Technology. Academic Press, New York, USA p 97-207.

13. A.S. Bommarius, K. Drauz, U. Groeger, C. Wandrey (1992) Membrane bioreactors for the production of enantiomerically pure α-amino acids, Chirality in Industry. John Wiley & Sons Ltd, New York, USA p 372-397.

14. O.K. Sebek, D. Perlman (1979) Microbial transformation of steroids and sterols. Microbial Technology (vol. 1), 2nd Edn., Academic Press, New York, USA p. 484-488.

15. L. Cao (2005) Carrier-bound Immobilized Enzymes: Principles, Application

Page 23: University of Groningen Library design and screening ... · global cheese production. Examples of other important developments for biocatalysis are the elucidation of the first protein

Chapter 1

28

and Design. Wiley-VCH, Wienheim, Germany. 16. J.D. Watson, F.H.C. Crick (1953) Molecular structure of nucleic acid: a

structure of deoxyribose nucleic acid. Nature 171: 737–738. 17. S.S. Hall (2002) Invisible frontiers: the race to synthesize a human gene.

Oxford University Press, USA. 18. J.A. van den Berg, K.J. van der Laken, A.J. van Ooyen, T.C. Renniers, K.

Rietveld, A. Schaap, A.J. Brake, R.J. Bishop, K. Schultz, D. Moyer, M. Richman, J.R. Shuster (1990) Kluyveromyces as a host for heterologous gene expression: expression and secretion of prochymosin. Biotechnology 8: 135-139.

19. J.C. Kendrew, G. Bodo, H.M. Dintzis, R.G. Parrish, H. Wyckoff, D.C. Phillips (1958) 3-Dimensional model of the myoglobin molecule obtained by X-ray analysis. Nature 181: 662-666.

20. K. Mullis, F. Faloona, S. Scharf, R. Saiki, G. Horn, H. Erlich (1986) Specific enzymatic amplification of DNA in vitro – the polymerase chain-reaction. Cold Spring Harbor Symposia on Quantitative Biology 51: 263-273.

21. E. Pettersson J. Lundeberg, A. Ahmadian (2009) Generations of sequencing technologies. Genomics 93: 105-111.

22. W. Kühne (1976) Enzymes: One Hundred Years. FEBS Lett. vol. 62. 23. R. Borriss (1987) Biotechnology of enzymes, in Biotechnology vol 7a, eds.

H.J. Rehm, G. Reed, series Enzyme Technology, ed. J.F. Kennedy, VCH Verlagsgesellschaft, Weinheim, Germany. p 35-62.

24. W. Gerhartz (1990) Enzymes in industry, VCH Verlagsgesellschaft, Weinheim, Germany. p 11.

25. J.C. Venter, K. Remington, J.F. Heidelberg, A.L. Halpern, D. Rusch, J.A. Eisen, D. Wu, I. Paulsen, K. E. Nelson, W. Nelson, D.E. Fouts, S. Levy, A.H. Knap, M.W. Lomas, K. Nealson, O. White, J. Peterson, J. Hoffman, R. Parsons, H. Baden-Tillson, C. Pfannkoch, Y.H. Rogers, H.O. Smith (2004) Environmental genome shotgun sequencing of the Sargasso Sea. Science 304: 66-74.

26. M. Matsumura, W.J. Becktel, M. Levitt, B.W. Matthews (1989) Stabilization of phage T4 lysozyme by engineered disulfide bonds. Proc. Natl. Acad. Sci. 86: 6562-6566.

27. M. Matsumura, G. Signor, B.W. Matthews (1989) Substantial increase of protein stability by multiple disulfide bonds. Nature 342: 291-293.

28. U.T Bornscheuer, M. Pohl (2001) Improved biocatalysts by directed evolution and rational protein design. Curr. Opin. Chem. Biol. 2: 137-143.

29. R. Chen (2001) Enzyme engineering: rational redesign versus directed evolution. Trends Biotechnol. 19: 13-14.

30. J.C. Moore, F.H. Arnold (1996) Directed evolution of a para-nitrobenzyl esterase for aqueous-organic solvents. Nature Biotechnol. 14: 458-467.

31. W.P. Stemmer (1994) DNA shuffling by random fragmentation and reassembly: in vitro recombination for molecular evolution. Proc. Natl. Acad. Sci. USA 91: 10747-10751.

32. F.H. Arnold (2001) Combinatorial and computational challenges for biocatalyst design. Nature 409: 253-257.

Page 24: University of Groningen Library design and screening ... · global cheese production. Examples of other important developments for biocatalysis are the elucidation of the first protein

Introduction !

!29

33. H.E. Schoemaker, D. Mink, M.G. Wubbolts (2003) Dispelling the myths - biocatalysis in industrial synthesis. Science 299: 1694-1697.

34. K. Chen, F.H. Arnold (1993) Tuning the activity of an enzyme for unusual environments: sequential random mutagenesis of subtilisin E for catalysis in dimethylformamide. Proc. Natl. Acad. Sci. USA 90: 5618-5622.

35. W.P Stemmer (1994) Rapid evolution of a protein in vitro by DNA shuffling. Nature 370: 389-391

36. L. You, F.H. Arnold (1996) Directed evolution of subtilisin E in Bacillus subtilis to enhance total activity in aqueous dimethylformamide. Protein Eng. 9: 77-83.

37. J.H. Spee, W.M. de Vos, O.P. Kuipers (1993) Efficient random mutagenesis method with adjustable mutation frequency by use of PCR and dITP. Nucleic Acids Res. 21: 777-778.

38. T.S. Wong, D. Zhurina, U. Schwaneberg (2006) The diversity challenge in directed protein evolution. Comb. Chem. High Throughput Screen. 9: 271-288.

39. K.L. Morley, R.J. Kazlauskas (2005) Improving enzyme properties: when are closer mutations better? Trends Biotechnol. 23: 231-237.

40. H.H. Hogrefe, J. Cline, G.L. Youngblood, R.M. Allen (2002) Creating randomized amino acid libraries with the QuikChange Multi Site-Directed Mutagenesis kit. BioTechniques. 33: 1158-1165.

41. P. Klaassen, A.W.H. Vollebregt, M.A van den Berg, M. Hans, J.M. van der Laan (2007) Process for preparing pravastatin. European Patent EP2094841.

42. M. Hans, J.M. van der Laan, B. Meijrink, W. van Scheppingen, R. Kerkman, M. van den Berg, M. Kittelmann, A. Kuhn, A. Riepp, J. Kühnöl, A. Fredenhagen, L. Oberer, O. Ghisalba, S. Luetz, D.P. Mangan, T.S. Moody, D. Schmid, A. Osorio-Lozada, F.O. Ütkür, J. Collins, C. Brandenbusch, G. Sadowski, A. Schmid, B. Bühler, M. Kinne, M. Poraj-Kobielska, R. Ullrich, M. Hofrichter, G. Grogan, M.L. Thompson (2012) Regio- and stereoselective hydroxylation. In: Practical Methods for Biocatalysis and Biotransformations. John Wiley & Sons, Ltd, Chichester, UK.

43. F.H. Arnold, J.C. Moore (1997) Optimizing industrial enzymes by directed evolution, in: New Enzymes for Organic Synthesis, vol. 58, Adv. Biochem. Eng. Biotechnol., Springer, Berlin, p 2-14.

44. M.T. Reetz (2004) Controlling the enantioselectivity of enzymes by directed evolution: practical and theoretical ramifications. Proc. Natl. Acad. Sci. USA 101: 5716-5722.

45. H. Lin, V.W. Cornish (2002) Screening and selection methods for large-scale analysis of protein function. Angew. Chem. 41: 4402-4425.

46. E.T. Boder, K.D. Wittrup (1997) Yeast surface display for screening combinatorial polypeptide libraries. Nat. Biotechnol. 15: 553-557.

47. T.S. Wong, D. Roccatano, D. Loakes, K.L. Tee, A. Schenk, B. Hauer, U. Schwaneberg (2008) Transversion-enriched sequence saturation mutagenesis (SeSaM-Tv+): a random mutagenesis method with consecutive nucleotide exchanges that complements the bias of error-prone PCR. Biotechnol. J. 3: 74-82.

Page 25: University of Groningen Library design and screening ... · global cheese production. Examples of other important developments for biocatalysis are the elucidation of the first protein

Chapter 1

30

48. M.T. Reetz, M. Bocola, J.D. Carballeira, D.X. Zha, A. Vogel (2005) Expanding the range of substrate acceptance of enzymes: combinatorial active-site saturation test. Angew. Chem. 44: 4192-4196.

49. M.T. Reetz, L.W. Wang, M. Bocola (2006) Directed evolution of enantioselective enzymes: iterative cycles of CASTing for probing protein-sequence space. Angew. Chem. 45: 1236-1241.

50. H.M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T.N. Bhat, H. Weissig, I.N. Shindyalov, P.E. Bourne (2000) The protein data bank. Nucleic Acids Research 28: 235-242.

51. F.H. Arnold (1998) Design by directed evolution. Acc. Chem. Res. 31: 125-131.

52. Y. Nov (2012) When second best is good enough: another probabilistic look at saturation mutagenesis. Appl. Environ. Microbiol. 78: 258-262.

53. J.G.E. van Leeuwen, H.J. Wijma, R.J. Floor, J.M. van der Laan, D.B. Janssen (2012) Directed evolution strategies for enantiocomplementary haloalkane dehalogenases: from chemical waste to enantiopure building blocks. Chembiochem. 13: 137-148.

54. R.E. Campbell, O. Tour, A.E. Palmer, P.A. Steinbach, G.S. Baird, D.A. Zacharias, R.Y. Tsien (2002) A monomeric red fluorescent protein. Proc. Natl. Acad. Sci. USA 99: 7877-7882.

55. R.J. Hayes, J. Bentzien, M.L. Ary, M.Y. Hwang, J.M. Jacinto, J. Vielmetter, A. Kundu, B.I. Dahiyat (2002) Combining computational and experimental screening for rapid optimization of protein properties. Proc. Natl. Acad. Sci. USA 99: 15926-15931.

56. R. Fox, L.J. Giver, D. Held, D. Hattendorf, T. Choudhary (2010) Reduced codon mutagenesis. Codexis patent. US2011/0082055 A1.

57. M.T. Reetz, D. Kahakeaw, R. Lohmer (2008) Addressing the numbers problem in directed evolution. Chembiochem. 9: 1797-1804.

58. C. Jäckel, J.D. Bloom, P. Kast, F.H. Arnold, D. Hilvert (2010) Consensus protein design without phylogenetic bias. J. Mol. Biol. 399: 541-546.

59. A. Pavelka, E. Chovancova, J. Damborsky (2009) HotSpot Wizard: a web server for identification of hot spots in protein engineering. Nucleic Acids Res. 37: W376-383.

60. R.K. Kuipers, H.J. Joosten, E. Verwiel, S. Paans, J. Akerboom, J. van der Oost, N.G. Leferink, W.J. van Berkel, G. Vriend, P.J. Schaap (2009) Correlated mutation analyses on super-family alignments reveal functionally important residues. Proteins 76: 608-616.

61. H. Jochens, D. Aerts, U.T. Bornscheuer (2010) Thermostabilization of an esterase by alignment-guided focussed directed evolution. Protein Eng. Des. Sel. 23: 903-909.

62. M.T. Reetz, P. Soni, L. Fernandez (2009) Knowledge-guided laboratory evolution of protein thermolability. Biotechnol. Bioeng. 102: 1712-1717.

63. J.F. Chaparro-Riggers, K.M. Polizzi, A.S. Bommarius (2007) Better library design: data-driven protein engineering. J. Biotechnol. 2: 180-191.

64. L.G. Otten, F. Hollmann, I.W. Arends (2010) Enzyme engineering for enantioselectivity: from trial-and-error to rational design? Trends Biotechnol. 28: 46-54.

Page 26: University of Groningen Library design and screening ... · global cheese production. Examples of other important developments for biocatalysis are the elucidation of the first protein

Introduction !

!31

65. R.K. Kuipers, H.J. Joosten, W.J. van Berkel, N.G. Leferink, E. Rooijen, E. Ittmann, F. van Zimmeren, H. Jochens, U. Bornscheuer, G. Vriend, V.A. dos Santos, P.J. Schaap (2010) 3DM: Systematic analysis of heterogeneous superfamily data to discover protein functionalities. Proteins: Struct. Funct. Bioinf. 78: 2101-2113.

66. C.A. Smith, T. Kortemme (2011) Predicting the tolerated sequences for proteins and protein interfaces using Rosetta Backrub flexible backbone design. PLoS One 6: e20451.

67. M.C. Laboissière, M.M. Young, R.G. Pinho, S. Todd, R.J. Fletterick, I. Kuntz, C.S. Craik (2002) Computer-assisted mutagenesis of ecotin to engineer its secondary binding site for urokinase inhibition. J. Biol. Chem. 277: 26623-26631.

68. P. Araujo, B. Sosa, T Miller, S. Mayo (2012) In Silico screening of computational enzyme designs. Protein Science 21: 132.

69. S.M. Lippow, T.S. Moon, S. Basu, S.H. Yoon, X. Li, B.A. Chapman, K. Robison, D. Lipovšek, K.L. Prather (2010) Engineering enzyme specificity using computational design of a defined-sequence library. Chem. Biol. 17: 1306-1315.

70. J. Marek, J. Vévodová, I.K. Smatanová, Y. Nagata, L.A. Svensson, J. Newman, M. Takagi, J. Damborský (2000) Crystal structure of the haloalkane dehalogenase from Sphingomonas paucimobilis UT26. Biochemistry 39: 14082-14086.

71. I.S. Ridder, H.J. Rozeboom, B.W. Dijkstra (1999) Haloalkane dehalogenase from Xanthobacter autotrophicus GJ10 refined at 1.15 A resolution. Acta. Crystallogr. D. Biol. Crystallogr. 55: 1273-1290.

72. J. Newman, T.S. Peat, R. Richard, L. Kan, P.E. Swanson, J.A. Affholter, I.H. Holmes, J.F. Schindler, C.J. Unkefer, T.C. Terwilliger (1999) Haloalkane dehalogenases: structure of a Rhodococcus enzyme. Biochemistry 38: 16105-16114.

73. K.H. Verschueren, S.M. Franken, H.J. Rozeboom, K.H. Kalk, B.W. Dijkstra (1993) Refined X-ray structures of haloalkane dehalogenase at pH 6.2 and pH 8.2 and implications for the reaction mechanism. J. Mol. Biol. 232: 856-872.

74. F. Laturnus, C. Wiencke, H. Klöser (1996) Antarctic macroalgae – Sources of volatile halogenated organic compounds. Mar. Environ. Res. 41: 169-181.

75. G.W. Gribble (1994) The natural production of chlorinated compounds. Environ. Sci. Technol. 28: 310A-319A.

76. S.R. Armstrong, L.C. Green (2004) Chlorinated hydrocarbon solvents. Clin. Occup. Environ. Med. 4: 481-496.

77. R.D. Stewart, C.L. Hake (1979) Paint-remover hazard. JAMA 235: 398-401. 78. M.J. Dagani, H.J. Barda, T.J. Benya, D.C. Sanders (2002) “Bromine

compounds” In: Ullmann's Encyclopedia of Industrial Chemistry. Wiley-VCH, Weinheim, Germany.

79. L.N. Herrera-Rodriguez, F. Kahn, K.T. Robins, H.P. Meyer (2011) Perspectives on biotechnological halogenation. Part 1: Halogenated products and enzymatic halogenation. Chem. Today. 29: n4 (Lonza)

80. E.C. Voldner, Y.F. Li (1995) Global usage of selected persistent organochlorines. Sci. Total Environ. 160: 201-210.

Page 27: University of Groningen Library design and screening ... · global cheese production. Examples of other important developments for biocatalysis are the elucidation of the first protein

Chapter 1

32

81. D.W. Connell, G.J. Miller, M.R. Mortimer, G.R. Shaw, S.M. Anderson (1999) Persistent lipophilic contaminants and other chemical residues in the southern hemisphere. Crit. Rev. Environ. Sci. Technol. 29: 47-82.

82. D.B. Janssen, A. Scheper, L. Dijkhuizen, B. Witholt (1985) Degradation of halogenated aliphatic compounds by Xanthobacter autotrophicus GJ10. Appl. Environ. Microbiol. 49: 673-677.

83. S. Hartmans, J.A. De Bont (1992) Aerobic vinyl chloride metabolism in Mycobacterium aurum L1. Appl. Environ. Microbiol. 58: 1220-1226.

84. R. Imai, Y. Nagat, K. Senoo, H. Wada, M. Fukuda, M. Takagi, K. Yano (1989) Dehydrochlorination of γ-hexachlorocyclohexane (γ-BHC) by γ-BHC-assimilating Pseudomonas paucimobilis. Agric. Biol. Chem. 53: 2015-2017.

85. T. Omori, M. Alexander (1978) Bacterial and spontaneous dehalogenation of organic compounds. Appl. Environ. Microbiol. 35: 512-516.

86. T. Omori, M. Alexander (1978) Bacterial dehalogenation of halogenated alkanes and fatty acids. Appl. Environ. Microbiol. 35: 867-871.

87. D.B. Janssen, F. Pries, J.R. van der Ploeg (1994) Genetics and biochemistry of dehalogenating enzymes. Annu. Rev. Microbiol. 48: 163-191.

88. S. Fetzner (1998) Bacterial dehalogenation. Appl. Microbiol. Biotechnol. 50: 633-657.

89. M.I. Arif, G. Samin, J.G.E. van Leeuwen, J. Oppentocht, D.B. Janssen (2012) Novel dehalogenase mechanism for 2,3-dichloro-1-propanol utilization in Pseudomonas putida strain MC4. Appl. Environ. Microbiol. 78: 6128-6136.

90. S. Keuning, D.B. Janssen, B. Witholt (1985) Purification and characterization of hydrolytic haloalkane dehalogenase from Xanthobacter autotrophicus GJ10. J. Bacteriol. 163: 635-639.

91. F. Pries, J. Kingma, M. Pentenga, G. van Pouderoyen, C.M. Jeronimus-Stratingh, A.P. Bruins, D.B. Janssen (1994) Site-directed mutagenesis and oxygen isotope incorporation studies of the nucleophilic aspartate of haloalkane dehalogenase. Biochemistry 33: 1242-1247.

92. F. Pries, J. Kingma, G.H. Krooshof, C.M. Jeronimus-Stratingh, A.P. Bruins, D.B. Janssen (1995) Histidine 289 is essential for hydrolysis of the alkyl-enzyme intermediate of haloalkane dehalogenase. J. Biol. Chem. 270: 10405-10411.

93. E. Chovancová, J. Kosinski, J.M. Bujnicki, J. Damborský (2007) Phylogenetic analysis of haloalkane dehalogenases. Proteins. 67: 305-316.

94. P. Holloway, K.L. Knoke, J.T. Trevors, H. Lee (1998) Alteration of the substrate range of haloalkane dehalogenase by site-directed mutagenesis. Biotechnol. Bioeng. 59: 520-523.

95. J.P. Schanstra, A. Ridder, J. Kingma, and D.B. Janssen (1997) Influence of mutations of Val226 on the catalytic rate of haloalkane dehalogenase. Prot. Eng. 10: 53-61.

96. M.G. Pikkemaat, A.B. Linssen, H.J. Berendsen, D.B. Janssen (2002) Molecular dynamics simulations as a tool for improving protein stability. Protein Eng. 15: 185-192.

97. G. Stucki, M. Thüer (1995) Experiences of a large-scale application of 1,2-dichloroethane degrading microorganisms for groundwater treatment. Environ. Sci. Technol. 29: 2339-2345.

Page 28: University of Groningen Library design and screening ... · global cheese production. Examples of other important developments for biocatalysis are the elucidation of the first protein

Introduction !

!33

98. Y. Nagata, T. Nariya, R. Ohtomo, M. Fukuda, K. Yano, M. Takagi (1993) Cloning and sequencing of a dehalogenase gene encoding an enzyme with hydrolase activity involved in the degradation of gamma-hexachlorocyclohexane in Pseudomonas paucimobilis. J. Bacteriol. 175: 6403-6410.

99. Y. Nagata, K. Miyauchi, J. Damborsky, K. Manova, A. Ansorgova, M. Takagi (1997) Purification and characterization of a haloalkane dehalogenase of a new substrate class from a gamma-hexachlorocyclohexane-degrading bacterium, Sphingomonas paucimobilis UT26. Appl. Environ. Microbiol. 63: 3707-3710.

100. A.M. Kulakova, M.J. Larkin, L.A. Kulakov (1997) The plasmid-located haloalkane dehalogenase gene from Rhodococcus rhodochrous NCIMB 13064. Microbiology. 143: 109-115.

101. T. Bosma, E. Kruizinga, E.J. de Bruin, G.J. Poelarends, D.B. Janssen (1999) Utilization of trihalogenated propanes by Agrobacterium radiobacter AD1 through heterologous expression of the haloalkane dehalogenase from Rhodococcus sp. strain M15-3. Appl. Environ. Microbiol. 65: 4575-4581.

102. T. Bosma, J. Damborský, G. Stucki, D.B. Janssen (2002) Biodegradation of 1,2,3-trichloropropane through directed evolution and heterologous expression of a haloalkane dehalogenase gene. Appl. Environ. Microbiol. 68: 3582-3587.

103. M. Pavlova, M. Klvana, Z. Prokop, R. Chaloupkova, P. Banas, M. Otyepka, R.C. Wade, M. Tsuda, Y. Nagata, J. Damborsky (2009) Redesigning dehalogenase access tunnels as a strategy for degrading an anthropogenic substrate. Nat. Chem. Biol. 5: 727-733.

104. G. Samin, D.B. Janssen (2012) Transformation and biodegradation of 1,2,3-trichloropropane (TCP). Environ. Sci. Pollut. Res. Int. 19: 3067-3078.

105. R.J. Pieters, J.H. Lutje Spelberg, R.M. Kellogg, D.B. Janssen (2001) The enantioselectivity of haloalkane dehalogenases. Tetrahedron Letters 42: 469-471.

106. Z. Prokop, Y. Sato, J. Brezovsky, T. Mozga, R. Chaloupkova, T. Koudelakova, P. Jerabek, V. Stepankova, R. Natsume, J.G.E. van Leeuwen, D.B. Janssen, J. Florian J, Y. Nagata, T. Senda, J. Damborsky (2010) Enantioselectivity of haloalkane dehalogenases and its modulation by surface loop engineering. Angew. Chem. 49: 6111-6115.

107. A. Westerbeek, J.G.E. Leeuwen van, W. Szymański, B.L. Feringa, D.B. Janssen (2012) Haloalkane dehalogenase catalysed desymmetrisation and tandem kinetic resolution for the preparation of chiral haloalcohols. Tetrahedron 68: 7645-7650.

Page 29: University of Groningen Library design and screening ... · global cheese production. Examples of other important developments for biocatalysis are the elucidation of the first protein

!