structure-based biologics protein drug design using ... · structure-based biologics protein drug...

Structure-based biologics protein drug design using BioLuminate

David A. Pearlman Schrödinger, Inc. Cambridge, MA Schrödinger Webinar Series 3 December 2015

Schrödinger biologics

Software Platform

(BioLuminate)

End user tool (Comp

chemists, bench scientists)

Collaborations and contract

research

Next generation research

(Aggregation, Immunogenicity, Solubility, etc.)

Requires a platform intuitive enough for bench scientists and sophisticated enough

for expert users

BioLuminate: A biologics design toolkit

•  Comfortable learning curve supports use across disciplines

–  Focus on workflows and tasks –  Feature rich – GUI modeled on PyMOL

Experiment is great, but sometimes it’s not enough

Growing embrace of

structure-based computational approaches in

biologics discovery

Obtain Protein

Structure

Theoretical Calculations

Improved Proteins

Structure-based protein biologics modeling

•  Predict changes stability/affinity •  Binding site/Epitope ID (protein docking) •  In silico affinity maturation / library

design •  Antibody Humanization •  Liability ID / remove (aggregation,

reactivity, immunogenicity, solubility) •  Stabilization (cysteine disulphides) •  Enzyme improvement •  ADC site ID •  Formulation and delivery

Xray or Predicted from Sequence

•  Homology modeling •  Specialized antibody prediction tools

Obtain Protein

Structure


Improved Proteins

Structure-based antibody modeling

Xray or Predicted from Sequence

Predicting Fv from sequence

Predicting antibody CDR: The H3 loop is difficult

For antibody, L1→L3, H1, H2 usually “pretty good” using homology models. But H3 is a problem:

Framework L1 L2 L3 H1 H2 H3 Length range

6-13 3-3 7-8 7-9 5-6 10-13

RMS (Å) 0.9 1.0 0.5 1.4 1.3 1.1 3.3

Based on blinded prediction of 9 antibody structures, using four different “best practice” approaches. Almagro et al. (2011) Proteins 79 3050-3066

•  H3 structure is very important – Antigen recognition –  Frequent mutations during affinity maturation

•  But H3 is most problematic – Rules/homology often don’t work –  Large variation in length (5-26)

•  Prime De novo approach to H3 prediction – Based on physics + knowledge-based terms –  Friesner lab, Columbia; Jacobson lab, UCSF

•  Proven state-of-art method for loop prediction

Dealing with H3 loop prediction

•  Kai Zhu & Tyler Day: “Ab Initio Structure Prediction of the Antibody Hypervariable Loop” – Proteins: Struct. Funct. Bioinf. (2013) 81 1081-1089.

•  Prediction in native crystal structure

– Remove H3 from crystal coordinates, build H3 de novo

•  Benchmark set of 53 structures –  10 of length 4-6 (short) –  29 of length 7-11 (medium) –  14 of length 12-22 (long) – Sivasubramanian et al. (2009) Proteins 74 497

•  Calculations run in BioLuminate program

Prime Applied to Antibody H3 Prediction

Antibody H3 Loop Predictions using Prime

H3 Loop Length

4-6 7-9 10-11 12-14 >17

Prime* 0.2 0.5 0.5 0.9 3.7

Rosetta± 1.7 1.5 1.7 2.9 4.0

*Zhu & Day Proteins: Struct. Funct. Bioinf. (2013) 81 1081

±Sivasubramanian, Sircar, Chaudhury & Gray (2009) Proteins 74 497

Average RMS deviations from x-ray for H3 (Å)

Crystal Symmetry Improves Predictions

H3 Loop Length

4-6 7-9 10-11 12-14 >17

Prime* 0.2 0.5 0.5 0.9 3.7

Prime* with sym 0.2 0.4 0.5 0.8 2.0

*Zhu & Day Proteins: Struct. Funct. Bioinf. (2013) 81 1081

±Sivasubramanian, Sircar, Chaudhury & Gray (2009) Proteins 74 497

Average RMS deviations from x-ray for H3 (Å)

The 2nd Blinded Antibody Modeling Assessment (AMA-II) 2013

Volume 82, Issue 8 August 2014

•  7 Participants: –  Schödinger; CCG; Accelrys; Rosetta (Jeff Grey @John Hopkins)),

Macromoltek; Astellas Pharma + Osaka U; PIGS server

•  Predict 10 unpublished structures (4 human Ab, 6 mouse Ab) •  Two stages:

–  Stage 1: Predict full Fv from sequence –  Stage 2: Predict H3 given xray coordinates of remainder of structure

Method Fv RMSD Framework RMSD

All loops RMSD –H3

H3 RMSD

Schrödinger 1.1 ± 0.2Å 0.8 ± 0.2Å 1.1 ± 0.4Å 2.7 ± 0.8Å Accelrys 1.1 ± 0.3Å 0.9 ± 0.3Å 1.1 ± 0.5Å 3.0 ± 1.1Å CCG 1.1 ± 0.2Å 0.9 ± 0.3Å 1.0 ± 0.3Å 3.3 ± 0.9Å Rosetta (Jeff Grey) 1.1 ± 0.2Å 0.8 ± 0.2Å 1.1 ± 0.4Å 2.6 ± 0.9Å Macromoltek 1.4 ± 0.2Å 1.2 ± 0.2Å 1.2 ± 0.3Å 3.0 ± 1.0Å Astellas + Osaka U 1.1 ± 0.2Å 0.8 ± 0.2Å 1.0 ± 0.2Å 2.3 ± 0.6Å PIGS server 1.2 ± 0.1Å 0.9 ± 0.2Å 0.9 ± 0.4Å 3.1 ± 1.1Å Average 1.1 ± 0.2Å 0.9 ± 0.2A 1.1 + 0.4Å 2.8 ± 0.9Å

AMA-II : Overall results for Round 1: Full Fv from sequence

•  All methods are generally producing decent models •  H3 is the recurrent problem

Method H3 RMSD (Round 1)

H3 RMSD (Round 2)

Schrödinger 2.7 ± 0.8Å 1.4 ± 1.1Å Accelrys 3.0 ± 1.1Å 2.3 ± 1.0Å CCG 3.3 ± 0.9Å 2.5 ± 1.6Å Rosetta (Jeff Grey) 2.6 ± 0.9Å 2.1 ± 1.1Å Macromoltek 3.0 ± 1.0Å 3.3 ± 1.2Å Astellas + Osaka U 2.3 ± 0.6Å 1.4 ± 1.9Å PIGS server 3.1 ± 1.1Å Average 2.8 ± 0.9A 2.2 ± 0.9Å

AMA-II : Overall results for Round 2: Predict H3, given xray structure of remainder of Fv

•  Impressive automated prediction using Prime

Blinded H3 predictions: Prime versus other methods

Model H3 Length

Prime Prediction

Prime rank versus other methods

RMSD of best method (if not Prime)

AM-2 11 3.2 2 3.0 AM-3 8 0.5 1 N/A AM-4 8 1.1 4 1.0 AM-5 8 3.2 6 0.9 AM-6 14 3.1 1 N/A AM-7 8 0.4 1 N/A AM-8 11 1.8 1 N/A AM-9 10 0.6 1 N/A

AM-10 11 1.0 1 N/A AM-11 10 0.5 1 N/A

RMSD distances in Å

Best in competition Not best, but very close Miss

Blinded H3 predictions: Loop lookup versus Prime

Model H3 Length

Best in Database

Homology Prediction

Prime Prediction

AM-2 11 1.7 4.3 3.2 AM-3 8 0.9 1.5 0.5 AM-4 8 0.7 2.2 1.1 AM-5 8 1.0 2.4 3.2 AM-6 14 2.6 3.1 3.1 AM-7 8 1.5 2.3 0.4 AM-8 11 1.7 3.3 1.8 AM-9 10 0.7 1.9 0.6

AM-10 11 1.5 2.8 1.0 AM-11 10 0.4 2.6 0.5

Average 1.3 2.7 1.4

RMSD distances in Å Prime prediction better than ANY loop model in database

BioLuminate provides tools for antibody humanization

Automated framework

replacement

Homology + mutation

calculations to optimize loop region

Structure-based protein biologics modeling


• Liability ID & removal

•  Aggregation / viscosity •  Immunogenicity •  Solubility •  Post-translational modifications (glycosylation, etc.) •  Reactive hot spots •  Thermal stability •  They can’t “fix it in formulation”

• → Build out liabilities without destroying affinity or stability •  Need to calculate affinity & stability changes with sequence

You might be a biologics designer if…

…the following keep you up at night…

ID libabilties

Evaluate possible

mutations

Select and create

mutants

It is critical to engineer out protein liabilities early

•  Aggregation/Viscosity propensity •  Hotspots: Deamidation, oxidation,

glycosylation, proteolysis •  Immunogenicity •  Solubility •  Thermal stability •  IP avoidance

•  Calculate energy changes – Affinity – Stability

•  Mutations: – Remove liabilities – Retain affinity – Retain stability

Liability ID using BioLuminate

Aggregation hot spots

Reactive residue ID

Immunogenicity prediction

Titration curve / Isoelectric point

•  How does this residue change affect: – Stability – Affinity (to other molecules)

Removing Liabilities: Residue mutation studies

ABC AXC

?

•  For liability reduction: – Mutate out liability – Maintain stability – Maintain affinity (if applicable)

•  Empirical scoring methods – Approximate –  Fast – Can only predict parameterized moieties

• MM-GBSA – Approximate (implicit solvent) –  Fast (< 1 minute per calculation)

•  FEP (Free Energy Perturbation) – Precise (mean unsigned error ~1 kcal/mol) – Computationally more expensive, explicit solvent –  ~1-2 calculations per GPU processor/day

•  Requires huge amount conformational sampling

Predicting free energy changes: Affinity/Stability of Residue A to Residue B

ABC AXC

?

Physics-based methods (Can predict non-standard AA)

Skip

Free energy calculations (ΔG)Question: What is ΔG between two structurally similar molecules?

Method:

ΔGstatisticalmechanics

Thermodynamic ensemble of conformations

Ensemble generated using Molecular Dynamics

Changes in affinity/binding

and stability

FEP: Technologies Facilitate a Robust Solution

Improved force field…….

Enhanced sampling........

Hardware acceleration…

Automated setup………..

Error estimates………….

OPLS3

REST

GPU

FEP Mapper

Cycle Closure

Faster computers and GPU blast off

CPU speeds during FEP era increased by ~8000x

GPU

GPU 50-100x through

parallelism

Schrödinger’s GPU Cluster

400 GPUs: Two 200 GPU racks

Roughly the same processing power as

the total of every home PC in the USA

in 1987

Year Relative amount sampling

How long would it take?

Sufficient for accuracy?

1987 1x 1 month on supercomputer

No

2015 3000x 1 day on GPU Yes

FEP for affinity validated for small molecule ligands

-‐15

-‐14

-‐13

-‐12

-‐11

-‐10

-‐9

-‐8

-‐7

-‐6

-‐5

-‐4

-‐15 -‐14 -‐13 -‐12 -‐11 -‐10 -‐9 -‐8 -‐7 -‐6 -‐5 -‐4

BACE

CDK2

JNK1

MCL1

ΔG FEP

(kcal/mol)

ΔG Expt. (kcal/mol) |ΔΔGFEP – ΔΔGExpt.| (kcal/mol)

Percen

tage

46.2%

24.8%

15.4%

7.4% 6.2%

0%

10%

20%

30%

40%

50%

< 0.6 0.6-‐1.2 1.2-‐1.8 1.8-‐2.4 >2.4

•  Over 500 perturbations tested for 17 systems w/ identical automated protocol –  RMSE ≈ 1.2 kcal/mol

Wang et al. (2015) JACS 137 2695-2703

What does RMSE ≈ 1.2 kcal/mol mean for predicFons?

•  The probability the sign of ∆∆G is correct depends on the size of |∆∆G|

Derived from Shirts, M. R., Mobley, D. L., & Brown, S. P. (2010) Free-‐energy calcula-ons in structure-‐based drug design.

Predicted |ΔΔG|

(kcal/mol)

Probability predicted sign agrees with experiment

0.6 69.1% 1.2 84.1% 1.4 87.8% 1.8 93.3% 2.4 97.7%

•  7 projects •  Protein:small

molecule ligand binding

•  158 prospective FEP predictions

•  Experimentally tested AFTER prediction

•  111 predictions within 1 log unit

• Only 9 predictions off by > 2 log unit

FEP validated for small molecule ligands: Performance in Prospective Drug Discovery Collaborations

Protein affinity changes calculated using FEP

How do we calculate free energy changes for affinity?

ΔΔGbinding = ΔG1 – ΔG2 = ΔGA – ΔGB

A

B

1 2

§  Experimental: Measure vertical processes 1 and 2.

§  Theoretical: Calculate horizontal processes A and B

FEP delta affinity results across all protein systems

223 mutations R2 = 0.76 Mean Unsigned Err = 1.7 kcal/mol Slope = 0.98

158 mutations R2 = 0.52 Mean Unsigned Err = 2.8 kcal/mol Slope = 1.08

How do we calculate free energy changes for stability?

ΔΔGstability = ΔG1 – ΔG2 = ΔGA – ΔGB

A

B

1 2

§  Experimental: Measure vertical processes 1 and 2.

§  Theoretical: Calculate horizontal processes A and B

Protein stability predictions using FEP Applied to systems from Fold-X Test Set System PDB ID #

Mutations R2-value MUE ΔΔG Sign

correct T4-Lysozyme 2LZM 66 0.67 1.2 92%

Human Lysozyme

1REX 45 0.66 1.3 80%

Peptostrept. Magn. Prot. L

1HZ6 44 0.59 1.1 89%

B1 IG binding protein G

1PGA 24 0.37 1.1 79%

Fibronectin II domain

1TEN 32 0.33* / 0.68 1.6 / 1.3 88% / 93%

FK506 BP 1FKB 27 0.4 1.6 85%

All 238 0.57 1.2 87%

Errors in Kcal/mol; *: Result strongly affected by terminal outliers

Aggregate FEP stability predictions

Slope = 1.3 Offset = -0.2 MUE = 1.2 kcal/mol

FEP performance compared to other methods

Software R2-value achieved* Stabilizing/destabilizing

% correct CC/PBSA 0.31 79% EGAD 0.35 71% FoldX 0.25 70% Hunter 0.20 69% I-Mutant2.0 0.29 78% Rosetta 0.07 73% FEP 0.57 87%

•  FEP: Appreciably better R2

•  FEP: Better correct stabilizing/destabilizing classification •  (Non FEP results from Potapov, 2009, Prot. Eng. Des. Sel., 22, 553)

Building a better protein…

…by introducing stabilizing protein disulphide bonds

Introducing mutations to create stable disulphide bonds

ID residues close enough to be candidates

Apply weighted scoring function • A) Function inferred from geometries in 11399 PDB struct +

• B) Implicit solvent energy function (MM-GBSA)

Create triaged list of mutations likely to create stable disulphides

Application of cys scanning to find stable construct

LPA + ONO agonist. Not stable enough to be crystallized

ID 5 sets of mutation candidates to introduce new disulphide using

BioLuminate

(Asp204, Val282) -> (Cys204…Cys282) works

Substantially better thermal stability

Crystals suitable for a structure

Other BioLuminate features…

• I Tubert-Brohman, W Sherman, M Repasky & T Beuming (2013) J. Chem. Inf. Model. 53 1689-1699

Automated protein-peptide docking

Design peptide linkers of any composition between protein domains

Protein crosslink design

Non-standard amino acids supported in residue scanning/affinity maturation

Non-standard amino acids

Automatically evaluates structure quality on a large number of criteia

Protein quality report

Other BioLuminate features…

Quickly ID, characterize and visualize residues at protein interface

Interactive Protein Interaction Table

Identify structural problems and visualize them in the workspace

Protein quality visualizer

Homology modeling, chimeric models, advanced loop modeling

Homology Modeling

Advanced sequence viewer (work in progress)

Sequence viewer

•  BioLuminate offers an extensive set of foundation tools that facilitate –  A suitably low learning curve for new users, non expert users –  Collaboration –  Advanced exploration of new approaches to fundamental problems in biologics

•  BioLuminate features first in class approaches to – Antibody modeling – Protein engineering, including cysteine scanning and FEP

Future work…

•  Kai Zhu •  Tyler Day •  Dora Warshaviak •  Colleen Murrett •  Richard Friesner

Acknowledgements

•  Thomas Steinbrecher •  Fiona McRobb •  Jeffrey Sanders

DRUG DISCOVERY COLLABORATIONS

Antibody predictions Protein FEP Calculations

• Woody Sherman •  Robert Abel

Immunogenicity

•  Tyler Day

Aggregation patch analyzer •  Johannes Maier

structure-based biologics protein drug design using ... · structure-based biologics protein drug...

Documents