topologically diverse shapes accessible by modular design

51
S1 Topologically diverse shapes accessible by modular design of arylopeptoid macrocycles. Thomas Hjelmgaard, §Lionel Nauton, Francesco De Riccardis, || Laurent Jouffret, Sophie Faure* § Department of Chemistry, Section for Chemical Biology and Nanobioscience, Faculty of Science, University of Copenhagen, Thorvaldsensvej 40, 1871 Frederiksberg C, Denmark, || Department of Chemistry and Biology, University of Salerno, Via Giovanni Paolo II n. 132, I-84084 Fisciano (SA) ITALY ‡ Clermont Université, Université Blaise Pascal, Institut de Chimie de Clermont-Ferrand, BP 10448, 63000 Clermont-Ferrand, France and CNRS, UMR 6296, ICCF, F-63178 Aubière Cedex, France. * [email protected] Contents S2-S12: Experimental section S13: Macrocyclization Optimization study S14-S17: HPLC and LC-MS profiles S18-S40: NMR spectra S41-S45: NMR study S46: X-Ray crystallography S47: Molecular modelling S48-S50: Complexation study S51: References

Upload: others

Post on 26-Jan-2022

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Topologically diverse shapes accessible by modular design

S1

Topologically diverse shapes accessible by modular design of arylopeptoid macrocycles.

Thomas Hjelmgaard,§† Lionel Nauton,‡ Francesco De Riccardis,|| Laurent Jouffret,‡ Sophie Faure*‡

§ Department of Chemistry, Section for Chemical Biology and Nanobioscience, Faculty of Science, University of Copenhagen, Thorvaldsensvej 40, 1871 Frederiksberg C, Denmark, || Department of Chemistry and Biology, University of Salerno, Via Giovanni Paolo II n. 132, I-84084 Fisciano (SA) ITALY ‡ Clermont Université, Université Blaise Pascal, Institut de Chimie de Clermont-Ferrand, BP 10448, 63000 Clermont-Ferrand, France and CNRS, UMR 6296, ICCF, F-63178 Aubière Cedex, France.

* [email protected]

Contents

S2-S12: Experimental section

S13: Macrocyclization Optimization study

S14-S17: HPLC and LC-MS profiles

S18-S40: NMR spectra

S41-S45: NMR study

S46: X-Ray crystallography

S47: Molecular modelling

S48-S50: Complexation study

S51: References

Page 2: Topologically diverse shapes accessible by modular design

S2

Experimental section

General experimental methods

CH2Cl2 used as solvent in reactions was dried over 4Å molecular sieves. All other chemicals and

solvents obtained from commercial sources (Acros Organics, Alfa Aesar, Fluka and Sigma-Aldrich)

were used as received. For synthesis of linear precursors 1-12 see below. Melting points were

determined on a Mettler Toledo MP70 melting point system (linear precursors 1-12) or on a Stuart

Scientific melting point apparatus SMP3 and are referenced to the melting points of benzophenone

and benzoic acid. IR spectra were recorded on a Shimadzu FTIR-8400S spectrometer equipped with a

Pike Technologies MIRacleTM ATR and ν are expressed in cm-1. NMR spectra were recorded on a

Bruker Avance 300 MHz spectrometer or on a 400 MHz Bruker AC 400 spectrometer. Chemical

shifts are referenced to the residual solvent peak and J values are given in Hz. The following

multiplicity abbreviations are used: (s) singlet, (d) doublet, (t) triplet, (q) quartet, (m) multiplet, and

(br) broad. Where applicable, assignments were based on COSY, HMBC, HSQC and J-mod-

experiments. TLC was performed on Merck TLC aluminium sheets, silica gel 60, F254. Progression of

reactions was, when applicable, followed by HPLC, NMR and/or TLC. Visualizing of spots was

effected with UV-light and/or ninhydrin in EtOH/AcOH. Flash chromatography was performed with

silica gel 60, 35-75 or 40-63 µm. Unless otherwise stated, flash chromatography was performed in the

eluent system for which the Rf values are given. Numbers in parentheses in LC-MS spectra are relative

abundances; only major peaks (>20% relative abundance) are listed. HRMS of linear precursors were

recorded on a Micromass Q-Tof Micro (3000V) apparatus or a Thermo Scientific Q Exactive

Quadrupole-Orbitrap Mass Spectrometer. Analytical HPLC of the linear precursors 1-12 was

performed on a Waters 2525 binary gradient module equipped with a Waters 2767 sample manager, a

column fluidic organiser, a Gemini 110 column (C18, 5 µm, 110 Å, 4.6×100 mm) with flow = 1.0

mL/min, and a UV fraction manager coupled with a Waters 2996 PDA detector; detection range =

210-400 nm; solvent A = MeOH/water/TFA 5:95:0.1 and solvent B = MeOH/water/TFA 95:5:0.1;

Gradient (10 min runs): 50% B (0-2 min), 50→100% B (2-7 min), 100→50% B (7-9 min), 50% B (9-

10 min). Analytical LC-MS of macrocycles 13-23 were performed on a Dionex Ultimate 3000 system

with a Gemini-NX C18 (3 µm, 50×4.6 mm) column, thermostated to 42 °C with a column oven,

connected to an ESI-MS (MSQ Plus Mass Spectrometer, Dionex); flow = 1.0 mL/min; detection at

215 nm; solvent A = water (0.1% formic acid) and solvent B = MeCN (0.1% formic acid); gradient

(12 min runs): 5% B (0-0.5 min), 5→100% B (0.5-9 min), 100% B (9-10.4 min), 100→5% B (10.4-

10.5 min), 5% B (10.5-12 min).

Page 3: Topologically diverse shapes accessible by modular design

S3

Synthesis of linear precursors

The linear arylopeptoids were synthesized on a 2-chlorotrityl chloride copoly(styrene-1% DVB) resin

(100-200 mesh, 1.50 mmol/g, Merck) using “Method A” described in our previous work,1 using 3- or

4-(chloromethyl)benzoic acid in the attachment step, 2-, 3- or 4-(chloromethyl)benzoyl chloride in the

coupling steps, and 2-methoxymethylamine in the substitution steps. The only modifications to the

procedures were: (1) The arylopeptoids were synthesized at twice the scale described. (2) The

substitution steps were performed using 20 equiv / 4.0 M solutions of 2-methoxymethylamine rather

than 20 equiv / 2.0 M solutions. (3) The capping step was performed using di-tert-butyl dicarbonate

(330 mg, 1.52 mmol) in place of benzoyl chloride (1.52 mmol) in order to cap with a Boc group. (4)

After cleavage with HFIP and evaporation, the crude product taken up in CH2Cl2 (15 mL) and washed

with 1M HCl (10 mL). The aqueous layer was extracted with CH2Cl2 (15 mL) and the combined

organic layers were concentrated and dried in vacuo, yielding the crude linear arylopeptoid which was

used in the ensuing steps without further purification.

General procedure for macrocyclization of arylopeptoids

To a solution of the linear N-Boc protected linear arylopeptoid (0.045 mmol) in CH2Cl2 (1.5 mL) at 0

°C was added TFA (1.5 mL) and the resulting mixture was stirred for 3 h at 0 °C. The solvents were

evaporated under reduced pressure and the residue was evaporated several times with CH2Cl2 and

dried in vacuo, yielding the crude termini deprotected linear arylopeptoid. To a solution of the crude

linear arylopeptoid in CH2Cl2 (9.0 mL) at 0 °C under N2 was added enough DIPEA (approx. 0.039

mL, 0.224 mmol) to turn the mixture slightly basic. COMU (23.2 mg, 0.054 mmol) was added and the

resulting mixture was stirred overnight while allowing to warm slowly to rt. The solvents were

evaporated under reduced pressure and the residue was taken up in EtOAc (15 mL). The organic layer

was washed with satd. aq. NaHCO3 (2×7.5 mL), satd. aq. NH4Cl (2×7.5 mL) and water (7.5 mL). The

organic layer was concentrated and dried in vacuo. Flash chromatography of the residue yielded the

desired product.

Linear arylopeptoid pppp-1: Colorless foam (264 mg, 59%, 95% purity). mp =

53-56 °C. 1H NMR (300 MHz, CDCl3): δ = 8.02 (d, J = 8.1 Hz, 2H, o-C6H4COO),

7.55-7.04 (m, 14H, Ar-H), 4.96-4.56 (m, 6H, 3×CONCH2Ar), 4.55-4.42 (br s, 2H,

BocNCH2Ar), 3.76-3.06 (m, 28H, 4×CH2CH2OCH3, 4×CH2CH2OCH3 and 4×CH2CH2OCH3), 1.54-

1.28 (2×br s, 9H, Boc) ppm. 13C NMR (75 MHz, CDCl3): δ = 172.1 (3Cq, 3×CON), 169.4, 169.2 (Cq,

COO), 155.8, 155.5 (Cq, Boc), 143.1, 142.6, 140.4, 140.1, 139.1, 138.6, 135.1, 134.6, 129.1 (8Cq),

130.3, 127.9, 127.5, 127.3, 126.9, 126.3 (16CH), 80.0 (Cq, Boc), 71.1, 70.7, 70.3 (4CH2,

4×CH2CH2OCH3), 58.9, 58.9, 58.7, 58.6 (4CH3, 4×CH2CH2OCH3), 53.7, 48.1, 46.4, 46.2, 44.7

OH

O

4

NBoc

OMe

Page 4: Topologically diverse shapes accessible by modular design

S4

(7CH2, 4×CH2CH2OCH3, 3×CONCH2Ar), 51.4, 50.5 (CH2, BocNCH2Ar), 28.3 (3CH3, Boc) ppm.

HRMS (TOF MS ES+) calcd for C49H63N4O11 [M + H]+ m/z 883.4488, found 883.4485.

Linear arylopeptoid pppppp-2: Colorless foam (358 mg, 56%, 96% purity). mp

= 61-64 °C. 1H NMR (300 MHz, CDCl3): δ = 8.01 (d, J = 8.1 Hz, 2H, o-

C6H4COO), 7.58-7.02 (m, 22H, Ar-H), 6.84-6.40 (br s, 1H, COOH), 4.98-4.54

(m, 10H, 5×CONCH2Ar), 4.54-4.41 (br s, 2H, BocNCH2Ar), 3.72-3.04 (m, 42H, 6×CH2CH2OCH3,

6×CH2CH2OCH3 and 6×CH2CH2OCH3), 1.54-1.26 (2×br s, 9H, Boc) ppm. 13C NMR (75 MHz,

CDCl3): δ = 172.1 (5Cq, 5×CON), 168.9, 168.7 (Cq, COO), 155.7, 155.5 (Cq, Boc), 143.0, 142.5,

142.4, 140.3, 139.0, 138.7, 135.2, 134.6, 129.2 (12Cq), 131.4, 130.3, 129.0, 127.8, 127.5, 127.0,

126.9, 126.3 (24CH), 80.0 (Cq, Boc), 71.1, 70.7, 70.3, 70.2 (6CH2, 6×CH2CH2OCH3), 58.7, 58.6

(6CH3, 6×CH2CH2OCH3), 54.3, 53.7, 48.3, 48.1, 46.4, 46.2, 44.6 (11CH2, 6×CH2CH2OCH3,

5×CONCH2Ar), 51.4, 50.6 (CH2, BocNCH2Ar), 28.3 (3CH3, Boc) ppm. HRMS (TOF MS ES-) calcd

for C71H87N6O15 [M - H]- m/z 1263.6235, found 1263.6232.

Linear arylopeptoid mmmm-3: Colorless foam (299 mg, 67%, 95% purity). mp

= 46-49 °C. 1H NMR (300 MHz, CDCl3): δ = 8.06-7.74 (m, 2H, o/o’-C6H4COO),

7.50-7.10 (m, 14H, Ar-H), 4.93-4.54 (m, 6H, 3×CONCH2Ar), 4.54-4.40 (br s, 2H,

BocNCH2Ar), 3.72-3.05 (m, 28H, 4×CH2CH2OCH3, 4×CH2CH2OCH3 and

4×CH2CH2OCH3), 1.52-1.32 (2×br s, 9H, Boc) ppm. 13C NMR (75 MHz, CDCl3): δ = 172.1 (3Cq,

3×CON), 169.0, 168.7 (Cq, COO), 155.8, 155.5 (Cq, Boc), 139.2, 138.9, 138.8, 137.7, 137.6, 137.5,

136.6, 136.5, 136.1, 130.6, 130.3 (8Cq), 132.7, 132.5, 131.3, 129.4, 129.0, 128.8, 128.5, 128.2, 127.8,

126.3, 125.9, 125.5, 125.1 (16CH), 80.0 (Cq, Boc), 71.1, 70.7, 70.3 (4CH2, 4×CH2CH2OCH3), 58.7,

58.6 (4CH3, 4×CH2CH2OCH3), 53.7, 48.1, 48.0, 46.3, 46.2, 44.8, 44.5 (7CH2, 4×CH2CH2OCH3,

3×CONCH2Ar), 51.4, 50.5 (CH2, BocNCH2Ar), 28.3 (3CH3, Boc) ppm. HRMS (TOF MS ES+) calcd

for C49H63N4O11 [M + H]+ m/z 883.4488, found 883.4480.

Linear arylopeptoid mmmmmm-4: Colorless foam (414 mg, 65%, 95% purity).

mp = 53-56 °C. 1H NMR (300 MHz, CDCl3): δ = 8.04-7.74 (m, 2H, o/o’-

C6H4COO), 7.50-7.06 (m, 22H, Ar-H), 4.93-4.53 (m, 10H, 5×CONCH2Ar), 4.53-

4.42 (br s, 2H, BocNCH2Ar), 3.74-3.06 (m, 42H, 6×CH2CH2OCH3,

6×CH2CH2OCH3 and 6×CH2CH2OCH3), 1.51-1.32 (2×br s, 9H, Boc) ppm. 13C NMR (75 MHz,

CDCl3): δ = 172.1 (5Cq, 5×CON), 168.6, 168.2 (Cq, COO), 155.8, 155.5 (Cq, Boc), 139.3, 138.9,

137.7, 137.5, 136.5, 136.2, 130.7, 130.5 (12Cq), 132.5, 131.2, 128.8, 128.5, 128.2, 127.8, 126.3,

125.9, 125.5, 125.0 (24CH), 79.9 (Cq, Boc), 71.1, 70.6, 70.2, 69.8 (6CH2, 6×CH2CH2OCH3), 58.7,

OH

O

6

NBoc

OMe

OH

O

4

NBoc

OMe

OH

O

6

NBoc

OMe

Page 5: Topologically diverse shapes accessible by modular design

S5

58.6 (6CH3, 6×CH2CH2OCH3), 53.6, 48.0, 46.3, 46.1, 44.5 (11CH2, 6×CH2CH2OCH3,

5×CONCH2Ar), 51.4, 50.5 (CH2, BocNCH2Ar), 28.3 (3CH3, Boc) ppm. HRMS (TOF MS ES-) calcd

for C71H87N6O15 [M - H]- m/z 1263.6235, found 1263.6240.

Linear arylopeptoid mpmp-5: Colorless foam (319 mg, 72%, 95%

purity). mp = 46-49 °C. 1H NMR (300 MHz, CDCl3): δ = 8.04-7.78

(m, 2H, o/o’-C6H4COO), 7.60-7.08 (m, 14H, Ar-H), 4.91-4.54 (m, 6H,

3×CONCH2Ar), 4.54-4.42 (br s, 2H, BocNCH2Ar), 3.72-3.10 (m,

28H, 4×CH2CH2OCH3, 4×CH2CH2OCH3 and 4×CH2CH2OCH3), 1.51-1.30 (2×br s, 9H, Boc) ppm.

13C NMR (75 MHz, CDCl3): δ = 172.2 (3Cq, 3×CON), 169.2, 169.0 (Cq, COO), 155.8, 155.5 (Cq,

Boc), 140.4, 138.9, 138.5, 137.6, 136.5, 135.3, 134.6, 130.5 (8Cq), 132.6, 131.5, 129.0, 128.7, 128.3,

127.9, 127.5, 126.9, 126.1, 125.5 (16CH), 80.0 (Cq, Boc), 71.1, 70.7, 70.3 (4CH2, 4×CH2CH2OCH3),

58.7, 58.6 (4CH3, 4×CH2CH2OCH3), 53.7, 48.0, 46.3, 46.1, 44.6 (7CH2, 4×CH2CH2OCH3,

3×CONCH2Ar), 51.4, 50.5 (CH2, BocNCH2Ar), 28.3 (3CH3, Boc) ppm. HRMS (TOF MS ES+) calcd

for C49H63N4O11 [M + H]+ m/z 883.4488, found 883.4487.

Linear arylopeptoid mpmpmp-6: Colorless foam (380 mg, 60%,

95% purity). mp = 52-55 °C. 1H NMR (300 MHz, CDCl3): δ = 8.04-

7.74 (m, 2H, o/o’-C6H4COO), 7.60-7.06 (m, 22H, Ar-H), 6.72-6.20

(br s, 1H, COOH), 4.93-4.54 (m, 10H, 5×CONCH2Ar), 4.54-4.42 (br

s, 2H, BocNCH2Ar), 3.74-3.06 (m, 42H, 6×CH2CH2OCH3, 6×CH2CH2OCH3 and 6×CH2CH2OCH3),

1.51-1.30 (2×br s, 9H, Boc) ppm. 13C NMR (75 MHz, CDCl3): δ = 172.2 (5Cq, 5×CON), 168.8, 168.5

(Cq, COO), 155.8, 155.5 (Cq, Boc), 140.5, 138.9, 138.5, 137.8, 137.6, 136.4, 135.3, 134.7, 130.6,

130.4 (12Cq), 132.5, 131.5, 129.5, 128.9, 128.8, 128.7, 128.3, 127.9, 127.5, 126.9, 126.3, 125.9,

125.6, 125.4 (24CH), 80.0 (Cq, Boc), 71.1, 70.6, 70.2 (6CH2, 6×CH2CH2OCH3), 58.7, 58.6 (6CH3,

6×CH2CH2OCH3), 53.7, 48.0, 46.3, 46.2, 44.5 (11CH2, 6×CH2CH2OCH3, 5×CONCH2Ar), 51.4, 50.5

(CH2, BocNCH2Ar), 28.3 (3CH3, Boc) ppm. HRMS (TOF MS ES-) calcd for C71H87N6O15 [M - H]-

m/z 1263.6235, found 1263.6226.

Linear arylopeptoid pmpm-7: Colorless foam (288 mg, 65%, 95%

purity). mp = 48-51 °C. 1H NMR (300 MHz, CDCl3): δ = 8.06-7.92

(m, 2H, o-C6H4COO), 7.50-7.08 (m, 14H, Ar-H), 4.94-4.55 (m, 6H,

3×CONCH2Ar), 4.55-4.42 (br s, 2H, BocNCH2Ar), 3.75-3.07 (m,

28H, 4×CH2CH2OCH3, 4×CH2CH2OCH3 and 4×CH2CH2OCH3), 1.52-1.31 (2×br s, 9H, Boc) ppm.

13C NMR (75 MHz, CDCl3): δ = 172.1 (3Cq, 3×CON), 168.9 (Cq, COO), 155.8, 155.5 (Cq, Boc),

OH

O

2

N

OMe

O

NBoc

OMe

OH

O

2

N

OMe

O

NBoc

OMe

OH

O

3

N

OMe

O

NBoc

OMe

Page 6: Topologically diverse shapes accessible by modular design

S6

143.1, 142.5, 139.3, 138.9, 137.6, 136.5, 136.1, 135.1, 129.2, 129.1 (8Cq), 130.3, 129.5, 128.8, 128.5,

127.9, 127.7, 127.4, 127.2, 126.9, 126.5, 125.8, 125.5, 125.1 (16CH), 80.0 (Cq, Boc), 71.1, 70.8, 70.4

(4CH2, 4×CH2CH2OCH3), 58.7, 58.6 (4CH3, 4×CH2CH2OCH3), 53.7, 48.0, 46.3, 44.4 (7CH2,

4×CH2CH2OCH3, 3×CONCH2Ar), 51.4, 50.6 (CH2, BocNCH2Ar), 28.3 (3CH3, Boc) ppm. HRMS

(TOF MS ES+) calcd for C49H63N4O11 [M + H]+ m/z 883.4488, found 883.4489.

Linear arylopeptoid pmpmpm-8: Colorless foam (354 mg, 56%,

95% purity). mp = 53-56 °C. 1H NMR (300 MHz, CDCl3): δ = 8.05-

7.90 (m, 2H, o-C6H4COO), 7.50-7.04 (m, 22H, Ar-H), 6.65-6.00 (br s,

1H, COOH), 4.95-4.54 (m, 10H, 5×CONCH2Ar), 4.54-4.41 (br s, 2H,

BocNCH2Ar), 3.76-3.05 (m, 42H, 6×CH2CH2OCH3, 6×CH2CH2OCH3 and 6×CH2CH2OCH3), 1.52-

1.30 (2×br s, 9H, Boc) ppm. 13C NMR (75 MHz, CDCl3): δ = 172.3 (5Cq, 5×CON), 168.8, 168.6 (Cq,

COO), 155.9, 155.7 (Cq, Boc), 142.5, 139.3, 139.1, 137.9, 137.6, 136.6, 136.3, 135.3, 129.5 (12Cq),

130.4, 128.9, 128.6, 128.1, 127.5, 127.3, 127.0, 126.6, 126.4, 126.0, 125.7, 125.2 (24CH), 80.1 (Cq,

Boc), 71.2, 70.8, 70.5, 70.3 (6CH2, 6×CH2CH2OCH3), 58.9, 58.8 (6CH3, 6×CH2CH2OCH3), 53.8,

48.1, 46.5, 46.3, 44.6 (11CH2, 6×CH2CH2OCH3, 5×CONCH2Ar), 51.5, 50.7 (CH2, BocNCH2Ar), 28.4

(3CH3, Boc) ppm. HRMS (TOF MS ES-) calcd for C71H87N6O15 [M - H]- m/z 1263.6235, found

1263.6233.

Linear arylopeptoid momo-9: Colorless foam (333 mg, 75%, 96%

purity). mp = 54-57 °C. 1H NMR (300 MHz, CDCl3): δ = 8.17-7.80 (m,

2H, o/o’-C6H4COO), 7.68-7.02 (m, 14H, Ar-H), 5.34-4.14 (m, 8H,

3×CONCH2Ar and BocNCH2Ar), 3.84-2.96 (m, 28H, 4×CH2CH2OCH3,

4×CH2CH2OCH3 and 4×CH2CH2OCH3), 1.56-1.30 (2×br s, 9H, Boc)

ppm. 13C NMR (75 MHz, CDCl3): δ = 172.6, 172.4, 172.3, 171.9, 171.7, 171.6, 171.3, 171.2, 170.8,

170.7, 170.5 (3Cq, 3×CON), 168.9, 168.7 (Cq, COO), 155.9 (Cq, Boc), 137.8, 137.7, 137.1, 137.0,

136.9, 136.7, 136.5, 135.6, 135.6, 135.4, 135.2, 135.0, 134.9, 134.3, 134.1, 130.6, 130.6 (8Cq), 132.8,

132.6, 131.9, 129.5, 129.2, 129.1, 129.0, 128.8, 128.7, 128.6, 128.1, 127.7, 127.5, 127.4, 127.3, 127.1,

126.9, 126.7, 126.7, 126.3, 126.0, 125.9, 125.5, 125.2, 125.1 (16CH), 80.0, 79.9 (Cq, Boc), 70.5, 70.2,

69.9, 69.5, 69.4 (4CH2, 4×CH2CH2OCH3), 58.7, 58.6, 58.6, 58.6, 58.5 (4CH3, 4×CH2CH2OCH3),

53.1, 53.0, 52.8, 51.0, 50.7, 49.2, 49.1, 48.5, 48.1, 47.9, 47.8, 47.5, 46.7, 46.4, 45.4, 45.1, 44.2, 44.0,

43.6 (8CH2, 4×CH2CH2OCH3, 3×CONCH2Ar and BocNCH2Ar), 28.3 (3CH3, Boc) ppm. HRMS

(TOF MS ES+) calcd for C49H63N4O11 [M + H]+ m/z 883.4488, found 883.4495.

OH

O

3

N

OMe

O

NBoc

OMe

Page 7: Topologically diverse shapes accessible by modular design

S7

Linear arylopeptoid momomo-10: Colorless foam (464 mg, 73%, 95%

purity). mp = 61-64 °C. 1H NMR (300 MHz, CDCl3): δ = 8.15-7.78 (m,

2H, o/o’-C6H4COO), 7.58-6.90 (m, 22H, Ar-H), 5.32-4.12 (m, 12H,

5×CONCH2Ar and BocNCH2Ar), 3.80-2.90 (m, 42H, 6×CH2CH2OCH3,

6×CH2CH2OCH3 and 6×CH2CH2OCH3), 1.56-1.30 (2×br s, 9H, Boc)

ppm. 13C NMR (75 MHz, CDCl3): δ = 172.5, 172.3, 171.6, 171.4, 171.2, 170.8, 170.5 (5Cq, 5×CON),

168.5, 168.4 (Cq, COO), 156.0, 155.9 (Cq, Boc), 137.7, 137.6, 136.8, 136.7, 136.4, 136.2, 135.4,

135.2, 134.9, 134.3, 134.0, 133.8, 133.6, 130.7 (12Cq), 132.6, 132.5, 129.4, 129.3, 129.2, 129.0,

128.9, 128.7, 128.6, 127.8, 127.5, 127.4, 127.2, 127.1, 126.7, 126.5, 126.4, 126.3, 125.9, 125.7, 125.5,

125.3, 125.1 (24CH), 80.0 (Cq, Boc), 70.5, 70.5, 70.2, 70.0, 69.8, 69.4, 69.2 (6CH2,

6×CH2CH2OCH3), 58.6, 58.5 (6CH3, 6×CH2CH2OCH3), 53.0, 50.9, 48.5, 48.5, 48.0, 47.9, 47.8, 47.6,

46.8, 46.6, 46.3, 45.3, 45.1, 44.8, 44.0, 43.8, 43.6 (12CH2, 6×CH2CH2OCH3, 5×CONCH2Ar and

BocNCH2Ar), 28.3 (3CH3, Boc) ppm. HRMS (TOF MS ES-) calcd for C71H90N6O15 [M + 2H]+ m/z

633.3227, found 633.3233.

Linear arylopeptoid popo-11: Colorless foam (298 mg, 67%, 95%

purity). mp = 62-65 °C. 1H NMR (300 MHz, CDCl3): δ = 8.08-7.88 (m,

2H, o-C6H4COO), 7.60-8.84 (m, 14H, Ar-H), 5.30-4.18 (m, 8H,

3×CONCH2Ar and BocNCH2Ar), 3.80-2.95 (m, 28H, 4×CH2CH2OCH3,

4×CH2CH2OCH3 and 4×CH2CH2OCH3), 1.57-1.30 (2×br s, 9H, Boc)

ppm. 13C NMR (75 MHz, CDCl3): δ = 172.5, 172.3, 172.1, 171.5, 171.4, 171.2, 170.7, 170.5, 169.6,

169.3 (4Cq, 3×CON and COO), 156.0, 155.9, 155.8 (Cq, Boc), 143.2, 142.8, 142.1, 139.1, 138.9,

138.3, 137.8, 135.5, 135.2, 135.1, 135.0, 134.9, 134.6, 134.4, 134.1, 134.0, 133.9, 128.9 (8Cq), 130.3,

130.0, 129.5, 129.3, 129.2, 129.0, 128.0, 127.8, 127.7, 127.6, 127.4, 127.2, 126.9, 126.6, 126.5, 126.2,

125.8, 125.6 (16CH), 80.0 (Cq, Boc), 70.5, 70.3, 70.1, 70.0, 69.8, 69.4, 69.3 (4CH2,

4×CH2CH2OCH3), 58.7, 58.7, 58.6, 58.5 (4CH3, 4×CH2CH2OCH3), 53.1, 48.4, 47.7, 46.8, 46.6, 46.2,

45.2, 45.0, 44.2, 43.8 (8CH2, 4×CH2CH2OCH3, 3×CONCH2Ar and BocNCH2Ar), 28.3 (3CH3, Boc)

ppm. HRMS (TOF MS ES+) calcd for C49H63N4O11 [M + H]+ m/z 883.4488, found 883.4497.

Linear arylopeptoid popopo-12: Colorless foam (412 mg, 65%, 95%

purity). mp = 64-67 °C. 1H NMR (300 MHz, CDCl3): δ = 8.07-7.86 (m,

2H, o-C6H4COO), 7.60-6.88 (m, 22H, Ar-H), 5.31-4.12 (m, 12H,

5×CONCH2Ar and BocNCH2Ar), 3.77-2.94 (m, 42H, 6×CH2CH2OCH3,

6×CH2CH2OCH3 and 6×CH2CH2OCH3), 1.54-1.30 (2×br s, 9H, Boc)

ppm. 13C NMR (75 MHz, CDCl3): δ = 172.4, 172.3, 172.1, 171.4, 171.2, 170.7, 170.5, 169.1, 168.9

OH

O

2

N

OMe

ON

Boc

OMe

OH

O

3

N

OMe

ON

Boc

OMe

Page 8: Topologically diverse shapes accessible by modular design

S8

(6Cq, 5×CON and COO), 155.9 (Cq, Boc), 143.1, 142.0, 138.9, 138.3, 137.8, 135.5, 135.3, 135.2,

134.8, 134.1, 133.9 (12Cq), 130.3, 129.9, 129.5, 129.3, 129.2, 129.0, 128.0, 127.9, 127.8, 127.6,

127.4, 127.2, 127.0, 126.5, 126.2, 126.0, 125.8, 125.6 (24CH), 80.0 (Cq, Boc), 70.5, 70.3, 70.1, 70.0,

69.3 (6CH2, 6×CH2CH2OCH3), 58.7, 58.6, 58.5 (6CH3, 6×CH2CH2OCH3), 53.1, 53.0, 51.2, 48.4,

47.8, 46.3, 45.3, 44.2, 43.8 (12CH2, 6×CH2CH2OCH3, 5×CONCH2Ar and BocNCH2Ar), 28.3 (3CH3,

Boc) ppm. HRMS (TOF MS ES-) calcd for C71H90N6O15 [M + 2H]+ m/z 633.3227, found 633.3232.

Macrocyclic arylopeptoids cyclo-

pppp-13 and cyclo-pppppppp-14:

Macrocyclization of 1 (39.7 mg,

0.045 mmol) following the general

procedure, performing the reaction

at 2.5 mM, yielded a mixture of 13

and 14. Flash chromatography was

performed in EtOAc/MeOH 80:20

until 13 had passed followed by

change to CH2Cl2/MeOH 95:5 which allowed for isolation of 14. Data for cyclo-pppp-13: White solid

(16.1 mg, 47%, >99% purity). Rf (EtOAc/MeOH 80:20) = 0.40. Rf (CH2Cl2/MeOH 95:5) = 0.25. mp =

246-247 °C. 1H NMR (300 MHz, CDCl3): δ = 7.33 (br d, J = 5.8 Hz, 8H, Ar-H), 7.02 (br d, J = 5.8

Hz, 8H, Ar-H), 4.58-4.45 (br s, 8H, 4×CONCH2Ar), 3.71-3.55 (br m, 16H, 4×CH2CH2OCH3 and

4×CH2CH2OCH3), 3.36-3.29 (br s, 12H, 4×CH2CH2OCH3) ppm. 13C NMR (75 MHz, CDCl3): δ =

171.5 (4Cq, 4×CON), 138.4 (4Cq), 135.7 (4Cq), 127.0 (8CH), 126.8 (8CH), 71.0 (4CH2,

4×CH2CH2OCH3), 58.8 (4CH3, 4×CH2CH2OCH3), 53.5 (4CH2, 4×CONCH2Ar), 44.8 (4CH2,

4×CH2CH2OCH3) ppm. LC-MS: 786.9 (13, M+Na+), 784.3 (17), 765.0 (100, M+H+). HRMS (TOF

MS ES+) calcd for C44H53N4O8 [M + H]+ m/z 765.3858, found 765.3867. Data for cyclo-pppppppp-14:

White solid (6.6 mg, 19%, >98% purity). Rf (EtOAc/MeOH 80:20) = 0.13. Rf (CH2Cl2/MeOH 95:5) =

0.25. mp = 94-99 °C. 1H NMR (300 MHz, CDCl3): δ = 7.48-7.07 (m, 32H, Ar-H), 4.92-4.53 (m, 16H,

8×CONCH2Ar), 3.74-3.12 (m, 56H, 8×CH2CH2OCH3, 8×CH2CH2OCH3 and 8×CH2CH2OCH3) ppm.

13C NMR (75 MHz, CDCl3): δ = 172.0 (8Cq, 8×CON), 139.0, 138.8 (16Cq), 127.9, 127.2, 126.8

(32CH), 70.8, 70.3 (8CH2, 8×CH2CH2OCH3), 58.9 (8CH3, 8×CH2CH2OCH3), 53.8, 48.1, 44.8

(16CH2, 8×CH2CH2OCH3 and 8×CONCH2Ar) ppm. LC-MS: 1551.9 (18), 1550.9 (19, M+Na+),

1548.4 (13), 1532.0 (12), 1531.0 (35), 1530.0 (75), 1528.9 (69, M+H+), 787.1 (12), 784.8 (37), 784.1

(84), 776.7 (27), 776.0 (34), 765.9 (24), 765.1 (42), 760.7 (19), 760.1 (32), 750.0 (29), 749.2

(100), 733.8 (34), 733.2 (90), 382.9 (12), 309.9 (13). HRMS (TOF MS ES+) calcd for C88H105N8O16

[M + H]+ m/z 1529.7643, found 1529.7651.

Page 9: Topologically diverse shapes accessible by modular design

S9

Macrocyclic arylopeptoid cyclo-pppppp-15: Macrocyclization of

pppppp-2 (38.0 mg, 0.030 mmol) following the general procedure

yielded cyclo-pppppp-15 as a white solid (21.2 mg, 62%, >97% HPLC

purity). Rf (EtOAc/MeOH 80:20) = 0.20. mp = 95-96 °C. 1H NMR (300

MHz, CDCl3): δ = 7.50-7.02 (m, 24H, Ar-H), 4.92-4.53 (m, 12H,

6×CONCH2Ar), 3.74-3.08 (m, 42H, 6×CH2CH2OCH3,

6×CH2CH2OCH3 and 6×CH2CH2OCH3) ppm. 13C NMR (75 MHz,

CDCl3): δ = 171.9 (6Cq, 6×CON), 138.9, 135.2 (12Cq), 127.2, 126.6

(24CH), 70.7, 70.2 (6CH2, 6×CH2CH2OCH3), 58.8 (6CH3, 6×CH2CH2OCH3), 53.6, 48.0, 44.7

(12CH2, 6×CH2CH2OCH3 and 6×CONCH2Ar) ppm. LC-MS: 1147.7 (16, M+H+), 574.4 (100), 558.3

(22). HRMS (TOF MS ES+) calcd for C66H79N6O12 [M + H]+ m/z 1147.5751, found 1147.5765.

Macrocyclic arylopeptoid cyclo-mmmm-16: Macrocyclization of

mmmm-3 (40.0 mg, 0.045 mmol) following the general procedure yielded

cyclo-mmmm-16 as a white solid (18.0 mg, 52%, >99% HPLC purity). Rf

(EtOAc/MeOH 90:10) = 0.27. mp = 144-145 °C. 1H NMR (400 MHz,

CDCl3): δ = 7.31 (br m, 8H, Ar-H), 7.12 (br m, 4H, Ar-H), 7.00 (br m,

4H, Ar-H), 4.80-4.31 (m, 8H, 4×CONCH2Ar), 3.78-3.52 (br m, 12H,

2×CH2CH2OCH3 and 4×CH2CH2OCH3), 3.45-2.94 (br s, 16H, 2×CH2CH2OCH3 and

4×CH2CH2OCH3) ppm. 13C NMR (100 MHz, CDCl3): δ = 172.0, 171.9, 171.8, (4Cq, 4×CON), 137.7

(4Cq), 136.9 (4Cq), 129.2 (4CH), 126.7, 126.1 (12CH), 71.0 (4CH2, 4×CH2CH2OCH3), 59.0 (4CH3,

4×CH2CH2OCH3), 53.8 (4CH2, 4×CONCH2Ar), 45.1 (4CH2, 4×CH2CH2OCH3) ppm. HRMS (TOF

MS ES+) calcd for C44H53N4O8 [M + H]+ m/z 765.3863, found 765.3839.

Macrocyclic arylopeptoid cyclo-mmmmmm-17: Macrocyclization

of mmmmmm-4 (57.0 mg, 0.045 mmol) following the general

procedure yielded cyclo-mmmmmm-17 as a white solid (31.0 mg,

60%, >87% HPLC purity). Rf (EtOAc/MeOH 90:10) = 0.26. mp =

77-78 °C. 1H NMR (400 MHz, CDCl3): δ = 7.49-7.05 (m, 24H, Ar-

H), 4.90-4.46 (m, 12H, 6×CONCH2Ar), 3.74-3.05 (m, 42H,

6×CH2CH2OCH3, 6×CH2CH2OCH3 and 6×CH2CH2OCH3) ppm.

13C NMR (100 MHz, CDCl3): δ = 171.8 (6Cq, 6×CON), 137.9, 137.7 (6Cq), 136.9 (6Cq), 129.0 (6CH),

126.3, 125.6 (18CH), 70.9, 70.0 (6CH2, 6×CH2CH2OCH3), 58.9 (6CH3, 6×CH2CH2OCH3), 53.7

(6CH2, 6×CONCH2Ar), 48.3 (3CH2, 3×CH2CH2OCH3), 44.8 (3CH2, 3×CH2CH2OCH3) ppm. HRMS

(TOF MS ES+) calcd for C66H79N6O12 [M + H]+ m/z 1147.5751, found 1147.5809.

NO

O

O

N

O

N

OO

N

OO

N

O

N

O

O

O

Page 10: Topologically diverse shapes accessible by modular design

S10

Macrocyclic arylopeptoid cyclo-mpmp-18: Macrocyclization of mpmp-5

(39.7 mg, 0.045 mmol) following the general procedure yielded cyclo-mpmp-

18 as a pale yellow solid (25.1 mg, 73%, >99% HPLC purity). Rf

(EtOAc/MeOH 90:10) = 0.26. mp = 184-185 °C. 1H NMR (400 MHz,

CDCl3): δ = 7.62-7.14 (m, 16H, Ar-H), 4.85-4.40 (m, 8H, 4×CONCH2Ar),

3.76-2.88 (m, 28H, 4×CH2CH2OCH3, 4×CH2CH2OCH3 and

4×CH2CH2OCH3) ppm. 13C NMR (100 MHz, CDCl3): δ = 172.4, 171.9, 171.6 (4Cq, 6×CON), 137.7,

137.1 (8Cq), 129.1, 127.4, 126.1 (16CH), 71.1, 70.8 (4CH2, 4×CH2CH2OCH3), 59.0 (4CH3,

4×CH2CH2OCH3), 53.9, 53.8 (4CH2, 4×CONCH2Ar), 49.2, 48.4, 48.2 (4CH2, 4×CH2CH2OCH3) ppm.

HRMS (TOF MS ES+) calcd for C44H53N4O8 [M + H]+ m/z 765.3863, found 765.3835.

Macrocyclic arylopeptoid cyclo-mpmpmp-19:

Macrocyclization of mpmpmp-6 (51.0 mg, 0.040 mmol)

following the general procedure yielded cyclo-mpmpmp-19 as a

colorless amorphous solid (23.0 mg, 50%, 98% HPLC purity).

Rf (EtOAc/MeOH 80:20) = 0.30. 1H NMR (400 MHz, CDCl3):

δ = 7.54-7.02 (m, 24H, Ar-H), 4.96-4.48 (m, 12H,

6×CONCH2Ar), 3.79-3.07 (m, 42H, 6×CH2CH2OCH3, 6×CH2CH2OCH3 and 6×CH2CH2OCH3) ppm.

13C NMR (100 MHz, CDCl3): δ = 172.1 (6Cq, 6×CON), 138.7, 137.7, 136.9, 135.5 (12Cq), 129.1,

127.2, 126.3 (24CH), 70.9 (6CH2, 6×CH2CH2OCH3), 59.0 (6CH3, 6×CH2CH2OCH3), 53.8, 48.3, 44.8

(12CH2, 6×CH2CH2OCH3 and 6×CONCH2Ar) ppm. HRMS (TOF MS ES+) calcd for C66H80N6O12 [M

+ 2H]2+ m/z 574.2917, found 574.2890.

Macrocyclic arylopeptoid cyclo-pmpm-18: Macrocyclization of pmpm-7

(39.7 mg, 0.045 mmol) following the general procedure yielded cyclo-

pmpm-18 as a pale yellow solid (22.0 mg, 65%, >99% purity). Rf

(EtOAc/MeOH 90:10) = 0.26. mp = 185-186 °C. HRMS (TOF MS ES+)

calcd for C44H53N4O8 [M + H]+ m/z 765.3863, found 765.3835.

Macrocyclic arylopeptoid cyclo-pmpmpm-19: Macrocyclization

of pmpmpm-8 (51.0 mg, 0.040 mmol) following the general

procedure yielded cyclo-pmpmpm-19 as a white amorphous solid

(28.0 mg, 61%, 98% HPLC purity). Rf (EtOAc/MeOH 80:20) =

0.30. HRMS (TOF MS ES+) calcd for C66H80N6O12 [M + 2H]2+ m/z

574.2917, found 574.2893.

Page 11: Topologically diverse shapes accessible by modular design

S11

Macrocyclic arylopeptoid cyclo-momo-20: Macrocyclization of

momo-9 (39.7 mg, 0.045 mmol) following the general procedure yielded

cyclo-momo-20 as a colorless solid (27.3 mg, 79%, >98% purity). Rf

(EtOAc/MeOH 90:10) = 0.37. mp = 222-223 °C. 1H NMR (300 MHz,

CDCl3): δ = 7.74-6.72 (m, 16H, Ar-H), 6.04-2.50 (m, 36H,

4×CONCH2Ar, 4×CH2CH2OCH3, 4×CH2CH2OCH3 and

4×CH2CH2OCH3) ppm. 13C NMR (75 MHz, CDCl3): δ = 172.4, 171.8, 171.3, 170.3, 169.7 (4Cq,

4×CON), 138.2, 137.2, 136.6, 135.3, 134.5, 133.9, 133.6, 133.1 (8Cq), 129.5, 129.3, 129.0, 128.8,

128.6, 128.4, 128.0, 127.7, 127.3, 127.0, 126.6, 126.3, 126.0, 125.7, 125.3, 124.9, 124.7, 124.6

(16CH), 70.7, 70.4, 70.2, 70.0, 69.9, 69.4 (4CH2, 4×CH2CH2OCH3), 58.8, 58.7, 58.7, 58.5 (4CH3,

4×CH2CH2OCH3), 51.6, 51.3, 50.7, 48.2, 47.4, 46.4, 46.2, 44.0, 43.0 (8CH2, 4×CH2CH2OCH3 and

4×CONCH2Ar) ppm. LC-MS: 765.3 (100, M+H+). HRMS (TOF MS ES+) calcd for C44H53N4O8 [M +

H]+ m/z 765.3858, found 765.3872.

Macrocyclic arylopeptoid cyclo-momomo-21: Macrocyclization

of momomo-10 (38.0 mg, 0.030 mmol) following the general

procedure yielded cyclo-momomo-21 as a white solid (23.9 mg,

69%, >97% purity). Rf (EtOAc/MeOH 85:15) = 0.32. mp = 99-

102 °C. 1H NMR (300 MHz, CDCl3): δ = 7.58-6.85 (m, 24H, Ar-

H), 5.30-4.10 (m, 12H, 6×CONCH2Ar), 3.84-2.76 (m, 42H,

6×CH2CH2OCH3, 6×CH2CH2OCH3 and 6×CH2CH2OCH3) ppm.

13C NMR (75 MHz, CDCl3): δ = 172.2, 171.3, 170.4 (6Cq,

6×CON), 137.9, 137.0, 136.8, 135.0, 134.3, 133.8 (12Cq), 129.3, 129.1, 128.7, 127.5, 127.1, 126.5,

126.2, 126.1 (24CH), 70.1, 69.8 (6CH2, 6×CH2CH2OCH3), 58.7, 58.6 (6CH3, 6×CH2CH2OCH3), 50.7,

48.4, 47.3, 46.2, 46.0, 45.3, 45.2, 43.7, 43.6 (12CH2, 6×CH2CH2OCH3 and 6×CONCH2Ar) ppm. LC-

MS: 1147.7 (41, M+H+), 574.4 (100), 558.4 (42), 542.3 (22), 478.8 (13). HRMS (TOF MS ES+) calcd

for C66H79N6O12 [M + H]+ m/z 1147.5751, found 1147.5758.

Macrocyclic arylopeptoid cyclo-popo-22: Macrocyclization of

popo-11 (65.0 mg, 0.074 mmol) following the general procedure

yielded cyclo-popo-22 as a colorless solid (32 mg, 56%, >99%

HPLC purity). Rf (EtOAc/MeOH 90:10) = 0.39. mp = 228-229 °C.

1H NMR (400 MHz, CDCl3): δ = 7.55-7.22 (m, 8H, Ar-H), 7.19 (d, J

= 7.7 Hz, 4H, Ar-H), 7.01 (d, J = 7.7 Hz, 4H, Ar-H), 5.75 (d, JAB =

14.6 Hz, 2H, CONCH2Ar), 4.84 (d, JAB = 18.1 Hz, 2H, CONCH2Ar), 4.38 (d, JAB = 18.1 Hz, 2H,

Page 12: Topologically diverse shapes accessible by modular design

S12

CONCH2Ar), 4.56-4.32 (m, 2H, CH2CH2OCH3), 3.83 (m, 2H, CH2CH2OCH3), 3.70 (d, JAB = 14.6 Hz,

2H, CONCH2Ar), 3.71-3.64 (m, 2H, CH2CH2OCH3), 3.42 (s, 6H, 2×CH2CH2OCH3), 3.26-3.18 (m,

2H, CH2CH2OCH3), 3.17 (s, 6H, 2×CH2CH2OCH3), 3.14 (t, J = 5.3 Hz, 4H, 2×CH2CH2OCH3,), 2.51

(dt, JAB = 14.9 Hz, J = 5.3 Hz, 2H, CH2CH2OCH3,), 2.08 (dt, JAB = 14.9 Hz, J = 5.3 Hz, 2H,

CH2CH2OCH3) ppm. 13C NMR (100 MHz, CDCl3): δ = 172.2, 170.4 (4Cq, 4×CON), 138.6, 135.9,

135.8, 134.6, 133.5 (8Cq), 129.7, 129.5, 128.6, 128.0, 127.9, 127.4, 126.8, 126.5, 126.3, 125.1

(16CH), 70.8, 70.3 (4CH2, 4×CH2CH2OCH3), 59.2, 59.0, 58.9, 58.8 (4CH3, 4×CH2CH2OCH3), 51.6,

47.4, 46.5, 46.0, 45.0, 38.5 (8CH2, 4×CH2CH2OCH3 and 4×CONCH2Ar) ppm. HRMS (TOF MS ES+)

calcd for C44H53N4O8 [M + H]+ m/z 765.3858, found 765.3843.

Macrocyclic arylopeptoid cyclo-popopo-23: Macrocyclization

of popopo-12 (57.0 mg, 0.045 mmol) following the general

procedure yielded cyclo-popopo-23 as a colorless solid (36 mg,

69%, 98% HPLC purity). Rf (CH2Cl2/MeOH 95:5) = 0.50. mp =

122-123 °C. 1H NMR (400 MHz, CDCl3): δ = 7.70-6.86 (m,

24H, Ar-H), 4.90-4.35 (m, 12H, 6×CONCH2Ar), 3.82-2.70 (m,

42H, 6×CH2CH2OCH3, 6×CH2CH2OCH3 and 6×CH2CH2OCH3)

ppm. 13C NMR (100 MHz, CDCl3): δ = 172.4, 171.4, 170.8

(6Cq, 6×CON), 135.8, 133.9 (12Cq), 129.5, 127.9, 127.1, 126.6

(24CH), 70.4, 69.8 (6CH2, 6×CH2CH2OCH3), 58.9 (6CH3, 6×CH2CH2OCH3), 51.1, 48.1, 46.1, 45.6,

42.9 (12CH2, 6×CH2CH2OCH3 and 6×CONCH2Ar) ppm. HRMS (TOF MS ES+) calcd for

C66H79N6O12 [M + H]+ m/z 1147.5751, found 1147.5731.

Page 13: Topologically diverse shapes accessible by modular design

S13

Macrocyclization Optimization study

The optimization of the macrocyclization process was carried out using the following four linear arylopeptoids as model systems: pppp-1 (the tetramer with the highest number of atoms between the C- and N-termini), pppppp-2 (the hexamer with the highest number of atoms between the C- and N-termini), momo-9 (the tetramer with the lowest number of atoms between the C- and N-termini), momomo-10 (the hexamer with the lowest number of atoms between the C- and N-termini). After Boc-removal with TFA, HATU, COMU and PyBOP showed similar efficiency for the macrocyclization of linear precursors 2, 9 and 10. However, PyBOP generally proved to be the least convenient reagent due to the formation of tripyrrolidinophosphine oxide which was difficult to separate from the macrocyclic products using column chromatography. While the cyclization of the linear precursors 2, 9 and 10 provided the derived macrocycles in 61-82% yield, the cyclization of tetramer 1 proved more challenging. Thus, 3:2 mixtures of the derived macrocyclic tetramer and octamer were obtained in approximately 50% yield regardless of the coupling reagent used. Fortunately the two macrocycles were easily separable by flash chromatography. A higher proportion of the cyclotetramer was obtained by increasing the dilution to 2.5 mM, however, the cyclooctamer remained present in ~20% yield. Table S1. Macrocyclization of linear arylopeptoids. Key: (a) CH2Cl2/TFA 1:1, 0 ºC, 3 h. (b) coupling reagent (1.2 equiv), DIPEA (5.0 equiv), CH2Cl2, 2.5-5.0 mM, 0 ºC to rt.

Linear precursor Conditionsa Macrocycle Macrocyclic dimer

Number Backbone Reagent Solvent Conc. (mM) Temp. Number Yieldb Purityc Yieldb Purityc

1 p-p-p-p HATU (1.2) CH2Cl2 5.0 0 °C to rt 13/14 32 >95 20 >90

1 p-p-p-p PyBOP (1.2) CH2Cl2 5.0 0 °C to rt 13/14 32d >91e 21 >91

1 p-p-p-p COMU (1.2) CH2Cl2 5.0 0 °C to rt 13/14 30 >99 16 >97

1 p-p-p-p COMU (1.2) CH2Cl2 5.0 rt 13/14 38 >99 23 >98

1 p-p-p-p COMU (1.2) CH2Cl2 2.5 0 °C to rt 13/14 47 >99 19 >98

2 p-p-p-p-p-p HATU (1.2) CH2Cl2 5.0 0 °C to rt 15 61 >99 - -

2 p-p-p-p-p-p PyBOP (1.2) CH2Cl2 5.0 0 °C to rt 15 65 >98 - -

2 p-p-p-p-p-p COMU (1.2) CH2Cl2 5.0 0 °C to rt 15 62 >97 - -

9 m-o-m-o HATU (1.2) CH2Cl2 5.0 0 °C to rt 20 78 >98 - -

9 m-o-m-o PyBOP (1.2) CH2Cl2 5.0 0 °C to rt 20 82 >98 - -

9 m-o-m-o COMU (1.2) CH2Cl2 5.0 0 °C to rt 20 79 >98 - -

10 m-o-m-o-m-o HATU (1.2) CH2Cl2 5.0 0 °C to rt 21 69 >97 - -

10 m-o-m-o-m-o PyBOP (1.2) CH2Cl2 5.0 0 °C to rt 21 65d >97e - -

10 m-o-m-o-m-o COMU (1.2) CH2Cl2 5.0 0 °C to rt 21 71 >97 - -

aTetramers on 0.045 mmol scale and hexamers on 0.030 mmol scale. bIsolated yield of pure product after purification by flash chromatography unless otherwise stated. cMeasured by analytical LC-MS. dCalculated yield since the product was isolated as a 1:1 mixture with tripyrrolidinophosphine oxide as judged by NMR. eTripyrrolidinophosphine oxide not detected on LC-MS.

Page 14: Topologically diverse shapes accessible by modular design

S14

HPLC and LC-MS profiles of synthesised arylopeptoids

Linear arylopeptoid pppp-1 Linear arylopeptoid pppppp-2

Linear arylopeptoid mmmm-3 Linear arylopeptoid mmmmmm-4

Linear arylopeptoid mpmp-5 Linear arylopeptoid mpmpmp-6

Linear arylopeptoid pmpm-7 Linear arylopeptoid pmpmpm-8

Page 15: Topologically diverse shapes accessible by modular design

S15

Linear arylopeptoid momo-9 Linear arylopeptoid momomo-10

Linear arylopeptoid popo-11 Linear arylopeptoid popopo-12

Cyclization of arylopeptoid pppp-1 (crude).

Cyclic arylopeptoid cyclo-pppp-13 Cyclic arylopeptoid cyclo-pppppppp-14

Page 16: Topologically diverse shapes accessible by modular design

S16

Cyclic arylopeptoid cyclo-pppppp-15

0,0 5,0 10,0 15,0 20,0 25,0 30,0 35,0 40,0 45,0

-200

500

1 000

1 500

2 000arylopeptoid cyclique #SF827A [modified by Administrateur] EXT215NM

mAU

min

1 - 7,4582 - 9,100

3 - 10,608

4 - 14,7755 - 16,942

WVL:215 nm

0,0 5,0 10,0 15,0 20,0 25,0 30,0 35,0 40,0 45,0

-200

0

200

400

600

800

1 000

1 200arylopeptoid cyclique #SF828A [modified by Administrateur] EXT215NMmAU

min

1 - 7,317

2 - 8,300 3 - 13,083

4 - 15,508

WVL:215 nm

Cyclic arylopeptoid cyclo-mmmm-16 Cyclic arylopeptoid cyclo-mmmmmm-17

0,0 5,0 10,0 15,0 20,0 25,0 30,0 35,0 40,1

-200

500

1 000

1 500

1 800arylopeptoid cyclique #SF822A [modified by Administrateur] EXT215NMmAU

min

1 - 9,958

2 - 10,625

3 - 14,292

WVL:215 nm

0,0 5,0 10,0 15,0 20,0 25,0 30,0 35,0 40,0 45,0

-100

0

125

250

375

500

700arylopeptoid cyclique #SF826A [modified by Administrateur] EXT215NMmAU

min

1 - 11,8332 - 12,450

3 - 14,525

4 - 15,617

WVL:215 nm

Cyclic arylopeptoid cyclo-mpmp-18 Cyclic arylopeptoid cyclo-mpmpmp-19

0,0 5,0 10,0 15,0 20,0 25,0 30,0 35,0 41,0

-200

500

1 000

1 500

1 800arylopeptoid cyclique #SF823A [modified by Administrateur] EXT215NMmAU

min

1 - 5,8582 - 8,375

3 - 10,267

WVL:215 nm

0,0 5,0 10,0 15,0 20,0 25,0 30,0 35,0 40,0 45,0

-100

200

400

600

800

1 000arylopeptoid cyclique #SF825A [modified by Administrateur] EXT215NMmAU

min

1 - 10,7672 - 11,8673 - 12,483

4 - 14,567

WVL:215 nm

Cyclic arylopeptoid cyclo-pmpm-18 Cyclic arylopeptoid cyclo-pmpmpm-19

Page 17: Topologically diverse shapes accessible by modular design

S17

Cyclic arylopeptoid cyclo-momo-20 Cyclic arylopeptoid cyclo-momomo-21

Cyclic arylopeptoid cyclo-popo-22 Cyclic arylopeptoid cyclo-popopo-23

Page 18: Topologically diverse shapes accessible by modular design

S18

NMR spectra of linear arylopeptoid pppp-1 (CDCl3)

ppm (t1)

050100150

172.108

169.352

169.158

155.757

155.488

143.071

142.560

140.439

140.110

139.066

138.631

135.134

134.636

130.344

129.139

127.873

127.488

127.291

126.919

126.269

79.987

71.094

70.706

70.313

58.939

58.895

58.745

58.623

53.676

51.384

50.524

48.085

46.389

46.206

44.660

28.262

OH

O

4

NBoc

OMe

Page 19: Topologically diverse shapes accessible by modular design

S19

NMR spectra of linear arylopeptoid pppppp-2 (CDCl3)

ppm (t1)

050100150

172.067

168.888

168.720

155.732

155.507

142.966

142.513

142.352

140.255

139.015

138.701

135.180

134.616

131.445

130.310

129.234

128.954

127.839

127.455

127.020

126.907

126.279

79.966

71.097

70.691

70.271

70.172

58.745

58.622

54.330

53.672

51.400

50.564

48.270

48.072

46.376

46.185

44.649

28.261

OH

O

6

NBoc

OMe

Page 20: Topologically diverse shapes accessible by modular design

S20

NMR spectra of linear arylopeptoid mmmm-3 (CDCl3)

ppm (t1)

050100150

172.120

169.019

168.661

155.783

155.473

139.223

138.889

138.844

137.729

137.589

137.455

136.633

136.489

136.139

132.666

132.535

131.301

130.565

130.292

129.376

128.995

128.750

128.508

128.163

127.762

126.262

125.893

125.547

125.085

79.980

71.056

70.652

70.260

58.735

58.623

53.685

51.385

50.540

48.096

47.984

46.266

46.150

44.757

44.502

28.280

OH

O

4

NBoc

OMe

Page 21: Topologically diverse shapes accessible by modular design

S21

NMR spectra of linear arylopeptoid mmmmmm-4 (CDCl3)

ppm (t1)

050100150200

172.112

168.634

168.231

155.831

155.457

139.255

138.899

137.744

137.460

136.474

136.213

132.540

131.191

130.734

130.473

128.800

128.507

128.184

127.781

126.341

125.927

125.513

125.047

79.932

71.073

70.634

70.244

69.793

58.744

58.638

53.649

51.394

50.529

47.977

46.301

46.088

44.506

28.288

OH

O

6

NBoc

OMe

Page 22: Topologically diverse shapes accessible by modular design

S22

NMR spectra of linear arylopeptoid mpmp-5 (CDCl3)

ppm (t1)

050100150

172.197

169.237

168.966

155.804

155.532

140.411

138.936

138.461

137.593

136.492

135.279

134.639

132.645

131.542

130.478

129.006

128.732

128.319

127.878

127.472

126.949

126.107

125.497

79.990

71.080

70.665

70.325

58.739

58.620

53.671

51.358

50.521

48.025

46.330

46.137

44.614

28.263

OH

O

2

N

OMe

O

NBoc

OMe

Page 23: Topologically diverse shapes accessible by modular design

S23

NMR spectra of linear arylopeptoid mpmpmp-6 (CDCl3)

ppm (t1)

050100150

172.153

168.751

168.530

155.753

155.494

140.526

138.921

138.508

137.812

137.577

136.432

135.258

134.695

132.524

131.487

130.601

130.405

129.478

128.918

128.786

128.699

128.282

127.879

127.498

126.942

126.276

125.902

125.564

125.417

79.953

71.082

70.637

70.221

58.731

58.618

53.661

51.388

50.523

48.039

46.321

46.155

44.532

28.256

OH

O

3

N

OMe

O

NBoc

OMe

Page 24: Topologically diverse shapes accessible by modular design

S24

NMR spectra of linear arylopeptoid pmpm-7 (CDCl3)

ppm (t1)

050100150

172.128

168.859

155.774

155.455

143.088

142.474

139.307

138.911

137.618

136.485

136.134

135.127

130.347

129.469

129.223

129.073

128.827

128.529

127.886

127.666

127.374

127.155

126.871

126.464

125.811

125.477

125.105

79.980

71.078

70.762

70.350

58.747

58.633

53.735

51.380

50.596

48.003

46.293

44.442

28.285

OH

O

2

N

OMe

O

NBoc

OMe

Page 25: Topologically diverse shapes accessible by modular design

S25

NMR spectra of linear arylopeptoid pmpmpm-8 (CDCl3)

ppm (t1)

050100150

172.270

168.821

168.593

155.912

155.676

142.525

139.289

139.087

137.916

137.597

136.622

136.276

135.315

130.427

129.480

128.946

128.632

128.063

127.537

127.327

126.997

126.574

126.416

125.988

125.659

125.187

80.055

71.200

70.801

70.533

70.289

58.867

58.752

53.789

51.533

50.720

48.131

46.451

46.269

44.591

28.402

OH

O

3

N

OMe

O

NBoc

OMe

Page 26: Topologically diverse shapes accessible by modular design

S26

NMR spectra of linear arylopeptoid momo-9 (CDCl3)

ppm (t1)050100150

172.560

172.408

172.287

171.892

171.743

171.613

171.258

171.201

170.786

170.735

170.542

168.914

168.689

155.936

137.776

137.675

137.076

137.024

136.885

136.724

136.453

135.634

135.580

135.445

135.223

135.012

134.875

134.280

134.057

132.774

132.617

131.905

130.613

130.568

129.470

129.220

129.107

129.008

128.844

128.743

128.606

128.148

127.713

127.541

127.439

127.266

127.098

126.920

126.724

126.683

126.281

126.017

125.922

125.504

125.248

125.114

80.046

79.928

70.476

70.234

69.866

69.532

69.402

58.701

58.644

58.604

58.552

58.519

53.097

52.961

52.811

51.038

50.746

49.221

49.090

48.528

48.097

47.947

47.806

47.549

46.674

46.393

45.411

45.068

44.196

44.012

43.551

28.316

Page 27: Topologically diverse shapes accessible by modular design

S27

NMR spectra of linear arylopeptoid momomo-10 (CDCl3)

ppm (t1)050100150

172.540

172.289

171.585

171.426

171.174

170.776

170.497

168.520

168.360

156.019

155.912

137.734

137.629

136.808

136.698

136.437

136.209

135.404

135.232

134.877

134.258

134.028

133.798

133.637

132.646

132.547

130.674

129.427

129.320

129.154

129.043

128.853

128.732

128.554

127.812

127.513

127.446

127.178

127.069

126.681

126.541

126.432

126.253

125.927

125.718

125.530

125.344

125.068

79.959

70.492

70.453

70.212

69.952

69.850

69.392

69.170

58.631

58.537

52.967

50.884

48.544

48.453

48.050

47.884

47.767

47.551

46.768

46.569

46.281

45.274

45.071

44.818

43.970

43.798

43.563

28.293

Page 28: Topologically diverse shapes accessible by modular design

S28

NMR spectra of linear arylopeptoid popo-11 (CDCl3)

ppm (t1)

050100150

172.503

172.293

172.078

171.502

171.412

171.186

170.707

170.466

169.569

169.292

156.027

155.913

155.824

143.222

142.785

142.077

139.108

138.852

138.265

137.804

135.467

135.202

135.147

134.983

134.851

134.600

134.386

134.102

134.021

133.915

130.287

129.951

129.470

129.332

129.192

129.044

128.913

127.976

127.788

127.668

127.587

127.397

127.205

126.930

126.603

126.474

126.207

125.777

125.559

79.996

70.505

70.264

70.114

69.962

69.758

69.393

69.282

58.711

58.665

58.608

58.546

53.124

48.373

47.714

46.764

46.621

46.221

45.226

44.998

44.154

43.775

28.289

OH

O

2

N

OMe

ON

Boc

OMe

Page 29: Topologically diverse shapes accessible by modular design

S29

NMR spectra of linear arylopeptoid popopo-12 (CDCl3)

ppm (t1)

050100150

172.432

172.260

172.076

171.440

171.216

170.706

170.481

169.096

168.886

155.882

143.120

142.010

138.918

138.288

137.831

135.546

135.307

135.210

134.770

134.086

133.890

130.259

129.929

129.473

129.315

129.193

129.023

127.995

127.930

127.820

127.642

127.405

127.196

126.983

126.494

126.216

126.019

125.837

125.571

79.980

70.528

70.289

70.147

70.027

69.336

58.665

58.609

58.542

53.131

53.016

51.222

48.405

47.751

46.287

45.271

44.217

43.837

28.296

OH

O

3

N

OMe

ON

Boc

OMe

Page 30: Topologically diverse shapes accessible by modular design

S30

NMR spectra of macrocyclic arylopeptoid cyclo-pppp-13 (CDCl3)

ppm (t1)050100150

171.502

138.367

135.701

127.022

126.776

71.026

58.810

53.482

44.817

Page 31: Topologically diverse shapes accessible by modular design

S31

NMR spectra of macrocyclic arylopeptoid cyclo-pppppppp-14 (CDCl3)

ppm (t1)050100150

171.979

139.021

138.810

135.443

127.887

127.177

126.764

70.836

70.337

58.857

53.755

48.052

44.762

Page 32: Topologically diverse shapes accessible by modular design

S32

NMR spectra of macrocyclic arylopeptoid cyclo-pppppp-15 (CDCl3)

ppm (t1)050100150

171.883

138.905

135.193

127.184

126.632

70.723

70.224

58.764

53.618

48.004

44.683

Page 33: Topologically diverse shapes accessible by modular design

S33

NMR spectra of macrocyclic arylopeptoid cyclo-mmmm-16 (CDCl3)

Page 34: Topologically diverse shapes accessible by modular design

S34

NMR spectra of macrocyclic arylopeptoid cyclo-mmmmmm-17 (CDCl3)

NO

O

O

N

O

N

OO

N

OO

N

O

N

O

O

O

Page 35: Topologically diverse shapes accessible by modular design

S35

NMR spectra of macrocyclic arylopeptoid cyclo-pmpm-18 (CDCl3)

Page 36: Topologically diverse shapes accessible by modular design

S36

NMR spectra of macrocyclic arylopeptoid cyclo-pmpmpm-19 (CDCl3)

Page 37: Topologically diverse shapes accessible by modular design

S37

NMR spectra of macrocyclic arylopeptoid cyclo-momo-20 (CDCl3)

ppm (t1)050100150

172.363

171.767

171.312

170.324

169.691

138.167

137.170

136.591

135.325

134.535

133.855

133.601

133.090

129.540

129.276

129.028

128.802

128.566

128.372

127.954

127.689

127.270

126.991

126.556

126.250

125.954

125.743

125.268

124.885

124.746

124.561

70.706

70.378

70.171

70.047

69.903

69.375

58.800

58.729

58.656

58.464

51.620

51.326

50.691

48.244

47.413

46.424

46.179

43.978

42.951

Page 38: Topologically diverse shapes accessible by modular design

S38

NMR spectra of macrocyclic arylopeptoid cyclo-momomo-21 (CDCl3)

ppm (t1)050100150

172.185

171.257

170.435

137.926

136.996

136.800

134.959

134.269

133.829

129.317

129.122

128.731

127.505

127.118

126.500

126.249

126.126

70.149

69.831

58.703

58.591

50.712

48.392

47.264

46.169

45.968

45.309

45.154

43.724

43.556

Page 39: Topologically diverse shapes accessible by modular design

S39

NMR spectra of macrocyclic arylopeptoid cyclo-popo-22 (CDCl3)

N

N

O

O

O

O

N

O

O

N

OO

Page 40: Topologically diverse shapes accessible by modular design

S40

NMR spectra of macrocyclic arylopeptoid cyclo-popopo-23 (CDCl3)

Page 41: Topologically diverse shapes accessible by modular design

S41

NMR study

Figure S1: 1H NMR spectra of pppp-13 at different concentrations in CDCl3 at 298K (black curve 0.2

mM, blue curve 1 mM, purple curve 5 mM).

Figure S2: Variable temperature study of pppp-13 in CDCl3 (5 mM): 278 K (black curve), 288K (blue

curve), 298 K (purple curve).

Figure S3: 1H NMR spectra of pppp-13 in different solvents at 298K (black curve CDCl3, blue curve CD3CN, purple curve CD3OD).

Page 42: Topologically diverse shapes accessible by modular design

S42

Figure S4: Variable temperature study of pppp-13 in CD3CN (5 mM): 278 K (black curve), 288K

(blue curve), 298 K (purple curve), 308 K (green curve), 318 K (red curve), 328 K (orange curve), 338

K (pale green curve) and 343 K (pink curve).

Figure S5: Variable temperature study of momo-20 in CD3CN (2 mM): 268 K (black curve), 278 K

(blue curve),288K (purple curve), 298 K (green curve), 308 K (red curve), 318 K (orange curve), 328

K (pale green curve).

Page 43: Topologically diverse shapes accessible by modular design

S43

Figure S6: Variable temperature study of popo-22 in CD3CN (5 mM): 278 K (black curve), 288 K (blue curve), 298K (purple curve), 308 K (green curve), 318 K (red curve), 328 K (orange curve), 338 K (pale green curve) and 343 K (pink curve).

Figure S7: 4-6 ppm region of the variable temperature study of popo-22 in CD3CN (5 mM): 278 K (black curve), 288 K (blue curve), 298K (purple curve), 308 K (green curve), 318 K (red curve), 328 K (orange curve), 338 K (pale green curve) and 343 K (pink curve).

Page 44: Topologically diverse shapes accessible by modular design

S44

Figure S8: 1H NMR spectra of the macrocycle popo-22 in CDCl3 at 298K.

Figure S9: NOESY experiments of the macrocycle popo-22 in CDCl3 at 298K and zoom region.

Page 45: Topologically diverse shapes accessible by modular design

S45

Figure S10: Variable temperature study of mmmm-16 in CD3CN (5 mM): 298 K (black curve), 288 K (blue curve), 278K (purple curve), 268 K (green curve).

Page 46: Topologically diverse shapes accessible by modular design

S46

X-Ray cristallography

Figure S11: Crystal packing of X-ray crystal structure of mmmm-16 in CDCl3 at 298K and zoom

region.

Page 47: Topologically diverse shapes accessible by modular design

S47

Molecular modelling2

On the basis of the different crystallographic structures obtained for the macrocycles pppp-13,

mmmm-16, momo-20 and popo-22, we propose by molecular modeling two models for the pmpm

system build as a centrosymmetric form for the first one and as a C2 symmetric form for the second

one, both with a cis-trans-cis-trans conformation of the backbone amides (Figure S12). Each model

was constructed using the chimera software.3 Molecular dynamics was performed using NAMD4 and

the charmm225 force field parameters. From each dynamics, the lowest energy structures have been

optimized at the DFT//B3LYP/6-31g(d,p) level of theory using the Gaussian G09 software.6

The observation of the variation of two dihedral angles ψ and θ of each monomer (Figure S13) over

each dynamics as well as the comparison of the energies calculated by DFT, suggest that the most

likely structure may be the second model (Figure S12b).

Figure S12. Computational models for cyclotetramer pmpm-18 (a) model build as a centrosymmetric form and (b) model build as a C2 symmetric form.

(a)

(b)

Figure S13. Dihedral angles: For ortho series: ψ [Caa; Cab; C(i); N(i+1)], θ [Cab; Caa; Ca(i); N(i)]; For meta series: ψ [Cab; Cac; C(i); N(i+1)], θ [Cab; Caa; Ca(i); N(i)], For para series: ψ [Cac or ae;

Cad; C(i); N(i+1)], θ [Cab or af; Caa; Ca(i); N(i)].

O

N

N

R

O

NN

R

O

N

N

R

ortho-arylopeptoid meta-arylopeptoid para-arylopeptoid

ψθ

θ

ψψ

θ

RR R

Page 48: Topologically diverse shapes accessible by modular design

S48

Complexation study

The hosts were tested in CDCl3 solution at 0.010 M concentrations (i.e. 3.82 mg of macrocycle was

dissolved in 0.500 mL of CDCl3). To the host solutions were added aliquots of 0.5 equivalents of

NaTFPB (as a solid). After every addition the sample was sonicated for 4 minutes till dissolution of

the salt. In case of solubility issues, the sample was heated at 60° C for 5 minutes and then sonicated

again.

Figure S14. 1H NMR spectra of the free pppp-13 macrocycle (CDCl3 solution, 298 K, concentration:

0.010 M, 600 MHz) and in the presence of 0.5 eq. of NaTFPB*.

*The addition of NaTFPB was stopped after 0.5 equivalents due to the insolubility of the guest.

Figure S15. 1H NMR spectra of the free pmpm-18 macrocycle (CDCl3 solution, 298 K, concentration:

0.010 M, 600 MHz) and in the presence of increasing amounts of NaTFPB.

pmpm-18

pppp-13

Page 49: Topologically diverse shapes accessible by modular design

S49

Figure S16. 1H NMR spectra of the free popo-22 macrocycle (CDCl3 solution, 298 K, concentration:

0.010 M, 600 MHz) and in the presence of increasing amounts of NaTFPB.

popo-22

Page 50: Topologically diverse shapes accessible by modular design

S50

Figure S17. 1H NMR spectra of the free momo-20 macrocycle (CDCl3 solution, 298 K, concentration:

0.010 M, 600 MHz) and in the presence of increasing amounts of NaTFPB.

momo-22

Page 51: Topologically diverse shapes accessible by modular design

S51

References

S1 T. Hjelmgaard, S. Faure, E. De Santis, D. Staerk, B. D. Alexander, A. A. Edwards, C. Taillefumier,

J. Nielsen Tetrahedron 2012, 68, 4444-4454. S2 Computations have been performed on the supercomputer facilities of the Mésocentre Clermont

Auvergne S3 E. F. Pettersen, T.D. Goddard, C.C. Huang, G.S. Couch, D.M. Greenblatt, E.C. Meng, T.E. Ferrin

UCSF Chimera--a visualization system for exploratory research and analysis. J. Comput. Chem.

2004, 25(13):1605-12. S4 J. C. Phillips, R. Braun, W. Wang, J. Gumbart, E. Tajkhorshid, E. Villa, C. Chipot, R. D. Skeel, L.

Kale, and K. Schulten. Scalable molecular dynamics with NAMD. J. Comput. Chem., 2005, 26, 1781-

1802. S5 A.D. MacKerell, M. Feig, C.L. Brooks, III, Extending the treatment of backbone energetics in

protein force fields: limitations of gas-phase quantum mechanics in reproducing protein

conformational distributions in molecular dynamics simulations, J. Comput. Chem., 2004, 25, 1400-

1415. S6 Gaussian 09, Revision D.01, M. J. Frisch, G. W. Trucks, H. B. Schlegel, G. E. Scuseria, M. A.

Robb, J. R. Cheeseman, G. Scalmani, V. Barone, G. A. Petersson, H. Nakatsuji, X. Li, M. Caricato, A.

Marenich, J. Bloino, B. G. Janesko, R. Gomperts, B. Mennucci, H. P. Hratchian, J. V. Ortiz, A. F.

Izmaylov, J. L. Sonnenberg, D. Williams-Young, F. Ding, F. Lipparini, F. Egidi, J. Goings, B. Peng,

A. Petrone, T. Henderson, D. Ranasinghe, V. G. Zakrzewski, J. Gao, N. Rega, G. Zheng, W. Liang,

M. Hada, M. Ehara, K. Toyota, R. Fukuda, J. Hasegawa, M. Ishida, T. Nakajima, Y. Honda, O. Kitao,

H. Nakai, T. Vreven, K. Throssell, J. A. Montgomery, Jr., J. E. Peralta, F. Ogliaro, M. Bearpark, J. J.

Heyd, E. Brothers, K. N. Kudin, V. N. Staroverov, T. Keith, R. Kobayashi, J. Normand, K.

Raghavachari, A. Rendell, J. C. Burant, S. S. Iyengar, J. Tomasi, M. Cossi, J. M. Millam, M. Klene,

C. Adamo, R. Cammi, J. W. Ochterski, R. L. Martin, K. Morokuma, O. Farkas, J. B. Foresman, and

D. J. Fox, Gaussian, Inc., Wallingford CT, 2016.