new approaches for the dereplication, marine isolation and...
TRANSCRIPT
New Approaches for the Dereplication, Isolation and Structural Elucidation
of New Bioactive Marine Natural Products.
Marcel Jaspars
mar
ine
biod
iscov
ery
cent
re
*
mar
ine
biod
iscov
ery
cent
re
ToolsChromatographySpectroscopyMolecular biologySynthesisMolecular modellingNew software
Marine invertebratesMarine microorganisms
& Symbionts
FunctionsChemical ecologySymbiosisBiosynthesisBioinorganic chemistry
Applications:PharmaceuticalCancerInfectionInflammationParasitesEpilepsyAlzheimer’s
PeopleProf Marcel JasparsDr Rainer EbelDr Hai DengDr Wael HoussenProf Jörg FeldmannDr Eva KruppDr Laurent TrembleauDr Andrea RaabMr Russell Gray
The Marine Biodiscovery Centre
Aim: The discovery of novel pharmaceuticalcandidates and research tools from extremeenvironments.
Natural Product ChemistryPharmacy & PharmacognosyChemical BiologySynthetic BiologyAnalytical MethodsAnalytical MethodsOrganic SynthesisMass spectrometryNuclear Magnetic Resonance
The Biodiscovery Pipelinem
arin
e bi
odisc
over
y ce
ntre
Development
Sampling
Curation
Biomass
Extraction
Assay
Purification
Active NCE
BottleneckAccess to resources(Physical and legal)
Taxonomic identificationStorage with associated metadataWhole organism collection Microbial cultivation/cryptic pathwaysExtraction processEarly DereplicationAssay throughput vs information obtainedFalse positives/negativesPurification processRemoval of nuisance compounds
Structure determinationLate Dereplication
SUPPLYScale-up/process intensification
mar
ine
biod
iscov
ery
cent
reCompound Isolation
Solvent-solvent partition Size-exclusion chromatography
Medium pressure liquid chromatography
HighPressureLiquid Chromatography
Bioassay Guided Isolation
S1Inactive
S2Inactive
S3Inactive
S4Inactive
S6F1Inactive
S6F2Inactive
S6F3Inactive
S6F4H1Inactive
S6F4H2Inactive
Pure CompoundActive
S6F4Active
S5Active
S6Inactive
Crude extractActive Partition fraction
Size exclusion
Column chromatography
HPLC
Problems with bioassay guided isolation
• Poor assay reproducibility using crude/semi-purified extracts
• False positives/negatives• Long times between cycles of
purification/bioassay are common• Repeated rediscovery of known
compounds (bioactives)
mar
ine
biod
iscov
ery
cent
re
One Solution – Compound Librariesm
arin
e bi
odisc
over
y ce
ntre
Collection/TaxonomyCuration/Informatics
Crude extracts
Purified extracts
Pure compounds
Pure compoundswith structure
Library size
Library generation
Library value
SCRE
EN(in
-hou
se/p
artn
er)
DereplicationHas your compound been reported before and
how do you find out?m
arin
e bi
odisc
over
y ce
ntre
Isolation of known compounds is time consuming and costly
John Blunt et al. in Handbook of Marine Natural Products, 2012
Available databases:
• 30,000 - MarinLit (RSC)• 36,000 - Antibase• 50,000 – AntiMarin• 160,000 – Dictionary of
natural products
Early stage dereplicationIdentifying known compounds in extractsPrioritising samples for further work
Late stage dereplicationAfter compounds have been purified.
Hyphenated DataUV (DAD)
LCMS
ES+ (HR)
ES-
MS/MS (LR)[Crude] = 0.5 mg/ml[Pure] = 0.05 mg/ml
Sample submission
mar
ine
biod
iscov
ery
cent
re
• Coupling of LC and MS increases resolution of both LC and MS methods
• Can monitor selected ions• Can monitor selected reactions
• Using high resolution, molecular formula can be determined and searched in database.
• Useful for early dereplication – at extract stage
Targeted DereplicationSearching for Every Database Compound in the LC-MS
mar
ine
biod
iscov
ery
cent
re
Unidentified Peak
Aspergillus indologenus extractUse database of all known Aspergillus and Penicilliumcompounds and retention timesto identify known peaks and locate unknowns
Tetrahedron Lett. 2015, 56 1847
MBC Streptomyces Database5070 Streptomyces compounds
mar
ine
biod
iscov
ery
cent
re
Jioji Tabudravu
MBC Spectroscopic Databasem
arin
e bi
odisc
over
y ce
ntre
665 natural products from marine and terrestrial sources representing several classes of natural products including peptides, alkaloids, terpenes and others
Case StudyDereplication of Streptomyces albus, ΔlgnC
mar
ine
biod
iscov
ery
cent
re
Tamarindus indica, Legon, Ghana
1' NH
3N
4
7 1
OH O
8
O
4'
5'
Legonmycin A
Streptomyces albus
Angew. Chem. Int. Ed. 2015, 54, 12697
X
mar
ine
biod
iscov
ery
cent
reKnowns and Unknowns
6 new possible compound candidates
mar
ine
biod
iscov
ery
cent
reNew Compounds
1 2
N
OH
NHO
ON
OH
NHO
O
N
O
ONH
OHO
O
N
O
ONH
OHO
O
3 4
1
3468
9 11
12
13
14
Known Known
New
mar
ine
biod
iscov
ery
cent
reMBC Spectroscopic Database
Similarity Match
N
OH
NH
O
O
N
O
ONH
OHO
O
mar
ine
biod
iscov
ery
cent
reRoss Sea Pseudomonas sp. BTN1
Ross Sea
mar
ine
biod
iscov
ery
cent
re Untargeted DereplicationSearching Every LC-MS peak in the Database
Analysis of the Blank
Jioji Tabudravu
mar
ine
biod
iscov
ery
cent
re Untargeted DereplicationBlank and Medium Removed
mar
ine
biod
iscov
ery
cent
re Untargeted DereplicationBlank, Medium and PharmaSea Database
(28,000 compounds) removed
533.368
mar
ine
biod
iscov
ery
cent
re Untargeted DereplicationAfter Removal of Blank/Medium/Knowns,
Remainder are New Compounds
O O
O OH
O
OHO
HOOH
Marine Drugs, 2016, 14, 83
mar
ine
biod
iscov
ery
cent
reMonorhamnolipids
Activity Against Burkholderia
O OH
OC7H15O
13
C6H13
1
4
6
O OH
OC9H19O
C4H9
O OH
OC7H15OC9H19
2 3
1
AB
C
OHO
OHHO
OHO
OHHO
1
23
56
OHO
OHHO
O
O O
Mar. Drugs 2016, 14, 83
Novelty, Complexity and Diversity Metrics
mar
ine
biod
iscov
ery
cent
re
Jioji Tabudravu
Novelty, Complexity and Diversity Metrics
mar
ine
biod
iscov
ery
cent
re
Novelty for FijianInvertebrate Extracts
mar
ine
biod
iscov
ery
cent
re
0.00
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90Ta
varu
a-2-
50
Lau-
1-50
Tava
niko
-5-5
0
Tava
niko
-4-5
0
Buag
o-7A
-25
Tava
niko
-7-5
0
Eden
-4-5
0
Tava
rua-
2-10
0
Eden
-5-1
00
Pina
ccle
-5-5
0
Buag
o-B-
50
Buag
o-B-
100+
TFA
Buag
o-7A
-100
+TFA
Pina
ccle
-5-2
5
Tava
rua-
8.9-
100+
TFA
Eden
-5-1
00+T
FA
Balo
lo- 3
.14-
100
Lau-
1-10
0+TF
A
Buag
o-7A
-100
+TFA
Tava
niko
-5-1
00
Nam
-2-2
5
Tava
niko
-4-1
00
Tava
niko
-5-2
5
Nam
-2-1
00
Pina
ccle
-5-1
00+T
FA
Tava
niko
-4-1
00+T
FA
Eden
-6-1
00
Eden
-6-2
5
Nov
elty
inde
x
Fractions
New Compound Discoveredm
arin
e bi
odisc
over
y ce
ntre
• Analysis MS/MS data is time consuming• No publicly available dereplication databasesGNPS• Compares fragmentation patterns• Clusters compounds with similar
fragmentation patterns based on similarity• Dereplication by comparing newly generated
MS/MS data with deposited dataSylvia Soldatou/Kevin Miranda
Molecular Network of Marine Endophytic Aspergillus sp. Extract
Node (parent ion)
Edge (cosine similarity score)
Sylvia Soldatou/Kevin Miranda
Molecular Network of Marine Endophytic Aspergillus sp. Extract
GNPS spectral library hit: Emericellamide A
Confirmed by MS-MS Fragmentation
ONH
O
NHO
NHO
NH
ONHO
O
Circles: potentially newanalogues
Sylvia Soldatou/Kevin Miranda
Late Stage DereplicationFrom Prioritising Crude Extract to
Identifying Pure Compound
0.51.01.52.02.53.03.54.04.55.05.56.06.57.07.58.08.5
JTSD114_FD_1H_CDCl3JTSD114_FD_1H_CDCl3
102030405060708090100110120130140150160170180190200210
JTSD114_FD_13C_CDCl3JTSD114_FD_13C_CDCl3
Crude 1H NMR Crude 13C NMR
0.51.01.52.02.53.03.54.04.55.05.56.06.57.07.58.08.5
JT_SD114_FD_S1_F9_H5JT_SD114_FD_S1_F9_H5
0102030405060708090100110120130140150160170180190200210
JT_SD114_FD_S1_F9_H5_13CJT_SD114_FD_S1_F9_H5_13C
Pure 1H NMR Pure 13C NMR
1 .62 .43 .24 .04 .85 .66 .47 .2f1 (ppm )
6.9
8
0.9
2
1.1
4
1.1
2
2.0
31
.13
3.8
3
3.8
9
2.8
5
3.9
1
3.8
7
1.8
71
.08
0.9
9
0.9
3
0.8
9
0.9
5
1.9
8
2.2
9
1.9
6
2.0
0
10.0
0
water CD3OD
Using Simple Features
N-Me
S-Me
-CH(Me2)
N
HN
HN
NH
NH
O
OO
OO
HN
O
HN
O OH
SMe
HO
mar
ine
biod
iscov
ery
cent
reChemical Structure
2D Framework1 Carbon skeleton
2 Functional groups
3 Location of functional groups
3D Framework 1 Relative stereo
2 Absolute stereo
3 Conformational features
Br N
O
N
OHBrO
Br
NN
O
HO
Br
O
OH
OH
NHN
S
HN
SN
NH
O
O
O N
N
O
O
OK
But what about:
mar
ine
biod
iscov
ery
cent
reStructure Determination Steps
Molecularformula
Functionalgroups
Substructures
Very secure3D molecular
structure
MS, NMR
NMR, IRUV
NMR
X-RAY
UnsaturationNumber
(UN)
Working 2D
structures
List of working
2D structures
Reasonable3D molecular
structure
New 2Dmolecularstructure
NMRORD Total
synthesis Molecularmodeling
Knownmolecularstructure
Dereplicate by MF
Dereplicate
Draw all isomers
NMR, MS, IR, UV
by structure
Purecompound
Strategy Based on C-H Connectivity
1JCH
H
C C
H1JCH &
3JHH
H
C C
H
C C
1JCHH
C C C&
2-3JCHH
C C C C C C
MF = C10H18O
sp3: Me Me Me
sp3: CH2 CH
2 CH
2
sp3: CH; sp2: CH
sp3: C-OH; sp2: C
1.) Obtain C/H pairs from HSQC
2.) Determine functional groups(NMR, IR etc)
1JCHHC C
H1JCH
OH
OH
OH
4.) Assemble working structures
5.) Make structural proposal
2-3JCHHC C C
mar
ine
biod
iscov
ery
cent
reStructure Determination Steps
H
Me MeOH
MeHH
H
H HHH
3.) Generate substructures3JHHHC C
H
Determining the MF224
272
UV, 1.4min #97
575.2114
622.0290
277.1190299.1007
KSA1_P1-A-3_01_12578.d: +MS, 1.4min #201
168.0808
186.0915
204.1023214.0867
230.0815
247.1082 277.1188
KSA1_P1-A-3_01_12578.d: +MS2(277.1190), 18.6-46.6eV, 1.4min #202
187.0944
208.0738
226.0847269.0916
299.1000
KSA1_P1-A-3_01_12578.d: +MS2(299.1007), 18.9-47.3eV, 1.5min #203
0
100
Intens.[mAU]
0.0
0.5
5x10Intens.
0
2
45x10
Intens.
0.0
0.5
1.04x10
Intens.
200 250 300 350 400 450 500 550 600 m/z
200 220 240 260 280 300 320 340 360 380 Wavelength [nm]
Data:Kojo Acquah
2030405060708090100110120130140150160170f1 (ppm)
13C NMR Data
AB
CD
EFG H
IJ
KLM N
Atom 13C/ppmA 172.75B 136.13C 132.61D 126.45E 120.74F 118.33
Data: Kojo Acquah
HSQC Data - C-H (1 bond)
2.42.83.23.64.04.44.85.25.66.06.46.87.27.6
20
30
40
50
60
70
80
90
100
110
120
f1 (
ppm
)
KSA032_WB_SF8_G, HSQC, 303K, DMSO-d6
EFG
HI
N
E
e
F
f
N N’
n n’Red = CH/CH3Blue = CH2
Atom 13C/ppm Mult 1H/ppmA 172.75CB 136.13CC 132.61CD 126.45CE 120.74CH 7.06F 118.33CH 6.98G 117.29CH 7.42H 111.15CH 7.36I 107.77CJ 70.51CH 4.18K 62.94CH2 3.60, 3.54L 56.09CH 3.62M 54.65CH 4.43N 24.62CH2 3.04, 2.69
2.3.03.54.04.55.05.56.06.57.07.58.08.59.09.510.010.511.0f1 (ppm)
1H NMR Data
NH g h e fm
j
k k’
l n n’
Data: Kojo Acquah
COSY Data - H-C-H, H-C-C-H
2.42.62.83.03.23.43.63.84.04.24.44.6f2 (ppm)
2.6
2.8
3.0
3.2
3.4
3.6
3.8
4.0
4.2
4.4
4.6
f1 (
ppm
)
KSA032_WB_SF8_G, COSY, 303K, DMSO-d6 m
jk
k’
ln
n’
m jk k’
ln n’
j-m
k-j
n-n’
COSY Data - H-C-H, H-C-C-HAtom 13C/ppm Mult 1H COSYA 172.75 CB 136.13 CC 132.61 CD 126.45 CE 120.74 CH 7.06 f, hF 118.33 CH 6.98 e, gG 117.29 CH 7.42 fH 111.15 CH 7.36 eI 107.77 CJ 70.51 CH 4.18 m, k
K 62.94 CH2
3.60, 3.54 j
L 56.09 CH 3.62 nM 54.65 CH 4.43 j, n
N 24.62 CH2
3.04, 2.69 l, m
Data: Kojo Acquah
6.87.27.68.08.48.89.29.610.010.410.8
105
110
115
120
125
130
135
KSA032_WB_SF8_G, HMBC, 303K, DMSO-d6
HMBC Data - C-C-H, C-C-C-H
B
C
D
EFG
H
I
NH g h e f
B-NH
C-NH
B-g B-e
D-h D-f
H-f
HMBC Data - C-C-H, C-C-C-HAtom 13C/ppm Mult 1H COSY HMBCA 172.75 C l, nB 136.13 C HN, g, eC 132.61 C HN, m, j, nD 126.45 C HN, h, f, nE 120.74 CH 7.06 f, h g, fF 118.33 CH 6.98 e, g hG 117.29 CH 7.42 f e,fH 111.15 CH 7.36 e f, g, eI 107.77 C HN, g, m, nJ 70.51 CH 4.18 m, k m, k
K 62.94 CH2
3.60, 3.54 j j, m
L 56.09 CH 3.62 n nM 54.65 CH 4.43 j, n l, k
N 24.62 CH2
3.04, 2.69 l, m l
Data: Kojo Acquah
Substructuresh
e
f
g
HOOH
jk k'
mn n'
l
COSY Data
HOOH
JK
MN L
OH
O
A
HN
C14H16N2O4
HMBC Data
HNB C
DE
FG
H
I
Combining the PiecesHO
OHJ
K
MN L
OH
O
A
HN
HNB C
DE
FG
H
I
mar
ine
biod
iscov
ery
cent
reComputer Aided Structure Elucidation
HSQC analysis - assign C-H
Tabulate data
# dC dH
1 173.3 9.272 121.5 6.873 110.8 6.09
COSY/HMBC analysis
Generate connectivities
HC
C
HC C
HC
CCH
CH
H3C C O O NH NH
173.3
118.7
92.2 134.4
124.7
109.8
110.8
121.5
59.0 160.1
= MF (C10H10N2O2)
Structuregeneration
NH
HN O
HOMe
NH
HN O
HMeO
NH
NHOMeO
H
& 386 others
173.3118.7
92.2
134.4
124.7
109.8110.8
59.0
160.1
Calculate matchwith 13C shif ts
NH
HN O
HOMe
Correct structure
The Determinator
• Fast• Easy to use• Generally
applicable• Reliable• Inexpensive
mar
ine
biod
iscov
ery
cent
re
Solving StructuresUsing Atomic Force Microscopy
mar
ine
biod
iscov
ery
cent
re
Nat. Chem. 2010
The Determinatorm
arin
e bi
odisc
over
y ce
ntre
AFM of Unknown
Applying the Determinatorto Breitfussin A
mar
ine
biod
iscov
ery
cent
re
Prerequisite
C15H8N3O1
Br I OMe
Applying the Determinatorto Breitfussin A
mar
ine
biod
iscov
ery
cent
re
Applying the Determinatorto Breitfussin A
mar
ine
biod
iscov
ery
cent
re
Applying the Determinatorto Breitfussin A
mar
ine
biod
iscov
ery
cent
re
Applying the Determinatorto Breitfussin A
mar
ine
biod
iscov
ery
cent
re
Applying the Determinatorto Breitfussin A
mar
ine
biod
iscov
ery
cent
re
Br
I
“If you have a strange substance and you want to know what it is, you go through a long and complicated process of chemical analysis. ……. It would be very easy to make an analysis of any complicated chemical substance; all one would have to do would be to look at it and see where the atoms are.”
Plenty of Room at the Bottom Richard P. Feynman
December 1959
mar
ine
biod
iscov
ery
cent
re