dream olfaction challenge poster
TRANSCRIPT
20
40
60
80
20 30 40 50
Fitted Pleasantness
Ple
asan
tnes
s
(R= 0.493, P<0.001)
2
4
6
0 100 200 300
Complexity
Ple
asan
tnes
s
20
40
60
80
0 100 200 300 400 500
Complexity
Kermen Model Prediction
Previous Models
Challenge Overview
[1]
Odor Intensity
Boelens model
Predicting olfactory perception from chemical structureChung Wen Yu1, Yusuke Ihara1,2, Joel D. Mainland1,3
1Monell Chemical Senses Center, Philadelphia, PA2Institute for Innovation, Ajinomoto Co., Inc., Kawasaki, Japan
3University of Pennsylvania, Philadelphia, PA
• Goal: Predict olfactory perception using chemical structure• Data: 49 human subjects rated the perceptual features of 476 odorants • Subchallenge 1: Predict the ratings of every subject• Subchallenge 2: Predict the mean and standard deviation of ratings across subjects
• Khan et al., 2007 predicted the pleasantness of odorants using a linear model with the �rst seven principal components of physicochemical space [1].• This model had similar performance on the DREAM challenge dataset.
• Kermen’s model [2] found that molecular complexity (a combination of size and symmetry) predicted odor pleasantness.• This model does not perform as well as Khan’s model.
Khan et al., 2007
Kermen et al., 2011
Khan et al. (2007) Figure 5. C
Khan Model Prediction
(R= 0.304, P<0.03)
Complexity Model (Kermen et al., 2011)
Ple
asan
tnes
s
(R= 0.286, P<0.001)
• The intensity model relies on structural features to make predictions. • Boelens model predicts whether or not molecules have an odor based on their volatility and lipophilicity [4].• Previous studies predicting olfactory thresholds modeled air to receptor transport [5,6].
−4
0
4
−200 0 200 400
Boiling Point (°C)
logP
ethylene glycolwaterglycerin
sorbitol
maltitol
ca�eine
L-Arginine
L-Histidine
TNT
ethyl salicylate
ethane
propanebutane
methane
Krypton
acetone
ethyl mercaptan
ethanol
carbon monoxide
hexanepentane
O
O
O
OH
OH
OH
OH
OHHO
HO
OH
OH
HO
HO
HO
OH
OH
OHH
O
O OH
OHHO
HO
OH
OH
OH
OH
HOOH
OHHO
HO
OH
OO
H3 C
N O
N
O
CH 3
CH 3
N
N
H 3 C
CH 3
N+
O
O–
N+
OO–
N+
O
O–
CH 3
O
H3 C
H3C SH
Alkanes
0.0
0.2
0.4
0.6
INTE
NS
ITY
SW
EE
TVA
LEN
CE
FRU
ITC
HE
MIC
AL
BA
KE
RY
GA
RLI
CD
EC
AYE
DBU
RN
TS
OU
RFL
OW
ER
SW
EAT
YAC
IDM
US
KY
FIS
HC
OLD
SP
ICE
SA
MM
ON
IA.
WO
OD
GR
AS
SW
AR
M
r va
lue
CVLB
−0.2
0.0
0.2
INTE
NS
ITY
SW
EE
TVA
LEN
CE
FRU
ITC
HE
MIC
AL
BA
KE
RY
GA
RLI
CD
EC
AYE
DBU
RN
TS
OU
RFL
OW
ER
SW
EAT
YAC
IDM
US
KY
FIS
HC
OLD
SP
ICE
SA
MM
ON
IAW
OO
DG
RA
SS
WA
RM
LB r
valu
e - C
V r
valu
e
Generate Predictive models
• Features: we used physicochemical descriptors [3], Morgan �ngerprints, and NSPDK �ngerprints. • Initial data cleaning: we removed non-informative variables, and performed cube-root transformation and normalization. • Model building: we built predictive models using the Extra-Trees algorithm • Cross-validation: we performed 5-fold CV, repeated twice. −0.2
0.0
0.2
0.4
0.6
BUR
NT
BA
KE
RY
WO
OD
WA
RM
SP
ICE
S
FRU
IT
GR
AS
S
CH
EM
ICA
L
SW
EAT
Y
SO
UR
SW
EE
T
ACID
AM
MO
NIA
CO
LD
MU
SK
Y
GA
RLI
C
INTE
NS
ITY
FIS
H
FLO
WE
R
DE
CAY
ED
VALE
NC
E
Impo
rtanc
e of
Sim
ilarit
y D
escr
ipto
rs
Similarity Features
PhysicochemicalFeatures
X1v
X0sol
VvdwMG
Vx
MW
X1sol
ATS1p
X0v
AMR
Dilution
0.0 0.1 0.2
Variable Importance Score
0.25
0.50
0.75
0 50 100
150
200
6823
Number of Variables
r val
ue CVLBTest-Retest
0.76
Moskone Musk ketone
Musk xylene Musk ambrette
Musks A B C
D E F
G H I
J K L
M N O
WiA_B.m.
NSPDK155708
SpMaxA_B.m.
Ho_Dz.Z.
Ho_Dz.m.
NSPDK8767
NSPDK8768
NSPDK56642963
SpMax7_Bh.s.
NSPDK61229
−0.1 0.0 0.1
Variable Importance Score
Odor Pleasantness
NSPDK61229:Acetylvanillin
NSPDK155708:Ethyl vanillin acetate
2D Structual Matrix:atomic mass, polarizability,
charges, etc.+
0.25
0.50
0.75
0 50 100
150
200
6823
Number of Variables
r val
ue CVLBTest-Retest
0.71
• 20 molecules were rated twice by each subject, allowing us to calculate the test-retest correlation. • Test-retest sets the ceiling for predictive models.• Averaged subject ratings are more reliable than individual subject ratings.
0.00
0.25
0.50
0.75
FRU
ITS
WE
ET
VALE
NC
EG
AR
LIC
INTE
NS
ITY
BA
KE
RY
DE
CAY
ED
CH
EM
ICA
LA
MM
ON
IAFL
OW
ER
SO
UR
ACID
SP
ICE
SFI
SH
BUR
NT
MU
SK
YW
OO
DC
OLD
WA
RM
SW
EAT
YG
RA
SS
r val
ue
Test-RetestSubchallenge 1Subchallenge 2Best DREAM model Subchallenge 2
Test-Retest
1. Khan M.K., Luk C., Flinker A., Aggarwal A., Lapid H., Haddad R., Sobel N. (2007). Predicting Odor Pleasantness from Odorant Structure: Pleasantness as a Re�ection of the Physical World. The Journal of Neuroscience. 27(37):10015-10023. 2. Kermen F., Chakirian A., Sezille C., Joussain P., Le Go� G., Ziessel A., Chastrette M., Mandairon N., Didier A., Rouby C. & Bensa� M. (2011). Molecular com-plexity determines the number of olfactory notes and the pleasantness of smells. Scienti�c Report. DOI: 10.1038/srep002063. Talete srl, DRAGON (Software for Molecular Descriptor Calculation) Version 6.0 - 2012 - http://www.talete.mi.it/4. H. Boelens. (1983). Structure-activity relationships in chemoreception by human olfaction. TIPS. 421-4265. Abraham MH., Sanchez-Moreno R., Cometto-Muniz JE., Cain WS. (2012). An Algorithm for 353 Odor Detection Thresholds in Humans. Chem. Senses 37: 207-2186.Hau KM., and Connel DW. (1998). Quantitative Structure-Activity Relationships (QSARs) for Odor Thresholds of Volatile Organic Compounds (VOCs). Indoor Air 8: 23–33
Bakery - Vanillin
NSPDK61660:S-Furfuryl thioacetate
(sulfurous)
NSPDK62131:Furfuryl Methyl Disul�de
(alliaceous)
NSPDK7363:Furfuryl Mercaptan
(co�ee)
Burnt - Furfuryl
NSPDK539829:Vanillin Isobutyrate
Morgan8467:Ethylvanillin
NSPDK1183:Vanillin
Decayed - Thiol
0.15
morgan6997:2-Ethylphenol
morgan18515:2-Butylpheno
morgan6943:2-Isopropylphenol
NSPDK62444:S-Methyl thiobutyrate
(Cheesy)
NSPDK13582:4-Butyrothiolactone
(Garlic)
NSPDK7969:Benzenethiol
(Meaty)
Chemical - Phenol
Conclusion• Averaged subject ratings are reliable for both odor intensity and pleasantness. Most of the perceptual descriptors show great within and across subject variance. • The intensity model performed comparably to the test-retest correlation, while the pleasantness model performed worse than the test-retest correlation. • Models for the 19 descriptors performed worse than intensity and pleasantness models. • Some models rely on physicochemical features, whereas others use molecular template-matching.
Literature Cited
Predicting Odor from Structure