dream olfaction challenge poster

1
20 40 60 80 20 30 40 50 Fitted Pleasantness Pleasantness (R= 0.493, P<0.001) 2 4 6 0 100 200 300 Complexity Pleasantness 20 40 60 80 0 100 200 300 400 500 Complexity Kermen Model Prediction Previous Models Challenge Overview [1] Odor Intensity Boelens model Predicting olfactory perception from chemical structure Chung Wen Yu 1 , Yusuke Ihara 1,2 , Joel D. Mainland 1,3 1 Monell Chemical Senses Center, Philadelphia, PA 2 Institute for Innovation, Ajinomoto Co., Inc., Kawasaki, Japan 3 University of Pennsylvania, Philadelphia, PA Goal: Predict olfactory perception using chemical structure Data: 49 human subjects rated the perceptual features of 476 odorants Subchallenge 1: Predict the ratings of every subject Subchallenge 2: Predict the mean and standard deviation of ratings across subjects Khan et al., 2007 predicted the pleasantness of odorants using a linear model with the first seven principal components of physicochemical space [1]. This model had similar performance on the DREAM challenge dataset. Kermen’s model [2] found that molecular complexity (a combination of size and symmetry) predicted odor pleasantness. This model does not perform as well as Khan’s model. Khan et al., 2007 Kermen et al., 2011 Khan et al. (2007) Figure 5. C Khan Model Prediction (R= 0.304, P<0.03) Complexity Model (Kermen et al., 2011) Pleasantness (R= 0.286, P<0.001) The intensity model relies on structural features to make predictions. Boelens model predicts whether or not molecules have an odor based on their volatility and lipophilicity [4]. Previous studies predicting olfactory thresholds modeled air to receptor transport [5,6]. -4 0 4 -200 0 200 400 Boiling Point (°C) logP ethylene glycol water glycerin sorbitol maltitol caffeine L-Arginine L-Histidine TNT ethyl salicylate ethane propane butane methane Krypton acetone ethyl mercaptan ethanol carbon monoxide hexane pentane O O O OH OH OH OH OH HO HO OH OH HO HO HO OH OH OH H O O OH OH HO HO OH OH OH OH HO OH OH HO HO OH O O H 3 C N O N O CH 3 CH 3 N N H3C CH3 N + O O N + O O N + O O CH 3 O H 3 C H 3 C SH Alkanes 0.0 0.2 0.4 0.6 INTENSITY SWEET VALENCE FRUIT CHEMICAL BAKERY GARLIC DECAYED BURNT SOUR FLOWER SWEATY ACID MUSKY FISH COLD SPICES AMMONIA. WOOD GRASS WARM r value CV LB −0.2 0.0 0.2 INTENSITY SWEET VALENCE FRUIT CHEMICAL BAKERY GARLIC DECAYED BURNT SOUR FLOWER SWEATY ACID MUSKY FISH COLD SPICES AMMONIA WOOD GRASS WARM LB r value - CV r value Generate Predictive models Features: we used physicochemical descriptors [3], Morgan fingerprints, and NSPDK fingerprints. Initial data cleaning: we removed non-informative variables, and performed cube-root transformation and normalization. Model building: we built predictive models using the Extra-Trees algorithm Cross-validation: we performed 5-fold CV, repeated twice. −0.2 0.0 0.2 0.4 0.6 BURNT BAKERY WOOD WARM SPICES FRUIT GRASS CHEMICAL SWEATY SOUR SWEET ACID AMMONIA COLD MUSKY GARLIC INTENSITY FISH FLOWER DECAYED VALENCE Importance of Similarity Descriptors Similarity Features Physicochemical Features X1v X0sol VvdwMG Vx MW X1sol ATS1p X0v AMR Dilution 0.0 0.1 0.2 Variable Importance Score 0.25 0.50 0.75 0 50 100 150 200 6823 Number of Variables r value CV LB Test-Retest 0.76 Moskone Musk ketone Musk xylene Musk ambrette Musks A B C D E F G H I J K L M N O WiA_B.m. NSPDK155708 SpMaxA_B.m. Ho_Dz.Z. Ho_Dz.m. NSPDK8767 NSPDK8768 NSPDK56642963 SpMax7_Bh.s. NSPDK61229 −0.1 0.0 0.1 Variable Importance Score Odor Pleasantness NSPDK61229: Acetylvanillin NSPDK155708: Ethyl vanillin acetate 2D Structual Matrix: atomic mass, polarizability, charges, etc. + 0.25 0.50 0.75 0 50 100 150 200 6823 Number of Variables r value CV LB Test-Retest 0.71 20 molecules were rated twice by each subject, allowing us to calculate the test-retest correlation. Test-retest sets the ceiling for predictive models. Averaged subject ratings are more reliable than individual subject ratings. 0.00 0.25 0.50 0.75 FRUIT SWEET VALENCE GARLIC INTENSITY BAKERY DECAYED CHEMICAL AMMONIA FLOWER SOUR ACID SPICES FISH BURNT MUSKY WOOD COLD WARM SWEATY GRASS r value Test-Retest Subchallenge 1 Subchallenge 2 Best DREAM model Subchallenge 2 Test-Retest 1. Khan M.K., Luk C., Flinker A., Aggarwal A., Lapid H., Haddad R., Sobel N. (2007). Predicting Odor Pleasantness from Odorant Structure: Pleasantness as a Reflection of the Physical World. The Journal of Neuroscience. 27(37):10015-10023. 2. Kermen F., Chakirian A., Sezille C., Joussain P., Le Goff G., Ziessel A., Chastrette M., Mandairon N., Didier A., Rouby C. & Bensafi M. (2011). Molecular com- plexity determines the number of olfactory notes and the pleasantness of smells. Scientific Report. DOI: 10.1038/srep00206 3. Talete srl, DRAGON (Software for Molecular Descriptor Calculation) Version 6.0 - 2012 - http://www.talete.mi.it/ 4. H. Boelens. (1983). Structure-activity relationships in chemoreception by human olfaction. TIPS. 421-426 5. Abraham MH., Sanchez-Moreno R., Cometto-Muniz JE., Cain WS. (2012). An Algorithm for 353 Odor Detection Thresholds in Humans. Chem. Senses 37: 207-218 6.Hau KM., and Connel DW. (1998). Quantitative Structure-Activity Relationships (QSARs) for Odor Thresholds of Volatile Organic Compounds (VOCs). Indoor Air 8: 23–33 Bakery - Vanillin NSPDK61660: S-Furfuryl thioacetate (sulfurous) NSPDK62131: Furfuryl Methyl Disulfide (alliaceous) NSPDK7363: Furfuryl Mercaptan (coffee) Burnt - Furfuryl NSPDK539829: Vanillin Isobutyrate Morgan8467: Ethylvanillin NSPDK1183: Vanillin Decayed - Thiol 0.15 morgan6997: 2-Ethylphenol morgan18515: 2-Butylpheno morgan6943: 2-Isopropylphenol NSPDK62444: S-Methyl thiobutyrate (Cheesy) NSPDK13582: 4-Butyrothiolactone (Garlic) NSPDK7969: Benzenethiol (Meaty) Chemical - Phenol Conclusion Averaged subject ratings are reliable for both odor intensity and pleasantness. Most of the perceptual descriptors show great within and across subject variance. The intensity model performed comparably to the test-retest correlation, while the pleasantness model performed worse than the test-retest correlation. Models for the 19 descriptors performed worse than intensity and pleasantness models. Some models rely on physicochemical features, whereas others use molecular template-matching. Literature Cited Predicting Odor from Structure

Upload: wendy-yu

Post on 14-Apr-2017

211 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: dream olfaction challenge poster

20

40

60

80

20 30 40 50

Fitted Pleasantness

Ple

asan

tnes

s

(R= 0.493, P<0.001)

2

4

6

0 100 200 300

Complexity

Ple

asan

tnes

s

20

40

60

80

0 100 200 300 400 500

Complexity

Kermen Model Prediction

Previous Models

Challenge Overview

[1]

Odor Intensity

Boelens model

Predicting olfactory perception from chemical structureChung Wen Yu1, Yusuke Ihara1,2, Joel D. Mainland1,3

1Monell Chemical Senses Center, Philadelphia, PA2Institute for Innovation, Ajinomoto Co., Inc., Kawasaki, Japan

3University of Pennsylvania, Philadelphia, PA

• Goal: Predict olfactory perception using chemical structure• Data: 49 human subjects rated the perceptual features of 476 odorants • Subchallenge 1: Predict the ratings of every subject• Subchallenge 2: Predict the mean and standard deviation of ratings across subjects

• Khan et al., 2007 predicted the pleasantness of odorants using a linear model with the �rst seven principal components of physicochemical space [1].• This model had similar performance on the DREAM challenge dataset.

• Kermen’s model [2] found that molecular complexity (a combination of size and symmetry) predicted odor pleasantness.• This model does not perform as well as Khan’s model.

Khan et al., 2007

Kermen et al., 2011

Khan et al. (2007) Figure 5. C

Khan Model Prediction

(R= 0.304, P<0.03)

Complexity Model (Kermen et al., 2011)

Ple

asan

tnes

s

(R= 0.286, P<0.001)

• The intensity model relies on structural features to make predictions. • Boelens model predicts whether or not molecules have an odor based on their volatility and lipophilicity [4].• Previous studies predicting olfactory thresholds modeled air to receptor transport [5,6].

−4

0

4

−200 0 200 400

Boiling Point (°C)

logP

ethylene glycolwaterglycerin

sorbitol

maltitol

ca�eine

L-Arginine

L-Histidine

TNT

ethyl salicylate

ethane

propanebutane

methane

Krypton

acetone

ethyl mercaptan

ethanol

carbon monoxide

hexanepentane

O

O

O

OH

OH

OH

OH

OHHO

HO

OH

OH

HO

HO

HO

OH

OH

OHH

O

O OH

OHHO

HO

OH

OH

OH

OH

HOOH

OHHO

HO

OH

OO

H3 C

N O

N

O

CH 3

CH 3

N

N

H 3 C

CH 3

N+

O

O–

N+

OO–

N+

O

O–

CH 3

O

H3 C

H3C SH

Alkanes

0.0

0.2

0.4

0.6

INTE

NS

ITY

SW

EE

TVA

LEN

CE

FRU

ITC

HE

MIC

AL

BA

KE

RY

GA

RLI

CD

EC

AYE

DBU

RN

TS

OU

RFL

OW

ER

SW

EAT

YAC

IDM

US

KY

FIS

HC

OLD

SP

ICE

SA

MM

ON

IA.

WO

OD

GR

AS

SW

AR

M

r va

lue

CVLB

−0.2

0.0

0.2

INTE

NS

ITY

SW

EE

TVA

LEN

CE

FRU

ITC

HE

MIC

AL

BA

KE

RY

GA

RLI

CD

EC

AYE

DBU

RN

TS

OU

RFL

OW

ER

SW

EAT

YAC

IDM

US

KY

FIS

HC

OLD

SP

ICE

SA

MM

ON

IAW

OO

DG

RA

SS

WA

RM

LB r

valu

e - C

V r

valu

e

Generate Predictive models

• Features: we used physicochemical descriptors [3], Morgan �ngerprints, and NSPDK �ngerprints. • Initial data cleaning: we removed non-informative variables, and performed cube-root transformation and normalization. • Model building: we built predictive models using the Extra-Trees algorithm • Cross-validation: we performed 5-fold CV, repeated twice. −0.2

0.0

0.2

0.4

0.6

BUR

NT

BA

KE

RY

WO

OD

WA

RM

SP

ICE

S

FRU

IT

GR

AS

S

CH

EM

ICA

L

SW

EAT

Y

SO

UR

SW

EE

T

ACID

AM

MO

NIA

CO

LD

MU

SK

Y

GA

RLI

C

INTE

NS

ITY

FIS

H

FLO

WE

R

DE

CAY

ED

VALE

NC

E

Impo

rtanc

e of

Sim

ilarit

y D

escr

ipto

rs

Similarity Features

PhysicochemicalFeatures

X1v

X0sol

VvdwMG

Vx

MW

X1sol

ATS1p

X0v

AMR

Dilution

0.0 0.1 0.2

Variable Importance Score

0.25

0.50

0.75

0 50 100

150

200

6823

Number of Variables

r val

ue CVLBTest-Retest

0.76

Moskone Musk ketone

Musk xylene Musk ambrette

Musks A B C

D E F

G H I

J K L

M N O

WiA_B.m.

NSPDK155708

SpMaxA_B.m.

Ho_Dz.Z.

Ho_Dz.m.

NSPDK8767

NSPDK8768

NSPDK56642963

SpMax7_Bh.s.

NSPDK61229

−0.1 0.0 0.1

Variable Importance Score

Odor Pleasantness

NSPDK61229:Acetylvanillin

NSPDK155708:Ethyl vanillin acetate

2D Structual Matrix:atomic mass, polarizability,

charges, etc.+

0.25

0.50

0.75

0 50 100

150

200

6823

Number of Variables

r val

ue CVLBTest-Retest

0.71

• 20 molecules were rated twice by each subject, allowing us to calculate the test-retest correlation. • Test-retest sets the ceiling for predictive models.• Averaged subject ratings are more reliable than individual subject ratings.

0.00

0.25

0.50

0.75

FRU

ITS

WE

ET

VALE

NC

EG

AR

LIC

INTE

NS

ITY

BA

KE

RY

DE

CAY

ED

CH

EM

ICA

LA

MM

ON

IAFL

OW

ER

SO

UR

ACID

SP

ICE

SFI

SH

BUR

NT

MU

SK

YW

OO

DC

OLD

WA

RM

SW

EAT

YG

RA

SS

r val

ue

Test-RetestSubchallenge 1Subchallenge 2Best DREAM model Subchallenge 2

Test-Retest

1. Khan M.K., Luk C., Flinker A., Aggarwal A., Lapid H., Haddad R., Sobel N. (2007). Predicting Odor Pleasantness from Odorant Structure: Pleasantness as a Re�ection of the Physical World. The Journal of Neuroscience. 27(37):10015-10023. 2. Kermen F., Chakirian A., Sezille C., Joussain P., Le Go� G., Ziessel A., Chastrette M., Mandairon N., Didier A., Rouby C. & Bensa� M. (2011). Molecular com-plexity determines the number of olfactory notes and the pleasantness of smells. Scienti�c Report. DOI: 10.1038/srep002063. Talete srl, DRAGON (Software for Molecular Descriptor Calculation) Version 6.0 - 2012 - http://www.talete.mi.it/4. H. Boelens. (1983). Structure-activity relationships in chemoreception by human olfaction. TIPS. 421-4265. Abraham MH., Sanchez-Moreno R., Cometto-Muniz JE., Cain WS. (2012). An Algorithm for 353 Odor Detection Thresholds in Humans. Chem. Senses 37: 207-2186.Hau KM., and Connel DW. (1998). Quantitative Structure-Activity Relationships (QSARs) for Odor Thresholds of Volatile Organic Compounds (VOCs). Indoor Air 8: 23–33

Bakery - Vanillin

NSPDK61660:S-Furfuryl thioacetate

(sulfurous)

NSPDK62131:Furfuryl Methyl Disul�de

(alliaceous)

NSPDK7363:Furfuryl Mercaptan

(co�ee)

Burnt - Furfuryl

NSPDK539829:Vanillin Isobutyrate

Morgan8467:Ethylvanillin

NSPDK1183:Vanillin

Decayed - Thiol

0.15

morgan6997:2-Ethylphenol

morgan18515:2-Butylpheno

morgan6943:2-Isopropylphenol

NSPDK62444:S-Methyl thiobutyrate

(Cheesy)

NSPDK13582:4-Butyrothiolactone

(Garlic)

NSPDK7969:Benzenethiol

(Meaty)

Chemical - Phenol

Conclusion• Averaged subject ratings are reliable for both odor intensity and pleasantness. Most of the perceptual descriptors show great within and across subject variance. • The intensity model performed comparably to the test-retest correlation, while the pleasantness model performed worse than the test-retest correlation. • Models for the 19 descriptors performed worse than intensity and pleasantness models. • Some models rely on physicochemical features, whereas others use molecular template-matching.

Literature Cited

Predicting Odor from Structure