prediction of nmr chemical shifts. a chemometrical approach К.А. blinov, y.d. smurnyy, Т.s....

29
Prediction of NMR Chemical Shifts. A Chemometrical Approach К.А. Blinov , Y.D. Smurnyy, Т.S. Churanova, М.Е. Elyashberg Advanced Chemistry Development (ACD)

Upload: jayson-summers

Post on 14-Jan-2016

221 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Prediction of NMR Chemical Shifts. A Chemometrical Approach К.А. Blinov, Y.D. Smurnyy, Т.S. Churanova, М.Е. Elyashberg Advanced Chemistry Development (ACD)

Prediction of NMR Chemical Shifts.

A Chemometrical Approach

К.А. Blinov, Y.D. Smurnyy, Т.S. Churanova, М.Е. Elyashberg

Advanced Chemistry Development (ACD)

Page 2: Prediction of NMR Chemical Shifts. A Chemometrical Approach К.А. Blinov, Y.D. Smurnyy, Т.S. Churanova, М.Е. Elyashberg Advanced Chemistry Development (ACD)

Structure and its spectral data

COSY.esp

4 3 2 1F2 Chemical Shift (ppm)

1.0

1.5

2.0

2.5

3.0

3.5

4.0

4.5

F1

Che

mic

al S

hift

(ppm

)

HMQC.esp

4 3 2F2 Chemical Shift (ppm)

16

24

32

40

48

56

64

72

80

F1

Che

mic

al S

hift

(ppm

)

C13.esp

85 80 75 70 65 60 55 50 45 40 35 30 25 20 15 10Chemical Shift (ppm)

0.25

0.50

0.75

1.00

Nor

mal

ized

Inte

nsity

26.8

531.6

1

42.4

642

.86

48.2

2

50.3

251

.94

52.6

7

60.1

060

.1864

.59

76.7

877

.03

77.2

977

.60

H1.esp

4.0 3.5 3.0 2.5 2.0 1.5Chemical Shift (ppm)

0.25

0.50

0.75

1.00

Nor

mal

ized

Inte

nsity

CH4

StructureSpectraN

NO

O N

NO

O

O

N

N

O

O

N

N

O

Page 3: Prediction of NMR Chemical Shifts. A Chemometrical Approach К.А. Blinov, Y.D. Smurnyy, Т.S. Churanova, М.Е. Elyashberg Advanced Chemistry Development (ACD)

Sometimes solution is not obvious

• In many cases we obtain several structures corresponding to spectral data.

• In this case we need a method to rank the structures.

• Most powerful method - compare experimental and predicted 13C NMR spectra

Page 4: Prediction of NMR Chemical Shifts. A Chemometrical Approach К.А. Blinov, Y.D. Smurnyy, Т.S. Churanova, М.Е. Elyashberg Advanced Chemistry Development (ACD)

13C NMR spectral data

NN

O

O

N

N

O

O

2,00

9.62

Experimental

Predicted

Page 5: Prediction of NMR Chemical Shifts. A Chemometrical Approach К.А. Blinov, Y.D. Smurnyy, Т.S. Churanova, М.Е. Elyashberg Advanced Chemistry Development (ACD)

How to find the best structure?

• In most cases predicted spectrum of “correct structure” has best fit to experimental spectrum

• In practice “correct structure” has average deviation between predicted and experimental spectra 2-3 ppm

Page 6: Prediction of NMR Chemical Shifts. A Chemometrical Approach К.А. Blinov, Y.D. Smurnyy, Т.S. Churanova, М.Е. Elyashberg Advanced Chemistry Development (ACD)

The role of the spectra prediction

• Real-world task. Unknown structure with MF C29H32N2O5 and spectral data (1D and 2D NMR).

• 20 min to generate all structures (> 12 000) • 24 hours to predict the NMR 13С spectra

of all the obtained structures• Speed of spectra prediction should be

increased

Page 7: Prediction of NMR Chemical Shifts. A Chemometrical Approach К.А. Blinov, Y.D. Smurnyy, Т.S. Churanova, М.Е. Elyashberg Advanced Chemistry Development (ACD)

Methods of the prediction of NMR spectra

• Quantum Mechanics• Database approach

– HOSE Codes– Maximum Common

Substructure

• Rule-based – Additive scheme– Neural Networks

– extremely slow– accurate but slow

– fast but inaccurate

• Our choice – improve accuracy of fast method

Page 8: Prediction of NMR Chemical Shifts. A Chemometrical Approach К.А. Blinov, Y.D. Smurnyy, Т.S. Churanova, М.Е. Elyashberg Advanced Chemistry Development (ACD)

Additive scheme

aixi

=

C

O

CH3

C

CH2

CH

CH2

CH2

CH2

153.71-1.85-4.49-1.39-2.79+1.43+0.52+0.52-1.35 = 144.31

153.71

-1.85

-4.49

-1.39

-2.79

1.43

0.52

0.52

-1.35

144.31

Main problem – find correct values of atom increments

Page 9: Prediction of NMR Chemical Shifts. A Chemometrical Approach К.А. Blinov, Y.D. Smurnyy, Т.S. Churanova, М.Е. Elyashberg Advanced Chemistry Development (ACD)

Available data

• We have database of 1.5 millions of chemical shifts for 13С.

• We can try to obtain correct values!

Page 10: Prediction of NMR Chemical Shifts. A Chemometrical Approach К.А. Blinov, Y.D. Smurnyy, Т.S. Churanova, М.Е. Elyashberg Advanced Chemistry Development (ACD)

How to encode atom environment

CH2Atom’s type

Number of atoms…1 1

CH

Input variables

C

O

CH3

C

CH2

CH

CH2

CH2

CH2

…C

1

1st sphere

CH2 CH3O

2 1 1

2nd sphere

Page 11: Prediction of NMR Chemical Shifts. A Chemometrical Approach К.А. Blinov, Y.D. Smurnyy, Т.S. Churanova, М.Е. Elyashberg Advanced Chemistry Development (ACD)

Data for PLS regressionAtom environment encoding

Sam

ples

Chemical shifts

X Y

Page 12: Prediction of NMR Chemical Shifts. A Chemometrical Approach К.А. Blinov, Y.D. Smurnyy, Т.S. Churanova, М.Е. Elyashberg Advanced Chemistry Development (ACD)

Find best structure encoding

• Initially best scheme of structure representation does not evident

• We should find scheme which has best accuracy

• We should optimize– substitutents coding scheme – number of used “spheres”

Page 13: Prediction of NMR Chemical Shifts. A Chemometrical Approach К.А. Blinov, Y.D. Smurnyy, Т.S. Churanova, М.Е. Elyashberg Advanced Chemistry Development (ACD)

Used data

• 210 K of chemical shifts used as a training set.

• 170 K of chemical shifts from recent literature used as external validation set.

Page 14: Prediction of NMR Chemical Shifts. A Chemometrical Approach К.А. Blinov, Y.D. Smurnyy, Т.S. Churanova, М.Е. Elyashberg Advanced Chemistry Development (ACD)

How to describe atom type

• Atom type (C, O, etc.).

• Hybridization (sp3, sp2, etc).

• Valence

• Number of neighbor H.

• Charge

• Distance to “central” atom (bonds)

CH3

CH

CH

NH2

“Central” atom

“Substitutent”

7 (N)

1 (sp3)

32

0

3

Page 15: Prediction of NMR Chemical Shifts. A Chemometrical Approach К.А. Blinov, Y.D. Smurnyy, Т.S. Churanova, М.Е. Elyashberg Advanced Chemistry Development (ACD)

Result for different atom encoding

7.17

10.96

5.36

8.76

4.39

6.57

3.52

5.37

0.00

2.00

4.00

6.00

8.00

10.00

12.00

Atoms only + Elementtype

+Hybridization

+ All other

AverageDeviation

StandardDeviation

Page 16: Prediction of NMR Chemical Shifts. A Chemometrical Approach К.А. Blinov, Y.D. Smurnyy, Т.S. Churanova, М.Е. Elyashberg Advanced Chemistry Development (ACD)

Result for number of spheres

5.43

7.69

3.97

5.88

3.66

5.51

3.52

5.37

3.51

5.37

3.53

5.40

0.00

1.00

2.00

3.00

4.00

5.00

6.00

7.00

8.00

1 2 3 4 5 6

Number of "Spheres"

Averagedeviation

Standarddeviation

Page 17: Prediction of NMR Chemical Shifts. A Chemometrical Approach К.А. Blinov, Y.D. Smurnyy, Т.S. Churanova, М.Е. Elyashberg Advanced Chemistry Development (ACD)

Is it the best possible accuracy?

• Best possible average deviation is 3.5 ppm.

• We need less than 3 ppm (2 is preferable).

• Should we use additional variables?

• We should be very careful adding variables.

Page 18: Prediction of NMR Chemical Shifts. A Chemometrical Approach К.А. Blinov, Y.D. Smurnyy, Т.S. Churanova, М.Е. Elyashberg Advanced Chemistry Development (ACD)

CH2 C

CH3

CH3

141,48125,90CH2 C

Cl

Cl

CH2 C

Cl

CH3

138,30

125,38CH2 C

H

Cl

Substitutents interference (cross effect)

CH2 C

H

H +2,48

122,90 CH2 C

H

CH3

134,16

+1.34 -1.94 -3.94

145.42127.86136.64

+11,26

Page 19: Prediction of NMR Chemical Shifts. A Chemometrical Approach К.А. Blinov, Y.D. Smurnyy, Т.S. Churanova, М.Е. Elyashberg Advanced Chemistry Development (ACD)

C

O

CH3

C

CH2

CH

CH2

CH2

CH2

Enhanced structure encoding

CH2 and CH Atom pair type

Number of pairs…1

Input variables

1

Atoms Pairs of atoms (Crosses)

C and O

Page 20: Prediction of NMR Chemical Shifts. A Chemometrical Approach К.А. Blinov, Y.D. Smurnyy, Т.S. Churanova, М.Е. Elyashberg Advanced Chemistry Development (ACD)

1 2 3 4

43

21

2

2.2

2.4

2.6

2.8

3

3.2

3.4

3.6

Result for atom pairs (crosses)

Distance between atoms

within a crossNumber of spheres

Mea

n er

ror,

ppm

Page 21: Prediction of NMR Chemical Shifts. A Chemometrical Approach К.А. Blinov, Y.D. Smurnyy, Т.S. Churanova, М.Е. Elyashberg Advanced Chemistry Development (ACD)

More enhancements?

• Now accuracy is good enough (2.3 ppm)

• But it is still bad in some cases

• Unfortunately these cases are very important

• This “special” cases should be taken into account

Page 22: Prediction of NMR Chemical Shifts. A Chemometrical Approach К.А. Blinov, Y.D. Smurnyy, Т.S. Churanova, М.Е. Elyashberg Advanced Chemistry Development (ACD)

Stereo effects: double bonds

CH3

OOH

CH3

CH3

CH3

25.7

17.6

3,9 A

2,9 A

• We use “topological” distance

• Sometimes equal topological distance correspond to different “real” distances

Page 23: Prediction of NMR Chemical Shifts. A Chemometrical Approach К.А. Blinov, Y.D. Smurnyy, Т.S. Churanova, М.Е. Elyashberg Advanced Chemistry Development (ACD)

Modified structure encoding

Atoms Pairs of atoms (Crosses) “Stereo” effects

Variables

Page 24: Prediction of NMR Chemical Shifts. A Chemometrical Approach К.А. Blinov, Y.D. Smurnyy, Т.S. Churanova, М.Е. Elyashberg Advanced Chemistry Development (ACD)

Prediction of spectra by different methods (mean error, ppm)

Taken into the account All types of atoms

CH3 =C

Atoms only 3,52 1,55 8,03

+ pairs of atoms (crosses)

2,32 1,50 3,22

+ “stereo” effects 2,27 1,24 3,22

+ solvent 2,25 1,24 3,20

+ to be continued?

Page 25: Prediction of NMR Chemical Shifts. A Chemometrical Approach К.А. Blinov, Y.D. Smurnyy, Т.S. Churanova, М.Е. Elyashberg Advanced Chemistry Development (ACD)

Size of training set

• We have 1.5 millions of chemical shifts

• We should try to use all available data

• Only one problem – matrix size

• In many cases matrix size becomes more than 2 GB

Page 26: Prediction of NMR Chemical Shifts. A Chemometrical Approach К.А. Blinov, Y.D. Smurnyy, Т.S. Churanova, М.Е. Elyashberg Advanced Chemistry Development (ACD)

Bigger dataset – smaller mean error!

0.00

1.00

2.00

3.00

4.00

5.00

1 2 4 8 16 32 64 128 207

Number of structures in training set (thousands)

Av

era

ge

de

via

tio

n (

pp

m)

Page 27: Prediction of NMR Chemical Shifts. A Chemometrical Approach К.А. Blinov, Y.D. Smurnyy, Т.S. Churanova, М.Е. Elyashberg Advanced Chemistry Development (ACD)

The final results

Method Average deviation

The rate of calculationshifts/sec.

Old Method - HOSE Codes

1.87 6

New Additive scheme

1.83 5800

Faster by 3 order!

Page 28: Prediction of NMR Chemical Shifts. A Chemometrical Approach К.А. Blinov, Y.D. Smurnyy, Т.S. Churanova, М.Е. Elyashberg Advanced Chemistry Development (ACD)

Prediction time: the past and present

NH

NH

O

O

CH3

CH3

OO

O

CH3

Method Average deviation Time

HOSE Codes 1.72 > 24 hours

Additive scheme 1.63 2 min.

C29H32N2O5

Page 29: Prediction of NMR Chemical Shifts. A Chemometrical Approach К.А. Blinov, Y.D. Smurnyy, Т.S. Churanova, М.Е. Elyashberg Advanced Chemistry Development (ACD)

Conclusions

• Combination of “new” method with old well-known algorithm can produce very good (and unexpected) result