prediction of amines capacity for carbon dioxide absorption in gas

9
 Prediction of amines capacity for carbon dioxide absorption in gas sweetening processes Mohammadreza Momeni, Siavash Riahi * Institute of Petroleum Engineering, Faculty of Chemical Engineering, College of Engineering, University of Tehran, Tehran, Iran a r t i c l e i n f o  Article history: Received 23 July 2014 Received in revised form 30 August 2014 Accepted 1 September 2014 Available online 26 September 2014 Keywords: Gas sweetening Rich loading Carbon dioxide Absorption Amines QSPR a b s t r a c t Almost all gas reservoirs around the world produce sour gas that contains considerable amounts of acid gases including carbon dioxide and hydrogen sul de. Because carbon dioxide in water tends to cause corrosion and the presence of CO 2  in natural gas reduces its heating value, it must be removed prior to preparation of natural gas for marketing. Many technologies have offered various solutions to remove carbon dioxide from natural gas based on regenerable amine-based solvents. In order to make these technologies more ef cient and economical, further research is required in terms of experiment and modeling to identify the main parameters which in uence the capacity of amines for CO 2  absorption. Numerous studies of amines have shown evidence that some relationships exist between the structure of amine and its capac ity for carbo n diox ide abso rptio n. Quanti tativ e Structure Prope rty/Activ ity Rela- tionship (QSPR/QSAR) provides an effective method for predicting amines capacity for CO 2  absorption. In this paper,  rst, Density functional theory (DFT) method level of B3LYP and 6-311 þ g (d,p) basis set was employed to complete molecular geometrical optimization. Then, the Quantitative relationship between the absorption capacities data and calculated descriptors was achieved by the multiple linear regression (MLR) and model variables were selected by genetic algorithms (GA). The accuracy of the model was veried by different statistical methods and the result proved high statistical qualities of the model. Unlike other QSPR researches, the reported equation in this paper consists of simple and easy-calculated descriptors which form a robust model for predicting amines capacity of carbon dioxide absorption. © 2014 Elsevier B.V. All rights reserved. 1. Introduction Amines are molecules containing nitrogen atoms attached to a carbon-based chain structure. They can be applied in various  elds of engineering and science. One of the most important applications of ami nes is usi ng the m as an aci dic gas abs or pt ion liqui d for removing carbon dioxide from natural gas or oxygen containing systems for instance  ue gas (Singh et al., 2007, 2009 ). The Ab- sorption capacity of amines is an important characteristic. More- over, Different aspects of the molecules behavior of toxicity and envir onme ntal prote ction to techn ical issues can be affec ted by this feature. The solu bility and abso rptio n rate of carbon dio xide in amine based CO 2  absorbents are not only important due to tech- nical consider ations but also are vital for envir onmen tal issue s. Since experimental determination of absorption capacity (or rich loading) is very time-consuming and expensive and the values are not alwa ys avai lable in liter ature sour ces, estimatio n plays an important role (Pourbasheer et al., 2011). Hence, the development of capable methods for predicting absorption capacity of different amines becomes an urgent task. Gas swe et eni ng or aci d gas removal (fo r ins tance CO 2 and H 2 S)is conv entio nally used in vari ous indus tries ( Bohlo ul et al., 201 4). Almost all gas reservoirs around the world produce sour gas that contains considerable amounts of acid gases including carbon di- oxid e and hyd rog en sulde. Owing tothe fac t that carbon di oxide in water tends to cause corrosion and the presence of CO 2  in natural gas reduces its heating val ue, it must be removed pri or to the pr epa rat ion of nat ur al gas for mar ket ing (MokhatabandPoe, 2012 ). The most common absorption media for this purpose are aqueous amine solutions. Amine derivatives including monoethanolamine (MEA), diethanolamine (DEA) and methyldiethanolamine (MDEA) are widely being used in commercial and industrial applications (Kohl and Nielsen, 1997). Due to the importance of amines in acid gas removal tech nolog ies, a desc ripti ve and a nove l mode l has to be dev eloped fro m whi ch amine che mic al pr opert ies can be pr edi cte d. There are evidences in the literature indicating the existence of relat ionship s between the structu re of an amine and it s *  Corresponding author . University of Tehran, Tehran 11 365-4563, Iran. Tel.: þ98 21 61114714. E-mail address:  [email protected]. ir  (S. Riahi). Contents lists available at  ScienceDirect  Journal of Natural Gas Science and Engineering journal homepage:  www.elsevier.com/locate/jngse http://dx.doi.org/10.1016/j.jngse.2014.09.002 1875-5100/© 2014 Elsevier B.V. All rights reserved.  Journal of Natural Gas Science and Engineering 21 (201 4) 442e450

Upload: davismoody

Post on 14-Jan-2016

24 views

Category:

Documents


1 download

DESCRIPTION

Amines

TRANSCRIPT

Page 1: Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 19

Prediction of amines capacity for carbon dioxide absorption in gas

sweetening processes

Mohammadreza Momeni Siavash Riahi

Institute of Petroleum Engineering Faculty of Chemical Engineering College of Engineering University of Tehran Tehran Iran

a r t i c l e i n f o

Article history

Received 23 July 2014

Received in revised form

30 August 2014

Accepted 1 September 2014

Available online 26 September 2014

Keywords

Gas sweetening

Rich loading

Carbon dioxide

Absorption

Amines

QSPR

a b s t r a c t

Almost all gas reservoirs around the world produce sour gas that contains considerable amounts of acidgases including carbon dioxide and hydrogen sul1047297de Because carbon dioxide in water tends to cause

corrosion and the presence of CO2 in natural gas reduces its heating value it must be removed prior to

preparation of natural gas for marketing Many technologies have offered various solutions to remove

carbon dioxide from natural gas based on regenerable amine-based solvents In order to make these

technologies more ef 1047297cient and economical further research is required in terms of experiment and

modeling to identify the main parameters which in1047298uence the capacity of amines for CO2 absorption

Numerous studies of amines have shown evidence that some relationships exist between the structure of

amine and its capacity for carbon dioxide absorption Quantitative Structure PropertyActivity Rela-

tionship (QSPRQSAR) provides an effective method for predicting amines capacity for CO 2 absorption In

this paper 1047297rst Density functional theory (DFT) method level of B3LYP and 6-311 thorn g (dp) basis set was

employed to complete molecular geometrical optimization Then the Quantitative relationship between

the absorption capacities data and calculated descriptors was achieved by the multiple linear regression

(MLR) and model variables were selected by genetic algorithms (GA) The accuracy of the model was

veri1047297ed by different statistical methods and the result proved high statistical qualities of the model

Unlike other QSPR researches the reported equation in this paper consists of simple and easy-calculated

descriptors which form a robust model for predicting amines capacity of carbon dioxide absorptioncopy 2014 Elsevier BV All rights reserved

1 Introduction

Amines are molecules containing nitrogen atoms attached to a

carbon-based chain structure They can be applied in various 1047297elds

of engineering and science One of the most important applications

of amines is using them as an acidic gas absorption liquid for

removing carbon dioxide from natural gas or oxygen containing

systems for instance 1047298ue gas (Singh et al 2007 2009) The Ab-

sorption capacity of amines is an important characteristic More-

over Different aspects of the molecules behavior of toxicity andenvironmental protection to technical issues can be affected by this

feature The solubility and absorption rate of carbon dioxide in

amine based CO2 absorbents are not only important due to tech-

nical considerations but also are vital for environmental issues

Since experimental determination of absorption capacity (or rich

loading) is very time-consuming and expensive and the values are

not always available in literature sources estimation plays an

important role (Pourbasheer et al 2011) Hence the development

of capable methods for predicting absorption capacity of different

amines becomes an urgent task

Gas sweetening or acid gas removal (for instance CO2 and H2S)is

conventionally used in various industries (Bohloul et al 2014)

Almost all gas reservoirs around the world produce sour gas that

contains considerable amounts of acid gases including carbon di-

oxide and hydrogen sul1047297de Owing tothe fact that carbon dioxide in

water tends to cause corrosion and the presence of CO2 in naturalgas reduces its heating value it must be removed prior to the

preparation of natural gas for marketing (Mokhatab and Poe 2012)

The most common absorption media for this purpose are aqueous

amine solutions Amine derivatives including monoethanolamine

(MEA) diethanolamine (DEA) and methyldiethanolamine (MDEA)

are widely being used in commercial and industrial applications

(Kohl and Nielsen 1997) Due to the importance of amines in acid

gas removal technologies a descriptive and a novel model has to be

developed from which amine chemical properties can be predicted

There are evidences in the literature indicating the existence

of relationships between the structure of an amine and its

Corresponding author University of Tehran Tehran 11365-4563 Iran Tel thorn98

21 61114714

E-mail address riahiutacir (S Riahi)

Contents lists available at ScienceDirect

Journal of Natural Gas Science and Engineering

j o u r n a l h o m e p a g e w w w e l s e v i e r c om l o c a t e j n g s e

httpdxdoiorg101016jjngse201409002

1875-5100copy

2014 Elsevier BV All rights reserved

Journal of Natural Gas Science and Engineering 21 (2014) 442e450

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 29

capacity for carbon dioxide absorption (rich loading) Signi1047297cant

contribution to analyzing the relationships between the structure

and absorption capacity of amines has been made by Chakraborty

et al In their work it has been shown that the existence of

substituents at a-carbon causes a carbamate instability which

results in an accelerated hydrolysis as a result the amount of

bicarbonate increases which leads to higher carbon dioxide

loading (Chakraborty et al 1986) In addition it was explained by

Sartori and Savage that steric hindrance effects produced by a-

substituent are responsible for these instabilities (Sartori and

Savage 1983) In addition Chakraborty studied the electronic

effects of substituents and suggested that substitution at carbon

atom causes an interaction of the p and p methyl group orbital

with the lone pair of the nitrogen Since nitrogen charge is

reduced by this interaction it reduces the strength of the NeH

bond which results in the raise of the hydrolysis in the aqueous

solution It seems that the rate of the initial reaction can be

reduced by the steric hindrance effects however the number of

amine available to react with CO2 grows noticeably (Chakraborty

et al 1988) Furthermore solvent screening experiments and

investigation of the effects of some variables for example chain

length the number of functional groups position of side chains

and functional group etc has been conducted by Singh et al Theyperformed semi-quantitative study of these effects on the ca-

pacity of amines for CO2 absorption (Singh et al 2007 2009) In

addition a computational study in the reactions between func-

tionalized amines and CO2 was performed by Lee and Kitchin

They highlighted the molecular descriptors by which reactivity

trends can be obtained Their work revealed that electron with-

drawing and donating groups tend to destabilize and stabilize

CO2 reaction products respectively (Lee and Kitchin 2012) All of

the results in this paper are based on mathematical calculations

and model development To the best of the authors knowledge

this work is the 1047297rst quantitative research on amines capacity for

CO2 absorption based on the simple and robust model

To achieve this goal a close observation of the relationship be-

tween the chemical structure and the activity of different amine-based solutions is required An effective method for processing

analyzing and predicting the characteristics of different molecules

can be provided by Quantitative Structure PropertyActivity Rela-

tionship (QSPRQSAR) (Beheshti et al 2012 2009 Freire et al

2010 Liang et al 2013 Godavarthy et al 2006 Riahi 2009

2008 Riahi et al 2008) Quantitative structureeproperty rela-

tionship technique relates chemical or physical properties of

compounds to their molecular structures This technique is used to

quantitatively develop a correlation which can predict speci1047297c

molecular properties for example environmental functions or

physico-chemical behaviors The QSPR approach is based on the

assumption that differences of molecules behaviors can be corre-

lated with deviation of some molecular features that are technically

termed descriptors The descriptors are numerical values thatbelong to the shape and structure of the molecule For using QSPR

method the knowledge of molecules chemical structures is quite

adequate and there is no necessity to conduct experimental con-

ditions QSPR often requires consecutive procedures consequently

the following steps were taken (Fini et al 2012)

1 A data set of molecules was taken from the literature with their

corresponding absorption capacities

2 The structural properties of molecules were extracted and

calculated by using computer software

3 The best model which contains an optimum number of de-

scriptors was selected by the means of several alternative al-

gorithms for example genetic algorithm (GA) and MLR

4 The selected model was validated using statistical tests and

validation methods for instance leave-one-out-cross-validation

method

In QSPR approaches selecting the proper method for con-

structing a robust and precise model is very important Multiple

linear regression (MLR) principle component regression (PCR) and

partial least squares (PLS) are most widely used in QSPR modeling

(Katritzky et al 2000 Marengo et al 1992) Variable selection for

building a well-1047297tted model is a further step Genetic algorithm

(GA) is one famous method by which this task can be accom-

plished This paper focuses on the development of a descriptive

novel model in QSPR analysis by which the prediction of absorp-

tion capacity (or rich loading) of various amines used in industrial

carbon capturing units can be predicted The quantitative rela-

tionship between the absorption capacities data and calculated

descriptors is achieved by the multiple linear regressions (MLR)

and model variables were selected by genetic algorithm (GA)

(Depczynski et al 2000 Jouan-Rimbaud et al 1995) The accuracy

of the model was veri1047297ed by different statistical methods and the

result proved high statistical qualities of that model One of the

main disadvantages of QSPR technique is that for most of the re-

searches conducted in this area the 1047297nal equation reported as bestmodel contains unfamiliar descriptors which are not only hard to

be calculated but also are dif 1047297cult or impossible to be interpreted

Fortunately the equation reported in this paper consists of de-

scriptors which are simple in terms of both calculation and

interpretation The model also demonstrates high statistical

qualities by which the predictive power and robustness of the

model can be guarantee

2 Materials and methods

The absorption capacity (rich loading) of 23 amines-based sol-

vents for carbon dioxide absorption (Table 1) were taken from the

literature (Singh et al 2007) Firstly density functional theory

(DFT) at the level of B3LYP and 6-311 thorn G (d p) basis set wasemployed to perform geometrical optimization (Cramer 2005 da

Silva and Svendsen 2004) These calculations were performed by

Gaussian software (Frisch et al 1998) The input of Gaussian soft-

ware was pre-optimized molecule structures using semi-empirical

geometry optimization method AM1 This process calculates a

group of precise and applicable descriptors introducing electronic

and quantum chemical properties of molecules Quantum chemical

descriptors include properties for example dipole moment sum of

the electronic and thermal free energies atomic charges HOMO

energy (highest occupied molecular orbital energy) LUMO energies

(Lowest Unoccupied molecular orbital energy) exact polarizability

etc Consequently a total number of 31 quantum chemical de-

scriptors were calculated for each molecule

Next geometrically optimized structures of each molecule werefed into the Dragon software developed by the Milano Chemo-

metrics and QSAR research group (Todeschini et al 2002) As a

result for each molecule more than 1486 theoretical molecular

descriptors were calculated These descriptors can be divided into

different groups for instance constitutional descriptors topologi-

cal descriptors functional group counts molecular properties etc

Because of the large amount of numerical data that result in

imprecise and slow further calculation the number of calculated

descriptors was decreased by the accepted procedure below

1 Constant and near constant value descriptors were eliminated

(361 excluded)

2 One of the collinear descriptors (R gt 098) that had better cor-

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450 443

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 39

Table 1

The structure experimental and calculated values of amines capacities for CO 2 absorption (rich loading)

No Name Structure Exp Eq (2) (Model) Eq (1)

1 12-diamino propane 127 129 123

2 13-diamino propane 130 129 125

3 14-Diamino butane(T) 126 137 129

4 2-Amino-1-butanol 088 079 080

5 2-Methyl pyridine 006 009 008

6 2-Pyridylamine 028 059 023

7 3-Amino-1-Propanol 088 071 072

8 4-Amino-1-butanol 083 079 076

9 5-Ami no-1-pentanol(T) 084 087 085

10 Butylamine 086 079 084

11 Diethylenetriamine 183 181 177

12 Ethylamine 091 063 082

13 Ethylenediamine 108 121 120

14 Hexamethylenediamine 148 153 146

15 Isobutyl ami ne(T) 078 079 082

16 Monoethanolamine 072 063 061

17 N-(2-Hydroxyethyl)ethylenediamine 115 123 117

18 NN-bis(2-hydroxyethyl)ethylenediamine 120 125 127

19 N-Pentylamine 072 087 090

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450444

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 49

relation with absorption capacity was saved and other de-

scriptors were eliminated (611 excluded)

After the above constraints a total of 514 descriptors were

selected for each molecule as an output of this stage

Finally the calculated descriptors formed a (23 545) data

matrix where 23 represents the number of compounds and 545

were the number of descriptors

3 Model development

After descriptors calculation GA-MLR was applied as a variable

selection and model development procedure for obtaining the best

model with the highest predictive power based on the training set

The procedure of constructing training and test sets will be dis-

cussed in the results section The GA-MLR analysis led to thedevelopment of one model with three variables The following

linear equation was built based on molecules with the training set

AC frac14 036 thorn 157 Mor09v 032 RDF035m thorn 043 nN

(1)

AC is used instead of absorption capacity Mor09v is one of the

3D-MoRSE descriptors and it is de1047297ned as signal 09 weighted by

van der Waals volume RDF035m belongs to the group of RDF de-

scriptors and it describes the radial distribution function-035

weighted by mass and nN represent the number of Nitrogen

atoms As can be noticed the calculation of two descriptors in the

above model is dif 1047297cult because these calculations should be per-

formed by computer It also seems it is not easy to describe the

relationship between these two descriptors and absorption ca-pacity of amines In QSPR studies interpretation of the model and

descriptors is a necessary and important step So it was decided to

investigate some new models with new simple descriptors In

addition due to the chemical reaction of amines with carbon di-

oxide it is concluded that the number of amino groups may affect

amines capacity of carbon dioxide absorption The information on

the chemistry of carbon dioxide reactions with amine-based sol-

vents will be presented in the discussion section After developing

numerous simple equations and evaluating them with different

statistical methods the following model was selected

AC frac14 019 thorn 004 nH thorn 054 nRNH2 thorn 040 nRNHR

(2)

Table 2 shows some statistical factors in order to provide a

better comparison between the two models The 1047297rst equation

demonstrates higher statistical parameters But the simpler de-

scriptors of the second model either in the calculation or inter-

pretation of results are more important Therefore we introduce

the second equation as a preferred model to predict absorption

capacity of amines and the rest of this paper including discussion

and conclusion section will focus on this model

Molecular descriptors and their de1047297nitions are given in Table 3

The correlation matrix of descriptors is also shown in Table 4 The

linear correlation value for each of the two descriptors is less than

065 which demonstrates these descriptors are independent of

each other and can be used to develop a QSPR model

As can be observed the three descriptors appeared in the model

are easily calculated and thus there is no need for computational

calculation Moreover this model demonstrates high statistical

qualities Indeed to the best of our knowledge the above model is

the simplest equation that can ever predict the capacity of amines

for carbon dioxide absorption under speci1047297c conditions

4 Results

One of the most critical factors that in1047298uence the quality of

regression model is how to select and construct training and test

set in order to warrant the molecular diversity on both of them To

take this into account from the total 23 amine-based carbon di-

oxide absorbents 18 molecules (about 80 of molecules) were

selected to construct a training set and 5 molecules built test set

(about 20) The test set was used for external cross-validation of

Table 1 (continued )

No Name Structure Exp Eq (2) (Model) Eq (1)

20 Propylamine(T) 077 071 080

21 Pyridine(T) 005 001 012

22 sec-Butylamine 084 079 087

23 Triethylenetetramine 251 241 247

All the absorption capacities (rich loading) numerical values are in the basis of (mol CO 2mol amine) Bold names with (T) superscripts are test set molecules

Table 2

Some basic statistical values for two models

Models Descriptors R2 Q 2 F s

Eq (1) nN mor09V RDF035m 0979 0971 30096 0082

Eq (2) nH nRNH2 nRNHR 0950 0945 12154 0127

All statistic parameters in this table calculated before training and test procedure

Table 3

The three molecular descriptors used in Eq (2)

Descriptor Type De1047297nition

nH Constitutional indices Number of Hydrogen atoms

nRNH2 Functional group counts Number of primary amines (aliphatic)

nRNHR Functional group counts Number of secondary amines (aliphatic)

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450 445

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 59

the model One of the common techniques in QSPR approach for

constructing training and test set with the constraint of structural

diversity is a PCA (principal component analysis) method (Hu et al

2009 Riahi et al 2008) In the current work PCA was employed to

classify data set of molecules into training and test sets For this

purpose PC1 and PC2 were calculated based on descriptors in the

model The result showed that these two principal components

made 575 and 335 of the variation in data respectively and

played the main roles Fig 1 shows the distribution of the data for

PC1 and PC2 and by observing this 1047297gure it can be concluded that

the compounds in the training and test sets were representatives of

the whole data

The training set was used to build the model while the test set

was used to validate the predicting power During the model

development procedure leave-one-out cross-validation (LOO-CV)method was applied to assess the performances of different

resulting models The Q 2LOO was calculated for each obtained

equation and then the best model was selected based on the high

value of this parameter There are some statistical tests and pa-

rameters that need to be considered Coef 1047297cient of determination

(R2) adjusted R2 Coef 1047297cient of leave-one-out cross validation (Q 2)

the slopes of regression lines forced through zero (k k0) root mean

square error (RMSE) and standard error of the estimate (s) are the

most important ones The 1047297rst 1047297ve parameters should be near to

unity while RMSE and s should be low enough near to zero

Furthermore the intercepts of the model should be close to zero

Moreover the Fisher function (F ) is another vital statistical test

High values of the F -ratio test indicate reliable models All statistical

parameters formulas used in this paper are mentioned below

R2 frac14 1

Pnifrac141

yexp

i ycalc

i

2

Pnifrac141

yexp

i y2

(3)

RMSE frac14

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPnifrac141

yexp

i ycalc

i

2

n

v uut (4)

F frac14

Pnifrac141

yexp ycalc

i

2

df M

Pnifrac141

yexpi ycalci

2df E

(5)

k frac14

P yexp

i ycalc

iP ycalc

i

2 (6)

k0 frac14

P yexp

i ycalc

iP yexp

i

2 (7)

s frac14

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPnifrac141

yexp

i ycalc

i

2

df E

v uut (8)

where df M and df E refer to the degrees of freedom of the model and

error respectively

Also the following criteria described by Golbaraikh and Tropsha

were applied to check the predictability of the QSPR model

(Golbraikh and Tropsha 2002)

1

Q 2 gt05 (9)

2

R2gt 06 (10)

3 R2 R2

0R2 lt0

1 and 0

85 k 1

15 (11)

where R20 is the coef 1047297cient of determination characterizing linear

regression with Y -intercept set at zero The predicted result of all

molecules either in training or test set with statistical parameters

are given in Table 5

Table 4

Correlation matrix of three descriptors used in Eq (2)

Descriptor nH nRNH2 nRNHR

nH 1000 0344 0642

nRNH2 0344 1000 0005

nRNHR 0642 0005 1000

Fig 1 The principal component analysis of the molecules in training and test sets Some points belong to more than one molecule

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450446

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 69

In Table6 Y-scrambling test was applied in order to examine the

robustness of the model (Tropsha et al 2003) In Y-scrambling test

the dependent variable (Absorption Capacity) is randomly dedi-

cated to different amines and new QPSR modeling is performed

based on the previous matrix of independent variables It is ex-

pectedthat newly developedQSPR models shouldhave lowenough

R2 and Q 2 values If it happens differently the reported model is not

accurate for the particular data set and method of modeling

The applicability domain of the model was studied by Williams

plot in Fig 2 (OECD 2007) In Williams plot the standardized re-

siduals (R) versus the leverage (hat diagonal) values (h) were

plotted Leverage demonstrates the distance of a compound from

the centroid of the X where X is the descriptor matrix The leverage

of a compound is calculated by the following equation (Netzeva

et al 2005)

hi frac14 xT i

X T X

1 xi (12)

where xi is the descriptor vector of the relevant compound The

warning leverage (h) is de1047297ned as (Eriksson et al 2003)

h frac143eth p thorn 1THORN

n (13)

n is the number of training objects and p is the number of de-

scriptors in the model Williams plot is used to identify both the

response outlier and the structurally in1047298uential chemicals in the

model A compound with hi gt h in1047298uence the regression line but it

does not consider as an outlier as its corresponding standardized

residual might be small In this data set the warning value of leverage is around 067 Furthermore compounds with standard-

ized residual rather than three standard residual unit (gt3s) is

considered as an outlier compound It is common in the literature

to use 3 as an accepted cut-off value for evaluating prediction re-

sults of the model

Fig 2 demonstrates that there is no chemical with leverage

higher than the warning h value of 067 It also shows that there is

no outlier in training or test sets and all compounds lie between the

two horizontal lines

The experimental absorption capacity (rich loading) values of

amines are plotted in Fig 3 against corresponding calculated values

for QSPR model

Furthermore mean effect (MF) is another term that helps to

interpret the result and shows the effect of each descriptor

individually or relative to other descriptors Fig 4 shows the stan-

dardized coef 1047297cients (also called beta coef 1047297cients) (XLSTAT 2013)

The 1047297gure is used to compare the relative weights of the de-

scriptors The higher the standardized coef 1047297cients value of a

descriptor the more important the weight of the corresponding

variable in the model This 1047297gure demonstrates the mean effect of

the descriptors in the model

By observing this 1047297gure it can be concluded that the number of

primary amines (nRNH2) and secondary amines (nRNHR) de-

scriptors play the main role in the amines capacity for carbon di-

oxide absorption respectively and the number of hydrogen (nH) has

the least effect This 1047297gure shows all descriptors in the model have

positive effects and the amines capacity for CO2 absorption is

directly related to each of these descriptors

At the last part of this section it should be noticed that the

present work focuses exclusively on developing a simple model by

which amines capacities for carbon dioxide absorption can be

predicted In fact the predominant difference between this study

and the previous ones is that this work concentrates 1047297rstly on

quantitative and then on qualitative representation of structural

effects on the capacity of amines for CO2 absorption

5 Discussion

Although high statistical parameters are signi1047297cant in demon-

strating the capability of the model QSPR should provide powerful

insight for the mechanism of carbon dioxide solubility in amine

based solvent For this reason an acceptable interpretation of de-

scriptors in the QSPR model should be provided It is better to di-

agnose which parameters affect the amines capacity and which

descriptors could appear in the model due to principal chemical

reactions between carbon dioxide and an amine-based solvent

The overall reaction mechanism for chemical absorption of CO2

in amine solvent systems is still under debate A mechanism for this

reaction which supports the formation of zwitterion intermediate

theory and by proton-remover base B through reactions (1) and (2)

below suggested by Caplow (1968)

CO2 thorn R 1R 2NH4 R 1R 2NHthorn COO (1)

R 1R 2NHthorn COO thorn B4 R 1R 2COO thorn BHthorn (2)

R 1 and R 2 demonstrate substituted group attached to amine

group B is a base molecule which can be a water molecule The

intermediate in the reaction is zwitterion But more recent studies

showed zwitterion seemed to be short-lived and may be an entirely

transient state (da Silva and Svendsen 2004) It led to the

assumption of the single-step mechanism of these reactions (re-

action (3)) A termolecular single step mechanism suggested by

Crooks and Donnellan (1989)

B thorn CO2 thorn R 1R 2NH4 R 1R 2NCOO thorn BHthorn (3)

where B is again the base molecule In this mechanism NH group is

attacked by base molecule and deprotonation of amine occurs The

bonding between amine and carbon dioxide also takes place

simultaneously

Table 6

R2 train values after several Y-scrambling tests

Iteration R2 train

1 0060

2 0074

3 0119

4 0027

5 0188

6 0102

7 0119

8 0209

9 0096

10 0039

Table 5

Validation parameters and statistical result of GA-MLR model

n R2 R2adj RMSE F k k0 s

Train 18 0942 0930 0127 7650 1004 0984 0144

Test 5 0976 0904 0060 1731 0962 1035 0135

Overall 23 0950 0942 0116 12379 0999 0990 0128

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450 447

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 79

As can be noticed the reaction between CO2 and amine based

solvent takes place because of the existence of NH bond So NH

group is an active site of the amine molecule where base molecule

(water) undergoes a chemical termolecular reaction Consequently

the amount of NH bonds or in other words the number of primary

and secondary amine groups in the amine molecule plays an

important role in the capacity of amines for CO2 absorption

Number of primary (nRNH2) and secondary (nRNHR) amines is two

main descriptors appearing in the model According to Fig 4 these

two descriptors have a positive effect and a higher mean effect All

these results demonstrate that the chemical reaction mechanism

coordinates with the proposed model

The model also contains number of Hydrogen atoms (nH) as

another descriptor Fig 4 shows nH descriptor has a positive effect

which is considerably less than two other descriptors The reason of

nH descriptor presence in the model can be explained by the result

of experimental work performed by Singh et al They showed that

an increase in the chain length between amines and other func-

tional groups in the amine structure result in an increase in amine

capacity for CO2 absorption (Singh et al 2007) Increasing with

chain length results in increasing numbers of hydrogen atoms so

apparently it seems it should have a positive effect due to the

experimental work

At last it should be noted that the simplicity of the model is

interesting and the results are quite acceptable for predicting

amines capacity for carbon dioxide absorption Although the ac-

curacy of the model is good for linear amine compounds it is not

better for unsaturated cyclic amines This can be explained by two

Fig 2 Williams plot of GA-MLR model development

Fig 3 Experimental vs predicted rich loading values (mol CO 2mol amine) e

regression line

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450448

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 89

main reasons First the three descriptors in model are not sensitive

to ring type functional group and just count the number of

hydrogen atoms primary and secondary amines Second unsatu-

rated cyclic amines show poor absorption rate and capacity and

they are not potential absorbents for CO2 absorption (2) Therefore

according to the industrial point of view it is preferable to use

linear amine for CO2 absorption and it is more important for the

model to predict CO2 absorption capacity for linear amines rather

than unsaturated cyclic amines

Fortunately the results of the 1047297rst equation (Eq (1)) for pre-

dicting amines capacity of CO2 absorption are largely accepted

either for linear or aromatic ring type amines (items labeled 5 6

and 21) This is because of the presence of RDF descriptor in this

model RDFdescriptorsare based on the distance distribution in the

geometrical representation of a molecule This function is inde-

pendent of the number of atoms and is invariant against translation

and rotation of the entire molecule The RDFcode provides valuable

information eg about bond distances ring types planar and non-

planar systems and atom types so it is sensitive to aromatic rings

(Todeschini and Consonni 2008)

6 Conclusions

One of the main concerns of the natural gas industry is to have a

robust and accuratemodel which canpredict the chemical behavior

of amines for gas treatment process This study is attempted toidentify the effects chemical structure of amine on their capacity for

carbon dioxide absorption and develop a model for this purpose

which is not only robust and accurate but also simple and appli-

cable Therefore QSPR approach has been chosen as a modeling

technique and model has been developed based on linear method

for its simplicity As a result two linear equations were developed

First model demonstrate high prediction powerwhile second one is

notably simpler and powerfully interpretable due to the chemistry

of amines reaction with carbon dioxide Consequently second

equation introduced as a preferred model of this study The most

important descriptors appearing in the model due to the weight of

the corresponding variable are number of primary aliphatic amines

(nRNH2) number of secondary aliphatic amines (nRNHR) and

number of hydrogen atoms (nH) respectively The accuracy and

predictive performance of the model validated with various sta-

tistical tests and examined with the test set of 1047297ve molecules

permits using this model to estimate other amines rich loading

under speci1047297c conditions According to the results it could be

argued that a good amine solvent for carbon dioxide absorption

should have a linear structure with a high number of primary and

secondary amine groups as side chains In other words increasing

the number of primary and secondary amine groups results in

increasing the number of NH bonds active sites which causes the

amine reaction with CO2 to happen

The promising results of this study might aid other researchers

in the 1047297eld of chemistry and natural gas engineering to design and

synthesis new potential amine-based solvents and investigate the

feasibility of using them in gas removal processes New improved

solvents should also be compared to more conventional ones from

corrosively energy ef 1047297ciency and operability point of view

Acknowledgment

The authors would like to gratefully acknowledge the support

from Institute of Petroleum Engineering (IPE) University of Tehran

List of symbols

CO2 carbon dioxide

QSPRQSAR quantitative structure propertyactivity relationship

DFT Density Functional Theory

MLR Multiple Linear Regression

GA Genetic Algorithms

PCR principle component regression

PLS partial least square

HOMO Highest Occupied Molecular Orbital

LUMO Lowest Unoccupied Molecular Orbital

AC absorption capacity

PCA principal component analysis

LOO-CV Leave-one-out cross-validation

RMSE root mean square errordf M degrees of freedom of the model

df E degrees of freedom of the error

References

Beheshti Abolghasem Riahi Siavash Ganjali Mohammad Reza 2009 Quantitativestructureeproperty relationship study on 1047297rst reduction and oxidation poten-tials of donor-substituted phenylquinolinylethynes and phenyl-isoquinolinylethynes quantum chemical investigation Electrochim Acta 54(23) 5368e5375

Beheshti A Norouzi P Ganjali MR 2012 A simple and robust model for pre-dicting the reduction potential of quinones family electrophilicity index effectInt J Electrochem Sci 7 4811e4821

Bohloul MR Vatani A Peyghambarzadeh SM 2014 Experimental and theo-retical study of CO2 solubility in N-methyl-2-pyrrolidone (NMP) Fluid PhaseEquilibr 365 106e111

Caplow Michael 1968 Kinetics of carbamate formation and breakdown J AmChem Soc 90 (24) 6795e6803

Chakraborty AK et al 1988 Molecular orbital approach to substituent effects inamine-CO2 interactions J Am Chem Soc 110 (21) 6947e6954

Chakraborty AK Astarita G Bischoff KB 1986 CO2absorption in aqueous so-lutions of hindered amines Chem Eng Sci 41 (4) 997e1003

Cramer Christopher J 2005 Essentials of Computational Chemistry Theories andModels Wileycom

Crooks John E Donnellan J Paul 1989 Kinetics and mechanism of the reactionbetween carbon dioxide and amines in aqueous solution J Chem Soc PerkinTrans 2 (4) 331e333

da Silva Eirik F Svendsen Hallvard F 2004 Ab initio study of the reaction of carbamate formation from CO2 and alkanolamines Indust Eng Chem Res 43(13) 3413e3418

Depczynski Uwe Frost VJ Molt K 2000 Genetic algorithms applied to the se-lection of factors in principal component regression Anal Chim Acta 420 (2)217e227

Eriksson Lennart Joanna Jaworska Worth Andrew P Cronin Mark TD

McDowell Robert M Gramatica Paola 2003 Methods for reliability and

Fig 4 Mean effects of model descriptors (standardized coef 1047297cient values)

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450 449

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 99

uncertainty assessment and for applicability evaluations of classi1047297cation-andregression-based QSARs Environ Health Perspect 111 (10) 1361

Fini Mojtaba Fallah Riahi Siavash Alireza Bahramian 2012 Experimental andQSPR studies on the effect of ionic surfactants on n-Decaneewater interfacialtension J Surfact Deterg 15 (4) 477e484

Freire Mara G et al 2010 Solubility of non-aromatic ionic liquids in water andcorrelation using a QSPR approach Fluid Phase Equilibr 294 (1) 234e240

Frisch Michael J Nielsen Alice B Frisch Aeleen (Eds) 1998 Gaussian 98 GaussianIncorporated

Godavarthy Srinivasa S Robinson Jr Robert L Gasem Khaled AM 2006

SVRCe

QSPR model for predicting saturated vapor pressures of pure 1047298uids FluidPhase Equilibr 246 (1) 39e51

Golbraikh Alexander Tropsha Alexander 2002 Beware of q2 J Mol Graph Model20 (4) 269e276

Hu Rongjing et al 2009 QSAR models for 2-amino-6-arylsulfonylbenzonitrilesand congeners HIV-1 reverse transcriptase inhibitors based on linear andnonlinear regression methods Eur J Med Chem 44 (5) 2158e2171

Jouan-Rimbaud Delphine et al 1995 Genetic algorithms as a tool for wavelengthselection in multivariate calibration Anal Chem 67 (23) 4295e4301

Katritzky Alan R et al 2000 QSPR correlation and predictions of GC retentionindexes for methyl-branched hydrocarbons produced by insects Anal Chem 72(1) 101e109

Kohl Arthur L Nielsen Richard 1997 Gas Puri1047297cation (access online via Elsevier)Lee Anita S Kitchin John R 2012 Chemical and molecular descriptors for the

reactivity of amines with CO2 Indust Eng Chem Res 51 (42) 13609e13618Liang Guijie Jie Xu Li Liu 2013 QSPR analysis for melting point of fatty acids

using genetic algorithm based multiple linear regression (GA-MLR) Fluid PhaseEquilibr 353 15e21

Marengo Emilio et al 1992 Comparative study of different structural descriptorsand variable selection approaches using partial least squares in quantitativestructure-activity relationships Chemometr Intell Lab Syst 14 (1) 225e233

Mokhatab Saeid Poe William A 2012 Handbook of Natural Gas Transmission andProcessing (access online via Elsevier)

Netzeva Tatiana I Worth Andrew P Aldenberg Tom Romualdo BenigniCronin Mark TD Gramatica Paola Jaworska Joanna S et al 2005 Currentstatus of methods for de1047297ning the applicability domain of (quantitative)structureeactivity relationships ATLA 33 155e173

OECD 2007 Guidance Document on the Validation of (Quantitative) Structure-Activity Relationships [(Q)SAR] Models Organisation for Economic Co-Operation and Development Paris

Pourbasheer Eslam et al 2011 Prediction of solubility of fullerene C60 in variousorganic solvents by genetic algorithm-multiple linear regression FullerenesNanotubes Carbon Nanostruct 19 (7) 585e598

Riahi Siavash Ganjali Mohammad Reza Norouzi Parviz Jafari Fatemeh 2008Application of GA-MLR GA-PLS and the DFT quantum mechanical (QM) cal-culations for the prediction of the selectivity coef 1047297cients of a histamine-selective electrode Sens Actuat B Chem 132 (1) 13e19

Riahi Siavash Pourbasheer Eslam Ganjali Mohammad Reza Norouzi Parviz 2009Investigation of different linear and nonlinear chemometric methods formodeling of retention index of essential oil components concerns to supportvector machine J Hazard Mater 166 (2) 853e859

Riahi Siavash Beheshti Abolghasem Ganjali Mohammad Reza Norouzi Parviz2008 A novel QSPR study of normalized migration time for drugs in capillaryelectrophoresis by new descriptors quantum chemical investigation Electro-phoresis 29 (19) 4027e4035

Riahi Siavash et al 2008 QSAR study of 2- (1-Propylpiperidin-4-yl) -1 H-Benz-imidazole-4-Carboxamide as PARP inhibitors for treatment of cancer ChemBiol Drug Design 72 (6) 575e584

Sartori Guido Savage David W 1983 Sterically hindered amines for carbon di-oxide removal from gases Indust Eng Chem Fundam 22 (2) 239e249

Singh Prachi Niederer John PM Versteeg Geert F 2007 Structure and activityrelationships for amine based CO2 absorbentsdI Int J Greenhouse Gas Control1 (1) 5e10

Singh Prachi Niederer John PM Versteeg Geert F 2009 Structure and activityrelationships for amine-based CO2 absorbents-II Chem Eng Res Design 87 (2)135e144

Todeschini R Consonni V Mauri A Pavan M 2002 DRAGON-Software for thecalculation of molecular descriptors version 21

Todeschini Roberto Consonni Viviana 2008 Handbook of Molecular Descriptors John Wiley amp Sons

Tropsha Alexander Gramatica Paola Gombar Vijay K 2003 The importance of being earnest validation is the absolute essential for successful application andinterpretation of QSPR models QSAR Comb Sci 22 (1) 69e77

XLSTAT 2013 software XLSTAT-CCR module Trial version

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450450

Page 2: Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 29

capacity for carbon dioxide absorption (rich loading) Signi1047297cant

contribution to analyzing the relationships between the structure

and absorption capacity of amines has been made by Chakraborty

et al In their work it has been shown that the existence of

substituents at a-carbon causes a carbamate instability which

results in an accelerated hydrolysis as a result the amount of

bicarbonate increases which leads to higher carbon dioxide

loading (Chakraborty et al 1986) In addition it was explained by

Sartori and Savage that steric hindrance effects produced by a-

substituent are responsible for these instabilities (Sartori and

Savage 1983) In addition Chakraborty studied the electronic

effects of substituents and suggested that substitution at carbon

atom causes an interaction of the p and p methyl group orbital

with the lone pair of the nitrogen Since nitrogen charge is

reduced by this interaction it reduces the strength of the NeH

bond which results in the raise of the hydrolysis in the aqueous

solution It seems that the rate of the initial reaction can be

reduced by the steric hindrance effects however the number of

amine available to react with CO2 grows noticeably (Chakraborty

et al 1988) Furthermore solvent screening experiments and

investigation of the effects of some variables for example chain

length the number of functional groups position of side chains

and functional group etc has been conducted by Singh et al Theyperformed semi-quantitative study of these effects on the ca-

pacity of amines for CO2 absorption (Singh et al 2007 2009) In

addition a computational study in the reactions between func-

tionalized amines and CO2 was performed by Lee and Kitchin

They highlighted the molecular descriptors by which reactivity

trends can be obtained Their work revealed that electron with-

drawing and donating groups tend to destabilize and stabilize

CO2 reaction products respectively (Lee and Kitchin 2012) All of

the results in this paper are based on mathematical calculations

and model development To the best of the authors knowledge

this work is the 1047297rst quantitative research on amines capacity for

CO2 absorption based on the simple and robust model

To achieve this goal a close observation of the relationship be-

tween the chemical structure and the activity of different amine-based solutions is required An effective method for processing

analyzing and predicting the characteristics of different molecules

can be provided by Quantitative Structure PropertyActivity Rela-

tionship (QSPRQSAR) (Beheshti et al 2012 2009 Freire et al

2010 Liang et al 2013 Godavarthy et al 2006 Riahi 2009

2008 Riahi et al 2008) Quantitative structureeproperty rela-

tionship technique relates chemical or physical properties of

compounds to their molecular structures This technique is used to

quantitatively develop a correlation which can predict speci1047297c

molecular properties for example environmental functions or

physico-chemical behaviors The QSPR approach is based on the

assumption that differences of molecules behaviors can be corre-

lated with deviation of some molecular features that are technically

termed descriptors The descriptors are numerical values thatbelong to the shape and structure of the molecule For using QSPR

method the knowledge of molecules chemical structures is quite

adequate and there is no necessity to conduct experimental con-

ditions QSPR often requires consecutive procedures consequently

the following steps were taken (Fini et al 2012)

1 A data set of molecules was taken from the literature with their

corresponding absorption capacities

2 The structural properties of molecules were extracted and

calculated by using computer software

3 The best model which contains an optimum number of de-

scriptors was selected by the means of several alternative al-

gorithms for example genetic algorithm (GA) and MLR

4 The selected model was validated using statistical tests and

validation methods for instance leave-one-out-cross-validation

method

In QSPR approaches selecting the proper method for con-

structing a robust and precise model is very important Multiple

linear regression (MLR) principle component regression (PCR) and

partial least squares (PLS) are most widely used in QSPR modeling

(Katritzky et al 2000 Marengo et al 1992) Variable selection for

building a well-1047297tted model is a further step Genetic algorithm

(GA) is one famous method by which this task can be accom-

plished This paper focuses on the development of a descriptive

novel model in QSPR analysis by which the prediction of absorp-

tion capacity (or rich loading) of various amines used in industrial

carbon capturing units can be predicted The quantitative rela-

tionship between the absorption capacities data and calculated

descriptors is achieved by the multiple linear regressions (MLR)

and model variables were selected by genetic algorithm (GA)

(Depczynski et al 2000 Jouan-Rimbaud et al 1995) The accuracy

of the model was veri1047297ed by different statistical methods and the

result proved high statistical qualities of that model One of the

main disadvantages of QSPR technique is that for most of the re-

searches conducted in this area the 1047297nal equation reported as bestmodel contains unfamiliar descriptors which are not only hard to

be calculated but also are dif 1047297cult or impossible to be interpreted

Fortunately the equation reported in this paper consists of de-

scriptors which are simple in terms of both calculation and

interpretation The model also demonstrates high statistical

qualities by which the predictive power and robustness of the

model can be guarantee

2 Materials and methods

The absorption capacity (rich loading) of 23 amines-based sol-

vents for carbon dioxide absorption (Table 1) were taken from the

literature (Singh et al 2007) Firstly density functional theory

(DFT) at the level of B3LYP and 6-311 thorn G (d p) basis set wasemployed to perform geometrical optimization (Cramer 2005 da

Silva and Svendsen 2004) These calculations were performed by

Gaussian software (Frisch et al 1998) The input of Gaussian soft-

ware was pre-optimized molecule structures using semi-empirical

geometry optimization method AM1 This process calculates a

group of precise and applicable descriptors introducing electronic

and quantum chemical properties of molecules Quantum chemical

descriptors include properties for example dipole moment sum of

the electronic and thermal free energies atomic charges HOMO

energy (highest occupied molecular orbital energy) LUMO energies

(Lowest Unoccupied molecular orbital energy) exact polarizability

etc Consequently a total number of 31 quantum chemical de-

scriptors were calculated for each molecule

Next geometrically optimized structures of each molecule werefed into the Dragon software developed by the Milano Chemo-

metrics and QSAR research group (Todeschini et al 2002) As a

result for each molecule more than 1486 theoretical molecular

descriptors were calculated These descriptors can be divided into

different groups for instance constitutional descriptors topologi-

cal descriptors functional group counts molecular properties etc

Because of the large amount of numerical data that result in

imprecise and slow further calculation the number of calculated

descriptors was decreased by the accepted procedure below

1 Constant and near constant value descriptors were eliminated

(361 excluded)

2 One of the collinear descriptors (R gt 098) that had better cor-

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450 443

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 39

Table 1

The structure experimental and calculated values of amines capacities for CO 2 absorption (rich loading)

No Name Structure Exp Eq (2) (Model) Eq (1)

1 12-diamino propane 127 129 123

2 13-diamino propane 130 129 125

3 14-Diamino butane(T) 126 137 129

4 2-Amino-1-butanol 088 079 080

5 2-Methyl pyridine 006 009 008

6 2-Pyridylamine 028 059 023

7 3-Amino-1-Propanol 088 071 072

8 4-Amino-1-butanol 083 079 076

9 5-Ami no-1-pentanol(T) 084 087 085

10 Butylamine 086 079 084

11 Diethylenetriamine 183 181 177

12 Ethylamine 091 063 082

13 Ethylenediamine 108 121 120

14 Hexamethylenediamine 148 153 146

15 Isobutyl ami ne(T) 078 079 082

16 Monoethanolamine 072 063 061

17 N-(2-Hydroxyethyl)ethylenediamine 115 123 117

18 NN-bis(2-hydroxyethyl)ethylenediamine 120 125 127

19 N-Pentylamine 072 087 090

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450444

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 49

relation with absorption capacity was saved and other de-

scriptors were eliminated (611 excluded)

After the above constraints a total of 514 descriptors were

selected for each molecule as an output of this stage

Finally the calculated descriptors formed a (23 545) data

matrix where 23 represents the number of compounds and 545

were the number of descriptors

3 Model development

After descriptors calculation GA-MLR was applied as a variable

selection and model development procedure for obtaining the best

model with the highest predictive power based on the training set

The procedure of constructing training and test sets will be dis-

cussed in the results section The GA-MLR analysis led to thedevelopment of one model with three variables The following

linear equation was built based on molecules with the training set

AC frac14 036 thorn 157 Mor09v 032 RDF035m thorn 043 nN

(1)

AC is used instead of absorption capacity Mor09v is one of the

3D-MoRSE descriptors and it is de1047297ned as signal 09 weighted by

van der Waals volume RDF035m belongs to the group of RDF de-

scriptors and it describes the radial distribution function-035

weighted by mass and nN represent the number of Nitrogen

atoms As can be noticed the calculation of two descriptors in the

above model is dif 1047297cult because these calculations should be per-

formed by computer It also seems it is not easy to describe the

relationship between these two descriptors and absorption ca-pacity of amines In QSPR studies interpretation of the model and

descriptors is a necessary and important step So it was decided to

investigate some new models with new simple descriptors In

addition due to the chemical reaction of amines with carbon di-

oxide it is concluded that the number of amino groups may affect

amines capacity of carbon dioxide absorption The information on

the chemistry of carbon dioxide reactions with amine-based sol-

vents will be presented in the discussion section After developing

numerous simple equations and evaluating them with different

statistical methods the following model was selected

AC frac14 019 thorn 004 nH thorn 054 nRNH2 thorn 040 nRNHR

(2)

Table 2 shows some statistical factors in order to provide a

better comparison between the two models The 1047297rst equation

demonstrates higher statistical parameters But the simpler de-

scriptors of the second model either in the calculation or inter-

pretation of results are more important Therefore we introduce

the second equation as a preferred model to predict absorption

capacity of amines and the rest of this paper including discussion

and conclusion section will focus on this model

Molecular descriptors and their de1047297nitions are given in Table 3

The correlation matrix of descriptors is also shown in Table 4 The

linear correlation value for each of the two descriptors is less than

065 which demonstrates these descriptors are independent of

each other and can be used to develop a QSPR model

As can be observed the three descriptors appeared in the model

are easily calculated and thus there is no need for computational

calculation Moreover this model demonstrates high statistical

qualities Indeed to the best of our knowledge the above model is

the simplest equation that can ever predict the capacity of amines

for carbon dioxide absorption under speci1047297c conditions

4 Results

One of the most critical factors that in1047298uence the quality of

regression model is how to select and construct training and test

set in order to warrant the molecular diversity on both of them To

take this into account from the total 23 amine-based carbon di-

oxide absorbents 18 molecules (about 80 of molecules) were

selected to construct a training set and 5 molecules built test set

(about 20) The test set was used for external cross-validation of

Table 1 (continued )

No Name Structure Exp Eq (2) (Model) Eq (1)

20 Propylamine(T) 077 071 080

21 Pyridine(T) 005 001 012

22 sec-Butylamine 084 079 087

23 Triethylenetetramine 251 241 247

All the absorption capacities (rich loading) numerical values are in the basis of (mol CO 2mol amine) Bold names with (T) superscripts are test set molecules

Table 2

Some basic statistical values for two models

Models Descriptors R2 Q 2 F s

Eq (1) nN mor09V RDF035m 0979 0971 30096 0082

Eq (2) nH nRNH2 nRNHR 0950 0945 12154 0127

All statistic parameters in this table calculated before training and test procedure

Table 3

The three molecular descriptors used in Eq (2)

Descriptor Type De1047297nition

nH Constitutional indices Number of Hydrogen atoms

nRNH2 Functional group counts Number of primary amines (aliphatic)

nRNHR Functional group counts Number of secondary amines (aliphatic)

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450 445

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 59

the model One of the common techniques in QSPR approach for

constructing training and test set with the constraint of structural

diversity is a PCA (principal component analysis) method (Hu et al

2009 Riahi et al 2008) In the current work PCA was employed to

classify data set of molecules into training and test sets For this

purpose PC1 and PC2 were calculated based on descriptors in the

model The result showed that these two principal components

made 575 and 335 of the variation in data respectively and

played the main roles Fig 1 shows the distribution of the data for

PC1 and PC2 and by observing this 1047297gure it can be concluded that

the compounds in the training and test sets were representatives of

the whole data

The training set was used to build the model while the test set

was used to validate the predicting power During the model

development procedure leave-one-out cross-validation (LOO-CV)method was applied to assess the performances of different

resulting models The Q 2LOO was calculated for each obtained

equation and then the best model was selected based on the high

value of this parameter There are some statistical tests and pa-

rameters that need to be considered Coef 1047297cient of determination

(R2) adjusted R2 Coef 1047297cient of leave-one-out cross validation (Q 2)

the slopes of regression lines forced through zero (k k0) root mean

square error (RMSE) and standard error of the estimate (s) are the

most important ones The 1047297rst 1047297ve parameters should be near to

unity while RMSE and s should be low enough near to zero

Furthermore the intercepts of the model should be close to zero

Moreover the Fisher function (F ) is another vital statistical test

High values of the F -ratio test indicate reliable models All statistical

parameters formulas used in this paper are mentioned below

R2 frac14 1

Pnifrac141

yexp

i ycalc

i

2

Pnifrac141

yexp

i y2

(3)

RMSE frac14

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPnifrac141

yexp

i ycalc

i

2

n

v uut (4)

F frac14

Pnifrac141

yexp ycalc

i

2

df M

Pnifrac141

yexpi ycalci

2df E

(5)

k frac14

P yexp

i ycalc

iP ycalc

i

2 (6)

k0 frac14

P yexp

i ycalc

iP yexp

i

2 (7)

s frac14

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPnifrac141

yexp

i ycalc

i

2

df E

v uut (8)

where df M and df E refer to the degrees of freedom of the model and

error respectively

Also the following criteria described by Golbaraikh and Tropsha

were applied to check the predictability of the QSPR model

(Golbraikh and Tropsha 2002)

1

Q 2 gt05 (9)

2

R2gt 06 (10)

3 R2 R2

0R2 lt0

1 and 0

85 k 1

15 (11)

where R20 is the coef 1047297cient of determination characterizing linear

regression with Y -intercept set at zero The predicted result of all

molecules either in training or test set with statistical parameters

are given in Table 5

Table 4

Correlation matrix of three descriptors used in Eq (2)

Descriptor nH nRNH2 nRNHR

nH 1000 0344 0642

nRNH2 0344 1000 0005

nRNHR 0642 0005 1000

Fig 1 The principal component analysis of the molecules in training and test sets Some points belong to more than one molecule

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450446

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 69

In Table6 Y-scrambling test was applied in order to examine the

robustness of the model (Tropsha et al 2003) In Y-scrambling test

the dependent variable (Absorption Capacity) is randomly dedi-

cated to different amines and new QPSR modeling is performed

based on the previous matrix of independent variables It is ex-

pectedthat newly developedQSPR models shouldhave lowenough

R2 and Q 2 values If it happens differently the reported model is not

accurate for the particular data set and method of modeling

The applicability domain of the model was studied by Williams

plot in Fig 2 (OECD 2007) In Williams plot the standardized re-

siduals (R) versus the leverage (hat diagonal) values (h) were

plotted Leverage demonstrates the distance of a compound from

the centroid of the X where X is the descriptor matrix The leverage

of a compound is calculated by the following equation (Netzeva

et al 2005)

hi frac14 xT i

X T X

1 xi (12)

where xi is the descriptor vector of the relevant compound The

warning leverage (h) is de1047297ned as (Eriksson et al 2003)

h frac143eth p thorn 1THORN

n (13)

n is the number of training objects and p is the number of de-

scriptors in the model Williams plot is used to identify both the

response outlier and the structurally in1047298uential chemicals in the

model A compound with hi gt h in1047298uence the regression line but it

does not consider as an outlier as its corresponding standardized

residual might be small In this data set the warning value of leverage is around 067 Furthermore compounds with standard-

ized residual rather than three standard residual unit (gt3s) is

considered as an outlier compound It is common in the literature

to use 3 as an accepted cut-off value for evaluating prediction re-

sults of the model

Fig 2 demonstrates that there is no chemical with leverage

higher than the warning h value of 067 It also shows that there is

no outlier in training or test sets and all compounds lie between the

two horizontal lines

The experimental absorption capacity (rich loading) values of

amines are plotted in Fig 3 against corresponding calculated values

for QSPR model

Furthermore mean effect (MF) is another term that helps to

interpret the result and shows the effect of each descriptor

individually or relative to other descriptors Fig 4 shows the stan-

dardized coef 1047297cients (also called beta coef 1047297cients) (XLSTAT 2013)

The 1047297gure is used to compare the relative weights of the de-

scriptors The higher the standardized coef 1047297cients value of a

descriptor the more important the weight of the corresponding

variable in the model This 1047297gure demonstrates the mean effect of

the descriptors in the model

By observing this 1047297gure it can be concluded that the number of

primary amines (nRNH2) and secondary amines (nRNHR) de-

scriptors play the main role in the amines capacity for carbon di-

oxide absorption respectively and the number of hydrogen (nH) has

the least effect This 1047297gure shows all descriptors in the model have

positive effects and the amines capacity for CO2 absorption is

directly related to each of these descriptors

At the last part of this section it should be noticed that the

present work focuses exclusively on developing a simple model by

which amines capacities for carbon dioxide absorption can be

predicted In fact the predominant difference between this study

and the previous ones is that this work concentrates 1047297rstly on

quantitative and then on qualitative representation of structural

effects on the capacity of amines for CO2 absorption

5 Discussion

Although high statistical parameters are signi1047297cant in demon-

strating the capability of the model QSPR should provide powerful

insight for the mechanism of carbon dioxide solubility in amine

based solvent For this reason an acceptable interpretation of de-

scriptors in the QSPR model should be provided It is better to di-

agnose which parameters affect the amines capacity and which

descriptors could appear in the model due to principal chemical

reactions between carbon dioxide and an amine-based solvent

The overall reaction mechanism for chemical absorption of CO2

in amine solvent systems is still under debate A mechanism for this

reaction which supports the formation of zwitterion intermediate

theory and by proton-remover base B through reactions (1) and (2)

below suggested by Caplow (1968)

CO2 thorn R 1R 2NH4 R 1R 2NHthorn COO (1)

R 1R 2NHthorn COO thorn B4 R 1R 2COO thorn BHthorn (2)

R 1 and R 2 demonstrate substituted group attached to amine

group B is a base molecule which can be a water molecule The

intermediate in the reaction is zwitterion But more recent studies

showed zwitterion seemed to be short-lived and may be an entirely

transient state (da Silva and Svendsen 2004) It led to the

assumption of the single-step mechanism of these reactions (re-

action (3)) A termolecular single step mechanism suggested by

Crooks and Donnellan (1989)

B thorn CO2 thorn R 1R 2NH4 R 1R 2NCOO thorn BHthorn (3)

where B is again the base molecule In this mechanism NH group is

attacked by base molecule and deprotonation of amine occurs The

bonding between amine and carbon dioxide also takes place

simultaneously

Table 6

R2 train values after several Y-scrambling tests

Iteration R2 train

1 0060

2 0074

3 0119

4 0027

5 0188

6 0102

7 0119

8 0209

9 0096

10 0039

Table 5

Validation parameters and statistical result of GA-MLR model

n R2 R2adj RMSE F k k0 s

Train 18 0942 0930 0127 7650 1004 0984 0144

Test 5 0976 0904 0060 1731 0962 1035 0135

Overall 23 0950 0942 0116 12379 0999 0990 0128

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450 447

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 79

As can be noticed the reaction between CO2 and amine based

solvent takes place because of the existence of NH bond So NH

group is an active site of the amine molecule where base molecule

(water) undergoes a chemical termolecular reaction Consequently

the amount of NH bonds or in other words the number of primary

and secondary amine groups in the amine molecule plays an

important role in the capacity of amines for CO2 absorption

Number of primary (nRNH2) and secondary (nRNHR) amines is two

main descriptors appearing in the model According to Fig 4 these

two descriptors have a positive effect and a higher mean effect All

these results demonstrate that the chemical reaction mechanism

coordinates with the proposed model

The model also contains number of Hydrogen atoms (nH) as

another descriptor Fig 4 shows nH descriptor has a positive effect

which is considerably less than two other descriptors The reason of

nH descriptor presence in the model can be explained by the result

of experimental work performed by Singh et al They showed that

an increase in the chain length between amines and other func-

tional groups in the amine structure result in an increase in amine

capacity for CO2 absorption (Singh et al 2007) Increasing with

chain length results in increasing numbers of hydrogen atoms so

apparently it seems it should have a positive effect due to the

experimental work

At last it should be noted that the simplicity of the model is

interesting and the results are quite acceptable for predicting

amines capacity for carbon dioxide absorption Although the ac-

curacy of the model is good for linear amine compounds it is not

better for unsaturated cyclic amines This can be explained by two

Fig 2 Williams plot of GA-MLR model development

Fig 3 Experimental vs predicted rich loading values (mol CO 2mol amine) e

regression line

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450448

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 89

main reasons First the three descriptors in model are not sensitive

to ring type functional group and just count the number of

hydrogen atoms primary and secondary amines Second unsatu-

rated cyclic amines show poor absorption rate and capacity and

they are not potential absorbents for CO2 absorption (2) Therefore

according to the industrial point of view it is preferable to use

linear amine for CO2 absorption and it is more important for the

model to predict CO2 absorption capacity for linear amines rather

than unsaturated cyclic amines

Fortunately the results of the 1047297rst equation (Eq (1)) for pre-

dicting amines capacity of CO2 absorption are largely accepted

either for linear or aromatic ring type amines (items labeled 5 6

and 21) This is because of the presence of RDF descriptor in this

model RDFdescriptorsare based on the distance distribution in the

geometrical representation of a molecule This function is inde-

pendent of the number of atoms and is invariant against translation

and rotation of the entire molecule The RDFcode provides valuable

information eg about bond distances ring types planar and non-

planar systems and atom types so it is sensitive to aromatic rings

(Todeschini and Consonni 2008)

6 Conclusions

One of the main concerns of the natural gas industry is to have a

robust and accuratemodel which canpredict the chemical behavior

of amines for gas treatment process This study is attempted toidentify the effects chemical structure of amine on their capacity for

carbon dioxide absorption and develop a model for this purpose

which is not only robust and accurate but also simple and appli-

cable Therefore QSPR approach has been chosen as a modeling

technique and model has been developed based on linear method

for its simplicity As a result two linear equations were developed

First model demonstrate high prediction powerwhile second one is

notably simpler and powerfully interpretable due to the chemistry

of amines reaction with carbon dioxide Consequently second

equation introduced as a preferred model of this study The most

important descriptors appearing in the model due to the weight of

the corresponding variable are number of primary aliphatic amines

(nRNH2) number of secondary aliphatic amines (nRNHR) and

number of hydrogen atoms (nH) respectively The accuracy and

predictive performance of the model validated with various sta-

tistical tests and examined with the test set of 1047297ve molecules

permits using this model to estimate other amines rich loading

under speci1047297c conditions According to the results it could be

argued that a good amine solvent for carbon dioxide absorption

should have a linear structure with a high number of primary and

secondary amine groups as side chains In other words increasing

the number of primary and secondary amine groups results in

increasing the number of NH bonds active sites which causes the

amine reaction with CO2 to happen

The promising results of this study might aid other researchers

in the 1047297eld of chemistry and natural gas engineering to design and

synthesis new potential amine-based solvents and investigate the

feasibility of using them in gas removal processes New improved

solvents should also be compared to more conventional ones from

corrosively energy ef 1047297ciency and operability point of view

Acknowledgment

The authors would like to gratefully acknowledge the support

from Institute of Petroleum Engineering (IPE) University of Tehran

List of symbols

CO2 carbon dioxide

QSPRQSAR quantitative structure propertyactivity relationship

DFT Density Functional Theory

MLR Multiple Linear Regression

GA Genetic Algorithms

PCR principle component regression

PLS partial least square

HOMO Highest Occupied Molecular Orbital

LUMO Lowest Unoccupied Molecular Orbital

AC absorption capacity

PCA principal component analysis

LOO-CV Leave-one-out cross-validation

RMSE root mean square errordf M degrees of freedom of the model

df E degrees of freedom of the error

References

Beheshti Abolghasem Riahi Siavash Ganjali Mohammad Reza 2009 Quantitativestructureeproperty relationship study on 1047297rst reduction and oxidation poten-tials of donor-substituted phenylquinolinylethynes and phenyl-isoquinolinylethynes quantum chemical investigation Electrochim Acta 54(23) 5368e5375

Beheshti A Norouzi P Ganjali MR 2012 A simple and robust model for pre-dicting the reduction potential of quinones family electrophilicity index effectInt J Electrochem Sci 7 4811e4821

Bohloul MR Vatani A Peyghambarzadeh SM 2014 Experimental and theo-retical study of CO2 solubility in N-methyl-2-pyrrolidone (NMP) Fluid PhaseEquilibr 365 106e111

Caplow Michael 1968 Kinetics of carbamate formation and breakdown J AmChem Soc 90 (24) 6795e6803

Chakraborty AK et al 1988 Molecular orbital approach to substituent effects inamine-CO2 interactions J Am Chem Soc 110 (21) 6947e6954

Chakraborty AK Astarita G Bischoff KB 1986 CO2absorption in aqueous so-lutions of hindered amines Chem Eng Sci 41 (4) 997e1003

Cramer Christopher J 2005 Essentials of Computational Chemistry Theories andModels Wileycom

Crooks John E Donnellan J Paul 1989 Kinetics and mechanism of the reactionbetween carbon dioxide and amines in aqueous solution J Chem Soc PerkinTrans 2 (4) 331e333

da Silva Eirik F Svendsen Hallvard F 2004 Ab initio study of the reaction of carbamate formation from CO2 and alkanolamines Indust Eng Chem Res 43(13) 3413e3418

Depczynski Uwe Frost VJ Molt K 2000 Genetic algorithms applied to the se-lection of factors in principal component regression Anal Chim Acta 420 (2)217e227

Eriksson Lennart Joanna Jaworska Worth Andrew P Cronin Mark TD

McDowell Robert M Gramatica Paola 2003 Methods for reliability and

Fig 4 Mean effects of model descriptors (standardized coef 1047297cient values)

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450 449

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 99

uncertainty assessment and for applicability evaluations of classi1047297cation-andregression-based QSARs Environ Health Perspect 111 (10) 1361

Fini Mojtaba Fallah Riahi Siavash Alireza Bahramian 2012 Experimental andQSPR studies on the effect of ionic surfactants on n-Decaneewater interfacialtension J Surfact Deterg 15 (4) 477e484

Freire Mara G et al 2010 Solubility of non-aromatic ionic liquids in water andcorrelation using a QSPR approach Fluid Phase Equilibr 294 (1) 234e240

Frisch Michael J Nielsen Alice B Frisch Aeleen (Eds) 1998 Gaussian 98 GaussianIncorporated

Godavarthy Srinivasa S Robinson Jr Robert L Gasem Khaled AM 2006

SVRCe

QSPR model for predicting saturated vapor pressures of pure 1047298uids FluidPhase Equilibr 246 (1) 39e51

Golbraikh Alexander Tropsha Alexander 2002 Beware of q2 J Mol Graph Model20 (4) 269e276

Hu Rongjing et al 2009 QSAR models for 2-amino-6-arylsulfonylbenzonitrilesand congeners HIV-1 reverse transcriptase inhibitors based on linear andnonlinear regression methods Eur J Med Chem 44 (5) 2158e2171

Jouan-Rimbaud Delphine et al 1995 Genetic algorithms as a tool for wavelengthselection in multivariate calibration Anal Chem 67 (23) 4295e4301

Katritzky Alan R et al 2000 QSPR correlation and predictions of GC retentionindexes for methyl-branched hydrocarbons produced by insects Anal Chem 72(1) 101e109

Kohl Arthur L Nielsen Richard 1997 Gas Puri1047297cation (access online via Elsevier)Lee Anita S Kitchin John R 2012 Chemical and molecular descriptors for the

reactivity of amines with CO2 Indust Eng Chem Res 51 (42) 13609e13618Liang Guijie Jie Xu Li Liu 2013 QSPR analysis for melting point of fatty acids

using genetic algorithm based multiple linear regression (GA-MLR) Fluid PhaseEquilibr 353 15e21

Marengo Emilio et al 1992 Comparative study of different structural descriptorsand variable selection approaches using partial least squares in quantitativestructure-activity relationships Chemometr Intell Lab Syst 14 (1) 225e233

Mokhatab Saeid Poe William A 2012 Handbook of Natural Gas Transmission andProcessing (access online via Elsevier)

Netzeva Tatiana I Worth Andrew P Aldenberg Tom Romualdo BenigniCronin Mark TD Gramatica Paola Jaworska Joanna S et al 2005 Currentstatus of methods for de1047297ning the applicability domain of (quantitative)structureeactivity relationships ATLA 33 155e173

OECD 2007 Guidance Document on the Validation of (Quantitative) Structure-Activity Relationships [(Q)SAR] Models Organisation for Economic Co-Operation and Development Paris

Pourbasheer Eslam et al 2011 Prediction of solubility of fullerene C60 in variousorganic solvents by genetic algorithm-multiple linear regression FullerenesNanotubes Carbon Nanostruct 19 (7) 585e598

Riahi Siavash Ganjali Mohammad Reza Norouzi Parviz Jafari Fatemeh 2008Application of GA-MLR GA-PLS and the DFT quantum mechanical (QM) cal-culations for the prediction of the selectivity coef 1047297cients of a histamine-selective electrode Sens Actuat B Chem 132 (1) 13e19

Riahi Siavash Pourbasheer Eslam Ganjali Mohammad Reza Norouzi Parviz 2009Investigation of different linear and nonlinear chemometric methods formodeling of retention index of essential oil components concerns to supportvector machine J Hazard Mater 166 (2) 853e859

Riahi Siavash Beheshti Abolghasem Ganjali Mohammad Reza Norouzi Parviz2008 A novel QSPR study of normalized migration time for drugs in capillaryelectrophoresis by new descriptors quantum chemical investigation Electro-phoresis 29 (19) 4027e4035

Riahi Siavash et al 2008 QSAR study of 2- (1-Propylpiperidin-4-yl) -1 H-Benz-imidazole-4-Carboxamide as PARP inhibitors for treatment of cancer ChemBiol Drug Design 72 (6) 575e584

Sartori Guido Savage David W 1983 Sterically hindered amines for carbon di-oxide removal from gases Indust Eng Chem Fundam 22 (2) 239e249

Singh Prachi Niederer John PM Versteeg Geert F 2007 Structure and activityrelationships for amine based CO2 absorbentsdI Int J Greenhouse Gas Control1 (1) 5e10

Singh Prachi Niederer John PM Versteeg Geert F 2009 Structure and activityrelationships for amine-based CO2 absorbents-II Chem Eng Res Design 87 (2)135e144

Todeschini R Consonni V Mauri A Pavan M 2002 DRAGON-Software for thecalculation of molecular descriptors version 21

Todeschini Roberto Consonni Viviana 2008 Handbook of Molecular Descriptors John Wiley amp Sons

Tropsha Alexander Gramatica Paola Gombar Vijay K 2003 The importance of being earnest validation is the absolute essential for successful application andinterpretation of QSPR models QSAR Comb Sci 22 (1) 69e77

XLSTAT 2013 software XLSTAT-CCR module Trial version

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450450

Page 3: Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 39

Table 1

The structure experimental and calculated values of amines capacities for CO 2 absorption (rich loading)

No Name Structure Exp Eq (2) (Model) Eq (1)

1 12-diamino propane 127 129 123

2 13-diamino propane 130 129 125

3 14-Diamino butane(T) 126 137 129

4 2-Amino-1-butanol 088 079 080

5 2-Methyl pyridine 006 009 008

6 2-Pyridylamine 028 059 023

7 3-Amino-1-Propanol 088 071 072

8 4-Amino-1-butanol 083 079 076

9 5-Ami no-1-pentanol(T) 084 087 085

10 Butylamine 086 079 084

11 Diethylenetriamine 183 181 177

12 Ethylamine 091 063 082

13 Ethylenediamine 108 121 120

14 Hexamethylenediamine 148 153 146

15 Isobutyl ami ne(T) 078 079 082

16 Monoethanolamine 072 063 061

17 N-(2-Hydroxyethyl)ethylenediamine 115 123 117

18 NN-bis(2-hydroxyethyl)ethylenediamine 120 125 127

19 N-Pentylamine 072 087 090

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450444

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 49

relation with absorption capacity was saved and other de-

scriptors were eliminated (611 excluded)

After the above constraints a total of 514 descriptors were

selected for each molecule as an output of this stage

Finally the calculated descriptors formed a (23 545) data

matrix where 23 represents the number of compounds and 545

were the number of descriptors

3 Model development

After descriptors calculation GA-MLR was applied as a variable

selection and model development procedure for obtaining the best

model with the highest predictive power based on the training set

The procedure of constructing training and test sets will be dis-

cussed in the results section The GA-MLR analysis led to thedevelopment of one model with three variables The following

linear equation was built based on molecules with the training set

AC frac14 036 thorn 157 Mor09v 032 RDF035m thorn 043 nN

(1)

AC is used instead of absorption capacity Mor09v is one of the

3D-MoRSE descriptors and it is de1047297ned as signal 09 weighted by

van der Waals volume RDF035m belongs to the group of RDF de-

scriptors and it describes the radial distribution function-035

weighted by mass and nN represent the number of Nitrogen

atoms As can be noticed the calculation of two descriptors in the

above model is dif 1047297cult because these calculations should be per-

formed by computer It also seems it is not easy to describe the

relationship between these two descriptors and absorption ca-pacity of amines In QSPR studies interpretation of the model and

descriptors is a necessary and important step So it was decided to

investigate some new models with new simple descriptors In

addition due to the chemical reaction of amines with carbon di-

oxide it is concluded that the number of amino groups may affect

amines capacity of carbon dioxide absorption The information on

the chemistry of carbon dioxide reactions with amine-based sol-

vents will be presented in the discussion section After developing

numerous simple equations and evaluating them with different

statistical methods the following model was selected

AC frac14 019 thorn 004 nH thorn 054 nRNH2 thorn 040 nRNHR

(2)

Table 2 shows some statistical factors in order to provide a

better comparison between the two models The 1047297rst equation

demonstrates higher statistical parameters But the simpler de-

scriptors of the second model either in the calculation or inter-

pretation of results are more important Therefore we introduce

the second equation as a preferred model to predict absorption

capacity of amines and the rest of this paper including discussion

and conclusion section will focus on this model

Molecular descriptors and their de1047297nitions are given in Table 3

The correlation matrix of descriptors is also shown in Table 4 The

linear correlation value for each of the two descriptors is less than

065 which demonstrates these descriptors are independent of

each other and can be used to develop a QSPR model

As can be observed the three descriptors appeared in the model

are easily calculated and thus there is no need for computational

calculation Moreover this model demonstrates high statistical

qualities Indeed to the best of our knowledge the above model is

the simplest equation that can ever predict the capacity of amines

for carbon dioxide absorption under speci1047297c conditions

4 Results

One of the most critical factors that in1047298uence the quality of

regression model is how to select and construct training and test

set in order to warrant the molecular diversity on both of them To

take this into account from the total 23 amine-based carbon di-

oxide absorbents 18 molecules (about 80 of molecules) were

selected to construct a training set and 5 molecules built test set

(about 20) The test set was used for external cross-validation of

Table 1 (continued )

No Name Structure Exp Eq (2) (Model) Eq (1)

20 Propylamine(T) 077 071 080

21 Pyridine(T) 005 001 012

22 sec-Butylamine 084 079 087

23 Triethylenetetramine 251 241 247

All the absorption capacities (rich loading) numerical values are in the basis of (mol CO 2mol amine) Bold names with (T) superscripts are test set molecules

Table 2

Some basic statistical values for two models

Models Descriptors R2 Q 2 F s

Eq (1) nN mor09V RDF035m 0979 0971 30096 0082

Eq (2) nH nRNH2 nRNHR 0950 0945 12154 0127

All statistic parameters in this table calculated before training and test procedure

Table 3

The three molecular descriptors used in Eq (2)

Descriptor Type De1047297nition

nH Constitutional indices Number of Hydrogen atoms

nRNH2 Functional group counts Number of primary amines (aliphatic)

nRNHR Functional group counts Number of secondary amines (aliphatic)

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450 445

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 59

the model One of the common techniques in QSPR approach for

constructing training and test set with the constraint of structural

diversity is a PCA (principal component analysis) method (Hu et al

2009 Riahi et al 2008) In the current work PCA was employed to

classify data set of molecules into training and test sets For this

purpose PC1 and PC2 were calculated based on descriptors in the

model The result showed that these two principal components

made 575 and 335 of the variation in data respectively and

played the main roles Fig 1 shows the distribution of the data for

PC1 and PC2 and by observing this 1047297gure it can be concluded that

the compounds in the training and test sets were representatives of

the whole data

The training set was used to build the model while the test set

was used to validate the predicting power During the model

development procedure leave-one-out cross-validation (LOO-CV)method was applied to assess the performances of different

resulting models The Q 2LOO was calculated for each obtained

equation and then the best model was selected based on the high

value of this parameter There are some statistical tests and pa-

rameters that need to be considered Coef 1047297cient of determination

(R2) adjusted R2 Coef 1047297cient of leave-one-out cross validation (Q 2)

the slopes of regression lines forced through zero (k k0) root mean

square error (RMSE) and standard error of the estimate (s) are the

most important ones The 1047297rst 1047297ve parameters should be near to

unity while RMSE and s should be low enough near to zero

Furthermore the intercepts of the model should be close to zero

Moreover the Fisher function (F ) is another vital statistical test

High values of the F -ratio test indicate reliable models All statistical

parameters formulas used in this paper are mentioned below

R2 frac14 1

Pnifrac141

yexp

i ycalc

i

2

Pnifrac141

yexp

i y2

(3)

RMSE frac14

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPnifrac141

yexp

i ycalc

i

2

n

v uut (4)

F frac14

Pnifrac141

yexp ycalc

i

2

df M

Pnifrac141

yexpi ycalci

2df E

(5)

k frac14

P yexp

i ycalc

iP ycalc

i

2 (6)

k0 frac14

P yexp

i ycalc

iP yexp

i

2 (7)

s frac14

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPnifrac141

yexp

i ycalc

i

2

df E

v uut (8)

where df M and df E refer to the degrees of freedom of the model and

error respectively

Also the following criteria described by Golbaraikh and Tropsha

were applied to check the predictability of the QSPR model

(Golbraikh and Tropsha 2002)

1

Q 2 gt05 (9)

2

R2gt 06 (10)

3 R2 R2

0R2 lt0

1 and 0

85 k 1

15 (11)

where R20 is the coef 1047297cient of determination characterizing linear

regression with Y -intercept set at zero The predicted result of all

molecules either in training or test set with statistical parameters

are given in Table 5

Table 4

Correlation matrix of three descriptors used in Eq (2)

Descriptor nH nRNH2 nRNHR

nH 1000 0344 0642

nRNH2 0344 1000 0005

nRNHR 0642 0005 1000

Fig 1 The principal component analysis of the molecules in training and test sets Some points belong to more than one molecule

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450446

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 69

In Table6 Y-scrambling test was applied in order to examine the

robustness of the model (Tropsha et al 2003) In Y-scrambling test

the dependent variable (Absorption Capacity) is randomly dedi-

cated to different amines and new QPSR modeling is performed

based on the previous matrix of independent variables It is ex-

pectedthat newly developedQSPR models shouldhave lowenough

R2 and Q 2 values If it happens differently the reported model is not

accurate for the particular data set and method of modeling

The applicability domain of the model was studied by Williams

plot in Fig 2 (OECD 2007) In Williams plot the standardized re-

siduals (R) versus the leverage (hat diagonal) values (h) were

plotted Leverage demonstrates the distance of a compound from

the centroid of the X where X is the descriptor matrix The leverage

of a compound is calculated by the following equation (Netzeva

et al 2005)

hi frac14 xT i

X T X

1 xi (12)

where xi is the descriptor vector of the relevant compound The

warning leverage (h) is de1047297ned as (Eriksson et al 2003)

h frac143eth p thorn 1THORN

n (13)

n is the number of training objects and p is the number of de-

scriptors in the model Williams plot is used to identify both the

response outlier and the structurally in1047298uential chemicals in the

model A compound with hi gt h in1047298uence the regression line but it

does not consider as an outlier as its corresponding standardized

residual might be small In this data set the warning value of leverage is around 067 Furthermore compounds with standard-

ized residual rather than three standard residual unit (gt3s) is

considered as an outlier compound It is common in the literature

to use 3 as an accepted cut-off value for evaluating prediction re-

sults of the model

Fig 2 demonstrates that there is no chemical with leverage

higher than the warning h value of 067 It also shows that there is

no outlier in training or test sets and all compounds lie between the

two horizontal lines

The experimental absorption capacity (rich loading) values of

amines are plotted in Fig 3 against corresponding calculated values

for QSPR model

Furthermore mean effect (MF) is another term that helps to

interpret the result and shows the effect of each descriptor

individually or relative to other descriptors Fig 4 shows the stan-

dardized coef 1047297cients (also called beta coef 1047297cients) (XLSTAT 2013)

The 1047297gure is used to compare the relative weights of the de-

scriptors The higher the standardized coef 1047297cients value of a

descriptor the more important the weight of the corresponding

variable in the model This 1047297gure demonstrates the mean effect of

the descriptors in the model

By observing this 1047297gure it can be concluded that the number of

primary amines (nRNH2) and secondary amines (nRNHR) de-

scriptors play the main role in the amines capacity for carbon di-

oxide absorption respectively and the number of hydrogen (nH) has

the least effect This 1047297gure shows all descriptors in the model have

positive effects and the amines capacity for CO2 absorption is

directly related to each of these descriptors

At the last part of this section it should be noticed that the

present work focuses exclusively on developing a simple model by

which amines capacities for carbon dioxide absorption can be

predicted In fact the predominant difference between this study

and the previous ones is that this work concentrates 1047297rstly on

quantitative and then on qualitative representation of structural

effects on the capacity of amines for CO2 absorption

5 Discussion

Although high statistical parameters are signi1047297cant in demon-

strating the capability of the model QSPR should provide powerful

insight for the mechanism of carbon dioxide solubility in amine

based solvent For this reason an acceptable interpretation of de-

scriptors in the QSPR model should be provided It is better to di-

agnose which parameters affect the amines capacity and which

descriptors could appear in the model due to principal chemical

reactions between carbon dioxide and an amine-based solvent

The overall reaction mechanism for chemical absorption of CO2

in amine solvent systems is still under debate A mechanism for this

reaction which supports the formation of zwitterion intermediate

theory and by proton-remover base B through reactions (1) and (2)

below suggested by Caplow (1968)

CO2 thorn R 1R 2NH4 R 1R 2NHthorn COO (1)

R 1R 2NHthorn COO thorn B4 R 1R 2COO thorn BHthorn (2)

R 1 and R 2 demonstrate substituted group attached to amine

group B is a base molecule which can be a water molecule The

intermediate in the reaction is zwitterion But more recent studies

showed zwitterion seemed to be short-lived and may be an entirely

transient state (da Silva and Svendsen 2004) It led to the

assumption of the single-step mechanism of these reactions (re-

action (3)) A termolecular single step mechanism suggested by

Crooks and Donnellan (1989)

B thorn CO2 thorn R 1R 2NH4 R 1R 2NCOO thorn BHthorn (3)

where B is again the base molecule In this mechanism NH group is

attacked by base molecule and deprotonation of amine occurs The

bonding between amine and carbon dioxide also takes place

simultaneously

Table 6

R2 train values after several Y-scrambling tests

Iteration R2 train

1 0060

2 0074

3 0119

4 0027

5 0188

6 0102

7 0119

8 0209

9 0096

10 0039

Table 5

Validation parameters and statistical result of GA-MLR model

n R2 R2adj RMSE F k k0 s

Train 18 0942 0930 0127 7650 1004 0984 0144

Test 5 0976 0904 0060 1731 0962 1035 0135

Overall 23 0950 0942 0116 12379 0999 0990 0128

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450 447

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 79

As can be noticed the reaction between CO2 and amine based

solvent takes place because of the existence of NH bond So NH

group is an active site of the amine molecule where base molecule

(water) undergoes a chemical termolecular reaction Consequently

the amount of NH bonds or in other words the number of primary

and secondary amine groups in the amine molecule plays an

important role in the capacity of amines for CO2 absorption

Number of primary (nRNH2) and secondary (nRNHR) amines is two

main descriptors appearing in the model According to Fig 4 these

two descriptors have a positive effect and a higher mean effect All

these results demonstrate that the chemical reaction mechanism

coordinates with the proposed model

The model also contains number of Hydrogen atoms (nH) as

another descriptor Fig 4 shows nH descriptor has a positive effect

which is considerably less than two other descriptors The reason of

nH descriptor presence in the model can be explained by the result

of experimental work performed by Singh et al They showed that

an increase in the chain length between amines and other func-

tional groups in the amine structure result in an increase in amine

capacity for CO2 absorption (Singh et al 2007) Increasing with

chain length results in increasing numbers of hydrogen atoms so

apparently it seems it should have a positive effect due to the

experimental work

At last it should be noted that the simplicity of the model is

interesting and the results are quite acceptable for predicting

amines capacity for carbon dioxide absorption Although the ac-

curacy of the model is good for linear amine compounds it is not

better for unsaturated cyclic amines This can be explained by two

Fig 2 Williams plot of GA-MLR model development

Fig 3 Experimental vs predicted rich loading values (mol CO 2mol amine) e

regression line

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450448

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 89

main reasons First the three descriptors in model are not sensitive

to ring type functional group and just count the number of

hydrogen atoms primary and secondary amines Second unsatu-

rated cyclic amines show poor absorption rate and capacity and

they are not potential absorbents for CO2 absorption (2) Therefore

according to the industrial point of view it is preferable to use

linear amine for CO2 absorption and it is more important for the

model to predict CO2 absorption capacity for linear amines rather

than unsaturated cyclic amines

Fortunately the results of the 1047297rst equation (Eq (1)) for pre-

dicting amines capacity of CO2 absorption are largely accepted

either for linear or aromatic ring type amines (items labeled 5 6

and 21) This is because of the presence of RDF descriptor in this

model RDFdescriptorsare based on the distance distribution in the

geometrical representation of a molecule This function is inde-

pendent of the number of atoms and is invariant against translation

and rotation of the entire molecule The RDFcode provides valuable

information eg about bond distances ring types planar and non-

planar systems and atom types so it is sensitive to aromatic rings

(Todeschini and Consonni 2008)

6 Conclusions

One of the main concerns of the natural gas industry is to have a

robust and accuratemodel which canpredict the chemical behavior

of amines for gas treatment process This study is attempted toidentify the effects chemical structure of amine on their capacity for

carbon dioxide absorption and develop a model for this purpose

which is not only robust and accurate but also simple and appli-

cable Therefore QSPR approach has been chosen as a modeling

technique and model has been developed based on linear method

for its simplicity As a result two linear equations were developed

First model demonstrate high prediction powerwhile second one is

notably simpler and powerfully interpretable due to the chemistry

of amines reaction with carbon dioxide Consequently second

equation introduced as a preferred model of this study The most

important descriptors appearing in the model due to the weight of

the corresponding variable are number of primary aliphatic amines

(nRNH2) number of secondary aliphatic amines (nRNHR) and

number of hydrogen atoms (nH) respectively The accuracy and

predictive performance of the model validated with various sta-

tistical tests and examined with the test set of 1047297ve molecules

permits using this model to estimate other amines rich loading

under speci1047297c conditions According to the results it could be

argued that a good amine solvent for carbon dioxide absorption

should have a linear structure with a high number of primary and

secondary amine groups as side chains In other words increasing

the number of primary and secondary amine groups results in

increasing the number of NH bonds active sites which causes the

amine reaction with CO2 to happen

The promising results of this study might aid other researchers

in the 1047297eld of chemistry and natural gas engineering to design and

synthesis new potential amine-based solvents and investigate the

feasibility of using them in gas removal processes New improved

solvents should also be compared to more conventional ones from

corrosively energy ef 1047297ciency and operability point of view

Acknowledgment

The authors would like to gratefully acknowledge the support

from Institute of Petroleum Engineering (IPE) University of Tehran

List of symbols

CO2 carbon dioxide

QSPRQSAR quantitative structure propertyactivity relationship

DFT Density Functional Theory

MLR Multiple Linear Regression

GA Genetic Algorithms

PCR principle component regression

PLS partial least square

HOMO Highest Occupied Molecular Orbital

LUMO Lowest Unoccupied Molecular Orbital

AC absorption capacity

PCA principal component analysis

LOO-CV Leave-one-out cross-validation

RMSE root mean square errordf M degrees of freedom of the model

df E degrees of freedom of the error

References

Beheshti Abolghasem Riahi Siavash Ganjali Mohammad Reza 2009 Quantitativestructureeproperty relationship study on 1047297rst reduction and oxidation poten-tials of donor-substituted phenylquinolinylethynes and phenyl-isoquinolinylethynes quantum chemical investigation Electrochim Acta 54(23) 5368e5375

Beheshti A Norouzi P Ganjali MR 2012 A simple and robust model for pre-dicting the reduction potential of quinones family electrophilicity index effectInt J Electrochem Sci 7 4811e4821

Bohloul MR Vatani A Peyghambarzadeh SM 2014 Experimental and theo-retical study of CO2 solubility in N-methyl-2-pyrrolidone (NMP) Fluid PhaseEquilibr 365 106e111

Caplow Michael 1968 Kinetics of carbamate formation and breakdown J AmChem Soc 90 (24) 6795e6803

Chakraborty AK et al 1988 Molecular orbital approach to substituent effects inamine-CO2 interactions J Am Chem Soc 110 (21) 6947e6954

Chakraborty AK Astarita G Bischoff KB 1986 CO2absorption in aqueous so-lutions of hindered amines Chem Eng Sci 41 (4) 997e1003

Cramer Christopher J 2005 Essentials of Computational Chemistry Theories andModels Wileycom

Crooks John E Donnellan J Paul 1989 Kinetics and mechanism of the reactionbetween carbon dioxide and amines in aqueous solution J Chem Soc PerkinTrans 2 (4) 331e333

da Silva Eirik F Svendsen Hallvard F 2004 Ab initio study of the reaction of carbamate formation from CO2 and alkanolamines Indust Eng Chem Res 43(13) 3413e3418

Depczynski Uwe Frost VJ Molt K 2000 Genetic algorithms applied to the se-lection of factors in principal component regression Anal Chim Acta 420 (2)217e227

Eriksson Lennart Joanna Jaworska Worth Andrew P Cronin Mark TD

McDowell Robert M Gramatica Paola 2003 Methods for reliability and

Fig 4 Mean effects of model descriptors (standardized coef 1047297cient values)

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450 449

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 99

uncertainty assessment and for applicability evaluations of classi1047297cation-andregression-based QSARs Environ Health Perspect 111 (10) 1361

Fini Mojtaba Fallah Riahi Siavash Alireza Bahramian 2012 Experimental andQSPR studies on the effect of ionic surfactants on n-Decaneewater interfacialtension J Surfact Deterg 15 (4) 477e484

Freire Mara G et al 2010 Solubility of non-aromatic ionic liquids in water andcorrelation using a QSPR approach Fluid Phase Equilibr 294 (1) 234e240

Frisch Michael J Nielsen Alice B Frisch Aeleen (Eds) 1998 Gaussian 98 GaussianIncorporated

Godavarthy Srinivasa S Robinson Jr Robert L Gasem Khaled AM 2006

SVRCe

QSPR model for predicting saturated vapor pressures of pure 1047298uids FluidPhase Equilibr 246 (1) 39e51

Golbraikh Alexander Tropsha Alexander 2002 Beware of q2 J Mol Graph Model20 (4) 269e276

Hu Rongjing et al 2009 QSAR models for 2-amino-6-arylsulfonylbenzonitrilesand congeners HIV-1 reverse transcriptase inhibitors based on linear andnonlinear regression methods Eur J Med Chem 44 (5) 2158e2171

Jouan-Rimbaud Delphine et al 1995 Genetic algorithms as a tool for wavelengthselection in multivariate calibration Anal Chem 67 (23) 4295e4301

Katritzky Alan R et al 2000 QSPR correlation and predictions of GC retentionindexes for methyl-branched hydrocarbons produced by insects Anal Chem 72(1) 101e109

Kohl Arthur L Nielsen Richard 1997 Gas Puri1047297cation (access online via Elsevier)Lee Anita S Kitchin John R 2012 Chemical and molecular descriptors for the

reactivity of amines with CO2 Indust Eng Chem Res 51 (42) 13609e13618Liang Guijie Jie Xu Li Liu 2013 QSPR analysis for melting point of fatty acids

using genetic algorithm based multiple linear regression (GA-MLR) Fluid PhaseEquilibr 353 15e21

Marengo Emilio et al 1992 Comparative study of different structural descriptorsand variable selection approaches using partial least squares in quantitativestructure-activity relationships Chemometr Intell Lab Syst 14 (1) 225e233

Mokhatab Saeid Poe William A 2012 Handbook of Natural Gas Transmission andProcessing (access online via Elsevier)

Netzeva Tatiana I Worth Andrew P Aldenberg Tom Romualdo BenigniCronin Mark TD Gramatica Paola Jaworska Joanna S et al 2005 Currentstatus of methods for de1047297ning the applicability domain of (quantitative)structureeactivity relationships ATLA 33 155e173

OECD 2007 Guidance Document on the Validation of (Quantitative) Structure-Activity Relationships [(Q)SAR] Models Organisation for Economic Co-Operation and Development Paris

Pourbasheer Eslam et al 2011 Prediction of solubility of fullerene C60 in variousorganic solvents by genetic algorithm-multiple linear regression FullerenesNanotubes Carbon Nanostruct 19 (7) 585e598

Riahi Siavash Ganjali Mohammad Reza Norouzi Parviz Jafari Fatemeh 2008Application of GA-MLR GA-PLS and the DFT quantum mechanical (QM) cal-culations for the prediction of the selectivity coef 1047297cients of a histamine-selective electrode Sens Actuat B Chem 132 (1) 13e19

Riahi Siavash Pourbasheer Eslam Ganjali Mohammad Reza Norouzi Parviz 2009Investigation of different linear and nonlinear chemometric methods formodeling of retention index of essential oil components concerns to supportvector machine J Hazard Mater 166 (2) 853e859

Riahi Siavash Beheshti Abolghasem Ganjali Mohammad Reza Norouzi Parviz2008 A novel QSPR study of normalized migration time for drugs in capillaryelectrophoresis by new descriptors quantum chemical investigation Electro-phoresis 29 (19) 4027e4035

Riahi Siavash et al 2008 QSAR study of 2- (1-Propylpiperidin-4-yl) -1 H-Benz-imidazole-4-Carboxamide as PARP inhibitors for treatment of cancer ChemBiol Drug Design 72 (6) 575e584

Sartori Guido Savage David W 1983 Sterically hindered amines for carbon di-oxide removal from gases Indust Eng Chem Fundam 22 (2) 239e249

Singh Prachi Niederer John PM Versteeg Geert F 2007 Structure and activityrelationships for amine based CO2 absorbentsdI Int J Greenhouse Gas Control1 (1) 5e10

Singh Prachi Niederer John PM Versteeg Geert F 2009 Structure and activityrelationships for amine-based CO2 absorbents-II Chem Eng Res Design 87 (2)135e144

Todeschini R Consonni V Mauri A Pavan M 2002 DRAGON-Software for thecalculation of molecular descriptors version 21

Todeschini Roberto Consonni Viviana 2008 Handbook of Molecular Descriptors John Wiley amp Sons

Tropsha Alexander Gramatica Paola Gombar Vijay K 2003 The importance of being earnest validation is the absolute essential for successful application andinterpretation of QSPR models QSAR Comb Sci 22 (1) 69e77

XLSTAT 2013 software XLSTAT-CCR module Trial version

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450450

Page 4: Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 49

relation with absorption capacity was saved and other de-

scriptors were eliminated (611 excluded)

After the above constraints a total of 514 descriptors were

selected for each molecule as an output of this stage

Finally the calculated descriptors formed a (23 545) data

matrix where 23 represents the number of compounds and 545

were the number of descriptors

3 Model development

After descriptors calculation GA-MLR was applied as a variable

selection and model development procedure for obtaining the best

model with the highest predictive power based on the training set

The procedure of constructing training and test sets will be dis-

cussed in the results section The GA-MLR analysis led to thedevelopment of one model with three variables The following

linear equation was built based on molecules with the training set

AC frac14 036 thorn 157 Mor09v 032 RDF035m thorn 043 nN

(1)

AC is used instead of absorption capacity Mor09v is one of the

3D-MoRSE descriptors and it is de1047297ned as signal 09 weighted by

van der Waals volume RDF035m belongs to the group of RDF de-

scriptors and it describes the radial distribution function-035

weighted by mass and nN represent the number of Nitrogen

atoms As can be noticed the calculation of two descriptors in the

above model is dif 1047297cult because these calculations should be per-

formed by computer It also seems it is not easy to describe the

relationship between these two descriptors and absorption ca-pacity of amines In QSPR studies interpretation of the model and

descriptors is a necessary and important step So it was decided to

investigate some new models with new simple descriptors In

addition due to the chemical reaction of amines with carbon di-

oxide it is concluded that the number of amino groups may affect

amines capacity of carbon dioxide absorption The information on

the chemistry of carbon dioxide reactions with amine-based sol-

vents will be presented in the discussion section After developing

numerous simple equations and evaluating them with different

statistical methods the following model was selected

AC frac14 019 thorn 004 nH thorn 054 nRNH2 thorn 040 nRNHR

(2)

Table 2 shows some statistical factors in order to provide a

better comparison between the two models The 1047297rst equation

demonstrates higher statistical parameters But the simpler de-

scriptors of the second model either in the calculation or inter-

pretation of results are more important Therefore we introduce

the second equation as a preferred model to predict absorption

capacity of amines and the rest of this paper including discussion

and conclusion section will focus on this model

Molecular descriptors and their de1047297nitions are given in Table 3

The correlation matrix of descriptors is also shown in Table 4 The

linear correlation value for each of the two descriptors is less than

065 which demonstrates these descriptors are independent of

each other and can be used to develop a QSPR model

As can be observed the three descriptors appeared in the model

are easily calculated and thus there is no need for computational

calculation Moreover this model demonstrates high statistical

qualities Indeed to the best of our knowledge the above model is

the simplest equation that can ever predict the capacity of amines

for carbon dioxide absorption under speci1047297c conditions

4 Results

One of the most critical factors that in1047298uence the quality of

regression model is how to select and construct training and test

set in order to warrant the molecular diversity on both of them To

take this into account from the total 23 amine-based carbon di-

oxide absorbents 18 molecules (about 80 of molecules) were

selected to construct a training set and 5 molecules built test set

(about 20) The test set was used for external cross-validation of

Table 1 (continued )

No Name Structure Exp Eq (2) (Model) Eq (1)

20 Propylamine(T) 077 071 080

21 Pyridine(T) 005 001 012

22 sec-Butylamine 084 079 087

23 Triethylenetetramine 251 241 247

All the absorption capacities (rich loading) numerical values are in the basis of (mol CO 2mol amine) Bold names with (T) superscripts are test set molecules

Table 2

Some basic statistical values for two models

Models Descriptors R2 Q 2 F s

Eq (1) nN mor09V RDF035m 0979 0971 30096 0082

Eq (2) nH nRNH2 nRNHR 0950 0945 12154 0127

All statistic parameters in this table calculated before training and test procedure

Table 3

The three molecular descriptors used in Eq (2)

Descriptor Type De1047297nition

nH Constitutional indices Number of Hydrogen atoms

nRNH2 Functional group counts Number of primary amines (aliphatic)

nRNHR Functional group counts Number of secondary amines (aliphatic)

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450 445

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 59

the model One of the common techniques in QSPR approach for

constructing training and test set with the constraint of structural

diversity is a PCA (principal component analysis) method (Hu et al

2009 Riahi et al 2008) In the current work PCA was employed to

classify data set of molecules into training and test sets For this

purpose PC1 and PC2 were calculated based on descriptors in the

model The result showed that these two principal components

made 575 and 335 of the variation in data respectively and

played the main roles Fig 1 shows the distribution of the data for

PC1 and PC2 and by observing this 1047297gure it can be concluded that

the compounds in the training and test sets were representatives of

the whole data

The training set was used to build the model while the test set

was used to validate the predicting power During the model

development procedure leave-one-out cross-validation (LOO-CV)method was applied to assess the performances of different

resulting models The Q 2LOO was calculated for each obtained

equation and then the best model was selected based on the high

value of this parameter There are some statistical tests and pa-

rameters that need to be considered Coef 1047297cient of determination

(R2) adjusted R2 Coef 1047297cient of leave-one-out cross validation (Q 2)

the slopes of regression lines forced through zero (k k0) root mean

square error (RMSE) and standard error of the estimate (s) are the

most important ones The 1047297rst 1047297ve parameters should be near to

unity while RMSE and s should be low enough near to zero

Furthermore the intercepts of the model should be close to zero

Moreover the Fisher function (F ) is another vital statistical test

High values of the F -ratio test indicate reliable models All statistical

parameters formulas used in this paper are mentioned below

R2 frac14 1

Pnifrac141

yexp

i ycalc

i

2

Pnifrac141

yexp

i y2

(3)

RMSE frac14

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPnifrac141

yexp

i ycalc

i

2

n

v uut (4)

F frac14

Pnifrac141

yexp ycalc

i

2

df M

Pnifrac141

yexpi ycalci

2df E

(5)

k frac14

P yexp

i ycalc

iP ycalc

i

2 (6)

k0 frac14

P yexp

i ycalc

iP yexp

i

2 (7)

s frac14

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPnifrac141

yexp

i ycalc

i

2

df E

v uut (8)

where df M and df E refer to the degrees of freedom of the model and

error respectively

Also the following criteria described by Golbaraikh and Tropsha

were applied to check the predictability of the QSPR model

(Golbraikh and Tropsha 2002)

1

Q 2 gt05 (9)

2

R2gt 06 (10)

3 R2 R2

0R2 lt0

1 and 0

85 k 1

15 (11)

where R20 is the coef 1047297cient of determination characterizing linear

regression with Y -intercept set at zero The predicted result of all

molecules either in training or test set with statistical parameters

are given in Table 5

Table 4

Correlation matrix of three descriptors used in Eq (2)

Descriptor nH nRNH2 nRNHR

nH 1000 0344 0642

nRNH2 0344 1000 0005

nRNHR 0642 0005 1000

Fig 1 The principal component analysis of the molecules in training and test sets Some points belong to more than one molecule

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450446

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 69

In Table6 Y-scrambling test was applied in order to examine the

robustness of the model (Tropsha et al 2003) In Y-scrambling test

the dependent variable (Absorption Capacity) is randomly dedi-

cated to different amines and new QPSR modeling is performed

based on the previous matrix of independent variables It is ex-

pectedthat newly developedQSPR models shouldhave lowenough

R2 and Q 2 values If it happens differently the reported model is not

accurate for the particular data set and method of modeling

The applicability domain of the model was studied by Williams

plot in Fig 2 (OECD 2007) In Williams plot the standardized re-

siduals (R) versus the leverage (hat diagonal) values (h) were

plotted Leverage demonstrates the distance of a compound from

the centroid of the X where X is the descriptor matrix The leverage

of a compound is calculated by the following equation (Netzeva

et al 2005)

hi frac14 xT i

X T X

1 xi (12)

where xi is the descriptor vector of the relevant compound The

warning leverage (h) is de1047297ned as (Eriksson et al 2003)

h frac143eth p thorn 1THORN

n (13)

n is the number of training objects and p is the number of de-

scriptors in the model Williams plot is used to identify both the

response outlier and the structurally in1047298uential chemicals in the

model A compound with hi gt h in1047298uence the regression line but it

does not consider as an outlier as its corresponding standardized

residual might be small In this data set the warning value of leverage is around 067 Furthermore compounds with standard-

ized residual rather than three standard residual unit (gt3s) is

considered as an outlier compound It is common in the literature

to use 3 as an accepted cut-off value for evaluating prediction re-

sults of the model

Fig 2 demonstrates that there is no chemical with leverage

higher than the warning h value of 067 It also shows that there is

no outlier in training or test sets and all compounds lie between the

two horizontal lines

The experimental absorption capacity (rich loading) values of

amines are plotted in Fig 3 against corresponding calculated values

for QSPR model

Furthermore mean effect (MF) is another term that helps to

interpret the result and shows the effect of each descriptor

individually or relative to other descriptors Fig 4 shows the stan-

dardized coef 1047297cients (also called beta coef 1047297cients) (XLSTAT 2013)

The 1047297gure is used to compare the relative weights of the de-

scriptors The higher the standardized coef 1047297cients value of a

descriptor the more important the weight of the corresponding

variable in the model This 1047297gure demonstrates the mean effect of

the descriptors in the model

By observing this 1047297gure it can be concluded that the number of

primary amines (nRNH2) and secondary amines (nRNHR) de-

scriptors play the main role in the amines capacity for carbon di-

oxide absorption respectively and the number of hydrogen (nH) has

the least effect This 1047297gure shows all descriptors in the model have

positive effects and the amines capacity for CO2 absorption is

directly related to each of these descriptors

At the last part of this section it should be noticed that the

present work focuses exclusively on developing a simple model by

which amines capacities for carbon dioxide absorption can be

predicted In fact the predominant difference between this study

and the previous ones is that this work concentrates 1047297rstly on

quantitative and then on qualitative representation of structural

effects on the capacity of amines for CO2 absorption

5 Discussion

Although high statistical parameters are signi1047297cant in demon-

strating the capability of the model QSPR should provide powerful

insight for the mechanism of carbon dioxide solubility in amine

based solvent For this reason an acceptable interpretation of de-

scriptors in the QSPR model should be provided It is better to di-

agnose which parameters affect the amines capacity and which

descriptors could appear in the model due to principal chemical

reactions between carbon dioxide and an amine-based solvent

The overall reaction mechanism for chemical absorption of CO2

in amine solvent systems is still under debate A mechanism for this

reaction which supports the formation of zwitterion intermediate

theory and by proton-remover base B through reactions (1) and (2)

below suggested by Caplow (1968)

CO2 thorn R 1R 2NH4 R 1R 2NHthorn COO (1)

R 1R 2NHthorn COO thorn B4 R 1R 2COO thorn BHthorn (2)

R 1 and R 2 demonstrate substituted group attached to amine

group B is a base molecule which can be a water molecule The

intermediate in the reaction is zwitterion But more recent studies

showed zwitterion seemed to be short-lived and may be an entirely

transient state (da Silva and Svendsen 2004) It led to the

assumption of the single-step mechanism of these reactions (re-

action (3)) A termolecular single step mechanism suggested by

Crooks and Donnellan (1989)

B thorn CO2 thorn R 1R 2NH4 R 1R 2NCOO thorn BHthorn (3)

where B is again the base molecule In this mechanism NH group is

attacked by base molecule and deprotonation of amine occurs The

bonding between amine and carbon dioxide also takes place

simultaneously

Table 6

R2 train values after several Y-scrambling tests

Iteration R2 train

1 0060

2 0074

3 0119

4 0027

5 0188

6 0102

7 0119

8 0209

9 0096

10 0039

Table 5

Validation parameters and statistical result of GA-MLR model

n R2 R2adj RMSE F k k0 s

Train 18 0942 0930 0127 7650 1004 0984 0144

Test 5 0976 0904 0060 1731 0962 1035 0135

Overall 23 0950 0942 0116 12379 0999 0990 0128

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450 447

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 79

As can be noticed the reaction between CO2 and amine based

solvent takes place because of the existence of NH bond So NH

group is an active site of the amine molecule where base molecule

(water) undergoes a chemical termolecular reaction Consequently

the amount of NH bonds or in other words the number of primary

and secondary amine groups in the amine molecule plays an

important role in the capacity of amines for CO2 absorption

Number of primary (nRNH2) and secondary (nRNHR) amines is two

main descriptors appearing in the model According to Fig 4 these

two descriptors have a positive effect and a higher mean effect All

these results demonstrate that the chemical reaction mechanism

coordinates with the proposed model

The model also contains number of Hydrogen atoms (nH) as

another descriptor Fig 4 shows nH descriptor has a positive effect

which is considerably less than two other descriptors The reason of

nH descriptor presence in the model can be explained by the result

of experimental work performed by Singh et al They showed that

an increase in the chain length between amines and other func-

tional groups in the amine structure result in an increase in amine

capacity for CO2 absorption (Singh et al 2007) Increasing with

chain length results in increasing numbers of hydrogen atoms so

apparently it seems it should have a positive effect due to the

experimental work

At last it should be noted that the simplicity of the model is

interesting and the results are quite acceptable for predicting

amines capacity for carbon dioxide absorption Although the ac-

curacy of the model is good for linear amine compounds it is not

better for unsaturated cyclic amines This can be explained by two

Fig 2 Williams plot of GA-MLR model development

Fig 3 Experimental vs predicted rich loading values (mol CO 2mol amine) e

regression line

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450448

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 89

main reasons First the three descriptors in model are not sensitive

to ring type functional group and just count the number of

hydrogen atoms primary and secondary amines Second unsatu-

rated cyclic amines show poor absorption rate and capacity and

they are not potential absorbents for CO2 absorption (2) Therefore

according to the industrial point of view it is preferable to use

linear amine for CO2 absorption and it is more important for the

model to predict CO2 absorption capacity for linear amines rather

than unsaturated cyclic amines

Fortunately the results of the 1047297rst equation (Eq (1)) for pre-

dicting amines capacity of CO2 absorption are largely accepted

either for linear or aromatic ring type amines (items labeled 5 6

and 21) This is because of the presence of RDF descriptor in this

model RDFdescriptorsare based on the distance distribution in the

geometrical representation of a molecule This function is inde-

pendent of the number of atoms and is invariant against translation

and rotation of the entire molecule The RDFcode provides valuable

information eg about bond distances ring types planar and non-

planar systems and atom types so it is sensitive to aromatic rings

(Todeschini and Consonni 2008)

6 Conclusions

One of the main concerns of the natural gas industry is to have a

robust and accuratemodel which canpredict the chemical behavior

of amines for gas treatment process This study is attempted toidentify the effects chemical structure of amine on their capacity for

carbon dioxide absorption and develop a model for this purpose

which is not only robust and accurate but also simple and appli-

cable Therefore QSPR approach has been chosen as a modeling

technique and model has been developed based on linear method

for its simplicity As a result two linear equations were developed

First model demonstrate high prediction powerwhile second one is

notably simpler and powerfully interpretable due to the chemistry

of amines reaction with carbon dioxide Consequently second

equation introduced as a preferred model of this study The most

important descriptors appearing in the model due to the weight of

the corresponding variable are number of primary aliphatic amines

(nRNH2) number of secondary aliphatic amines (nRNHR) and

number of hydrogen atoms (nH) respectively The accuracy and

predictive performance of the model validated with various sta-

tistical tests and examined with the test set of 1047297ve molecules

permits using this model to estimate other amines rich loading

under speci1047297c conditions According to the results it could be

argued that a good amine solvent for carbon dioxide absorption

should have a linear structure with a high number of primary and

secondary amine groups as side chains In other words increasing

the number of primary and secondary amine groups results in

increasing the number of NH bonds active sites which causes the

amine reaction with CO2 to happen

The promising results of this study might aid other researchers

in the 1047297eld of chemistry and natural gas engineering to design and

synthesis new potential amine-based solvents and investigate the

feasibility of using them in gas removal processes New improved

solvents should also be compared to more conventional ones from

corrosively energy ef 1047297ciency and operability point of view

Acknowledgment

The authors would like to gratefully acknowledge the support

from Institute of Petroleum Engineering (IPE) University of Tehran

List of symbols

CO2 carbon dioxide

QSPRQSAR quantitative structure propertyactivity relationship

DFT Density Functional Theory

MLR Multiple Linear Regression

GA Genetic Algorithms

PCR principle component regression

PLS partial least square

HOMO Highest Occupied Molecular Orbital

LUMO Lowest Unoccupied Molecular Orbital

AC absorption capacity

PCA principal component analysis

LOO-CV Leave-one-out cross-validation

RMSE root mean square errordf M degrees of freedom of the model

df E degrees of freedom of the error

References

Beheshti Abolghasem Riahi Siavash Ganjali Mohammad Reza 2009 Quantitativestructureeproperty relationship study on 1047297rst reduction and oxidation poten-tials of donor-substituted phenylquinolinylethynes and phenyl-isoquinolinylethynes quantum chemical investigation Electrochim Acta 54(23) 5368e5375

Beheshti A Norouzi P Ganjali MR 2012 A simple and robust model for pre-dicting the reduction potential of quinones family electrophilicity index effectInt J Electrochem Sci 7 4811e4821

Bohloul MR Vatani A Peyghambarzadeh SM 2014 Experimental and theo-retical study of CO2 solubility in N-methyl-2-pyrrolidone (NMP) Fluid PhaseEquilibr 365 106e111

Caplow Michael 1968 Kinetics of carbamate formation and breakdown J AmChem Soc 90 (24) 6795e6803

Chakraborty AK et al 1988 Molecular orbital approach to substituent effects inamine-CO2 interactions J Am Chem Soc 110 (21) 6947e6954

Chakraborty AK Astarita G Bischoff KB 1986 CO2absorption in aqueous so-lutions of hindered amines Chem Eng Sci 41 (4) 997e1003

Cramer Christopher J 2005 Essentials of Computational Chemistry Theories andModels Wileycom

Crooks John E Donnellan J Paul 1989 Kinetics and mechanism of the reactionbetween carbon dioxide and amines in aqueous solution J Chem Soc PerkinTrans 2 (4) 331e333

da Silva Eirik F Svendsen Hallvard F 2004 Ab initio study of the reaction of carbamate formation from CO2 and alkanolamines Indust Eng Chem Res 43(13) 3413e3418

Depczynski Uwe Frost VJ Molt K 2000 Genetic algorithms applied to the se-lection of factors in principal component regression Anal Chim Acta 420 (2)217e227

Eriksson Lennart Joanna Jaworska Worth Andrew P Cronin Mark TD

McDowell Robert M Gramatica Paola 2003 Methods for reliability and

Fig 4 Mean effects of model descriptors (standardized coef 1047297cient values)

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450 449

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 99

uncertainty assessment and for applicability evaluations of classi1047297cation-andregression-based QSARs Environ Health Perspect 111 (10) 1361

Fini Mojtaba Fallah Riahi Siavash Alireza Bahramian 2012 Experimental andQSPR studies on the effect of ionic surfactants on n-Decaneewater interfacialtension J Surfact Deterg 15 (4) 477e484

Freire Mara G et al 2010 Solubility of non-aromatic ionic liquids in water andcorrelation using a QSPR approach Fluid Phase Equilibr 294 (1) 234e240

Frisch Michael J Nielsen Alice B Frisch Aeleen (Eds) 1998 Gaussian 98 GaussianIncorporated

Godavarthy Srinivasa S Robinson Jr Robert L Gasem Khaled AM 2006

SVRCe

QSPR model for predicting saturated vapor pressures of pure 1047298uids FluidPhase Equilibr 246 (1) 39e51

Golbraikh Alexander Tropsha Alexander 2002 Beware of q2 J Mol Graph Model20 (4) 269e276

Hu Rongjing et al 2009 QSAR models for 2-amino-6-arylsulfonylbenzonitrilesand congeners HIV-1 reverse transcriptase inhibitors based on linear andnonlinear regression methods Eur J Med Chem 44 (5) 2158e2171

Jouan-Rimbaud Delphine et al 1995 Genetic algorithms as a tool for wavelengthselection in multivariate calibration Anal Chem 67 (23) 4295e4301

Katritzky Alan R et al 2000 QSPR correlation and predictions of GC retentionindexes for methyl-branched hydrocarbons produced by insects Anal Chem 72(1) 101e109

Kohl Arthur L Nielsen Richard 1997 Gas Puri1047297cation (access online via Elsevier)Lee Anita S Kitchin John R 2012 Chemical and molecular descriptors for the

reactivity of amines with CO2 Indust Eng Chem Res 51 (42) 13609e13618Liang Guijie Jie Xu Li Liu 2013 QSPR analysis for melting point of fatty acids

using genetic algorithm based multiple linear regression (GA-MLR) Fluid PhaseEquilibr 353 15e21

Marengo Emilio et al 1992 Comparative study of different structural descriptorsand variable selection approaches using partial least squares in quantitativestructure-activity relationships Chemometr Intell Lab Syst 14 (1) 225e233

Mokhatab Saeid Poe William A 2012 Handbook of Natural Gas Transmission andProcessing (access online via Elsevier)

Netzeva Tatiana I Worth Andrew P Aldenberg Tom Romualdo BenigniCronin Mark TD Gramatica Paola Jaworska Joanna S et al 2005 Currentstatus of methods for de1047297ning the applicability domain of (quantitative)structureeactivity relationships ATLA 33 155e173

OECD 2007 Guidance Document on the Validation of (Quantitative) Structure-Activity Relationships [(Q)SAR] Models Organisation for Economic Co-Operation and Development Paris

Pourbasheer Eslam et al 2011 Prediction of solubility of fullerene C60 in variousorganic solvents by genetic algorithm-multiple linear regression FullerenesNanotubes Carbon Nanostruct 19 (7) 585e598

Riahi Siavash Ganjali Mohammad Reza Norouzi Parviz Jafari Fatemeh 2008Application of GA-MLR GA-PLS and the DFT quantum mechanical (QM) cal-culations for the prediction of the selectivity coef 1047297cients of a histamine-selective electrode Sens Actuat B Chem 132 (1) 13e19

Riahi Siavash Pourbasheer Eslam Ganjali Mohammad Reza Norouzi Parviz 2009Investigation of different linear and nonlinear chemometric methods formodeling of retention index of essential oil components concerns to supportvector machine J Hazard Mater 166 (2) 853e859

Riahi Siavash Beheshti Abolghasem Ganjali Mohammad Reza Norouzi Parviz2008 A novel QSPR study of normalized migration time for drugs in capillaryelectrophoresis by new descriptors quantum chemical investigation Electro-phoresis 29 (19) 4027e4035

Riahi Siavash et al 2008 QSAR study of 2- (1-Propylpiperidin-4-yl) -1 H-Benz-imidazole-4-Carboxamide as PARP inhibitors for treatment of cancer ChemBiol Drug Design 72 (6) 575e584

Sartori Guido Savage David W 1983 Sterically hindered amines for carbon di-oxide removal from gases Indust Eng Chem Fundam 22 (2) 239e249

Singh Prachi Niederer John PM Versteeg Geert F 2007 Structure and activityrelationships for amine based CO2 absorbentsdI Int J Greenhouse Gas Control1 (1) 5e10

Singh Prachi Niederer John PM Versteeg Geert F 2009 Structure and activityrelationships for amine-based CO2 absorbents-II Chem Eng Res Design 87 (2)135e144

Todeschini R Consonni V Mauri A Pavan M 2002 DRAGON-Software for thecalculation of molecular descriptors version 21

Todeschini Roberto Consonni Viviana 2008 Handbook of Molecular Descriptors John Wiley amp Sons

Tropsha Alexander Gramatica Paola Gombar Vijay K 2003 The importance of being earnest validation is the absolute essential for successful application andinterpretation of QSPR models QSAR Comb Sci 22 (1) 69e77

XLSTAT 2013 software XLSTAT-CCR module Trial version

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450450

Page 5: Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 59

the model One of the common techniques in QSPR approach for

constructing training and test set with the constraint of structural

diversity is a PCA (principal component analysis) method (Hu et al

2009 Riahi et al 2008) In the current work PCA was employed to

classify data set of molecules into training and test sets For this

purpose PC1 and PC2 were calculated based on descriptors in the

model The result showed that these two principal components

made 575 and 335 of the variation in data respectively and

played the main roles Fig 1 shows the distribution of the data for

PC1 and PC2 and by observing this 1047297gure it can be concluded that

the compounds in the training and test sets were representatives of

the whole data

The training set was used to build the model while the test set

was used to validate the predicting power During the model

development procedure leave-one-out cross-validation (LOO-CV)method was applied to assess the performances of different

resulting models The Q 2LOO was calculated for each obtained

equation and then the best model was selected based on the high

value of this parameter There are some statistical tests and pa-

rameters that need to be considered Coef 1047297cient of determination

(R2) adjusted R2 Coef 1047297cient of leave-one-out cross validation (Q 2)

the slopes of regression lines forced through zero (k k0) root mean

square error (RMSE) and standard error of the estimate (s) are the

most important ones The 1047297rst 1047297ve parameters should be near to

unity while RMSE and s should be low enough near to zero

Furthermore the intercepts of the model should be close to zero

Moreover the Fisher function (F ) is another vital statistical test

High values of the F -ratio test indicate reliable models All statistical

parameters formulas used in this paper are mentioned below

R2 frac14 1

Pnifrac141

yexp

i ycalc

i

2

Pnifrac141

yexp

i y2

(3)

RMSE frac14

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPnifrac141

yexp

i ycalc

i

2

n

v uut (4)

F frac14

Pnifrac141

yexp ycalc

i

2

df M

Pnifrac141

yexpi ycalci

2df E

(5)

k frac14

P yexp

i ycalc

iP ycalc

i

2 (6)

k0 frac14

P yexp

i ycalc

iP yexp

i

2 (7)

s frac14

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPnifrac141

yexp

i ycalc

i

2

df E

v uut (8)

where df M and df E refer to the degrees of freedom of the model and

error respectively

Also the following criteria described by Golbaraikh and Tropsha

were applied to check the predictability of the QSPR model

(Golbraikh and Tropsha 2002)

1

Q 2 gt05 (9)

2

R2gt 06 (10)

3 R2 R2

0R2 lt0

1 and 0

85 k 1

15 (11)

where R20 is the coef 1047297cient of determination characterizing linear

regression with Y -intercept set at zero The predicted result of all

molecules either in training or test set with statistical parameters

are given in Table 5

Table 4

Correlation matrix of three descriptors used in Eq (2)

Descriptor nH nRNH2 nRNHR

nH 1000 0344 0642

nRNH2 0344 1000 0005

nRNHR 0642 0005 1000

Fig 1 The principal component analysis of the molecules in training and test sets Some points belong to more than one molecule

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450446

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 69

In Table6 Y-scrambling test was applied in order to examine the

robustness of the model (Tropsha et al 2003) In Y-scrambling test

the dependent variable (Absorption Capacity) is randomly dedi-

cated to different amines and new QPSR modeling is performed

based on the previous matrix of independent variables It is ex-

pectedthat newly developedQSPR models shouldhave lowenough

R2 and Q 2 values If it happens differently the reported model is not

accurate for the particular data set and method of modeling

The applicability domain of the model was studied by Williams

plot in Fig 2 (OECD 2007) In Williams plot the standardized re-

siduals (R) versus the leverage (hat diagonal) values (h) were

plotted Leverage demonstrates the distance of a compound from

the centroid of the X where X is the descriptor matrix The leverage

of a compound is calculated by the following equation (Netzeva

et al 2005)

hi frac14 xT i

X T X

1 xi (12)

where xi is the descriptor vector of the relevant compound The

warning leverage (h) is de1047297ned as (Eriksson et al 2003)

h frac143eth p thorn 1THORN

n (13)

n is the number of training objects and p is the number of de-

scriptors in the model Williams plot is used to identify both the

response outlier and the structurally in1047298uential chemicals in the

model A compound with hi gt h in1047298uence the regression line but it

does not consider as an outlier as its corresponding standardized

residual might be small In this data set the warning value of leverage is around 067 Furthermore compounds with standard-

ized residual rather than three standard residual unit (gt3s) is

considered as an outlier compound It is common in the literature

to use 3 as an accepted cut-off value for evaluating prediction re-

sults of the model

Fig 2 demonstrates that there is no chemical with leverage

higher than the warning h value of 067 It also shows that there is

no outlier in training or test sets and all compounds lie between the

two horizontal lines

The experimental absorption capacity (rich loading) values of

amines are plotted in Fig 3 against corresponding calculated values

for QSPR model

Furthermore mean effect (MF) is another term that helps to

interpret the result and shows the effect of each descriptor

individually or relative to other descriptors Fig 4 shows the stan-

dardized coef 1047297cients (also called beta coef 1047297cients) (XLSTAT 2013)

The 1047297gure is used to compare the relative weights of the de-

scriptors The higher the standardized coef 1047297cients value of a

descriptor the more important the weight of the corresponding

variable in the model This 1047297gure demonstrates the mean effect of

the descriptors in the model

By observing this 1047297gure it can be concluded that the number of

primary amines (nRNH2) and secondary amines (nRNHR) de-

scriptors play the main role in the amines capacity for carbon di-

oxide absorption respectively and the number of hydrogen (nH) has

the least effect This 1047297gure shows all descriptors in the model have

positive effects and the amines capacity for CO2 absorption is

directly related to each of these descriptors

At the last part of this section it should be noticed that the

present work focuses exclusively on developing a simple model by

which amines capacities for carbon dioxide absorption can be

predicted In fact the predominant difference between this study

and the previous ones is that this work concentrates 1047297rstly on

quantitative and then on qualitative representation of structural

effects on the capacity of amines for CO2 absorption

5 Discussion

Although high statistical parameters are signi1047297cant in demon-

strating the capability of the model QSPR should provide powerful

insight for the mechanism of carbon dioxide solubility in amine

based solvent For this reason an acceptable interpretation of de-

scriptors in the QSPR model should be provided It is better to di-

agnose which parameters affect the amines capacity and which

descriptors could appear in the model due to principal chemical

reactions between carbon dioxide and an amine-based solvent

The overall reaction mechanism for chemical absorption of CO2

in amine solvent systems is still under debate A mechanism for this

reaction which supports the formation of zwitterion intermediate

theory and by proton-remover base B through reactions (1) and (2)

below suggested by Caplow (1968)

CO2 thorn R 1R 2NH4 R 1R 2NHthorn COO (1)

R 1R 2NHthorn COO thorn B4 R 1R 2COO thorn BHthorn (2)

R 1 and R 2 demonstrate substituted group attached to amine

group B is a base molecule which can be a water molecule The

intermediate in the reaction is zwitterion But more recent studies

showed zwitterion seemed to be short-lived and may be an entirely

transient state (da Silva and Svendsen 2004) It led to the

assumption of the single-step mechanism of these reactions (re-

action (3)) A termolecular single step mechanism suggested by

Crooks and Donnellan (1989)

B thorn CO2 thorn R 1R 2NH4 R 1R 2NCOO thorn BHthorn (3)

where B is again the base molecule In this mechanism NH group is

attacked by base molecule and deprotonation of amine occurs The

bonding between amine and carbon dioxide also takes place

simultaneously

Table 6

R2 train values after several Y-scrambling tests

Iteration R2 train

1 0060

2 0074

3 0119

4 0027

5 0188

6 0102

7 0119

8 0209

9 0096

10 0039

Table 5

Validation parameters and statistical result of GA-MLR model

n R2 R2adj RMSE F k k0 s

Train 18 0942 0930 0127 7650 1004 0984 0144

Test 5 0976 0904 0060 1731 0962 1035 0135

Overall 23 0950 0942 0116 12379 0999 0990 0128

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450 447

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 79

As can be noticed the reaction between CO2 and amine based

solvent takes place because of the existence of NH bond So NH

group is an active site of the amine molecule where base molecule

(water) undergoes a chemical termolecular reaction Consequently

the amount of NH bonds or in other words the number of primary

and secondary amine groups in the amine molecule plays an

important role in the capacity of amines for CO2 absorption

Number of primary (nRNH2) and secondary (nRNHR) amines is two

main descriptors appearing in the model According to Fig 4 these

two descriptors have a positive effect and a higher mean effect All

these results demonstrate that the chemical reaction mechanism

coordinates with the proposed model

The model also contains number of Hydrogen atoms (nH) as

another descriptor Fig 4 shows nH descriptor has a positive effect

which is considerably less than two other descriptors The reason of

nH descriptor presence in the model can be explained by the result

of experimental work performed by Singh et al They showed that

an increase in the chain length between amines and other func-

tional groups in the amine structure result in an increase in amine

capacity for CO2 absorption (Singh et al 2007) Increasing with

chain length results in increasing numbers of hydrogen atoms so

apparently it seems it should have a positive effect due to the

experimental work

At last it should be noted that the simplicity of the model is

interesting and the results are quite acceptable for predicting

amines capacity for carbon dioxide absorption Although the ac-

curacy of the model is good for linear amine compounds it is not

better for unsaturated cyclic amines This can be explained by two

Fig 2 Williams plot of GA-MLR model development

Fig 3 Experimental vs predicted rich loading values (mol CO 2mol amine) e

regression line

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450448

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 89

main reasons First the three descriptors in model are not sensitive

to ring type functional group and just count the number of

hydrogen atoms primary and secondary amines Second unsatu-

rated cyclic amines show poor absorption rate and capacity and

they are not potential absorbents for CO2 absorption (2) Therefore

according to the industrial point of view it is preferable to use

linear amine for CO2 absorption and it is more important for the

model to predict CO2 absorption capacity for linear amines rather

than unsaturated cyclic amines

Fortunately the results of the 1047297rst equation (Eq (1)) for pre-

dicting amines capacity of CO2 absorption are largely accepted

either for linear or aromatic ring type amines (items labeled 5 6

and 21) This is because of the presence of RDF descriptor in this

model RDFdescriptorsare based on the distance distribution in the

geometrical representation of a molecule This function is inde-

pendent of the number of atoms and is invariant against translation

and rotation of the entire molecule The RDFcode provides valuable

information eg about bond distances ring types planar and non-

planar systems and atom types so it is sensitive to aromatic rings

(Todeschini and Consonni 2008)

6 Conclusions

One of the main concerns of the natural gas industry is to have a

robust and accuratemodel which canpredict the chemical behavior

of amines for gas treatment process This study is attempted toidentify the effects chemical structure of amine on their capacity for

carbon dioxide absorption and develop a model for this purpose

which is not only robust and accurate but also simple and appli-

cable Therefore QSPR approach has been chosen as a modeling

technique and model has been developed based on linear method

for its simplicity As a result two linear equations were developed

First model demonstrate high prediction powerwhile second one is

notably simpler and powerfully interpretable due to the chemistry

of amines reaction with carbon dioxide Consequently second

equation introduced as a preferred model of this study The most

important descriptors appearing in the model due to the weight of

the corresponding variable are number of primary aliphatic amines

(nRNH2) number of secondary aliphatic amines (nRNHR) and

number of hydrogen atoms (nH) respectively The accuracy and

predictive performance of the model validated with various sta-

tistical tests and examined with the test set of 1047297ve molecules

permits using this model to estimate other amines rich loading

under speci1047297c conditions According to the results it could be

argued that a good amine solvent for carbon dioxide absorption

should have a linear structure with a high number of primary and

secondary amine groups as side chains In other words increasing

the number of primary and secondary amine groups results in

increasing the number of NH bonds active sites which causes the

amine reaction with CO2 to happen

The promising results of this study might aid other researchers

in the 1047297eld of chemistry and natural gas engineering to design and

synthesis new potential amine-based solvents and investigate the

feasibility of using them in gas removal processes New improved

solvents should also be compared to more conventional ones from

corrosively energy ef 1047297ciency and operability point of view

Acknowledgment

The authors would like to gratefully acknowledge the support

from Institute of Petroleum Engineering (IPE) University of Tehran

List of symbols

CO2 carbon dioxide

QSPRQSAR quantitative structure propertyactivity relationship

DFT Density Functional Theory

MLR Multiple Linear Regression

GA Genetic Algorithms

PCR principle component regression

PLS partial least square

HOMO Highest Occupied Molecular Orbital

LUMO Lowest Unoccupied Molecular Orbital

AC absorption capacity

PCA principal component analysis

LOO-CV Leave-one-out cross-validation

RMSE root mean square errordf M degrees of freedom of the model

df E degrees of freedom of the error

References

Beheshti Abolghasem Riahi Siavash Ganjali Mohammad Reza 2009 Quantitativestructureeproperty relationship study on 1047297rst reduction and oxidation poten-tials of donor-substituted phenylquinolinylethynes and phenyl-isoquinolinylethynes quantum chemical investigation Electrochim Acta 54(23) 5368e5375

Beheshti A Norouzi P Ganjali MR 2012 A simple and robust model for pre-dicting the reduction potential of quinones family electrophilicity index effectInt J Electrochem Sci 7 4811e4821

Bohloul MR Vatani A Peyghambarzadeh SM 2014 Experimental and theo-retical study of CO2 solubility in N-methyl-2-pyrrolidone (NMP) Fluid PhaseEquilibr 365 106e111

Caplow Michael 1968 Kinetics of carbamate formation and breakdown J AmChem Soc 90 (24) 6795e6803

Chakraborty AK et al 1988 Molecular orbital approach to substituent effects inamine-CO2 interactions J Am Chem Soc 110 (21) 6947e6954

Chakraborty AK Astarita G Bischoff KB 1986 CO2absorption in aqueous so-lutions of hindered amines Chem Eng Sci 41 (4) 997e1003

Cramer Christopher J 2005 Essentials of Computational Chemistry Theories andModels Wileycom

Crooks John E Donnellan J Paul 1989 Kinetics and mechanism of the reactionbetween carbon dioxide and amines in aqueous solution J Chem Soc PerkinTrans 2 (4) 331e333

da Silva Eirik F Svendsen Hallvard F 2004 Ab initio study of the reaction of carbamate formation from CO2 and alkanolamines Indust Eng Chem Res 43(13) 3413e3418

Depczynski Uwe Frost VJ Molt K 2000 Genetic algorithms applied to the se-lection of factors in principal component regression Anal Chim Acta 420 (2)217e227

Eriksson Lennart Joanna Jaworska Worth Andrew P Cronin Mark TD

McDowell Robert M Gramatica Paola 2003 Methods for reliability and

Fig 4 Mean effects of model descriptors (standardized coef 1047297cient values)

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450 449

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 99

uncertainty assessment and for applicability evaluations of classi1047297cation-andregression-based QSARs Environ Health Perspect 111 (10) 1361

Fini Mojtaba Fallah Riahi Siavash Alireza Bahramian 2012 Experimental andQSPR studies on the effect of ionic surfactants on n-Decaneewater interfacialtension J Surfact Deterg 15 (4) 477e484

Freire Mara G et al 2010 Solubility of non-aromatic ionic liquids in water andcorrelation using a QSPR approach Fluid Phase Equilibr 294 (1) 234e240

Frisch Michael J Nielsen Alice B Frisch Aeleen (Eds) 1998 Gaussian 98 GaussianIncorporated

Godavarthy Srinivasa S Robinson Jr Robert L Gasem Khaled AM 2006

SVRCe

QSPR model for predicting saturated vapor pressures of pure 1047298uids FluidPhase Equilibr 246 (1) 39e51

Golbraikh Alexander Tropsha Alexander 2002 Beware of q2 J Mol Graph Model20 (4) 269e276

Hu Rongjing et al 2009 QSAR models for 2-amino-6-arylsulfonylbenzonitrilesand congeners HIV-1 reverse transcriptase inhibitors based on linear andnonlinear regression methods Eur J Med Chem 44 (5) 2158e2171

Jouan-Rimbaud Delphine et al 1995 Genetic algorithms as a tool for wavelengthselection in multivariate calibration Anal Chem 67 (23) 4295e4301

Katritzky Alan R et al 2000 QSPR correlation and predictions of GC retentionindexes for methyl-branched hydrocarbons produced by insects Anal Chem 72(1) 101e109

Kohl Arthur L Nielsen Richard 1997 Gas Puri1047297cation (access online via Elsevier)Lee Anita S Kitchin John R 2012 Chemical and molecular descriptors for the

reactivity of amines with CO2 Indust Eng Chem Res 51 (42) 13609e13618Liang Guijie Jie Xu Li Liu 2013 QSPR analysis for melting point of fatty acids

using genetic algorithm based multiple linear regression (GA-MLR) Fluid PhaseEquilibr 353 15e21

Marengo Emilio et al 1992 Comparative study of different structural descriptorsand variable selection approaches using partial least squares in quantitativestructure-activity relationships Chemometr Intell Lab Syst 14 (1) 225e233

Mokhatab Saeid Poe William A 2012 Handbook of Natural Gas Transmission andProcessing (access online via Elsevier)

Netzeva Tatiana I Worth Andrew P Aldenberg Tom Romualdo BenigniCronin Mark TD Gramatica Paola Jaworska Joanna S et al 2005 Currentstatus of methods for de1047297ning the applicability domain of (quantitative)structureeactivity relationships ATLA 33 155e173

OECD 2007 Guidance Document on the Validation of (Quantitative) Structure-Activity Relationships [(Q)SAR] Models Organisation for Economic Co-Operation and Development Paris

Pourbasheer Eslam et al 2011 Prediction of solubility of fullerene C60 in variousorganic solvents by genetic algorithm-multiple linear regression FullerenesNanotubes Carbon Nanostruct 19 (7) 585e598

Riahi Siavash Ganjali Mohammad Reza Norouzi Parviz Jafari Fatemeh 2008Application of GA-MLR GA-PLS and the DFT quantum mechanical (QM) cal-culations for the prediction of the selectivity coef 1047297cients of a histamine-selective electrode Sens Actuat B Chem 132 (1) 13e19

Riahi Siavash Pourbasheer Eslam Ganjali Mohammad Reza Norouzi Parviz 2009Investigation of different linear and nonlinear chemometric methods formodeling of retention index of essential oil components concerns to supportvector machine J Hazard Mater 166 (2) 853e859

Riahi Siavash Beheshti Abolghasem Ganjali Mohammad Reza Norouzi Parviz2008 A novel QSPR study of normalized migration time for drugs in capillaryelectrophoresis by new descriptors quantum chemical investigation Electro-phoresis 29 (19) 4027e4035

Riahi Siavash et al 2008 QSAR study of 2- (1-Propylpiperidin-4-yl) -1 H-Benz-imidazole-4-Carboxamide as PARP inhibitors for treatment of cancer ChemBiol Drug Design 72 (6) 575e584

Sartori Guido Savage David W 1983 Sterically hindered amines for carbon di-oxide removal from gases Indust Eng Chem Fundam 22 (2) 239e249

Singh Prachi Niederer John PM Versteeg Geert F 2007 Structure and activityrelationships for amine based CO2 absorbentsdI Int J Greenhouse Gas Control1 (1) 5e10

Singh Prachi Niederer John PM Versteeg Geert F 2009 Structure and activityrelationships for amine-based CO2 absorbents-II Chem Eng Res Design 87 (2)135e144

Todeschini R Consonni V Mauri A Pavan M 2002 DRAGON-Software for thecalculation of molecular descriptors version 21

Todeschini Roberto Consonni Viviana 2008 Handbook of Molecular Descriptors John Wiley amp Sons

Tropsha Alexander Gramatica Paola Gombar Vijay K 2003 The importance of being earnest validation is the absolute essential for successful application andinterpretation of QSPR models QSAR Comb Sci 22 (1) 69e77

XLSTAT 2013 software XLSTAT-CCR module Trial version

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450450

Page 6: Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 69

In Table6 Y-scrambling test was applied in order to examine the

robustness of the model (Tropsha et al 2003) In Y-scrambling test

the dependent variable (Absorption Capacity) is randomly dedi-

cated to different amines and new QPSR modeling is performed

based on the previous matrix of independent variables It is ex-

pectedthat newly developedQSPR models shouldhave lowenough

R2 and Q 2 values If it happens differently the reported model is not

accurate for the particular data set and method of modeling

The applicability domain of the model was studied by Williams

plot in Fig 2 (OECD 2007) In Williams plot the standardized re-

siduals (R) versus the leverage (hat diagonal) values (h) were

plotted Leverage demonstrates the distance of a compound from

the centroid of the X where X is the descriptor matrix The leverage

of a compound is calculated by the following equation (Netzeva

et al 2005)

hi frac14 xT i

X T X

1 xi (12)

where xi is the descriptor vector of the relevant compound The

warning leverage (h) is de1047297ned as (Eriksson et al 2003)

h frac143eth p thorn 1THORN

n (13)

n is the number of training objects and p is the number of de-

scriptors in the model Williams plot is used to identify both the

response outlier and the structurally in1047298uential chemicals in the

model A compound with hi gt h in1047298uence the regression line but it

does not consider as an outlier as its corresponding standardized

residual might be small In this data set the warning value of leverage is around 067 Furthermore compounds with standard-

ized residual rather than three standard residual unit (gt3s) is

considered as an outlier compound It is common in the literature

to use 3 as an accepted cut-off value for evaluating prediction re-

sults of the model

Fig 2 demonstrates that there is no chemical with leverage

higher than the warning h value of 067 It also shows that there is

no outlier in training or test sets and all compounds lie between the

two horizontal lines

The experimental absorption capacity (rich loading) values of

amines are plotted in Fig 3 against corresponding calculated values

for QSPR model

Furthermore mean effect (MF) is another term that helps to

interpret the result and shows the effect of each descriptor

individually or relative to other descriptors Fig 4 shows the stan-

dardized coef 1047297cients (also called beta coef 1047297cients) (XLSTAT 2013)

The 1047297gure is used to compare the relative weights of the de-

scriptors The higher the standardized coef 1047297cients value of a

descriptor the more important the weight of the corresponding

variable in the model This 1047297gure demonstrates the mean effect of

the descriptors in the model

By observing this 1047297gure it can be concluded that the number of

primary amines (nRNH2) and secondary amines (nRNHR) de-

scriptors play the main role in the amines capacity for carbon di-

oxide absorption respectively and the number of hydrogen (nH) has

the least effect This 1047297gure shows all descriptors in the model have

positive effects and the amines capacity for CO2 absorption is

directly related to each of these descriptors

At the last part of this section it should be noticed that the

present work focuses exclusively on developing a simple model by

which amines capacities for carbon dioxide absorption can be

predicted In fact the predominant difference between this study

and the previous ones is that this work concentrates 1047297rstly on

quantitative and then on qualitative representation of structural

effects on the capacity of amines for CO2 absorption

5 Discussion

Although high statistical parameters are signi1047297cant in demon-

strating the capability of the model QSPR should provide powerful

insight for the mechanism of carbon dioxide solubility in amine

based solvent For this reason an acceptable interpretation of de-

scriptors in the QSPR model should be provided It is better to di-

agnose which parameters affect the amines capacity and which

descriptors could appear in the model due to principal chemical

reactions between carbon dioxide and an amine-based solvent

The overall reaction mechanism for chemical absorption of CO2

in amine solvent systems is still under debate A mechanism for this

reaction which supports the formation of zwitterion intermediate

theory and by proton-remover base B through reactions (1) and (2)

below suggested by Caplow (1968)

CO2 thorn R 1R 2NH4 R 1R 2NHthorn COO (1)

R 1R 2NHthorn COO thorn B4 R 1R 2COO thorn BHthorn (2)

R 1 and R 2 demonstrate substituted group attached to amine

group B is a base molecule which can be a water molecule The

intermediate in the reaction is zwitterion But more recent studies

showed zwitterion seemed to be short-lived and may be an entirely

transient state (da Silva and Svendsen 2004) It led to the

assumption of the single-step mechanism of these reactions (re-

action (3)) A termolecular single step mechanism suggested by

Crooks and Donnellan (1989)

B thorn CO2 thorn R 1R 2NH4 R 1R 2NCOO thorn BHthorn (3)

where B is again the base molecule In this mechanism NH group is

attacked by base molecule and deprotonation of amine occurs The

bonding between amine and carbon dioxide also takes place

simultaneously

Table 6

R2 train values after several Y-scrambling tests

Iteration R2 train

1 0060

2 0074

3 0119

4 0027

5 0188

6 0102

7 0119

8 0209

9 0096

10 0039

Table 5

Validation parameters and statistical result of GA-MLR model

n R2 R2adj RMSE F k k0 s

Train 18 0942 0930 0127 7650 1004 0984 0144

Test 5 0976 0904 0060 1731 0962 1035 0135

Overall 23 0950 0942 0116 12379 0999 0990 0128

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450 447

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 79

As can be noticed the reaction between CO2 and amine based

solvent takes place because of the existence of NH bond So NH

group is an active site of the amine molecule where base molecule

(water) undergoes a chemical termolecular reaction Consequently

the amount of NH bonds or in other words the number of primary

and secondary amine groups in the amine molecule plays an

important role in the capacity of amines for CO2 absorption

Number of primary (nRNH2) and secondary (nRNHR) amines is two

main descriptors appearing in the model According to Fig 4 these

two descriptors have a positive effect and a higher mean effect All

these results demonstrate that the chemical reaction mechanism

coordinates with the proposed model

The model also contains number of Hydrogen atoms (nH) as

another descriptor Fig 4 shows nH descriptor has a positive effect

which is considerably less than two other descriptors The reason of

nH descriptor presence in the model can be explained by the result

of experimental work performed by Singh et al They showed that

an increase in the chain length between amines and other func-

tional groups in the amine structure result in an increase in amine

capacity for CO2 absorption (Singh et al 2007) Increasing with

chain length results in increasing numbers of hydrogen atoms so

apparently it seems it should have a positive effect due to the

experimental work

At last it should be noted that the simplicity of the model is

interesting and the results are quite acceptable for predicting

amines capacity for carbon dioxide absorption Although the ac-

curacy of the model is good for linear amine compounds it is not

better for unsaturated cyclic amines This can be explained by two

Fig 2 Williams plot of GA-MLR model development

Fig 3 Experimental vs predicted rich loading values (mol CO 2mol amine) e

regression line

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450448

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 89

main reasons First the three descriptors in model are not sensitive

to ring type functional group and just count the number of

hydrogen atoms primary and secondary amines Second unsatu-

rated cyclic amines show poor absorption rate and capacity and

they are not potential absorbents for CO2 absorption (2) Therefore

according to the industrial point of view it is preferable to use

linear amine for CO2 absorption and it is more important for the

model to predict CO2 absorption capacity for linear amines rather

than unsaturated cyclic amines

Fortunately the results of the 1047297rst equation (Eq (1)) for pre-

dicting amines capacity of CO2 absorption are largely accepted

either for linear or aromatic ring type amines (items labeled 5 6

and 21) This is because of the presence of RDF descriptor in this

model RDFdescriptorsare based on the distance distribution in the

geometrical representation of a molecule This function is inde-

pendent of the number of atoms and is invariant against translation

and rotation of the entire molecule The RDFcode provides valuable

information eg about bond distances ring types planar and non-

planar systems and atom types so it is sensitive to aromatic rings

(Todeschini and Consonni 2008)

6 Conclusions

One of the main concerns of the natural gas industry is to have a

robust and accuratemodel which canpredict the chemical behavior

of amines for gas treatment process This study is attempted toidentify the effects chemical structure of amine on their capacity for

carbon dioxide absorption and develop a model for this purpose

which is not only robust and accurate but also simple and appli-

cable Therefore QSPR approach has been chosen as a modeling

technique and model has been developed based on linear method

for its simplicity As a result two linear equations were developed

First model demonstrate high prediction powerwhile second one is

notably simpler and powerfully interpretable due to the chemistry

of amines reaction with carbon dioxide Consequently second

equation introduced as a preferred model of this study The most

important descriptors appearing in the model due to the weight of

the corresponding variable are number of primary aliphatic amines

(nRNH2) number of secondary aliphatic amines (nRNHR) and

number of hydrogen atoms (nH) respectively The accuracy and

predictive performance of the model validated with various sta-

tistical tests and examined with the test set of 1047297ve molecules

permits using this model to estimate other amines rich loading

under speci1047297c conditions According to the results it could be

argued that a good amine solvent for carbon dioxide absorption

should have a linear structure with a high number of primary and

secondary amine groups as side chains In other words increasing

the number of primary and secondary amine groups results in

increasing the number of NH bonds active sites which causes the

amine reaction with CO2 to happen

The promising results of this study might aid other researchers

in the 1047297eld of chemistry and natural gas engineering to design and

synthesis new potential amine-based solvents and investigate the

feasibility of using them in gas removal processes New improved

solvents should also be compared to more conventional ones from

corrosively energy ef 1047297ciency and operability point of view

Acknowledgment

The authors would like to gratefully acknowledge the support

from Institute of Petroleum Engineering (IPE) University of Tehran

List of symbols

CO2 carbon dioxide

QSPRQSAR quantitative structure propertyactivity relationship

DFT Density Functional Theory

MLR Multiple Linear Regression

GA Genetic Algorithms

PCR principle component regression

PLS partial least square

HOMO Highest Occupied Molecular Orbital

LUMO Lowest Unoccupied Molecular Orbital

AC absorption capacity

PCA principal component analysis

LOO-CV Leave-one-out cross-validation

RMSE root mean square errordf M degrees of freedom of the model

df E degrees of freedom of the error

References

Beheshti Abolghasem Riahi Siavash Ganjali Mohammad Reza 2009 Quantitativestructureeproperty relationship study on 1047297rst reduction and oxidation poten-tials of donor-substituted phenylquinolinylethynes and phenyl-isoquinolinylethynes quantum chemical investigation Electrochim Acta 54(23) 5368e5375

Beheshti A Norouzi P Ganjali MR 2012 A simple and robust model for pre-dicting the reduction potential of quinones family electrophilicity index effectInt J Electrochem Sci 7 4811e4821

Bohloul MR Vatani A Peyghambarzadeh SM 2014 Experimental and theo-retical study of CO2 solubility in N-methyl-2-pyrrolidone (NMP) Fluid PhaseEquilibr 365 106e111

Caplow Michael 1968 Kinetics of carbamate formation and breakdown J AmChem Soc 90 (24) 6795e6803

Chakraborty AK et al 1988 Molecular orbital approach to substituent effects inamine-CO2 interactions J Am Chem Soc 110 (21) 6947e6954

Chakraborty AK Astarita G Bischoff KB 1986 CO2absorption in aqueous so-lutions of hindered amines Chem Eng Sci 41 (4) 997e1003

Cramer Christopher J 2005 Essentials of Computational Chemistry Theories andModels Wileycom

Crooks John E Donnellan J Paul 1989 Kinetics and mechanism of the reactionbetween carbon dioxide and amines in aqueous solution J Chem Soc PerkinTrans 2 (4) 331e333

da Silva Eirik F Svendsen Hallvard F 2004 Ab initio study of the reaction of carbamate formation from CO2 and alkanolamines Indust Eng Chem Res 43(13) 3413e3418

Depczynski Uwe Frost VJ Molt K 2000 Genetic algorithms applied to the se-lection of factors in principal component regression Anal Chim Acta 420 (2)217e227

Eriksson Lennart Joanna Jaworska Worth Andrew P Cronin Mark TD

McDowell Robert M Gramatica Paola 2003 Methods for reliability and

Fig 4 Mean effects of model descriptors (standardized coef 1047297cient values)

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450 449

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 99

uncertainty assessment and for applicability evaluations of classi1047297cation-andregression-based QSARs Environ Health Perspect 111 (10) 1361

Fini Mojtaba Fallah Riahi Siavash Alireza Bahramian 2012 Experimental andQSPR studies on the effect of ionic surfactants on n-Decaneewater interfacialtension J Surfact Deterg 15 (4) 477e484

Freire Mara G et al 2010 Solubility of non-aromatic ionic liquids in water andcorrelation using a QSPR approach Fluid Phase Equilibr 294 (1) 234e240

Frisch Michael J Nielsen Alice B Frisch Aeleen (Eds) 1998 Gaussian 98 GaussianIncorporated

Godavarthy Srinivasa S Robinson Jr Robert L Gasem Khaled AM 2006

SVRCe

QSPR model for predicting saturated vapor pressures of pure 1047298uids FluidPhase Equilibr 246 (1) 39e51

Golbraikh Alexander Tropsha Alexander 2002 Beware of q2 J Mol Graph Model20 (4) 269e276

Hu Rongjing et al 2009 QSAR models for 2-amino-6-arylsulfonylbenzonitrilesand congeners HIV-1 reverse transcriptase inhibitors based on linear andnonlinear regression methods Eur J Med Chem 44 (5) 2158e2171

Jouan-Rimbaud Delphine et al 1995 Genetic algorithms as a tool for wavelengthselection in multivariate calibration Anal Chem 67 (23) 4295e4301

Katritzky Alan R et al 2000 QSPR correlation and predictions of GC retentionindexes for methyl-branched hydrocarbons produced by insects Anal Chem 72(1) 101e109

Kohl Arthur L Nielsen Richard 1997 Gas Puri1047297cation (access online via Elsevier)Lee Anita S Kitchin John R 2012 Chemical and molecular descriptors for the

reactivity of amines with CO2 Indust Eng Chem Res 51 (42) 13609e13618Liang Guijie Jie Xu Li Liu 2013 QSPR analysis for melting point of fatty acids

using genetic algorithm based multiple linear regression (GA-MLR) Fluid PhaseEquilibr 353 15e21

Marengo Emilio et al 1992 Comparative study of different structural descriptorsand variable selection approaches using partial least squares in quantitativestructure-activity relationships Chemometr Intell Lab Syst 14 (1) 225e233

Mokhatab Saeid Poe William A 2012 Handbook of Natural Gas Transmission andProcessing (access online via Elsevier)

Netzeva Tatiana I Worth Andrew P Aldenberg Tom Romualdo BenigniCronin Mark TD Gramatica Paola Jaworska Joanna S et al 2005 Currentstatus of methods for de1047297ning the applicability domain of (quantitative)structureeactivity relationships ATLA 33 155e173

OECD 2007 Guidance Document on the Validation of (Quantitative) Structure-Activity Relationships [(Q)SAR] Models Organisation for Economic Co-Operation and Development Paris

Pourbasheer Eslam et al 2011 Prediction of solubility of fullerene C60 in variousorganic solvents by genetic algorithm-multiple linear regression FullerenesNanotubes Carbon Nanostruct 19 (7) 585e598

Riahi Siavash Ganjali Mohammad Reza Norouzi Parviz Jafari Fatemeh 2008Application of GA-MLR GA-PLS and the DFT quantum mechanical (QM) cal-culations for the prediction of the selectivity coef 1047297cients of a histamine-selective electrode Sens Actuat B Chem 132 (1) 13e19

Riahi Siavash Pourbasheer Eslam Ganjali Mohammad Reza Norouzi Parviz 2009Investigation of different linear and nonlinear chemometric methods formodeling of retention index of essential oil components concerns to supportvector machine J Hazard Mater 166 (2) 853e859

Riahi Siavash Beheshti Abolghasem Ganjali Mohammad Reza Norouzi Parviz2008 A novel QSPR study of normalized migration time for drugs in capillaryelectrophoresis by new descriptors quantum chemical investigation Electro-phoresis 29 (19) 4027e4035

Riahi Siavash et al 2008 QSAR study of 2- (1-Propylpiperidin-4-yl) -1 H-Benz-imidazole-4-Carboxamide as PARP inhibitors for treatment of cancer ChemBiol Drug Design 72 (6) 575e584

Sartori Guido Savage David W 1983 Sterically hindered amines for carbon di-oxide removal from gases Indust Eng Chem Fundam 22 (2) 239e249

Singh Prachi Niederer John PM Versteeg Geert F 2007 Structure and activityrelationships for amine based CO2 absorbentsdI Int J Greenhouse Gas Control1 (1) 5e10

Singh Prachi Niederer John PM Versteeg Geert F 2009 Structure and activityrelationships for amine-based CO2 absorbents-II Chem Eng Res Design 87 (2)135e144

Todeschini R Consonni V Mauri A Pavan M 2002 DRAGON-Software for thecalculation of molecular descriptors version 21

Todeschini Roberto Consonni Viviana 2008 Handbook of Molecular Descriptors John Wiley amp Sons

Tropsha Alexander Gramatica Paola Gombar Vijay K 2003 The importance of being earnest validation is the absolute essential for successful application andinterpretation of QSPR models QSAR Comb Sci 22 (1) 69e77

XLSTAT 2013 software XLSTAT-CCR module Trial version

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450450

Page 7: Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 79

As can be noticed the reaction between CO2 and amine based

solvent takes place because of the existence of NH bond So NH

group is an active site of the amine molecule where base molecule

(water) undergoes a chemical termolecular reaction Consequently

the amount of NH bonds or in other words the number of primary

and secondary amine groups in the amine molecule plays an

important role in the capacity of amines for CO2 absorption

Number of primary (nRNH2) and secondary (nRNHR) amines is two

main descriptors appearing in the model According to Fig 4 these

two descriptors have a positive effect and a higher mean effect All

these results demonstrate that the chemical reaction mechanism

coordinates with the proposed model

The model also contains number of Hydrogen atoms (nH) as

another descriptor Fig 4 shows nH descriptor has a positive effect

which is considerably less than two other descriptors The reason of

nH descriptor presence in the model can be explained by the result

of experimental work performed by Singh et al They showed that

an increase in the chain length between amines and other func-

tional groups in the amine structure result in an increase in amine

capacity for CO2 absorption (Singh et al 2007) Increasing with

chain length results in increasing numbers of hydrogen atoms so

apparently it seems it should have a positive effect due to the

experimental work

At last it should be noted that the simplicity of the model is

interesting and the results are quite acceptable for predicting

amines capacity for carbon dioxide absorption Although the ac-

curacy of the model is good for linear amine compounds it is not

better for unsaturated cyclic amines This can be explained by two

Fig 2 Williams plot of GA-MLR model development

Fig 3 Experimental vs predicted rich loading values (mol CO 2mol amine) e

regression line

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450448

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 89

main reasons First the three descriptors in model are not sensitive

to ring type functional group and just count the number of

hydrogen atoms primary and secondary amines Second unsatu-

rated cyclic amines show poor absorption rate and capacity and

they are not potential absorbents for CO2 absorption (2) Therefore

according to the industrial point of view it is preferable to use

linear amine for CO2 absorption and it is more important for the

model to predict CO2 absorption capacity for linear amines rather

than unsaturated cyclic amines

Fortunately the results of the 1047297rst equation (Eq (1)) for pre-

dicting amines capacity of CO2 absorption are largely accepted

either for linear or aromatic ring type amines (items labeled 5 6

and 21) This is because of the presence of RDF descriptor in this

model RDFdescriptorsare based on the distance distribution in the

geometrical representation of a molecule This function is inde-

pendent of the number of atoms and is invariant against translation

and rotation of the entire molecule The RDFcode provides valuable

information eg about bond distances ring types planar and non-

planar systems and atom types so it is sensitive to aromatic rings

(Todeschini and Consonni 2008)

6 Conclusions

One of the main concerns of the natural gas industry is to have a

robust and accuratemodel which canpredict the chemical behavior

of amines for gas treatment process This study is attempted toidentify the effects chemical structure of amine on their capacity for

carbon dioxide absorption and develop a model for this purpose

which is not only robust and accurate but also simple and appli-

cable Therefore QSPR approach has been chosen as a modeling

technique and model has been developed based on linear method

for its simplicity As a result two linear equations were developed

First model demonstrate high prediction powerwhile second one is

notably simpler and powerfully interpretable due to the chemistry

of amines reaction with carbon dioxide Consequently second

equation introduced as a preferred model of this study The most

important descriptors appearing in the model due to the weight of

the corresponding variable are number of primary aliphatic amines

(nRNH2) number of secondary aliphatic amines (nRNHR) and

number of hydrogen atoms (nH) respectively The accuracy and

predictive performance of the model validated with various sta-

tistical tests and examined with the test set of 1047297ve molecules

permits using this model to estimate other amines rich loading

under speci1047297c conditions According to the results it could be

argued that a good amine solvent for carbon dioxide absorption

should have a linear structure with a high number of primary and

secondary amine groups as side chains In other words increasing

the number of primary and secondary amine groups results in

increasing the number of NH bonds active sites which causes the

amine reaction with CO2 to happen

The promising results of this study might aid other researchers

in the 1047297eld of chemistry and natural gas engineering to design and

synthesis new potential amine-based solvents and investigate the

feasibility of using them in gas removal processes New improved

solvents should also be compared to more conventional ones from

corrosively energy ef 1047297ciency and operability point of view

Acknowledgment

The authors would like to gratefully acknowledge the support

from Institute of Petroleum Engineering (IPE) University of Tehran

List of symbols

CO2 carbon dioxide

QSPRQSAR quantitative structure propertyactivity relationship

DFT Density Functional Theory

MLR Multiple Linear Regression

GA Genetic Algorithms

PCR principle component regression

PLS partial least square

HOMO Highest Occupied Molecular Orbital

LUMO Lowest Unoccupied Molecular Orbital

AC absorption capacity

PCA principal component analysis

LOO-CV Leave-one-out cross-validation

RMSE root mean square errordf M degrees of freedom of the model

df E degrees of freedom of the error

References

Beheshti Abolghasem Riahi Siavash Ganjali Mohammad Reza 2009 Quantitativestructureeproperty relationship study on 1047297rst reduction and oxidation poten-tials of donor-substituted phenylquinolinylethynes and phenyl-isoquinolinylethynes quantum chemical investigation Electrochim Acta 54(23) 5368e5375

Beheshti A Norouzi P Ganjali MR 2012 A simple and robust model for pre-dicting the reduction potential of quinones family electrophilicity index effectInt J Electrochem Sci 7 4811e4821

Bohloul MR Vatani A Peyghambarzadeh SM 2014 Experimental and theo-retical study of CO2 solubility in N-methyl-2-pyrrolidone (NMP) Fluid PhaseEquilibr 365 106e111

Caplow Michael 1968 Kinetics of carbamate formation and breakdown J AmChem Soc 90 (24) 6795e6803

Chakraborty AK et al 1988 Molecular orbital approach to substituent effects inamine-CO2 interactions J Am Chem Soc 110 (21) 6947e6954

Chakraborty AK Astarita G Bischoff KB 1986 CO2absorption in aqueous so-lutions of hindered amines Chem Eng Sci 41 (4) 997e1003

Cramer Christopher J 2005 Essentials of Computational Chemistry Theories andModels Wileycom

Crooks John E Donnellan J Paul 1989 Kinetics and mechanism of the reactionbetween carbon dioxide and amines in aqueous solution J Chem Soc PerkinTrans 2 (4) 331e333

da Silva Eirik F Svendsen Hallvard F 2004 Ab initio study of the reaction of carbamate formation from CO2 and alkanolamines Indust Eng Chem Res 43(13) 3413e3418

Depczynski Uwe Frost VJ Molt K 2000 Genetic algorithms applied to the se-lection of factors in principal component regression Anal Chim Acta 420 (2)217e227

Eriksson Lennart Joanna Jaworska Worth Andrew P Cronin Mark TD

McDowell Robert M Gramatica Paola 2003 Methods for reliability and

Fig 4 Mean effects of model descriptors (standardized coef 1047297cient values)

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450 449

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 99

uncertainty assessment and for applicability evaluations of classi1047297cation-andregression-based QSARs Environ Health Perspect 111 (10) 1361

Fini Mojtaba Fallah Riahi Siavash Alireza Bahramian 2012 Experimental andQSPR studies on the effect of ionic surfactants on n-Decaneewater interfacialtension J Surfact Deterg 15 (4) 477e484

Freire Mara G et al 2010 Solubility of non-aromatic ionic liquids in water andcorrelation using a QSPR approach Fluid Phase Equilibr 294 (1) 234e240

Frisch Michael J Nielsen Alice B Frisch Aeleen (Eds) 1998 Gaussian 98 GaussianIncorporated

Godavarthy Srinivasa S Robinson Jr Robert L Gasem Khaled AM 2006

SVRCe

QSPR model for predicting saturated vapor pressures of pure 1047298uids FluidPhase Equilibr 246 (1) 39e51

Golbraikh Alexander Tropsha Alexander 2002 Beware of q2 J Mol Graph Model20 (4) 269e276

Hu Rongjing et al 2009 QSAR models for 2-amino-6-arylsulfonylbenzonitrilesand congeners HIV-1 reverse transcriptase inhibitors based on linear andnonlinear regression methods Eur J Med Chem 44 (5) 2158e2171

Jouan-Rimbaud Delphine et al 1995 Genetic algorithms as a tool for wavelengthselection in multivariate calibration Anal Chem 67 (23) 4295e4301

Katritzky Alan R et al 2000 QSPR correlation and predictions of GC retentionindexes for methyl-branched hydrocarbons produced by insects Anal Chem 72(1) 101e109

Kohl Arthur L Nielsen Richard 1997 Gas Puri1047297cation (access online via Elsevier)Lee Anita S Kitchin John R 2012 Chemical and molecular descriptors for the

reactivity of amines with CO2 Indust Eng Chem Res 51 (42) 13609e13618Liang Guijie Jie Xu Li Liu 2013 QSPR analysis for melting point of fatty acids

using genetic algorithm based multiple linear regression (GA-MLR) Fluid PhaseEquilibr 353 15e21

Marengo Emilio et al 1992 Comparative study of different structural descriptorsand variable selection approaches using partial least squares in quantitativestructure-activity relationships Chemometr Intell Lab Syst 14 (1) 225e233

Mokhatab Saeid Poe William A 2012 Handbook of Natural Gas Transmission andProcessing (access online via Elsevier)

Netzeva Tatiana I Worth Andrew P Aldenberg Tom Romualdo BenigniCronin Mark TD Gramatica Paola Jaworska Joanna S et al 2005 Currentstatus of methods for de1047297ning the applicability domain of (quantitative)structureeactivity relationships ATLA 33 155e173

OECD 2007 Guidance Document on the Validation of (Quantitative) Structure-Activity Relationships [(Q)SAR] Models Organisation for Economic Co-Operation and Development Paris

Pourbasheer Eslam et al 2011 Prediction of solubility of fullerene C60 in variousorganic solvents by genetic algorithm-multiple linear regression FullerenesNanotubes Carbon Nanostruct 19 (7) 585e598

Riahi Siavash Ganjali Mohammad Reza Norouzi Parviz Jafari Fatemeh 2008Application of GA-MLR GA-PLS and the DFT quantum mechanical (QM) cal-culations for the prediction of the selectivity coef 1047297cients of a histamine-selective electrode Sens Actuat B Chem 132 (1) 13e19

Riahi Siavash Pourbasheer Eslam Ganjali Mohammad Reza Norouzi Parviz 2009Investigation of different linear and nonlinear chemometric methods formodeling of retention index of essential oil components concerns to supportvector machine J Hazard Mater 166 (2) 853e859

Riahi Siavash Beheshti Abolghasem Ganjali Mohammad Reza Norouzi Parviz2008 A novel QSPR study of normalized migration time for drugs in capillaryelectrophoresis by new descriptors quantum chemical investigation Electro-phoresis 29 (19) 4027e4035

Riahi Siavash et al 2008 QSAR study of 2- (1-Propylpiperidin-4-yl) -1 H-Benz-imidazole-4-Carboxamide as PARP inhibitors for treatment of cancer ChemBiol Drug Design 72 (6) 575e584

Sartori Guido Savage David W 1983 Sterically hindered amines for carbon di-oxide removal from gases Indust Eng Chem Fundam 22 (2) 239e249

Singh Prachi Niederer John PM Versteeg Geert F 2007 Structure and activityrelationships for amine based CO2 absorbentsdI Int J Greenhouse Gas Control1 (1) 5e10

Singh Prachi Niederer John PM Versteeg Geert F 2009 Structure and activityrelationships for amine-based CO2 absorbents-II Chem Eng Res Design 87 (2)135e144

Todeschini R Consonni V Mauri A Pavan M 2002 DRAGON-Software for thecalculation of molecular descriptors version 21

Todeschini Roberto Consonni Viviana 2008 Handbook of Molecular Descriptors John Wiley amp Sons

Tropsha Alexander Gramatica Paola Gombar Vijay K 2003 The importance of being earnest validation is the absolute essential for successful application andinterpretation of QSPR models QSAR Comb Sci 22 (1) 69e77

XLSTAT 2013 software XLSTAT-CCR module Trial version

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450450

Page 8: Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 89

main reasons First the three descriptors in model are not sensitive

to ring type functional group and just count the number of

hydrogen atoms primary and secondary amines Second unsatu-

rated cyclic amines show poor absorption rate and capacity and

they are not potential absorbents for CO2 absorption (2) Therefore

according to the industrial point of view it is preferable to use

linear amine for CO2 absorption and it is more important for the

model to predict CO2 absorption capacity for linear amines rather

than unsaturated cyclic amines

Fortunately the results of the 1047297rst equation (Eq (1)) for pre-

dicting amines capacity of CO2 absorption are largely accepted

either for linear or aromatic ring type amines (items labeled 5 6

and 21) This is because of the presence of RDF descriptor in this

model RDFdescriptorsare based on the distance distribution in the

geometrical representation of a molecule This function is inde-

pendent of the number of atoms and is invariant against translation

and rotation of the entire molecule The RDFcode provides valuable

information eg about bond distances ring types planar and non-

planar systems and atom types so it is sensitive to aromatic rings

(Todeschini and Consonni 2008)

6 Conclusions

One of the main concerns of the natural gas industry is to have a

robust and accuratemodel which canpredict the chemical behavior

of amines for gas treatment process This study is attempted toidentify the effects chemical structure of amine on their capacity for

carbon dioxide absorption and develop a model for this purpose

which is not only robust and accurate but also simple and appli-

cable Therefore QSPR approach has been chosen as a modeling

technique and model has been developed based on linear method

for its simplicity As a result two linear equations were developed

First model demonstrate high prediction powerwhile second one is

notably simpler and powerfully interpretable due to the chemistry

of amines reaction with carbon dioxide Consequently second

equation introduced as a preferred model of this study The most

important descriptors appearing in the model due to the weight of

the corresponding variable are number of primary aliphatic amines

(nRNH2) number of secondary aliphatic amines (nRNHR) and

number of hydrogen atoms (nH) respectively The accuracy and

predictive performance of the model validated with various sta-

tistical tests and examined with the test set of 1047297ve molecules

permits using this model to estimate other amines rich loading

under speci1047297c conditions According to the results it could be

argued that a good amine solvent for carbon dioxide absorption

should have a linear structure with a high number of primary and

secondary amine groups as side chains In other words increasing

the number of primary and secondary amine groups results in

increasing the number of NH bonds active sites which causes the

amine reaction with CO2 to happen

The promising results of this study might aid other researchers

in the 1047297eld of chemistry and natural gas engineering to design and

synthesis new potential amine-based solvents and investigate the

feasibility of using them in gas removal processes New improved

solvents should also be compared to more conventional ones from

corrosively energy ef 1047297ciency and operability point of view

Acknowledgment

The authors would like to gratefully acknowledge the support

from Institute of Petroleum Engineering (IPE) University of Tehran

List of symbols

CO2 carbon dioxide

QSPRQSAR quantitative structure propertyactivity relationship

DFT Density Functional Theory

MLR Multiple Linear Regression

GA Genetic Algorithms

PCR principle component regression

PLS partial least square

HOMO Highest Occupied Molecular Orbital

LUMO Lowest Unoccupied Molecular Orbital

AC absorption capacity

PCA principal component analysis

LOO-CV Leave-one-out cross-validation

RMSE root mean square errordf M degrees of freedom of the model

df E degrees of freedom of the error

References

Beheshti Abolghasem Riahi Siavash Ganjali Mohammad Reza 2009 Quantitativestructureeproperty relationship study on 1047297rst reduction and oxidation poten-tials of donor-substituted phenylquinolinylethynes and phenyl-isoquinolinylethynes quantum chemical investigation Electrochim Acta 54(23) 5368e5375

Beheshti A Norouzi P Ganjali MR 2012 A simple and robust model for pre-dicting the reduction potential of quinones family electrophilicity index effectInt J Electrochem Sci 7 4811e4821

Bohloul MR Vatani A Peyghambarzadeh SM 2014 Experimental and theo-retical study of CO2 solubility in N-methyl-2-pyrrolidone (NMP) Fluid PhaseEquilibr 365 106e111

Caplow Michael 1968 Kinetics of carbamate formation and breakdown J AmChem Soc 90 (24) 6795e6803

Chakraborty AK et al 1988 Molecular orbital approach to substituent effects inamine-CO2 interactions J Am Chem Soc 110 (21) 6947e6954

Chakraborty AK Astarita G Bischoff KB 1986 CO2absorption in aqueous so-lutions of hindered amines Chem Eng Sci 41 (4) 997e1003

Cramer Christopher J 2005 Essentials of Computational Chemistry Theories andModels Wileycom

Crooks John E Donnellan J Paul 1989 Kinetics and mechanism of the reactionbetween carbon dioxide and amines in aqueous solution J Chem Soc PerkinTrans 2 (4) 331e333

da Silva Eirik F Svendsen Hallvard F 2004 Ab initio study of the reaction of carbamate formation from CO2 and alkanolamines Indust Eng Chem Res 43(13) 3413e3418

Depczynski Uwe Frost VJ Molt K 2000 Genetic algorithms applied to the se-lection of factors in principal component regression Anal Chim Acta 420 (2)217e227

Eriksson Lennart Joanna Jaworska Worth Andrew P Cronin Mark TD

McDowell Robert M Gramatica Paola 2003 Methods for reliability and

Fig 4 Mean effects of model descriptors (standardized coef 1047297cient values)

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450 449

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 99

uncertainty assessment and for applicability evaluations of classi1047297cation-andregression-based QSARs Environ Health Perspect 111 (10) 1361

Fini Mojtaba Fallah Riahi Siavash Alireza Bahramian 2012 Experimental andQSPR studies on the effect of ionic surfactants on n-Decaneewater interfacialtension J Surfact Deterg 15 (4) 477e484

Freire Mara G et al 2010 Solubility of non-aromatic ionic liquids in water andcorrelation using a QSPR approach Fluid Phase Equilibr 294 (1) 234e240

Frisch Michael J Nielsen Alice B Frisch Aeleen (Eds) 1998 Gaussian 98 GaussianIncorporated

Godavarthy Srinivasa S Robinson Jr Robert L Gasem Khaled AM 2006

SVRCe

QSPR model for predicting saturated vapor pressures of pure 1047298uids FluidPhase Equilibr 246 (1) 39e51

Golbraikh Alexander Tropsha Alexander 2002 Beware of q2 J Mol Graph Model20 (4) 269e276

Hu Rongjing et al 2009 QSAR models for 2-amino-6-arylsulfonylbenzonitrilesand congeners HIV-1 reverse transcriptase inhibitors based on linear andnonlinear regression methods Eur J Med Chem 44 (5) 2158e2171

Jouan-Rimbaud Delphine et al 1995 Genetic algorithms as a tool for wavelengthselection in multivariate calibration Anal Chem 67 (23) 4295e4301

Katritzky Alan R et al 2000 QSPR correlation and predictions of GC retentionindexes for methyl-branched hydrocarbons produced by insects Anal Chem 72(1) 101e109

Kohl Arthur L Nielsen Richard 1997 Gas Puri1047297cation (access online via Elsevier)Lee Anita S Kitchin John R 2012 Chemical and molecular descriptors for the

reactivity of amines with CO2 Indust Eng Chem Res 51 (42) 13609e13618Liang Guijie Jie Xu Li Liu 2013 QSPR analysis for melting point of fatty acids

using genetic algorithm based multiple linear regression (GA-MLR) Fluid PhaseEquilibr 353 15e21

Marengo Emilio et al 1992 Comparative study of different structural descriptorsand variable selection approaches using partial least squares in quantitativestructure-activity relationships Chemometr Intell Lab Syst 14 (1) 225e233

Mokhatab Saeid Poe William A 2012 Handbook of Natural Gas Transmission andProcessing (access online via Elsevier)

Netzeva Tatiana I Worth Andrew P Aldenberg Tom Romualdo BenigniCronin Mark TD Gramatica Paola Jaworska Joanna S et al 2005 Currentstatus of methods for de1047297ning the applicability domain of (quantitative)structureeactivity relationships ATLA 33 155e173

OECD 2007 Guidance Document on the Validation of (Quantitative) Structure-Activity Relationships [(Q)SAR] Models Organisation for Economic Co-Operation and Development Paris

Pourbasheer Eslam et al 2011 Prediction of solubility of fullerene C60 in variousorganic solvents by genetic algorithm-multiple linear regression FullerenesNanotubes Carbon Nanostruct 19 (7) 585e598

Riahi Siavash Ganjali Mohammad Reza Norouzi Parviz Jafari Fatemeh 2008Application of GA-MLR GA-PLS and the DFT quantum mechanical (QM) cal-culations for the prediction of the selectivity coef 1047297cients of a histamine-selective electrode Sens Actuat B Chem 132 (1) 13e19

Riahi Siavash Pourbasheer Eslam Ganjali Mohammad Reza Norouzi Parviz 2009Investigation of different linear and nonlinear chemometric methods formodeling of retention index of essential oil components concerns to supportvector machine J Hazard Mater 166 (2) 853e859

Riahi Siavash Beheshti Abolghasem Ganjali Mohammad Reza Norouzi Parviz2008 A novel QSPR study of normalized migration time for drugs in capillaryelectrophoresis by new descriptors quantum chemical investigation Electro-phoresis 29 (19) 4027e4035

Riahi Siavash et al 2008 QSAR study of 2- (1-Propylpiperidin-4-yl) -1 H-Benz-imidazole-4-Carboxamide as PARP inhibitors for treatment of cancer ChemBiol Drug Design 72 (6) 575e584

Sartori Guido Savage David W 1983 Sterically hindered amines for carbon di-oxide removal from gases Indust Eng Chem Fundam 22 (2) 239e249

Singh Prachi Niederer John PM Versteeg Geert F 2007 Structure and activityrelationships for amine based CO2 absorbentsdI Int J Greenhouse Gas Control1 (1) 5e10

Singh Prachi Niederer John PM Versteeg Geert F 2009 Structure and activityrelationships for amine-based CO2 absorbents-II Chem Eng Res Design 87 (2)135e144

Todeschini R Consonni V Mauri A Pavan M 2002 DRAGON-Software for thecalculation of molecular descriptors version 21

Todeschini Roberto Consonni Viviana 2008 Handbook of Molecular Descriptors John Wiley amp Sons

Tropsha Alexander Gramatica Paola Gombar Vijay K 2003 The importance of being earnest validation is the absolute essential for successful application andinterpretation of QSPR models QSAR Comb Sci 22 (1) 69e77

XLSTAT 2013 software XLSTAT-CCR module Trial version

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450450

Page 9: Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

7182019 Prediction of Amines Capacity for Carbon Dioxide Absorption in Gas

httpslidepdfcomreaderfullprediction-of-amines-capacity-for-carbon-dioxide-absorption-in-gas 99

uncertainty assessment and for applicability evaluations of classi1047297cation-andregression-based QSARs Environ Health Perspect 111 (10) 1361

Fini Mojtaba Fallah Riahi Siavash Alireza Bahramian 2012 Experimental andQSPR studies on the effect of ionic surfactants on n-Decaneewater interfacialtension J Surfact Deterg 15 (4) 477e484

Freire Mara G et al 2010 Solubility of non-aromatic ionic liquids in water andcorrelation using a QSPR approach Fluid Phase Equilibr 294 (1) 234e240

Frisch Michael J Nielsen Alice B Frisch Aeleen (Eds) 1998 Gaussian 98 GaussianIncorporated

Godavarthy Srinivasa S Robinson Jr Robert L Gasem Khaled AM 2006

SVRCe

QSPR model for predicting saturated vapor pressures of pure 1047298uids FluidPhase Equilibr 246 (1) 39e51

Golbraikh Alexander Tropsha Alexander 2002 Beware of q2 J Mol Graph Model20 (4) 269e276

Hu Rongjing et al 2009 QSAR models for 2-amino-6-arylsulfonylbenzonitrilesand congeners HIV-1 reverse transcriptase inhibitors based on linear andnonlinear regression methods Eur J Med Chem 44 (5) 2158e2171

Jouan-Rimbaud Delphine et al 1995 Genetic algorithms as a tool for wavelengthselection in multivariate calibration Anal Chem 67 (23) 4295e4301

Katritzky Alan R et al 2000 QSPR correlation and predictions of GC retentionindexes for methyl-branched hydrocarbons produced by insects Anal Chem 72(1) 101e109

Kohl Arthur L Nielsen Richard 1997 Gas Puri1047297cation (access online via Elsevier)Lee Anita S Kitchin John R 2012 Chemical and molecular descriptors for the

reactivity of amines with CO2 Indust Eng Chem Res 51 (42) 13609e13618Liang Guijie Jie Xu Li Liu 2013 QSPR analysis for melting point of fatty acids

using genetic algorithm based multiple linear regression (GA-MLR) Fluid PhaseEquilibr 353 15e21

Marengo Emilio et al 1992 Comparative study of different structural descriptorsand variable selection approaches using partial least squares in quantitativestructure-activity relationships Chemometr Intell Lab Syst 14 (1) 225e233

Mokhatab Saeid Poe William A 2012 Handbook of Natural Gas Transmission andProcessing (access online via Elsevier)

Netzeva Tatiana I Worth Andrew P Aldenberg Tom Romualdo BenigniCronin Mark TD Gramatica Paola Jaworska Joanna S et al 2005 Currentstatus of methods for de1047297ning the applicability domain of (quantitative)structureeactivity relationships ATLA 33 155e173

OECD 2007 Guidance Document on the Validation of (Quantitative) Structure-Activity Relationships [(Q)SAR] Models Organisation for Economic Co-Operation and Development Paris

Pourbasheer Eslam et al 2011 Prediction of solubility of fullerene C60 in variousorganic solvents by genetic algorithm-multiple linear regression FullerenesNanotubes Carbon Nanostruct 19 (7) 585e598

Riahi Siavash Ganjali Mohammad Reza Norouzi Parviz Jafari Fatemeh 2008Application of GA-MLR GA-PLS and the DFT quantum mechanical (QM) cal-culations for the prediction of the selectivity coef 1047297cients of a histamine-selective electrode Sens Actuat B Chem 132 (1) 13e19

Riahi Siavash Pourbasheer Eslam Ganjali Mohammad Reza Norouzi Parviz 2009Investigation of different linear and nonlinear chemometric methods formodeling of retention index of essential oil components concerns to supportvector machine J Hazard Mater 166 (2) 853e859

Riahi Siavash Beheshti Abolghasem Ganjali Mohammad Reza Norouzi Parviz2008 A novel QSPR study of normalized migration time for drugs in capillaryelectrophoresis by new descriptors quantum chemical investigation Electro-phoresis 29 (19) 4027e4035

Riahi Siavash et al 2008 QSAR study of 2- (1-Propylpiperidin-4-yl) -1 H-Benz-imidazole-4-Carboxamide as PARP inhibitors for treatment of cancer ChemBiol Drug Design 72 (6) 575e584

Sartori Guido Savage David W 1983 Sterically hindered amines for carbon di-oxide removal from gases Indust Eng Chem Fundam 22 (2) 239e249

Singh Prachi Niederer John PM Versteeg Geert F 2007 Structure and activityrelationships for amine based CO2 absorbentsdI Int J Greenhouse Gas Control1 (1) 5e10

Singh Prachi Niederer John PM Versteeg Geert F 2009 Structure and activityrelationships for amine-based CO2 absorbents-II Chem Eng Res Design 87 (2)135e144

Todeschini R Consonni V Mauri A Pavan M 2002 DRAGON-Software for thecalculation of molecular descriptors version 21

Todeschini Roberto Consonni Viviana 2008 Handbook of Molecular Descriptors John Wiley amp Sons

Tropsha Alexander Gramatica Paola Gombar Vijay K 2003 The importance of being earnest validation is the absolute essential for successful application andinterpretation of QSPR models QSAR Comb Sci 22 (1) 69e77

XLSTAT 2013 software XLSTAT-CCR module Trial version

M Momeni S Riahi Journal of Natural Gas Science and Engineering 21 (2014) 442e450450