heuristically deciding between normal and skew normal distributions for describing the data on a...

14
This article was downloaded by: [York University Libraries] On: 12 August 2014, At: 17:17 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Journal of Statistical Theory and Practice Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/ujsp20 Heuristically Deciding Between Normal and Skew Normal Distributions for Describing the Data on a Response Variable and an Explanatory Variable Subir Ghosh a & Debarshi Dey a a Department of Statistics , University of California , Riverside , California , USA Accepted author version posted online: 08 Aug 2013.Published online: 23 Dec 2013. To cite this article: Subir Ghosh & Debarshi Dey (2014) Heuristically Deciding Between Normal and Skew Normal Distributions for Describing the Data on a Response Variable and an Explanatory Variable, Journal of Statistical Theory and Practice, 8:1, 126-137, DOI: 10.1080/15598608.2013.823581 To link to this article: http://dx.doi.org/10.1080/15598608.2013.823581 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &

Upload: debarshi

Post on 03-Feb-2017

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Heuristically Deciding Between Normal and Skew Normal Distributions for Describing the Data on a Response Variable and an Explanatory Variable

This article was downloaded by: [York University Libraries]On: 12 August 2014, At: 17:17Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registeredoffice: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Journal of Statistical Theory andPracticePublication details, including instructions for authors andsubscription information:http://www.tandfonline.com/loi/ujsp20

Heuristically Deciding Between Normaland Skew Normal Distributions forDescribing the Data on a ResponseVariable and an Explanatory VariableSubir Ghosh a & Debarshi Dey aa Department of Statistics , University of California , Riverside ,California , USAAccepted author version posted online: 08 Aug 2013.Publishedonline: 23 Dec 2013.

To cite this article: Subir Ghosh & Debarshi Dey (2014) Heuristically Deciding BetweenNormal and Skew Normal Distributions for Describing the Data on a Response Variable andan Explanatory Variable, Journal of Statistical Theory and Practice, 8:1, 126-137, DOI:10.1080/15598608.2013.823581

To link to this article: http://dx.doi.org/10.1080/15598608.2013.823581

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the“Content”) contained in the publications on our platform. However, Taylor & Francis,our agents, and our licensors make no representations or warranties whatsoever as tothe accuracy, completeness, or suitability for any purpose of the Content. Any opinionsand views expressed in this publication are the opinions and views of the authors,and are not the views of or endorsed by Taylor & Francis. The accuracy of the Contentshould not be relied upon and should be independently verified with primary sourcesof information. Taylor and Francis shall not be liable for any losses, actions, claims,proceedings, demands, costs, expenses, damages, and other liabilities whatsoever orhowsoever caused arising directly or indirectly in connection with, in relation to or arisingout of the use of the Content.

This article may be used for research, teaching, and private study purposes. Anysubstantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &

Page 2: Heuristically Deciding Between Normal and Skew Normal Distributions for Describing the Data on a Response Variable and an Explanatory Variable

Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

Dow

nloa

ded

by [

Yor

k U

nive

rsity

Lib

rari

es]

at 1

7:17

12

Aug

ust 2

014

Page 3: Heuristically Deciding Between Normal and Skew Normal Distributions for Describing the Data on a Response Variable and an Explanatory Variable

Journal of Statistical Theory and Practice, 8:126–137, 2014Copyright © Grace Scientific Publishing, LLCISSN: 1559-8608 print / 1559-8616 onlineDOI: 10.1080/15598608.2013.823581

Heuristically Deciding Between Normal and SkewNormal Distributions for Describing the Data on aResponse Variable and an Explanatory Variable

SUBIR GHOSH AND DEBARSHI DEY

Department of Statistics, University of California, Riverside, California, USA

Normal and skew normal distributions of a response variable Y for a given value xof an explanatory variable X are considered when the means of the distributions arelinear functions of x. Deciding between these distributions for describing data is pos-sible with the shape parameter of the skew normal distribution. The shape parametercan be either positive or negative. When the shape parameter is zero, a skew-normaldistribution becomes a normal distribution. Larger magnitude of the shape parameterprovides a better recognition of the distribution for describing the data. It is thereforeimportant to estimate the shape parameter of the skew normal distribution along withthe location and dispersion parameters. A linear approximation of the ratio of the stan-dard normal density and distribution functions in the presence of the shape parameterof skew normal distribution is used for this purpose. A heuristic method is proposedto determine the sign and estimate the magnitude of shape parameter, and to estimatethe location parameters: intercept and slope, and the dispersion parameter based on thislinear approximation. Simulation studies for performance evaluation of the proposedheuristic method are presented.

AMS Subject Classifications 2010: 62J05; 62J86; 62F10.

Keywords: Likelihood; Linear approximation; Maximum likelihood; Mill’s ratio;Model selection; Normal distribution; Shape parameter; Skew normal distribution.

1. Introduction

Data do not often arise in practice exactly from a normal distribution or even a symmetricdistribution. Many univariate distributions are known in the literature for describing theskewness present in the data (Arnold, Beaver, Groeneveld, and Meeker 1993; Arnold andBeaver 2000; Azzalini 1985, 1986; Genton 2004; Ma and Genton 2004; Mudholkar andHutson 2000; Rao 1985). The most fundamental skew normal distribution (Azzalini 1985,1986) is compared in this article against the normal distribution for describing the data.A shape parameter plays the key role in deciding between the skew normal and normaldistributions.

Received 7 January 2013; accepted 6 July 2013.Address correspondence to: Professor Subir Ghosh, Department of Statistics, University of

California, Riverside, CA 92521-0138, USA. Email: [email protected]

126

Dow

nloa

ded

by [

Yor

k U

nive

rsity

Lib

rari

es]

at 1

7:17

12

Aug

ust 2

014

Page 4: Heuristically Deciding Between Normal and Skew Normal Distributions for Describing the Data on a Response Variable and an Explanatory Variable

Normal and Skew Normal Distributions 127

Consider a random variable Y for a given value x of a random variable X and twomodels:

M1: Y is normally distributed as N(γ M10 + γ M1

1 x, σM1), (1)

M2: Y is skew normally distributed as SN(γ M20 + γ M2

1 x, σM2, λ), (2)

where γ Mi0 , γ Mi

1 , σ Mi, i = 1, 2, and λ are unknown parameters. The location parametersγ Mi

0 and γ Mi1 , i = 1, 2, are the intercept and slope of the line describing the dependence of

mean of Y on x. The dispersion parameters σ Mi, i = 1, 2, are > 0 and the shape parameterλ is a real number that can be either positive or negative. When λ = 0, the two modelsM1 and M2 become identical in the sense that γ M1

0 = γ M20 , γ M1

1 = γ M21 , and σM1 = σM2.

The probability density function of Y for M2 is expressed as

f (y; λ, γ M20 , γ M2

1 , σM2) = 2

σM2φ

(y − γ M2

0 − γ M21 x

σM2

)�

y − γ M20 − γ M2

1 x

σM2

), (3)

where φ is the standard normal N(0, 1) density function and F is the standard normalN(0, 1) distribution function. For a real number z, define

Rλ(z) = φ(λz)

�(λz). (4)

Clearly, Rλ(0) =√

= 0.7978846 and Rλ(z) = R−λ(−z). The function R−λ(z) = Rλ(−z)is known in statistics and econometrics literature as the inverse of Mill’s ratio (Mills 1926)and is also known in reliability analysis as well as in survival analysis as the hazard functionfor the special case of standard normal distribution.

The data consist of n independent observations (yi, xi), i = 1, ..., n, from either M1 orM2. For deciding between the models M1 and M2 based on the data, the estimation of λ

and drawing other inferences on it are fundamental tasks that need immediate attention.The log-likelihood for the skew normal model is

lM2 = n log2 − n logσM2 +n∑

i=1

log φ

(yi − γ M2

0 − γ M21 xi

σM2

)

+n∑

i=1

log �

yi − γ M20 − γ M2

1 xi

σM2

).

The log-likelihood for the normal model is

lM1 = −n logσM1 +n∑

i=1

log φ

(yi − γ M1

0 − γ M11 xi

σM1

).

The maximum likelihood estimating equations (MLEEs) for M2 are:

n∑i=1

(yi − γ0 − γ1xi) = λσ

n∑i=1

(yi − γ0 − γ1xi

σ

), (5)

Dow

nloa

ded

by [

Yor

k U

nive

rsity

Lib

rari

es]

at 1

7:17

12

Aug

ust 2

014

Page 5: Heuristically Deciding Between Normal and Skew Normal Distributions for Describing the Data on a Response Variable and an Explanatory Variable

128 S. Ghosh and D. Dey

n∑i=1

xi(yi − γ0 − γ1xi) = λσ

n∑i=1

xi Rλ

(yi − γ0 − γ1xi

σ

), (6)

n∑i=1

(yi − γ0 − γ1xi)Rλ

(yi − γ0 − γ1xi

σ

)= 0, (7)

n∑i=1

(yi − γ0 − γ1xi)2 = nσ 2. (8)

The solutions of γ 0, γ 1, λ, and σ 2 for MLEEs in Eqs. (5)–(8) are denoted by γ M20 , γ M2

1 , λ,and σ M2

0 , respectively. The solutions of MLEEs in Eqs. (5)–(8) when λ = 0 are denoted byγ M1

0 , γ M11 , and σ 2

M1, satisfying Eqs. (9)–(11):

n∑i=1

(yi − γ M10 − γ M1

1 xi) = 0, (9)

n∑i=1

xi(yi − γ M10 − γ M1

1 xi) = 0, (10)

n∑i=1

(yi − γ M10 − γ M1

1 xi)2 = nσ 2

M1. (11)

When λ �= 0, Eqs. (5), (6), (7), and (8) can be expressed as

n∑i=1

(yi − γ M20 − γ M2

1 xi) = λσM2

n∑i=1

(yi − γ M2

0 − γ M21 xi

σM2

)

= n(γ M10 − γ M2

0 ) + (γ M11 − γ M2

1 )n∑

i=1

xi,

(12)

n∑i=1

xi(yi − γ M20 − γ M1

1 xi) = λσM2

n∑i=1

xiRλ

(yi − γ M2

0 − γ M21 xi

σM2

),

= (γ M10 − γ M2

0 )n∑

i=1

xi + (γ M11 − γ M2

1 )n∑

i=1

x2i ,

(13)

n∑i=1

(yi − γ M20 − γ M2

1 xi)Rλ

(yi − γ M2

0 − γ M21 xi

σM2

)= 0, (14)

n∑i=1

(yi − γ M20 − γ M2

1 xi)2 = nσ 2

M2. (15)

Dow

nloa

ded

by [

Yor

k U

nive

rsity

Lib

rari

es]

at 1

7:17

12

Aug

ust 2

014

Page 6: Heuristically Deciding Between Normal and Skew Normal Distributions for Describing the Data on a Response Variable and an Explanatory Variable

Normal and Skew Normal Distributions 129

The complexity for finding γ M20 , γ M1

1 , λ, and σ 2M2 from Eqs. (5)–(8) or equivalently

Eqs. (12)–(15) is stemming from the presence of Rλ (z). Ghosh and Dey (2010) proposeda linear and a non–linear approximation of Rλ (z), and demonstrated the satisfactoryperformances of such approximations in the regions of interest. In the notation of this arti-cle, Ghosh and Dey (2010) considered the situation with only Y and without X, implyingthat γ 1 = 0 in Eqs. (1) and (2). The presence of X in addition to Y brings in additionalcomplexity in implementing a quadratic approximation of Rλ (z) and therefore the useof linear approximation is only demonstrated here. The approximation of the tail proba-bility of the scalar skew normal distribution is presented in Capitanio (2010). The issueswith the maximum likelihood estimation have been pointed out in the stimulating papersof Sartori (2006), Lagos Alvarez and Jamenez Gamero (2012), Pewsey (2000), and Firth(1993) and are circumvented completely or close to it by the proposed method based onlinear approximation.

In the next section, the linear approximation of Ghosh and Dey (2010) is given and thesolutions of γ 0, γ 1, σ , and λ in Eqs. (5)–(8) using this linear approximation involving anunknown constant α are denoted by γ M2

0 , γ M21 , σ 2

M2, and λ. The determined value of α isdenoted by α. Section 2 presents Theorem 2.1, demonstrating the relations between γ M2

1

and γ M11 , γ M2

0 and γ M10 , σ 2

M2 and σ 2M1, bounds on α and αλ2, and a necessary and sufficient

condition for determining the sign of λ. A heuristic method is proposed in section 3 fordetermining the numerical values of γ M2

0 , γ M21 , σ 2

M2, α, and λ. Section 4 provides simula-tion studies for performance evaluation of the proposed heuristic method. Section 5 presentsan application of the proposed method for deciding between normal and skew normaldistributions with an illustrative example. Section 6 provides the conclusion of the article.

2. Linear Approximation of Rλ (z)

A linear approximation of Rλ (z) for a given λ and −3 ≤ z ≤ 3 is given by

Aλ,α(z) = 0.7978846 + λαz ≈ Rλ(z). (16)

When z = 0, Aλ,α(0) = 0.7978846 = Rλ(0). In other words, Aλ,α(0) provides the exactvalue of Rλ (0). Equations (12)–(15) using the preceding linear approximation become

n∑i=1

(yi − γ M20 − γ M2

1 xi) = λσM2

n∑i=1

Aλ,α

(yi − γ M2

0 − γ M21 xi

σM2

)

= n(γ M10 − γ M2

0 ) + (γ M11 − γ M2

1 )n∑

i=1

xi,

(17)

n∑i=1

xi(yi − γ M20 − γ M1

1 xi) = λσM2

n∑i=1

xiAλ,α

(yi − γ M2

0 − γ M21 xi

σM2

),

= (γ M10 − γ M2

0 )n∑

i=1

xi + (γ M11 − γ M2

1 )n∑

i=1

x2i ,

(18)

Dow

nloa

ded

by [

Yor

k U

nive

rsity

Lib

rari

es]

at 1

7:17

12

Aug

ust 2

014

Page 7: Heuristically Deciding Between Normal and Skew Normal Distributions for Describing the Data on a Response Variable and an Explanatory Variable

130 S. Ghosh and D. Dey

n∑i=1

(yi − γ M20 − γ M2

1 xi)Aλ,α

(yi − γ M2

0 − γ M21 xi

σM2

)= 0, (19)

n∑i=1

(yi − γ M20 − γ M2

1 xi)2 = nσ 2

M2, (20)

where Aλ,α(z) = 0.7978846 + λαz. Denote

nx =n∑

i=1

xi, ny =n∑

i=1

yi, wi = yi − γ M20 − γ M2

1 xi, ¯w = y − γ M20 − γ M2

1 x. (21)

It follows from Eqs. (17)–(20) that

(1 − αλ2)n∑

i=1

wi = 0.7978846nλσM2, (22)

(1 − αλ2)n∑

i=1

xiwi = 0.7978846nλσM2x, (23)

0.7978846n∑

i=1

wi = − αλ

σM2

n∑i=1

w2i , (24)

nσ 2M2 =

n∑i=1

w2i . (25)

The relations between γ M21 and γ M1

1 , γ M20 and γ M1

0 , and σ 2M2 and σ 2

M1 are established inTheorem 2.1. It is also established that the sign of λ is the same as the sign of ¯w. Theproposed method of determining numerical values of γ M2

0 , γ M21 , σ 2

M2, α, and λ in section 3is based on the results in Theorem 2.1.

Theorem 2.1. The results that follow hold:

a. 0.7978846 ¯w = −αλσM2.b. (α − (αλ)2) ¯w = 0.7978846αλσM2 = −(0.7978846)2 ¯w.

c. γ M21 =

n∑i=1

(xi−x)(yi−y)

n∑i=1

(xi−x)2= γ M1

1 , when (1 − αλ2) �= 0.

d. γ M20 = y − γ M2

1 x + (0.7978846)−1αλσM2 = γ M10 + (0.7978846)−1αλσM2, when (1 −

αλ2) �= 0.

e. σ 2M2 − σ 2

M1 = ¯w2, when (1 − αλ2) �= 0.

f. −(0.7978846)2 ≤ α ≤ 0 ≤ (αλ)2 ≤ (0.7978846)2, when ¯w �= 0 and (1 − αλ2) �= 0.g. λ > 0 if and only if ¯w > 0 and λ < 0 if and only if ¯w < 0.

Dow

nloa

ded

by [

Yor

k U

nive

rsity

Lib

rari

es]

at 1

7:17

12

Aug

ust 2

014

Page 8: Heuristically Deciding Between Normal and Skew Normal Distributions for Describing the Data on a Response Variable and an Explanatory Variable

Normal and Skew Normal Distributions 131

Proof. Part (a) follows by combining Eqs. (24) and (25) and dividing both sides by n.Part (b) is obtained by substituting in (a) the expression of ¯w in Eq. (21). Part (c) fol-lows by multiplying both sides of Eq. (22) by x and then subtracting it from Eq. (23). Thefirst equality in (d) follows by multiplying both sides of Eq. (22) by n−1α and combin-ing with (a). The second equality in (d) can be seen by noting γ M1

0 = y − γ M11 x and then

using (c). Notice that (yi − γ M20 − γ M2

1 xi) = (yi − γ M10 − γ M1

1 xi) + ¯w. Squaring its bothsides and summing over i = 1, ..., n, and moreover observing that the cross-product term is

zero becausen∑

i=1(yi − γ M1

0 − γ M11 xi) = 0, the result in (e) can be obtained. When ¯w �= 0, it

follows from (b) that α − (αλ)2 = −(0.7978846)2 and hence −(0.7978846)2 ≤ α ≤ (αλ)2.

From (e), σ 2M2 ≤ ¯w2

and hence from (a) αλ2 ≤ (0.7978846)2. Consequently, α = (αλ)2 −(0.7978846)2 ≤ 0. Combining all these inequalities, the proof of (f) is complete. SinceσM2 > 0 and α ≤ 0 from (f), the proof of (g) is clear from (a). �

3. Determining the Numerical Values of γ M20 , γ M2

1 , σ 2M2, α, and λ

Define

b = (0.7978846)−2(αλ)2. (26)

A numerical method of finding an optimum value b∗ of b is given next by determining λ.The numerical values of γ M2

0 , γ M21 , σ 2

M2, α, and λ are then obtained by the method describedhere.

Step 1: Determining λ. Clearly from parts (f), (b), (e), and (a) of Theorem 2.1 when ¯w �= 0and (1 − αλ2) �= 0,

α = −(0.7978846)2 + (αλ)2 = −(0.7978846)2(1 − b),

¯w2 = (0.7978846)−2(αλ)2σ 2M2 = bσ 2

M2,

σ 2M1 = σ 2

M2 − ¯w2 = (1 − b)σ 2M2, 0 ≤ b < 1.

(27)

From Eq. (27) the expression of λ2 can be obtained as

λ2 = b

(0.7978846)−2α2= b

(0.7978846)2(1 − b)2. (28)

Denote two possible values of λ as

λ+ =√

b

0.7978846(1 − b), λ− = −λ+. (29)

The challenge is now to determine the sign of λ. If the determined sign of λ is positive, theλ becomes λ+, and if the determined sign of λ is negative, the λ becomes λ−.

Step 2: A Method of Determining the Sign of λ. To determine the sign of λ, it is necessaryto choose between λ+ and λ−. The question is now: How could one choose between λ+and λ− ? Denote γ M2

0 given in the part (d) of Theorem 2.1 by γ M20+ for λ = λ+ and γ M2

0− for

Dow

nloa

ded

by [

Yor

k U

nive

rsity

Lib

rari

es]

at 1

7:17

12

Aug

ust 2

014

Page 9: Heuristically Deciding Between Normal and Skew Normal Distributions for Describing the Data on a Response Variable and an Explanatory Variable

132 S. Ghosh and D. Dey

λ = λ−; the estimated log-likelihood for the model M2 by lM2+ for λ = λ+, γ M20+ , γ M2

1 , σ 2M2,

and α; and lM2− for λ = λ−, γ M20− , γ M2

1 , σ 2M2, and α. The lM2+ and lM2− are given by

lM2+ = n log2 − n logσM2 +

n∑i=1

log φ

(yi − γ M2

0+ − γ M21 xi

σM2

)

+n∑

i=1

log �

(λ+ yi − γ M2

0+ − γ M21 xi

σM2

),

lM2− = n log2 − n logσM2 +

n∑i=1

log φ

(yi − γ M2

0− − γ M21 xi

σM2

)

+n∑

i=1

log �

(λ− yi − γ M2

0− − γ M21 xi

σM2

).

(30)

From the data (yi, xi), i = 1, ..., n, the value of γ M21 = γ M1

1 can be obtained from part(c) of Theorem 2.1. For a specified value of b, the values of (αλ)2 from Eq. (26), α from

Eq. (27), λ2 from Eq. (28), λ+ and λ− from Eq. (29), σ 2M1 from Eq. (11), and σ 2

M2 and ¯w2

from Eq. (27), γ M20+ and γ M2

0− from part (d) of Theorem 2.1 can be found. The values of lM2+and lM2− are then obtained from Eq. (30).

Step 2.1: The Sign of λ. Consider now 100 values of b: 0.00, 0.01, . . ., 0.99 within its range[0,1) given in Eq. (27). For each of the one hundred values of b, calculate the numericalvalue of the difference lM2+ − lM2− . Determine the number of values of b giving the positivedifference. If this number turns out more than 50, the sign of λ will be determined positiveand if this number less than 50, the sign will be negative.

Step 3: Determining an Optimum Value of b. Suppose the determined sign of λ is positive.For each of the 100 values of b, 0.00, 0.01, . . ., 0.99, calculate the numerical values ofγ M2

0+ , γ M21 , σ 2

M2, α, and λ+ and then the numerical value of lM2+ from Eq. (30). Choose theoptimum value b∗ as the value out of 100 values of b that gives the largest numerical valueof lM2+ . Choosing the b∗ is similar when the determined sign of λ is negative.

Step 4: Determining γ M20 , γ M2

1 , σ 2M2, α, and λ. For the determined optimum value b∗

of b from Step 3, the values of γ M20 , γ M2

1 , σ 2M2, α, and λ are then obtained.

4. Simulation Studies for Performance Evaluation

Simulation studies are done to evaluate the performance of the proposed method. Generate20 independent observations zi, i = 1, ..., 20 from SN(0, 1, λ) for a specified value of λ.Take 20 values of xi, i = 1, ..., 20 as

100, 96, 88, 100, 100, 96, 80, 68, 92, 96, 88, 92, 68, 84, 84, 88, 72, 88, 72, 88.

For the specified values of γ M20 , γ M2

1 , and σ M2, generate 20 observations yi, i = 1, ..., 20from

yi = γ0 + γ1xi + σzi.

Dow

nloa

ded

by [

Yor

k U

nive

rsity

Lib

rari

es]

at 1

7:17

12

Aug

ust 2

014

Page 10: Heuristically Deciding Between Normal and Skew Normal Distributions for Describing the Data on a Response Variable and an Explanatory Variable

Normal and Skew Normal Distributions 133

Table 1The proportion P of times the sign of λ is correctly determined for γ M2

0 = −30, γ M21 = 5,

σM2 = 0.05, λ = 0.5, 1, 2, 3,a and n = 20, 50, 100, 500

λ n P λ n P

0.5 20 0.5101 1 20 0.579050 0.5312 50 0.6369

100 0.5346 100 0.7005500 0.5891 500 0.8832

2 20 0.7460 3 20 0.844550 0.8842 50 0.9653

100 0.9564 100 0.9964500 0.9999 500 1.0000

aThe results for (–ve) values of λ are similar to their corresponding (+ve) values.

The data consist of 20 independent observations (yi, xi), i = 1, ..., 20, from M2 for the spec-ified values of γ M2

0 , γ M21 , σ M2, and λ. The proposed method is used to determine the sign

of repeating this 100,000 times by generating 100,000 sets of 20 independent observationszi, i = 1, ..., 20 from SN(0, 1, λ), 100,000 data sets consisting of 20 independent observa-tions (yi, xi), i = 1, ..., 20, from M2 are obtained for the same specified values of γ M2

0 , γ M21 ,

σ M2, and λ. The 100,000 data sets are also obtained for the numbers of observations 50,100, and 500 for the same specified values of γ M2

0 , γ M21 , σ M2, and λ. Different values of

γ M20 , γ M2

1 , σ M2, and λ are also considered.Taking γ M2

0 = −30, γ M21 = 5, and σM2 = 0.05, Table 1 presents the proportion P of

times the sign of λ is correctly determined. As λ increases to 3 or −3, the proportionP becomes closer to 1 even with a sample of size 50. As the number of observations nincreases to 500, the proportion P gets closer to 1 when λ is more than 1. Detecting correctlythe sign of a small value of λ is naturally very challenging. The proposed method is identi-fying the sign correctly more than 50% of times even for a value of λ like 0.5 or −0.5.

Table 2 presents the descriptive statistics for the sampling distributions of γ M20 , γ M2

1 ,σM2, λ, and α generated from 100,000 values for each of them when γ M2

0 = −30, γ M21 =

5, σM2 = 0.05, λ = 0.5, 3, and n = 20, 50, 100, 500. The mean/median for the samplingdistribution of γ M2

0 and the true value −30 of γ 0 is very small for all values of λ and nconsidered. The same is true for the sampling distribution of γ M2

1 . Both mean and medianfor the sampling distribution of σM2 are sufficiently close to the true value 0.05 of σ , andthe closeness increases as the value of n becomes larger for all values of λ considered. Themedian for the sampling distribution of λ is much closer to the true value of λ than themean when the true value of λ is very small, but the differences in the closeness becomesmall as the true value of λ increases and the value of n becomes large. These interestingobservations are also present for a large number of simulated data sets generated with theother values of γ M2

0 , γ M21 , σ M2, and λ not reported because of their similarities with Tables 1

and 2 presented in this article.

5. Deciding Between Normal and Skew Normal Based on a SimulatedData

When the shape parameter λ of skew normal distribution in Eq. (2) is equal to zero, itbecomes the normal distribution in Eq. (1). Consequently, with a closer magnitude of λ to

Dow

nloa

ded

by [

Yor

k U

nive

rsity

Lib

rari

es]

at 1

7:17

12

Aug

ust 2

014

Page 11: Heuristically Deciding Between Normal and Skew Normal Distributions for Describing the Data on a Response Variable and an Explanatory Variable

134 S. Ghosh and D. Dey

Table 2Descriptive statistics for the sampling distributions of γ M2

0 , γ M21 , σM2, λ, and α when

γ M20 = −30, γ M2

1 = 5, σM2 = 0.05, λ = 0.5, 3, and n = 20, 50, 100, 500

λ n Q1 Median Mean Q3 SD

0.5 γ M20 20 −30.06 −29.98 −29.98 −29.91 0.1066

50 −30.03 −29.98 −29.98 −29.94 0.0724100 −30.02 −29.99 −29.99 −29.95 0.0550500 −30.01 −29.99 −29.99 −29.96 0.0310

γ M21 20 4.999 5 5 5.001 0.0010

50 5 5 5 5 0.0007100 5 5 5 5 0.0005500 5 5 5 5 0.0002

σM2 20 0.0510 0.0596 0.0601 0.0690 0.012750 0.0516 0.0585 0.0586 0.0652 0.0093

100 0.0509 0.0565 0.0567 0.0621 0.0075500 0.0493 0.0522 0.0527 0.0556 0.0043

λ 20 −1.827 0.1266 0.0434 1.883 1.760650 −1.321 0.2611 0.0956 1.485 1.4273

100 −1.011 0.2950 0.1025 1.210 1.1998500 −0.571 0.3853 0.1631 0.808 0.7849

3 γ M20 20 −30.04 −29.99 −29.99 −29.94 0.0716

50 −30.03 −30 −29.99 −29.96 0.0452100 −30.02 −30 −30 −29.98 0.0310500 −30.01 −30 −30 −29.99 0.0137

γ M21 20 5 5 5 5 0.0007

50 5 5 5 5 0.0005100 5 5 5 5 0.0003500 5 5 5 5 0.0001

σM2 20 0.0367 0.0438 0.0440 0.0511 0.010450 0.0416 0.0464 0.0461 0.0509 0.0072

100 0.0443 0.0474 0.0473 0.0507 0.0050500 0.0467 0.0483 0.0483 0.0497 0.0021

λ 20 1.283 2.066 1.427 2.273 1.339450 1.772 2.066 1.835 2.201 0.6619

100 1.883 2.066 1.972 2.132 0.3172500 2.002 2.066 2.035 2.066 0.0883

zero, the data provide more evidence in support of the normal distribution. The further themagnitude of λ is from zero, the more the data give evidence in support of the skew normaldistribution. The estimated log-likelihood for the skew normal model M2 is given by

lM2 = n log2 − n logσM2 +n∑

i=1

log φ

(yi − γ M2

0 − γ M21 xi

σM2

)

+n∑

i=1

log �

yi − γ M20 − γ M2

1 xi

σM2

).

(31)

The estimated log-likelihood for the normal model M1 is expressed as

Dow

nloa

ded

by [

Yor

k U

nive

rsity

Lib

rari

es]

at 1

7:17

12

Aug

ust 2

014

Page 12: Heuristically Deciding Between Normal and Skew Normal Distributions for Describing the Data on a Response Variable and an Explanatory Variable

Normal and Skew Normal Distributions 135

Table 3The (yi, xi), i = 1, ..., 20, are independent observations generated from

SN(−30 + 5x, 0.05, 1)

(470.0698, 100) (450.0979, 96) (409.9942, 88) (470.0207, 100) (470.0375, 100)(450.0491, 96) (370.0371, 80) (310.0119, 68) (430.0793, 92) (450.0547, 96)(409.9933, 88) (430.0319, 92) (310.0990, 68) (390.0083, 84) (389.9928, 84)(410.0700, 88) (330.0358, 72) (410.0679, 88) (330.0494, 72) (410.0329, 88)

lM1 = −n logσM1 +n∑

i=1

log φ

(yi − γ M1

0 − γ M11 xi

σM1

). (32)

Table 3 presents a simulated data set (yi, xi), i = 1, ..., 20, generated independentlyfrom SN(−30 + 5x, 0.05, 1) with γ M2

0 = −30, γ M21 = 5, σM2 = 0.05, and λ = 1 in the

skew normal model M2. The values of sample size n = 20 and shape parameter λ = 1 aresmall. We now fit the model M2 to the data in Table 3 by the proposed heuristic methodtreating the parameters γ M2

0 , γ M21 , σ M2, and λ. The proposed method of estimating the

parameters in the model M2 works best when the sample size n and the magnitude ofshape parameter λ are large. Thus, a spectacular performance of the proposed methodis not expected in our fitting of the model M2 to the data in Table 3. We find that thenumerical values of the difference lM2+ − lM2− are positive for all the 100 values of b:0.00, 0.01, . . ., 0.99. Hence, the the sign of λ is positive with overwhelming evidencefrom the data. For 100 values of b, 0.00, 0.01, . . ., 0.99, the numerical values ofγ M2

0+ , γ M21 , σ 2

M2, α, λ+ and lM2+ are obtained. We then find that the value of b as b∗

= 0.43 gives the largest numerical value of lM2+ = 40.77. For b = b∗ = 0.43, we haveγ M2

0 = γ M20+ = −30.00863, γ M2

1 = γ M11 = 5.00027, σM2 = 0.04184, α = −0.36287,

λ = 1.44185, and lM2 = lM2+ = 40.77. The proposed method in this article almost perfectlyestimates the intercept parameter γ M2

0 , the slope parameter γ M21 , and the dispersion

parameter σ M2 in the M2 with very small deviations from their true values. The estimateof the shape parameter λ has a reasonably small deviation from its true value. Thus, theperformance of our proposed method for estimating the unknown parameters γ 0, γ 1, and σ

is spectacular and for estimating the unknown parameter λ is close to being spectacular.The value 1.44185 of λ being away from zero is providing strong evidence in support

of the skew normal distribution for describing the data in Table 3. A comparison betweenlM2 and lM1 is possible following the Cox statistic (Cox 1961; 1962) and the work of manyresearchers (Ashkar and Aucoin 2012; 2010; Bain and Engelhardt 1980; Dumonceaux andAntle 1973; Dumonceaux, Antle, and Haas 1973; Gupta and Kundu 2003; Kappenman1982; Kundu and Manglick 2004; White 1982). We obtain from the data in Table 3: γ M1

0 =−29.98154, γ M1

1 = 5.00027, σM1 = 0.03158322, and lM1 = 40.72. We observe that lM2 >

lM1, which favors the skew normal distribution. We also observe with the true values ofγ M2

0 = −30, γ M21 = 5, σM2 = 0.05, and λ = 1:

20 loge2 − 20 loge0.05 +20∑

i=1

loge φ

(yi + 30 − 5xi

0.05

)

+20∑

i=1

loge �

(yi + 30 − 5xi

0.05

)= 38.2615,

Dow

nloa

ded

by [

Yor

k U

nive

rsity

Lib

rari

es]

at 1

7:17

12

Aug

ust 2

014

Page 13: Heuristically Deciding Between Normal and Skew Normal Distributions for Describing the Data on a Response Variable and an Explanatory Variable

136 S. Ghosh and D. Dey

which is smaller than both lM2 and lM1 as an effect of noise present in the Table 3 data. The

values of 20 loge2 +20∑

i=1loge �

(yi+30−5xi

0.05

)= 5.6923 and

20 loge2 +20∑

i=1loge �

(1.441847 yi+30.00863−5.00027xi

0.04184

)= 5.6606 are positive and

very close to each other.

6. Conclusions

The proposed computationally intensive heuristic method based on the linear approxima-tion of Rλ (z) for a given λ and −3 ≤ z ≤ 3 by Aλ,α(z) in Eq. (16) has a strong ability todecide between the models M1 and M2 by first determining the sign of the shape parame-ter and then estimating it along with the other parameters: shape, location, and dispersion,under the linear approximation. The proposed heuristic method is simple and transparentwith strong performances. An interesting feature of the proposed method is that the estimateof the slope parameter γ 1 remains the same for both the models M1 and M2, demonstrat-ing a robustness property. The proposed method enjoys strong accuracy, particularly whenthe shape parameter is not too small and the number of observations is moderate to large.Strong accuracy of the proposed method in determining the sign of the shape parameter isalso observed.

Acknowledgment

The authors thank two reviewers for their careful reading of the earlier version of this articleand their constructive suggestions.

ReferencesArnold, B. C., and R. J. Beaver. 2000. Hidden truncation models. Sankhya Ser. A, 62, 23–35.Arnold, B. C., R. J. Beaver, R. D. Groeneveld, and W. Q. Meeker. 1993. The nontruncated marginal

of a truncated bivariate normal distribution. Psychometrika, 58, 471–478.Ashkar, F., and F. Aucoin. 2010. Discriminating between the logistic and the normal distributions

based on likelihood ratio. Interstat, http://interstat.statjournals.netAshkar, F., and F. Aucoin. 2012. Discriminating between the lognormal and the log-logistic

distributions for hydrological frequency analysis. J. Hydrologic Eng., 17, 160–167.Azzalini, A. 1985. A class of distributions which includes the normal ones. Scand. J. Stat., 12,

171–178.Azzalini, A. 1986. Further results on a class of distributions which includes the normal ones.

Statistica, 46, 199–208.Bain, L. J., and M. Engelhardt. 1980. Probability of correct selection of Weibull versus gamma based

on likelihood ratio. Communi. Stati. Theory Methods, 9, 375–381.Capitanio, A. 2010. On the approximation of the tail probability of the scalar skew-normal

distribution. Metron, LXVIII, 299–308.Cox, D. R. 1961. Tests of separate families of hypotheses. Proc. Fourth Berkeley Symposium, Vol. 1,

105–123. Berkeley, CA: University of California Press.Cox, D. R. 1962. Further results on tests of separate families of hypotheses. J. R. Stat. Soc. Ser. B, 24,

406–424.Dumonceaux, R., and C. E. Antle. 1973. Discrimination between the log-normal and the Weibull

distributions. Technometrics, 15, 923–926.

Dow

nloa

ded

by [

Yor

k U

nive

rsity

Lib

rari

es]

at 1

7:17

12

Aug

ust 2

014

Page 14: Heuristically Deciding Between Normal and Skew Normal Distributions for Describing the Data on a Response Variable and an Explanatory Variable

Normal and Skew Normal Distributions 137

Dumonceaux, R., C. E. Antle, and G. Haas. 1973. Likelihood ratio test for discrimination betweentwo models with unknown location and scale parameters. Technometrics, 15, 19–27.

Firth, D. 1993. Bias reduction of maximum likelihood estimates for the scalar skew t distribution.Biometrika, 80, 27–38.

Genton, M. G., ed. 2004. Skew-elliptical distributions and their applications: a journey beyondnormality. New York, NY: Chapman & Hall/CRC.

Ghosh, S., and D. Dey. 2010. Linear and nonlinear approximations of the ratio of the standard normaldensity and distribution functions for the estimation of the skew normal shape parameter. J.Indian Soci. Agric. Stat., 64(2), 237–242.

Gupta, R. D., and D. Kundu. 2003. Discriminating between Weibull and generalized exponentialdistributions. Comput. Stati. Data Anal., 43, 179–196.

Kappenman, R. F. 1982. On a method for selecting a distributional model. Commun. Stat. TheoryMethods, 11, 663–672.

Kundu, D., and A. Manglick. 2004. Discriminating between the Weibull and log-normal distributions.Naval Res. Logistics, 51, 893–905.

Lagos Alvarez, B., and M. D. Jamenez Gamero. 2012. A note on bias reduction of the maximumlikelihood estimates for the scalar skew t distribution. J. Stat. Plan. Inference, 142, 608–612.

Ma, Y., and M. G. Genton. 2004. Flexible class of skew-symmetric distributions. Scand. J. Stat., 31,459–468.

Mills, J. P. 1926. Table of ratio: Area to bounding ordinate, for any portion of the normal curve.Biometrika, 18, 395–400.

Mudholkar, G. S., and A. D. Hutson. 2000. The epsilon-skew-normal distribution for analyzing nearlynormal data. J. Stat. Plan. Inference, 83, 291–309.

Pewsey, A. 2000. Problems of inference for Azzalini’s skew normal distribution. J. Appl. Stat., 27,859–870.

Rao, C. R. 1985. Weighted distributions arising out of methods of ascertainment: What populationdoes a sample represent? In A celebration of statistics. The ISI centenary volume, ed. A. C.Atkinson and S. E. Fienberg, 543–569. New York, NY: Springer.

Sartori, N. 2006. Bias prevention of maximum likelihood estimates for scalar skew normal and skewt distributions. J. Stat. Plan. Inference, 136, 4259–4275.

White, H. 1982. Maximum likelihood estimation of misspecified models. Econometrica, 50, 1–25.

Dow

nloa

ded

by [

Yor

k U

nive

rsity

Lib

rari

es]

at 1

7:17

12

Aug

ust 2

014