maximization of statistical heterogeneity: from shannon’s entropy to gini’s index

16
Physica A 389 (2010) 3023–3038 Contents lists available at ScienceDirect Physica A journal homepage: www.elsevier.com/locate/physa Maximization of statistical heterogeneity: From Shannon’s entropy to Gini’s index Iddo Eliazar a,* , Igor M. Sokolov b a Department of Technology Management, Holon Institute of Technology, P.O. Box 305, Holon 58102, Israel b Institut für Physik, Humboldt-Universität zu Berlin, Newtonstr. 15, D-12489 Berlin, Germany article info Article history: Received 12 January 2010 Received in revised form 17 February 2010 Available online 8 April 2010 Keywords: Statistical heterogeneity Dispersion Entropy Egalitarianism Shannon’s entropy Gini’s index Subbotin’s distribution The logistic distribution abstract Different fields of Science apply different quantitative gauges to measure statistical heterogeneity. Statistical Physics and Information Theory commonly use Shannon’s entropy which measures the randomness of probability laws, whereas Economics and the Social Sciences commonly use Gini’s index which measures the evenness of probability laws. Motivated by the principle of maximal entropy, we explore the maximization of statistical heterogeneity – for probability laws with a given mean – in the four following scenarios: (i) Shannon entropy maximization subject to a given dispersion level; (ii) Gini index maximization subject to a given dispersion level; (iii) Shannon entropy maximization subject to a given Gini index; (iv) Gini index maximization subject to a given Shannon entropy. Analysis of these four scenarios results in four different classes of heterogeneity- maximizing probability laws – yielding an in-depth description of both the marked differences and the interplay between the Physical ‘‘randomness-based’’ and the Economic ‘‘evenness-based’’ approaches to the maximization of statistical heterogeneity. © 2010 Elsevier B.V. All rights reserved. 1. Introduction The Gaussian law – one of the most elemental probability laws in Science and Engineering – is well known to result from the principal of maximum entropy: Within the class of probability laws supported on the real line, with a given mean and variance, the Gaussian law attains maximal Shannon entropy. Shannon’s entropy [1] quantifies the randomness of probability laws, and is a measure of heterogeneity commonly applied in Statistical Physics and Information Theory [2]. On the other hand, in Economics and the Social Sciences a commonly applied measure of heterogeneity is the Gini index [3], which quantifies the evenness of probability laws. In this article we shift from the ‘‘principal of maximum entropy’’ to the more general ‘‘principal of maximum heterogeneity’’, and explore the maximization of statistical heterogeneity – for probability laws with a given mean – in the four following scenarios: P1. Shannon entropy maximization subject to a given dispersion level. P2. Gini index maximization subject to a given dispersion level. P3. Shannon entropy maximization subject to a given Gini index. P4. Gini index maximization subject to a given Shannon entropy. A comprehensive analysis of these four heterogeneity-maximization scenarios is conducted — yielding four different classes of heterogeneity-maximizing probability laws with markedly different statistical characteristics. The results * Corresponding author. Tel.: +972 507 290 650. E-mail addresses: [email protected] (I. Eliazar), [email protected] (I.M. Sokolov). 0378-4371/$ – see front matter © 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.physa.2010.03.045

Upload: iddo-eliazar

Post on 21-Jun-2016

219 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: Maximization of statistical heterogeneity: From Shannon’s entropy to Gini’s index

Physica A 389 (2010) 3023–3038

Contents lists available at ScienceDirect

Physica A

journal homepage: www.elsevier.com/locate/physa

Maximization of statistical heterogeneity: From Shannon’s entropy toGini’s indexIddo Eliazar a,∗, Igor M. Sokolov ba Department of Technology Management, Holon Institute of Technology, P.O. Box 305, Holon 58102, Israelb Institut für Physik, Humboldt-Universität zu Berlin, Newtonstr. 15, D-12489 Berlin, Germany

a r t i c l e i n f o

Article history:Received 12 January 2010Received in revised form 17 February 2010Available online 8 April 2010

Keywords:Statistical heterogeneityDispersionEntropyEgalitarianismShannon’s entropyGini’s indexSubbotin’s distributionThe logistic distribution

a b s t r a c t

Different fields of Science apply different quantitative gauges to measure statisticalheterogeneity. Statistical Physics and Information Theory commonly use Shannon’sentropy which measures the randomness of probability laws, whereas Economics and theSocial Sciences commonly use Gini’s index which measures the evenness of probabilitylaws. Motivated by the principle of maximal entropy, we explore the maximization ofstatistical heterogeneity – for probability laws with a given mean – in the four followingscenarios: (i) Shannon entropy maximization subject to a given dispersion level; (ii) Giniindexmaximization subject to a given dispersion level; (iii) Shannon entropymaximizationsubject to a given Gini index; (iv) Gini index maximization subject to a given Shannonentropy. Analysis of these four scenarios results in four different classes of heterogeneity-maximizing probability laws – yielding an in-depth description of both the markeddifferences and the interplay between the Physical ‘‘randomness-based’’ and the Economic‘‘evenness-based’’ approaches to the maximization of statistical heterogeneity.

© 2010 Elsevier B.V. All rights reserved.

1. Introduction

The Gaussian law – one of the most elemental probability laws in Science and Engineering – is well known to resultfrom the principal of maximum entropy: Within the class of probability laws supported on the real line, with a given meanand variance, the Gaussian law attains maximal Shannon entropy. Shannon’s entropy [1] quantifies the randomness ofprobability laws, and is a measure of heterogeneity commonly applied in Statistical Physics and Information Theory [2].On the other hand, in Economics and the Social Sciences a commonly appliedmeasure of heterogeneity is the Gini index [3],which quantifies the evenness of probability laws.In this article we shift from the ‘‘principal of maximum entropy’’ to the more general ‘‘principal of maximum

heterogeneity’’, and explore the maximization of statistical heterogeneity – for probability laws with a given mean – inthe four following scenarios:

P1. Shannon entropy maximization subject to a given dispersion level.P2. Gini index maximization subject to a given dispersion level.P3. Shannon entropy maximization subject to a given Gini index.P4. Gini index maximization subject to a given Shannon entropy.

A comprehensive analysis of these four heterogeneity-maximization scenarios is conducted — yielding four differentclasses of heterogeneity-maximizing probability laws with markedly different statistical characteristics. The results

∗ Corresponding author. Tel.: +972 507 290 650.E-mail addresses: [email protected] (I. Eliazar), [email protected] (I.M. Sokolov).

0378-4371/$ – see front matter© 2010 Elsevier B.V. All rights reserved.doi:10.1016/j.physa.2010.03.045

Page 2: Maximization of statistical heterogeneity: From Shannon’s entropy to Gini’s index

3024 I. Eliazar, I.M. Sokolov / Physica A 389 (2010) 3023–3038

obtained provide an explicit and an in-depth description of both the dramatic differences and the interplay betweenthe Physical ‘‘randomness-based’’ and the Economic ‘‘evenness-based’’ approaches to the maximization of statisticalheterogeneity.The remainder of the article is organized as follows. Section 2 concisely reviews three basic gauges of statistical

heterogeneity: dispersion, Shannon’s entropy, and Gini’s index. The optimization problems P1 and P2 are studied,respectively, in Sections 3 and 4, followed by a discussion presented in Section 5. The optimization problems P3 and P4are studied in Section 6. Section 7 re-investigates the optimization problems P2, P3 and P4 in the context of a variant ofGini’s index. The proofs of the main results are outlined in the Appendix.

2. Statistical heterogeneity

How heterogeneous is a given probability law? This question is one of the elemental issues in various fields of Science.Consider, henceforth, a probability law governed by the probability density function f (x)(x real), with mean µ.Perhaps themost basic approach to gauge statistical heterogeneity is the notion of dispersion: measuring the fluctuations

of the probability law around its mean. The dispersion is given by the functional

D(f ) =(∫

−∞

|x− µ|p f (x)dx)1/p

, (1)

where the parameter p is the dispersion exponent (p ≥ 1). The dispersion is positive valued: D(f ) > 0. The greater thedispersion – the more scattered and heterogeneous the probability law; the smaller the dispersion – the more concentratedand homogeneous the probability law. In the case p = 2 – themost commonly applied dispersion exponent – the dispersionD(f ) equals the probability law’s standard deviation, and the square dispersion D(f )2 equals the law’s variance.A different and more profound approach to gauge statistical heterogeneity is based on the notion of entropy: measuring

the uncertainty, or ‘‘randomness’’, of the probability law. Entropy is a fundamental concept linking together InformationTheory and Statistical Physics [2]. Shannon’s entropy [1] – the most commonly applied measure of randomness – is given bythe functional

H(f ) = −∫∞

−∞

ln(f (x))f (x)dx, (2)

and is real valued: −∞ < H(f ) < ∞. The greater Shannon’s entropy – the more uncertain and heterogeneous theprobability law; the smaller Shannon’s entropy – the more certain and homogeneous the probability law. Henceforth weshall use the terms ‘‘Shannon’s entropy’’ and ‘‘entropy’’ interchangeably.Yet another approach to gauge statistical heterogeneity is based on the notion of equality: measuring the evenness, or

‘‘egalitarianism’’, of the probability law. This approach follows from Economics and the Social Sciences, and stems fromquestions of the type: ‘‘How large is the inequality of the distribution of wealth in a given society?’’. Gini’s index [3] – themost commonly applied measure of societal egalitarianism – is given by the functional

G(f ) = 1−1µ

∫∞

0

(∫∞

xf (x′)dx′

)2dx. (3)

Gini’s index is defined for probability laws supported on the positive half-line (x ≥ 0), and takes values in the unit interval:0 < G(f ) < 1. The greater Gini’s index – the more unequal and heterogeneous the probability law; the smaller Gini’s index– the more even and homogeneous the probability law.In the context of societal egalitarianism the bounds of the ‘‘Gini range’’ (0 < G(f ) < 1) represent two polar social

extremes: pure communism and pure monarchy. The lower bound 0 corresponds to a purely communist society in whichwealth is distributed equally amongst all the societymembers. The upper bound 1 corresponds to a puremonarchy inwhichwealth is held, exclusively, by one single monarch – leaving all other society members impoverished.Although stemming from an Economic origin, the Gini index is far from being confined to the measurement of societal

egalitarianism alone. Rather, the Gini index is a quantitative gauge for the evenness of arbitrary positive-valued probabil-ity laws. Examples include: astrophysics—the analysis of galaxy morphology [4]; medical chemistry—the analysis of kinaseinhibitors [5]; ecology—the effect of biodiversity on ecosystem functioning [6]; finance—the analysis of inter-trade time in-tervals [7]. On the other hand, Shannon’s entropy is applied also in Economics— e.g., in the equilibrium theory ofmarkets [8].

3. Entropy maximization

The principle of maximum entropy plays a central role in both Information Theory and Statistical Physics [2]. InPhysics the principle of maximum entropy leads to the notion of equipartition: the equal spread of probabilities over allpossible states of a system — the spread being in compliance with given system-constraints. Thus a system’s equipartitioncorresponds to the most heterogeneous, or ‘‘disordered’’, probability law defined on the system’s states.Three key examples of probability laws that maximize Shannon’s entropy are [9]:

E1. Within the class of probability laws supported on the unit interval (0 ≤ x ≤ 1) the entropy maximizer is the uniformlaw.

Page 3: Maximization of statistical heterogeneity: From Shannon’s entropy to Gini’s index

I. Eliazar, I.M. Sokolov / Physica A 389 (2010) 3023–3038 3025

Fig. 1. A schematic illustration of the function f1(x) — the Subbotin probability density function of Eq. (4) which governs the probability law solving theentropy-maximization Problem 1. The function f1(x) is plotted against the variable (x− µ) /σ , with σ = 1, for the following dispersion-exponent values:(i) p = 1, depicted by the solid line (note the cusped maximum); (ii) p = 2, depicted by the dashed line; (iii) p = 3, depicted by the dashed–dotted line.

E2. Within the class of probability laws supported on the non-negative half-line (x ≥ 0), and possessing a given mean, theentropy maximizer is the exponential law.

E3. Within the class of probability laws supported on the real line (−∞ < x < ∞), and possessing a given mean andvariance, the entropy maximizer is the Gaussian law.

The third example is a special case of the following entropy-maximization problem:

Problem 1. Within the class of probability laws supported on the real line (−∞ < x < ∞), with mean µ and a givendispersion, which is the law that maximizes Shannon’s entropy?

The solution of the entropy-maximization Problem 1 is Subbotin’s probability law [10]—governed by the probabilitydensity function

f1(x) =φ(p)σexp

(−1p

∣∣∣∣x− µσ∣∣∣∣p) (4)

(−∞ < x <∞), where σ is a positive scale parameter and φ(p) = p1−1/p/2Γ (1/p). An outline of the derivation of Eq. (4)is given in the Appendix. The entropy-maximization Problem 1 was addressed – in the context of statistical equilibrium ineconomic competition – in Ref. [11].The Subbotin probability density function f1(x): has an infinite support (−∞ < x <∞); is symmetric around itsmeanµ;

is unimodal, and attains its globalmaximumat itsmeanµ. The dispersion exponent p = 1 yields the bilateral exponential law(also referred to as the Laplace law) — inwhich case the density f1(x) has a cuspedmaximum. The dispersion exponent p = 2yields the Gaussian law. For all dispersion exponents in the range p > 1 Subbotin’s density f1(x) admits the same qualitativeshape — a bell curve centered at the mean µ. A schematic illustration of the Subbotin’s density f1(x) is depicted in Fig. 1.The corresponding dispersion D1 = D(f1) and entropy H1 = H(f1) are given, respectively, by:

D1 = σ ,

H1 =(1p− ln(φ(p))

)+ ln(σ ).

(5)

Both the dispersion D1 and the entropy H1 are monotone increasing with respect to the scale parameter σ , and span theirentire ranges of values as the parameter σ is varied: 0 < D1 < ∞ and −∞ < H1 < ∞. Hence, the attainable levels ofheterogeneity – in the context of the entropy-maximization Problem 1 – are unlimited. Moreover, Eq. (5) implies that thedependence of the entropy H1 on the dispersion D1 is logarithmic:

H1 =(1p− ln(φ(p))

)+ ln(D1). (6)

4. Gini maximization

In the previous Section we discussed the ‘‘principle of maximum entropy’’. What happens when shifting from theperspective of Information Theory and Statistical Physics to the perspective of Economics and Social Sciences? Namely, whathappens when applying Gini’s index G(f ) – rather than Shannon’s entropy H(f ) – as the underlying measure of statisticalheterogeneity?

Page 4: Maximization of statistical heterogeneity: From Shannon’s entropy to Gini’s index

3026 I. Eliazar, I.M. Sokolov / Physica A 389 (2010) 3023–3038

Fig. 2. A schematic illustration of the function f2(x) — the probability density function of Eq. (7) which governs the probability law solving the Gini-maximization Problem 2. The function f2(x) is plotted against the variable (x− µ) /σ , with σ = 1, for the following dispersion-exponent values:(i) p = 1.5, depicted by the dotted line; (ii) p = 2, depicted by the solid line y = 0.5; (iii) p = 2.5, depicted by the dashed line; (iv) p = 3, depicted by thesolid line y = |(x− µ) /σ |; (v) p = 4, depicted by the dashed–dotted line.

Let us examine first the ‘‘Gini counterparts’’ of the entropy-maximization examples noted in the previous section andleading, respectively, to the uniform, exponential, and Gaussian laws:

E1′. Within the class of probability laws supported on the unit interval (0 ≤ x ≤ 1) the maximal Gini index attainable is 1— corresponding to a pure monarchy.

E2′. Within the class of probability laws supported on the non-negative half-line (x ≥ 0), and possessing a given mean, themaximal Gini index attainable is 1 — corresponding, again, to a pure monarchy.

E3′. Within the class of probability laws supported on the non-negative half-line (x ≥ 0), and possessing a given mean andvariance, the Gini maximizer is the uniform law (supported on a sub-range of the non-negative half-line).

The third example is a special case of the following Gini-maximization problem:

Problem 2. Within the class of probability laws supported on the non-negative half-line (x ≥ 0), with mean µ and a givendispersion, which is the law that maximizes Gini’s index?

The Gini-maximization Problem 2 is the counterpart of the entropy-maximization Problem 1 – replacing‘‘randomness’’ by ‘‘evenness’’ as the underlying measure of heterogeneity – and is well-defined for dispersion exponentsin the range p > 1. The probability law solving the Gini-maximization Problem 2 is governed by the probability densityfunction

f2(x) =p− 12σ

∣∣∣∣x− µσ∣∣∣∣p−2 (7)

(µ− σ ≤ x ≤ µ+ σ ), where σ is a scale parameter taking values in the range 0 < σ ≤ µ. An outline of the derivation ofEq. (7) is given in the Appendix.The probability density function f2(x): is localized and has a finite support (µ− σ ≤ x ≤ µ+ σ ); is symmetric around

its mean µ; undergoes qualitative phase transitions as the dispersion exponent p is varied. In the range 1 < p < 2 thedensity f2(x) diverges at its mean µ. The dispersion exponent p = 2 yields the uniform law on the interval [µ− σ ,µ+ σ ].In the range p > 2 the density f2(x) attains a global minimum at its mean µ – the minimum being cusped in the sub-range2 < p ≤ 3, and being U-shaped in the sub-range p > 3. A schematic illustration of the density f2(x) is depicted in Fig. 2.The corresponding dispersion D2 = D(f2) and Gini index G2 = G(f2) are given, respectively, by:

D2 =(p− 12p− 1

)1/p· σ ,

G2 =(p− 12p− 1

)1µ· σ .

(8)

Both the dispersion D2 and the Gini index G2 are monotone increasing with respect to the scale parameter σ , but fail to spantheir entire ranges of values as the parameter σ is varied. Indeed, both the dispersion D2 and the Gini index G2 are bounded

from above: 0 < D2 <(p−12p−1

)1/pµ and 0 < G2 <

p−12p−1 . Hence, the attainable levels of heterogeneity – in the context of

the Gini-maximization Problem 2 – are limited. Moreover, Eq. (8) implies that the dependence of the Gini index G2 on the

Page 5: Maximization of statistical heterogeneity: From Shannon’s entropy to Gini’s index

I. Eliazar, I.M. Sokolov / Physica A 389 (2010) 3023–3038 3027

Table 1The differences between the entropy-maximizing and the Gini-maximizing probability laws.

Entropy maximization Gini maximization

Density f1(x) f2(x)Density support Infinite FiniteDensity form Exponential Power lawDensity shape Bell curve Varying, p-dependentLaw for p = 2 Gaussian UniformHeterogeneity-dispersion relation Logarithmic LinearHeterogeneity levels Unbounded Bounded from above

dispersion D2 is linear:

G2 =(p− 12p− 1

) p−1p 1µ· D2. (9)

5. Discussion

The entropy-maximization Problem 1 and the counterpart Gini-maximization Problem 2 are analogous. Both seek tomaximize statistical heterogeneity — subject to a given mean and a given dispersion level. The former considers a Physicalperspective and applies Shannon’s entropy as its measure of statistical heterogeneity, whereas the latter considers anEconomic perspective and applies Gini’s index. Yet, these analogous optimization problems result in markedly differentheterogeneity-maximizing probability laws. Indeed, the differences between the entropy-maximizing Subbotin density f1(x)of Eq. (4), and the Gini-maximizing density f2(x) of Eq. (7), are dramatic. These differences – regarding dispersion exponentsin the range p > 1 – are summarized in Table 1.We note that the minimization – rather than the maximization – of statistical heterogeneity leads to pure homogeneity.

Indeed, entropy-minimization yields the pure homogeneous lower bound inff H(f ) = −∞ within the following classes ofprobability laws: (i) supported on the unit interval (0 ≤ x ≤ 1); (ii) supported on the non-negative half-line (x ≥ 0), andpossessing a givenmean; (iii) supported on the real line (−∞ < x <∞), and possessing a givenmean and dispersion. Also,Gini-minimization yields the pure communism lower bound inff G(f ) = 0 within the following classes of probability laws:(i) supported on the unit interval (0 ≤ x ≤ 1); (ii) supported on the non-negative half-line (x ≥ 0), and possessing a givenmean; (iii) supported on the non-negative half-line (x ≥ 0), and possessing a given mean and dispersion. We further notethat in all the aforementioned heterogeneity-minimization problems the minimum is neither attainable nor is it unique:the pure homogeneous lower bounds are attained by infima (rather than by minima), and in a non unique fashion. This isin sharp contrast with the counterpart heterogeneity-maximization problems — each resulting in a unique probability lawwhich maximizes heterogeneity.

6. Entropy-Gini maximization

6.1. The entropy-Gini maximization problems

So far we have applied either a Physical perspective (Shannon’s entropy) or an Economic perspective (Gini’s index) tomaximize heterogeneity – subject to a given mean and a given dispersion level. What happens when we abolish the notionof dispersion altogether and try to relate the Physical and the Economic perspectives – in the context of heterogeneity-maximization – to each other? Namely, what happens when maximizing Shannon’s entropy (Gini’s index) – subjectto a given mean and a given Gini index (Shannon entropy)? This query leads us to the following pair of optimizationproblems:

Problem 3. Within the class of probability laws supported on the non-negative half-line (x ≥ 0), with mean µ and a givenGini index, which is the law that maximizes Shannon’s entropy?

Problem 4. Within the class of probability laws supported on the non-negative half-line (x ≥ 0), with mean µ and a givenShannon entropy, which is the law that maximizes Gini’s index?

The probability law solving both the optimization Problems 3 and 4 is governed by the probability density function

f3(x) =

ψ(σ)

µσ exp

(ψ(σ)

µx)

(σ exp

(ψ(σ)

µx)+ (1− σ)

)2 (10)

(x ≥ 0), where σ is a positive parameter and ψ (σ) = ln(σ )/(σ − 1). The corresponding survival probability functionF3(x) =

∫∞

x f3(x′)dx′ is given by

Page 6: Maximization of statistical heterogeneity: From Shannon’s entropy to Gini’s index

3028 I. Eliazar, I.M. Sokolov / Physica A 389 (2010) 3023–3038

Fig. 3. A schematic illustration of the function f3 (x) — the probability density function of Eq. (10) which governs the probability law solving both theoptimization Problems 3 and 4. The function f3(x) is plotted against the variable x for the following parameter values:µ = 1 and (i) σ = 0.25, depicted bythe solid line; (ii) σ = 0.5, depicted by the dashed–dotted line; (iii) σ = 1, depicted by the dashed line; (iv) σ = 2, depicted by the dotted line.

F3(x) =1

σ exp(ψ(σ)

µx)+ (1− σ)

(11)

(x ≥ 0). An outline of the derivation of Eqs. (10)–(11) is given in the Appendix.The density f3(x): has an infinite support (x ≥ 0); is asymmetric and skewed; undergoes qualitative phase transitions as

the parameter σ is varied. In the range 0 < σ < 1/2 the density f3(x) is unimodal, and attains its global maximum at thevalue

xmode = µ (1− σ)(1−

ln (1− σ)ln(σ )

). (12)

In the range σ ≥ 1/2 the density f3(x) is monotone decreasing, and attains its global maximum at the origin (i.e., xmode = 0).At the parameter value σ = 1 the density f3(x) coincides with the density of the exponential law (with meanµ). The medianof the density f3(x) is given by

xmedian = µ (1− σ)(1−

ln (1+ σ)ln(σ )

). (13)

A schematic illustration of the density f3(x) is depicted in Fig. 3.The corresponding entropy H3 = H(f3) and Gini index G3 = G(f3) are given, respectively, by:

H3 = (2+ ln(µ))− σψ (σ)− ln(ψ(σ)),

G3 = 1+ψ(σ)− 1ln(σ )

= 1+1

σ − 1−

1ln(σ )

.(14)

The parameter limit σ → 0 yields pure communism: limσ→0 G3 = 0 (and limσ→0 H3 = −∞). On the other hand, theparameter limit σ → ∞ yields pure monarchy: limσ→∞ G3 = 1 (and limσ→∞ H3 = −∞). Schematic illustrations of theentropy H3 and the Gini index G3 – as functions of the parameter σ – are depicted in Fig. 4.There is no closed-form formula expressing the direct connection between the entropy H3 and the Gini index G3. Yet,

from both Eq. (14) and Fig. 4 it is evident that the functional relation between the entropy H3 and the Gini index G3 is notmonotone, and is highly non-linear.

6.2. A probabilistic approach

In this subsection we present a probabilistic approach yielding the probability law solving both the optimizationProblems 3 and 4. Henceforth, let X denote a positive-valued random variable whose probability law is governed by theprobability density function f (x) (x ≥ 0).The survival probability of the random variable X is given by F(x) =

∫∞

x f (x′)dx′ (x ≥ 0), and the hazard rate of the

random variable X is given by

R(x) := limδ→0

1δPr (X ≤ x+ δ | X > x) =

f (x)F(x)

(15)

Page 7: Maximization of statistical heterogeneity: From Shannon’s entropy to Gini’s index

I. Eliazar, I.M. Sokolov / Physica A 389 (2010) 3023–3038 3029

Fig. 4. A schematic illustration of H3 and G3 — the Shannon entropy and the Gini index of Eq. (14) which correspond to the probability law solving boththe optimization Problems 3 and 4. The Shannon entropy H3 , as a function of the parameter σ , is depicted by the solid line. The Gini index G3 , as a functionof the parameter σ , is depicted by the dashed line.

(x ≥ 0). Considering the random variable X as a random time, the hazard rate R(x) represents the likelihood that the randomtime be realized right after time x — given that it was not realized up to time x. Hazard rates play a key role in AppliedProbability [12] and in the Theory of Reliability [13].The probability law solving both the optimization Problems 3 and 4 is characterized by the following affine relation

between the hazard rate R(x) and the survival probability F(x):R(x) = aF(x)+ b. (16)

Namely, the affine relation of Eq. (16) holds if and only if the survival probability F(x) admits the ‘‘entropy-Gini’’ form ofEq. (11) — where the transformations between the parameters (a, b) and (σ , µ) are given by:

a =ln(σ )µ

and b =ln(σ )

µ(σ − 1);

σ = 1+aband µ =

1aln(1+

ab

).

(17)

In the previous subsection we noted that at the parameter value σ = 1 the exponential law (with mean µ) is attained.Nowwe note that at the parameter limit b→ 0 the Zipfian law is attained. Namely, in the parameter limit b→ 0 we obtainthe Zipfian survival probability F(x) = 1/(ax + 1) (x ≥ 0), where a is a positive parameter. The proofs of the assertionsmade in this subsection are given in the Appendix.

7. From Gini’s index to Gini’s distance

The dispersion and Shannon’s entropy are defined for probability laws supported on the real line, whereas Gini’s index isdefined for probability laws supported on the positive half-line. In this section we shift from Gini’s index to a functional weterm ‘‘Gini’s distance’’: a measure of statistical heterogeneity which Gini’s index is based upon, and which is well-definedalso for probability laws supported on the real line.The Gini distance of a probability law governed by the probability density function f (x) (−∞ < x <∞) is given by

K(f ) =12

∫∞

−∞

∫∞

−∞

|x− y| f (x)f (y)dxdy. (18)

Namely, K(f ) is the average distance between two independent random variables whose probability law is governed by theprobability density function f (x) (−∞ < x < ∞). Gini’s distance is positive valued K(f ) > 0, and is bounded from aboveby the dispersion (with dispersion exponent p = 1): K(f ) ≤ D(f ). In the case of probability laws supported on the positivehalf-line (i.e., with f (x) = 0 for all x < 0) the connection between Gini’s index G(f ) and Gini’s distance K(f ) is given by

G(f ) =1µK(f ). (19)

The derivation of Eq. (19) is given in the Appendix.The ‘‘Gini’s distance’’ counterpart of the Gini-maximization Problem 2 is the following optimization problem:

Problem 5. Within the class of probability laws supported on the real line (−∞ < x < ∞), with mean µ and a givendispersion, which is the law that maximizes Gini’s distance?

Page 8: Maximization of statistical heterogeneity: From Shannon’s entropy to Gini’s index

3030 I. Eliazar, I.M. Sokolov / Physica A 389 (2010) 3023–3038

As in the case of the Gini-maximization Problem 2, the counterpart optimization Problem 5 is well-defined for dispersionexponents in the range p > 1, and the probability law solving it is governed by the probability density function f2(x)of Eq. (7):

f2(x) =p− 12σ

∣∣∣∣x− µσ∣∣∣∣p−2 (20)

(µ−σ ≤ x ≤ µ+σ ), where σ is a positive scale parameter. The only difference between themaximization Problems 2 and5 is that in the latter the support of the probability density function f2(x) is not restricted to the positive half-line – implying,in turn, that the scale parameter σ can assume all positive values (rather than being restricted to the range 0 < σ ≤ µ). Anoutline of the derivation of Eq. (20) is given in the Appendix.The ‘‘Gini’s distance’’ counterparts of the entropy-Gini maximization Problems 3 and 4 are, respectively, the following

optimization problems:

Problem 6. Within the class of probability laws supported on the real line (−∞ < x <∞), with mean µ and a given Ginidistance, which is the law that maximizes Shannon’s entropy?

Problem 7. Within the class of probability laws supported on the real line (−∞ < x < ∞), with mean µ and a givenShannon entropy, which is the law that maximizes Gini’s distance?

The solution of both optimization Problems 6 and 7 is the logistic probability law [14] — governed by the probabilitydensity function

f4(x) =1σexp

( x−µσ

)(1+ exp

( x−µσ

))2 (21)

(−∞ < x < ∞), where σ is a positive scale parameter. The corresponding survival probability function F4(x) =∫∞

x f4(x′)dx′ is given by

F4(x) =1

1+ exp( x−µσ

) (22)

(−∞ < x <∞). An outline of the derivation of Eqs. (21)–(22) is given in the Appendix.The logistic probability density function f4(x) is qualitatively similar to the Subbotin probability density function f1(x)

(solving the entropy-maximization Problem 1): it has an infinite support (−∞ < x <∞); it is symmetric around its meanµ; it is unimodal, and attains its global maximum at its mean µ. Moreover, the Gini distance K4 = K(f4) and the entropyH4 = H(f4) of the logistic density f4(x) are analogous, respectively, to the dispersion D1 = D(f1) and entropy H1 = H(f1) ofSubbotin’s density f1(x) (given by Eq. (5)):

K4 = σ ,H4 = 2+ ln(σ ).

(23)

And, the dependence of the entropy H4 on the Gini distance K4 is analogous to the logarithmic dependence of the entropyH1 on the dispersion D1 (given by Eq. (6)):

H4 = 2+ ln (K4) . (24)

Both the Gini distance K4 and the entropy H4 are monotone increasing with respect to the scale parameter σ , and span theirentire ranges of values as the parameter σ is varied: 0 < K4 < ∞ and −∞ < H4 < ∞. Hence, the attainable levels ofheterogeneity – in the context of the optimization Problems 6 and 7 – are unlimited.

8. Conclusions

In this letter we explored heterogeneity-maximizing probability laws – our motivation stemming from the ‘‘principle ofmaximum entropy’’. To that end, we considered three different measures of statistical heterogeneity: dispersion – given bythe functional D(f ) of Eq. (1) – which measures the fluctuations of probability laws around their means; Shannon’s entropy– given by the functional H(f ) of Eq. (2) – which measures the randomness of probability laws via an information-basedapproach; Gini’s index – given by the functional G(f ) of Eq. (3) – which measures the evenness of probability laws via theEconomic notion of ‘‘societal equality’’.Maximizing Shannon’s entropy and Gini’s index – subject to a givenmean and a given dispersion level – led, respectively,

to the classes of probability laws governed by the probability density functions of Eqs. (4) and (7). These probability densityfunctions are both symmetric around their means, but yet display markedly different statistical behaviors: infinite vs. finitesupport; exponential vs. power-law functional structure; bell-curve shape vs. a varying density shape (with respect to thedispersion exponent p); Gaussian law vs. uniform law in the case p = 2 (the dispersion equaling the standard deviation);

Page 9: Maximization of statistical heterogeneity: From Shannon’s entropy to Gini’s index

I. Eliazar, I.M. Sokolov / Physica A 389 (2010) 3023–3038 3031

logarithmic vs. linear dependence of themeasure of heterogeneity (Shannon’s entropy / Gini’s index) on the dispersion level;unlimited vs. limited attainable heterogeneity levels (Shannon’s entropy / Gini’s index).Maximizing Shannon’s entropy – subject to a given mean and a given Gini index, and maximizing Gini’s index – subject

to a given mean and a given Shannon entropy, led to a third class of probability laws. This third class of probabilitylaws is governed by the asymmetric and skewed probability density function of Eq. (10), is a generalization of both theexponential law and Zipf’s law, and is characterized by an affine relation between the corresponding hazard rates andsurvival probabilities. In this class of probability laws the connection between Shannon’s entropy and Gini’s index is non-monotone and non-linear.Last, shifting from Gini’s index to Gini’s distance – given by the functional K(f ) of Eq. (18) – we re-investigated

heterogeneity-maximizing involving Gini’s index. Maximizing Gini’s distance – subject to a given mean and a givendispersion level – led to the same class of probability laws obtained from the counterpart Gini-indexmaximization problem.On the other hand,maximizing Shannon’s entropy – subject to a givenmean and a givenGini distance, andmaximizingGini’sdistance – subject to a given mean and a given Shannon entropy, led to the logistic distribution.The maximization of statistical heterogeneity is thus highly dependent on the gauges of heterogeneity applied —

both as maximization target-functions, and as maximization constraint-functions. This letter presented both the markeddifferences and the interplay between the Physical ‘‘randomness-based’’ and Economic ‘‘evenness-based’’ approaches tothe maximization of statistical heterogeneity.

Acknowledgements

The authors gratefully acknowledge the anonymous reviewers for their insightful suggestions and remarks.

Appendix

In the Appendix we will make use of the following functionals:Linear functionals. Let (a, b) be an interval (with lower bound a ≥ −∞ and upper bound b ≤ ∞), and let ξ(x) (a < x < b)

be a real-valued function defined on the interval (a, b). Consider the linear functional

L(f ) =∫ b

aξ(x)f (x)dx. (25)

The first variation of the functional L(f ) is given by

∆[L(f )](φ) =∫ b

a[ξ(x)]φ(x)dx, (26)

where φ(x) (a < x < b) is an arbitrary test function.Shannon functionals. Let (a, b) be an interval (with lower bound a ≥ −∞ and upper bound b ≤ ∞). Consider the convex

functional

H̃(f ) =∫ b

aln(f (x))f (x)dx, (27)

where f (x) (a < x < b) is a probability density function. The first variation of the functional H̃(f ) is given by

∆[H̃(f )](φ) =∫ b

a[1+ ln(f (x))]φ(x)dx, (28)

where φ(x) (a < x < b) is an arbitrary test function.Gini’s functional. Consider the convex functional

G̃(f ) =∫∞

0F(x)2dx, (29)

where f (x) (x ≥ 0) is a probability density function, and where F(x) =∫∞

x f (x′)dx′ (x ≥ 0) is the corresponding survival

probability function. The first variation of the functional G̃(f ) is given by

∆[G̃(f )](φ) =∫∞

0

[2∫ x

0F(u)du

]φ(x)dx, (30)

where φ(x) (x ≥ 0) is an arbitrary test function.

Page 10: Maximization of statistical heterogeneity: From Shannon’s entropy to Gini’s index

3032 I. Eliazar, I.M. Sokolov / Physica A 389 (2010) 3023–3038

A.1. Entropy maximization: Eq. (4)

The entropy-maximization Problem 1 is equivalent to the convex optimization problem:

min∫∞

−∞

ln(f (x))f (x) dx

s.t. ∫∞

−∞

f (x)dx = 1

∫∞

−∞

xf (x)dx = µ

∫∞

−∞

|x− µ|p f (x)dx = δ

(31)

(the target functional is convex, and the constraint functionals are linear). The corresponding Lagrangian is given by:

L1(f ,λ) =∫∞

−∞

ln(f (x))f (x)dx+ λ1

(∫∞

−∞

f (x)dx− 1)

+ λ2

(∫∞

−∞

xf (x)dx− µ)+ λ3

(∫∞

−∞

|x− µ|p f (x)dx− δ), (32)

where λ = (λ1, λ2, λ3) is the vector of Lagrange multipliers. Using Eqs. (26) and (28) (with (a, b) = (−∞,∞)), the firstvariation of the LagrangianL1 (f ,λ) is given by:

∆ [L1(f ,λ)] (φ) =∫∞

−∞

[(1+ ln (f (x)))+ λ1 + λ2x+ λ3 |x− µ|p

]φ(x)dx. (33)

Equating the first variation of the LagrangianL1(f ,λ) to zero yields:

1+ ln(f (x))+ λ1 + λ2x+ λ3 |x− µ|p = 0. (34)Eq. (34), in turn, yields:

f (x) = exp(−1− λ1 − λ2x− λ3 |x− µ|p

). (35)

The first constraint (i.e., probability normalization) implies that λ2 = 0, and hence we arrive at the probability densityfunction f1(x) of Eq. (4). Since the optimization problem of Eq. (31) is convex, a global maximum is attained at the criticalpoint f1(x).

A.2. Gini maximization: Eq. (7)

Set F(x) =∫∞

x f (x′)dx′ (x ≥ 0) to be the survival probability function corresponding to the probability density function

f (x) (x ≥ 0). The entropy-maximization Problem 2 is equivalent to the convex optimization problem:

min∫∞

0F(x)2dx

s.t. ∫∞

0f (x)dx = 1

∫∞

0xf (x)dx = µ

∫∞

0|x− µ|p f (x) dx = δ

(36)

(the target functional is convex, and the constraint functionals are linear). The corresponding Lagrangian is given by:

L2(f ,λ) =∫∞

0F (x)2 dx+ λ1

(∫∞

0f (x)dx− 1

)+ λ2

(∫∞

0xf (x) dx− µ

)+ λ3

(∫∞

0|x− µ|p f (x)dx− δ

), (37)

where λ = (λ1, λ2, λ3) is the vector of Lagrange multipliers. Using Eqs. (26) and (30) (with (a, b) = (0,∞)), the firstvariation of the LagrangianL2(f ,λ) is given by:

Page 11: Maximization of statistical heterogeneity: From Shannon’s entropy to Gini’s index

I. Eliazar, I.M. Sokolov / Physica A 389 (2010) 3023–3038 3033

∆[L2(f ,λ)](φ) =∫∞

0

[(2∫ x

0F (u) du

)+ λ1 + λ2x+ λ3 |x− µ|p

]φ(x)dx. (38)

Equating the first variation of the LagrangianL2(f ,λ) to zero yields:

2∫ x

0F(u)du+ λ1 + λ2x+ λ3 |x− µ|p = 0. (39)

Differentiating both sides of Eq. (39) twice further yields:

f (x) =λ3

2p (p− 1) |x− µ|p−2 . (40)

The first constraint (i.e., probability normalization) implies that the support of the function f (x)must be bounded.Moreover,since the right hand side of Eq. (40) is symmetric aroundµ, the second constraint (mean= µ) implies that the support mustbe also symmetric around µ. Hence, we arrive at the probability density function f2(x) of Eq. (7). Since the optimizationproblem of Eq. (36) is convex, a global maximum is attained at the critical point f2(x).

A.3. Entropy-Gini maximization: Eqs. (10)–(11)

Set F(x) =∫∞

x f (x′)dx′ (x ≥ 0) to be the survival probability function corresponding to the probability density function

f (x) (x ≥ 0).

A.3.1. The entropy-Gini maximization Problem 3The entropy-Gini maximization Problem 3 is equivalent to the optimization problem:

min∫∞

0ln(f (x))f (x) dx

s.t. ∫∞

0f (x)dx = 1

∫∞

0xf (x)dx = µ

∫∞

0F(x)2dx = γ .

(41)

The corresponding Lagrangian is given by:

L3(f ,λ) =∫∞

0ln(f (x))f (x)dx+ λ1

(∫∞

0f (x)dx− 1

)+ λ2

(∫∞

0xf (x) dx− µ

)+ λ3

(∫∞

0F (x)2 dx− γ

), (42)

where λ = (λ1, λ2, λ3) is the vector of Lagrange multipliers. Using Eqs. (26), (28) and (30) (with (a, b) = (0,∞)), the firstvariation of the LagrangianL3(f ,λ) is given by:

∆ [L3(f ,λ)] (φ) =∫∞

0

[(1+ ln (f (x)))+ λ1 + λ2x+ λ3

(2∫ x

0F(u)du

)]φ(x)dx. (43)

Equating the first variation of the LagrangianL3(f ,λ) to zero yields:

1+ ln(f (x))+ λ1 + λ2x+ λ32∫ x

0F(u)du = 0. (44)

Differentiating Eq. (44) further yields:

f ′(x)f (x)+ λ2 + λ32F(x) = 0

⇐⇒

f ′(x)+ λ2f (x)+ λ32F(x)f (x) = 0⇐⇒

−F ′′(x)− λ2F ′(x)− λ3(F(x)2

)′= 0

⇐⇒

F ′(x)+ λ2F(x)+ λ3F(x)2 = c,

(45)

Page 12: Maximization of statistical heterogeneity: From Shannon’s entropy to Gini’s index

3034 I. Eliazar, I.M. Sokolov / Physica A 389 (2010) 3023–3038

where c is a (real valued) integration constant. Thus, we arrive at the Riccati equation:

F ′(x) = c − λ2F(x)− λ3F(x)2. (46)

A.3.2. The entropy-Gini maximization Problem 4The entropy-Gini maximization Problem 4 is equivalent to the optimization problem:

min∫∞

0F(x)2dx

s.t. ∫∞

0f (x)dx = 1

∫∞

0xf (x)dx = µ

∫∞

0ln(f (x))f (x) dx = η.

(47)

The corresponding Lagrangian is given by:

L4(f ,λ) =∫∞

0F (x)2 dx+ λ1

(∫∞

0f (x)dx− 1

)+ λ2

(∫∞

0xf (x) dx− µ

)+ λ3

(∫∞

0ln (f (x)) f (x)dx− η

), (48)

where λ = (λ1, λ2, λ3) is the vector of Lagrange multipliers. Using Eqs. (26), (28) and (30) (with (a, b) = (0,∞)), the firstvariation of the LagrangianL4(f ,λ) is given by:

∆[L4(f ,λ)](φ) =∫∞

0

[(2∫ x

0F (u) du

)+ λ1 + λ2x+ λ3 (1+ ln(f (x)))

]φ(x)dx. (49)

Equating the first variation of the LagrangianL4(f ,λ) to zero yields:

2∫ x

0F(u)du+ λ1 + λ2x+ λ3(1+ ln(f (x))) = 0. (50)

Differentiating equation (50) further yields:

2F(x)+ λ2 + λ3f ′ (x)f (x)

= 0

⇐⇒

2F(x)f (x)+ λ2f (x)+ λ3f ′(x) = 0⇐⇒

−(F(x)2

)′− λ2F ′(x)− λ3F ′′(x) = 0

⇐⇒

F(x)2 + λ2F(x)+ λ3F ′(x) = c,

(51)

where c is a (real valued) integration constant. Thus, we arrive at the Riccati equation:

F ′(x) =cλ3−λ2

λ3F(x)−

1λ3F(x)2. (52)

A.3.3. Solution of the Riccati equations (46) and (52)Both Eqs. (46) and (52) admit the Riccati form:

F ′(x) = c0 + c1F(x)+ c2F (x)2 , (53)

where c0, c1 and c2 are arbitrary real coefficients. The corresponding boundary conditions are F(0) = 1 and limx→∞ F(x) =0. Since F(x) is a survival probability function the coefficient c0 must vanish (i.e., c0 = 0), and the Riccati equation (53)reduces to the Bernoulli equation:

F ′(x) = c1F(x)+ c2F(x)2. (54)

Page 13: Maximization of statistical heterogeneity: From Shannon’s entropy to Gini’s index

I. Eliazar, I.M. Sokolov / Physica A 389 (2010) 3023–3038 3035

The solution of the Bernoulli equation (54), in turn, is given by:

F(x) =1

σ exp (ρx)+ (1− σ), (55)

(x ≥ 0), where σ and ρ are positive valued parameters.The corresponding mean is given by:

µ =

∫∞

0F(x)dx =

ln(σ )σ − 1

1ρ. (56)

Hence, setting ψ(σ) = ln(σ )/(σ − 1)we obtain that the parameter ρ is given by:

ρ =ψ(σ)

µ. (57)

Substituting Eq. (57) into Eq. (55) yields the survival probability function F3(x) of Eq. (11) — which, after differentiation,further yields the probability density function f3(x) of Eq. (10).

A.3.4. The probabilistic approach: proofsNoting that F ′(x) = −f (x) and substituting Eq. (15) into Eq. (16) yields the Bernoulli equation:

F ′(x) = −bF(x)+ aF(x)2. (58)

The corresponding boundary conditions are F(0) = 1 and limx→∞ F(x) = 0. The solution of the Bernoulli equation (58) isgiven by:

F(x) =1(

1+ ab

)exp(bx)− a

b

(59)

(x ≥ 0), where the parameters a and b take values in the ranges a > −b and b > 0.Comparing Eqs. (11) and (59) implies that

1+ab= σ and b =

ln(σ )µ(σ − 1)

(60)

which, in turn, yields Eq. (17). Taking the limit b→ 0 in Eq. (59)we obtain the Zipfian survival probability F (x) = 1/(ax+1)(x ≥ 0; the parameter a admitting positive values).

A.4. Gini’s distance: Eqs. (19), (21) and (22)

A.4.1. The connection between Gini’s index and Gini’s distance: Eq. (19)Set F(x) =

∫∞

x f (x′)dx′ (x ≥ 0) to be the survival probability function corresponding to the probability density function

f (x) (x ≥ 0). Using the definition of Gini’s index (Eq. (3)) we have:

G(f ) = 1−1µ

∫∞

0F(u)2du =

(µ−

∫∞

0F(u)2du

)(61)

(using the fact that µ =∫∞

0 F(u)dx)

=1µ

(∫∞

0F(u)du−

∫∞

0F(u)2du

)=1µ

∫∞

0F (u) (1− F(u)) du

=1µ

∫∞

0

(∫∞

uf (y)dy

)(∫ u

0f (x)dx

)du =

∫∫∫0<x<u<y<∞

f (x)f (y)dxdy

=1µ

∫∫∫0<x<y<∞

(y− x)f (x)f (y)dxdy =12µ

∫∞

0

∫∞

0|x− y| f (x)(y)dxdy. (62)

Representing the probability density function f (x) (x ≥ 0) as supported on the entire real line (−∞ < x < ∞) – namely,with f (x) = 0 for all x < 0 – we conclude that

G(f ) =12µ

∫∞

−∞

∫∞

−∞

|x− y| f (x)f (y)dxdy =1µK(f ). (63)

Page 14: Maximization of statistical heterogeneity: From Shannon’s entropy to Gini’s index

3036 I. Eliazar, I.M. Sokolov / Physica A 389 (2010) 3023–3038

A.4.2. The Gini-distance functional: first variationThe Gini distance K(f ) admits the representation

K(f ) =12

∫∞

−∞

(∫∞

uf (y)dy

)(∫ u

−∞

f (x)dx)du (64)

(the proof of this representation is via calculations analogous to the ones conducted in Eq. (62)). The first variation of thefunctional K(f ) is given by

∆ [K(f )] (φ) =12

∫∞

−∞

[∫ x

−∞

F<(u)du+∫∞

xF>(u)du

]φ(x)dx, (65)

where φ(x) (−∞ < x < ∞) is an arbitrary test function, and where F<(u) =∫ u−∞f (u′)du′ and F>(u) =

∫∞

u f(u′)

du′(−∞ < u <∞).

A.4.3. The optimization Problem 5Consider the concave optimization Problem 5:

max K(f )s.t. ∫

−∞

f (x)dx = 1∫∞

−∞

xf (x)dx = µ∫∞

−∞

|x− µ|p f (x)dx = δ

(66)

(the target functional is concave, and the constraint functionals are linear). The corresponding Lagrangian is given by:

L5(f ,λ) = K(f )+ λ1

(∫∞

−∞

f (x)dx− 1)+ λ2

(∫∞

−∞

xf (x)dx− µ)+ λ3

(∫∞

−∞

|x− µ|p f (x)dx− δ), (67)

where λ = (λ1, λ2, λ3) is the vector of Lagrange multipliers. Using Eq. (26) (with (a, b) = (−∞,∞)) and Eq. (65), the firstvariation of the LagrangianL5(f ,λ) is given by:

∆[L5(f ,λ)](φ) =∫∞

−∞

[12

(∫ x

−∞

F<(u)du+∫∞

xF> (u) du

)+ λ1 + λ2x+ λ3 |x− µ|p

]φ(x)dx. (68)

Equating the first variation of the LagrangianL5(f ,λ) to zero yields:

12

(∫ x

−∞

F<(u)du+∫∞

xF>(u)du

)+ λ1 + λ2x+ λ3 |x− µ|p = 0. (69)

Differentiating both sides of Eq. (39) twice further yields:

f (x) = −λ3p (p− 1) |x− µ|p−2 . (70)

The first constraint (i.e., probability normalization) implies that the support of the function f (x)must be bounded.Moreover,since the right hand side of Eq. (70) is symmetric aroundµ, the second constraint (mean= µ) implies that the support mustbe also symmetric around µ. Hence, we arrive at the probability density function f2(x) of Eq. (20). Since the optimizationproblem of Eq. (66) is concave, a global maximum is attained at the critical point f2(x).

A.4.4. The optimization Problem 6The optimization Problem 6 is equivalent to the optimization problem:

min∫∞

−∞

ln(f (x))f (x) dx

s.t. ∫∞

−∞

f (x)dx = 1∫∞

−∞

xf (x)dx = µ

K(f ) = γ .

(71)

Page 15: Maximization of statistical heterogeneity: From Shannon’s entropy to Gini’s index

I. Eliazar, I.M. Sokolov / Physica A 389 (2010) 3023–3038 3037

The corresponding Lagrangian is given by:

L6(f ,λ) =∫∞

−∞

ln(f (x))f (x)dx+ λ1

(∫∞

−∞

f (x)dx− 1)+ λ2

(∫∞

−∞

xf (x)dx− µ)+ λ3 (K(f )− γ ) , (72)

where λ = (λ1, λ2, λ3) is the vector of Lagrange multipliers. Using Eqs. (26) and (28) (with (a, b) = (−∞,∞)) andEq. (65), the first variation of the LagrangianL6(f ,λ) is given by:

∆[L6(f ,λ)](φ) =∫∞

−∞

[(1+ ln (f (x)))+ λ1 + λ2x+

λ3

2

(∫ x

−∞

F<(u)du+∫∞

xF>(u)du

)]φ(x)dx. (73)

Equating the first variation of the LagrangianL6(f ,λ) to zero yields:

1+ ln(f (x))+ λ1 + λ2x+λ3

2

(∫ x

−∞

F<(u)du+∫∞

xF>(u)du

)= 0. (74)

Differentiating equation (74) further yields:

f ′(x)f (x)+ λ2 + λ3

12(F<(x)− F>(x)) = 0

⇐⇒

2f ′(x)+ 2λ2f (x)+ λ3 (1− 2F>(x)) f (x) = 0⇐⇒

−2F ′′>(x)− (2λ2 + λ3) F′

>(x)+ λ3(F> (x)2

)′= 0

⇐⇒

−2F ′(x)− (2λ2 + λ3) F(x)+ λ3F(x)2 = −c,

(75)

where c is a (real valued) integration constant. Thus, we arrive at the Riccati equation:

F ′>(x) =c2−2λ2 + λ32

F>(x)+λ3

2F>(x)2. (76)

A.4.5. The optimization Problem 7Consider the optimization Problem 7:

max K(f )s.t. ∫

−∞

f (x)dx = 1∫∞

−∞

xf (x)dx = µ∫∞

−∞

ln(f (x))f (x)dx = η.

(77)

The corresponding Lagrangian is given by:

L7(f ,λ) = K(f )+ λ1

(∫∞

−∞

f (x)dx− 1)+ λ2

(∫∞

−∞

xf (x)dx− µ)+ λ3

(∫∞

−∞

ln(f (x))f (x)dx− η), (78)

where λ = (λ1, λ2, λ3) is the vector of Lagrange multipliers. Using Eqs. (26), (28) (with (a, b) = (−∞,∞)) and Eq. (65),the first variation of the LagrangianL7(f ,λ) is given by:

∆ [L7(f ,λ)] (φ) =∫∞

−∞

[12

(∫ x

−∞

F<(u)du+∫∞

xF> (u) du

)+ λ1 + λ2x+ λ3 (1+ ln(f (x)))

]φ(x)dx. (79)

Equating the first variation of the LagrangianL7(f ,λ) to zero yields:

12

(∫ x

−∞

F<(u)du+∫∞

xF>(u)du

)+ λ1 + λ2x+ λ3(1+ ln(f (x))) = 0. (80)

Page 16: Maximization of statistical heterogeneity: From Shannon’s entropy to Gini’s index

3038 I. Eliazar, I.M. Sokolov / Physica A 389 (2010) 3023–3038

Differentiating equation (80) further yields:

12(F<(x)− F>(x))+ λ2 + λ3

f ′(x)f (x)

= 0

⇐⇒

(1− 2F>(x)) f (x)+ 2λ2f (x)+ 2λ3f ′(x) = 0⇐⇒ (

F>(x)2)′− (1+ 2λ2) F ′>(x)− 2λ3F

′′

>(x) = 0⇐⇒

F>(x)2 − (1+ 2λ2) F> (x)− 2λ3F ′>(x) = −c,

(81)

where c is a (real valued) integration constant. Thus, we arrive at the Riccati equation:

F ′>(x) =c2λ3−1+ 2λ22λ3

F>(x)+12λ3F>(x)2. (82)

A.4.6. Solution of the Riccati equations (76) and (82)Both Eqs. (76) and (82) admit the Riccati form:

F ′>(x) = c0 + c1F>(x)+ c2F>(x)2, (83)

where c0, c1 and c2 are arbitrary real coefficients. The corresponding boundary conditions are limx→−∞ F>(x) = 1 andlimx→∞ F>(x) = 0. Since F>(x) is a survival probability function the coefficient c0 must vanish (i.e., c0 = 0), and the Riccatiequation (83) reduces to the Bernoulli equation:

F ′>(x) = c1F>(x)+ c2F> (x)2 . (84)

The solution of the Bernoulli equation (84), in turn, is given by:

F>(x) =1

1+ a exp(bx), (85)

(−∞ < x <∞), where a and b are positive parameters.The corresponding mean is given by:

µ =

∫∞

0F>(x)dx =

1bln(1a

). (86)

Hence, setting σ = 1/b and substituting a = exp (−µ/σ) into Eq. (85) yields the survival probability function F4(x) ofEq. (22) — which, after differentiation, further yields the probability density function f4(x) of Eq. (21).

References

[1] C.E. Shannon, W. Weaver, The Mathematical Theory of Communication, University of Illinois Press, Urbana, 1949.[2] E.T. Jaynes, Phys. Rev. 106 (1957) 620.[3] C. Gini, Econom. J. 31 (1921) 124.[4] R.G. Abraham, S. van den Bergh, P. Nair, Astrophys. J. 558 (2003) 218.[5] P.P. Graczyk, J. Med. Chem. 50 (2007) 5773.[6] L. Wittebolle, et al., Nature 458 (2009) 623;S. Naeem, Nature 458 (2009) 579.

[7] N. Sazuka, J. Inoue, Physica A 383 (2007) 49;N. Sazuka, J. Inoue, E. Scalas, Physica A 388 (2009) 2839.

[8] D.K. Foley, J. Econom. Theory 62 (1994) 321.[9] T.M. Cover, Elements of Information Theory, 2nd ed., Wiley, New York, 2006.[10] M.F. Subbotin, Mat. Sb. 31 (1923) 296.[11] S. Alfarano, M. Milakovic, Econom. Lett. 101 (2008) 272. (see also: http://econstor.eu/bitstream/10419/22055/1/EWP-2008-10.pdf).[12] S.M. Ross, Introduction to Probability Models, 8th ed., Academic Press, Boston, 2002.[13] E. Barlow, F. Proschan, Mathematical Theory of Reliability, reprint ed., in: Classics in Applied Mathematics, vol. 17, Society for Industrial and Applied

Mathematics, 1996.[14] N. Balakrishnan, Handbook of the Logistic Distribution, Marcel Dekker, New York, 1992.